Math Genius: Distribution of maximum of a moving average sequence

Original Source Link

Let $Z_n sim WN(0,sigma^2)$ and $a in mathbb{R}$, $$X_t = Z_t + aZ_{t-1} qquad t = 0, pm1, pm 2,…$$ defines a $MA(1)$ sequence. I need to prove that this sequence has an independent extremal index. Therefore I need to prove that as $n rightarrow infty$, for each $tau > 0$

$$P(n(1-F(M_n)geq tau) rightarrow e^{-tau}$$

with $M_n = max_{forall t} (X_t)$ and $F$ the continuous cdf of $X_t$. I am trying to make a start with trying to do something with $M_n = max(X_t + aX_{t-1})$, getting to know the distribution…
I already have the proof for $X_t$ is an iid sequence.

Tagged : / / /

Math Genius: Convolution of Mixed Variables over Unique Domains

Original Source Link

Question: I have two independent random variables (say $X$ and $Y$) such that $X sim U[0,1]$ and $Y sim$ Exp$(1)$, and I want to find the PDF of $Z=X+Y$.

My attempt:
I know $f_X(x)=1$ for $x in[0,1]$, and $f_Y(y)=e^{-y}$ for $y in [0, infty)$.

I also know that, due to their independence, $f_Z(z)=(f_X * f_Y)(z)$ where $(f_X * f_Y)(z)$ is the convolution of $f_X$ and $f_Y$.

Furthermore, $(f_X * f_Y)(z)=int^{infty}_{-infty} f_X(z-y)f_Y(y) dy = int^{infty}_{-infty} f_Y(z-x)f_X(x) dx$.

However, I am unsure of a few things:

  • Can I use a convolution approach even though $f_X$ and $f_Y$ are not defined for all real numbers?
  • If I can, how would I determine the bounds of the integral given $f_X$ and $f_Y$ are defined for different subsets of the real numbers ($x in[0,1]$ and $y in [0, infty)$ respectively)?

Context: Ultimately, I need to compute $P(Z>z)$ for two different cases (when $zin[0,1]$ and when $z>1$), so I planned on integrating $f_Z(z)$ to get the CDF for $Z$.

Any help would be greatly appreciated.

When you say $f_X(x)=1$ for $0<x<1$ what you really mean is $f_X(x)=1$ for $0<x<1$ an d $f(x)=0$ for all other $x$. All density functions are defined on the entire real line. So there is no problem in using the convolution formula.

In this case $(f_X*f_Y)(z)=int_{-infty}^{infty} f_X(z-y)f_Y(y) dy$. [This is the general formula for convolution]. Let $z >0$. Note that $f_Y(y)=0$ if $y <0$ and $f_X(z-y)=0$ if $z-y notin (0,1)$ i.e., if $y notin (z-1,z)$. Hence integration is over all positive $y$ satisfying $z-1<y<z$. In order to carry out this integration you have to consider two cases: $z >1$ and $z <1$. In the first case the integration is from $z-1$ to $z$. In the second case it is from $0$ to $z$.

Tagged : / / / /

Math Genius: Variance of max of $m$ i.i.d. random variables

Original Source Link

I’m trying to verify if my analysis is correct or not.

Suppose we have $m$ random variables $x_i$ , $i in m$. Each $x_i sim mathcal{N}(0,sigma^2)$.

From extreme value theorem one can state $Y= maxlimits_{i in m} [mathcal{P}(x_i leq epsilon)] = [G(epsilon)]$ as $mtoinfty$, if $x_i$ are i.i.d and $G(epsilon)$ is a standard Gumbel distribution.

My first question is can we state that: $$text{Var}[Y]= text{Var}left[max_{i in m} [mathcal{P}(x_i leq epsilon)] right]= text{Var}[ [G(epsilon)]] = frac{pi^2}{6}$$

My second question is, if we have $n$ of such $Y$ but all of them are independent with zero mean, can we state:
$$text{Var}left[prod_{i}^n Y_iright] = left(frac{pi^2}{6}right)^n$$


There’s final result for the second point at Distribution of the maximum of a large number of normally distributed random variables but no complete step by step derivation.

$defdto{stackrel{mathrm{d}}{→}}defpeq{mathrel{phantom{=}}{}}$The answers to question 1 and 2 are both negative.

For question 1, since $Y_m = maxlimits_{i in m} P(x_i leqslant ε) dto G(ε)$, then $color{blue}{limlimits_{m → ∞}} D(Y_m) = D(G(ε))$, i.e. the equality is in the sense of a limiting process.

For question 2, it’s geneally not true that$$
Dleft( prod_{m = 1}^n Y_m right) = prod_{m = 1}^n D(Y_m)

for i.i.d. $Y_1, cdots, Y_n$, especially when $E(Y_1) ≠ 0$ and $D(Y_1) > 0$ by the following proposition:

Proposition: If $X$ and $Y$ are independent random variables on the same probability space, then$$
D(XY) – D(X) D(Y) = D(X) (E(Y))^2 + D(Y) (E(X))^2.

Proof: Since $X$ and $Y$ are independent, thenbegin{gather*}
D(XY) = E(X^2 Y^2) – (E(XY))^2 = E(X^2) E(Y^2) – (E(X) E(Y))^2,\
D(X) D(Y) = left( E(X^2) – (E(X))^2 right) left( E(Y^2) – (E(Y))^2 right)\
= E(X^2) E(Y^2) – E(X^2) (E(Y))^2 – E(Y^2) (E(X))^2 + (E(X))^2 (E(Y))^2,

&peq D(XY) – D(X) D(Y) = E(X^2) (E(Y))^2 + E(Y^2) (E(X))^2 – 2 (E(X))^2 (E(Y))^2\
&= left( E(X^2) – (E(X))^2 right) (E(Y))^2 + left( E(Y^2) – (E(Y))^2 right) (E(X))^2\
&= D(X) (E(Y))^2 + D(Y) (E(X))^2. tag*{$square$}

Now it can be proved by induction on $n$ with the above proposition that$$
Dleft( prod_{m = 1}^n Y_m right) > prod_{m = 1}^n D(Y_m) > 0.

Tagged : / / /

Math Genius: Are random variables of distributions always independent?

Original Source Link

Let $(Omega_1,F_1,Q_1)$, $(Omega_2,F_2,Q_2)$ be two probability spaces. Let $Xsim Q_1, Ysim Q_2$ under $(Omega,F,mathbb{P})$.

If we now define $Omega:= Omega_1times Omega_2, F:=F_1times F_2$ and $P:= Q_1otimes Q_2$, then we can define $X,Y$ as the projections onto the first and second coordinate.

Then we have:
&P(Xin A , Yin B ) \
= {} & P({Xin A}cap {Yin B}) \
= {} & P((Atimes Omega_2)cap(Omega_1times B ))\
= {} & P(Atimes B) \
= {} & Q_1(A)cdot Q_2(B) \
= {} & (Q_1(A)cdot Q_2(Omega_2)) cdot (Q_1(Omega_1)cdot Q_2(B)) \
= {} & P(Xin A)cdot P(Yin B)

However, this only shows that there’s a definition of $P$ so that $X,Y$ are independent.

How do I show that for all definitions of $P$ the random variables $X,Y$ are independent?

Definition of stochastic independence of random variables as by Georgii:
enter image description here

Indeed, your answer is a counterexample. Here is a way to generate a large family of counterexamples.

Let $Omega_1$ and $Omega_2$ be finite. Then $Q_1$ is defined by a certain probability mass function $q_1$, so that $Q_1(E)=sum_{omegain E}q_1(omega)$, and similarly for $q_2$. Furthermore, letting $defP{mathbb P}P=Q_1otimes Q_2$, then $P$ has the mass function $p$, where $p(omega_1,omega_2)=q_1(omega_1)cdot q_2(omega_2)$.

Now, choose two particular outcomes $x_1,x_2in Omega_1$ and $y_1,y_2in Omega_2$ for which $q_1(x_i)>0$ and $q_2(y_i)>0$, for $iin {1,2}$, and choose $epsilon>0$ which is sufficiently small. Then, define a modified probability measure on $Omega_1times Omega_2$ by the folloiwng probability mass function, which is a slight modification of $p$:
tilde p(omega_1,omega_2)=
p(omega_1,omega_2)+epsilon & omega_1=x_1,omega_2=y_1\
p(omega_1,omega_2)-epsilon & omega_1=x_2,omega_2=y_1\
p(omega_1,omega_2)-epsilon & omega_1=x_1,omega_2=y_2\
p(omega_1,omega_2)+epsilon & omega_1=x_2,omega_2=y_2\
p(omega_1,omega_2) & text{otherwise}\

You can verify that $tilde p$ defines a measure on $Omega_1times Omega_2$ whose marginal distributions on $Omega_1$ and $Omega_2$ are equal to $Q_1$ and $Q_2$, respectively. However, $tilde p$ is no longer the product measure, so the random variables $X$ and $Y$ are no longer independent.

Count example:

$newcommand{P}{mathbb P}$
Let $(Omega_1,F_1,Q_1) = (Omega_2,F_2,Q_2)=(Omega,F,mathbb{P})$ and define $X:=X_2:=X_1$.

Then we have for events $A,B$ with $Acap B =emptyset$ and $P(A) neq 0 neq P(B)$:
P(Xin A , Yin B ) = P({Xin A}cap {Yin B}) = P(emptyset) = 0 neq P(Xin A)cdot P(Xin B)

Therefore, we can construct two random variables that are totally dependent, if they both have the same distribution.

Tagged : / /

Math Genius: Does the below-quoted fact follow from the countable subadditivity property of probability measure $mathbb{P}$?

Original Source Link

Let $(Omega,mathcal{F},mathbb{P})$ be a probability space, $a$ and $b$ be two rationals (i.e. $a$,$b$ $in mathbb{Q}$) such that $a<b$ and $(X_n)$ be a sequence of random variables defined on the above-defined measurable space. Set:
Lambda_{a,b}={limsuplimits_{nrightarrowinfty}X_ngeq b; liminflimits_{nrightarrowinfty}X_n leq a}

Lambda=bigcuplimits_{a<b} Lambda_{a,b}

Therefore, I “read” $Lambda$ as follows:

there exists at least a pair of rationals $a$, $b$ ($a<b$) such that $limsuplimits_{nrightarrowinfty}X_ngeq b; liminflimits_{nrightarrowinfty}X_n leq a$, that is

Lambda={limsuplimits_{nrightarrowinfty}X_n> liminflimits_{nrightarrowinfty}X_n}

I am given that $mathbb{P}(Lambda_{a,b})=0$ $forall a, b in mathbb{Q}$ such that $a<b$. And, at this point, I know that one can state that

$mathbb{P}(Lambda)=0$, since all rational pairs are countable”

I interpret the above statement in the following way, but I am not sure whether it is correct or not.


  • $mathbb{P}(Lambda_{a,b})=0$ $forall a,b in mathbb{Q}$ such that $a<b$;
  • $bigcuplimits_{a<b}Lambda_{a,b}$ is a countable union, since rationals are
    countable by definition;

by countable subadditivity property of probability measure $mathbb{P}$, it follows that:
mathbb{P}(Lambda)=mathbb{P}big(bigcuplimits_{a<b} Lambda_{a,b}big)leq sumlimits_{a<b}mathbb{P}big(Lambda_{a,b}big)=0

where $sumlimits_{a<b}mathbb{P}big(Lambda_{a,b}big)=0$ follows from the fact that I am given that $mathbb{P}(Lambda_{a,b})=0$ $forall a, b in mathbb{Q}$ such that $a<b$.

Hence, since probability lies by definition between $0$ and $1$, $mathbb{P}(Lambda)leq 0$ “means” that:

Is my reasoning correct?

Tagged : / / /

Math Genius: Why is that $mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP$?

Original Source Link

I’d like to prove that:

mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP

This came to me because the “Law of total expectation” would be way easier to prove if I could argue that:

mathbb{E}[X] = sum_{i=1}^infty int x mathbb{I}_{(A_i)} dP=sum_{i=1}^inftymathbb{E}[X|A_i]mathbb{P}(A_i)

with $A_i$ a partition of $Omega$.

Assuming that $A$ is a set, $Z=mathbb{E}[X|A]$ is a constant and thus:


where we used that $mathbb{I}_A in sigma(A)$ (therefore $ mathbb{E}[Xmathbb{I}_A|A]=mathbb{I}_Amathbb{E}[X|A]$) and the last equality is the tower law.

Tagged : / / /

Math Genius: Coins falling out, find the probability $P_k(n)$ and $lim_{k rightarrow infty} P_k(n)$.

Original Source Link

John Doe was a rich man. He had $k + 1$ piggy banks, $k$ coins in each. In $i$-th piggy bank, $i-1$ coins are genuine and $k + 1 – i$ coins are counterfeit.

John equiprobably chooses the piggy bank and does this sequence of actions $n$ times:

  1. He shakes a piggy bank until a coin falls out (any coin can fall out with the same probability);
  2. Writes down information about whether the coin was a genuine or counterfeit;
  3. Throws the coin back into the piggy bank.

John is legitimately surprised, as all $n$ times the coin was counterfeit. What is the probability $P_{k}(n)$ that the next coin to fall out of the chosen piggy bank is also counterfeit?

I. What is the explicit formula for $P_{k}(n)$? Find the probability for $n = 2$ and $k = 5$, find $P_{5}(2)$.

II. Find $lim_{k rightarrow infty} P_{k}(n)$.


$P(text{fake coin} space | space n space text{fake coins}) = frac{P(text{fake coin and} space n space text{fake coins})}{P(n space text{fake coins})} = frac{P(n + 1 space text{fake})}{P(n space text{fake})}$. Applying the total probability rule to each term in the ratio yields this result.
$$P_k(n) = frac{sum_{i = 0}^{k} (frac{i}{k})^{n+1}}{sum_{i = 0}^{k} (frac{i}{k})^n}$$
How to proceed with the limit? $$lim_{k rightarrow infty} frac{frac{1}{k}sum_{i = 0}^{k} i^{n+1}}{sum_{j = 0}^{k} j^{n}}$$

For every nonnegative integer $n$ the summation$sum_{i=1}^{k}i^{n}$ can be written
as a polynomial of degree $n+1$ in $k$ where the coefficient of
$k^{n+1}$ equals $frac{1}{n+1}$.

See Faulhaber’s formula.

Applying that to your (correct) result for $P_k(n)$ we find: $$lim_{ktoinfty}P_k(n)=frac{n+1}{n+2}$$

Remarkably the RHS is exactly the probability that would have rolled out for $P_k(n)$ if rule 3 (the coins are thrown back) would not be followed.

For that see this question and its answers.

Tagged : / /

Math Genius: Why is that $mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP$?

Original Source Link

I’d like to prove that:

mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP

This came to me because the “Law of total expectation” would be way easier to prove if I could argue that:

mathbb{E}[X] = sum_{i=1}^infty int x mathbb{I}_{(A_i)} dP=sum_{i=1}^inftymathbb{E}[X|A_i]mathbb{P}(A_i)

with $A_i$ a partition of $Omega$.

Assuming that $A$ is a set, $Z=mathbb{E}[X|A]$ is a constant and thus:


where we used that $mathbb{I}_A in sigma(A)$ (therefore $ mathbb{E}[Xmathbb{I}_A|A]=mathbb{I}_Amathbb{E}[X|A]$) and the last equality is the tower law.

Tagged : / / /

Math Genius: Let $X$ be a continuous random variable and $f(x)$ be a continuous function. Conditions for $P(f(X) = 0) = 0$?

Original Source Link

Let $X$ be a continuous real-valued random variable (for simplicity, we can assume $Xsim N(0,1)$) so that $P(X=a) = 0$ for some $a in mathbb{R}$.
Let $f(x)$ be a nonconstant continuous (analytic or Lipschitz if needed) function such that
$mu = mathbb{E}[f(X)] > 0$ and $0 < text{Var}[f(X)] < infty$.

I would like to know the conditions under which $P(f(X) = 0) = 0$.
Or at least, would like to estimate how small it is, e.g. $P(f(X) ne 0) > 1-gamma$.

Obviously, if we already know $A={x | f(x) = 0}$, $P(A)$ is the quantity of our interest.
My attempt is as follows:
Since $mu > 0$, it follows from one of the concentration inequalities that
P(f(X) > 0) ge 1 – frac{text{Var}[f(X)]}{text{Var}[f(X)] + mu^2} = frac{(mathbb{E}[f(X)])^2}{mathbb{E}[f(X)^2]},

which does not tell much…

I thought this would be easy to answer but it is not straightforward at all.

Any suggestions/comments/answers will be very appreciated. Thanks!

Denote by $g$ the density of $X$ and by $Z_f = lbrace xcolon , f(x)=0rbrace$ the zero set of $f$. We have
$$P(f(X)=0) = P(X in Z_f) = int_{Z_f} g(x), mathrm{d} x.$$
Therefore, $P(f(X) =0) = 0$ is equivalent to:
$$text{for a.e. } x in Z_f, g(x) =0.$$

Tagged : / / /

Math Genius: How to tell whether a vector of random variables has a single variable as input, and when it has a vector of variables as input?

Original Source Link

Let $X_1,…,X_n$ be random variables.

Then the formula
$$mathbb{P}(pmatrix{X_1\…\X_n} = A)$$

can have two meanings, dependent on how $mathbb P $ is defined:

It can mean either $mathbb{P}left(left{winOmegamid pmatrix{X_1(w)\…\X_n(w)} = zright}right)$ , if we define our probability space as $(Omega,mathcal{F},mathbb{P})$

or $mathbb{P}left(left{pmatrix{w_1\…\w_n} in times_{i=1}^n Omega_i mid pmatrix{X_1\…\X_n}pmatrix{w_1\…\w_n} = zright}right)$
, if we define our probability space as $(Omega^n,mathcal{F},mathbb{P})$.

However, defining the probability space is often skipped, and the domain of the $X_i$ isn’t always denoted either.

If I happen to stumble upon such a case where I can’t somehow deduce which is correct, is there a difference between the two cases, and which should I assume it is, then?

So, in general it is: When you deal with a function $f:mathbb R to mathbb R^n$, then you can write it as a $f=(f_1,…,f_n)$ where $f_k:mathbb R to mathbb R$ is a single variable real valued function for $k in {1,..,n}$. In that case $f(t) = (f_1(t),…,f_n(t))$ and you probably won’t have any doubt, it is the way that we should look at it. The same goes if our function have a domain in more abstract space, that is, Let $(Omega,mathcal F,mathbb P)$ be a probability space, and let $X:Omega to mathbb R^n$ be random variable. It can be shown, that writing $X=(X_1,…,X_n)$, then $X_k : Omega to mathbb R$ is a random variable for any $k in {1,…,n}$. In that case $X(omega) = (X_1(omega),…,X_n(omega))$ and the proper way to look at it is $mathbb P(X in A) = mathbb P ({omega in Omega : (X_1(omega),…,X_n(omega)) in A })$, since $mathbb P$ is a measure on $(Omega,mathcal F)$. So when you have just a random vector, then you shouldn’t think that every coordinate takes different argument. Every coordinate should be a function from whole space (that is $Omega$) no matter what $Omega$ looks like.

Okay, it was the case, when we had a $mathbb R^n$ valued random variable and made it into the vector of $mathbb R$ valued random variables. However, there are other possibilities. Instead of having vector and looking at it coordinates, we can have a lot of random variables and form a new vector. However, it isn’t as simple as it might look: If you have probability spaces $(Omega_1,mathcal F_1,mathbb P_1),…,(Omega_n,mathcal F_n, mathbb P_n)$ and defined real valued random variables on them: $X_k : Omega_k to mathbb R^n$, you can define new set, call it $Omega$ which would be defined as $Omega = Omega_1 times … times Omega_n$ (so every $omega in Omega$ is of the form $(omega_1,…,omega_n)$ where $omega_k in Omega_k$), and (call it for now function, since we didn’t specified sigma fields) function $X:Omega to mathbb R^n$ given by $X(omega) = (X_1(omega_1),…,X_n(omega_n))$. And that is obviously another proper way to look at it, when considering $X$ just as a function, but it is more subtle when considering it as a random, measurable function! As for the case with measurability, you can always define new sigma field as $mathcal F = mathcal F_1 otimes … otimes mathcal F_n =: sigma( A_1 times … times A_n : A_k in mathcal F_k , k in {1,..,n})$ (loosely speeking, you just take any $”$rectangle$”$ of the base elements of every $mathcal F_k$ which is in form $A_1 times … times A_n$ where $A_k in mathcal F_k$ and close it under operationts that are needed to form a $sigma-$field.) Now, the problem with measuring (so calculating probability) isn’t as easy, it requires the concept of Product measure (which you can google). Again, loosely speaking, it defines a measure $mathbb P$ on our new measurable space $(Omega,mathcal F)$ to be $mathbb P(A_1 times … times A_n) = mathbb P_1(A_1) cdot … cdot mathbb P_n(A_n)$ for every rectangle $A_1 times … times A_n$ (in case of your random variable it would mean that $mathbb P(X in B_1 times … times B_n) = mathbb P({omega in Omega : X(omega) in B_1 times … times B_n}) = mathbb P_1({omega_1 in Omega_1 : X_1(omega_1) in B_1 }) cdot … cdot mathbb P_n({omega_n in Omega_n : X_n(omega_n) in B_n }) = mathbb P_1(X_1 in B_1)…mathbb P_n(X_n in B_n)$ (note that we’re calculating probability for every $X_k$ on the other space and with respect to different probability measure, since variables $X_1,…,X_n$ arn’t defined on $Omega$ but on $Omega_1,…,Omega_n$ respectivelly). It (again google product measure) can be shown that when the space is $sigma-$finite then it is uniquelly determined.

It was a long story, but what is important, that when defining it as a product space, then every coordinate would be INDEPENDENT! (note the measure $mathbb P$ is defined so, since $mathbb P_k(X_k in B_k) = mathbb P( X_k in B_k, X_1,…,X_{k-1},X_{k+1},…,X_n in mathbb R )$) That means by considering a random vector in the way that coordinates have different arguments, it would make them independent, and we’re loosing a lot of possible random vector, where the coordinates need not be independent.

So the lesson should be, in general, when dealing with random vector $X$ on $(Omega,mathcal F,mathbb P)$ you can write it as $X=(X_1,…,X_n)$ where every $X_k : Omega to mathbb R$ is a random variable and $X(omega) = (X_1(omega),…,X_n(omega))$ since it is just defined this way. However, when you know that $X_1,…,X_n$ are independent, then you can REDEFINE (it won’t be exactly the same random variable, just similar, with the same distribution) it as in the product space, taking every $Omega_k = Omega$ and $X:Omega^n to mathbb R^n$ will be given as $X((omega_1,..,omega_n)) = (X_1(omega_1),…,X_n(omega_n))$. That sometimes is an useful approach when one is interested in things concerning only the distribution of $X$.

Q.1. Is there a difference between the two cases?

A.1 Yes, I discern a difference between the two cases.

Q.2 If I can’t deduce which set of assumptions to use, which set of assumptions should I use?

A.2 Based on the law of parsimony, unless indicated otherwise, you should first assume the case that requires less words. Notice that $Omega$ is more parsimonious than $Omega^n$. Therefore, first go with $Omega$.

Q.3 Can you give me a more precise answer?

A.3 I think the way to treat this would be to consider something like the partition function. You can use this process.

By $m$, I denote a least least upper bound of $n$. For $n leq m$, the entropy will be
$$sigma^{(n)}_{X} = sum_{iin n} p_i log(p_i)$$

So you obtain a sequence
$$left{sigma^{(1)}_{X}, sigma^{(2)}_{X}, ldots, sigma^{(m)}_{X}right}$$

Lets get a sense of things, by looking at the entropys’ expected values under equi-partition. The entropys’ expected values under equipartition are
$$left{left< sigma^{(1)}_{X}right>, left<sigma^{(2)}_{X}right>, ldots, left<sigma^{(m)}_{X}right>right}= left{-log(1), -log(2), ldots, -log(m)right}.$$
Observe that the sequence diverges.

One ask, which one of these gives the best and most correct indicator? Notice for $q<p$, that expected value of $left<sigma^{(q)}right> > left<sigma^{(p)}right>$. Also ponder the following: in getting $left<sigma^{(p)}right>$, I had to average all possibilities—including for all possibilities when there were only $q$ non-zero random variables.

This insinuates that you should look at your expected results for all $nin mathbb{Z}^+$, and also in the limit that $n$ goes to infinity. My guess is that though the partition function diverges with increasing $n$, the thing that you would actually care to and be able to observe would converge.

Tagged : /