## Math Genius: Distribution of maximum of a moving average sequence

Let $$Z_n sim WN(0,sigma^2)$$ and $$a in mathbb{R}$$, $$X_t = Z_t + aZ_{t-1} qquad t = 0, pm1, pm 2,…$$ defines a $$MA(1)$$ sequence. I need to prove that this sequence has an independent extremal index. Therefore I need to prove that as $$n rightarrow infty$$, for each $$tau > 0$$

$$P(n(1-F(M_n)geq tau) rightarrow e^{-tau}$$

with $$M_n = max_{forall t} (X_t)$$ and $$F$$ the continuous cdf of $$X_t$$. I am trying to make a start with trying to do something with $$M_n = max(X_t + aX_{t-1})$$, getting to know the distribution…
I already have the proof for $$X_t$$ is an iid sequence.

## Math Genius: Convolution of Mixed Variables over Unique Domains

Question: I have two independent random variables (say $$X$$ and $$Y$$) such that $$X sim U[0,1]$$ and $$Y sim$$ Exp$$(1)$$, and I want to find the PDF of $$Z=X+Y$$.

My attempt:
I know $$f_X(x)=1$$ for $$x in[0,1]$$, and $$f_Y(y)=e^{-y}$$ for $$y in [0, infty)$$.

I also know that, due to their independence, $$f_Z(z)=(f_X * f_Y)(z)$$ where $$(f_X * f_Y)(z)$$ is the convolution of $$f_X$$ and $$f_Y$$.

Furthermore, $$(f_X * f_Y)(z)=int^{infty}_{-infty} f_X(z-y)f_Y(y) dy = int^{infty}_{-infty} f_Y(z-x)f_X(x) dx$$.

However, I am unsure of a few things:

• Can I use a convolution approach even though $$f_X$$ and $$f_Y$$ are not defined for all real numbers?
• If I can, how would I determine the bounds of the integral given $$f_X$$ and $$f_Y$$ are defined for different subsets of the real numbers ($$x in[0,1]$$ and $$y in [0, infty)$$ respectively)?

Context: Ultimately, I need to compute $$P(Z>z)$$ for two different cases (when $$zin[0,1]$$ and when $$z>1$$), so I planned on integrating $$f_Z(z)$$ to get the CDF for $$Z$$.

Any help would be greatly appreciated.

When you say $$f_X(x)=1$$ for $$0 what you really mean is $$f_X(x)=1$$ for $$0 an d $$f(x)=0$$ for all other $$x$$. All density functions are defined on the entire real line. So there is no problem in using the convolution formula.

In this case $$(f_X*f_Y)(z)=int_{-infty}^{infty} f_X(z-y)f_Y(y) dy$$. [This is the general formula for convolution]. Let $$z >0$$. Note that $$f_Y(y)=0$$ if $$y <0$$ and $$f_X(z-y)=0$$ if $$z-y notin (0,1)$$ i.e., if $$y notin (z-1,z)$$. Hence integration is over all positive $$y$$ satisfying $$z-1. In order to carry out this integration you have to consider two cases: $$z >1$$ and $$z <1$$. In the first case the integration is from $$z-1$$ to $$z$$. In the second case it is from $$0$$ to $$z$$.

## Math Genius: Variance of max of \$m\$ i.i.d. random variables

I’m trying to verify if my analysis is correct or not.

Suppose we have $$m$$ random variables $$x_i$$ , $$i in m$$. Each $$x_i sim mathcal{N}(0,sigma^2)$$.

From extreme value theorem one can state $$Y= maxlimits_{i in m} [mathcal{P}(x_i leq epsilon)] = [G(epsilon)]$$ as $$mtoinfty$$, if $$x_i$$ are i.i.d and $$G(epsilon)$$ is a standard Gumbel distribution.

My first question is can we state that: $$text{Var}[Y]= text{Var}left[max_{i in m} [mathcal{P}(x_i leq epsilon)] right]= text{Var}[ [G(epsilon)]] = frac{pi^2}{6}$$

My second question is, if we have $$n$$ of such $$Y$$ but all of them are independent with zero mean, can we state:
$$text{Var}left[prod_{i}^n Y_iright] = left(frac{pi^2}{6}right)^n$$

Thanks.

Update:
There’s final result for the second point at Distribution of the maximum of a large number of normally distributed random variables but no complete step by step derivation.

$$defdto{stackrel{mathrm{d}}{→}}defpeq{mathrel{phantom{=}}{}}$$The answers to question 1 and 2 are both negative.

For question 1, since $$Y_m = maxlimits_{i in m} P(x_i leqslant ε) dto G(ε)$$, then $$color{blue}{limlimits_{m → ∞}} D(Y_m) = D(G(ε))$$, i.e. the equality is in the sense of a limiting process.

For question 2, it’s geneally not true that$$Dleft( prod_{m = 1}^n Y_m right) = prod_{m = 1}^n D(Y_m)$$
for i.i.d. $$Y_1, cdots, Y_n$$, especially when $$E(Y_1) ≠ 0$$ and $$D(Y_1) > 0$$ by the following proposition:

Proposition: If $$X$$ and $$Y$$ are independent random variables on the same probability space, then$$D(XY) – D(X) D(Y) = D(X) (E(Y))^2 + D(Y) (E(X))^2.$$

Proof: Since $$X$$ and $$Y$$ are independent, then$$begin{gather*} D(XY) = E(X^2 Y^2) – (E(XY))^2 = E(X^2) E(Y^2) – (E(X) E(Y))^2,\ D(X) D(Y) = left( E(X^2) – (E(X))^2 right) left( E(Y^2) – (E(Y))^2 right)\ = E(X^2) E(Y^2) – E(X^2) (E(Y))^2 – E(Y^2) (E(X))^2 + (E(X))^2 (E(Y))^2, end{gather*}$$
andbegin{align*} &peq D(XY) – D(X) D(Y) = E(X^2) (E(Y))^2 + E(Y^2) (E(X))^2 – 2 (E(X))^2 (E(Y))^2\ &= left( E(X^2) – (E(X))^2 right) (E(Y))^2 + left( E(Y^2) – (E(Y))^2 right) (E(X))^2\ &= D(X) (E(Y))^2 + D(Y) (E(X))^2. tag*{square} end{align*}

Now it can be proved by induction on $$n$$ with the above proposition that$$Dleft( prod_{m = 1}^n Y_m right) > prod_{m = 1}^n D(Y_m) > 0.$$

## Math Genius: Are random variables of distributions always independent?

$$newcommand{P}{mathbb{P}}$$
Let $$(Omega_1,F_1,Q_1)$$, $$(Omega_2,F_2,Q_2)$$ be two probability spaces. Let $$Xsim Q_1, Ysim Q_2$$ under $$(Omega,F,mathbb{P})$$.

If we now define $$Omega:= Omega_1times Omega_2, F:=F_1times F_2$$ and $$P:= Q_1otimes Q_2$$, then we can define $$X,Y$$ as the projections onto the first and second coordinate.

Then we have:
begin{align} &P(Xin A , Yin B ) \ = {} & P({Xin A}cap {Yin B}) \ = {} & P((Atimes Omega_2)cap(Omega_1times B ))\ = {} & P(Atimes B) \ = {} & Q_1(A)cdot Q_2(B) \ = {} & (Q_1(A)cdot Q_2(Omega_2)) cdot (Q_1(Omega_1)cdot Q_2(B)) \ = {} & P(Xin A)cdot P(Yin B) end{align}

However, this only shows that there’s a definition of $$P$$ so that $$X,Y$$ are independent.

How do I show that for all definitions of $$P$$ the random variables $$X,Y$$ are independent?

Definition of stochastic independence of random variables as by Georgii: Indeed, your answer is a counterexample. Here is a way to generate a large family of counterexamples.

Let $$Omega_1$$ and $$Omega_2$$ be finite. Then $$Q_1$$ is defined by a certain probability mass function $$q_1$$, so that $$Q_1(E)=sum_{omegain E}q_1(omega)$$, and similarly for $$q_2$$. Furthermore, letting $$defP{mathbb P}P=Q_1otimes Q_2$$, then $$P$$ has the mass function $$p$$, where $$p(omega_1,omega_2)=q_1(omega_1)cdot q_2(omega_2)$$.

Now, choose two particular outcomes $$x_1,x_2in Omega_1$$ and $$y_1,y_2in Omega_2$$ for which $$q_1(x_i)>0$$ and $$q_2(y_i)>0$$, for $$iin {1,2}$$, and choose $$epsilon>0$$ which is sufficiently small. Then, define a modified probability measure on $$Omega_1times Omega_2$$ by the folloiwng probability mass function, which is a slight modification of $$p$$:
$$tilde p(omega_1,omega_2)= begin{cases} p(omega_1,omega_2)+epsilon & omega_1=x_1,omega_2=y_1\ p(omega_1,omega_2)-epsilon & omega_1=x_2,omega_2=y_1\ p(omega_1,omega_2)-epsilon & omega_1=x_1,omega_2=y_2\ p(omega_1,omega_2)+epsilon & omega_1=x_2,omega_2=y_2\ p(omega_1,omega_2) & text{otherwise}\ end{cases}$$
You can verify that $$tilde p$$ defines a measure on $$Omega_1times Omega_2$$ whose marginal distributions on $$Omega_1$$ and $$Omega_2$$ are equal to $$Q_1$$ and $$Q_2$$, respectively. However, $$tilde p$$ is no longer the product measure, so the random variables $$X$$ and $$Y$$ are no longer independent.

Count example:

$$newcommand{P}{mathbb P}$$
Let $$(Omega_1,F_1,Q_1) = (Omega_2,F_2,Q_2)=(Omega,F,mathbb{P})$$ and define $$X:=X_2:=X_1$$.

Then we have for events $$A,B$$ with $$Acap B =emptyset$$ and $$P(A) neq 0 neq P(B)$$:
$$P(Xin A , Yin B ) = P({Xin A}cap {Yin B}) = P(emptyset) = 0 neq P(Xin A)cdot P(Xin B)$$

Therefore, we can construct two random variables that are totally dependent, if they both have the same distribution.

## Math Genius: Does the below-quoted fact follow from the countable subadditivity property of probability measure \$mathbb{P}\$?

Let $$(Omega,mathcal{F},mathbb{P})$$ be a probability space, $$a$$ and $$b$$ be two rationals (i.e. $$a$$,$$b$$ $$in mathbb{Q}$$) such that $$a and $$(X_n)$$ be a sequence of random variables defined on the above-defined measurable space. Set:
$$begin{equation} Lambda_{a,b}={limsuplimits_{nrightarrowinfty}X_ngeq b; liminflimits_{nrightarrowinfty}X_n leq a} end{equation}$$
$$begin{equation} Lambda=bigcuplimits_{a

Therefore, I “read” $$Lambda$$ as follows:

there exists at least a pair of rationals $$a$$, $$b$$ ($$a) such that $$limsuplimits_{nrightarrowinfty}X_ngeq b; liminflimits_{nrightarrowinfty}X_n leq a$$, that is

$$begin{equation} Lambda={limsuplimits_{nrightarrowinfty}X_n> liminflimits_{nrightarrowinfty}X_n} end{equation}$$

I am given that $$mathbb{P}(Lambda_{a,b})=0$$ $$forall a, b in mathbb{Q}$$ such that $$a. And, at this point, I know that one can state that

$$mathbb{P}(Lambda)=0$$, since all rational pairs are countable”

I interpret the above statement in the following way, but I am not sure whether it is correct or not.

Since:

• $$mathbb{P}(Lambda_{a,b})=0$$ $$forall a,b in mathbb{Q}$$ such that $$a;
• $$bigcuplimits_{a is a countable union, since rationals are
countable by definition;

by countable subadditivity property of probability measure $$mathbb{P}$$, it follows that:
$$begin{equation} mathbb{P}(Lambda)=mathbb{P}big(bigcuplimits_{a
where $$sumlimits_{a follows from the fact that I am given that $$mathbb{P}(Lambda_{a,b})=0$$ $$forall a, b in mathbb{Q}$$ such that $$a.

Hence, since probability lies by definition between $$0$$ and $$1$$, $$mathbb{P}(Lambda)leq 0$$ “means” that:
$$begin{equation} mathbb{P}(Lambda)=0 end{equation}$$

Is my reasoning correct?

## Math Genius: Why is that \$mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP\$?

I’d like to prove that:

$$mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP$$

This came to me because the “Law of total expectation” would be way easier to prove if I could argue that:

$$mathbb{E}[X] = sum_{i=1}^infty int x mathbb{I}_{(A_i)} dP=sum_{i=1}^inftymathbb{E}[X|A_i]mathbb{P}(A_i)$$

with $$A_i$$ a partition of $$Omega$$.

Assuming that $$A$$ is a set, $$Z=mathbb{E}[X|A]$$ is a constant and thus:

$$ZP(A)=Zmathbb{E}[mathbb{I}_A]=mathbb{E}[Zmathbb{I}_A]=mathbb{E}[mathbb{E}[Xmathbb{I}_A|A]]=mathbb{E}[Xmathbb{I}_A]$$

where we used that $$mathbb{I}_A in sigma(A)$$ (therefore $$mathbb{E}[Xmathbb{I}_A|A]=mathbb{I}_Amathbb{E}[X|A]$$) and the last equality is the tower law.

## Math Genius: Coins falling out, find the probability \$P_k(n)\$ and \$lim_{k rightarrow infty} P_k(n)\$.

John Doe was a rich man. He had $$k + 1$$ piggy banks, $$k$$ coins in each. In $$i$$-th piggy bank, $$i-1$$ coins are genuine and $$k + 1 – i$$ coins are counterfeit.

John equiprobably chooses the piggy bank and does this sequence of actions $$n$$ times:

1. He shakes a piggy bank until a coin falls out (any coin can fall out with the same probability);
2. Writes down information about whether the coin was a genuine or counterfeit;
3. Throws the coin back into the piggy bank.

John is legitimately surprised, as all $$n$$ times the coin was counterfeit. What is the probability $$P_{k}(n)$$ that the next coin to fall out of the chosen piggy bank is also counterfeit?

I. What is the explicit formula for $$P_{k}(n)$$? Find the probability for $$n = 2$$ and $$k = 5$$, find $$P_{5}(2)$$.

II. Find $$lim_{k rightarrow infty} P_{k}(n)$$.

Attempt

$$P(text{fake coin} space | space n space text{fake coins}) = frac{P(text{fake coin and} space n space text{fake coins})}{P(n space text{fake coins})} = frac{P(n + 1 space text{fake})}{P(n space text{fake})}$$. Applying the total probability rule to each term in the ratio yields this result.
$$P_k(n) = frac{sum_{i = 0}^{k} (frac{i}{k})^{n+1}}{sum_{i = 0}^{k} (frac{i}{k})^n}$$
How to proceed with the limit? $$lim_{k rightarrow infty} frac{frac{1}{k}sum_{i = 0}^{k} i^{n+1}}{sum_{j = 0}^{k} j^{n}}$$

For every nonnegative integer $$n$$ the summation$$sum_{i=1}^{k}i^{n}$$ can be written
as a polynomial of degree $$n+1$$ in $$k$$ where the coefficient of
$$k^{n+1}$$ equals $$frac{1}{n+1}$$.

Applying that to your (correct) result for $$P_k(n)$$ we find: $$lim_{ktoinfty}P_k(n)=frac{n+1}{n+2}$$

Remarkably the RHS is exactly the probability that would have rolled out for $$P_k(n)$$ if rule 3 (the coins are thrown back) would not be followed.

For that see this question and its answers.

Tagged : / /

## Math Genius: Why is that \$mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP\$?

I’d like to prove that:

$$mathbb{E}[X|A]mathbb{P}(A)=int x mathbb{I}_{A} dP$$

This came to me because the “Law of total expectation” would be way easier to prove if I could argue that:

$$mathbb{E}[X] = sum_{i=1}^infty int x mathbb{I}_{(A_i)} dP=sum_{i=1}^inftymathbb{E}[X|A_i]mathbb{P}(A_i)$$

with $$A_i$$ a partition of $$Omega$$.

Assuming that $$A$$ is a set, $$Z=mathbb{E}[X|A]$$ is a constant and thus:

$$ZP(A)=Zmathbb{E}[mathbb{I}_A]=mathbb{E}[Zmathbb{I}_A]=mathbb{E}[mathbb{E}[Xmathbb{I}_A|A]]=mathbb{E}[Xmathbb{I}_A]$$

where we used that $$mathbb{I}_A in sigma(A)$$ (therefore $$mathbb{E}[Xmathbb{I}_A|A]=mathbb{I}_Amathbb{E}[X|A]$$) and the last equality is the tower law.

## Math Genius: Let \$X\$ be a continuous random variable and \$f(x)\$ be a continuous function. Conditions for \$P(f(X) = 0) = 0\$?

Let $$X$$ be a continuous real-valued random variable (for simplicity, we can assume $$Xsim N(0,1)$$) so that $$P(X=a) = 0$$ for some $$a in mathbb{R}$$.
Let $$f(x)$$ be a nonconstant continuous (analytic or Lipschitz if needed) function such that
$$mu = mathbb{E}[f(X)] > 0$$ and $$0 < text{Var}[f(X)] < infty$$.

I would like to know the conditions under which $$P(f(X) = 0) = 0$$.
Or at least, would like to estimate how small it is, e.g. $$P(f(X) ne 0) > 1-gamma$$.

Obviously, if we already know $$A={x | f(x) = 0}$$, $$P(A)$$ is the quantity of our interest.
My attempt is as follows:
Since $$mu > 0$$, it follows from one of the concentration inequalities that
$$P(f(X) > 0) ge 1 – frac{text{Var}[f(X)]}{text{Var}[f(X)] + mu^2} = frac{(mathbb{E}[f(X)])^2}{mathbb{E}[f(X)^2]},$$
which does not tell much…

I thought this would be easy to answer but it is not straightforward at all.

Denote by $$g$$ the density of $$X$$ and by $$Z_f = lbrace xcolon , f(x)=0rbrace$$ the zero set of $$f$$. We have
$$P(f(X)=0) = P(X in Z_f) = int_{Z_f} g(x), mathrm{d} x.$$
Therefore, $$P(f(X) =0) = 0$$ is equivalent to:
$$text{for a.e. } x in Z_f, g(x) =0.$$

## Math Genius: How to tell whether a vector of random variables has a single variable as input, and when it has a vector of variables as input?

Let $$X_1,…,X_n$$ be random variables.

Then the formula
$$mathbb{P}(pmatrix{X_1\…\X_n} = A)$$

can have two meanings, dependent on how $$mathbb P$$ is defined:

It can mean either $$mathbb{P}left(left{winOmegamid pmatrix{X_1(w)\…\X_n(w)} = zright}right)$$ , if we define our probability space as $$(Omega,mathcal{F},mathbb{P})$$

or $$mathbb{P}left(left{pmatrix{w_1\…\w_n} in times_{i=1}^n Omega_i mid pmatrix{X_1\…\X_n}pmatrix{w_1\…\w_n} = zright}right)$$
, if we define our probability space as $$(Omega^n,mathcal{F},mathbb{P})$$.

However, defining the probability space is often skipped, and the domain of the $$X_i$$ isn’t always denoted either.

If I happen to stumble upon such a case where I can’t somehow deduce which is correct, is there a difference between the two cases, and which should I assume it is, then?

So, in general it is: When you deal with a function $$f:mathbb R to mathbb R^n$$, then you can write it as a $$f=(f_1,…,f_n)$$ where $$f_k:mathbb R to mathbb R$$ is a single variable real valued function for $$k in {1,..,n}$$. In that case $$f(t) = (f_1(t),…,f_n(t))$$ and you probably won’t have any doubt, it is the way that we should look at it. The same goes if our function have a domain in more abstract space, that is, Let $$(Omega,mathcal F,mathbb P)$$ be a probability space, and let $$X:Omega to mathbb R^n$$ be random variable. It can be shown, that writing $$X=(X_1,…,X_n)$$, then $$X_k : Omega to mathbb R$$ is a random variable for any $$k in {1,…,n}$$. In that case $$X(omega) = (X_1(omega),…,X_n(omega))$$ and the proper way to look at it is $$mathbb P(X in A) = mathbb P ({omega in Omega : (X_1(omega),…,X_n(omega)) in A })$$, since $$mathbb P$$ is a measure on $$(Omega,mathcal F)$$. So when you have just a random vector, then you shouldn’t think that every coordinate takes different argument. Every coordinate should be a function from whole space (that is $$Omega$$) no matter what $$Omega$$ looks like.

Okay, it was the case, when we had a $$mathbb R^n$$ valued random variable and made it into the vector of $$mathbb R$$ valued random variables. However, there are other possibilities. Instead of having vector and looking at it coordinates, we can have a lot of random variables and form a new vector. However, it isn’t as simple as it might look: If you have probability spaces $$(Omega_1,mathcal F_1,mathbb P_1),…,(Omega_n,mathcal F_n, mathbb P_n)$$ and defined real valued random variables on them: $$X_k : Omega_k to mathbb R^n$$, you can define new set, call it $$Omega$$ which would be defined as $$Omega = Omega_1 times … times Omega_n$$ (so every $$omega in Omega$$ is of the form $$(omega_1,…,omega_n)$$ where $$omega_k in Omega_k$$), and (call it for now function, since we didn’t specified sigma fields) function $$X:Omega to mathbb R^n$$ given by $$X(omega) = (X_1(omega_1),…,X_n(omega_n))$$. And that is obviously another proper way to look at it, when considering $$X$$ just as a function, but it is more subtle when considering it as a random, measurable function! As for the case with measurability, you can always define new sigma field as $$mathcal F = mathcal F_1 otimes … otimes mathcal F_n =: sigma( A_1 times … times A_n : A_k in mathcal F_k , k in {1,..,n})$$ (loosely speeking, you just take any $$”$$rectangle$$”$$ of the base elements of every $$mathcal F_k$$ which is in form $$A_1 times … times A_n$$ where $$A_k in mathcal F_k$$ and close it under operationts that are needed to form a $$sigma-$$field.) Now, the problem with measuring (so calculating probability) isn’t as easy, it requires the concept of Product measure (which you can google). Again, loosely speaking, it defines a measure $$mathbb P$$ on our new measurable space $$(Omega,mathcal F)$$ to be $$mathbb P(A_1 times … times A_n) = mathbb P_1(A_1) cdot … cdot mathbb P_n(A_n)$$ for every rectangle $$A_1 times … times A_n$$ (in case of your random variable it would mean that $$mathbb P(X in B_1 times … times B_n) = mathbb P({omega in Omega : X(omega) in B_1 times … times B_n}) = mathbb P_1({omega_1 in Omega_1 : X_1(omega_1) in B_1 }) cdot … cdot mathbb P_n({omega_n in Omega_n : X_n(omega_n) in B_n }) = mathbb P_1(X_1 in B_1)…mathbb P_n(X_n in B_n)$$ (note that we’re calculating probability for every $$X_k$$ on the other space and with respect to different probability measure, since variables $$X_1,…,X_n$$ arn’t defined on $$Omega$$ but on $$Omega_1,…,Omega_n$$ respectivelly). It (again google product measure) can be shown that when the space is $$sigma-$$finite then it is uniquelly determined.

It was a long story, but what is important, that when defining it as a product space, then every coordinate would be INDEPENDENT! (note the measure $$mathbb P$$ is defined so, since $$mathbb P_k(X_k in B_k) = mathbb P( X_k in B_k, X_1,…,X_{k-1},X_{k+1},…,X_n in mathbb R )$$) That means by considering a random vector in the way that coordinates have different arguments, it would make them independent, and we’re loosing a lot of possible random vector, where the coordinates need not be independent.

So the lesson should be, in general, when dealing with random vector $$X$$ on $$(Omega,mathcal F,mathbb P)$$ you can write it as $$X=(X_1,…,X_n)$$ where every $$X_k : Omega to mathbb R$$ is a random variable and $$X(omega) = (X_1(omega),…,X_n(omega))$$ since it is just defined this way. However, when you know that $$X_1,…,X_n$$ are independent, then you can REDEFINE (it won’t be exactly the same random variable, just similar, with the same distribution) it as in the product space, taking every $$Omega_k = Omega$$ and $$X:Omega^n to mathbb R^n$$ will be given as $$X((omega_1,..,omega_n)) = (X_1(omega_1),…,X_n(omega_n))$$. That sometimes is an useful approach when one is interested in things concerning only the distribution of $$X$$.

Q.1. Is there a difference between the two cases?

A.1 Yes, I discern a difference between the two cases.

Q.2 If I can’t deduce which set of assumptions to use, which set of assumptions should I use?

A.2 Based on the law of parsimony, unless indicated otherwise, you should first assume the case that requires less words. Notice that $$Omega$$ is more parsimonious than $$Omega^n$$. Therefore, first go with $$Omega$$.

Q.3 Can you give me a more precise answer?

A.3 I think the way to treat this would be to consider something like the partition function. You can use this process.

By $$m$$, I denote a least least upper bound of $$n$$. For $$n leq m$$, the entropy will be
$$sigma^{(n)}_{X} = sum_{iin n} p_i log(p_i)$$

So you obtain a sequence
$$left{sigma^{(1)}_{X}, sigma^{(2)}_{X}, ldots, sigma^{(m)}_{X}right}$$

Lets get a sense of things, by looking at the entropys’ expected values under equi-partition. The entropys’ expected values under equipartition are
$$left{left< sigma^{(1)}_{X}right>, left, ldots, leftright}= left{-log(1), -log(2), ldots, -log(m)right}.$$
Observe that the sequence diverges.

One ask, which one of these gives the best and most correct indicator? Notice for $$q, that expected value of $$left > left$$. Also ponder the following: in getting $$left$$, I had to average all possibilities—including for all possibilities when there were only $$q$$ non-zero random variables.

This insinuates that you should look at your expected results for all $$nin mathbb{Z}^+$$, and also in the limit that $$n$$ goes to infinity. My guess is that though the partition function diverges with increasing $$n$$, the thing that you would actually care to and be able to observe would converge.

Tagged : /