Math Genius: Prove that $n cdotmin{T_1,…,T_n}$ isn’t allowable as an estimator of $mu$

Original Source Link

Let’s suppose we have some electronic device which duration follows an Exponential distribution of unknown mean $mu$. Some research team wants to estimate $mu$ and uses a sample of $n$ devices to do it. They use the estimator $T=ncdot min{T_1,…,T_n}$, to be said, $n$ times the time in what the first device fails when they put to work all at the same time.

I need to prove that this estimator $T$ is not allowable. That would mean that $T$ isn’t asymptotically centered or that $T$ isn’t consistent.

I’ve found the distribution function of $min{T_1,…,T_n}$ (I’ll call it $F$):
$$F(t)=P(min{T_1,…,T_n}leq t)=1-P(min{T_1,…,T_n}> t) =$$
$$= 1-P(T_1>t,T_2>t,…,T_n>t)=1-(P(T_1>t))^n=1-e^{-nt/mu}.$$
I’m stucked here. I need to prove it’s NOT allowable.

You are on the right track but you need to keep going a bit further.

The distribution of the first order statistic $$T_{(1)} = min (T_1, T_2, ldots, T_n)$$ is indeed exponential with CDF $$Pr[T_{(1)} le t] = 1 – e^{-nt/mu}.$$ The next step is to compute the distribution of the scale transformed first order statistic: this is your actual $T$: $$T = n T_{(1)} = n min (T_1, T_2, ldots, T_n).$$ This is not hard to do; I leave it as an exercise to show $$T sim operatorname{Exponential}(mu)$$ with CDF $$Pr[T le t] = 1 – e^{-t/mu}.$$
What this tells us is that $n$ times the minimum observation is exponentially distributed with the same mean parameter, and our intuition should lead us to observe that if this is the case, this statistic is no better than simply observing one observation: in fact, because the CDF of $T$ is independent of $n$ entirely, this means its characteristics as an estimator of $mu$ is also independent of the sample size! So for instance, the asymptotic variance does not tend to $0$; we explicitly have $$operatorname{Var}[T] = mu^2$$ hence $$lim_{n to infty} operatorname{Var}[T] = mu^2 > 0.$$ This is an undesirable characteristic of an estimator for the parameter because it says that the precision of the estimate does not decrease with increasing sample size, so there is no information to be gained about the parameter by collecting more data if you use this estimator.

Stepping back, ask yourself why this happens. While it is true that the first order statistic $T_{(1)}$ has a decreasing variance with increasing sample size, the problem is that the rate of this decrease is not stronger than the increase in variance that occurs when we multiply $T_{(1)}$ by $n$. As you know, for a scalar constant $c$ and a random variable $X$ with finite variance, we have $$operatorname{Var}[cX] = c^2 operatorname{Var}[X].$$ This means that scale transformations of a random variable have a squaring effect on its variance. So if we must scale the first order statistic by $n$ in order to get an estimate, then the variance of this statistic must decrease faster than $n^2$ in order to compensate for the increase due to scaling it. And this does not occur; in fact, the two effects balance each other out exactly in this case.

I should point out here that “not allowable” is a bit strong. I would prefer to characterize the estimator $T$ to be “poor” or “undesirable.” After all, it is an estimator of $mu$–just not a good one.

Tagged : / / / /

Math Genius: Prove that $n cdotmin{T_1,…,T_n}$ isn’t allowable as an estimator of $mu$

Original Source Link

Let’s suppose we have some electronic device which duration follows an Exponential distribution of unknown mean $mu$. Some research team wants to estimate $mu$ and uses a sample of $n$ devices to do it. They use the estimator $T=ncdot min{T_1,…,T_n}$, to be said, $n$ times the time in what the first device fails when they put to work all at the same time.

I need to prove that this estimator $T$ is not allowable. That would mean that $T$ isn’t asymptotically centered or that $T$ isn’t consistent.

I’ve found the distribution function of $min{T_1,…,T_n}$ (I’ll call it $F$):
$$F(t)=P(min{T_1,…,T_n}leq t)=1-P(min{T_1,…,T_n}> t) =$$
$$= 1-P(T_1>t,T_2>t,…,T_n>t)=1-(P(T_1>t))^n=1-e^{-nt/mu}.$$
I’m stucked here. I need to prove it’s NOT allowable.

You are on the right track but you need to keep going a bit further.

The distribution of the first order statistic $$T_{(1)} = min (T_1, T_2, ldots, T_n)$$ is indeed exponential with CDF $$Pr[T_{(1)} le t] = 1 – e^{-nt/mu}.$$ The next step is to compute the distribution of the scale transformed first order statistic: this is your actual $T$: $$T = n T_{(1)} = n min (T_1, T_2, ldots, T_n).$$ This is not hard to do; I leave it as an exercise to show $$T sim operatorname{Exponential}(mu)$$ with CDF $$Pr[T le t] = 1 – e^{-t/mu}.$$
What this tells us is that $n$ times the minimum observation is exponentially distributed with the same mean parameter, and our intuition should lead us to observe that if this is the case, this statistic is no better than simply observing one observation: in fact, because the CDF of $T$ is independent of $n$ entirely, this means its characteristics as an estimator of $mu$ is also independent of the sample size! So for instance, the asymptotic variance does not tend to $0$; we explicitly have $$operatorname{Var}[T] = mu^2$$ hence $$lim_{n to infty} operatorname{Var}[T] = mu^2 > 0.$$ This is an undesirable characteristic of an estimator for the parameter because it says that the precision of the estimate does not decrease with increasing sample size, so there is no information to be gained about the parameter by collecting more data if you use this estimator.

Stepping back, ask yourself why this happens. While it is true that the first order statistic $T_{(1)}$ has a decreasing variance with increasing sample size, the problem is that the rate of this decrease is not stronger than the increase in variance that occurs when we multiply $T_{(1)}$ by $n$. As you know, for a scalar constant $c$ and a random variable $X$ with finite variance, we have $$operatorname{Var}[cX] = c^2 operatorname{Var}[X].$$ This means that scale transformations of a random variable have a squaring effect on its variance. So if we must scale the first order statistic by $n$ in order to get an estimate, then the variance of this statistic must decrease faster than $n^2$ in order to compensate for the increase due to scaling it. And this does not occur; in fact, the two effects balance each other out exactly in this case.

I should point out here that “not allowable” is a bit strong. I would prefer to characterize the estimator $T$ to be “poor” or “undesirable.” After all, it is an estimator of $mu$–just not a good one.

Tagged : / / / /

Math Genius: Problem calculating the characteristic function of the exponential distribution

Original Source Link

I was trying to calculate the characteristic function of the exponential distribution $$varphi(t) = mathbb E[e^{itX}] = int_{-infty}^infty e^{itx} lambda e^{-lambda x} cdot 1_{[0,infty)}(x) , dx = lambda int_0^infty e^{itx-lambda x} : dx = lambda int_0^infty e^{x(it-lambda)} : dx$$ Now here I substituted $u = x(it-lambda)$and got $$frac{lambda}{it-lambda}[e^{x(it-lambda)}]_0^{infty}$$ and I know the solution should be $$frac{lambda}{it-lambda}[0-1] = frac{lambda}{lambda – it}$$ But I don’t understand why $$lim_{x rightarrow infty}(e^{x(it-lambda)}) = lim_{x rightarrow infty} (frac{e^{xit}}{e^{x lambda}})= 0$$

Note that we have $lambda>0$. We obtain
begin{align*}
color{blue}{lim_{xtoinfty}}color{blue}{e^{x(it-lambda)}}
&=lim_{xtoinfty}left(e^{-lambda x}e^{itx}right)\
&=left(lim_{xto infty}e^{-lambda x}right)left(lim_{xto infty} e^{itx}right)\
&=0cdotleft(lim_{xto infty} e^{itx}right)\
&,,color{blue}{=0}
end{align*}

since $e^{itx}$ has modulus $1$.

Tagged : / / /

Math Genius: Exponential distribution and the age of the human civilization

Original Source Link

So, thinking another day about how young our human civilization is, compared to the potential time it can last for – for example, 5 billion years until the Earth dies, I thought that maybe it is not a coincidence and that civilisations don’t actually live for that long, in other words, the young age of our civilization tells us something about how long we are going to live for.

I know from studying the exponential distribution, that at any point of time, the prediction we make about the future is the same. For example, when connecting to someone’s phone call to eavesdrop, we expect them to go on talking for the same amount of time, irregardless of how long they have already been talking for.

Are those two things connected in any way? Does the amount of time human civilization has existed carry any information about how long it is going to exist for?

Tagged :

Math Genius: Which of this two estimators of $mu$ is better (Exponential distribution)?

Original Source Link

The problem goes like this:

“Suppose we have some electronic device which duration follows an Exponential distribution of an unknown mean $mu$. We want to estimate $mu$ and two teams will take care of it. Each team tests $n$ different devices:

  • First team measures the time $T$ in which the first device fails, and provides the estimation $T_1=nT$.

  • Second team measures each device’s duration and estimate $T_2$ as the mean of their durations.

Which team uses a better estimator of $mu$?”

I know that the second team is obviously using a better estimator of $mu$, but i’m not sure how to prove it. I’ve tried using the definition of allowable estimator, that says that every allowable estimator is asymptotically centered ($(hat{theta}_n)$ is asymptotically centered as a $theta$ estimator if $limlimits_{n rightarrow infty}tau_n(theta)=theta$, being $tau_n(theta)=text{E}[hat{theta}_n(X_i)]$).

Calculating that limit for the team two’s estimator gives me $mu$, because the sample mean goes to $mu$ when the number of devices sampled go to infinity.

My real problem is with $T_1$. I don’t get why it multiplies the time $T$ of the first devide by $n$, and evaluating that limit gives $infty$ if i’m not wrong:

$$limlimits_{n rightarrow infty}tau_n(ncdot T)=text{E}[infty cdot T]=infty.$$

Is my reasoning correct? Or what am i doing wrong?

You asked a similar question more recently, but I am going to provide a separate answer to this question since the comments attached to this question are mistaken.

What the other commenters in this question have misunderstood is that the $n$ devices are being tested simultaneously and not sequentially; therefore, the time to failure of the first device to fail is exponential with mean $mu/n$, which is what you calculated in the more recent linked question. Multiplying this by $n$ yields an estimator for the mean time to failure; however, as I have pointed out there, the asymptotic variance of this estimator is not a decreasing function of the sample size. This makes it an inferior estimator compared to the one used by the second team.

It is worth noting that $$lim_{n to infty} operatorname{E}[nT_{(1)}] ne operatorname{E}[infty cdot T_{(1)}].$$ This is of course an invalid interchange of limits and expectations as well as not recognizing that $T_{(1)}$ has a limiting expectation of $0$. Instead, $$operatorname{E}[n T_{(1)}] = n operatorname{E}[T_{(1)}] = n (mu/n) = mu.$$

Tagged : / / /

Math Genius: Why the process $n mapsto e^{-aT_n}f(X_{T_n})$ is a super martingale.

Original Source Link

Let $M := (e^{-aT_n}f(X_{T_n}); mathcal{F}_n)$, where $a$ is constant, $mathcal{F}_n$ is appropriately defined filtration and $T_n$ is $n$th jump of an independent (independent of $X_t$) Poisson process with parameter $lambda$ i.e. $T_n$ is exponentially distributed with parameter $1/lambda$.

Assume that $f(x) geq mathbb{E}_x[e^{-aU}f(X_{U})]$ for all $x in mathbb{R}_+$, where $U$ is independent of $X_t$ and exponentially distributed random variable with parameter $1/lambda$.

Why does it follow from this inequality that $M$ is indeed a supermartingale? I assume this should be straightforward but I don’t see how to apply the definition of a supermartingale here.

I think you’re missing an assumption about $X$ being a stationary (strong) Markov process so that:
$$mathbb{E}[g(X_{t+u})|X_s, sleq t] = mathbb{E}_{X_t}[g(X_u)]$$

Also, if the $T_n$ are really jump times of a Poisson process, then the differences $T_{n+1}-T_n$ are exponentially distributed with parameter $1/lambda$, obviously. I assume this was just a typo and not throwing you off.

With that said, because $T_n$ is $mathcal{F}_n$-measurable, we can write:
$$mathbb{E}[M_{n+1}|mathcal{F}_n]
= e^{-aT_n}mathbb{E}[e^{-a(T_{n+1}-T_n)}f(X_{T_{n+1}})|mathcal{F}_n]
= e^{-aT_n}mathbb{E}[e^{-aU}f(X_{T_n+U})|mathcal{F}_n]$$

for $U=T_{n+1}-T_n$. By the independence of $U$ and the stationary (strong) Markov property for $X$, we have:
$$=e^{-aT_n}mathbb{E}_{X_{T_n}}[e^{-aU}f(X_U)]$$
and by the hypothesized inequality:

$$leq e^{-aT_n} f(X_{T_n}) = M_n$$

Tagged : / /

Math Genius: Why the process $n mapsto e^{-aT_n}f(X_{T_n})$ is a super martingale.

Original Source Link

Let $M := (e^{-aT_n}f(X_{T_n}); mathcal{F}_n)$, where $a$ is constant, $mathcal{F}_n$ is appropriately defined filtration and $T_n$ is $n$th jump of an independent (independent of $X_t$) Poisson process with parameter $lambda$ i.e. $T_n$ is exponentially distributed with parameter $1/lambda$.

Assume that $f(x) geq mathbb{E}_x[e^{-aU}f(X_{U})]$ for all $x in mathbb{R}_+$, where $U$ is independent of $X_t$ and exponentially distributed random variable with parameter $1/lambda$.

Why does it follow from this inequality that $M$ is indeed a supermartingale? I assume this should be straightforward but I don’t see how to apply the definition of a supermartingale here.

I think you’re missing an assumption about $X$ being a stationary (strong) Markov process so that:
$$mathbb{E}[g(X_{t+u})|X_s, sleq t] = mathbb{E}_{X_t}[g(X_u)]$$

Also, if the $T_n$ are really jump times of a Poisson process, then the differences $T_{n+1}-T_n$ are exponentially distributed with parameter $1/lambda$, obviously. I assume this was just a typo and not throwing you off.

With that said, because $T_n$ is $mathcal{F}_n$-measurable, we can write:
$$mathbb{E}[M_{n+1}|mathcal{F}_n]
= e^{-aT_n}mathbb{E}[e^{-a(T_{n+1}-T_n)}f(X_{T_{n+1}})|mathcal{F}_n]
= e^{-aT_n}mathbb{E}[e^{-aU}f(X_{T_n+U})|mathcal{F}_n]$$

for $U=T_{n+1}-T_n$. By the independence of $U$ and the stationary (strong) Markov property for $X$, we have:
$$=e^{-aT_n}mathbb{E}_{X_{T_n}}[e^{-aU}f(X_U)]$$
and by the hypothesized inequality:

$$leq e^{-aT_n} f(X_{T_n}) = M_n$$

Tagged : / /

Math Genius: Conditional and joint distribution of the sum of exponential RVs

Original Source Link

Let $X_1,X_2,…,X_n$ be i.i.d. $Exp(lambda)$ random variables and $Y_k =sum^{k}_{i=1}X_i$, $k = 1,2,…,n$.

a) Find the joint PDF of $Y_1,…,Y_n$.

b) Find the conditional PDF of $Y_k$ conditioned on $Y_1,….,Y_{k−1}$, for $k = 2,3,…,n$.

c) Show that $Y_1,…,Y_k$ conditioned on $Y_{k+1},…,Y_n$ is uniformly distributed over a subset in $Bbb{R}^k$, for $k = 1,2,…,n−1$. Find this subset.

My attempt:

For $lambda_i = lambda$, $sum^{n}_{i=1}X_i sim Erlang(n,lambda) $, thus $Y_ksim Erlang(k,lambda)$

From here I need to find the CDF first to find the PDF. But I don’t understand how.

Begin here:

Since for all $2 leq kleq n$ we have $X_k=Y_k-Y_{k-1}$, and the $(X_k)$ are iid expnentially distributed with pdf $f_{small X}(x)=lambdaexp(-lambda x)cdotpmathbf 1_{0leq x}$ … therefore… $$begin{align}f_{small Y_1,Y_2,ldots,Y_n}(y_1,y_2,ldots, y_n)&=f_{small X_1,X_2,ldots,X_n}(y_1,y_2{-}y_1,ldots,y_n{-}y_{n-1})\[1ex]&= f_{small X}(y_1)prod_{k=2}^nf_{small X}(y_k-y_{k-1})\[2ex]&=lambda^nexpleft(-lambdaleft(y_1+sum_{k=2}^n(y_k-y_{k-1})right)right)cdotmathbf 1_{0leq y_1leq y_2leqldotsleq y_n}\[2ex]&=phantom{lambda^nexp(-lambda y_n)cdotmathbf 1_{0leq y_1leq y_2leqldotsleq y_n}}end{align}$$

Tagged : / / / /

Math Genius: Expectation of sample range for an exponential distribution

Original Source Link

$X_1, ldots , X_n$, $n ge 4$ are independent random variables with exponential distribution: $fleft(xright) = mathrm{e}^{-x}, xge 0$. We define $$R= max left( X_1, ldots , X_nright) – min left( X_1, ldots , X_nright)$$

Calculate $mathbb{E}R$.

So I know that: $$mathbb{E}R =mathbb{E}left( max left( X_1, ldots , X_nright) right)- mathbb{E}left(min left( X_1, ldots , X_nright)right)$$

And I can calculate
$$mathbb{E}left(min left( X_1, ldots , X_nright)right) = intlimits_{0}^{infty}left(1-F_{min}left(xright)right) mathrm{dx}=intlimits_{0}^{infty}left(mathrm{e}^{-nx}mathrm{dx} right) = frac{1}{n}$$.

The problem is to calculate:
$$mathbb{E}left(max left( X_1, ldots , X_nright)right) = intlimits_{0}^{infty}x cdot ncdot mathrm{e}^{-x}left( 1-mathrm{e}^{-x}right)^{n-1} mathrm{dx} = ldots$$

I don’t know how to calculate the above integral.

Let $X_{(1)}<X_{(2)}<cdots<X_{(n)}$ be the order statistics corresponding to $X_1,X_2,ldots,X_n$.

Making the transformation $(X_{(1)},ldots,X_{(n)})mapsto (Y_1,ldots,Y_n)$ where $Y_1=X_{(1)}$ and $Y_i=X_{(i)}-X_{(i-1)}$ for $i=2,3,ldots,n$, we have $Y_i$ exponential with mean $1/(n-i+1)$ independently for all $i=1,ldots,n$.

Therefore, $$R=X_{(n)}-X_{(1)}=sum_{i=1}^n Y_i-Y_1=sum_{i=2}^n Y_i$$

Hence, $$mathbb E(R)=sum_{i=2}^n frac1{n-i+1}$$

A more elegant argument is provided in this answer by @Did.


Alternatively, we can proceed to find the expectation of $X_{(1)}$ and $X_{(n)}$ separately as you did. Clearly $X_{(1)}$ is exponential with mean $1/n$. And the density of $X_{(n)}$ is

$$f_{X_{(n)}}(x)=ne^{-x}(1-e^{-x})^{n-1}mathbf1_{x>0}$$

Therefore,

begin{align}
mathbb E(X_{(n)})&=int x f_{X_{(n)}}(x),dx
\&=nint_0^infty xe^{-x}(1-e^{-x})^{n-1},dx
\&=nint_0^1(-ln u)(1-u)^{n-1},du tag{1}
\&=nint_0^1 -ln(1-t)t^{n-1},dt tag{2}
\&=nint_0^1 sum_{j=1}^infty frac{t^j}{j}cdot t^{n-1},dt tag{3}
\&=nsum_{j=1}^infty frac1j int_0^1 t^{n+j-1},dt tag{4}
\&=nsum_{j=1}^infty frac1{j(n+j)}
\&=sum_{j=1}^infty left(frac1j-frac1{n+j}right)
\&=sum_{j=1}^n frac1j
end{align}

$(1)$: Substitute $e^{-r}=u$.

$(2)$: Substitute $t=1-u$.

$(3)$: Use Maclaurin series expansion of $ln(1-t)$ which is valid since $tin (0,1)$.

$(4)$: Interchange integral and sum using Fubini/Tonelli’s theorem.

We can also find the density of $R$ through the change of variables $(X_{(1)},X_{(n)})mapsto (R,X_{(1)})$ and find $mathbb E(R)$ directly by essentially the same calculation as above.

You can go for calculating another integral:

$$begin{aligned}mathbb{E}maxleft(X_{1},dots,X_{n}right) & =int_{0}^{infty}Pleft(maxleft(X_{1},dots,X_{n}right)>xright)dx\
& =int_{0}^{infty}1-Pleft(maxleft(X_{1},dots,X_{n}right)leq xright)dx\
& =int_{0}^{infty}1-left(1-e^{-x}right)^{n}dx\
& =int_{0}^{infty}sum_{k=1}^{n}binom{n}{k}left(-1right)^{k-1}e^{-kx}dx\
& =sum_{k=1}^{n}binom{n}{k}left(-1right)^{k-1}int_{0}^{infty}e^{-kx}dx\
& =sum_{k=1}^{n}binom{n}{k}left(-1right)^{k-1}left[-frac{e^{-kx}}{k}right]_{0}^{infty}\
& =sum_{k=1}^{n}binom{n}{k}left(-1right)^{k-1}frac{1}{k}
end{aligned}
$$

There might be a closed form for it, but I haven’t found it yet.


Edit:

According to the comment of @RScrlli the outcome can be proved to equal harmonic number: $$H_n=sum_{k=1}^nfrac1{k}$$

This makes me suspect that there is a way to find it as the expectation of:$$X_{(n)}=X_{(1)}+(X_{(2)}-X_{(1)})+cdots+(X_{(n)}-X_{(n-1)})$$

a clever probabilistic approach is one that takes advantage of the homogenous parameter $lambda_i =1$ for all, and the memorylessness of the exponential distribution (and the fact that there is zero probability for any $X_i = X_j$ for $ineq j)$.

$(X_1, X_2, …,X_n)$
we want $Ebig[max_i X_ibig]$

$max_i X_i$ is equivalent to the final arrival in a poisson process with intensity $n$ where intensity drops by one after each arrival

i.e. with
first arrival in $(X_1, X_2, …,X_n)$
this is equivalent to the merger of $n$ independent Poisson processes which results in a merged Poisson process with parameter $n$.

WLOG suppose $X_n$ is first arrival, then consider
first arrival in $(X_1, X_2, …,X_{n-1})$
by memorylessness we have a fresh start with $n-1$ independent Poisson processes which is a merged process with parameter $n-1$

and continue on until WLOG
we only want first arrival in $(X_1)$

so $max_i X_i =sum_{i=1}^n T_i$ where $T_i$ are the arrival times described above
$Ebig[max_i X_ibig] =sum_{i=1}^n Ebig[T_ibig] =sum_{i=1}^n frac{1}{n-i+1}= sum_{i=1}^nfrac{1}{n}$

really you should always try to exploit memorylessness when dealing with exponential r.v.’s

Tagged : / / / /

Math Genius: 1 minute of call costs 1 currency. Let’s assume that duration of call has exponential distribution with a parameter $lambda=1$

Original Source Link

1 minute of call costs 1 currency. Let’s assume that duration of call has exponential distribution with a parameter $lambda=1$, which implies that call lasts for an avarage of 1 minute.

How much on avarage do we pay for a call?

$X$-Duration of call

$Z$-Costs of call

My take on it:

$$
lambda=1Rightarrow left{begin{matrix}
P(X=t)=e^{-t}\
P(Xleq t)=1-e^{-t}
end{matrix}right.$$

$$
for$$
$$tleq0$$

$$P(X=t)=lambda e^{-lambda t}=e^{-t},t>0$$

For call we pay $z$ currency $(z=1,2,…)$ if call lasts $z-1<Xleq z$

$$P(Z=z)=P(z-1<Xleq z)=(1-e^{-z})-(1-e^{-(z-1)})=e^{1-z}-e^{z}$$

Avarage costs of call $E(Z)$

$E(Z)=sum_{z=1}^{infty}ztimes P(Z=z)=sum_{z=1}^{infty}z(e^{1-z}-e^{-z})=(e-1)sum_{z=1}^{infty}ze^{-z}=(e-1)times frac{e}{(e-1)^{2}}$

$approx1.68$ currency

Could anyone check if it makes sense?

Tagged : / /