My Avatar

LanternD's Castle

An electronics enthusiast - survive technically

STT 861 Theory of Prob and STT I Lecture Note - 9

2017-11-01

Review of the important concepts of previous section; moment generation function; Gamma distribution, chi-square distribution.

Portal to all the other notes

Lecture 09 - Nov 01 2017

Quick Review Session (For the mid-term exam)

Bayes’ theorem

Suppose we have data: an even $B$ that happened.

Possible outcomes: $A_1,A_2,…,A_n$.

Model for each $A_i$: $P(A_i)$ given. This is the “prior” model.

Model for each relation between $A_i$ and $B$: $P(B|A_i)$. This is the “likelihood” model.

Theorem:

\[P(A_i|B)=\frac{P(B|A_i)P(A_i)}{\sum_{j=1}^{n}P(B|A_j)P(A_j)}\]

Quick Example (The Chevalier de Méré example in Note 3).

$P$(one six in 4 rolls of a die) = 1 - $P$(no six in 4 rolls of a die) = $1-(\frac{5}{6})^4\approx 0.5177$

$P$(one double-six in 24 rolls of 2 dice) = $1-(\frac{35}{36})^{24}\approx 0.4914$

Discrete and continuous variables

Discrete r.v.’s: $P(X=x_k)=P_k$. $E(X)=\sum_{k}x_kp_k$.

Continuous Case: $P(a\leq x\leq b)=\int_{a}^{b}f(x)dx$. $E(X)=\int_{-\infty}^{\infty}xf(x)dx$.

Linearity

Chebyshev and Weak law of large numbers

$X$ is a r.v. $Var(X)$ exists. Then

\[P(|X-E(X)|>\varepsilon)\leq \frac{Var(X)}{\varepsilon^2}\]

This is true no matter how small $\varepsilon>0$ is.

Apply this to $\bar{X}=\frac{1}{n}\sum (X_i-E(X_i))$, where $X_i$’s are i.i.d. and $Var(X_i)<\infty$.

Note $E(X)=\mu=E(X)$, $Var(\bar{X})=\frac{\sigma^2}{n}$.

By Chebyshev:

\[P(|\bar{X}-\mu|>\varepsilon)\leq \frac{\sigma^2}{h\varepsilon^2}\]

As $n\rightarrow \infty$, this probability $\rightarrow0$.

Special discrete distributions

Let $X\sim Exp(\lambda)$, the density is $f(x)=\lambda e^{-\lambda x}$ for $x\geq 0$.

Let $X_i$ be i.i.d $Exp(\lambda)$. Let $N(t)$ be the # of arrivals in time interval $[0,t]$. Assume $N=$ Poisson process. Then $X_i$ is a model for the amount of time between $i-1$th and the $i$th arrivals.

[Use step functions to illustrate.]

Theorem: if $N(t)$ is Poisson($\lambda$) process, and $T_i$’s are its jump times (arrival times) and $X_i=T_i-T{i-1}$, then $X_i\sim Exp(\lambda)$ (i.i.d).

What about the distribution of $T_i$? $T_i\sim \Gamma(i,\theta=\frac{1}{\lambda})$.

Here recall $\lambda$ is a rate parameter, so $\theta$ is a scale parameter.

The density of $T_n$ is

\[f(x)=\frac{\lambda}{\Gamma(n)}(\lambda x)^{n-1}e^{-\lambda x}\]

where $x\geq 1$.

Moment generation function

Method for doing problem 2.2.5.

Let $X$ have the binomial distribution with parameters $n$ and $p$. Conditionally on $X = k$, let $Y$ have the binomial distribution with parameters $k$ and $r$ . What is the marginal distribution of $Y$?

There are lots of way to solve this problem, here we use moment generate functions (mgf).

Definition: Let $X$ be a r.v. Let

\[M_X(t)=E(e^{tX})\]

where $t$ is fixed. $M_X(t)$ is the moment generation function of $X$.

It turns out, the function usually characterizes the distribution of $X$.

Example: let $X\sim Bin(n,p)$. we know $X=X_1+X_2+\cdots+X_n$ (i.i.d Bernoulli($p$)).

Now,

\[\begin{align*} M_X(t) &= E(e^{tX})\\ &=E(e^{X_1+X_2+\cdots+X_n}) \\ &= E(e^{tX_1})E(e^{tX_2})\cdots E(e^{tX_n})\\ &= (E(e^{tX_1}))^n \end{align*}\]

and

\[E(e^{tX_1})=p(e^t)+(1-p)\times 1=1+p(e^t-1)\]

Therefore,

\[M_X(t)=(1+p(e^t-1))^n\]

Now look at Problem 2.2.5.

\[Y\sim Bin(Bin(n,p),r)\]

Therefore

\[Y=Y_1+Y_2+\cdots+Y_X\]

where $Y_i$ are i.i.d Bernoulli$(r)$.

Hunch: $Y$ is $Binomial(a, b)$. To prove it: compute $M_Y(t)=(1+b(e^t-1))^a$

\[\begin{align*} M_Y(t) &=E(e^{tY})\\ &=E(e^{t(Y_1+Y_2+\cdots+Y_X)})\\ &= \sum_{k=0}^{n}(E(...|X=k))P(X=k)\\ &= \sum_{k=0}^{n}E(e^{t\cdot Bin(k,r)})P(X=k)\\ &= \sum_{k=0}^{n}(1+r(e^t-1))^kP(X=k)\\ &= \sum_{k=0}^{n}e^{k\ln(1+r(e^t-1))} \\ &= \sum_{k=0}^{n}e^{ku}P_k \\ \end{align*}\]

This is the defeinition of $E(e^{uX})\triangleq M_X(u)$.

\[\begin{align*} M_X(u) &= (1+p(e^u-1))^n \\ &= (1+p(e^{\ln(1+r(e^t-1))})-1)^n \\ &=(1+p(1+r(e^t-1)-1))^n \\ &=(1+pr(e^t-1))^n \end{align*}\]

Therefore we recognize that $Y\sim Binom(n,pr)$.

Gamma Distribution

Go back to Gamma distribution.

Example

let $Z\sim N(0,1)$ (standard normal). the $f_Z(z)=\frac{1}{\sqrt{2\pi}}\exp(\frac{z^2}{2})$. Find the density of $Y=Z^2$.

\[\begin{align*} F_Y(y) &= P(Y\leq y) = P(Z^2\leq y)\\ &= P(-\sqrt{y}\leq Z \leq \sqrt{y}) \\ &=F_Z(\sqrt{y})-F_Z(-\sqrt{y}) \end{align*}\]

Use chain rule to compute $f_Y(y)$.

\[\begin{align*} f_Y(y)&=\frac{dF_Y}{dy}=\frac{d}{dy}F(\sqrt{y})-\frac{d}{dy}F_Z(-\sqrt{y}) \\ &= f_Z(\sqrt{y})\frac{1}{2\sqrt{y}} - f_Z(-\sqrt{y})\frac{-1}{2\sqrt{y}}\\ &=\frac{1}{\sqrt{2\pi}}y^{\frac{1}{2}}e^{-\frac{y}{2}} \end{align*}\]

We recognize this is the density of $\Gamma(\alpha=\frac{1}{2},\theta = 2)$.

Chi-square distribution and degree of freedom

This Gamma and every Gamma for which $\alpha=\frac{n}{2}$, where $n$ is an integer, is called $\chi^2(n)$ (“Chi-squared” with $n$ degrees of freedom).

We see $\chi^2(n)\sim Z_1^2+Z_2+6+\cdots +Z_n^2$. where $Z_i\sim i.i.d~N(0,1)$.

\[\chi^2(n)\equiv \Gamma(\frac{n}{2},2)\]

Q: What about $\chi^2(2)$?

A: $\sim Gamma(1,2)$, which is exponential distribution with parameter $\lambda=\frac{1}{2}$.

Q: Now to create $X\sim Exp(\lambda=1)$ using only i.i.d normals $N(0,1)$.

Try this: $X=Z_1^2+Z_2^2\sim Exp(\frac{1}{2})$.

\[X=\frac{1}{2}(Z_1^2+Z_2^2)\sim Exp(1)\]

When we need to multiply a scale parameter $\theta$ by a constant $c$, we multiply the random variable by $c$.

Equivalently, when we need to multiply a rate parameter $\lambda$ by $c$, just divide the random variable by $c$.



Disqus Comment 0