STT 861 Theory of Prob and STT I Lecture Note - 8

2017-10-25

Proof of the biased sample mean; Example 3.5.3 in the text book; Normal distribution, joint normal distribution (multivariate), Gamma distribution.

Portal to all the other notes

Lecture 08 - Oct 25 2017

For video record

(This part is similar to previous note. We recorded a video for it so the professor talked about it once again.)

Basically it is the prove that sample variance ${\hat{σ}}^{2} = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - \bar{x})^{2}$ biased.

Recall,

\bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

{\hat{σ}}^{2} = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - \bar{x})^{2}

Note that $\bar{x}$ is the expectation of a r.v. $\hat{X}$ which takes the value $x_{i}$ with probability = $\frac{1}{n}$ .

Therefore

\bar{x} = E (\hat{X})

and

V a r (\hat{X}) = {\hat{σ}}^{2}

Now recall the formula:

E ((X - c))^{2} = V a r (X) + (E (X) - c)^{2}

\Leftrightarrow V a r (X) = E ((X - c)^{2}) - (E (X) - c)^{2}

Use this with $X = \hat{X}$ defined above:

{\hat{σ}}^{2} = \frac{1}{n} \sum_{i = 1}^{n} (x_{i} - c)^{2} - (\bar{x} - c)

Now the question is: what is the bias of ${\hat{σ}}^{2}$ ? For this we replace each $x_{i}$ by $X_{i}$ , where $X_{i}$ ’s are i.i.d with mean $= μ$ and variance $= σ^{2}$ . The resulting expression is

{\hat{σ}}^{2} (X) = \frac{1}{n} \sum_{i = 1}^{n} (X_{i} - \bar{X})^{2}

We are interested in $E ({\hat{σ}}^{2} (X))$ . We want to know if this $= σ^{2}$ or not.

Now use formula above with $C = μ$ and $x = X$ and we take $E$ on both side.

\begin{aligned} E ({\hat{σ}}^{2}) & = E (\frac{1}{n} \sum_{i = 1}^{n} (X_{i} - μ)^{2}) - E ((\bar{X} - μ)^{2}) \\ = E (\frac{1}{n} \sum_{i = 1}^{n} (X_{i} - μ)^{2}) - V a r (\bar{X}) \\ = \frac{1}{n} \sum_{i = 1}^{n} V a r (X_{i}) - \frac{1}{n} V a r (x_{i}) \\ = \frac{1}{n} \sum_{i = 1}^{n} V a r (X_{i}) - \frac{σ^{2}}{n} \\ = σ^{2} - \frac{1}{n} σ^{2} \end{aligned}

We proved that $E ({\hat{σ}}^{2}) = (1 - \frac{1}{n} σ^{2})$ . It is biased with $\frac{1}{n} σ^{2}$ .

Example 3.5.3, Page 100

$(X, Y)$ has density $f (x, y)$ is 2 if $0 \leq y \leq y \leq 1$ , and 0 otherwise.

First compute the joint CDF of $(X, Y)$ ,

General definition

\begin{aligned} F (u, v) & = P (X \leq u, Y \leq v) \\ = \int_{- \infty}^{x} (\int_{- \infty}^{y} f (x, y) d y) d x \\ = \int_{0}^{x} (\int_{0}^{y} f (x, y) d y) d x \end{aligned}

If $u \geq v$ , we integrate $y$ between 0 and $x$ ,

\int_{0}^{v} f (x, y) d y = \int_{0}^{min (v, x)} d y = 2 min (v, x)

Now we integrates between $0$ and $u$ with respect to $x$ .

\begin{aligned} \int_{0}^{u} 2 min (v, x) d x \\ = \int_{0}^{v} 2 min (v, x) d x + \int_{u}^{v} 2 min (v, x) d x \\ = v^{2} - 2 v (u - v) \end{aligned}

So we have proved that when $u > v$ ,

F (u, v) = v^{2} + 2 v (u - v)

If $u < v$ , we still compute the same integral

\begin{aligned} \int_{0}^{u} 2 min (v, x) d x \\ = u^{2} \end{aligned}

( $0 < u < v$ )

This proven $F (u, v) = u^{2}$ when $u < v$ .

Compute the marginal density of $x$ and $y$ .

In general, marginal density $f_{X} (x) = \frac{\partial}{\partial x} F (x, y)$ when fixing $y$ . Similarly, $f_{Y} (y) = \frac{\partial}{\partial x} F (x, y)$ where $x$ is fixed.

If $x > y$ ,

f_{X} (x) = 2 y (c o n s t a n t)

If $x < y$ , then $f_{X} (x) = 2 x$ .

f_{X} (x) = 2 x

Special continuous distributions - Chapter 4

Definition: $X$ is a normal with parameters $0$ and $1$ if it has this density

f (x) = \frac{1}{\sqrt{2 π}} e^{- x^{2} / 2}

Facts:

$E (X) = 0$ , because the density is 0;
$V a r (X) = 1$ (prove this using one integration by parts)

Notation: $X \sim N (0, 1)$ .

Definition: $X$ is normal with parameters $μ$ and $σ^{2}$ if it has the density

f (x) = \frac{1}{\sqrt{2 π σ^{2}}} \exp (- \frac{1}{2 σ^{2}} (x - μ)^{2})

Facts:

$E (x) = μ$
$V a r (x) = σ^{2}$

Notation: $X \sim N (μ, σ)$

Proof: comes from the case $N (0, 1)$ by using change of variables.

Fact: let $X \sim N (0, 1)$ , let $μ$ and $σ$ be fixed. Let $Y = μ + σ X$ . Then $Y \sim N (μ, σ^{2})$ .

Fact: let $X$ and $X^{'}$ be two independent r.v.’s respectively, $N (μ, σ^{2})$ , $N (μ^{'}, σ^{' 2})$ , then

Y = X + X^{'} \sim N (μ + μ^{'}, σ^{2} + σ^{' 2})

Moral of the story: the class of normal r.v.’s is stable by linear combination.

Example 1

let $X \sim N (1, 1)$ , $Y \sim N (0, 4)$ , $Z \sim N (- 2, 1)$ . Assume they are independent. $V = 2 X + 3 + Y - 4 Z$ . Find $E (V)$ , $V a r (V)$ .

A: By the above moral, $V$ should be normal.

$E (V) = 13$ , $V a r (V) = 24$

Example 2

Let $X \sim N (μ, σ^{2})$ , let

Z = \frac{x - μ}{σ}

so $E (Z) = 0$ , $V a r (Z) = 1$ . Because $Z$ is a linear transformation of $X$ .

Joint normal distribution (multivariate normal)

Definition: The vector $(X_{1}, X_{2}, \dots, X_{n})$ is normal with mean $μ \in R^{n}$ and covariance matrix $C$ . (here $\vec{μ} = {μ_{i}}, i = 1, 2, \dots, n$ , $C = {c_{i j}}_{i, j = 1}^{n}$ , and $E (x_{i}) = μ_{i}$ , $c o v (x_{i}, x_{j}) = c_{i j}$ and sometimes people use the letter $Q$ or $Σ$ instead of $C$ ), if its joint density is

\begin{aligned} f (x_{1}, x_{2}, . . ., x_{n}) & = \frac{1}{(2 π)^{n / 2}} \frac{1}{\sqrt{det C}} \exp (- \frac{1}{2} (x - μ)^{T} C^{- 1} (x - μ)) \end{aligned}

(Notice: $x$ and $μ$ are column vectors)

The stuff in $\exp ()$ is

(x - μ)^{T} C^{- 1} (x - μ) = \sum_{j = 1}^{n} \sum_{i = 1}^{n} (x_{i} - μ_{i}) C_{i j}^{- 1} (x_{j} - μ_{j})

Sample case, $n = 1$ , $C = (σ^{2})$ , so $det (C) = σ^{2}$ and $\frac{1}{det (C)} = \frac{1}{\sqrt{σ^{2}}}$

(x - μ)^{T} C^{- 1} (x - μ) = \frac{(x - μ)^{2}}{σ^{2}}

This matches $f (x)$ .

when $n = 2$ with independent $x_{1}$ and $x_{2}$ .

det (C) = σ_{1}^{2} \cdot σ_{2}^{2}

\begin{aligned} f (x_{1}, x_{2}) & = \frac{1}{2 π \sqrt{σ_{1}^{2} σ_{2}^{2}}} \exp (\sum_{j = 1}^{2} \sum_{i = 1}^{2} (x_{i} - μ_{i}) C_{i j} (x_{j} - μ_{j})) \\ = \frac{1}{2 π \sqrt{σ_{1}^{2} σ_{2}^{2}}} \exp (- \frac{1}{2} \sum_{i = j = 1}^{2} (x_{i} - μ_{i}) \frac{1}{σ_{i}^{2}} (x_{i} - μ_{i})) \\ = \frac{1}{2 π \sqrt{σ_{1}^{2} σ_{2}^{2}}} \exp (- \frac{1}{2} (\frac{(x_{1} - μ_{1})^{2}}{σ_{1}^{2}} + \frac{(x_{2} - μ_{2})^{2}}{σ_{2}^{2}})) \end{aligned}

In general, if $x_{1}, x_{2}, \dots, x_{n}$ are independent normals, $N (μ_{i}, σ_{i}^{2}), i = 1, 2, \dots, n$ , then

\begin{aligned} f (x_{1}, x_{2}, . . ., x_{n}) & = \frac{1}{(2 π)^{n / 2} \prod_{i = 1}^{n} \sqrt{σ_{i}}} \exp (- \frac{1}{2} | | (\frac{x_{i} - μ_{i}}{σ_{i}})_{v e c t o r} | |_{e u c l i}^{2}) \\ = \frac{1}{(2 π)^{n / 2} \prod_{i = 1}^{n} \sqrt{σ_{i}}} \exp (- \frac{1}{2} \sum_{i = 1}^{n} (\frac{x_{i} - μ_{i}}{σ_{i}})^{2}) \end{aligned}

We recognize that if $C o v (x_{i}, x_{j}) = 0$ and $(x_{i}, x_{j})$ are bivariate normal, then $x_{i}$ and $x_{j}$ are independent.

Normally, $C o v (x_{i}, x_{j}) ⇏$ (can not infer) $x_{i}$ and $x_{j}$ are independent. But it does when $(x_{i}, x_{j})$ is bivariate normal.

Question: let’s create a multivariate normal vector $Y$ with covariance matrix $C$ to create a vector $X = (x_{1}, x_{2}, \dots, x_{n})$ , where all $x_{i}$ ’s are i.i.d $N (0, 1)$ (standard normals’’).

(It’s easy to use Box-Mueller transformation to do so).

How to create a $Y$ from this $X$ ?

Answer: Recall the notion of square-root of a matrix ( $C$ is positive definite).

(Linear algebra: $M^{T} M = M M^{T} = C$ )

There exists a matrix $M$ which is $\sqrt{C}$ ’’.

Consider

Z = M X

We know (by stability by linear combinations) that $Z$ is multivariate normal.

Note

E (Z) = M E (X) = 0

What about $C o v (Z)$ ?

\begin{aligned} Q_{i j} & = C o v (Z_{i}, Z_{j}) \\ = E [Z_{i} Z_{j}] \\ = E ((\sum_{k = i}^{n} M_{i k} x_{k}) \cdot (\sum_{l = 1}^{n} M_{j l} x_{l})) \\ = \sum_{k = 1}^{n} \sum_{l = 1}^{n} M_{i k} M_{j l} E (x_{k} x_{l}) \\ = \sum_{k = 1}^{n} M_{i k} M_{j l} (* * * E (x_{k} x_{l}) = 1 w h e r e i = j) \\ = (M M^{T})_{i j} \end{aligned}

So we see that $Y = Z = ‘ ‘ \sqrt{X} ”$ .

Exercise: For $n = 2$ , find a square-root for a $2 \times 2$ matrix. There should be a place in the book where the author does that in hiding for the purpose of creating a bivariate normal.

Gamma distribution - Chapter 4.3

Definition: A r.v. $X$ is Gamma with shape parameter $α$ and scale parameter $θ$ if it has this density

f (x) = c (\frac{x}{θ})^{α - 1} θ^{- 1} \exp (- \frac{x}{θ})

where the constant $c = Γ (α)$ , if $α = n$ is an integer, $Γ = (n - 1)!$

Notation: $X \sim Γ (α, θ)$ .

Fact:

when $α = 1$ : $X \sim G a m m a (1, θ) = \exp (λ = \frac{1}{θ})$ .
Consider a sequence $X_{1}, X_{2}, \dots, X_{n}$ , which are $\sim Γ (α_{1}, θ), Γ (α_{2}, θ), \dots, Γ (α_{n}, θ)$ . Assume they are independent. Then, with $X = X_{1} + X_{2} + \dots + X_{n}$ , we get

X \sim Γ (α_{1} + α_{2} + \dots + α_{n}, θ)

Consequently, apply the above with $α_{1} = α_{2} = \dots = α_{n} = 1$ , the sum of $n$ i.i.d r.v.’s $\sim \exp (λ = \frac{1}{θ})$ is a r.v. $\sim Γ (n, θ)$ .

Specifically, this proves exactly that the $n$ th arrival’’ time of a $P o s (λ)$ process is a $Γ (n, \frac{1}{λ})$ r.v.

$E (X) = n θ$ , because $E (X_{i}) \frac{1}{λ} = θ$
$V a r (X) = n θ^{2}$
When $α$ is not an integer, $α \notin N$ , let $X \sim Γ (α, θ)$ , $E (X) = α θ$ , $V a r (X) = α θ^{2}$ . (Exercise: prove this at home).