STT 861 Theory of Prob and STT I Lecture Note - 8
2017-10-25
Proof of the biased sample mean; Example 3.5.3 in the text book; Normal distribution, joint normal distribution (multivariate), Gamma distribution.
Portal to all the other notes
- Lecture 01 - 2017.09.06
- Lecture 02 - 2017.09.13
- Lecture 03 - 2017.09.20
- Lecture 04 - 2017.09.27
- Lecture 05 - 2017.10.04
- Lecture 06 - 2017.10.11
- Lecture 07 - 2017.10.18
- Lecture 08 - 2017.10.25 -> This post
- Lecture 09 - 2017.11.01
- Lecture 10 - 2017.11.08
- Lecture 11 - 2017.11.15
- Lecture 12 - 2017.11.20
- Lecture 13 - 2017.11.29
- Lecture 14 - 2017.12.06
Lecture 08 - Oct 25 2017
For video record
(This part is similar to previous note. We recorded a video for it so the professor talked about it once again.)
Basically it is the prove that sample variance ˆσ2=1n∑ni=1(xi−ˉx)2 biased.
Recall,
ˉx=1nn∑i=1xi ˆσ2=1nn∑i=1(xi−ˉx)2Note that ˉx is the expectation of a r.v. ˆX which takes the value xi with probability = 1n.
Therefore
ˉx=E(ˆX)and
Var(ˆX)=ˆσ2Now recall the formula:
E((X−c))2=Var(X)+(E(X)−c)2 ⇔Var(X)=E((X−c)2)−(E(X)−c)2Use this with X=ˆX defined above:
ˆσ2=1nn∑i=1(xi−c)2−(ˉx−c)Now the question is: what is the bias of ˆσ2? For this we replace each xi by Xi, where Xi’s are i.i.d with mean =μ and variance=σ2. The resulting expression is
ˆσ2(X)=1nn∑i=1(Xi−ˉX)2We are interested in E(ˆσ2(X)). We want to know if this =σ2 or not.
Now use formula above with C=μ and x=X and we take E on both side.
E(ˆσ2)=E(1nn∑i=1(Xi−μ)2)−E((ˉX−μ)2)=E(1nn∑i=1(Xi−μ)2)−Var(ˉX)=1nn∑i=1Var(Xi)−1nVar(xi)=1nn∑i=1Var(Xi)−σ2n=σ2−1nσ2We proved that E(ˆσ2)=(1−1nσ2). It is biased with 1nσ2.
Example 3.5.3, Page 100
(X,Y) has density f(x,y) is 2 if 0≤y≤y≤1, and 0 otherwise.
First compute the joint CDF of (X,Y),
General definition
F(u,v)=P(X≤u,Y≤v)=∫x−∞(∫y−∞f(x,y)dy)dx=∫x0(∫y0f(x,y)dy)dxIf u≥v, we integrate y between 0 and x,
∫v0f(x,y)dy=∫min(v,x)0dy=2min(v,x)Now we integrates between 0 and u with respect to x.
∫u02min(v,x)dx=∫v02min(v,x)dx+∫vu2min(v,x)dx=v2−2v(u−v)So we have proved that when u>v,
F(u,v)=v2+2v(u−v)If u<v, we still compute the same integral
∫u02min(v,x)dx=u2(0<u<v)
This proven F(u,v)=u2 when u<v.
Compute the marginal density of x and y.
In general, marginal density fX(x)=∂∂xF(x,y) when fixing y. Similarly, fY(y)=∂∂xF(x,y) where x is fixed.
If x>y,
fX(x)=2y(constant)If x<y, then fX(x)=2x.
fX(x)=2xSpecial continuous distributions - Chapter 4
Definition: X is a normal with parameters 0 and 1 if it has this density
f(x)=1√2πe−x2/2Facts:
- E(X)=0, because the density is 0;
- Var(X)=1 (prove this using one integration by parts)
Notation: X∼N(0,1).
Definition: X is normal with parameters μ and σ2 if it has the density
f(x)=1√2πσ2exp(−12σ2(x−μ)2)Facts:
- E(x)=μ
- Var(x)=σ2
Notation: X∼N(μ,σ)
Proof: comes from the case N(0,1) by using change of variables.
Fact: let X∼N(0,1), let μ and σ be fixed. Let Y=μ+σX. Then Y∼N(μ,σ2).
Fact: let X and X′ be two independent r.v.’s respectively, N(μ,σ2), N(μ′,σ′2), then
Y=X+X′∼N(μ+μ′,σ2+σ′2)Moral of the story: the class of normal r.v.’s is stable by linear combination.
Example 1
let X∼N(1,1), Y∼N(0,4), Z∼N(−2,1). Assume they are independent. V=2X+3+Y−4Z. Find E(V), Var(V).
A: By the above moral, V should be normal.
E(V)=13, Var(V)=24
Example 2
Let X∼N(μ,σ2), let
Z=x−μσso E(Z)=0, Var(Z)=1. Because Z is a linear transformation of X.
Joint normal distribution (multivariate normal)
Definition: The vector (X1,X2,…,Xn) is normal with mean μ∈Rn and covariance matrix C. (here →μ={μi},i=1,2,…,n, C=cijni,j=1, and E(xi)=μi, cov(xi,xj)=cij and sometimes people use the letter Q or Σ instead of C), if its joint density is
f(x1,x2,...,xn)=1(2π)n/21√detCexp(−12(x−μ)TC−1(x−μ))(Notice: x and μ are column vectors)
The stuff in exp() is
(x−μ)TC−1(x−μ)=n∑j=1n∑i=1(xi−μi)C−1ij(xj−μj)Sample case, n=1, C=(σ2), so det(C)=σ2 and 1det(C)=1√σ2
So
(x−μ)TC−1(x−μ)=(x−μ)2σ2This matches f(x).
when n=2 with independent x1 and x2.
det(C)=σ21⋅σ22 f(x1,x2)=12π√σ21σ22exp(2∑j=12∑i=1(xi−μi)Cij(xj−μj))=12π√σ21σ22exp(−122∑i=j=1(xi−μi)1σ2i(xi−μi))=12π√σ21σ22exp(−12((x1−μ1)2σ21+(x2−μ2)2σ22))In general, if x1,x2,…,xn are independent normals, N(μi,σ2i),i=1,2,…,n, then
f(x1,x2,...,xn)=1(2π)n/2∏ni=1√σiexp(−12||(xi−μiσi)vector||2eucli)=1(2π)n/2∏ni=1√σiexp(−12n∑i=1(xi−μiσi)2)We recognize that if Cov(xi,xj)=0 and (xi,xj) are bivariate normal, then xi and xj are independent.
Normally, Cov(xi,xj)⇏ (can not infer) xi and xj are independent. But it does when (xi,xj) is bivariate normal.
Question: let’s create a multivariate normal vector Y with covariance matrix C to create a vector X=(x1,x2,…,xn), where all xi’s are i.i.d N(0,1) (standard normals’’).
(It’s easy to use Box-Mueller transformation to do so).
How to create a Y from this X?
Answer: Recall the notion of square-root of a matrix (C is positive definite).
(Linear algebra: MTM=MMT=C)
There exists a matrix M which is √C’’.
Consider
Z=MXWe know (by stability by linear combinations) that Z is multivariate normal.
Note
E(Z)=ME(X)=0What about Cov(Z)?
Qij=Cov(Zi,Zj)=E[ZiZj]=E((n∑k=iMikxk)⋅(n∑l=1Mjlxl))=n∑k=1n∑l=1MikMjlE(xkxl)=n∑k=1MikMjl(∗∗∗E(xkxl)=1 where i=j)=(MMT)ijSo we see that Y=Z=‘‘√X”.
Exercise: For n=2, find a square-root for a 2×2 matrix. There should be a place in the book where the author does that in hiding for the purpose of creating a bivariate normal.
Gamma distribution - Chapter 4.3
Definition: A r.v. X is Gamma with shape parameter α and scale parameter θ if it has this density
f(x)=c(xθ)α−1θ−1exp(−xθ)where the constant c=Γ(α), if α=n is an integer, Γ=(n−1)!
Notation: X∼Γ(α,θ).
Fact:
- when α=1: X∼Gamma(1,θ)=exp(λ=1θ).
- Consider a sequence X1,X2,…,Xn, which are ∼Γ(α1,θ),Γ(α2,θ),…,Γ(αn,θ). Assume they are independent. Then, with X=X1+X2+⋯+Xn, we get
Consequently, apply the above with α1=α2=⋯=αn=1, the sum of n i.i.d r.v.’s ∼exp(λ=1θ) is a r.v. ∼Γ(n,θ).
Specifically, this proves exactly that the nth arrival’’ time of a Pos(λ) process is a Γ(n,1λ) r.v.
- E(X)=nθ, because E(Xi)1λ=θ
- Var(X)=nθ2
- When α is not an integer, α∉N, let X∼Γ(α,θ), E(X)=αθ, Var(X)=αθ2. (Exercise: prove this at home).