STT 861 Theory of Prob and STT I Lecture Note - 13

2017-11-29

Recap of linear predictor; almost surely convergence, converge in probability, converge in distribution; central limit theorem, theorem of DeMoivre-Laplace.

Portal to all the other notes

Lecture 13 - Nov 29 2017

For Video Recording

Linear Prediction

Recall let

The r.v. is the best predictor of given in the least square sense: minimizes = MSE.

But what about making this MSE as small as possible when is linear? So use notation (instead of ).

We want to minimize

Find and to make this as small as possible. Let

We also know,

where , and ,

We see immediately that this is minimal for and .

Therefore,

This answers the question of what the best linear predictor of given is the mean square sense.

We see the smallest MSE is therefore .

Therefore, we see that the proportion of ’s variance which is not explained by is

Finally, the proportion of ’s variance which is explained by is .

Chapter 6 - Convergences

Definition: We say that the sequence of r.v.’s converges (“a.s.”(almost surely)) to the r.v. if

with probability 1. In other word,

Definition: (A weaker notion of convergence) A sequence of r.v.’s converges in probability to if

Note: [Convince yourself as an exercise at home] Convergence in probability is (easier to achieve) than converge a.s.

Definition: (even weaker version) Let sequence of r.v.’s as above but now let be the CDF of some distribution. We say converges in distribution to the law is

as for every fixed where is continuous.

Note: unlike the previous two notions, here there is no need for a limiting r.v. and the ’s. Do not need to share a probability space with or anyone else.

Example 1

Let , be i.i.d with and is finite. We proved that, with

then

(By Chebyshev’s inequality)

The whole thing goes to 0 as , this proves that in probability.

Note: assuming only exists ( could be infinity), conclusion still holds. See W. Feller’s book in 1950.

Let . This is a stepper function, with increment each step. Let’s try to find out for (integer function, the integer larger than ).

For fixed , as , .

Since the function , is not the CDF of any random variable, this proves that does not converge in distribution, And therefore, cannot converges in any stronger sense (in probability or a.s.).

How about ?

Since

for , this is the CDF of .

Example 2

Let , i.i.d. Let . We can tell that in some sense. Let’s prove it in probability:

Let ,

We proved that in probability.

Now consider . Let’s see about CDF of .

This CDF is exponential. Thus in distribution.

Theorem (6.3.6): Let be a sequence of r.v.’s. If has MGF and for not 0, then in distribution.

Example 3

Let be Bin(). We know that the PMF of converges to the PMF of Poisson(). Try it again here. We will find

and here we recognize that this is the MGF of Poisson(). By Theorem 6.3.6, in distribution.

Let be i.i.d with and variance .

Let , ,

(We divide by as a standardization and we know ).

Central Limit Theorem (CLT)

As , then in distribution.

This means:

The proof just combines the Taylor formula up to the order 2 and Theorem 6.3.6.

Next, consider

A corollary of the CLT says this:

in distribution.

Proof:

and conclude by changing of variable.

Theorem of DeMoivre-Laplace

Let be i.i.d Bernoulli. Let . We know .

Let ,

Therefore, by CLT, as in distribution.

In other words, to be precise,

The speed of convergence in the CLT is known as a “Berry-Esseen” theorem. But the speed of convergence for Binomial CLT is much faster and rule of thrum is and . CLT is good within convergence error.

Example 4

Let

where ’s are i.i.d Uniform(0, 1). We know by Weak law of large numbers, that in probability as . But how spread out is around 1/2? For example, can we estimate the chance that is more than 0.02 away from its mean value 1/2?

Answer:

We will feel comfortable if is large enough to make this greater than 0.95. What value should at least be.

In order to get this setup, it is known that the right-hand value should be = 1.96.

This value

So 1.96 is therefore know as the 97.5 percentile of . Therefore, we must take

← Older-Things to Do After Installing Ubuntu

STT 861 Theory of Prob and STT I Lecture Note - 14-Newer →

Disqus Comment （0）

LanternD's Castle

An electronics enthusiast - survive technically