My Avatar

LanternD's Castle

An electronics enthusiast - survive technically

STT 861 Theory of Prob and STT I Lecture Note - 6

2017-10-11

Hypergeometric distribution; Poisson Law; Brownian motion; continous random variables, exponential distribution, cumulative distribution function, uniform distribution; expectation and variance of continuous random variable.

Portal to all the other notes

Lecture 06 - Oct 11 2017

Hypergeometric distribution (continued)

(See Page 66)

A hypergeometric r.v. X with parameters n,N,R:

Have a set of size N, a subset of size R (‘red’). Pick a sample of size n without replacement. Then X is the number of elements of type R (‘red’) in the sample.

Then for k=0,,n,

P(X=k)=CnkCNnRkCNR

Important: let p=RN

E[X]=np Var[X]=np(1p)NnN1

The correction factor NnN1 is called the finite sample correction factor.

Poisson Law

Let N and M be Poi(λ) and Poi(μ) and independent. Then N+M is Pos(λ+μ).

(Think of it as arrivals in a store on 2 different days.)

Therefore, N(1) is Poi(λ). By the same construction N(t) is Poi(λt).

Now as t funs from 0 to 1, we have a collection or r.v’s {N(t):t[0,1]}. This is the Poisson process with parameter λ.

These are the properties of this Poi(λ) process. [a stochastic process, like N, is a random function where the rules of randomness are specified.]

The word stochastic comes from the Greek word “stochos” (στoχoσ). It means “target”.

It comes from the stroy of Aristotle and target distribution.

Let 0<s<t, then,

  1. N(t)N(s)Poi(λ(ts))
  2. N(s) and N(t)N(s) are independent.

Vocabulary: (2) is called independence of increments. And (1) is usually called stationarity.

Remark: N(t)+N(s) does not P(λ(t+s)).

What is N(t)+N(s)? =N(t)+N(s)Poi(λ(ts))+2Poi(λs)

2Poi(λs) is not Poi(2λs). They might have different variance.

Remark: (1) and (2) define the prob distribution of the Poi(λ) process uniquely.

Example 2.5.2 (Page 77)

Murder rate 540/year. Assume # murders in the interval [0,t] where t=proportion of a 360-day year is a Poisson process N(t).

Q1: What is prob of two or more murders with in 1 day P(N(1 day)2).

A1: λ=540360=1.5

P1=1k=01eλλkk!=1(e1.5+e1.5×1.5)=0.4422

Q2: What is prob of “2 or more murders for each of 3 consecutive days”?

A2: P2=P13=0.44223=0.0865

Q3: What is the prob of “No murder for 5 days”?

A3: P3=P(N(5 days)=0)=P(Poi(152)=0)=e152

Q4: Rate during weekdays is 1.2/day, during weekends is 2.5/day. What is the prob of “10 or more murders in 1 week?

A4:

P4=P(N(weekdays)+N(weekend)10)=P(Poi(6)+Poi(5)10)=P(Poi(11)10) P4=1k=0101e11(11)kk!=0.6595

HW discussion (Problem 2.4.4 (a))

XNegBin(r1,p), YNegBin(r2,p). W=X+Y. Find the distribution of W.

We know that X=X1+X2++Xr1, where the Xi’s are i.i.d. Geom(p). Similarly we have Yis are i.i.d. Geom(p).

Now if we assume X and Y are independent. Then all the Xis and Yis are independent. Then W=X1++Xr1+Y1++Yr2 are r1+r2 i.i.d. Geom(p). Therefore by definition W is NegBin(r1+r2,p).

Preview to Chapter 3 and Brownian motion

Let X1,X2,,Xn be a sequence of i.i.d. r.v.s with E[Xi]=0 and Var[Xi]=σ2>0.

Let Sn=X¯n=X1+X2++Xnn, then E(Sn)=0 and it turns out Sn gets really small as n.

Var[Sn]=nσ2n2=σ2n0

as n.

In fact we have the weak law of large numbers:

ϵ>0, P(|Sn|>ϵ)0 as n.

Proof: By Chebyshev’s inequality:

P(|Sn|>ϵ)|E[|Sn|2]ϵ2=σ2/nϵ20

What about dividing X1+X2++Xn by something much smaller than n?

Let Wn=X1+X2++Xnn, then E[Wn]=0 when n is large. Var[Wn]=σ2.

So we has a maybe limiting behavior? Yes. Distribution of Wn tends to bell curve (Normal distribution with σ2 variance).

Pick t[0,1]. Roughly speaking mnt.

Let

Wn(t)=X1+X2++Xm(t)n

we find, E[Wn]=0, Var[Wn]=tσ2.

The distribution Wn(t) converges to Bell curve with variance =tσ2.

The whole entire collection of {Wn(t):t[0,1]} converges to a stochastic process called Brownian motion W. It has these properties:

  1. E[W(t)]=0;
  2. Var[W(t)W(s)]=σ2(ts);
  3. W(t)W(s) and W(s) are independent;
  4. W(t)W(s) is Normal (bell curve).

Continuous Random Variables (Chapter 3)

Definition: A r.v. X is said to be continuous with density f is: a<b,

P[axb]=abf(x)dx

Properties of densities:

  1. f(x)0,x
  2. f(x)d=1

Example 1

let f(x)=ceλx, where x0, c and λ are positive constants. How should we define f(x) for x<0?

Let’s say f(x)=0,x<0. Do the integration.

ceλx=[cλ]=cλ=1c=λ

Exponential distribution

Definition: The r.v. whose density is

f(x)=0 x<0,λeλx x0 is called Exponential with parameter λ.

Notation: XExp(λ)

Cumulative distribution function (CDF)

Definition: The cumulative distribution function (CDF) of X with density is defined as before:

F(x)=P(Xx)=P(Xx)=f(x)dx

Example 2

If XExp(λ), then we find F(x)=0 for x0.

F(x)=0xλeλydy=1eλx

Note: the tail of X is defined as G(x)=P(X>x)=1F(x).

For Exponential, G(x)=eλ.

Remark: If X has density f and aR, then P(x=a)=aaf(x)dx=0.

Uniform distribution

The r.v. X is said to be uniform between 2 fixed values a and b if its density is a constant 1ba.

What about the CDF? It has 0 at x=a and 1 at xb.


Expectation and variance of continuous random variable (Chapter 3.3)

Definition: let X have density f, then its expectation is xf(x)dx

Example 3

XExp(λ), E[X]=xf(x)dx.

Integration by parts:

0udv=[uv]|00vdu

Best choice: use a dv whose d is known, and if u is a polynomial that’s good because du has a degree one less than u. Choice: dv=λeλxdx, v=eλx, u=x, du=dx.

E[X]=[x(eλx)]00eλxdx=1λ

Exercise: XUnif[a,b], find E(X)=a+b2.

Definition: the variance of X with density f is defined the same as before:

Var[X]=E[(XE[X])2]=E[X2]E[X]2

where E[X2]=x2f(x)dx.

Exercise: Verify that E[X2]=2λ2 for XExp(λ), and Var[X]=1λ2.

Var[X]=(ab)212 for XUnif(a,b).



Disqus Comment 0