基本信息

授课教师:qz

Lecture 10 概率论

实质内容可以不听回头自己看
有手有脑得个八九十分不是问题
来听就一定能听懂;不来听也能懂
作业 = 没有意义的东西

研究随机事件发生的可能性的应用数学

集合论 Set Theory

  • 并集 Union \cup
  • 交集 Intersection \cap
  • 补集 Complement ACA^{C}
  • 互斥 mutually Exclusive
  • 穷尽 Collectively Exhaustive
  • 分割 Partition

 

Applying Set Theory to Probability

  • 随机实验 Random Experience
  • 样本空间 Sample Space
  • 事件 Events

 

Probability Axioms

  • A1: P[A]0P[A]\ge0
  • A2: P[S]=1P[S]=1
  • A3: mutually exclusive events P[A1A2]=P[A1]+P[A2]+P[A_1\cup A_2\cup…]=P[A_1]+P[A_2]+…
  • P[AB]=P[A]+P[B]P[AB]P[A\cup B]=P[A]+P[B]-P[A\cap B]

 

Discrete Sample Space

S={a1,a2,an}S=\lbrace{a_1,a_2,…a_n}\rbrace

P[{ai}]=1/nP[\lbrace a_i\rbrace]=1/n

 

Conditonal Probability

Defination

P[AB]=P[AB]/P[B]P[A|B]=P[AB]/P[B]

 

Theorem

  • P[AB]>0P[A|B]>0
  • P[BB]=1P[B|B]=1
  • If AiA_i is the partition of A, then P[AB]=P[A1B]+P[A2B]+P[A|B]=P[A_1|B]+P[A_2|B]+…

 

Partitions & the Law of Total Probability

If the partition is B={B1,B2,,Bn}B = \{B_1,B_2,…,B_n\}, and Ci=ABiC_i=A\cap B_i, then A=C1C2CnA=C_1\cup C_2\cup …\cup C_n

Bayel’s Law

P[BA]=P[AB]P[B]P[A]P[B|A]=\displaystyle\frac{P[A|B]P[B]}{P[A]}

 

Independence

A and B are independent if only if P(AB)=P(A)P(B)    P(AB)=P(A),P(BA)=P(B)P(A\cap B)=P(A)P(B)\iff P(A|B)=P(A),P(B|A)=P(B)

Independence & Mutually Exclusive

independence ans mutually exclusive are not synonyms

only when P(A)P(B)=0P(A)P(B) = 0, Ind = M.E.

 

Random Variables

XSX\in S

SXS_X: random variable range

map the sample outcomes ss to the corresponding value of the random variable XX

 

Discrete Random Variables

Probablity Mass Function

Defination: PX(x)=P[X=x]P_X(x)= P[X=x]

 

Classical Distribution

Name Meaning PMF Expected Value Variance
Bernoulli(p) one test, result is 0, or 1 {1p,x=0p,x=1\begin{cases}1-p&,x=0\\p&,x=1\end{cases} pp p(1p)p(1-p)
Geometric(p) the number of tests that result occurs 1 time p(1p)x1,x=1,2,...p(1-p)^{x-1},x=1,2,... 1p\displaystyle\frac{1}{p} 1pp2\displaystyle\frac{1-p}{p^2}
Binomial(p) the number of result occurs in n times of tests (kn)pk(1p)nk,k=0,1,2,...\dbinom{k}{n}p^k(1-p)^{n-k},k=0,1,2,... npnp np(1p)np(1-p)
Pascal(k, p) the number of tests when the result occurs kk times (x1k1)pk(1p)xk\displaystyle\binom{x-1}{k-1}p^k(1-p)^{x-k} kp\displaystyle\frac{k}{p} k(1p)p2\displaystyle\frac{k(1-p)}{p^2}
Discrete Uniform(k, l) in range [k,l+1)[k,l+1), all events have equal probability 1(l+1)k\displaystyle\frac{1}{(l+1)-k} (l+1)+k12\displaystyle\frac{(l+1)+k-1}{2} ((l+1)k1)((l+1)k+1)12\displaystyle\frac{((l+1)-k-1)((l+1)-k+1)}{12}
Poisson(a) the number of events occuring in a fixed interval of time if each occurs with a known average rate aa and independently axeax!\displaystyle\frac{a^xe^{-a}}{x!} aa aa

Expected Value: E[X]=μX=xSXxPX(x)E[X]=\mu_X=\displaystyle\sum_{x\in S_X} xP_X(x)
Variance Value: Var[X]=E[(XμX)2]=E(X2)μX2Var[X]=E[(X-\mu_X)^2]=E(X^2)-\mu_X^2
Standard Deviation: σX=Var[X]\sigma_X=\sqrt{Var[X]}

Cumulative Distribution Function (CDF)

Defination: FX(x)=P[Xx]=xixP[X=xi]F_X(x)=P[X\le x]=\displaystyle\sum_{x_i\le x} P[X=x_i]

Derived Random Variable

Y=g(x),E[Y]=xSXg(x)PX(x)Y=g(x),E[Y]=\displaystyle\sum_{x\in S_X} g(x)P_X(x)

  • E[aX+b]=aE[X]+bE[aX+b]=aE[X]+b
  • Var[aX+b]=a2Var[X]Var[aX+b]=a^2Var[X]

Continuous Random Variables

CDF: FX(x)=P[Xx]F_X(x)=P[X\le x]

  • P[x1Xx2]=x1x2fX(x)dx=FX(x2)FX(x1)P[x_1\le X\le x_2]=\displaystyle\int_{x_1}^{x_2}f_X(x)\mathrm{d}x=F_X(x_2)-F_X(x_1)

PDF: fX(x)=dFX(x)dxf_X(x)=\displaystyle\frac{\mathrm{d}F_X(x)}{\mathrm{d}x}

  • +fX(x)dx=1\displaystyle\int_{-\infty}^{+\infty}f_X(x)\mathrm{d}x=1

Uniform Random Variables

X is a uniform (a, b), PDF: fX(x)=1baf_X(x)=\displaystyle\frac{1}{b-a}

CDF: FX(x)=(xa)/(bx),x(a,b)F_X(x)=(x-a)/(b-x),x\in(a,b)

E[X]=(a+b)/2E[X]=(a+b)/2

Var[X]=(ba)2/12Var[X]=(b-a)^2/12

Gaussian / Normal Random Variables

X is a Gaussian, PDF: fX(x)=12πσ2e(xμ)22σ2f_X(x)=\displaystyle\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}

CDF: FX(x)=Φ(xμσ)F_X(x)=\Phi(\displaystyle\frac{x-\mu}{\sigma})

Deifine: Φ(x)=12πxet22dt\Phi(x)=\displaystyle\frac{1}{\sqrt{2\pi}}\int_{-\infty}^xe^{-\frac{t^2}{2}}\mathrm{d}t

E[X]=μE[X]=\mu

Var[X]=σ2Var[X]=\sigma^2

Standard Normal Random Variables

Gaussian Random Variables when μ=0,σ=1\mu=0,\sigma=1

X is a Standard Normal, PDF: fX(x)=12πex22f_X(x)=\displaystyle\frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}

CDF: FX(x)=Φ(x)=12πxet22dtF_X(x)=\Phi(x)=\displaystyle\frac{1}{\sqrt{2\pi}}\int_{-\infty}^xe^{-\frac{t^2}{2}}\mathrm{d}t

E[X]=0E[X]=0

Var[X]=1Var[X]=1

In Gaussian(μ\mu, σ\sigma), test x=x0x=x_0, in Standard Normal, x=(x0μ)/σx^\prime=(x_0-\mu)/\sigma

  • Φ(z)+Φ(z)=1\Phi(z)+\Phi(-z)=1

Binary Random Variables

Joint Probability Mass Function(PMF)

PX,Y(x,y)=P[X=x,Y=y]P_{X,Y}(x,y)=P[X=x,Y=y]

use table to present P(x, y)

Joint CDF

FX,Y(x,y)=P[Xx,Yy]F_{X,Y}(x,y)=P[X\le x,Y\le y]

Joint PDF

fX,Y(x,y)=2FX,Y(x,y)xyf_{X,Y}(x,y)=\displaystyle\frac{\partial^2F_{X,Y}(x,y)}{\partial x\partial y}

Marginal PMF

PX(x)=ySYPX,Y(x,y)P_X(x)=\displaystyle\sum_{y\in S_Y}P_{X,Y}(x,y)

PY(y)=xSXPX,Y(x,y)P_Y(y)=\displaystyle\sum_{x\in S_X}P_{X,Y}(x,y)

Marginal PDF

fX(x)=FX,Y(x,y)dyf_X(x)=\displaystyle\int_{-\infty}^{\infty}F_{X,Y}(x,y)\mathrm{d}y

Covariance

Cov[X,Y]=E[(XμX)(YμY)]Cov[X,Y]=E[(X-\mu_X)(Y-\mu_Y)]

  • Cov[X,Y]=E[XY]μxμyCov[X,Y]=E[X\cdot Y]-\mu_x\mu_y

If 2 variables tend to show

  • similar behaviour, cov is positive
  • opposite behaviour, cov is negative
  • uncorrelated behaviour, cov is zero

Correlation

rX,Y=E[XY]r_{X,Y}=E[X\cdot Y]

A normalization of correlation: ρX,Y[1,1]\rho_{X,Y}\in[-1,1]

ρX,Y=Cov[X,Y]Var[X]Var[Y]=Cov[X,Y]σXσY\rho_{X,Y}=\displaystyle\frac{Cov[X,Y]}{\sqrt{Var[X]Var[Y]}}=\frac{Cov[X,Y]}{\sigma_X\sigma_Y}

独立则无关,无关不一定独立

X^=aX+b,Y^=cY+d\hat X=aX+b,\hat Y=cY+d

  • ρX^,Y^=ρX,Y\rho_{\hat X,\hat Y}=\rho_{X,Y}
  • Cov[X^,Y^]=acCov[X,Y]Cov[\hat X,\hat Y]=ac\cdot Cov[X,Y]

Other Theorem

  • Cov[X,Y]=rX,YμXμYCov[X,Y]=r_{X,Y}-\mu_X\mu_Y
  • Var[X+Y]=Var[X]+Var[Y]+2Cov[X,Y]Var[X+Y]=Var[X]+Var[Y]+2Cov[X,Y]

Independence

Bivariate Gaussian Random Variables

Conditional PMF

PXY(xy)=P[X=xY=y]=PX,Y(x,y)PY(y)P_{X|Y}(x|y)=P[X=x|Y=y]=\displaystyle\frac{P_{X,Y}(x,y)}{P_Y(y)}

Sample

Expected Value of Sums

Wn=X1+X2++XnW_n=X_1+X_2+…+X_n

E[Wn]=E[X1]+E[X2]++E[Xn]E[W_n]=E[X_1]+E[X_2]+…+E[X_n]

Var[Wn]=i=1nVar[Xi]+2i=1n1j=i+1nCov[Xi,Xj]=i=1nj=1nCov[Xi,Xj]Var[W_n]=\displaystyle\sum_{i=1}^n Var[X_i]+2\sum_{i=1}^{n-1}\sum_{j=i+1}^n Cov[X_i,X_j]=\sum_{i=1}^{n}\sum_{j=1}^n Cov[X_i,X_j]

其实就是任意两项(包括自己与自己)的协方差之和

Central Limit Theorem

Xi:iidZn=WnnμXnσX2,limn+FZn(z)=Φ(z)=12πzeu2/2duX_i:iid\Rightarrow\displaystyle Z_n=\frac{W_n-n\mu_{X}}{\sqrt{n\sigma_{X}^2},}\lim_{n\to+\infty}F_{Z_n}(z)=\Phi(z)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^ze^{-u^2/2}\mathrm{d}u

iid:independent and identically distributed 独立同分布

Approximation: FWn(w)Φ(WnnμXnσX2)F_{W_n}(w)\approx\Phi(\displaystyle\frac{W_n-n\mu_X}{\sqrt{n\sigma_X^2}})

一种无视具体分布类型,利用 X 的期望和反差,用标准正态分布估计原 iid 的 CDF 的方法

DML Formula

K=Binomial(n,p)K = Binomial(n, p)

P[k1Kk2]Φ(k2+0.5npnp(1p))Φ(k20.5npnp(1p))P[k_1\le K\le k_2]\approx\Phi(\displaystyle\frac{k_2+0.5-np}{\sqrt{np(1-p)}})-\Phi(\displaystyle\frac{k_2-0.5-np}{\sqrt{np(1-p)}})

上下界~随意~扩展 0.5

Sample Mean

Mn(X)=1n(X1+X2++Xn)M_n(X)=\displaystyle\frac{1}{n}(X_1+X_2+…+X_n)

Mn(X)M_n(X): Random Variable
E[X]E[X]: A Constant Number
limnMn(X)=E[X]\displaystyle\lim_{n\to\infty}M_n(X)=E[X]

E[Mn(X)]=E[X]E[M_n(X)]=E[X]

Var[Mn(X)]=Var[X]nVar[M_n(X)]=\displaystyle\frac{Var[X]}{n}

limnVar[Mn(X)]=0\displaystyle\lim_{n\to\infty}Var[M_n(X)]=0

Useful Inequalities in Probability

Markov Inequality

P[X<0]=0P[Xc2]E[X]c2P[X<0]=0\to P[X\ge c^2]\le \displaystyle\frac{E[X]}{c^2}

Chebyshev Inequality

let X=(YμY)P[Xc2]=P[(YμY)2c2]Var[Y]c2X=(Y-\mu_Y)\to P[X\ge c^2]=P[(Y-\mu_Y)^2\ge c^2]\le\displaystyle\frac{Var[Y]}{c^2} or P[Xc2]=P[YμYc]Var[Y]c2P[X\ge c^2]=P[|Y-\mu_Y|\ge c]\le\displaystyle\frac{Var[Y]}{c^2}

Laws of Large Numbers

P[Mn(X)μXc]Var[X]/(nc2)P[|M_n(X)-\mu_X|\ge c]\le Var[X]/(nc^2)

limnP[Mn(X)μXc]=0,c0Mn(X)=μX\lim_{n\to\infty}P[|M_n(X)-\mu_X|\ge c]= 0,c\to0\Rightarrow M_n(X)=\mu_X

Point Estimates of Model Parameters

estimate: rr

general estimates: R^n\hat R_n is a function of X1,X2,,XnX_1,X_2,…,X_n

Consistent Estimator

defination(weak): ϵ>0,limnP[R^nrϵ]=0\forall\epsilon>0,\displaystyle\lim_{n\to\infty}P[|\hat R_n-r|\ge\epsilon]=0

defination(strong): R^=r\hat R=r

Unbiased Estimator

defination: E[R^]=rE[\hat R]=r

Asymptotically Unbiased Estimator

definaton: limnE[R^n]=r\displaystyle\lim_{n\to\infty}E[\hat R_n]=r

Mean Square Error

e=E[(R^r)2]e=E[(\hat R-r)^2]

limnen=0R^n\lim_{n\to\infty} e_n=0\Rightarrow\hat R_n is consistent

Mn(X)M_n(X) is an unbiased estimate of E[X]E[X]

Standard Error

e\sqrt{e}

Sample Variance

defination: Vn(X)=1ni=1n(XiMn(X))2V_n(X)=\displaystyle\frac{1}{n}\sum_{i=1}^n(X_i-M_n(X))^2

E[Vn(X)]=(11n)Var[X]E[V_n(X)]=\displaystyle(1-\frac{1}{n})Var[X]

E[Vn(X)]<Var[X]E[V_n(X)]<Var[X]

公告
Welcome to Vanadium's Blog!
ZJUer | Freshman | IS | 术力口 | 摸鱼
最新文章
网站资讯
文章数目 :
44
已运行时间 :
本站总字数 :
113.7k
最后更新时间 :