X∼N(μ,σ2)fX(x)=1σ√2πe−(x−μ)2/2σ2
In fact, any PDF of the following form with α>0 is a normal distribution:
Note
fX(x) is at the peak when
x=μ. It's therefore not difficult to show (via minimizing the exponent) that:
Posterior probability:
|
fΘ|X(θ|x)=fΘ(θ)⋅fX|Θ(x|θ)fX(x)fX(x)=∫fΘ(θ)⋅fX|Θ(x|θ)dθfΘ(θ): prior distributionfΘ|X(θ|x): posteriori distribution, output of Bayesian inference |
Given the prior distribution of
Θ and some observations of
X, we can express the posteriori distribution of
Θ (as a function of
Θ) and do a point estimation. Normal distribution is particularly nice as the point estimation can be easily done by finding the peak of the posteriori distribution, which translates to simply finding the minima of the exponent as a quadratic function of
Θ via differentiation.
Estimate with single observation
|
X=Θ+WΘ,W:N(0,1)independent Θ,WˆΘMAP=ˆΘLMS=E[Θ|X]=X2E[(Θ−ˆΘ)2|X=x]=12ˆΘ: (point) estimator - a random variable ˆθ: estimate - a number MAP: Maximum a posteriori probability LMS: Least Mean Squares |
Estimate with multiple observations
|
X1=Θ+W1Θ∼N(x0,σ20)Wi∼N(0,σ2i)⋮Xn=Θ+WnΘ,W1,⋯Wn independentfΘ|X(θ|x)=c⋅e−quad(θ)quad(θ)=(θ−x0)22σ20+(θ−x1)22σ22+⋯+(θ−xn)22σ2nˆΘMAP=ˆΘLMS=E[Θ|X]=1n∑i=01σ2in∑i=0xiσ2iE[(Θ−ˆΘ)2]=E[(Θ−ˆΘ)2|X=x]=E[(Θ−ˆθ)2|X=x]var(Θ)=var(Θ|X=x)=1n∑i=01σ2imean squared error |
Θ as a m-dimensional vector with n observations
|
fΘ|X(θ|x)=1fX(x)m∏j=1fΘj(θj)n∏i=1fXi(xi)posteriori distribution |
As normal distribution, we can then differentiate quad(
Θ) per
Θj and set the derivatives to zero, solving
m linear equations with
m unknowns for the point estimate of
Θ.
Source: MITx 6.041x, Lecture 15.
# posted by rot13(Unafba Pune) @ 4:52 PM
