Google
 
Web unafbapune.blogspot.com

Sunday, March 30, 2014

 

Normal Distribution

\(\displaystyle X \thicksim N(\mu, \sigma^2) \qquad f_X(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-(x-\mu)^2 / 2 \sigma^2} \)

In fact, any PDF of the following form with \(\alpha > 0\) is a normal distribution:
  \[ \begin{aligned} f_X(x) &= c \cdot\color{blue}{e^{-(\alpha x^2 + \beta x + \gamma )}} \\ \end{aligned} \]
Note \(f_X(x)\) is at the peak when \(x = \mu\). It's therefore not difficult to show (via minimizing the exponent) that:
  \[ \begin{aligned} \mu = \color{blue}{-\frac{\beta}{2\alpha}} \qquad \sigma^2 = \color{red}{\frac{1}{2\alpha}} \\ \end{aligned} \]
Posterior probability:
  \[ \begin{aligned} \color{teal}{f_{\Theta|X}(\theta\,|\,x)} &\color{teal}{=} \color{teal}{\frac{f_\Theta(\theta) \cdot f_{X|\Theta}(x\,|\,\theta)}{f_X(x)}} \\ f_X(x) &= \int f_\Theta(\theta) \cdot f_{X|\Theta}(x\,|\,\theta) \, d\theta \\ f_\Theta(\theta) &\text{: prior distribution} \\ f_{\Theta|X}(\theta\,|\,x) &\text{: posteriori distribution, output of Bayesian inference}\\ \end{aligned} \]
Given the prior distribution of \(\Theta\) and some observations of \(X\), we can express the posteriori distribution of \(\Theta\) (as a function of \(\Theta\)) and do a point estimation. Normal distribution is particularly nice as the point estimation can be easily done by finding the peak of the posteriori distribution, which translates to simply finding the minima of the exponent as a quadratic function of \(\Theta\) via differentiation.

Estimate with single observation

  \[ \begin{aligned} X &= \Theta + W \qquad \Theta, W: N(0,1) \qquad \text{independent } \Theta, W \\ \widehat{\Theta}_{\text{MAP}} &= \widehat{\Theta}_{\text{LMS}} = \mathbb{E}[\Theta\,|\,X] = \frac{X}{2} \qquad \mathbb{E}[(\Theta - \widehat{\Theta})^2\,|\,X=x] = \color{red}{1 \over 2} \\ \widehat{\Theta} &\text{: (point) estimator - a random variable } \quad \hat{\theta} \text{: estimate - a number } \quad \text{MAP: Maximum a posteriori probability } \quad \text{LMS: Least Mean Squares} \end{aligned} \]

Estimate with multiple observations

  \[ \begin{aligned} X_1 &= \Theta + W_1 \qquad \Theta \thicksim N(x_0,\sigma_0^2) \qquad W_i \thicksim N(0, \sigma_i^2) \\ \vdots \\ X_n &= \Theta + W_n \qquad \Theta, W_1, \cdots W_n \text{ independent} \\ f_{\Theta|X}(\theta\,|\,x) &= c \cdot e^{-\text{quad}(\theta)} \\ \text{quad}(\theta) &= \frac{(\theta - x_0)^2}{2\sigma_0^2} + \frac{(\theta - x_1)^2}{2\sigma_2^2} + \cdots + \frac{(\theta - x_n)^2}{2\sigma_n^2} \\ \widehat{\Theta}_{\text{MAP}} &= \widehat{\Theta}_{\text{LMS}} = \mathbb{E}[\Theta\,|\,X] = \color{red}{\frac{1}{\displaystyle{\sum_{i=0}^n \frac{1}{\sigma_i^2}}}}\displaystyle{\sum_{i=0}^n\frac{x_i}{\sigma_i^2}} \\ \mathbb{E}[(\Theta - \widehat{\Theta})^2] &= \mathbb{E}[(\Theta - \widehat{\Theta})^2\color{blue}{\,|\,X=x}] = \mathbb{E}[(\Theta - \color{blue}{\widehat{\theta}})^2\,|\,X=x] \\ var(\Theta) &= var(\Theta\,|\,X=x) = \color{red}{1 \over \displaystyle{\sum_{i=0}^n\frac{1}{\sigma_i^2}}} & \text{mean squared error} \end{aligned} \]

\(\Theta\) as a \(m\)-dimensional vector with \(n\) observations

  \[ \begin{aligned} f_{\Theta|X}(\theta\,|\,x) &= \frac{1}{f_X(x)} \prod_{j=1}^{m} f_{\Theta_j}(\theta_j) \prod_{i=1}^n f_{X_i}(x_i) & \text{posteriori distribution} \\ \end{aligned} \]
As normal distribution, we can then differentiate quad(\(\Theta\)) per \(\Theta_j\) and set the derivatives to zero, solving \(m\) linear equations with \(m\) unknowns for the point estimate of \(\Theta\).

Source: MITx 6.041x, Lecture 15.


Comments: Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?