EM for PCA
With complete information
- If we knew zzz for each xxx, estimating AAA and DDD would be simple
x=Az+E x=A z+E x=Az+E
P(x∣z)=N(Az,D) P(x \mid z)=N(A z, D) P(x∣z)=N(Az,D)
- Given complete information (x1,z1),(x2,z2)\left(x_{1}, z_{1}\right),\left(x_{2}, z_{2}\right)(x1,z1),(x2,z2)
argmaxA,D∑(x,z)logP(x,z)=argmaxA,D∑(x,z)logP(x∣z) \underset{A, D}{\operatorname{argmax}} \sum_{(x, z)} \log P(x, z)=\underset{A, D}{\operatorname{argmax}} \sum_{(x, z)} \log P(x \mid z) A,Dargmax(x,z)∑logP(x,z)=A,Dargmax(x,z)∑logP(x∣z)
=argmaxA,D∑(x,Z)log1(2π)d∣D∣exp(−0.5(x−Az)TD−1(x−Az)) =\underset{A, D}{\operatorname{argmax}} \sum_{(x, Z)} \log \frac{1}{\sqrt{(2 \pi)^{d}|D|}} \exp \left(-0.5(x-A z)^{T} D^{-1}(x-A z)\right) =A,Dargmax(x,Z)∑log(2π)d∣D∣1exp(−0.5(x−Az)TD−1(x−Az))
- We can get a close form solution: A=XZ+A = XZ^{+}A=XZ+
- But we don’t have ZZZ => missing
With incomplete information
- Initialize the plane
- Complete the data by computing the appropriate zzz for the plane
- P(z∣X;A)P(z|X;A)P(z∣X;A) is a delta, because EEE is orthogonal to AAA
- Reestimate the plane using the zzz
- Iterate
Linear Gaussian Model
- PCA assumes the noise is always orthogonal to the data
- Not always true
- The noise added to the output of the encoder can lie in any direction (uncorrelated)
- We want a generative model: to generate any point
- Take a Gaussian step on the hyperplane
- Add full-rank Gaussian uncorrelated noise that is independent of the position on the hyperplane
- Uncorrelated: diagonal covariance matrix
- Direction of noise is unconstrained
With complete information
x=Az+e x=A z+e x