Proof of SVD

We have

ARm×nA \in \mathbb{R}^{m \times n}

Let (λ1,v1),(λ2,v2),,(λr,vr),(i[1,r],λi0)(\lambda_1,\vec{v}_1), (\lambda_2,\vec{v}_2), \dots, (\lambda_r,\vec{v}_r), (\forall i \in [1,r], \lambda_i \ne 0) be non-zero eigenvalue-vector pairs of AAA^{\top}A (and that λ1λ2λr\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_r)

And let (λr+1,vr+1),,(λn,vn)(\lambda_{r+1},\vec{v}_{r+1}),\dots,(\lambda_n,\vec{v}_n) be eigenvalue-vector pairs of ATAA^{T}A that its eigenvalue is 00.

Define:

V=[v1v2vn]V = \begin{bmatrix} \vec{v}_1 &\vec{v}_2 &\cdots &\vec{v}_n \end{bmatrix}
σi=λi\sigma_i = \sqrt{\lambda_i}
ir,ui:Avi=σiui\forall i \le r, \vec{u}_i: A\vec{v}_i=\sigma_i\vec{u}_i

Prove:

ui are orthonormalij,<ui,uj>=0i,ui2=1\vec{u}_i \text{ are orthonormal} \\ \forall i \ne j, <\vec{u}_i,\vec{u}_j> = 0 \\ \forall i, ||\vec{u}_i||_2 = 1
<ui,uj>=uiuj=(Avi)σiAvjσj=1σiσjviAAvj=1σiσjviλjvj=σjσiσj0undefinedeigenvectors of a matrix with different eigenvalues=0\begin{split} <\vec{u}_i,\vec{u}_j> &= \vec{u}_i^{\top}\vec{u}_j \\ &=\frac{(A \vec{v}_i)^{\top}}{\sigma_i} \frac{A\vec{v}_j}{\sigma_j} \\ &=\frac{1}{\sigma_i \sigma_j}\vec{v}_i^{\top}A^{\top}A\vec{v}_j \\ &=\frac{1}{\sigma_i \sigma_j}\vec{v}_i \lambda_j \vec{v}_j \\ &=\frac{\sigma_j}{\sigma_i \sigma_j} \underbrace{0}_{\text{eigenvectors of a matrix with different eigenvalues}} \\ &=0 \end{split}
ui22=(Avi)σi(Avi)σi=1λiviAAvi=1λiviλivi=vi22=1\begin{split} ||\vec{u}_i||_2^2 &= \frac{(A\vec{v}_i)^{\top}}{\sigma_i}\frac{(A\vec{v}_i)}{\sigma_i} \\ &=\frac{1}{\lambda_i}\vec{v}_i^{\top}A^{\top}A\vec{v}_i \\ &=\frac{1}{\lambda_i}\vec{v}_i^{\top}\lambda_i\vec{v}_i \\ &=||\vec{v}_i||_2^2 = 1 \end{split}

We will now proceed to define more ui,i(r,n]\vec{u}_i, \forall i \in (r, n]

We will use gram-schmidt for computing those extra ui\vec{u}_is.

Now we construct VV that

V=[VR=[v1vr]Vnull=[vr+1vn]]V = \begin{bmatrix} V_R = \begin{bmatrix} \vec{v}_1 &\cdots &\vec{v}_r \end{bmatrix} &V_{null}=\begin{bmatrix} \vec{v}_{r+1} &\cdots &\vec{v}_n \end{bmatrix} \end{bmatrix}

And now (because σiui=Avi)\sigma_i\vec{u}_i = A\vec{v}_i):

AVR=[σ10000σ20000σ30000σr][u1u2ur]AV_R = \begin{bmatrix} \sigma_1 &0 &0 &\cdots &0 \\ 0 &\sigma_2 &0 &\cdots &0 \\ 0 &0 &\sigma_3 &\cdots &0 \\ \vdots &\vdots &\vdots &\ddots &\vdots \\ 0 &0 &0 &\cdots &\sigma_r \end{bmatrix}\begin{bmatrix} \vec{u}_1 &\vec{u}_2 &\cdots &\vec{u}_r \end{bmatrix}

From the same reasoning

AV=ΣU=UΣAV=\Sigma U = U\Sigma

(proof is left as an exercise)

And now we can apply V1V^{-1} to the right side of equation

V=UΣVV = U\Sigma V^{\top}
🔥
Now think about the unit circle… This is helpful because it helps us know what every single vector is going to transform into

Note:

ij,vivj=0ij,(Avi)(Avj)=0(proved in the u orthonormal proof)\forall i \ne j, \vec{v}_i^{\top}\vec{v}_j = 0 \\ \forall i \ne j, (A\vec{v}_i)^{\top}(A\vec{v}_j)=0 \quad (\text{proved in the $\vec{u}$ orthonormal proof)}