Proof of SVD

We have

A \in \mathbb{R}^{m \times n}

Let $(\lambda_1,\vec{v}_1), (\lambda_2,\vec{v}_2), \dots, (\lambda_r,\vec{v}_r), (\forall i \in [1,r], \lambda_i \ne 0)$ be non-zero eigenvalue-vector pairs of $A^{\top}A$ (and that $\lambda_1 \ge \lambda_2 \ge \cdots \ge \lambda_r$ )

And let $(\lambda_{r+1},\vec{v}_{r+1}),\dots,(\lambda_n,\vec{v}_n)$ be eigenvalue-vector pairs of $A^{T}A$ that its eigenvalue is $0$ .

Define:

V = \begin{bmatrix} \vec{v}_1 &\vec{v}_2 &\cdots &\vec{v}_n \end{bmatrix}

\sigma_i = \sqrt{\lambda_i}

\forall i \le r, \vec{u}_i: A\vec{v}_i=\sigma_i\vec{u}_i

Prove:

\vec{u}_i \text{ are orthonormal} \\ \forall i \ne j, <\vec{u}_i,\vec{u}_j> = 0 \\ \forall i, ||\vec{u}_i||_2 = 1

\begin{split} <\vec{u}_i,\vec{u}_j> &= \vec{u}_i^{\top}\vec{u}_j \\ &=\frac{(A \vec{v}_i)^{\top}}{\sigma_i} \frac{A\vec{v}_j}{\sigma_j} \\ &=\frac{1}{\sigma_i \sigma_j}\vec{v}_i^{\top}A^{\top}A\vec{v}_j \\ &=\frac{1}{\sigma_i \sigma_j}\vec{v}_i \lambda_j \vec{v}_j \\ &=\frac{\sigma_j}{\sigma_i \sigma_j} \underbrace{0}_{\text{eigenvectors of a matrix with different eigenvalues}} \\ &=0 \end{split}

\begin{split} ||\vec{u}_i||_2^2 &= \frac{(A\vec{v}_i)^{\top}}{\sigma_i}\frac{(A\vec{v}_i)}{\sigma_i} \\ &=\frac{1}{\lambda_i}\vec{v}_i^{\top}A^{\top}A\vec{v}_i \\ &=\frac{1}{\lambda_i}\vec{v}_i^{\top}\lambda_i\vec{v}_i \\ &=||\vec{v}_i||_2^2 = 1 \end{split}

We will now proceed to define more $\vec{u}_i, \forall i \in (r, n]$

We will use gram-schmidt for computing those extra $\vec{u}_i$ s.

Now we construct $V$ that

V = \begin{bmatrix} V_R = \begin{bmatrix} \vec{v}_1 &\cdots &\vec{v}_r \end{bmatrix} &V_{null}=\begin{bmatrix} \vec{v}_{r+1} &\cdots &\vec{v}_n \end{bmatrix} \end{bmatrix}

And now (because $\sigma_i\vec{u}_i = A\vec{v}_i)$ :

AV_R = \begin{bmatrix} \sigma_1 &0 &0 &\cdots &0 \\ 0 &\sigma_2 &0 &\cdots &0 \\ 0 &0 &\sigma_3 &\cdots &0 \\ \vdots &\vdots &\vdots &\ddots &\vdots \\ 0 &0 &0 &\cdots &\sigma_r \end{bmatrix}\begin{bmatrix} \vec{u}_1 &\vec{u}_2 &\cdots &\vec{u}_r \end{bmatrix}

From the same reasoning

AV=\Sigma U = U\Sigma

(proof is left as an exercise)

And now we can apply $V^{-1}$ to the right side of equation

V = U\Sigma V^{\top}

🔥

Now think about the unit circle… This is helpful because it helps us know what every single vector is going to transform into

Note:

\forall i \ne j, \vec{v}_i^{\top}\vec{v}_j = 0 \\ \forall i \ne j, (A\vec{v}_i)^{\top}(A\vec{v}_j)=0 \quad (\text{proved in the $\vec{u}$ orthonormal proof)}