SVD and low-rank structure
Bar: visualize what SVD does to a unit circle, and relate it to PCA, embeddings, and matrix factorization without notes.
Intuition first
Every matrix
rotates input space onto the "right singular" axes, (diagonal, ) stretches each axis, rotates into output space along the "left singular" axes.
A unit circle becomes an ellipse whose semi-axis lengths are the singular values
Low-rank approximation (the whole point)
Keep the top
Worked example (by hand)
How it connects to the rest of ML
- PCA = SVD of the (mean-centered) data matrix. Right singular vectors
= principal directions; = variance explained. PCA is SVD with a centering step. - Embeddings / latent factors: factor a big sparse matrix (users × pages, words × contexts) into
low-rank
→ dense vectors that capture similarity. Classic recommender = truncated SVD. - Matrix factorization for clickstream: a user×page interaction matrix is low-rank because behavior is driven by a few latent intents → cluster journeys, fill gaps, recommend next page.
By-hand exercise (meets the bar)
- Sketch what
does to the unit circle (ellipse with axes 3 and 1; , ). - For a rank-2 approximation of a matrix with
, what fraction of energy is kept? ($ \tfrac{25+16}{25+16+4+1}=\tfrac{41}{46}\approx 89%$.)
Links
- Sibling tool: Optimization & gradients (both underlie how models learn structure)
- Used in: Pattern & structure discovery (journey clustering), Polars over pandas (big tables)