When a matrix transforms a vector, it typically both rotates and scales it. But certain special vectors — eigenvectors — are only scaled, not rotated. Understanding them unlocks PCA, PageRank, spectral graph theory, and differential equations.
The Definition
A non-zero vector v is an eigenvector of matrix A if:
Av = λvWhere λ (lambda) is a scalar called the eigenvalue. Applying A to v simply stretches (or shrinks, or flips) v by a factor of λ — its direction is unchanged.
Example:
A = [[3, 0],
[0, 2]]
v₁ = [1, 0]: A @ v₁ = [3, 0] = 3 × [1, 0] → eigenvalue λ₁ = 3
v₂ = [0, 1]: A @ v₂ = [0, 2] = 2 × [0, 1] → eigenvalue λ₂ = 2For a diagonal matrix, the eigenvectors are the coordinate axes and the eigenvalues are the diagonal entries.
Finding Eigenvalues
Rearranging Av = λv:
Av - λv = 0
(A - λI)v = 0For a non-zero solution v to exist, the matrix (A - λI) must be singular (non-invertible):
det(A - λI) = 0This is the characteristic equation. Solving it gives the eigenvalues.
For a 2×2 matrix:
A = [[a, b],
[c, d]]
det(A - λI) = (a-λ)(d-λ) - bc = 0
λ² - (a+d)λ + (ad-bc) = 0In Python:
import numpy as np
A = np.array([[4., 1.],
[2., 3.]])
eigenvalues, eigenvectors = np.linalg.eig(A)
# eigenvalues: [5., 2.]
# eigenvectors[:, 0] is the eigenvector for λ=5
# eigenvectors[:, 1] is the eigenvector for λ=2Geometric Interpretation
Eigenvectors are the axes along which a transformation acts purely as scaling. For any input, the transformation:
- Decomposes it into components along eigenvector directions
- Scales each component by the corresponding eigenvalue
- Recombines
This means if you know all eigenvectors and eigenvalues, you completely understand what the matrix does.
| Eigenvalue | Effect on eigenvector |
|---|---|
| λ > 1 | Stretches |
| 0 < λ < 1 | Shrinks |
| λ = 1 | No change |
| λ = 0 | Collapses to zero (matrix is singular) |
| λ < 0 | Flips direction |
| λ is complex | Rotation involved |
Diagonalisation
A matrix A with n linearly independent eigenvectors can be written as:
A = PDP⁻¹Where:
- P: matrix whose columns are the eigenvectors
- D: diagonal matrix with eigenvalues on the diagonal
This is diagonalisation. Its power: computing Aⁿ (matrix to the nth power) becomes trivial:
Aⁿ = PDⁿP⁻¹And Dⁿ is just each diagonal entry raised to the nth power. This appears in solving recurrences, Fibonacci, Markov chains, and graph problems.
Principal Component Analysis (PCA)
PCA is the most important application of eigenvalues in ML. It finds the directions of maximum variance in data.
Setup: given a dataset represented as an n×d matrix X (n samples, d features), centre it (subtract the mean), then compute the covariance matrix:
C = (1/n) Xᵀ XC is a d×d symmetric matrix. Symmetric matrices always have real eigenvalues and orthogonal eigenvectors.
The eigenvectors of C are the principal components — the directions of maximum variance. The eigenvalues tell you how much variance each direction captures.
# PCA from scratch
X_centered = X - X.mean(axis=0)
C = X_centered.T @ X_centered / len(X)
eigenvalues, eigenvectors = np.linalg.eigh(C)
# Sort by descending eigenvalue
idx = eigenvalues.argsort()[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]
# Project onto top k components
k = 2
X_reduced = X_centered @ eigenvectors[:, :k]The top k eigenvectors capture the most variance. Projecting onto them reduces dimensions while preserving as much information as possible.
Explained variance: eigenvalue λᵢ / Σλ gives the fraction of variance captured by component i. Plotting the cumulative explained variance (scree plot) shows how many components are needed.
Singular Value Decomposition (SVD)
SVD generalises eigendecomposition to non-square matrices:
A = UΣVᵀWhere:
- U (m×m): left singular vectors — eigenvectors of AAᵀ
- Σ (m×n): diagonal matrix of singular values (always non-negative)
- V (n×n): right singular vectors — eigenvectors of AᵀA
Singular values σᵢ = √λᵢ where λᵢ are eigenvalues of AᵀA.
U, sigma, Vt = np.linalg.svd(A)Applications:
- Image compression: keep only the top k singular values and vectors — low-rank approximation
- Recommendation systems: matrix factorisation for collaborative filtering
- Pseudoinverse: solve least-squares problems (A⁺ = VΣ⁺Uᵀ)
- Dimensionality reduction: PCA is equivalent to SVD on the centred data matrix
PageRank: Eigenvectors of the Web
Google's original PageRank algorithm computes the principal eigenvector of the web's link matrix.
Model the web as a directed graph. Build a transition matrix M where M[j][i] = 1/k if page i has k outgoing links to page j.
The PageRank vector r satisfies:
Mr = rThis is the eigenvector equation with λ = 1. The PageRank of each page is its component in this eigenvector — proportional to how many important pages link to it.
Computing this eigenvector for billions of pages uses the power iteration method:
def power_iteration(M, iterations=100):
r = np.ones(M.shape[0]) / M.shape[0] # start uniform
for _ in range(iterations):
r = M @ r # apply transition
r = r / r.sum() # normalise
return rEach iteration brings r closer to the true eigenvector. Convergence is guaranteed when the largest eigenvalue is 1 (which is engineered by the matrix construction).
Key Takeaways
- An eigenvector v of matrix A satisfies Av = λv — it is only scaled, not rotated.
- Eigenvalues are found by solving det(A - λI) = 0.
- Diagonalisation: A = PDP⁻¹ — makes computing Aⁿ trivial.
- PCA finds the eigenvectors of the covariance matrix — the directions of maximum variance. Eigenvalues quantify how much variance each direction captures.
- SVD (A = UΣVᵀ) generalises eigendecomposition to non-square matrices — used in image compression, recommendation systems, and solving least-squares problems.
- PageRank computes the principal eigenvector of a web link matrix using power iteration.