Prerequisites: Matrix-Vector Multiplication

Eigenvalues and Eigenvectors

This is an early draft. Content may change as it gets reviewed.

Most vectors get transformed by a matrix into something that points in a completely different direction. But some special vectors only get stretched (or shrunk) — their direction doesn’t change. These special vectors are the key to unlocking the structure hidden inside any matrix.

The definition

An eigenvector of a matrix $A$ is a nonzero vector $\mathbf{v}$ such that:

$$A\mathbf{v} = \lambda \mathbf{v}$$

The matrix $A$ applied to $\mathbf{v}$ produces the same vector, just scaled by some factor $\lambda$. That scaling factor is the eigenvalue.

In words: the matrix transforms most vectors into new directions. But eigenvectors are immune to the rotation — they just get longer or shorter. The eigenvalue tells you by how much.

A concrete example

$$A = \begin{pmatrix} 3 & 1 \ 0 & 2 \end{pmatrix}$$

Try the vector $\mathbf{v}_1 = (1, 0)^T$:

$$A \begin{pmatrix} 1 \ 0 \end{pmatrix} = \begin{pmatrix} 3 \ 0 \end{pmatrix} = 3 \begin{pmatrix} 1 \ 0 \end{pmatrix}$$

Same direction, scaled by 3. So $(1, 0)^T$ is an eigenvector with eigenvalue $\lambda_1 = 3$.

Try $\mathbf{v}_2 = (1, -1)^T$:

$$A \begin{pmatrix} 1 \ -1 \end{pmatrix} = \begin{pmatrix} 2 \ -2 \end{pmatrix} = 2 \begin{pmatrix} 1 \ -1 \end{pmatrix}$$

Same direction, scaled by 2. Eigenvalue $\lambda_2 = 2$.

A $2 \times 2$ matrix has (at most) 2 eigenvectors. A $p \times p$ matrix has (at most) $p$.

Try it yourself

Try It: Eigenvectors

A = [ ; ]
Presets:

The grey arrow is your test vector v. The orange arrow is Av — where the matrix sends it. The dashed lines show the eigenvector directions. Drag v onto a dashed line and watch: Av snaps to the same direction as v, just scaled. That’s what “eigenvector” means — the direction that survives the transformation.

Try the “Symmetric” preset — the eigenvectors are perpendicular (orthogonal). That’s always true for symmetric matrices, including covariance matrices.

Why they matter: the spectral theorem

For symmetric matrices (and covariance matrices are always symmetric), a beautiful result holds:

All eigenvalues are real (not complex)
Eigenvectors corresponding to different eigenvalues are orthogonal (perpendicular)
The matrix can be completely reconstructed from its eigenvalues and eigenvectors

This means a symmetric matrix is just a set of perpendicular directions, each with a scaling factor. The matrix stretches space along those directions — some more than others.

What eigenvalues tell you about data

When the matrix is a covariance matrix, the eigenvectors and eigenvalues have a direct interpretation:

Each eigenvector is a direction in feature space — a particular combination of the original variables
Each eigenvalue is the variance along that direction — how much the data spreads out that way
The eigenvector with the largest eigenvalue points in the direction of maximum variance
The eigenvector with the smallest eigenvalue points in the direction of minimum variance

If you have 67 linguistic features, the covariance matrix is $67 \times 67$ with 67 eigenvectors. Each eigenvector is a “dimension” — a pattern of co-occurrence among features. The eigenvalue tells you how much variance that dimension explains. The largest eigenvalues correspond to the strongest patterns in the data.

This is exactly what PCA does: find the eigenvectors of the covariance matrix, rank them by eigenvalue, and keep the top few. Those are the principal components.