Prerequisites: Vectors

Matrices

This is an early draft. Content may change as it gets reviewed.

A matrix is a rectangular grid of numbers:

$$A = \begin{pmatrix} 1 & 2 & 3 \ 4 & 5 & 6 \end{pmatrix}$$

This is a $2 \times 3$ matrix: 2 rows and 3 columns. The entry in row $i$, column $j$ is written $A_{ij}$.

Three ways to think about a matrix

A table of data. Each row is a data point; each column is a variable. A corpus of 500 texts measured on 67 features is a $500 \times 67$ matrix.
A collection of vectors. The rows are row vectors; the columns are column vectors. You can think of a matrix as stacking vectors together.
A transformation. A matrix is a machine: put a vector in, get a different vector out. This is the most powerful perspective, and the one that connects to eigenvalues.

Special matrices

Square matrix: Same number of rows and columns ($n \times n$). Covariance matrices are always square.

Symmetric matrix: Equals its own transpose — the mirror image across the diagonal. Formally: $A_{ij} = A_{ji}$ for all $i, j$. Equivalently, $A = A^T$.

$$\text{Symmetric: } \begin{pmatrix} 4 & 2 & 1 \ 2 & 5 & 3 \ 1 & 3 & 6 \end{pmatrix} \qquad \text{Not symmetric: } \begin{pmatrix} 4 & 2 & 1 \ 7 & 5 & 3 \ 0 & 3 & 6 \end{pmatrix}$$

Symmetric matrices have beautiful properties: all their eigenvalues are real, and their eigenvectors are orthogonal. Covariance matrices are always symmetric.

Identity matrix: The $n \times n$ matrix with 1s on the diagonal and 0s everywhere else:

$$I = \begin{pmatrix} 1 & 0 & 0 \ 0 & 1 & 0 \ 0 & 0 & 1 \end{pmatrix}$$

The identity is the “do nothing” transformation: $I\mathbf{v} = \mathbf{v}$ for any vector. It’s the matrix equivalent of multiplying by 1.

Diagonal matrix: Nonzero entries only on the diagonal. A diagonal matrix scales each dimension independently — stretch one axis by 3, another by 0.5, leave the third unchanged.

The transpose

The transpose $A^T$ flips a matrix across its diagonal: rows become columns and columns become rows. If $A$ is $m \times n$, then $A^T$ is $n \times m$.

$$A = \begin{pmatrix} 1 & 2 \ 3 & 4 \ 5 & 6 \end{pmatrix} \implies A^T = \begin{pmatrix} 1 & 3 & 5 \ 2 & 4 & 6 \end{pmatrix}$$

The transpose appears everywhere: in the definition of symmetric matrices ($A = A^T$), in computing covariance matrices, and in the normal equations for linear regression.