Variance: Measuring Spread
You know that a probability distribution is a curve describing how plausible different values are. But two distributions can have the same centre (mean) and look completely different — one tight and narrow, the other wide and flat. Variance measures this difference: how spread out the values are.
The idea
Imagine measuring the heights of 100 people. The average might be 170 cm. But are most people between 168 and 172 (low variance), or are they scattered between 150 and 190 (high variance)?
Variance answers: on average, how far is each value from the mean?
The formula
For $n$ values $x_1, x_2, \ldots, x_n$ with mean $\bar{x}$:
$$\text{Var}(X) = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})^2$$
Each term $(x_i - \bar{x})^2$ is the squared distance from the mean. We square it so that values above and below the mean don’t cancel each other out. Then we average.
Standard deviation is just the square root of variance: $\sigma = \sqrt{\text{Var}(X)}$. It has the same units as the original data, which makes it more interpretable (“the standard deviation of heights is 8 cm” is easier to grasp than “the variance is 64 cm²”).
Building intuition
- Variance = 0: Every value is identical. No spread at all.
- Small variance: Values cluster tightly around the mean.
- Large variance: Values are scattered widely.
For a normal distribution (the bell curve): - About 68% of values fall within 1 standard deviation of the mean - About 95% fall within 2 standard deviations - About 99.7% fall within 3
Try it yourself
Each circle is a data point. The dashed line is the mean. The shaded bars show each point’s deviation from the mean. Drag the points left and right to see how the variance and standard deviation change.
Try clustering all points near the mean — variance drops toward zero. Spread them to the edges — variance climbs.
Why this matters
Variance is the building block for almost all multivariate statistics. Covariance measures how two variables vary together. PCA finds the directions of maximum variance in high-dimensional data. Factor analysis models shared variance across many variables. Understanding variance for one variable is the foundation for all of these.