The Pólya Urn
A third way to build the Dirichlet process — perhaps the most intuitive. The Pólya urn scheme makes the “rich get richer” dynamic viscerally obvious.
The setup
Imagine an urn. Right now it contains no balls — just $\theta$ litres of “base paint.” The paint is a blend of infinitely many colours, mixed according to the base distribution $H$. (If $H$ says “red” is twice as likely as “blue,” the paint is two-thirds red and one-third blue.)
The process
Draw 1: You reach in. Since there are no balls yet, you must draw from the paint. The paint crystallises into a ball of a specific colour — say, red. You put this red ball into the urn. Now the urn has $\theta$ litres of paint and 1 ball.
Draw 2: You reach in again. Now you might grab the red ball (probability $1 / (1 + \theta)$) or draw from the paint again (probability $\theta / (1 + \theta)$). Suppose you get the red ball. Note its colour. Put it back, plus one additional red ball. Now there are 2 red balls.
Draw 3: The urn has $\theta$ litres of paint and 2 red balls. Probability of drawing a red ball: $2 / (2 + \theta)$. Probability of new colour from paint: $\theta / (2 + \theta)$. Suppose you draw from the paint this time and get blue. Now there are 2 red balls, 1 blue ball, and $\theta$ litres of paint.
And so on. After many draws, the urn is full of balls in various colours, with the common colours vastly outnumbering the rare ones.
What makes this click
The Pólya urn makes two things viscerally obvious:
Rich get richer. Every time you draw a colour, you add another ball of that colour. So common colours become even more common. If the has been drawn 100 times, there are 100 balls labelled the in the urn — it’s very likely to be drawn again. This is the same self-reinforcing dynamic as the “popular tables attract more customers” pattern.
New colours are always possible. The $\theta$ litres of paint never run out. No matter how many balls accumulate, the paint is always there, offering a chance of something new. The probability of a new colour is $\theta / (n + \theta)$ — it shrinks as $n$ grows, but it never hits zero. A word you’ve never seen before always has some chance of appearing next.
Three constructions, one process
The stick-breaking construction and the Pólya urn are two different views of the same object:
| Stick-breaking | Pólya urn |
|---|---|
| Breaks a stick into pieces (probabilities) | Fills an urn with balls (observations) |
| Gives you the distribution up front | Gives you samples one at a time |
| $\theta$ controls greediness of breaks | $\theta$ controls rate of new colours |
| Words drawn from $H$ | Colours drawn from $H$ |
Stick-breaking generates the “menu” of word probabilities. The Pólya urn generates words one at a time from that menu. They produce the same result — just different ways of looking at it.