EECS 311: Red-Black Trees

Red-black trees are binary search trees. Like AVL trees, we keep a little extra bookkeeping information on each node, but very little -- just one bit, in fact. If this bit is on, we say that the node has a red link to its parent. If it's off, the node has a black link to its parent.

Maintaining a red-black tree as new nodes are added primarily involves recoloring and rotation, as follows:

Create a new node n to hold the value.
If the tree is empty, make n the root.
Otherwise, go left or right, as with normal insertion in a binary search tree, except that if you pass through a node m with red links to both its children,
- color those links black, and
- if m is not the root, color m's parent link red.
At the appropriate leaf, add n as a child with a red link to its parent.
If either of the steps that adds red links creates 2 red links in a row, rotate the associated nodes to create a node with 2 red links to its children.

There are four cases where rotations have to occur:

If the red links are both right links, rotate up and left, like this:

Note that the parent link colors of n1 and n2 have flipped.
If the red links are right then left, rotate up and right, then up and left, like this:

Note that the second rotation here is exactly the same as in the previous case. n1 and n3 flip their parent link colors.
If the two red links are both left, or left then right, do the mirror images of the above. These symmetrical patterns are left as an exercise for the reader.

Analysis

There are 3 things you have to verify about this and any algorithm:

Completeness: are all situations covered? for example, there's no rule for what to do if you find a red link to a node with two red links to its children; you need to prove that can never arise;
Correctness: does the algorithm preserve the properties of a binary search tree?
Complexity: what is the O() complexity of this algorithm for insertion and retrieval?

The first two can be handled inductively, by showing that every step preserves the properties of binary search trees. Therefore, if the tree starts correctly, it will stay correct.

An upper bound on the complexity can be found by noting two invariant properties that the above algorithm maintains:

There are never two red links in a row.
The number of black links from the root to any leaf is the same.

From these two properties, it follows that the paths are always O(log n) long.

Simulation

There's an interactive Java animation that shows the operation of red-black trees at http://reptar.uta.edu/NOTES5311/REDBLACK/RedBlack.html. Note that:

It makes the nodes red and black rather than the links. The algorithm works the same. It's just the drawings and the language that are different.
The first kind of rotation, above, takes place in two steps: change colors, then rotate.
When the root node becomes red, the simulation recolors it black as a distinct step.

A good way to understand how red-black trees work is to play with this simulation until you can always correctly predict what it's going to do when you hit the "Next" button.

Implementation

Implementation of red-black trees for insertion is mostly a matter of carefully handling all the rotations. If your tree nodes do not have links to their parents, then, while your code descends the tree, it needs to pass along as parameters the most recent 4 nodes in the path from root to leaf, i.e., a node, its parent, its grandparent, and its great-grandparent.

Deletion is more complex, though still O(log N).

Comments? Send mail to c-riesbeck@northwestern.edu.