EECS 311: Balanced Binary Search Trees

This is a brief review of a couple of the many kinds of balanced search trees that have been developed.

AVL (Height-balanced) Trees

An AVL tree is a binary search tree. The key ideas are

We define a balance factor for each node:

height( left subtree) - height(right subtree)

A tree is an AVL tree if and only if:

If this bound is maintained, then it can be shown that the tree will give the desired N log2N retrieval property that we desire.

The trick is maintaining this bound with no more than log2N work when inserting new nodes. This is done using rotations. The rules for insertion are basically:

There are 4 possible cases of rotation that might be required, and 2 of the cases have subcases. There's a symmetry between left and right rotations.

You should understand how each case is defined and which rotation is appropriate. Sometimes, there may be several ways to fix an unbalanced AVL tree, but you need to know which of those rotations the algorithm does. See the textbook for more information.

There are quite a few online interactive animations for AVL search trees. I like this one, and this one. But I find none of them are as helpful for learning as simply building trees yourself by hand. Or, if you prefer, watching someone else do it by hand.

Red-black trees

How red-black trees work is described in detail here and in the textbook.

Red-black trees are like AVL trees in that they

They differ in what that invariant is and what bookkeeping information is kept. Instead of two numbers on each node representing the heights of the subtrees, in red-black trees just one bit is kept, marking how a child node links to a parent node. By convention, a marked link is called a red link, an unmarked link is called black.

The invariant property is that

From this it follows that the tree can't be more than 2 log N deep, and normally it's much better than that.

Because there's so much less bookkeeping data, red-black trees in general perform better than other self-balancing binary search tree algorithms and are the algorithm of choice for things like the STL's set and map data structures.

2-3-4 Trees

2-3-4 trees are search trees, but not binary search trees. While 2-3-4 trees themselves are rarely used in practice, they have two interesting properties that appear in other commonly used data structures:

N-way search trees and node splitting is a general idea that is central in B-trees and similar data structures that are used in implementing large database management systems. Such trees are particularly useful for managing large amounts of data stored on external media, such as disks.

An interesting side point is that even though there are no rotations in 2-3-4 trees, it turns out that operations in building a 2-3-4 tree can be mapped one-to-one to operations in building a red-black tree. That is, a red-black tree with rotations and coloring can be viewed as a way to implement 2-3-4 trees using binary nodes.

The key ideas in 2-3-4 trees are:

Each node in a 2-3-4 tree has 1, 2 or 3 keys. The keys are ordered in the node, and the subtrees follow these rules:

Searching a 2-3-4 tree is pretty much like searching a binary search tree. The only difference is that we compare the input data item against the keys in the node first to determine which subtree to follow (left, left of middle, right of middle, or right). The logic for this is straightforward.

The real differences arise when we add data to a 2-3-4 tree. The key operation is node splitting. When we adding data, as we go down the tree,

Question for the reader: Why does a 2-3-4 tree stay balanced?

Mapping 2-3-4 trees to red-black trees

Red-black trees can be seen as a way of representing 2-3-4 trees as binary trees. The basic idea is simple:

Intuitively, the red links capture the same information that we see when a 2-3-4 node gets full.

The three kinds of nodes in a 2-3-4 tree therefore become the following kinds of nodes in a red-black tree:

When the 2-3-4 algorithm

For example, let's compare adding the numbers 1, 2, 3, ... to a 2-3-4 and a red-black tree.

Insert2-3-4 OperationsRed-black Operations
1 Put in leftmost key of root node Put in root node
2 Put in rightmost key of root node Make 2 a red right child of 1
3 Shift 2 left, put 3 in rightmost key of root node Make 3 a red right child of 2. Because there are two red links in a row, rotate left. Note: the red links join the keys in the 2-3-4 node.
4 Split the full node. Make a root node with 2, a left child with 1, and a right 3-way child with 3 and 4 Recolor the two red children black. Make 4 a red right child of 3. Note: the red link joins the keys in the corresponding 3-way node.

This should be enough for you to see how to keep going with more numbers. Notice how the counter-intuitive "make 2 red siblings black" makes perfect sense when viewed as the result of a node split.


Comments? comment icon Send mail to c-riesbeck@northwestern.edu.

Valid HTML 4.01 Strict