CS61B(17): B-Trees(2-3, 2-3-4 Trees)

This is the lecture note of CS61B - Lecture 17.

In today's lecture, we are gonna be primarily concerned with one thing: tree height.

BST Again

The answer is B.

WE can see that performance of operations on spindly trees can be just as bad as a linked list!


Big O is still a useful idea:

  • Allows us to make simple blanket statements, e.g. can just say “binary search is O(log N)” instead of “binary search is Θ(log N) in the worst case”.
  • Sometimes don’t know the exact runtime, so use O to give an upper bound.
    • Example: Runtime for finding shortest route that goes to all world cities is O(2N)*. There might be a faster way, but nobody knows one yet.
  • Easier to write proofs for Big O than Big Theta, e.g. finding runtime of mergesort, you can round up the number of items to the next power of 2 (see A level study guide problems for Asymptotics2 lecture). A little beyond the scope of our course

Let’s now turn to understanding the performance of BST operations.

  • The “height” of a tree determines the worst case runtime to find a node.
    • Example: Worst case is contains(s), requires 5 comparisons (height + 1).
  • The “average depth” determines the average case runtime to find a node.
    • Example: Average case is 3.35 comparisons (average depth + 1).

So an important question is What about Real World BSTs? One way to approximate this is to consider randomized BSTs.

B-Tree

So, to avoid spindly trees, we're going to invent a very close cousin to the BST which is called B-Tree.

However, this idea also has a problem that maybe you will meet a leaf with high O(N).

Basix Insertion

Before we continue, let's first do a quiz to exam your understanding.

What we've talked is leaves. Then, what about non-leaf nodes? What if a non-leaf node gets too full? Can we split that?

The answer is yes!

Split Non-leaf Nodes

Observation: Splitting-trees have perfect balance.

  • If we split the root, every node gets pushed down by exactly one level.
  • If we split a leaf node or internal node, the height doesn’t change.

Runtime Analysis

Now, we've know that B-Tree is bushy, let's do runtime analysis.

Summary

Extra: Deletion

See Extra Slides