跳转至

9 Binary Search Trees

A binary search tree, or BST, is a data structure that facilitates fast lookup, insert and removal operations. Consider the following decision tree where picking a side forfeits all the possibilities of the other side, cutting the problem in half:

noyesnoyesnoyesnoyesyesnoShould I go the gym?Did I go yesterday?Did I go joggingyesterday?Am I stillfeeling sore?Did I run 5km?Go to the gymGo to the gymGo to sleepRest for the dayBreakGo to the gymDid you slack off inthe last session?

Once you make a decision and choose a branch, there is no looking back. You keep going until you make a final decision at a leaf node. Binary trees let you do the same thing. Specifically, a binary search tree imposes two rules on the binary tree you saw in the previous chapter:

  • The value of a left child must be less than the value of its parent.
  • Consequently, the value of a right child must be greater than or equal to the value of its parent.

Binary search trees use these properties to save you from performing unnecessary checking. As a result, lookup, insert and removal have an average time complexity of O(log n), which is considerably faster than linear data structures such as lists and linked lists.

In this chapter, you’ll learn about the benefits of BST relative to a list and, as usual, implement the data structure from scratch.

List vs. BST

To illustrate the power of using BST, you’ll look at some common operations and compare the performance of lists against the binary search tree.

Consider the following two collections:

125881845440105207770401877120701054254588

Lookup

There’s only one way to do element lookups for an unsorted list. You need to check every element in the list from the start:

125881845440105207770Searching for 105

That’s why list.contains is an O(n) operation.

This is not the case for binary search trees:

401877201058845170425Searching for 105

Every time the search algorithm visits a node in the BST, it can safely make these two assumptions:

  • If the search value is less than the current value, it must be in the left subtree.
  • If the search value is greater than the current value, it must be in the right subtree.

By leveraging the rules of the BST, you can avoid unnecessary checks and cut the search space in half every time you make a decision. That’s why element lookup in BST is an O(log n) operation.

Insertion

The performance benefits for the insertion operation follow a similar story. Assume you want to insert 0 into a collection. Inserting at the front of the list causes all other elements to shift backward by one position. It’s like butting in line. Everyone in the line behind your chosen spot needs to make space for you by shuffling back:

1258818454401052077701025881845440105207770Inserting 0 in sorted order

Inserting into a list has a time complexity of O(n).

Insertion into a binary search tree is much more comforting. By leveraging the rules of BST, you only need to make three traversals in the example below to find the location for the insertion, and you don’t have to shuffle all the elements around!

401877201058845170025

Inserting elements in BST is an O(log n) operation.

Removal

Similar to insertion, removing an element in a list also triggers a shuffling of elements:

1258818454401052077701881845440105207770remove 25Removing 25 from the list

This behavior also goes along with the lineup analogy. If you leave the middle of the line, everyone behind you needs to shuffle forward to take up the empty space.

Here’s what removing a value from a binary search tree looks like:

401877201058845170025

Nice and easy! There are complications to manage when the node you’re removing has children, but you’ll look into that later. Even with those complications, removing an element from a BST is still an O(log n) operation.

Binary search trees drastically reduce the number of steps for add, remove and lookup operations. Now that you understand the benefits of using a binary search tree, you can move on to the actual implementation.

Implementation

Open up the starter project for this chapter. In the lib folder you’ll find binary_node.dart with the BinaryNode type that you created in the previous chapter. Create a new file named binary_search_tree.dart in the same folder and add the following code to it:

import 'binary_node.dart';

class BinarySearchTree<E extends Comparable<dynamic>> {
  BinaryNode<E>? root;

  @override
  String toString() => root.toString();
}

Here are a few things to note:

  • By definition, binary search trees can only hold values that are Comparable.
  • If you prefer you could use Comparable<E> instead of Comparable<dynamic>. However, int doesn’t directly implement Comparable; its superclass num does. That makes it so that users of your class would have to use num when they really want int. Using Comparable<dynamic>, on the other hand, allows them to use int directly.

Next, you’ll look at the insert method.

Inserting Elements

In accordance with BST rules, nodes of the left child must contain values less than the current node. Nodes of the right child must contain values greater than or equal to the current node. You’ll implement the insert method while respecting these rules.

Adding an Insert Method

Add the following to BinarySearchTree:

void insert(E value) {
  root = _insertAt(root, value);
}

BinaryNode<E> _insertAt(BinaryNode<E>? node, E value) {
  // 1
  if (node == null) {
    return BinaryNode(value);
  }
  // 2
  if (value.compareTo(node.value) < 0) {
    node.leftChild = _insertAt(node.leftChild, value);
  } else {
    node.rightChild = _insertAt(node.rightChild, value);
  }
  // 3
  return node;
}

The insert method is exposed to users, while _insertAt will be used as a private helper method:

  1. This is a recursive method, so it requires a base case for terminating recursion. If the current node is null, you’ve found the insertion point and you return the new BinaryNode.
  2. Because element types are comparable, you can perform a comparison. This if statement controls which way the next _insertAt call should traverse. If the new value is less than the current value, that is, if compareTo returns a negative number, you’ll look for an insertion point on the leftchild. If the new value is greater than or equal to the current value, you’ll turn to the right child.
  3. Return the current node. This makes assignments of the form node = _insertAt(node, value) possible as _insertAt will either create node, if it was null, or return node, if it was not null.

Testing it Out

Open bin/starter.dart and replace the contents with the following:

import 'package:starter/binary_search_tree.dart';

void main() {
  final tree = BinarySearchTree<int>();
  for (var i = 0; i < 5; i++) {
    tree.insert(i);
  }
  print(tree);
}

Run the code above and you should see the following output:

   ┌── 4
  ┌──3
  │ └── null
 ┌──2
 │ └── null
┌──1
│ └── null
0
└── null

Balanced vs. Unbalanced Trees

The previous tree looks a bit unbalanced, but it does follow the rules. However, this tree layout has undesirable consequences. When working with trees, you always want to achieve a balanced format:

2310401234balancedunbalanced

An unbalanced tree affects performance. If you insert 5 into the unbalanced tree you’ve created, it becomes an O(n) operation:

5534012345unbalancedbalanced102

You can create structures known as self-balancing trees that use clever techniques to maintain a balanced structure, but you’ll have to wait for those details until Chapter 10, “AVL Trees”. For now, you’ll build a sample tree with a bit of care to keep it from becoming unbalanced.

Building a Balanced Tree

Add the following function below main:

BinarySearchTree<int> buildExampleTree() {
  var tree = BinarySearchTree<int>();
  tree.insert(3);
  tree.insert(1);
  tree.insert(4);
  tree.insert(0);
  tree.insert(2);
  tree.insert(5);
  return tree;
}

Replace the contents of main with the following:

final tree = buildExampleTree();
print(tree);

Run the code. You should see the following in the console:

 ┌── 5
┌──4
│ └── null
3
│ ┌── 2
└──1
 └── 0

Much nicer!

Finding Elements

Finding an element in a binary search tree requires you to traverse through its nodes. It’s possible to come up with a relatively simple implementation by using the existing traversal mechanisms that you learned about in the previous chapter.

Add the following method to BinarySearchTree:

bool contains(E value) {
  if (root == null) return false;
  var found = false;
  root!.traverseInOrder((other) {
    if (value == other) {
      found = true;
    }
  });
  return found;
}

Next, head back to main to test this out:

final tree = buildExampleTree();
if (tree.contains(5)) {
  print("Found 5!");
} else {
  print("Couldn’t find 5");
}

You should see the following in the console:

Found 5!

In-order traversal has a time complexity of O(n). Thus, this implementation of contains has the same time complexity as an exhaustive search through an unsorted list.

You can do better.

Optimizing contains

Relying on the properties of BST can help you avoid needless comparisons. Back in BinarySearchTree, replace contains with the following:

bool contains(E value) {
  // 1
  var current = root;
  // 2
  while (current != null) {
    // 3
    if (current.value == value) {
      return true;
    }
    // 4
    if (value.compareTo(current.value) < 0) {
      current = current.leftChild;
    } else {
      current = current.rightChild;
    }
  }
  return false;
}
  1. Start by setting current to the root node.
  2. As long as current isn’t null, you’ll keep branching through the tree.
  3. If the current node’s value is equal to what you’re trying to find, return true.
  4. Otherwise, decide whether you’re going to check the left or the right child.

This implementation of contains is an O(log n) operation in a balanced binary search tree.

Removing Elements

Removing elements is a little more tricky because you need to handle a few different scenarios.

Removing a Leaf Node

Removing a leaf node is straightforward. Simply detach the leaf node:

312405removing 2

For non-leaf nodes, however, there are extra steps you must take.

Removing Nodes With One Child

When removing nodes with one child, you’ll need to reconnect that child with the rest of the tree:

350124removing 4, which has one child

Removing Nodes With Two Children

Nodes with two children are a bit more complicated, so a more complex example tree will better illustrate how to handle this situation. Assume that you have the following tree and that you want to remove the value 25:

50Remove 25752512376345321727331087

Simply deleting the node presents a dilemma. You have two child nodes (12 and 37) to reconnect, but the parent node only has space for one child:

501237453217273310756387

To solve this problem, you’ll implement a clever workaround by performing a swap. When removing a node with two children, replace the node you removed with the smallest node in its right subtree. Based on the principles of BST, this is the leftmost node of the right subtree:

50Replaced 25752712376345321727331087

It’s important to note that this produces a valid binary search tree. Because the new node was the smallest in the right subtree, all nodes in the right subtree will still be greater than or equal to the new node. And because the new node came from the right subtree, all nodes in the left subtree will be less than the new node.

After performing the swap, you can simply remove the value you copied, just a leaf node.

50752712376345321733108727

This will take care of removing nodes with two children.

Finding the Minimum Node in a Subtree

Open up binary_search_tree.dart. You’ll implement the remove method in just a minute, but first add the following helper extension at the bottom of the file:

extension _MinFinder<E> on BinaryNode<E> {
  BinaryNode<E> get min => leftChild?.min ?? this;
}

This recursive min property on BinaryNode will help you find the minimum node in a subtree.

Implementing remove

Now add these two methods to BinarySearchTree:

void remove(E value) {
  root = _remove(root, value);
}

BinaryNode<E>? _remove(BinaryNode<E>? node, E value) {
  if (node == null) return null;

  if (value == node.value) {
    // more to come
  } else if (value.compareTo(node.value) < 0) {
    node.leftChild = _remove(node.leftChild, value);
  } else {
    node.rightChild = _remove(node.rightChild, value);
  }
  return node;
}

This should look familiar to you. You’re using the same recursive setup with a private helper method as you did for insert. The method isn’t quite finished yet, though. Once you’ve found the node that you want to remove, you still need to separately handle the removal cases for (1) a leaf node, (2) a node with one child, and (3) a node with two children.

Handling the Removal Cases

Replace the // more to come comment above with the following code:

// 1
if (node.leftChild == null && node.rightChild == null) {
  return null;
}
// 2
if (node.leftChild == null) {
  return node.rightChild;
}
if (node.rightChild == null) {
  return node.leftChild;
}
// 3
node.value = node.rightChild!.min.value;
node.rightChild = _remove(node.rightChild, node.value);
  1. If the node is a leaf node, you simply return null, thereby removing the current node.
  2. If the node has no left child, you return node.rightChild to reconnect the right subtree. If the node has no right child, you return node.leftChildto reconnect the left subtree.
  3. This is the case in which the node to be removed has both a left and right child. You replace the node’s value with the smallest value from the right subtree. You then call remove on the right child to remove this swapped value.

Testing it Out

Head back to main and test remove by writing the following:

final tree = buildExampleTree();
print('Tree before removal:');
print(tree);
tree.remove(3);
print('Tree after removing root:');
print(tree);

You should see the output below in the console:

Tree before removal:
 ┌── 5
┌──4
│ └── null
3
│ ┌── 2
└──1
 └── 0

Tree after removing root:
┌── 5
4
│ ┌── 2
└──1
 └── 0

Successfully implemented!

In the next chapter you’ll learn how to create a self-balancing binary search tree called an AVL tree.

Challenges

Think you’ve gotten the hang of binary search trees? Try out these three challenges to lock the concepts down. As usual, you can find the answers in the Challenge Solutions section at the end of the book.

Challenge 1: Binary Tree or Binary Search Tree?

Write a function that checks if a binary tree is a binary search tree.

Challenge 2: Equality

Given two binary trees, how would you test if they are equal or not?

Challenge 3: Is it a Subtree?

Create a method that checks if the current tree contains all the elements of another tree.

Key Points

  • The binary search tree (BST) is a powerful data structure for holding sorted data.
  • Elements of the binary search tree must be comparable. You can achieve this using a generic constraint or by supplying a closure to perform the comparison.
  • The time complexity for insert, remove and contains methods in a BST is O(log n).
  • Performance will degrade to O(n) as the tree becomes unbalanced. This is undesirable, but self-balancing trees such as the AVL tree can overcome the problem.