9 Binary Search Trees¶
A binary search tree, or BST, is a data structure that facilitates fast lookup, insert and removal operations. Consider the following decision tree where picking a side forfeits all the possibilities of the other side, cutting the problem in half:
noyesnoyesnoyesnoyesyesnoShould I go the gym?Did I go yesterday?Did I go joggingyesterday?Am I stillfeeling sore?Did I run 5km?Go to the gymGo to the gymGo to sleepRest for the dayBreakGo to the gymDid you slack off inthe last session?
Once you make a decision and choose a branch, there is no looking back. You keep going until you make a final decision at a leaf node. Binary trees let you do the same thing. Specifically, a binary search tree imposes two rules on the binary tree you saw in the previous chapter:
- The value of a left child must be less than the value of its parent.
- Consequently, the value of a right child must be greater than or equal to the value of its parent.
Binary search trees use these properties to save you from performing unnecessary checking. As a result, lookup, insert and removal have an average time complexity of O(log n), which is considerably faster than linear data structures such as lists and linked lists.
In this chapter, you’ll learn about the benefits of BST relative to a list and, as usual, implement the data structure from scratch.
List vs. BST¶
To illustrate the power of using BST, you’ll look at some common operations and compare the performance of lists against the binary search tree.
Consider the following two collections:
125881845440105207770401877120701054254588
Lookup¶
There’s only one way to do element lookups for an unsorted list. You need to check every element in the list from the start:
125881845440105207770Searching for 105
That’s why list.contains
is an O(n) operation.
This is not the case for binary search trees:
401877201058845170425Searching for 105
Every time the search algorithm visits a node in the BST, it can safely make these two assumptions:
- If the search value is less than the current value, it must be in the left subtree.
- If the search value is greater than the current value, it must be in the right subtree.
By leveraging the rules of the BST, you can avoid unnecessary checks and cut the search space in half every time you make a decision. That’s why element lookup in BST is an O(log n) operation.
Insertion¶
The performance benefits for the insertion operation follow a similar story. Assume you want to insert 0 into a collection. Inserting at the front of the list causes all other elements to shift backward by one position. It’s like butting in line. Everyone in the line behind your chosen spot needs to make space for you by shuffling back:
1258818454401052077701025881845440105207770Inserting 0 in sorted order
Inserting into a list has a time complexity of O(n).
Insertion into a binary search tree is much more comforting. By leveraging the rules of BST, you only need to make three traversals in the example below to find the location for the insertion, and you don’t have to shuffle all the elements around!
401877201058845170025
Inserting elements in BST is an O(log n) operation.
Removal¶
Similar to insertion, removing an element in a list also triggers a shuffling of elements:
1258818454401052077701881845440105207770remove 25Removing 25 from the list
This behavior also goes along with the lineup analogy. If you leave the middle of the line, everyone behind you needs to shuffle forward to take up the empty space.
Here’s what removing a value from a binary search tree looks like:
401877201058845170025
Nice and easy! There are complications to manage when the node you’re removing has children, but you’ll look into that later. Even with those complications, removing an element from a BST is still an O(log n) operation.
Binary search trees drastically reduce the number of steps for add, remove and lookup operations. Now that you understand the benefits of using a binary search tree, you can move on to the actual implementation.
Implementation¶
Open up the starter project for this chapter. In the lib folder you’ll find binary_node.dart with the BinaryNode
type that you created in the previous chapter. Create a new file named binary_search_tree.dart in the same folder and add the following code to it:
import 'binary_node.dart';
class BinarySearchTree<E extends Comparable<dynamic>> {
BinaryNode<E>? root;
@override
String toString() => root.toString();
}
Here are a few things to note:
- By definition, binary search trees can only hold values that are
Comparable
. - If you prefer you could use
Comparable<E>
instead ofComparable<dynamic>
. However,int
doesn’t directly implementComparable
; its superclassnum
does. That makes it so that users of your class would have to usenum
when they really wantint
. UsingComparable<dynamic>
, on the other hand, allows them to useint
directly.
Next, you’ll look at the insert
method.
Inserting Elements¶
In accordance with BST rules, nodes of the left child must contain values less than the current node. Nodes of the right child must contain values greater than or equal to the current node. You’ll implement the insert
method while respecting these rules.
Adding an Insert Method¶
Add the following to BinarySearchTree
:
void insert(E value) {
root = _insertAt(root, value);
}
BinaryNode<E> _insertAt(BinaryNode<E>? node, E value) {
// 1
if (node == null) {
return BinaryNode(value);
}
// 2
if (value.compareTo(node.value) < 0) {
node.leftChild = _insertAt(node.leftChild, value);
} else {
node.rightChild = _insertAt(node.rightChild, value);
}
// 3
return node;
}
The insert
method is exposed to users, while _insertAt
will be used as a private helper method:
- This is a recursive method, so it requires a base case for terminating recursion. If the current node is
null
, you’ve found the insertion point and you return the newBinaryNode
. - Because element types are comparable, you can perform a comparison. This
if
statement controls which way the next_insertAt
call should traverse. If the new value is less than the current value, that is, ifcompareTo
returns a negative number, you’ll look for an insertion point on the leftchild. If the new value is greater than or equal to the current value, you’ll turn to the right child. - Return the current node. This makes assignments of the form
node = _insertAt(node, value)
possible as_insertAt
will either createnode
, if it wasnull
, or returnnode
, if it was notnull
.
Testing it Out¶
Open bin/starter.dart and replace the contents with the following:
import 'package:starter/binary_search_tree.dart';
void main() {
final tree = BinarySearchTree<int>();
for (var i = 0; i < 5; i++) {
tree.insert(i);
}
print(tree);
}
Run the code above and you should see the following output:
┌── 4
┌──3
│ └── null
┌──2
│ └── null
┌──1
│ └── null
0
└── null
Balanced vs. Unbalanced Trees¶
The previous tree looks a bit unbalanced, but it does follow the rules. However, this tree layout has undesirable consequences. When working with trees, you always want to achieve a balanced format:
2310401234balancedunbalanced
An unbalanced tree affects performance. If you insert 5 into the unbalanced tree you’ve created, it becomes an O(n) operation:
5534012345unbalancedbalanced102
You can create structures known as self-balancing trees that use clever techniques to maintain a balanced structure, but you’ll have to wait for those details until Chapter 10, “AVL Trees”. For now, you’ll build a sample tree with a bit of care to keep it from becoming unbalanced.
Building a Balanced Tree¶
Add the following function below main
:
BinarySearchTree<int> buildExampleTree() {
var tree = BinarySearchTree<int>();
tree.insert(3);
tree.insert(1);
tree.insert(4);
tree.insert(0);
tree.insert(2);
tree.insert(5);
return tree;
}
Replace the contents of main
with the following:
final tree = buildExampleTree();
print(tree);
Run the code. You should see the following in the console:
┌── 5
┌──4
│ └── null
3
│ ┌── 2
└──1
└── 0
Much nicer!
Finding Elements¶
Finding an element in a binary search tree requires you to traverse through its nodes. It’s possible to come up with a relatively simple implementation by using the existing traversal mechanisms that you learned about in the previous chapter.
Add the following method to BinarySearchTree
:
bool contains(E value) {
if (root == null) return false;
var found = false;
root!.traverseInOrder((other) {
if (value == other) {
found = true;
}
});
return found;
}
Next, head back to main
to test this out:
final tree = buildExampleTree();
if (tree.contains(5)) {
print("Found 5!");
} else {
print("Couldn’t find 5");
}
You should see the following in the console:
Found 5!
In-order traversal has a time complexity of O(n). Thus, this implementation of contains
has the same time complexity as an exhaustive search through an unsorted list.
You can do better.
Optimizing contains¶
Relying on the properties of BST can help you avoid needless comparisons. Back in BinarySearchTree
, replace contains
with the following:
bool contains(E value) {
// 1
var current = root;
// 2
while (current != null) {
// 3
if (current.value == value) {
return true;
}
// 4
if (value.compareTo(current.value) < 0) {
current = current.leftChild;
} else {
current = current.rightChild;
}
}
return false;
}
- Start by setting
current
to theroot
node. - As long as
current
isn’tnull
, you’ll keep branching through the tree. - If the current node’s
value
is equal to what you’re trying to find, returntrue
. - Otherwise, decide whether you’re going to check the left or the right child.
This implementation of contains
is an O(log n) operation in a balanced binary search tree.
Removing Elements¶
Removing elements is a little more tricky because you need to handle a few different scenarios.
Removing a Leaf Node¶
Removing a leaf node is straightforward. Simply detach the leaf node:
312405removing 2
For non-leaf nodes, however, there are extra steps you must take.
Removing Nodes With One Child¶
When removing nodes with one child, you’ll need to reconnect that child with the rest of the tree:
350124removing 4, which has one child
Removing Nodes With Two Children¶
Nodes with two children are a bit more complicated, so a more complex example tree will better illustrate how to handle this situation. Assume that you have the following tree and that you want to remove the value 25:
50Remove 25752512376345321727331087
Simply deleting the node presents a dilemma. You have two child nodes (12 and 37) to reconnect, but the parent node only has space for one child:
501237453217273310756387
To solve this problem, you’ll implement a clever workaround by performing a swap. When removing a node with two children, replace the node you removed with the smallest node in its right subtree. Based on the principles of BST, this is the leftmost node of the right subtree:
50Replaced 25752712376345321727331087
It’s important to note that this produces a valid binary search tree. Because the new node was the smallest in the right subtree, all nodes in the right subtree will still be greater than or equal to the new node. And because the new node came from the right subtree, all nodes in the left subtree will be less than the new node.
After performing the swap, you can simply remove the value you copied, just a leaf node.
50752712376345321733108727
This will take care of removing nodes with two children.
Finding the Minimum Node in a Subtree¶
Open up binary_search_tree.dart. You’ll implement the remove
method in just a minute, but first add the following helper extension at the bottom of the file:
extension _MinFinder<E> on BinaryNode<E> {
BinaryNode<E> get min => leftChild?.min ?? this;
}
This recursive min
property on BinaryNode
will help you find the minimum node in a subtree.
Implementing remove¶
Now add these two methods to BinarySearchTree
:
void remove(E value) {
root = _remove(root, value);
}
BinaryNode<E>? _remove(BinaryNode<E>? node, E value) {
if (node == null) return null;
if (value == node.value) {
// more to come
} else if (value.compareTo(node.value) < 0) {
node.leftChild = _remove(node.leftChild, value);
} else {
node.rightChild = _remove(node.rightChild, value);
}
return node;
}
This should look familiar to you. You’re using the same recursive setup with a private helper method as you did for insert
. The method isn’t quite finished yet, though. Once you’ve found the node that you want to remove, you still need to separately handle the removal cases for (1) a leaf node, (2) a node with one child, and (3) a node with two children.
Handling the Removal Cases¶
Replace the // more to come
comment above with the following code:
// 1
if (node.leftChild == null && node.rightChild == null) {
return null;
}
// 2
if (node.leftChild == null) {
return node.rightChild;
}
if (node.rightChild == null) {
return node.leftChild;
}
// 3
node.value = node.rightChild!.min.value;
node.rightChild = _remove(node.rightChild, node.value);
- If the node is a leaf node, you simply return
null
, thereby removing the current node. - If the node has no left child, you return
node.rightChild
to reconnect the right subtree. If the node has no right child, you returnnode.leftChild
to reconnect the left subtree. - This is the case in which the node to be removed has both a left and right child. You replace the node’s value with the smallest value from the right subtree. You then call
remove
on the right child to remove this swapped value.
Testing it Out¶
Head back to main
and test remove
by writing the following:
final tree = buildExampleTree();
print('Tree before removal:');
print(tree);
tree.remove(3);
print('Tree after removing root:');
print(tree);
You should see the output below in the console:
Tree before removal:
┌── 5
┌──4
│ └── null
3
│ ┌── 2
└──1
└── 0
Tree after removing root:
┌── 5
4
│ ┌── 2
└──1
└── 0
Successfully implemented!
In the next chapter you’ll learn how to create a self-balancing binary search tree called an AVL tree.
Challenges¶
Think you’ve gotten the hang of binary search trees? Try out these three challenges to lock the concepts down. As usual, you can find the answers in the Challenge Solutions section at the end of the book.
Challenge 1: Binary Tree or Binary Search Tree?¶
Write a function that checks if a binary tree is a binary search tree.
Challenge 2: Equality¶
Given two binary trees, how would you test if they are equal or not?
Challenge 3: Is it a Subtree?¶
Create a method that checks if the current tree contains all the elements of another tree.
Key Points¶
- The binary search tree (BST) is a powerful data structure for holding sorted data.
- Elements of the binary search tree must be comparable. You can achieve this using a generic constraint or by supplying a closure to perform the comparison.
- The time complexity for
insert
,remove
andcontains
methods in a BST is O(log n). - Performance will degrade to O(n) as the tree becomes unbalanced. This is undesirable, but self-balancing trees such as the AVL tree can overcome the problem.