Trees: Binary Search Trees

Anatomy of Binary Search Tree

A Binary Search Tree is a binary tree data structure where each node has at most two children, and for each node, all elements in its left subtree are less than or equal to the node, and all elements in its right subtree are greater.

        4
       / \
      2   6
     / \ / \
    1  3 5  7

Understanding Binary Search Tree

A Binary Search Tree (BST) is a hierarchical data structure consisting of nodes, where each node has at most two children: a left child and a right child. The BST maintains a sorted order of its elements based on the following property:

  • Ordering Property: For every node n, all nodes in the left subtree of n have values less than n, and all nodes in the right subtree of n have values greater than n.

Because of this ordering property, BSTs naturally organize data in a sorted manner. As elements are inserted into or deleted from the tree, the structure adjusts dynamically to maintain the sorting order.

1 It is called a binary tree because each tree node has a maximum of two children.

As mentioned, a Binary Search Tree is a type of binary tree, where each node can have at most two children. The term "binary" refers to the maximum number of children allowed per node. This binary structure provides a clear and efficient way to organize and search through data.

2 It is called a search tree because it can be used to search for the presence of a number in O(log(n)) time

The most significant advantage of a Binary Search Tree is its efficient search operation. Due to its sorted structure and the ordering property mentioned earlier, BSTs enable rapid searching for a specific element.

  • Search Operation: Starting from the root node, at each step of the search process, the algorithm compares the target value with the value of the current node. Based on the comparison, it determines whether to continue the search in the left subtree, the right subtree, or if the current node matches the target.
  • Time Complexity: In a balanced Binary Search Tree, the average-case time complexity of searching for an element is O(log n), where n is the number of nodes in the tree. This logarithmic time complexity arises from the binary search nature of the algorithm, where the search space is effectively halved at each step.
  • Advantage: O(log n) time complexity provides efficient search performance even for large datasets. This makes BSTs suitable for applications requiring fast retrieval of information, such as databases, symbol tables, and search algorithms.

Key Properties of Binary Search Trees:

1. Ordering:

  • Nodes in the left subtree of a node contain values less than to the node's value.
  • Nodes in the right subtree of a node contain values greater than the node's value.

2. Unique Values:

  • Each node in a BST contains a unique value.
  • This uniqueness simplifies searching and ensures a clear hierarchical structure.

Operations on Binary Search Trees:

1. Insertion

Algorithm:

  1. Start at the root.
  2. If the value to be inserted is less than the current node's value, move to the left subtree.
  3. If greater, move to the right subtree.
  4. Repeat until an empty spot is found, then insert the new node.

Implementation:

void insert(TreeNode*& root, int value) {
    if (root == nullptr) {
        root = new TreeNode(value);
    } else if (value < root->data) {
        insert(root->left, value);
    } else {
        insert(root->right, value);
    }
}

2. Searching:

Algorithm:

  1. Start at the root.
  2. If the value is equal to the current node's value, return the node.
  3. If less, move to the left subtree; otherwise, move to the right subtree.
  4. Repeat until the value is found or a leaf node is reached.

Implementation:

bool search(TreeNode* root, int value) {
    if (root == nullptr) {
        return false;
    }
    if (root->data == value) {
        return true;
    } else if (value < root->data) {
        return search(root->left, value);
    } else {
        return search(root->right, value);
    }
}

3. Deletion:

Deletion in a BST involves handling three case: a node with no children, a node with one child, and a node with two children.

4. Traversal:

Traversal of a BST can be performed in various orders: in-order, pre-order, and post-order.

In-order Traversal (sorted order):

void inorderTraversal(TreeNode* root) {
    if (root != nullptr) {
        inorderTraversal(root->left);
        // Process or print the current node's data.
        inorderTraversal(root->right);
    }
}

Why Balancing is required in BST

Balancing in binary search trees is essential to maintain efficient search, insertion, and deletion operations. When a binary search tree becomes unbalanced, the tree's height can increase significantly, leading to degraded performance, where the time complexity of operations may no longer remain logarithmic.

Let's consider an example to understand why balancing is required:

Suppose we have a binary search tree where elements are inserted in ascending order:

    1
     \
      2
       \
        3
         \
          4
           \
            5
             \
              6

In this unbalanced tree, all elements are inserted as right children. As a result, the tree degenerates into a linked list structure. When performing a search operation, for instance, to find element 6, we would need to traverse through all previous nodes, resulting in a time complexity of O(n), where n is the number of elements in the tree. This defeats the purpose of using a binary search tree, which ideally offers O(log n) time complexity for search operations.

To maintain efficiency, balancing techniques like rotation, as seen in AVL trees or Red-Black trees, are employed. These techniques ensure that the tree remains relatively balanced, keeping the height of the tree minimized and preserving the logarithmic time complexity for search operations.

Let's visualize the same elements inserted into a balanced binary search tree:

      3
     / \
    1   5
     \ / \
     2 4  6

In this balanced tree, the search operation for element 6 would only require traversing through a maximum of three nodes, maintaining the O(log n) time complexity even with a larger number of elements.

4 Complexity Analysis

The complexity analysis of a Binary Search Tree (BST) involves understanding the time complexity of its fundamental operations, such as search, insertion, and deletion, in terms of the height of the tree.

1 Search Operation:

In a balanced Binary Search Tree, the average-case time complexity of searching for a key is O(log n), where n is the number of nodes in the tree. This complexity arises because at each step of the search process, the search space is divided roughly in half, similar to a binary search algorithm.

However, in the worst-case scenario, where the BST is unbalanced (e.g., resembling a linked list), the search time complexity can be O(n), where n is the number of nodes. This occurs when the tree's height becomes linear, resulting in a sequential search through all nodes.

  • Best Case:
    • In the best-case scenario, the BST is perfectly balanced, and the height of the tree is logarithmic with respect to the number of nodes (h = log2(n))
    • Time Complexity: O(log n)
  • Average Case:
    • In an average-case scenario, the BST is reasonably balanced, and the height of the tree is logarithmic with respect to the number of nodes (h = log₂(n)).
    • Time Complexity: O(log n)
  • Worst Case:
    • In the worst-case scenario, the BST is highly unbalanced, resembling a linked list, and the height of the tree is linear with respect to the number of nodes (h = n).
    • Time Complexity: O(n)

2 Insertion Operation:

Inserting a new node into a balanced Binary Search Tree also has an average-case time complexity of O(log n), where n is the number of nodes. This complexity arises because the tree's height remains logarithmic due to balancing mechanisms (e.g., AVL tree or Red-Black tree properties).

In the worst-case scenario, the time complexity of insertion can also be O(n), similar to search. This occurs when the BST becomes highly unbalanced, requiring traversal through most or all nodes to find the appropriate insertion point.

  • Best Case:
    • Similar to the search operation, in the best-case scenario, the BST is perfectly balanced, and the height of the tree is logarithmic (h = log₂(n)).
    • Time Complexity: O(log n)
  • Average Case:
    • Similar to the search operation, in an average-case scenario, the BST is reasonably balanced, and the height of the tree is logarithmic (h = log₂(n)).
    • Time Complexity: O(log n)
  • Worst Case:
    • Similar to the search operation, in the worst-case scenario, the BST is highly unbalanced, and the height of the tree is linear (h = n).
    • Time Complexity: O(n)

3 Deletion Operation:

Deleting a node from a balanced Binary Search Tree typically has an average-case time complexity of O(log n), similar to search and insertion. This complexity arises because, like insertion, the height of the tree remains logarithmic due to balancing mechanisms.

In the worst-case scenario, deletion can have a time complexity of O(n) if the tree becomes highly unbalanced, similar to search and insertion.

  • Best Case:
    • In the best-case scenario, the BST is perfectly balanced, and the height of the tree is logarithmic (h = log₂(n)).
    • Time Complexity: O(log n)
  • Average Case:
    • Similar to the search and insertion operations, in an average-case scenario, the BST is reasonably balanced, and the height of the tree is logarithmic (h = log₂(n)).
    • Time Complexity: O(log n)
  • Worst Case:
    • Similar to the search and insertion operations, in the worst-case scenario, the BST is highly unbalanced, and the height of the tree is linear (h = n).
    • Time Complexity: O(n)

4 Space Complexity:

The space complexity of a Binary Search Tree primarily depends on the number of nodes in the tree. In the average case, the space complexity of a BST is O(n), where n is the number of nodes. This space is required to store the nodes and any auxiliary data structures used in tree operations.

#4.1 Summary:

1 Average Case:

  • Search, insertion, and deletion operations have a time complexity of O(log n) in a balanced BST.
  • Space complexity is O(n).

2 Worst Case:

  • Search, insertion, and deletion operations can have a time complexity of O(n) in an unbalanced BST.
  • Space complexity is O(n).

5 Operations Algorithm

1 Insertion Operation:

#include <iostream>

using namespace std;

// Definition of a BST Node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;
    
    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Insertion Operation
TreeNode* insert(TreeNode* root, int val) {
    if (root == nullptr) {
        // Create a new node if the tree is empty
        return new TreeNode(val);
    }

    // If val is less than the current node's value, insert in the left subtree
    if (val < root->data) {
        root->left = insert(root->left, val);
    }
    // If val is greater than or equal to the current node's value, insert in the right subtree
    else {
        root->right = insert(root->right, val);
    }
    
    return root;
}

int main() {
    // Initialize an empty BST
    TreeNode* root = nullptr;

    // Insert elements into the BST
    root = insert(root, 10);
    root = insert(root, 5);
    root = insert(root, 15);
    root = insert(root, 3);
    root = insert(root, 7);
    root = insert(root, 12);
    root = insert(root, 20);

    return 0;
}

Explanation:

  • The insert function takes the root of the BST and a value to be inserted as arguments.
  • It recursively inserts the value into the appropriate position in the BST based on the value of the current node.
  • If the value is less than the current node's value, it recursively inserts into the left subtree; otherwise, it recursively inserts into the right subtree.
  • The function returns the root of the modified BST.

Iterative Insertion:

TreeNode* insert(TreeNode* root, int val) {
    TreeNode* newNode = new TreeNode(val);
    
    if (root == nullptr) {
        return newNode;
    }

    TreeNode* current = root;
    TreeNode* parent = nullptr;
    
    while (current != nullptr) {
        parent = current;
        if (val < current->data) {
            current = current->left;
        } else {
            current = current->right;
        }
    }
    
    if (val < parent->data) {
        parent->left = newNode;
    } else {
        parent->right = newNode;
    }
    
    return root;
}

2 Search Operation:

#include <iostream>

using namespace std;

// Definition of a BST Node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;
    
    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Search Operation
bool search(TreeNode* root, int val) {
    if (root == nullptr) {
        // Element not found
        return false;
    }

    if (root->data == val) {
        // Element found
        return true;
    }

    // Recursively search in the appropriate subtree
    if (val < root->data) {
        return search(root->left, val);
    } else {
        return search(root->right, val);
    }
}

int main() {
    // Initialize an empty BST
    TreeNode* root = nullptr;

    // Insert elements into the BST (as shown in the previous example)

    // Search for elements in the BST
    cout << "Search 7: " << (search(root, 7) ? "Found" : "Not Found") << endl;
    cout << "Search 8: " << (search(root, 8) ? "Found" : "Not Found") << endl;

    return 0;
}

Explanation:

  • The search function takes the root of the BST and a value to be searched as arguments.
  • It recursively searches for the value in the BST, starting from the root node.
  • If the value is found at the current node, it returns true.
  • If the value is less than the current node's value, it recursively searches in the left subtree; otherwise, it recursively searches in the right subtree.
  • If the value is not found in the BST, it returns false.

Iterative Search:

bool search(TreeNode* root, int val) {
    TreeNode* current = root;
    
    while (current != nullptr) {
        if (current->data == val) {
            return true;
        } else if (val < current->data) {
            current = current->left;
        } else {
            current = current->right;
        }
    }
    
    return false;
}

3 Delete Operation:

When deleting a node from a Binary Search Tree (BST), several scenarios can arise based on the structure of the tree and the position of the node to be deleted.

  • It takes the use of Inorder traversal, which always returns the sorted data.

Node has no children (Leaf Node):

If the node to be deleted has no children, it is a leaf node. In this case, we can simply remove the node from the tree.

Node has one child:

If the node to be deleted has only one child (either left or right), we can bypass the node by linking its parent directly to its child. This operation effectively removes the node from the tree.

Node has two Children:

If the node to be deleted has two children, deleting it becomes more complex. We need to find a replacement node to maintain the structure and ordering of the BST. This replacement node can be the node's inorder predecessor (the largest node in the left subtree) or the inorder successor (the smallest node in the right subtree).

  • Inorder Predecessor: The largest node in the left subtree of the node to be deleted. It is the rightmost node in the left subtree.
  • Inorder Successor: The smallest node in the right subtree of the node to be deleted. It is the leftmost node in the right subtree.

Once we identify the replacement node, we copy its value to the node to be deleted, and then recursively delete the replacement node from its original position.

  • Handling Scenarios during Deletion:
    • Case 1: Node to be deleted is a leaf node:
      • Simply remove the node from the tree.
    • Case 2: Node to be deleted has one child:
      • Replace the node with its child by updating the appropriate link from its parent.
    • Case 3: Node to be deleted has two children:
      • Find the inorder predecessor or successor.
      • Copy its value to the node to be deleted.
      • Recursively delete the inorder predecessor or successor from its original position.
#include <iostream>

using namespace std;

// Definition of a BST Node
struct TreeNode {
    int data;
    TreeNode* left;
    TreeNode* right;
    
    TreeNode(int val) : data(val), left(nullptr), right(nullptr) {}
};

// Helper function to find the inorder predecessor (largest node in the left subtree)
TreeNode* findMax(TreeNode* root) {
    while (root->right != nullptr) {
        root = root->right;
    }
    return root;
}

// Delete Operation
TreeNode* deleteNode(TreeNode* root, int val) {
    if (root == nullptr) {
        // Element not found, return null
        return nullptr;
    }

    if (val < root->data) {
        // Search in the left subtree if the value to be deleted is less than the current node's value
        root->left = deleteNode(root->left, val);
    } else if (val > root->data) {
        // Search in the right subtree if the value to be deleted is greater than the current node's value
        root->right = deleteNode(root->right, val);
    } else {
        // Node with the value to be deleted found
        
        // Case 1: Node has no children or only one child
        if (root->left == nullptr) {
            TreeNode* temp = root->right;
            delete root;
            return temp;
        } else if (root->right == nullptr) {
            TreeNode* temp = root->left;
            delete root;
            return temp;
        }
        
        // Case 2: Node has two children
        // Find the inorder predecessor (largest node in the left subtree)
        TreeNode* temp = findMax(root->left);
        
        // Copy the value of the inorder predecessor to the current node
        root->data = temp->data;
        
        // Recursively delete the inorder predecessor from its original position
        root->left = deleteNode(root->left, temp->data);
    }
    
    return root;
}

// In-order Traversal
void inOrderTraversal(TreeNode* root) {
    if (root == nullptr) {
        return;
    }

    // Traverse left subtree
    inOrderTraversal(root->left);

    // Visit current node
    cout << root->data << " ";

    // Traverse right subtree
    inOrderTraversal(root->right);
}

int main() {
    // Initialize a BST
    TreeNode* root = new TreeNode(11);
    root->left = new TreeNode(6);
    root->right = new TreeNode(19);
    root->left->left = new TreeNode(4);
    root->left->right = new TreeNode(8);
    root->left->right->right = new TreeNode(10);
    root->left->right->left = new TreeNode(5);
    root->right->left = new TreeNode(17);
    root->right->right = new TreeNode(43);
    root->right->right->right = new TreeNode(49);
    root->right->right->left = new TreeNode(31);

    // In-order Traversal before deletion
    cout << "In-order Traversal before deletion: ";
    inOrderTraversal(root);
    cout << endl;

    // Delete nodes from the BST
    root = deleteNode(root, 6); // Delete node with value 6
    root = deleteNode(root, 43); // Delete node with value 43

    // In-order Traversal after deletion
    cout << "In-order Traversal after deletion: ";
    inOrderTraversal(root);
    cout << endl;

    return 0;
}

Iterative Deletion:

TreeNode* deleteNode(TreeNode* root, int val) {
    TreeNode* current = root;
    TreeNode* parent = nullptr;
    
    while (current != nullptr && current->data != val) {
        parent = current;
        if (val < current->data) {
            current = current->left;
        } else {
            current = current->right;
        }
    }
    
    if (current == nullptr) {
        // Node with given value not found
        return root;
    }
    
    // Case 1: Node has no children or only one child
    if (current->left == nullptr) {
        if (parent == nullptr) {
            root = current->right;
        } else if (current == parent->left) {
            parent->left = current->right;
        } else {
            parent->right = current->right;
        }
        delete current;
    } else if (current->right == nullptr) {
        if (parent == nullptr) {
            root = current->left;
        } else if (current == parent->left) {
            parent->left = current->left;
        } else {
            parent->right = current->left;
        }
        delete current;
    } else {
        // Case 2: Node has two children
        TreeNode* successor = findMin(current->right);
        current->data = successor->data;
        current->right = deleteNode(current->right, successor->data);
    }
    
    return root;
}
OperationAverage Case Time ComplexityWorst Case Time ComplexitySpace Complexity
InsertionO(log n)O(n) (Unbalanced Tree)O(1) (Iterative) or O(log n) (Recursive)
DeletionO(log n)O(n) (Unbalanced Tree)O(1) (Iterative) or O(log n) (Recursive)
SearchO(log n)O(n) (Unbalanced Tree)O(1) (Iterative) or O(log n) (Recursive)
Traversal (In-order, Pre-order, Post-order)O(n)O(n)O(n) (Recursive)

6 Draw Binary Search Tree (BST) step by step:

Draw the BST step by step by inserting the given numbers:

We will start with an empty tree and insert each number one by one, following the rules of BST insertion:

  1. If the tree is empty, the first number becomes the root.
  2. If a number is less than the current node, it goes to the left subtree.
  3. If a number is greater than or equal to the current node, it goes to the right subtree.

Here's the resulting BST after inserting each number:

Step 1: Insert 11
       11

Step 2: Insert 6
       11
      /
     6

Step 3: Insert 8
       11
      /
     6
      \
       8

Step 4: Insert 19
       11
      /  \
     6    19
      \
       8

Step 5: Insert 4
       11
      /  \
     6    19
    / \
   4   8

Step 6: Insert 10
       11
      /  \
     6    19
    / \ 
   4   8
      /
     10

Step 7: Insert 5
       11
      /  \
     6    19
    / \ 
   4   8
    \ 
     5
         
Step 8: Insert 17
       11
      /  \
     6    19
    / \    \
   4   8   17
    \ 
     5
         
Step 9: Insert 43
       11
      /  \
     6    19
    / \    \
   4   8   43
    \     /
     5   17
         
Step 10: Insert 49
       11
      /  \
     6    19
    / \    \
   4   8   43
    \     / \
     5   17  49
         
Step 11: Insert 31
       11
      /  \
     6    19
    / \    \
   4   8   43
    \   / \
     5 17  49
       /
     31