Skip to main content

Trees and Graphs

Trees and graphs model relationships between entities — file systems, social networks, routing tables, dependency graphs, and DOM structures. These two data structures appear in more real-world systems than almost any other.

Trees

A tree is a connected graph with no cycles, rooted at a single node. Each node has exactly one parent (except the root) and zero or more children.

        5
       / \
      3   8
     / \   \
    1   4   9

Terminology:

  • Root: topmost node (5)
  • Leaf: node with no children (1, 4, 9)
  • Height: longest path from root to a leaf
  • Depth: distance from root to a node

Binary Trees

A binary tree is a tree where each node has at most two children, called left and right.

class TreeNode:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None

Tree Traversals

Three classic traversal orders — each visits every node exactly once: O(n).

In-order (left → root → right): for a BST, yields values in sorted order.

def inorder(node):
    if node is None:
        return
    inorder(node.left)
    print(node.val)
    inorder(node.right)

Pre-order (root → left → right): useful for copying or serialising a tree.

Post-order (left → right → root): useful for deletion (process children before parent).

Level-order (BFS): visits nodes level by level using a queue.

from collections import deque

def level_order(root):
    if not root:
        return
    queue = deque([root])
    while queue:
        node = queue.popleft()
        print(node.val)
        if node.left:
            queue.append(node.left)
        if node.right:
            queue.append(node.right)

Binary Search Trees (BST)

A BST enforces a constraint: for every node, all values in the left subtree are smaller and all values in the right subtree are larger.

        5
       / \
      3   8
     / \   \
    1   4   9

This constraint enables O(log n) search, insert, and delete — for a balanced tree.

def search(node, target):
    if node is None or node.val == target:
        return node
    if target < node.val:
        return search(node.left, target)   # go left
    return search(node.right, target)      # go right

Each step eliminates half the remaining nodes — logarithmic depth.

The catch: a BST is only O(log n) if it is balanced. Insert elements in sorted order and the tree degenerates into a linked list — O(n) operations. Self-balancing BSTs (AVL trees, Red-Black trees) automatically rebalance to maintain O(log n) guarantees. Python's sortedcontainers.SortedList and Java's TreeMap use these internally.

Heaps

A heap is a complete binary tree satisfying the heap property:

  • Min-heap: every node's value is ≤ its children's values. The minimum is always at the root.
  • Max-heap: every node's value is ≥ its children's values. The maximum is at the root.
OperationComplexity
Find min/maxO(1) — it's always the root
InsertO(log n) — bubble up
Delete min/maxO(log n) — bubble down
Build heap from n itemsO(n)

Heaps power priority queues — process items by priority, not insertion order.

import heapq

# Min-heap in Python
heap = []
heapq.heappush(heap, 3)
heapq.heappush(heap, 1)
heapq.heappush(heap, 5)
print(heapq.heappop(heap))   # 1  always the minimum

Graphs

A graph is a set of nodes (vertices) and edges connecting them. Unlike trees, graphs can have cycles, multiple paths between nodes, and disconnected components.

Directed: edges have direction (A → B does not imply B → A). Example: Twitter follows. Undirected: edges are bidirectional. Example: Facebook friendships. Weighted: edges carry a cost. Example: road distances.

Representations

Adjacency list: a dictionary mapping each node to its neighbours. Memory-efficient for sparse graphs.

graph = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A'],
    'D': ['B'],
}

Adjacency matrix: a 2D array where matrix[i][j] = 1 means an edge exists. O(1) edge lookup; O(V²) memory.

For most real-world graphs (sparse networks), use adjacency lists.

Graph Traversal

Explores all neighbours at the current depth before going deeper. Uses a queue.

from collections import deque

def bfs(graph, start):
    visited = set([start])
    queue = deque([start])
    while queue:
        node = queue.popleft()
        print(node)
        for neighbour in graph[node]:
            if neighbour not in visited:
                visited.add(neighbour)
                queue.append(neighbour)

Use BFS for: shortest path in unweighted graphs, level-order tree traversal, finding connected components.

Explores as deep as possible along each branch before backtracking. Uses a stack (implicit via recursion or explicit).

def dfs(graph, node, visited=None):
    if visited is None:
        visited = set()
    visited.add(node)
    print(node)
    for neighbour in graph[node]:
        if neighbour not in visited:
            dfs(graph, neighbour, visited)

Use DFS for: cycle detection, topological sort, finding all paths, maze solving, connected components.

BFS vs DFS

BFSDFS
Data structureQueueStack / recursion
Finds shortest pathYes (unweighted)No
MemoryO(width of graph)O(depth of graph)
Detects cyclesYesYes
Good forShortest path, levelsPath existence, topology

Both are O(V + E) — visit every vertex and every edge once.

Key Takeaways

  • Trees are hierarchical; graphs are the general case (trees are graphs without cycles).
  • BSTs give O(log n) search/insert/delete — only guaranteed for balanced trees.
  • Heaps give O(1) min/max access and O(log n) insert/delete — ideal for priority queues.
  • BFS uses a queue, finds shortest paths, explores by level.
  • DFS uses a stack, explores by depth, suits path finding and topological problems.
  • Both BFS and DFS are O(V + E) on a graph with V vertices and E edges.