stackademic

The leading education platform for anyone with an interest in software development.

Union-Find

Disjoint-set structure for tracking connected groups with near-constant merges

Overview

Union-Find (a disjoint-set union) tracks a partition of elements into disjoint groups and answers "are these two in the same set?" while supporting fast merges. With both path compression and union by rank, find and union run in O(α(n)) amortized time, where α is the inverse Ackermann function — effectively constant. It underpins cycle detection, Kruskal's minimum spanning tree, and connected-component queries on graphs.

Syntax / Usage

Store a parent pointer per element and a rank estimating tree height.

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))   # each element is its own root
        self.rank = [0] * n

    def find(self, x):                 # O(alpha(n)) amortized
        while self.parent[x] != x:
            self.parent[x] = self.parent[self.parent[x]]  # path compression
            x = self.parent[x]
        return x

    def union(self, a, b):             # O(alpha(n)) amortized
        ra, rb = self.find(a), self.find(b)
        if ra == rb:
            return False               # already connected
        if self.rank[ra] < self.rank[rb]:
            ra, rb = rb, ra
        self.parent[rb] = ra
        if self.rank[ra] == self.rank[rb]:
            self.rank[ra] += 1
        return True

Examples

Merge elements and check connectivity between them:

uf = UnionFind(5)
uf.union(0, 1)
uf.union(3, 4)

print(uf.find(0) == uf.find(1))  # True
print(uf.find(0) == uf.find(4))  # False

uf.union(1, 4)
print(uf.find(0) == uf.find(3))  # True - now all connected

Detect a cycle in an undirected graph by unioning each edge's endpoints:

def has_cycle(n, edges):
    uf = UnionFind(n)
    for a, b in edges:
        if not uf.union(a, b):   # endpoints already share a root
            return True
    return False

print(has_cycle(3, [(0, 1), (1, 2), (2, 0)]))  # True
print(has_cycle(3, [(0, 1), (1, 2)]))          # False

Common Mistakes

  • Comparing raw elements instead of their roots via find
  • Skipping path compression or union by rank, degrading operations toward O(n)
  • Setting parent[x] = y directly instead of linking the two roots
  • Assuming union works on directed relationships — it models undirected connectivity
  • Forgetting that the structure only merges sets and cannot split them apart later

See Also

data-structures-graphs data-structures-trees data-structures-arrays