Disjoint Set Data Structure Implementation in Java
A Disjoint Set (also known as Union-Find) is a data structure that keeps track of a partition of a set into disjoint (non-overlapping) subsets. It supports two primary operations: find (determines which subset a particular element is in) and union (merges two subsets into a single subset).
Program Structure
The implementation of a Disjoint Set involves the following components:
- DisjointSet: The main class that handles the operations of the Disjoint Set, including union and find.
- Path Compression: An optimization technique that flattens the structure of the tree whenever
find
is called, improving the efficiency of future operations. - Union by Rank: An optimization technique that attaches the shorter tree under the root of the taller tree during a union operation.
DisjointSet Class
This class represents the Disjoint Set and contains:
- An integer array
parent[]
that stores the parent of each element in the set. - An integer array
rank[]
that keeps track of the rank (approximate height) of each tree in the forest. - Methods for performing union, find, and initializing the data structure.
Java Code Implementation
DisjointSet Class
public class DisjointSet {
private int[] parent;
private int[] rank;
public DisjointSet(int n) {
parent = new int[n];
rank = new int[n];
for (int i = 0; i < n; i++) {
parent[i] = i;
rank[i] = 0;
}
}
public int find(int x) {
if (parent[x] != x) {
parent[x] = find(parent[x]); // Path compression
}
return parent[x];
}
public void union(int x, int y) {
int rootX = find(x);
int rootY = find(y);
if (rootX != rootY) {
if (rank[rootX] > rank[rootY]) {
parent[rootY] = rootX;
} else if (rank[rootX] < rank[rootY]) {
parent[rootX] = rootY;
} else {
parent[rootY] = rootX;
rank[rootX]++;
}
}
}
}
Explanation
Here is a detailed explanation of the key components in the DisjointSet implementation:
1. Constructor and Fields
The DisjointSet
class has two main fields:
parent
: An integer array whereparent[i]
points to the parent of elementi
. Ifparent[i] == i
, theni
is a root node.rank
: An integer array that stores the rank of each element. The rank is an approximation of the depth of the tree, used to keep the tree flat.
The constructor initializes the parent
and rank
arrays. Each element is initially its own parent (forming singleton sets), and the rank is initialized to 0 for all elements.
2. find Method
The find
method returns the root of the set containing element x
. This method uses path compression, a technique that flattens the structure of the tree whenever find
is called, by making every node point directly to the root. This optimization improves the efficiency of future operations.
3. union Method
The union
method merges the sets containing elements x
and y
. It first finds the roots of the sets containing x
and y
using the find
method. If the roots are different, it attaches the tree with the lower rank to the tree with the higher rank (union by rank). If the ranks are equal, one tree is arbitrarily chosen to be the root, and its rank is incremented by 1.
Conclusion
This Java implementation of the Disjoint Set data structure provides an efficient way to manage and merge disjoint sets. The use of path compression and union by rank optimizations ensures that both the find
and union
operations have nearly constant time complexity, making the Disjoint Set data structure ideal for applications like network connectivity, image processing, and Kruskal’s algorithm for finding minimum spanning trees.