Disjoint Set Data Structure Implementation in Bash
This document explains the implementation of the Disjoint Set (Union-Find) data structure in Bash. The Disjoint Set data structure is used to manage a partition of a set into disjoint subsets and is commonly used in algorithms like Kruskal’s Minimum Spanning Tree algorithm. The operations supported are:
- Find: Determine the subset (or set representative) a particular element belongs to.
- Union: Merge two subsets into a single subset.
Program Structure
The program is structured into three main functions:
- initialize: Initializes the disjoint set.
- find: Implements the Find operation with path compression.
- union: Implements the Union operation with union by rank.
Implementation
#!/bin/bash
# Function to initialize the disjoint set
initialize() {
local n=$1
for (( i=0; i<n; i++ )); do
parent[i]=$i
rank[i]=0
done
}
# Function to find the representative of the set containing element x
# Uses path compression to optimize future queries
find() {
local x=$1
if [ "${parent[x]}" -ne "$x" ]; then
parent[x]=$(find "${parent[x]}")
fi
echo "${parent[x]}"
}
# Function to unite two sets containing elements x and y
# Uses union by rank to keep the tree flat
union() {
local x=$(find "$1")
local y=$(find "$2")
if [ "$x" -ne "$y" ]; then
if [ "${rank[x]}" -lt "${rank[y]}" ]; then
parent[x]=$y
elif [ "${rank[x]}" -gt "${rank[y]}" ]; then
parent[y]=$x
else
parent[y]=$x
rank[x]=$((rank[x] + 1))
fi
fi
}
# Example usage
n=5
initialize "$n"
# Perform some unions
union 0 2
union 4 2
union 3 1
# Perform some finds
echo "Find(4): $(find 4)" # Output should be representative of set containing 4
echo "Find(3): $(find 3)" # Output should be representative of set containing 3
Explanation
The above program implements a basic Disjoint Set data structure with the following features:
Initialization
The initialize
function sets up two arrays:
parent
: Holds the parent or representative of each element. Initially, each element is its own parent.rank
: Used to keep the tree flat. Initially, the rank of each element is 0.
Find Operation
The find
function recursively finds the root of the set containing a given element. Path compression is used here to make future queries faster by directly attaching each node visited to the root.
Union Operation
The union
function merges two sets. It first finds the representatives of both sets, then attaches the tree with the lower rank to the root of the tree with the higher rank. If both trees have the same rank, one tree becomes the root and its rank is increased by one.
Example Usage
An example usage is provided at the end of the script, where a disjoint set is initialized with 5 elements. Several union operations are performed, and find operations are used to determine the representative of certain sets.
Conclusion
This implementation of the Disjoint Set data structure in Bash provides an efficient way to manage and query partitions of a set. It is especially useful in scenarios where union and find operations are frequently performed, such as in network connectivity algorithms and clustering.