Disjoint Set Data Structure Implementation in Bash

This document explains the implementation of the Disjoint Set (Union-Find) data structure in Bash. The Disjoint Set data structure is used to manage a partition of a set into disjoint subsets and is commonly used in algorithms like Kruskal’s Minimum Spanning Tree algorithm. The operations supported are:

  • Find: Determine the subset (or set representative) a particular element belongs to.
  • Union: Merge two subsets into a single subset.

Program Structure

The program is structured into three main functions:

  1. initialize: Initializes the disjoint set.
  2. find: Implements the Find operation with path compression.
  3. union: Implements the Union operation with union by rank.

Implementation


#!/bin/bash

# Function to initialize the disjoint set
initialize() {
    local n=$1
    for (( i=0; i<n; i++ )); do
        parent[i]=$i
        rank[i]=0
    done
}

# Function to find the representative of the set containing element x
# Uses path compression to optimize future queries
find() {
    local x=$1
    if [ "${parent[x]}" -ne "$x" ]; then
        parent[x]=$(find "${parent[x]}")
    fi
    echo "${parent[x]}"
}

# Function to unite two sets containing elements x and y
# Uses union by rank to keep the tree flat
union() {
    local x=$(find "$1")
    local y=$(find "$2")

    if [ "$x" -ne "$y" ]; then
        if [ "${rank[x]}" -lt "${rank[y]}" ]; then
            parent[x]=$y
        elif [ "${rank[x]}" -gt "${rank[y]}" ]; then
            parent[y]=$x
        else
            parent[y]=$x
            rank[x]=$((rank[x] + 1))
        fi
    fi
}

# Example usage
n=5
initialize "$n"

# Perform some unions
union 0 2
union 4 2
union 3 1

# Perform some finds
echo "Find(4): $(find 4)" # Output should be representative of set containing 4
echo "Find(3): $(find 3)" # Output should be representative of set containing 3
    

Explanation

The above program implements a basic Disjoint Set data structure with the following features:

Initialization

The initialize function sets up two arrays:

  • parent: Holds the parent or representative of each element. Initially, each element is its own parent.
  • rank: Used to keep the tree flat. Initially, the rank of each element is 0.

Find Operation

The find function recursively finds the root of the set containing a given element. Path compression is used here to make future queries faster by directly attaching each node visited to the root.

Union Operation

The union function merges two sets. It first finds the representatives of both sets, then attaches the tree with the lower rank to the root of the tree with the higher rank. If both trees have the same rank, one tree becomes the root and its rank is increased by one.

Example Usage

An example usage is provided at the end of the script, where a disjoint set is initialized with 5 elements. Several union operations are performed, and find operations are used to determine the representative of certain sets.

Conclusion

This implementation of the Disjoint Set data structure in Bash provides an efficient way to manage and query partitions of a set. It is especially useful in scenarios where union and find operations are frequently performed, such as in network connectivity algorithms and clustering.

 

By Aditya Bhuyan

I work as a cloud specialist. In addition to being an architect and SRE specialist, I work as a cloud engineer and developer. I have assisted my clients in converting their antiquated programmes into contemporary microservices that operate on various cloud computing platforms such as AWS, GCP, Azure, or VMware Tanzu, as well as orchestration systems such as Docker Swarm or Kubernetes. For over twenty years, I have been employed in the IT sector as a Java developer, J2EE architect, scrum master, and instructor. I write about Cloud Native and Cloud often. Bangalore, India is where my family and I call home. I maintain my physical and mental fitness by doing a lot of yoga and meditation.

Leave a Reply

Your email address will not be published. Required fields are marked *

error

Enjoy this blog? Please spread the word :)