Autocomplete System Using a Trie in Bash

This document explains how to design an autocomplete system using a Trie data structure in Bash. A Trie (also known as a prefix tree) is a tree-like data structure that efficiently stores a dynamic set of strings, allowing for fast lookup operations, particularly for prefix-based searches like autocomplete.

Program Structure

The program is structured into several key functions:

  • initialize_trie: Initializes the root of the Trie.
  • insert_word: Inserts a word into the Trie.
  • search_prefix: Searches for a prefix in the Trie and retrieves all words that start with that prefix.
  • autocomplete: Wrapper function that triggers the autocomplete feature.

Implementation


#!/bin/bash

# Initialize the Trie root
initialize_trie() {
    root="root"
    declare -gA children_${root}  # Create an associative array to hold children of the root node
}

# Function to insert a word into the Trie
insert_word() {
    local word=$1
    local node=$root

    for (( i=0; i<${#word}; i++ )); do
        local char=${word:i:1}
        local child_node="${node}_${char}"
        
        if [[ ! -v children_${node}[$char] ]]; then
            declare -gA children_${child_node}
            children_${node}[$char]=$child_node
        fi
        
        node=${children_${node}[$char]}
    done
    
    # Mark the end of the word
    children_${node}["*"]="end"
}

# Function to recursively collect all words from a given node
collect_words() {
    local node=$1
    local prefix=$2
    local word_list=()

    for char in "${!children_${node}[@]}"; do
        if [[ $char == "*" ]]; then
            word_list+=("$prefix")
        else
            local next_node=${children_${node}[$char]}
            word_list+=($(collect_words $next_node "${prefix}${char}"))
        fi
    done

    echo "${word_list[@]}"
}

# Function to search for a prefix in the Trie and return all matching words
search_prefix() {
    local prefix=$1
    local node=$root

    for (( i=0; i<${#prefix}; i++ )); do
        local char=${prefix:i:1}

        if [[ -v children_${node}[$char] ]]; then
            node=${children_${node}[$char]}
        else
            echo ""
            return
        fi
    done

    collect_words $node $prefix
}

# Autocomplete function to search for all words matching a given prefix
autocomplete() {
    local prefix=$1
    local results=$(search_prefix $prefix)

    if [ -z "$results" ]; then
        echo "No suggestions found for \"$prefix\"."
    else
        echo "Autocomplete suggestions for \"$prefix\": $results"
    fi
}

# Example usage
initialize_trie

insert_word "hello"
insert_word "hell"
insert_word "heaven"
insert_word "heavy"

autocomplete "he"
autocomplete "hell"
autocomplete "hea"
autocomplete "hel"
autocomplete "ho"
    

Explanation

The program implements a basic autocomplete system using a Trie with the following components:

Initialization

The initialize_trie function sets up the root of the Trie, represented by an associative array that stores the children of each node. Each node in the Trie corresponds to a character, and the root node is initialized to represent the starting point of all words.

Inserting Words

The insert_word function adds a word to the Trie. It iterates over each character in the word and inserts it into the Trie by creating a child node for each character. The function also marks the end of a word with a special character (‘*’).

Collecting Words

The collect_words function recursively traverses the Trie starting from a given node and collects all the words that can be formed from that node. This is used to gather all words that share a common prefix.

Searching for a Prefix

The search_prefix function searches the Trie for a given prefix. If the prefix is found, the function calls collect_words to retrieve all words that start with that prefix. If the prefix is not found, the function returns an empty result.

Autocomplete Functionality

The autocomplete function serves as a wrapper for the autocomplete feature. It takes a prefix as input, calls the search_prefix function, and prints out the autocomplete suggestions for that prefix. If no matching words are found, it returns a message indicating that no suggestions are available.

Example Usage

The example usage section demonstrates how to initialize the Trie, insert words into it, and perform autocomplete searches. The program outputs suggestions for the provided prefixes and handles cases where no suggestions are found.

Conclusion

This Bash implementation of an autocomplete system using a Trie provides a simple yet effective way to search for words based on prefixes. Tries are particularly useful in applications where prefix-based searches are common, such as in search engines, text editors, and word games.

 

By Aditya Bhuyan

I work as a cloud specialist. In addition to being an architect and SRE specialist, I work as a cloud engineer and developer. I have assisted my clients in converting their antiquated programmes into contemporary microservices that operate on various cloud computing platforms such as AWS, GCP, Azure, or VMware Tanzu, as well as orchestration systems such as Docker Swarm or Kubernetes. For over twenty years, I have been employed in the IT sector as a Java developer, J2EE architect, scrum master, and instructor. I write about Cloud Native and Cloud often. Bangalore, India is where my family and I call home. I maintain my physical and mental fitness by doing a lot of yoga and meditation.

Leave a Reply

Your email address will not be published. Required fields are marked *

error

Enjoy this blog? Please spread the word :)