Trie (Prefix Tree) Implementation in Bash
This document explains how to implement a Trie (also known as a prefix tree) using Bash. A Trie is a tree-like data structure that efficiently stores a dynamic set of strings, enabling fast lookups, insertions, and prefix-based searches. It is widely used in applications like autocomplete, spell checking, and IP routing.
Program Structure
The program is structured into several key functions:
- initialize_trie: Initializes the root of the Trie.
- insert_word: Inserts a word into the Trie.
- search_word: Searches for a word in the Trie.
- starts_with_prefix: Checks if any word in the Trie starts with a given prefix.
Implementation
#!/bin/bash
# Initialize the Trie root
initialize_trie() {
root="root"
declare -gA children_${root} # Create an associative array to hold children of the root node
}
# Function to insert a word into the Trie
insert_word() {
local word=$1
local node=$root
for (( i=0; i<${#word}; i++ )); do
local char=${word:i:1}
local child_node="${node}_${char}"
if [[ ! -v children_${node}[$char] ]]; then
declare -gA children_${child_node}
children_${node}[$char]=$child_node
fi
node=${children_${node}[$char]}
done
# Mark the end of the word
children_${node}["*"]="end"
}
# Function to search for a word in the Trie
search_word() {
local word=$1
local node=$root
for (( i=0; i<${#word}; i++ )); do
local char=${word:i:1}
if [[ -v children_${node}[$char] ]]; then
node=${children_${node}[$char]}
else
echo "Word \"$word\" not found in the Trie."
return
fi
done
if [[ -v children_${node}["*"] ]]; then
echo "Word \"$word\" found in the Trie."
else
echo "Word \"$word\" not found in the Trie."
fi
}
# Function to check if any word in the Trie starts with a given prefix
starts_with_prefix() {
local prefix=$1
local node=$root
for (( i=0; i<${#prefix}; i++ )); do
local char=${prefix:i:1}
if [[ -v children_${node}[$char] ]]; then
node=${children_${node}[$char]}
else
echo "No word in the Trie starts with prefix \"$prefix\"."
return
fi
done
echo "There is at least one word in the Trie that starts with prefix \"$prefix\"."
}
# Example usage
initialize_trie
insert_word "hello"
insert_word "hell"
insert_word "heaven"
insert_word "heavy"
search_word "hello"
search_word "hell"
search_word "heavens"
starts_with_prefix "he"
starts_with_prefix "hea"
starts_with_prefix "ho"
Explanation
The program implements a basic Trie with the following components:
Initialization
The initialize_trie
function sets up the root of the Trie. The root is represented by an associative array that stores the children of each node. Each node in the Trie corresponds to a character, and the root node is initialized to represent the starting point of all words.
Inserting Words
The insert_word
function adds a word to the Trie. It iterates over each character in the word and inserts it into the Trie by creating a child node for each character if it doesn’t already exist. The function also marks the end of the word with a special character (‘*’).
Searching for Words
The search_word
function searches the Trie for a given word. It traverses the Trie by following the path of characters that make up the word. If the path exists and the word is marked as complete (indicated by the ‘*’ character), the word is found; otherwise, it is not found.
Checking Prefixes
The starts_with_prefix
function checks if any word in the Trie starts with a given prefix. It traverses the Trie following the characters of the prefix. If the path exists, it indicates that there is at least one word in the Trie that starts with the prefix; otherwise, there isn’t.
Example Usage
The example usage section demonstrates how to initialize the Trie, insert words into it, search for specific words, and check if any words start with given prefixes. The program outputs whether the words are found and whether any words start with the provided prefixes.
Conclusion
This Bash implementation of a Trie (prefix tree) provides a simple and efficient way to store and search for strings based on prefixes. Tries are particularly useful in applications like autocomplete systems, spell checkers, and IP routing, where quick prefix-based lookups are essential.