Autocomplete System Using Trie in C

An autocomplete system is a feature commonly used in search engines, text editors, and other applications to predict and suggest possible completions for a partially entered word or phrase. A Trie (prefix tree) is an efficient data structure for implementing an autocomplete system because it allows quick insertion and retrieval of words, especially when searching for words with a common prefix.

Program Structure

This implementation of an autocomplete system using a Trie in C will include the following components:

  1. Struct definitions for the Trie node and the Trie itself.
  2. Functions to insert words into the Trie.
  3. Functions to search for a given prefix in the Trie and suggest possible completions.
  4. A main function to demonstrate the usage of the Trie for the autocomplete system.

Code Implementation

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ALPHABET_SIZE 26

// Trie node structure
typedef struct TrieNode {
    struct TrieNode *children[ALPHABET_SIZE];
    int isEndOfWord;
} TrieNode;

// Function to create a new Trie node
TrieNode* createNode() {
    TrieNode *newNode = (TrieNode*)malloc(sizeof(TrieNode));
    newNode->isEndOfWord = 0;
    for (int i = 0; i < ALPHABET_SIZE; i++) { newNode->children[i] = NULL;
    }
    return newNode;
}

// Function to insert a word into the Trie
void insert(TrieNode *root, const char *word) {
    TrieNode *node = root;
    while (*word) {
        int index = *word - 'a';
        if (node->children[index] == NULL) {
            node->children[index] = createNode();
        }
        node = node->children[index];
        word++;
    }
    node->isEndOfWord = 1;
}

// Function to search for a node corresponding to the prefix in the Trie
TrieNode* searchPrefix(TrieNode *root, const char *prefix) {
    TrieNode *node = root;
    while (*prefix) {
        int index = *prefix - 'a';
        if (node->children[index] == NULL) {
            return NULL;
        }
        node = node->children[index];
        prefix++;
    }
    return node;
}

// Recursive function to suggest all words in the Trie with the given prefix
void suggestWords(TrieNode *node, char prefix[], int level) {
    if (node->isEndOfWord) {
        prefix[level] = '\0';
        printf("%s\n", prefix);
    }
    for (int i = 0; i < ALPHABET_SIZE; i++) { if (node->children[i] != NULL) {
            prefix[level] = i + 'a';
            suggestWords(node->children[i], prefix, level + 1);
        }
    }
}

// Function to autocomplete words with the given prefix
void autocomplete(TrieNode *root, const char *prefix) {
    TrieNode *node = searchPrefix(root, prefix);
    if (node == NULL) {
        printf("No suggestions found for prefix \"%s\"\n", prefix);
        return;
    }
    char buffer[100];
    strcpy(buffer, prefix);
    suggestWords(node, buffer, strlen(prefix));
}

// Main function to demonstrate the usage of the Trie for autocomplete
int main() {
    TrieNode *root = createNode();

    // Insert words into the Trie
    insert(root, "hello");
    insert(root, "hell");
    insert(root, "heaven");
    insert(root, "heavy");

    // Autocomplete suggestions for a given prefix
    char prefix[] = "he";
    printf("Autocomplete suggestions for \"%s\":\n", prefix);
    autocomplete(root, prefix);

    return 0;
}

Explanation

The program implements an autocomplete system using a Trie (prefix tree) in C. The Trie is an efficient data structure for storing and retrieving strings, particularly useful when working with a set of words that share common prefixes.

The key components of this implementation are as follows:

The TrieNode struct represents a node in the Trie. Each node has an array of pointers to its children, representing the 26 lowercase English letters (a-z), and a boolean flag isEndOfWord to indicate whether the node marks the end of a valid word.

The createNode() function creates and initializes a new Trie node, setting all its child pointers to NULL and isEndOfWord to 0.

The insert() function inserts a word into the Trie. It traverses the Trie from the root, creating new nodes as necessary for each character in the word. When the end of the word is reached, it sets the isEndOfWord flag to 1 at the corresponding node.

The searchPrefix() function searches the Trie for a given prefix and returns the node where the prefix ends. If the prefix is not found, it returns NULL.

The suggestWords() function recursively explores all possible words that start with a given prefix, starting from the node corresponding to the end of the prefix. It prints each word found.

The autocomplete() function is the main function for generating autocomplete suggestions. It searches for the given prefix in the Trie, and if found, it calls suggestWords() to print all possible completions.

In the main() function, we demonstrate the Trie’s functionality by inserting several words into the Trie and then using the autocomplete feature to suggest completions for a given prefix.

This implementation of an autocomplete system using a Trie is efficient and can handle large dictionaries of words, making it ideal for applications like search engines, text editors, and other systems requiring quick and accurate text completion.

 

By Aditya Bhuyan

I work as a cloud specialist. In addition to being an architect and SRE specialist, I work as a cloud engineer and developer. I have assisted my clients in converting their antiquated programmes into contemporary microservices that operate on various cloud computing platforms such as AWS, GCP, Azure, or VMware Tanzu, as well as orchestration systems such as Docker Swarm or Kubernetes. For over twenty years, I have been employed in the IT sector as a Java developer, J2EE architect, scrum master, and instructor. I write about Cloud Native and Cloud often. Bangalore, India is where my family and I call home. I maintain my physical and mental fitness by doing a lot of yoga and meditation.

Leave a Reply

Your email address will not be published. Required fields are marked *

error

Enjoy this blog? Please spread the word :)