cplusplus
cplusplus

 

 

This post presents a C++ program to identify duplicate subtrees in a binary tree. The program employs hashing to store the serialized representation of each subtree and uses a hash map to track occurrences.

Program Explanation

The program defines a binary tree structure and provides a mechanism to find duplicate subtrees. Here’s a breakdown of the components:

1. Structure of the Node

struct TreeNode {
    int val;
    TreeNode *left;
    TreeNode *right;
    TreeNode(int x) : val(x), left(NULL), right(NULL) {}
};

The TreeNode structure represents each node in the binary tree, containing an integer value and pointers to its left and right children.

2. Hash Map for Subtree Serialization

#include <unordered_map>
#include <vector>
#include <string>

unordered_map<string, int> subtreeCount;

The subtreeCount hash map stores the serialized representation of subtrees as keys and their occurrence counts as values. This allows us to track duplicates efficiently.

3. Serializing the Subtree

string serialize(TreeNode* root) {
    if (!root) return "#"; // Use '#' to represent NULL
    string left = serialize(root->left);
    string right = serialize(root->right);
    string serialized = to_string(root->val) + "," + left + "," + right;
    subtreeCount[serialized]++;
    return serialized;
}

The serialize function converts a subtree rooted at root into a string format. It recursively serializes the left and right children, concatenating their representations with the current node’s value. The function also updates the subtreeCount hash map.

4. Finding Duplicate Subtrees

vector<TreeNode*> findDuplicateSubtrees(TreeNode* root) {
    vector<TreeNode*> duplicates;
    serialize(root); // Populate the subtreeCount
    for (const auto& entry : subtreeCount) {
        if (entry.second > 1) {
            // Add the subtree to duplicates (not implemented here)
        }
    }
    return duplicates;
}

The findDuplicateSubtrees function first serializes the tree to populate the subtreeCount map. Then, it iterates through the map to identify subtrees that appear more than once. Note that the logic for actually adding the duplicate subtrees to the duplicates vector needs to be implemented for a complete solution.

Complete C++ Program

#include <iostream>
#include <unordered_map>
#include <vector>
#include <string>

using namespace std;

// Structure for a binary tree node
struct TreeNode {
    int val;
    TreeNode *left;
    TreeNode *right;
    TreeNode(int x) : val(x), left(NULL), right(NULL) {}
};

// Hash map to count subtree occurrences
unordered_map<string, int> subtreeCount;

// Function to serialize the subtree and populate the hash map
string serialize(TreeNode* root) {
    if (!root) return "#"; // Use '#' to represent NULL
    string left = serialize(root->left);
    string right = serialize(root->right);
    string serialized = to_string(root->val) + "," + left + "," + right;
    subtreeCount[serialized]++;
    return serialized;
}

// Function to find duplicate subtrees
vector<TreeNode*> findDuplicateSubtrees(TreeNode* root) {
    vector<TreeNode*> duplicates;
    serialize(root); // Populate the subtreeCount

    // Find all duplicate subtrees
    for (const auto& entry : subtreeCount) {
        if (entry.second > 1) {
            // Add the subtree root node to duplicates (not implemented)
        }
    }
    return duplicates;
}

// Example usage
int main() {
    // Create a sample binary tree
    TreeNode* root = new TreeNode(1);
    root->left = new TreeNode(2);
    root->right = new TreeNode(3);
    root->left->left = new TreeNode(4);
    root->left->right = new TreeNode(2);
    root->left->right->left = new TreeNode(4);
    root->right->right = new TreeNode(4);

    // Find duplicate subtrees
    vector<TreeNode*> duplicates = findDuplicateSubtrees(root);

    // Print duplicate subtree roots (dummy print, to be implemented)
    cout << "Duplicate subtrees found: " << duplicates.size() << endl;

    return 0;
}

Conclusion

This program demonstrates how to find duplicate subtrees in a binary tree using hashing for efficient storage and retrieval of subtree representations. By serializing each subtree and counting occurrences with a hash map, we can quickly identify duplicates.

 

By Aditya Bhuyan

I work as a cloud specialist. In addition to being an architect and SRE specialist, I work as a cloud engineer and developer. I have assisted my clients in converting their antiquated programmes into contemporary microservices that operate on various cloud computing platforms such as AWS, GCP, Azure, or VMware Tanzu, as well as orchestration systems such as Docker Swarm or Kubernetes. For over twenty years, I have been employed in the IT sector as a Java developer, J2EE architect, scrum master, and instructor. I write about Cloud Native and Cloud often. Bangalore, India is where my family and I call home. I maintain my physical and mental fitness by doing a lot of yoga and meditation.

Leave a Reply

Your email address will not be published. Required fields are marked *

error

Enjoy this blog? Please spread the word :)