This document presents a Java program that identifies duplicate subtrees in a binary tree using hashing. The approach leverages a serialization technique combined with a hash map to efficiently track and detect duplicate structures.
Program Structure
import java.util.HashMap;
import java.util.ArrayList;
class TreeNode {
int val;
TreeNode left;
TreeNode right;
TreeNode(int x) { val = x; }
}
public class DuplicateSubtrees {
private HashMap<String, Integer> subtreeMap = new HashMap<>();
private ArrayList duplicateSubtrees = new ArrayList<>();
public List findDuplicateSubtrees(TreeNode root) {
serializeSubtrees(root);
return duplicateSubtrees;
}
private String serializeSubtrees(TreeNode node) {
if (node == null) {
return "#"; // Use a marker for null
}
String serial = node.val + "," + serializeSubtrees(node.left) + "," + serializeSubtrees(node.right);
// If the serialized subtree is already in the map
if (subtreeMap.getOrDefault(serial, 0) == 1) {
duplicateSubtrees.add(node); // Add to duplicates
}
subtreeMap.put(serial, subtreeMap.getOrDefault(serial, 0) + 1);
return serial;
}
}
Explanation
Classes and Methods
- TreeNode: This class represents a node in the binary tree. It contains the integer value of the node and pointers to the left and right children.
- DuplicateSubtrees: This class contains the logic for finding duplicate subtrees. It has:
- HashMap<String, Integer> subtreeMap: This map tracks the serialized representation of each subtree and its count.
- ArrayList duplicateSubtrees: This list stores references to the root nodes of duplicate subtrees.
- findDuplicateSubtrees(TreeNode root): This method initiates the process of finding duplicate subtrees.
- serializeSubtrees(TreeNode node): This recursive method serializes the subtree rooted at the given node. It returns a string representation of the subtree and updates the hash map.
How It Works
1. The program starts with the `findDuplicateSubtrees` method, which calls `serializeSubtrees` on the root of the binary tree.
2. The `serializeSubtrees` method generates a unique string for each subtree. The string is formed by concatenating the node’s value with the serialized representations of its left and right subtrees.
3. Each serialized string is used as a key in the `subtreeMap`. If a subtree’s serialization appears for the second time, the root node of that subtree is added to the `duplicateSubtrees` list.
4. Finally, the method returns a list of duplicate subtree roots.
Usage
To use this program, simply create instances of the `TreeNode` class to construct your binary tree, and then instantiate the `DuplicateSubtrees` class to find the duplicates:
public class Main {
public static void main(String[] args) {
TreeNode root = new TreeNode(1);
root.left = new TreeNode(2);
root.right = new TreeNode(3);
root.left.left = new TreeNode(4);
root.left.right = new TreeNode(2);
root.left.right.left = new TreeNode(4);
root.right.right = new TreeNode(4);
DuplicateSubtrees finder = new DuplicateSubtrees();
List duplicates = finder.findDuplicateSubtrees(root);
for (TreeNode node : duplicates) {
System.out.println(node.val);
}
}
}
Conclusion
This Java program efficiently finds duplicate subtrees in a binary tree using serialization and hashing techniques. This approach is both time-efficient and space-efficient, making it suitable for large binary trees.