This program identifies duplicate subtrees in a binary tree using hashing. It utilizes a dictionary to keep track of serialized subtrees and their occurrences.
Program Structure
- TreeNode Class: Defines the structure of a node in the binary tree.
- serialize Function: Serializes the subtree rooted at a given node into a string format.
- findDuplicateSubtrees Function: Traverses the binary tree and identifies duplicate subtrees.
- Example Usage: Demonstrates how to build a binary tree and call the function to find duplicates.
Python Code
class TreeNode:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def serialize(node, lookup, duplicates):
if not node:
return "#"
# Serialize the subtree
serialized = "{},{},{}".format(node.val, serialize(node.left, lookup, duplicates), serialize(node.right, lookup, duplicates))
# Use the serialized subtree to track occurrences
if serialized in lookup:
lookup[serialized] += 1
else:
lookup[serialized] = 1
# If this subtree is seen twice, add it to duplicates
if lookup[serialized] == 2:
duplicates.append(node)
return serialized
def findDuplicateSubtrees(root):
lookup = {}
duplicates = []
serialize(root, lookup, duplicates)
return duplicates
# Example Usage
if __name__ == "__main__":
# Construct the binary tree
# Example Tree:
# 1
# / \
# 2 3
# / \ \
# 4 5 2
# / \
# 4 5
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
root.right.right = TreeNode(2)
root.right.right.left = TreeNode(4)
root.right.right.right = TreeNode(5)
duplicates = findDuplicateSubtrees(root)
for dup in duplicates:
print("Duplicate subtree rooted at node with value:", dup.val)
Explanation
The program uses a depth-first traversal approach to serialize each subtree and track its occurrences using a dictionary:
- The
TreeNode
class represents each node in the binary tree. - The
serialize
function recursively constructs a string representation of the subtree. - The
lookup
dictionary records how many times each serialized subtree appears. - If a subtree is found twice, it is added to the
duplicates
list. - The
findDuplicateSubtrees
function initializes the process and returns the list of duplicate subtrees.
In the example usage, a binary tree is constructed, and the program identifies and prints the root values of any duplicate subtrees.
Explanation of the Code:
- TreeNode Class: Defines a simple structure for a binary tree node, with properties for the node value and pointers to left and right children.
- serialize Function:
- Recursively serializes each subtree rooted at the current node.
- Uses a hash table (
lookup
) to count occurrences of each serialized subtree. - Adds a subtree to the
duplicates
list if it is encountered for the second time.
- findDuplicateSubtrees Function:
- Initializes data structures and starts the serialization process from the root of the binary tree.
- Example Usage:
- Constructs a sample binary tree.
- Calls the
findDuplicateSubtrees
function and prints out the values of the roots of duplicate subtrees.