B-tree Implementation in Python
A B-tree is a self-balancing tree data structure that maintains sorted data and allows for efficient insertion, deletion, and search operations. It is commonly used in database indexing due to its ability to maintain balance with a large number of keys and its efficiency in disk-based systems.
Program Structure
The B-tree implementation in Python consists of the following key components:
1. BTreeNode Class
This class represents a node in the B-tree. It includes the following properties:
keys
: A list of keys stored in the node.children
: A list of child nodes. This list is empty for leaf nodes.leaf
: A boolean indicating whether the node is a leaf node.
2. BTree Class
This class encapsulates the functionality of the B-tree and includes the following methods:
__init__(self, t)
: Initializes the B-tree with a given minimum degreet
.search(self, k, x=None)
: Searches for a keyk
in the B-tree, starting from nodex
(defaults to the root).insert(self, k)
: Inserts a new keyk
into the B-tree.split_child(self, x, i)
: Splits the child node at indexi
of nodex
when it is full.insert_non_full(self, x, k)
: Inserts a keyk
into a nodex
that is not full.
Python Code Implementation
class BTreeNode:
def __init__(self, t, leaf=False):
"""
Initialize a B-tree node.
:param t: Minimum degree (defines the range for number of keys).
:param leaf: Boolean flag indicating if the node is a leaf.
"""
self.t = t # Minimum degree
self.leaf = leaf # True if leaf node
self.keys = [] # List of keys in the node
self.children = [] # List of child pointers
def __str__(self):
return f"Keys: {self.keys}, Leaf: {self.leaf}"
class BTree:
def __init__(self, t):
"""
Initialize the B-tree.
:param t: Minimum degree (defines the range for number of keys).
"""
self.root = BTreeNode(t, True)
self.t = t # Minimum degree
def search(self, k, x=None):
"""
Search for a key in the B-tree.
:param k: The key to search for.
:param x: The node to start the search from (defaults to root).
:return: Tuple of (node, index) if key is found, otherwise None.
"""
if x is None:
x = self.root
i = 0
while i < len(x.keys) and k > x.keys[i]:
i += 1
if i < len(x.keys) and k == x.keys[i]:
return x, i
elif x.leaf:
return None
else:
return self.search(k, x.children[i])
def insert(self, k):
"""
Insert a key into the B-tree.
:param k: The key to insert.
"""
root = self.root
if len(root.keys) == (2 * self.t) - 1:
s = BTreeNode(self.t)
self.root = s
s.children.append(root)
self.split_child(s, 0)
self.insert_non_full(s, k)
else:
self.insert_non_full(root, k)
def split_child(self, x, i):
"""
Split the child of a node.
:param x: The node whose child is to be split.
:param i: The index of the child in the children list.
"""
t = self.t
y = x.children[i]
z = BTreeNode(t, y.leaf)
x.children.insert(i + 1, z)
x.keys.insert(i, y.keys[t - 1])
z.keys = y.keys[t: (2 * t) - 1]
y.keys = y.keys[0: t - 1]
if not y.leaf:
z.children = y.children[t: 2 * t]
y.children = y.children[0: t]
def insert_non_full(self, x, k):
"""
Insert a key into a node that is not full.
:param x: The node to insert the key into.
:param k: The key to insert.
"""
i = len(x.keys) - 1
if x.leaf:
x.keys.append(None)
while i >= 0 and k < x.keys[i]:
x.keys[i + 1] = x.keys[i]
i -= 1
x.keys[i + 1] = k
else:
while i >= 0 and k < x.keys[i]:
i -= 1
i += 1
if len(x.children[i].keys) == (2 * self.t) - 1:
self.split_child(x, i)
if k > x.keys[i]:
i += 1
self.insert_non_full(x.children[i], k)
Explanation
The implementation of a B-tree is structured around two main classes: BTreeNode
and BTree
.
1. BTreeNode Class
The BTreeNode
class represents a node in the B-tree. Each node contains:
keys
: A list that stores the keys in the node, kept in sorted order.children
: A list of pointers to child nodes. This list is empty for leaf nodes.leaf
: A boolean flag that indicates whether the node is a leaf.
2. BTree Class
The BTree
class represents the B-tree itself and provides methods for inserting keys, searching for keys, and managing the tree’s structure. The key methods include:
search(self, k, x=None)
: Searches for a keyk
starting from the nodex
. It returns the node and index where the key is found, orNone
if the key is not present.insert(self, k)
: Inserts a new keyk
into the B-tree. If the root is full, the tree grows in height by splitting the root.split_child(self, x, i)
: Splits a full child node ofx
at indexi
. This is used to ensure nodes do not exceed the maximum number of keys.insert_non_full(self, x, k)
: Handles the insertion of a key into a node that is not full. If necessary, it recursively splits child nodes that are full.
Usage Example
# Example usage of the BTree
# Create a B-tree with a minimum degree of 3
btree = BTree(3)
# Insert some keys into the B-tree
keys = [10, 20, 5, 6, 12, 30, 7, 17]
for key in keys:
btree.insert(key)
# Search for a key in the B-tree
result = btree.search(6)
if result:
node, index = result
print(f"Key 6 found in node with keys: {node.keys}")
else:
print("Key 6 not found in the B-tree")
# Output should show that key 6 is found in the node
This example demonstrates how to create a B-tree with a minimum degree of 3, insert several keys into the tree, and search for a specific key. The output indicates whether the key is found and, if so, the node in which it resides.