Bloom Filter Implementation in C++

A Bloom Filter is a probabilistic data structure used to test whether an element is a member of a set. It can return two types of results:

True positive: The element is definitely in the set.
False positive: The element is possibly in the set (but not definitely).

The Bloom Filter does not store the elements themselves but uses multiple hash functions to set bits in a bit array. When querying an element, the Bloom Filter checks the corresponding bits. If all the bits are set, the element is likely in the set; otherwise, it is definitely not.

Program Structure

The C++ implementation of the Bloom Filter will include:

Initialization: Setting up the bit array and hash functions.
Hash Functions: Implementing multiple hash functions to map elements to different positions in the bit array.
Insert Operation: Adding elements to the Bloom Filter.
Query Operation: Checking for the possible membership of elements in the set.

Code Implementation


#include <iostream>
#include <vector>
#include <string>
#include <functional>

class BloomFilter {
private:
    std::vector<bool> bitArray;
    int size;
    int numHashFunctions;

    // Hash function for demonstration purposes
    int hash1(const std::string &key) const {
        std::hash<std::string> hash_fn;
        return hash_fn(key) % size;
    }

    // Another hash function for demonstration purposes
    int hash2(const std::string &key) const {
        std::hash<std::string> hash_fn;
        return (hash_fn(key) / size) % size;
    }

public:
    // Constructor to initialize the Bloom Filter
    BloomFilter(int size, int numHashFunctions)
        : size(size), numHashFunctions(numHashFunctions) {
        bitArray.resize(size, false);
    }

    // Insert an element into the Bloom Filter
    void insert(const std::string &key) {
        int index1 = hash1(key);
        int index2 = hash2(key);

        bitArray[index1] = true;
        bitArray[index2] = true;

        // For more hash functions, continue similarly
    }

    // Query an element in the Bloom Filter
    bool contains(const std::string &key) const {
        int index1 = hash1(key);
        int index2 = hash2(key);

        return bitArray[index1] && bitArray[index2];
    }
};

// Main function to demonstrate the Bloom Filter operations
int main() {
    int size = 100; // Size of the bit array
    int numHashFunctions = 2; // Number of hash functions

    BloomFilter bf(size, numHashFunctions);

    // Insert elements into the Bloom Filter
    bf.insert("hello");
    bf.insert("world");

    // Query for elements
    if (bf.contains("hello")) {
        std::cout << "'hello' is possibly in the set." << std::endl;
    } else {
        std::cout << "'hello' is definitely not in the set." << std::endl;
    }

    if (bf.contains("goodbye")) {
        std::cout << "'goodbye' is possibly in the set." << std::endl;
    } else {
        std::cout << "'goodbye' is definitely not in the set." << std::endl;
    }

    return 0;
}

Explanation

Initialization

The constructor of the BloomFilter class initializes a bit array of a given size and sets the number of hash functions. The bit array is initialized to false for all positions.

Hash Functions

In this implementation, two simple hash functions, hash1() and hash2(), are defined using the C++ standard library’s std::hash function. These functions compute different indices in the bit array where the bits will be set. More hash functions can be added for increased accuracy.

Insert Operation

The insert() function adds a key to the Bloom Filter. It calculates the indices using the hash functions and sets the corresponding bits in the bit array to true.

Query Operation

The contains() function checks if a key is present in the Bloom Filter. It calculates the indices using the hash functions and checks if all the corresponding bits in the bit array are set to true. If they are, the key is probably in the set; otherwise, it is definitely not.

Conclusion

The Bloom Filter is an efficient and space-saving probabilistic data structure for checking set membership. Although it can produce false positives, it never generates false negatives, making it useful for applications where the occasional false positive is acceptable. This C++ implementation provides a basic yet effective way to utilize Bloom Filters in various contexts.

Bloom Filter Implementation in C++

Bloom Filter Implementation in C++

Program Structure

Code Implementation

Explanation

Initialization

Hash Functions

Insert Operation

Query Operation

Conclusion

By Aditya Bhuyan

Leave a Reply Cancel reply

You Missed

Welcome to the Digital Recipe Book in C Programming

Diary Application in C Programming

Basic Music Player Application in C

Track and Display Stock Prices Using an API in C Programming

Featured Posts

Welcome to the Digital Recipe Book in C Programming

Diary Application in C Programming

Basic Music Player Application in C

Track and Display Stock Prices Using an API in C Programming

Create the Classic Snake Game in C

Enjoy this blog? Please spread the word :)

Bloom Filter Implementation in C++

Program Structure

Code Implementation

Explanation

Initialization

Hash Functions

Insert Operation

Query Operation

Conclusion

By Aditya Bhuyan

Related Post

Leave a Reply Cancel reply

You Missed

Enjoy this blog? Please spread the word :)