Minimum Edit Distance in CPLUSPLUS

 

 

The Minimum Edit Distance problem, also known as Levenshtein Distance, is a measure of how dissimilar two strings are by counting the minimum number of operations required to transform one string into the other. The allowed operations are insertion, deletion, and substitution of a single character.

Program Structure

  • Input: Two strings for which the minimum edit distance needs to be calculated.
  • Output: The minimum edit distance between the two strings.
  • Dynamic Programming: The program uses a 2D array to store the distances for various substrings.

C++ Code


#include <iostream>
#include <vector>
#include <string>

using namespace std;

// Function to calculate the minimum edit distance between two strings
int minEditDistance(const string &str1, const string &str2) {
    int m = str1.length();
    int n = str2.length();
    
    // Create a 2D vector to store distances
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

    // Initialize the first row and column
    for (int i = 0; i <= m; i++) {
        dp[i][0] = i; // Deleting all characters from str1
    }
    for (int j = 0; j <= n; j++) {
        dp[0][j] = j; // Inserting all characters of str2
    }

    // Fill the dp array
    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (str1[i - 1] == str2[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1]; // No operation needed
            } else {
                dp[i][j] = min({dp[i - 1][j] + 1,   // Deletion
                                dp[i][j - 1] + 1,   // Insertion
                                dp[i - 1][j - 1] + 1 // Substitution
                });
            }
        }
    }

    // The minimum edit distance is in dp[m][n]
    return dp[m][n];
}

int main() {
    string str1, str2;

    // Input strings
    cout << "Enter first string: ";
    cin >> str1;
    cout << "Enter second string: ";
    cin >> str2;

    // Calculate minimum edit distance
    int distance = minEditDistance(str1, str2);

    // Output the result
    cout << "Minimum Edit Distance: " << distance << endl;

    return 0;
}

Explanation of the Code

The code consists of the following main components:

  • Input Handling: The program prompts the user to enter two strings.
  • Edit Distance Calculation: The minEditDistance function uses dynamic programming to fill a 2D vector dp. Here, dp[i][j] holds the minimum edit distance between the first i characters of str1 and the first j characters of str2.
  • Initialization: The first row and column are initialized to represent the cost of converting an empty string to a substring of the other string (i.e., the number of insertions or deletions required).
  • Dynamic Programming Logic: For each character pair, the function calculates the minimum cost based on the three possible operations (deletion, insertion, substitution).
  • Output: The program displays the calculated minimum edit distance.

Conclusion

This program efficiently calculates the minimum edit distance between two strings using dynamic programming techniques, providing a clear understanding of string similarity transformations.

 

Leave a Reply

Your email address will not be published. Required fields are marked *