The Minimum Edit Distance problem, also known as Levenshtein Distance, is a measure of how dissimilar two strings are by counting the minimum number of operations required to transform one string into the other. The allowed operations are insertion, deletion, and substitution of a single character.
Program Structure
- Input: Two strings for which the minimum edit distance needs to be calculated.
- Output: The minimum edit distance between the two strings.
- Dynamic Programming: The program uses a 2D array to store the distances for various substrings.
C++ Code
#include <iostream>
#include <vector>
#include <string>
using namespace std;
// Function to calculate the minimum edit distance between two strings
int minEditDistance(const string &str1, const string &str2) {
int m = str1.length();
int n = str2.length();
// Create a 2D vector to store distances
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
// Initialize the first row and column
for (int i = 0; i <= m; i++) {
dp[i][0] = i; // Deleting all characters from str1
}
for (int j = 0; j <= n; j++) {
dp[0][j] = j; // Inserting all characters of str2
}
// Fill the dp array
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (str1[i - 1] == str2[j - 1]) {
dp[i][j] = dp[i - 1][j - 1]; // No operation needed
} else {
dp[i][j] = min({dp[i - 1][j] + 1, // Deletion
dp[i][j - 1] + 1, // Insertion
dp[i - 1][j - 1] + 1 // Substitution
});
}
}
}
// The minimum edit distance is in dp[m][n]
return dp[m][n];
}
int main() {
string str1, str2;
// Input strings
cout << "Enter first string: ";
cin >> str1;
cout << "Enter second string: ";
cin >> str2;
// Calculate minimum edit distance
int distance = minEditDistance(str1, str2);
// Output the result
cout << "Minimum Edit Distance: " << distance << endl;
return 0;
}
Explanation of the Code
The code consists of the following main components:
- Input Handling: The program prompts the user to enter two strings.
- Edit Distance Calculation: The
minEditDistance
function uses dynamic programming to fill a 2D vectordp
. Here,dp[i][j]
holds the minimum edit distance between the firsti
characters ofstr1
and the firstj
characters ofstr2
. - Initialization: The first row and column are initialized to represent the cost of converting an empty string to a substring of the other string (i.e., the number of insertions or deletions required).
- Dynamic Programming Logic: For each character pair, the function calculates the minimum cost based on the three possible operations (deletion, insertion, substitution).
- Output: The program displays the calculated minimum edit distance.
Conclusion
This program efficiently calculates the minimum edit distance between two strings using dynamic programming techniques, providing a clear understanding of string similarity transformations.