Problem Statement

The Minimum Edit Distance problem, also known as the Levenshtein distance, is a measure of how dissimilar two strings are by counting the minimum number of operations required to transform one string into the other. The allowed operations are insertion, deletion, and substitution of a single character.

Approach Explanation

This problem can be efficiently solved using dynamic programming. The idea is to build a table that represents the edit distance between substrings of the two input strings.

Dynamic Programming Structure:

  1. Define a 2D DP table where dp[i][j] represents the minimum edit distance between the first i characters of string A and the first j characters of string B.
  2. Initialize the first row and first column of the table based on the operations required to convert an empty string to the respective prefixes.
  3. Iterate through the table, calculating the minimum edit distance by considering the costs of insertion, deletion, and substitution.
  4. The final answer will be found in dp[m][n], where m and n are the lengths of the two strings.

Time Complexity:

O(m * n), where m and n are the lengths of the two strings.

Space Complexity:

O(m * n) for the DP table.

Python Code


def min_edit_distance(A, B):
    """
    Function to calculate the minimum edit distance between two strings.

    Args:
    A (str): First input string.
    B (str): Second input string.

    Returns:
    int: Minimum edit distance between the two strings.
    """
    m, n = len(A), len(B)
    
    # Create a DP table initialized to 0
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Initialize the first column and first row
    for i in range(m + 1):
        dp[i][0] = i  # Cost of deleting all characters from A
    for j in range(n + 1):
        dp[0][j] = j  # Cost of inserting all characters to A to match B

    # Fill the DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if A[i - 1] == B[j - 1]:
                dp[i][j] = dp[i - 1][j - 1]  # No operation needed
            else:
                dp[i][j] = min(dp[i - 1][j] + 1,    # Deletion
                               dp[i][j - 1] + 1,    # Insertion
                               dp[i - 1][j - 1] + 1)  # Substitution

    return dp[m][n]


# Example usage
string1 = "kitten"
string2 = "sitting"
distance = min_edit_distance(string1, string2)

print("Minimum edit distance is:", distance)
    

Explanation of the Program

Let’s break down the structure of the program:

1. Input:

The input consists of two strings for which we want to calculate the minimum edit distance. For example:

    string1 = "kitten"
    string2 = "sitting"

2. DP Table Initialization:

A DP table is created with dimensions (m + 1) x (n + 1), initialized to zeros. This table is used to store the minimum edit distances for different pairs of prefixes of the two strings.

3. Base Case Setup:

The first row and the first column of the table are initialized. The first column represents the cost of converting the first string to an empty string (i.e., deleting all characters), while the first row represents the cost of converting an empty string to the second string (i.e., inserting all characters).

4. Filling the DP Table:

The program iterates through each character of both strings. If the characters at the current indices match, the value in the DP table is carried over from the diagonal (no operation needed). If they do not match, the program calculates the minimum cost considering insertion, deletion, and substitution.

5. Final Result:

The minimum edit distance can be found in the bottom-right cell of the DP table: dp[m][n].

Example Execution:

For the provided input strings, the output will display the minimum edit distance:

Minimum edit distance is: 3

This indicates that the minimum number of operations required to transform “kitten” into “sitting” is 3.

 

By Aditya Bhuyan

I work as a cloud specialist. In addition to being an architect and SRE specialist, I work as a cloud engineer and developer. I have assisted my clients in converting their antiquated programmes into contemporary microservices that operate on various cloud computing platforms such as AWS, GCP, Azure, or VMware Tanzu, as well as orchestration systems such as Docker Swarm or Kubernetes. For over twenty years, I have been employed in the IT sector as a Java developer, J2EE architect, scrum master, and instructor. I write about Cloud Native and Cloud often. Bangalore, India is where my family and I call home. I maintain my physical and mental fitness by doing a lot of yoga and meditation.

Leave a Reply

Your email address will not be published. Required fields are marked *

error

Enjoy this blog? Please spread the word :)