Notation: LCS(i,j) represents the length of maximum common subsequence of the respective strings A(0...i), B(0....j)
Optimality: To get the Longest Common Subsequence (LCS) at position (i)(j)
when the final characters of two strings, A and B, do not match (A(i) is not equal to B(i)), take the maximum of two values: LCS(i)(j) = MAX(LCS(i-1)(j), LCS(i)(j-1)).
if A(i) equal to B(i) then LCS(i)(j) = 1+ LCS(i-1)(j-1)
This illustrates the notion that the LCS is affected by the LCS values derived by removing the final character from each string when the last characters are not identical.
Recurrence Relation: if A(i)!=B(i) then LCS(i, j) =max( LCS(i-1,j) ,(i, j-1)
if A(i)=B(j) then LCS(i, j) = 1 + LCS(i-1,j-1)
Algorithm:
def find_lcs(A, B):
dp = [[0] * (len(B) + 1) for _ in range(len(A) + 1)]
for i in range(1, len(A) + 1):
for j in range(1, len(B) + 1):
dp[i][j] = dp[i-1][j-1] + 1 if A[i-1] == B[j-1] else max(dp[i-1][j], dp[i][j-1])
lcs = ''.join([s1[i-1] for i, j in zip(range(len(A), 0, -1), range(len(B), 0, -1)) if dp[i][j] != dp[i-1][j] and dp[i][j] != dp[i][j-1]])
return lcs
s1 = "ACTGACT"
s2 = "CAAGCATA"
result = find_lcs(s1, s2)
print(result)
As we are storing the results of sub-problem in the array of size n1*n2 so the time and space complexity of this problem is O(n1*n2)