Sequence Alignment Score
In bioinformatics, the edit distance between two strings measures the minimum number of insertions, deletions, and substitutions to transform one into the other. Let s_1 be the first 100 decimal di...
Problem Statement
This archive keeps the full statement, math, and original media on the page.
A frog is placed on the number line. Every step the frog jumps either \(a\) units to the left or \(b\) units to the right, both with \(1/2\) probability.
Define \(f(a, b)\) as the limit \(\lim _{n \to \infty } \frac {c_n}n\) where \(c_n\) is the expected number of unique numbers visited in the first \(n\) steps. You are given \(f(1, 1) = 0\) and \(f(1, 2) \approx 0.427050983\).
Find \(f(89, 97)\). Give your answer rounded to nine digits after the decimal point.
Problem 959: Sequence Alignment Score
Mathematical Analysis
Levenshtein Distance
The Levenshtein distance (edit distance) between strings and is the minimum number of single-character operations (insertions, deletions, substitutions) needed to transform into .
Theorem (Optimal Substructure). The edit distance satisfies the recurrence:
where is the edit distance between the first characters of and first characters of .
Proof. Consider the last operation in an optimal alignment:
- If it aligns with (match or substitution): cost is .
- If is deleted: cost is .
- If is inserted: cost is . The minimum over these three choices is optimal.
Properties of the DP Table
Proposition. The edit distance satisfies the triangle inequality: for any strings .
Proposition. For strings of length over an alphabet of size , the expected edit distance between two random strings is approximately for large .
For decimal digits () of length 100, the expected distance for random strings is . The digits of and are believed to be “normal” (uniformly distributed), so we expect the edit distance to be close to this value.
Concrete Small Example
For and :
- , (base cases).
- : , so .
- Final: (all three characters differ).
Derivation
Editorial
Edit distance (Levenshtein) between first 100 digits of pi and e. Wagner-Fischer DP algorithm: d[i][j] = min cost to align s1[:i] with s2[:j] Complexity: O(n*m) time and space, n = m = 100. We extract the first 100 decimal digits of and (well-known constants). We then build the DP table of size . Finally, return .
Pseudocode
Extract the first 100 decimal digits of $\pi$ and $e$ (well-known constants)
Build the DP table of size $(101 \times 101)$
Return $d[100][100]$
The Digit Strings
: 3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067…
: 2718281828459045235360287471352662497757247093699959574966967627724076630353547594571382178525166427…
(First 100 digits of each, including the digit before the decimal point.)
Proof of Correctness
Theorem (Wagner-Fischer, 1974). The dynamic programming algorithm correctly computes the Levenshtein distance.
Proof. By induction on . The base cases and are correct (requires deletions or insertions). The recurrence correctly considers all possible last operations.
Complexity Analysis
- Time: where . Total: operations.
- Space: for the full table, or with rolling arrays.
Answer
Code
Each problem page includes the exact C++ and Python source files from the local archive.
/*
* Problem 959: Sequence Alignment Score
*
* Compute Levenshtein (edit) distance between first 100 digits
* of pi and e using Wagner-Fischer DP.
*
* Complexity: O(n^2) time and space, n = 100.
*/
#include <bits/stdc++.h>
using namespace std;
int main() {
string s1 = "3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067";
string s2 = "2718281828459045235360287471352662497757247093699959574966967627724076630353547594571382178525166427";
int n = s1.size(), m = s2.size();
vector<vector<int>> d(n + 1, vector<int>(m + 1, 0));
for (int i = 0; i <= n; i++) d[i][0] = i;
for (int j = 0; j <= m; j++) d[0][j] = j;
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= m; j++) {
if (s1[i-1] == s2[j-1]) {
d[i][j] = d[i-1][j-1];
} else {
d[i][j] = 1 + min({d[i-1][j], d[i][j-1], d[i-1][j-1]});
}
}
}
cout << d[n][m] << endl;
return 0;
}
"""
Problem 959: Sequence Alignment Score
Edit distance (Levenshtein) between first 100 digits of pi and e.
Wagner-Fischer DP algorithm:
d[i][j] = min cost to align s1[:i] with s2[:j]
Complexity: O(n*m) time and space, n = m = 100.
"""
# First 100 digits of pi and e
PI_DIGITS = "3141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117067"
E_DIGITS = "2718281828459045235360287471352662497757247093699959574966967627724076630353547594571382178525166427"
def edit_distance(s1: str, s2: str) -> tuple[int, list[list[int]]]:
"""Compute Levenshtein distance and return the full DP table."""
n, m = len(s1), len(s2)
d = [[0] * (m + 1) for _ in range(n + 1)]
for i in range(n + 1):
d[i][0] = i
for j in range(m + 1):
d[0][j] = j
for i in range(1, n + 1):
for j in range(1, m + 1):
if s1[i - 1] == s2[j - 1]:
d[i][j] = d[i - 1][j - 1]
else:
d[i][j] = 1 + min(d[i - 1][j], d[i][j - 1], d[i - 1][j - 1])
return d[n][m], d
def solve() -> int:
s1 = PI_DIGITS[:100]
s2 = E_DIGITS[:100]
dist, _ = edit_distance(s1, s2)
return dist
# --- Compute answer ---
s1, s2 = PI_DIGITS[:100], E_DIGITS[:100]
answer, dp_table = edit_distance(s1, s2)
# Verify lengths
assert len(s1) == 100 and len(s2) == 100
print(answer)