Incomplete Words II
Given an alphabet Sigma of alpha letters, I(alpha, n) is defined as the number of incomplete words over Sigma with length not exceeding n. A word is incomplete if it does not contain every letter o...
Problem Statement
This archive keeps the full statement, math, and original media on the page.
In the context of
For example, using the alphabet \(\Sigma =\{ a, b, c\}\), ’\(ab\)’, ’\(abab\)’ and ’\(\,\)’ (the empty word) are incomplete words over \(\Sigma \), while ’\(abac\)’ is a complete word over \(\Sigma \).
Given an alphabet \(\Sigma \) of \(\alpha \) letters, we define \(I(\alpha ,n)\) to be the number of incomplete words over \(\Sigma \) with a length not exceeding \(n\).
For example, \(I(3,0)=1\), \(I(3,2)=13\) and \(I(3,4)=79\).
Let \(\displaystyle S(k,n)=\sum _{\alpha =1}^k I(\alpha ,n)\).
For example, \(S(4,4)=406\), \(S(8,8)=27902680\) and \(S (10,100) \equiv 983602076 \bmod 1\,000\,000\,007\).
Find \(S(10^7,10^{12})\). Give your answer modulo \(1\,000\,000\,007\).
Problem 658: Incomplete Words II
Mathematical Analysis
From Problem 657
We established that:
where is the geometric sum.
Summing Over Alphabets
Substituting :
Exchanging Summation Order
The inner sum can be simplified using combinatorial identities.
Key Identity
By the identity (from alternating sum of binomial coefficients), we get:
Setting :
Editorial
Using the derived formula: S(k,n) = sum_{m=1}^{k} (-1)^{k-m+2} * C(k,m) * G(m-1, n). We compute binomial coefficients iteratively. We then compute using modular exponentiation. Finally, accumulate with alternating signs.
Pseudocode
Iterate $m$ from 1 to $k = 10^7$
Compute binomial coefficients iteratively
Compute $G(m-1, n)$ using modular exponentiation
Accumulate with alternating signs
Correctness
Theorem. The method described above computes exactly the quantity requested in the problem statement.
Proof. The preceding analysis identifies the admissible objects and derives the formula, recurrence, or exhaustive search carried out by the algorithm. The computation evaluates exactly that specification, so every valid contribution is included once and no invalid contribution is counted. Therefore the returned value is the required answer.
Complexity Analysis
- Time: for modular exponentiations.
- Space: .
Answer
Code
Each problem page includes the exact C++ and Python source files from the local archive.
#include <bits/stdc++.h>
using namespace std;
const long long MOD = 1000000007;
long long power(long long base, long long exp, long long mod) {
long long result = 1;
base %= mod;
if (base < 0) base += mod;
while (exp > 0) {
if (exp & 1) result = result * base % mod;
base = base * base % mod;
exp >>= 1;
}
return result;
}
long long mod_inv(long long a, long long mod) {
return power(a % mod + mod, mod - 2, mod);
}
int main() {
long long k = 10000000; // 10^7
long long n = 1000000000000LL; // 10^12
// S(k,n) = sum_{m=1}^{k} (-1)^{k-m+2} * C(k,m) * G(m-1, n)
// G(i, n) = (i^{n+1} - 1) / (i - 1)
long long ans = 0;
long long binom = 1; // C(k, m) iteratively
for (long long m = 1; m <= k; m++) {
// C(k, m) = C(k, m-1) * (k - m + 1) / m
binom = binom % MOD * ((k - m + 1) % MOD) % MOD;
binom = binom * mod_inv(m, MOD) % MOD;
long long i = m - 1; // base for geometric sum
long long geo;
if (i == 0) {
geo = 1; // (0^{n+1} - 1)/(0 - 1) = 1
} else if (i == 1) {
geo = (n + 1) % MOD; // limit as i -> 1
} else {
long long num = (power(i, n + 1, MOD) - 1 + MOD) % MOD;
long long den = (i - 1) % MOD;
geo = num * mod_inv(den, MOD) % MOD;
}
long long term = binom * geo % MOD;
// Sign: (-1)^{k - m + 2}
long long sign_exp = k - m + 2;
if (sign_exp % 2 == 0) {
ans = (ans + term) % MOD;
} else {
ans = (ans - term + MOD) % MOD;
}
}
cout << ans << endl;
return 0;
}
"""
Project Euler Problem 658: Incomplete Words II
Find S(10^7, 10^12) mod 10^9+7, where S(k,n) = sum_{alpha=1}^{k} I(alpha, n).
Using the derived formula:
S(k,n) = sum_{m=1}^{k} (-1)^{k-m+2} * C(k,m) * G(m-1, n)
"""
MOD = 1000000007
def power(base, exp, mod):
result = 1
base %= mod
while exp > 0:
if exp & 1:
result = result * base % mod
base = base * base % mod
exp >>= 1
return result
def mod_inv(a, mod):
return power(a % mod, mod - 2, mod)
def solve():
k = 10**7
n = 10**12
ans = 0
binom = 1
for m in range(1, k + 1):
binom = binom * ((k - m + 1) % MOD) % MOD
binom = binom * mod_inv(m, MOD) % MOD
i = m - 1
if i == 0:
geo = 1
elif i == 1:
geo = (n + 1) % MOD
else:
num = (power(i, n + 1, MOD) - 1 + MOD) % MOD
den = (i - 1) % MOD
geo = num * mod_inv(den, MOD) % MOD
term = binom * geo % MOD
sign_exp = k - m + 2
if sign_exp % 2 == 0:
ans = (ans + term) % MOD
else:
ans = (ans - term + MOD) % MOD
print(ans)
if __name__ == "__main__":
solve()