Project Euler

Incomplete Words

In the theory of formal languages, a word is any finite sequence of letters from a given alphabet, and a word is called incomplete if it does not contain every letter of that alphabet. For example,...

Source sync Apr 19, 2026

Problem #0657

Level Level 19

Solved By 674

Languages C++, Python

Answer 219493139

Length 302 words

modular_arithmeticcombinatoricssequence

In the context of formal languages, any finite sequence of letters of a given alphabet $\Sigma$ is called a word over $\Sigma$. We call a word incomplete if it does not contain every letter of $\Sigma$.

For example, using the alphabet $\Sigma=\{ a, b, c\}$, '$ab$', '$abab$' and '$\,$' (the empty word) are incomplete words over $\Sigma$, while '$abac$' is a complete word over $\Sigma$.

Given an alphabet $\Sigma$ of $\alpha$ letters, we define $I(\alpha,n)$ to be the number of incomplete words over $\Sigma$ with a length not exceeding $n$.

For example, $I(3,0)=1$, $I(3,2)=13$ and $I(3,4)=79$.

Find $I(10^7,10^{12})$. Give your answer modulo $1\,000\,000\,007$.

Problem 657: Incomplete Words

Mathematical Analysis

Counting Complete vs. Incomplete Words

The total number of words of length exactly $k$ over an alphabet of $\alpha$ letters is $\alpha^k$ .

The total number of words of length $\leq n$ is:

\sum_{k=0}^{n} \alpha^k = \frac{\alpha^{n+1} - 1}{\alpha - 1}

Inclusion-Exclusion for Complete Words

A word of length $k$ is complete if it contains all $\alpha$ letters. By inclusion-exclusion:

C(\alpha, k) = \sum_{j=0}^{\alpha} (-1)^j \binom{\alpha}{j} (\alpha - j)^k

This is related to Stirling numbers of the second kind:

C(\alpha, k) = \alpha! \cdot S(k, \alpha)

Number of Incomplete Words

I(\alpha, n) = \sum_{k=0}^{n} \alpha^k - \sum_{k=0}^{n} C(\alpha, k)

= \frac{\alpha^{n+1} - 1}{\alpha - 1} - \sum_{k=0}^{n} \sum_{j=0}^{\alpha} (-1)^j \binom{\alpha}{j} (\alpha - j)^k

Swapping the sums:

I(\alpha, n) = \frac{\alpha^{n+1} - 1}{\alpha - 1} - \sum_{j=0}^{\alpha} (-1)^j \binom{\alpha}{j} \frac{(\alpha-j)^{n+1} - 1}{(\alpha-j) - 1}

The $j = 0$ term in the second sum gives $\frac{\alpha^{n+1} - 1}{\alpha - 1}$ , which cancels the first term, leaving:

I(\alpha, n) = -\sum_{j=1}^{\alpha} (-1)^j \binom{\alpha}{j} \frac{(\alpha-j)^{n+1} - 1}{(\alpha-j) - 1}

= \sum_{j=1}^{\alpha} (-1)^{j+1} \binom{\alpha}{j} \frac{(\alpha-j)^{n+1} - 1}{(\alpha-j) - 1}

For $j = \alpha$ , the term $(\alpha - j) = 0$ , and we treat $\frac{0^{n+1} - 1}{0 - 1} = 1$ .

Simplification

Setting $i = \alpha - j$ :

I(\alpha, n) = \sum_{i=0}^{\alpha-1} (-1)^{\alpha - i + 1} \binom{\alpha}{i} \frac{i^{n+1} - 1}{i - 1}

For $i = 1$ : $\frac{1 - 1}{1 - 1}$ is treated as $n + 1$ .

For $i = 0$ : the term contributes $(-1)^{\alpha+1} \cdot 1 = (-1)^{\alpha+1}$ .

Editorial

I(alpha, n) = sum_{j=1}^{alpha} (-1)^{j+1} * C(alpha,j) * geo(alpha-j, n) where geo(i, n) = (i^{n+1} - 1) / (i - 1). We compute the sum using modular arithmetic with $p = 10^9 + 7$ . We then use Fermat’s little theorem for modular inverses. Finally, note: most terms with $i \geq p$ vanish since $i^{n+1} \equiv i^{(n+1) \mod (p-1)} \pmod{p}$ .

Pseudocode

Compute the sum using modular arithmetic with $p = 10^9 + 7$
Use Fermat's little theorem for modular inverses
Use modular exponentiation for $i^{n+1} \mod p$
Compute binomial coefficients $\binom{\alpha}{j}$ iteratively
Note: most terms with $i \geq p$ vanish since $i^{n+1} \equiv i^{(n+1) \mod (p-1)} \pmod{p}$

Correctness

Theorem. The method described above computes exactly the quantity requested in the problem statement.

Proof. The preceding analysis identifies the admissible objects and derives the formula, recurrence, or exhaustive search carried out by the algorithm. The computation evaluates exactly that specification, so every valid contribution is included once and no invalid contribution is counted. Therefore the returned value is the required answer. $\square$

Complexity Analysis

Time: $O(\min(\alpha, p) \cdot \log n)$ for modular exponentiations.
Space: $O(1)$ .

Answer

\boxed{219493139}

C++ project_euler/problem_657/solution.cpp

#include <bits/stdc++.h>
using namespace std;

const long long MOD = 1000000007;

long long power(long long base, long long exp, long long mod) {
    long long result = 1;
    base %= mod;
    while (exp > 0) {
        if (exp & 1) result = result * base % mod;
        base = base * base % mod;
        exp >>= 1;
    }
    return result;
}

long long mod_inv(long long a, long long mod) {
    return power(a, mod - 2, mod);
}

int main() {
    long long alpha = 10000000;  // 10^7
    long long n = 1000000000000LL; // 10^12

    // I(alpha, n) = sum_{j=1}^{alpha} (-1)^{j+1} * C(alpha, j) * (geometric sum)
    // where geometric sum = ((alpha-j)^{n+1} - 1) / ((alpha-j) - 1)

    long long ans = 0;
    long long binom = 1; // C(alpha, j), computed iteratively

    for (long long j = 1; j <= alpha; j++) {
        // Update binomial coefficient: C(alpha, j) = C(alpha, j-1) * (alpha - j + 1) / j
        binom = binom % MOD * ((alpha - j + 1) % MOD) % MOD;
        binom = binom % MOD * mod_inv(j % MOD, MOD) % MOD;

        long long i = alpha - j; // the base of the geometric series
        long long geo;

        if (i == 0) {
            // (0^{n+1} - 1) / (0 - 1) = 1
            geo = 1;
        } else if (i == 1) {
            // (1^{n+1} - 1) / (1 - 1) -> n + 1
            geo = (n + 1) % MOD;
        } else {
            // (i^{n+1} - 1) / (i - 1)
            long long num = (power(i % MOD, n + 1, MOD) - 1 + MOD) % MOD;
            long long den = (i % MOD - 1 + MOD) % MOD;
            geo = num % MOD * mod_inv(den, MOD) % MOD;
        }

        long long term = binom % MOD * geo % MOD;

        if (j % 2 == 1) {
            // (-1)^{j+1} = +1 for odd j
            ans = (ans + term) % MOD;
        } else {
            // (-1)^{j+1} = -1 for even j
            ans = (ans - term + MOD) % MOD;
        }
    }

    cout << ans << endl;
    return 0;
}

Python project_euler/problem_657/solution.py

"""
Project Euler Problem 657: Incomplete Words

Find I(10^7, 10^12) mod 10^9+7.

I(alpha, n) = sum_{j=1}^{alpha} (-1)^{j+1} * C(alpha,j) * geo(alpha-j, n)
where geo(i, n) = (i^{n+1} - 1) / (i - 1)
"""

MOD = 1000000007

def power(base, exp, mod):
    result = 1
    base %= mod
    while exp > 0:
        if exp & 1:
            result = result * base % mod
        base = base * base % mod
        exp >>= 1
    return result

def mod_inv(a, mod):
    return power(a, mod - 2, mod)

def solve():
    alpha = 10**7
    n = 10**12

    ans = 0
    binom = 1  # C(alpha, j) mod MOD, built iteratively

    for j in range(1, alpha + 1):
        # C(alpha, j) = C(alpha, j-1) * (alpha - j + 1) / j
        binom = binom * ((alpha - j + 1) % MOD) % MOD
        binom = binom * mod_inv(j, MOD) % MOD

        i = alpha - j  # base of geometric series

        if i == 0:
            geo = 1
        elif i == 1:
            geo = (n + 1) % MOD
        else:
            num = (power(i, n + 1, MOD) - 1 + MOD) % MOD
            den = (i % MOD - 1 + MOD) % MOD
            geo = num * mod_inv(den, MOD) % MOD

        term = binom * geo % MOD

        if j % 2 == 1:
            ans = (ans + term) % MOD
        else:
            ans = (ans - term + MOD) % MOD

    print(ans)

if __name__ == "__main__":
    solve()