All Euler problems
Project Euler

Incomplete Words

In the theory of formal languages, a word is any finite sequence of letters from a given alphabet, and a word is called incomplete if it does not contain every letter of that alphabet. For example,...

Source sync Apr 19, 2026
Problem #0657
Level Level 19
Solved By 674
Languages C++, Python
Answer 219493139
Length 302 words
modular_arithmeticcombinatoricssequence

Problem Statement

This archive keeps the full statement, math, and original media on the page.

In the context of formal languages, any finite sequence of letters of a given alphabet $\Sigma$ is called a word over $\Sigma$. We call a word incomplete if it does not contain every letter of $\Sigma$.

For example, using the alphabet $\Sigma=\{ a, b, c\}$, '$ab$', '$abab$' and '$\,$' (the empty word) are incomplete words over $\Sigma$, while '$abac$' is a complete word over $\Sigma$.

Given an alphabet $\Sigma$ of $\alpha$ letters, we define $I(\alpha,n)$ to be the number of incomplete words over $\Sigma$ with a length not exceeding $n$.

For example, $I(3,0)=1$, $I(3,2)=13$ and $I(3,4)=79$.

Find $I(10^7,10^{12})$. Give your answer modulo $1\,000\,000\,007$.

Problem 657: Incomplete Words

Mathematical Analysis

Counting Complete vs. Incomplete Words

The total number of words of length exactly kk over an alphabet of α\alpha letters is αk\alpha^k.

The total number of words of length n\leq n is:

k=0nαk=αn+11α1\sum_{k=0}^{n} \alpha^k = \frac{\alpha^{n+1} - 1}{\alpha - 1}

Inclusion-Exclusion for Complete Words

A word of length kk is complete if it contains all α\alpha letters. By inclusion-exclusion:

C(α,k)=j=0α(1)j(αj)(αj)kC(\alpha, k) = \sum_{j=0}^{\alpha} (-1)^j \binom{\alpha}{j} (\alpha - j)^k

This is related to Stirling numbers of the second kind:

C(α,k)=α!S(k,α)C(\alpha, k) = \alpha! \cdot S(k, \alpha)

Number of Incomplete Words

I(α,n)=k=0nαkk=0nC(α,k)I(\alpha, n) = \sum_{k=0}^{n} \alpha^k - \sum_{k=0}^{n} C(\alpha, k) =αn+11α1k=0nj=0α(1)j(αj)(αj)k= \frac{\alpha^{n+1} - 1}{\alpha - 1} - \sum_{k=0}^{n} \sum_{j=0}^{\alpha} (-1)^j \binom{\alpha}{j} (\alpha - j)^k

Swapping the sums:

I(α,n)=αn+11α1j=0α(1)j(αj)(αj)n+11(αj)1I(\alpha, n) = \frac{\alpha^{n+1} - 1}{\alpha - 1} - \sum_{j=0}^{\alpha} (-1)^j \binom{\alpha}{j} \frac{(\alpha-j)^{n+1} - 1}{(\alpha-j) - 1}

The j=0j = 0 term in the second sum gives αn+11α1\frac{\alpha^{n+1} - 1}{\alpha - 1}, which cancels the first term, leaving:

I(α,n)=j=1α(1)j(αj)(αj)n+11(αj)1I(\alpha, n) = -\sum_{j=1}^{\alpha} (-1)^j \binom{\alpha}{j} \frac{(\alpha-j)^{n+1} - 1}{(\alpha-j) - 1} =j=1α(1)j+1(αj)(αj)n+11(αj)1= \sum_{j=1}^{\alpha} (-1)^{j+1} \binom{\alpha}{j} \frac{(\alpha-j)^{n+1} - 1}{(\alpha-j) - 1}

For j=αj = \alpha, the term (αj)=0(\alpha - j) = 0, and we treat 0n+1101=1\frac{0^{n+1} - 1}{0 - 1} = 1.

Simplification

Setting i=αji = \alpha - j:

I(α,n)=i=0α1(1)αi+1(αi)in+11i1I(\alpha, n) = \sum_{i=0}^{\alpha-1} (-1)^{\alpha - i + 1} \binom{\alpha}{i} \frac{i^{n+1} - 1}{i - 1}

For i=1i = 1: 1111\frac{1 - 1}{1 - 1} is treated as n+1n + 1.

For i=0i = 0: the term contributes (1)α+11=(1)α+1(-1)^{\alpha+1} \cdot 1 = (-1)^{\alpha+1}.

Editorial

I(alpha, n) = sum_{j=1}^{alpha} (-1)^{j+1} * C(alpha,j) * geo(alpha-j, n) where geo(i, n) = (i^{n+1} - 1) / (i - 1). We compute the sum using modular arithmetic with p=109+7p = 10^9 + 7. We then use Fermat’s little theorem for modular inverses. Finally, note: most terms with ipi \geq p vanish since in+1i(n+1)mod(p1)(modp)i^{n+1} \equiv i^{(n+1) \mod (p-1)} \pmod{p}.

Pseudocode

Compute the sum using modular arithmetic with $p = 10^9 + 7$
Use Fermat's little theorem for modular inverses
Use modular exponentiation for $i^{n+1} \mod p$
Compute binomial coefficients $\binom{\alpha}{j}$ iteratively
Note: most terms with $i \geq p$ vanish since $i^{n+1} \equiv i^{(n+1) \mod (p-1)} \pmod{p}$

Correctness

Theorem. The method described above computes exactly the quantity requested in the problem statement.

Proof. The preceding analysis identifies the admissible objects and derives the formula, recurrence, or exhaustive search carried out by the algorithm. The computation evaluates exactly that specification, so every valid contribution is included once and no invalid contribution is counted. Therefore the returned value is the required answer. \square

Complexity Analysis

  • Time: O(min(α,p)logn)O(\min(\alpha, p) \cdot \log n) for modular exponentiations.
  • Space: O(1)O(1).

Answer

219493139\boxed{219493139}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_657/solution.cpp
#include <bits/stdc++.h>
using namespace std;

const long long MOD = 1000000007;

long long power(long long base, long long exp, long long mod) {
    long long result = 1;
    base %= mod;
    while (exp > 0) {
        if (exp & 1) result = result * base % mod;
        base = base * base % mod;
        exp >>= 1;
    }
    return result;
}

long long mod_inv(long long a, long long mod) {
    return power(a, mod - 2, mod);
}

int main() {
    long long alpha = 10000000;  // 10^7
    long long n = 1000000000000LL; // 10^12

    // I(alpha, n) = sum_{j=1}^{alpha} (-1)^{j+1} * C(alpha, j) * (geometric sum)
    // where geometric sum = ((alpha-j)^{n+1} - 1) / ((alpha-j) - 1)

    long long ans = 0;
    long long binom = 1; // C(alpha, j), computed iteratively

    for (long long j = 1; j <= alpha; j++) {
        // Update binomial coefficient: C(alpha, j) = C(alpha, j-1) * (alpha - j + 1) / j
        binom = binom % MOD * ((alpha - j + 1) % MOD) % MOD;
        binom = binom % MOD * mod_inv(j % MOD, MOD) % MOD;

        long long i = alpha - j; // the base of the geometric series
        long long geo;

        if (i == 0) {
            // (0^{n+1} - 1) / (0 - 1) = 1
            geo = 1;
        } else if (i == 1) {
            // (1^{n+1} - 1) / (1 - 1) -> n + 1
            geo = (n + 1) % MOD;
        } else {
            // (i^{n+1} - 1) / (i - 1)
            long long num = (power(i % MOD, n + 1, MOD) - 1 + MOD) % MOD;
            long long den = (i % MOD - 1 + MOD) % MOD;
            geo = num % MOD * mod_inv(den, MOD) % MOD;
        }

        long long term = binom % MOD * geo % MOD;

        if (j % 2 == 1) {
            // (-1)^{j+1} = +1 for odd j
            ans = (ans + term) % MOD;
        } else {
            // (-1)^{j+1} = -1 for even j
            ans = (ans - term + MOD) % MOD;
        }
    }

    cout << ans << endl;
    return 0;
}