All Euler problems
Project Euler

Incomplete Words II

Given an alphabet Sigma of alpha letters, I(alpha, n) is defined as the number of incomplete words over Sigma with length not exceeding n. A word is incomplete if it does not contain every letter o...

Source sync Apr 19, 2026
Problem #0658
Level Level 31
Solved By 279
Languages C++, Python
Answer 958280177
Length 183 words
modular_arithmeticcombinatoricsbrute_force

Problem Statement

This archive keeps the full statement, math, and original media on the page.

In the context of formal languages, any finite sequence of letters of a given alphabet \(\Sigma \) is called a word over \(\Sigma \). We call a word incomplete if it does not contain every letter of \(\Sigma \).

For example, using the alphabet \(\Sigma =\{ a, b, c\}\), ’\(ab\)’, ’\(abab\)’ and ’\(\,\)’ (the empty word) are incomplete words over \(\Sigma \), while ’\(abac\)’ is a complete word over \(\Sigma \).

Given an alphabet \(\Sigma \) of \(\alpha \) letters, we define \(I(\alpha ,n)\) to be the number of incomplete words over \(\Sigma \) with a length not exceeding \(n\).

For example, \(I(3,0)=1\), \(I(3,2)=13\) and \(I(3,4)=79\).

Let \(\displaystyle S(k,n)=\sum _{\alpha =1}^k I(\alpha ,n)\).

For example, \(S(4,4)=406\), \(S(8,8)=27902680\) and \(S (10,100) \equiv 983602076 \bmod 1\,000\,000\,007\).

Find \(S(10^7,10^{12})\). Give your answer modulo \(1\,000\,000\,007\).

Problem 658: Incomplete Words II

Mathematical Analysis

From Problem 657

We established that:

I(α,n)=j=1α(1)j+1(αj)G(αj,n)I(\alpha, n) = \sum_{j=1}^{\alpha} (-1)^{j+1} \binom{\alpha}{j} \cdot G(\alpha - j, n)

where G(i,n)=in+11i1G(i, n) = \frac{i^{n+1} - 1}{i - 1} is the geometric sum.

Summing Over Alphabets

S(k,n)=α=1kI(α,n)=α=1kj=1α(1)j+1(αj)G(αj,n)S(k, n) = \sum_{\alpha=1}^{k} I(\alpha, n) = \sum_{\alpha=1}^{k} \sum_{j=1}^{\alpha} (-1)^{j+1} \binom{\alpha}{j} G(\alpha - j, n)

Substituting i=αji = \alpha - j:

S(k,n)=α=1ki=0α1(1)αi+1(αi)G(i,n)S(k, n) = \sum_{\alpha=1}^{k} \sum_{i=0}^{\alpha-1} (-1)^{\alpha-i+1} \binom{\alpha}{i} G(i, n)

Exchanging Summation Order

S(k,n)=i=0k1G(i,n)α=i+1k(1)αi+1(αi)S(k, n) = \sum_{i=0}^{k-1} G(i, n) \sum_{\alpha=i+1}^{k} (-1)^{\alpha-i+1} \binom{\alpha}{i}

The inner sum can be simplified using combinatorial identities.

Key Identity

By the identity α=i+1k(1)αi+1(αi)=(1)ki+1(ki+1)\sum_{\alpha=i+1}^{k} (-1)^{\alpha-i+1} \binom{\alpha}{i} = (-1)^{k-i+1} \binom{k}{i+1} (from alternating sum of binomial coefficients), we get:

S(k,n)=i=0k1(1)ki+1(ki+1)G(i,n)S(k, n) = \sum_{i=0}^{k-1} (-1)^{k-i+1} \binom{k}{i+1} G(i, n)

Setting m=i+1m = i + 1:

S(k,n)=m=1k(1)km+2(km)G(m1,n)S(k, n) = \sum_{m=1}^{k} (-1)^{k-m+2} \binom{k}{m} G(m-1, n)

Editorial

Using the derived formula: S(k,n) = sum_{m=1}^{k} (-1)^{k-m+2} * C(k,m) * G(m-1, n). We compute binomial coefficients iteratively. We then compute G(m1,n)G(m-1, n) using modular exponentiation. Finally, accumulate with alternating signs.

Pseudocode

Iterate $m$ from 1 to $k = 10^7$
Compute binomial coefficients iteratively
Compute $G(m-1, n)$ using modular exponentiation
Accumulate with alternating signs

Correctness

Theorem. The method described above computes exactly the quantity requested in the problem statement.

Proof. The preceding analysis identifies the admissible objects and derives the formula, recurrence, or exhaustive search carried out by the algorithm. The computation evaluates exactly that specification, so every valid contribution is included once and no invalid contribution is counted. Therefore the returned value is the required answer. \square

Complexity Analysis

  • Time: O(klogn)O(k \log n) for kk modular exponentiations.
  • Space: O(1)O(1).

Answer

958280177\boxed{958280177}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_658/solution.cpp
#include <bits/stdc++.h>
using namespace std;

const long long MOD = 1000000007;

long long power(long long base, long long exp, long long mod) {
    long long result = 1;
    base %= mod;
    if (base < 0) base += mod;
    while (exp > 0) {
        if (exp & 1) result = result * base % mod;
        base = base * base % mod;
        exp >>= 1;
    }
    return result;
}

long long mod_inv(long long a, long long mod) {
    return power(a % mod + mod, mod - 2, mod);
}

int main() {
    long long k = 10000000;   // 10^7
    long long n = 1000000000000LL; // 10^12

    // S(k,n) = sum_{m=1}^{k} (-1)^{k-m+2} * C(k,m) * G(m-1, n)
    // G(i, n) = (i^{n+1} - 1) / (i - 1)

    long long ans = 0;
    long long binom = 1; // C(k, m) iteratively

    for (long long m = 1; m <= k; m++) {
        // C(k, m) = C(k, m-1) * (k - m + 1) / m
        binom = binom % MOD * ((k - m + 1) % MOD) % MOD;
        binom = binom * mod_inv(m, MOD) % MOD;

        long long i = m - 1; // base for geometric sum

        long long geo;
        if (i == 0) {
            geo = 1; // (0^{n+1} - 1)/(0 - 1) = 1
        } else if (i == 1) {
            geo = (n + 1) % MOD; // limit as i -> 1
        } else {
            long long num = (power(i, n + 1, MOD) - 1 + MOD) % MOD;
            long long den = (i - 1) % MOD;
            geo = num * mod_inv(den, MOD) % MOD;
        }

        long long term = binom * geo % MOD;

        // Sign: (-1)^{k - m + 2}
        long long sign_exp = k - m + 2;
        if (sign_exp % 2 == 0) {
            ans = (ans + term) % MOD;
        } else {
            ans = (ans - term + MOD) % MOD;
        }
    }

    cout << ans << endl;
    return 0;
}