All Euler problems
Project Euler

Superinteger

An integer s is a superinteger of another integer n if the digits of n form a subsequence of the digits of s. Let p(i) be the i -th prime, c(i) the i -th composite. Define sd(n) = 1 + (n-1) mod 9 (...

Source sync Apr 19, 2026
Problem #0467
Level Level 22
Solved By 527
Languages C++, Python
Answer 775181359
Length 367 words
modular_arithmeticconstructivedynamic_programming

Problem Statement

This archive keeps the full statement, math, and original media on the page.

An integer $s$ is called superinteger of another integer $n$ if the digits of $n$ form a subsequence of the digits of $s$.

For example, $2718281828$ is a superinteger of $18828$, while $314159$ is not a superinteger of $151$.

Let $p(n)$ be the $n$th prime number, and let $c(n)$ be the $n$th composite number. For example, $p(1) = 2$, $p(10) = 29$, $c(1) = 4$ and $c(10) = 18$.

  • $\{p(i) : i \ge 1\} = \{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, \dots\}$

  • $\{c(i) : i \ge 1\} = \{4, 6, 8, 9, 10, 12, 14, 15, 16, 18, \dots\}$

Let $P^D$ be the sequence of the digital roots of $\{p(i)\}$ ($C^D$ is defined similarly for $\{c(i)\}$):

  • $P^D = \{2, 3, 5, 7, 2, 4, 8, 1, 5, 2, \dots\}$

  • $C^D = \{4, 6, 8, 9, 1, 3, 5, 6, 7, 9, \dots\}$

Let $P_n$ be the integer formed by concatenating the first $n$ elements of $P^D$ ($C_n$ is defined similarly for $C^D$).

  • $P_{10} = 2357248152$

  • $C_{10} = 4689135679$

Let $f(n)$ be the smallest positive integer that is a common superinteger of $P_n$ and $C_n$.

For example, $f(10) = 2357246891352679$ and $f(100) \bmod 1\,000\,000\,007 = 771661825$.

Find $f(10\,000) \bmod 1\,000\,000\,007$.

Problem 467: Superinteger

Mathematical Foundation

Theorem 1 (Digital root formula). For n1n \ge 1, sd(n)=1+(n1)mod9\text{sd}(n) = 1 + (n - 1) \bmod 9.

Proof. The digital root satisfies sd(n)n(mod9)\text{sd}(n) \equiv n \pmod{9} with sd(n){1,2,,9}\text{sd}(n) \in \{1, 2, \ldots, 9\}. For n1n \ge 1: if 9n9 \mid n then sd(n)=9=1+(n1)mod9\text{sd}(n) = 9 = 1 + (n-1) \bmod 9 (since (n1)mod9=8(n-1) \bmod 9 = 8); otherwise sd(n)=nmod9=1+(n1)mod9\text{sd}(n) = n \bmod 9 = 1 + (n-1) \bmod 9. \square

Theorem 2 (SCS via LCS). For strings AA and BB over alphabet Σ\Sigma, the length of the Shortest Common Supersequence satisfies:

SCS(A,B)=A+BLCS(A,B)|\text{SCS}(A, B)| = |A| + |B| - |\text{LCS}(A, B)|

where LCS\text{LCS} is the Longest Common Subsequence.

Proof. An SCS of AA and BB is a string SS of minimum length containing both AA and BB as subsequences. Any common subsequence of AA and BB can be “shared” in SS, and an LCS provides the maximum such sharing. Formally, align AA and BB to SS: each position of SS is either from AA only, BB only, or shared. The shared positions form a common subsequence, and to minimize S|S| we maximize the shared portion. Thus S=A+BLCS|S| = |A| + |B| - |\text{LCS}|. \square

Lemma 1 (LCS recurrence). For sequences A=a1amA = a_1 \ldots a_m and B=b1bnB = b_1 \ldots b_n:

L(i,j)={0if i=0 or j=0,L(i1,j1)+1if ai=bj,max(L(i1,j),L(i,j1))otherwise.L(i, j) = \begin{cases} 0 & \text{if } i = 0 \text{ or } j = 0, \\ L(i-1, j-1) + 1 & \text{if } a_i = b_j, \\ \max(L(i-1,j),\, L(i,j-1)) & \text{otherwise.} \end{cases}

The LCS length is L(m,n)L(m, n).

Proof. Standard dynamic programming recurrence. If ai=bja_i = b_j, the optimal LCS either uses this match (gaining 1 over L(i1,j1)L(i-1,j-1)) or does not, but using it is always at least as good. If aibja_i \ne b_j, at least one of aia_i or bjb_j is not in the LCS suffix, giving the max of the two sub-cases. \square

Theorem 3 (SCS construction and modular evaluation). The SCS string is constructed by backtracking through the LCS DP table, interleaving characters from AA and BB while sharing matched characters. The numeric value modulo MM is computed incrementally:

val(10val+dk)modM\text{val} \leftarrow (10 \cdot \text{val} + d_k) \bmod M

for each digit dkd_k of the SCS.

Proof. The backtracking produces the lexicographically smallest SCS (by choosing AA‘s character first when tied). The modular arithmetic preserves the polynomial identity val=kdk10L1kmodM\text{val} = \sum_k d_k \cdot 10^{L-1-k} \bmod M via Horner’s method. \square

Editorial

The key algorithm: lexicographically smallest SCS via forward construction using suffix LCS tables. We generate sequences. We then lCS DP table (O(n^2)). Finally, else.

Pseudocode

Generate sequences
LCS DP table (O(n^2))
else
Backtrack to construct SCS, compute value mod MOD
Build SCS in reverse, then reverse at end
else
Compute numeric value mod MOD
for d in scs

Complexity Analysis

  • Time: O(LloglogL)O(L \log \log L) for sieving primes up to the nn-th prime (LnlnnL \approx n \ln n). O(n2)O(n^2) for the LCS DP table. O(n)O(n) for SCS construction and modular evaluation. Total: O(n2)O(n^2).
  • Space: O(n2)O(n^2) for the LCS table (can be reduced to O(n)O(n) with Hirschberg’s algorithm if only the value is needed, but backtracking requires the full table).

Answer

775181359\boxed{775181359}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_467/solution.cpp
/*
 * Project Euler Problem 467: Superinteger
 *
 * Find f(10000) mod 1,000,000,007 where f(n) is the smallest superinteger
 * (shortest common supersequence) of the digital root sequences of the
 * first n primes and first n composites.
 *
 * Algorithm:
 *   1. Generate digital root sequences P_n and C_n
 *   2. Compute suffix LCS table: lcs[i][j] = LCS(A[i:], B[j:])
 *   3. Build lexicographically smallest SCS forward using greedy:
 *      At state (i,j), next char must be A[i] or B[j] (or both if equal).
 *      Pick the option that keeps SCS length optimal, break ties by
 *      choosing the smaller digit.
 *   4. Compute numeric value mod 10^9+7 during construction.
 *
 * Compile: g++ -O2 -o solution solution.cpp
 * Memory: ~400MB for n=10000 (short array 10001x10001)
 */

#include <iostream>
#include <vector>
#include <algorithm>
#include <cstring>

using namespace std;

const long long MOD = 1000000007LL;
const int N = 10000;
const int SIEVE_LIMIT = 200000;

bool is_prime_arr[SIEVE_LIMIT + 1];

int digital_root(int n) {
    if (n == 0) return 0;
    return 1 + (n - 1) % 9;
}

void sieve() {
    memset(is_prime_arr, true, sizeof(is_prime_arr));
    is_prime_arr[0] = is_prime_arr[1] = false;
    for (int i = 2; i * i <= SIEVE_LIMIT; i++) {
        if (is_prime_arr[i]) {
            for (int j = i * i; j <= SIEVE_LIMIT; j += i) {
                is_prime_arr[j] = false;
            }
        }
    }
}

// Suffix LCS table: lcs[i][j] = LCS(A[i:], B[j:])
short lcs_table[N + 1][N + 1];

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    sieve();

    // Generate sequences
    vector<int> A, B; // A = sp (primes), B = sc (composites)
    for (int i = 2; i <= SIEVE_LIMIT && ((int)A.size() < N || (int)B.size() < N); i++) {
        if (is_prime_arr[i]) {
            if ((int)A.size() < N) A.push_back(digital_root(i));
        } else {
            if ((int)B.size() < N) B.push_back(digital_root(i));
        }
    }

    int m = A.size(), n = B.size();
    cout << "Generated sequences: m=" << m << ", n=" << n << endl;
    cout << "P starts: ";
    for (int i = 0; i < 20; i++) cout << A[i];
    cout << endl;
    cout << "C starts: ";
    for (int i = 0; i < 20; i++) cout << B[i];
    cout << endl;

    // Compute suffix LCS
    cout << "Computing suffix LCS table..." << endl;
    for (int i = 0; i <= m; i++) lcs_table[i][n] = 0;
    for (int j = 0; j <= n; j++) lcs_table[m][j] = 0;

    for (int i = m - 1; i >= 0; i--) {
        for (int j = n - 1; j >= 0; j--) {
            if (A[i] == B[j]) {
                lcs_table[i][j] = lcs_table[i+1][j+1] + 1;
            } else {
                lcs_table[i][j] = max(lcs_table[i+1][j], lcs_table[i][j+1]);
            }
        }
    }

    int lcs_len = lcs_table[0][0];
    int scs_len = m + n - lcs_len;
    cout << "LCS length: " << lcs_len << endl;
    cout << "SCS length: " << scs_len << endl;

    // Build lexicographically smallest SCS forward
    cout << "Building SCS..." << endl;
    long long val = 0;
    int ci = 0, cj = 0;

    auto scs_rem = [&](int i, int j) -> int {
        return (m - i) + (n - j) - lcs_table[i][j];
    };

    while (ci < m || cj < n) {
        int d;
        if (ci == m) {
            d = B[cj]; cj++;
        } else if (cj == n) {
            d = A[ci]; ci++;
        } else if (A[ci] == B[cj]) {
            d = A[ci]; ci++; cj++;
        } else {
            int cur = scs_rem(ci, cj);
            bool opt_a = (1 + scs_rem(ci + 1, cj) == cur);
            bool opt_b = (1 + scs_rem(ci, cj + 1) == cur);

            if (opt_a && opt_b) {
                if (A[ci] <= B[cj]) {
                    d = A[ci]; ci++;
                } else {
                    d = B[cj]; cj++;
                }
            } else if (opt_a) {
                d = A[ci]; ci++;
            } else {
                d = B[cj]; cj++;
            }
        }
        val = (val * 10 + d) % MOD;
    }

    cout << endl;
    cout << "f(" << N << ") mod " << MOD << " = " << val << endl;

    // Verify small case n=10
    {
        cout << "\n=== Verification n=10 ===" << endl;
        vector<int> a10(A.begin(), A.begin() + 10);
        vector<int> b10(B.begin(), B.begin() + 10);

        short lcs10[11][11];
        memset(lcs10, 0, sizeof(lcs10));
        for (int i = 9; i >= 0; i--)
            for (int j = 9; j >= 0; j--) {
                if (a10[i] == b10[j])
                    lcs10[i][j] = lcs10[i+1][j+1] + 1;
                else
                    lcs10[i][j] = max(lcs10[i+1][j], lcs10[i][j+1]);
            }

        cout << "LCS(10) = " << lcs10[0][0] << endl;
        cout << "SCS(10) = " << 20 - lcs10[0][0] << endl;

        auto scs_rem10 = [&](int i, int j) -> int {
            return (10 - i) + (10 - j) - lcs10[i][j];
        };

        string scs_str;
        int ii = 0, jj = 0;
        while (ii < 10 || jj < 10) {
            int dd;
            if (ii == 10) { dd = b10[jj]; jj++; }
            else if (jj == 10) { dd = a10[ii]; ii++; }
            else if (a10[ii] == b10[jj]) { dd = a10[ii]; ii++; jj++; }
            else {
                int cur = scs_rem10(ii, jj);
                bool oa = (1 + scs_rem10(ii+1, jj) == cur);
                bool ob = (1 + scs_rem10(ii, jj+1) == cur);
                if (oa && ob) {
                    if (a10[ii] <= b10[jj]) { dd = a10[ii]; ii++; }
                    else { dd = b10[jj]; jj++; }
                } else if (oa) { dd = a10[ii]; ii++; }
                else { dd = b10[jj]; jj++; }
            }
            scs_str += ('0' + dd);
        }
        cout << "f(10) = " << scs_str << endl;
        cout << "Expected: 2357246891352679" << endl;

        long long v = 0;
        for (char c : scs_str) v = (v * 10 + (c - '0')) % MOD;
        cout << "f(10) mod " << MOD << " = " << v << endl;
    }

    return 0;
}