All Euler problems
Project Euler

Investigating Ulam Sequences

For 2 <= n <= 10, find U(2, 2n+1)(10^11) -- the (10^11) -th term of the Ulam sequence U(2, 2n+1) -- and compute the sum of these values. An Ulam sequence U(a, b) with a < b starts with a, b and the...

Source sync Apr 19, 2026
Problem #0167
Level Level 10
Solved By 2,041
Languages C++, Python
Answer 3916160068885
Length 531 words
modular_arithmeticsequencelinear_algebra

Problem Statement

This archive keeps the full statement, math, and original media on the page.

For two positive integers \(a\) and \(b\), the Ulam sequence \(U(a,b)\) is defined by \(U(a,b)_1 = a\), \(U(a,b)_2 = b\) and for \(k > 2\), \(U(a,b)_k\) is the smallest integer greater than \(U(a,b)_{k - 1}\) which can be written in exactly one way as the sum of two distinct previous members of \(U(a,b)\).

For example, the sequence \(U(1,2)\) begins with

\(1\), \(2\), \(3 = 1 + 2\), \(4 = 1 + 3\), \(6 = 2 + 4\), \(8 = 2 + 6\), \(11 = 3 + 8\);

\(5\) does not belong to it because \(5 = 1 + 4 = 2 + 3\) has two representations as the sum of two previous members, likewise \(7 = 1 + 6 = 3 + 4\).

Find \(\displaystyle \sum \limits _{n = 2}^{10} U(2,2n+1)_k\), where \(k = 10^{11}\).

Problem 167: Investigating Ulam Sequences

Mathematical Foundation

Definition. The Ulam sequence U(a,b)=(u1,u2,u3,)U(a, b) = (u_1, u_2, u_3, \ldots) is defined by u1=au_1 = a, u2=bu_2 = b, and for k3k \geq 3, uku_k is the smallest integer m>uk1m > u_{k-1} such that {(i,j):i<j,ui+uj=m}=1|\{(i, j) : i < j, \, u_i + u_j = m\}| = 1.

Theorem 1 (Even members of U(2,v)U(2, v) for odd v5v \geq 5). The sequence U(2,v)U(2, v) for odd v5v \geq 5 contains exactly two even members: 22 and v+1v + 1. All subsequent members are odd.

Proof. The first member is u1=2u_1 = 2 (even), and u2=vu_2 = v (odd). The third term is u3=v+2u_3 = v + 2 (odd, the unique representation is 2+v2 + v). We claim e=v+1e = v + 1 is the only other even term that eventually appears (as the sum of two odd terms already in the sequence — but we need exactly one representation).

For any odd candidate c>max{u1,u2,}c > \max\{u_1, u_2, \ldots\}: the representations c=ui+ujc = u_i + u_j with i<ji < j can only involve at most one even term (since odd + even = odd, even + even = even c\neq c). The even terms are 2 and ee. So the possible representations involving even terms are (2,c2)(2, c-2) and (e,ce)(e, c-e). Two odd terms sum to an even number, so they cannot produce odd cc. Therefore, the number of representations of odd cc is:

r(c)=[c2U]+[ceU]r(c) = [c - 2 \in U] + [c - e \in U]

where [][\cdot] is the Iverson bracket. This means cc is Ulam iff exactly one of c2c-2 or cec-e is in the sequence — an XOR condition. \square

Theorem 2 (Eventual periodicity of differences). For U(2,2n+1)U(2, 2n+1) with 2n102 \leq n \leq 10, the difference sequence dk=uk+1ukd_k = u_{k+1} - u_k is eventually periodic with period PP and period sum D=i=1PdT+iD = \sum_{i=1}^{P} d_{T+i} for some transient length TT.

Proof. By Theorem 1, after the transient, membership in UU is determined by the XOR rule: cU    [c2U][ceU]c \in U \iff [c-2 \in U] \oplus [c-e \in U]. This is a linear recurrence over F2\mathbb{F}_2 on the characteristic function 1U\mathbf{1}_U restricted to odd integers. The state is determined by the membership values at positions c2c-2 and cec-e, which depends on a window of width e2=v1e - 2 = v - 1. Since there are 2v12^{v-1} possible windows and the recurrence is deterministic, the sequence of membership bits is eventually periodic. The period of the membership bits directly implies eventual periodicity of the difference sequence. \square

Lemma 1 (Known periods). The periods PP for U(2,2n+1)U(2, 2n+1) are:

nnv=2n+1v = 2n+1Period PP
2532
3726
49444
5111,628
6135,906
71580
817126,960
919380,882
10212,097,152

These are verified computationally by generating sufficient terms and checking repetition of the difference sequence.

Theorem 3 (Extrapolation formula). Once the transient TT, period PP, and period sum DD are identified, the kk-th term for k>Tk > T is:

uk=uT+r+qDu_k = u_{T + r} + q \cdot D

where kT=qP+rk - T = qP + r with 0r<P0 \leq r < P.

Proof. By periodicity of differences: uT+qP+r=uT+r+qi=1PdT+i=uT+r+qDu_{T + qP + r} = u_{T+r} + q \cdot \sum_{i=1}^{P} d_{T+i} = u_{T+r} + qD. \square

Editorial

For U(2, 2n+1) with n=2..10, find the 10^11-th term and sum them. We generate terms using XOR rule until periodicity detected. We then generate using XOR: for odd c, c in U iff exactly one of. Finally, (c-2 in U), (c-e in U) is true.

Pseudocode

Generate terms using XOR rule until periodicity detected
Generate using XOR: for odd c, c in U iff exactly one of
(c-2 in U), (c-e in U) is true
Also handle even member e
Detect period in difference sequence
Extrapolate to k = 10^11

Complexity Analysis

  • Time: O(n=210(Tn+3Pn))O\bigl(\sum_{n=2}^{10} (T_n + 3P_n)\bigr) for generation and period verification. The largest period is P10=2,097,152P_{10} = 2{,}097{,}152 for U(2,21)U(2, 21), so total generation is 6×106\sim 6 \times 10^6 terms. Each term requires O(1)O(1) via the XOR rule.
  • Space: O(maxn(Tn+2Pn))O(\max_n (T_n + 2P_n)) for the membership bit array and difference sequence.

Answer

3916160068885\boxed{3916160068885}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_167/solution.cpp
#include <bits/stdc++.h>
using namespace std;

/*
 * Problem 167: Investigating Ulam Sequences
 *
 * For U(2, 2n+1) with n=2..10, find the 10^11-th term and sum them.
 * Answer: 3916160068885
 *
 * Key insight: U(2,v) for odd v>=5 has exactly 2 even members: 2 and some e.
 * After that, all members are odd. Odd c is Ulam iff is_ulam[c-2] XOR is_ulam[c-e].
 * This gives O(1) generation per term.
 *
 * The difference sequence is eventually periodic. For small periods we detect
 * them via non-2 diffs. For large periods (U(2,21) with period 2^21), we use
 * the known period from OEIS A100729 and verify it.
 */

const int MAXVAL = 60000000;

// Known periods from OEIS A100729 for U(2, 2n+1), n=2..10
int known_periods[] = {0, 0, 32, 26, 444, 1628, 5906, 80, 126960, 380882, 2097152};

long long solve_ulam(int a, int b, long long k, int nn) {
    // Phase 1: Find second even member
    int SMALL = 100000;
    vector<short> rc(SMALL + 1, 0);
    vector<int> isq = {a, b};
    vector<bool> isu(SMALL + 1, false);
    isu[a] = isu[b] = true;
    if (a + b <= SMALL) rc[a + b]++;

    int second_even = -1;
    int nc2 = b + 1;
    while (nc2 <= SMALL) {
        while (nc2 <= SMALL && rc[nc2] != 1) nc2++;
        if (nc2 > SMALL) break;
        int u = nc2;
        isq.push_back(u);
        isu[u] = true;
        if (u % 2 == 0 && u != a) second_even = u;
        for (int i = 0; i < (int)isq.size() - 1; i++) {
            int s = isq[i] + u;
            if (s <= SMALL && rc[s] < 3) rc[s]++;
        }
        nc2 = u + 1;
        if (second_even > 0) break;
    }

    fprintf(stderr, "U(%d,%d): second_even=%d\n", a, b, second_even);

    // Phase 2: Fast generation using XOR rule
    int kp = known_periods[nn];
    int needed_terms = 6 * kp + 10000; // enough to detect and verify
    if (needed_terms > 14000000) needed_terms = 14000000;
    if (needed_terms < 100000) needed_terms = 100000;

    vector<bool> is_ulam(MAXVAL + 2, false);
    vector<int> seq;
    seq.reserve(needed_terms + 100);
    for (int x : isq) {
        is_ulam[x] = true;
        seq.push_back(x);
    }

    int c = isq.back();
    if (c % 2 == 0) c++;
    else c += 2;

    while (c <= MAXVAL && (int)seq.size() < needed_terms) {
        bool w1 = (c - 2 >= 0 && c - 2 <= MAXVAL && is_ulam[c - 2]);
        bool w2 = (c - second_even >= 0 && c - second_even <= MAXVAL && is_ulam[c - second_even]);
        if (w1 != w2) {
            seq.push_back(c);
            is_ulam[c] = true;
        }
        c += 2;
    }

    int n = seq.size();
    fprintf(stderr, "  generated %d terms, last=%d\n", n, seq[n-1]);

    // Compute diffs
    int nd = n - 1;
    vector<int> diffs(nd);
    for (int i = 0; i < nd; i++) diffs[i] = seq[i+1] - seq[i];

    // Try to detect period using non-2 diffs first (fast for small periods)
    int full_period = -1;

    if (kp <= 500000) {
        // Use non-2 diff approach
        vector<int> non2_val, non2_pos;
        for (int i = 0; i < nd; i++) {
            if (diffs[i] != 2) {
                non2_val.push_back(diffs[i]);
                non2_pos.push_back(i);
            }
        }
        int nn2 = non2_val.size();

        for (int p = 1; p <= nn2 / 6; p++) {
            int b2 = nn2 - p;
            bool ok = true;
            for (int rep = 1; rep <= 4 && ok; rep++) {
                if (b2 - rep * p < 0) { ok = false; break; }
                for (int j = 0; j < p && ok; j++)
                    if (non2_val[b2 + j] != non2_val[b2 - rep * p + j]) ok = false;
            }
            if (ok) {
                int prev_b = b2 - p;
                int fp = non2_pos[b2] - non2_pos[prev_b];

                // Verify
                int ei2 = nd - 1, fb2 = ei2 - fp + 1;
                if (fb2 - 2 * fp >= 0) {
                    bool v = true;
                    for (int rep = 1; rep <= 2 && v; rep++)
                        for (int j = 0; j < fp && v; j++)
                            if (diffs[fb2+j] != diffs[fb2-rep*fp+j]) v = false;
                    if (v) { full_period = fp; break; }
                }
            }
        }
    }

    // If not found, try the known period directly
    if (full_period < 0) {
        int fp = kp;
        int ei2 = nd - 1, fb2 = ei2 - fp + 1;
        if (fb2 - 4 * fp >= 0) {
            bool v = true;
            for (int rep = 1; rep <= 4 && v; rep++)
                for (int j = 0; j < fp && v; j++)
                    if (diffs[fb2+j] != diffs[fb2-rep*fp+j]) v = false;
            if (v) full_period = fp;
        }
    }

    if (full_period < 0) {
        fprintf(stderr, "  FAILED to find/verify period\n");
        return -1;
    }

    // Find start of periodicity
    int ei = nd - 1, fb = ei - full_period + 1;
    int pstart = fb;
    while (pstart >= full_period) {
        bool m = true;
        for (int j = 0; j < full_period && m; j++)
            if (diffs[pstart-full_period+j] != diffs[pstart+j]) m = false;
        if (m) pstart -= full_period;
        else break;
    }

    long long psum = 0;
    for (int j = 0; j < full_period; j++) psum += diffs[pstart + j];

    // Verify extrapolation
    {
        long long tidx = n - 1;
        long long off = tidx - pstart;
        long long q2 = off / full_period;
        long long r = off % full_period;
        long long pred = (long long)seq[pstart+(int)r] + q2 * psum;
        if (pred != seq[n-1]) {
            fprintf(stderr, "  extrapolation FAILED\n");
            return -1;
        }
    }

    fprintf(stderr, "  period=%d, start=%d, psum=%lld\n", full_period, pstart, psum);

    long long idx = k - 1;
    if (idx < n) return seq[idx];
    long long offset = idx - pstart;
    long long q2 = offset / full_period;
    long long r = offset % full_period;
    return (long long)seq[pstart+(int)r] + q2 * psum;
}

int main(){
    long long K = 100000000000LL;
    long long total = 0;
    for (int nn = 2; nn <= 10; nn++) {
        int v = 2 * nn + 1;
        long long val = solve_ulam(2, v, K, nn);
        if (val < 0) {
            fprintf(stderr, "Failed for U(2,%d)\n", v);
            return 1;
        }
        fprintf(stderr, "  U(2,%d)(10^11) = %lld\n\n", v, val);
        total += val;
    }
    cout << total << endl;
    return 0;
}