Project Euler

Longest Collatz Sequence

Define the Collatz map C: Z^+ -> Z^+ by C(n) = n/2 & if 2 | n, 3n + 1 & if 2 nmid n. The Collatz sequence from n_0 is (n_0, C(n_0), C^2(n_0),...). The chain length L(n) is the number of terms in th...

Source sync Apr 19, 2026

Problem #0014

Level Level 00

Solved By 245,912

Languages C++, Python

Answer 837799

Length 552 words

sequencemodular_arithmeticbrute_force

The following iterative sequence is defined for the set of positive integers: $$\begin{cases} n \to n/2 (n \text{ is even}) \\ n \to 3n + 1 (n \text{ is odd}) \end{cases}$$ Using the rule above and starting with $13$, we generate the following sequence: $$13 \to 40 \to 20 \to 10 \to 5 \to 16 \to 8 \to 4 \to 2 \to 1.$$ It can be seen that this sequence (starting at $13$ and finishing at $1$) contains $10$ terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at $1$.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

Problem 14: Longest Collatz Sequence

Mathematical Development

Definitions

Definition 1. The Collatz orbit of $n$ is the sequence $(C^k(n))_{k \ge 0}$ . The stopping time (or chain length) is $L(n) = \min\{k \ge 0 : C^k(n) = 1\} + 1$ .

Definition 2. For $n > 1$ , define $C^{(k)}(n) = C(C^{(k-1)}(n))$ with $C^{(0)}(n) = n$ .

Theorems

Theorem 1 (Recursive structure). $L(1) = 1$ , and for $n > 1$ :

L(n) = 1 + L(C(n)).

Proof. The Collatz sequence from $n$ is $(n, C(n), C^2(n), \ldots, 1)$ . Removing the first term yields the sequence from $C(n)$ , which has $L(C(n))$ terms. Hence $L(n) = 1 + L(C(n))$ . $\square$

Lemma 1 (Odd-step compression). For odd $n > 1$ , $3n + 1$ is even, and

C(C(n)) = \frac{3n+1}{2}.

Consequently, $L(n) = 2 + L\!\left(\frac{3n+1}{2}\right)$ for odd $n > 1$ .

Proof. If $n$ is odd, then $C(n) = 3n + 1$ . Since $3n$ is odd and $1$ is odd, $3n + 1$ is even, so $C(3n+1) = \frac{3n+1}{2}$ . Two applications of $C$ yield $\frac{3n+1}{2}$ , and by Theorem 1 applied twice: $L(n) = 1 + L(3n+1) = 1 + 1 + L\!\left(\frac{3n+1}{2}\right) = 2 + L\!\left(\frac{3n+1}{2}\right)$ . $\square$

Theorem 2 (Termination). The Collatz conjecture asserts $L(n) < \infty$ for all $n \ge 1$ . While unproven in general, it has been verified computationally for all $n < 2^{68}$ (far exceeding $10^6$ ).

Proof. By computational verification (Oliveira e Silva, 2009, extended by subsequent authors). $\square$

Theorem 3 (Correctness of memoized search). Let $\text{cache}[n]$ store $L(n)$ for $1 \le n \le N$ . The following procedure correctly computes $L(n)$ for all $n \in [1, N]$ : starting from $n$ , follow the Collatz map until reaching some $m \le N$ with $\text{cache}[m] > 0$ ; then $L(n) = k + \text{cache}[m]$ , where $k$ is the number of steps taken.

Proof. By Theorem 1, $L(n) = k + L(m)$ where $m = C^{(k)}(n)$ is the first iterate with a known chain length. Since $L$ is a function of $n$ alone (assuming termination), the cached value $\text{cache}[m] = L(m)$ is correct, and thus $L(n) = k + L(m)$ . Processing $n = 2, 3, \ldots, N$ in order ensures that $\text{cache}[1] = 1$ is available as a base case, and each orbit eventually reaches a value $\le n - 1$ (since the orbit must pass through values below $n$ before reaching $1$ ) or exits the cache range and terminates at $1$ directly. $\square$

Lemma 2 (Orbit eventually descends). For any $n > 1$ , the Collatz orbit from $n$ eventually reaches a value strictly less than $n$ .

Proof. If $n$ is even, $C(n) = n/2 < n$ . If $n$ is odd, $C(n) = 3n+1 > n$ , but $C^2(n) = \frac{3n+1}{2}$ . While this may exceed $n$ , the orbit (under the assumption of termination verified for $n < 2^{68}$ ) must reach $1 < n$ . More precisely, each even step halves the value, while odd steps multiply by at most $\frac{3}{2}$ (after compression). Since $\frac{3}{2} < 2$ , a sequence of odd-then-even steps is contractive on average, guaranteeing eventual descent. $\square$

Editorial

We scan every starting value below $N$ while memoizing Collatz chain lengths that have already been resolved. For each start, we follow the sequence until we hit a value with known length, count how many new steps were taken, and store the resulting length for the starting value before comparing it with the current best. This is sufficient because every candidate under $N$ is examined and each evaluation eventually terminates at a cached chain tail.

Pseudocode

Algorithm: Longest Collatz Chain Below a Bound
Require: An integer N ≥ 2.
Ensure: The starting value below N that produces the longest Collatz chain.
1: Initialize a memo table with length(1) ← 1, and set best_start ← 1 and best_len ← 1.
2: For each start in {2, 3, ..., N - 1}, follow the Collatz map until a value with known chain length is reached, recording the unresolved trajectory.
3: Propagate chain lengths backward along that recorded trajectory and store them in the memo table.
4: If length(start) > best_len, update best_len ← length(start) and best_start ← start.
5: Return best_start.

Complexity Analysis

Proposition. The algorithm runs in $O(N \cdot \bar{L})$ worst-case time and $\Theta(N)$ space, where $\bar{L}$ is the average number of steps before hitting a cached value.

Proof (Space). The cache array has $N$ entries, each storing one integer. This dominates all other storage. Hence space is $\Theta(N)$ .

Proof (Time). For each starting value $n$ , the while-loop runs until the orbit hits a cached value $m \le N$ . Let $s(n)$ denote the number of iterations for starting value $n$ . The total work is $\sum_{n=2}^{N-1} s(n)$ . In the worst case, each $s(n)$ could be as large as $L(n)$ , giving $O(N \cdot \max_n L(n))$ . However, with memoization, once $\text{cache}[m]$ is set, all future orbits passing through $m$ terminate immediately. Empirically, for $N = 10^6$ , the total number of Collatz steps across all starting values is approximately $3.5 \times 10^7$ , consistent with $O(N \log N)$ amortized behavior.

The $O(N \log N)$ heuristic can be motivated as follows: the expected number of steps to halve a random integer’s magnitude is $O(1)$ (each even step halves, each odd-even pair multiplies by $\sim 3/4$ after two steps). Thus $s(n) = O(\log n)$ on average, giving total work $O(N \log N)$ . $\square$

Answer

\boxed{837799}

C++ project_euler/problem_014/solution.cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    const int N = 1000000;
    vector<int> cache(N + 1, 0);
    cache[1] = 1;

    for (int i = 2; i < N; i++) {
        long long n = i;
        int steps = 0;
        while (n >= N || cache[n] == 0) {
            n = (n % 2 == 0) ? n / 2 : 3 * n + 1;
            steps++;
        }
        cache[i] = steps + (int)cache[n];
    }

    int best = 1;
    for (int i = 2; i < N; i++)
        if (cache[i] > cache[best])
            best = i;

    cout << best << endl;
    return 0;
}

Python project_euler/problem_014/solution.py

"""Project Euler Problem 14: Longest Collatz Sequence"""


def solve():

    LIMIT = 1_000_000
    cache = [0] * (LIMIT + 1)
    cache[1] = 1

    for i in range(2, LIMIT):
        n, steps = i, 0
        while n >= LIMIT or cache[n] == 0:
            n = n // 2 if n % 2 == 0 else 3 * n + 1
            steps += 1
        cache[i] = steps + cache[n]

    return max(range(1, LIMIT), key=lambda x: cache[x])


answer = solve()
assert answer == 837799
print(answer)