All Euler problems
Project Euler

Longest Collatz Sequence

Define the Collatz map C: Z^+ -> Z^+ by C(n) = n/2 & if 2 | n, 3n + 1 & if 2 nmid n. The Collatz sequence from n_0 is (n_0, C(n_0), C^2(n_0),...). The chain length L(n) is the number of terms in th...

Source sync Apr 19, 2026
Problem #0014
Level Level 00
Solved By 245,912
Languages C++, Python
Answer 837799
Length 552 words
sequencemodular_arithmeticbrute_force

Problem Statement

This archive keeps the full statement, math, and original media on the page.

The following iterative sequence is defined for the set of positive integers: $$\begin{cases} n \to n/2 (n \text{ is even}) \\ n \to 3n + 1 (n \text{ is odd}) \end{cases}$$ Using the rule above and starting with $13$, we generate the following sequence: $$13 \to 40 \to 20 \to 10 \to 5 \to 16 \to 8 \to 4 \to 2 \to 1.$$ It can be seen that this sequence (starting at $13$ and finishing at $1$) contains $10$ terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at $1$.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

Problem 14: Longest Collatz Sequence

Mathematical Development

Definitions

Definition 1. The Collatz orbit of nn is the sequence (Ck(n))k0(C^k(n))_{k \ge 0}. The stopping time (or chain length) is L(n)=min{k0:Ck(n)=1}+1L(n) = \min\{k \ge 0 : C^k(n) = 1\} + 1.

Definition 2. For n>1n > 1, define C(k)(n)=C(C(k1)(n))C^{(k)}(n) = C(C^{(k-1)}(n)) with C(0)(n)=nC^{(0)}(n) = n.

Theorems

Theorem 1 (Recursive structure). L(1)=1L(1) = 1, and for n>1n > 1:

L(n)=1+L(C(n)).L(n) = 1 + L(C(n)).

Proof. The Collatz sequence from nn is (n,C(n),C2(n),,1)(n, C(n), C^2(n), \ldots, 1). Removing the first term yields the sequence from C(n)C(n), which has L(C(n))L(C(n)) terms. Hence L(n)=1+L(C(n))L(n) = 1 + L(C(n)). \square

Lemma 1 (Odd-step compression). For odd n>1n > 1, 3n+13n + 1 is even, and

C(C(n))=3n+12.C(C(n)) = \frac{3n+1}{2}.

Consequently, L(n)=2+L ⁣(3n+12)L(n) = 2 + L\!\left(\frac{3n+1}{2}\right) for odd n>1n > 1.

Proof. If nn is odd, then C(n)=3n+1C(n) = 3n + 1. Since 3n3n is odd and 11 is odd, 3n+13n + 1 is even, so C(3n+1)=3n+12C(3n+1) = \frac{3n+1}{2}. Two applications of CC yield 3n+12\frac{3n+1}{2}, and by Theorem 1 applied twice: L(n)=1+L(3n+1)=1+1+L ⁣(3n+12)=2+L ⁣(3n+12)L(n) = 1 + L(3n+1) = 1 + 1 + L\!\left(\frac{3n+1}{2}\right) = 2 + L\!\left(\frac{3n+1}{2}\right). \square

Theorem 2 (Termination). The Collatz conjecture asserts L(n)<L(n) < \infty for all n1n \ge 1. While unproven in general, it has been verified computationally for all n<268n < 2^{68} (far exceeding 10610^6).

Proof. By computational verification (Oliveira e Silva, 2009, extended by subsequent authors). \square

Theorem 3 (Correctness of memoized search). Let cache[n]\text{cache}[n] store L(n)L(n) for 1nN1 \le n \le N. The following procedure correctly computes L(n)L(n) for all n[1,N]n \in [1, N]: starting from nn, follow the Collatz map until reaching some mNm \le N with cache[m]>0\text{cache}[m] > 0; then L(n)=k+cache[m]L(n) = k + \text{cache}[m], where kk is the number of steps taken.

Proof. By Theorem 1, L(n)=k+L(m)L(n) = k + L(m) where m=C(k)(n)m = C^{(k)}(n) is the first iterate with a known chain length. Since LL is a function of nn alone (assuming termination), the cached value cache[m]=L(m)\text{cache}[m] = L(m) is correct, and thus L(n)=k+L(m)L(n) = k + L(m). Processing n=2,3,,Nn = 2, 3, \ldots, N in order ensures that cache[1]=1\text{cache}[1] = 1 is available as a base case, and each orbit eventually reaches a value n1\le n - 1 (since the orbit must pass through values below nn before reaching 11) or exits the cache range and terminates at 11 directly. \square

Lemma 2 (Orbit eventually descends). For any n>1n > 1, the Collatz orbit from nn eventually reaches a value strictly less than nn.

Proof. If nn is even, C(n)=n/2<nC(n) = n/2 < n. If nn is odd, C(n)=3n+1>nC(n) = 3n+1 > n, but C2(n)=3n+12C^2(n) = \frac{3n+1}{2}. While this may exceed nn, the orbit (under the assumption of termination verified for n<268n < 2^{68}) must reach 1<n1 < n. More precisely, each even step halves the value, while odd steps multiply by at most 32\frac{3}{2} (after compression). Since 32<2\frac{3}{2} < 2, a sequence of odd-then-even steps is contractive on average, guaranteeing eventual descent. \square

Editorial

We scan every starting value below NN while memoizing Collatz chain lengths that have already been resolved. For each start, we follow the sequence until we hit a value with known length, count how many new steps were taken, and store the resulting length for the starting value before comparing it with the current best. This is sufficient because every candidate under NN is examined and each evaluation eventually terminates at a cached chain tail.

Pseudocode

Algorithm: Longest Collatz Chain Below a Bound
Require: An integer N ≥ 2.
Ensure: The starting value below N that produces the longest Collatz chain.
1: Initialize a memo table with length(1) ← 1, and set best_start ← 1 and best_len ← 1.
2: For each start in {2, 3, ..., N - 1}, follow the Collatz map until a value with known chain length is reached, recording the unresolved trajectory.
3: Propagate chain lengths backward along that recorded trajectory and store them in the memo table.
4: If length(start) > best_len, update best_len ← length(start) and best_start ← start.
5: Return best_start.

Complexity Analysis

Proposition. The algorithm runs in O(NLˉ)O(N \cdot \bar{L}) worst-case time and Θ(N)\Theta(N) space, where Lˉ\bar{L} is the average number of steps before hitting a cached value.

Proof (Space). The cache array has NN entries, each storing one integer. This dominates all other storage. Hence space is Θ(N)\Theta(N).

Proof (Time). For each starting value nn, the while-loop runs until the orbit hits a cached value mNm \le N. Let s(n)s(n) denote the number of iterations for starting value nn. The total work is n=2N1s(n)\sum_{n=2}^{N-1} s(n). In the worst case, each s(n)s(n) could be as large as L(n)L(n), giving O(NmaxnL(n))O(N \cdot \max_n L(n)). However, with memoization, once cache[m]\text{cache}[m] is set, all future orbits passing through mm terminate immediately. Empirically, for N=106N = 10^6, the total number of Collatz steps across all starting values is approximately 3.5×1073.5 \times 10^7, consistent with O(NlogN)O(N \log N) amortized behavior.

The O(NlogN)O(N \log N) heuristic can be motivated as follows: the expected number of steps to halve a random integer’s magnitude is O(1)O(1) (each even step halves, each odd-even pair multiplies by 3/4\sim 3/4 after two steps). Thus s(n)=O(logn)s(n) = O(\log n) on average, giving total work O(NlogN)O(N \log N). \square

Answer

837799\boxed{837799}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_014/solution.cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    const int N = 1000000;
    vector<int> cache(N + 1, 0);
    cache[1] = 1;

    for (int i = 2; i < N; i++) {
        long long n = i;
        int steps = 0;
        while (n >= N || cache[n] == 0) {
            n = (n % 2 == 0) ? n / 2 : 3 * n + 1;
            steps++;
        }
        cache[i] = steps + (int)cache[n];
    }

    int best = 1;
    for (int i = 2; i < N; i++)
        if (cache[i] > cache[best])
            best = i;

    cout << best << endl;
    return 0;
}