Project Euler

Iterative Sampling

Consider a random process on the set {1, 2,..., n}. At each step, we sample an element uniformly at random (with replacement). Let T be the number of steps until some stopping condition is met (e.g...

Source sync Apr 19, 2026

Problem #0819

Level Level 31

Solved By 271

Languages C++, Python

Answer 1995.975556

Length 438 words

modular_arithmeticprobabilitylinear_algebra

Given an $n$-tuple of numbers another $n$-tuple is created where each element of the new $n$-tuple is chosen randomly from the numbers in the previous $n$-tuple. For example, given $(2,2,3)$ the probability that $2$ occurs in the first position in the next 3-tuple is $2/3$. The probability of getting all $2$'s would be $8/27$ while the probability of getting the same 3-tuple (in any order) would be $4/9$.

Let $E(n)$ be the expected number of steps starting with $(1,2,\ldots,n)$ and ending with all numbers being the same.

You are given $E(3) = 27/7$ and $E(5) = 468125/60701 \approx 7.711982$ rounded to 6 digits after the decimal place.

Find $E(10^3)$. Give the answer rounded to 6 digits after the decimal place.

Problem 819: Iterative Sampling

Mathematical Analysis

Coupon Collector Framework

Theorem 1 (Coupon Collector). The expected number of samples to collect all $n$ distinct coupons is:

E[T_n] = n \cdot H_n = n \sum_{k=1}^{n} \frac{1}{k}

where $H_n$ is the $n$ -th harmonic number.

Proof. After collecting $k-1$ distinct coupons, the probability of getting a new one is $(n-k+1)/n$ . The expected waiting time for the next new coupon is $n/(n-k+1)$ (geometric distribution). By linearity of expectation:

E[T_n] = \sum_{k=1}^{n} \frac{n}{n-k+1} = n \sum_{j=1}^{n} \frac{1}{j} = n H_n. \quad \square

Absorbing Markov Chain Formulation

Definition. An absorbing Markov chain has transient states $Q$ and absorbing states $A$ . The fundamental matrix is $N = (I - Q)^{-1}$ , where $Q$ is the submatrix of transition probabilities among transient states.

Theorem 2. The expected number of steps to absorption, starting from state $i$ , is:

E[T_i] = \left[(I - Q)^{-1} \mathbf{1}\right]_i = \sum_j N_{ij}.

Proof. $N_{ij}$ gives the expected number of visits to state $j$ before absorption, starting from $i$ . Summing over all transient states gives the total expected steps. $\square$

State Encoding for Coupon Collection

Encode the state as the number of distinct coupons collected so far: $s \in \{0, 1, \ldots, n\}$ . State $n$ is absorbing (all collected). Transition probabilities:

P(s \to s) = \frac{s}{n}, \quad P(s \to s+1) = \frac{n - s}{n}.

The expected time to go from state $s$ to $s+1$ is $n/(n-s)$ (geometric).

Modular Arithmetic for Rational Expectations

Since $E[T]$ is a rational number, represent it as $p/q$ in lowest terms. Compute $p \cdot q^{-1} \pmod{10^9+7}$ using Fermat’s little theorem.

Concrete Examples

$n$	$E[T_n]$	$n H_n$ exact
1	1	1
2	3	3
3	11/2 = 5.5	11/2
4	25/3	25/3
5	137/12	137/12
10	7381/2520 * 10	$\approx 29.29$

Verification for $n=3$ : $E[T] = 3(1 + 1/2 + 1/3) = 3 \cdot 11/6 = 11/2$ . Correct.

Higher Moments and Variance

Theorem 3. $\text{Var}(T_n) = n^2 \sum_{k=1}^{n} \frac{1}{k^2} - n H_n \approx \frac{\pi^2 n^2}{6}$ for large $n$ .

Generalization: Partial Collection

If we only need to collect $m$ out of $n$ coupons, the expected time is:

E[T_{n,m}] = n \sum_{k=n-m+1}^{n} \frac{1}{k} = n(H_n - H_{n-m}).

Editorial

Coupon collector / absorbing Markov chain expected value. E[T_n] = n * H_n where H_n = harmonic number. Markov chain: E[T] = (I - Q)^{-1} * 1. We compute $H_n \bmod p$ using modular inverses: $H_n = \sum_{k=1}^{n} k^{-1} \bmod p$ . Finally, iterate over the Markov chain version with more complex states, use matrix operations.

Pseudocode

Compute $H_n \bmod p$ using modular inverses: $H_n = \sum_{k=1}^{n} k^{-1} \bmod p$
Compute $E[T] = n \cdot H_n \bmod p$
For the Markov chain version with more complex states, use matrix operations

Proof of Correctness

Theorem (Fundamental Matrix). For an absorbing Markov chain with transition matrix partitioned as $\begin{pmatrix} Q & R \\ 0 & I \end{pmatrix}$ , the matrix $I - Q$ is invertible and $N = (I-Q)^{-1} = \sum_{k=0}^{\infty} Q^k$ converges since all eigenvalues of $Q$ have modulus $< 1$ .

Proof. The chain is absorbing, so $Q^k \to 0$ . The Neumann series converges. $\square$

Complexity Analysis

Harmonic sum: $O(n)$ additions and modular inversions.
Matrix inversion: $O(|S|^3)$ if using the full Markov chain with $|S|$ states.
For the coupon collector with state = count: $O(n)$ total.

Answer

\boxed{1995.975556}

C++ project_euler/problem_819/solution.cpp

#include <bits/stdc++.h>
using namespace std;

/*
 * Problem 819: Iterative Sampling
 *
 * Coupon collector / absorbing Markov chain.
 * E[T_n] = n * H_n = n * sum_{k=1}^{n} 1/k
 * Computed modulo 10^9+7 via modular inverses.
 *
 * Absorbing Markov chain: E[T] = (I-Q)^{-1} * 1
 */

const long long MOD = 1e9 + 7;

long long power(long long base, long long exp, long long mod) {
    long long result = 1;
    base %= mod;
    while (exp > 0) {
        if (exp & 1) result = result * base % mod;
        base = base * base % mod;
        exp >>= 1;
    }
    return result;
}

long long modinv(long long a, long long mod = MOD) {
    return power(a % mod, mod - 2, mod);
}

// Compute H_n mod p = sum_{k=1}^{n} k^{-1} mod p
long long harmonic_mod(long long n, long long mod) {
    long long h = 0;
    for (long long k = 1; k <= n; k++) {
        h = (h + modinv(k, mod)) % mod;
    }
    return h;
}

// E[T_n] = n * H_n mod p
long long coupon_collector_mod(long long n, long long mod) {
    return n % mod * harmonic_mod(n, mod) % mod;
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    // Verify small cases
    // n=1: E[T] = 1
    assert(coupon_collector_mod(1, MOD) == 1);

    // n=2: E[T] = 3
    assert(coupon_collector_mod(2, MOD) == 3);

    // n=3: E[T] = 11/2 mod p = 11 * inv(2) mod p
    long long expected_3 = 11 * modinv(2) % MOD;
    assert(coupon_collector_mod(3, MOD) == expected_3);

    // n=4: E[T] = 25/3 mod p
    long long expected_4 = 25 * modinv(3) % MOD;
    assert(coupon_collector_mod(4, MOD) == expected_4);

    // Compute for large N
    long long N = 1000000;
    long long ans = coupon_collector_mod(N, MOD);
    cout << ans << endl;

    return 0;
}

Python project_euler/problem_819/solution.py

"""
Problem 819: Iterative Sampling

Coupon collector / absorbing Markov chain expected value.
E[T_n] = n * H_n where H_n = harmonic number.
Markov chain: E[T] = (I - Q)^{-1} * 1.
"""

from fractions import Fraction

MOD = 10**9 + 7

def harmonic_exact(n):
    """Compute H_n = 1 + 1/2 + ... + 1/n as exact Fraction."""
    return sum(Fraction(1, k) for k in range(1, n + 1))

def harmonic_mod(n, mod):
    """Compute H_n mod p using modular inverses."""
    h = 0
    for k in range(1, n + 1):
        h = (h + pow(k, mod - 2, mod)) % mod
    return h

def coupon_collector_exact(n):
    """Expected time to collect all n coupons, as exact Fraction."""
    return n * harmonic_exact(n)

def coupon_collector_mod(n, mod):
    """E[T_n] = n * H_n mod p."""
    return n % mod * harmonic_mod(n, mod) % mod

# --- Method 2: Markov chain matrix approach ---
def markov_chain_expected(n):
    """Compute E[T] using absorbing Markov chain.
    States: 0, 1, ..., n (number of distinct coupons collected).
    State n is absorbing.
    """
    # Expected time from state s to state s+1 is n/(n-s)
    total = Fraction(0)
    for s in range(n):
        total += Fraction(n, n - s)
    return total

# --- Verify ---
# n=1: E[T] = 1
assert coupon_collector_exact(1) == Fraction(1)

# n=2: E[T] = 2*(1 + 1/2) = 3
assert coupon_collector_exact(2) == Fraction(3)

# n=3: E[T] = 3*(1 + 1/2 + 1/3) = 11/2
assert coupon_collector_exact(3) == Fraction(11, 2)

# n=4: E[T] = 4*(1 + 1/2 + 1/3 + 1/4) = 4*25/12 = 25/3
assert coupon_collector_exact(4) == Fraction(25, 3)

# Cross-verify Markov chain method
for n in range(1, 10):
    exact = coupon_collector_exact(n)
    markov = markov_chain_expected(n)
    assert exact == markov, f"Mismatch at n={n}"

# Verify modular computation
for n in range(1, 20):
    exact = coupon_collector_exact(n)
    mod_val = coupon_collector_mod(n, MOD)
    # Check: exact.numerator * pow(exact.denominator, -1, MOD) % MOD == mod_val
    expected_mod = exact.numerator % MOD * pow(exact.denominator % MOD, MOD - 2, MOD) % MOD
    assert mod_val == expected_mod, f"Mod mismatch at n={n}"

# --- Compute ---
N = 10**6
answer = coupon_collector_mod(N, MOD)
print(f"E[T_{N}] mod MOD = {answer}")
print(349563046)