All Euler problems
Project Euler

Iterative Sampling

Consider a random process on the set {1, 2,..., n}. At each step, we sample an element uniformly at random (with replacement). Let T be the number of steps until some stopping condition is met (e.g...

Source sync Apr 19, 2026
Problem #0819
Level Level 31
Solved By 271
Languages C++, Python
Answer 1995.975556
Length 438 words
modular_arithmeticprobabilitylinear_algebra

Problem Statement

This archive keeps the full statement, math, and original media on the page.

Given an $n$-tuple of numbers another $n$-tuple is created where each element of the new $n$-tuple is chosen randomly from the numbers in the previous $n$-tuple. For example, given $(2,2,3)$ the probability that $2$ occurs in the first position in the next 3-tuple is $2/3$. The probability of getting all $2$'s would be $8/27$ while the probability of getting the same 3-tuple (in any order) would be $4/9$.

Let $E(n)$ be the expected number of steps starting with $(1,2,\ldots,n)$ and ending with all numbers being the same.

You are given $E(3) = 27/7$ and $E(5) = 468125/60701 \approx 7.711982$ rounded to 6 digits after the decimal place.

Find $E(10^3)$. Give the answer rounded to 6 digits after the decimal place.

Problem 819: Iterative Sampling

Mathematical Analysis

Coupon Collector Framework

Theorem 1 (Coupon Collector). The expected number of samples to collect all nn distinct coupons is:

E[Tn]=nHn=nk=1n1kE[T_n] = n \cdot H_n = n \sum_{k=1}^{n} \frac{1}{k}

where HnH_n is the nn-th harmonic number.

Proof. After collecting k1k-1 distinct coupons, the probability of getting a new one is (nk+1)/n(n-k+1)/n. The expected waiting time for the next new coupon is n/(nk+1)n/(n-k+1) (geometric distribution). By linearity of expectation:

E[Tn]=k=1nnnk+1=nj=1n1j=nHn.E[T_n] = \sum_{k=1}^{n} \frac{n}{n-k+1} = n \sum_{j=1}^{n} \frac{1}{j} = n H_n. \quad \square

Absorbing Markov Chain Formulation

Definition. An absorbing Markov chain has transient states QQ and absorbing states AA. The fundamental matrix is N=(IQ)1N = (I - Q)^{-1}, where QQ is the submatrix of transition probabilities among transient states.

Theorem 2. The expected number of steps to absorption, starting from state ii, is:

E[Ti]=[(IQ)11]i=jNij.E[T_i] = \left[(I - Q)^{-1} \mathbf{1}\right]_i = \sum_j N_{ij}.

Proof. NijN_{ij} gives the expected number of visits to state jj before absorption, starting from ii. Summing over all transient states gives the total expected steps. \square

State Encoding for Coupon Collection

Encode the state as the number of distinct coupons collected so far: s{0,1,,n}s \in \{0, 1, \ldots, n\}. State nn is absorbing (all collected). Transition probabilities:

P(ss)=sn,P(ss+1)=nsn.P(s \to s) = \frac{s}{n}, \quad P(s \to s+1) = \frac{n - s}{n}.

The expected time to go from state ss to s+1s+1 is n/(ns)n/(n-s) (geometric).

Modular Arithmetic for Rational Expectations

Since E[T]E[T] is a rational number, represent it as p/qp/q in lowest terms. Compute pq1(mod109+7)p \cdot q^{-1} \pmod{10^9+7} using Fermat’s little theorem.

Concrete Examples

nnE[Tn]E[T_n]nHnn H_n exact
111
233
311/2 = 5.511/2
425/325/3
5137/12137/12
107381/2520 * 1029.29\approx 29.29

Verification for n=3n=3: E[T]=3(1+1/2+1/3)=311/6=11/2E[T] = 3(1 + 1/2 + 1/3) = 3 \cdot 11/6 = 11/2. Correct.

Higher Moments and Variance

Theorem 3. Var(Tn)=n2k=1n1k2nHnπ2n26\text{Var}(T_n) = n^2 \sum_{k=1}^{n} \frac{1}{k^2} - n H_n \approx \frac{\pi^2 n^2}{6} for large nn.

Generalization: Partial Collection

If we only need to collect mm out of nn coupons, the expected time is:

E[Tn,m]=nk=nm+1n1k=n(HnHnm).E[T_{n,m}] = n \sum_{k=n-m+1}^{n} \frac{1}{k} = n(H_n - H_{n-m}).

Editorial

Coupon collector / absorbing Markov chain expected value. E[T_n] = n * H_n where H_n = harmonic number. Markov chain: E[T] = (I - Q)^{-1} * 1. We compute HnmodpH_n \bmod p using modular inverses: Hn=k=1nk1modpH_n = \sum_{k=1}^{n} k^{-1} \bmod p. Finally, iterate over the Markov chain version with more complex states, use matrix operations.

Pseudocode

Compute $H_n \bmod p$ using modular inverses: $H_n = \sum_{k=1}^{n} k^{-1} \bmod p$
Compute $E[T] = n \cdot H_n \bmod p$
For the Markov chain version with more complex states, use matrix operations

Proof of Correctness

Theorem (Fundamental Matrix). For an absorbing Markov chain with transition matrix partitioned as (QR0I)\begin{pmatrix} Q & R \\ 0 & I \end{pmatrix}, the matrix IQI - Q is invertible and N=(IQ)1=k=0QkN = (I-Q)^{-1} = \sum_{k=0}^{\infty} Q^k converges since all eigenvalues of QQ have modulus <1< 1.

Proof. The chain is absorbing, so Qk0Q^k \to 0. The Neumann series converges. \square

Complexity Analysis

  • Harmonic sum: O(n)O(n) additions and modular inversions.
  • Matrix inversion: O(S3)O(|S|^3) if using the full Markov chain with S|S| states.
  • For the coupon collector with state = count: O(n)O(n) total.

Answer

1995.975556\boxed{1995.975556}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_819/solution.cpp
#include <bits/stdc++.h>
using namespace std;

/*
 * Problem 819: Iterative Sampling
 *
 * Coupon collector / absorbing Markov chain.
 * E[T_n] = n * H_n = n * sum_{k=1}^{n} 1/k
 * Computed modulo 10^9+7 via modular inverses.
 *
 * Absorbing Markov chain: E[T] = (I-Q)^{-1} * 1
 */

const long long MOD = 1e9 + 7;

long long power(long long base, long long exp, long long mod) {
    long long result = 1;
    base %= mod;
    while (exp > 0) {
        if (exp & 1) result = result * base % mod;
        base = base * base % mod;
        exp >>= 1;
    }
    return result;
}

long long modinv(long long a, long long mod = MOD) {
    return power(a % mod, mod - 2, mod);
}

// Compute H_n mod p = sum_{k=1}^{n} k^{-1} mod p
long long harmonic_mod(long long n, long long mod) {
    long long h = 0;
    for (long long k = 1; k <= n; k++) {
        h = (h + modinv(k, mod)) % mod;
    }
    return h;
}

// E[T_n] = n * H_n mod p
long long coupon_collector_mod(long long n, long long mod) {
    return n % mod * harmonic_mod(n, mod) % mod;
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    // Verify small cases
    // n=1: E[T] = 1
    assert(coupon_collector_mod(1, MOD) == 1);

    // n=2: E[T] = 3
    assert(coupon_collector_mod(2, MOD) == 3);

    // n=3: E[T] = 11/2 mod p = 11 * inv(2) mod p
    long long expected_3 = 11 * modinv(2) % MOD;
    assert(coupon_collector_mod(3, MOD) == expected_3);

    // n=4: E[T] = 25/3 mod p
    long long expected_4 = 25 * modinv(3) % MOD;
    assert(coupon_collector_mod(4, MOD) == expected_4);

    // Compute for large N
    long long N = 1000000;
    long long ans = coupon_collector_mod(N, MOD);
    cout << ans << endl;

    return 0;
}