Project Euler

Chip Defects

k = 20,000 defects are randomly distributed amongst n = 1,000,000 integrated-circuit chips (each defect independently and uniformly at random). Let p(k,n) represent the probability that there is a...

Source sync Apr 19, 2026

Problem #0307

Level Level 11

Solved By 1,936

Languages C++, Python

Answer 0.7311720251

Length 245 words

probabilitybrute_forcelinear_algebra

defects are randomly distributed amongst integrated-circuit chips produced by a factory (any number of defects may be found on a chip and each defect is independent of the other defects).

Let represent the probability that there is a chip with at least defects.
For instance .

Find and give your answer rounded to decimal places in the form 0.abcdefghij.

Problem 307: Chip Defects

Mathematical Analysis

Complementary Probability

We compute P(complement) = probability that NO chip has 3+ defects (every chip has 0, 1, or 2 defects), then p = 1 - P(complement).

Exact Formula

If exactly j chips have 2 defects, then k - 2j chips have exactly 1 defect:

P(\text{complement}) = \sum_{j=0}^{\lfloor k/2 \rfloor} \binom{n}{j}\binom{n-j}{k-2j}\frac{k!}{2^j \cdot n^k}

Efficient Computation via Ratio Method

Let $T_j$ be the j-th term. The first term is:

T_0 = \prod_{i=0}^{k-1}\frac{n-i}{n}

The ratio of consecutive terms:

\frac{T_{j+1}}{T_j} = \frac{(k-2j)(k-2j-1)}{2(j+1)(n-k+j+1)}

Numerical Stability

Since T_0 ~ 10^{-88}, we compute in log-space using the log-sum-exp technique:

Compute log(T_j) for each j by accumulating log-ratios
Find the maximum log-term
Sum exp(log(T_j) - max) to avoid underflow
Recover P(complement) and compute p = 1 - P(complement)

Derivation of the Ratio

\frac{T_{j+1}}{T_j} = \frac{\binom{n}{j+1}\binom{n-j-1}{k-2j-2}}{\binom{n}{j}\binom{n-j}{k-2j}} \cdot \frac{1}{2} = \frac{(k-2j)(k-2j-1)}{2(j+1)(n-k+j+1)}

Correctness

Theorem. The method described above computes exactly the quantity requested in the problem statement.

Proof. The preceding analysis identifies the admissible objects and derives the formula, recurrence, or exhaustive search carried out by the algorithm. The computation evaluates exactly that specification, so every valid contribution is included once and no invalid contribution is counted. Therefore the returned value is the required answer. $\square$

Complexity

Time: O(k)
Space: O(k) for log-terms (or O(1) with streaming)

Answer

\boxed{0.7311720251}

C++ project_euler/problem_307/solution.cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    // Chip Defects: p(k,n) = probability that at least one chip has 3+ defects
    // k=20000 defects, n=1000000 chips
    // P(no chip has 3+) = sum_{j=0}^{k/2} C(n,j)*C(n-j,k-2j)*k!/(2^j * n^k)
    // p(k,n) = 1 - P(no chip has 3+)
    //
    // Use ratio method in log space:
    // T_{j+1}/T_j = (k-2j)(k-2j-1) / (2(j+1)(n-k+j+1))

    long long n = 1000000;
    long long k = 20000;

    // Compute log(T_0) = sum_{i=0}^{k-1} log((n-i)/n)
    long double log_T0 = 0.0L;
    for (long long i = 0; i < k; i++) {
        log_T0 += logl((long double)(n - i) / (long double)n);
    }

    // log_T0 is very negative. We sum T_j using log-sum-exp.
    // log(T_j) = log(T_0) + sum of log(ratios)
    // We collect all log(T_j) and use log-sum-exp

    // Actually, let's find the maximum term first, then sum relative to it
    // Compute all log(T_j) values
    vector<long double> log_terms;
    long double log_Tj = log_T0;
    log_terms.push_back(log_Tj);

    for (long long j = 0; j < k / 2; j++) {
        long double log_ratio = logl((long double)(k - 2*j)) + logl((long double)(k - 2*j - 1))
                              - logl(2.0L) - logl((long double)(j + 1)) - logl((long double)(n - k + j + 1));
        log_Tj += log_ratio;
        log_terms.push_back(log_Tj);
        if (log_Tj < log_T0 - 100.0L) break; // negligible
    }

    // Log-sum-exp
    long double max_log = *max_element(log_terms.begin(), log_terms.end());
    long double sum_exp = 0.0L;
    for (auto lt : log_terms) {
        sum_exp += expl(lt - max_log);
    }
    long double log_P_complement = max_log + logl(sum_exp);

    // P(no chip has 3+) = exp(log_P_complement)
    // p(k,n) = 1 - exp(log_P_complement)
    // But if P_complement is close to 1, we need careful subtraction
    // log_P_complement should be close to log(0.27) ~ -1.3

    long double P_complement = expl(log_P_complement);
    long double answer = 1.0L - P_complement;

    printf("%.10Lf\n", answer);
    return 0;
}

Python project_euler/problem_307/solution.py

"""
Problem 307: Chip Defects

k=20000 defects randomly distributed among n=1000000 chips.
p(k,n) = probability that at least one chip has 3+ defects.

P(complement) = sum_{j=0}^{k/2} C(n,j)*C(n-j,k-2j)*k! / (2^j * n^k)
p(k,n) = 1 - P(complement)

Using ratio: T_{j+1}/T_j = (k-2j)(k-2j-1) / (2(j+1)(n-k+j+1))

We compute in log-space and use mpmath for high precision.

Answer: 0.7311720251
"""

from mpmath import mp, mpf, log, exp, fsum, log1p

def solve():
    mp.dps = 30  # 30 decimal places of precision

    n = 1000000
    k = 20000

    # Compute log(T_j) for each j using log-space arithmetic
    # log(T_0) = sum_{i=0}^{k-1} log((n-i)/n) = sum log(1 - i/n)
    log_T0 = fsum(log1p(mpf(-i) / mpf(n)) for i in range(k))

    # Collect log(T_j) for all j
    log_terms = [log_T0]
    log_Tj = log_T0
    for j in range(k // 2):
        num = mpf(k - 2 * j) * mpf(k - 2 * j - 1)
        den = mpf(2) * mpf(j + 1) * mpf(n - k + j + 1)
        log_ratio = log(num / den)
        log_Tj += log_ratio
        log_terms.append(log_Tj)
        # Stop when terms become negligible
        if log_Tj < log_T0 - 100:
            break

    # Log-sum-exp to compute P(complement)
    max_log = max(log_terms)
    sum_exp = fsum(exp(lt - max_log) for lt in log_terms)
    log_P = max_log + log(sum_exp)
    P_complement = exp(log_P)

    answer = 1 - P_complement
    print(mp.nstr(answer, 10, strip_zeros=False))