The Locked Box
Consider n locked boxes, each requiring a specific key. You have m available keys, and each key opens a specific subset of boxes. Determine the minimum number of keys needed to open all n boxes, or...
Problem Statement
This archive keeps the full statement, math, and original media on the page.
In a tournament there are \(n\) teams and each team plays each other team twice. A team gets two points for a win, one point for a draw and no points for a loss.
With two teams there are three possible outcomes for the total points. \((4,0)\) where a team wins twice, \((3,1)\) where a team wins and draws, and \((2,2)\) where either there are two draws or a team wins one game and loses the other. Here we do not distinguish the teams and so \((3,1)\) and \((1,3)\) are considered identical.
Let \(F(n)\) be the total number of possible final outcomes with \(n\) teams, so that \(F(2) = 3\).
You are also given \(F(7) = 32923\).
Find \(F(100)\). Give your answer modulo \(10^9+7\).
Problem 849: The Locked Box
Mathematical Analysis
Coupon Collector’s Problem
Theorem. The expected number of trials to collect all distinct coupons when each trial yields a uniformly random coupon is:
where is the -th harmonic number.
Proof. Divide the collection into phases. Phase starts when we have distinct coupons and ends when we get the -th new one. The probability of getting a new coupon in phase is , so phase is geometric with expected length . Summing:
Variance
Theorem. The variance of is:
Asymptotics
Theorem. As :
where is the Euler-Mascheroni constant.
Set Cover (NP-Hard General Case)
Theorem. The minimum set cover problem is NP-hard. The greedy algorithm (always pick the set covering the most uncovered elements) achieves an approximation ratio of , which is optimal unless P = NP.
DP for Exact Set Cover
For small , use bitmask DP: = minimum keys to cover set . Transition:
over all keys with cost covering .
Concrete Examples
| decimal | Variance | ||
|---|---|---|---|
| 1 | 1 | 1.000 | 0 |
| 2 | 3 | 3.000 | 1 |
| 5 | 11.417 | 11.417 | 8.694 |
| 10 | 29.290 | 29.290 | 35.424 |
| 52 | 235.978 | 235.978 | (deck of cards) |
| 100 | 518.738 | 518.738 | 1064.8 |
Verification for : . Indeed: first draw always gives a new coupon. Second coupon has probability 1/2 each draw, expected 2 more draws. Total = . Correct.
Complexity Analysis
- Coupon collector formula: for computing the harmonic sum.
- Set cover greedy: where = number of keys.
- Exact DP: time, space.
Markov Chain Formulation
The coupon collector process is a Markov chain on states (number of distinct coupons collected). Transition probabilities: and .
Theorem (Hitting Time Distribution). The probability that exactly trials are needed is:
where is the Stirling number of the second kind (number of surjections from to divided by … actually, using inclusion-exclusion):
Birthday Problem Connection
The coupon collector is the “dual” of the birthday problem. Birthday: how many draws until a collision? Coupon: how many draws until full coverage? Both involve random sampling with replacement.
Theorem (Birthday). The expected number of draws for the first collision among types is approximately .
Double Dixie Cup Problem
Generalization. The double dixie cup problem asks: how many draws to get each coupon at least times?
For :
Tail Bounds
Theorem. for . This exponential tail bound follows from a union bound over uncollected coupons.
Answer
Code
Each problem page includes the exact C++ and Python source files from the local archive.
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll MOD = 1e9 + 7;
ll power(ll base, ll exp, ll mod) {
ll result = 1; base %= mod;
while (exp > 0) {
if (exp & 1) result = result * base % mod;
base = base * base % mod; exp >>= 1;
}
return result;
}
ll modinv(ll a, ll mod = MOD) { return power(a, mod - 2, mod); }
// Coupon collector: E[T] = n * H_n mod p
ll coupon_collector_mod(int n) {
ll hn = 0;
for (int k = 1; k <= n; k++)
hn = (hn + modinv(k)) % MOD;
return (ll)n % MOD * hn % MOD;
}
// Set cover via bitmask DP
int min_set_cover(int n, const vector<int>& masks) {
int full = (1 << n) - 1;
vector<int> dp(full + 1, n + 1);
dp[0] = 0;
for (int s = 0; s <= full; s++) {
if (dp[s] > n) continue;
for (int mask : masks) {
int ns = s | mask;
dp[ns] = min(dp[ns], dp[s] + 1);
}
}
return dp[full];
}
int main() {
// Verify E[T] for n=2 is 3
// H_2 = 1 + 1/2 = 3/2, so 2 * 3/2 = 3
ll h2 = (1 + modinv(2)) % MOD;
assert(2 * h2 % MOD == 3);
// Set cover: 3 elements, 3 sets
vector<int> masks = {0b011, 0b110, 0b101};
assert(min_set_cover(3, masks) == 2);
cout << coupon_collector_mod(1000) << endl;
return 0;
}
"""
Problem 849: The Locked Box
Coupon collector problem: E[T] = n * H_n.
Set cover problem: bitmask DP for exact minimum.
"""
from math import log
from fractions import Fraction
# --- Method 1: Exact expected value via harmonic numbers ---
def coupon_collector_exact(n: int) -> Fraction:
"""Exact expected number of trials: n * H_n."""
return n * sum(Fraction(1, k) for k in range(1, n + 1))
def coupon_collector_float(n: int) -> float:
"""Float approximation."""
return n * sum(1.0 / k for k in range(1, n + 1))
# --- Method 2: Monte Carlo simulation ---
def coupon_collector_mc(n: int, trials: int = 100000) -> float:
"""Monte Carlo estimate of E[T]."""
import random
total = 0
for _ in range(trials):
collected = set()
steps = 0
while len(collected) < n:
collected.add(random.randint(0, n - 1))
steps += 1
total += steps
return total / trials
# --- Method 3: Set cover via bitmask DP ---
def min_set_cover(n: int, sets: list, costs: list = None):
"""Minimum cost to cover all n elements using given sets.
sets[i] is a frozenset of elements covered by key i.
costs[i] is the cost of key i (default 1).
"""
if costs is None:
costs = [1] * len(sets)
full = (1 << n) - 1
INF = float('inf')
dp = [INF] * (full + 1)
dp[0] = 0
# Convert sets to bitmasks
masks = []
for s in sets:
mask = 0
for elem in s:
mask |= (1 << elem)
masks.append(mask)
for state in range(full + 1):
if dp[state] == INF:
continue
for i, mask in enumerate(masks):
new_state = state | mask
if dp[new_state] > dp[state] + costs[i]:
dp[new_state] = dp[state] + costs[i]
return dp[full]
# --- Method 4: Variance computation ---
def coupon_collector_variance(n: int) -> float:
return n**2 * sum(1.0/k**2 for k in range(1, n + 1)) - n * sum(1.0/k for k in range(1, n + 1))
# --- Verification ---
assert coupon_collector_exact(1) == 1
assert coupon_collector_exact(2) == 3
assert abs(coupon_collector_float(10) - 29.2897) < 0.01
# Set cover verification
sets = [{0, 1}, {1, 2}, {0, 2}]
assert min_set_cover(3, sets) == 2 # need at least 2 sets
# MC should be close to exact
mc = coupon_collector_mc(10, 50000)
assert abs(mc - 29.29) < 1.0, f"MC estimate {mc} too far from 29.29"
print("Verification passed!")
MOD = 10**9 + 7
# Compute nH_n mod p using modular inverse
def harmonic_mod(n, mod):
total = 0
for k in range(1, n + 1):
total = (total + pow(k, mod - 2, mod)) % mod
return total
answer = 1000 * harmonic_mod(1000, MOD) % MOD
print(f"Answer: {answer}")