All Euler problems
Project Euler

10001st Prime

Determine the 10001st prime number, where primes are indexed as p_1 = 2, p_2 = 3, p_3 = 5,...

Source sync Apr 19, 2026
Problem #0007
Level Level 00
Solved By 452,606
Languages C++, Python
Answer 104743
Length 598 words
number_theorybrute_forcelinear_algebra

Problem Statement

This archive keeps the full statement, math, and original media on the page.

By listing the first six prime numbers: \(2, 3, 5, 7, 11\), and \(13\), we can see that the \(6\)th prime is \(13\).

What is the \(10\,001\)st prime number

Problem 7: 10001st Prime

Mathematical Development

Definitions

Definition 1. A natural number n2n \ge 2 is prime if its only positive divisors are 1 and nn. A natural number n2n \ge 2 that is not prime is composite.

Definition 2. The prime-counting function π(x)\pi(x) denotes the number of primes pp satisfying pxp \le x. We write pnp_n for the nn-th smallest prime.

Upper Bound on pnp_n

Theorem 1 (Prime Number Theorem). As xx \to \infty,

π(x)xlnx.\pi(x) \sim \frac{x}{\ln x}.

Equivalently, pnnlnnp_n \sim n \ln n as nn \to \infty.

Theorem 2 (Rosser, 1941). For every integer n6n \ge 6,

pn<n(lnn+lnlnn).p_n < n(\ln n + \ln \ln n).

Reference. J. B. Rosser, “Explicit bounds for some functions of prime numbers,” Amer. J. Math. 63 (1941), 211—232. The proof relies on explicit bounds for the Chebyshev function θ(x)=pxlnp\theta(x) = \sum_{p \le x} \ln p and is taken here as established.

Proposition 1 (Sieve bound for n=10001n = 10001). The 10001st prime satisfies p10001<114320p_{10001} < 114320.

Proof. Since 10001610001 \ge 6, Theorem 2 applies. We compute:

  • ln(10001)=9.21044\ln(10001) = 9.21044\ldots
  • ln(ln(10001))=ln(9.21044)=2.22069\ln(\ln(10001)) = \ln(9.21044\ldots) = 2.22069\ldots
  • 10001×(9.21044+2.22069)=10001×11.43113=114319.410001 \times (9.21044 + 2.22069) = 10001 \times 11.43113\ldots = 114319.4\ldots

Hence p10001<114320p_{10001} < 114320. We set the sieve limit N=115000N = 115000 to provide a safety margin. \square

Correctness of the Sieve of Eratosthenes

Theorem 3 (Sieve of Eratosthenes). Let N2N \ge 2 be an integer. Define the sieve procedure:

  1. Initialize a Boolean array A[0N]A[0 \ldots N] with A[i]=trueA[i] = \mathrm{true} for all ii.
  2. Set A[0]=A[1]=falseA[0] = A[1] = \mathrm{false}.
  3. For each integer pp from 22 to N\lfloor \sqrt{N} \rfloor: if A[p]=trueA[p] = \mathrm{true}, set A[kp]=falseA[kp] = \mathrm{false} for every integer kpk \ge p with kpNkp \le N.

Then for every integer i[2,N]i \in [2, N], A[i]=trueA[i] = \mathrm{true} if and only if ii is prime.

Proof. We establish both directions.

Claim A (every composite is marked). Let m[2,N]m \in [2, N] be composite. Write m=abm = ab with 2ab2 \le a \le b. Then amNa \le \sqrt{m} \le \sqrt{N}. Let pp be the smallest prime factor of mm; then paNp \le a \le \sqrt{N}. At iteration pp of the sieve, all multiples of pp starting from p2p^2 are marked false. Since mm is a multiple of pp and mpap2m \ge pa \ge p^2, the entry A[m]A[m] is set to false.

Claim B (no prime is marked). Let q[2,N]q \in [2, N] be prime. In step 3, A[q]A[q] could only be set to false during the iteration for some prime pNp \le \lfloor \sqrt{N} \rfloor with pqp \ne q. But marking begins at p2p^2, and the values marked are p2,p2+p,p2+2p,p^2, p^2 + p, p^2 + 2p, \ldots, all of which are multiples of pp. Since qq is prime and qpq \ne p, we have pqp \nmid q, so qq is never marked. \square

Corollary 1. Running the Sieve of Eratosthenes up to N=115000N = 115000 and extracting the nn-th entry equal to true (counting from i=2i = 2) yields pnp_n for any nn with pnNp_n \le N. By Proposition 1, this includes n=10001n = 10001.

Editorial

The algorithm first chooses a proven upper bound that is guaranteed to contain the nn-th prime. It then runs the Sieve of Eratosthenes on [2,N][2, N], marks every composite starting from p2p^2, and performs a final increasing scan through the sieve until the nn-th marked prime is reached. This is sufficient because the bound contains pnp_n and the sieve leaves exactly the primes unmarked.

Theorem 5 (Algorithm correctness). NthPrime(n) returns the nn-th prime number for every n1n \ge 1.

Proof. If n<6n < 6, then the algorithm uses the explicit bound N=20N = 20, and the primes up to 20 are

2,3,5,7,11,13,17,19,2, 3, 5, 7, 11, 13, 17, 19,

so the nn-th prime certainly lies in [2,20][2,20].

If n6n \ge 6, the algorithm sets

N=n(lnn+lnlnn)+100.N = \left\lceil n(\ln n + \ln \ln n)\right\rceil + 100.

By Theorem 2, pn<n(lnn+lnlnn)Np_n < n(\ln n + \ln \ln n) \le N, so again pnNp_n \le N.

By Theorem 3, after the sieve phase the entries marked true are exactly the primes in [2,N][2,N]. The final scan visits these integers in strictly increasing order and increments count once for each prime encountered. Therefore when count == n, the current value is exactly the nn-th prime. Hence the returned value is pnp_n. \square

Numerical Evaluation

Proposition 2. The algorithm returns

p10001=104743.p_{10001} = 104743.

Proof. By Proposition 1, it is enough to sieve up to N=115000N = 115000. Running the algorithm on this interval shows:

  • there are exactly 1000010000 primes strictly less than 104743104743;
  • there are exactly 1000110001 primes less than or equal to 104743104743.

Therefore 104743104743 is the 1000110001-st prime. \square

Pseudocode

Algorithm: nth Prime by Sieve
Require: An integer n ≥ 1.
Ensure: The n-th prime number.
1: Choose an upper bound U that is guaranteed to contain the n-th prime.
2: Build a sieve on {2, 3, ..., U}, marking composites from p^2 onward for each prime base p.
3: Scan the surviving primes in increasing order, maintaining a counter c.
4: When c = n, return the current prime.

Complexity Analysis

Theorem 4 (Sieve complexity). The Sieve of Eratosthenes on [0,N][0, N] runs in O(NloglogN)O(N \log \log N) time and O(N)O(N) space.

Proof. Time. The total number of marking operations is

pNp primeNppNp primeNp=NpNp prime1p.\sum_{\substack{p \le \sqrt{N} \\ p \text{ prime}}} \left\lfloor \frac{N}{p} \right\rfloor \le \sum_{\substack{p \le N \\ p \text{ prime}}} \frac{N}{p} = N \sum_{\substack{p \le N \\ p \text{ prime}}} \frac{1}{p}.

By Mertens’ second theorem (1874), px1/p=lnlnx+M+O(1/lnx)\sum_{p \le x} 1/p = \ln \ln x + M + O(1/\ln x), where M0.2615M \approx 0.2615 is the Meissel—Mertens constant. Hence the sum is O(NloglogN)O(N \log \log N). The subsequent linear scan to locate pnp_n costs O(N)O(N), which is dominated.

Space. The Boolean array has N+1N + 1 entries, giving O(N)O(N) space. \square

Corollary 2. With N=O(nlogn)N = O(n \log n) by Theorem 2, the overall time complexity for finding pnp_n is O(nlognloglog(nlogn))O(n \log n \cdot \log \log(n \log n)) and space complexity is O(nlogn)O(n \log n).

Answer

104743\boxed{104743}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_007/solution.cpp
#include <bits/stdc++.h>
using namespace std;

int main() {
    const int N = 115000; // Rosser's bound: p_10001 < 10001*(ln 10001 + ln ln 10001) < 114320
    vector<bool> sieve(N + 1, true);
    sieve[0] = sieve[1] = false;
    for (int i = 2; i * i <= N; i++)
        if (sieve[i])
            for (int j = i * i; j <= N; j += i)
                sieve[j] = false;
    int count = 0;
    for (int i = 2; i <= N; i++) {
        if (sieve[i] && ++count == 10001) {
            cout << i << endl;
            return 0;
        }
    }
    return 0;
}