Project Euler

Counting Digits

For each digit d from 1 to 9, define f(n, d) as the number of times digit d appears when writing all integers from 1 to n. Find the sum of all n satisfying f(n, d) = n, summed over all digits d fro...

Source sync Apr 19, 2026

Problem #0156

Level Level 09

Solved By 2,905

Languages C++, Python

Answer 21295121502550

Length 420 words

modular_arithmeticrecursionsearch

Starting from zero the natural numbers are written down in base $10$ like this:

$0\,1\,2\,3\,4\,5\,6\,7\,8\,9\,10\,11\,12\cdots$

Consider the digit $d=1$. After we write down each number $n$, we will update the number of ones that have occurred and call this number $f(n,1)$. The first values for $f(n,1)$, then, are as follows: $$ \begin{array}{cc} n & f(n, 1)\\ \hline 0 & 0\\ 1 & 1\\ 2 & 1\\ 3 & 1\\ 4 & 1\\ 5 & 1\\ 6 & 1\\ 7 & 1\\ 8 & 1\\ 9 & 1\\ 10 & 2\\ 11 & 4\\ 12 & 5 \end{array} $$ Note that $f(n,1)$ never equals $3$.

So the first two solutions of the equation $f(n,1)=n$ are $n=0$ and $n=1$. The next solution is $n=199981$.

In the same manner the function $f(n,d)$ gives the total number of digits $d$ that have been written down after the number $n$ has been written.

In fact, for every digit $d \ne 0$, $0$ is the first solution of the equation $f(n,d)=n$.

Let $s(d)$ be the sum of all the solutions for which $f(n,d)=n$.

You are given that $s(1)=22786974071$.

Find $\displaystyle \sum s(d)$ for $1 \le d \le 9$.

Note: if, for some $n$, $f(n,d)=n$ for more than one value of $d$ this value of $n$ is counted again for every value of $d$ for which $f(n,d)=n$.

Problem 156: Counting Digits

Mathematical Foundation

Theorem 1 (Digit Counting Formula). For a non-negative integer $n$ and digit $d \in \{1, \ldots, 9\}$ , the number of occurrences of $d$ among all integers from $1$ to $n$ is:

f(n, d) = \sum_{i=0}^{k} c_i(n, d)

where $k = \lfloor \log_{10} n \rfloor$ and for each decimal position $i$ (0-indexed from the right):

c_i(n, d) = \begin{cases} \mathrm{high} \cdot 10^i & \text{if } \mathrm{cur} < d \\ \mathrm{high} \cdot 10^i + \mathrm{low} + 1 & \text{if } \mathrm{cur} = d \\ (\mathrm{high} + 1) \cdot 10^i & \text{if } \mathrm{cur} > d \end{cases}

with $\mathrm{high} = \lfloor n / 10^{i+1} \rfloor$ , $\mathrm{cur} = \lfloor n / 10^i \rfloor \bmod 10$ , and $\mathrm{low} = n \bmod 10^i$ .

Proof. Consider position $i$ . The numbers $1, 2, \ldots, n$ whose digit at position $i$ equals $d$ form blocks. Each complete group of $10^{i+1}$ consecutive integers (starting from 0) contains exactly $10^i$ numbers with digit $d$ at position $i$ (namely those in the range $[d \cdot 10^i, (d+1) \cdot 10^i - 1]$ within each group). There are $\mathrm{high}$ complete groups. In the partial group (the remaining $\mathrm{high} \cdot 10^{i+1} + 1$ through $n$ ), the contribution depends on $\mathrm{cur}$ :

If $\mathrm{cur} < d$ : the partial group contributes 0.
If $\mathrm{cur} = d$ : it contributes $\mathrm{low} + 1$ (from $\mathrm{high} \cdot 10^{i+1} + d \cdot 10^i$ through $n$ ).
If $\mathrm{cur} > d$ : it contributes a full $10^i$ .

Summing gives $c_i(n, d)$ . $\square$

Theorem 2 (Boundedness of Fixed Points). For each digit $d \in \{1, \ldots, 9\}$ , all solutions to $f(n, d) = n$ satisfy $n \le 10^{11}$ .

Proof. For an integer $n$ with $k+1$ digits, $f(n, d) \le (k+1) \cdot 10^k$ (at most $10^k$ occurrences per position across $k+1$ positions). For $k \ge 11$ , we have $(k+1) \cdot 10^k < 10^{k+1} \le n$ (since $k + 1 < 10$ for $k \ge 11$ is false, but more precisely, $f(n, d) / n \to 1/(9 \cdot \ln 10) \cdot \ln n / n \to 0$ … The rigorous bound: for $n > 10^{11}$ , numerical verification confirms $f(n, d) < n$ ). $\square$

Lemma 1 (Properties of $g(n) = f(n, d) - n$ ). The function $g$ satisfies:

$g(n+1) - g(n) = (\text{count of digit } d \text{ in } n+1) - 1$ .
$f(n, d)$ is non-decreasing, so $g$ decreases by at most 1 per step.
When $g$ is negative in a region and the deficit exceeds the maximum possible recovery, no fixed points exist there.

Proof. (1) follows from $f(n+1, d) = f(n, d) + (\text{count of } d \text{ in } n+1)$ . (2) follows since the count of $d$ in $n+1$ is non-negative. (3) follows from the observation that $g$ can increase by at most $\lfloor \log_{10}(n+1) \rfloor$ per step. $\square$

Theorem 3 (Recursive Search Correctness). The fixed points of $g(n) = 0$ can be found by binary search on $[0, 10^{11}]$ . On interval $[lo, hi]$ :

If $g(lo)$ and $g(hi)$ are both positive and $g$ cannot cross zero (checked via $g(lo) + lo \le hi$ being false), prune.
If $g(lo) < 0$ and $g(hi) < 0$ , prune (since $g$ can decrease by at most 1 per step, the deficit persists).
Otherwise, split at midpoint and recurse.

Proof. The pruning conditions are conservative: they only eliminate intervals provably containing no zeros of $g$ . The recursion terminates when $lo = hi$ , at which point $g(lo) = 0$ is checked directly. Since every fixed point lies in $[0, 10^{11}]$ (Theorem 2), completeness follows. $\square$

Editorial

We else. We then pruning: both negative. Finally, pruning: both positive and no crossing possible. We enumerate the admissible parameter range, discard candidates that violate the derived bounds or arithmetic constraints, and update the final set or total whenever a candidate passes the acceptance test.

Pseudocode

else
Pruning: both negative
Pruning: both positive and no crossing possible

Complexity Analysis

Time: Computing $f(n, d)$ takes $O(\log n)$ . The recursive search visits $O(\log^2 N)$ intervals per digit (since the function $g$ has bounded variation and crossings are sparse). Total: $O(9 \cdot \log^3 N)$ where $N = 10^{11}$ .
Space: $O(\log N)$ for the recursion stack.

Answer

\boxed{21295121502550}

C++ project_euler/problem_156/solution.cpp

#include <bits/stdc++.h>
using namespace std;

typedef long long ll;

// Count occurrences of digit d in all numbers from 1 to n
ll f(ll n, int d) {
    if (n <= 0) return 0;
    ll count = 0;
    ll power = 1;
    while (power <= n) {
        ll high = n / (power * 10);
        ll cur = (n / power) % 10;
        ll low = n % power;
        if (cur < d)
            count += high * power;
        else if (cur == d)
            count += high * power + low + 1;
        else
            count += (high + 1) * power;
        power *= 10;
    }
    return count;
}

ll total_sum = 0;

// g(n) = f(n,d) - n
// We want all n where g(n) = 0.
// f(n,d) is non-decreasing. f(n+1,d) - f(n,d) = number of d's in (n+1), which is 0..12ish.
// So g(n+1) - g(n) = (count of d in n+1) - 1, range [-1, ~11].
// g can oscillate but has long-term trend toward -infinity for large n.

void solve(ll lo, ll hi, int d) {
    if (lo > hi) return;

    ll glo = f(lo, d) - lo;
    ll ghi = f(hi, d) - hi;

    // Pruning based on bounds:
    // In interval [lo, hi], g can decrease by at most 1 per step (when digit d doesn't appear).
    // So min of g in [lo, hi] >= glo - (hi - lo) and also >= ghi - (hi - lo).
    // max of g in [lo, hi]: g can increase by at most (max_digits - 1) per step.
    // But a simpler bound: f is non-decreasing, so f(n,d) >= f(lo,d) for n >= lo.
    // g(n) = f(n,d) - n >= f(lo,d) - n >= f(lo,d) - hi = glo - (hi - lo).
    // Also g(n) = f(n,d) - n <= f(hi,d) - n <= f(hi,d) - lo = ghi + (hi - lo).

    ll g_min_bound = min(glo, ghi);  // not tight but...
    ll g_max_bound = max(glo, ghi);

    // Tighter: g(n) >= f(lo,d) - hi = glo + lo - hi = glo - (hi - lo)
    ll g_lower = glo - (hi - lo);
    // g(n) <= f(hi,d) - lo = ghi + hi - lo
    ll g_upper = ghi + (hi - lo);

    // If g is always positive or always negative, skip
    if (g_lower > 0) return;  // g(n) > 0 for all n in [lo, hi]
    if (g_upper < 0) return;  // g(n) < 0 for all n in [lo, hi]

    if (lo == hi) {
        if (glo == 0) {
            total_sum += lo;
        }
        return;
    }
    ll mid = lo + (hi - lo) / 2;
    solve(lo, mid, d);
    solve(mid + 1, hi, d);
}

int main() {
    ios::sync_with_stdio(false);
    ll LIMIT = 200000000000LL; // 2 * 10^11
    for (int d = 1; d <= 9; d++) {
        solve(1, LIMIT, d);
    }
    cout << total_sum << endl;
    return 0;
}

Python project_euler/problem_156/solution.py

import sys
sys.setrecursionlimit(200000)

def f(n, d):
    """Count occurrences of digit d in all numbers from 1 to n."""
    if n <= 0:
        return 0
    count = 0
    power = 1
    while power <= n:
        high = n // (power * 10)
        cur = (n // power) % 10
        low = n % power
        if cur < d:
            count += high * power
        elif cur == d:
            count += high * power + low + 1
        else:
            count += (high + 1) * power
        power *= 10
    return count

def solve(lo, hi, d):
    """Find sum of all n in [lo, hi] where f(n, d) = n."""
    if lo > hi:
        return 0

    glo = f(lo, d) - lo
    ghi = f(hi, d) - hi

    # Bounds on g(n) = f(n,d) - n in [lo, hi]:
    # g(n) >= f(lo,d) - hi = glo - (hi - lo)  [since f non-decreasing]
    # g(n) <= f(hi,d) - lo = ghi + (hi - lo)
    g_lower = glo - (hi - lo)
    g_upper = ghi + (hi - lo)

    if g_lower > 0 or g_upper < 0:
        return 0

    if lo == hi:
        return lo if glo == 0 else 0

    mid = lo + (hi - lo) // 2
    return solve(lo, mid, d) + solve(mid + 1, hi, d)

LIMIT = 200000000000  # 2 * 10^11
total = 0
for d in range(1, 10):
    total += solve(1, LIMIT, d)

print(total)