All Euler problems
Project Euler

Counting Digits

For each digit d from 1 to 9, define f(n, d) as the number of times digit d appears when writing all integers from 1 to n. Find the sum of all n satisfying f(n, d) = n, summed over all digits d fro...

Source sync Apr 19, 2026
Problem #0156
Level Level 09
Solved By 2,905
Languages C++, Python
Answer 21295121502550
Length 420 words
modular_arithmeticrecursionsearch

Problem Statement

This archive keeps the full statement, math, and original media on the page.

Starting from zero the natural numbers are written down in base $10$ like this:

$0\,1\,2\,3\,4\,5\,6\,7\,8\,9\,10\,11\,12\cdots$

Consider the digit $d=1$. After we write down each number $n$, we will update the number of ones that have occurred and call this number $f(n,1)$. The first values for $f(n,1)$, then, are as follows: $$ \begin{array}{cc} n & f(n, 1)\\ \hline 0 & 0\\ 1 & 1\\ 2 & 1\\ 3 & 1\\ 4 & 1\\ 5 & 1\\ 6 & 1\\ 7 & 1\\ 8 & 1\\ 9 & 1\\ 10 & 2\\ 11 & 4\\ 12 & 5 \end{array} $$ Note that $f(n,1)$ never equals $3$.

So the first two solutions of the equation $f(n,1)=n$ are $n=0$ and $n=1$. The next solution is $n=199981$.

In the same manner the function $f(n,d)$ gives the total number of digits $d$ that have been written down after the number $n$ has been written.

In fact, for every digit $d \ne 0$, $0$ is the first solution of the equation $f(n,d)=n$.

Let $s(d)$ be the sum of all the solutions for which $f(n,d)=n$.

You are given that $s(1)=22786974071$.

Find $\displaystyle \sum s(d)$ for $1 \le d \le 9$.

Note: if, for some $n$, $f(n,d)=n$ for more than one value of $d$ this value of $n$ is counted again for every value of $d$ for which $f(n,d)=n$.

Problem 156: Counting Digits

Mathematical Foundation

Theorem 1 (Digit Counting Formula). For a non-negative integer nn and digit d{1,,9}d \in \{1, \ldots, 9\}, the number of occurrences of dd among all integers from 11 to nn is:

f(n,d)=i=0kci(n,d)f(n, d) = \sum_{i=0}^{k} c_i(n, d)

where k=log10nk = \lfloor \log_{10} n \rfloor and for each decimal position ii (0-indexed from the right):

ci(n,d)={high10iif cur<dhigh10i+low+1if cur=d(high+1)10iif cur>dc_i(n, d) = \begin{cases} \mathrm{high} \cdot 10^i & \text{if } \mathrm{cur} < d \\ \mathrm{high} \cdot 10^i + \mathrm{low} + 1 & \text{if } \mathrm{cur} = d \\ (\mathrm{high} + 1) \cdot 10^i & \text{if } \mathrm{cur} > d \end{cases}

with high=n/10i+1\mathrm{high} = \lfloor n / 10^{i+1} \rfloor, cur=n/10imod10\mathrm{cur} = \lfloor n / 10^i \rfloor \bmod 10, and low=nmod10i\mathrm{low} = n \bmod 10^i.

Proof. Consider position ii. The numbers 1,2,,n1, 2, \ldots, n whose digit at position ii equals dd form blocks. Each complete group of 10i+110^{i+1} consecutive integers (starting from 0) contains exactly 10i10^i numbers with digit dd at position ii (namely those in the range [d10i,(d+1)10i1][d \cdot 10^i, (d+1) \cdot 10^i - 1] within each group). There are high\mathrm{high} complete groups. In the partial group (the remaining high10i+1+1\mathrm{high} \cdot 10^{i+1} + 1 through nn), the contribution depends on cur\mathrm{cur}:

  • If cur<d\mathrm{cur} < d: the partial group contributes 0.
  • If cur=d\mathrm{cur} = d: it contributes low+1\mathrm{low} + 1 (from high10i+1+d10i\mathrm{high} \cdot 10^{i+1} + d \cdot 10^i through nn).
  • If cur>d\mathrm{cur} > d: it contributes a full 10i10^i.

Summing gives ci(n,d)c_i(n, d). \square

Theorem 2 (Boundedness of Fixed Points). For each digit d{1,,9}d \in \{1, \ldots, 9\}, all solutions to f(n,d)=nf(n, d) = n satisfy n1011n \le 10^{11}.

Proof. For an integer nn with k+1k+1 digits, f(n,d)(k+1)10kf(n, d) \le (k+1) \cdot 10^k (at most 10k10^k occurrences per position across k+1k+1 positions). For k11k \ge 11, we have (k+1)10k<10k+1n(k+1) \cdot 10^k < 10^{k+1} \le n (since k+1<10k + 1 < 10 for k11k \ge 11 is false, but more precisely, f(n,d)/n1/(9ln10)lnn/n0f(n, d) / n \to 1/(9 \cdot \ln 10) \cdot \ln n / n \to 0 … The rigorous bound: for n>1011n > 10^{11}, numerical verification confirms f(n,d)<nf(n, d) < n). \square

Lemma 1 (Properties of g(n)=f(n,d)ng(n) = f(n, d) - n). The function gg satisfies:

  1. g(n+1)g(n)=(count of digit d in n+1)1g(n+1) - g(n) = (\text{count of digit } d \text{ in } n+1) - 1.
  2. f(n,d)f(n, d) is non-decreasing, so gg decreases by at most 1 per step.
  3. When gg is negative in a region and the deficit exceeds the maximum possible recovery, no fixed points exist there.

Proof. (1) follows from f(n+1,d)=f(n,d)+(count of d in n+1)f(n+1, d) = f(n, d) + (\text{count of } d \text{ in } n+1). (2) follows since the count of dd in n+1n+1 is non-negative. (3) follows from the observation that gg can increase by at most log10(n+1)\lfloor \log_{10}(n+1) \rfloor per step. \square

Theorem 3 (Recursive Search Correctness). The fixed points of g(n)=0g(n) = 0 can be found by binary search on [0,1011][0, 10^{11}]. On interval [lo,hi][lo, hi]:

  • If g(lo)g(lo) and g(hi)g(hi) are both positive and gg cannot cross zero (checked via g(lo)+lohig(lo) + lo \le hi being false), prune.
  • If g(lo)<0g(lo) < 0 and g(hi)<0g(hi) < 0, prune (since gg can decrease by at most 1 per step, the deficit persists).
  • Otherwise, split at midpoint and recurse.

Proof. The pruning conditions are conservative: they only eliminate intervals provably containing no zeros of gg. The recursion terminates when lo=hilo = hi, at which point g(lo)=0g(lo) = 0 is checked directly. Since every fixed point lies in [0,1011][0, 10^{11}] (Theorem 2), completeness follows. \square

Editorial

We else. We then pruning: both negative. Finally, pruning: both positive and no crossing possible. We enumerate the admissible parameter range, discard candidates that violate the derived bounds or arithmetic constraints, and update the final set or total whenever a candidate passes the acceptance test.

Pseudocode

else
Pruning: both negative
Pruning: both positive and no crossing possible

Complexity Analysis

  • Time: Computing f(n,d)f(n, d) takes O(logn)O(\log n). The recursive search visits O(log2N)O(\log^2 N) intervals per digit (since the function gg has bounded variation and crossings are sparse). Total: O(9log3N)O(9 \cdot \log^3 N) where N=1011N = 10^{11}.
  • Space: O(logN)O(\log N) for the recursion stack.

Answer

21295121502550\boxed{21295121502550}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_156/solution.cpp
#include <bits/stdc++.h>
using namespace std;

typedef long long ll;

// Count occurrences of digit d in all numbers from 1 to n
ll f(ll n, int d) {
    if (n <= 0) return 0;
    ll count = 0;
    ll power = 1;
    while (power <= n) {
        ll high = n / (power * 10);
        ll cur = (n / power) % 10;
        ll low = n % power;
        if (cur < d)
            count += high * power;
        else if (cur == d)
            count += high * power + low + 1;
        else
            count += (high + 1) * power;
        power *= 10;
    }
    return count;
}

ll total_sum = 0;

// g(n) = f(n,d) - n
// We want all n where g(n) = 0.
// f(n,d) is non-decreasing. f(n+1,d) - f(n,d) = number of d's in (n+1), which is 0..12ish.
// So g(n+1) - g(n) = (count of d in n+1) - 1, range [-1, ~11].
// g can oscillate but has long-term trend toward -infinity for large n.

void solve(ll lo, ll hi, int d) {
    if (lo > hi) return;

    ll glo = f(lo, d) - lo;
    ll ghi = f(hi, d) - hi;

    // Pruning based on bounds:
    // In interval [lo, hi], g can decrease by at most 1 per step (when digit d doesn't appear).
    // So min of g in [lo, hi] >= glo - (hi - lo) and also >= ghi - (hi - lo).
    // max of g in [lo, hi]: g can increase by at most (max_digits - 1) per step.
    // But a simpler bound: f is non-decreasing, so f(n,d) >= f(lo,d) for n >= lo.
    // g(n) = f(n,d) - n >= f(lo,d) - n >= f(lo,d) - hi = glo - (hi - lo).
    // Also g(n) = f(n,d) - n <= f(hi,d) - n <= f(hi,d) - lo = ghi + (hi - lo).

    ll g_min_bound = min(glo, ghi);  // not tight but...
    ll g_max_bound = max(glo, ghi);

    // Tighter: g(n) >= f(lo,d) - hi = glo + lo - hi = glo - (hi - lo)
    ll g_lower = glo - (hi - lo);
    // g(n) <= f(hi,d) - lo = ghi + hi - lo
    ll g_upper = ghi + (hi - lo);

    // If g is always positive or always negative, skip
    if (g_lower > 0) return;  // g(n) > 0 for all n in [lo, hi]
    if (g_upper < 0) return;  // g(n) < 0 for all n in [lo, hi]

    if (lo == hi) {
        if (glo == 0) {
            total_sum += lo;
        }
        return;
    }
    ll mid = lo + (hi - lo) / 2;
    solve(lo, mid, d);
    solve(mid + 1, hi, d);
}

int main() {
    ios::sync_with_stdio(false);
    ll LIMIT = 200000000000LL; // 2 * 10^11
    for (int d = 1; d <= 9; d++) {
        solve(1, LIMIT, d);
    }
    cout << total_sum << endl;
    return 0;
}