ICPC 2019 - H. Hobson’s Trains
The graph is functional: every station has exactly one outgoing edge. For each station v, we must count how many starting stations can reach v in at most k legs.
Source-first archive entry
This page is built from the copied files in competitive_programming/icpc/2019/H-hobsons-trains. Edit
competitive_programming/icpc/2019/H-hobsons-trains/solution.tex to update the written solution and
competitive_programming/icpc/2019/H-hobsons-trains/solution.cpp to update the implementation.
The website does not replace those files with hand-maintained HTML. It reads the copied source tree during the build and exposes the exact files below.
Problem Statement
Copied statement text kept beside the solution archive for direct reference.
Problem H
Hobson’s Trains
Time limit: 5 seconds
Mr. Hobson has retired from running a stable and has invested in a more modern form of transport,
trains. He has built a rail network with n stations. However, he has retained his commitment to free
the passenger from the burden of too many choices: from each station, a passenger can catch a train to
exactly one other station. Such a journey is referred to as a leg. Note that this is a one-way journey, and
it might not be possible to get back again.
Hobson also offers exactly one choice of ticket, which allows a passenger to travel up to k legs in one
trip. At the exit from each station is an automated ticket reader (only one, so that passengers do not need
to decide which to use). The reader checks that the distance from the initial station to the final station
does not exceed k legs.
Each ticket reader must be programmed with a list of valid starting stations, but the more memory this
list needs, the more expensive the machine will be. Help Hobson by determining, for each station A, the
number of stations (including A) from which a customer can reach A in at most k legs.
1 1,2 1,2,3,6 2,3,4,5,6 3,4,5
1 2 3 4 5
6
6
Figure H.1: Illustration of Sample Input 1. Each circle represents a station. The numbers
outside the circles are the station numbers loaded into the ticket readers when k = 2.
Input
The first line of input contains two integers n and k, where n (2 ≤ n ≤ 5 · 105 ) is the number of stations
and k (1 ≤ k ≤ n − 1) is the maximum number of legs that may be traveled on a ticket. Then follow n
lines, the ith of which contains an integer di (1 ≤ di ≤ n and di 6= i), the station which may be reached
from station i in one leg.
Output
Output n lines, with the ith line containing the number of stations from which station i can be reached
in at most k legs.
Sample Input 1 Sample Output 1
6 2 1
2 2
3 4
4 5
5 3
4 1
3
Sample Input 2 Sample Output 2
5 3 3
2 3
3 3
1 2
5 2
4
Editorial
Rendered from the copied solution.tex file. The original TeX source remains
available below.
Key Observations
Every functional graph decomposes into directed cycles with directed trees feeding into the cycle vertices.
If $v$ is a tree node, then only nodes in the reverse-tree subtree of $v$ can ever reach $v$.
If $v$ is a cycle node, then reachable starts come from all attached trees around that cycle: a node in the tree attached to cycle vertex $c_i$ can reach cycle vertex $c_j$ iff \[ \text{depth-to-}c_i + \text{forward distance on the cycle from } c_i \text{ to } c_j \le k. \]
Algorithm
1. Find the cycle nodes.
Use the standard indegree-peeling process for functional graphs. Remove every indegree-$0$ vertex, decrease the indegree of its successor, and continue. The remaining vertices are exactly the cycle vertices.
2. Build the reverse forest.
For every cycle vertex, keep only reverse edges coming from non-cycle vertices. This gives a forest rooted at the cycle vertices. Define depth $0$ on the roots, then depth increases by $1$ along reverse-tree edges.
3. Count answers for tree nodes.
Do an Euler tour of the reverse forest. For any node $v$, its reverse-tree subtree is a contiguous Euler interval. A node $u$ in that subtree reaches $v$ within $k$ legs exactly when \[ \text{depth}(u) \le \text{depth}(v) + k. \] So we sort all nodes by depth, sort all queries $v$ by the threshold $\text{depth}(v)+k$, and answer subtree counts with a Fenwick tree over Euler order.
4. Count answers for cycle nodes.
Process each directed cycle separately. For every node $u$ attached to cycle vertex $c_i$ (including $c_i$ itself), let $d$ be its depth to the cycle. Then $u$ contributes $1$ to every cycle vertex on the forward arc of length \[ \min(m-1,\, k-d), \] starting at $c_i$, where $m$ is the cycle length. Add these contributions with a difference array on the doubled cycle, then read the total for each cycle vertex.
Correctness Proof
We prove that the algorithm outputs the correct count for every station.
Lemma 1.
If $v$ is not on a cycle, then a starting station can reach $v$ only if it lies in the reverse-tree subtree of $v$.
Proof.
Every node has exactly one outgoing edge. Once a path leaves the unique reverse-tree path toward the cycle, it can never come back to $v$. Hence any start that reaches $v$ must lie on the reverse-tree descendants of $v$. □
Lemma 2.
For a tree node $v$, a node $u$ reaches $v$ in at most $k$ legs if and only if $u$ is in the reverse-tree subtree of $v$ and \[ \text{depth}(u) \le \text{depth}(v) + k. \]
Proof.
Inside the reverse tree, the path from $u$ to $v$ is unique and has length $\text{depth}(u)-\text{depth}(v)$. Therefore the condition is exactly that $u$ is a descendant of $v$ and this difference is at most $k$. □
Lemma 3.
For a cycle vertex $c_j$, a node $u$ attached to cycle vertex $c_i$ reaches $c_j$ in at most $k$ legs if and only if \[ \text{depth}(u) + \operatorname{dist}_{\text{cycle}}(c_i, c_j) \le k, \] where the cycle distance is measured forward along the directed cycle.
Proof.
The path from $u$ first climbs uniquely to $c_i$ in $\text{depth}(u)$ steps, then continues deterministically forward around the cycle. The first time it reaches $c_j$ is after exactly the stated number of steps, and any later visit is longer by a whole cycle length, so the inequality above is necessary and sufficient. □
Theorem.
The algorithm outputs, for every station $v$, the number of starting stations that can reach $v$ in at most $k$ legs.
Proof.
By Lemma 2, the Fenwick-tree phase counts exactly the valid starts for every tree node. By Lemma 3, the cycle phase adds exactly the valid starts for every cycle node via arc updates on that cycle. Since every station is either a tree node or a cycle node, all answers are correct. □
Complexity Analysis
Indegree peeling, Euler tour, sorting, Fenwick processing, and cycle interval updates are all linear or near-linear. The total running time is $O(n \log n)$ because of the sorting and Fenwick tree operations, and the memory usage is $O(n)$.
Implementation Notes
The reverse forest includes each cycle node as a root, so the subtree computation also counts the nodes in its own attached tree.
Cycle answers are computed from scratch with interval additions, so the root values from the tree phase are overwritten for cycle nodes.
Using a doubled cycle turns wrapped arcs into ordinary intervals.
Code
Exact copied C++ implementation from solution.cpp.
#include <bits/stdc++.h>
using namespace std;
namespace {
struct Fenwick {
int n;
vector<int> bit;
explicit Fenwick(int n_) : n(n_), bit(n_ + 1, 0) {}
void add(int idx, int val) {
for (; idx <= n; idx += idx & -idx) {
bit[idx] += val;
}
}
int sum(int idx) const {
int res = 0;
for (; idx > 0; idx -= idx & -idx) {
res += bit[idx];
}
return res;
}
int range_sum(int l, int r) const {
return sum(r) - sum(l - 1);
}
};
void solve() {
int n, k;
cin >> n >> k;
vector<int> to(n + 1), indeg(n + 1, 0);
vector<vector<int> > rev(n + 1);
for (int i = 1; i <= n; ++i) {
cin >> to[i];
rev[to[i]].push_back(i);
++indeg[to[i]];
}
queue<int> q;
vector<char> in_cycle(n + 1, true);
for (int i = 1; i <= n; ++i) {
if (indeg[i] == 0) {
q.push(i);
}
}
while (!q.empty()) {
int v = q.front();
q.pop();
in_cycle[v] = false;
int nxt = to[v];
if (--indeg[nxt] == 0) {
q.push(nxt);
}
}
vector<vector<int> > cycles;
vector<char> seen_cycle(n + 1, false);
vector<int> cycle_id(n + 1, -1), pos_in_cycle(n + 1, -1);
for (int i = 1; i <= n; ++i) {
if (!in_cycle[i] || seen_cycle[i]) {
continue;
}
vector<int> cyc;
int v = i;
while (!seen_cycle[v]) {
seen_cycle[v] = true;
pos_in_cycle[v] = (int)cyc.size();
cycle_id[v] = (int)cycles.size();
cyc.push_back(v);
v = to[v];
}
cycles.push_back(cyc);
}
vector<int> root_cycle(n + 1, 0);
vector<int> depth(n + 1, 0);
vector<int> tin(n + 1), tout(n + 1), euler(1, 0);
vector<vector<int> > attached(n + 1);
int timer = 0;
for (const vector<int>& cyc : cycles) {
for (int root : cyc) {
vector<pair<int, int> > st;
st.push_back(make_pair(root, 0));
root_cycle[root] = root;
depth[root] = 0;
while (!st.empty()) {
int v = st.back().first;
int& idx = st.back().second;
if (idx == 0) {
tin[v] = ++timer;
euler.push_back(v);
attached[root].push_back(v);
}
if (idx < (int)rev[v].size()) {
int u = rev[v][idx++];
if (in_cycle[u]) {
continue;
}
root_cycle[u] = root;
depth[u] = depth[v] + 1;
st.push_back(make_pair(u, 0));
} else {
tout[v] = timer;
st.pop_back();
}
}
}
}
vector<int> nodes(n);
iota(nodes.begin(), nodes.end(), 1);
sort(nodes.begin(), nodes.end(), [&](int a, int b) {
return depth[a] < depth[b];
});
vector<int> query_nodes(n);
iota(query_nodes.begin(), query_nodes.end(), 1);
sort(query_nodes.begin(), query_nodes.end(), [&](int a, int b) {
return depth[a] + k < depth[b] + k;
});
vector<int> answer(n + 1, 0);
Fenwick fw(n);
int ptr = 0;
for (int v : query_nodes) {
int limit = depth[v] + k;
while (ptr < n && depth[nodes[ptr]] <= limit) {
fw.add(tin[nodes[ptr]], 1);
++ptr;
}
answer[v] = fw.range_sum(tin[v], tout[v]);
}
for (const vector<int>& cyc : cycles) {
int m = (int)cyc.size();
vector<int> diff(2 * m + 1, 0);
for (int idx = 0; idx < m; ++idx) {
int root = cyc[idx];
for (int v : attached[root]) {
if (depth[v] > k) {
continue;
}
int len = min(m - 1, k - depth[v]);
++diff[idx];
--diff[idx + len + 1];
}
}
vector<int> pref(2 * m, 0);
int cur = 0;
for (int i = 0; i < 2 * m; ++i) {
cur += diff[i];
pref[i] = cur;
}
for (int idx = 0; idx < m; ++idx) {
answer[cyc[idx]] = pref[idx] + pref[idx + m];
}
}
for (int i = 1; i <= n; ++i) {
cout << answer[i] << '\n';
}
}
} // namespace
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
solve();
return 0;
}
Source Files
Exact copied source-of-truth files. Edit solution.tex for the write-up and solution.cpp for the implementation.
\documentclass[11pt]{article}
\usepackage[margin=1in]{geometry}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,amssymb,amsthm}
\usepackage{enumitem}
\title{ICPC World Finals 2019\\H. Hobson's Trains}
\author{}
\date{}
\begin{document}
\maketitle
\section*{Problem Summary}
The graph is functional: every station has exactly one outgoing edge. For each station $v$, we must count how many starting stations can reach $v$ in at most $k$ legs.
\section*{Key Observations}
\begin{itemize}[leftmargin=*]
\item Every functional graph decomposes into directed cycles with directed trees feeding into the cycle vertices.
\item If $v$ is a tree node, then only nodes in the reverse-tree subtree of $v$ can ever reach $v$.
\item If $v$ is a cycle node, then reachable starts come from all attached trees around that cycle: a node in the tree attached to cycle vertex $c_i$ can reach cycle vertex $c_j$ iff
\[
\text{depth-to-}c_i + \text{forward distance on the cycle from } c_i \text{ to } c_j \le k.
\]
\end{itemize}
\section*{Algorithm}
\paragraph{1. Find the cycle nodes.}
Use the standard indegree-peeling process for functional graphs. Remove every indegree-$0$ vertex, decrease the indegree of its successor, and continue. The remaining vertices are exactly the cycle vertices.
\paragraph{2. Build the reverse forest.}
For every cycle vertex, keep only reverse edges coming from non-cycle vertices. This gives a forest rooted at the cycle vertices. Define depth $0$ on the roots, then depth increases by $1$ along reverse-tree edges.
\paragraph{3. Count answers for tree nodes.}
Do an Euler tour of the reverse forest. For any node $v$, its reverse-tree subtree is a contiguous Euler interval. A node $u$ in that subtree reaches $v$ within $k$ legs exactly when
\[
\text{depth}(u) \le \text{depth}(v) + k.
\]
So we sort all nodes by depth, sort all queries $v$ by the threshold $\text{depth}(v)+k$, and answer subtree counts with a Fenwick tree over Euler order.
\paragraph{4. Count answers for cycle nodes.}
Process each directed cycle separately. For every node $u$ attached to cycle vertex $c_i$ (including $c_i$ itself), let $d$ be its depth to the cycle. Then $u$ contributes $1$ to every cycle vertex on the forward arc of length
\[
\min(m-1,\, k-d),
\]
starting at $c_i$, where $m$ is the cycle length. Add these contributions with a difference array on the doubled cycle, then read the total for each cycle vertex.
\section*{Correctness Proof}
We prove that the algorithm outputs the correct count for every station.
\paragraph{Lemma 1.}
If $v$ is not on a cycle, then a starting station can reach $v$ only if it lies in the reverse-tree subtree of $v$.
\paragraph{Proof.}
Every node has exactly one outgoing edge. Once a path leaves the unique reverse-tree path toward the cycle, it can never come back to $v$. Hence any start that reaches $v$ must lie on the reverse-tree descendants of $v$. \qed
\paragraph{Lemma 2.}
For a tree node $v$, a node $u$ reaches $v$ in at most $k$ legs if and only if $u$ is in the reverse-tree subtree of $v$ and
\[
\text{depth}(u) \le \text{depth}(v) + k.
\]
\paragraph{Proof.}
Inside the reverse tree, the path from $u$ to $v$ is unique and has length $\text{depth}(u)-\text{depth}(v)$. Therefore the condition is exactly that $u$ is a descendant of $v$ and this difference is at most $k$. \qed
\paragraph{Lemma 3.}
For a cycle vertex $c_j$, a node $u$ attached to cycle vertex $c_i$ reaches $c_j$ in at most $k$ legs if and only if
\[
\text{depth}(u) + \operatorname{dist}_{\text{cycle}}(c_i, c_j) \le k,
\]
where the cycle distance is measured forward along the directed cycle.
\paragraph{Proof.}
The path from $u$ first climbs uniquely to $c_i$ in $\text{depth}(u)$ steps, then continues deterministically forward around the cycle. The first time it reaches $c_j$ is after exactly the stated number of steps, and any later visit is longer by a whole cycle length, so the inequality above is necessary and sufficient. \qed
\paragraph{Theorem.}
The algorithm outputs, for every station $v$, the number of starting stations that can reach $v$ in at most $k$ legs.
\paragraph{Proof.}
By Lemma 2, the Fenwick-tree phase counts exactly the valid starts for every tree node. By Lemma 3, the cycle phase adds exactly the valid starts for every cycle node via arc updates on that cycle. Since every station is either a tree node or a cycle node, all answers are correct. \qed
\section*{Complexity Analysis}
Indegree peeling, Euler tour, sorting, Fenwick processing, and cycle interval updates are all linear or near-linear. The total running time is $O(n \log n)$ because of the sorting and Fenwick tree operations, and the memory usage is $O(n)$.
\section*{Implementation Notes}
\begin{itemize}[leftmargin=*]
\item The reverse forest includes each cycle node as a root, so the subtree computation also counts the nodes in its own attached tree.
\item Cycle answers are computed from scratch with interval additions, so the root values from the tree phase are overwritten for cycle nodes.
\item Using a doubled cycle turns wrapped arcs into ordinary intervals.
\end{itemize}
\end{document}
#include <bits/stdc++.h>
using namespace std;
namespace {
struct Fenwick {
int n;
vector<int> bit;
explicit Fenwick(int n_) : n(n_), bit(n_ + 1, 0) {}
void add(int idx, int val) {
for (; idx <= n; idx += idx & -idx) {
bit[idx] += val;
}
}
int sum(int idx) const {
int res = 0;
for (; idx > 0; idx -= idx & -idx) {
res += bit[idx];
}
return res;
}
int range_sum(int l, int r) const {
return sum(r) - sum(l - 1);
}
};
void solve() {
int n, k;
cin >> n >> k;
vector<int> to(n + 1), indeg(n + 1, 0);
vector<vector<int> > rev(n + 1);
for (int i = 1; i <= n; ++i) {
cin >> to[i];
rev[to[i]].push_back(i);
++indeg[to[i]];
}
queue<int> q;
vector<char> in_cycle(n + 1, true);
for (int i = 1; i <= n; ++i) {
if (indeg[i] == 0) {
q.push(i);
}
}
while (!q.empty()) {
int v = q.front();
q.pop();
in_cycle[v] = false;
int nxt = to[v];
if (--indeg[nxt] == 0) {
q.push(nxt);
}
}
vector<vector<int> > cycles;
vector<char> seen_cycle(n + 1, false);
vector<int> cycle_id(n + 1, -1), pos_in_cycle(n + 1, -1);
for (int i = 1; i <= n; ++i) {
if (!in_cycle[i] || seen_cycle[i]) {
continue;
}
vector<int> cyc;
int v = i;
while (!seen_cycle[v]) {
seen_cycle[v] = true;
pos_in_cycle[v] = (int)cyc.size();
cycle_id[v] = (int)cycles.size();
cyc.push_back(v);
v = to[v];
}
cycles.push_back(cyc);
}
vector<int> root_cycle(n + 1, 0);
vector<int> depth(n + 1, 0);
vector<int> tin(n + 1), tout(n + 1), euler(1, 0);
vector<vector<int> > attached(n + 1);
int timer = 0;
for (const vector<int>& cyc : cycles) {
for (int root : cyc) {
vector<pair<int, int> > st;
st.push_back(make_pair(root, 0));
root_cycle[root] = root;
depth[root] = 0;
while (!st.empty()) {
int v = st.back().first;
int& idx = st.back().second;
if (idx == 0) {
tin[v] = ++timer;
euler.push_back(v);
attached[root].push_back(v);
}
if (idx < (int)rev[v].size()) {
int u = rev[v][idx++];
if (in_cycle[u]) {
continue;
}
root_cycle[u] = root;
depth[u] = depth[v] + 1;
st.push_back(make_pair(u, 0));
} else {
tout[v] = timer;
st.pop_back();
}
}
}
}
vector<int> nodes(n);
iota(nodes.begin(), nodes.end(), 1);
sort(nodes.begin(), nodes.end(), [&](int a, int b) {
return depth[a] < depth[b];
});
vector<int> query_nodes(n);
iota(query_nodes.begin(), query_nodes.end(), 1);
sort(query_nodes.begin(), query_nodes.end(), [&](int a, int b) {
return depth[a] + k < depth[b] + k;
});
vector<int> answer(n + 1, 0);
Fenwick fw(n);
int ptr = 0;
for (int v : query_nodes) {
int limit = depth[v] + k;
while (ptr < n && depth[nodes[ptr]] <= limit) {
fw.add(tin[nodes[ptr]], 1);
++ptr;
}
answer[v] = fw.range_sum(tin[v], tout[v]);
}
for (const vector<int>& cyc : cycles) {
int m = (int)cyc.size();
vector<int> diff(2 * m + 1, 0);
for (int idx = 0; idx < m; ++idx) {
int root = cyc[idx];
for (int v : attached[root]) {
if (depth[v] > k) {
continue;
}
int len = min(m - 1, k - depth[v]);
++diff[idx];
--diff[idx + len + 1];
}
}
vector<int> pref(2 * m, 0);
int cur = 0;
for (int i = 0; i < 2 * m; ++i) {
cur += diff[i];
pref[i] = cur;
}
for (int idx = 0; idx < m; ++idx) {
answer[cyc[idx]] = pref[idx] + pref[idx + m];
}
}
for (int i = 1; i <= n; ++i) {
cout << answer[i] << '\n';
}
}
} // namespace
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
solve();
return 0;
}