ICPC 2021 - J. Splitstream
The network starts with the sequence 1,2,3,,m. Each node either: [leftmargin=*] splits one sequence into odd and even positions, or merges two sequences by alternating their elements. For each query (x,k), we must rep...
Source-first archive entry
This page is built from the copied files in competitive_programming/icpc/2021/J-splitstream. Edit
competitive_programming/icpc/2021/J-splitstream/solution.tex to update the written solution and
competitive_programming/icpc/2021/J-splitstream/solution.cpp to update the implementation.
The website does not replace those files with hand-maintained HTML. It reads the copied source tree during the build and exposes the exact files below.
Problem Statement
Copied statement text kept beside the solution archive for direct reference.
Problem J
Splitstream
Time limit: 3 seconds
A splitstream system is an acyclic network of nodes that processes
finite sequences of numbers. There are two types of nodes (illustrated
in Figure J.1):
• A split node takes a sequence of numbers as input and dis-
tributes them alternatingly to its two outputs. The first number
goes to output 1, the second to output 2, the third to output 1,
the fourth to output 2, and so on, in this order.
• A merge node takes two sequences of numbers as input and
merges them alternatingly to form its single output. The output
contains the first number from input 1, then the first from input
2, then the second from input 1, then the second from input
2, and so on. If one of the input sequences is shorter than the
other, then the remaining numbers from the longer sequence
are simply transmitted without being merged after the shorter
Ganga-Brahmaputra delta
sequence has been exhausted. (ESA, CC BY-SA 3.0 IGO)
Figure J.1: Illustration of how split and merge nodes work.
The overall network has one input, which is the sequence of positive integers 1, 2, 3, . . . , m. Any output
of any node can be queried. A query will seek to identify the k th number in the sequence of numbers for
a given output and a given k. Your task is to implement such queries efficiently.
Input
The first line of input contains three integers m, n, and q, where m (1 ≤ m ≤ 109 ) is the length of the
input sequence, n (1 ≤ n ≤ 104 ) is the number of nodes, and q (1 ≤ q ≤ 103 ) is the number of queries.
The next n lines describe the network, one node per line. A split node has the format S x y z, where
x, y and z identify its input, first output and second output, respectively. A merge node has the format
M x y z, where x, y and z identify its first input, second input and output, respectively. Identifiers x, y
and z are distinct positive integers. The overall input is identified by 1, and the remaining input/output
identifiers form a consecutive sequence beginning at 2. Every input identifier except 1 appears as exactly
one output. Every output identifier appears as the input of at most one node.
Each of the next q lines describes a query. Each query consists of two integers x and k, where x
(2 ≤ x ≤ 105 ) is a valid output identifier and k (1 ≤ k ≤ 109 ) is the index of the desired number in that
sequence. Indexing in a sequence starts with 1.
Output
For each query x and k output one line with the k th number in the output sequence identified by x, or
none if there is no element with that index number.
Sample Input 1 Sample Output 1
200 2 2 100
S 1 2 3 99
M 3 2 4
4 99
4 100
Sample Input 2 Sample Output 2
100 3 6 47
S 1 4 2 98
S 2 3 5 49
M 3 4 6 51
6 48 53
6 49 100
6 50
6 51
6 52
5 25
Sample Input 3 Sample Output 3
2 3 3 2
S 1 2 3 none
S 3 4 5 none
M 5 2 6
3 1
5 1
6 2
Editorial
Rendered from the copied solution.tex file. The original TeX source remains
available below.
Main Observation
We never need to materialize any sequence.
If we know the length of every wire, then for a query $(x,k)$ we can walk backwards through the network until we reach the original input wire $1$.
Lengths of Wires
Let $\ell(w)$ be the length of wire $w$.
Split node.
If a split node reads input $x$ and writes outputs $y$ and $z$, then: \[ \ell(y)=\left\lceil\frac{\ell(x)}{2}\right\rceil, \qquad \ell(z)=\left\lfloor\frac{\ell(x)}{2}\right\rfloor. \]
Merge node.
If a merge node reads inputs $x$ and $y$ and writes output $z$, then: \[ \ell(z)=\ell(x)+\ell(y). \]
These values can be computed with memoized DFS because the network is acyclic.
Tracing One Query Backwards
Split Output
Suppose a split node takes input sequence \[ a_1,a_2,a_3,\dots \] Then:
output 1 is \[ a_1,a_3,a_5,\dots \]
output 2 is \[ a_2,a_4,a_6,\dots \]
So:
the $k$th element of output 1 is the $(2k-1)$th element of the input;
the $k$th element of output 2 is the $(2k)$th element of the input.
Merge Output
Suppose a merge node takes sequences \[ a_1,a_2,\dots,a_p \qquad\text{and}\qquad b_1,b_2,\dots,b_q. \] Its output is \[ a_1,b_1,a_2,b_2,\dots \] until one sequence ends, and then the remaining elements of the longer sequence continue unchanged.
Therefore:
while $k \le 2\min(p,q)$:
odd $k$ comes from the first input at index $(k+1)/2$;
even $k$ comes from the second input at index $k/2$.
after that, if $p>q$, the remaining elements come from the first input at index $k-q$;
if $q>p$, they come from the second input at index $k-p$.
Algorithm
Parse all nodes and record, for each output wire, which node created it.
Compute wire lengths lazily with memoized DFS.
For each query $(x,k)$:
if $k > \ell(x)$, print
none;otherwise repeatedly replace $(x,k)$ by the corresponding predecessor wire and predecessor index, using the rules above, until $x=1$.
On wire $1$, the sequence is simply $1,2,\dots,m$, so the answer is $k$ itself. enumerate
Correctness Proof
We prove that the algorithm answers every query correctly.
Lemma 1.
For every wire, the memoized formulas compute its correct length.
Proof.
The formulas follow directly from the definitions of split and merge nodes:
a split sends odd-positioned elements to one output and even-positioned elements to the other;
a merge outputs every element from both inputs exactly once.
Since the network is acyclic, recursively applying these formulas reaches the base wire $1$ and is well-defined. □
Lemma 2.
For a split node, the backward index transformation used by the algorithm is correct.
Proof.
By definition of the split operation, output 1 contains exactly the odd-indexed elements of the input, in the same order. So its $k$th element is input element $2k-1$. Likewise output 2 contains exactly the even-indexed input elements, so its $k$th element is input element $2k$. □
Lemma 3.
For a merge node, the backward index transformation used by the algorithm is correct.
Proof.
The merge node alternates elements from the two inputs as long as both still have elements. Therefore the first $2\min(p,q)$ output positions correspond exactly to alternating positions from the two inputs. After the shorter input is exhausted, the output continues with the remaining suffix of the longer input without any further mixing. The backward formulas are exactly these cases written explicitly. □
Theorem.
For every query $(x,k)$, the algorithm outputs the correct $k$th element of wire $x$, or
noneif that element does not exist.Proof.
If $k > \ell(x)$, then wire $x$ has fewer than $k$ elements, so
noneis correct by Lemma 1.Otherwise, by repeatedly applying Lemmas 2 and 3, the algorithm transforms the query to an equivalent query on the predecessor wire of the current node. Because the network is acyclic, this process eventually reaches wire $1$. On wire $1$, the $k$th element is exactly $k$. Thus the returned value is the unique value that maps through the network to the original query position. □
Complexity Analysis
Let $N$ be the number of nodes and $Q$ the number of queries.
Each wire length is computed at most once, so the total preprocessing is $O(N)$. Each query walks upward through at most one path in the DAG, so one query costs $O(N)$ in the worst case, which is easily fast enough for the given limits. Memory usage is $O(N)$.
Implementation Notes
Wire lengths fit in 64-bit integers because no wire can contain more than the original $m$ values.
The answer on wire $1$ is just the queried index itself.
Code
Exact copied C++ implementation from solution.cpp.
#include <bits/stdc++.h>
using namespace std;
namespace {
struct Node {
char type;
int a;
int b;
int c;
};
struct WireInfo {
int creator;
int role;
};
long long m;
int n;
int q;
vector<Node> nodes;
vector<WireInfo> wire_info;
vector<long long> length_cache;
long long get_length(int wire) {
if (wire == 1) {
return m;
}
long long& memo = length_cache[wire];
if (memo != -1) {
return memo;
}
const WireInfo& info = wire_info[wire];
const Node& node = nodes[info.creator];
if (node.type == 'S') {
long long input_length = get_length(node.a);
if (info.role == 1) {
memo = (input_length + 1) / 2;
} else {
memo = input_length / 2;
}
} else {
memo = get_length(node.a) + get_length(node.b);
}
return memo;
}
long long query_value(int wire, long long k) {
while (wire != 1) {
const WireInfo& info = wire_info[wire];
const Node& node = nodes[info.creator];
if (node.type == 'S') {
if (info.role == 1) {
k = 2 * k - 1;
} else {
k = 2 * k;
}
wire = node.a;
continue;
}
long long left_length = get_length(node.a);
long long right_length = get_length(node.b);
long long interleaved = 2 * min(left_length, right_length);
if (k <= interleaved) {
if (k % 2 == 1) {
k = (k + 1) / 2;
wire = node.a;
} else {
k /= 2;
wire = node.b;
}
} else if (left_length > right_length) {
k -= right_length;
wire = node.a;
} else {
k -= left_length;
wire = node.b;
}
}
return k;
}
void solve() {
cin >> m >> n >> q;
nodes.resize(n);
int max_wire = 1;
for (int i = 0; i < n; ++i) {
cin >> nodes[i].type >> nodes[i].a >> nodes[i].b >> nodes[i].c;
max_wire = max(max_wire, max(nodes[i].a, max(nodes[i].b, nodes[i].c)));
}
vector<pair<int, long long>> queries(q);
for (int i = 0; i < q; ++i) {
cin >> queries[i].first >> queries[i].second;
max_wire = max(max_wire, queries[i].first);
}
wire_info.assign(max_wire + 1, {-1, 0});
for (int i = 0; i < n; ++i) {
const Node& node = nodes[i];
if (node.type == 'S') {
wire_info[node.b] = {i, 1};
wire_info[node.c] = {i, 2};
} else {
wire_info[node.c] = {i, 0};
}
}
length_cache.assign(max_wire + 1, -1);
length_cache[1] = m;
for (const auto& query : queries) {
int wire = query.first;
long long k = query.second;
if (k < 1 || k > get_length(wire)) {
cout << "none\n";
} else {
cout << query_value(wire, k) << '\n';
}
}
}
} // namespace
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
solve();
return 0;
}
Source Files
Exact copied source-of-truth files. Edit solution.tex for the write-up and solution.cpp for the implementation.
\documentclass[11pt]{article}
\usepackage[margin=1in]{geometry}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,amssymb,amsthm}
\usepackage{enumitem}
\title{ICPC World Finals 2021\\J. Splitstream}
\author{}
\date{}
\begin{document}
\maketitle
\section*{Problem Summary}
The network starts with the sequence
\[
1,2,3,\dots,m.
\]
Each node either:
\begin{itemize}[leftmargin=*]
\item splits one sequence into odd and even positions, or
\item merges two sequences by alternating their elements.
\end{itemize}
For each query $(x,k)$, we must report the $k$th value on output wire $x$, or \texttt{none} if that
output is shorter than $k$.
\section*{Main Observation}
We never need to materialize any sequence.
If we know the length of every wire, then for a query $(x,k)$ we can walk \emph{backwards} through the
network until we reach the original input wire $1$.
\section*{Lengths of Wires}
Let $\ell(w)$ be the length of wire $w$.
\paragraph{Split node.}
If a split node reads input $x$ and writes outputs $y$ and $z$, then:
\[
\ell(y)=\left\lceil\frac{\ell(x)}{2}\right\rceil,
\qquad
\ell(z)=\left\lfloor\frac{\ell(x)}{2}\right\rfloor.
\]
\paragraph{Merge node.}
If a merge node reads inputs $x$ and $y$ and writes output $z$, then:
\[
\ell(z)=\ell(x)+\ell(y).
\]
These values can be computed with memoized DFS because the network is acyclic.
\section*{Tracing One Query Backwards}
\subsection*{Split Output}
Suppose a split node takes input sequence
\[
a_1,a_2,a_3,\dots
\]
Then:
\begin{itemize}[leftmargin=*]
\item output 1 is
\[
a_1,a_3,a_5,\dots
\]
\item output 2 is
\[
a_2,a_4,a_6,\dots
\]
\end{itemize}
So:
\begin{itemize}[leftmargin=*]
\item the $k$th element of output 1 is the $(2k-1)$th element of the input;
\item the $k$th element of output 2 is the $(2k)$th element of the input.
\end{itemize}
\subsection*{Merge Output}
Suppose a merge node takes sequences
\[
a_1,a_2,\dots,a_p
\qquad\text{and}\qquad
b_1,b_2,\dots,b_q.
\]
Its output is
\[
a_1,b_1,a_2,b_2,\dots
\]
until one sequence ends, and then the remaining elements of the longer sequence continue unchanged.
Therefore:
\begin{itemize}[leftmargin=*]
\item while $k \le 2\min(p,q)$:
\begin{itemize}[leftmargin=*]
\item odd $k$ comes from the first input at index $(k+1)/2$;
\item even $k$ comes from the second input at index $k/2$.
\end{itemize}
\item after that, if $p>q$, the remaining elements come from the first input at index $k-q$;
\item if $q>p$, they come from the second input at index $k-p$.
\end{itemize}
\section*{Algorithm}
\begin{enumerate}[leftmargin=*]
\item Parse all nodes and record, for each output wire, which node created it.
\item Compute wire lengths lazily with memoized DFS.
\item For each query $(x,k)$:
\begin{itemize}[leftmargin=*]
\item if $k > \ell(x)$, print \texttt{none};
\item otherwise repeatedly replace $(x,k)$ by the corresponding predecessor wire and predecessor
index, using the rules above, until $x=1$.
\end{itemize}
\item On wire $1$, the sequence is simply $1,2,\dots,m$, so the answer is $k$ itself.
\end{enumerate}
\section*{Correctness Proof}
We prove that the algorithm answers every query correctly.
\paragraph{Lemma 1.}
For every wire, the memoized formulas compute its correct length.
\paragraph{Proof.}
The formulas follow directly from the definitions of split and merge nodes:
\begin{itemize}[leftmargin=*]
\item a split sends odd-positioned elements to one output and even-positioned elements to the other;
\item a merge outputs every element from both inputs exactly once.
\end{itemize}
Since the network is acyclic, recursively applying these formulas reaches the base wire $1$ and is
well-defined. \qed
\paragraph{Lemma 2.}
For a split node, the backward index transformation used by the algorithm is correct.
\paragraph{Proof.}
By definition of the split operation, output 1 contains exactly the odd-indexed elements of the input, in
the same order. So its $k$th element is input element $2k-1$. Likewise output 2 contains exactly the
even-indexed input elements, so its $k$th element is input element $2k$. \qed
\paragraph{Lemma 3.}
For a merge node, the backward index transformation used by the algorithm is correct.
\paragraph{Proof.}
The merge node alternates elements from the two inputs as long as both still have elements. Therefore the
first $2\min(p,q)$ output positions correspond exactly to alternating positions from the two inputs. After
the shorter input is exhausted, the output continues with the remaining suffix of the longer input without
any further mixing. The backward formulas are exactly these cases written explicitly. \qed
\paragraph{Theorem.}
For every query $(x,k)$, the algorithm outputs the correct $k$th element of wire $x$, or \texttt{none} if
that element does not exist.
\paragraph{Proof.}
If $k > \ell(x)$, then wire $x$ has fewer than $k$ elements, so \texttt{none} is correct by Lemma 1.
Otherwise, by repeatedly applying Lemmas 2 and 3, the algorithm transforms the query to an equivalent
query on the predecessor wire of the current node. Because the network is acyclic, this process
eventually reaches wire $1$. On wire $1$, the $k$th element is exactly $k$. Thus the returned value is
the unique value that maps through the network to the original query position. \qed
\section*{Complexity Analysis}
Let $N$ be the number of nodes and $Q$ the number of queries.
Each wire length is computed at most once, so the total preprocessing is $O(N)$. Each query walks upward
through at most one path in the DAG, so one query costs $O(N)$ in the worst case, which is easily fast
enough for the given limits. Memory usage is $O(N)$.
\section*{Implementation Notes}
\begin{itemize}[leftmargin=*]
\item Wire lengths fit in 64-bit integers because no wire can contain more than the original $m$
values.
\item The answer on wire $1$ is just the queried index itself.
\end{itemize}
\end{document}
#include <bits/stdc++.h>
using namespace std;
namespace {
struct Node {
char type;
int a;
int b;
int c;
};
struct WireInfo {
int creator;
int role;
};
long long m;
int n;
int q;
vector<Node> nodes;
vector<WireInfo> wire_info;
vector<long long> length_cache;
long long get_length(int wire) {
if (wire == 1) {
return m;
}
long long& memo = length_cache[wire];
if (memo != -1) {
return memo;
}
const WireInfo& info = wire_info[wire];
const Node& node = nodes[info.creator];
if (node.type == 'S') {
long long input_length = get_length(node.a);
if (info.role == 1) {
memo = (input_length + 1) / 2;
} else {
memo = input_length / 2;
}
} else {
memo = get_length(node.a) + get_length(node.b);
}
return memo;
}
long long query_value(int wire, long long k) {
while (wire != 1) {
const WireInfo& info = wire_info[wire];
const Node& node = nodes[info.creator];
if (node.type == 'S') {
if (info.role == 1) {
k = 2 * k - 1;
} else {
k = 2 * k;
}
wire = node.a;
continue;
}
long long left_length = get_length(node.a);
long long right_length = get_length(node.b);
long long interleaved = 2 * min(left_length, right_length);
if (k <= interleaved) {
if (k % 2 == 1) {
k = (k + 1) / 2;
wire = node.a;
} else {
k /= 2;
wire = node.b;
}
} else if (left_length > right_length) {
k -= right_length;
wire = node.a;
} else {
k -= left_length;
wire = node.b;
}
}
return k;
}
void solve() {
cin >> m >> n >> q;
nodes.resize(n);
int max_wire = 1;
for (int i = 0; i < n; ++i) {
cin >> nodes[i].type >> nodes[i].a >> nodes[i].b >> nodes[i].c;
max_wire = max(max_wire, max(nodes[i].a, max(nodes[i].b, nodes[i].c)));
}
vector<pair<int, long long>> queries(q);
for (int i = 0; i < q; ++i) {
cin >> queries[i].first >> queries[i].second;
max_wire = max(max_wire, queries[i].first);
}
wire_info.assign(max_wire + 1, {-1, 0});
for (int i = 0; i < n; ++i) {
const Node& node = nodes[i];
if (node.type == 'S') {
wire_info[node.b] = {i, 1};
wire_info[node.c] = {i, 2};
} else {
wire_info[node.c] = {i, 0};
}
}
length_cache.assign(max_wire + 1, -1);
length_cache[1] = m;
for (const auto& query : queries) {
int wire = query.first;
long long k = query.second;
if (k < 1 || k > get_length(wire)) {
cout << "none\n";
} else {
cout << query_value(wire, k) << '\n';
}
}
}
} // namespace
int main() {
ios::sync_with_stdio(false);
cin.tie(nullptr);
solve();
return 0;
}