All ICPC entries
Competitive Programming

ICPC 2022 - Y. Compression

We are given a binary string and may repeatedly apply the allowed compression operation. We must output a shortest string that can be reached.

Source sync Apr 19, 2026
Track ICPC
Year 2022
Files TeX, C++, statement assets
Folder competitive_programming/icpc/2022/Y-compression
ICPC2022TeXC++statement textstatement pdf

Source-first archive entry

This page is built from the copied files in competitive_programming/icpc/2022/Y-compression. Edit competitive_programming/icpc/2022/Y-compression/solution.tex to update the written solution and competitive_programming/icpc/2022/Y-compression/solution.cpp to update the implementation.

The website does not replace those files with hand-maintained HTML. It reads the copied source tree during the build and exposes the exact files below.

Problem Statement

Copied statement text kept beside the solution archive for direct reference.

World Finals | ICPC 2023 Luxor
     46th Annual                                                                                     hosted by
     ICPC World
   Championship                                                                                      AASTMT

                                              Problem Y
                                            Compression
                                        Time limit: 1 second
Infinite Compression Plan Consortium (ICPC) has developed a
new, great data compression strategy, called “Don’t Repeat Your-
self” (DRY). DRY works on a string of characters, and if the
string contains two consecutive instances of the same substring,
it simply removes one of them. For example, in the string
“alfalfa seeds”, it could remove one of the two “e” sub-
strings in “seeds”, and one of the two “lfa” substrings in
“alfalfa”, resulting in “alfa seds”. DRY can also take
advantage of previous removals — for instance, in the string
“seventeenth baggage”, it will first remove the duplicate
“e” in “seventeenth” and the duplicate “g” in “baggage”,         Image adapted from Wikimedia Commons, public domain

resulting in “sevententh bagage”, and then remove the du-
plicate “ent” in “sevententh” and “ag” in “bagage”, resulting in “seventh bage”.
If there are multiple choices of repeating consecutive substrings to remove, DRY should choose in a way
that results in the shortest possible final string. For example, in the string “ABBCDCABCDCD”, DRY has
two choices — either removing the repeated “B” near the beginning, or the repeated “CD” at the end. If
the “B” is removed, then DRY will be able to remove the repeated “ABCDC”, resulting in “ABCDCD”,
from which the “CD” at the end will finally be removed, resulting in “ABCD”. However, if DRY removed
the “CD” at the end first, it would be left with “ABBCDCABCD”, from which only the repeated “B” can
be removed to obtain “ABCDCABCD” — and from that string nothing more can be removed. So, the
correct choice for DRY is to begin by compressing the double “B”, and to finally end up with “ABCD”.
ICPC observed that DRY obtains the best results on binary strings — that is, strings composed only of
zeroes and ones. So, it has tasked you with implementing the best possible DRY algorithm working
on binary strings. For any binary string, you should output a shortest string that can be obtained by
repeatedly applying DRY to it.

Input

The input consists of a single line containing a nonempty string of length less than or equal to 105 ,
consisting only of zeroes and ones.

Output

Output one line, containing a shortest possible result of running DRY on the input string. If there is
more than one possible shortest result, any one will be accepted.

 Sample Input 1                                            Sample Output 1
 1111                                                      1

 Sample Input 2                                            Sample Output 2
 101                                                       101

46th ICPC World Championship Problem Y: Compression © ICPC Foundation                                            19

                                 World Finals | ICPC 2023 Luxor
   46th Annual                                                          hosted by
    ICPC World
  Championship                                                          AASTMT

Sample Input 3                               Sample Output 3
10110                                        10

46th ICPC World Championship Problem Y: Compression © ICPC Foundation               20

Editorial

Rendered from the copied solution.tex file. The original TeX source remains available below.

Key Observations

  • The first character never changes.

  • The last character never changes.

  • If the original string contains both 0 and 1, then no sequence of operations can delete all occurrences of one of the two characters.

  • These three invariants already determine the unique shortest answer.

Algorithm

  1. If all characters are equal, output that single character.

  2. Otherwise, both bits occur in the string.

  3. If the first and last characters are different, output exactly those two characters.

  4. If the first and last characters are equal, output \[ \text{first} \;+\; \text{opposite bit} \;+\; \text{last}. \]

Correctness Proof

We prove that the algorithm returns the correct answer.

Lemma 1.

Every reachable string has the same first character and the same last character as the original string.

Proof.

The allowed compression never changes the two endpoints of the current string, so by induction over all operations the first and last characters remain invariant.

Lemma 2.

If the original string contains both 0 and 1, then every reachable string also contains both 0 and 1.

Proof.

This is the third key invariant of the operation: it is impossible to erase all occurrences of one character while the other remains. Therefore any reachable string from a mixed binary string must still contain both bits.

Lemma 3.

The string produced by the algorithm is reachable and no shorter reachable string exists.

Proof.

If the string is constant, a single-character answer is clearly optimal.

Now assume both bits occur. By repeatedly compressing inside equal runs, we can first transform the string into an alternating one. After that, repeated compressions of the front remove two characters at a time while preserving the same endpoint characters. Thus we can always reduce to the shortest alternating string with the same endpoints.

If the endpoints differ, the shortest such alternating string is exactly the length-$2$ string formed by those endpoints. If the endpoints are equal, a length-$2$ string is impossible, and because both bits must remain present by Lemma 2, the shortest possibility is the length-$3$ string first opposite first. Hence the algorithm's answer is reachable and optimal.

Theorem.

The algorithm outputs a shortest reachable string.

Proof.

By Lemma 1 and Lemma 2, any reachable optimum must satisfy the same endpoints and, when the input is mixed, must contain both bits. Lemma 3 shows that the algorithm constructs exactly the shortest string satisfying those necessary conditions. Therefore the answer is correct.

Complexity Analysis

We only scan the string once to test whether all characters are equal. The running time is $O(n)$ and the memory usage is $O(1)$.

Implementation Notes

  • Once the three cases above are identified, the answer can be printed directly without simulating any compression steps.

Code

Exact copied C++ implementation from solution.cpp.

C++ competitive_programming/icpc/2022/Y-compression/solution.cpp

Exact copied implementation source.

Raw file
#include <bits/stdc++.h>
using namespace std;

namespace {

void solve() {
    string s;
    cin >> s;

    bool all_same = true;
    for (char ch : s) {
        if (ch != s[0]) {
            all_same = false;
            break;
        }
    }

    if (all_same) {
        cout << s[0] << '\n';
        return;
    }

    if (s.front() != s.back()) {
        cout << s.front() << s.back() << '\n';
        return;
    }

    char middle = (s.front() == '0' ? '1' : '0');
    cout << s.front() << middle << s.back() << '\n';
}

}  // namespace

int main() {
    ios::sync_with_stdio(false);
    cin.tie(nullptr);

    solve();
    return 0;
}

Source Files

Exact copied source-of-truth files. Edit solution.tex for the write-up and solution.cpp for the implementation.

TeX write-up competitive_programming/icpc/2022/Y-compression/solution.tex

Exact copied write-up source.

Raw file
\documentclass[11pt]{article}
\usepackage[margin=1in]{geometry}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{amsmath,amssymb,amsthm}
\usepackage{enumitem}

\title{ICPC World Finals 2022\\Y. Compression}
\author{}
\date{}

\begin{document}
\maketitle

\section*{Problem Summary}

We are given a binary string and may repeatedly apply the allowed compression operation. We must
output a shortest string that can be reached.

\section*{Key Observations}

\begin{itemize}[leftmargin=*]
    \item The first character never changes.
    \item The last character never changes.
    \item If the original string contains both \texttt{0} and \texttt{1}, then no sequence of operations can
    delete all occurrences of one of the two characters.
    \item These three invariants already determine the unique shortest answer.
\end{itemize}

\section*{Algorithm}

\begin{enumerate}[leftmargin=*]
    \item If all characters are equal, output that single character.
    \item Otherwise, both bits occur in the string.
    \item If the first and last characters are different, output exactly those two characters.
    \item If the first and last characters are equal, output
    \[
    \text{first} \;+\; \text{opposite bit} \;+\; \text{last}.
    \]
\end{enumerate}

\section*{Correctness Proof}

We prove that the algorithm returns the correct answer.

\paragraph{Lemma 1.}
Every reachable string has the same first character and the same last character as the original
string.

\paragraph{Proof.}
The allowed compression never changes the two endpoints of the current string, so by induction
over all operations the first and last characters remain invariant. \qed

\paragraph{Lemma 2.}
If the original string contains both \texttt{0} and \texttt{1}, then every reachable string also contains
both \texttt{0} and \texttt{1}.

\paragraph{Proof.}
This is the third key invariant of the operation: it is impossible to erase all occurrences of one
character while the other remains. Therefore any reachable string from a mixed binary string must
still contain both bits. \qed

\paragraph{Lemma 3.}
The string produced by the algorithm is reachable and no shorter reachable string exists.

\paragraph{Proof.}
If the string is constant, a single-character answer is clearly optimal.

Now assume both bits occur. By repeatedly compressing inside equal runs, we can first transform
the string into an alternating one. After that, repeated compressions of the front remove two
characters at a time while preserving the same endpoint characters. Thus we can always reduce to
the shortest alternating string with the same endpoints.

If the endpoints differ, the shortest such alternating string is exactly the length-$2$ string formed
by those endpoints. If the endpoints are equal, a length-$2$ string is impossible, and because both
bits must remain present by Lemma 2, the shortest possibility is the length-$3$ string
\texttt{first opposite first}. Hence the algorithm's answer is reachable and optimal. \qed

\paragraph{Theorem.}
The algorithm outputs a shortest reachable string.

\paragraph{Proof.}
By Lemma 1 and Lemma 2, any reachable optimum must satisfy the same endpoints and, when the
input is mixed, must contain both bits. Lemma 3 shows that the algorithm constructs exactly the
shortest string satisfying those necessary conditions. Therefore the answer is correct. \qed

\section*{Complexity Analysis}

We only scan the string once to test whether all characters are equal. The running time is $O(n)$
and the memory usage is $O(1)$.

\section*{Implementation Notes}

\begin{itemize}[leftmargin=*]
    \item Once the three cases above are identified, the answer can be printed directly without
    simulating any compression steps.
\end{itemize}

\end{document}