All Euler problems
Project Euler

Roman Numerals

The Project Euler data file contains one thousand Roman numerals written in valid (but not necessarily minimal) form. Find the number of characters saved by rewriting each numeral in its minimal form.

Source sync Apr 19, 2026
Problem #0089
Level Level 03
Solved By 23,746
Languages C++, Python
Answer 743
Length 476 words
brute_forcesearchsequence

Problem Statement

This archive keeps the full statement, math, and original media on the page.

For a number written in Roman numerals to be considered valid there are basic rules which must be followed. Even though the rules allow some numbers to be expressed in more than one way there is always a "best" way of writing a particular number.

For example, it would appear that there are at least six ways of writing the number sixteen:

IIIIIIIIIIIIIIII
VIIIIIIIIIII
VVIIIIII
XIIIIII
VVVI
XVI

However, according to the rules only XIIIIII and XVI are valid, and the last example is considered to be the most efficient, as it uses the least number of numerals.

The 11K text file, roman.txt (right click and 'Save Link/Target As...'), contains one thousand numbers written in valid, but not necessarily minimal, Roman numerals; see About... Roman Numerals for the definitive rules for this problem.

Find the number of characters saved by writing each of these in their minimal form.

Note: You can assume that all the Roman numerals in the file contain no more than four consecutive identical units.

Problem 89: Roman Numerals

Mathematical Analysis

Roman Numeral Rules

The seven Roman numeral symbols and their values:

SymbolValue
I1
V5
X10
L50
C100
D500
M1000

Subtractive Notation (Minimal Form)

In minimal form, subtractive notation is used:

  • IV = 4, IX = 9
  • XL = 40, XC = 90
  • CD = 400, CM = 900

Parsing Roman Numerals

To convert a Roman numeral string to an integer: scan left to right. If the current symbol’s value is less than the next symbol’s value, subtract it; otherwise, add it.

Generating Minimal Form

To convert an integer to minimal Roman numeral form, greedily subtract the largest possible value from the following ordered table:

RepresentationValue
M1000
CM900
D500
CD400
C100
XC90
L50
XL40
X10
IX9
V5
IV4
I1

Editorial

Each input numeral is already valid, so there is no need to reason about alternative parses. We simply convert it to its integer value and then rebuild the unique minimal Roman representation of that same value. The number of saved characters is just the difference in lengths.

Both transformations are straightforward. Parsing is determined by the subtractive rule: a symbol contributes negatively only when it stands immediately before a larger symbol. Rebuilding is greedy from the standard minimal token table, because the special subtractive forms are already included. So the candidates are the symbols or token pairs allowed by Roman notation, and the minimal-form table is what constrains the output to the shortest legal representation.

Pseudocode

Set the total number of saved characters to 0.

For each Roman numeral in the data file:
    Parse it from left to right into an integer value
    Rebuild that value as a minimal Roman numeral using the ordered token table
    Add the difference in string lengths to the running total

Return the running total.

Examples of Savings

  • IIII (4 chars) becomes IV (2 chars): saves 2.
  • VIIII (5 chars) becomes IX (2 chars): saves 3.
  • DCCCC (5 chars) becomes CM (2 chars): saves 3.

Correctness

Theorem. The method described above computes exactly the quantity requested in the problem statement.

Proof. The preceding analysis identifies the admissible objects and derives the formula, recurrence, or exhaustive search carried out by the algorithm. The computation evaluates exactly that specification, so every valid contribution is included once and no invalid contribution is counted. Therefore the returned value is the required answer. \square

Complexity Analysis

  • Time: O(NL)O(N \cdot L) where N=1000N = 1000 numerals and LL is the average length. Essentially O(1)O(1) since all values are bounded.
  • Space: O(NL)O(N \cdot L).

Answer

743\boxed{743}

Code

Each problem page includes the exact C++ and Python source files from the local archive.

C++ project_euler/problem_089/solution.cpp
#include <bits/stdc++.h>
using namespace std;

int romanToInt(const string& s) {
    map<char, int> val = {
        {'I', 1}, {'V', 5}, {'X', 10}, {'L', 50},
        {'C', 100}, {'D', 500}, {'M', 1000}
    };
    int result = 0;
    for (int i = 0; i < (int)s.size(); i++) {
        if (i + 1 < (int)s.size() && val[s[i]] < val[s[i + 1]])
            result -= val[s[i]];
        else
            result += val[s[i]];
    }
    return result;
}

string intToRoman(int n) {
    const int values[] = {1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1};
    const string symbols[] = {"M", "CM", "D", "CD", "C", "XC", "L", "XL", "X", "IX", "V", "IV", "I"};
    string result;
    for (int i = 0; i < 13; i++) {
        while (n >= values[i]) {
            result += symbols[i];
            n -= values[i];
        }
    }
    return result;
}

int main() {
    // The roman numerals from Project Euler problem 89
    // Since we cannot download the file, we embed the known answer logic.
    // The file contains 1000 Roman numerals, one per line.
    // We read from stdin or a file.

    // Try reading from file first, then stdin
    ifstream fin("p089_roman.txt");
    istream& in = fin.is_open() ? fin : cin;

    int totalSaved = 0;
    string line;
    while (getline(in, line)) {
        // Remove trailing whitespace
        while (!line.empty() && (line.back() == '\r' || line.back() == '\n' || line.back() == ' '))
            line.pop_back();
        if (line.empty()) continue;

        int val = romanToInt(line);
        string minimal = intToRoman(val);
        totalSaved += (int)line.size() - (int)minimal.size();
    }

    cout << totalSaved << endl;
    return 0;
}