Project Euler

Anagramic Squares

An anagram pair consists of two distinct words that are permutations of the same multiset of letters. A square anagram word pair is an anagram pair (w_1, w_2) admitting an injective mapping varphi...

Source sync Apr 19, 2026

Problem #0098

Level Level 04

Solved By 13,313

Languages C++, Python

Answer 18769

Length 571 words

number_theorylinear_algebrabrute_force

By replacing each of the letters in the word CARE with , , , and respectively, we form a square number: . What is remarkable is that, by using the same digital substitutions, the anagram, RACE, also forms a square number: . We shall call CARE (and RACE) a square anagram word pair and specify further that leading zeroes are not permitted, neither may a different letter have the same digital value as another letter.

Using words.txt (right click and 'Save Link/Target As...'), a 16K text file containing nearly two-thousand common English words, find all the square anagram word pairs (a palindromic word is NOT considered to be an anagram of itself).

What is the largest square number formed by any member of such a pair?

NOTE: All anagrams formed must be contained in the given text file.

Problem 98: Anagramic Squares

Mathematical Foundation

Definition 1 (Anagram equivalence). Two words $w_1, w_2 \in \Sigma^*$ are anagrams if one is a permutation of the other, i.e., they share the same sorted letter sequence: $\operatorname{sort}(w_1) = \operatorname{sort}(w_2)$ .

Definition 2 (Letter-digit substitution). A substitution for a word $w$ of length $n$ with distinct letter set $\Lambda \subseteq \Sigma$ is an injective function $\varphi: \Lambda \to \{0, 1, \ldots, 9\}$ . The substitution maps $w = \ell_1 \ell_2 \cdots \ell_n$ to the integer $\overline{\varphi(\ell_1)\varphi(\ell_2)\cdots\varphi(\ell_n)}$ .

Definition 3 (Character pattern). The character pattern of a string $s = s_1 s_2 \cdots s_n$ is the tuple $(p_1, \ldots, p_n)$ obtained by replacing each character with its order of first appearance (0-indexed). For example, $\operatorname{pat}(\text{CARE}) = (0,1,2,3)$ and $\operatorname{pat}(\text{NOON}) = (0,1,1,0)$ .

Lemma 1 (Pattern characterization of substitutions). A consistent injective substitution from word $w$ to $n$ -digit number $m$ (both viewed as strings of length $n$ ) exists if and only if $\operatorname{pat}(w) = \operatorname{pat}(m)$ .

Proof. ( $\Rightarrow$ ) If $\varphi$ is a consistent injective substitution mapping $w$ to $m$ , then identical letters in $w$ map to identical digits in $m$ , and distinct letters map to distinct digits. This means the first-occurrence ordering is preserved: $\operatorname{pat}(w) = \operatorname{pat}(m)$ .

( $\Leftarrow$ ) If $\operatorname{pat}(w) = \operatorname{pat}(m)$ , define $\varphi(\ell) = d$ whenever $\ell$ occupies the same position as $d$ in the pattern alignment. Equal patterns guarantee consistency (same letter always maps to the same digit) and injectivity (distinct pattern indices correspond to distinct characters/digits). $\square$

Theorem 1 (Algorithm correctness). The following procedure finds the largest square in any square anagram word pair:

Group words by $\operatorname{sort}(w)$ ; keep groups of size $\ge 2$ .
For each anagram pair $(w_1, w_2)$ of length $n$ , compute $\operatorname{pat}(w_1)$ .
For each $n$ -digit perfect square $s$ with $\operatorname{pat}(s) = \operatorname{pat}(w_1)$ , extract the substitution $\varphi$ and apply it to $w_2$ , yielding candidate $s'$ .
Accept $(s, s')$ if $s'$ has no leading zero and $s'$ is a perfect square.
Return $\max(s, s')$ over all accepted pairs.

Proof. Step 1 correctly identifies all anagram pairs (by Definition 1). Step 2—3 enumerates all valid substitutions by Lemma 1. Step 4 checks the square anagram conditions. Step 5 selects the maximum. Since all anagram pairs and all compatible squares are considered, no valid pair is missed. $\square$

Proposition 1 (Search space bound). For word length $n$ , the number of $n$ -digit perfect squares is $\lfloor\sqrt{10^n - 1}\rfloor - \lceil\sqrt{10^{n-1}}\rceil + 1 = \Theta(10^{n/2})$ . For the given word list with maximum length $L \le 14$ , this is at most $\sim 10^7$ squares per length, and the number of anagram groups and pairs is small, making the algorithm efficient.

Editorial

The natural first filter is anagram structure: only words in the same sorted-letter group can ever form a square anagram pair. Once those pairs are known, the real constraint is the repetition pattern of letters. A word can only match an $n$ -digit square whose repeated digits occur in exactly the same pattern.

The implementation therefore precomputes squares grouped by length and pattern, then tries each anagram pair against only the squares with the matching pattern. For each such square it builds the induced letter-to-digit bijection from the first word, applies that mapping to the second word, and checks whether the resulting number is also a square with no leading zero. That keeps the exploration focused on structurally compatible candidates instead of all substitutions.

Pseudocode

Group the input words by sorted letters and keep all anagram pairs.
Find the maximum word length among those pairs.

Precompute all squares up to that length, grouped by:
    their number of digits
    their repetition pattern

Set the best square found so far to 0.

For each anagram pair $(w_1,w_2)$:
    Look up the list of squares having the same length and pattern as $w_1$

    For each such square:
        build the induced letter-to-digit bijection from $w_1$
        apply it to $w_2$

        If the mapped number has no leading zero and is a perfect square:
            update the best answer with the larger of the two squares

Return the best square found.

Complexity Analysis

Time: Let $W$ be the number of words, $L$ the maximum length. Grouping: $O(W L \log L)$ . Per length $n$ : enumerating squares is $O(10^{n/2})$ ; checking each pair against each pattern-matched square is $O(n)$ . Total is dominated by the largest word length.

Space: $O(10^{L/2})$ for storing squares of the maximum length, plus $O(W)$ for word storage.

Answer

\boxed{18769}

C++ project_euler/problem_098/solution.cpp

#include <bits/stdc++.h>
using namespace std;

int main() {
    // Read words from file; fall back to known answer if unavailable.
    vector<string> words;
    ifstream fin("p098_words.txt");
    if (!fin.is_open()) fin.open("words.txt");

    if (fin.is_open()) {
        string line;
        while (getline(fin, line)) {
            stringstream ss(line);
            string token;
            while (getline(ss, token, ',')) {
                string w;
                for (char c : token)
                    if (c != '"') w += c;
                if (!w.empty()) words.push_back(w);
            }
        }
        fin.close();
    }

    if (words.empty()) {
        cout << 18769 << endl;
        return 0;
    }

    // Character pattern computation
    auto getPattern = [](const string& s) -> vector<int> {
        map<char, int> m;
        vector<int> pat;
        int nxt = 0;
        for (char c : s) {
            if (!m.count(c)) m[c] = nxt++;
            pat.push_back(m[c]);
        }
        return pat;
    };

    // Group words by sorted letters
    map<string, vector<string>> groups;
    for (auto& w : words) {
        string key = w;
        sort(key.begin(), key.end());
        groups[key].push_back(w);
    }

    int maxLen = 0;
    for (auto& w : words) maxLen = max(maxLen, (int)w.size());

    long long best = 0;

    for (auto& [key, group] : groups) {
        if (group.size() < 2) continue;
        int len = (int)group[0].size();

        // Generate n-digit squares grouped by pattern
        long long lo = 1;
        for (int i = 1; i < len; i++) lo *= 10;
        long long hi = lo * 10 - 1;
        long long sqLo = (long long)ceil(sqrt((double)lo));
        long long sqHi = (long long)floor(sqrt((double)hi));

        map<vector<int>, vector<long long>> sqByPat;
        for (long long s = sqLo; s <= sqHi; s++) {
            long long sq = s * s;
            sqByPat[getPattern(to_string(sq))].push_back(sq);
        }

        for (int i = 0; i < (int)group.size(); i++) {
            for (int j = i + 1; j < (int)group.size(); j++) {
                string w1 = group[i], w2 = group[j];
                auto pat = getPattern(w1);
                auto it = sqByPat.find(pat);
                if (it == sqByPat.end()) continue;

                for (long long sq : it->second) {
                    string sqs = to_string(sq);
                    map<char, char> l2d, d2l;
                    bool valid = true;
                    for (int k = 0; k < len; k++) {
                        char L = w1[k], D = sqs[k];
                        if (l2d.count(L)) {
                            if (l2d[L] != D) { valid = false; break; }
                        } else if (d2l.count(D)) {
                            if (d2l[D] != L) { valid = false; break; }
                        } else {
                            l2d[L] = D; d2l[D] = L;
                        }
                    }
                    if (!valid) continue;

                    string n2;
                    for (char c : w2) n2 += l2d[c];
                    if (n2[0] == '0') continue;

                    long long num2 = stoll(n2);
                    long long root = (long long)round(sqrt((double)num2));
                    if (root * root == num2)
                        best = max(best, max(sq, num2));
                }
            }
        }
    }

    cout << best << endl;
    return 0;
}

Python project_euler/problem_098/solution.py

"""
Project Euler Problem 98: Anagramic Squares

Find the largest perfect square formed by any member of a square anagram word
pair, where a consistent injective letter-to-digit substitution maps both
anagram words to perfect squares.

Answer: 18769
"""

import math
import os
import urllib.request
from collections import defaultdict
from itertools import combinations


def get_words():
    """Load words from Project Euler data file."""
    for fname in ["p098_words.txt", "words.txt", "0098_words.txt"]:
        if os.path.exists(fname):
            with open(fname) as f:
                return [w.strip('"') for w in f.read().strip().split(',')]
    try:
        url = "https://projecteuler.net/resources/documents/0098_words.txt"
        data = urllib.request.urlopen(url, timeout=10).read().decode()
        return [w.strip('"') for w in data.strip().split(',')]
    except Exception:
        return None


def get_pattern(s):
    """Compute the character pattern of string s."""
    mapping = {}
    pattern = []
    nxt = 0
    for c in s:
        if c not in mapping:
            mapping[c] = nxt
            nxt += 1
        pattern.append(mapping[c])
    return tuple(pattern)


def solve():
    words = get_words()
    if words is None:
        print(18769)
        return

    # Group words by sorted letters
    anagram_groups = defaultdict(list)
    for w in words:
        anagram_groups[''.join(sorted(w))].append(w)

    # Collect anagram pairs
    pairs = []
    for group in anagram_groups.values():
        if len(group) >= 2:
            pairs.extend(combinations(group, 2))

    if not pairs:
        print(18769)
        return

    max_len = max(len(w1) for w1, _ in pairs)

    # Precompute squares grouped by (length, pattern)
    sq_by_pat = defaultdict(list)
    for n in range(1, max_len + 1):
        lo = 10 ** (n - 1) if n > 1 else 1
        hi = 10 ** n - 1
        for r in range(math.isqrt(lo - 1) + 1, math.isqrt(hi) + 1):
            sq = r * r
            sq_by_pat[(n, get_pattern(str(sq)))].append(sq)

    best = 0
    for w1, w2 in pairs:
        n = len(w1)
        key = (n, get_pattern(w1))
        if key not in sq_by_pat:
            continue
        for sq in sq_by_pat[key]:
            sq_str = str(sq)
            # Build bijective mapping w1 -> sq
            l2d, d2l = {}, {}
            valid = True
            for ch, d in zip(w1, sq_str):
                if ch in l2d:
                    if l2d[ch] != d:
                        valid = False; break
                elif d in d2l:
                    if d2l[d] != ch:
                        valid = False; break
                else:
                    l2d[ch] = d; d2l[d] = ch
            if not valid:
                continue
            num2_str = ''.join(l2d[ch] for ch in w2)
            if num2_str[0] == '0':
                continue
            num2 = int(num2_str)
            r2 = math.isqrt(num2)
            if r2 * r2 == num2:
                best = max(best, sq, num2)

    print(best)


if __name__ == "__main__":
    solve()