Building a Tower
Let f(n) denote the number of ways to fill a 3 x 3 x n tower using 2 x 1 x 1 bricks. Find f(10^10000) mod (10^8 + 7).
Problem Statement
This archive keeps the full statement, math, and original media on the page.
Let \(f(n)\) represent the number of ways one can fill a \(3 \times 3 \times n\) tower with blocks of \(2 \times 1 \times 1\).
You’re allowed to rotate the blocks in any way you like; however, rotations, reflections etc of the tower itself are counted as distinct.
For example (with \(q = 100000007\)):
\(f(2) = 229\),
\(f(4) = 117805\),
\(f(10) \bmod q = 96149360\),
\(f(10^3) \bmod q = 24806056\),
\(f(10^6) \bmod q = 30808124\).
Find \(f(10^{10000}) \bmod 100000007\).
Problem 324: Building a Tower
Mathematical Foundation
Theorem 1 (Transfer matrix). Let be the set of all bitmasks of the cross-section. There exists a matrix over such that
where state represents the empty profile (no protruding bricks).
Proof. We build the tower layer by layer. A profile records which of the 9 cells at the boundary between two consecutive layers are occupied by bricks protruding from below. Entry counts the number of valid placements of bricks within a single layer such that all empty cells of profile are filled, and the cells protruding upward form profile . The brick placements within a layer correspond to horizontal dominoes (within-row or within-column) and vertical bricks crossing the boundary. Since the starting and ending profiles are both empty (no protrusions at the bottom or top), .
Lemma 1 (Linear recurrence). The sequence satisfies a linear recurrence
of degree equal to the degree of the minimal polynomial of restricted to the cyclic subspace generated by (the unit vector for state 0).
Proof. The Cayley—Hamilton theorem guarantees that satisfies its characteristic polynomial of degree at most 512. The sequence therefore satisfies the corresponding recurrence. The minimal polynomial of the sequence has degree at most .
Theorem 2 (Polynomial exponentiation for large ). Let be the minimal polynomial of the sequence of degree , working modulo the prime . Then
where in , computed via repeated squaring.
Proof. Since and annihilates the relevant subspace, any polynomial identity translates directly to on that subspace, hence . The polynomial is computed via squarings in .
Lemma 2 (Berlekamp—Massey). Given , the Berlekamp—Massey algorithm recovers the minimal polynomial of degree in operations over .
Proof. Standard result; see Massey (1969).
Editorial
3x3xn tower with 2x1x1 bricks. Find f(10^10000) mod (10^8 + 7). Uses transfer matrix + Berlekamp-Massey + polynomial exponentiation. We build transfer matrix T (512 x 512). We then compute f(0), …, f(2*512) by repeated matrix-vector multiplication. Finally, find minimal polynomial via Berlekamp-Massey.
Pseudocode
Build transfer matrix T (512 x 512)
Compute f(0), ..., f(2*512) by repeated matrix-vector multiplication
Find minimal polynomial via Berlekamp-Massey
Compute x^(10^10000) mod m(x) in F_p[x]
Represent n = 10^10000 in binary (~33220 bits)
Use repeated squaring of polynomials mod m(x)
Evaluate f(n) = sum of a_i * f(i)
Complexity Analysis
- Time: for transfer matrix construction (where is the cost of enumerating tilings per state pair) + for Berlekamp—Massey + for polynomial exponentiation, where and . Dominant term: .
- Space: for the transfer matrix or for the polynomial representation.
Answer
Code
Each problem page includes the exact C++ and Python source files from the local archive.
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
const ll MOD = 100000007LL;
// Transfer matrix (sparse representation)
vector<vector<pair<int,int>>> T(512); // T[i] = list of (j, count)
void fill(int pos, int cur_mask, int next_mask, int start_mask, vector<vector<pair<int,int>>>& T_build) {
while (pos < 9 && ((cur_mask >> pos) & 1)) pos++;
if (pos == 9) {
// Add transition start_mask -> next_mask
for (auto& p : T_build[start_mask]) {
if (p.first == next_mask) { p.second++; return; }
}
T_build[start_mask].push_back({next_mask, 1});
return;
}
int r = pos / 3, c = pos % 3;
if (c + 1 < 3 && !((cur_mask >> (pos+1)) & 1))
fill(pos+1, cur_mask | (1<<pos) | (1<<(pos+1)), next_mask, start_mask, T_build);
if (r + 1 < 3 && !((cur_mask >> (pos+3)) & 1))
fill(pos+1, cur_mask | (1<<pos) | (1<<(pos+3)), next_mask, start_mask, T_build);
fill(pos+1, cur_mask | (1<<pos), next_mask | (1<<pos), start_mask, T_build);
}
void build_transfer() {
for (int s = 0; s < 512; s++) {
fill(0, s, 0, s, T);
}
}
// Berlekamp-Massey
pair<int, vector<ll>> berlekamp_massey(const vector<ll>& s) {
int n = s.size();
vector<ll> C = {1}, B = {1};
int L = 0, m = 1;
ll b = 1;
auto modinv = [](ll a, ll mod) -> ll {
ll g = mod, x = 0, y = 1;
ll aa = a;
while (aa != 0) {
ll q = g / aa;
g -= q * aa; swap(g, aa);
x -= q * y; swap(x, y);
}
return (x % mod + mod) % mod;
};
for (int i = 0; i < n; i++) {
ll d = s[i];
for (int j = 1; j <= L; j++)
d = (d + C[j] % MOD * (s[i-j] % MOD)) % MOD;
d = (d % MOD + MOD) % MOD;
if (d == 0) { m++; continue; }
if (2*L <= i) {
vector<ll> T_save = C;
ll coef = d % MOD * modinv(b, MOD) % MOD;
while ((int)C.size() < (int)B.size() + m) C.push_back(0);
for (int j = 0; j < (int)B.size(); j++)
C[j+m] = (C[j+m] - coef % MOD * (B[j] % MOD) % MOD + MOD) % MOD;
L = i + 1 - L;
B = T_save;
b = d;
m = 1;
} else {
ll coef = d % MOD * modinv(b, MOD) % MOD;
while ((int)C.size() < (int)B.size() + m) C.push_back(0);
for (int j = 0; j < (int)B.size(); j++)
C[j+m] = (C[j+m] - coef % MOD * (B[j] % MOD) % MOD + MOD) % MOD;
m++;
}
}
return {L, C};
}
// Polynomial multiplication mod modpoly mod MOD
vector<ll> poly_mul(const vector<ll>& a, const vector<ll>& b, const vector<ll>& modpoly) {
int d = modpoly.size() - 1;
vector<ll> c(a.size() + b.size() - 1, 0);
for (int i = 0; i < (int)a.size(); i++) {
if (a[i] == 0) continue;
for (int j = 0; j < (int)b.size(); j++)
c[i+j] = (c[i+j] + a[i] * b[j]) % MOD;
}
auto modinv = [](ll a, ll mod) -> ll {
ll g = mod, x = 0, y = 1;
ll aa = a;
while (aa != 0) {
ll q = g / aa;
g -= q * aa; swap(g, aa);
x -= q * y; swap(x, y);
}
return (x % mod + mod) % mod;
};
ll lc_inv = modinv(modpoly[d], MOD);
for (int i = (int)c.size() - 1; i >= d; i--) {
if (c[i] == 0) continue;
ll coef = c[i] % MOD * lc_inv % MOD;
for (int j = 0; j <= d; j++)
c[i-d+j] = (c[i-d+j] - coef * modpoly[j] % MOD + MOD) % MOD;
}
c.resize(d);
return c;
}
int main() {
build_transfer();
// Compute sequence values
vector<ll> v(512, 0);
v[0] = 1;
vector<ll> vals;
vals.push_back(v[0]);
for (int step = 0; step < 200; step++) {
vector<ll> nv(512, 0);
for (int i = 0; i < 512; i++) {
if (v[i] == 0) continue;
for (auto& [j, cnt] : T[i]) {
nv[j] = (nv[j] + v[i] * cnt) % MOD;
}
}
v = nv;
vals.push_back(v[0]);
}
// Berlekamp-Massey
auto [L, C] = berlekamp_massey(vals);
// Build modpoly
vector<ll> modpoly(L + 1);
for (int j = 0; j <= L; j++)
modpoly[j] = (C[L - j] % MOD + MOD) % MOD;
// Compute 10^10000 in binary
// We'll do polynomial exponentiation: start with x, square-and-multiply
// But we need to represent 10^10000 in binary.
// Strategy: compute by repeated squaring of the exponent itself? No.
// Better: compute poly^10 repeatedly 10000 times (each time raise to 10th power).
// poly_pow_10: raise polynomial to 10th power
// poly^10 = ((poly^2)^2 * poly)^2 (10 = 1010 in binary)
vector<ll> result(L, 0);
result[0] = 1; // polynomial 1
vector<ll> base(L, 0);
base[1] = 1; // polynomial x
// Compute base^(10^10000) = (...((base^10)^10)^10...)^10 (10000 times)
// Start: cur = x
// Repeat 10000 times: cur = cur^10
auto poly_pow10 = [&](vector<ll> p) -> vector<ll> {
// p^10 = p^8 * p^2
auto p2 = poly_mul(p, p, modpoly);
auto p4 = poly_mul(p2, p2, modpoly);
auto p8 = poly_mul(p4, p4, modpoly);
return poly_mul(p8, p2, modpoly);
};
vector<ll> cur = base; // x
for (int i = 0; i < 10000; i++) {
cur = poly_pow10(cur);
}
// Evaluate: f(n) = sum cur[i] * vals[i]
ll answer = 0;
for (int i = 0; i < L; i++) {
answer = (answer + cur[i] % MOD * (vals[i] % MOD)) % MOD;
}
answer = (answer + MOD) % MOD;
cout << answer << endl;
return 0;
}
"""
Problem 324: Building a Tower
3x3xn tower with 2x1x1 bricks.
Find f(10^10000) mod (10^8 + 7).
Uses transfer matrix + Berlekamp-Massey + polynomial exponentiation.
"""
MOD = 10**8 + 7
def build_transfer_matrix():
"""Build 512x512 transfer matrix for 3x3 layer profiles."""
T = [[0]*512 for _ in range(512)]
def fill(pos, cur_mask, next_mask, start_mask):
while pos < 9 and (cur_mask >> pos) & 1:
pos += 1
if pos == 9:
T[start_mask][next_mask] += 1
return
r, c = pos // 3, pos % 3
# Horizontal brick (right)
if c + 1 < 3 and not ((cur_mask >> (pos + 1)) & 1):
fill(pos + 1, cur_mask | (1 << pos) | (1 << (pos + 1)), next_mask, start_mask)
# Vertical brick (down, within layer)
if r + 1 < 3 and not ((cur_mask >> (pos + 3)) & 1):
fill(pos + 1, cur_mask | (1 << pos) | (1 << (pos + 3)), next_mask, start_mask)
# Brick going up to next layer
fill(pos + 1, cur_mask | (1 << pos), next_mask | (1 << pos), start_mask)
for s in range(512):
fill(0, s, 0, s)
return T
def berlekamp_massey(s, mod):
"""Find shortest LFSR for sequence s mod mod."""
n = len(s)
C, B = [1], [1]
L, m, b = 0, 1, 1
for i in range(n):
d = s[i]
for j in range(1, L + 1):
d = (d + C[j] * s[i - j]) % mod
if d == 0:
m += 1
elif 2 * L <= i:
T_save = C[:]
coef = d * pow(b, -1, mod) % mod
while len(C) < len(B) + m:
C.append(0)
for j in range(len(B)):
C[j + m] = (C[j + m] - coef * B[j]) % mod
L = i + 1 - L
B = T_save
b = d
m = 1
else:
coef = d * pow(b, -1, mod) % mod
while len(C) < len(B) + m:
C.append(0)
for j in range(len(B)):
C[j + m] = (C[j + m] - coef * B[j]) % mod
m += 1
return L, C
def poly_mul(a, b, modpoly, mod):
"""Multiply polynomials a*b mod modpoly mod mod."""
c = [0] * (len(a) + len(b) - 1)
for i in range(len(a)):
if a[i] == 0:
continue
for j in range(len(b)):
c[i + j] = (c[i + j] + a[i] * b[j]) % mod
d = len(modpoly) - 1
lc_inv = pow(modpoly[d], -1, mod)
for i in range(len(c) - 1, d - 1, -1):
if c[i] == 0:
continue
coef = c[i] * lc_inv % mod
for j in range(d + 1):
c[i - d + j] = (c[i - d + j] - coef * modpoly[j]) % mod
result = c[:d]
while len(result) < d:
result.append(0)
return result
def poly_pow(base, exp, modpoly, mod):
"""Compute base^exp mod modpoly mod mod."""
d = len(modpoly) - 1
result = [0] * d
result[0] = 1
if exp == 0:
return result
bits = []
n = exp
while n > 0:
bits.append(n & 1)
n >>= 1
bits.reverse()
for bit in bits:
result = poly_mul(result, result, modpoly, mod)
if bit:
result = poly_mul(result, base, modpoly, mod)
return result
def solve():
T = build_transfer_matrix()
# Compute sequence values via matrix-vector multiplication
v = [0] * 512
v[0] = 1
vals = [v[0]]
for step in range(200):
nv = [0] * 512
for i in range(512):
if v[i] == 0:
continue
for j in range(512):
if T[i][j] == 0:
continue
nv[j] = (nv[j] + v[i] * T[i][j]) % MOD
v = nv
vals.append(v[0])
# Find minimal polynomial
L, C = berlekamp_massey(vals, MOD)
# Build modular polynomial (index = degree)
modpoly = [C[L - j] % MOD for j in range(L + 1)]
# Compute x^(10^10000) mod modpoly
n = pow(10, 10000)
base = [0] * L
base[1] = 1 # polynomial x
result_poly = poly_pow(base, n, modpoly, MOD)
# Evaluate: f(n) = sum result_poly[i] * vals[i]
answer = 0
for i in range(L):
answer = (answer + result_poly[i] * vals[i]) % MOD
print(answer)
if __name__ == "__main__":
solve()