Skip to navigation Skip to main content Skip to footer

Teaching the Mapmakers: How AI Training Data Shapes (and Distorts) Security Coverage

By Steven van der Baan

08 May 2026

This is the third piece in a series. The first, The Cartographer's Advantage, made the philosophical argument for why human judgment is structurally irreplaceable in security testing. The second, The Expedition Debrief, asked what that looks like in practice. This piece goes deeper on a question the first two left open: what, exactly, is in the map that AI draws, and what got left out when the mapmakers were taught?

The day after I published the second piece in this series, a new AI security tool called Mythos was released. I'm not citing it as a data point about the pace of the space, the space is moving fast, everyone knows it, but because of what it represents structurally. Another tool. Another model. Another capability built on top of the same underlying substrate: a training corpus assembled from the security knowledge that exists in public, written-down, structured form.

That substrate is where I want to spend some time.

The first piece argued that AI pattern-matches against its training distribution. The second piece built a methodology for working in the spaces that pattern-matching misses. This piece asks the prior question: what is the training distribution, and why does it produce the map it produces?

The Corpus Is Not a Sample

When we talk about AI security tools learning from training data, the mental model most practitioners carry is something like: "it has read a lot about vulnerabilities." That is true but underspecified in a way that matters.

The corpus that AI security tools are trained on, explicitly or implicitly, through foundation models fine-tuned on security-relevant data, is assembled from sources that share a structural characteristic: they are records of vulnerabilities that were found, named, documented, and published. CVE databases. Bug bounty reports. Security research papers. Exploit archives. Proof-of-concept repositories. OWASP lists. Conference presentations. GitHub security advisories.

This is not a representative sample of the vulnerability landscape. It is a sample of the vulnerability landscape that intersects with the incentive structures and publication norms of the security industry. Those two things are not the same.

Bug bounty platforms pay for findings that fit their scope. Their scope is usually web applications, usually specific endpoint categories, usually vulnerability classes that can be demonstrated clearly and triaged efficiently. SQL injection pays well. Business logic issues that require deep application knowledge to understand, let alone verify, pay less well or not at all, they're too slow to triage, too hard to scope, too dependent on context that the platform doesn't have. The corpus of disclosed bug bounty reports is therefore not a random sample of what's exploitable. It is a sample of what's exploitable, demonstrable, and legible to a remote third party with limited application context.

CVE databases have a different but related bias. They record vulnerabilities in named, versioned software packages, the ones significant enough that someone built a tracking mechanism around them. The class of vulnerabilities that exist in the interaction between correctly implemented components, the discount stacking bug in the checkout flow built by three teams over four years, never gets a CVE. It is not a flaw in a library. It is a flaw in the system. The database has no category for it.

The mapmakers learned from these corpora. The maps they draw are faithful to what they were taught. The question is what the curriculum left out.

What Gets Under-Represented

The current generation of AI security tools is genuinely impressive on certain terrain. A single comprehensive run of Shannon, one of the more capable autonomous AI pentesters available, costs roughly sixty dollars in compute and finds real vulnerabilities at a rate competitive with skilled human testers on the vulnerability classes it knows. XBOW, an autonomous agent, reached the top of HackerOne's US leaderboard in 2025. These are not toys.

But the terrain they are impressive on is not uniformly distributed across the attack surface.

Business logic vulnerabilities are under-represented in every corpus the mapmakers were trained on, because they are application-specific by definition. They arise from the collision of assumptions made by different teams, different sprints, different developers working on different subsystems with different mental models of how the whole thing fits together. No CVE captures this class. No bug bounty report generalises it. The AI can learn the general concept, "business logic flaws exist", but it cannot build pattern libraries for vulnerabilities that are inherently particular to their context. Every instance is a one-off.

State-dependent vulnerabilities are similarly sparse in training data. The finding that lives in the gap between two correctly implemented endpoints, what happens when you hit them in this order, from this account state, with this session history, is not the kind of finding that gets written up cleanly. It's not a reproducible proof of concept that maps neatly to a vulnerability class. It's a finding that requires understanding the application's state machine, and the training corpus contains almost no examples of that reasoning because that reasoning doesn't survive the translation into structured documentation.

Multi-actor race conditions and authorization sequence failures share this problem. A vulnerability that requires two accounts acting in a specific temporal sequence is hard to test manually and harder to document in a way that generalises. The training corpus has scattered examples, not enough to build robust pattern libraries.

Zero-day vulnerability classes are, by definition, absent from training data until someone discovers and publishes them. Server-side template injection, HTTP request smuggling, prototype pollution, these were all invisible to automated tools until a human tester perceived them as a category for the first time. The AI cannot discover what the corpus doesn't contain. The curriculum didn't include the next class because nobody has named it yet.

The Consistency Problem Is a Coverage Problem in Disguise

One of the most frequently cited limitations of current AI security tools is non-determinism: run the same test twice and you might get different results. Practitioners talk about "hammering a prompt" to distinguish false positives from real findings. The Shannon documentation acknowledges this; it is a known operational friction.

The way this is usually framed is as a reliability issue. The same input produces inconsistent output. That is a problem, but it obscures a more fundamental one.

Non-determinism in security tools is a symptom of the model reasoning under uncertainty, and the model reasons under uncertainty when it is operating near the edges of its training distribution. When the tool is pattern-matching against well-represented vulnerability classes, SQL injection, XSS, known authentication bypasses, it is in familiar territory. The outputs are more consistent because the terrain is well-mapped. When the tool is operating in sparse terrain, business logic, novel attack chains, application-specific state issues, it is, in a real sense, improvising. The non-determinism is telling you that the tool is in territory it has seen less of.

This matters operationally because consistency and coverage are inversely distributed across the attack surface. The vulnerabilities the tool finds reliably are the ones that were already well-documented. The vulnerabilities the tool finds inconsistently, or not at all, are the ones the corpus couldn't teach it to see.

The Uncensored Angle

There is a strand of practitioner experimentation that is worth noting here, not because it solves the problem but because it reveals it clearly.

A class of videos and writeups has emerged around using uncensored AI models for security work, models without the safety constraints that limit what commercial AI tools will help with. The argument is that the safety layer suppresses legitimate security research, and removing it unlocks capability.

Some of that argument is pragmatic and defensible. But the more interesting implication is the one the framing doesn't address: removing the safety layer changes what the model will help with, not what the model knows. An uncensored model trained on the same corpus as a commercial model has the same map. It will help you navigate territory its training data covered without requiring you to justify why you're asking. It will not draw you a better map of the terrain the training data didn't cover.

The coverage problem is upstream of the safety layer. It lives in the corpus.

What This Means for Security Coverage

If the training data systematically under-represents certain vulnerability classes, and if AI tools are increasingly handling the first pass on security engagements, then the industry faces a structural coverage problem that is easy to mistake for a tool maturity problem.

The distinction matters. A tool maturity problem resolves over time as the tools improve. A structural coverage problem requires a different intervention: not better tools, but better understanding of what the tools cannot cover, and deliberate human attention to exactly those areas.

The practical implication follows directly from the first two pieces in this series. If AI tools are trained on what has been found, named, and documented, then the role of the human tester is not just to catch what the scanner misses on a given engagement. It is to continue producing the findings that become the training data for the next generation of tools, and to do so in the terrain the tools can't reach, where the coverage gaps are.

The map gets better because humans venture into unmapped territory and come back with notes. If the whole profession follows the AI pass and only pushes past it occasionally, the map updates slowly. The coverage gaps narrow, but they narrow on the terrain the testers visited. The terrain they didn't visit stays blank.

Teaching the Mapmakers Better

The title of this piece is not quite right, and I want to acknowledge that before closing.

The AI tools are not, in the relevant sense, being taught. They are learning from a corpus that was assembled before they existed, shaped by incentive structures that were not designed with coverage in mind, and reflecting the parts of the vulnerability landscape that the industry found legible enough to document. Nobody decided to under-represent business logic vulnerabilities. The under-representation is an emergent property of how knowledge gets recorded and published.

Correcting it is partly a tooling problem, better support for documenting complex, state-dependent, multi-actor findings in ways that could eventually become training signal. It is partly a culture problem, the industry currently under-values findings that are hard to triage and under-invests in the documentation infrastructure that would make complex findings legible. And it is partly a practice problem of the kind the expedition debrief was designed to address: capturing the reasoning chain behind complex findings, not just the findings themselves, so that the knowledge survives in a form that can eventually inform the next generation of maps.

The coverage gap will not close by itself. It will close because practitioners keep going where the map runs out, and because the profession builds better infrastructure for recording what they find there.

The territory is still larger than the map. The question is whether we are making the map faster than the territory is growing.