Mythos - what's the real story?

21 April 2026

| Watch the webinar video | Read the article |

Watch the webinar video

In April 2026, Anthropic announced their latest AI model, Claude Mythos and the Project Glasswing initiative to pilot these capabilities for defensive purposes. It marked a further acceleration in cyber risk as AI capability advances and shrinks the time between vulnerability discovery and exploitation.

We hosted a global webinar to explore the practical implications of Claude Mythos, and with plenty of time for Q&A discussed no-regret actions for security leaders navigating the fast-moving, evolving AI landscape.

Hosted by our Director and Senior Advisor, Tim Rawlins, we heard from:

David Brauchler, Technical Director
Joel Scambray, SVP, Cybersecurity Advisory
Andy Davis, Global Research Director

Get in touch if you’d like to review what this means for your environment and how you might assess whether your current approach can keep pace.

"There is no silver bullet. In an AI-accelerated world, the challenge is to look beyond hype and to trust in advice and action you can stand behind."

- Mike Maddison, CEO, NCC Group

Read the article now

Mythos: what’s the story – and what security leaders should do next

Webinar takeaways from NCC Group’s briefing for security leaders and their teams on 16 April, 2026

AI-assisted vulnerability discovery is moving from occasional acceleration to sustained, repeatable scale. Claude Mythos is a useful signal—not because it creates a new class of threat, but because it compresses the time between finding weaknesses and putting them to work. In that world, the limiting factor in reducing risk is no longer what you can detect; it’s how quickly you can triage, prioritise, and ship safe fixes across complex environments. The question for security leaders is straightforward: do your governance and remediation systems move fast enough to keep exposure trending down as discovery speed climbs?

We brought together a few of our technical experts in a webinar to explore the practical implications of Claude Mythos, and the no-regret actions security leaders can take as AI capabilities—and attacker adoption—continue to move quickly.

Hosted by our Senior Adviser Tim Rawlins, with David Brauchler (Technical Director), Joel Scambray (SVP, Cybersecurity Advisory), and Andy Davis (Global Research Director), we discussed how Mythos has sharpened a debate many security leaders have been tracking for years.

What Mythos is (and what it isn’t)

Mythos is not a once-in-a-generation shock event. In fact, Mythos represents an iterative step along a lengthy history of AI-powered exploitation. But organizations do need to plan for a world where discovery scale and velocity increases and the time between triage and remediation becomes a board-level concern.

Andy Davis: “The reality is a bit more nuanced and less apocalyptic than some of the media headlines are suggesting. It is a genuine step change in AI assisted vulnerability discovery and exploitation of those vulnerabilities. Mythos may be the first to do this well at scale. So, it is a wake up, but it’s certainly not going to be the last.”

One of the clearest technical points from the session was that this raw model capability is likely to be quickly replicated by other AI. The differentiator around their capabilities is how they are wrapped in a system that drives repeatable security outcomes: harnesses, workflows, tool use, validation loops, and human feedback.

David Brauchler: “What Mythos has demonstrated is that pure model intelligence, or strength, or ability is not in isolation sufficient for that model to be used for cyber security tasks. If your test harnesses are insufficient, then your results are going to be insufficient as well, irrespective of how good the model itself is. Finding that intersection between the capabilities of both the model and the test harnesses that leverage that model’s capabilities, are paramount when it comes to extracting the max amount of security utility when deploying these types of models in production.”

For security leaders evaluating AI assisted testing - as a service, a product, or an internal capability - this is a useful filter. Model benchmarks and demonstrations are not enough. The operational question is whether the AI tool produces results that can be validated, prioritised, and converted into remediation at scale.

Mythos does not end defenders, but it is a force multiplier that augment both pace and scale

AI is a force multiplier, not a replacement for competent practitioners. It can increase researcher throughput and consistency, but it does not remove the need for human judgement—especially when translating findings into business impact, assessing exploitability in context, and deciding what to fix first. To manage the impact, organisations still need a clear view of their assets (hardware and software) and their attack surface. That’s not new, but it’s become even more important.

Andy Davis: “AI is a force multiplier for researchers. It’s not a replacement for humans in cyber security, it amplifies effectiveness, but it’s not magic.”

David Brauchler: “We have to be careful about getting taken up by the amount of hype. AI and humans have different overlap when it comes to the types of vulnerabilities that they discover and the types of attack surface that they can intuit.”

Evolution, not revolution - but evolution can still outpace organisations

Even while rejecting doomsday narratives, our panellists were clear on what does change: the speed at which vulnerabilities can be identified, chained, and operationalised. Once a capability is demonstrated at scale, diffusion and replication tend to follow.

David Brauchler: “We like to use the phrase, this is an evolution, not a revolution. Once somebody sets a record previously thought impossible, it’s not long before copycats begin to march in and match those same capabilities.”

The leadership question becomes: do your governance, prioritisation, and remediation systems move fast enough to keep risk trending down even as discovery speed increases? In many organisations, doing so will require executive support to free up the resources—and, at times, the change windows—needed to respond.

The shift: AI risk in your organisation is no longer hypothetical - and it is often architectural

AI is already embedded in production environments: decision support, workflow automation, developer tooling, customer interaction, and internal operations. AI risk is rarely “a model problem” in isolation. In practice, it is more often an architecture and integration problem: where the model sits, what it can access, what it can do, what it trusts, and how its actions are constrained.

Three realities stood out:

Attack surface expands quickly as AI systems become more connected (APIs, tools, data sources) and more autonomous (agents).
Static controls and superficial “guardrails” often fail under dynamic conditions and attacker-influenced inputs.
Organisations struggle to distinguish perceived AI risk from actual exposure because visibility is fragmented.

This is why the most practical security question is shifting from “Is AI risky?” to:
“Where is it already being used in our environment, and how do we contain failures without crippling delivery?”

There is a simple operational reality: when vulnerability discovery accelerates faster than remediation, exposure grows—even in organisations with mature detection capabilities. In a Mythos-style environment, risk is no longer constrained by what you can find, but by how quickly you can make risk-reducing changes safely and at scale.

David Brauchler: “One of the best ways to prepare for Mythos is to get your risk remediation pipeline in order. Unless our collective remediation processes can keep with the bug discovery process, defenders are going to be left in the dust.”

Many organisations already operate with a long backlog of unresolved findings. In that context, “more findings” is not automatically “more security”. Without better triage, decision-making, and fix throughput, increased discovery can simply expand backlog and fatigue—while organisational risk stays flat or worsens. This also extends beyond the remit of the security team: to maintain (or improve) security at a higher pace, the wider organisation may need to accept more frequent operational disruption to deliver changes safely.

Prioritisation becomes a control, not a reporting exercise

A second operational theme was that prioritisation must become more rigorous and more repeatable under higher volume.

Joel Scambray: “The fundamentals have not been repealed. Prioritisation being a key tool all risk professionals have used for years. What Mythos changes is in the speed and scale which requires a revisit of that fundamental - how does one prioritise?”

The panel cautioned against treating vulnerability volume as the primary measure of progress, particularly where models generate large numbers of potential issues. Human judgement remains essential for determining exploitability in context and filtering noise.

Andy Davis: “We’ve got to make sure that the vulnerabilities that we’re trying to prioritise are actually vulnerabilities in the first place. It’s very difficult for AI models to understand real world impact. And that’s where humans come in.”

AI as an attack surface: trust, provenance, permissions, and agency are where failures concentrate

Common failure modes are typically not “clever prompts”, but repeatable patterns that can lead to data exposure, privilege escalation, and indirect prompt injection via attacker-influenced inputs.

David Brauchler: “We really see vulnerabilities that end up exposing confidential internal data due to AI systems that are connected to privileged data sources. We see privilege escalation through excessive agency vulnerabilities. And we see a lot of indirect prompt injection, usually in the cross user prompt injection category. So, threat actors that are able to manipulate data sources that AI systems consume, are able to make that AI an agent of the threat actor and control its behaviour.”

A root cause highlighted was weak segmentation of trusted vs untrusted inputs, and poor understanding of provenance - inability to consistently separate what should be treated as instruction, what should be treated as data, and what should be treated as hostile.

David Brauchler: “Systems are not segmenting data trust properly because they don’t understand the data provenance. Unless we know the provenance of the sources of input entering the context windows of our models, we cannot control how they behave.”

A useful mental model from the session: assume any AI system exposed to threat-actor-controlled input can become an agent of the threat actor unless you explicitly design against it.

As organisations move from assistants to agents that can take actions, the attack surface grows rapidly. A text-only assistant can mislead, but an agent can act: it can call tools, make API requests, and execute workflows. As autonomy of those agents increases, the system’s effective attack surface expands – making least privilege and containment design essential from the outset. This is also a governance issue: “helpful autonomy” can quickly turn into broad operational reach unless permissions, action limits, and oversight are clearly defined and enforced.

Practical actions for security leaders (no-regret, operational steps)

1) Strengthen prioritisation so you reduce the most risk first

• Define and defend a clear ‘enterprise edge’ scope
Identify the limited set of internet exposed services and externally reachable systems that represent your highest likelihood of exploitation.
Prioritise fixes that reduce risk on those boundaries first. Include “edge-adjacent” supply chain dependencies in that scope (e.g., critical VPN/identity components, gateway services, and the third party libraries/packages they rely on), because AI accelerated discovery makes widely reused components higher leverage targets.

• Prioritise by business impact and exploitability in your environment (not volume)
Use a consistent decision model that answers: “If this is exploited here, what happens?” The prioritisation to reachable attack paths, privilege outcomes, and blast radius.
Extend that decision model to third party and open source components: prioritise what sits on critical paths and what would expand blast radius if compromised.

• Make prioritisation a repeatable operating process
Treat it as an operational control: named decision owners, defined cadence, escalation criteria, and explicit “drop / defer” rules so you do not collapse under higher finding volume. Make “supply chain exposure” an explicit input to the process (e.g., dependency criticality, maintainer risk, reachability, and known compromise signals) so it’s not handled ad hoc.

2) Modernise remediation so time-to-fix does not lag behind time-to-find

• Run remediation as a measurable pipeline with throughput targets
Track cycle time from discovery to fix, bottlenecks by team, ageing items, and work-in-progress limits. If discovery accelerates, throughput must increase or exposure will accumulate.

• Split remediation into two tracks to increase velocity without increasing outages

• Fast track: low-risk, well-understood changes with strong rollback

• Complex track: architecture or dependency work requiring design review, testing, and change windows. This prevents the complex tail from blocking safe, high-value fixes.

Automate friction, not accountability

Automate correlation, deduplication, asset ownership routing, and evidence capture. Keep decision rights with accountable humans, especially where changes can introduce outages or security regressions.

3) Build AI systems with trust segmentation and containment by design

Implement data provenance and trust segmentation for AI inputs
Classify inputs as trusted / semi-trusted / untrusted and constrain what each class can influence. The objective is to prevent attacker-controlled inputs shaping behaviour in privileged contexts.
Separate ‘instruction’ and ‘data’ paths
Ensure untrusted content cannot become instruction. Use strict templates, tool schemas, and policy enforcement so privileged tools only accept input from controlled contexts.
Tier privileges and limit blast radius for agentic actions
Avoid a single all-powerful agent. Use multiple agents with least privilege, narrow scopes, and explicit tool allow-lists. Route sensitive actions through higher-trust paths with stronger controls.
Design for misbehaviour
Assume model drift, manipulation, and unexpected behaviour. Build monitoring, anomaly detection, kill switches, and rollback paths so abnormal behaviour is detected quickly and consequences are contained.
Supply chain tie-in for AI architectures
Treat third‑party tools, plugins, “skills,” and external integrations as untrusted by default –
constrain what they can trigger, what they can access, and how their outputs are consumed –
because compromised extensions can become a direct path to privileged actions.

4) Reduce governance gaps that make AI risk harder to manage

Create visibility for AI usage and data flows
Inventory where AI is used (embedded features, third-party platforms, internal tools), what data is shared, and which services are involved. This reduces ‘shadow AI’ risk and clarifies where controls are needed.
Define accountability for agent actions
Set policy for who is responsible when an agent acts, how actions are logged, what audit evidence is retained, and which activities require approval.
Reserve human approvals for genuinely high-impact, irreversible actions
Too many approvals create fatigue and weaken oversight. Use approvals where they measurably reduce risk, and rely elsewhere on deterministic controls, technical boundaries, and monitoring.
Require asset registers such as software bill of materials (SBOMs) (or equivalent transparency) for critical suppliers and internal products, and maintain an internal view of “what we depend on” across code, build pipelines, and AI extensions. Assess where suppliers are using AI (in development, build, testing, and support environments ) and what that changes about your exposure – especially as AI increases the speed and scale of vulnerability discovery against commonly reused components.

Communicating AI risk to executives and boards

A recurring challenge raised in the session was not purely technical: it was translation. Security leaders need to communicate AI risk in terms that allow executives to make decisions.

A practical approach is to shift the narrative from “AI vulnerabilities” to operational failure modes and impact:

What breaks? (service disruption, data exposure, fraud, integrity loss)
How does it happen here? (attack paths, permissions, data access, tool use)
How likely is it? (exposure, observability, known patterns, mitigations)
What are we doing about it? (prioritised controls, measurable remediation throughput, containment design)

Executives do not need model mechanics. They need clarity on probable failure, consequence, and control effectiveness - especially as AI systems become more agentic and interconnected.

Closing thoughts: turning Mythos into a practical operating plan

The most useful response to Mythos is not to chase headlines, but to treat it as a clear signal: vulnerability discovery and exploitation are accelerating, and defensive programmes must be optimised for speed, prioritisation, and resilience.

If you focus on these four areas, you will be materially better positioned regardless of which AI leads next:

Sharper prioritisation tied to business impact and exploitability in your environment
A remediation pipeline with measurable throughput that can sustain higher velocity
AI architectures that enforce trust segmentation, least privilege, and blast-radius containment as agency grows
Supply chain contractual demands for visibility and control – SBOMs, governed dependency updates, and rapid response to compromised components.

In other words: treat AI security as an architectural and operational discipline, not a tooling problem - and ensure the fundamentals (visibility, prioritisation, remediation, segmentation, containment) can operate at the new pace.

Support to navigate Mythos - technical and board-ready guidance

Watch the full webinar recording for more practical technical advice, or read our CEO Mike Maddison’s viewpoint to inform your board discussion and next steps.

Watch the webinar | Read the article

About the speakers

Tim Rawlins
Senior Advisor, NCC Group
Tim advises NCC Group and our major clients across a range of industries and sectors, on critical business risk, security and resilience issues. He also represents us in wide ranging public affairs, including government liaison, speaking at conferences and in the international media. Before joining NCC Group, Tim was the Global Head of Corporate Security at Credit Suisse in London. Previously, he was the first Operations Director at The O2 and the EMEA Security Director for Turner Broadcasting. Tim started his career working for the British Government.

David Brauchler
Technical Director, NCC Group
David leads NCC Group’s North America AI/ML Security Practice and operates as an application security specialist and penetration tester. His skillset includes web, mobile, native, AI/ML, and application threat modelling. David has a long portfolio of technical presentations through his adjunct lecturing, conference talks, special guest lectures, podcasts, news broadcasts, media commentary, and more.

Andy Davis
Global Research Director, NCC Group
Andy leads NCC Group’s worldwide research initiatives, driving innovation and fostering collaboration across its global technical community. With over 30 years of experience in the Information Security industry, Andy has deep expertise across telecommunications, cyber security, and government research. He joined NCC Group in 2010 as UK Research Director, following senior roles at IRM and KPMG. From 2016, he successfully led the growth of our Global Transport Practice, with a research-led approach that has helped shape the future of connected transport security.

Joel Scambray
SVP, Cybersecurity Advisory, NCC Group
Joel has helped Fortune 500-class organizations address information security challenges for over twenty-five years as a consultant, author, speaker, executive, and entrepreneur. He is widely recognized as co-author of the Hacking Exposed book series, and has worked/consulted for companies including Microsoft, Google, Amazon, and Ernst & Young. He has helped start and build security companies valued collectively in the hundreds of millions of dollars. Joel is currently Senior Vice President, Cybersecurity Advisory at NCC Group.

Get in touch if you’d like to review what this means for your environment and how you might assess whether your current approach can keep pace.