Anthropic's most dangerous AI just accidentally leaked
Anthropic's data breach exposed 'Claude Mythos' — their most capable model, described as far ahead of any other AI in cyber attack capabilities.
Anthropic didn't plan to announce their most powerful AI model this week. A misconfiguration in their content management system (the software teams use to publish blog posts and internal documents) accidentally made roughly 3,000 internal files publicly searchable — and buried inside those files was something Anthropic had been keeping quiet: a model codenamed Claude Mythos.
Fortune journalists and two independent security researchers — Roy Paz from LayerX Security and Alexandre Pauwels from the University of Cambridge — discovered the leak and alerted Anthropic. The company confirmed the model is real and called it a step change and the most capable they have built to date.
But what makes Mythos genuinely alarming isn't just higher benchmark scores. According to the leaked internal documents, Mythos is "currently far ahead of any other AI model in cyber capabilities" — and that lead is exactly what has cybersecurity professionals sitting up straight.
What Mythos Can Do That No Other AI Can
Anthropic's leaked assessments describe Mythos as achieving dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity compared to Claude Opus 4.6 — their existing most-capable commercial model. Internal documents describe it as "larger and more intelligent" than anything currently in their Opus lineup.
The cybersecurity dimension is where the story takes a sharp turn. The leaked documents warn that Mythos "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."
In plain English: this model may be able to find and exploit security holes in software faster than human security teams can patch them. That's a fundamentally different threat model from anything currently on the market.
Anthropic is treating this seriously. Their planned rollout is deliberately slow — beginning with a small group of early-access customers focused specifically on defensive cybersecurity applications. The logic: let defenders get ahead of the curve before the model reaches broader availability.
How 3,000 Internal Docs Went Accidentally Public
The exposure wasn't a hack or a deliberate leak. Anthropic's content management system (CMS) was misconfigured to automatically make all uploaded files publicly accessible and indexed. This included drafts, product images, PDFs, and materials from CEO events — none of which were intended for public viewing.
When Fortune's investigative team and the two researchers combed through the accessible files, they found:
- Draft blog posts describing Mythos (also referenced internally as "Capybara" — a second naming candidate)
- Internal capability assessments flagging unprecedented cybersecurity risks
- Executive event materials that referenced the model's existence and development status
- Product images and PDFs from multiple unreleased projects across Anthropic
Anthropic disabled public access after Fortune notified them on Thursday, March 27. The company acknowledged the misconfiguration and confirmed it has been corrected.
- Model name: Claude Mythos (internal codename also "Capybara")
- Capability tier: Above Claude Opus — a new class of model
- Performance: Dramatically higher on coding, reasoning, and cybersecurity vs. Opus 4.6
- Cyber risk: "Far ahead of any other AI model" in offensive cyber capabilities
- Cost: Described as "very expensive to serve" — premium pricing expected
- Rollout: Slow and staged, starting with cybersecurity defense customers
- Approx. leaked documents: ~3,000 internal files made publicly accessible
Why This Matters If You're Not a Cybersecurity Expert
You might be thinking: this sounds like a problem for security teams, not for me. But the implications are broader than they appear.
Every company that runs software — which is effectively every company — relies on the assumption that human security teams can respond faster than attackers can exploit vulnerabilities. If Mythos-level AI (and the models that follow it) can flip that equation, the entire premise of current cybersecurity defense changes.
For business owners and managers: the next 12-18 months are likely to see a sharp increase in the speed and sophistication of AI-assisted cyberattacks. This is a good time to review your company's incident response plan and consider whether your security tools are keeping pace.
For IT and security professionals: the same models that can attack can also defend. Anthropic's deliberate strategy of giving defenders early access to Mythos is a signal that AI-powered security tools are about to get significantly more capable. Getting ahead of the curve on AI-assisted penetration testing (checking for security holes before attackers do) is increasingly valuable.
What Comes Next for Claude Mythos
Anthropic has confirmed they are actively testing Mythos with "early-access customers" and will expand access through the Claude API gradually. The company noted the model is "very expensive to serve" — meaning when it does launch commercially, it will almost certainly carry a significant price premium above their existing Opus tier.
The final name is still undecided — both "Mythos" and "Capybara" appear to be internal project names. Anthropic's typical public naming convention (Claude Haiku → Sonnet → Opus) may need a new tier above Opus to accommodate this model.
No official launch date has been announced. Given the security-first rollout strategy, a broad release is likely months away rather than weeks. But the fact that it's already in testing with real customers suggests it's closer than most people expected.
Related Content — Get Started with Easy Claude Code | Free Learning Guides | More AI News
Sources
Stay updated on AI news
Simple explanations of the latest AI developments