
Anthropic Withholds Security-Focused AI Agent After Breach Tests
Anthropic restricts Claude Mythos Preview after autonomous discovery of zero-day vulnerabilities. Project Glasswing provides controlled access to infrastructure partners only.
Anthropic is keeping its most advanced AI agent strictly under wraps after the model demonstrated unprecedented cybersecurity capabilities — autonomously discovering thousands of zero-day vulnerabilities across major operating systems and web browsers. Rather than release Claude Mythos Preview publicly, the company launched Project Glasswing, providing controlled access only to critical infrastructure providers.
The decision signals a new paradigm for frontier AI labs. When autonomous agents can independently exploit decades-old security flaws, traditional deployment strategies become insufficient.
Emergent Security Capabilities Drive Restricted Deployment
Mythos Preview wasn't explicitly trained for cybersecurity work. According to Anthropic, these capabilities emerged from general improvements in code reasoning and autonomous task execution. The same architectural advances that make the agent better at patching vulnerabilities also make it exceptionally skilled at exploiting them.
The model has effectively saturated existing security benchmarks, forcing Anthropic to test it against real-world zero-day discovery. Results demonstrate concerning capabilities:
- OpenBSD vulnerability — identified a 27-year-old bug in an OS known for robust security
- FreeBSD remote execution — fully autonomous discovery and exploitation of CVE-2026-4747, a 17-year-old flaw allowing complete server compromise
- Multi-vulnerability chaining — combining 3-5 separate flaws into sophisticated exploitation chains
Nicholas Carlini from Anthropic's research team noted finding more bugs in recent weeks than in his entire previous career combined. The agent operates without human intervention after initial prompting.
Project Glasswing Partnership Structure
Anthropic is committing $100 million in usage credits for Mythos Preview access plus $4 million in direct funding to open-source security organizations. The controlled deployment targets two distinct user groups.
Core Infrastructure Partners
Launch partners include organizations maintaining critical internet infrastructure:
- Cloud providers — Amazon Web Services, Google, Microsoft
- Security vendors — CrowdStrike, Palo Alto Networks
- Hardware manufacturers — Nvidia, Broadcom, Apple
- Enterprise users — JPMorgan Chase, Cisco
- Open source foundations — Linux Foundation
Extended Access Program
Beyond the core group, Anthropic has granted access to over 40 additional organizations building or maintaining critical software infrastructure. This includes $2.5 million donated to Alpha-Omega and OpenSSF through the Linux Foundation, plus $1.5 million to the Apache Software Foundation.
Jim Zemlin, CEO of the Linux Foundation, emphasized the democratizing potential: open-source maintainers whose software underpins global infrastructure historically lacked access to enterprise-grade security expertise.
Government Briefings and National Security Implications
Anthropic has privately briefed senior US government officials on Mythos Preview's full capabilities. The intelligence community is actively evaluating how such models could reshape both offensive and defensive cyber operations.
This isn't theoretical speculation. Anthropic previously documented what it describes as the first confirmed cyberattack largely executed by AI agents — a Chinese state-sponsored group using autonomous agents to infiltrate approximately 30 global targets, with AI handling the majority of tactical operations independently.
Newton Cheng, Anthropic's Frontier Red Team Cyber Lead, stated the company has no plans to make Mythos Preview generally available due to its cybersecurity capabilities.
Competitive Landscape and Future Deployment
The controlled deployment approach is becoming standard for frontier AI capabilities. When OpenAI released GPT-5.3-Codex in February, the company classified it as high-capability for cybersecurity tasks under its Preparedness Framework. Anthropic's Project Glasswing reinforces that frontier labs are converging on controlled deployment rather than open release for models at this capability level.
Anthropic plans to eventually deploy Mythos-class models at scale, but only after establishing new safeguards. The company intends to test these safeguards first with an upcoming Claude Opus model that doesn't pose the same risk level as Mythos Preview.
Open Questions
Whether controlled deployment standards will hold as these capabilities spread remains uncertain:
- International coordination — no global framework governs autonomous agent cybersecurity capabilities
- Capability proliferation — smaller labs may lack infrastructure for controlled deployment
- Economic pressures — competitive dynamics could incentivize broader release
Why This Matters
Project Glasswing represents the first major test of how the AI industry handles autonomous agents with dual-use cybersecurity capabilities. Anthropic's decision to restrict access rather than pursue general deployment could establish precedent for future frontier models.
For builders and enterprises, this signals that the most capable autonomous agents will likely remain behind controlled access programs rather than public APIs. The infrastructure partnership model may become the primary distribution channel for agents with sensitive capabilities, fundamentally altering how developers access cutting-edge AI tools.