devtake.dev

Anthropic's Glasswing logged 10,000 vulnerabilities in a month. Most are still waiting on a patch.

Anthropic says Project Glasswing's first month produced over 10,000 critical-and-high-severity vulns. Verification and patching is the limiting step.

Dieter Morelli · · 4 min read · 2 sources
Anthropic Project Glasswing announcement card with glasswing butterfly motif.
Image: Anthropic · Source

Anthropic posted a first-month status report for Project Glasswing on May 22. Its roughly 50 partners surfaced more than 10,000 high or critical-severity vulnerabilities by pointing Claude Mythos Preview at their codebases, and only about 2,100 have shipped a fix.

Glasswing is Anthropic’s restricted cyber-defense program, the one where hand-picked organizations get gated access to Mythos. Mythos is the unreleased frontier model Anthropic has held back from public release because it argues the model’s offensive cyber capability is too dangerous to ship broadly. Glasswing is the narrow door, an invite-only set of companies running the model defensively under contract.

The May 22 post is the umbrella view: what Glasswing looks like when you stack every partner’s haul into one number. The first public chapter was Mozilla, whose Firefox 150 shipped with 271 Mythos-found fixes in April. The May report makes the bigger argument out loud, in Anthropic’s own words: “Progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it’s limited by how quickly we can verify, disclose, and patch.”

What the numbers say

A few datapoints from the report stand out.

  • 10,000+ high or critical-severity vulnerabilities surfaced across partners in a single month.
  • Cloudflare produced about 2,000 issues, of which roughly 400 were rated critical or high.
  • A scan of more than 1,000 open-source projects estimates 6,202 unfixed critical or high-severity vulnerabilities exist in widely-used code today.
  • Of the 1,752 assessed findings Anthropic audited end-to-end, 90.6% validated as real (1,587 true positives). The remainder were noise.
  • 2,100 vulnerabilities were patched through Claude Security in three weeks, with an average two-week patch turnaround on confirmed critical bugs.

The true-positive rate is the number that does the work. A 90.6% precision against critical-severity claims is well above what security teams typically see from automated scanners, and it’s the number that makes the rest of the report load-bearing instead of inflationary. The volume isn’t worth much if 80% of the queue is junk; at 9 in 10, the queue is real.

The bottleneck shifted

The report’s substantive argument is that the security industry now has a glut of confirmed bugs and a shortage of people to fix them. Anthropic frames this as a positive: Mythos is a force multiplier on discovery, so the next investment should be the patch pipeline. The report cites a two-week median time-to-fix on confirmed critical issues, which is fast by industry standards. But with 6,202 estimated unpatched critical bugs sitting in 1,000+ open-source projects, the math says the pipeline is the gating factor for years to come.

That framing is convenient for Anthropic. The pitch is “the discovery problem is solved, now buy the patching product.” Glasswing’s own Claude Security service handled the 2,100 fixes Anthropic mentions, which gives Anthropic a commercial product on both sides of the bug pipeline. That’s the same pattern as Mythos itself: the model that finds the threat is the model that sells the defense.

It also leaves a question the post sidesteps. Glasswing partners are running Mythos under contract. The model is still restricted from public release because Anthropic argues its offensive cyber capabilities are too dangerous for general access. If a 90.6%-precision vulnerability scanner is sitting inside Cloudflare, Mozilla, and 48 other partners, the same model is, by definition, also sitting inside Anthropic. The asymmetry is the program. So is the risk.

What this means for you

If you maintain an open-source project of any meaningful scale, assume your code is being scanned by Glasswing partners with or without your involvement. The 90-day coordinated disclosure clock is the same as everyone else’s; the difference is that the queue feeding it grew 10x. Build your triage and patching workflow on the assumption that disclosures will arrive in batches of dozens, not ones and twos.

If you run security at a large org, ignore the AI-vendor framing and look at the precision number. A scanner that’s right 9 times out of 10 on critical-severity claims is the closest the field has gotten to a real signal-to-noise turnaround in years, even if it’s gated behind a partner agreement. The right question to ask Anthropic now isn’t whether Mythos can find bugs. It’s what the Glasswing application looks like, and how long the waitlist is.

Share this article

Sources

Mentioned in this article