HomeNewsTechnologyClaude Found 22 Firefox Security Flaws in Two Weeks

Claude Found 22 Firefox Security Flaws in Two Weeks

follow us on Google News

Security researchers typically spend months hunting for vulnerabilities in hardened software. Claude did it in days.

In a collaboration between Anthropic and Mozilla, Claude Opus 4.6 identified 22 vulnerabilities in Firefox over a two-week period. Mozilla classified 14 of them as high severity, accounting for nearly a fifth of all high-severity Firefox vulnerabilities fixed in 2025. Fixes shipped to hundreds of millions of Firefox users in version 148.0.

That is not a minor footnote. Firefox is one of the most rigorously tested open-source projects in existence. Finding novel security flaws in it is hard, by design. The fact that an AI model did it this quickly signals something meaningful about where software security is heading.

- Advertisement -

How It Started: A Benchmark That Got Outgrown

The project began when Anthropic noticed that an earlier version of Claude was coming close to maxing out CyberGym, a benchmark designed to test whether AI models can reproduce known security vulnerabilities. Rather than simply raising the score ceiling, the team wanted a harder, more realistic challenge. They chose Firefox.

The approach started conservatively. Researchers first asked Claude to find previously documented vulnerabilities in older versions of Firefox’s codebase, essentially checking whether the model could reproduce known CVEs. It could, at a rate that surprised the team. But there was a reasonable caveat: some of those historical vulnerabilities may have appeared in Claude’s training data, which would inflate the results. The real test was finding something new.

So Anthropic pointed Claude at the current version of Firefox with a clear objective: find vulnerabilities that have never been reported before.

Twenty Minutes to the First Find

Claude started with Firefox’s JavaScript engine, a logical entry point. It handles untrusted code every time a user loads a webpage, which makes it a high-value target and a well-defended one. Within twenty minutes, Claude flagged a Use After Free vulnerability, a class of memory flaw that can allow attackers to overwrite data with malicious content. Three Anthropic researchers independently confirmed it. A bug report, including a candidate patch written by Claude, went to Mozilla’s issue tracker shortly after.

By the time that first submission was filed, Claude had already surfaced fifty additional unique crashing inputs.

Mozilla responded quickly. After a technical exchange, the team there encouraged Anthropic to submit findings in bulk rather than manually validating each one. By the end of the effort, Anthropic had scanned nearly 6,000 C++ files and submitted 112 unique reports. Most have already been patched in Firefox 148, with the rest scheduled for upcoming releases.

- Advertisement -

Finding Bugs Is One Thing. Exploiting Them Is Another.

Anthropic did not stop at discovery. To better understand the full scope of Claude’s capabilities, the team ran a separate evaluation asking the model to go further: not just find the vulnerabilities, but build working exploits from them.

An exploit, in practical terms, is the tool an attacker would actually use. To prove it had succeeded, Claude was required to demonstrate a real attack by reading and writing a file on a target system. The team ran this test several hundred times across different starting conditions, spending roughly $4,000 in API credits.

Claude succeeded in two cases. That number matters in both directions. It confirms that the gap between finding a flaw and weaponizing it is still wide, which is good news for defenders. But it also confirms the gap is not absolute. Claude produced functional, if crude, browser exploits automatically.

The important qualifier is that those exploits only worked in a stripped-down test environment that removed Firefox’s sandbox and other real-world security layers. In a standard browser, those defenses would have blocked the attacks. Still, the ability to auto-generate even a partial exploit, one component of what a more complete attack chain would require, is a development worth taking seriously.

- Advertisement -

What This Means for Software Security Going Forward

The practical upshot of this research is less about any individual vulnerability and more about the pace at which AI can now operate across an entire codebase.

A few things made this work well in practice. Anthropic’s team leaned heavily on what they call task verifiers: tools that give the AI real-time feedback as it explores code, letting it confirm whether a potential vulnerability actually triggers a crash and whether a proposed fix resolves it without breaking anything else. Rather than generating outputs and hoping for the best, Claude was iterating with immediate feedback. That loop is what separates useful AI-assisted security research from noise.

On the submission side, Mozilla emphasized three things that made Anthropic’s bug reports trustworthy and actionable: minimal test cases, detailed proofs of concept, and candidate patches. That framework is worth adopting more broadly as AI-assisted vulnerability research becomes more common.

Anthropic has since published its Coordinated Vulnerability Disclosure principles, outlining how it will work with maintainers when AI surfaces new findings. The same approach has already extended beyond Firefox: Claude Opus 4.6 has also been used to find vulnerabilities in the Linux kernel, with more disclosures planned.

The Window Is Open. It Will Not Stay Open Forever.

Right now, Claude is significantly better at finding and flagging vulnerabilities than it is at exploiting them. That asymmetry favors defenders. Security teams and open-source maintainers have a real opportunity to use this technology to get ahead of attackers rather than respond to them.

The honest caveat is that this window is likely temporary. AI capabilities in vulnerability research are improving quickly, and the gap between discovery and exploitation will narrow. When it does, the calculus around safeguards and responsible disclosure will need to evolve alongside it.

For now, the message to developers is straightforward: the tools to find serious flaws in your software have gotten dramatically more powerful. Claude Code Security, currently in limited research preview, is bringing these capabilities directly to developers and open-source maintainers. The opportunity to use them proactively, before someone else does, is here.

Leave a Reply

More to Explore

Blender 5.1: The Precision Refinement Every Designer Needs

Released on March 17, 2026, Blender 5.1 arrives not as a radical departure, but as a masterclass in refinement. While version 5.0 was the...

How to Set Up Firefox’s New Free Built-in VPN and Use Native Split View

Digital privacy often feels like a full-time job, requiring users to juggle various extensions and subscriptions just to keep their personal data from leaking...

OpenAI Shuts Down Sora as Disney’s $1 Billion Deal Collapses

The sudden closure of OpenAI's AI video platform marks one of the most dramatic reversals in the brief history of generative AI, and leaves...

Sony’s Tokyo Studio Is Where the Future of Filmmaking Gets Made

Sony is bringing its global media production hub network to Japan, opening the Digital Media Production Center Japan (DMPC Japan) inside the company's Group...

Anthropic’s Claude Cowork Lets You Assign AI Tasks From Your Phone and Walk Away

Artificial intelligence is getting better at doing things. The harder challenge has always been getting it to do things without you watching. Anthropic's Claude...

NVIDIA’s Dynamo 1.0 Is Free, Open Source Software That Makes AI Inference Up to 7x Faster

Running AI models at scale is harder than it looks. Training a model is a one-time investment. Inference, the process of actually using that...

Adobe and NVIDIA Are Teaming Up to Reinvent Creative and Marketing Workflows With AI

Two of the most influential companies in creative technology are deepening a partnership that goes back more than two decades. Adobe and NVIDIA have...

NVIDIA Is Trying to Become the Default Platform for Every Kind of Robot

Jensen Huang has a bold prediction: every industrial company will become a robotics company. Whether or not that timeline plays out exactly as he...

NVIDIA and T-Mobile Want to Turn the 5G Network Into a Distributed AI Computer

Most conversations about AI infrastructure focus on data centers, the massive facilities packed with GPU racks that train and run the world's most powerful...

BYD, Nissan, Geely and More Are Building Self-Driving Cars on NVIDIA’s Platform — and Robotaxis Are Coming to Uber by 2027

Self-driving vehicles have been a promise for a long time. The technology has advanced significantly, but wide-scale deployment has remained perpetually just around the...

NVIDIA Wants to Be the Platform That Powers Every Enterprise AI Agent

Autonomous AI agents are moving from experiment to enterprise infrastructure faster than most organizations anticipated. The question is no longer whether companies will deploy...

NVIDIA Is Building a Coalition of AI Labs to Develop Open Frontier Models Together

The race to build the most powerful AI models has largely been a competition, with labs guarding their research, their data, and their techniques...

NVIDIA Is Releasing a Wave of Open AI Models Covering Everything From Robot Brains to Drug Discovery

NVIDIA does not just make chips. It has spent years building a parallel business in AI software and open models, and at GTC this...

NVIDIA’s NemoClaw Brings Security and Privacy to OpenClaw’s Fast-Growing AI Agent Platform

AI agents are getting good enough to actually be useful, and that is precisely when the uncomfortable questions start. If a piece of software...

NVIDIA Is Taking Its AI Chips to Space — Here’s What That Actually Means

NVIDIA has spent the last several years building the infrastructure that powers AI on Earth. Now it is setting its sights considerably higher. The...

Recommended for You

You Might Also Like