How the Mythos era and Project Glasswing collapsed the logic […]

Exposure Remediation, Preemptive Security

The Day the Priority List Died

Barak Klinghofer May 4, 2026

How the Mythos era and Project Glasswing collapsed the logic of vulnerability prioritization, and what defenders need to do about it.

In April 2026, Anthropic published a model called Claude Mythos Preview, then declined to ship it.

That second part is the part most people skipped.

Mythos was withheld not for commercial reasons but for security ones. In controlled testing, it autonomously discovered zero-day vulnerabilities across every major operating system and browser, with an 83.1% first-attempt success rate on a working proof-of-concept. The UK’s AI Safety Institute watched it complete a 32-step corporate network compromise in three out of ten unsupervised attempts. Against FreeBSD’s NFS server, it dredged up an unauthenticated remote code execution flaw that had been sitting in the codebase, undetected, for seventeen years. Then it built the exploit chain to weaponize it.

That model is not in attacker hands. The next version of it, or the open-weight equivalent six months behind it, will be.

This is what Anthropic’s Project Glasswing, the defensive coalition spun up in the same week alongside CrowdStrike, Zscaler, Microsoft, Palo Alto, AWS, and forty-plus critical-infrastructure organizations, exists to confront. The premise is simple and ugly: finding vulnerabilities is now table stakes for both sides. The new front line is what happens after the finding.

Which means a quiet, decade-old assumption has just stopped working.

The assumption was that prioritization saved you.

Prioritization was never about importance. It was about throughput.

Every vulnerability management program in the world is built on a polite fiction: that you can rank what matters, fix the top of the list, and call the rest acceptable risk.

The fiction worked because of arithmetic, not strategy. There were more findings than fixers. The attacker, like the defender, was a human being with a calendar. The median time from CVE disclosure to first observed exploitation in 2018 was seven hundred and seventy-one days. You had two years. So you triaged. You picked the loudest, the most exposed, the most CVSS-decorated, and you let the rest age in a Jira backlog like wine.

The CVSS score, the EPSS percentile, the catalogue of “known exploited” CVEs, the “critical / high / medium / low” rainbow on every dashboard ever built: these are not laws of physics. They are coping mechanisms. They exist because no security team on Earth could fix everything, so the industry agreed to build a queue and call the queue a strategy.

In the Mythos era, the queue is the strategy’s failure point.

What Mythos changed in one chart you don’t have

Run the numbers honestly.

In 2018, you had roughly 771 days between disclosure and exploitation. In 2024, it was somewhere between five and eight days for the average exploited flaw. In early 2026, the median exploitation window for a high-interest CVE is measured in hours. Sixty-seven percent of weaponized CVEs in 2026 were exploited on the day of disclosure or earlier.

In 2025, 48,000 CVEs were published, a 21% jump year-over-year. The 2026 trajectory points to 55,000. Adversaries no longer wait for those CVEs to be published. Mythos-class capability lets them look at a git commit hash and an obscure dependency tree and find the flaw before NIST has assigned it an identifier.

The asymmetry that prioritization was built on, defenders being faster at picking than attackers were at exploiting, has flipped. Attackers are faster at finding than defenders are at fixing.

Inside Reclaim’s customer base, we see it directly. The average enterprise we onboard arrives with a backlog of 18,400 open exposures classified as “critical” or “high” by their existing tools. Their security team’s actual fix throughput is around 240 items per month. At that rate, the backlog never closes. It compounds. By the time a “high” gets attention, two more “highs” have moved to the front of the queue, and an attacker armed with a publicly available coding agent has already had four months of opportunity.

This is the part the dashboards do not show. The list is not a list of what you will fix. It is a list of what you will not fix in time.

The “critical” label is now mostly a lie

Here is something the major scanners would prefer you not look at too closely.

Of the vulnerabilities they grade CVSS 7.0 or higher across a typical enterprise, only about 2.3% are ever observed under exploitation in the real world. Meanwhile, 28% of CVEs that are actually exploited carry only a “medium” CVSS rating. The grade you are using to triage is roughly uncorrelated with the threat you are facing.

In the Reclaim customer environments we have audited at first-touch over the last twelve months, the pattern is even more brutal. On average:

  • 61% of items flagged “critical” by the customer’s existing prioritization stack were already mitigated by another control somewhere in the environment, usually a network segment, a Defender policy, an Entra Conditional Access rule, or a runtime EDR enforcement that the scanner did not know about.
  • Only 14% of “critical” items represented a complete, end-to-end exploitable path to a business-relevant asset.
  • 23% of items the scanner ranked “low” or “informational” were sitting on a validated attack path to a crown-jewel system.

The priority list, in other words, is wrong about what is dangerous, wrong about what is already handled, and wrong about what to fix first. It is loud where it should be quiet, and silent where it should be screaming.

This is not the scanner vendor’s fault. CVSS measures theoretical severity. EPSS measures statistical likelihood. KEV measures past exploitation. None of them measure your environment, your controls, or whether the gun is actually loaded. They cannot. They were never designed to.

An AI agent embedded inside your environment, reasoning over your controls and your data, can.

That is the work that has to be done now. And it is not the work Mythos was built to do.

What Glasswing actually proves, and what it does not

Project Glasswing’s most important contribution is not the coalition or the press cycle. It is a single, uncomfortable admission, made plain by Anthropic and now visible to anyone paying attention.

Finding vulnerabilities at machine speed is now possible. Both sides have it.

That is the entire point of the project. Anthropic ran Mythos defensively against critical open-source infrastructure and turned up thousands of latent flaws that had been hiding in plain sight for years. The same capability, in less coordinated hands, will turn up the same flaws in your production estate, your SaaS supply chain, and your vendors’ code. The asymmetry that defenders quietly relied on, that finding things at scale required talent, time, and budget, is gone.

Now notice carefully what Glasswing does not do.

It does not deploy a fix into your Microsoft 365 tenant. It does not roll out a Conditional Access policy that closes a real attack path without locking out your sales team. It does not validate that a hardening change will not break the macro-driven workbook the CFO opens every Monday morning. It does not own change windows, dependencies, rollbacks, ticket queues, or audit trails inside an actual enterprise.

Glasswing is upstream defense. It hardens the software the world runs on. It is necessary, and it is doing important work. But the moment a finding has to be turned into a fix inside your environment, on your stack, with your business constraints, Glasswing is finished. That work is yours.

This is the part the new AI-for-security narrative keeps glossing over. A general-purpose offensive model can find a flaw in any codebase. It cannot remediate it inside your enterprise. It does not know your controls, your users, your change windows, or which of your business processes will fall over if it changes a setting. Mythos-class capability is a flashlight in a forest. It is not a fire crew inside your house.

The bottleneck has moved. It used to live at find. It now lives at fix, inside your environment, at machine speed, without breaking the business.

That is a different problem. It needs a different agent.

This is the moment the security industry has been postponing for a decade. Visibility was a comfortable place to stand. You could buy more visibility, get a bigger dashboard, write a longer report, and tell a CFO that you were “on top of risk.” Visibility is now necessary and laughably insufficient.

In the Mythos era, the only meaningful security metric is the distance between exposure exists and exposure is gone.

Every other number on every other QBR slide is downstream of that distance.

If your stack reduces exposure, your stack is doing security. If your stack adds another column to a spreadsheet, your stack is doing inventory.

What AI-speed remediation actually looks like

This is where the conversation usually breaks down, because “AI-speed remediation” is not a feature you bolt onto a vulnerability scanner. It is a different operating model.

It has four properties that matter, and a vendor either has all four or it does not get to use the phrase:

1. The agent does the work, not just the recommending. First-generation security AI assistants were polite. They suggested. They pasted commands into a chat window for a human to paste into a console. That is not remediation. That is a cover letter. An AI Security Engineer participates in operations the way a human engineer would: discovers exposures across endpoint, identity, browser, OS, email, and cloud, plans the remediation, executes the change, and closes the loop. With agency, not advisories.

2. The plan is business-aware before the change is made. This is the part most security automation projects break on. A “fix” that breaks the finance team’s macro-enabled spreadsheet at quarter close is not a fix. It is an outage with extra steps. Reclaim’s PIPE™ technology runs predictive impact analysis against actual user behavior in your environment, before any change is committed. The agent simulates the consequences and adjusts the remediation plan to land in your environment without a ticket queue blooming behind it.

3. The change is reversible by design. AI-speed remediation requires AI-speed rollback. Every change is sequenced, monitored, and unwindable. This is not a “trust me, it’ll be fine” operating model. It is a “deploy in production, watch the telemetry, revert in seconds if anything moves the wrong direction” operating model. Without that property, no security team will let the agent touch their environment, and they would be right not to.

4. It runs on the stack you already own. No new agents. No new endpoint to deploy. No new SOC tab. Reclaim sits on top of the EDR, IdP, email security, M365, Workspace, and cloud telemetry the customer has already paid for, and makes those tools deliver. In Mythos-era economics, the last thing any CISO needs is a thirteenth tool that finds more things they cannot fix.

That is the difference between a finding factory and an AI Security Engineer.

The numbers from our customer base

Across Reclaim production deployments through Q1 2026:

  • 92% of remediations executed without human intervention. The remaining 8% required engineer review, almost always for change-window or business-context reasons rather than safety ones.
  • Mean time-from-exposure-to-fix dropped from 41 days to 3.4 hours across the customer cohort within the first 90 days of deployment. For exposures sitting on validated attack paths, that number is under 30 minutes.
  • Zero production-impacting changes across more than 60,000 automated remediations in the last twelve months. PIPE™-validated change is the reason.
  • 3.1x effective ROI on the existing security stack. Customers retired or deprioritized an average of 2.4 overlapping tools per environment after deploying Reclaim, redirecting that budget at problems the existing stack was not solving.
  • A 78% reduction in priority-list noise. When PIPE™ applies environmental context (controls already in place, real exploitability, business impact), the average customer’s “critical and high” backlog collapses by roughly four-fifths. The remaining items are the ones that actually matter, and the agent fixes them while you read the report.

Two customer stories make the math concrete.

A multinational manufacturer arrived at Reclaim with a backlog of tens of thousands of open exposures and a small security team whose monthly fix throughput was a tiny fraction of what the queue demanded. Within a couple of months, the agent had autonomously remediated the overwhelming majority of those items and re-classified a meaningful portion as already-mitigated by existing controls. The team is now spending its time on threat modeling and architecture review. The board is no longer being shown a backlog chart that goes up and to the right.

A SaaS platform with high-velocity engineering culture arrived with a different problem: every misconfiguration fix they tried to ship through their ITSM process took weeks on average, because change approval was the bottleneck. With PIPE™™ pre-validating impact, change approval moved from a manual review to a confidence threshold. Their median fix time collapsed to a small fraction of an hour, and their security drift incidents fell sharply over a single quarter.

Neither customer added headcount. Neither customer bought new tools. Neither customer accepted the trade between speed and safety.

What this means for the next twelve months

A simple test, for any security leader reading this:

If a Mythos-class capability becomes available to a moderately funded threat actor in the next eighteen months, and your environment is presented to it cold, what is your time-to-fix on the resulting findings? Not your time-to-detect. Not your time-to-prioritize. Your time-to-fix.

If the honest answer is measured in days or weeks, your program is operating on pre-2026 physics. The finding will be exploited before it leaves your queue.

If the answer is measured in hours, you have either already adopted AI-speed remediation, or you are about to.

The category formerly known as “exposure management” is in the process of splitting into two halves. One half will keep building better lists. They will have prettier dashboards, more integrations, more natural-language interfaces, and the same fundamental product, which is a report. The other half will fix things.

Reclaim is in the second half. We were already in the second half before Mythos. The model just made the rest of the industry’s positioning work for us.

The closing line

Mythos did not raise the stakes of finding faster. It ended the relevance of finding faster.

Anthropic’s research, Glasswing’s coalition, and the math of any honest 2026 backlog all point at the same conclusion: visibility without remediation is noise, and waiting is now expensive in ways that are easy to measure.

The priority list is not coming back. The Mythos era will not slow down to wait for it. The teams that move now to AI-speed remediation will spend the next twelve months retiring backlogs, retiring tools, and retiring the meeting where someone reads the top ten items off a slide.

The teams that don’t will spend it explaining the breach.

Tools should fix, not just find. From lists to fixes. Optimize what you already own.

Your AI Security Engineer is waiting.