Just launched: 360° security audit to protect your legacy code from AI exploits.

Discover
LegacyLeap Logo

What Claude Mythos Revealed About Legacy System Security Vulnerabilities

What Anthropic Mythos Revealed About Legacy Security

TL;DR

  • Claude Mythos doesn’t scan for vulnerabilities. It reasons over code. It reads source code, forms hypotheses about exploitable conditions, and produces working proof-of-concept exploits, finding flaws that survived decades of human review and automated scanning.

  • It demonstrated this on the hardest possible target. Mythos found vulnerabilities in classified US government systems within hours. Legacy enterprise systems running on end-of-life frameworks have none of those protections.

  • EOL stacks carry permanent exposure. VB6, AngularJS 1.x, and .NET Framework 4.x hold known CVEs with no available patch. NIST SP 800-53, PCI-DSS v4.0, and HIPAA Security Rule each treat this as a direct control failure.

  • ~66,000 CVEs are projected for 2026, running 46% above the February forecast, with AI-assisted discovery as the primary structural driver.

  • There are things you can map yourself before engaging anyone. A self-audit of your framework versions, an SCA scan, and a sort of your last audit findings by remediation path will tell you what you are actually carrying.

Table of Contents

Introduction

Senator Mark Warner stood up in a Senate Banking Committee hearing on June 11 and cited the head of the NSA: Anthropic’s Claude Mythos had broken into almost all of the government’s classified systems. Not over weeks. Hours [1].

It was the first publicly documented instance of AI breaking into classified systems at scale. The testing ran under Project Glasswing, Anthropic’s controlled security program, and Mythos identified those vulnerabilities rather than exploiting them in an unauthorized attack. What is now referred to as the Mythos classified systems findings settled the question of what this class of capability can actually do, independent of how the event itself gets classified.

The dispute between Anthropic and the government has its own coverage. What has not been examined closely is what that sentence means for every enterprise running software that was never designed for this level of scrutiny.

That is what this piece is about.

How Claude Mythos and AI-Powered Vulnerability Discovery Actually Work

Traditional vulnerability scanners work from catalogues. They hold a library of known CVE signatures, version-to-vulnerability mappings, and pattern-match logic. Point a scanner at your environment and it answers a bounded question: does a known vulnerability category exist here? The output is a list of matches. Everything the catalogue does not cover is invisible.

Mythos does not work from a catalogue. Anthropic’s red team documentation describes the actual process: the model reads source code, ranks files by vulnerability likelihood, forms hypotheses about exploitable conditions, runs the software, uses debuggers as needed, and produces a bug report with a working proof-of-concept exploit [2].

There is no signature library it is querying. It reads code the way a senior security researcher reads code, reasoning about intent, tracing execution paths, thinking through what happens when inputs go wrong.

Security firm Horizon3.ai put the distinction plainly in their post-announcement analysis: a scanner asks whether a vulnerability category exists; a Mythos-class system asks whether a specific code path can be weaponized, and exactly how [3]. These are different questions at a different level of analysis, not the same question answered at greater speed.

Mythos identified a 17-year-old remote code execution vulnerability in FreeBSD’s NFS server that allowed unauthenticated root access from anywhere on the internet. That vulnerability survived every scanner and every human audit that had examined that code [2].

A 16-year-old flaw in FFmpeg’s H.264 codec, introduced in a 2003 commit and missed by a 2010 refactor, went undetected by every fuzzer and every human reviewer before Mythos found it [4].

Anthropic noted that these security capabilities were not explicitly trained into the model. They emerged as a downstream consequence of general improvements in code understanding, reasoning, and autonomy [2].

Machine learning vulnerability detection at this level was not engineered deliberately. It arrived as a side effect.

DimensionTraditional Vulnerability ScannersAI-Powered Vulnerability Discovery
What it searchesKnown CVE catalogue and signature librarySource code structure, execution paths, logic patterns
What it findsDocumented, catalogued vulnerabilitiesCatalogued CVEs plus zero-days outside the catalogue
Documentation requiredPartial; helps scope the scanNone; model reconstructs understanding from code
Source access requiredOften not; runs against binaries or networkSource preferred; binary analysis also viable
Exploit verificationNo; flags potential vulnerabilitiesYes; produces working proof-of-concept exploits
Chaining capabilityNoneChains multiple flaws into end-to-end attack paths

Prior generations of AI-powered vulnerability scanning still depended on a signature ceiling. Mythos has no ceiling.

One detail matters specifically for the legacy context.

AI code analysis and security reasoning do not require source code access to function. Current security-focused models can examine compiled software without source code, reasoning backwards from machine code to identify exploitable conditions.

The assumption that keeping source code internal provides structural protection no longer holds.

Scanner vs AI Discovery workflow

Why Legacy Systems Are the Softest Target in This Environment

The classified systems Mythos broke into were hardened. Active monitoring, modern security teams, access controls, supported infrastructure. They still went down in hours [1].

The protection legacy systems have had was never really security. It was friction.

Tightly coupled logic, undocumented modules, tribal knowledge held by engineers who left years ago. Friction made manual analysis expensive. It made scanning less effective because the attack surface was harder to map. For most attackers, most of the time, the effort required to analyze a deeply complex legacy system simply was not worth it.

AI-powered discovery does not experience friction. It reads the code. Complexity means more surface area, not less of it.

This matters at scale. FIRST’s mid-year forecast now projects approximately 66,000 CVEs for 2026, running 46.3% above the February forecast [5]. AI-assisted discovery is the primary structural driver.

Mozilla alone saw 271 bugs found and fixed in Firefox 150 through Project Glasswing, Anthropic’s partner program. These were bugs in maintained, modern software with active security teams. The same capability pointed at software that stopped receiving patches years ago finds an estate with no repair path.

The timing dimension is the part that makes this operationally urgent. The interval between a vulnerability being discovered and being weaponized has compressed from months to days over the past decade. In a growing share of cases, attackers now exploit on or before the day of public disclosure.

For systems that cannot be patched because the vendor no longer issues patches, there is no window to act through. The exposure does not resolve. It accumulates.

Compliance frameworks reflect this directly.

  • NIST SP 800-53 SA-22 addresses Unsupported System Components as an explicit control requirement.
  • SI-2 covers Flaw Remediation, which an EOL framework cannot satisfy by definition.
  • PCI-DSS v4.0 explicitly bars unsupported software and weak cryptography.
  • HIPAA Security Rule enforcement has increasingly treated legacy technology as a structural gap rather than a documentation deficiency.

Running an EOL stack is an enumerable finding under each of these frameworks, not an interpretation question. For a detailed breakdown of how these controls apply to specific legacy technologies, see the legacy software compliance risk guide.

Your legacy estate carries CVEs that no patch will ever close. A Legacyleap Security Posture Assessment maps your specific framework versions against known vulnerabilities, separates compliance risk from active security exposure, and runs entirely inside your infrastructure. Claim yours free.

VB6 Security Vulnerabilities, Unpatched CVEs, and the EOL Stack Problem

The EOL stack problem has a concrete shape. There are specific frameworks, specific CVEs, and specific remediation gaps that define the exposure for the most common legacy estates.

StackEOL StatusKnown Unpatched CVE ExposureRemediation Path
VB6 Runtime EOL since 2008No security patches in 15+ years; structural vulnerabilities predating the CVE catalogueModernization only
AngularJS 1.xEOL December 2021No CVE patches issued since EOL in December 2021. Three high-severity CVEs disclosed in 2025 (CVE-2025-59052, CVE-2025-66035, CVE-2025-66412) affect Angular, the modern successor framework; patches are available only for Angular v19+. AngularJS 1.x will never receive patches for any CVE.Modernization only
.NET Framework 4.x Extended support ended; no new security patchesKnown CVEs with no upstream fix; Windows Server dependency compounds exposureUpgrade to .NET 6+ or later
Java EE / EJBEOL varies by vendor and versionVendor-dependent; many configurations carry unpatched CVEs across EJB and JSP layersMigration to Spring Boot or Jakarta EE
Apache Struts 1.x EOL since 2013Multiple high-severity CVEs with no vendor remedyMigration to Spring Boot or Spring MVC

The volume of unpatched CVE exposure in legacy software does not hold steady. It compounds with every new disclosure on an EOL framework, and no future patch is coming for any of these stacks. A scanner run today surfaces the same CVEs that a scanner run next year will surface, plus any new ones discovered in the interim. The finding does not resolve. It compounds.

The only remediation path for EOL stacks is modernization. Patching cannot close a CVE on a framework whose vendor has stopped issuing patches. Compensating controls can reduce exposure in the short term and may sustain one renewal cycle with a cyber insurer or auditor, but they do not close the structural vulnerability.

As AI-powered discovery accelerates the rate at which legacy CVEs are found and the rate at which findings surface in audits, the window for that position is compressing. For a structured approach to mapping your estate’s CVE exposure before beginning a modernization program, see the legacy system vulnerability assessment guide.

Legacy Stack Risk Matrix

What You Can Map Before Engaging Anyone

This is where most articles stop at “now you know the problem, here’s our product.” That framing is not useful to someone who needs to understand their own position before they decide what to do next.

There are three things an engineering team can do independently, without committing to any vendor, that will give a clear picture of actual exposure.

1. Map your framework versions against their EOL dates

One afternoon, no tools required. List every application, its primary framework, and the date vendor security support ended. For each one, note the last date a security patch was issued. The gap between that date and today is the unpatched window.

For VB6, that gap is fifteen years. This single exercise tells you which systems have a repair path and which are carrying permanent exposure regardless of what you do operationally.

2. Run a software composition analysis scan

Free tools, including OWASP Dependency-Check and Google’s OSV-Scanner, map your dependencies against known CVEs and run entirely within your own environment. The output sorts into two buckets automatically: CVEs with patches available and CVEs on EOL frameworks with no fix.

The second bucket is your structural exposure. This takes a day, no external party sees your code, and it gives you a finding-level picture you can take into any internal or external conversation with actual data behind it.

3. Go back to your last security audit and sort the findings by remediation path

Almost every recent audit flags EOL software. Most of those findings are classified as “patch when a patch is available.”

Pull them out and sort them: findings that a patch resolves, and findings that require modernization because the vendor no longer issues patches.

The modernization bucket is what you are carrying indefinitely. Auditors and cyber insurers are increasingly treating these as separate risk categories. Knowing your number before your next renewal or audit cycle is a different kind of conversation than discovering it during one.

These three steps will not close the exposure. They will tell you precisely what you are carrying, which is the prerequisite for any decision that comes after.

What the Exposure Looks Like When It Is Not a Deliberate Attack

The deliberate adversary framing understates the actual risk profile.

During safety evaluations of a security-focused AI model, a production-grade vulnerability in Firefox was identified autonomously, without being the result of any deliberate attack on Firefox.

The capability was being exercised against available code. The finding was responsibly disclosed, and Mozilla patched it before a major security competition where it could have been demonstrated publicly. The model found the vulnerability because it was capable of finding it, and the code was there.

Nobody was trying to attack Firefox. But what if someone was?

Anthropic documented that Mythos’s security capabilities were not explicitly trained. They emerged as a downstream consequence of general improvements in code understanding, reasoning, and autonomy [2].

These improvements propagate through the model ecosystem. What requires a frontier model today will require a commodity model within a shorter timeframe than many enterprises have accounted for in their modernization planning.

A codebase built in the 1990s on VB6 was built in a world where this capability did not exist. The systems have not changed. The environment around them has.

What Legacyleap’s Security Posture Assessment Gives You

Knowing what you are carrying is not the same as knowing what to do about it. The SCA scan tells you which CVEs exist. It does not tell you what it costs to close them, which findings require a full re-platform versus an in-place upgrade, or how much runway you have against your next audit deadline.

Legacyleap’s Gen AI-powered Security Posture Assessment fills that gap. It ingests your audit findings, your scanner exports, and your source code, runs entirely inside your infrastructure, and returns something more actionable than a finding inventory:

  • Every finding mapped to its exact code location and root-cause technology
  • A fix approach per finding: in-place upgrade, re-architecture, or re-platform
  • Exact scope, cost, and timeline against your deadline
  • Blast radius per finding: what depends on each vulnerable component
  • Compliance risk separated from active security exposure

That last point matters more than it might appear. Audit reports treat these as the same category. They are not. 

A system that fails PCI-DSS because it runs an unsupported framework carries a different risk profile than a system with an actively exploitable CVE and no detection layer. Knowing which bucket each finding falls into changes the sequencing conversation when capacity is limited and every remediation needs internal justification.

Legacyleap is a US-based, Gen AI-powered legacy modernization platform. When the assessment identifies findings that require modernization, the platform handles the execution: governed transformations delivered as reviewable pull requests, parity validation before anything deploys, documentation reconstructed for systems that have none. Your code does not leave your environment.

A major North American airline. A business-critical VB6 operations application: end-of-life, zero documentation, zero test coverage, hard deadline.

Full re-platform to React and .NET. Zero critical security issues post-transformation. Eight weeks.

150+ production-grade assessments completed. Every one inside the client’s environment.

Explore the full case study.

Conclusion

Senator Warner’s disclosure confirmed what Anthropic’s own red team documentation had already demonstrated: a tool that reasons over code rather than matching against catalogues found vulnerabilities in hardened, monitored, classified infrastructure in hours. Legacy systems running on frameworks that stopped receiving patches years ago are categorically softer targets. The friction that once provided passive protection is gone.

The CVE exposure on EOL stacks is permanent until modernization closes it. The compliance frameworks are unambiguous about what that means. The exploit timeline makes it operationally urgent.

Start with what you can map yourself. Then get the full picture before someone else does.

Claim a free Security Posture Assessment. Legacyleap maps your legacy estate, surfaces unpatched CVEs by framework and version, and delivers a prioritized remediation roadmap in three to five days, entirely inside your infrastructure.

Book a technical demo. See the Legacyleap agents in action against a real codebase: architecture visualization, dependency mapping, documentation generation, and modernization planning in a live session led by our CTO.

FAQs

Q1. How does AI-powered vulnerability discovery differ from traditional scanners on legacy systems?

Scanners match against known signatures. AI-powered discovery reads the code itself and reasons about exploitable conditions, including ones no catalogue has recorded yet.

Q2. What happens to unpatched CVEs on EOL software like VB6 or AngularJS 1.x?

They stay open indefinitely. Once a framework reaches end-of-life, no vendor patch is coming. Modernization is the only remediation path.

Q3. Does AI-powered code analysis require access to source code?

No. Models can analyze compiled binaries directly and reason backwards to find exploitable conditions without the original source.

Q4. What does the Mythos classified systems finding mean for enterprise legacy security?

If hardened, monitored government systems fell in hours, legacy systems with none of those protections are far more exposed to the same capability.

Q5. Which compliance frameworks treat EOL software as a direct control failure?

NIST SP 800-53 (SA-22, SI-2), PCI-DSS v4.0, and HIPAA Security Rule all flag unsupported software as an explicit, enumerable finding.

References

[1] CNBC. “Anthropic’s Mythos Model Found Vulnerabilities in Classified US Government Systems, Official Says.” June 23, 2026. https://www.cnbc.com/2026/06/23/anthropics-mythos-model-found-vulnerabilities-in-classified-us-government-systems-official-says.html 

[2] Anthropic. “Assessing Claude Mythos Preview’s Cybersecurity Capabilities.” April 7, 2026. https://red.anthropic.com/2026/mythos-preview/ 

[3] Horizon3.ai. “Claude Mythos & Enterprise Security: Your Questions Answered.” 2026. https://horizon3.ai/intelligence/blogs/claude-mythos-enterprise-security/ 

[4] Help Net Security. “Anthropic’s New AI Model Finds and Exploits Zero-Days Across Every Major OS and Browser.” April 8, 2026. https://www.helpnetsecurity.com/2026/04/08/anthropic-claude-mythos-preview-identify-vulnerabilities/ 

[5] FIRST. “FIRST Mid-Year Vulnerability Forecast Confirms Historic Surge, Projects ~66,000 CVEs in 2026.” June 15, 2026. https://www.first.org/newsroom/releases/20260615

Share the Blog

Latest Blogs

GLBA Safeguards Rule and legacy systems

The GLBA Safeguards Rule and the Legacy Systems Behind Most Audit Failures

SEC Cybersecurity Disclosure Rules

How SEC Cybersecurity Disclosure Rules Apply to Companies Running Legacy Systems

HIPAA Security Rule Requirements 2026

HIPAA Security Rule Requirements 2026: A Legacy Systems Compliance Guide

Legacy Data Platform Modernization

Legacy Data Platform Modernization: Closing the Execution Gap Between Assessment and Production

SSIS Pipeline Migration

SSIS Pipeline Migration: How to Choose the Right Target State Before You Commit

Ab Initio to Apache Spark Migration with Gen AI

Ab Initio to Apache Spark: The Enterprise Migration Guide

Technical Demo

Book a Technical Demo

Explore how Legacyleap’s Gen AI agents analyze, refactor, and modernize your legacy applications, at unparalleled velocity.

Watch how Legacyleap’s Gen AI agents modernize legacy apps ~50-70% faster

Want an Application Modernization Cost Estimate?

Get a detailed and personalized cost estimate based on your unique application portfolio and business goals.