OPINION
When Anthropic announced Project Glasswing this month, most coverage landed on the headline numbers: a 27-year-old OpenBSD vulnerability, a 16-year-old FFmpeg flaw, a Linux kernel exploit chain assembled without human steering. The coalition behind it, including AWS, Apple, Cisco, CrowdStrike, Google, Microsoft, Palo Alto Networks, and others, isn’t there for the optics; they’re there because the model’s capabilities are real, and the coordinated disclosure pipeline matters.
The part worth dwelling on is the FFmpeg result specifically. At least five million automated fuzzer testing passes hit that vulnerable line of code and not one caught it. Mythos Preview read the code, understood what it was doing, and found the flaw.
That gap highlights a fundamental security misconception of the past two decades.
The industry built enumerators. It needed readers.
Automated security tooling has almost always worked the same way at its core: define a pattern, scan to identify the pattern, flag the match. SIEMs ingest event logs and match rules. Static analysis tools check code against known signatures. Vulnerability scanners compare software versions against CVE databases, and so on. These are mostly based on enumeration, and enumeration can only find what you already know to look for.
Five million passes with the industry standard tools, zero catches. These tools knew how to count. But they didn’t know how to read.
Mythos Preview succeeded because it approached the code the way a skilled human analyst would: with an understanding of intent, of relationships between components, of what a sequence of operations does, rather than what it superficially looks like. Security at that depth has been the exclusive domain of rare, expensive human expertise. A model that replicates it at scale is genuinely a different kind of thing, and the industry is right to pay attention.
Code vulnerabilities get the press, and Glasswing deserves credit for what it’s doing there. Still, the same enumeration failure that let a 16-year-old flaw survive five million scans is sitting inside every other layer of most organizations’ security programs right now.
The majority of security incidents and breaches I’ve tracked over my career didn’t originate from a zero-day exploit. It doesn’t mean that zero-day exploits are not the devil itself in many, large scale, high impact, state-sponsored and organized cybercriminal attacks. But most incidents still occur due to that storage or DB (database) someone set to public and forgot about, a set of credentials sitting in a breach database for six months while the account stayed active, a firewall exception for a contractor who left the company two years ago, an admin portal with default credentials still facing the Internet, and other misconfiguration horror stories. Attackers go where access is easy. Misconfigured cloud/SaaS assets and leaked credentials give them exactly that. Finding those doors doesn’t require a frontier AI model — just a port scan, a credential-stuffing script, and some patience.
The bigger problem for most security teams, though, even with the existing solutions in the market, remains security data fatigue that lacks context, and those misconfigurations nobody ran a single test against because nobody remembered the asset existed. Forgotten integrations, shadow IT, SaaS and now shadow AI and agents are everywhere. Teams spin up resources during a sprint and never review them. Cloud infrastructure and AI tools grow faster than anyone’s security practices can keep up. Those gaps don’t require novel AI to exploit, they require opportunity, and attackers are nothing if not patient.
Configuration management tools flag deviations from known-good states. Identity governance platforms run scheduled reviews against static policies. CSPM (clous security posture management) solutions check cloud resources against predefined rule sets. Mostly based on enumeration and bounded by what the rule writer anticipated when they wrote the rule.
The deeper irony shows up within Glasswing’s own published results. One of the scenarios Anthropic’s team worked through involved a model escaping a sandbox environment. The escape relied on a service with an outbound network connection, left open to handle email delivery, that nobody had reviewed or restricted. A code-level vulnerability enabled the path; a configuration oversight left the door unlocked. You can close every exploit chain in the codebase and still lose to a network rule or a permission someone approved in 2019 and never revisited.
Code security is the first layer. What’s actually happening across the full environment, across every service, every identity, every integration, is the bigger picture most teams never get to see clearly.
Understanding the Posture Layer
For most security teams, the posture problem isn’t a knowledge problem. The knowledge exists. The tools exist. The gap is the ability to assemble a coherent picture from fragmented, context-free data spread across a dozen systems, fast enough to act on it before an attacker does.
The shift Mythos Preview represents at the code layer, intelligence that understands rather than enumerates, is the same shift the rest of the security stack needs. Systems that can look at an environment, understand what’s there, reason over it continuously, and surface what matters before an attacker finds it first: Not more scanners with better rules. A genuinely different way of working with security data.
At the identity and access layer, at the configuration layer, at the posture layer, that work gets far less press than zero-day hunting. For most teams, it’s where the real exposure lives every single day.
What comes next? The obvious: Expect a significant wave of security advisories and patches over the coming weeks as Glasswing’s coordinated disclosures roll out. Infrastructure teams should prepare for an intense cycle. Validating your asset inventory and software bill of materials now, before the advisories land, is time well spent.
Mythos Preview is genuinely moving the floor on application security, and the initiative around it is serious. The principle it proves, that understanding beats enumeration, applies well beyond the codebase.
The organizations that take that lesson seriously, across their full stack, are the ones that will come out ahead.
