Speaker:: Adam Krivka & Ondrej Vlcek
Title:: AI Found 12 Zero-Days in OpenSSL
Duration:: 25 min
Video:: https://www.youtube.com/watch?v=IjL2qN1KDe8
## Key Thesis
Isle Security built an agentic, multi-stage vulnerability research pipeline that has discovered 500 confirmed vulnerabilities in open-source software within six months — including 12 in OpenSSL alone — matching the output of Google's Project Naptime/Big Sleep effort. The core argument is that LLM-based reasoning is now sophisticated enough to find deep, non-obvious security bugs at scale, and that the industry must treat this with urgency because offensive actors have equal access to the same technology.
## Synopsis
Ondrej Vlcek, a 30-year cybersecurity veteran who wrote one of the original antivirus engines in the 1990s, opens by reframing the talk: the point is not the specific bugs found, but the engine behind them. He situates isle's work alongside other major AI vulnerability research efforts — Google's Project Naptime/Big Sleep, the DARPA AI Cyber Challenge (which went from 35% of planted vulnerabilities found in DEFCON 2024 semifinals to 87% found at DEFCON 2025 finals), and Anthropic's Claude 4.6 finding 500+ vulnerabilities in projects like Ghost, Ghostscript, and the Linux kernel.
Isle started not as a vulnerability discovery tool, but as an agentic remediation engine. When they began benchmarking existing scanners against 100,000 historical CVEs, they discovered signature-based scanners had efficacy rates in the low single digits — even on vulnerabilities they could have been trained on. They pivoted to build their own LLM-based scanner, and it started finding previously unknown vulnerabilities as a side effect.
Adam Krivka walks through representative findings. The most serious OpenSSL bug is a stack overflow in a component used by email clients, where an attacker controls the length of a field in a data notation structure causing a stack buffer overflow. They also found bugs across Chromium, Firefox, and WebKit. A standout example is a logic inversion bug in Traefik, a popular Kubernetes ingress controller: a configuration flag called `proxy_ssl_verify` when set to true actually sets `insecure_skip_verify` to true — silently disabling TLS authentication. This is the kind of subtle semantic error that pattern-matching scanners are structurally incapable of finding.
Isle's pipeline uses a two-phase approach: a breadth-first "broadening" phase where the system generates as many hypotheses as possible about what could be wrong in a given codebase, followed by a focus "narrowing" phase where agents do deep agentic exploration including running the code, crafting PoC/fuzzing where applicable, and using multiple models to critique each other's findings. Parallelism is central — humans are single-threaded, LLMs are not. They emphasize careful context construction, human-in-the-loop for final submission (increasingly a formality), and specialized fine-tuned models with attention to interpretability and continual learning.
The talk closes with a call to urgency. Vlcek notes that Daniel Stenberg (author of curl), previously a vocal AI skeptic, publicly called isle's reports "magic" after six months of interaction and now advocates for AI-assisted vulnerability research. The concern is a "vulnerability apocalypse" if defenders don't move faster than attackers, who have access to the same LLM capabilities and unlimited token budgets from nation-state funding.
## Key Takeaways
- 500 confirmed vulnerabilities found in 6 months; 133 CVEs minted so far (trailing metric, backlog is large)
- Signature/pattern-matching scanners have near-zero efficacy even on known historical CVEs
- Multi-model adversarial critique helps combat hallucinations and sycophancy
- Heavy parallelism is a core advantage: many hypothesis threads investigated simultaneously
- Isle provides fixes alongside reports to avoid "AI slop" that burdens already overworked open-source maintainers
- Active GitHub bot deployed on OpenSSL, OpenClaw, OpenEMR, Apache — scanning every PR
- General availability product launched at isle.com on the day of the talk
- Nation-state actors have essentially unlimited token budgets; defenders must match urgency
## Notable Quotes / Data Points
- DARPA AI Cyber Challenge: 35% of vulnerabilities found in 2024 semifinals → 87% in 2025 finals (7 teams, limited tokens)
- Isle: 500 vulnerabilities confirmed, ~133 CVEs minted in 6 months (same number as Anthropic's Claude 4.6 release)
- Daniel Stenberg (curl): went from "complete skeptic and naysayer to a huge, huge fan" in 6 months
- Commercial pattern-matching scanners: "low single-digit percentage success rates" on historical CVEs they could have been trained on
- "The vulnerability apocalypse... we only prevent [it] if we move with urgency"
#unprompted #claude