Speaker:: Peter Girnus & Derek Chen
Title:: FENRIR: AI Hunting for AI Zero-Days at Scale
Duration:: 21 min
Video:: https://www.youtube.com/watch?v=c6_bRzHCf3U
## Key Thesis
Trend Micro's FENRIR system demonstrates that AI-augmented vulnerability discovery — using a cascaded pipeline of traditional static analysis tools followed by LLM triage and deep agentic verification — can find high-severity zero-days at scale with dramatically higher throughput and lower false positive rates than traditional methods. The system uses AI to secure AI, intentionally targeting the AI ecosystem itself as a massive and underexamined attack surface.
## Synopsis
FENRIR is a production zero-day discovery engine built by Trend Micro's AI zero-day initiative team. It lives within a larger unified platform that also includes MIMIR, a defensive N-day vulnerability research component. The two systems form a bidirectional intelligence loop: when FENRIR discovers a zero-day and it gets patched, MIMIR deploys autonomous agents to the advisory page and commit to find additional bugs introduced or bypassed in the fix.
The pipeline is a cascade designed to keep token costs low while maximizing signal quality. Stage one is pure static analysis: YaraX for fast pre-filtering across millions of lines of code in seconds, SemGrep for precision rule matching, CodeQL for data flow and taint analysis, and SpotBugs/FindSecBugs for Java binaries. Multi-scanner correlation — when two tools flag the same CWE class within ~15 lines of each other — provides high-confidence signal before any LLM is involved, at zero token cost.
Stage two (L1 triage) uses a fast but capable model (Claude Sonnet) with a pre-allocated 50-line code window. The goal is purely noise reduction — eliminating obvious false positives without making final determinations. CWE-specific prompting (e.g., "can this possibly be CWE-79?") outperforms generic prompting. This stage filters over 60% of findings with a single cheap model call, cutting throughput costs dramatically. By design, it is biased toward recall — it never drops true positives.
Stage three (L2 deep agentic triage) is the expensive layer. Claude Opus runs in an isolated secure sandbox with full bash execution and write privileges. It receives full code context, performs multi-step reasoning with built-in reflection (forcing the model to argue against its own findings to reduce shallow reasoning), and generates exploit proof-of-concept code. Median cost is ~$0.61 per finding and ~$8.80 per confirmed true positive. The system lets the agent "roam" — collecting its own context, tracing data flow, building call graphs via CodeGraph tools — rather than pre-allocating context statically.
The human-in-the-loop stage receives a filtered set of 10–25 high-confidence findings (from 500 raw SAST alerts) with auto-generated vulnerability reports, impact assessments, and full disclosure packages. Humans validate exploitability, review severity, and submit to vendors. Unique features include a weighted context generation algorithm, a dynamic priority scoring engine, and a kill chain analysis system for exploit path modeling.
Since production deployment, FENRIR has achieved 2.5x more vulnerabilities discovered, 80% reduction in false positive rate, 70% faster disclosure, and an overall 3x team productivity increase. The team has submitted over 60 CVEs (all high or critical), with 100+ in ZDI pre-disclosure and 3,000 pending review. During the talk itself, a Langchain CVE was patched in real time. The team is now expanding to organizational-level scanning and memory corruption bugs using more advanced reasoning capabilities.
## Key Takeaways
- Cascade architecture — static analysis tools first, LLM triage second, agentic deep verify last — eliminates most false positives before any token is spent
- CWE-specific prompting substantially outperforms generic "find vulnerabilities" prompting
- Model sizing matters: Qwen 3 0.6B dropped true positives; Opus works but is wasteful at L1; Sonnet is the right L1 model
- Built-in reflection (forcing the agent to disprove its own findings) significantly reduces hallucination and shallow reasoning in the deep verify stage
- Reachability filtering is critical — unreachable code, test cases, and documentation are eliminated immediately
- The AI ecosystem itself (LLM frameworks, agent tooling) is a massively underexamined attack surface and a primary FENRIR target
- Human remains in the loop but workflow is simplified from 500 alerts to 10–25 high-confidence packages
## Notable Quotes / Data Points
- Over 60 CVEs submitted (all high/critical severity), 100+ in ZDI pre-disclosure, 3,000 pending review
- L1 triage filters >60% of findings with a single Sonnet call, replacing 12 turns of Opus
- L2 median cost: ~$0.61/finding, ~$8.80/true positive
- 2.5x more vulnerabilities discovered; 80% false positive reduction; 70% faster disclosure; 3x team productivity
- System has survived multiple generations of LLM upgrades since its March 2025 inception
- Started as an MCP server component — team views MCP as a natural transition from tool calling to agent building
#unprompted #claude