Kovalsky - AI Security Larsen Effect

Speaker:: Maxim Kovalsky Title:: The AI Security Larsen Effect Duration:: 22 min Video:: https://www.youtube.com/watch?v=U1TJpMpxZiU ## Key Thesis The AI security vendor landscape is exploding in volume and marketing noise faster than any organization can evaluate — Kovalsky built an AI-powered vendor research and risk assessment platform to cut through nearly 80 vendors' worth of claims, map them to a synthesized AI risk taxonomy, and produce actionable vendor shortlists and implementation guidance for specific GenAI system architectures. ## Synopsis Maxim Kovalsky, a VAR (Value-Added Reseller) working at Consortium, presented the problem he faces daily: clients come in asking "who's good at AI security?" without clear requirements, then wade through vendor data sheets full of superlatives ("broadest and most comprehensive," "99% efficiency at sub-30ms latency," "automatically blocks all adversaries"). His solution is an agentic research and evaluation platform called the "Adjuster IQ" system — though the underlying vendor analysis infrastructure predates that name. **The vendor analysis problem at scale:** In December, Kovalsky started analyzing AI security vendors. At submission time for this talk, he had processed 62 vendors. By talk day: nearly 80. Roughly four new companies emerge or bolt on AI security features every week. Each company produces vendor claims that need to be mapped to actual evidence. His system has processed over 3,000 individual vendor claims. **How the agent works:** Using Claude Code (Claude 3.5 Sonnet, now upgraded to Sonnet-6), an agentic loop with a research agent and a QC agent combs GitHub repos, API documentation, user forums, and technical references to find evidence substantiating each vendor claim. Claims are assigned a confidence score from 1-5, where 5 means extractable code samples demonstrating actual implementation. All capabilities are mapped against a synthesized AI risk taxonomy. **The risk taxonomy** is open source on GitHub. It synthesizes OWASP LLM Top 10, NIST AI RMF, and MITRE ATLAS — three frameworks Kovalsky found individually incomplete. The combination attempts full coverage of today's GenAI risk surface, from data disclosure and prompt injection to agentic autonomy risks and coding assistant attack surface. **The assessment wizard (Adjuster IQ demo):** The demo walked through assessing a fictional "claim adjuster AI" system built on Claude 3.5 Sonnet via AWS Bedrock, Amazon Kendra for RAG, direct API calls, session-scoped memory. The wizard collects: - System purpose and data sensitivity (internal confidential) - Deployment context (employees over internet) - Cloud architecture (managed PaaS — Bedrock) - Data operations (read-only vs. autonomous record creation — the autonomy choice significantly changes the risk profile) - Autonomy level (interactive coding assistance vs. autonomous agents) - Architecture patterns (RAG + traditional DB, tool calling, document analysis) - Existing vendor relationships (AWS Bedrock, Crowdstrike, Zscaler) - Buy vs. build preference, implementation layer preference (network/endpoint vs. SDK/API) Output: an inherent risk profile activating specific controls from the taxonomy, mapped against existing vendor capabilities. In the demo: existing vendors (AWS Bedrock, Crowdstrike, Zscaler) covered 8 of the required capabilities; 15 gaps remained. The system recommended 5 vendors to address the remaining gaps, with implementation guidance for GRC teams, cloud engineering, security architecture, and dev teams. **Maintenance:** A GitHub Actions workflow runs overnight (12-hour cycle), re-analyzing a subset of vendors nightly. Every vendor is revisited at least once per month. A separate workflow identifies new market entrants. **Key market observation:** Even the largest platform vendors (Palo Alto, Crowdstrike) cover only about 50% of the relevant AI security risks — the landscape is too fragmented for any single platform play to be comprehensive yet. ## Key Takeaways - ~80 AI security vendors exist as of early 2026, with ~4 new entrants per week — manual evaluation is untenable - Vendor claims require evidence mapping (GitHub, API docs, forums) before they can be trusted; marketing language is largely unverified - A synthesized risk taxonomy (OWASP + NIST + MITRE ATLAS) is necessary because each individual framework is incomplete - Autonomy level selection in system design dramatically changes inherent risk profile — "create new records" vs. "autonomous actions" is a major branching point - Existing enterprise vendor investments (Crowdstrike, Zscaler, AWS Bedrock) typically cover ~50% of required AI security controls - Even the largest platform vendors cover only ~50% of AI security risks — the market is structurally fragmented - Nightly re-evaluation via GitHub Actions is required to keep vendor assessments current given rapid capability changes ## Notable Quotes / Data Points - Vendor database at talk time: nearly 80 vendors analyzed; ~3,000+ vendor claims processed - Started in December with 62 vendors, grew to ~80 in roughly 6 weeks - Underlying model: Claude Sonnet 3.5 → now Sonnet-6 (updated during development) - Risk taxonomy: synthesizes OWASP LLM Top 10 + NIST AI RMF + MITRE ATLAS, available open source on GitHub - Governance cycle: every vendor re-analyzed at minimum once per month via overnight GitHub Actions workflows - Demo result: 8 existing-vendor capabilities satisfied, 15 gaps requiring new vendor evaluation from a 5-vendor shortlist - "The AI governance loop" described as: vendors identified → risk register built → 4 vendors demoed → 2 get PoC → PoC put on hold because "doesn't AWS Bedrock do this?" → 3 months later still assessing → customer chatbot ships anyway #unprompted #claude