Maisel - Hooking Coding Agents with Cedar Policy

Speaker:: Matt Maisel Title:: Hooking Coding Agents with the Cedar Policy Language Duration:: 17 min Video:: https://www.youtube.com/watch?v=m6pzrqFJ6hE ## Key Thesis Coding agent security requires a deterministic, tamper-proof reference monitor that sits outside the model and mediates every trajectory event (actions, observations, control, state) — Cedar's formal policy language, combined with hook-based interception of agent lifecycle events, provides the policy enforcement layer needed to implement this without relying solely on in-model safeguards or prompts. ## Synopsis Matt Maisel, CTO and co-founder of Cinderea, presented a framework for policy-based governance of coding agent behavior using Cedar — the formally-analyzable policy language developed by AWS — combined with agent hook systems from Claude Code, Cursor, and Gemini CLI. Maisel began by mapping the coding agent execution loop to a **trajectory event model** with four event types: - **Actions** — file writes, shell commands, code execution, tool calls (mutate the environment) - **Observations** — environment feedback returned to the agent after actions - **Control** — user prompts, permission requests, sub-agent orchestration - **State** — memory compaction, context pruning, environment snapshots This model maps cleanly onto the "lethal trifecta" threat model: untrusted input (malicious skill from a marketplace, returned as an observation), sensitive data in context (source code, internal docs), and state-changing exfiltration (shell commands, network calls). Multi-step attacks span multiple turns of the loop across these event types. The architectural solution is a **reference monitor** outside the model that mediates every event — always invoked, tamperproof, verifiable. Cedar is chosen as the policy language because it is expressive, fast, and formally analyzable (using the Lean symbolic compiler). Unlike Rego, Cedar policies can be checked for contradictions, vacuous conditions, and shadowed policy subsets through built-in tooling. Cedar's attribute-based access control (ABAC) model maps well to this domain: entities like Agent, User, and Trajectory carry attributes like sensitivity labels, information flow control tags, and safety model classifications. The hook implementation differs per tool: - **Gemini CLI**: before/after model hooks, can stream individual tokens - **Claude Code**: no model-level hooks; only final agent response as notification event; shell/tool hooks available - **Cursor**: granular hooks for MCP tool calls, shell commands, and generic tool calls A local adapter process intercepts hook events over stdin, transforms them into the trajectory event model, and sends them to a local harness service running the Cedar policy engine. The harness also maintains stateful entity and trajectory stores — allowing multi-turn taint tracking across the agent session. Three demo scenarios were presented: 1. **Destructive command blocking**: a Cedar policy catches `DELETE FROM users` without a WHERE clause and blocks it, returning the policy violation context to the agent 2. **Information flow control (IFC)**: Gemini CLI retrieves PII (name, DOB, address), tagged as "highly confidential." On a subsequent turn, when the agent attempts a web fetch, the Cedar engine detects that the current trajectory is tainted with sensitive data and blocks the network call dynamically 3. **Lethal trifecta (Cursor)**: a skill fetched from a public marketplace contains a `metrics.py` script that harvests environment variables and exfiltrates them via HTTP. The Cedar harness detects and blocks the shell command after running the script through YARA signatures Writing Cedar policies by hand is tedious, but the formal properties enable **AI-assisted policy generation**: a policy agent can author and validate Cedar policies using MCP tools that expose schema context, then run formal analysis to verify correctness. Maisel also noted a specific CLAUDE.md-related finding: Claude Code will write files even when in plan mode — a behavior that can be caught and blocked with a Cedar policy checking permission mode state. ## Key Takeaways - A deterministic reference monitor outside the model is a necessary complement to in-model safeguards — prompts and guardrails alone are insufficient - Cedar's formal analyzability (contradiction detection, vacuous policy checking) is a meaningful advantage over policy languages like Rego - Multi-turn stateful policies (taint tracking across trajectory) require a separate entity/trajectory store since Cedar itself is stateless - Hook granularity varies significantly across coding agents — Cursor has the most, Claude Code has the least (no model-level hooks) - Information flow control tainting enables dynamic access revocation: agent gains PII context → subsequent network calls get blocked automatically - Policy content can be sourced from existing CLAUDE.md and cursor rules files — formalize the intent already expressed there as Cedar policies - Cedar policy generation can itself be delegated to an AI agent, with formal verification as the correctness gate ## Notable Quotes / Data Points - Claude Code specific finding: "Claude Code lets you write files while in plan mode" — can be caught and blocked with Cedar - Cedar uses the Lean symbolic compiler for formal policy analysis - Current guardrail integrations in the harness: YARA signatures, information flow control models, and a safety model (GPT Safeguards 20B) - The framework is open source; QR code to GitHub repo provided in talk - Cedar is also being open-sourced for first-party agent frameworks: LangChain, Strands, ADK, and others - Multi-agent trajectory merging (agent-to-agent communication boundaries) is identified as future work — contextual integrity policies for secrets one agent knows but others shouldn't #unprompted #claude