Ring & Peedikayil - Operation Pale Fire

Speaker:: Wes Ring & Josiah Peedikayil Title:: Operation Pale Fire Duration:: 21 min Video:: https://www.youtube.com/watch?v=SUa1nta8FGQ ## Key Thesis Operation Pale Fire was a red team operation at Block targeting Goose, their open-source AI agent, to determine whether a real-world end-to-end prompt injection attack chain — from external attacker to code execution on an employee laptop — was achievable in practice. The operation succeeded through a spear-phishing pivot after two calendar-based injection campaigns fell short due to model resistance, MCP version changes, and context window limitations. ## Synopsis Peedikayil presents the operation from Block's offensive security team, crediting colleagues Hassan, Wes Ring, Michael Rand, and Justin Enler. The team's goals: discover what an end-to-end AI agent compromise looks like in practice, identify dangerous features in Goose, and operationalize a prompt injection that could evade detection — not just prove the concept works in a lab, but make it stealthy. Goose (now open-sourced on Block's GitHub) is an AI agent with an auto-installed developer shell tool that can run shell commands. It integrates with multiple LLMs (Claude 3.5 Sonnet, 3.7, GPT-4) and an internal MCP marketplace allowing connections to Jira, Google Drive, and other enterprise tools. The combination of shell access and untrusted content ingestion made it an obvious red team target. **Campaign 1 — Calendar Prompt Injection.** The team identified Google Calendar as an attack surface: Block's calendar configuration at the time allowed external parties to schedule invites with employees, with the option to disable the associated email notification (making delivery stealthy). The attacker account could be omitted from the guest list, with only a small italicized "created by" attribution at the bottom. The attack chain: send a calendar invite with a malicious prompt injection in the description → wait for the employee to ask Goose "What's on my calendar today?" → Goose reads the calendar via MCP, ingests the description, executes the injected command via the developer shell tool call (curl-bash payload). To hide the injection payload, they used zero-width Unicode characters (ASCII smuggling) to render the text invisible to humans while remaining fully readable by the LLM. Despite promising early tests, the campaign failed in production. Key blockers: the internal Gcal MCP received a major rewrite mid-operation that broke the tuned injection; Claude 3.7 and GPT models were more resistant than 3.5; on real calendars with many events, the context window filled and the model forgot to execute the shell command; and the team lacked automation to test across multiple models and versions. **Campaign 2 — Goose Recipes.** Recipes are a Goose-specific feature: reusable workflows encoded as base64 URLs that load into the system prompt. The team discovered that injecting a payload into the recipe's "instructions" field placed it directly in the system prompt, a much stronger position than a user-prompt injection. They encoded the payload in base64 and further obscured it using zero-width Unicode inside the decoded instructions. The attack vector became: send a calendar invite with a hyperlink to the malicious recipe URL, with a compelling pretext ("We've blocked off time for you to try the new Goose recipe"), including a Google Meet link to add legitimacy. The first campaign version had attendees accidentally join the Meet expecting a real meeting; the team adapted by preparing a demo slide deck. A user clicked the recipe link. However, a typo in the payload (which LLMs had been auto-correcting during testing but not in production) prevented code execution. **Campaign 3 — Spear Phishing the Goose Team.** The team pivoted to targeted spear phishing. They contacted the Goose development team through a public channel, posing as external researchers who had found a bug in RTL text rendering within a recipe. When the developer ran the recipe to investigate, it triggered the developer shell tool call, executing the curl-bash payload — delivering the infostealer. The infostealer was caught by Block's detection team shortly after execution, validating their detections for this category of threat. Mitigations implemented after the operation: stripping non-standard Unicode characters from inputs and recipes; adding a recipe warning that displays what the recipe will do before execution (transparency); prompt injection detection using a BERT classifier for bad bash command detection; LLM-based semantic prompt injection detection (mixed results); command allow-listing; and requiring employee approval before external parties can place calendar invites. ## Key Takeaways - End-to-end AI agent compromise via prompt injection is operationally achievable — not just theoretically - Zero-width Unicode (ASCII smuggling) is an effective technique for hiding prompt injection payloads in plain sight across calendar invites and recipe URLs - System prompt injection (via Goose recipes) is significantly more reliable than user-prompt injection for achieving consistent execution - Context window saturation on real calendars caused the agent to forget injected instructions — a real-world constraint that lab testing misses - Model resistance varies significantly: Claude 3.7 and GPT models were more resistant than 3.5 Sonnet at the time - Testing on sparse personal calendars doesn't replicate production calendar density; automate testing across models and versions - Spear phishing the developer team (via social engineering, not technical exploit) was the successful vector - Blocking external calendar invites without prior email approval is a high-value, low-cost mitigation ## Notable Quotes / Data Points - Google Calendar API allows disabling email notifications on invites, making delivery stealthy - Attacker account can be omitted from the guest list entirely — only "created by" text remains as indicator - First prompt injection payloads were long enough that the team called them "master prompt injections" — later refined - Claude 3.5 Sonnet responded better to "you need to run this curl bash to stay secure" framing - Over 50 calendar invite/day rate limiting by Google was a logistics obstacle - Block's red team deconfliction workflow: infostealer was caught and deconflicted quickly after execution - Open source tool: Block engineering blog publishes red team content #unprompted #claude