Incident Tracker¶
Real-World AI Security Incidents Mapped to MASO Controls
Part of the MASO Framework · Threat Intelligence Last updated: February 2026
Purpose¶
This tracker maps publicly disclosed AI security incidents to MASO control domains, identifying which controls would have prevented, detected, or contained each incident. Every entry includes the attack vector, the multi-agent amplification risk (how the same attack would compound in a multi-agent system), and the specific MASO controls that address it.
Incidents are classified by the OWASP risk they exploit and the MASO tier at which controls would have been effective.
Incident Register¶
INC-01: Autonomous Agent Crypto Transfer (March 2024)¶
What happened: Researchers demonstrated that an Auto-GPT agent with cryptocurrency wallet access could be tricked into transferring funds to attacker addresses via indirect prompt injection hidden in email content. The agent processed a newsletter containing embedded instructions and initiated an unauthorised wallet transfer.
Attack vector: Indirect prompt injection via email content → tool execution (wallet transfer)
OWASP mapping: LLM01 (Prompt Injection), LLM06 (Excessive Agency), ASI02 (Unrestricted Tool Access)
Multi-agent amplification: In a multi-agent system, a compromised email-processing agent could pass poisoned instructions to a financial-execution agent through the message bus. The execution agent would treat the instructions as legitimate delegated tasks, with no independent verification of the original source.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| PG-1.1 Input guardrails per agent | Prompt & Goal Integrity | Block injection at email agent boundary |
| PG-1.4 Message source tagging | Prompt & Goal Integrity | Tag email-derived content as untrusted data, not instruction |
| EC-1.2 Tool allow-lists | Execution Control | Restrict wallet operations to explicitly approved task types |
| EC-2.5 LLM-as-Judge gate | Execution Control | Independent evaluation before financial execution |
| EC-1.1 Human approval for write operations | Execution Control | Human confirms all transfers at Tier 1 |
Minimum effective tier: Tier 1 (human approval for writes prevents execution; Tier 2 Judge gate catches it automatically)
INC-02: GitHub Copilot RCE — CVE-2025-53773 (2025)¶
What happened: An attacker embedded prompt injection in public repository code comments. When a developer opened the repository with Copilot active, the injected prompt instructed Copilot to modify .vscode/settings.json to enable YOLO mode (auto-approve all commands). Subsequent commands executed without user approval, achieving arbitrary code execution on the developer's machine.
Attack vector: Indirect prompt injection via code comments → configuration modification → RCE
OWASP mapping: LLM01 (Prompt Injection), LLM05 (Improper Output Handling), ASI02 (Unrestricted Tool Access)
Multi-agent amplification: In a multi-agent coding system, a code-review agent processing the poisoned repository could pass the injected instructions to a code-generation agent, which could then modify configuration files across multiple developer environments simultaneously. The blast radius scales with the number of agents consuming the same repository.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| PG-1.1 Input guardrails per agent | Prompt & Goal Integrity | Detect injection patterns in code comments |
| PG-1.2 System prompt isolation | Prompt & Goal Integrity | Prevent external content from overriding agent instructions |
| EC-1.4 Blast radius caps | Execution Control | Limit scope of file modifications per agent |
| EC-2.5 LLM-as-Judge gate | Execution Control | Flag settings.json modification as high-risk action |
| IA-1.4 Scoped tool permissions | Identity & Access | Prevent agents from modifying IDE configuration |
Minimum effective tier: Tier 1 (tool scoping prevents settings modification; blast radius caps contain damage)
INC-03: Cursor IDE Agentic RCE — CVE-2025-59944 (2025)¶
What happened: A case-sensitivity bug in a protected file path allowed an attacker to influence Cursor's agentic behaviour. The agent read the wrong configuration file containing hidden instructions, which escalated into remote code execution. The root cause was that the agent trusted unverified external content and treated it as authoritative.
Attack vector: Path traversal via case sensitivity → configuration poisoning → RCE
OWASP mapping: LLM01 (Prompt Injection), ASI01 (Agent Goal Hijack), ASI02 (Unrestricted Tool Access)
Multi-agent amplification: A configuration-poisoning attack in a multi-agent system could compromise the orchestrator's task definitions, redirecting all downstream agents. If the orchestrator reads poisoned configuration, every agent it spawns inherits the attacker's instructions.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| PG-1.3 Immutable task specification | Prompt & Goal Integrity | Task definitions cannot be modified by external content |
| PG-2.2 Goal integrity monitoring | Prompt & Goal Integrity | Detect deviation from original task objectives |
| SC-1.1 Component inventory (AIBOM) | Supply Chain | Track all configuration sources; detect unauthorised changes |
| IA-2.6 Secrets exclusion from context | Identity & Access | Configuration files with credentials isolated from agent context |
Minimum effective tier: Tier 2 (goal integrity monitoring detects the deviation; immutable task specs prevent it)
INC-04: Perplexity Comet Browser Agent Data Exfiltration (2024–2025)¶
What happened: Researchers demonstrated that Perplexity's AI-powered browser agent could be manipulated through prompt injection in web page content to exfiltrate sensitive data from the user's browsing session. The agent processed web content containing hidden instructions and followed them.
Attack vector: Indirect prompt injection via web content → data exfiltration through agent browsing actions
OWASP mapping: LLM01 (Prompt Injection), LLM02 (Sensitive Information Disclosure), ASI04 (Inadequate Sandboxing)
Multi-agent amplification: A browsing agent in a multi-agent system could exfiltrate data and pass it to other agents through the message bus. If the exfiltrated data includes credentials or internal URLs, downstream agents could use them to access additional systems.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| DP-1.1 Data classification labels | Data Protection | Classify browsing data; prevent cross-boundary transfer |
| DP-2.1 DLP on message bus | Data Protection | Detect sensitive data in inter-agent messages |
| EC-1.4 Blast radius caps | Execution Control | Limit browsing agent's access to user session data |
| OB-2.1 Anomaly scoring | Observability | Flag unusual data transfer patterns |
Minimum effective tier: Tier 1 (data classification prevents exfiltration; DLP catches it at the bus)
INC-05: PoisonedRAG — Knowledge Base Contamination (2024)¶
What happened: Researchers demonstrated that adding just 5 malicious documents to a corpus of millions caused the targeted AI to return attacker-desired false answers 90% of the time for specific trigger queries. The poisoning was undetectable because the AI was technically performing retrieval correctly — the retrieved content itself was compromised.
Attack vector: Data poisoning via RAG corpus → misinformation delivery
OWASP mapping: LLM04 (Data and Model Poisoning), LLM08 (Vector and Embedding Weaknesses), LLM09 (Misinformation)
Multi-agent amplification: This is where multi-agent dynamics make the attack dramatically worse. A research agent retrieves poisoned data. An analysis agent synthesises it into a report. A presentation agent formats it for stakeholders. By the third agent, the poisoned claim has been cited, elaborated, and presented with high confidence. No agent in the chain has independent verification capability. This is exactly the epistemic failure mode MASO's Prompt, Goal & Epistemic Integrity domain was designed to address.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| DP-2.2 RAG integrity with freshness | Data Protection | Validate document provenance and freshness metadata |
| PG-2.5 Claim provenance enforcement | Prompt & Goal Integrity | Unverified agent claims cannot be treated as facts |
| PG-2.6 Self-referential evidence prohibition | Prompt & Goal Integrity | Agents cannot cite other agents' output as primary evidence |
| PG-2.7 Uncertainty preservation | Prompt & Goal Integrity | Confidence scores propagate through chain; don't inflate |
| PG-3.5 Challenger agent | Prompt & Goal Integrity | Adversarial agent attacks primary hypothesis |
Minimum effective tier: Tier 2 (provenance enforcement and uncertainty preservation catch the propagation; Tier 3 challenger agent provides active defence)
INC-06: Samsung Confidential Code Leak via ChatGPT (2023)¶
What happened: Samsung engineers pasted proprietary source code into ChatGPT for debugging assistance, leaking confidential intellectual property. Samsung subsequently banned internal use of external AI tools. Similar incidents were reported at JPMorgan, Goldman Sachs, and other financial institutions.
Attack vector: User error / policy violation → data exfiltration to external AI provider
OWASP mapping: LLM02 (Sensitive Information Disclosure), ASI06 (Inadequate Data Controls)
Multi-agent amplification: In a multi-agent system with external model providers, data shared with one agent's model provider is effectively shared with all agents using that provider. If Agent A sends proprietary code to Provider X for analysis, and Agent B also uses Provider X, the data boundary has been breached at the provider level even if inter-agent controls are perfect.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| DP-1.1 Data classification labels | Data Protection | Classify code as confidential; block external transmission |
| DP-2.1 DLP on message bus | Data Protection | Detect code patterns in outbound messages |
| DP-1.3 Memory isolation | Data Protection | Prevent context leakage between agent sessions |
| SC-2.1 AIBOM with provider mapping | Supply Chain | Know which data reaches which external provider |
Minimum effective tier: Tier 1 (data classification and DLP prevent the leak)
INC-07: AI Worm Proof-of-Concept — Morris II (February 2025)¶
What happened: Researchers demonstrated a self-replicating AI worm that spread between autonomous agents through prompt injection. The worm injected itself into AI-generated content. When a compromised agent communicated with another through email or chat, hidden instructions in the message infected the receiving agent, which then propagated the worm to other agents it communicated with.
Attack vector: Self-replicating prompt injection via inter-agent communication → cascading compromise
OWASP mapping: LLM01 (Prompt Injection), ASI01 (Agent Goal Hijack), ASI03 (Insecure Agent Communication), ASI08 (Agent Memory Poisoning)
Multi-agent amplification: This attack is inherently multi-agent. It cannot exist in a single-agent system. The worm exploits the communication channel between agents — exactly the inter-agent message bus that MASO treats as a first-class security control point. Every agent in the chain becomes both victim and vector.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| PG-1.1 Input guardrails per agent | Prompt & Goal Integrity | Detect injection patterns in incoming messages |
| PG-2.1 Inter-agent injection detection | Prompt & Goal Integrity | Judge evaluates all inter-agent messages for injection |
| PG-1.4 Message source tagging | Prompt & Goal Integrity | Distinguish instruction from data in messages |
| EC-1.5 Interaction timeout | Execution Control | Cap propagation chains at max turn count |
| OB-3.1 Independent observability agent | Observability | Detect anomalous communication patterns across the system |
| OB-3.2 Circuit breaker | Observability | Kill switch terminates all agent communication |
Minimum effective tier: Tier 2 (inter-agent injection detection + Judge gate blocks propagation; Tier 3 independent observability agent detects system-wide anomaly)
INC-08: MCP Server Supply Chain Attacks (2025)¶
What happened: Multiple reports documented vulnerabilities in Model Context Protocol (MCP) servers, including the Toxic Agent Flow exploit in GitHub's MCP server. Attackers exploited MCP tool descriptions and resource listings to inject instructions that influenced model behaviour. With tens of thousands of MCP servers published online, the supply chain attack surface expanded rapidly.
Attack vector: Poisoned MCP tool metadata → indirect prompt injection → unauthorised actions
OWASP mapping: LLM03 (Supply Chain Vulnerabilities), LLM01 (Prompt Injection), ASI02 (Unrestricted Tool Access)
Multi-agent amplification: Multi-agent systems consume multiple MCP servers — one per specialist agent is a common pattern. A single poisoned MCP server can compromise the agent that uses it, which then becomes a vector for poisoning other agents through the message bus. The supply chain risk scales multiplicatively with the number of MCP integrations.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| SC-1.2 Signed tool manifests | Supply Chain | Verify MCP server integrity before connection |
| SC-2.2 MCP server vetting | Supply Chain | Pre-approve MCP servers; deny unsigned/unvetted |
| SC-2.3 Runtime component audit | Supply Chain | Continuous verification of active MCP connections |
| PG-1.1 Input guardrails per agent | Prompt & Goal Integrity | Filter injection from MCP tool responses |
| IA-1.4 Scoped tool permissions | Identity & Access | Limit what each MCP server can access |
Minimum effective tier: Tier 2 (signed manifests + vetting + runtime audit provide defence in depth)
INC-09: Financial Services AI Banking Assistant Fraud (June 2025)¶
What happened: Attackers sent crafted messages through a banking app's AI chat interface that tricked the AI into bypassing transaction verification steps. The AI approved fraudulent transfers because the attacker's instructions overrode the normal security protocols. The company lost approximately $250,000 before detecting the attack.
Attack vector: Direct prompt injection via chat → security bypass → fraudulent transactions
OWASP mapping: LLM01 (Prompt Injection), LLM06 (Excessive Agency), ASI09 (Inadequate Human Oversight)
Multi-agent amplification: In a multi-agent banking system, a compromised customer-facing agent could delegate fraudulent transactions to a back-office execution agent, using legitimate delegation channels. The execution agent would see a properly formatted request from an authorised agent — the fraud would be invisible at the execution layer.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| EC-1.1 Human approval for write operations | Execution Control | All financial transactions require human confirmation |
| EC-2.5 LLM-as-Judge gate | Execution Control | Independent evaluation of transaction legitimacy |
| EC-2.6 Decision commit protocol | Execution Control | Committed decisions cannot be reversed without human auth |
| PG-1.1 Input guardrails per agent | Prompt & Goal Integrity | Detect injection patterns in customer messages |
| OB-2.1 Anomaly scoring | Observability | Flag unusual transaction patterns |
Minimum effective tier: Tier 1 (human approval for writes is the minimum — no AI should autonomously approve financial transactions without it)
INC-10: LLM-as-Judge Manipulation — JudgeDeceiver (2024–2025)¶
What happened: Researchers demonstrated JudgeDeceiver, an optimisation-based attack that injects a crafted sequence into a candidate response such that an LLM-as-Judge selects the attacker's response regardless of quality. This has implications for LLM-powered search ranking, reinforcement learning with AI feedback, and tool selection systems.
Attack vector: Adversarial optimisation → Judge manipulation → compromised evaluation
OWASP mapping: LLM01 (Prompt Injection), ASI07 (Insecure AI Evaluation)
Multi-agent amplification: If the Judge layer itself is compromised, every control that depends on Judge evaluation is bypassed simultaneously. In MASO, the Judge gate (EC-2.5) is a critical control point — manipulating it undermines execution control, goal integrity monitoring, and output validation across the entire multi-agent system.
MASO controls that address this:
| Control | Domain | Effect |
|---|---|---|
| EC-2.5 LLM-as-Judge gate (hardened) | Execution Control | Multiple judge criteria reduce single-point manipulation |
| PG-2.9 Model diversity policy | Prompt & Goal Integrity | Judge uses different model/provider than task agents |
| OB-3.1 Independent observability agent | Observability | Separate monitoring agent with own model; cross-checks Judge |
| PG-3.5 Challenger agent | Prompt & Goal Integrity | Adversarial agent tests Judge decisions |
| EC-3.1 Multi-judge consensus | Execution Control | Multiple independent judges for high-risk decisions |
Minimum effective tier: Tier 3 (defending the Judge requires model diversity, independent observability, and challenger agents — this is an advanced threat)
Incident Statistics¶
| Category | Count | Most Common OWASP Risk |
|---|---|---|
| Prompt injection (direct + indirect) | 6 | LLM01 |
| Data exfiltration / disclosure | 3 | LLM02 |
| Supply chain compromise | 2 | LLM03 |
| Knowledge base poisoning | 1 | LLM04 |
| Tool/agency abuse | 4 | LLM06 / ASI02 |
| Judge/evaluation manipulation | 1 | ASI07 |
Most referenced MASO controls across all incidents:
| Control | Incidents Addressed |
|---|---|
| PG-1.1 Input guardrails per agent | 8/10 |
| EC-2.5 LLM-as-Judge gate | 5/10 |
| EC-1.1 Human approval for writes | 3/10 |
| DP-2.1 DLP on message bus | 3/10 |
| OB-3.1 Independent observability agent | 3/10 |
How to Use This Tracker¶
For risk assessments: Reference specific incidents when justifying control investments. Each incident includes the MASO controls that would have prevented or contained it.
For red team planning: Use the attack vectors as starting points for testing your multi-agent system against known real-world patterns. See the Red Team Playbook for structured test scenarios.
For executive briefings: The incident statistics and dollar-value losses (INC-09: $250K) provide concrete evidence for security investment decisions.
For control gap analysis: If your deployment lacks any control referenced in the "MASO controls that address this" column, you have a known exposure to a real-world attack pattern.¶
AI Runtime Behaviour Security, 2026 (Jonathan Gill).