RAG Is Your Biggest Attack Surface¶
The Pattern Everyone Uses, Nobody Secures¶
Retrieval-Augmented Generation (RAG) is the dominant enterprise AI pattern. It lets LLMs answer questions using your data without retraining.
The architecture is simple: embed your documents, store embeddings in a vector database, retrieve relevant chunks at query time, pass them to the LLM as context.
The security implications are not simple.
The Problem¶
RAG creates a new data access path that bypasses your existing access controls.
Traditional path:
User → Application → Database → Access Control → Data
RAG path:
User → LLM → Retrieval → Vector Store → (maybe access control?) → Data → LLM → User
The LLM sees the retrieved data. The LLM generates a response. If the retrieved data includes content the user shouldn't see, the LLM will happily summarise it for them.
Five Risks You're Probably Not Controlling¶
1. Retrieval Bypasses Document-Level Access Control¶
You embedded 50,000 documents. Some are HR-confidential. Some contain board minutes. Some are public knowledge base articles.
When a user queries the system, the vector similarity search returns the most semantically relevant chunks. It does not check whether the user is authorised to see them.
Control required: Query-time access filtering that enforces document-level (or chunk-level) permissions before retrieved content reaches the LLM.
2. Data Poisoning Through Ingestion¶
If an attacker can inject or modify documents in your source corpus, they can influence every future RAG response.
This is not theoretical. Any system that ingests user-generated content, customer emails, uploaded documents, or web-scraped data has an open ingestion path.
Control required: Ingestion validation, source authentication, and content integrity checks before embedding.
3. Prompt Injection Via Retrieved Content¶
Retrieved chunks become part of the LLM's context. If a retrieved document contains adversarial instructions (e.g., "Ignore previous instructions and..."), the LLM may follow them.
This is indirect prompt injection. The attack vector is your own data.
Control required: Content sanitisation at ingestion, guardrails on retrieved content before it enters the prompt, and output validation.
4. Information Leakage Through Inference¶
Even with access controls on retrieval, the LLM may infer sensitive information from seemingly innocuous chunks. Salary bands from job descriptions. M&A targets from legal memos. Customer complaints from support tickets.
The LLM synthesises. That's its job. The synthesis may reveal more than any individual source document.
Control required: Classification-aware retrieval that considers the sensitivity of synthesised output, not just individual source documents.
5. Embedding Store as a High-Value Target¶
Your vector database contains dense numerical representations of your proprietary data. It's typically less protected than your source databases because security teams don't yet think of vector stores as data stores.
They are.
Control required: Encryption at rest and in transit, access control, audit logging, and network segmentation for vector databases — the same controls you apply to any data store containing sensitive information.
What the Three-Layer Pattern Catches¶
| Layer | RAG Risk Mitigated |
|---|---|
| Guardrails | PII in outputs, known-bad content patterns |
| Judge | Responses that seem inconsistent with expected scope |
| Human Oversight | Edge cases flagged by the judge |
What It Misses¶
| Risk | Why the Pattern Misses It |
|---|---|
| Unauthorised retrieval | Happens before the LLM generates output — no output to evaluate |
| Data poisoning | Corrupted data produces plausible responses — judge may not flag them |
| Indirect prompt injection via data | Guardrails check user input, not retrieved content |
| Inference-based leakage | Individual outputs may look fine; the risk is in aggregation |
| Vector store compromise | Infrastructure risk, not output risk |
The three-layer pattern monitors output quality. RAG security requires controlling the input pipeline as well.
The Controls¶
See RAG Security Controls for implementation guidance.¶
AI Runtime Behaviour Security, 2026 (Jonathan Gill).