RAG Security: The Attack Surface Inside Your Enterprise AI

The Question

Retrieval-Augmented Generation — RAG — is the architecture most enterprises use to make AI systems useful with their own data. Instead of retraining a model on proprietary information, you build a vector database of your documents, policies, and knowledge base, and the AI retrieves relevant content at query time to inform its responses.

It is an elegant solution to a real problem. It is also a security surface that most enterprise security teams have never reviewed, do not monitor, and have not included in their threat model.

OWASP classifies this risk as LLM08: Vector and Embedding Weaknesses — attacks that target the retrieval layer of AI systems rather than the model itself. It is one of the least understood risks in the OWASP Top 10 for LLMs, and one of the most underdefended in enterprise deployments.

Why This Matters Now

In 2024, security researcher Johann Rehberger published a demonstration that has since become one of the most widely cited examples of indirect prompt injection in production AI systems. Rehberger showed that ChatGPT's memory feature — which allows the model to store facts about a user and reference them in future sessions — could be manipulated by a malicious webpage.

The attack worked like this: Rehberger created a webpage containing hidden text with an instruction embedded in it. When a user visited the page and asked ChatGPT to summarize it, the model processed the hidden instruction as part of the content and wrote a false "memory" — an attacker-controlled piece of information — into its persistent memory store. In future sessions, ChatGPT would reference that false memory as if it were a true fact about the user.

The user never saw the instruction. They never consented to the memory being written. The attack required no access to their ChatGPT account — only that they visit a page the attacker controlled.

This is the RAG security threat made concrete. The attack did not target the model. It targeted the data the model retrieves and trusts. Your vector database is subject to the same class of attack.

What the CURVE™ Data Shows

RAG security sits at the intersection of two market categories tracked in the 2026 Stackcurve AI Security CURVE™ Report: Memory Poisoning (one of the five agentic threat classes) and the broader AI Application Security category.

The research finding is direct: vector database security is the most underserved gap in the enterprise AI security stack. Most enterprises that have built RAG systems have invested in the model layer — prompt guardrails, output filtering, access controls on the AI application itself — and almost nothing in the retrieval layer. The vector store is treated as infrastructure rather than as a security-sensitive data system.

This is a misjudgment. The vector database is the trust foundation of a RAG system. The model will retrieve content from it and treat that content as authoritative context. If the content in the vector store is wrong, manipulated, or adversarially crafted, the model's outputs will reflect that. The model has no intrinsic ability to distinguish a legitimate document from a poisoned one in its own retrieval corpus.

The full vendor rankings are in the 2026 AI Security CURVE™ Report — free to download.

The Gap Most Buyers Miss

Understanding the RAG security problem requires understanding how trust flows in a RAG architecture.

In a traditional application, data from a database is passed to application logic that processes it explicitly. A developer can write validation rules, sanitization functions, and access controls that inspect the data before it affects behavior.

In a RAG architecture, retrieved content is passed to a language model that processes it implicitly — it reads it, understands it semantically, and uses it to generate a response. The model does not run the retrieved content through an explicit validation layer. It trusts what it retrieves. That trust is the attack surface.

Three specific attack patterns exploit this architecture:

Document poisoning — an attacker with write access to the document corpus, or with the ability to get a malicious document into the ingestion pipeline, embeds instructions in a document that the model will later retrieve. When a user triggers a query that retrieves that document, the embedded instruction is processed alongside the legitimate content. This is indirect prompt injection at the retrieval layer: the document is the injection vector.

Embedding inversion — academic research has demonstrated that the vector embeddings stored in RAG databases can, in some architectures, be partially inverted to reconstruct the original text. If your vector database is compromised, an attacker may be able to recover not just metadata about your documents but their actual content — including content you believed was protected because only embeddings were stored.

Cross-session contamination — in shared RAG architectures where multiple users or AI agents write to and retrieve from the same vector store, a write from one context can influence retrieval in another. An attacker who can contribute content to a shared knowledge base can shape the responses that other users receive from the AI system built on that base.

Questions Your Buying Team Should Be Asking

1. What is the access control model for your vector database? Who can write to it? Who can read from it? Are those permissions scoped to specific collections or namespaces, or is access broad? The vector database requires the same access control rigor as any other sensitive data store — but most enterprises have not applied it.

2. Do you scan and validate documents before they enter your RAG corpus? Every document ingested into your vector database is a potential injection vector. A pre-ingestion scanning step — looking for anomalous instruction-like content in documents that should be purely informational — is the most direct control against document poisoning. Few enterprises have this step in their ingestion pipeline.

3. Can you audit what your RAG system retrieved for a given query? Retrieval logging — a record of which documents were retrieved for each query and what they contained — is both a security monitoring tool and a forensic tool when something goes wrong. Most enterprise RAG deployments have no retrieval audit trail. Building one is a low-cost, high-value control.

4. How is your vector database isolated between users, tenants, or applications? Shared vector stores create cross-contamination risk. If multiple AI applications or user contexts retrieve from the same store, a poisoning attack in one context can affect outputs in another. Namespace isolation between contexts is the architectural control.

5. Have you included your RAG pipeline in your threat model? Walk your security team through the full data flow: document ingestion, chunking, embedding, storage, retrieval, and model consumption. Identify where adversarial content could enter the pipeline and what controls exist at each step. Most enterprises that have done this exercise discover significant gaps.

The Stackcurve Take

The Rehberger ChatGPT memory demonstration was alarming not because it was technically sophisticated — it was not — but because it revealed how completely the trust model of AI systems inverts the assumptions of traditional application security.

In traditional security, you validate inputs from users because users are untrusted. In a RAG architecture, the model implicitly trusts the documents it retrieves — and documents, unlike users, do not have to be present at the time of the attack. An adversary can plant a malicious document in your knowledge base today and have it influence model behavior for every query that retrieves it, indefinitely, until someone discovers and removes it.

The practical first step is the retrieval audit trail. You cannot detect poisoning you cannot see. Logging what your RAG system retrieves, for which queries, costs almost nothing and provides both monitoring capability and forensic capability when something goes wrong.

The second step is access control on the write path. A document corpus that anyone with SharePoint access can contribute to is a document corpus that anyone can poison. Explicit write permissions, with a review step for new document ingestion, narrows the attack surface significantly.

RAG security is not a solved problem — the vendor tooling is early and the threat research is still active. But the foundational controls are available, implementable, and effective against the most common attack patterns today.

The 2026 Stackcurve AI Security CURVE™ Report covers the AI monitoring, observability, and agentic security vendors most relevant to RAG deployments. Download it free →

← Back to Research Library

Stackcurve Advisory Briefs are independent research. No vendor pays for placement, tier assignment, or editorial influence. The CURVE™ methodology is disclosed in full at stackcurve.net/research/methodology.