Memory and Context Controls¶
Securing what the model remembers — across turns, sessions, and users.
This document uses the simplified three-tier system (Tier 1/2/3). See Risk Tiers — Simplified Tier Mapping for the mapping to LOW/MEDIUM/HIGH/CRITICAL.
The Problem¶
The three-layer pattern evaluates individual requests and responses. But AI systems accumulate context:
- Within a conversation — each turn adds to the context window
- Across conversations — persistent memory, session history, user profiles
- Across users — shared embeddings, cached responses, fine-tuned models
A single request-response pair may be safe. The accumulated context may not be.
Threat Model¶
| Threat | Vector | Impact |
|---|---|---|
| Gradual context poisoning | Early turns inject instructions that influence later turns | Model behaviour changes over a long conversation without triggering per-turn guardrails |
| Cross-session leakage | Persistent memory or shared cache surfaces User A's data for User B | Data breach — potentially regulated data |
| Memory manipulation | Injecting false "memories" via conversation that persist across sessions | Ongoing manipulation of model behaviour for a user |
| Context window overflow | Filling the context with irrelevant content to push out system instructions | Guardrail bypass — system prompt "forgotten" |
| Accumulated PII | Individual turns are PII-free but the conversation as a whole builds a profile | Privacy violation — model holds more personal data than any individual turn reveals |
Controls¶
1. Session Isolation¶
Every user session must be isolated. No shared state between users unless explicitly designed and controlled.
| Requirement | Implementation |
|---|---|
| Separate context per user | Each user gets their own conversation thread — no shared context window |
| Separate memory per user | Persistent memory is scoped to the authenticated user |
| No shared cache for generated responses | Response caching (if used) must be keyed to user + input, not input alone |
| Session timeout | Conversations expire after inactivity — context is not preserved indefinitely |
2. Context Window Hygiene¶
| Control | What It Does |
|---|---|
| System prompt anchoring | Re-inject system instructions at intervals in long conversations, not just at the start |
| Context summarisation | Periodically summarise old turns and replace verbose history with summaries |
| Turn limits | Maximum number of turns per conversation before requiring a new session |
| Token budget monitoring | Alert when context window approaches capacity — model behaviour degrades near limits |
3. Persistent Memory Controls¶
For systems that maintain memory across sessions (user preferences, conversation history, learned context):
| Control | What It Does |
|---|---|
| Memory content filtering | Apply guardrails to content before it's written to persistent memory |
| Memory access control | Only the authenticated user (or authorised system) can read their memory |
| Memory expiry | Set TTLs on stored memories — not everything should persist forever |
| Memory audit trail | Log what's written to and read from persistent memory |
| User memory controls | Users can view, edit, and delete their stored memories |
| Memory injection prevention | Validate that persistent memories are genuine (from real conversations) not injected |
4. Accumulated Context Evaluation¶
Don't just evaluate individual turns. Periodically evaluate the full conversation context.
| Trigger | Action |
|---|---|
| Every N turns (e.g., 10) | Run the Judge on the full conversation, not just the latest turn |
| Context window >50% full | Check for context poisoning patterns (repeated instructions, topic drift) |
| User requests sensitive action | Evaluate the full conversation for manipulation patterns before allowing the action |
5. Cross-Session Data Governance¶
| Requirement | Implementation |
|---|---|
| Data classification | Classify persistent memory content with the same scheme used for other data stores |
| Retention policies | Apply your existing data retention policies to conversation history and memory |
| Right to deletion | Implement memory deletion that actually deletes — not just soft-delete |
| Encryption | Encrypt persistent memory at rest and in transit — same controls as any data store |
Architecture Patterns¶
Stateless (Recommended for Tier 1–2)¶
No persistent memory. Each conversation starts fresh. Context exists only within the session.
- Simplest to secure
- No cross-session risks
- Users may find it frustrating for repeated tasks
Stateful with Scoped Memory (Tier 2–3)¶
Persistent memory scoped to user, with explicit controls.
- Memory is a separate data store with its own access controls
- Memory content is filtered before storage and before retrieval
- Memory has TTLs and audit trails
Shared Knowledge Base (Tier 3 — requires careful design)¶
Shared embeddings or knowledge that multiple users access (e.g., company FAQ, product documentation).
- Shared content must be read-only for end users
- Ingestion pipeline is controlled (see RAG Security)
- User-specific context is never written to shared stores
Behavioural Learning and Preference Data¶
The controls above address what the model remembers — context windows, persistent memory, shared knowledge. But some systems are designed to learn from user behaviour: adapting communication style, building preference profiles, personalising recommendations based on interaction history.
This is a different threat surface. Memory controls govern storage and retrieval. Behavioural learning controls govern what you choose to extract, model, and act on.
The Problem¶
A system that learns customer preferences builds a behavioural profile. Over time, that profile becomes:
- Quasi-identifying — Writing style, reading patterns, product preferences, and interaction timing can re-identify a user even without explicit PII
- Inferential — The system can infer sensitive attributes (financial situation, health concerns, emotional state) from behavioural signals the user didn't intend to share
- Self-reinforcing — Recommendation engines create feedback loops: the system shows you what it thinks you want, you interact with it, and that interaction confirms its model — even if the model was wrong
- Poisonable — An adversary can inject false preference signals to manipulate future recommendations (showing a user competing products, biasing pricing, or shifting trust)
Decision Framework¶
Before building a preference-learning system, answer these questions:
| Question | If you can't answer clearly, stop |
|---|---|
| What are you learning, and why? | "User preferences" is too vague. Define exactly which signals you extract (product categories browsed, response length preference, time-of-day patterns) and the business purpose for each |
| Does the user know? | Transparency isn't optional. The user should understand what the system has learned about them, in language they can read — not a JSON dump |
| Can the user see, correct, and delete what you've learned? | If you store a preference model, users need the ability to inspect it, dispute incorrect inferences, and request deletion. This is regulatory in many jurisdictions and good practice in all of them |
| Is the learned data more sensitive than the raw data? | Individual page views are low-sensitivity. An inferred health concern derived from browsing patterns is high-sensitivity. Classify the output of your learning, not just the input |
| How do you detect preference poisoning? | If an attacker can shift your model of a user by injecting interactions, your recommendation engine becomes an attack surface. Define baselines and anomaly detection for profile changes |
| What's your feedback loop risk? | If the system recommends → user clicks → system reinforces, you can converge on a narrow model that doesn't reflect the user's actual preferences. Build in diversity or exploration mechanisms |
What the Framework Covers¶
Your existing controls from this document and the broader framework apply to the infrastructure of behavioural learning:
| Framework control | How it applies to preference learning |
|---|---|
| Persistent memory controls (Section 3 above) | Preference data is persistent memory. Apply the same controls: TTLs, access scoping, content filtering, injection prevention, user memory controls |
| Accumulated PII (Threat Model above) | Behavioural profiles are the canonical example. Individual interactions are low-risk; the accumulated profile is high-risk |
| Cross-session data governance (Section 5 above) | Preference data flows across sessions. Apply the same isolation, classification, and access controls |
| Data retention (Data Retention Guidance) | Preference data needs retention limits. Define how long you keep learned preferences and how you purge them |
| Judge evaluation (Controls) | Your Judge can evaluate whether recommendations are appropriate, whether the system is over-personalising, and whether inferred preferences are plausible |
What the Framework Does Not Cover¶
The policy decisions — what to learn, when to ask consent, how to explain inferences — are domain-specific. The framework gives you the security and governance infrastructure. You need domain expertise and legal guidance for:
- Consent design — What does meaningful consent look like for behavioural learning? (Not "I agree to terms." Granular, revocable, specific.)
- Explainability — How do you present a learned preference model to a non-technical user in a way they can understand and act on?
- Differential privacy — How do you learn aggregate patterns without exposing individual behaviour? (Research-stage for most enterprises, but critical at scale.)
- Fairness and bias in recommendations — If your preference model correlates with protected characteristics, your recommendations may discriminate. This is a fairness problem, not just a security problem.
Offramps — Go Here Next¶
| Topic | Resource | Why |
|---|---|---|
| Profiling under GDPR | ICO Guidance on Profiling and Automated Decision-Making | Defines when behavioural profiling requires explicit consent, a right to object, and human review. Directly applicable if you serve UK/EU users |
| GDPR transparency requirements | GDPR Articles 13–14 (right to be informed), Article 15 (right of access), Article 22 (automated individual decision-making) | What you must disclose about automated profiling. Your legal team should map these to your preference learning system |
| CCPA right to know and delete | California Attorney General CCPA Resources | If you serve California residents, they have the right to know what data you've collected (including inferences) and to request deletion |
| NIST Privacy Framework | NIST Privacy Framework 1.0 | Maps privacy risk management to your existing NIST AI RMF alignment. The "Identify-P" and "Control-P" functions apply directly to preference data |
| Recommendation system fairness | Your AI/ML team's fairness evaluation tools (Fairlearn, AI Fairness 360, What-If Tool) | Test whether your preference model produces discriminatory recommendations. The framework's Judge can flag outliers, but fairness testing requires dedicated tooling |
| Consent management platforms | Your privacy/compliance team's consent management documentation (OneTrust, Cookiebot, TrustArc, or equivalent) | The mechanism for capturing, storing, and honouring user consent for behavioural learning. Don't build this from scratch |
The framework's role: Secure the storage, access, and lifecycle of preference data using existing memory and data protection controls. Detect anomalies in preference profiles. Evaluate recommendation quality through the Judge layer.
Your responsibility: Decide what to learn, get informed consent, provide transparency and user control, and ensure fairness. These are design and policy decisions, not security controls — but they determine whether your security controls are protecting the right things.
AI Runtime Behaviour Security, 2026 (Jonathan Gill).