Emerging AI Trends — Impact on AI Runtime Behaviour Security¶
This document assesses how emerging AI trends affect the reference architecture and identifies required adaptations.
Executive Summary¶
The core architecture principle — Guardrails prevent, Judge detects, Humans decide — remains valid across emerging trends. However, agentic AI fundamentally challenges the interaction-centric model and requires architectural extension.
| Architecture Component | Robustness | Action Required |
|---|---|---|
| Three-layer model | ✅ Robust | None |
| Risk-based tiering | ✅ Robust | Extend criteria for agentic |
| Guardrails | ⚠️ Stressed | Extend for multimodal, agentic |
| Judge | ⚠️ Stressed | Extend for trajectories |
| HITL | ⚠️ Stressed | Shift to checkpoints |
| Logging | ✅ Robust | Extend for traces |
Trend Analysis¶
1. Agentic AI¶
What it is: AI systems that take autonomous multi-step actions, use tools, interact with external systems, and operate with minimal human intervention.
Impact: HIGH — Requires architectural extension
| Current Model | Agentic Reality |
|---|---|
| Single request → response | Multi-step execution chains |
| Evaluate one interaction | Evaluate trajectory and cumulative effect |
| Human reviews before action | Actions happen autonomously |
| Clear boundaries | Tool use, API calls, real-world effects |
| One AI to govern | Orchestrator + multiple specialist agents |
What breaks: - Interaction-centric Judge can't assess multi-step chains - Sampling strategies assume independent interactions - HITL can't review every step in real-time - Guardrails designed for text I/O, not tool calls
Required adaptations:
New control requirements for agentic AI:
| Control | Purpose |
|---|---|
| Plan approval | Review intended actions before execution |
| Action-level guardrails | Check each tool call / action |
| Circuit breakers | Hard limits on steps, cost, scope |
| Trajectory logging | Full trace of execution path |
| Trajectory evaluation | Judge assesses full chain |
| Deviation detection | Alert when execution diverges from plan |
| Rollback capability | Undo actions where possible |
2. Multimodal AI¶
What it is: AI that processes and generates images, audio, video, and combinations thereof.
Impact: MEDIUM — Extend existing controls
What works: - Three-layer model applies (guardrails, Judge, HITL) - Risk-based approach applies - Logging and audit requirements apply
What needs extension:
| Modality | Guardrail Maturity | Judge Capability | Notes |
|---|---|---|---|
| Text | ✅ Mature | ✅ Strong | Current focus |
| Images | ⚠️ Emerging | ⚠️ Developing | NSFW, deepfakes, PII in images |
| Audio | ⚠️ Limited | ⚠️ Limited | Voice cloning, impersonation |
| Video | ❌ Immature | ❌ Limited | Computational cost high |
Required adaptations:
- Input guardrails — Extend to detect:
- Harmful image content (NSFW, violence, CSAM)
- Deepfakes and manipulated media
- PII in images (faces, documents)
-
Audio impersonation attempts
-
Output guardrails — Extend to filter:
- Generated NSFW content
- Generated deepfakes / impersonation
- Copyright-infringing generations
-
Watermarking for AI-generated content
-
Judge — Must evaluate:
- Image/audio/video appropriateness
- Cross-modal consistency (does image match text?)
-
Multimodal attack patterns
-
Logging — Must capture:
- Input media (or hashes/references)
- Generated media
- Sufficient for investigation
Platform support:
| Platform | Image Guardrails | Audio/Video |
|---|---|---|
| Bedrock Guardrails | ✅ Image filters | ❌ Limited |
| Azure Content Safety | ✅ Image analysis | ⚠️ Some audio |
| Google Cloud | ✅ Vision Safety | ⚠️ Some audio |
3. Reasoning Models¶
What it is: Models that "think" before responding (o1, o3, Claude extended thinking, DeepSeek R1).
Impact: LOW — Already addressed
The architecture already accommodates reasoning models: - Judge Model Selection Guide covers tiered Judge with reasoning models - Reasoning models are well-suited to the Judge role - Extended thinking provides audit trail
Minor considerations: - Cost management (reasoning models are expensive) - Latency for real-time applications - Transparency of reasoning chain
4. Longer Context Windows¶
What it is: Models that can process 100K, 200K, 1M+ tokens.
Impact: LOW — Operational adjustments
What works: Architecture unchanged.
Adjustments needed:
| Aspect | Impact |
|---|---|
| Logging cost | Higher storage requirements |
| Judge cost | Evaluating longer contexts costs more |
| Sampling | May need to adjust sampling rates |
| Attack surface | More room for injection in long contexts |
Guardrail considerations: - Prompt injection can be buried deeper in long contexts - May need segmented scanning - Attention-based attacks exploit long contexts
5. Real-Time / Streaming AI¶
What it is: AI that processes and generates content in real-time streams (live conversation, video analysis).
Impact: MEDIUM — Latency trade-offs
Challenges: - Guardrails must be fast enough for real-time - Can't wait for full response to evaluate - Judge must work on streaming data
Adaptations:
| Component | Streaming Adaptation |
|---|---|
| Guardrails | Incremental checking, token-level filtering |
| Logging | Stream capture, chunked storage |
| Judge | Evaluate chunks or sessions, not single interactions |
| HITL | Post-session review, real-time alerts for critical issues |
6. Fine-Tuned / Custom Models¶
What it is: Organisation-specific models trained or fine-tuned on proprietary data.
Impact: LOW — Validation requirements
Architecture unchanged, but adds requirements:
| Requirement | Purpose |
|---|---|
| Model validation | Ensure fine-tuning hasn't introduced issues |
| Bias testing | Fine-tuning can introduce or amplify bias |
| Capability assessment | Understand what the model can/can't do |
| Version control | Track model versions and changes |
These align with existing Model Risk Management (SR 11-7) requirements.
7. Local / Edge AI¶
What it is: AI running on-device (phones, laptops, IoT) rather than cloud.
Impact: MEDIUM — Different deployment model
Challenges: - Can't insert guardrails between user and model - Logging may be limited or delayed - Judge can't evaluate in real-time - Less control over model behaviour
Adaptations:
| Component | Edge Adaptation |
|---|---|
| Guardrails | Embedded in application, device-side |
| Logging | Local buffer, sync when connected |
| Judge | Server-side evaluation of synced logs |
| HITL | Async review of aggregated data |
Risk implications: - Less real-time control = higher risk tier - May need to limit edge AI to lower-risk use cases - Or accept different control model with delayed assurance
8. AI-to-AI Interactions¶
What it is: AI systems that communicate with each other, including multi-agent systems and AI pipelines.
Impact: HIGH — Attribution challenges
Challenges: - Which AI is "responsible" for an outcome? - How to evaluate a chain of AI interactions? - Emergent behaviour from AI combinations - Attack propagation across AI systems
Required adaptations:
For AI-to-AI interactions, implement unified trace logging that captures the full chain: - Trace ID that follows the request across all AI systems - Per-AI inputs and outputs logged - Final outcome attribution - Accountability mapping
Control requirements: - Trace IDs across AI interactions - Per-AI guardrails still apply - Judge evaluates full trace - Attribution model for accountability
Summary: Architecture Durability¶
What Remains Stable¶
| Principle | Why It Survives |
|---|---|
| Layered defence | Universal security principle |
| Risk-based controls | Regulatory and practical necessity |
| Human accountability | Regulatory requirement, ethical imperative |
| Logging and audit | Foundation for all assurance |
| Guardrails → Judge → HITL | Functional abstraction, not implementation |
What Requires Extension¶
| Trend | Extension Needed |
|---|---|
| Agentic AI | Plan approval, trajectory evaluation, circuit breakers |
| Multimodal | Extend guardrails and Judge to non-text modalities |
| AI-to-AI | Trace logging, attribution model |
| Edge AI | Delayed assurance model |
| Streaming | Incremental evaluation |
What Might Break¶
Only agentic AI fundamentally challenges the model:
The current architecture assumes discrete interactions that can be evaluated independently. Agentic AI breaks this by: 1. Creating multi-step chains where context matters 2. Taking real-world actions that can't be "undone" 3. Operating faster than humans can review
The fix is not to abandon the architecture but to extend it: - Add plan-level review (before execution) - Add trajectory-level evaluation (in addition to interaction-level) - Add circuit breakers (hard limits during execution) - Shift HITL from "review everything" to "review decisions and exceptions"
See Agentic Controls for the complete control set.
Recommendations¶
Near-Term (Now)¶
- Implement agentic controls — See Agentic Controls for plan approval, circuit breakers, trajectory evaluation
- Monitor multimodal guardrail maturity — Platform capabilities are evolving rapidly
- Implement trace logging — Even for non-agentic systems, correlation IDs enable future capabilities
Medium-Term (6-12 months)¶
- Develop trajectory Judge — Extend Judge to evaluate multi-step chains
- Build circuit breaker patterns — Reusable components for agentic systems
- Extend guardrails for multimodal — As platform support matures
Long-Term (12+ months)¶
- AI-to-AI governance model — Attribution, accountability across AI chains
- Autonomous AI oversight — When AI operates without human review
- Regulatory alignment — EU AI Act and others will evolve; track and adapt
Conclusion¶
The reference architecture is durable but not static.
The core principle — Guardrails prevent, Judge detects, Humans decide — survives because it describes functions, not implementations. As AI capabilities evolve, the implementations change but the functions remain.
Agentic AI is the critical trend to watch. It challenges the interaction-centric model and requires genuine architectural extension. Other trends (multimodal, reasoning models, longer contexts) are accommodated with relatively minor adjustments.
The framework should be treated as a living document that evolves with the technology. This is not a weakness — it's a design principle.¶
AI Runtime Behaviour Security, 2026 (Jonathan Gill).