Control Layer Resilience: Internal PACE¶

This section defines what happens when each control layer degrades. Every control has its own PACE plan (vertical axis) in addition to the architecture-level PACE across layers (horizontal axis). See the PACE Resilience Methodology for the full model.

This document uses the simplified three-tier system (Tier 1/2/3). See Risk Tiers — Simplified Tier Mapping for the mapping to LOW/MEDIUM/HIGH/CRITICAL.

Design Principle¶

Every control layer will eventually fail. The question is not whether but how. Before deploying any AI system, the architect must define the fail posture for each control at the assigned risk tier. This is a mandatory design input, not an operational afterthought.

Fail Posture Decision Tree

Guardrails — Internal PACE¶

Guardrails are the Primary layer: deterministic, fast, always-on. When they degrade, the system must decide instantly whether to continue serving traffic (fail-open) or stop (fail-closed). The tier determines the answer.

PACE	Condition	Tier 1	Tier 2	Tier 3
P — Normal	Engine running, all filters active	Standard filters: topic, length, PII	Full suite: content filters, schema validation, injection detection, hallucination indicators	Hardened multi-layer: regulatory constraint enforcement, action-level permissions, schema-strict validation. No bypass path.
A — Degraded	Engine slow, partial filter failure, rule update pending	Pass traffic. Log which filters are degraded. Flag for next business day review.	Fall back to stricter, simpler rule set. Over-block rather than under-block. Alert on-call.	Fall back to stricter rule set. Alert on-call immediately. Increase Judge evaluation to 100% if not already.
C — Down	Engine unresponsive or returning errors	Fail-open. Pass traffic. Log all requests. Rely on Judge and/or human review.	Fail-closed. Hold all AI outputs for Judge evaluation + human review. If Judge also degraded, route to non-AI fallback.	Fail-closed. Block all AI traffic immediately. Route to non-AI fallback path. Incident response initiated.
E — Compromised	Evidence of guardrail tampering or adversarial bypass at infrastructure level	Disable AI feature. Alert team.	Route to non-AI fallback. Incident response. Preserve guardrail configuration for forensic analysis.	Full stop on AI operations. Non-AI fallback path active. Forensic evidence preserved. Stakeholders and regulators notified.

Guardrail Transition Triggers¶

Transition	Recommended Triggers
P → A	Guardrail response latency >2x baseline for 5 min; any filter returning default/error responses; rule update deployed but not validated
A → C	Guardrail engine health check failing; >50% of filter categories non-functional; engine unresponsive for >30s
C → E	Evidence of configuration tampering; known vulnerability actively exploited; guardrail logs show signs of adversarial manipulation
Recovery: C/E → P	Engine restored, all filters validated with test suite, monitoring confirms normal operation for >15 min (Tier 1) / >1 hour (Tier 2/3)

LLM-as-Judge — Internal PACE¶

The Judge is the Alternate layer: probabilistic, asynchronous, catches what guardrails miss. Its failure modes are different from guardrails — it can be slow, wrong, or manipulated — and the fallback response differs by tier.

PACE	Condition	Tier 1	Tier 2	Tier 3
P — Normal	Judge evaluating outputs, scores within calibration range	Sample 10–20% of outputs. Log scores.	Evaluate 100% of outputs async. Auto-hold outputs below confidence threshold.	Evaluate 100% of outputs AND actions. Dual-model evaluation. Pre- and post-action verification for agents.
A — Degraded	Judge latency >3x baseline, or partial model errors, or score distribution drifting	Reduce sampling rate. Log coverage gap. Review gap at next business day.	Switch to priority-only evaluation: flag high-risk outputs first (customer-facing, financial, PII-containing). Accept evaluation gap on low-risk outputs. Alert on-call.	Queue all outputs. Accept increased latency rather than skipping evaluation. Alert on-call. If agent, constrain to read-only until Judge performance restored.
C — Down	Judge returning errors, nonsensical scores, or completely unresponsive	Suspend Judge. Guardrails-only operation. Increase human sampling of outputs if practical.	All outputs held for human review until Judge restored. If human review queue exceeds capacity, throttle AI throughput to match.	All AI traffic paused. Human-only operation for any in-flight work. No new AI requests accepted.
E — Compromised	Evidence of Judge model poisoning, adversarial manipulation of evaluation criteria, or score manipulation	Disable Judge. Guardrails only. Alert team. Investigate before restoring.	Activate circuit breaker. Route to non-AI fallback. Incident response. Do not trust any recent Judge scores — review outputs that passed since potential compromise.	Full stop. Forensic analysis of Judge model, evaluation prompts, and score history. All outputs evaluated by compromised Judge must be re-reviewed by humans. Regulators notified.

Judge Transition Triggers¶

Transition	Recommended Triggers
P → A	Judge latency >3x baseline for 5 min; error rate >5% of evaluations; score distribution shifts >2 standard deviations from calibration baseline
A → C	Judge health check failing; >50% of evaluation requests returning errors; complete unresponsiveness for >60s
C → E	Anomalous score patterns (e.g., all outputs scoring identically); evidence of prompt injection in evaluation chain; Judge model integrity check fails
Recovery: C/E → P	Judge model reloaded from known-good checkpoint, validated against test suite, calibration confirmed against baseline, monitoring confirms normal operation for >1 hour

Human Oversight — Internal PACE¶

Human Oversight is the Contingency layer: slow, expensive, but brings judgment that automated systems lack. Its failure modes are organisational — staffing gaps, queue overflow, fatigue — not technical.

PACE	Condition	Tier 1	Tier 2	Tier 3
P — Normal	Review queue staffed, items processed within SLA	Async review of flagged items. Next business day acceptable.	Dedicated reviewers with domain knowledge. SLA-bound response times.	Domain experts with regulatory knowledge. Dual approval for irreversible actions. SLA measured in minutes, not hours.
A — Degraded	Primary reviewer unavailable (leave, illness, turnover)	Flagged items queue until reviewer available. Acceptable backlog: 1–2 business days.	Escalate to secondary reviewer pool. Extend SLA by defined buffer (e.g., 2x) but don't skip review.	On-call escalation to alternate domain expert. No action proceeds without approval. If no alternate available within SLA, escalate to C.
C — Overloaded	Review queue exceeds capacity — flood of flags from Judge or guardrail changes	Throttle AI throughput to match review capacity. Extend queue SLA.	Tighten Judge thresholds to reduce flag volume (fewer borderline cases flagged). Activate additional reviewers from trained pool. If still overloaded, throttle AI throughput.	Constrain agent scope to reduce action volume. Activate crisis staffing from pre-identified pool. If still overloaded, move to Supervised phase (agent proposes, human approves every action).
E — Unavailable	No reviewers available (incident, holiday gap, mass resignation)	Disable AI features requiring review. Guardrails and Judge operate autonomously for remaining features.	Switch to automated-only: Judge must auto-approve/reject with conservative thresholds. Accept higher false-positive rate (more blocks) in exchange for no human review. Flag for urgent staffing resolution.	Suspend all AI operations requiring human approval. No exceptions. In-flight agent actions completed or rolled back per transaction resolution matrix. Full stop until oversight restored.

Human Oversight Transition Triggers¶

Transition	Recommended Triggers
P → A	Primary reviewer unavailable; queue wait time >1.5x SLA
A → C	Queue size >3x normal; wait time >2x SLA; multiple reviewers unavailable simultaneously
C → E	Zero qualified reviewers available; queue wait time >5x SLA with no resolution timeline
Recovery: E → P	Reviewers available and confirmed; queue backlog cleared or triaged; SLA performance confirmed for >4 hours

Cross-Layer PACE: Architecture-Level Resilience¶

When individual layer PACE has been exhausted — the layer is at its Emergency state — the architecture-level PACE activates:

Condition	Architecture Response
Guardrails at E (compromised)	Judge becomes sole automated defence. Human oversight scope expanded. Evaluate whether to activate circuit breaker based on Judge coverage and confidence.
Judge at E (compromised)	Guardrails continue blocking known-bad. Human oversight absorbs all quality assurance. Reduce AI throughput to match human capacity.
Human Oversight at E (unavailable)	Guardrails and Judge operate without human backstop. At Tier 2+, tighten all automated thresholds. At Tier 3, activate circuit breaker — automated-only operation is not acceptable for regulated decisions.
Two or more layers at E simultaneously	Circuit breaker activates immediately, regardless of tier. Route to non-AI fallback. This is a systemic failure requiring incident response.
Circuit breaker activated	Non-AI fallback path serves traffic. All AI components isolated. Incident response team assembled. Recovery requires layer-by-layer restoration with validation at each step.

The Non-AI Fallback Path¶

Every system at Tier 2 or above must have a documented, tested, and maintained non-AI fallback path. This is the last line of defence and must not share dependencies with the AI system.

Aspect	Tier 1	Tier 2	Tier 3
What it is	The manual process the AI replaced	Rule-based system or templated responses	Staffed parallel process with trained operators
Who maintains it	Same team that runs the AI	Designated owner with quarterly review	Dedicated operational resilience function
Dependencies	Must not depend on the AI model or guardrail engine	Must not depend on any AI infrastructure component	Must not share any infrastructure with the AI system
Testing	Annually: confirm it still works	Quarterly: run production-equivalent traffic through it	Monthly: operate in parallel for a defined period
Activation	Manual (feature flag, deployment rollback)	Automated (circuit breaker with health checks)	Automated (circuit breaker) with manual confirmation within defined window
Capacity	Best effort	Must handle 100% of AI traffic at degraded quality	Must handle critical subset at production quality

AI Runtime Behaviour Security, 2026 (Jonathan Gill).