Quantitative Risk Assessment for AI Controls¶
How the three-layer pattern reduces residual risk — with worked examples across all four tiers.
Part of the AI Runtime Behaviour Security Version 1.0 · February 2026 · Jonathan Gill
NIST AI RMF Alignment¶
This document implements activities from all four functions of the NIST AI Risk Management Framework (AI RMF 1.0). If your organisation already uses NIST AI RMF, this risk assessment plugs directly into your existing process.
| NIST AI RMF Function | Subcategories | What This Document Provides |
|---|---|---|
| GOVERN | 1.3, 1.4, 1.5 | Risk management process structure; risk tolerance expressed as quantitative residual risk thresholds; recalibration schedule |
| MAP | 2.1, 3.1, 3.2 | AI system risk categorisation (four tiers); threat identification per scenario; lifecycle risk across transaction volumes |
| MEASURE | 1.1, 1.2, 2.1, 2.2, 2.3, 2.6 | Quantitative risk metrics; per-layer effectiveness measurement; residual risk calculation; recalibration methodology |
| MANAGE | 1.1, 1.3, 2.1, 2.2, 2.4 | Control selection proportionate to risk tier; compensating controls; PACE fail postures per tier; incident-driven recalculation |
The methodology below follows the NIST RMF lifecycle: identify threats (MAP), measure control effectiveness (MEASURE), calculate residual risk for governance decisions (GOVERN), and define response actions when risk exceeds appetite (MANAGE).
For detailed infrastructure control mappings to all 51 NIST AI RMF subcategories, see NIST AI RMF Mapping.
Why Quantify Control Effectiveness¶
Most AI security guidance says "add guardrails" or "implement human oversight" without answering the question that risk committees actually ask: how much does each layer reduce the probability of harm, and what residual risk remains?
This document provides a quantitative model for answering that question. It uses illustrative effectiveness rates — not empirically validated benchmarks — to demonstrate the methodology. Your actual rates will depend on your implementation quality, threat landscape, and operational maturity. The point is the approach, not the specific numbers.
Important: The effectiveness percentages in this document are illustrative. They exist to demonstrate how layered controls compound to reduce residual risk. Your organisation should measure actual effectiveness through red teaming, Judge accuracy calibration (see Judge Assurance), and incident data. Replace the illustrative rates with your measured rates as they become available.
The Layered Control Model¶
Each layer in the three-layer pattern operates independently. When one layer misses a threat, the next layer has an independent opportunity to catch it. This is the same principle behind defence in depth in traditional security — except here we can model the compounding effect mathematically.
Illustrative Effectiveness Rates¶
| Layer | Effectiveness | What This Means | Scope |
|---|---|---|---|
| Guardrails | ~90% | Catches 90% of issues that reach it — known patterns, policy violations, format errors | Every transaction, real-time |
| LLM-as-Judge | ~95% | Catches 95% of issues in the transactions it evaluates — semantic violations, subtle policy breaches, quality failures | Sampled or full coverage depending on tier |
| Human Oversight | ~98% | Catches 98% of issues surfaced to reviewers — edge cases, nuanced judgement calls, novel threats | Flagged transactions + sampling |
How Layers Compound¶
When all three layers are active and operating independently on a transaction:
P(issue reaches customer) = P(miss guardrail) × P(miss judge) × P(miss human)
= (1 - 0.90) × (1 - 0.95) × (1 - 0.98)
= 0.10 × 0.05 × 0.02
= 0.0001
= 0.01%
One in ten thousand. Compare this to any single layer alone:
| Configuration | Residual Risk | Improvement Factor |
|---|---|---|
| No controls | 100% | — |
| Guardrails only | 10% | 10× |
| Guardrails + Judge | 0.5% | 200× |
| Guardrails + Judge + Human | 0.01% | 10,000× |
This is why the framework insists on layered controls. Each layer alone is insufficient. Together, they achieve orders-of-magnitude risk reduction.
Critical Caveat: Independence Assumption¶
This model assumes layers fail independently — a threat that bypasses guardrails is not inherently more likely to bypass the Judge. This holds when:
- The Judge uses a different model than the task agent
- Human reviewers have domain expertise beyond what the AI provides
- Each layer uses different detection methods (pattern matching vs. semantic evaluation vs. human judgement)
If your Judge uses the same model as your task agent, or your human reviewers rubber-stamp AI outputs, the independence assumption breaks and your actual residual risk is higher than this model predicts.
Worked Example: Customer Product Chatbot (HIGH Tier)¶
System Description¶
A customer-facing AI chatbot that helps customers browse products, compare options, and complete purchases. The system has:
- Product catalog access (read) — prices, specifications, availability
- Shopping cart management (write) — add/remove items, apply promotions
- Payment processing (write via API) — charge customer payment methods
- Order management (write) — create and confirm orders
- Customer account access (read) — order history, saved addresses, payment methods
Risk Classification¶
| Dimension | Assessment | Rationale |
|---|---|---|
| Decision authority | High — takes actions (processes payments, creates orders) | |
| Reversibility | Medium — payments can be refunded but create operational cost | |
| Data sensitivity | High — PII, payment data, purchase history | |
| Audience | Critical — external customers | |
| Scale | High — thousands of transactions per day | |
| Regulatory | Medium — consumer protection, PCI-DSS adjacency |
Assigned tier: HIGH (payment processing pushes toward CRITICAL, but payment gateway provides independent validation — see compensating controls)
Threat Scenarios¶
Five failure modes that matter for this system, with per-layer analysis across 1,000 transactions.
Scenario 1: Prompt Injection — Price Manipulation¶
Threat: Attacker crafts input that causes the chatbot to apply unauthorized discounts, override pricing, or bypass payment validation.
Inherent likelihood: ~20 attempts per 1,000 transactions (2%). Most e-commerce chatbots see regular probing from both malicious actors and curious users.
| Layer | Detection Method | Effectiveness | Catches | Misses |
|---|---|---|---|---|
| Guardrails | Pattern matching for injection signatures, encoding detection, input normalisation | 90% | 18.0 | 2.0 |
| Judge | Semantic evaluation — "did the chatbot apply pricing outside catalog parameters?" | 95% | 1.9 | 0.1 |
| Human | Review flagged transactions, price anomaly alerts | 98% | 0.098 | 0.002 |
Residual: 0.002 successful price manipulations per 1,000 transactions = 1 in 500,000 transactions
| Metric | Value |
|---|---|
| Inherent risk (no controls) | 20 per 1,000 |
| Residual risk (all layers) | 0.002 per 1,000 |
| Risk reduction factor | 10,000× |
| Annualised (1M transactions/year) | ~2 incidents |
Scenario 2: Hallucinated Product Information¶
Threat: Chatbot fabricates product specifications, availability, pricing, or warranty terms that don't match the catalog. Customer makes a purchase decision based on wrong information.
Inherent likelihood: ~50 per 1,000 transactions (5%). Hallucination rates vary by model and RAG implementation quality, but product attribute errors are common when the model generates rather than retrieves.
| Layer | Detection Method | Effectiveness | Catches | Misses |
|---|---|---|---|---|
| Guardrails | Output validation against product catalog API — price match, spec match, availability check | 90% | 45.0 | 5.0 |
| Judge | Semantic grounding evaluation — "are all stated product attributes supported by the catalog source?" | 95% | 4.75 | 0.25 |
| Human | Review flagged hallucination alerts, spot-check random transactions | 98% | 0.245 | 0.005 |
Residual: 0.005 undetected hallucinations per 1,000 transactions = 1 in 200,000 transactions
| Metric | Value |
|---|---|
| Inherent risk (no controls) | 50 per 1,000 |
| Residual risk (all layers) | 0.005 per 1,000 |
| Risk reduction factor | 10,000× |
| Annualised (1M transactions/year) | ~5 incidents |
Scenario 3: PII Leakage¶
Threat: Chatbot includes another customer's personal data, order history, or payment details in a response. Could occur through context window contamination, shared session state, or RAG retrieval pulling the wrong customer's data.
Inherent likelihood: ~10 per 1,000 transactions (1%). Lower base rate than hallucination, but higher impact per incident.
| Layer | Detection Method | Effectiveness | Catches | Misses |
|---|---|---|---|---|
| Guardrails | Output PII scanner — regex + ML for names, addresses, card numbers, account IDs not belonging to the current session | 90% | 9.0 | 1.0 |
| Judge | Cross-reference evaluation — "does the response contain any data elements not attributable to the requesting customer?" | 95% | 0.95 | 0.05 |
| Human | Review PII alerts, periodic audit of cross-customer data access patterns | 98% | 0.049 | 0.001 |
Residual: 0.001 PII leakage incidents per 1,000 transactions = 1 in 1,000,000 transactions
| Metric | Value |
|---|---|
| Inherent risk (no controls) | 10 per 1,000 |
| Residual risk (all layers) | 0.001 per 1,000 |
| Risk reduction factor | 10,000× |
| Annualised (1M transactions/year) | ~1 incident |
Scenario 4: Unauthorized or Incorrect Payment¶
Threat: Chatbot processes a payment for the wrong amount, charges the wrong payment method, or initiates a transaction the customer didn't authorize.
Inherent likelihood: ~5 per 1,000 transactions (0.5%). Lower base rate because payment flows are typically more structured, but highest financial impact per incident.
| Layer | Detection Method | Effectiveness | Catches | Misses |
|---|---|---|---|---|
| Guardrails | Amount validation against cart total, payment method confirmation check, duplicate transaction detection | 90% | 4.5 | 0.5 |
| Judge | Transaction integrity evaluation — "does the payment amount, method, and authorization match the conversation flow?" | 95% | 0.475 | 0.025 |
| Human | Payment anomaly queue, high-value transaction review | 98% | 0.0245 | 0.0005 |
Residual: 0.0005 unauthorized payments per 1,000 transactions = 1 in 2,000,000 transactions
| Metric | Value |
|---|---|
| Inherent risk (no controls) | 5 per 1,000 |
| Residual risk (all layers) | 0.0005 per 1,000 |
| Risk reduction factor | 10,000× |
| Annualised (1M transactions/year) | ~0.5 incidents |
Scenario 5: Inappropriate or Harmful Response¶
Threat: Chatbot generates offensive content, makes inappropriate recommendations, provides dangerous advice (e.g., regarding product use), or behaves in a way that damages brand reputation.
Inherent likelihood: ~15 per 1,000 transactions (1.5%). Includes both adversarial prompting and edge cases where the model's training produces unexpected outputs in a commercial context.
| Layer | Detection Method | Effectiveness | Catches | Misses |
|---|---|---|---|---|
| Guardrails | Content policy filter, toxicity classifier, brand guideline checker | 90% | 13.5 | 1.5 |
| Judge | Tone and policy evaluation — "is this response appropriate for a customer-facing commercial interaction?" | 95% | 1.425 | 0.075 |
| Human | Escalation queue review, customer complaint correlation | 98% | 0.0735 | 0.0015 |
Residual: 0.0015 inappropriate responses per 1,000 transactions = 1 in ~667,000 transactions
| Metric | Value |
|---|---|
| Inherent risk (no controls) | 15 per 1,000 |
| Residual risk (all layers) | 0.0015 per 1,000 |
| Risk reduction factor | 10,000× |
| Annualised (1M transactions/year) | ~1.5 incidents |
Combined Risk Summary — Product Chatbot¶
| Threat | Inherent (per 1K) | Residual (per 1K) | Annualised (1M txn) | Severity |
|---|---|---|---|---|
| Prompt injection — price manipulation | 20 | 0.002 | ~2 | High |
| Hallucinated product info | 50 | 0.005 | ~5 | Medium |
| PII leakage | 10 | 0.001 | ~1 | Critical |
| Unauthorized payment | 5 | 0.0005 | ~0.5 | Critical |
| Inappropriate response | 15 | 0.0015 | ~1.5 | Medium |
| Total | 100 | 0.0100 | ~10 |
Interpretation: Without controls, roughly 10% of transactions would have some form of issue. With all three layers active, the residual rate drops to approximately 0.001% — about 10 incidents per year at 1M transactions. Of those, the critical-severity incidents (PII, unauthorized payment) are expected at fewer than 2 per year.
Compensating Controls¶
The three-layer AI pattern does not operate in isolation. Existing infrastructure provides independent controls that further reduce residual risk. These are not substitutes for AI-specific controls — they are additional layers in the overall defence.
For the Product Chatbot¶
| Compensating Control | What It Catches | Independence From AI Layers |
|---|---|---|
| Payment gateway validation | Amount limits, card verification, 3D Secure, duplicate detection | Operates at payment infrastructure level — catches errors regardless of what the chatbot sends |
| API input validation | Malformed requests, out-of-range values, schema violations | Application layer — rejects structurally invalid API calls before they reach backend systems |
| Fraud detection system | Anomalous transaction patterns, velocity checks, device fingerprinting | Operates on transaction data, not chatbot outputs — independent signal |
| Rate limiting (API gateway) | Bulk exploitation, automated attacks, enumeration | Network/infrastructure level — limits blast radius regardless of individual transaction success |
| Order confirmation workflow | Customer verifies order details before final payment | Human-in-the-loop at the customer level — the customer themselves is a control |
| Inventory management system | Prevents fulfillment of out-of-stock items, catches quantity errors | Backend system of record — chatbot hallucination about availability is caught at fulfillment |
| Refund/chargeback process | Enables recovery from payment errors | Not a preventive control, but reduces financial impact of residual failures |
Adjusted Residual Risk with Compensating Controls¶
Taking the two highest-severity scenarios and applying compensating controls:
Unauthorized payment (0.0005 per 1,000 AI-layer residual):
| Additional Layer | Catches | Remaining |
|---|---|---|
| Payment gateway validation (amount, card, 3DS) | ~95% of remaining | 0.000025 |
| Fraud detection system | ~80% of remaining | 0.000005 |
| Customer order confirmation | ~90% of remaining | 0.0000005 |
Effective residual with compensating controls: ~1 in 2 billion transactions
PII leakage (0.001 per 1,000 AI-layer residual):
| Additional Layer | Catches | Remaining |
|---|---|---|
| API response filtering (DLP at gateway) | ~85% of remaining | 0.00015 |
| Session isolation (infrastructure) | ~70% of remaining | 0.000045 |
Effective residual with compensating controls: ~1 in 22 million transactions
The point: Compensating controls don't excuse weak AI-specific controls. But when a risk committee asks "what's the realistic probability of a customer being charged incorrectly?" the answer includes the full control stack, not just the AI layers. Present both the AI-layer residual and the compensated residual.
Risk Tier Scenarios¶
The product chatbot is a HIGH-tier system. Here's how the same methodology applies across all four tiers, showing how control depth scales with risk.
LOW Tier: Public FAQ Chatbot¶
System: Answers general product questions from public documentation. No customer data access, no transaction capability, no personalization.
Control configuration: Guardrails only. No Judge (or optional 1-5% sampling). No human-in-the-loop (exception-only).
| Threat | Inherent (per 1K) | Controls Applied | Residual (per 1K) |
|---|---|---|---|
| Hallucinated FAQ answer | 30 | Guardrails (90%): output grounding check against FAQ corpus | 3.0 |
| Inappropriate response | 10 | Guardrails (90%): content policy filter | 1.0 |
| Brand reputation harm | 5 | Guardrails (90%): tone checker | 0.5 |
Residual: ~4.5 issues per 1,000 interactions
Why this is acceptable: No financial impact, no data exposure, no irreversible actions. The FAQ bot gives wrong or awkward answers ~0.45% of the time. Users can verify against the website. The cost of these failures is low. Adding Judge and HITL would improve accuracy but the investment isn't proportionate to the risk.
PACE posture: Primary only (fail-open with logging). If guardrails fail, the chatbot continues to operate but all outputs are logged for batch review.
MEDIUM Tier: Internal Document Assistant¶
System: Helps internal employees search and summarise company policy documents. Has access to internal knowledge base (read-only). Users are employees with domain knowledge who are expected to verify outputs.
Control configuration: Guardrails + Judge (5-10% sampling, batch daily). Human review on flags only.
| Threat | Inherent (per 1K) | Controls Applied | Residual (per 1K) |
|---|---|---|---|
| Hallucinated policy detail | 40 | Guardrails (90%) + Judge on 10% sample (95%) | 4.0 full coverage, ~3.8 effective* |
| PII in internal docs exposed incorrectly | 8 | Guardrails (90%) + Judge on 10% sample (95%) | 0.8 full coverage, ~0.76 effective* |
| Confidential doc outside need-to-know | 5 | Guardrails (90%): access control check + Judge | 0.5 full coverage, ~0.475 effective* |
* Judge at 10% sampling catches 95% of the 10% it evaluates. Effective additional catch rate: 0.10 × 0.95 = 9.5% of guardrail misses.
How sampling affects the math:
With full Judge coverage: P(miss) = 0.10 × 0.05 = 0.005 (0.5%)
With 10% Judge sampling: P(miss) = 0.10 × (0.90 + 0.10 × 0.05) = 0.10 × 0.905 = 0.0905 (~9%)
Residual: ~5 issues per 1,000 interactions (with sampling)
Why this is acceptable: Internal users with domain expertise will catch most residual errors. The document assistant is an accelerator, not a decision-maker. Employees are expected to verify critical details against source documents. The Judge sampling catches systematic errors (drifting summaries, recurring hallucination patterns) even if it doesn't catch every individual instance.
PACE posture: P + A configured. If guardrails degrade, scope narrows to read-only retrieval (no summarisation). If Judge is unavailable, guardrail-only mode with increased human spot-checking.
HIGH Tier: Customer Product Chatbot¶
See the detailed worked example above.
Control configuration: Guardrails + Judge (20-50% coverage, near real-time) + Human oversight (flagged transactions + sampling).
Residual: ~0.01 issues per 1,000 transactions across all threat categories
PACE posture: P + A + C configured and tested. - Alternate: Judge down → guardrails remain active, all transactions flagged for human review queue, response latency accepted - Contingency: Guardrails degraded → chatbot enters "assisted browse" mode (read-only, no transactions), human reviews every interaction - Emergency: Multiple layers down → circuit breaker fires, chatbot replaced with static product pages + "contact us" fallback
CRITICAL Tier: Credit Decisioning System¶
System: AI evaluates loan applications and produces credit decisions that are auto-executed for standard cases. Decisions affect customer finances directly. Regulatory obligations (fair lending, adverse action notices).
Control configuration: Full three-layer deployment. Judge at 100% coverage, real-time. Human review of all adverse decisions and all decisions above a threshold.
| Threat | Inherent (per 1K) | Controls Applied | Residual (per 1K) |
|---|---|---|---|
| Discriminatory decision | 15 | Guardrails (90%): protected-class input filtering + bias detection. Judge (95%): fairness evaluation per decision. Human (98%): all adverse decisions reviewed | 0.0015 |
| Hallucinated financial data | 20 | Guardrails (90%): data validation against bureau records. Judge (95%): source verification. Human (98%): sample review of all auto-approved | 0.002 |
| Incorrect risk score | 30 | Guardrails (90%): range and consistency checks. Judge (95%): independent risk recalculation on sample. Human (98%): all high-value reviewed | 0.003 |
| Regulatory violation | 10 | Guardrails (90%): compliance rule engine. Judge (95%): regulatory checklist evaluation. Human (98%): compliance officer review | 0.001 |
| Model drift — gradual accuracy degradation | 5 | Guardrails (90%): statistical drift detection. Judge (95%): decision distribution monitoring. Human (98%): monthly portfolio review | 0.0005 |
Residual: ~0.008 issues per 1,000 decisions across all threat categories = 1 in 125,000 decisions
Compensating controls that further reduce risk: - Regulatory model validation — independent model risk management (OCC/Fed guidance) - Adverse action notice process — customer can challenge decisions, creating a feedback loop - Portfolio-level monitoring — statistical analysis catches systematic bias even if individual decisions pass - Audit trail requirements — every decision is logged with full reasoning chain for regulatory examination
PACE posture: Full PACE cycle with tested E→P recovery. - Alternate: Judge degraded → all decisions require human approval (no auto-execution) - Contingency: Multiple layers degraded → system enters "manual underwriting" mode — AI provides data retrieval only, all decisions made by human underwriters - Emergency: Circuit breaker → AI removed from decision path entirely, application queue held, existing commitments honoured through manual process
Cross-Tier Summary¶
| Tier | Example | Layers Active | Judge Coverage | Inherent Risk (per 1K) | Residual Risk (per 1K) | Reduction Factor |
|---|---|---|---|---|---|---|
| LOW | Public FAQ | Guardrails | Optional 1-5% | ~45 | ~4.5 | 10× |
| MEDIUM | Internal docs | Guardrails + Judge (sampled) | 5-10% | ~53 | ~5.0 | 10× |
| HIGH | Product chatbot | All three | 20-50% | ~100 | ~0.01 | 10,000× |
| CRITICAL | Credit decisions | All three (full) | 100% | ~80 | ~0.008 | 10,000× |
Key insight: The jump from 10× to 10,000× risk reduction happens when the Judge moves from sampling to substantial coverage and Human Oversight moves from exception-only to systematic review. This is why the framework requires full three-layer deployment for HIGH and CRITICAL tiers.
What These Numbers Do Not Tell You¶
1. Severity is not uniform. One PII leakage incident may matter more than fifty hallucinated FAQ answers. The residual risk numbers are per-incident counts, not impact-weighted. Weight your residual risk by impact severity when reporting to risk committees.
2. Effectiveness rates change over time. Adversaries adapt. Models drift. Guardrail bypass techniques evolve. The 90/95/98 rates are a snapshot. Schedule quarterly recalibration through: - Red team exercises against guardrails - Judge accuracy measurement against labelled datasets (see Judge Assurance) - Human reviewer agreement studies
3. Correlated failures break the model. If a novel attack technique bypasses both guardrails AND the Judge (because both rely on similar detection approaches), the independence assumption fails and residual risk is higher than predicted. This is why the framework emphasises different models, different methods, and different perspectives across layers.
4. The "unknown unknown" isn't modelled. This analysis covers known threat categories. Novel failure modes — threats you haven't imagined — are not captured. The Judge layer's semantic evaluation and Human Oversight provide some coverage for novel threats, but the model cannot quantify what it cannot anticipate. This is the fundamental argument for defence in depth: you need layers precisely because you can't predict everything.
5. Compensating controls have their own failure rates. The payment gateway, fraud detection system, and API validation layer can all fail too. A complete risk assessment would model these as additional independent layers with their own effectiveness rates, producing a full probability tree. The simplified analysis above is sufficient for directional decision-making.
Using This in Practice¶
For Risk Committees¶
If your organisation uses NIST AI RMF, frame this assessment in those terms:
| What You're Presenting | NIST RMF Function | Language to Use |
|---|---|---|
| Threat scenarios and likelihood | MAP | "We've identified and categorised the AI-specific risks for this system" |
| Per-layer effectiveness data | MEASURE | "We've measured control effectiveness through red teaming and Judge calibration" |
| Residual risk calculation | MEASURE | "Residual risk after all control layers is X per Y transactions" |
| Risk appetite comparison | GOVERN | "This residual risk is within/outside our stated risk tolerance" |
| Compensating controls and PACE postures | MANAGE | "We have compensating controls and defined degradation paths when controls fail" |
Present two numbers: 1. AI-layer residual risk — what's left after guardrails, Judge, and human oversight 2. Compensated residual risk — what's left after existing infrastructure controls also apply
Frame the discussion around whether the compensated residual risk is within appetite, not whether it's zero. It will never be zero.
For Engineering Teams¶
Use the per-scenario tables to: - Prioritise control implementation — highest inherent likelihood × severity first - Justify Judge coverage levels — show the math on sampling vs. full coverage - Identify where compensating controls reduce urgency — payment gateway validation may mean you can deploy with guardrails-only initially while building out the Judge layer
For Incident Response¶
When an incident occurs, update the effectiveness rates: - If a prompt injection bypasses guardrails, your guardrail effectiveness for that attack class drops - Recalculate residual risk with updated rates - Determine whether the remaining layers still bring residual risk within appetite - If not, implement additional controls or reduce system scope
Recalibration Schedule¶
| Activity | Frequency | Updates |
|---|---|---|
| Red team guardrail testing | Quarterly | Guardrail effectiveness rate |
| Judge accuracy evaluation | Quarterly | Judge effectiveness rate |
| Human reviewer agreement study | Bi-annually | Human oversight effectiveness rate |
| Incident-driven recalculation | Per incident | Specific scenario rates |
| Full risk assessment refresh | Annually | All rates, all scenarios, all tiers |
Template: Applying This to Your System¶
For each AI system, complete this assessment. NIST AI RMF function labels are included so you can slot each step into your existing risk management process.
1. System description and tier classification (NIST RMF: MAP 1.1, MAP 2.1) - What does the system do? - What data does it access? What actions can it take? - What is the assigned risk tier and why?
2. Threat scenario identification (NIST RMF: MAP 3.1, MAP 3.2) - List 3-7 realistic failure modes - Estimate inherent likelihood per 1,000 transactions (use incident data, red team results, or informed estimates) - Rate severity: Critical / High / Medium / Low
3. Per-layer control analysis (NIST RMF: MEASURE 1.1, MEASURE 2.1, MEASURE 2.6) - For each scenario, describe what each layer detects and how - Apply your measured or estimated effectiveness rates - Calculate residual risk
4. Compensating controls (NIST RMF: MANAGE 1.1, MANAGE 2.2) - List existing infrastructure controls that independently reduce risk - Estimate their effectiveness against each scenario - Calculate compensated residual risk
5. Appetite comparison (NIST RMF: GOVERN 1.5, MANAGE 2.4) - Does the compensated residual risk fall within your risk appetite? - If not, what additional controls or scope reductions are needed?
6. Recalibration plan (NIST RMF: MEASURE 2.3, GOVERN 1.4) - When will you re-measure effectiveness rates? - What triggers an unscheduled reassessment?
AI Runtime Behaviour Security, 2026 (Jonathan Gill).