Implementation Checklist¶

Implementation Architecture

Phase 1: Foundation¶

Classification¶

AI system identified and documented
Risk tier assigned (LOW / MEDIUM / HIGH / CRITICAL)
Classification rationale documented
Business owner identified
Review date set

Governance¶

Roles defined (owner, operator, reviewer)
Approval workflow established
Escalation path documented
Success metrics defined

Phase 2: Logging¶

Interaction Capture¶

Full input/output logging enabled
User/session attribution working
Timestamps accurate
Context captured (system prompt, retrieved content)

Storage¶

Retention period configured per tier
Access controls applied
Tamper protection enabled (HIGH/CRITICAL)
Backup/recovery tested

Phase 3: Guardrails¶

Input Guardrails¶

Output Guardrails¶

Content filtering enabled
PII detection configured
Grounding checks enabled (if applicable)
Format validation applied

Testing¶

Known-bad inputs blocked
Legitimate inputs pass
False positive rate acceptable
Latency acceptable

Phase 4: Judge¶

Setup¶

Judge prompt developed
Evaluation criteria defined
Scoring rubric documented
Judge model selected

Shadow Mode¶

Judge running on all/sampled interactions
Findings logged but not acted on
Human comparison performed
Accuracy measured (target: >90%)

Calibration¶

False positive rate measured
False negative rate estimated
Judge prompt tuned based on findings
Re-validated after tuning

Phase 5: Human Oversight¶

Queues¶

Queue structure defined
SLAs set per queue
Routing rules configured
Escalation paths working

Reviewers¶

Reviewers identified and trained
Review interface deployed
Actions documented (approve, correct, escalate, etc.)
Feedback loop to Judge established

Quality Assurance¶

Canary cases configured
Review time tracking enabled
Volume limits set
Inter-rater reliability measured

Phase 6: Operationalise¶

Judge to Advisory¶

Findings surfaced to reviewers
Reviewer feedback captured
Judge accuracy re-measured
Tuning based on feedback

Judge to Operational¶

Findings automatically routed
Workflows triggered by findings
Metrics dashboard live
Alerting configured

Phase 7: Continuous Improvement¶

Metrics¶

Guardrail block rate tracked
Judge finding rate tracked
HITL review rate tracked
False positive/negative trends monitored

Tuning¶

Regular guardrail rule review
Judge prompt refinement
Threshold adjustment based on data
New attack pattern incorporation

Review¶

Quarterly control effectiveness review
Annual risk tier re-assessment
Incident lessons incorporated
Regulatory changes assessed

Agent-Specific (if applicable)¶

Scope Enforcement¶

Network allowlist configured
Data access limited to authorised scope
Action allowlist implemented
Resource caps set

Action Controls¶

Action validator deployed
Approval workflow for high-impact actions
Circuit breakers configured
Tool output sanitisation enabled

Monitoring¶

Action volume alerts set
Error rate alerts set
Cost anomaly alerts set
Scope violation alerts set

Verification¶

This checklist is most effective when automated — integrated into CI/CD pipelines and platform deployment workflows so that items are verified as part of the build, not signed off in a meeting. Where automation isn't feasible, the team that built the system verifies their own readiness.

Organisations may choose to add formal sign-off gates for higher-risk tiers. That is a governance decision, not a framework requirement. The framework's requirement is that the checks are completed and the results are visible — not that a specific approver signs a document.

Phase	Completed	Verified By	Date
Foundation	☐
Logging	☐
Guardrails	☐
Judge	☐
HITL	☐
Operational	☐
Agent Controls	☐

AI Runtime Behaviour Security, 2026 (Jonathan Gill).