From Idea to Production to Ongoing Control¶

The end-to-end process: strategy to use case to tool selection to risk tiering to deployment to ongoing governance. One flow, no gaps.

Part of From Strategy to Production

Why This Matters¶

The framework has excellent depth in each domain — risk tiers, controls, PACE resilience, governance. But there's no single document that connects the entire lifecycle from "someone has an idea" to "the system is running safely in production and being continuously governed."

Without that connected flow, organisations experience gaps:

Ideas become deployments without passing through risk classification
Systems launch without controls because nobody triggered the governance process
Operating teams inherit systems without knowing what to monitor or when to escalate
Use cases evolve without triggering reassessment
Tools get selected before anyone asks whether AI is the right approach

This article defines the complete process. Every stage has a clear output that feeds the next stage. Every handoff has a named owner. Every decision point has criteria.

The End-to-End Flow¶

Idea to Production Flow

Eight stages. Each produces a defined output. Each has guardrails that prevent mistakes, detect gaps, and absorb failure if stages are rushed or skipped.

Stage	Activity	Output	Guardrail
1. Strategic Alignment	Is this worth doing?	Business case	Detect: systems without business cases surface in governance reviews
2. Use Case Definition	What exactly will it do?	Completed use case definition	Prevent: ten questions steer toward complete definitions
3. Tool Selection	Is AI the right approach?	Technology decision	Prevent: Use Case Filter steers to right tool early
4. Risk Classification	What tier does this sit at?	Scored risk profile + tier	Detect: unclassified systems visible in registry
5. Control Design	What controls does this tier need?	Control specification + PACE plan	Prevent: approved platforms inherit baseline controls
6. Build & Test	Implement the system and controls	Working system with controls	Detect: pre-deployment checks surface gaps
7. Deploy & Operate	Launch and run	Operating system with monitoring	Absorb: gradual rollout contains blast radius
8. Ongoing Governance	Monitor, review, evolve	Continuous assurance	Detect: continuous monitoring surfaces drift

Stage 1: Strategic Alignment¶

Owner: Business sponsor

Purpose: Determine whether this initiative is worth pursuing — before any technical work begins.

Inputs: - Business problem or opportunity - Strategic context (see Business Alignment)

Activities: - Define the business problem in measurable terms - Assess whether the problem justifies investment - Identify at least two alternative approaches (see below, Stage 3) - Estimate the value of solving it

Output: Business Case

Field	Content
Problem statement	What's the problem, measured in current cost/impact?
Proposed approach	High-level solution concept
Expected value	Quantified benefit (cost reduction, revenue, efficiency)
Strategic alignment	How does this connect to organisational strategy?
Initial risk sense	Gut-level: is this low, medium, or high risk?
Sponsor	Named executive sponsor

Guardrail: Systems that reach production without a business case become visible during governance reviews — they can't demonstrate value and generate monitoring noise. The environment doesn't block teams from exploring, but it makes unjustified investment visible.

What can go wrong here: - Skip this stage → technology investment without business justification - Vague problem statement → impossible to measure success later - No alternatives considered → commitment to AI before evaluating options

Stage 2: Use Case Definition¶

Owner: Business owner + AI engineer (collaborative)

Purpose: Translate the business case into a specific, assessable use case definition.

Inputs: - Business case from Stage 1

Activities: - Complete the ten-question use case definition - Define explicit positive and negative scope - Identify data requirements and access needs - Determine user population and expected volume - Identify regulatory context - Name the accountable business owner

Output: Completed Use Case Definition

The full template from Use Case Definition. All ten questions answered. No "TBD" in critical fields.

Guardrail: The ten questions are the preventive control — they steer teams toward completeness. If fields are left as "TBD," downstream controls will be misconfigured and monitoring will surface the mismatch. Review by business owner, legal/compliance, and data owner improves quality but isn't a hard stop — incomplete definitions reveal themselves in operation.

What can go wrong here: - Incomplete definition → uncertain risk tier → wrong controls - Negative scope missing → guardrails can't enforce boundaries - Understated decision authority → system classified too low - "TBD" in regulatory context → compliance surprise at launch

Stage 3: Tool Selection¶

Owner: Technical lead + business owner

Purpose: Determine whether AI is the right tool — and if so, what kind.

This stage explicitly evaluates alternatives. The framework's first control is choosing the right tool.

Inputs: - Completed use case definition from Stage 2

Activities:

The Tool Selection Decision Tree¶

Tool Selection Decision Tree

Question	If Yes	If No
Can this be solved with deterministic rules?	Use rules engine, workflow automation, or traditional code	Continue
Does it require understanding unstructured input (natural language, images)?	AI is likely appropriate	Consider RPA or structured automation
Does it require pattern recognition across large datasets?	AI/ML is likely appropriate	Consider traditional analytics
Does it need to generate novel content or responses?	Generative AI is appropriate	Consider retrieval + templating
Does the use case require real-time, non-deterministic reasoning?	LLM-based AI is appropriate	Consider traditional ML models

The Five Options¶

Option	When To Use	Risk Profile	Framework Implication
Traditional software	Deterministic logic, bounded inputs, exact outputs needed	Lowest — existing SDLC applies	Outside framework scope
RPA / workflow automation	Structured, repeatable processes; UI-based integration	Low — deterministic, auditable	Outside framework scope
Traditional ML	Pattern recognition on structured data; classification, regression	Low–Medium — predictable, testable	Partial framework (monitoring, bias)
LLM / Generative AI	Unstructured input, natural language, content generation	Medium–Critical (depends on use case)	Full framework applies
Multi-agent AI	Complex workflows requiring multiple AI components collaborating	High–Critical	Full framework + MASO

The Hybrid Reality¶

Most real-world solutions are hybrid. A customer service system might use: - Traditional code for authentication and session management - Rules engine for routing queries to the right department - LLM for understanding the customer's intent and drafting responses - Traditional database for account lookups - Deterministic logic for executing any account actions

The framework applies to the AI components. The risk tier is determined by what the AI does, not by the entire system.

Key principle from The First Control: "AI proposes. Deterministic systems dispose." Wherever possible, use AI for cognition (understanding, drafting, recommending) and deterministic systems for action (executing, committing, approving). This naturally constrains the AI's blast radius and often reduces the risk tier.

Output: Technology Decision

Field	Content
Selected approach	AI, RPA, traditional, or hybrid (specify which components are AI)
Justification	Why this approach over alternatives
AI components	If hybrid, which parts use AI and which don't
AI type	LLM, traditional ML, multi-agent, or combination
Platform/provider	Managed service, self-hosted, vendor product
Risk implication	How tool selection affects risk tier

Guardrail: The Use Case Filter is the preventive control — it steers teams to the right tool before investment begins. If a team skips it and builds AI where rules would suffice, the overhead becomes visible in operation: unnecessary guardrail tuning, Judge findings on deterministic tasks, governance cost that simpler tools wouldn't generate. If the decision is "not AI," the initiative exits to standard SDLC.

Stage 4: Risk Classification¶

Owner: Risk analyst (2nd line)

Purpose: Formally classify the risk tier using the framework's six-dimension scoring model.

Inputs: - Completed use case definition (Stage 2) - Technology decision (Stage 3)

Activities: - Score each dimension (Decision Authority, Reversibility, Data Sensitivity, Audience, Scale, Regulatory) - Apply scoring rules (highest dimension wins; adjacent HIGHs compound) - Apply use case modifiers (agentic, customer-facing, regulated, batch) - Check Fast Lane qualification (all four criteria met → Fast Lane) - Document the classification with justification per dimension - For AI-assisted classification, review the AI's proposed scores (see Use Case Definition)

Output: Scored Risk Profile

Dimension	Score	Justification
Decision Authority	e.g., HIGH	AI recommendations directly shape fraud investigation priority
Reversibility	e.g., MEDIUM	Incorrect prioritisation is recoverable but may delay detection
Data Sensitivity	e.g., CRITICAL	Processes transaction data including cardholder PII
Audience	e.g., MEDIUM	Internal fraud analysts
Scale	e.g., HIGH	80,000 transactions/day
Regulatory	e.g., HIGH	PCI-DSS, banking regulations
Overall Tier	CRITICAL	Data sensitivity drives the tier

Guardrail: Unclassified systems are visible in the use case registry — they stand out because they have no tier, no controls, and no monitoring baseline. For Fast Lane, teams self-certify. For MEDIUM, a risk analyst reviews. For HIGH/CRITICAL, the governance committee reviews. The classification process is lightweight enough that skipping it costs more than doing it.

What can go wrong here: - Optimistic scoring → system under-controlled - No governance approval → classification has no authority - Dimension ambiguity not investigated → hidden risk - Fast Lane self-certification when criteria aren't clearly met → under-governed system

Stage 5: Control Design¶

Owner: Security architect + AI governance

Purpose: Translate the risk tier into a specific control specification for this system.

Inputs: - Scored risk profile (Stage 4) - Use case definition (Stage 2) - Technology decision (Stage 3)

Activities: - Select controls from the control matrix based on tier - Apply modifiers from the control selection guide - Design the PACE resilience plan — Primary, Alternate, Contingency, Emergency states - Specify guardrail configuration (what to block, what to allow) - Define Judge evaluation criteria (what "good" and "bad" look like for this use case) - Specify HITL requirements (who reviews, SLA, escalation path) - Size operational requirements (HITL staff, Judge compute, log storage) - If agentic: specify tool access controls, sandbox boundaries, delegation limits - If multi-agent: apply MASO controls at the appropriate tier

Output: Control Specification

Control Area	Specification
Guardrails — Input	Topic rules, injection detection, PII detection, rate limiting (specific config)
Guardrails — Output	Content filtering, PII handling, confidence thresholds (specific config)
Judge	Evaluation criteria, sampling rate, escalation rules, Judge model selection
HITL	Reviewer role, SLA, escalation path, review criteria
PACE	P/A/C/E states with transition triggers, fallback process, kill switch
Logging	Content scope, retention period, access controls
Monitoring	Dashboards, alerts, anomaly thresholds
Incident response	Playbook reference, severity mapping, notification requirements

Guardrail: Teams building on approved platforms inherit baseline controls automatically — logging, monitoring, and standard guardrails come with the platform. The control specification adds use-case-specific configuration on top. Review by security architect, governance, and business owner strengthens the design, but the platform defaults mean even a rushed deployment starts with basic protection.

Stage 6: Build and Test¶

Owner: Engineering team

Purpose: Implement the system and its controls, and verify they work.

Inputs: - Control specification (Stage 5) - Technology decision (Stage 3)

Activities: - Build the AI system (model integration, data pipelines, UI) - Implement guardrails per specification - Configure Judge evaluation (prompts, sampling, routing) - Set up HITL workflows and queues - Configure logging and monitoring - Implement PACE transitions (feature flag, fallback activation) - Test against the testing guidance - Run pre-deployment checklist

Pre-Deployment Checklist:

Check	Verified By	Status
Use case definition matches implementation	Business owner
Risk tier is current (no scope changes during build)	Risk analyst
Input guardrails active and tested	Security
Output guardrails active and tested	Security
Judge evaluation configured and tested (shadow mode)	Security/QA
HITL workflow functional; reviewers trained	Operations
PACE transitions tested (feature flag, fallback)	Engineering
Logging captures required data at required retention	Engineering
Monitoring dashboards and alerts configured	Operations
Incident response playbook exists and is known	Operations
Manual fallback process documented and tested	Business owner
Kill switch operational	Engineering
Regulatory/compliance sign-off obtained	Legal/Compliance

Guardrail: The pre-deployment checklist is a detective control — it surfaces gaps before they reach production. Items that aren't verified generate findings, not blockers. For HIGH/CRITICAL systems, the governance committee reviews before go-live. For lower tiers, the checklist serves as the team's own quality signal. The feature flag and PACE plan mean a deployment that discovers problems can be rolled back quickly.

What can go wrong here: - Controls implemented but not tested → false confidence - Judge in shadow mode never switches to active → no detection - HITL reviewers assigned but not trained → Human Factors failure - PACE plan documented but transitions never tested → plan doesn't work under pressure - Scope changed during build, risk tier not reassessed → running at wrong tier

Stage 7: Deploy and Operate¶

Owner: Technical operations + business owner

Purpose: Launch the system and transition to steady-state operations.

Inputs: - Tested system with verified controls (Stage 6)

Activities: - Deploy to production (gradual rollout for HIGH/CRITICAL) - Activate Judge evaluation (move from shadow to active) - Begin HITL operations - Monitor control effectiveness - Tune guardrails based on initial false positive/negative data - Calibrate Judge accuracy against HITL decisions - Verify logging and alerting in production

Deployment Pattern by Tier:

Tier	Deployment Approach	Rationale
Fast Lane	Ship it	Low risk; feature flag is the safety net
LOW	Standard release	Basic monitoring sufficient
MEDIUM	Canary or staged rollout	Monitor Judge findings before full traffic
HIGH	Gradual rollout with enhanced monitoring	Watch for unexpected patterns at scale
CRITICAL	Phased rollout with governance checkpoints	Each phase reviewed before expansion

First 30 Days:

Activity	When	Owner
Daily guardrail effectiveness review	Day 1–14	Security
Daily Judge finding review	Day 1–30	Operations
HITL SLA compliance check	Daily	Governance
False positive rate assessment	Day 7, 14, 30	Security
Judge accuracy calibration	Day 14, 30	Operations
Operational review with business owner	Day 7, 14, 30	All
First PACE transition test	Day 30	Engineering

Guardrail: Gradual rollout is the absorb control — it contains the blast radius of unexpected behaviour. The first 30 days of monitoring generate the baseline that ongoing governance uses. If calibration reveals problems, the deployment can be paused or rolled back without affecting the full user population. Operational handover to the steady-state team happens when monitoring confirms stability, not on a fixed schedule.

Stage 8: Ongoing Governance¶

Owner: AI governance function (2nd line) + business owner (1st line)

Purpose: Continuously assure that the system operates within its defined risk profile and that the risk profile remains current.

Inputs: - Operating system from Stage 7 - Use case definition (maintained as a living document)

Activities — Continuous:

Activity	Frequency	Owner	Output
Guardrail effectiveness monitoring	Real-time	Technical ops	Block rates, false positive rates
Judge finding triage	Daily	Operations	Escalations, patterns
HITL SLA monitoring	Daily	Governance	Compliance reports
Anomaly detection	Continuous	Security/SOC	Alerts on drift
Usage monitoring	Weekly	Operations	Volume trends, user patterns

Activities — Periodic:

Activity	Frequency	Owner	Output
Judge accuracy calibration	Weekly (HIGH/CRITICAL), Monthly (MEDIUM)	Technical ops	Calibration adjustments
Control effectiveness review	Quarterly	Governance	Effectiveness report
Use case reassessment	Annual minimum; triggered by changes	Risk analyst	Updated risk profile
PACE transition test	Quarterly (CRITICAL), Bi-annual (HIGH), Annual (MEDIUM/LOW)	Engineering	Test results
Manual fallback exercise	Bi-annual	Business owner	Fallback verified
Regulatory alignment check	Annual + on regulatory change	Legal/Compliance	Compliance status
Human factors assessment	Annual	Governance	Reviewer competence, deskilling check

Activities — Event-Driven:

Trigger	Activity	Owner
AI incident	Incident playbook activation	Incident team
Scope change request	Use case reassessment → possible reclassification	Business owner + risk
Model change	Control configuration review	Security
Data access change	Data sensitivity reassessment	Risk + data owner
Regulatory change	Compliance impact assessment	Legal + governance
Volume threshold breach	Operational sizing review	Operations
Judge accuracy drop	Recalibration or investigation	Technical ops
HITL SLA breach	Root cause analysis	Governance

The Governance Dashboard¶

What the governance committee needs to see:

Metric	Source	Frequency	Target
Systems by tier	Use case registry	Monthly	Complete coverage
Control implementation %	Control tracking	Monthly	100%
HITL SLA compliance	Queue metrics	Monthly	>95%
Judge accuracy	Calibration data	Monthly	>80% agreement with HITL
Open escalations	Escalation log	Monthly	Trending down
Incidents by severity	Incident log	Monthly	Trending down
False positive rate	Guardrail metrics	Monthly	<5%
Use cases overdue for review	Registry	Monthly	0
Shadow AI discovered	Discovery tools	Monthly	Trending down

When to Stop¶

Systems should be retired when: - The business case no longer holds - The risk exceeds the organisation's appetite and can't be reduced - A better solution exists (AI or otherwise) - Regulatory changes make the use case non-viable - The organisation can no longer safely operate the controls

Retirement process: 1. Governance approves retirement 2. Users notified with timeline 3. Manual fallback activated permanently 4. Data retention obligations confirmed 5. System decommissioned 6. Use case moved to "Retired" in registry 7. Post-retirement review documented (lessons learned)

The Complete Lifecycle — Summary¶

STAGE 1: STRATEGIC ALIGNMENT
  Input:     Business problem
  Output:    Business case
  Guardrail: Detect — unjustified systems visible in governance reviews
     │
STAGE 2: USE CASE DEFINITION
  Input:     Business case
  Output:    Ten-question use case definition
  Guardrail: Prevent — ten questions steer toward completeness
     │
STAGE 3: TOOL SELECTION
  Input:     Use case definition
  Output:    Technology decision (AI / RPA / traditional / hybrid)
  Guardrail: Prevent — Use Case Filter steers to right tool
  Exit:      If not AI → standard SDLC
     │
STAGE 4: RISK CLASSIFICATION
  Input:     Use case definition + technology decision
  Output:    Six-dimension scored risk profile + tier
  Guardrail: Detect — unclassified systems visible in registry
     │
STAGE 5: CONTROL DESIGN
  Input:     Risk profile + use case + technology
  Output:    Control specification + PACE plan
  Guardrail: Prevent — approved platforms inherit baseline controls
     │
STAGE 6: BUILD & TEST
  Input:     Control specification
  Output:    Working system with verified controls
  Guardrail: Detect — checklist surfaces gaps before production
     │
STAGE 7: DEPLOY & OPERATE
  Input:     Tested system
  Output:    Production system with active monitoring
  Guardrail: Absorb — gradual rollout contains blast radius
     │
STAGE 8: ONGOING GOVERNANCE
  Input:     Production system
  Output:    Continuous assurance
  Guardrail: Detect — continuous monitoring surfaces drift
  Loop:      Periodic review → reassessment → control adjustment
  Exit:      Retirement when appropriate

How the Framework Maps to This Flow¶

Stage	Primary Framework Documents
1. Strategic Alignment	Business Alignment, The First Control
2. Use Case Definition	Use Case Definition, Model Card Template
3. Tool Selection	The First Control, Risk Tier Is Use Case
4. Risk Classification	Risk Tiers, Control Selection Guide, Fast Lane
5. Control Design	Controls, PACE Resilience, Threat Model Template
6. Build & Test	Quick Start, Implementation Guide, Testing Guidance
7. Deploy & Operate	Governance Operating Model, SOC Integration
8. Ongoing Governance	Governance Operating Model, Anomaly Detection

Where the Process Shortens¶

Not every initiative needs all eight stages at full depth.

Scenario	Shortened Process
Fast Lane deployment	Stage 1 (brief) → Stage 2 (ten questions) → Stage 3 (confirm AI) → Stage 4 (self-certify Fast Lane) → Stage 6 (basic guardrails + logging + feature flag) → Stage 7 (deploy) → Stage 8 (annual review)
Vendor SaaS product	Stages 1–4 as normal → Stage 5 (map vendor controls to framework; identify gaps) → Stage 6 (configure, don't build) → Stages 7–8 as normal
Upgrading existing system	Skip Stage 1 (already justified) → Stage 2 (update definition with changes) → Stage 3 (already decided) → Stage 4 (reclassify) → Stages 5–7 (implement new controls) → Stage 8 (continue)
POC / Experiment	Stage 1 (brief) → Stage 2 (minimal) → Stage 3 (confirm AI) → Stage 4 (classify as LOW + time-bound) → Stage 6 (basic controls) → Stage 7 (limited deployment) → Fixed end date (no Stage 8 — either promote to full process or retire)

Where the Framework Doesn't Cover This Flow¶

Gap	What's Missing	Impact
No formal Stage 1 guidance	The framework doesn't help evaluate business cases	Organisations commit to AI without evaluating alternatives
No use case definition template	Risk tiers assume a defined use case but don't provide the definition format	Classification happens on incomplete information
No tool selection methodology	"AI or not AI?" is addressed in one insight article but not as a formal decision point	AI gets selected by default
No deployment guidance	Implementation guide covers tools, not deployment patterns	Organisations deploy CRITICAL systems without gradual rollout
No retirement process	The framework covers the system lifecycle but not end-of-life	Systems run indefinitely without reassessment

This article and Use Case Definition fill these gaps. The flow defined here can be used as the operational process that connects the framework's components into a coherent lifecycle.

AI Runtime Behaviour Security, 2026 (Jonathan Gill).