Skip to content

From Idea to Production to Ongoing Control

The end-to-end process: strategy to use case to tool selection to risk tiering to deployment to ongoing governance. One flow, no gaps.

Part of From Strategy to Production


Why This Matters

The framework has excellent depth in each domain — risk tiers, controls, PACE resilience, governance. But there's no single document that connects the entire lifecycle from "someone has an idea" to "the system is running safely in production and being continuously governed."

Without that connected flow, organisations experience gaps:

  • Ideas become deployments without passing through risk classification
  • Systems launch without controls because nobody triggered the governance process
  • Operating teams inherit systems without knowing what to monitor or when to escalate
  • Use cases evolve without triggering reassessment
  • Tools get selected before anyone asks whether AI is the right approach

This article defines the complete process. Every stage has a clear output that feeds the next stage. Every handoff has a named owner. Every decision point has criteria.


The End-to-End Flow

Idea to Production Flow

Eight stages. Each produces a defined output. Each has guardrails that prevent mistakes, detect gaps, and absorb failure if stages are rushed or skipped.

Stage Activity Output Guardrail
1. Strategic Alignment Is this worth doing? Business case Detect: systems without business cases surface in governance reviews
2. Use Case Definition What exactly will it do? Completed use case definition Prevent: ten questions steer toward complete definitions
3. Tool Selection Is AI the right approach? Technology decision Prevent: Use Case Filter steers to right tool early
4. Risk Classification What tier does this sit at? Scored risk profile + tier Detect: unclassified systems visible in registry
5. Control Design What controls does this tier need? Control specification + PACE plan Prevent: approved platforms inherit baseline controls
6. Build & Test Implement the system and controls Working system with controls Detect: pre-deployment checks surface gaps
7. Deploy & Operate Launch and run Operating system with monitoring Absorb: gradual rollout contains blast radius
8. Ongoing Governance Monitor, review, evolve Continuous assurance Detect: continuous monitoring surfaces drift

Stage 1: Strategic Alignment

Owner: Business sponsor

Purpose: Determine whether this initiative is worth pursuing — before any technical work begins.

Inputs: - Business problem or opportunity - Strategic context (see Business Alignment)

Activities: - Define the business problem in measurable terms - Assess whether the problem justifies investment - Identify at least two alternative approaches (see below, Stage 3) - Estimate the value of solving it

Output: Business Case

Field Content
Problem statement What's the problem, measured in current cost/impact?
Proposed approach High-level solution concept
Expected value Quantified benefit (cost reduction, revenue, efficiency)
Strategic alignment How does this connect to organisational strategy?
Initial risk sense Gut-level: is this low, medium, or high risk?
Sponsor Named executive sponsor

Guardrail: Systems that reach production without a business case become visible during governance reviews — they can't demonstrate value and generate monitoring noise. The environment doesn't block teams from exploring, but it makes unjustified investment visible.

What can go wrong here: - Skip this stage → technology investment without business justification - Vague problem statement → impossible to measure success later - No alternatives considered → commitment to AI before evaluating options


Stage 2: Use Case Definition

Owner: Business owner + AI engineer (collaborative)

Purpose: Translate the business case into a specific, assessable use case definition.

Inputs: - Business case from Stage 1

Activities: - Complete the ten-question use case definition - Define explicit positive and negative scope - Identify data requirements and access needs - Determine user population and expected volume - Identify regulatory context - Name the accountable business owner

Output: Completed Use Case Definition

The full template from Use Case Definition. All ten questions answered. No "TBD" in critical fields.

Guardrail: The ten questions are the preventive control — they steer teams toward completeness. If fields are left as "TBD," downstream controls will be misconfigured and monitoring will surface the mismatch. Review by business owner, legal/compliance, and data owner improves quality but isn't a hard stop — incomplete definitions reveal themselves in operation.

What can go wrong here: - Incomplete definition → uncertain risk tier → wrong controls - Negative scope missing → guardrails can't enforce boundaries - Understated decision authority → system classified too low - "TBD" in regulatory context → compliance surprise at launch


Stage 3: Tool Selection

Owner: Technical lead + business owner

Purpose: Determine whether AI is the right tool — and if so, what kind.

This stage explicitly evaluates alternatives. The framework's first control is choosing the right tool.

Inputs: - Completed use case definition from Stage 2

Activities:

The Tool Selection Decision Tree

Tool Selection Decision Tree

Question If Yes If No
Can this be solved with deterministic rules? Use rules engine, workflow automation, or traditional code Continue
Does it require understanding unstructured input (natural language, images)? AI is likely appropriate Consider RPA or structured automation
Does it require pattern recognition across large datasets? AI/ML is likely appropriate Consider traditional analytics
Does it need to generate novel content or responses? Generative AI is appropriate Consider retrieval + templating
Does the use case require real-time, non-deterministic reasoning? LLM-based AI is appropriate Consider traditional ML models

The Five Options

Option When To Use Risk Profile Framework Implication
Traditional software Deterministic logic, bounded inputs, exact outputs needed Lowest — existing SDLC applies Outside framework scope
RPA / workflow automation Structured, repeatable processes; UI-based integration Low — deterministic, auditable Outside framework scope
Traditional ML Pattern recognition on structured data; classification, regression Low–Medium — predictable, testable Partial framework (monitoring, bias)
LLM / Generative AI Unstructured input, natural language, content generation Medium–Critical (depends on use case) Full framework applies
Multi-agent AI Complex workflows requiring multiple AI components collaborating High–Critical Full framework + MASO

The Hybrid Reality

Most real-world solutions are hybrid. A customer service system might use: - Traditional code for authentication and session management - Rules engine for routing queries to the right department - LLM for understanding the customer's intent and drafting responses - Traditional database for account lookups - Deterministic logic for executing any account actions

The framework applies to the AI components. The risk tier is determined by what the AI does, not by the entire system.

Key principle from The First Control: "AI proposes. Deterministic systems dispose." Wherever possible, use AI for cognition (understanding, drafting, recommending) and deterministic systems for action (executing, committing, approving). This naturally constrains the AI's blast radius and often reduces the risk tier.

Output: Technology Decision

Field Content
Selected approach AI, RPA, traditional, or hybrid (specify which components are AI)
Justification Why this approach over alternatives
AI components If hybrid, which parts use AI and which don't
AI type LLM, traditional ML, multi-agent, or combination
Platform/provider Managed service, self-hosted, vendor product
Risk implication How tool selection affects risk tier

Guardrail: The Use Case Filter is the preventive control — it steers teams to the right tool before investment begins. If a team skips it and builds AI where rules would suffice, the overhead becomes visible in operation: unnecessary guardrail tuning, Judge findings on deterministic tasks, governance cost that simpler tools wouldn't generate. If the decision is "not AI," the initiative exits to standard SDLC.


Stage 4: Risk Classification

Owner: Risk analyst (2nd line)

Purpose: Formally classify the risk tier using the framework's six-dimension scoring model.

Inputs: - Completed use case definition (Stage 2) - Technology decision (Stage 3)

Activities: - Score each dimension (Decision Authority, Reversibility, Data Sensitivity, Audience, Scale, Regulatory) - Apply scoring rules (highest dimension wins; adjacent HIGHs compound) - Apply use case modifiers (agentic, customer-facing, regulated, batch) - Check Fast Lane qualification (all four criteria met → Fast Lane) - Document the classification with justification per dimension - For AI-assisted classification, review the AI's proposed scores (see Use Case Definition)

Output: Scored Risk Profile

Dimension Score Justification
Decision Authority e.g., HIGH AI recommendations directly shape fraud investigation priority
Reversibility e.g., MEDIUM Incorrect prioritisation is recoverable but may delay detection
Data Sensitivity e.g., CRITICAL Processes transaction data including cardholder PII
Audience e.g., MEDIUM Internal fraud analysts
Scale e.g., HIGH 80,000 transactions/day
Regulatory e.g., HIGH PCI-DSS, banking regulations
Overall Tier CRITICAL Data sensitivity drives the tier

Guardrail: Unclassified systems are visible in the use case registry — they stand out because they have no tier, no controls, and no monitoring baseline. For Fast Lane, teams self-certify. For MEDIUM, a risk analyst reviews. For HIGH/CRITICAL, the governance committee reviews. The classification process is lightweight enough that skipping it costs more than doing it.

What can go wrong here: - Optimistic scoring → system under-controlled - No governance approval → classification has no authority - Dimension ambiguity not investigated → hidden risk - Fast Lane self-certification when criteria aren't clearly met → under-governed system


Stage 5: Control Design

Owner: Security architect + AI governance

Purpose: Translate the risk tier into a specific control specification for this system.

Inputs: - Scored risk profile (Stage 4) - Use case definition (Stage 2) - Technology decision (Stage 3)

Activities: - Select controls from the control matrix based on tier - Apply modifiers from the control selection guide - Design the PACE resilience plan — Primary, Alternate, Contingency, Emergency states - Specify guardrail configuration (what to block, what to allow) - Define Judge evaluation criteria (what "good" and "bad" look like for this use case) - Specify HITL requirements (who reviews, SLA, escalation path) - Size operational requirements (HITL staff, Judge compute, log storage) - If agentic: specify tool access controls, sandbox boundaries, delegation limits - If multi-agent: apply MASO controls at the appropriate tier

Output: Control Specification

Control Area Specification
Guardrails — Input Topic rules, injection detection, PII detection, rate limiting (specific config)
Guardrails — Output Content filtering, PII handling, confidence thresholds (specific config)
Judge Evaluation criteria, sampling rate, escalation rules, Judge model selection
HITL Reviewer role, SLA, escalation path, review criteria
PACE P/A/C/E states with transition triggers, fallback process, kill switch
Logging Content scope, retention period, access controls
Monitoring Dashboards, alerts, anomaly thresholds
Incident response Playbook reference, severity mapping, notification requirements

Guardrail: Teams building on approved platforms inherit baseline controls automatically — logging, monitoring, and standard guardrails come with the platform. The control specification adds use-case-specific configuration on top. Review by security architect, governance, and business owner strengthens the design, but the platform defaults mean even a rushed deployment starts with basic protection.


Stage 6: Build and Test

Owner: Engineering team

Purpose: Implement the system and its controls, and verify they work.

Inputs: - Control specification (Stage 5) - Technology decision (Stage 3)

Activities: - Build the AI system (model integration, data pipelines, UI) - Implement guardrails per specification - Configure Judge evaluation (prompts, sampling, routing) - Set up HITL workflows and queues - Configure logging and monitoring - Implement PACE transitions (feature flag, fallback activation) - Test against the testing guidance - Run pre-deployment checklist

Pre-Deployment Checklist:

Check Verified By Status
Use case definition matches implementation Business owner
Risk tier is current (no scope changes during build) Risk analyst
Input guardrails active and tested Security
Output guardrails active and tested Security
Judge evaluation configured and tested (shadow mode) Security/QA
HITL workflow functional; reviewers trained Operations
PACE transitions tested (feature flag, fallback) Engineering
Logging captures required data at required retention Engineering
Monitoring dashboards and alerts configured Operations
Incident response playbook exists and is known Operations
Manual fallback process documented and tested Business owner
Kill switch operational Engineering
Regulatory/compliance sign-off obtained Legal/Compliance

Guardrail: The pre-deployment checklist is a detective control — it surfaces gaps before they reach production. Items that aren't verified generate findings, not blockers. For HIGH/CRITICAL systems, the governance committee reviews before go-live. For lower tiers, the checklist serves as the team's own quality signal. The feature flag and PACE plan mean a deployment that discovers problems can be rolled back quickly.

What can go wrong here: - Controls implemented but not tested → false confidence - Judge in shadow mode never switches to active → no detection - HITL reviewers assigned but not trained → Human Factors failure - PACE plan documented but transitions never tested → plan doesn't work under pressure - Scope changed during build, risk tier not reassessed → running at wrong tier


Stage 7: Deploy and Operate

Owner: Technical operations + business owner

Purpose: Launch the system and transition to steady-state operations.

Inputs: - Tested system with verified controls (Stage 6)

Activities: - Deploy to production (gradual rollout for HIGH/CRITICAL) - Activate Judge evaluation (move from shadow to active) - Begin HITL operations - Monitor control effectiveness - Tune guardrails based on initial false positive/negative data - Calibrate Judge accuracy against HITL decisions - Verify logging and alerting in production

Deployment Pattern by Tier:

Tier Deployment Approach Rationale
Fast Lane Ship it Low risk; feature flag is the safety net
LOW Standard release Basic monitoring sufficient
MEDIUM Canary or staged rollout Monitor Judge findings before full traffic
HIGH Gradual rollout with enhanced monitoring Watch for unexpected patterns at scale
CRITICAL Phased rollout with governance checkpoints Each phase reviewed before expansion

First 30 Days:

Activity When Owner
Daily guardrail effectiveness review Day 1–14 Security
Daily Judge finding review Day 1–30 Operations
HITL SLA compliance check Daily Governance
False positive rate assessment Day 7, 14, 30 Security
Judge accuracy calibration Day 14, 30 Operations
Operational review with business owner Day 7, 14, 30 All
First PACE transition test Day 30 Engineering

Guardrail: Gradual rollout is the absorb control — it contains the blast radius of unexpected behaviour. The first 30 days of monitoring generate the baseline that ongoing governance uses. If calibration reveals problems, the deployment can be paused or rolled back without affecting the full user population. Operational handover to the steady-state team happens when monitoring confirms stability, not on a fixed schedule.


Stage 8: Ongoing Governance

Owner: AI governance function (2nd line) + business owner (1st line)

Purpose: Continuously assure that the system operates within its defined risk profile and that the risk profile remains current.

Inputs: - Operating system from Stage 7 - Use case definition (maintained as a living document)

Activities — Continuous:

Activity Frequency Owner Output
Guardrail effectiveness monitoring Real-time Technical ops Block rates, false positive rates
Judge finding triage Daily Operations Escalations, patterns
HITL SLA monitoring Daily Governance Compliance reports
Anomaly detection Continuous Security/SOC Alerts on drift
Usage monitoring Weekly Operations Volume trends, user patterns

Activities — Periodic:

Activity Frequency Owner Output
Judge accuracy calibration Weekly (HIGH/CRITICAL), Monthly (MEDIUM) Technical ops Calibration adjustments
Control effectiveness review Quarterly Governance Effectiveness report
Use case reassessment Annual minimum; triggered by changes Risk analyst Updated risk profile
PACE transition test Quarterly (CRITICAL), Bi-annual (HIGH), Annual (MEDIUM/LOW) Engineering Test results
Manual fallback exercise Bi-annual Business owner Fallback verified
Regulatory alignment check Annual + on regulatory change Legal/Compliance Compliance status
Human factors assessment Annual Governance Reviewer competence, deskilling check

Activities — Event-Driven:

Trigger Activity Owner
AI incident Incident playbook activation Incident team
Scope change request Use case reassessment → possible reclassification Business owner + risk
Model change Control configuration review Security
Data access change Data sensitivity reassessment Risk + data owner
Regulatory change Compliance impact assessment Legal + governance
Volume threshold breach Operational sizing review Operations
Judge accuracy drop Recalibration or investigation Technical ops
HITL SLA breach Root cause analysis Governance

The Governance Dashboard

What the governance committee needs to see:

Metric Source Frequency Target
Systems by tier Use case registry Monthly Complete coverage
Control implementation % Control tracking Monthly 100%
HITL SLA compliance Queue metrics Monthly >95%
Judge accuracy Calibration data Monthly >80% agreement with HITL
Open escalations Escalation log Monthly Trending down
Incidents by severity Incident log Monthly Trending down
False positive rate Guardrail metrics Monthly <5%
Use cases overdue for review Registry Monthly 0
Shadow AI discovered Discovery tools Monthly Trending down

When to Stop

Systems should be retired when: - The business case no longer holds - The risk exceeds the organisation's appetite and can't be reduced - A better solution exists (AI or otherwise) - Regulatory changes make the use case non-viable - The organisation can no longer safely operate the controls

Retirement process: 1. Governance approves retirement 2. Users notified with timeline 3. Manual fallback activated permanently 4. Data retention obligations confirmed 5. System decommissioned 6. Use case moved to "Retired" in registry 7. Post-retirement review documented (lessons learned)


The Complete Lifecycle — Summary

STAGE 1: STRATEGIC ALIGNMENT
  Input:     Business problem
  Output:    Business case
  Guardrail: Detect — unjustified systems visible in governance reviews
     │
STAGE 2: USE CASE DEFINITION
  Input:     Business case
  Output:    Ten-question use case definition
  Guardrail: Prevent — ten questions steer toward completeness
     │
STAGE 3: TOOL SELECTION
  Input:     Use case definition
  Output:    Technology decision (AI / RPA / traditional / hybrid)
  Guardrail: Prevent — Use Case Filter steers to right tool
  Exit:      If not AI → standard SDLC
     │
STAGE 4: RISK CLASSIFICATION
  Input:     Use case definition + technology decision
  Output:    Six-dimension scored risk profile + tier
  Guardrail: Detect — unclassified systems visible in registry
     │
STAGE 5: CONTROL DESIGN
  Input:     Risk profile + use case + technology
  Output:    Control specification + PACE plan
  Guardrail: Prevent — approved platforms inherit baseline controls
     │
STAGE 6: BUILD & TEST
  Input:     Control specification
  Output:    Working system with verified controls
  Guardrail: Detect — checklist surfaces gaps before production
     │
STAGE 7: DEPLOY & OPERATE
  Input:     Tested system
  Output:    Production system with active monitoring
  Guardrail: Absorb — gradual rollout contains blast radius
     │
STAGE 8: ONGOING GOVERNANCE
  Input:     Production system
  Output:    Continuous assurance
  Guardrail: Detect — continuous monitoring surfaces drift
  Loop:      Periodic review → reassessment → control adjustment
  Exit:      Retirement when appropriate

How the Framework Maps to This Flow

Stage Primary Framework Documents
1. Strategic Alignment Business Alignment, The First Control
2. Use Case Definition Use Case Definition, Model Card Template
3. Tool Selection The First Control, Risk Tier Is Use Case
4. Risk Classification Risk Tiers, Control Selection Guide, Fast Lane
5. Control Design Controls, PACE Resilience, Threat Model Template
6. Build & Test Quick Start, Implementation Guide, Testing Guidance
7. Deploy & Operate Governance Operating Model, SOC Integration
8. Ongoing Governance Governance Operating Model, Anomaly Detection

Where the Process Shortens

Not every initiative needs all eight stages at full depth.

Scenario Shortened Process
Fast Lane deployment Stage 1 (brief) → Stage 2 (ten questions) → Stage 3 (confirm AI) → Stage 4 (self-certify Fast Lane) → Stage 6 (basic guardrails + logging + feature flag) → Stage 7 (deploy) → Stage 8 (annual review)
Vendor SaaS product Stages 1–4 as normal → Stage 5 (map vendor controls to framework; identify gaps) → Stage 6 (configure, don't build) → Stages 7–8 as normal
Upgrading existing system Skip Stage 1 (already justified) → Stage 2 (update definition with changes) → Stage 3 (already decided) → Stage 4 (reclassify) → Stages 5–7 (implement new controls) → Stage 8 (continue)
POC / Experiment Stage 1 (brief) → Stage 2 (minimal) → Stage 3 (confirm AI) → Stage 4 (classify as LOW + time-bound) → Stage 6 (basic controls) → Stage 7 (limited deployment) → Fixed end date (no Stage 8 — either promote to full process or retire)

Where the Framework Doesn't Cover This Flow

Gap What's Missing Impact
No formal Stage 1 guidance The framework doesn't help evaluate business cases Organisations commit to AI without evaluating alternatives
No use case definition template Risk tiers assume a defined use case but don't provide the definition format Classification happens on incomplete information
No tool selection methodology "AI or not AI?" is addressed in one insight article but not as a formal decision point AI gets selected by default
No deployment guidance Implementation guide covers tools, not deployment patterns Organisations deploy CRITICAL systems without gradual rollout
No retirement process The framework covers the system lifecycle but not end-of-life Systems run indefinitely without reassessment

This article and Use Case Definition fill these gaps. The flow defined here can be used as the operational process that connects the framework's components into a coherent lifecycle.


AI Runtime Behaviour Security, 2026 (Jonathan Gill).