Progression¶

Moving from low-risk to high-risk AI — and why skipping steps is the most common strategic failure.

Part of From Strategy to Production

The Maturity Trap¶

Organisations want to be at Tier 3. Autonomous agents making decisions, driving efficiency, transforming operations. The business case for full autonomy is always more compelling than the business case for decision support.

So they skip. They go from no AI directly to autonomous AI. Sometimes it works — usually because the specific use case is forgiving and the data is good. More often it fails, and the failure is expensive, visible, and corrosive to future AI investment.

The framework provides a progression path. This article explains why it exists, how to use it, and where it breaks down.

The Progression Model¶

Strategy Progression Model

The framework supports five positions on a progression from no AI to autonomous AI:

Position	Framework Equivalent	What the AI Does	Who Decides
No AI	Pre-framework	Nothing — manual processes	Humans
Assisted	Fast Lane	AI drafts, suggests, summarises	Human decides everything; AI is a tool
Supported	Tier 1 (LOW/MEDIUM)	AI recommends; human acts	Human decides, informed by AI
Supervised	Tier 2 (HIGH)	AI acts on routine cases; human handles exceptions	AI decides routine; human decides complex
Autonomous	Tier 3 (CRITICAL)	AI acts independently; human monitors	AI decides; human oversees

Each step up increases value. Each step up also increases risk, control requirements, skill requirements, and operational complexity. The progression is designed so that each step builds the capability needed for the next.

What Each Step Builds¶

Step 1: No AI → Assisted (Fast Lane)¶

What you deploy: Internal tools — drafting assistants, search helpers, summarisation tools. Fast Lane criteria apply: internal users, read-only, no regulated data, human reviews everything.

What you learn: - How to deploy an AI system (engineering capability) - How to configure basic guardrails (security capability) - How users actually interact with AI (usage patterns) - Where AI output is good enough and where it isn't (quality baseline) - How to operate a feature flag and fallback process (operational basics)

What you build: - AI engineering skills in the development team - Guardrail configuration experience in security - Usage data that informs future Judge criteria - Organisational comfort with AI as a tool

Controls required: Basic guardrails, usage logging, feature flag, known fallback. That's it.

Duration: 1-3 months to first deployment. 3-6 months to establish patterns.

The mistake: Staying here too long. "Innovation theatre" — endless pilots that never progress to value. If Fast Lane deployments don't lead to Tier 1 within 6-12 months, something is blocking progression.

Step 2: Assisted → Supported (Tier 1)¶

What you deploy: AI that recommends actions to internal users, or AI that handles structured tasks with human review. May start handling non-sensitive external interactions (public FAQ, basic routing).

What you learn: - How to implement and operate a Judge (quality evaluation) - False positive rates and how to tune controls - What human reviewers need to do their job effectively - How to classify risk (first formal risk assessment) - How the AI performs on real data over time (drift detection baseline)

What you build: - Judge evaluation capability (the biggest new skill) - Risk classification process (governance foundation) - Human review process (HITL basics) - Operational metrics and monitoring (performance baseline) - Incident response for AI (first playbook)

Controls required: Standard guardrails, Judge evaluation (5-10% sampling), periodic HITL, logging with 1-year retention.

Duration: 3-6 months to first deployment. 6-12 months to operational maturity.

The mistake: Treating Judge as a checkbox. Deploying a Judge that nobody monitors or calibrates. The Judge is only as good as its criteria and calibration. If nobody is reviewing Judge accuracy, you don't have a Judge — you have a log generator.

Step 3: Supported → Supervised (Tier 2)¶

What you deploy: Customer-facing AI with real capability — customer service with account access, document processing with outputs going to external parties, decision support with significant business impact. The AI handles routine cases; humans handle exceptions and high-impact decisions.

What changes: - External exposure. AI outputs reach customers. Errors have reputational and regulatory consequences. - Data sensitivity. The AI accesses customer data. Privacy controls become mandatory. - Decision influence. AI recommendations significantly influence outcomes. Automation bias becomes a real risk. - Control intensity. Judge evaluation increases to 20-50%. Human review SLAs tighten to 4 hours for HIGH flags.

What you learn: - How to manage AI at scale with external users - How to handle AI incidents that affect customers - How to maintain controls under production pressure - Whether your human review process actually catches problems - Where the framework's controls feel too heavy or too light for your context

What you build: - Dedicated AI governance function (or at least a role within governance) - Mature Judge operations (weekly calibration, accuracy tracking) - Trained HITL reviewers with domain expertise - AI incident response capability - Regulatory engagement capability (explaining AI to regulators)

Controls required: Full guardrails, 20-50% Judge evaluation, HITL with 4-hour SLA, enhanced logging with 3-year retention.

Duration: 6-12 months to first deployment. 12-18 months to operational maturity.

The mistake: Skipping Tier 1. Organisations that go directly from Fast Lane to Tier 2 lack Judge operations experience, HITL maturity, and risk classification confidence. Everything is new simultaneously. Errors are externally visible.

Step 4: Supervised → Autonomous (Tier 3)¶

What you deploy: AI that makes independent decisions with real-world impact — automated lending decisions, autonomous trading within parameters, clinical decision support without mandatory clinician review, or multi-agent systems that plan and execute without human approval for routine actions.

What changes: - Decisions are AI-made. The human is a monitor, not a decision-maker. - Errors are direct. No human buffer between AI decision and real-world consequence. - Regulatory exposure is maximum. Every regulated decision requires demonstrable controls. - Control overhead is highest. 100% Judge evaluation, 1-hour review SLAs, 7-year immutable logs.

Prerequisites (non-negotiable):

Prerequisite	What It Means	How to Verify
Mature Judge with calibrated accuracy	You know your Judge's detection rate and false positive rate	6+ months of Judge accuracy data
Proven HITL process	Reviewers are trained, calibrated, and meeting SLAs	3+ months of SLA compliance data
Documented PACE plan	You know exactly what happens when each control fails	PACE plan tested through at least desktop exercise
AI governance committee	A body with authority to approve, modify, and stop AI systems	Meeting minutes showing active governance
Regulatory engagement	Regulators are aware and have no objections	Documented regulatory interaction
Incident response tested	You've responded to an AI incident (real or simulated)	Post-incident review documented
Kill switch operational	You can stop the AI within minutes	Tested quarterly

Duration: 12-18 months to first deployment (after Tier 2 maturity). Ongoing.

The mistake: Treating autonomous AI as a cost-saving measure. If the business case depends on eliminating human review, the business case is wrong. Autonomous AI reduces human decision-making but increases human monitoring, governance, and incident response. The human cost shifts — it doesn't disappear.

Progression Timelines¶

Realistic Timeline for a Typical Enterprise¶

Milestone	Elapsed Time	Key Dependencies
First Fast Lane deployment	Month 2-3	Engineering capacity, basic security awareness
5+ Fast Lane deployments operational	Month 6	Usage data, pattern recognition
First Tier 1 deployment	Month 6-9	Judge capability, risk classification process
Tier 1 operational maturity	Month 12-15	Judge calibration data, HITL experience
First Tier 2 deployment	Month 12-18	Governance function, trained reviewers, regulatory awareness
Tier 2 operational maturity	Month 18-24	Incident response capability, Judge accuracy data
First Tier 3 deployment (if needed)	Month 24-36	Full governance, proven controls, regulatory engagement

Total elapsed time from start to autonomous AI: 2-3 years.

This feels slow. It is slow. It's also realistic. Organisations that claim to go from nothing to autonomous AI in 6 months are either: - Deploying in a genuinely low-risk context that doesn't need Tier 3 controls - Skipping controls and hoping nothing goes wrong - About to have a bad quarter

Accelerated Timeline (Possible Under Specific Conditions)¶

Condition	How It Accelerates
Existing mature governance	Saves 3-6 months; governance processes already exist
Experienced AI team (hired or acquired)	Saves 3-6 months; build skills aren't a bottleneck
Vendor-managed controls	Saves 3-6 months; guardrails and Judge operated by vendor
Low-risk domain	Tier 3 may not be needed; Tier 2 is the ceiling
Small scale	Fewer interactions = simpler operations = faster maturation

Even accelerated, the progression from nothing to Tier 3 is 12-18 months. The reason is that operational maturity — the ability to detect and respond to AI failures — can't be compressed below the time it takes to actually operate systems and encounter real problems.

What Makes Organisations Skip Steps¶

Reason	Why It's Tempting	Why It's Dangerous
Competitive pressure	Competitor has autonomous AI; we need it too	You don't know their control maturity, incident rate, or risk appetite
Board mandate	"We need to be AI-first by Q4"	Mandating a timeline doesn't create the capability to meet it safely
Vendor claims	"Our platform handles all the security"	Vendor handles technical controls; you handle governance, HITL, incident response
Sunk cost	Already spent £2M building the system; delaying launch wastes the investment	Launching a system you can't safely operate wastes more
Proof-of-concept success	POC worked perfectly; let's go to production	POC operated at 1/1000th scale, with the best data, without adversarial inputs
Cost pressure	Manual process costs £5M/year; AI would cost £1M/year	The £1M estimate doesn't include controls, HITL, governance, or incident costs

What Happens When You Skip¶

Scenario: Financial services firm skips from no AI to autonomous fraud detection (Tier 3)

Month 1-3: Build and deploy. System goes live with basic guardrails only.

Month 4: First false positive storm. System blocks 200 legitimate transactions in one afternoon. No Judge evaluation to catch the drift. No HITL process to triage blocks. Customer complaints spike.

Month 5: Reactive fix. Lower guardrail thresholds. False positives decrease but false negatives increase. System misses actual fraud.

Month 6: Actual fraud gets through. Post-incident investigation reveals the guardrail threshold change was made by a developer without governance approval. No audit trail of the change.

Month 7: Regulatory inquiry. "Show us your controls." No Judge accuracy data. No HITL records. No PACE plan. No risk classification. The framework requirements exist, but none were implemented because the team skipped directly to autonomous deployment.

Month 8: System pulled back to human-supervised mode. 6 months of autonomous operation replaced with decision-support while controls are built. The progression starts where it should have started.

Cost of skipping: Lost customer trust. Regulatory scrutiny. 8 months of operating without appropriate controls. More expensive than doing it right because everything was built reactively under pressure.

The Framework's Role in Progression¶

Where the Framework Helps¶

The tiered model is inherently progressive:

Framework Feature	How It Supports Progression
Risk tiers	Clear criteria for what tier you need at each stage
Control matrix	Specific controls required at each tier; nothing ambiguous
Fast Lane	Entry point with minimal overhead; removes "security says no" as an excuse not to start
PACE model	Degradation is designed, not accidental; each progression step has a defined fallback
Judge evaluation	Sampling rates scale; you don't need 100% on day 1
HITL requirements	Human oversight decreases as AI maturity increases; the framework models this explicitly

Where the Framework Hinders¶

Limitation	Impact on Progression
No explicit progression path	The tiers exist but there's no documented "how to move from Tier 1 to Tier 2" process
Tier classification is static	Systems are classified at deployment; the framework doesn't guide dynamic reclassification during progression
No human readiness assessment	The framework tells you what controls you need; it doesn't tell you whether your people can operate them
No data quality integration	Progression should depend partly on data maturity; the framework doesn't incorporate this
Binary tier boundaries	Moving from MEDIUM to HIGH is a step function; no intermediate state
Downgrade is slow	6+ months stable operation to downgrade a tier; but sometimes circumstances genuinely reduce risk faster

What Would Make It Better¶

Gap	Proposed Addition	Impact
Progression guide	Document: "When you're at Tier X, here's what to do to prepare for Tier X+1"	Organisations know what to build next
Readiness checklist per tier	Before deploying at Tier 2, confirm: Judge accuracy >80%, HITL SLA >95% for 3 months, etc.	Prevents premature progression
Dynamic classification	Allow systems to operate at different tiers for different functions	Supports gradual capability expansion
Human readiness criteria	Minimum human capability requirements per tier (training, experience, backup)	Prevents the "controls exist but nobody can operate them" failure
Data quality overlay	Data quality as a risk modifier that affects effective tier	Prevents high-risk AI on low-quality data

Progression Patterns by Industry¶

Financial Services¶

Position	Typical Use Cases	Regulatory Consideration
Assisted	Report drafting, research summarisation, code review	Low regulatory impact; manageable
Supported	Risk analysis assistance, customer query routing, AML triage	Moderate; outputs inform regulated decisions
Supervised	Customer service with account access, claims processing	High; customer-facing, data-sensitive
Autonomous	Fraud detection, credit scoring, algorithmic trading	Maximum; SR 11-7, EU AI Act, FCA oversight

Typical progression: 18-30 months to reach Supervised. Autonomous only for specific, well-bounded use cases with strong regulatory engagement.

Healthcare¶

Position	Typical Use Cases	Regulatory Consideration
Assisted	Literature search, administrative summarisation, scheduling	Low; no clinical impact
Supported	Clinical documentation assistance, diagnostic research	Moderate; informs clinical decisions
Supervised	Triage support, treatment recommendation, imaging analysis	High; directly affects patient care
Autonomous	Very limited — monitoring alarms, drug interaction checking	Maximum; medical device regulations may apply

Typical progression: 24-36 months to reach Supervised. Autonomous is rare and highly constrained by medical device regulation and clinical safety requirements.

Retail / E-commerce¶

Position	Typical Use Cases	Regulatory Consideration
Assisted	Product description generation, internal analytics	Low
Supported	Personalised recommendations, customer search	Moderate; consumer protection, GDPR
Supervised	Customer service, returns processing, pricing assistance	Moderate-high; customer-facing
Autonomous	Dynamic pricing, inventory management, fraud detection	High; competition law, consumer protection

Typical progression: 12-18 months to reach Supervised. Faster progression possible because regulatory intensity is lower than financial services or healthcare.

When to Stay Where You Are¶

Not every organisation needs to reach Tier 3. Not every organisation should.

Stay at Assisted/Fast Lane if: - AI provides sufficient value as a productivity tool - The organisation doesn't have governance capacity for higher tiers - The risk appetite genuinely doesn't extend to AI-driven decisions

Stay at Supported (Tier 1) if: - Human expertise is the competitive advantage, not AI speed - Regulatory environment is uncertain (better to wait for clarity) - The volume of decisions doesn't justify the cost of autonomous controls

Stay at Supervised (Tier 2) if: - The human-AI combination outperforms both human-only and AI-only - Accountability requirements favour human decision-making - Control costs for Tier 3 exceed the value of removing human oversight

Progress to Autonomous (Tier 3) only if: - The speed or scale of decisions requires it (fraud detection at 1M transactions/day) - Demonstrable data shows the AI outperforms human decision-making - Full governance, controls, and operational capability are proven - Regulatory environment supports it - The business case includes full operational cost, not just the technology cost

The strategic goal is not "highest tier possible." It's "right tier for the business, operated safely." For many organisations, that's Tier 1 or Tier 2 — and that's fine.

AI Runtime Behaviour Security, 2026 (Jonathan Gill).