Worked Example: High-Volume Customer Communications¶

When time-to-detect equals time-to-harm, control timing becomes a design decision.

This example follows Sentinel Bank (fictional) as they deploy AI-generated customer communications at scale. The critical difference from interactive chat: messages are sent, not displayed. Once delivered, harm is done.

Why This Example Matters¶

In high-throughput communications (fraud alerts, collections, service notifications), latency determines whether you have governance or post-mortem reporting.

The question isn't "should we monitor?" but "when must we act, and how fast?"

The Use Case¶

System Name: Compass (AI-Powered Customer Communications)

What it does: - Generates personalised outbound messages (email, SMS, in-app) - Responds to customer enquiries across channels - Drafts follow-up communications for agents - Handles fraud alerts, payment reminders, service updates

Scale: - 2 million messages per day - 15 million customers across channels - Peak: 50,000 messages per hour (fraud alert scenarios) - 200+ intent categories

Technology: - Claude via AWS Bedrock - RAG for policy/product retrieval - Customer data API integration - Multi-channel delivery (email, SMS, push)

The critical constraint: Messages are delivered to customers. Unlike chatbots where output is displayed and the customer responds, these communications go out. A bad message reaches the customer before you know it was bad.

Step 1: Risk Classification¶

Assessment¶

Factor	Assessment	Score
Decision Impact	Communications may trigger customer action	Medium
Data Sensitivity	Customer PII, account details, balances	High
User Population	External customers (15M)	High
Autonomy Level	Can send messages without human review	High
Regulatory Scope	Banking (FCA, PRA), GDPR, PECR	High
Reputational Risk	Direct customer communication at scale	High
Reversibility	Messages cannot be unsent	High

Classification Decision¶

Risk Tier: CRITICAL (Tier 4)

Rationale: High autonomy (auto-send) + irreversibility (can't unsend) + scale (millions of messages) + regulatory exposure = maximum control requirements.

Step 2: The Latency Problem¶

Traditional Thinking (Broken)¶

The typical pattern — generate, send, log, then evaluate hours later — doesn't work. By the time nearline evaluation catches a problem, thousands of messages are already delivered.

Time-Band Thinking (Required)¶

Controls must match the reversibility window. For communications: - Once sent, irreversible - Therefore, high-risk checks must complete before send

Step 3: Three Time Bands¶

Band A: Inline (Milliseconds)¶

What belongs here: Anything that must prevent harm before it happens.

Control	Purpose	Latency
Bedrock Guardrails	Block toxic, harmful, off-topic content	~50ms
PII/DLP validators	Prevent data leakage	~10ms
Schema validation	Ensure message structure	~5ms
Tool allowlists	Restrict data access	~5ms
Rate limiting	Prevent flood/abuse	~1ms

Implementation:

See the inline flow in the architecture diagram below — requests pass through rate limiting, schema validation, guardrails, generation, output checks, and DLP before the auto-send decision.

Latency budget: 100ms total for inline checks.

What inline controls catch: - Prompt injection attempts - Toxic or harmful content - PII in output (account numbers, addresses) - Off-topic responses - Malformed messages

What they miss: - Subtle policy violations - Incorrect information - Inappropriate tone for context - Hallucinated product details - Drift from training baseline

Band B: Near-Real-Time (Seconds)¶

What belongs here: Signals that may not block, but must trigger action quickly.

Signal	Trigger	Action
Repeated soft guardrail hits	Same intent, 5 hits in 10 min	Pause auto-send for intent
Retrieval anomalies	Empty or conflicting retrievals	Force human review
Escalation spike	3x normal escalation rate	Alert ops, tighten policy
Model version rollout	Any metric deviation	Auto-rollback

Implementation:

Events flow through EventBridge to Kinesis, then to an aggregator that feeds alert rules. Threshold alerts and anomaly detection trigger automated responses: pause auto-send, tighten policy, force human review, or rollback prompt versions.

Latency budget: 5-30 seconds from event to action.

Example: Detecting Drift After Prompt Rollout

Time	Event	Action
10:00:00	New prompt version deployed	—
10:00:15	First messages generated	—
10:00:30	Aggregator sees 12% soft-hit rate (baseline: 2%)	Threshold breached
10:00:35	Alert fires: "Guardrail anomaly on intent:payment_reminder"	—
10:00:40	Circuit breaker triggers	Intent moves to draft-only
10:00:45	Ops notified	Investigation begins

Messages affected: ~50 (vs 5,000 if detected hourly)

Band C: Nearline/Offline (Minutes to Hours)¶

What belongs here: Heavy analysis and learning loops.

Analysis	Purpose	Frequency
LLM-as-Judge scoring	QA sample evaluation	Every 15 min
Drift analysis	Population-level patterns	Hourly
Failure clustering	New failure modes	Daily
Human calibration	Judge accuracy check	Weekly

Implementation:

Message logs are sampled, evaluated by LLM-as-Judge, and stored in a findings database. QA dashboards and anomaly jobs drive policy updates — guardrail rules, prompt improvements, and training data refinements.

Why delays don't break safety: These controls don't stop individual messages. They improve the system over time.

Step 4: The Auto-Send Decision¶

Critical design choice: Not all messages should auto-send.

Intent Risk Classification¶

Intent Category	Risk Level	Send Mode	Rationale
Balance notification	Low	Auto-send	Templated, factual, low harm
Payment reminder	Low	Auto-send	Templated, clear grounding
Fraud alert	Medium	Auto-send	Time-critical, but templated
Product recommendation	Medium	Draft-only	Personalised, regulatory risk
Complaint response	High	Draft-only	Requires human judgment
Collections message	High	Draft-only	Regulatory, reputational
Rate change notification	High	Draft-only	Contractual implications
Hardship communication	Critical	Draft-only	Vulnerability, regulation

The Risk Envelope¶

Auto-send only when ALL conditions are met:

AUTO-SEND = (
    intent.risk_level <= MEDIUM
    AND retrieval.confidence >= 0.9
    AND guardrail.soft_hits == 0
    AND intent.auto_send_enabled == true
    AND circuit_breaker.status == CLOSED
)

If any condition fails → draft-only, human approval required.

Step 5: Circuit Breakers¶

What They Do¶

Circuit breakers automatically tighten controls when anomalies are detected, without waiting for human intervention.

Implementation¶

class CircuitBreaker:
    def __init__(self, intent_id):
        self.intent_id = intent_id
        self.state = "CLOSED"  # CLOSED = normal, OPEN = blocked
        self.failure_count = 0
        self.threshold = 5
        self.window_seconds = 300

    def record_event(self, event_type, severity):
        if event_type in ["guardrail_soft_hit", "retrieval_empty", "judge_flag"]:
            self.failure_count += 1

        if self.failure_count >= self.threshold:
            self.trip()

    def trip(self):
        self.state = "OPEN"
        # Immediate effects:
        # 1. Disable auto-send for this intent
        # 2. Force human review
        # 3. Alert operations
        # 4. Log for investigation

Circuit Breaker States¶

State	Behaviour	Trigger
CLOSED	Normal operation, auto-send enabled	Default
OPEN	Auto-send disabled, draft-only	Threshold breached
HALF-OPEN	Test mode, limited auto-send	Manual reset, testing

Example Thresholds¶

Metric	Threshold	Window	Action
Guardrail soft hits	5	5 min	OPEN
Retrieval empty rate	10%	10 min	OPEN
Judge flags	3	15 min	OPEN
Escalation rate	3x baseline	30 min	Alert + review
Any hard block	1	immediate	Alert

Step 6: Alert Aggregation¶

The Problem¶

Individual alerts are noise. Pattern detection requires aggregation.

Event Schema¶

{
  "event_id": "evt_abc123",
  "timestamp": "2024-01-15T14:32:01Z",
  "message_id": "msg_xyz789",
  "intent_id": "payment_reminder",
  "customer_segment": "retail",
  "channel": "email",
  "model_version": "claude-3-sonnet-20240229",
  "prompt_version": "v2.3.1",

  "inline_results": {
    "guardrail_passed": true,
    "guardrail_soft_hits": ["tone_formal"],
    "dlp_passed": true,
    "latency_ms": 87
  },

  "retrieval_results": {
    "documents_found": 3,
    "confidence": 0.92,
    "conflicts": false
  },

  "decision": {
    "action": "auto_send",
    "circuit_breaker_state": "CLOSED"
  }
}

Aggregation Dimensions¶

Dimension	Why It Matters
`intent_id`	Problem with specific use case
`model_version`	Regression after update
`prompt_version`	Prompt change caused issue
`customer_segment`	Certain populations affected
`channel`	Channel-specific problems
`time_window`	Temporal patterns

Alert Rules¶

rules:
  - name: "Guardrail anomaly by intent"
    condition: >
      COUNT(guardrail_soft_hits) BY intent_id
      OVER 5_MINUTES > 5
    action:
      - trip_circuit_breaker(intent_id)
      - alert(severity: HIGH, team: ops)

  - name: "Retrieval degradation"
    condition: >
      AVG(retrieval.confidence) BY intent_id
      OVER 10_MINUTES < 0.7
    action:
      - alert(severity: MEDIUM, team: ops)
      - force_human_review(intent_id)

  - name: "Model version regression"
    condition: >
      (guardrail_soft_hit_rate BY model_version CURRENT)
      > 1.5 * (guardrail_soft_hit_rate BY model_version PREVIOUS)
    action:
      - alert(severity: HIGH, team: ml)
      - consider_rollback(model_version)

Step 7: Latency Matrix¶

Control Latency by Risk Tier¶

Control	Tier 1-2	Tier 3	Tier 4
Input guardrails	Async	Inline	Inline
Output guardrails	Async	Inline	Inline
DLP/PII check	Async	Inline	Inline
Retrieval validation	Skip	Inline	Inline
LLM-as-Judge	Sample (1%)	Sample (10%)	All (async)
Human review	Escalation only	High-risk intents	All drafts
Circuit breakers	Manual	Threshold	Aggressive
Drift detection	Daily	Hourly	Real-time

Message Latency Budget (Tier 4)¶

Message Latency Budget

Stage	Latency	Cumulative
Rate limiting	+1ms	1ms
Input schema validation	+5ms	6ms
Input guardrails	+50ms	56ms
Retrieval	+200ms	256ms
Generation	+800ms	1,056ms
Output guardrails	+50ms	1,106ms
DLP/PII check	+10ms	1,116ms
Auto-send decision	+5ms	~1,120ms

If draft-only: +5ms to queue (~1,125ms total)

Human review SLA: 15 minutes for standard, 5 minutes for urgent

Step 8: Response Timing¶

Two Different "Responses"¶

A. Response to the Customer

Time-sensitive, but you control it.

Scenario	Response
Auto-send eligible	Send immediately
Draft-only	Queue for human review
Uncertain	Send safe acknowledgement

Safe acknowledgement template:

"Thank you for your message. We're reviewing this and will respond shortly."

This buys time without sending potentially problematic content.

B. Response to Misbehaving AI

This is what your alerting pipeline drives.

Response Type	Timing	Actions
Automated	Seconds	Disable auto-send, tighten policy, rollback prompt
Ops team	Minutes	Investigate, escalate, manual intervention
Compliance	Hours	Full review, regulatory assessment

Step 9: AWS Implementation¶

AWS Implementation Architecture

Inline (Milliseconds)¶

API Gateway receives requests, Lambda handles routing and validation, Bedrock provides guardrails and generation, then Lambda performs output validation and DLP. The decision routes to send, queue, or block.

Near-Real-Time (Seconds)¶

All Lambdas emit events to EventBridge, which feeds Kinesis. A Lambda aggregator processes streams and fans out to OpenSearch (alerting), DynamoDB (circuit breaker state), and SNS/PagerDuty (notifications).

Nearline (Minutes+)¶

S3 stores message logs. Step Functions selects samples for Bedrock judge evaluation. Results flow to OpenSearch for findings storage, then QuickSight for dashboards.

Step 10: Metrics Dashboard¶

Real-Time Panel¶

Metric	Target	Alert Threshold
Messages/hour	varies	>2x baseline
Auto-send rate	60-70%	<40% or >85%
Guardrail block rate	<1%	>2%
Guardrail soft-hit rate	<5%	>10%
Draft queue depth	<100	>500
Circuit breakers open	0	>0

Quality Panel (Nearline)¶

Metric	Target	Alert Threshold
Judge pass rate	>95%	<90%
Tone score	>4.0/5.0	<3.5
Factual accuracy	>98%	<95%
Human override rate	<5%	>10%

Drift Panel (Daily)¶

Metric	Target	Alert Threshold
Intent distribution	stable	>20% shift
Failure clustering	known patterns	new clusters
Model confidence	stable	>0.1 drop

Key Takeaways¶

Time-to-detect = time-to-harm for irreversible actions. Design controls accordingly.
Three bands, three purposes:
Inline prevents known-bad
Near-real-time detects drift before scale
Nearline improves the system
Auto-send is a privilege, not a default. Earn it through risk classification and confidence thresholds.
Circuit breakers are your safety net. They act faster than humans can.
Aggregation reveals patterns. Individual events are noise; correlated events are signal.
The safest delay is the one the customer never notices. A holding message buys time without sending bad content.

Architecture Diagram¶

¶

AI Runtime Behaviour Security, 2026 (Jonathan Gill).