Tier 2 — Managed Multi-Agent Deployment¶

Medium Autonomy · Selective Human Oversight · Production Operations

Part of the MASO Framework · Implementation Guidance Version 1.0 · February 2026

When to Use Tier 2¶

Tier 2 is the operational steady state for most enterprise multi-agent deployments. Agents operate with bounded autonomy: they can execute read operations and pre-approved low-consequence write operations without human intervention, while high-consequence actions still escalate to human oversight.

The shift from Tier 1 to Tier 2 is not about removing human control — it is about replacing per-action human approval with continuous automated monitoring plus exception-based human intervention. The human operator's role changes from gatekeeper to supervisor.

Tier 2 is appropriate when:

The organisation has completed at least 90 days of Tier 1 operations and established behavioural baselines for all agents.
Tier 1 graduation criteria have been formally met and documented.
The organisation has the technical capacity to implement per-agent Non-Human Identities, signed message bus, LLM-as-Judge evaluation, and continuous anomaly scoring.
The AI security maturity is at CMMI Level 2–3 (managed or defined) for AI-specific controls.
The cost of per-action human approval is becoming a bottleneck to operational value.

Most regulated enterprises will operate at Tier 2 for their production multi-agent systems indefinitely. Tier 3 (full autonomy) requires a level of demonstrated trust and technical maturity that relatively few use cases will justify.

Architecture at Tier 2¶

Tier 2 Architecture

Key architectural changes from Tier 1:

Signed message bus: All inter-agent messages are cryptographically signed by the sending agent's NHI. The bus validates signatures before delivery. This prevents message spoofing and provides non-repudiation for the audit trail.

Non-Human Identity (NHI) per agent: Each agent has a certificate-based identity with short-lived credentials (recommended rotation: 1–24 hours depending on risk). The NHI is bound to the agent's permission scope and is used for mutual authentication on the message bus and tool access.

LLM-as-Judge (Layer 2) is mandatory: A dedicated evaluation model (distinct from the task agents, ideally from a different model provider) reviews agent outputs and proposed actions against policy, quality, and safety criteria before they are committed. The judge model does not execute actions — it evaluates and either approves, flags for human review, or blocks.

Continuous monitoring layer: Automated anomaly detection replaces periodic manual log review. Behavioural baselines established during Tier 1 are used as reference. Deviations trigger alerts and, if thresholds are exceeded, automatic PACE phase transitions.

Action classification: Every agent action is classified as either: - Auto-approve: Read operations, low-consequence writes within pre-approved categories. Proceeds without human intervention. LLM-as-Judge reviews but does not block unless policy violation is detected. - Escalate: High-consequence writes, actions involving external parties, irreversible operations, actions flagged by the LLM-as-Judge. Routed to human supervisor for approval. - Block: Actions outside the agent's scope, guardrail violations, actions on the deny-list. Blocked automatically, logged, and flagged for review.