Objective Intent: From Fault Detection to Behavioral Assurance¶

The existing controls catch faults where they can: injection, tool misuse, data leakage. What they cannot evaluate is whether the system is doing what it was designed to do. Objective Intent is the bridge from fault detection to behavioral assurance: from catching things that go wrong to verifying that things go right.

The Gap This Fills¶

The MASO framework already provides multiple layers of defence:

Guardrails block known-bad patterns (mechanical)
Model-as-Judge evaluates quality and safety (semantic, but reactive)
Observability detects drift from behavioral baselines (statistical)
The Intent Layer evaluates workflow-level outcomes post-execution (strategic, but coarse)

What is missing is a formal declaration of what each agent, each judge, and the overall orchestration is supposed to achieve, written by the developer, versioned, and used as the reference standard for continuous behavioral evaluation.

Without this:

Judges evaluate against generic criteria, not the specific purpose the developer intended.
Drift detection compares against statistical baselines, not declared objectives.
Post-execution evaluation can assess coherence but not correctness relative to design intent.
Combined agent actions produce emergent outcomes that nobody specified and nobody is evaluating against.

Objective Intent makes the developer's expectations explicit, versioned, and evaluable.

Two Levels of Evaluation¶

The architecture operates at two distinct levels, and both are necessary.

Tactical: single-agent intent compliance¶

Each agent has a declared Objective Intent Specification (OISpec). A tactical judge evaluates that individual agent's actions and outputs against its OISpec. This catches:

An agent operating outside its declared parameters
An agent pursuing a goal it was not assigned
An agent exceeding its declared authority or scope
An agent producing outputs inconsistent with its stated purpose

This is the monitoring of one: the individual agent against its declared contract.

Strategic: aggregated intent compliance¶

The orchestration has a declared Workflow Intent, the combined objective across all agents. A strategic evaluation agent assesses whether the aggregate behaviour of all agents satisfies the workflow intent. This catches:

Individual agents all complying with their own intents but producing a collectively wrong outcome
Emergent behaviour that no single agent's intent anticipated
Intent gaps, scenarios where no agent's OISpec covers a critical requirement
Conflicting intents between agents that produce contradictory actions

This is the monitoring of the whole: the fleet-level behavioural assessment against the aggregated intent.

Judge intent monitoring¶

Judges themselves operate against declared intents. A judge whose behaviour drifts from its declared evaluation criteria is as dangerous as a task agent that drifts from its operational intent. Judge intents specify:

What the judge is evaluating (scope)
What criteria it applies (standards)
What actions it takes on findings (disposition)
What it does NOT evaluate (explicit exclusions)

Judge intent compliance is monitored by a separate evaluator, not the judge itself and not the agents it evaluates. This closes the "who watches the watchmen" loop: not through infinite recursion, but through explicit intent contracts at every level.

The Objective Intent Specification (OISpec)¶

An OISpec is a structured, version-controlled document attached to every agent, judge, and workflow. It is not free-form prose. It is a typed schema that evaluation agents can programmatically consume.

Agent OISpec¶

{
  "oisspec_version": "1.0",
  "agent_id": "agent-analyst-01",
  "agent_role": "task | orchestrator | evaluator | observer",
  "created_by": "developer-id",
  "version": 3,

  "objective": {
    "goal": "Analyse market data and produce a risk score for the specified portfolio",
    "success_criteria": [
      "Risk score is traceable to specific data points",
      "All data sources are from the last 30 days",
      "Uncertainty above 0.3 is explicitly flagged"
    ],
    "failure_criteria": [
      "Risk score produced without citing data sources",
      "Recommendation to trade exceeding $50,000 exposure",
      "Uncertainty stripped from downstream outputs"
    ]
  },

  "parameters": {
    "permitted_tools": ["market-data-api", "news-sentiment-api"],
    "prohibited_actions": ["execute-trade", "send-email", "modify-portfolio"],
    "data_scope": "Portfolio X only. No cross-portfolio access.",
    "time_boundary": "Last 30 calendar days from execution time",
    "output_constraints": {
      "max_confidence_without_source": 0.5,
      "required_fields": ["risk_score", "data_sources", "uncertainty"],
      "prohibited_content": ["trading instructions", "personal advice"]
    }
  },

  "authority": {
    "can_delegate": false,
    "max_delegation_depth": 0,
    "can_create_agents": false,
    "can_modify_workflow": false
  },

  "risk_classification": "high",
  "evaluation_frequency": "every_action | per_phase | post_execution"
}

The critical elements are the success criteria and failure criteria. Success criteria define the positive standard. Failure criteria define the red lines. Both are necessary because intent alignment is not binary: an agent can partially satisfy success criteria while avoiding failure criteria, and the gap between is where judgment applies.

Workflow OISpec¶

{
  "oisspec_version": "1.0",
  "workflow_name": "Portfolio Risk Assessment",

  "objective": {
    "goal": "Produce a risk assessment for portfolio X using current market data and news sentiment",
    "aggregate_success_criteria": [
      "Final recommendation is traceable through all intermediate outputs to primary data sources",
      "No agent operated outside its declared OISpec parameters",
      "Combined agent outputs are internally consistent",
      "All uncertainty signals from intermediate agents are preserved in the final output"
    ],
    "aggregate_failure_criteria": [
      "Final output contradicts intermediate analysis without explanation",
      "Two or more agents produced conflicting assessments that were silently resolved",
      "Uncertainty was stripped at any handoff point",
      "Any agent exceeded its declared authority"
    ]
  },

  "agent_intents": ["oisspec-id-analyst-01", "oisspec-id-sentiment-01", "oisspec-id-combiner-01"],
  "judge_intents": ["oisspec-id-judge-quality-01", "oisspec-id-judge-compliance-01"],

  "evaluation": {
    "tactical_evaluation": "per_agent_per_action",
    "strategic_evaluation": "per_phase_and_post_execution",
    "intent_coverage_check": true
  }
}

The intent_coverage_check flag is significant. Before a workflow executes, the system verifies that every declared success criterion in the workflow OISpec is covered by at least one agent's OISpec. This catches intent gaps before execution, not after.

How Evaluation Works¶

Intent flows downward: decomposed from workflow to agent. Evaluation flows upward: aggregated from agent to workflow. Judge monitoring is lateral, independent of both flows.

                    ┌─────────────────────────────────┐
                    │     Workflow Objective Intent     │
                    │   (Strategic: the whole)         │
                    └──────────────┬──────────────────┘
                                   │
                    ┌──────────────┴──────────────────┐
                    │    Strategic Evaluation Agent     │
                    │  Assesses aggregate compliance    │
                    └──────────────┬──────────────────┘
                                   │
              ┌────────────────────┼────────────────────┐
              │                    │                    │
   ┌──────────▼─────────┐  ┌──────▼──────────┐  ┌─────▼────────────┐
   │  Agent A OISpec     │  │ Agent B OISpec  │  │ Agent C OISpec   │
   │  "Retrieve market   │  │ "Analyse news   │  │ "Combine into    │
   │   data"             │  │  sentiment"     │  │  risk score"     │
   └──────────┬──────────┘  └──────┬──────────┘  └─────┬────────────┘
              │                    │                    │
   ┌──────────▼─────────┐  ┌──────▼──────────┐  ┌─────▼────────────┐
   │  Tactical Judge     │  │ Tactical Judge  │  │ Tactical Judge   │
   │  (per-agent OISpec  │  │ (per-agent      │  │ (per-agent       │
   │   compliance)       │  │  OISpec check)  │  │  OISpec check)   │
   └─────────────────────┘  └─────────────────┘  └──────────────────┘
              │                    │                    │
              └────────────────────┼────────────────────┘
                                   │
                    ┌──────────────▼──────────────────┐
                    │    Judge Meta-Evaluator          │
                    │  "Are the tactical judges still  │
                    │   evaluating per their OISpecs?" │
                    └─────────────────────────────────┘

The three evaluation levels work in concert:

Level	Evaluator	Question it answers
Tactical	Per-agent judge	Did this agent do what it was supposed to do, within the parameters it was given?
Strategic	Workflow evaluator	Did the system as a whole achieve what was intended, and did the agents collectively behave as designed?
Judge monitoring	Meta-evaluator	Is the judge still doing what it was configured to do, or has it drifted?

Controls by Risk Tier¶

Objective Intent evaluation intensity scales with risk classification. The framework defines controls across three implementation tiers.

Tier 1: Supervised¶

Every agent has a declared purpose. Every workflow has a declared objective. Humans verify compliance manually. Intent gaps are caught before execution.

Control	Requirement
OI-1.1 Agent OISpec declaration	Every agent must have a declared OISpec, authored by the developer at design time, not generated by the agent
OI-1.2 Workflow OISpec declaration	Every multi-agent workflow must have a workflow-level OISpec referencing all participating agent OISpecs
OI-1.3 OISpec immutability	OISpecs are read-only during execution; changes require human authorisation and create a new version
OI-1.4 Manual intent review	Human reviewer compares agent outputs against the agent's OISpec weekly
OI-1.5 Intent coverage check	Before workflow execution, verify that every success criterion in the workflow OISpec is covered by at least one agent's OISpec

Tier 2: Managed¶

Automated evaluation at both tactical and strategic levels. Judges are themselves monitored against declared intents. Combined actions are evaluated for emergent violations.

Control	Requirement
OI-2.1 Automated tactical evaluation	Tactical judge evaluates each agent's actions against its OISpec at the frequency defined in the OISpec
OI-2.2 Automated strategic evaluation	Strategic evaluation agent assesses aggregate outputs against workflow OISpec at phase boundaries and post-execution
OI-2.3 Judge OISpec declaration	Every judge must have a declared OISpec specifying its evaluation scope, criteria, and permitted actions
OI-2.4 Judge intent monitoring	Judge behaviour is evaluated against its OISpec by an independent meta-evaluator
OI-2.5 OISpec violation escalation	Violations trigger defined escalation paths: HIGH risk gets immediate human notification, CRITICAL gets output quarantined
OI-2.6 Intent alignment scoring	Each agent receives a continuous intent alignment score that feeds into anomaly detection
OI-2.7 Combined action evaluation	Strategic evaluator checks for emergent outcomes that comply with individual OISpecs but violate the workflow OISpec

OI-2.7 is the control that catches "every agent did its job, but the result is wrong." This is the most dangerous class of multi-agent failure, and it is invisible without strategic-level evaluation against a declared workflow intent.

Tier 3: Autonomous¶

Real-time intent compliance, predictive analysis, and continuous improvement of the intent specification process itself.

Control	Requirement
OI-3.1 Continuous intent compliance	Tactical evaluation runs on every agent action in real-time, preventing non-compliant actions from propagating
OI-3.2 Predictive intent analysis	Strategic evaluator projects whether the current trajectory will satisfy the workflow OISpec at completion
OI-3.3 Cross-workflow intent learning	Patterns from strategic evaluation across many workflows feed into OISpec template improvement
OI-3.4 Judge rotation with intent continuity	When judge models are rotated, the replacement is validated against the same OISpec before assuming the role
OI-3.5 Adversarial intent testing	Red team exercises include attempts to satisfy the letter of an OISpec while violating its spirit
OI-3.6 Intent decomposition validation	When an orchestrator decomposes a workflow OISpec into agent OISpecs, an independent evaluator verifies the decomposition is faithful and complete

Evaluation Frequency by Risk¶

Risk Tier	Tactical Evaluation	Strategic Evaluation	Judge Monitoring	OISpec Review
LOW	Post-execution	Post-execution	Monthly calibration	Quarterly
MEDIUM	Per phase	Post-execution	Monthly calibration	Monthly
HIGH	Every action	Per phase + post-execution	Weekly calibration	Monthly
CRITICAL	Every action (synchronous)	Per phase + post-execution (output quarantined)	Daily calibration	On every OISpec change

HIGH and CRITICAL risk classifications are where Objective Intent matters most. At these tiers, the cost of non-compliance with declared intent is material: financial loss, regulatory violation, patient harm, or irreversible action.

How This Extends Existing Controls¶

Objective Intent does not replace existing MASO controls. It provides the formal reference standard that several existing controls implicitly require.

Existing Control	How Objective Intent Extends It
PG-1.3 Immutable task specification	OISpec is more granular: per-agent, per-judge, and per-workflow, with typed success/failure criteria
PG-2.2 Goal integrity monitoring	OISpec provides the formal reference standard that goal integrity is measured against
PA-2.1 Orchestrator intent verification	OISpec verification is continuous, not just at decomposition time
PA-2.2 Judge calibration	Judge OISpecs make calibration criteria explicit and evaluable, not just accuracy targets
OB-2.2 Anomaly scoring	Intent alignment score becomes an additional signal in the anomaly scoring vector
The Intent Layer	OISpec provides the structured Intent Specification that the post-execution judge evaluates against, now extended to every agent and every judge

What This Does Not Solve¶

Intellectual honesty requires mapping the limits.

Intent is hard to write well. Specifying what an agent should do in sufficient detail for machine evaluation is non-trivial. Vague OISpecs ("produce good analysis") are unevaluable. Overly specific OISpecs ("respond with exactly these fields in this order") are brittle. The right level of specificity depends on the use case and risk tier. Start with HIGH/CRITICAL risk workflows. Use the cross-workflow learning feedback loop (OI-3.3) to improve OISpec templates over time.

An agent can satisfy the letter of its OISpec while violating the spirit. An agent told "cite all sources" can cite sources that do not support the claim. An agent told "flag uncertainty above 0.3" can present everything at 0.29. Adversarial intent testing (OI-3.5) probes for this. Strategic evaluation (OI-2.2) catches cases where individual compliance produces aggregate non-compliance. Failure criteria in the OISpec are as important as success criteria.

The meta-evaluator can also drift. The judge meta-evaluator operates against its own OISpec. Who evaluates the meta-evaluator? At some point, the chain terminates in sampled human review. The recursion stops at humans, not at more agents.

OISpec maintenance is an ongoing cost. As agents evolve, tools change, and requirements shift, OISpecs must be updated. Stale OISpecs produce false positives (flagging correct behaviour that does not match outdated criteria) or false negatives (missing violations of updated requirements). OISpec versioning with change tracking and alerts when agent behaviour consistently triggers evaluation but human reviewers override provides the signal that the OISpec needs updating, not the agent.

Common Pitfalls¶

Writing OISpecs after deployment, not before. If the OISpec is written to match existing agent behaviour rather than to declare intended behaviour, it becomes a description, not a contract. OISpecs must be authored at design time and agents must be built to satisfy them.

Declaring intent for task agents but not for judges. A judge without a declared OISpec is a black box with authority. Its evaluation criteria are implicit, its scope is undefined, and its drift is undetectable. Judges need OISpecs as much as task agents do.

Evaluating individual agents but not combined actions. The most dangerous failures in multi-agent systems are emergent: they only appear when individually correct actions combine into collectively wrong outcomes. The strategic evaluation layer exists specifically to catch these.

Treating OISpec compliance as binary. Intent alignment is a spectrum. An agent at 95% compliance over 100 actions is different from an agent at 100% compliance over 10 actions. The intent alignment score (OI-2.6) must be continuous and trended, not just pass/fail.

Assuming the OISpec captures everything that matters. No specification is complete. OISpecs will always miss edge cases, novel scenarios, and failure modes nobody anticipated. This is why strategic evaluation, adversarial testing, and sampled human review exist: they catch what the specification does not cover.

Maturity Model¶

Level	Indicator
Initial	Agents deployed without declared intent. Goals implied by system prompts but not formally specified. No evaluation against stated objectives.
Managed	OISpecs declared for all agents and workflows. Manual review compares behaviour against intent. Intent coverage checked before execution.
Defined	Automated tactical and strategic evaluation operational. Judges have declared OISpecs. Judge intent monitoring active. Intent alignment scoring feeds anomaly detection.
Quantitatively Managed	Tactical evaluation accuracy measured. Strategic evaluation false positive/negative rates tracked. Intent alignment scores trended. OISpec coverage metrics published.
Optimising	Real-time intent compliance. Predictive intent analysis. Cross-workflow learning improving OISpec templates. Adversarial intent testing refining specifications.

Relationship to Other Framework Pages¶

Page	Connection
The Intent Layer	Post-execution semantic evaluation against declared intent. OISpec provides the structured specification the post-execution judge evaluates against.
Containment Through Intent	How declared intent flows through the defence stack. Objective Intent formalises the specification that containment is built on.
What Scales	Scaling properties of security patterns. OISpec evaluation scales at O(W) per workflow, not O(N) per agent.
The Judge Detects, Not Decides	The judge's role in the control stack. OISpec gives the judge explicit evaluation criteria instead of generic standards.
Infrastructure Beats Instructions	Why intent must be enforced through infrastructure. OISpec is the formal input to guardrail configuration.

The Honest Position¶

Objective Intent adds the layer that existing controls implicitly require but do not formally define. Without it, the framework catches mechanical faults and evaluates generic quality. With it, the framework can assess whether the system is doing what the developer designed it to do.

The OISpec is a contract between the developer and the evaluation infrastructure. The developer declares what each agent, each judge, and the overall workflow is supposed to achieve. The evaluation infrastructure holds the system accountable to that declaration. Sampled human review holds the evaluation infrastructure accountable.

This is not a guarantee of correctness. An OISpec can be incomplete, imprecise, or wrong. But it makes the developer's expectations explicit, versioned, and evaluable, and that is the prerequisite for every form of assurance that follows.