3. Epistemic Integrity¶
After this module you will be able to¶
- Define epistemic integrity as a measurable engineering property of agent chains
- Design a verification receipt data structure for tracking reasoning-basis integrity
- Identify which integrity checks must happen at runtime vs. design-time
- Implement interfaces that propagate data quality metadata across agent boundaries
Not philosophy: engineering¶
The term "epistemic integrity" sounds academic. In engineering terms, it means something concrete:
An agent's output is only trustworthy if the data it reasoned over was complete, current, and correctly scoped for the task.
This is not a subjective judgement. It is measurable. You can instrument it. You can set thresholds. You can alert on it. This module shows you how.
The engineering definition: Epistemic integrity is the property that an agent's stated conclusions are warranted by the data it actually accessed. When you can measure and verify this property at runtime, you have epistemic integrity monitoring. When you can't, you have the Phantom Compliance problem.
From concept to data structure¶
Traditional software systems have well-established patterns for tracking data provenance. When a function returns a result, you can trace which database queries produced the inputs, whether those queries returned complete results, and whether the data was fresh. You do this with request context, database connection metadata, and cache headers.
Agent systems need the same thing, but the "database query" is a retrieval step, the "function" is an LLM inference, and the "result" is a natural-language output that carries no built-in provenance.
The solution is a verification receipt, a structured metadata object that travels alongside every agent output, recording what data the agent accessed and how complete that data was.
The verification receipt pattern¶
Here is a concrete schema:
{
"receipt_id": "vr-2025-03-15-14-22-01-agent-b",
"agent_id": "compliance-agent-b",
"chain_id": "trade-review-4847",
"timestamp": "2025-03-15T14:22:02Z",
"reasoning_basis": {
"data_sources": [
{
"source_id": "restricted-securities-vectorstore",
"query": "restricted_securities_check",
"expected_result_count": 312,
"actual_result_count": 47,
"completeness_ratio": 0.15,
"freshness": "2025-03-15T14:21:58Z",
"truncation_occurred": true,
"truncation_reason": "context_window_limit"
}
],
"tool_calls": [],
"context_window": {
"capacity_tokens": 128000,
"used_tokens": 94000,
"utilisation": 0.73,
"truncation_occurred": false
}
},
"output_metadata": {
"stated_confidence": 0.94,
"warranted_confidence": 0.15,
"confidence_gap": 0.79,
"claims": [
{
"claim": "No restricted securities found in proposed trade",
"basis": "Checked 47 of 312 known restricted securities",
"coverage": 0.15
}
]
},
"integrity_verdict": {
"pass": false,
"flags": ["retrieval_completeness_below_threshold"],
"recommended_action": "escalate_or_retry"
}
}
Let's break down the critical fields.
reasoning_basis.data_sources¶
This section records every data access the agent made during its reasoning process. For each source:
expected_result_count: How many results should this query return? This can come from a precomputed baseline, a count query against the source, or a configured threshold.actual_result_count: How many results did the agent actually receive?completeness_ratio: The ratio of actual to expected. This is the single most important metric for detecting Phantom Compliance-style failures.truncation_occurred: A boolean flag indicating whether the data was truncated before reaching the agent.
output_metadata.confidence_gap¶
The confidence gap is the difference between what the agent claims its confidence is and what the data warrants. Agent B reported 94% confidence on a compliance check that covered 15% of the restricted securities list. The gap is 0.79, a clear signal that something is wrong.
Computing warranted confidence is domain-specific, but a simple heuristic works well: if you checked 15% of the data, your maximum warranted confidence in a "nothing found" claim is approximately 15%. Any confidence above that is unwarranted.
integrity_verdict¶
The verdict is a computed field based on configurable rules:
- Completeness ratio below threshold? Flag.
- Confidence gap above threshold? Flag.
- Truncation occurred on a critical data source? Flag.
If any flag is set, the receipt's pass field is false, and the recommended action tells the system what to do next.
Where the receipt lives in your architecture¶
The verification receipt is not a log entry. It is a first-class data object that travels with the agent's output through the chain.
The architectural principle: Verification receipts make data quality a first-class citizen in inter-agent communication. Just as HTTP responses carry headers with cache metadata and content types, agent outputs carry receipts with reasoning-basis metadata. Downstream agents can make informed trust decisions instead of blindly accepting upstream outputs.
Runtime vs. design-time checks¶
Not every integrity check needs to happen at runtime. Some can be verified at design time (during development and deployment), and some must be verified at runtime (during execution). Getting this distinction right is critical for both performance and coverage.
Design-time checks¶
These checks verify structural properties that don't change between requests:
| Check | What it verifies | How to implement |
|---|---|---|
| Schema validation | Receipt structure is well-formed | JSON Schema validation in CI/CD |
| Source registration | Every data source has an expected result count or baseline | Configuration validation at deploy time |
| Threshold configuration | Completeness thresholds are set for every critical source | Config review + automated validation |
| Receipt propagation | Every agent in the chain produces and consumes receipts | Integration tests that verify receipt flow |
| Truncation handling | Framework is configured to emit truncation events | Unit tests for context assembly logic |
Design-time checks are your safety net against misconfiguration. They ensure that the runtime checks can work.
Runtime checks¶
These checks must execute on every request because their results depend on the specific data being processed:
| Check | What it verifies | When it runs |
|---|---|---|
| Retrieval completeness | Actual vs. expected result count for this specific query | After every retrieval step |
| Context utilisation | How much of the context window is used; whether truncation occurred | Before every LLM inference |
| Response freshness | Tool response timestamps are within acceptable staleness bounds | After every tool call |
| Cross-agent consistency | Downstream confidence is warranted by upstream data quality | At every inter-agent boundary |
| Confidence gap | Stated confidence does not exceed warranted confidence | Before emitting any agent output |
Runtime checks add latency. The retrieval completeness check requires either a count query against the source or a comparison against a cached baseline. Context utilisation checks are essentially free (you already have the token count). Response freshness checks require parsing one additional field from the tool response.
In practice, the total overhead for runtime epistemic integrity checks is 50-200ms per agent boundary, negligible compared to LLM inference time.
Implementing the interface¶
Here is a minimal interface definition for agents that support epistemic integrity:
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
@dataclass
class DataSourceAccess:
source_id: str
query: str
expected_count: Optional[int]
actual_count: int
freshness: datetime
truncated: bool
truncation_reason: Optional[str] = None
@property
def completeness_ratio(self) -> Optional[float]:
if self.expected_count and self.expected_count > 0:
return self.actual_count / self.expected_count
return None
@dataclass
class VerificationReceipt:
agent_id: str
chain_id: str
timestamp: datetime
data_sources: list[DataSourceAccess] = field(default_factory=list)
context_utilisation: float = 0.0
context_truncated: bool = False
stated_confidence: float = 0.0
@property
def warranted_confidence(self) -> float:
"""Minimum completeness ratio across all data sources."""
ratios = [ds.completeness_ratio for ds in self.data_sources
if ds.completeness_ratio is not None]
if not ratios:
return 0.0
return min(ratios)
@property
def confidence_gap(self) -> float:
return max(0.0, self.stated_confidence - self.warranted_confidence)
@property
def integrity_pass(self) -> bool:
return (
self.confidence_gap < 0.2
and all(
(ds.completeness_ratio or 0) > 0.8
for ds in self.data_sources
)
and not self.context_truncated
)
@dataclass
class AgentOutput:
content: str
receipt: VerificationReceipt
upstream_receipts: list[VerificationReceipt] = field(
default_factory=list
)
The critical design choice: AgentOutput bundles the content with its receipt and the chain of upstream receipts. Any downstream agent (or any monitoring system) can inspect the full provenance chain.
What this catches¶
Let's replay the Phantom Compliance scenario with verification receipts in place:
-
Agent B retrieves 47 of 312 restricted securities. The receipt records
completeness_ratio: 0.15andtruncated: true. -
Agent B produces CLEAR with 94% confidence. The receipt computes
warranted_confidence: 0.15andconfidence_gap: 0.79. Theintegrity_passfield isfalse. -
Agent C receives Agent B's output and receipt. Before proceeding, Agent C checks
receipt.integrity_pass. It isfalse. Agent C halts the chain and escalates.
Result: The trade is not approved. The failure is detected at runtime, at the inter-agent boundary, before any damage occurs. Total added latency: approximately 100ms for the receipt computation and check.
What this doesn't catch¶
Verification receipts are not a complete solution. They have limitations:
- Unknown unknowns: If you don't configure an expected result count for a data source, the completeness ratio can't be computed. You need to register your data sources and their baselines.
- Semantic completeness: Receipts track whether the agent got enough data, not whether it got the right data. If all 312 restricted securities are returned but the query was wrong and retrieved the wrong list, the receipt will show 100% completeness on the wrong data.
- Gaming: If an agent is adversarial (or hallucinating), it could generate a receipt that claims high completeness. Receipts should be computed by the framework layer, not by the agent itself.
These limitations are addressed by the MASO controls in Module 4, which add independent verification layers on top of the receipt pattern.
Reflection
Look at the warranted_confidence calculation in the code above. It uses the minimum completeness ratio across all data sources. Is this the right heuristic for your use case? When might you want the average instead? When might you want a weighted calculation based on source criticality?
Consider
For compliance-critical systems (like the Phantom Compliance scenario), the minimum is correct, because your confidence can only be as high as your weakest data source. For advisory systems where multiple sources provide overlapping information, an average or weighted approach might be more appropriate. The key is that you are making this decision explicitly rather than defaulting to the agent's self-reported confidence.