Engineering Leads Track¶

By the end of this track you will be able to¶

Identify the production failure modes in multi-agent systems that conventional tooling misses
Evaluate your current observability stack against chain-level integrity requirements
Design verification receipt patterns and epistemic integrity interfaces for agent chains
Select and implement MASO controls with concrete integration patterns for common agent frameworks
Build instrumentation, dashboards, and testing pipelines that catch Phantom Compliance-style failures

Your thread¶

As an engineering lead, your job is to build systems that work reliably in production, and to know when they aren't working before your customers do. Multi-agent AI systems introduce failure modes that don't trigger alerts, don't produce errors, and don't show up in your existing dashboards. This track gives you the "what do I actually build" story.

The golden thread: threat model → what breaks in practice → which controls are runtime vs design-time → what instrumentation looks like.

Every module follows the same five-beat structure, moving from problem to solution:

What goes wrong (scenario-driven)
Why current controls don't catch it (the gap)
What epistemic integrity means in this context (the concept)
Which MASO controls address it (the framework)
How to verify it's working (the evidence)

Modules¶

Module	Focus	Time
1. What Breaks in Practice	Concrete production failure modes: the things that page you at 2am	~20 min
2. Why Current Tools Miss It	Your observability stack, LangSmith, LangFuse: what they give you and what they don't	~20 min
3. Epistemic Integrity	An engineering requirement, not a philosophy lecture: data structures and interfaces	~25 min
4. MASO Controls	Execution Control, Observability, Data Protection: what you actually build	~30 min
5. Instrumentation & Evidence	Metrics, dashboards, canary tests, chaos testing for agent chains	~25 min
Decision Exercise	Ambiguous production signals: what do you do next?	~15 min

Total estimated time: ~2.5 hours

Prerequisites¶

Before starting this track, complete the core scenario, which introduces the Phantom Compliance failure that every module references. If you have already read it, start Module 1.

What this track is not¶

This track does not cover prompt engineering, model fine-tuning, or single-agent guardrail configuration. Those are important, but well-covered elsewhere. This track focuses on the multi-agent runtime problems that existing resources don't address: the failures that happen between agents, not within them.

Start Module 1 →