The Supply Chain Problem¶

You Don't Control the Model You Deploy¶

Every enterprise AI system depends on components you didn't build:

Foundation models from OpenAI, Anthropic, Google, Meta
Open-source models downloaded from Hugging Face, Ollama, or similar
Frameworks like LangChain, LlamaIndex, CrewAI, AutoGen
Embedding models that encode your proprietary data
Vector databases that store and retrieve that data

You test your application. You monitor your outputs. But you have no visibility into whether the model you're calling today is the same model you evaluated last month.

The Risks Are Not Hypothetical¶

Risk	Example
Model update without notice	Provider updates weights or system prompt; your evaluated baseline is invalid
Dependency compromise	Malicious package in your AI toolchain (LangChain had CVEs in 2023–2024)
Model poisoning	Open-source model weights tampered with before you download them
Embedding drift	Embedding model update changes retrieval behaviour across your RAG pipeline
Shadow AI	Teams deploy models you haven't evaluated, using your data

Traditional software has SBOMs. AI systems need equivalent provenance documentation — what NDAA and EU AI Act drafters are calling "AI-BOMs."

What the Framework Misses¶

The three-layer pattern (Guardrails → Judge → Human) monitors runtime behaviour. It doesn't address:

Whether the model you're monitoring is the model you approved
Whether the framework dependencies are trustworthy
Whether the model weights have integrity

Runtime monitoring detects symptoms. Supply chain controls prevent the disease.

The Pattern¶

Control	What It Does	When
Model pinning	Lock to specific model version/checkpoint	Deployment
Dependency scanning	Audit AI framework dependencies for vulnerabilities	CI/CD
Weight verification	Hash-based integrity checks on downloaded models	Download + deployment
Provider change monitoring	Detect when API-accessed models change behaviour	Continuous
AI-BOM generation	Document all AI components, versions, and sources	Release
Shadow AI discovery	Identify unsanctioned model usage across the enterprise	Periodic

The Uncomfortable Truth¶

For API-accessed models (OpenAI, Anthropic, etc.), you cannot verify model integrity. You are trusting the provider. Your controls are:

Contractual — SLAs that require change notification
Behavioral — Continuous evaluation that detects drift (this is where the Judge helps)
Architectural — Abstraction layers that let you switch providers if trust breaks

For self-hosted models, you have more control but more responsibility. You own the full chain from download to deployment.

Neither is inherently safer. Both need explicit controls.¶

AI Runtime Behaviour Security, 2026 (Jonathan Gill).