Supply Chain Security Controls¶

Part of the AI Security Infrastructure Controls framework. Companion to AI Runtime Behaviour Security.

Overview¶

AI supply chains extend far beyond traditional software dependencies. They include foundation models with opaque training data, fine-tuning datasets that can introduce backdoors, RAG knowledge bases that become part of the model's effective reasoning, third-party tools and plugins that agents invoke autonomously, and guardrail/safety models that are themselves machine learning systems. Compromise at any point in this chain can undermine every downstream security control.

These eight controls establish verification, provenance, and integrity requirements across the full AI supply chain.

SUP-01 — Verify Model Provenance and Integrity¶

Risk Tiers: All

Objective¶

Ensure that every model deployed in production can be traced to a verified source, with cryptographic proof that it has not been modified since publication.

Requirements¶

Requirement	Description
Source verification	Models must be obtained from verified publishers or approved internal registries. No models downloaded from unverified sources, community forks, or anonymous uploads.
Cryptographic integrity	Record and verify cryptographic hashes (SHA-256 minimum) for all model artifacts at download, storage, and deployment. Any hash mismatch blocks deployment.
Signature validation	Where model publishers provide digital signatures, validate signatures before deployment. Maintain a registry of trusted signing keys.
Model registry	Maintain a centralised registry of all approved models with: publisher identity, version, hash, download source, approval date, approver, and risk tier classification.
Version pinning	Production deployments must reference specific model versions, never "latest" or floating tags. Version changes require re-approval.

Relationship to Three Layers¶

Layer	How SUP-01 Supports It
Guardrails	Guardrails can only enforce policy if the model they protect is the model that was tested. A substituted model may respond differently to the same guardrail configuration.
Judge	Judge evaluation baselines are model-specific. A different model invalidates calibration data and threshold settings.
Human Oversight	Provenance records give human reviewers confidence that the model in production matches what was assessed and approved.

SUP-02 — Assess Model Risk Before Adoption¶

Risk Tiers: All

Objective¶

Evaluate every model against security, safety, and operational risk criteria before it is approved for use in any environment.

Requirements¶

Requirement	Description
Pre-adoption assessment	Every model must undergo a documented risk assessment before deployment. Assessment scope includes: training data provenance, known vulnerabilities, licence terms, capability profile, and alignment evaluation results.
Risk classification	Assign a risk tier to each model based on: capability level, deployment context (internal vs. customer-facing), data access scope, autonomy level, and regulatory exposure.
Red team evaluation	For Tier 2+ deployments, conduct adversarial testing (prompt injection, jailbreak, data extraction) against the specific model version before approval.
Licence compliance	Verify that model licence terms permit the intended use case. Flag models with restrictive licences, non-commercial clauses, or unclear IP provenance.
Re-assessment triggers	Define events that trigger re-assessment: model version updates, deployment context changes, new vulnerability disclosures, regulatory changes, or elapsed time thresholds.

Relationship to Three Layers¶

Layer	How SUP-02 Supports It
Guardrails	Risk assessment identifies which guardrail configurations are needed for a specific model's known weaknesses and capability profile.
Judge	Assessment results inform Judge evaluation criteria and threshold calibration for the specific model.
Human Oversight	Risk classification determines the level of human oversight required — higher-risk models require more frequent and more granular human review.

SUP-03 — Verify RAG Data Source Integrity¶

Risk Tiers: Tier 2+

Objective¶

Ensure that data ingested into retrieval-augmented generation (RAG) knowledge bases is from verified sources, has not been tampered with, and does not introduce poisoned or adversarial content into the model's effective context.

Requirements¶

Requirement	Description
Source allowlisting	Maintain an explicit allowlist of approved data sources for each RAG knowledge base. Only data from allowlisted sources may be ingested.
Content scanning	Scan all ingested content for: prompt injection payloads, adversarial content designed to manipulate model behaviour, malware or malicious scripts, and content that violates data classification policy.
Provenance tracking	Record provenance metadata for every document in the knowledge base: source, ingestion timestamp, ingestion pipeline version, content hash, and approver (for manually curated content).
Integrity monitoring	Continuously verify that knowledge base contents match their recorded hashes. Alert on any unexpected modification.
Separation from runtime	RAG ingestion pipelines must be separated from runtime query paths (see NET-05). Ingestion processes should never have direct access to the model runtime environment.

Relationship to Three Layers¶

Layer	How SUP-03 Supports It
Guardrails	Input guardrails inspect prompts, but RAG content bypasses prompt-level inspection because it enters via the retrieval path. Source integrity is the guardrail for this vector.
Judge	Judge can evaluate whether retrieved content appears anomalous relative to the knowledge base's expected domain, but only if the baseline is trustworthy.
Human Oversight	Provenance records enable human reviewers to trace any problematic output back to the specific RAG source that contributed to it.

SUP-04 — Secure Fine-Tuning Pipeline¶

Risk Tiers: Tier 2+

Objective¶

Protect fine-tuning processes from data poisoning, unauthorised modification, and supply chain compromise that could embed backdoors or degrade model safety.

Requirements¶

Requirement	Description
Training data validation	All fine-tuning datasets must undergo review for: data quality, label accuracy, adversarial examples, PII content, and alignment with intended behaviour.
Pipeline access control	Fine-tuning pipelines require authenticated access with role-based permissions. Training job submission is restricted to authorised personnel.
Environment isolation	Fine-tuning environments are isolated from production inference environments. No shared compute, storage, or network paths.
Artifact versioning	Every fine-tuned model version is stored with: base model reference, training dataset reference, hyperparameters, training logs, and output hash.
Post-training evaluation	Fine-tuned models must pass safety and security evaluation (including adversarial testing) before deployment. Evaluation results are recorded and linked to the model version.
Rollback capability	Maintain the ability to revert to the previous model version if post-deployment monitoring detects degraded safety or security behaviour.

Relationship to Three Layers¶

Layer	How SUP-04 Supports It
Guardrails	A poisoned fine-tuned model may learn to evade specific guardrail patterns. Pipeline security prevents this attack vector.
Judge	Post-training evaluation provides the Judge with a validated baseline. Changes in Judge scores after fine-tuning indicate potential problems.
Human Oversight	Artifact versioning and training logs give human reviewers a complete audit trail of what changed and why.

SUP-05 — Audit Tool and Plugin Supply Chain¶

Risk Tiers: Tier 2+ (agentic)

Objective¶

Ensure that tools and plugins available to AI agents are from verified sources, have been assessed for security risk, and are maintained under version control with integrity verification.

Requirements¶

Requirement	Description
Tool registry	Maintain a centralised registry of all approved tools and plugins. Each entry includes: publisher, version, hash, capabilities, required permissions, risk classification, and approval date.
Source verification	Tools must be obtained from verified publishers or approved internal repositories. No tools from unverified sources or community repositories without security review.
Security assessment	Every tool undergoes security assessment before approval: code review (or vendor assessment for closed-source), dependency analysis, permission requirements analysis, and adversarial testing of tool behaviour.
Version control	Tool versions are pinned in production. Updates require re-assessment and re-approval. Automatic updates are prohibited.
Dependency analysis	Analyse tool dependencies for known vulnerabilities. Transitive dependencies are included in the analysis scope.
Capability declaration	Tools must declare their capabilities and required permissions in a machine-readable manifest. Undeclared capabilities are blocked at the gateway (see TOOL-01, TOOL-02).

Relationship to Three Layers¶

Layer	How SUP-05 Supports It
Guardrails	Tool manifests feed guardrail policy — the guardrail knows what the tool is allowed to do and blocks invocations outside declared scope.
Judge	Judge can evaluate whether tool invocation patterns match the declared capability profile, detecting anomalous usage that may indicate compromise.
Human Oversight	The tool registry provides human reviewers with a complete inventory of what agents can do, enabling informed approval decisions.

SUP-06 — Verify Guardrail and Safety Model Integrity¶

Risk Tiers: All

Objective¶

Guardrails and Judge models are themselves machine learning systems (or rule engines). Their integrity must be verified with the same rigour applied to the primary model, because compromise of safety systems is the highest-impact supply chain attack.

Requirements¶

Requirement	Description
Integrity verification	Guardrail models and rule configurations are subject to the same hash verification and signature validation as primary models (SUP-01).
Independent sourcing	Where possible, guardrail and Judge models should come from different providers or model families than the primary model. This reduces the risk of correlated failure.
Configuration version control	Guardrail rule sets and Judge prompts/configurations are stored in version-controlled repositories with audit trails. Changes require approval.
Tamper detection	Monitor guardrail and Judge model artifacts for unauthorised modification. Alert on any change that bypasses the approved change process.
Update validation	Updates to guardrail or Judge models/configurations must pass regression testing against known attack patterns and edge cases before deployment.

Relationship to Three Layers¶

Layer	How SUP-06 Supports It
Guardrails	This control directly protects guardrail integrity. A compromised guardrail that silently passes malicious content is worse than no guardrail at all.
Judge	Judge model integrity is equally critical. A compromised Judge that approves harmful outputs defeats the evaluation layer entirely.
Human Oversight	Version-controlled configurations and tamper detection ensure that human-approved safety settings remain in effect.

SUP-07 — Maintain AI Component Inventory (AI-BOM)¶

Risk Tiers: All

Objective¶

Maintain a comprehensive, machine-readable inventory of all AI components in production — analogous to a software bill of materials (SBOM) but extended to cover models, datasets, guardrails, tools, and evaluation systems.

Requirements¶

Requirement	Description
Component coverage	The AI-BOM must include: foundation models, fine-tuned models, guardrail models/rules, Judge models/configurations, embedding models, RAG knowledge bases, tools and plugins, orchestration frameworks, and vector databases.
Metadata per component	Each entry includes: component type, name, version, publisher/source, hash, deployment location, risk tier, dependencies, approval status, and last assessment date.
Machine-readable format	The AI-BOM is maintained in a structured, machine-readable format (e.g., JSON, YAML) that can be consumed by automated tooling.
Continuous update	The AI-BOM is updated automatically when components are deployed, updated, or decommissioned. Manual-only maintenance is insufficient for production systems.
Dependency mapping	The AI-BOM captures dependencies between components: which guardrails protect which models, which tools are available to which agents, which knowledge bases serve which retrieval endpoints.

Relationship to Three Layers¶

Layer	How SUP-07 Supports It
Guardrails	The AI-BOM identifies which guardrails are deployed for each model, enabling gap analysis and coverage verification.
Judge	The AI-BOM tracks Judge model versions and their associations with primary models, ensuring evaluation consistency.
Human Oversight	The AI-BOM gives human reviewers a single source of truth for what is deployed, enabling informed risk decisions and incident response.

SUP-08 — Monitor for Model and Dependency Vulnerabilities¶

Risk Tiers: All

Objective¶

Continuously monitor for newly disclosed vulnerabilities, attacks, and safety issues affecting any component in the AI-BOM, and trigger assessment or remediation when relevant disclosures occur.

Requirements¶

Requirement	Description
Vulnerability feed monitoring	Subscribe to vulnerability feeds and advisories for: model providers (e.g., security bulletins from OpenAI, Anthropic, Meta, Google), framework providers (e.g., LangChain, LlamaIndex, Hugging Face), tool and plugin providers, and general CVE databases for software dependencies.
AI-BOM correlation	Correlate incoming vulnerability disclosures against the AI-BOM to determine which production deployments are affected.
Impact assessment	For each relevant vulnerability, assess: exploitability in the deployment context, data exposure risk, whether existing guardrails/Judge mitigate the issue, and urgency of remediation.
Remediation tracking	Track remediation actions to completion: model updates, guardrail rule additions, configuration changes, or compensating controls.
Proactive testing	When new attack techniques are published (e.g., novel prompt injection methods, jailbreak patterns), proactively test deployed models against these techniques rather than waiting for a vendor advisory.

Relationship to Three Layers¶

Layer	How SUP-08 Supports It
Guardrails	Vulnerability monitoring may identify new attack patterns that require guardrail rule updates (e.g., new injection techniques).
Judge	New vulnerability disclosures may require Judge evaluation criteria updates to detect exploitation of newly discovered weaknesses.
Human Oversight	Vulnerability monitoring feeds human risk assessment and drives prioritised remediation decisions.

AI Runtime Behaviour Security, 2026 (Jonathan Gill).