SOC Integration for AI Systems¶

Bridging AI security monitoring and existing security operations.

The Problem¶

You've implemented guardrails, a judge, and human oversight. They generate alerts, logs, and escalations.

Where do these go?

In most enterprises: nowhere useful. AI security telemetry sits in application logs, disconnected from the SOC that handles every other security event. The SOC team doesn't know what an LLM judge alert means. The AI team doesn't know how to write a SOC-compatible runbook.

Alert Taxonomy¶

AI systems produce security-relevant events that don't map cleanly to existing SOC categories. Define them explicitly.

Alert Category	Source	Severity Baseline	Examples
Guardrail Block	Input/output guardrails	Low–Medium	PII detected, toxicity blocked, off-topic rejected
Judge Flag	LLM-as-Judge	Medium–High	Policy violation detected, hallucination scored high, unsafe reasoning
Anomaly	Behavioral monitoring	Medium	Usage pattern deviation, output distribution shift, latency spike
Prompt Attack	Guardrails + Judge	High	Prompt injection detected, jailbreak attempt, system prompt extraction
Data Exfiltration Signal	Output monitoring	High	Bulk data in response, structured data extraction pattern
Agent Boundary Violation	Agentic controls	High	Unauthorised tool use, delegation policy breach, scope escalation
Model Drift	Judge assurance metrics	Medium	Judge accuracy degradation, output distribution change

Severity Mapping¶

Map AI alert severity to your existing SOC severity framework:

AI Severity	SOC Equivalent	Response SLA
Low (single guardrail block)	Informational	Log, aggregate, review weekly
Medium (judge flag, single anomaly)	Warning	Triage within 4 hours
High (attack pattern, data exfil signal)	Alert	Triage within 1 hour
Critical (confirmed breach, agent compromise)	Incident	Immediate response

Adjust SLAs to match your existing SOC tiers. Don't create a parallel severity system.

Identity Correlation¶

The hardest operational problem: linking an AI security event to a real identity.

AI telemetry typically gives you:

Source	Identity Available
API gateway logs	API key, OAuth token, IP address
LLM provider logs (Bedrock, Azure OpenAI)	Service principal, request ID
Application logs	Session ID, user ID (if your app logs it)
Judge evaluation logs	Request ID (if you propagate it)
Vector database logs	Service account, query metadata

To correlate "User X performed action Y that triggered alert Z," you need to join across at least three of these sources.

Correlation Pattern¶

API Gateway           →  request_id, user_token, timestamp
     ↓
Application Layer     →  request_id, session_id, user_id
     ↓
LLM Provider          →  request_id (if propagated), model, tokens
     ↓
Judge                  →  request_id, evaluation_result, score
     ↓
SIEM                   →  Correlated event: user_id + evaluation_result + model + timestamp

Critical requirement: Propagate a correlation ID (request_id or trace_id) across every component. Without this, you're guessing.

Platform-Specific Patterns¶

Amazon Bedrock + CloudWatch: - Bedrock invocation logs → CloudWatch Logs → EventBridge → SIEM - Correlation key: requestId from Bedrock invocation metadata - Identity: IAM role/user from CloudTrail

Databricks + Unity Catalog: - Serving endpoint logs → Delta table → Databricks SQL → SIEM export - Correlation key: request_id from serving endpoint - Identity: Unity Catalog identity mapped to workspace user

Azure OpenAI + Sentinel: - Diagnostic logs → Log Analytics → Sentinel - Correlation key: x-ms-client-request-id header - Identity: Entra ID from API Management authentication

Escalation Triggers¶

Define explicit triggers that move AI events from monitoring to investigation.

Automated Escalation (no analyst required)¶

Trigger	Action
>10 guardrail blocks from same user in 5 minutes	Block user, alert SOC
Judge flags >3 high-severity outputs in 1 hour	Disable AI feature for user, alert SOC
Prompt injection pattern detected	Log full request/response, alert SOC
Agent attempts tool outside allowlist	Halt agent, alert SOC + AI team

Analyst-Driven Escalation¶

Signal	Triage Question	Escalation Path
Repeated judge flags, low severity	Is the user testing boundaries or doing legitimate work?	→ AI team for context
Anomalous output volume	Is this a batch job or data exfiltration?	→ SOC L2 for investigation
Model drift detected	Did the provider update the model, or is this adversarial?	→ AI team + vendor liaison
New prompt injection technique	Is this a known pattern or novel?	→ SOC L2 + threat intel

Triage Procedures for AI Alerts¶

SOC analysts need clear guidance for AI-specific alerts. They are not AI experts. Don't expect them to be.

Guardrail Block (Low Severity)¶

1. Check: Is this a single event or part of a pattern?
   - Single event → Log and close
   - Pattern (>5 from same user) → Escalate to Medium

2. Check: What was blocked?
   - PII in output → Verify PII type, check if user has legitimate access to that data
   - Toxicity → Log, no further action unless repeated
   - Off-topic → Log, no further action

3. No user contact required for single events.

Judge Flag (Medium Severity)¶

1. Review the judge evaluation alongside the original request and response.

2. Check: Does the response violate policy?
   - Yes → Escalate to AI team for remediation
   - Unclear → Flag for human review in next calibration cycle
   - No (false positive) → Log as FP, feed back to judge tuning

3. Check: Is the user's request legitimate?
   - Research/testing → Confirm with user's manager
   - Appears adversarial → Escalate to High

Prompt Attack (High Severity)¶

1. Capture full request and response (do not summarise — exact content matters).

2. Check: Did the attack succeed?
   - Yes (guardrails/judge bypassed) → Incident. Disable endpoint. Notify AI team immediately.
   - No (blocked by guardrails) → Log technique, update threat intel, monitor for variants.

3. Check: Is this a known technique?
   - Known → Verify guardrails are current, close.
   - Novel → Escalate to AI security team for analysis and guardrail update.

4. Do NOT attempt to reproduce the attack in production.

SIEM Integration¶

Log Format¶

Emit AI security events in a format your SIEM can parse. Extend your existing schema rather than creating a new one.

{
  "event_type": "ai_security",
  "timestamp": "2026-02-11T14:30:00Z",
  "severity": "medium",
  "category": "judge_flag",
  "request_id": "req-abc-123",
  "user_id": "jgill@example.com",
  "model": "claude-sonnet-4-5-20250929",
  "judge_model": "gpt-4o",
  "judge_score": 0.82,
  "judge_verdict": "policy_violation",
  "policy_violated": "financial_advice_without_disclaimer",
  "guardrail_result": "pass",
  "tokens_in": 450,
  "tokens_out": 1200,
  "endpoint": "/api/v1/chat",
  "risk_tier": 2
}

Detection Rules (Examples)¶

Splunk SPL:

index=ai_security category="prompt_attack" 
| stats count by user_id, src_ip 
| where count > 3
| alert severity=high

Sentinel KQL:

AISecurity_CL
| where category_s == "judge_flag" and judge_score_d > 0.8
| summarize count() by user_id_s, bin(TimeGenerated, 1h)
| where count_ > 5

Runbook Integration¶

For each alert category, create a runbook in your existing ITSM (ServiceNow, PagerDuty, Jira).

Alert Category	Runbook Owner	Tool Integration
Guardrail Block	AI Platform Team	Automated — no ticket unless threshold breached
Judge Flag	AI Security + SOC L1	Ticket auto-created at Medium severity
Prompt Attack	SOC L2 + AI Security	Incident auto-created at High severity
Agent Boundary Violation	AI Platform Team + SOC L2	Incident auto-created
Model Drift	AI Platform Team	Alert to AI team, no SOC ticket

AI Runtime Behaviour Security, 2026 (Jonathan Gill).