Decision Exercise: Risk & Governance¶
This exercise tests whether you can¶
- Evaluate a multi-agent governance decision with ambiguous signals, regulatory pressure, and business pressure
- Apply the concepts from this track: epistemic integrity, privileged agent governance, evidence standards, and accountability chains
- Make a defensible governance decision and articulate the reasoning behind it
- Distinguish between compliance theatre and operational assurance when under pressure
The situation¶
You are the Head of AI Governance at Northbrook Financial, a mid-size asset management firm. Eighteen months ago, the firm deployed an AI-assisted portfolio rebalancing pipeline. The system has performed well: it rebalances 1,200 client portfolios weekly, has reduced rebalancing costs by 35%, and has outperformed the manual process on execution quality metrics.
The pipeline has three agents:
- Agent M (Market Analysis): Ingests market data, economic indicators, and client portfolio positions. Produces a market context assessment.
- Agent S (Strategy): Takes Agent M's assessment and each client's investment policy statement (IPS). Produces rebalancing recommendations aligned with client objectives and risk tolerances.
- Agent E (Execution): Takes Agent S's recommendations, checks against compliance rules (concentration limits, sector exposure, liquidity requirements), and queues approved trades for execution.
Agent E is a privileged agent; it commits trades to the execution queue. The system operates at Tier 2 (Managed): automated controls with human oversight for flagged cases. A senior portfolio manager reviews any flagged rebalancing before execution.
The signals¶
It is Monday morning. You have three items on your desk:
Signal 1: Regulatory inquiry¶
On Friday, you received a letter from the regulator. They are conducting a thematic review of AI use in portfolio management. The letter requests:
- An inventory of all AI systems used in investment decision-making
- Documentation of governance frameworks for those systems
- Evidence of control effectiveness, specifically mentioning "the ability to demonstrate that AI-assisted decisions are based on complete and current data"
The deadline is 60 days.
The regulator's specific mention of "complete and current data" suggests they are aware of Phantom Compliance-style failure modes. Your governance framework covers individual AI systems well (you aligned it with NIST AI RMF and ISO 42001 twelve months ago), but you have not yet extended it to cover chain-level integrity for the rebalancing pipeline. You have deployment evidence and activity evidence for your controls. You do not have effectiveness evidence or coverage evidence for chain-level failure modes.
Signal 2: Integrity anomaly¶
Your monitoring team flagged an anomaly over the weekend. During Friday's rebalancing run, Agent M's market data retrieval for emerging markets returned 40% fewer data points than the 30-day average. The retrieval passed the minimum threshold guardrail (it returned more than the minimum required data points), but the shortfall is statistically significant.
Agent S produced rebalancing recommendations for 47 portfolios with emerging market exposure based on Agent M's assessment. Agent E approved and queued all 47 trades. None were flagged for human review because the automated controls did not trigger because no individual agent produced an output that breached its guardrail thresholds.
The trades executed on Friday. Early market data from this morning suggests the emerging markets positions may be underweighted relative to the opportunity, consistent with Agent M having had an incomplete market picture.
The retrieval anomaly is similar to the Phantom Compliance pattern: an agent operated on incomplete data, produced a confident output, and the downstream chain treated it as authoritative. The impact appears to be suboptimal positioning rather than a compliance violation, but you do not yet know the full scope. The 47 affected portfolios represent approximately $180 million in assets. The underweighting may self-correct if the market moves, or it may represent a missed opportunity cost for clients.
Signal 3: Business pressure¶
The CEO is preparing for the quarterly board meeting next week. The AI rebalancing pipeline is a headline success story, and the 35% cost reduction and execution quality improvements are central to the firm's technology strategy. The CEO has asked you to present on AI governance at the board meeting and has specifically requested "a positive story about how our governance framework keeps our AI systems safe."
The CTO has also messaged you this morning. They are aware of the retrieval anomaly and have proposed a quick fix: lowering the minimum threshold guardrail for market data retrieval so that "legitimate data sparsity" (e.g., emerging markets having fewer data points on some days) does not trigger false concerns. They note that the current threshold was set conservatively during initial deployment and may not reflect operational reality.
The CEO wants a positive governance story. The CTO wants to lower a guardrail threshold. The regulator wants evidence of control effectiveness for data completeness. These three pressures are pulling in different directions. You need to decide what to do before the board meeting and before the next weekly rebalancing run on Friday.
Your decision¶
You need to make three interconnected decisions before Friday:
Decision 1: The retrieval anomaly¶
What do you do about the 47 affected portfolios and the Friday rebalancing run?
Option A: Accept and continue¶
The impact appears to be suboptimal positioning, not a compliance violation. The trades executed within all defined guardrails. Accept the outcome, document the anomaly, and allow Friday's run to proceed with current controls.
Option B: Investigate and pause¶
Pause the emerging market component of Friday's rebalancing run until the retrieval anomaly is investigated. Commission a review of the 47 affected portfolios to quantify the impact. Allow non-emerging-market rebalancing to continue.
Option C: Pause the entire pipeline¶
Pause the entire rebalancing pipeline until the anomaly is investigated and chain-level integrity controls are implemented. This affects all 1,200 portfolios and will require manual rebalancing for high-priority accounts.
Option D: Investigate and enhance¶
Allow Friday's run to proceed but add a temporary manual review step: a portfolio manager reviews all rebalancing recommendations for portfolios with significant emerging market exposure before they are queued for execution. Commission an investigation in parallel.
Decision 2: The CTO's proposal¶
What do you do about the proposal to lower the retrieval threshold?
Option A: Approve the change¶
The CTO's reasoning is sound, and the threshold may be too conservative for markets with genuinely variable data availability. Lower the threshold.
Option B: Reject the change¶
Lowering the threshold in response to an anomaly moves the guardrail in the wrong direction. Maintain the current threshold.
Option C: Defer and investigate¶
Do not change the threshold until the retrieval anomaly is fully investigated. If the investigation confirms that the anomaly was data sparsity (not a retrieval failure), revisit the threshold with data-driven justification.
Option D: Reframe the conversation¶
The threshold is the wrong control for this problem. What is needed is not a lower threshold but a chain-level integrity check. Agent S should know whether Agent M's data was complete relative to expectations, and should flag its recommendations accordingly. Propose this to the CTO instead.
Decision 3: The board presentation¶
What do you present to the board?
Option A: The positive story¶
Present the governance framework as the CEO requested. Highlight the controls that are in place, the system's track record, and the cost savings. Mention the retrieval anomaly as an example of monitoring catching issues.
Option B: The balanced story¶
Present the governance framework honestly. Highlight what works, but also present the chain-level governance gap, the retrieval anomaly, and the regulatory inquiry. Frame the gap as an opportunity to strengthen governance, not as a failure.
Option C: The risk story¶
Lead with the governance gap and the regulatory inquiry. Frame the retrieval anomaly as a near-miss that exposes a structural weakness. Request board approval for a governance extension programme.
Option D: Defer the presentation¶
Ask the CEO to defer the AI governance presentation until the retrieval anomaly is investigated and the regulatory response is drafted. Present at the following board meeting with complete information.
Think it through¶
Before reading the analysis below, make your three decisions and write down the reasoning for each. The exercise is most valuable if you commit before reading the discussion.
Analysis: Decision 1, The retrieval anomaly (click to reveal)
There is no single correct answer, but some options are more defensible than others.
Option A (Accept and continue) is the least defensible. You have identified a Phantom Compliance-pattern anomaly (incomplete data leading to confident downstream decisions) and the response is to accept it and continue. If the anomaly recurs on Friday and the impact is larger, your position becomes untenable. A regulator reviewing this decision in the context of their thematic review would note that you identified the pattern and did not act.
Option B (Investigate and pause emerging markets) is proportionate. It limits the impact to the affected area while allowing the rest of the pipeline to continue. It demonstrates governance responsiveness without overreacting. The key is the quality of the investigation: it must determine whether the anomaly was genuine data sparsity or a retrieval failure, and whether the downstream chain had any mechanism to detect and flag it.
Option C (Pause the entire pipeline) is defensible but potentially disproportionate. If the anomaly is limited to emerging markets data, pausing all 1,200 portfolios creates operational disruption (and client impact) that exceeds the risk. However, if you suspect the anomaly may not be limited to emerging markets, and it could indicate a systemic retrieval issue, this option becomes more appropriate.
Option D (Investigate and enhance) is pragmatic and demonstrates governance maturity. It adds a temporary compensating control (manual review) while the investigation proceeds. The risk is that the temporary control becomes permanent, creating human review fatigue. The key is defining a clear exit criterion: the manual review step is removed when chain-level integrity controls are implemented.
The strongest answer is Option B or D, depending on your assessment of the scope. If you believe the anomaly is isolated to emerging markets data, Option B is sufficient. If you are uncertain, Option D provides an additional safety layer.
Analysis: Decision 2, The CTO's proposal (click to reveal)
Option D (Reframe the conversation) is the strongest answer, and here is why:
The CTO is proposing to adjust the guardrail in response to what they see as a false positive. From an engineering perspective, this may be reasonable; if emerging markets genuinely have variable data availability, a fixed minimum threshold will generate noise.
But from a governance perspective, the CTO is addressing the symptom (the guardrail triggered concern) rather than the cause (the chain has no mechanism to distinguish between "less data available" and "less data retrieved"). Lowering the threshold does not solve the problem; it makes the problem less visible.
Option A (Approve) is risky. Lowering a guardrail immediately after an anomaly, with a regulatory inquiry pending that specifically mentions data completeness, is a governance red flag. Even if the CTO's reasoning is technically correct, the timing and optics are poor.
Option B (Reject) is safe but may not be correct in the long term. The threshold may indeed need adjustment, but that decision should be data-driven, not reactive.
Option C (Defer) is reasonable as a holding position but does not address the underlying architecture gap.
Option D (Reframe) is the most mature governance response because it identifies the real problem: the chain lacks epistemic integrity controls. Agent S should not simply consume Agent M's output; it should evaluate the quality and completeness of that output and flag its recommendations when the data basis is weaker than usual. This is exactly the MASO epistemic integrity control discussed in Module 3.
Analysis: Decision 3, The board presentation (click to reveal)
Option B (The balanced story) is the strongest answer for governance professionals.
Option A (The positive story) is compliance theatre. Presenting the governance framework as fully adequate, while you have evidence of a chain-level gap and a regulatory inquiry mentioning the exact failure mode, creates board liability. If the regulator's review finds the gap that you did not disclose to the board, the board will ask why they were not informed.
Option B (The balanced story) demonstrates governance maturity. It shows the board that the governance framework works for what it covers, is honest about where it does not yet cover chain-level risks, and frames the extension as a planned improvement. This is the narrative that a regulator would respect: the organisation knows its gaps and is acting to close them.
Option C (The risk story) may be appropriate if the retrieval anomaly investigation reveals a more serious issue. But leading with risk without the balanced context may cause the board to overreact, potentially by pausing all AI initiatives rather than extending governance for multi-agent systems.
Option D (Defer) delays board visibility. The board should know about the regulatory inquiry and the governance gap now, not after the fact. Deferring also means the board meeting focuses on the positive AI story without the governance context, which creates the same liability as Option A.
The CEO conversation: Presenting Option B means having a conversation with the CEO before the board meeting. Frame it as: "The governance story is positive, and it becomes stronger if we show the board that we are proactively extending it to cover new risks, rather than waiting for the regulator to tell us." A CEO who understands governance will recognise this as the more credible story.
Consolidation¶
You have completed the Risk & Governance track. Here are the core takeaways:
-
The governance gap is structural, not accidental. Current AI governance frameworks were designed for single-agent systems. Multi-agent chains require a specific extension (the MASO framework) to close the gap between governing agents and governing chains.
-
Epistemic integrity is a governance obligation. When your organisation deploys AI agents that make or support consequential decisions, you have an obligation to ensure those decisions are based on complete and current data. This is not a new obligation; it is the same obligation you have always had, but multi-agent systems make it harder to fulfil.
-
Privileged agent governance is the accountability layer. Agents that commit actions (approve trades, execute orders, release recommendations) must be governed with the same rigour as the humans they augment or replace. This means identification, delegation authority, escalation criteria, and oversight.
-
Evidence beats documentation. A regulator will not be satisfied by documentation that controls exist. They will want evidence that controls work: effectiveness evidence and coverage evidence, not just deployment evidence and activity evidence.
-
Board reporting must cover chains, not just components. Chain integrity metrics, privileged agent oversight status, control effectiveness trends, and coverage gap analysis belong in board reporting. Component metrics (accuracy, uptime, throughput) are necessary but not sufficient.
Next steps¶
This track has given you the governance and accountability perspective. The Convergence Exercise brings together the risk, security, and engineering perspectives around a shared decision, where the governance lens you have built here meets the technical realities of implementation.
The convergence exercise is most valuable when multiple tracks have been completed by different team members. If you are working through this learning path with colleagues from security architecture or engineering, compare your exercise decisions before proceeding.