Multi-Agent Governance: Who Is Responsible When Agents Delegate to Agents?

The simplest AI agent architecture is a single agent making decisions. The governance model for a single agent is well-understood: define what the agent is authorized to do, evaluate its decisions against explicit rules, and capture a trace of every evaluation. The authority model is clear because there is one agent, one set of permissions, and one decision record per action.

The moment a second agent enters the system, governance becomes a fundamentally harder problem. When Agent A receives a task, decides it needs specialized processing, and delegates to Agent B — which in turn calls Agent C for data enrichment — three agents have participated in producing a single outcome. If that outcome is wrong, harmful, or non-compliant, the question becomes: who is responsible? The agent that initiated the chain? The agent that made the final decision? The agent that provided the data that led to the decision? The human who deployed the system?

This is not a theoretical concern. Multi-agent architectures are becoming the standard pattern for complex AI systems. Stanford HAI's research on responsible agentic systems identifies multi-agent delegation as one of the three most urgent governance challenges in agentic AI, specifically because the delegation patterns that make multi-agent systems powerful are the same patterns that make accountability attribution difficult.

The Accountability Gap in Multi-Agent Systems

In a single-agent system, every decision has one decision-maker. The audit trail is a flat sequence: agent received input, agent evaluated rules, agent produced outcome. In a multi-agent system, the decision path is a tree — or more accurately, a directed acyclic graph — where multiple agents contribute to a single outcome through delegation, consultation, and data exchange.

The accountability gap emerges because standard audit trails are designed for linear processes. They capture what each individual agent did, but they do not capture the delegation relationships between agents. Consider a concrete example:

A customer retention system (Agent A) identifies an at-risk account and decides to initiate a retention workflow. It delegates to a discount authorization agent (Agent B) to determine what offer is appropriate. Agent B calls a customer lifetime value agent (Agent C) to compute the customer's projected value. Agent C returns a value. Agent B uses that value to authorize a 30% discount. Agent A presents the offer to the customer.

If the 30% discount violates the company's margin policy, where did the failure occur? Did Agent A delegate inappropriately? Did Agent B apply the wrong authorization rules? Did Agent C compute an incorrect value that led to an inappropriate authorization? A standard per-agent audit trail shows three separate, internally consistent decision records. What it does not show is the authority chain: who authorized whom to do what, with what constraints, and whether those constraints were maintained across the delegation boundary.

The multi-agent safety survey literature identifies this as the "distributed accountability problem" — the phenomenon where accountability is nominally present at each node in a multi-agent chain but is not attributable at the system level because no single record captures the end-to-end authority chain.

Why Standard Observability Fails for Multi-Agent Delegation

Teams building multi-agent systems typically implement observability using distributed tracing frameworks — tools like OpenTelemetry, Jaeger, or Datadog APM. These tools capture spans across service boundaries, propagate trace context through HTTP headers, and produce flame graphs showing the timing and dependency relationships between service calls. They are excellent engineering tools. They are inadequate governance tools.

Distributed tracing captures call graphs: Agent A called Agent B, which called Agent C, which returned in 47ms. It captures what happened operationally. It does not capture what was authorized to happen: what authority did Agent A grant to Agent B? What constraints did that authority include? Did Agent B exceed those constraints? Did Agent B propagate appropriate constraints when it delegated to Agent C?

The distinction is between operational tracing and authority tracing. Operational tracing answers "what did the system do?" Authority tracing answers "what was the system permitted to do, and did it stay within those permissions at every delegation step?" For governance purposes, both are necessary. For accountability purposes, only authority tracing can reconstruct the chain of responsibility.

Capability	Operational Tracing (e.g., OpenTelemetry)	Authority Tracing (Governance-Grade)
Captures call graph between agents	Yes	Yes
Records timing and performance	Yes	Not primary concern
Records what authority was delegated	No	Yes
Records constraints on delegated authority	No	Yes
Detects authority boundary violations	No	Yes
Reconstructs end-to-end accountability chain	Partially (call graph only)	Yes (authority chain with constraints)
Supports audit evidence requirements	Insufficient alone	Designed for this purpose

Regulatory Context: Who Is Accountable Under Current Frameworks?

Current regulatory frameworks were not designed for multi-agent delegation, but their accountability provisions apply regardless. Understanding how existing frameworks attribute responsibility is essential for designing governance architectures that satisfy them.

NIST AI RMF Govern 6.1 establishes that accountability for AI system outcomes must be assigned to identified roles within the deploying organization. The framework does not distinguish between single-agent and multi-agent architectures — the deploying organization is accountable for the system's outcomes regardless of how many agents participated in producing them. This means that "Agent B made the decision, not Agent A" is not a valid accountability attribution from a governance perspective. The organization that deployed the multi-agent system is accountable for the system's behavior as a whole.

EU AI Act Article 25 places obligations on deployers of AI systems, including the obligation to ensure that the system operates within its intended purpose and that human oversight mechanisms function effectively. For multi-agent systems, this means the deployer must demonstrate that delegation between agents does not create decision paths that escape the oversight mechanisms. If Agent A delegates to Agent B and Agent B makes a high-risk decision without the human oversight that the system description promises, the deployer has a compliance gap — even if Agent A's individual controls are well-designed.

The practical implication is clear: regulatory accountability does not fragment across agents. It remains with the deploying organization. The governance architecture must therefore provide the deploying organization with the ability to reconstruct the complete authority chain for any outcome, regardless of how many agents participated.

Pattern 1: Explicit Delegation Tokens

The first governance pattern addresses the authority gap directly: when Agent A delegates to Agent B, the delegation itself becomes a governed, recorded event with explicit parameters.

A delegation token is a structured artifact that accompanies every inter-agent delegation. It contains: the identity of the delegating agent, the identity of the receiving agent, the specific scope of authority being delegated (what the receiving agent is authorized to decide or do), the constraints on that authority (limits, exclusions, conditions), an expiration (the delegation is time-bounded), and the trace context linking this delegation to the originating decision chain.

// Delegation token structure
{
  "delegation_id": "del_a1b2c3d4",
  "delegating_agent": "retention-orchestrator",
  "receiving_agent": "discount-authorization",
  "scope": {
    "action": "determine_discount_offer",
    "account_id": "acct_789",
    "max_discount_percent": 25,
    "allowed_discount_types": ["percentage", "credit"],
    "excluded_products": ["enterprise_tier"]
  },
  "constraints": {
    "requires_margin_check": true,
    "min_remaining_contract_months": 3,
    "max_total_value": 5000
  },
  "expires_at": "2026-04-29T15:30:00Z",
  "parent_trace_id": "trace_x7y8z9",
  "delegation_depth": 1,
  "max_delegation_depth": 2
}

The delegation token serves three governance functions. First, it makes the scope of delegated authority explicit and enforceable — Agent B can only operate within the boundaries defined by the token. Second, it creates a recorded authority chain — the token itself is evidence of what authority was granted, by whom, with what constraints. Third, it enables constraint propagation — when Agent B delegates to Agent C, it can only delegate authority that falls within the scope it received, and the delegation depth counter prevents unbounded delegation chains.

The critical design decision is that delegation tokens are evaluated, not just recorded. The receiving agent's decision checkpoint validates the token before proceeding: is the token valid? Is the requested action within scope? Are the constraints satisfied? If any validation fails, the delegation is rejected and the rejection is traced.

Pattern 2: Cascading Trace Propagation

The second governance pattern addresses the observability gap: ensuring that the decision traces generated by individual agents compose into a coherent end-to-end record of the multi-agent decision process.

In a single-agent system, a decision trace is a self-contained record: inputs, rules evaluated, outcome. In a multi-agent system, each agent generates its own trace, but the individual traces are fragments of a larger decision process. Cascading trace propagation ensures that these fragments can be assembled into a complete authority chain.

The mechanism is straightforward in concept: every trace generated in a multi-agent chain includes a reference to the parent trace that initiated the delegation, the delegation token that authorized the current agent's participation, and the depth in the delegation chain. These references create a tree structure that can be traversed to reconstruct the full decision path from the initiating agent to the final action.

The implementation requires three properties that go beyond standard distributed tracing:

Causal ordering, not just temporal ordering. Standard distributed traces are temporally ordered: this happened, then this happened. Cascading governance traces are causally ordered: this agent delegated to this agent because of this evaluation, which produced this outcome, which caused this delegation. The causal chain is the authority chain. Temporal ordering is necessary but not sufficient — two agents can execute concurrently, but the authority relationship between them is still directional.

Constraint inheritance visibility. Each trace in the chain must document not just the rules the agent evaluated, but also the constraints it inherited from the delegating agent. When the discount authorization agent evaluates its rules, the trace must show that it operated under a maximum discount constraint of 25% inherited from the delegation token — even if its own rules would have permitted 40%. The inherited constraint is part of the decision record because it explains why the agent's behavior differed from what its own rules alone would have produced.

Cross-agent completeness guarantees. The trace system must guarantee that every agent in the delegation chain produced a trace. A gap in the chain — an agent that participated but did not produce a trace — makes the end-to-end record unreliable. This requires that the trace infrastructure is a mandatory component of the inter-agent communication protocol, not an optional add-on that agents can bypass.

Pattern 3: Unified Decision Plane Evaluation

The third pattern takes a fundamentally different architectural approach: instead of governing delegation between autonomous agents, it routes all consequential decisions through a single, shared decision plane that evaluates rules regardless of which agent initiated the evaluation.

In this pattern, agents do not delegate decision authority to other agents. Instead, every agent that needs to make a consequential decision sends its proposed action to the same decision plane for evaluation. The decision plane evaluates the proposal against the applicable rules — which may include rules specific to the proposing agent, rules specific to the action type, and cross-cutting rules that apply to all agents — and returns an authorized, traced outcome.

The governance advantage is significant: the decision plane becomes the single point of authority for all agent decisions, regardless of how many agents are operating. There is no authority fragmentation because authority was never distributed. Every decision, regardless of which agent proposed it, is evaluated by the same rule engine, subject to the same constraints, and recorded in the same trace store.

This pattern does not eliminate multi-agent coordination — agents still specialize, communicate, and compose their capabilities. What it eliminates is multi-agent decision authority. An agent can compute a recommendation, aggregate data, or coordinate a workflow. When it needs to take a consequential action — one that affects customers, data, or financial outcomes — it proposes the action rather than executing it directly. The decision plane is the only entity in the system that commits decisions.

Property	Delegation Tokens	Cascading Traces	Unified Decision Plane
Where decision authority lives	Distributed across agents with scoped delegation	Distributed across agents with full trace lineage	Centralized in the decision plane
Accountability attribution	Reconstructed from delegation token chain	Reconstructed from cascading trace tree	Every decision has a single evaluator
Governance complexity	Moderate — requires token validation at each boundary	Moderate — requires cross-agent trace assembly	Lower — single evaluation point for all decisions
Agent autonomy	High — agents make decisions within delegated scope	High — agents decide freely but trace completely	Lower — agents propose, the decision plane decides
Best suited for	Systems where agents need real-time decision authority	Systems with existing multi-agent architectures	Systems where auditability is the primary requirement

Combining the Patterns: A Practical Architecture

These three patterns are not mutually exclusive. In practice, the most robust multi-agent governance architectures combine elements of all three.

The unified decision plane serves as the evaluation layer for all consequential decisions — actions that affect customers, commit financial resources, or modify system state. Agents that need to take consequential actions route their proposals through the decision plane, which evaluates rules and produces traced outcomes. This provides the strong auditability guarantee for the decisions that matter most.

Delegation tokens govern the coordination between agents for non-consequential operations: data retrieval, computation, workflow orchestration. When Agent A asks Agent B to compute a customer lifetime value, the delegation token scopes what data Agent B can access and how the result will be used — but the computation itself is not a consequential decision, so it does not require decision plane evaluation.

Cascading trace propagation ties everything together: every operation, whether evaluated by the decision plane or executed under a delegation token, produces a trace that references the parent context. The complete trace tree for any outcome can be reconstructed from any node in the chain.

This layered approach provides proportional governance: the strongest controls are applied to the highest-stakes decisions, while lower-stakes operations receive appropriate but less restrictive governance. The trace infrastructure is uniform across all layers, which means the end-to-end audit trail is complete regardless of which governance pattern governed each step.

Implementation Considerations for Multi-Agent Governance

Delegation Depth Limits

Unbounded delegation chains are ungovernable. A practical governance architecture enforces a maximum delegation depth — typically two to three levels for consequential operations. Beyond that depth, the authority chain becomes too long to reconstruct reliably, constraint inheritance becomes unreliable, and the blast radius of a misbehaving agent becomes difficult to bound. Delegation depth limits should be configurable per action type: a data retrieval delegation may tolerate three levels, while a financial commitment delegation should tolerate one.

Constraint Narrowing, Not Widening

A fundamental rule of delegation governance: a delegating agent can narrow constraints but never widen them. If Agent A delegates to Agent B with a maximum discount of 25%, Agent B cannot delegate to Agent C with a maximum discount of 30%. The constraint inheritance must be monotonically narrowing — each delegation step can only reduce the scope of authority, never expand it. This property must be enforced by the infrastructure, not by agent behavior.

Failure Propagation and Rollback

When an agent in a delegation chain fails — times out, produces an error, or violates a constraint — the governance architecture must define how that failure propagates. Does the entire chain roll back? Does the delegating agent retry with a different agent? Does the system escalate to human review? These decisions are themselves governance decisions that should be defined in the rule set, not implemented ad hoc in agent code.

Human Override at Any Level

Regulatory frameworks uniformly require that human oversight can be exercised at any point in the AI decision process. For multi-agent systems, this means that a human operator must be able to intervene at any level of the delegation chain — not just at the top level. The governance architecture must support injecting a human review step at any delegation boundary, and the trace infrastructure must record the human intervention as part of the authority chain.

The Core Principle: Delegation Is a Governed Action

The fundamental shift required for multi-agent governance is treating delegation itself as a governed action — not an implementation detail. When Agent A delegates to Agent B, that delegation is not a function call. It is an authority transfer that should be evaluated, constrained, and recorded with the same rigor applied to any other consequential action.

Systems that treat delegation as a function call produce multi-agent architectures where accountability evaporates at every boundary. Systems that treat delegation as a governed action produce architectures where the authority chain is explicit, the constraints are enforced, and the end-to-end record is complete.

This principle extends the propose-then-decide architecture described in Infrastructure for Deterministic AI Decisions to multi-agent contexts. In a single-agent system, the agent proposes and the decision plane decides. In a multi-agent system, the same principle applies: agents propose, the governance infrastructure decides — both the substantive decisions and the meta-decisions about who is authorized to propose what, under what constraints, through what delegation chain.

For the authorization model that governs individual agent permissions, see The Authorization Model for AI Agents. For the decision trace architecture that captures evaluation records across agent boundaries, see Decision Traces: The Audit Log Pattern That Makes AI Systems Defensible.