Inline Sub-Conversations - Making Delegation Transparent

When the parent agent delegated to a specialist, the work happened in a completely separate conversation record. The user saw the final output but had no visibility into what actually happened. What prompt was sent to the specialist? What did the specialist actually say? How long did it take? All invisible.

The observability problem

In distributed systems there's a concept called observability - your ability to understand what's happening inside the system by looking at its outputs. Think of it as the difference between a car dashboard that shows speed, fuel, and engine temperature, versus a car that only has a single "everything's fine" light. When something goes wrong (or even just looks a bit off), the dashboard lets you diagnose it. The single light tells you nothing.

The multi-agent pipeline was the single light. The parent delegated, the specialist worked, a result came back. But the intermediate state - what was actually asked, what was actually produced, how long each step took - was locked away in separate database records with no way to see it from the chat. If the specialist gave an unexpected answer, you couldn't check whether the parent had framed the question poorly or whether the specialist just went off track.

Imagine a manager who delegates tasks to their team and only ever tells you "it's done." You never see the work, the intermediate steps, or the reasoning. You just have to trust the relay chain.

Parent-child linking

The fix uses a technique from distributed tracing called correlation IDs. It's simple in concept: when the parent creates a delegation, the system stamps the sub-conversation with the ID of the parent message that triggered it. Like a reference number on a work order that points back to the original request. This creates a causal chain - you can trace from the user's question, through the parent's delegation decision, into the specialist's full conversation.

In the chat UI, after any message that triggered a delegation, a collapsible panel appears. Expand it and you see the specialist's full exchange: the prompt it received, its complete response, timing information. You can visually walk the entire delegation chain, similar to how tools like Jaeger let you trace a web request as it bounces between microservices.

The panels are collapsed by default - a design principle called progressive disclosure. Show the essential information (the result) up front, and let the user drill into the details (the full delegation chain) only when they want to. No visual clutter, no information overload, but everything is one click away.

What this actually captures

Beyond the conversation content, we now record structured trace data for every agent interaction:

Tool calls and tool results (the full tool use chain, not just the final output)
Token usage per agent (how much "thinking" each one used)
Time to first token per specialist (how long before they started generating)
Total duration per delegation
Delegation depth and parent-child relationships
Success or failure status with the specific reason

Think of it as flight recorder data for your AI pipeline. You probably won't look at it most of the time, but when something goes wrong - a specialist gives a weird answer, or a delegation takes unusually long - you can pull up the trace and see exactly what happened at every step.

The orchestration context object acts as the collector, accumulating entries as delegation flows through the agent tree. Each sub-agent adds its own entry, including any child delegations it triggers (agents can delegate to other agents). By the time the request completes, you have a full call graph of every agent that participated.

What Claude did here

The database change (adding a column to link sub-conversations back to their parent message, with an index for fast lookups) and the React component for inline panels were both implemented by Claude after I described the target: "I want to see what the sub-agent actually did, right there in the chat." The tricky part was threading the correlation ID through the delegation pipeline - the message ID needs to be captured at the point of delegation and propagated through the tool call, the job dispatch, and into the sub-conversation record. Claude traced the full call chain and identified exactly where it needed to be passed. The panel component uses a render-on-expand pattern so it doesn't load sub-conversation data until you actually open it, keeping the chat performant even with many delegations.