Rendering tool-use steps in a streaming chat UI - what patterns are you using?

With AI agents becoming the norm, I’m running into a UI challenge that I don’t see many people talking about: how do you render a multi-step tool-use sequence in a chat interface while it’s still streaming?

Here’s the situation. I’m building a React frontend for an internal AI assistant that uses tool calling (MCP-based). A single user message might trigger the agent to call 3-5 tools before producing a final answer. Right now my UI just shows a spinner until the whole thing resolves, which can take 10-15 seconds. That’s a terrible experience.

What I want is something like what Claude and ChatGPT do, where you see each tool call happening in real time, maybe with a collapsible section showing what tool was called and what it returned, and then the final response streams in token by token.

The tricky parts I’m hitting:

1. Parsing the SSE stream mid-flight. The server sends content_block_start, content_block_delta, and content_block_stop events. Tool use blocks and text blocks are interleaved. I need to maintain a state machine that tracks which block is currently open and renders the right component. My current reducer is getting pretty gnarly.

2. Accordion/collapse UX for tool results. Once a tool call completes, I want to collapse it into a summary line (like “Searched database - 12 results”) so the chat doesn’t get overwhelmed with raw JSON. But the user should be able to expand it. The timing of when to auto-collapse is awkward, especially when the next tool call starts immediately.

3. Optimistic rendering vs waiting. Should I show “Calling search_documents…” as soon as I see the tool_use block start, or wait until I have the full tool input? Showing it early feels more responsive but sometimes the tool name changes as more tokens stream in (though that’s rare with most providers now).

Here’s a rough sketch of my current approach:

type StreamBlock = 
  | { type: 'text'; content: string; done: boolean }
  | { type: 'tool_use'; name: string; input: string; result?: string; done: boolean };

function ChatMessage({ blocks }: { blocks: StreamBlock[] }) {
  return (
    <div className="space-y-2">
      {blocks.map((block, i) => {
        if (block.type === 'tool_use') {
          return <ToolUseBlock key={i} block={block} />;
        }
        return <TextBlock key={i} content={block.content} streaming={!block.done} />;
      })}
    </div>
  );
}

But managing the block array as deltas come in is where all the complexity lives. Anyone built something like this and found a clean pattern? Especially interested in how you handle the state management. I’ve looked at Vercel’s AI SDK useChat but it abstracts away too much of the tool-use rendering for my needs.


Seed content posted by the DevForums team to help get our community started. Have a better answer? Jump in!

I’ve built a few of these streaming tool-use UIs and landed on a pattern that works pretty well. The key insight is to treat the stream as a state machine rather than a raw text buffer.

Here’s the approach I’d recommend:

1. Parse SSE events into a structured message model

Instead of appending raw text, maintain a message object with typed blocks:

type MessageBlock =
  | { type: 'text'; content: string }
  | { type: 'tool_call'; id: string; name: string; input: string; status: 'running' | 'complete' | 'error' }
  | { type: 'tool_result'; call_id: string; output: string };

interface StreamingMessage {
  blocks: MessageBlock[];
  isComplete: boolean;
}

As SSE events come in, you push new blocks or update existing ones. Text deltas append to the current text block, tool_use events create a new tool_call block, and tool_result events update the matching call’s status.

2. Use a reducer for state management

A useReducer works way better than useState here because you’re handling multiple event types that modify the same message:

function messageReducer(state: StreamingMessage, event: SSEEvent): StreamingMessage {
  switch (event.type) {
    case 'content_block_start':
      if (event.content_block.type === 'tool_use') {
        return {
          ...state,
          blocks: [...state.blocks, {
            type: 'tool_call',
            id: event.content_block.id,
            name: event.content_block.name,
            input: '',
            status: 'running'
          }]
        };
      }
      return { ...state, blocks: [...state.blocks, { type: 'text', content: '' }] };
    case 'content_block_delta':
      // update the last block with the delta
      const blocks = [...state.blocks];
      const last = blocks[blocks.length - 1];
      if (last.type === 'text') last.content += event.delta.text;
      if (last.type === 'tool_call') last.input += event.delta.partial_json;
      return { ...state, blocks };
    // handle tool results, completion, etc.
  }
}

3. Render blocks with collapsible tool sections

Map over message.blocks and render each type differently. For tool calls, I use a <details> element (or a custom collapsible) that shows the tool name and a spinner while running, then collapses to a summary when done:

{message.blocks.map((block, i) => {
  if (block.type === 'text') return <MarkdownRenderer key={i} content={block.content} />;
  if (block.type === 'tool_call') return (
    <ToolCallCard key={block.id} name={block.name} status={block.status}
      input={block.input} defaultOpen={block.status === 'running'} />
  );
})}

4. Batch your renders

SSE deltas can fire hundreds of times per second. Wrap your dispatch in requestAnimationFrame or use a small buffer (16ms) to batch updates. Without this you’ll get jank, especially on longer responses.

For the MCP-specific parts, the pattern is the same, you just need to map MCP’s tool call/result events to your block model. Most MCP client libraries emit typed events you can switch on directly.

Hope that helps! Happy to share a more complete example repo if you want.