Optimize Latency for Parallel Agent Runs with Streaming #498

adhishthite · 2025-04-14T08:52:19Z

Please read this first

Have you read the docs?Agents SDK docs -> Yes
Have you searched for related issues? Others may have had similar requests -> Yes

Question

I'm implementing a parallel translation pattern in the examples, where multiple agents generate translations simultaneously, and a selection agent chooses the best one. While this approach provides quality benefits, it introduces significant latency in the user experience, especially when streaming responses.

Current Implementation

The current implementation follows this pattern:

async def main():
    msg = input("Enter message for translation")
    
    with trace("Parallel translation"):
        # Run 3 translation agents in parallel (takes ~10s total)
        res_1, res_2, res_3 = await asyncio.gather(
            Runner.run(spanish_agent, msg),
            Runner.run(spanish_agent, msg),
            Runner.run(spanish_agent, msg),
        )
        
        # Collect outputs and combine
        outputs = [
            ItemHelpers.text_message_outputs(res_1.new_items),
            ItemHelpers.text_message_outputs(res_2.new_items),
            ItemHelpers.text_message_outputs(res_3.new_items),
        ]
        translations = "\n\n".join(outputs)
        
        # Run selection agent to pick best (adds more latency)
        best_translation = await Runner.run(
            translation_picker,
            f"Input: {msg}\n\nTranslations:\n{translations}",
        )

Problem Statement

The current workflow creates a significant latency issue in streaming scenarios:

All translation agents must complete execution (taking ~10s in parallel)
Only after all translations are complete can the selection agent begin processing
The UI remains without any output until the selection agent starts streaming
This latency compounds when this is part of a longer agent chain

This implementation leads to poor user experience due to long waiting periods without feedback, especially in complex agent workflows where subsequent agents depend on this translation output.

What's the recommended approach to optimize this pattern for streaming scenarios while maintaining the quality benefits of parallel execution and selection?

The text was updated successfully, but these errors were encountered:

rm-openai · 2025-04-14T15:33:46Z

I think this is more a product question than a technical one? The common pattern I've seen is to stream updates to the user as the other agents run. i.e. run your parallel agents in a streaming fashion, use the streaming events as a way to deliver progress updates, then show a final response when done.

adhishthite · 2025-04-14T20:01:34Z

Thanks @rm-openai .

Do we have an example on such streaming?

rm-openai · 2025-04-15T02:40:31Z

Yes, the research bot: https://github.com/openai/openai-agents-python/tree/main/examples/research_bot

You can see a video of the progress updates here: https://x.com/_rohanmehta/status/1899888529980698832

github-actions · 2025-04-23T02:06:41Z

This issue is stale because it has been open for 7 days with no activity.

github-actions · 2025-04-27T02:10:14Z

This issue was closed because it has been inactive for 3 days since being marked as stale.

adhishthite added the question Question about using the SDK label Apr 14, 2025

github-actions bot added the stale label Apr 23, 2025

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Latency for Parallel Agent Runs with Streaming #498

Optimize Latency for Parallel Agent Runs with Streaming #498

adhishthite commented Apr 14, 2025

rm-openai commented Apr 14, 2025

adhishthite commented Apr 14, 2025

rm-openai commented Apr 15, 2025

github-actions bot commented Apr 23, 2025

github-actions bot commented Apr 27, 2025

Optimize Latency for Parallel Agent Runs with Streaming #498

Optimize Latency for Parallel Agent Runs with Streaming #498

Comments

adhishthite commented Apr 14, 2025

Please read this first

Question

Current Implementation

Problem Statement

rm-openai commented Apr 14, 2025

adhishthite commented Apr 14, 2025

rm-openai commented Apr 15, 2025

github-actions bot commented Apr 23, 2025

github-actions bot commented Apr 27, 2025