You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you searched for related issues? Others may have had similar requests -> Yes
Question
I'm implementing a parallel translation pattern in the examples, where multiple agents generate translations simultaneously, and a selection agent chooses the best one. While this approach provides quality benefits, it introduces significant latency in the user experience, especially when streaming responses.
Current Implementation
The current implementation follows this pattern:
asyncdefmain():
msg=input("Enter message for translation")
withtrace("Parallel translation"):
# Run 3 translation agents in parallel (takes ~10s total)res_1, res_2, res_3=awaitasyncio.gather(
Runner.run(spanish_agent, msg),
Runner.run(spanish_agent, msg),
Runner.run(spanish_agent, msg),
)
# Collect outputs and combineoutputs= [
ItemHelpers.text_message_outputs(res_1.new_items),
ItemHelpers.text_message_outputs(res_2.new_items),
ItemHelpers.text_message_outputs(res_3.new_items),
]
translations="\n\n".join(outputs)
# Run selection agent to pick best (adds more latency)best_translation=awaitRunner.run(
translation_picker,
f"Input: {msg}\n\nTranslations:\n{translations}",
)
Problem Statement
The current workflow creates a significant latency issue in streaming scenarios:
All translation agents must complete execution (taking ~10s in parallel)
Only after all translations are complete can the selection agent begin processing
The UI remains without any output until the selection agent starts streaming
This latency compounds when this is part of a longer agent chain
This implementation leads to poor user experience due to long waiting periods without feedback, especially in complex agent workflows where subsequent agents depend on this translation output.
What's the recommended approach to optimize this pattern for streaming scenarios while maintaining the quality benefits of parallel execution and selection?
The text was updated successfully, but these errors were encountered:
I think this is more a product question than a technical one? The common pattern I've seen is to stream updates to the user as the other agents run. i.e. run your parallel agents in a streaming fashion, use the streaming events as a way to deliver progress updates, then show a final response when done.
Please read this first
Question
I'm implementing a parallel translation pattern in the
examples
, where multiple agents generate translations simultaneously, and a selection agent chooses the best one. While this approach provides quality benefits, it introduces significant latency in the user experience, especially when streaming responses.Current Implementation
The current implementation follows this pattern:
Problem Statement
The current workflow creates a significant latency issue in streaming scenarios:
This implementation leads to poor user experience due to long waiting periods without feedback, especially in complex agent workflows where subsequent agents depend on this translation output.
What's the recommended approach to optimize this pattern for streaming scenarios while maintaining the quality benefits of parallel execution and selection?
The text was updated successfully, but these errors were encountered: