-
Notifications
You must be signed in to change notification settings - Fork 1.3k
How to make hand-off decisions more reliable? #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@jhills20 could you chime in here? |
Hey @cipriangritto-nordra, this all comes down to prompting. In addition to the recommended prefix, try adding more clarity in the triage agent prompt on when the agent should handoff or not..., something like this:
Also something that has worked is adding a line about how you must either always ask a question, or call a tool. That way the triage agent never just hallucinates and says it will do something but doesn't call a tool. Also try 4.1! Should be much better |
@jhills20 would you consider a configurable 'forced handoff' (similar to the forced tool usage), to make handoffs more reliable? Alternatively, do you have any advice about using the Agent.as_tool() method and forcing tool usage? I get the same issue as @cipriangritto-nordra. At first the handoff happens, but by message 3 or 4, the Triage agent has enough context (from the message history) to attempt answering itself in the style of the other agent's outputs. I'd like to force it to always pick another Agent no matter how confident it is, to avoid having to maintain prompts, evals, etc. Doing this reliably is key for production-level multi-agent workflows and IMO a must-have to compete against Google's Workflow Agents for e.g. Thanks! |
You can indeed force handoffs:
|
Thanks! Wasn't clear that handoffs were equivalent to tools |
Please read this first
Question
What is the recommended approach to "convince" the initial Agent / Triage agent to make use of hand-offs more often?
We seem to run into an issue after some point in the conversation (5-10 messages exchanged) where the Triage Agent (both when using GPT-4o and o3-mini) tries to handle the request of the user by itself - often hallucinating the process although it does not have access to the actual tools to complete a task - our guess is that it observes how the specialized agent answers and tries to replicate that behavior.
We are using the recommended prompt prefix for hand-offs, we are providing a hand-off description and the Triage Agent certainly has access to hand-offs, as we observe it both in Traces and in the fact that - for a significant part of the conversation - the AI triggers the hand-off correctly. The quality of the decision-making seems to be degrading over time - or as the Triage Agent has the chance to see how a specialized agent handles the request.
We are preserving context using the newly introduced previous_response_id - but we have noticed this error somewhat in the past as well.
Using agents-as-tools is not fully an option for us as we are actively listening for hand-off events in order to trigger certain custom UI behavior.
Our use case demands that, for casual conversation, the triage agent is not supposed to use any hand-off, but for any specific requests related to the product - it must hand-off to the relevant specialized agent - therefore we cannot force the use of a hand-off tool all the time, unless we're missing a trick here?
Any recommendations?
The text was updated successfully, but these errors were encountered: