You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running the LiteLLM sample from the doc works. However modifying the sample to use the run_streamed API to log responses, tool calls etc results in an exception.
Debug information
Agents SDK version: v0.0.12
Python version: 3.13
Repro steps
#!/usr/bin/env -S uv run --script# /// script# requires-python = ">=3.13"# dependencies = [# "openai-agents[litellm]",# ]# ///from __future__ importannotationsimportasynciofromagentsimportAgent, ItemHelpers, Runner, function_tool, set_tracing_disabledfromagents.extensions.models.litellm_modelimportLitellmModelset_tracing_disabled(disabled=True)
@function_tooldefget_weather(city: str):
print(f"[debug] getting weather for {city}")
returnf"The weather in {city} is sunny."asyncdefmain(model: str, api_key: str):
agent=Agent(
name="Assistant",
instructions="You only respond in haikus.",
model=LitellmModel(model=model, api_key=api_key),
tools=[get_weather],
)
result=Runner.run_streamed(agent, "What's the weather in Tokyo?")
asyncforeventinresult.stream_events():
# We'll ignore the raw responses event deltasifevent.type=="raw_response_event":
continue# When the agent updates, print thatelifevent.type=="agent_updated_stream_event":
print(f"Agent updated: {event.new_agent.name}")
continue# When items are generated, print themelifevent.type=="run_item_stream_event":
ifevent.item.type=="tool_call_item":
print(f"-- Tool was called: {event.item.raw_item}")
elifevent.item.type=="tool_call_output_item":
print(f"-- Tool output: {event.item.output}")
elifevent.item.type=="message_output_item":
print(f"-- Message output:\n{ItemHelpers.text_message_output(event.item)}")
else:
pass# Ignore other event typesprint(result.final_output)
if__name__=="__main__":
# First try to get model/api key from argsimportargparseparser=argparse.ArgumentParser()
parser.add_argument("--model", type=str, required=False)
parser.add_argument("--api-key", type=str, required=False)
args=parser.parse_args()
model=args.modelifnotmodel:
model=input("Enter a model name for Litellm: ")
api_key=args.api_keyifnotapi_key:
api_key=input("Enter an API key for Litellm: ")
asyncio.run(main(model, api_key))
❯ ./demo-streaming.py --api-key=... --model=together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo
Agent updated: Assistant
Traceback (most recent call last):
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/pydantic/main.py", line 986, in __getattr__
return pydantic_extra[item]
~~~~~~~~~~~~~~^^^^^^
KeyError: 'usage'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/private/var/folders/ct/x2gct7yn2bxfqs891n8h1dxr0000gn/T/tmp.FRPiXhVbRG/./demo-streaming.py", line 73, in <module>
asyncio.run(main(model, api_key))
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/Cellar/[email protected]/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 195, in run
return runner.run(main)
~~~~~~~~~~^^^^^^
File "/usr/local/Cellar/[email protected]/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "/usr/local/Cellar/[email protected]/3.13.3/Frameworks/Python.framework/Versions/3.13/lib/python3.13/asyncio/base_events.py", line 719, in run_until_complete
return future.result()
~~~~~~~~~~~~~^^
File "/private/var/folders/ct/x2gct7yn2bxfqs891n8h1dxr0000gn/T/tmp.FRPiXhVbRG/./demo-streaming.py", line 34, in main
async for event in result.stream_events():
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<16 lines>...
pass # Ignore other event types
^^^^
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/agents/result.py", line 191, in stream_events
raise self._stored_exception
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/agents/run.py", line 560, in _run_streamed_impl
turn_result = await cls._run_single_turn_streamed(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<9 lines>...
)
^
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/agents/run.py", line 671, in _run_single_turn_streamed
async for event in model.stream_response(
...<28 lines>...
streamed_result._event_queue.put_nowait(RawResponsesStreamEvent(data=event))
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/agents/extensions/models/litellm_model.py", line 167, in stream_response
async for chunk in ChatCmplStreamHandler.handle_stream(response, stream):
...<3 lines>...
final_response = chunk.response
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/agents/models/chatcmpl_stream_handler.py", line 59, in handle_stream
usage = chunk.usage
^^^^^^^^^^^
File "/Users/shoda/.cache/uv/environments-v2/demo-streaming-44c9a8b0431f886f/lib/python3.13/site-packages/pydantic/main.py", line 988, in __getattr__
raise AttributeError(f'{type(self).__name__!r} object has no attribute {item!r}') from exc
AttributeError: 'ModelResponseStream' object has no attribute 'usage'
The sample given in the docs works:
❯ ./demo.py --api-key=... --model=together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo
[debug] getting weather for Tokyo
Sakura blooms dance
Gentle Tokyo spring breeze
Warmth on skin so sweet
Expected behavior
run_streamed should work or there should be some way of streaming events while using LiteLLM models.
The text was updated successfully, but these errors were encountered:
Yes, I noticed that in ChatCmplStreamHandler, it tries to read the usage attribute for each chunk, but it should be present only in the last one if I'm not wrong. And even if you correct that, then it fails in delta.refusal.
I corrected the code cheking if the attributes are present in the delta and then it worked, @rm-openai let me know if you have time to do it otherwise I may open a PR with the corrections and write a simple test for the streaming (if this solution may be good for you)
In response to issue #587 , I implemented a solution to first check if
`refusal` and `usage` attributes exist in the `delta` object.
I added a unit test similar to `test_openai_chatcompletions_stream.py`.
Let me know if I should change something.
---------
Co-authored-by: Rohan Mehta <[email protected]>
Describe the question
Running the LiteLLM sample from the doc works. However modifying the sample to use the run_streamed API to log responses, tool calls etc results in an exception.
Debug information
Repro steps
The sample given in the docs works:
Expected behavior
run_streamed should work or there should be some way of streaming events while using LiteLLM models.
The text was updated successfully, but these errors were encountered: