Skip to content

context.usage returns 0 in streaming mode #594

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
brodguez opened this issue Apr 24, 2025 · 3 comments
Closed

context.usage returns 0 in streaming mode #594

brodguez opened this issue Apr 24, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@brodguez
Copy link

Bug description

When using openai-agents-python with Runner.run_streamed, the context.usage values in the hooks (input_tokens, output_tokens, total_tokens, requests) are always 0.

This issue does not happen when using Runner.run (non-streaming mode), where the usage values are correctly populated.

Debug information

  • Agents SDK version: v0.0.12
  • Platform: macOS

Repro steps

# Tested with multiple models (gpt-4o, o4-mini, o3-mini)
run_config = RunConfig(
    model="o4-mini",
    model_settings=ModelSettings(include_usage=True)
)

class AIAgentsHooks(RunHooks):
    def __init__(self):
        self.event_counter = 0

    def _usage_to_str(self, usage: Usage) -> str:
        return f"{usage.requests} requests, {usage.input_tokens} input tokens, {usage.output_tokens} output tokens, {usage.total_tokens} total tokens"

    async def on_agent_start(self, context: RunContextWrapper, agent: Agent) -> None:
        print(f"Start: {self._usage_to_str(context.usage)}")

    async def on_agent_end(self, context: RunContextWrapper, agent: Agent, output: Any) -> None:
        print(f"End: {self._usage_to_str(context.usage)}")

hooks = AIAgentsHooks()

# Run in streaming mode
result = Runner.run_streamed(
    starting_agent=agent,
    input=input,
    run_config=run_config,
    hooks=hooks
)

async for event in result.stream_events():
    if event.type == "raw_response_event" and hasattr(event.data, "delta"):
        delta = event.data.delta
        if delta:
            yield StreamingChunk(
                data=ChunkData(delta=delta, finish_reason=None)
            )

Output (streamed)
The output is always: 0 requests, 0 input tokens, 0 output tokens, 0 total tokens

If the same config is run with Runner.run, usage works correctly:

result = await Runner.run(
    starting_agent=agent,
    input=input,
    run_config=run_config,
    hooks=hooks
)

Output (non-streamed):
requests, 123 input tokens, 56 output tokens, 179 total tokens

Expected behavior

In streaming mode, the context.usage values should reflect actual usage data just like in non-streaming mode.

@brodguez brodguez added the bug Something isn't working label Apr 24, 2025
@rm-openai
Copy link
Collaborator

@brodguez PR #595 fixes this. Note that streaming usage is only available in the very last chunk of the LLM response, so in your example it would be present in on_agent_end but not necessarily before that.

@brodguez
Copy link
Author

Thanks! 🙌

I just need it in on_agent_end so that works perfectly for my use case.

Appreciate the support!

@rm-openai
Copy link
Collaborator

Will be available in next version 0.0.14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants