-
Notifications
You must be signed in to change notification settings - Fork 421
feat: Updated token calculation for aws-bedrock LLM Events
#3445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Updated token calculation for aws-bedrock LLM Events
#3445
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3445 +/- ##
==========================================
- Coverage 97.75% 89.54% -8.22%
==========================================
Files 404 413 +9
Lines 54575 55474 +899
Branches 1 1
==========================================
- Hits 53348 49672 -3676
- Misses 1227 5802 +4575
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Co-authored-by: Maurice Rickard <[email protected]>
Description
This PR updates how we calculate token_count and collect token usage attributes for openai.
See spec details here.
Token Calculation
Customers can choose to provide an
llm_token_count_callbackwhich agents MUST invoke to obtain token counts. If no callback is registered, then agents MUST pull token count data from the usage metadata objects returned in the LLM response.token_countshould be set onLlmChatCompletionMessageevent.token_count: set to equal zero and attach to the message event as an attribute IF (in this order):Otherwise omit this attribute completely. This signifies the pipeline tokenizer whether or not it should run (this is done for backwards compatibility).
response.usage.*attributes: " section below for details on the token usage data we look for.token_countattribute even ifai_monitoring.record_content.enabledis false.response.usage.*attributes:The relevant token usage attributes MUST be attached to the
LlmChatCompletionSummaryandLlmEmbeddingevents.These will be set by obtaining the token values from the tokenCallback or from the token usage metadata in the response object or response headers (in this order).
The
LlmChatCompletionSummaryevent should addresponse.usage.prompt_tokens,response.usage.completion_tokensandresponse.usage.total_tokensresponse.usage.prompt_tokenson the summary event.response.usage.completion_tokenson the summary event.response.usage.total_tokenson the summary event.The
LlmEmbeddingevent should only addresponse.usage.total_tokens.response.usage.total_tokensAdditional changes
I updated the stream handlers for Claude3 and Llama in order to get the token usage (input and output) from the stream and add to the currentBody before returning the response. The other handlers like Titan already has the token usage in the parsed event so no additional logic had to be added there.
How to Test
Run
npm run test.Related Issues
Partially fixes issue #3042