Skip to content

fix: guard BEGIN_OF_TOOL_RESPONSE for Gemma2 and fix f-string in _normalize_token#590

Open
ashiksharonm wants to merge 3 commits intogoogle-deepmind:mainfrom
ashiksharonm:fix/gemma2-sampler-end-tokens-and-normalize-token-fstring-568-579
Open

fix: guard BEGIN_OF_TOOL_RESPONSE for Gemma2 and fix f-string in _normalize_token#590
ashiksharonm wants to merge 3 commits intogoogle-deepmind:mainfrom
ashiksharonm:fix/gemma2-sampler-end-tokens-and-normalize-token-fstring-568-579

Conversation

@ashiksharonm
Copy link

Fixes #568: Sampler.sample() unconditionally accessed tokenizer.special_tokens.BEGIN_OF_TOOL_RESPONSE when building end_tokens for SamplerLoop. The _Gemma2SpecialTokens enum does not define this attribute (introduced in Gemma3 for tool/function calling), causing:

AttributeError: type object '_Gemma2SpecialTokens' has no attribute
'BEGIN_OF_TOOL_RESPONSE'

Fix: guard the token access behind hasattr() so it is only included when the tokenizer actually defines the attribute (i.e. Gemma3+).

Also fixes #579: the ValueError raised by _normalize_token() used a plain string literal instead of an f-string, so the error message showed the literal text '{token!r}' rather than the actual token value. Added the missing f prefix.

Added two regression tests in _sampler_test.py:

  • test_normalize_token_error_message_contains_token_value
  • test_sampler_gemma2_tokenizer_no_begin_of_tool_response

…malize_token

Fixes google-deepmind#568: Sampler.sample() unconditionally accessed
tokenizer.special_tokens.BEGIN_OF_TOOL_RESPONSE when building end_tokens
for SamplerLoop. The _Gemma2SpecialTokens enum does not define this
attribute (introduced in Gemma3 for tool/function calling), causing:

  AttributeError: type object '_Gemma2SpecialTokens' has no attribute
  'BEGIN_OF_TOOL_RESPONSE'

Fix: guard the token access behind hasattr() so it is only included when
the tokenizer actually defines the attribute (i.e. Gemma3+).

Also fixes google-deepmind#579: the ValueError raised by _normalize_token() used a
plain string literal instead of an f-string, so the error message showed
the literal text '{token!r}' rather than the actual token value. Added the
missing f prefix.

Added two regression tests in _sampler_test.py:
- test_normalize_token_error_message_contains_token_value
- test_sampler_gemma2_tokenizer_no_begin_of_tool_response
@google-cla
Copy link

google-cla bot commented Feb 26, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant