-
Notifications
You must be signed in to change notification settings - Fork 12k
tool-call
: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars
#12034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
b37779b
sampler: turn lazy grammar trigger words to regexes
a456911
add scripts/tool_bench.sh & .py
14a4388
optionally allow any spaces in json schema grammars (useful for llama…
e2ca8be
constrain llama json output regardless of function name if matches at…
53266f9
better error when wrong function called
7833c16
improve error message in weather test
0e1a00e
add more models to tool_bench.sh
44740f7
benchmark other sizes of qwen 2.5 coder
dd6eb97
rm duplicate in tool_bench.sh
0fc6218
add missing <variant> include
6fd4972
fix lints
2e656f9
improve "bad" qwen triggers
fbd3c19
add cast to please some gccs
62a1416
ditch server test request retry logic
596ff7f
fix flake8 lints
fe6968f
nits
1caacd5
remove any_spaces grammar option, allow extra line for airy llama jso…
789a3e1
Update test_tool_call.py
6493a14
test w/ beefier qwen 2.5 coder 3b
cc817a0
revert some test_hello_world diffs
ead02c6
diff
d7acf2c
Update test_tool_call.py
0db4073
add requirements for tool_bench
0ce606b
fix test_thoughts deepseek test expectation
a3cde16
Update README.md
79ad623
update relaxed newline space rule in grammar tests
3fe208a
support add_generation_prompt query parameter (useful for /apply_temp…
fe8c79b
Merge remote-tracking branch 'origin/master' into tool-bench-prod
99d2d80
token cast tweak for gcc
c7fa19a
fix warning on gcc13 w/ uninitialized variant
6e5a830
fix python lints
0b5d105
fix gcc13 warning
7bcc5af
fix pyright lints in tool_bench.py
d1f48d0
Merge remote-tracking branch 'origin/master' into tool-bench-prod
fc19192
update readme w/ link to tool call
60f28ef
tool-bench: add --ctk, --ctv, --fa flags
2470a1c
Merge remote-tracking branch 'origin/master' into tool-bench-prod
e6e9c13
common_grammar_trigger: always use string value (+ optional token)
5d43b72
add llama_grammar_trigger_pattern
1317a35
add common_grammar_trigger.{to_json,from_json}
ad3caa3
fix crashing typo
a6d7887
avoid returning optional from parse_json
20a2f5f
disable slow hello Llama-3.1-8B (chopped unescaped string witin strin…
92e9723
fix nit eol at eof
01be080
Update src/llama-grammar.cpp
ochafik 00db465
Merge remote-tracking branch 'origin/master' into tool-bench-prod
24010fe
avoid ggml_assert in server for grammar triggers inconsistency
71719a6
add comment on limits to common_grammar_trigger.to/from json speciali…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ok I didn't notice that we cannot include
json
in this file. Then maybe change it to:Then use with
to<json>
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it does read better in the call site, I think naming it just
to
makes the interface harder to understand to readers, esp. given the unconventional template use (if anything, would name itserialize
/deserialize
, or provideoperator<<
/operator>>
). Happy to revisit in a follow up / to batch update all theto_json*
to something :-)