Skip to content

Conversation

@ericcurtin
Copy link
Collaborator

So we can load these natively just like gguf

@ericcurtin ericcurtin marked this pull request as draft November 28, 2025 20:32
@ericcurtin
Copy link
Collaborator Author

ericcurtin commented Nov 28, 2025

safetensors tend to be downloaded/pulled more often than gguf. This will introduce support for both formats.

It will help us run performance comparisons using the exact same models/files in vllm and llama.cpp

So we can load these natively just like gguf

Signed-off-by: Eric Curtin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant