-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Add support for running bloom models #452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
may be move to discussion? |
Agreed, and Bloom has much better support and performance for other languages, while llama is mostly English focused. |
+1 to this i'm already fine tuning bloom for instruction task and the result quite good bloomz.cpp seem doesnt have capabilities to do inference for instruction task (define prompt) |
FYI, I've created a conversion script that successfully turns bloom's tokenizer.json files into tokenizer.model files here: #867 I'm also working on a chunked conversion script pytorch to ggml for bloomz-176b, so I think its a good idea to add bloom support also. It should allow very large models to be converted with much less memory. |
Chunked conversion script is here: 74b92ff This loads the model layer by layer instead of all at once and does the conversion layer by layer as well. It has significantly lower memory requirements. See also: |
Hi @akumaburn, @bil-ash I've followed your tutorial for convert bloom to ggml, but if I want to use bloom with llama cpp.
I think it comes from other format architecture. |
…-gpu-step-4 Update macOS Metal GPU step 4
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Bloom models have a more permissive license than llama models and are also multilingual in nature. While there is a project based on llama.cpp which can perform inference of bloom models, development seems to be slow and might even stagnate after a few days. So I am requesting to add support for running bloom models using llama.cpp(most probably with a command-line switch)
The text was updated successfully, but these errors were encountered: