-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Added new memory efficient conversion script for hf to ggml format, tested on bloomz 176B + Added token conversion script to convert from tokenizer.json format to tokenizer.model #867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…o tokenizer.model format, tested with bigscience models
I've added a helper script as well in 74b92ff This script is more memory efficient than the existing convert-hf-to-ggml.py ; I was able to use it to convert bloomz-176b to float16 ggml format with under 64GB of ram utilization. |
What is the runtime memory usage of the converted bloomz-176b model? |
I believe its around 340GB of ram, you'd need to run it with bloomz.cpp (this particular fork: NouamaneTazi/bloomz.cpp#21) which doesn't have mmap at the moment. I only have around ~96GB of ram, so I've not been able to run the model yet, I've been working on a quantizer as well to quantize it to int4 / q4_0 which is going well, but I suspect it may still not be enough. |
Are you aware that @comex has already written a new conversion script for converting HF to GGML? It has been approved for merge but hasn't been merged yet. It can be seen here: #545 So you might want to compare your new convert script to that, rather than the original script provided in llama.cpp currently.
Thanks very much for the tokenizer.json conversion script! I recently hoped to convert my GPTQ 4-bit version of GeorgiaTechResearch/Galpaca 30B (an OPT model) to GGML. My model repo is: huggingface/galpaca-30B-GPTQ-4bit-128g I couldn't use comex's convert.py due to lack of tokenizer.model. I tried your script and it seemed to work to produce a tokenizer.model:
I have no idea if it's even possible to try and convert an OPT model to GGML, but I thought I'd give it a try anyway! Unfortunately I still can't convert the model. comex's convert.py fails on the new tokenizer.model file:
I tried your conversion script as well, but I can't get it working on any model. Trying it with locally downloaded HF model:
I get the same error if I try it with a remote model:
|
@TheBloke Thanks for linking me to @comex's script ; my script uses the HuggingFace library to handle the tokenization : ( https://github.com/huggingface/tokenizers ) , and assumes Byte-Pair Encoding by default- see: https://huggingface.co/docs/transformers/tokenizer_summary Comex's script seems to be using SentencePiece (https://arxiv.org/pdf/1808.06226.pdf) which is a different tokenizer. I had created it with the hope of using bloom models (which currently llama.cpp doesn't support) see: #452 and it works for that purpose. In reality the conversion script would probably have to support all common tokenizers in order to work for each |
…ordPiece tokenizers, updated arguments
I just updated the token conversion script to add support for "SentencePiece" and "WordPiece" tokenizers, Usage has been updated to:
eg:
@TheBloke I'm not sure if this will help with your quest of converting your OPT model into a GGML model, but I thought I'd tag you anyways. EDIT: Actually, looks like SentencePiece specifically isn't supported by HuggingFace's library, I'm taking a look to see... |
I've tried this on Bloomz mt0-xl:
Am I doing anything wrong? |
@aidaho It looks like that model is using T5Tokenizer(https://huggingface.co/bigscience/mt0-xl/blob/main/tokenizer_config.json) which is not supported by this script, which only supports BPE and WordPiece at the moment.. |
Converts tokenizer.json to tokenizer.model format, tested with bigscience model (eg https://huggingface.co/bigscience/bloomz), usage like:
python3 tokenconvert.py ./ad033898-d849-41a1-9ecd-ad24e016bc4f/bloomz