Skip to content

Error to loading the latest Chinese LLaMa Alpaca model #1464

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
done434 opened this issue May 15, 2023 · 7 comments
Closed

Error to loading the latest Chinese LLaMa Alpaca model #1464

done434 opened this issue May 15, 2023 · 7 comments
Labels

Comments

@done434
Copy link

done434 commented May 15, 2023

Errors when loading the latest Chinese LLaMA/Alpaca Plus-13B model:

./main -m ../ggml-alpaca13b-q5_1.bin -n 256 --repeat_penalty 1.0 --color -i -r "[Steve]:" -f chat-with-vicuna-v1.txt
main: build = 526 (e6a46b0)
main: seed = 1684135558
llama.cpp: loading model from ../ggml-alpaca13b-q5_1.bin
error loading model: unknown (magic, version) combination: 67676a74, 0000000; is this really a GGML file?
llama_init_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '../ggml-alpaca13b-q5_1.bin'
main: error: unable to load model

Here are the steps to combine the Chinese alpaca model with original llama model:

  1. Using merge_llama_with_chinese_lora.py from the Chinese-LLaMa-Alpaca project to combine the Chinese-LLaMA-Plus-13B, chinese-alpaca-plus-lora-13b together with the original llama model, the output is pth format.
  2. Using this project's convert.py models/13B/ to convert the combined model to ggml format.
  3. Using this project's quantize to quantize the model to 4-bit (using q5_1 method)

No errors in the above steps.

But when loading the model using main program, there are errors like above.
It seemed that (magic, version) combination: 67676a74, 0000000 are not supported when loading the model.

Any solution or suggestion about this? Thanks!

@mingxing0769
Copy link

@jamesljl
Copy link

jamesljl commented May 15, 2023

I'v encountered the same problem. The some steps above , no errors while merging and quantizing. just hangs when loading the model, before ">" appears. but the model seems loaded without error.

@jamesljl
Copy link

Download the previous version : https://github.com/ggerganov/llama.cpp/releases

the previous version don't work , even can't load 8-bit quantized model

@done434
Copy link
Author

done434 commented May 16, 2023

I found the problem of it. The original document suggest to convert the model using the command like this:
python convert.py zh-models/7B/

I read the convert.py carefully and found it has a parameter of vocab-dir:
"--vocab-dir", type=Path, help="directory containing tokenizer.model, if separate from model file"

The document ask to put the tokenizer.model in the upper level directory, I guess maybe it can't use this tokenizer.model file and in fact the tokenizer.model in the Chinese Alpaca model is different with the original LLaMa model.

So at last I add the --vocab-dir parameter to specify the directory of the Chinese Alpaca's tokenizer.model.
Then everything is ok now.

@jamesljl
Copy link

jamesljl commented May 16, 2023

I just reinstalled ubuntu vm and clone the latest version, re-compiled it. then it works. that's weird

@MrLiu199
Copy link

MrLiu199 commented May 21, 2023

I pull the latest release version, rerun make and ./main, and it worked. The reason for me is that I convert and quantize with the newer version but trying to run with an older version.

@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants