Skip to content

Add oneliner for batch quantization #17

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

jooray
Copy link
Contributor

@jooray jooray commented Mar 11, 2023

No description provided.

which can be scripted like this if you are lazy (for 65B model):

```bash
for i in models/65B/ggml-model-f16.bin*;do quantized=`echo "$i" | sed -e 's/f16/q4_0/'`; ./quantize "$i" "$quantized" 2 ;done
Copy link
Collaborator

@prusnak prusnak Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sed is not necessary, bash, zsh and other modern shells can perform pattern replacement of a variable:

Suggested change
for i in models/65B/ggml-model-f16.bin*;do quantized=`echo "$i" | sed -e 's/f16/q4_0/'`; ./quantize "$i" "$quantized" 2 ;done
for i in models/65B/ggml-model-f16.bin* ; do ./quantize "$i" "${i/f16/q4_0}" 2 ;done

Copy link

@s-and-witch s-and-witch Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will generate 'models/65B/ggml-model-q4_0/.bin.2' such paths and will fail with errors, the right command (in bash) should be for i in models/65B/ggml-model-f16.bin* ; do ./quantize "$i" "${i/f16/q4_0}" 2 ;done

Copy link
Collaborator

@prusnak prusnak Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Player-205 right, updated the suggestion above, thanks

@ggerganov
Copy link
Member

Lets put this in a quantize.sh script that accepts argument like 7B, 13B, etc. and update instructions to just run the script:

source quantize.sh 7B

Should be much easier to follow

@leszekhanusz
Copy link

Note that if the disk space is limited, it is still useful to quantize each file separately so that we could delete each intermediate file in between.
In my case I added a rm command because I did not have enough disk space otherwise:

for i in models/65B/ggml-model-f16.bin* ; do ./quantize "$i" "${i/f16/q4_0}" 2 ; rm "$i"; done

@ggerganov
Copy link
Member

Good point, should have a second parameter for "keep f16" which is on by default

@prusnak
Copy link
Collaborator

prusnak commented Mar 13, 2023

Superseded by #92

@ggerganov ggerganov closed this Mar 13, 2023
SlyEcho pushed a commit to SlyEcho/llama.cpp that referenced this pull request Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants