I would like to have an option to run gpt-oss-20b on a 16GB Mac
As per @ggerganov tweet it's now possible and it works fine for me when I tested.
To run gpt-oss-20b on a 16GB Mac use these commands:
brew install llama.cpp
llama-server -hf ggml-org/gpt-oss-20b-GGUF --n-cpu-moe 12 -fa -c 32768 --jinja --no-mmap