Skip to content

clip : improve projector naming #13118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 26, 2025
Merged

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Apr 25, 2025

I don't quite like the abstract naming like "resample", "merger", etc. It can be useful if one projector can be reused by various vision models. But unfortunately, that has hardly been the case.

The cumbersome bool has_*_projector pattern is also removed. The only variable being kept is has_llava_projector, because both MLP, MLP_NORM, LDP, LDPV2 are considered variants of llava projector.

Test result:

OK:   llama-mtmd-cli ggml-org/SmolVLM-500M-Instruct-GGUF:Q8_0
OK:   llama-mtmd-cli ggml-org/SmolVLM2-2.2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/SmolVLM2-500M-Video-Instruct-GGUF:Q8_0
OK:   llama-mtmd-cli ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
OK:   llama-mtmd-cli guinmoon/MobileVLM-3B-GGUF:Q4_K_M
OK:   llama-mtmd-cli THUDM/glm-edge-v-5b-gguf:Q4_K_M
OK:   llama-mtmd-cli second-state/Llava-v1.5-7B-GGUF:Q2_K
OK:   llama-mtmd-cli cjpais/llava-1.6-mistral-7b-gguf:Q3_K
OK:   llama-mtmd-cli ibm-research/granite-vision-3.2-2b-GGUF:Q4_K_M
OK:   llama-mtmd-cli second-state/MiniCPM-Llama3-V-2_5-GGUF:Q2_K
OK:   llama-mtmd-cli openbmb/MiniCPM-V-2_6-gguf:Q2_K
OK:   llama-mtmd-cli openbmb/MiniCPM-o-2_6-gguf:Q4_0
OK:   llama-qwen2vl-cli bartowski/Qwen2-VL-2B-Instruct-GGUF:Q4_K_M
OK:   llama-mtmd-cli ggml-org/pixtral-12b-GGUF:Q4_K_M

@ngxson ngxson requested a review from ggerganov April 25, 2025 22:18
PROJECTOR_TYPE_GLM_EDGE,
PROJECTOR_TYPE_MERGER,
PROJECTOR_TYPE_QWEN2VL,
Copy link
Collaborator Author

@ngxson ngxson Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @HimariO , PROJECTOR_TYPE_RESAMPLER is renamed to PROJECTOR_TYPE_QWEN2VL

For qwen2.5, we can add PROJECTOR_TYPE_QWEN25VL. For code paths used by qwenvl, we will need to check ctx->proj_type == PROJECTOR_TYPE_QWEN2VL || ctx->proj_type == PROJECTOR_TYPE_QWEN25VL

But tbh the best way is to have a dedicated builder function for qwenvl, it makes the code much easier to read. I'll make a proposal in the next few days.

@ngxson ngxson merged commit 4753791 into ggml-org:master Apr 26, 2025
48 checks passed
pockers21 pushed a commit to pockers21/llama.cpp that referenced this pull request Apr 28, 2025
* clip : improve projector naming

* no more kv has_llava_projector

* rm unused kv

* rm more unused
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants