-
-
Notifications
You must be signed in to change notification settings - Fork 138
Open
Description
Problem
I am using a quantized version of pixtral large and I can't load the vision modules of a smaller variant. I cannot perform inference with images, I can only perform inference with text.
I imagine this will be a much needed feature as multimodal inference is always less performant than raw text.
Solution
Create a config for enabling this feature, I have a very strong feeling that this is low-hanging fruit.
Alternatives
No response
Explanation
I imagine this will be a much needed feature as multimodal inference is always less performant than raw text.
Examples
No response
Additional context
No response
Acknowledgements
- I have looked for similar requests before submitting this one.
- I understand that the developers have lives and my issue will be answered when possible.
- I understand the developers of this program are human, and I will make my requests politely.
Metadata
Metadata
Assignees
Labels
No labels