Skip to content

[REQUEST] Enabling support for vision draft models. #384

@MoetasimR

Description

@MoetasimR

Problem

I am using a quantized version of pixtral large and I can't load the vision modules of a smaller variant. I cannot perform inference with images, I can only perform inference with text.
I imagine this will be a much needed feature as multimodal inference is always less performant than raw text.

Solution

Create a config for enabling this feature, I have a very strong feeling that this is low-hanging fruit.

Alternatives

No response

Explanation

I imagine this will be a much needed feature as multimodal inference is always less performant than raw text.

Examples

No response

Additional context

No response

Acknowledgements

  • I have looked for similar requests before submitting this one.
  • I understand that the developers have lives and my issue will be answered when possible.
  • I understand the developers of this program are human, and I will make my requests politely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions