Eval bug: Llama 4 Scout vision broken for image > 336px

### Name and Version

Last working version b8541 (ded446b34)

### Operating systems

Linux

### GGML backends

CUDA

### Hardware

2 x L40s, also tested on 2x A6000 ADA

### Models

[unsloth--Llama-4-Scout-17B-16E-Instruct-GGUF ](https://huggingface.co/unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF) Q4_K_M

### Problem description & steps to reproduce

Llama 4 vision broken for any image larger than 336px.

```
  # model: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF (Q4_K_M + mmproj-BF16)                                                                                                                      
                                                                                                                                                                                                   
# works - image under 336px                                                                                                                                                    
llama-mtmd-cli \                                                                                                                                                                                 
  -m Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00001-of-00002.gguf \                                                                                                                                 
  --mmproj mmproj-BF16.gguf \                                                                                                                                                                    
  -c 32768 \                                                                                                                                                                                     
  --image cat-200px.jpg \                                                                                                                                                                        
  -p "What animal is this?" -n 16                                                                                                                                                                
                                                                                                                                                                                                 
# fails - image over 336px                                                                                                                                            
llama-mtmd-cli \                                                                                                                                                                                 
  -m Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00001-of-00002.gguf \                                                                                                                                 
  --mmproj mmproj-BF16.gguf \                                                                                                                                                                    
  -c 32768 \                                                                    
  --image cat-1200px.jpg \                                                                                                                                                                       
  -p "What animal is this?" -n 16   
```

### First Bad Commit

 b8542 (a73bbd5d)

### Relevant log output

<details>
<summary>Logs</summary>


```console
warmup: *****************************************************************                                                                                                                        
init_vision: llama 4 vision is known to have degraded quality:                                                                                                                                   
    https://github.com/ggml-org/llama.cpp/pull/13282                                                                                                                                             
main: loading model: Llama-4-Scout-17B-16E-Instruct-Q4_K_M-00001-of-00002.gguf                                                                                                                   
WARN: This is an experimental CLI for testing multimodal capability.                                                                                                                             
      For normal use cases, please use the standard llama-cli                                                                                                                                    
encoding image slice...                                                                                                                                                                          
failed to encode image slice                                                                                                                                                                     
failed to eval chunk 1                                                                                                                                                                           
Unable to eval prompt 
```
</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Llama 4 Scout vision broken for image > 336px #21871

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: Llama 4 Scout vision broken for image > 336px #21871

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions