Skip to content

Eval bug: Performance in current vulkan binaries is BAD #16317

@openconstruct

Description

@openconstruct

Name and Version

Using llama-server from current build (b6615)
gives me ERNIE 4.5 21BA3B speeds of 4.x tokens per second
using the binary from july 17 2025 gives me 20.x

I renewed my OS and noticed speed was way down, so I reverted to an older version and it's full speed again.
Ubuntu 25,04, using the vulkan ubuntu binary in both cases_

Operating systems

Linux

GGML backends

Vulkan

Hardware

Ryzen 7535HS Radeon RX 6550M 32GB ram

Models

No response

Problem description & steps to reproduce

Run current b6615 ubuntu vulkan client performance is terrible ( 4.x tps)
run version from july get 20.x tps

First Bad Commit

No response

Relevant log output

idk

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions