Eval bug: SYCL missing kernel, Error OP RMS_NORM

### Name and Version

Using `rpc-server` with container built using `--target rpc-server`, so no `llama-cli`. Container was built cleanly against commit `82764d8f405ff7928c061d8c100b50e9f77939f6`.

### Operating systems

Linux

### GGML backends

RPC server for Intel cards built with SYCL, different container build with ROCm for the AMD GPUs. No apparent errors with the AMD instances.

### Hardware

Host 1 has Intel Arc Pro B70 x2, AMD Radeon AI Pro 9700 XT x1
Host 2 has AMD Radeon AI Pro 9700 XT x3
Host 3 has AMD Radeon 7900XTX x1

### Models

Was attempted with Unsloth MiniMax 2.7 and Unsloth Devstral 2, both at Q4_K_XL.

### Problem description & steps to reproduce

Running in RPC mode with custom container. Using separate RPC server instances for each card on host with Intel GPUs due to https://github.com/ggml-org/llama.cpp/issues/21747, other hosts have one RPC server for all cards.

As soon as `llama-server` finishes loading the models, the first Intel RPC server crashes, which kills `llama-server`.

I am able to run on these cards with Vulkan, but that is suspiciously slow. I am hoping to get SYCL to work for better performance.

It crashes with the following message:

```console
[graph_compute] device: 0, n_nodes: 603, n_tensors: 764
No kernel named _ZTSZZL17rms_norm_f32_syclPKfPfiiiilllfPN4sycl3_V15queueEiENKUlRNS3_7handlerEE0_clES7_EUlNS3_7nd_itemILi3EEEE_ was foundException caught at file:/opt/llama/ggml/src/ggml-sycl/ggml-sycl.cpp, line:4269
Error OP RMS_NORM
```

<details><summary>Dockerfile for Intel RPC server</summary>

```dockerfile
ARG ONEAPI_VERSION=2025.3.3-0-devel-ubuntu24.04

FROM docker.io/intel/deep-learning-essentials:${ONEAPI_VERSION} as base

ARG intel_arch=bmg_g21
ARG IGC_VERSION=v2.30.1
ARG IGC_VERSION_FULL=2_2.30.1+20950
ARG COMPUTE_RUNTIME_VERSION=26.09.37435.1
ARG COMPUTE_RUNTIME_VERSION_FULL=26.09.37435.1-0
ARG IGDGMM_VERSION=22.9.0
RUN --mount=type=cache,destination=/tmp/neo \
  cd /tmp/neo && wget -c \
  https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-core-${IGC_VERSION_FULL}_amd64.deb \
  https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-opencl-${IGC_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-ocloc-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-ocloc_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-opencl-icd-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-opencl-icd_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libigdgmm12_${IGDGMM_VERSION}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libze-intel-gpu1-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libze-intel-gpu1_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  && dpkg --install *.deb

FROM base as build

RUN --mount=type=cache,destination=/var/lib/apt \
  --mount=type=cache,destination=/var/cache/apt \
  apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y \
    ccache \
    git \
    libgomp1 \
    libssl-dev \
    ninja-build

ARG CCACHE_DIR=/var/cache/ccache
ARG CFLAGS="${CFLAGS} -O3"
ARG CXXFLAGS="${CFLAGS} -O3"

ARG rebuild=''
ARG branch=master
RUN git clone --depth=1 --recurse-submodules --branch=${branch:-master} \
  https://github.com/ggml-org/llama.cpp /opt/llama
WORKDIR /opt/llama

RUN --mount=type=cache,destination=${CCACHE_DIR} \
  bash -c "source /opt/intel/oneapi/setvars.sh --force && \
    cmake -B build -G Ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DGGML_RPC=ON \
      -DGGML_SYCL=ON \
      -DGGML_SYCL_DEVICE_ARCH=${intel_arch} \
      -DCMAKE_C_COMPILER=icx \
      -DCMAKE_CXX_COMPILER=icpx \
    && cmake --build build --target rpc-server -j$(nproc) \
    && mkdir -vp /app \
    && cp -vrL build/bin/* /app/"

FROM base as app

COPY --from=build /app /app

WORKDIR /app
VOLUME /var/cache/llama
ENV ZES_ENABLE_SYSMAN=1
ENV UR_L0_ENABLE_RELAXED_ALLOCATION_LIMITS=1
ENV GGML_RPC_DEBUG=1
ENV LLAMA_CACHE=/var/cache/llama
ENV ONEAPI_DEVICE_SELECTOR="level_zero:0"

ENTRYPOINT ["/app/rpc-server"]
CMD ["--host", "0.0.0.0", "--cache"]
EXPOSE 50052
```

### First Bad Commit

Unknown. These Intel cards are new to me, and I have yet to get them working with SYCL.

### Relevant log output

<details>
<summary>Logs</summary>


Output from RPC server for first Intel card (also first card in RPC list):

```console
...
[set_tensor] saved to '/var/cache/llama/rpc/081ba1bf3fcff680'
[set_tensor] buffer: 0x2ce224f0, data: 0xffffd55971dc0000, offset: 0, size: 198180864
[set_tensor] saved to '/var/cache/llama/rpc/00803106cce709fe'
[alloc_buffer] device: 0, size: 2113929216 -> remote_ptr: 2ce20e00, remote_size: 2113929216
[buffer_get_base] remote_ptr: 2ce20e00
[buffer_clear] remote_ptr: 2ce20e00, value: 0
[get_device_memory] device: 0, free_mem: 20051484672, total_mem: 34242297856
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[alloc_buffer] device: 0, size: 478242816 -> remote_ptr: 2ce177f0, remote_size: 478242816
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[buffer_get_base] remote_ptr: 2ce177f0
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fbc00000, offset: 0, size: 98304
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd400000, offset: 0, size: 8
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd400800, offset: 0, size: 65536
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd410800, offset: 0, size: 16384
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd414800, offset: 0, size: 16
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd415800, offset: 0, size: 16
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd416800, offset: 0, size: 2048
[graph_compute] device: 0, n_nodes: 603, n_tensors: 764
No kernel named _ZTSZZL17rms_norm_f32_syclPKfPfiiiilllfPN4sycl3_V15queueEiENKUlRNS3_7handlerEE0_clES7_EUlNS3_7nd_itemILi3EEEE_ was foundException caught at file:/opt/llama/ggml/src/ggml-sycl/ggml-sycl.cpp, line:4269
Error OP RMS_NORM
```

Output from `llama-server`:

```console
...
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[compy:4100] compute buffer size =   456.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[compy:4101] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[compy:4102] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[phatty:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC1[phatty:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC2[phatty:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[kingoftown:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve:        CPU compute buffer size =   304.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: graph nodes  = 3791
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: graph splits = 8
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: reserve took 458.72 ms, sched copies = 1
Apr 12 20:26:08 phatty llama-server[3477434]: set_adapters_lora: adapters = (nil)
Apr 12 20:26:08 phatty llama-server[3477434]: adapters_lora_are_same: adapters = (nil)
Apr 12 20:26:08 phatty llama-server[3477434]: common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
Apr 12 20:26:08 phatty llama-server[3477434]: set_warmup: value = 1
Apr 12 20:26:08 phatty llama-server[3477434]: /opt/llama/ggml/src/ggml-rpc/ggml-rpc.cpp:670: Remote RPC server crashed or returned malformed response
Apr 12 20:26:08 phatty llama-server[3477434]: recv failed (bytes_recv=0, size_to_recv=8)
Apr 12 20:26:08 phatty llama-server[3479019]: ⚠️ warning: The current terminal doesn't support styling.  Styled output might not appear as expected.
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 67]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 66]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 65]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 64]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 63]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 62]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 61]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 60]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 59]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 58]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 57]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 56]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 55]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 54]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 53]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 52]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 51]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 50]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 49]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 48]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 47]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 46]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 45]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 44]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 43]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 42]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 41]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 40]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 39]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 38]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 37]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 36]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 35]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 34]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 33]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 32]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 31]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 30]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 29]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 28]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 27]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 26]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 25]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 24]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 23]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 22]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 21]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 20]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 19]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 18]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 17]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 16]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 15]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 14]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 13]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 12]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 11]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 10]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 9]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 8]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 7]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 6]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 5]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 4]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 3]
Apr 12 20:26:09 phatty llama-server[3479019]: [Thread debugging using libthread_db enabled]
Apr 12 20:26:09 phatty llama-server[3479019]: Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Apr 12 20:26:09 phatty llama-server[3479019]: 0x00007fe4175b69ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #0  0x00007fe4175b69ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #1  0x00007fe4175ab668 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #2  0x00007fe4175ab6ad in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #3  0x00007fe4176167c7 in wait4 () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #4  0x00007fe417b37e2b in ggml_print_backtrace () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #5  0x00007fe417b37f7e in ggml_abort () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #6  0x00007fe417acc1cc in ggml_backend_rpc_buffer_get_tensor(ggml_backend_buffer*, ggml_tensor const*, void*, unsigned long, unsigned long) () from /opt/llama/build/bin/libggml-rpc.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #7  0x00007fe417b4e626 in ggml_backend_tensor_copy () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #8  0x00007fe417b534ed in ggml_backend_sched_graph_compute_async () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #9  0x00007fe417c840d1 in llama_context::graph_compute(ggml_cgraph*, bool) () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #10 0x00007fe417c86684 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #11 0x00007fe417c8c1cf in llama_context::decode(llama_batch const&) () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #12 0x00007fe417c8dbfb in llama_decode () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #13 0x0000561db7e8c012 in common_init_from_params(common_params&) ()
Apr 12 20:26:09 phatty llama-server[3479019]: #14 0x0000561db7da03ac in server_context_impl::load_model(common_params&) ()
Apr 12 20:26:09 phatty llama-server[3479019]: #15 0x0000561db7cf1cb6 in main ()
Apr 12 20:26:09 phatty llama-server[3479019]: [Inferior 1 (process 1) detached]
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: SYCL missing kernel, Error OP RMS_NORM #21837

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: SYCL missing kernel, Error OP RMS_NORM #21837

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions