Skip to content

Eval bug: SYCL missing kernel, Error OP RMS_NORM #21837

@MrDrMcCoy

Description

@MrDrMcCoy

Name and Version

Using rpc-server with container built using --target rpc-server, so no llama-cli. Container was built cleanly against commit 82764d8f405ff7928c061d8c100b50e9f77939f6.

Operating systems

Linux

GGML backends

RPC server for Intel cards built with SYCL, different container build with ROCm for the AMD GPUs. No apparent errors with the AMD instances.

Hardware

Host 1 has Intel Arc Pro B70 x2, AMD Radeon AI Pro 9700 XT x1
Host 2 has AMD Radeon AI Pro 9700 XT x3
Host 3 has AMD Radeon 7900XTX x1

Models

Was attempted with Unsloth MiniMax 2.7 and Unsloth Devstral 2, both at Q4_K_XL.

Problem description & steps to reproduce

Running in RPC mode with custom container. Using separate RPC server instances for each card on host with Intel GPUs due to #21747, other hosts have one RPC server for all cards.

As soon as llama-server finishes loading the models, the first Intel RPC server crashes, which kills llama-server.

I am able to run on these cards with Vulkan, but that is suspiciously slow. I am hoping to get SYCL to work for better performance.

It crashes with the following message:

[graph_compute] device: 0, n_nodes: 603, n_tensors: 764
No kernel named _ZTSZZL17rms_norm_f32_syclPKfPfiiiilllfPN4sycl3_V15queueEiENKUlRNS3_7handlerEE0_clES7_EUlNS3_7nd_itemILi3EEEE_ was foundException caught at file:/opt/llama/ggml/src/ggml-sycl/ggml-sycl.cpp, line:4269
Error OP RMS_NORM
Dockerfile for Intel RPC server
ARG ONEAPI_VERSION=2025.3.3-0-devel-ubuntu24.04

FROM docker.io/intel/deep-learning-essentials:${ONEAPI_VERSION} as base

ARG intel_arch=bmg_g21
ARG IGC_VERSION=v2.30.1
ARG IGC_VERSION_FULL=2_2.30.1+20950
ARG COMPUTE_RUNTIME_VERSION=26.09.37435.1
ARG COMPUTE_RUNTIME_VERSION_FULL=26.09.37435.1-0
ARG IGDGMM_VERSION=22.9.0
RUN --mount=type=cache,destination=/tmp/neo \
  cd /tmp/neo && wget -c \
  https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-core-${IGC_VERSION_FULL}_amd64.deb \
  https://github.com/intel/intel-graphics-compiler/releases/download/${IGC_VERSION}/intel-igc-opencl-${IGC_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-ocloc-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-ocloc_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-opencl-icd-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/intel-opencl-icd_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libigdgmm12_${IGDGMM_VERSION}_amd64.deb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libze-intel-gpu1-dbgsym_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.ddeb \
  https://github.com/intel/compute-runtime/releases/download/$COMPUTE_RUNTIME_VERSION/libze-intel-gpu1_${COMPUTE_RUNTIME_VERSION_FULL}_amd64.deb \
  && dpkg --install *.deb

FROM base as build

RUN --mount=type=cache,destination=/var/lib/apt \
  --mount=type=cache,destination=/var/cache/apt \
  apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y \
    ccache \
    git \
    libgomp1 \
    libssl-dev \
    ninja-build

ARG CCACHE_DIR=/var/cache/ccache
ARG CFLAGS="${CFLAGS} -O3"
ARG CXXFLAGS="${CFLAGS} -O3"

ARG rebuild=''
ARG branch=master
RUN git clone --depth=1 --recurse-submodules --branch=${branch:-master} \
  https://github.com/ggml-org/llama.cpp /opt/llama
WORKDIR /opt/llama

RUN --mount=type=cache,destination=${CCACHE_DIR} \
  bash -c "source /opt/intel/oneapi/setvars.sh --force && \
    cmake -B build -G Ninja \
      -DCMAKE_BUILD_TYPE=Release \
      -DGGML_RPC=ON \
      -DGGML_SYCL=ON \
      -DGGML_SYCL_DEVICE_ARCH=${intel_arch} \
      -DCMAKE_C_COMPILER=icx \
      -DCMAKE_CXX_COMPILER=icpx \
    && cmake --build build --target rpc-server -j$(nproc) \
    && mkdir -vp /app \
    && cp -vrL build/bin/* /app/"

FROM base as app

COPY --from=build /app /app

WORKDIR /app
VOLUME /var/cache/llama
ENV ZES_ENABLE_SYSMAN=1
ENV UR_L0_ENABLE_RELAXED_ALLOCATION_LIMITS=1
ENV GGML_RPC_DEBUG=1
ENV LLAMA_CACHE=/var/cache/llama
ENV ONEAPI_DEVICE_SELECTOR="level_zero:0"

ENTRYPOINT ["/app/rpc-server"]
CMD ["--host", "0.0.0.0", "--cache"]
EXPOSE 50052

First Bad Commit

Unknown. These Intel cards are new to me, and I have yet to get them working with SYCL.

Relevant log output

Logs

Output from RPC server for first Intel card (also first card in RPC list):

...
[set_tensor] saved to '/var/cache/llama/rpc/081ba1bf3fcff680'
[set_tensor] buffer: 0x2ce224f0, data: 0xffffd55971dc0000, offset: 0, size: 198180864
[set_tensor] saved to '/var/cache/llama/rpc/00803106cce709fe'
[alloc_buffer] device: 0, size: 2113929216 -> remote_ptr: 2ce20e00, remote_size: 2113929216
[buffer_get_base] remote_ptr: 2ce20e00
[buffer_clear] remote_ptr: 2ce20e00, value: 0
[get_device_memory] device: 0, free_mem: 20051484672, total_mem: 34242297856
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[alloc_buffer] device: 0, size: 478242816 -> remote_ptr: 2ce177f0, remote_size: 478242816
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[buffer_get_base] remote_ptr: 2ce177f0
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[get_alloc_size] device: 0, buffer: (nil), data: (nil)
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fbc00000, offset: 0, size: 98304
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd400000, offset: 0, size: 8
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd400800, offset: 0, size: 65536
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd410800, offset: 0, size: 16384
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd414800, offset: 0, size: 16
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd415800, offset: 0, size: 16
[set_tensor] buffer: 0x2ce177f0, data: 0xffffd559fd416800, offset: 0, size: 2048
[graph_compute] device: 0, n_nodes: 603, n_tensors: 764
No kernel named _ZTSZZL17rms_norm_f32_syclPKfPfiiiilllfPN4sycl3_V15queueEiENKUlRNS3_7handlerEE0_clES7_EUlNS3_7nd_itemILi3EEEE_ was foundException caught at file:/opt/llama/ggml/src/ggml-sycl/ggml-sycl.cpp, line:4269
Error OP RMS_NORM

Output from llama-server:

...
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[compy:4100] compute buffer size =   456.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[compy:4101] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[compy:4102] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[phatty:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC1[phatty:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC2[phatty:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: RPC0[kingoftown:4100] compute buffer size =   344.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve:        CPU compute buffer size =   304.09 MiB
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: graph nodes  = 3791
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: graph splits = 8
Apr 12 20:26:08 phatty llama-server[3477434]: sched_reserve: reserve took 458.72 ms, sched copies = 1
Apr 12 20:26:08 phatty llama-server[3477434]: set_adapters_lora: adapters = (nil)
Apr 12 20:26:08 phatty llama-server[3477434]: adapters_lora_are_same: adapters = (nil)
Apr 12 20:26:08 phatty llama-server[3477434]: common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
Apr 12 20:26:08 phatty llama-server[3477434]: set_warmup: value = 1
Apr 12 20:26:08 phatty llama-server[3477434]: /opt/llama/ggml/src/ggml-rpc/ggml-rpc.cpp:670: Remote RPC server crashed or returned malformed response
Apr 12 20:26:08 phatty llama-server[3477434]: recv failed (bytes_recv=0, size_to_recv=8)
Apr 12 20:26:08 phatty llama-server[3479019]: ⚠️ warning: The current terminal doesn't support styling.  Styled output might not appear as expected.
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 67]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 66]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 65]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 64]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 63]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 62]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 61]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 60]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 59]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 58]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 57]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 56]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 55]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 54]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 53]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 52]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 51]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 50]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 49]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 48]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 47]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 46]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 45]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 44]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 43]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 42]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 41]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 40]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 39]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 38]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 37]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 36]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 35]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 34]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 33]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 32]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 31]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 30]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 29]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 28]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 27]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 26]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 25]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 24]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 23]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 22]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 21]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 20]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 19]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 18]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 17]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 16]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 15]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 14]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 13]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 12]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 11]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 10]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 9]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 8]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 7]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 6]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 5]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 4]
Apr 12 20:26:08 phatty llama-server[3479019]: [New LWP 3]
Apr 12 20:26:09 phatty llama-server[3479019]: [Thread debugging using libthread_db enabled]
Apr 12 20:26:09 phatty llama-server[3479019]: Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Apr 12 20:26:09 phatty llama-server[3479019]: 0x00007fe4175b69ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #0  0x00007fe4175b69ee in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #1  0x00007fe4175ab668 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #2  0x00007fe4175ab6ad in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #3  0x00007fe4176167c7 in wait4 () from /lib/x86_64-linux-gnu/libc.so.6
Apr 12 20:26:09 phatty llama-server[3479019]: #4  0x00007fe417b37e2b in ggml_print_backtrace () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #5  0x00007fe417b37f7e in ggml_abort () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #6  0x00007fe417acc1cc in ggml_backend_rpc_buffer_get_tensor(ggml_backend_buffer*, ggml_tensor const*, void*, unsigned long, unsigned long) () from /opt/llama/build/bin/libggml-rpc.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #7  0x00007fe417b4e626 in ggml_backend_tensor_copy () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #8  0x00007fe417b534ed in ggml_backend_sched_graph_compute_async () from /opt/llama/build/bin/libggml-base.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #9  0x00007fe417c840d1 in llama_context::graph_compute(ggml_cgraph*, bool) () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #10 0x00007fe417c86684 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #11 0x00007fe417c8c1cf in llama_context::decode(llama_batch const&) () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #12 0x00007fe417c8dbfb in llama_decode () from /opt/llama/build/bin/libllama.so.0
Apr 12 20:26:09 phatty llama-server[3479019]: #13 0x0000561db7e8c012 in common_init_from_params(common_params&) ()
Apr 12 20:26:09 phatty llama-server[3479019]: #14 0x0000561db7da03ac in server_context_impl::load_model(common_params&) ()
Apr 12 20:26:09 phatty llama-server[3479019]: #15 0x0000561db7cf1cb6 in main ()
Apr 12 20:26:09 phatty llama-server[3479019]: [Inferior 1 (process 1) detached]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions