Guided llama.cpp install mission in KubeStellar Console

We built a guided install mission for llama.cpp inside [KubeStellar Console](https://console.kubestellar.io?utm_source=github&utm_medium=issue&utm_campaign=cncf_outreach&utm_term=llama-cpp), a standalone Kubernetes dashboard (unrelated to legacy kubestellar/kubestellar, kubeflex, or OCM — zero shared code).

→ **[Open the llama.cpp install mission](https://console.kubestellar.io/missions/install-llama-cpp?utm_source=github&utm_medium=issue&utm_campaign=cncf_outreach&utm_term=llama-cpp)**

### What the mission does

The mission deploys `llama-server` as a Kubernetes workload with a PVC for the GGUF model cache, and exposes it via a ClusterIP Service on the OpenAI-compatible `/v1/chat/completions` endpoint. Each step:

1. **Pre-flight** — checks CPU vs CUDA image selection, verifies NVIDIA device plugin when GPU is requested
2. **Commands** — deploys the `ghcr.io/ggml-org/llama.cpp:server` image with an initContainer that fetches a GGUF model from Hugging Face on first start
3. **Validation** — waits for rollout and health probe, then `curl`s `/v1/chat/completions` to confirm inference works end-to-end
4. **Troubleshooting** — on failure, reads pod logs and suggests fixes (initContainer download failures, OOM, readiness probe timing, CPU vs GPU image mismatch)
5. **Rollback** — complete uninstall path for the Deployment, Service, PVC, and namespace

The mission also ties llama.cpp into the Console's agent selector: set `LLAMACPP_URL` to the in-cluster Service URL and llama.cpp appears as a chat-capable provider in the dropdown, so Console chat routes through your local llama-server.

### Why we're reaching out

The Console now ships with six local-LLM runner integrations (Ollama, llama.cpp, LocalAI, vLLM, LM Studio, Red Hat AI Inference Server) so operators in regulated or air-gapped environments can keep LLM traffic inside their cluster. llama.cpp is the \"dependency-minimal\" option — easiest to audit, lightest to run, supports the broadest hardware matrix.

### Install

Local (connects to your current kubeconfig context):
```bash
curl -sSL https://raw.githubusercontent.com/kubestellar/console/main/start.sh | bash
```

Deploy into a cluster:
```bash
curl -sSL https://raw.githubusercontent.com/kubestellar/console/main/deploy.sh | bash
```

---

Mission definitions are open source — PRs welcome at [install-llama-cpp.json](https://github.com/kubestellar/console-kb/blob/master/fixes/cncf-install/install-llama-cpp.json?utm_source=github&utm_medium=issue&utm_campaign=cncf_outreach&utm_term=llama-cpp). Feel free to close if not relevant.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guided llama.cpp install mission in KubeStellar Console #21973

What the mission does

Why we're reaching out

Install

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Guided llama.cpp install mission in KubeStellar Console #21973

Description

What the mission does

Why we're reaching out

Install

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions