About LLMKube
Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.
- Pricing
- Free
- License
- CC-BY-SA-3.0
- Deployment
- Self-hosted
- Tags
- apache-2.0go/docker/k8s
Kubernetes operator for llama.cpp-native LLM inference with GPU scheduling, Apple Silicon Metal support, and OpenAI-compatible API.