Add README files for various applications: Decidim, Discourse, Example Admin, Example App, Ghost, Immich, Keila, Lemmy, Listmonk, Loomio, Matrix, Memcached, MySQL, Open WebUI, OpenProject, PostgreSQL, Redis, and vLLM

2026-02-17 07:58:44 +00:00
parent ebc19a9595
commit 1e8425c98d
18 changed files with 508 additions and 6 deletions
--- a/vllm/README.md
+++ b/vllm/README.md
@@ -0,0 +1,29 @@
+# vLLM
+
+vLLM is a fast and easy-to-use library for LLM inference and serving with an OpenAI-compatible API. Use it to run large language models on your own hardware.
+
+## Dependencies
+
+None, but requires a GPU node in your cluster.
+
+## Configuration
+
+Key settings configured through your instance's `config.yaml`:
+
+- **model** - Hugging Face model to serve (default: `Qwen/Qwen2.5-7B-Instruct`)
+- **maxModelLen** - Maximum sequence length (default: `8192`)
+- **gpuProduct** - Required GPU type (default: `RTX 4090`)
+- **gpuCount** - Number of GPUs to use (default: `1`)
+- **gpuMemoryUtilization** - Fraction of GPU memory to use (default: `0.9`)
+- **domain** - Where the API will be accessible (default: `vllm.{your-cloud-domain}`)
+
+## Access
+
+After deployment, the OpenAI-compatible API will be available at:
+- `https://vllm.{your-cloud-domain}/v1`
+
+Other apps on the cluster (such as Open WebUI) can connect internally at `http://vllm-service.llm.svc.cluster.local:8000/v1`.
+
+## Hardware Requirements
+
+This app requires a GPU node in your cluster. Adjust the `gpuProduct`, `gpuCount`, and memory settings to match your available hardware.