fix(infra): deploy Ollama to prod/staging compose + fix broken model-init recipe #759

Merged
marcel merged 8 commits from fix/issue-758-ollama-prod-compose into main 2026-06-06 20:30:35 +02:00
Showing only changes of commit 2a0863cf3e - Show all commits

View File

@@ -613,7 +613,7 @@ Expected output includes `qwen2.5:7b-instruct-q4_K_M`.
|---|---|---|
| `app.ollama.base-url` | `http://ollama:11434` | Ollama service URL (dev: `http://localhost:11434`) |
| `app.ollama.model` | `qwen2.5:7b-instruct-q4_K_M` | Model to use for inference |
| `app.ollama.timeout-seconds` | `30` | Read timeout for inference calls |
| `app.ollama.timeout-seconds` | `60` | Read timeout for inference calls (absorbs cold model load on the first query after an Ollama restart) |
| `app.nl-search.rate-limit.max-requests-per-minute` | `5` | Per-user rate limit |
### Upgrade the Ollama model