diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md index 5c2580de..d5f6e1c1 100644 --- a/docs/DEPLOYMENT.md +++ b/docs/DEPLOYMENT.md @@ -613,7 +613,7 @@ Expected output includes `qwen2.5:7b-instruct-q4_K_M`. |---|---|---| | `app.ollama.base-url` | `http://ollama:11434` | Ollama service URL (dev: `http://localhost:11434`) | | `app.ollama.model` | `qwen2.5:7b-instruct-q4_K_M` | Model to use for inference | -| `app.ollama.timeout-seconds` | `30` | Read timeout for inference calls | +| `app.ollama.timeout-seconds` | `60` | Read timeout for inference calls (absorbs cold model load on the first query after an Ollama restart) | | `app.nl-search.rate-limit.max-requests-per-minute` | `5` | Per-user rate limit | ### Upgrade the Ollama model