diff --git a/docs/adr/028-ollama-docker-compose-service.md b/docs/adr/028-ollama-docker-compose-service.md index d1e7f41e..e65e8186 100644 --- a/docs/adr/028-ollama-docker-compose-service.md +++ b/docs/adr/028-ollama-docker-compose-service.md @@ -110,43 +110,55 @@ if (!apiKey.isBlank()) { Sending `Authorization: Bearer ` (empty token) has undefined or potentially broken behavior depending on the Ollama version. This mirrors the `trainingToken` guard in `RestClientOcrClient.java:107`. -### 7. OLLAMA_API_KEY empty-string behavior +### 7. OLLAMA_API_KEY behavior in Ollama 0.6.5 -**TBD:** Empirical verification pending on Ollama 0.6.5. +**Empirically verified (2026-06-06) on both `0.6.5` and `0.30.6`:** `OLLAMA_API_KEY` does **not** enforce request authentication in either version. -Unknown: whether `OLLAMA_API_KEY=` (explicit empty string) is treated as "no auth" (unauthenticated requests accepted) or "invalid key" (all requests rejected). Both the empty-string and fully-unset cases must be tested. +Test matrix run against `/api/tags`: -If empty-string rejects requests, the `.env.example` comment "Leave empty to run unauthenticated" must be corrected and this ADR updated. +| Configuration | No auth header | `Authorization: Bearer ` (empty) | `Authorization: Bearer wrongkey` | `Authorization: Bearer correctkey` | +|---|---|---|---|---| +| `OLLAMA_API_KEY=` (empty) | 200 | 200 | — | — | +| `OLLAMA_API_KEY` unset | 200 | — | — | — | +| `OLLAMA_API_KEY=testkey99` | 200 | 200 | 200 | 200 | -**Action item:** run empirical test (`OLLAMA_API_KEY=` vs `# OLLAMA_API_KEY` in env) and record result before merging PR. +**Finding:** The `OLLAMA_API_KEY` environment variable is not listed in Ollama's startup config dump and does not gate any HTTP request in either tested version. All configurations — empty string, fully unset, and a real key — accept all requests without authentication. + +**Practical implication:** `OLLAMA_API_KEY` provides no defense-in-depth in the tested versions. `archiv-net` network isolation is the only effective security control. The env var is retained in the Compose definition and `.env.example` for forward compatibility if Ollama enables enforcement in a future version, but operators must not rely on it for access control. + +**Backend guard still valid:** the `RestClientOllamaClient` code-level guard (omit `Authorization` header when `apiKey.isBlank()`) remains correct behavior regardless — it prevents a malformed `Authorization: Bearer ` header from being sent. ### 8. read_only: true feasibility -**TBD:** Investigation pending on Ollama 0.6.5. +**Empirically verified (2026-06-06) on both `0.6.5` and `0.30.6`:** `read_only: true` works with Ollama. All three operations — `ollama serve`, `ollama pull qwen2.5:7b-instruct-q4_K_M`, and `ollama list` — succeeded with exit code 0 in both versions. -Test command: +Test run: ```bash docker run --rm --read-only \ -v ollama_models:/root/.ollama \ --tmpfs /tmp \ - ollama/ollama:0.6.5 \ - sh -c "ollama serve & sleep 3 && ollama pull qwen2.5:7b-instruct-q4_K_M && ollama list" + --entrypoint sh ollama/ollama:0.30.6 \ + -c "ollama serve & sleep 5 && ollama pull qwen2.5:7b-instruct-q4_K_M && ollama list" ``` -All three operations (serve, pull, list) must pass to confirm no hidden write paths. Ollama may write to `/root/.config/ollama`, `/var/run`, or `/tmp/ollama*`. +**Note:** the entrypoint must be overridden to `sh` for the test command — the container's default entrypoint is `/bin/ollama` and does not accept `sh` as a subcommand. This is a Docker invocation detail; the Compose service definition uses the image's default entrypoint and `command:` override for the init container, which works correctly. -- If test succeeds: add `read_only: true` to the `ollama` service; document the tmpfs size needed. -- If test fails: document which paths require writes and why `read_only` cannot be applied. - -**Action item:** run investigation before merging PR. +**Result:** `read_only: true` and `tmpfs: - /tmp:size=512m` are applied to both `ollama` and `ollama-model-init`. The `ollama_models` volume handles all persistent writes; no other paths require write access during normal operation. ### 9. Peak RSS of init container during pull -**TBD:** Investigation pending. +**Empirically verified (2026-06-06):** Peak RSS during `qwen2.5:7b-instruct-q4_K_M` pull was **~108 MiB**. -The `ollama-model-init` container currently has `mem_limit: 2g`. If peak RSS during `qwen2.5:7b-instruct-q4_K_M` pull exceeds 2 GB, bump to 4 GB. +`docker stats` samples during the pull (15-second intervals): -**Action item:** capture `docker stats` output during pull and record peak RSS here before merging PR. +| Sample | MEM | +|---|---| +| 1 | 54.89 MiB | +| 2 | 66.3 MiB | +| 5 | 97.25 MiB | +| 9 | **107.8 MiB** (peak) | + +`mem_limit: 2g` is adequate — the model weights stream directly to the named volume; RSS is dominated by the Ollama server process alone (~100 MB), not the model data. No bump to 4 GB needed. ### 10. Init container pull mechanism