fix(ocr): route Surya model staging to SSD via TMPDIR + add volume-init service #615

Merged
marcel merged 10 commits from feat/issue-614-tmpdir-persistent-volume into main 2026-05-18 11:32:37 +02:00
Showing only changes of commit 193a4d6ee6 - Show all commits

View File

@@ -564,6 +564,22 @@ bash scripts/download-kraken-models.sh
Version-specific one-time steps that must be run before or after upgrading to a given release. Each subsection is safe to skip on a fresh install.
### Upgrading to PR #615 — TMPDIR redirect + ocr-volume-init
`ocr-volume-init` is a new one-shot service in both compose files that runs before `ocr-service` on every `docker compose up`. It:
1. `chown -R 1000:1000 /app/cache /app/models` — corrects volume ownership so the non-root `ocr` user (uid 1000) can write to volumes that may have been created as root (including volumes from before PR #611).
2. `mkdir -p /app/cache/.tmp` — creates the TMPDIR staging directory that Surya uses for GB-scale model downloads. Without this directory, the first model download falls back to the 512 MB `/tmp` tmpfs and fails with ENOSPC. See ADR-021.
**Verify it succeeded:**
```bash
docker logs archiv-ocr-volume-init # dev
docker logs archiv-production-ocr-volume-init-1 # prod
```
Expected output: no error lines; exit code 0.
**Failure mode:** if `chown` is denied (e.g. the volume is mounted read-only), the container exits non-zero and `ocr-service` will not start (`depends_on: condition: service_completed_successfully`). Check `docker logs` for the `chown` error and verify the volume is writable.
### Upgrading to PR #611 — non-root OCR container
The OCR cache volume path changed from `/root/.cache` to `/app/cache` (PR #611 — CIS Docker §4.1 hardening). The existing volume was written as root and is inaccessible to the new non-root `ocr` user, causing a `PermissionError` on startup.