docs: add XDG_CACHE_HOME/TORCH_HOME to OCR env table and upgrade notes for PR #611

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-17 17:32:02 +02:00
parent fc8b4b164b
commit 38973a014e

View File

@@ -19,6 +19,7 @@ This doc is the Day-1 checklist and operational reference. It links to the canon
5. [Backup + recovery](#5-backup--recovery)
6. [Common operational tasks](#6-common-operational-tasks)
7. [Known limitations](#7-known-limitations)
8. [Upgrade notes](#8-upgrade-notes)
---
@@ -140,6 +141,8 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
| `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — |
| `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — |
| `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on CX32 hosts; leave unset on CX42+ to use the 12g default | `12g` (prod compose default) | — | — |
| `XDG_CACHE_HOME` | XDG cache base dir — redirects Matplotlib and other XDG-aware libraries away from the read-only `HOME` (`/home/ocr`) to the writable cache volume | `/app/cache` | — | — |
| `TORCH_HOME` | PyTorch model cache — redirects `~/.cache/torch` to the writable models volume | `/app/models/torch` | — | — |
### Observability stack (`docker-compose.observability.yml`)
@@ -554,3 +557,21 @@ bash scripts/download-kraken-models.sh
| **No multi-region** | Single PostgreSQL + MinIO instance; no replication or failover | Deliberate scope decision |
| **Max upload size** | 50 MB per file (500 MB per request for multi-file) | Configurable in `application.yaml` (`spring.servlet.multipart`) |
| **No automated backup** | Phase 5 of Production v1 milestone is not yet implemented | See §5 above |
---
## 8. Upgrade notes
Version-specific one-time steps that must be run before or after upgrading to a given release. Each subsection is safe to skip on a fresh install.
### Upgrading to PR #611 — non-root OCR container
The OCR cache volume path changed from `/root/.cache` to `/app/cache` (PR #611 — CIS Docker §4.1 hardening). The existing `ocr_cache` volume was written as root and is inaccessible to the new non-root `ocr` user, causing a `PermissionError` on startup.
**Before starting the updated container stack**, drop the old root-owned volume:
```bash
docker volume rm familienarchiv_ocr_cache
```
The volume is recreated automatically on `docker compose up`. The OCR service will re-download its model cache on first startup (approximately 12 GB, one-time cost).