docs: add XDG_CACHE_HOME/TORCH_HOME to OCR env table and upgrade notes for PR #611
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -19,6 +19,7 @@ This doc is the Day-1 checklist and operational reference. It links to the canon
|
||||
5. [Backup + recovery](#5-backup--recovery)
|
||||
6. [Common operational tasks](#6-common-operational-tasks)
|
||||
7. [Known limitations](#7-known-limitations)
|
||||
8. [Upgrade notes](#8-upgrade-notes)
|
||||
|
||||
---
|
||||
|
||||
@@ -140,6 +141,8 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
||||
| `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — |
|
||||
| `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — |
|
||||
| `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on CX32 hosts; leave unset on CX42+ to use the 12g default | `12g` (prod compose default) | — | — |
|
||||
| `XDG_CACHE_HOME` | XDG cache base dir — redirects Matplotlib and other XDG-aware libraries away from the read-only `HOME` (`/home/ocr`) to the writable cache volume | `/app/cache` | — | — |
|
||||
| `TORCH_HOME` | PyTorch model cache — redirects `~/.cache/torch` to the writable models volume | `/app/models/torch` | — | — |
|
||||
|
||||
### Observability stack (`docker-compose.observability.yml`)
|
||||
|
||||
@@ -554,3 +557,21 @@ bash scripts/download-kraken-models.sh
|
||||
| **No multi-region** | Single PostgreSQL + MinIO instance; no replication or failover | Deliberate scope decision |
|
||||
| **Max upload size** | 50 MB per file (500 MB per request for multi-file) | Configurable in `application.yaml` (`spring.servlet.multipart`) |
|
||||
| **No automated backup** | Phase 5 of Production v1 milestone is not yet implemented | See §5 above |
|
||||
|
||||
---
|
||||
|
||||
## 8. Upgrade notes
|
||||
|
||||
Version-specific one-time steps that must be run before or after upgrading to a given release. Each subsection is safe to skip on a fresh install.
|
||||
|
||||
### Upgrading to PR #611 — non-root OCR container
|
||||
|
||||
The OCR cache volume path changed from `/root/.cache` to `/app/cache` (PR #611 — CIS Docker §4.1 hardening). The existing `ocr_cache` volume was written as root and is inaccessible to the new non-root `ocr` user, causing a `PermissionError` on startup.
|
||||
|
||||
**Before starting the updated container stack**, drop the old root-owned volume:
|
||||
|
||||
```bash
|
||||
docker volume rm familienarchiv_ocr_cache
|
||||
```
|
||||
|
||||
The volume is recreated automatically on `docker compose up`. The OCR service will re-download its model cache on first startup (approximately 1–2 GB, one-time cost).
|
||||
|
||||
Reference in New Issue
Block a user