From befb8e8b35fded16a61e741a31b6e7597b9c0ee6 Mon Sep 17 00:00:00 2001 From: Marcel Date: Tue, 5 May 2026 23:01:00 +0200 Subject: [PATCH 1/3] =?UTF-8?q?docs(legibility):=20write=20docs/DEPLOYMENT?= =?UTF-8?q?.md=20=E2=80=94=20Day-1=20checklist=20and=20operational=20refer?= =?UTF-8?q?ence?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Covers: topology diagram (Mermaid), OCR memory/VPS sizing table, dev-vs-prod differences, complete env vars table (all vars verified against docker-compose.yml and application.yaml, including APP_ADMIN_* and ALLOWED_PDF_HOSTS gaps not in .env.example), security checklist before first boot, bootstrap sequence, logs, backup current state vs planned, common operational tasks, and known limitations with ADR links. Closes #399 Co-Authored-By: Claude Sonnet 4.6 --- docs/DEPLOYMENT.md | 272 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 272 insertions(+) create mode 100644 docs/DEPLOYMENT.md diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md new file mode 100644 index 00000000..c7513923 --- /dev/null +++ b/docs/DEPLOYMENT.md @@ -0,0 +1,272 @@ + + +# Familienarchiv — Deployment Reference + +> **If the app is down right now → jump to [§4 Logs](#4-logs--observability).** + +This doc is the Day-1 checklist and operational reference. It links to the canonical infrastructure docs in `docs/infrastructure/` rather than duplicating them. + +**Audience:** operator bringing up a fresh instance, or Successor-X debugging a live incident. + +**Ownership:** project owner. Update this file in any PR that changes the container topology, env vars, or backup procedure. + +## Table of Contents + +1. [Deployment topology](#1-deployment-topology) +2. [Environment variables](#2-environment-variables) +3. [Bootstrap from scratch](#3-bootstrap-from-scratch) +4. [Logs + observability](#4-logs--observability) +5. [Backup + recovery](#5-backup--recovery) +6. [Common operational tasks](#6-common-operational-tasks) +7. [Known limitations](#7-known-limitations) + +--- + +## 1. Deployment topology + +```mermaid +graph TD + Browser -->|HTTPS| Caddy["Caddy (TLS termination)"] + Caddy -->|HTTP :5173| Frontend["Web Frontend\nSvelteKit / Node.js"] + Caddy -->|HTTP :8080| Backend["API Backend\nSpring Boot / Jetty :8080"] + Backend -->|JDBC :5432| DB[(PostgreSQL 16)] + Backend -->|S3 API :9000| MinIO[(MinIO / Hetzner OBS)] + Backend -->|HTTP :8000 internal| OCR["OCR Service\nPython FastAPI"] + OCR -->|presigned URL| MinIO + Browser -->|SSE direct| Backend +``` + +**Key facts:** +- Caddy terminates TLS and reverse-proxies to frontend and backend. See the Caddyfile in [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md). +- The OCR service has **no external port** — reachable only on the internal Docker network from the backend. +- SSE notifications go directly backend → browser (not via the SvelteKit SSR layer). +- Management port 8081 (Spring Actuator / Prometheus scrape) is internal only — the Caddy config blocks `/actuator/*` externally. + +### OCR memory requirements + +The OCR service requires significant RAM for model loading. The dev compose sets `mem_limit: 12g`. + +| Production target | RAM | Recommended OCR limit | Notes | +|---|---|---|---| +| Hetzner CX42 | 16 GB | 12 GB | Recommended for OCR-enabled production | +| Hetzner CX32 | 8 GB | 6 GB | Accept reduced batch sizes and slower throughput | +| Hetzner CX22 | 4 GB | — | Disable the OCR service (`profiles: [ocr]`); run OCR on demand only | + +A CX32 cannot honour a `mem_limit: 12g` — set it to `6g` in the prod overlay or use CX42. + +### Dev vs production differences + +| Concern | Dev compose | Prod overlay | +|---|---|---| +| MinIO image tag | `minio/minio:latest` (unpinned) | Pinned in prod overlay | +| Data persistence | Bind mounts `./data/postgres`, `./data/minio` | Named Docker volumes | +| Bucket creation | `create-buckets` helper container | Pre-created in Hetzner console | +| Spring profile | `dev,e2e` (enables OpenAPI + Swagger UI) | `prod` | +| Mail | Mailpit (local catcher) | Real SMTP | + +Full prod overlay: [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md). + +--- + +## 2. Environment variables + +All vars are set in `.env` at the repo root (copy from `.env.example`). The backend resolves them via `application.yaml`; the Docker Compose file wires them into each container. + +**Any var found in `docker-compose.yml` or `application*.yaml` that is not in this table is a blocking review comment on any PR that changes those files.** + +### Backend + +| Variable | Purpose | Default | Required? | Sensitive? | +|---|---|---|---|---| +| `SPRING_DATASOURCE_URL` | PostgreSQL JDBC URL | — | YES | — | +| `SPRING_DATASOURCE_USERNAME` | DB username | — | YES | — | +| `SPRING_DATASOURCE_PASSWORD` | DB password | — | YES | YES | +| `S3_ENDPOINT` | MinIO / OBS endpoint URL | — | YES | — | +| `S3_ACCESS_KEY` | MinIO access key (use service account, not root in prod) | — | YES | YES | +| `S3_SECRET_KEY` | MinIO secret key | — | YES | YES | +| `S3_BUCKET_NAME` | Target bucket name | — | YES | — | +| `S3_REGION` | S3 region string | `us-east-1` | YES | — | +| `APP_ADMIN_USERNAME` | Bootstrap admin username (⚠ not in .env.example) | `admin` | YES | — | +| `APP_ADMIN_PASSWORD` | Bootstrap admin password (⚠ ships as `admin123`) | `admin123` | YES | YES | +| `APP_BASE_URL` | Public-facing URL for email links | `http://localhost:3000` | YES (prod) | — | +| `APP_OCR_BASE_URL` | Internal URL of the OCR service | — | YES | — | +| `APP_OCR_TRAINING_TOKEN` | Secret token for OCR training endpoints | — | YES (prod) | YES | +| `MAIL_HOST` | SMTP host | `mailpit` (dev) | YES (prod) | — | +| `MAIL_PORT` | SMTP port | `1025` (dev) | YES (prod) | — | +| `MAIL_USERNAME` | SMTP username | — | YES (prod) | YES | +| `MAIL_PASSWORD` | SMTP password | — | YES (prod) | YES | +| `APP_MAIL_FROM` | From address for outbound mail | `noreply@familienarchiv.local` | YES (prod) | — | +| `MAIL_SMTP_AUTH` | SMTP auth enabled | `false` (dev) | YES (prod) | — | +| `MAIL_STARTTLS_ENABLE` | STARTTLS enabled | `false` (dev) | YES (prod) | — | +| `SPRING_PROFILES_ACTIVE` | Spring profile | `dev,e2e` (compose) | YES | — | + +### PostgreSQL container + +| Variable | Purpose | Default | Required? | Sensitive? | +|---|---|---|---|---| +| `POSTGRES_USER` | DB superuser | `archive_user` | YES | — | +| `POSTGRES_PASSWORD` | DB password | `change-me` | YES | YES | +| `POSTGRES_DB` | Database name | `family_archive_db` | YES | — | + +### MinIO container + +| Variable | Purpose | Default | Required? | Sensitive? | +|---|---|---|---|---| +| `MINIO_ROOT_USER` | MinIO root username | `minio_admin` | YES | — | +| `MINIO_ROOT_PASSWORD` | MinIO root password | `change-me` | YES | YES | +| `MINIO_DEFAULT_BUCKETS` | Bucket name | `archive-documents` | YES | — | + +### OCR service + +| Variable | Purpose | Default | Required? | Sensitive? | +|---|---|---|---|---| +| `TRAINING_TOKEN` | Guards `/train` and `/segtrain` endpoints (accepts file uploads) | — | YES (prod) | YES | +| `ALLOWED_PDF_HOSTS` | SSRF protection — comma-separated list of allowed PDF source hosts. **Do not widen to `*`** | `minio,localhost,127.0.0.1` | YES | — | +| `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — | + +--- + +## 3. Bootstrap from scratch + +> Full VPS provisioning steps are in [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md). This section covers the sequence and the security-critical steps. + +### Security checklist — complete before first boot + +> ⚠️ **These defaults ship in `.env.example` and `application.yaml`. Change them or you will have an insecure installation.** + +- [ ] Set `APP_ADMIN_PASSWORD` (default: `admin123` — change before starting the backend) +- [ ] Set `APP_ADMIN_USERNAME` if you want a non-default admin login name (add to `.env` — not in `.env.example`) +- [ ] Rotate `POSTGRES_PASSWORD` from `change-me` +- [ ] Rotate `MINIO_ROOT_PASSWORD` from `change-me` +- [ ] Set a strong `OCR_TRAINING_TOKEN` (`python3 -c "import secrets; print(secrets.token_hex(32))"`) +- [ ] Confirm `ALLOWED_PDF_HOSTS` is locked to your MinIO/S3 hostname — widening to `*` opens SSRF +- [ ] Set `SPRING_PROFILES_ACTIVE=prod` in the prod overlay (not `dev,e2e` — that exposes Swagger UI and `/v3/api-docs`) +- [ ] Use a dedicated MinIO service account for `S3_ACCESS_KEY` / `S3_SECRET_KEY`, not the root credentials + +### Bootstrap sequence + +```bash +# 1. Copy and fill the env file +cp .env.example .env +# edit .env — complete the security checklist above first + +# 2. (Production only) Create the MinIO / Hetzner OBS bucket in the console +# The dev compose has a create-buckets helper; production does not. +# Create the bucket named $MINIO_DEFAULT_BUCKETS with private access. + +# 3. Start the stack (prod overlay — see docs/infrastructure/production-compose.md) +docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d + +# 4. Flyway migrations run automatically on backend start. +# Watch the backend log to confirm: +docker compose logs --follow --tail=100 backend + +# 5. Verify the stack is healthy +curl http://localhost:8080/actuator/health +# Expected: {"status":"UP"} + +# 6. Open the app and log in with the admin credentials from .env +``` + +> **Do not use `docker-compose.ci.yml` locally** — it disables bind mounts that the dev workflow depends on. + +--- + +## 4. Logs + observability + +### First-response commands + +```bash +# Stream backend logs (most useful first) +docker compose logs --follow --tail=100 backend + +# Stream all services +docker compose logs --follow + +# Single snapshot +docker compose logs --tail=200 +# services: frontend, backend, db, minio, ocr-service +``` + +### Log locations + +- **Backend application log**: stdout (captured by Docker). Access inside the container at `/app/logs/` via `docker exec`. +- **Spring Actuator health**: `http://localhost:8080/actuator/health` (internal only in prod — port 8081 for Prometheus scraping) +- **Prometheus scraping**: management port 8081, path `/actuator/prometheus`. Internal only; Caddy blocks `/actuator/*` externally. + +### Future observability + +Phase 7 of the Production v1 milestone adds Prometheus + Loki + Grafana. No monitoring infrastructure is in place yet. + +--- + +## 5. Backup + recovery + +### Current state — no automated backup + +No automated backup is configured. Manual procedure for a point-in-time backup: + +```bash +# PostgreSQL dump +docker exec familienarchiv-db-1 pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql + +# MinIO data (bind-mounted in dev) +# Copy ./data/minio/ to external storage +``` + +Restoration: +```bash +# Restore Postgres +docker exec -i familienarchiv-db-1 psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql +``` + +### Planned — phase 5 of Production v1 milestone + +Automated backup (PostgreSQL WAL archiving + MinIO bucket replication) is planned in the Production v1 milestone phase 5. Until that ships: **manual backups are the only recovery option.** + +--- + +## 6. Common operational tasks + +### Reset dev database (truncates data, keeps schema) + +```bash +bash scripts/reset-db.sh +``` + +> Truncates all data but does **not** drop the schema or re-run Flyway. Use for E2E test resets, not full reinstalls. +> ⚠️ Script hardcodes `DB_USER=archive_user` and `DB_NAME=family_archive_db` — if you customised these in `.env`, edit the script accordingly. + +### Rebuild frontend container (clears node_modules volume) + +```bash +bash scripts/rebuild-frontend.sh +``` + +> Assumes the Docker Compose volume is named `familienarchiv_frontend_node_modules`. If your project directory is not named `familienarchiv`, edit line 16 of the script. + +### Download Kraken OCR models + +```bash +bash scripts/download-kraken-models.sh +``` + +> Downloads the Kurrent/Sütterlin HTR models. Run once after a fresh clone or when models are updated. + +### Trigger a mass import (Excel/ODS) + +1. Place the import file in the `import/` bind mount on the backend container. +2. Call `POST /api/admin/trigger-import` (requires `ADMIN` permission). +3. The import runs asynchronously — poll `GET /api/admin/import-status` or watch backend logs. + +--- + +## 7. Known limitations + +| Limitation | Reason | Reference | +|---|---|---| +| **Single-node OCR service** | The two required OCR engines (Surya + Kraken) exist only in the Python ecosystem; horizontal scaling would require a job queue not currently implemented | [ADR-001](adr/001-ocr-python-microservice.md) | +| **No multi-tenancy** | Designed as a single-family private archive; all authenticated users share the same document space | Deliberate scope decision (family-only product frame) | +| **No multi-region** | Single PostgreSQL + MinIO instance; no replication or failover | Deliberate scope decision | +| **Max upload size** | 50 MB per file (500 MB per request for multi-file) | Configurable in `application.yaml` (`spring.servlet.multipart`) | +| **No automated backup** | Phase 5 of Production v1 milestone is not yet implemented | See §5 above | -- 2.49.1 From 1b61af934a9dbac975657dda50f87e954388ff60 Mon Sep 17 00:00:00 2001 From: Marcel Date: Tue, 5 May 2026 23:14:20 +0200 Subject: [PATCH 2/3] docs(legibility): fix two blockers in DEPLOYMENT.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Use correct container name archive-db (not familienarchiv-db-1) in §5 backup/restore commands — verified against docker-compose.yml - Add KRAKEN_MODEL_PATH to OCR service env vars table (was missing; set at docker-compose.yml:92 as /app/models/german_kurrent.mlmodel) Refs #399 Co-Authored-By: Claude Sonnet 4.6 --- docs/DEPLOYMENT.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md index c7513923..b9895e33 100644 --- a/docs/DEPLOYMENT.md +++ b/docs/DEPLOYMENT.md @@ -122,6 +122,7 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back |---|---|---|---|---| | `TRAINING_TOKEN` | Guards `/train` and `/segtrain` endpoints (accepts file uploads) | — | YES (prod) | YES | | `ALLOWED_PDF_HOSTS` | SSRF protection — comma-separated list of allowed PDF source hosts. **Do not widen to `*`** | `minio,localhost,127.0.0.1` | YES | — | +| `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — | | `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — | --- @@ -208,7 +209,7 @@ No automated backup is configured. Manual procedure for a point-in-time backup: ```bash # PostgreSQL dump -docker exec familienarchiv-db-1 pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql +docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql # MinIO data (bind-mounted in dev) # Copy ./data/minio/ to external storage @@ -217,7 +218,7 @@ docker exec familienarchiv-db-1 pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > bac Restoration: ```bash # Restore Postgres -docker exec -i familienarchiv-db-1 psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql +docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql ``` ### Planned — phase 5 of Production v1 milestone -- 2.49.1 From 6fd7778b9ed4c622b2efb47b49dc8bc09b9274ed Mon Sep 17 00:00:00 2001 From: Marcel Date: Wed, 6 May 2026 07:11:57 +0200 Subject: [PATCH 3/3] fix(docs): correct DEPLOYMENT.md env var name and prod overlay note MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Security checklist: OCR_TRAINING_TOKEN → APP_OCR_TRAINING_TOKEN (backend) plus TRAINING_TOKEN (OCR service); both must share the same value - Bootstrap: clarify docker-compose.prod.yml is not committed — must be created from docs/infrastructure/production-compose.md Co-Authored-By: Claude Sonnet 4.6 --- docs/DEPLOYMENT.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/DEPLOYMENT.md b/docs/DEPLOYMENT.md index b9895e33..6e697c55 100644 --- a/docs/DEPLOYMENT.md +++ b/docs/DEPLOYMENT.md @@ -139,7 +139,7 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back - [ ] Set `APP_ADMIN_USERNAME` if you want a non-default admin login name (add to `.env` — not in `.env.example`) - [ ] Rotate `POSTGRES_PASSWORD` from `change-me` - [ ] Rotate `MINIO_ROOT_PASSWORD` from `change-me` -- [ ] Set a strong `OCR_TRAINING_TOKEN` (`python3 -c "import secrets; print(secrets.token_hex(32))"`) +- [ ] Set a strong `APP_OCR_TRAINING_TOKEN` (backend) and the matching `TRAINING_TOKEN` (OCR service) — both must be the same value (`python3 -c "import secrets; print(secrets.token_hex(32))"`) - [ ] Confirm `ALLOWED_PDF_HOSTS` is locked to your MinIO/S3 hostname — widening to `*` opens SSRF - [ ] Set `SPRING_PROFILES_ACTIVE=prod` in the prod overlay (not `dev,e2e` — that exposes Swagger UI and `/v3/api-docs`) - [ ] Use a dedicated MinIO service account for `S3_ACCESS_KEY` / `S3_SECRET_KEY`, not the root credentials @@ -156,6 +156,7 @@ cp .env.example .env # Create the bucket named $MINIO_DEFAULT_BUCKETS with private access. # 3. Start the stack (prod overlay — see docs/infrastructure/production-compose.md) +# docker-compose.prod.yml is NOT committed — create it from the guide above docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d # 4. Flyway migrations run automatically on backend start. -- 2.49.1