# Familienarchiv — Deployment Reference > **If the app is down right now → jump to [§4 Logs](#4-logs--observability).** This doc is the Day-1 checklist and operational reference. It links to the canonical infrastructure docs in `docs/infrastructure/` rather than duplicating them. **Audience:** operator bringing up a fresh instance, or Successor-X debugging a live incident. **Ownership:** project owner. Update this file in any PR that changes the container topology, env vars, or backup procedure. ## Table of Contents 1. [Deployment topology](#1-deployment-topology) 2. [Environment variables](#2-environment-variables) 3. [Bootstrap from scratch](#3-bootstrap-from-scratch) 4. [Logs + observability](#4-logs--observability) 5. [Backup + recovery](#5-backup--recovery) 6. [Common operational tasks](#6-common-operational-tasks) 7. [Known limitations](#7-known-limitations) --- ## 1. Deployment topology ```mermaid graph TD Browser -->|HTTPS| Caddy["Caddy (TLS termination)"] Caddy -->|HTTP :3000| Frontend["Web Frontend\nSvelteKit Node adapter"] Caddy -->|HTTP :8080| Backend["API Backend\nSpring Boot / Jetty :8080"] Backend -->|JDBC :5432| DB[(PostgreSQL 16)] Backend -->|S3 API :9000| MinIO[(MinIO)] Backend -->|HTTP :8000 internal| OCR["OCR Service\nPython FastAPI"] OCR -->|presigned URL| MinIO Caddy -->|SSE proxy_pass| Backend ``` **Key facts:** - Caddy terminates TLS and reverse-proxies to frontend (`:3000`) and backend (`:8080`). The Caddyfile is committed at [`infra/caddy/Caddyfile`](../infra/caddy/Caddyfile) and is installed on the host as `/etc/caddy/Caddyfile` (symlink). - The host binds all docker-published ports to `127.0.0.1` only; Caddy is the sole external entry point. - The OCR service has **no published port** — reachable only on the internal Docker network from the backend. - SSE notifications transit Caddy (browser → Caddy → backend); the backend is never reachable directly from the public internet. The SvelteKit SSR layer is bypassed for SSE, but Caddy is not. - The Caddyfile responds `404` on `/actuator/*` (defense in depth). Internal monitoring scrapes the backend on the docker network, not through Caddy. - Production and staging cohabit on the same host via docker compose project names: `archiv-production` (ports 8080/3000) and `archiv-staging` (ports 8081/3001). - An optional observability stack (Prometheus, Node Exporter, cAdvisor) runs as a separate compose file: `docker compose -f docker-compose.observability.yml up -d`. It joins `archiv-net` and scrapes the backend's management port (`:8081`). Configuration lives under `infra/observability/`. ### OCR memory requirements The OCR service requires significant RAM for model loading. The dev compose sets `mem_limit: 12g`. | Production target | RAM | Recommended OCR limit | Notes | |---|---|---|---| | Hetzner CX42 | 16 GB | 12 GB | Recommended for OCR-enabled production | | Hetzner CX32 | 8 GB | 6 GB | Accept reduced batch sizes and slower throughput | | Hetzner CX22 | 4 GB | — | Disable the OCR service (`profiles: [ocr]`); run OCR on demand only | A CX32 cannot honour the default `mem_limit: 12g` — set the `OCR_MEM_LIMIT=6g` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow) before deploying on a CX32. The prod compose interpolates this var with a 12g default. ### Dev vs production differences | Concern | Dev (`docker-compose.yml`) | Prod (`docker-compose.prod.yml`) | |---|---|---| | MinIO image tag | `minio/minio:latest` | Pinned `minio/minio:RELEASE.…` | | Data persistence | Bind mounts `./data/postgres`, `./data/minio` | Named Docker volumes (`postgres-data`, `minio-data`) | | MinIO credentials for backend | Root user/password | Service account `archiv-app` with bucket-scoped rights | | Bucket creation | `create-buckets` helper | Same helper, plus service-account bootstrap on every up | | Spring profile | `dev,e2e` (Swagger + e2e overrides) | unset — base `application.yaml` is production-ready | | Mail | Mailpit (local catcher) | Real SMTP (production) / Mailpit via `profiles: [staging]` (staging) | | Frontend image | Dev server, `target: development`, port 5173 | Node adapter, `target: production`, port 3000 | | Host port binding | All published | Bound to `127.0.0.1` only; Caddy is the front door | | Deploy method | `docker compose up -d` (manual) | Gitea Actions: `nightly.yml` (staging, cron) and `release.yml` (production, on `v*` tag) — both use `up -d --wait` | Full prod compose: [`docker-compose.prod.yml`](../docker-compose.prod.yml). Workflow files: [`.gitea/workflows/nightly.yml`](../.gitea/workflows/nightly.yml), [`.gitea/workflows/release.yml`](../.gitea/workflows/release.yml). --- ## 2. Environment variables All vars are set in `.env` at the repo root (copy from `.env.example`). The backend resolves them via `application.yaml`; the Docker Compose file wires them into each container. **Any var found in `docker-compose.yml` or `application*.yaml` that is not in this table is a blocking review comment on any PR that changes those files.** ### Backend | Variable | Purpose | Default | Required? | Sensitive? | |---|---|---|---|---| | `SPRING_DATASOURCE_URL` | PostgreSQL JDBC URL | — | YES | — | | `SPRING_DATASOURCE_USERNAME` | DB username | — | YES | — | | `SPRING_DATASOURCE_PASSWORD` | DB password | — | YES | YES | | `S3_ENDPOINT` | MinIO / OBS endpoint URL | — | YES | — | | `S3_ACCESS_KEY` | MinIO access key (use service account, not root in prod) | — | YES | YES | | `S3_SECRET_KEY` | MinIO secret key | — | YES | YES | | `S3_BUCKET_NAME` | Target bucket name | — | YES | — | | `S3_REGION` | S3 region string | `us-east-1` | YES | — | | `APP_ADMIN_USERNAME` | Bootstrap admin username (⚠ not in .env.example) | `admin` | YES | — | | `APP_ADMIN_PASSWORD` | Bootstrap admin password (⚠ ships as `admin123`) | `admin123` | YES | YES | | `APP_BASE_URL` | Public-facing URL for email links | `http://localhost:3000` | YES (prod) | — | | `APP_OCR_BASE_URL` | Internal URL of the OCR service | — | YES | — | | `APP_OCR_TRAINING_TOKEN` | Secret token for OCR training endpoints | — | YES (prod) | YES | | `IMPORT_HOST_DIR` | Absolute host path holding the ODS spreadsheet + PDFs for the `/admin/system` mass-import card. Mounted read-only at `/import` inside the backend (compose-only — backend reads via `app.import.dir`). Compose refuses to start when unset, so staging and prod cannot accidentally share the source. Convention: `/srv/familienarchiv-staging/import` and `/srv/familienarchiv-production/import` | — | YES (prod compose) | — | | `MAIL_HOST` | SMTP host | `mailpit` (dev) | YES (prod) | — | | `MAIL_PORT` | SMTP port | `1025` (dev) | YES (prod) | — | | `MAIL_USERNAME` | SMTP username | — | YES (prod) | YES | | `MAIL_PASSWORD` | SMTP password | — | YES (prod) | YES | | `APP_MAIL_FROM` | From address for outbound mail | `noreply@familienarchiv.local` | YES (prod) | — | | `MAIL_SMTP_AUTH` | SMTP auth enabled | `false` (dev) | YES (prod) | — | | `MAIL_STARTTLS_ENABLE` | STARTTLS enabled | `false` (dev) | YES (prod) | — | | `SPRING_PROFILES_ACTIVE` | Spring profile | `dev,e2e` (compose) | YES | — | | `OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP gRPC endpoint for distributed traces (Tempo). Set to `http://tempo:4317` via compose. | `http://localhost:4317` | — | — | | `MANAGEMENT_TRACING_SAMPLING_PROBABILITY` | Micrometer tracing sample rate; overridden to `0.0` in test profile. | `0.1` (compose) / `1.0` (dev) | — | — | ### PostgreSQL container | Variable | Purpose | Default | Required? | Sensitive? | |---|---|---|---|---| | `POSTGRES_USER` | DB superuser | `archive_user` | YES | — | | `POSTGRES_PASSWORD` | DB password | `change-me` | YES | YES | | `POSTGRES_DB` | Database name | `family_archive_db` | YES | — | ### MinIO container | Variable | Purpose | Default | Required? | Sensitive? | |---|---|---|---|---| | `MINIO_ROOT_USER` | MinIO root username (dev compose only — prod compose hardcodes `archiv`) | `minio_admin` | YES (dev) | — | | `MINIO_ROOT_PASSWORD` / `MINIO_PASSWORD` | MinIO root password. **Used only by the `mc admin` bootstrap in prod, never by the backend.** | `change-me` | YES | YES | | `MINIO_APP_PASSWORD` | Password for the `archiv-app` service account that the backend uses. Bucket-scoped via `readwrite` policy on `familienarchiv`. Bootstrapped by `create-buckets`. | — | YES (prod) | YES | | `MINIO_DEFAULT_BUCKETS` | Bucket name (dev compose only — prod compose hardcodes `familienarchiv`) | `archive-documents` | YES (dev) | — | ### OCR service | Variable | Purpose | Default | Required? | Sensitive? | |---|---|---|---|---| | `TRAINING_TOKEN` | Guards `/train` and `/segtrain` endpoints (accepts file uploads) | — | YES (prod) | YES | | `ALLOWED_PDF_HOSTS` | SSRF protection — comma-separated list of allowed PDF source hosts. **Do not widen to `*`** | `minio,localhost,127.0.0.1` | YES | — | | `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — | | `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — | | `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on CX32 hosts; leave unset on CX42+ to use the 12g default | `12g` (prod compose default) | — | — | ### Observability stack (`docker-compose.observability.yml`) | Variable | Purpose | Default | Required? | Sensitive? | |---|---|---|---|---| | `PORT_PROMETHEUS` | Host port for the Prometheus UI (bound to `127.0.0.1` only) | `9090` | — | — | | `PORT_GRAFANA` | Host port for the Grafana UI (bound to `127.0.0.1` only) | `3001` | — | — | | `GRAFANA_ADMIN_PASSWORD` | Grafana `admin` user password | `changeme` | YES (prod) | YES | | `PORT_GLITCHTIP` | Host port for the GlitchTip UI (bound to `127.0.0.1` only) | `3002` | — | — | | `GLITCHTIP_DOMAIN` | Public-facing base URL for GlitchTip (used in email links and CORS) | `http://localhost:3002` | YES (prod) | — | | `GLITCHTIP_SECRET_KEY` | Django secret key for GlitchTip — generate with `python3 -c "import secrets; print(secrets.token_hex(32))"` | — | YES | YES | --- ## 3. Bootstrap from scratch Production and staging deploy via Gitea Actions (`release.yml` on `v*` tag, `nightly.yml` on cron). The server itself only needs to host Caddy, Docker, and the runner — the workflows handle the rest. ### 3.1 Server one-time setup ```bash # Base hardening ufw default deny incoming && ufw allow 22/tcp && ufw allow 80/tcp && ufw allow 443/tcp && ufw enable # /etc/ssh/sshd_config: PasswordAuthentication no, PermitRootLogin no # Install Caddy 2 (https://caddyserver.com/docs/install#debian-ubuntu-raspbian) apt install caddy # Use the Caddyfile from the repo (replace path with the runner's clone target) # CI DEPENDENCY: the nightly and release workflows run `systemctl reload caddy` to # pick up committed Caddyfile changes. They find the file via this symlink — if it # is absent or points elsewhere, the reload succeeds but serves stale config. ln -sf /opt/familienarchiv/infra/caddy/Caddyfile /etc/caddy/Caddyfile systemctl reload caddy # fail2ban — protect /api/auth/login from credential stuffing. # Jail watches the Caddy JSON access log for 401 responses on # /api/auth/login. The jail (maxretry=10 / findtime=10m / bantime=30m) # and filter are committed under infra/fail2ban/ — symlink them in: apt install fail2ban ln -sf /opt/familienarchiv/infra/fail2ban/jail.d/familienarchiv.conf \ /etc/fail2ban/jail.d/familienarchiv.conf ln -sf /opt/familienarchiv/infra/fail2ban/filter.d/familienarchiv-auth.conf \ /etc/fail2ban/filter.d/familienarchiv-auth.conf systemctl reload fail2ban # Verify after first deploy with: # fail2ban-client status familienarchiv-auth # fail2ban-regex /var/log/caddy/access.log familienarchiv-auth # Tailscale — used by the backup pipeline to reach heim-nas (follow-up issue) curl -fsSL https://tailscale.com/install.sh | sh && tailscale up # Self-hosted Gitea runner — register against the repo with a runner token. # This runner is assumed single-tenant: the deploy workflows write .env.* # files to disk during execution (cleaned up unconditionally on completion). # A multi-tenant runner would need to switch to stdin-piped env files. # (See https://docs.gitea.com/usage/actions/quickstart for the register step.) ``` ### 3.2 DNS records ``` archiv.raddatz.cloud A staging.raddatz.cloud A git.raddatz.cloud A ``` ### 3.3 Gitea secrets (Repo → Settings → Actions → Secrets) | Secret | Used by | Notes | |---|---|---| | `PROD_POSTGRES_PASSWORD` | release.yml | strong unique password | | `PROD_MINIO_PASSWORD` | release.yml | MinIO root password; used only at bootstrap | | `PROD_MINIO_APP_PASSWORD` | release.yml | application service-account password | | `PROD_OCR_TRAINING_TOKEN` | release.yml | `python3 -c "import secrets; print(secrets.token_hex(32))"` | | `PROD_APP_ADMIN_USERNAME` | release.yml | e.g. `admin@archiv.raddatz.cloud` | | `PROD_APP_ADMIN_PASSWORD` | release.yml | **⚠ locked permanently on first deploy** — see §3.5 | | `STAGING_POSTGRES_PASSWORD` | nightly.yml | different from prod | | `STAGING_MINIO_PASSWORD` | nightly.yml | different from prod | | `STAGING_MINIO_APP_PASSWORD` | nightly.yml | different from prod | | `STAGING_OCR_TRAINING_TOKEN` | nightly.yml | different from prod | | `STAGING_APP_ADMIN_USERNAME` | nightly.yml | e.g. `admin@staging.raddatz.cloud` | | `STAGING_APP_ADMIN_PASSWORD` | nightly.yml | locked on first staging deploy | | `MAIL_HOST` | release.yml | SMTP relay hostname (prod only) | | `MAIL_PORT` | release.yml | typically `587` | | `MAIL_USERNAME` | release.yml | SMTP user | | `MAIL_PASSWORD` | release.yml | SMTP password | ### 3.4 First deploy ```bash # 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow") # Expected: docker compose up -d --wait succeeds for archiv-staging, then # the workflow's "Smoke test deployed environment" step asserts: # - https://staging.raddatz.cloud/login returns 200 # - HSTS header is present # - /actuator/health returns 404 (defense-in-depth check) # 2. (Optional) Re-verify manually curl -I https://staging.raddatz.cloud/ # Expected: 200 (login page) with HSTS + X-Content-Type-Options headers # 3. When staging looks healthy, push a v* tag to trigger release.yml git tag v1.0.0 && git push origin v1.0.0 ``` ### 3.5 ⚠ Admin password is locked on first deploy `UserDataInitializer` creates the admin user **only if the email does not exist**. The first successful deploy persists the admin password to the database. Changing `PROD_APP_ADMIN_PASSWORD` in Gitea secrets after that point has **no effect** — the secret is only consulted when the row is missing. Before the first deploy: rotate `PROD_APP_ADMIN_PASSWORD` to a strong value. After the first deploy: change the admin password via the in-app account settings, not via the Gitea secret. --- ## 4. Logs + observability ### First-response commands ```bash # Stream backend logs (most useful first) docker compose logs --follow --tail=100 backend # Stream all services docker compose logs --follow # Single snapshot docker compose logs --tail=200 # services: frontend, backend, db, minio, ocr-service ``` ### Log locations - **Backend application log**: stdout (captured by Docker). Access inside the container at `/app/logs/` via `docker exec`. - **Spring Actuator health**: `http://localhost:8080/actuator/health` (internal only in prod — port 8081 for Prometheus scraping) - **Prometheus scraping**: management port 8081, path `/actuator/prometheus`. Internal only; Caddy blocks `/actuator/*` externally. ### Observability stack An observability stack is available via `docker-compose.observability.yml`. Configuration lives under `infra/observability/`. Start it after the main stack is up (which creates `archiv-net`): ```bash docker compose up -d # creates archiv-net docker compose -f docker-compose.observability.yml up -d ``` Current services: | Service | Image | Purpose | |---|---|---| | `obs-prometheus` | `prom/prometheus:v3.4.0` | Scrapes metrics from backend management port 8081 (`/actuator/prometheus`), node-exporter, and cAdvisor | | `obs-node-exporter` | `prom/node-exporter:v1.9.0` | Host-level CPU / memory / disk / network metrics | | `obs-cadvisor` | `gcr.io/cadvisor/cadvisor:v0.52.1` | Per-container resource metrics | | `obs-loki` | `grafana/loki:3.4.2` | Log aggregation — receives log streams from Promtail. Port 3100 is `expose`-only (not host-bound). | | `obs-promtail` | `grafana/promtail:3.4.2` | Log shipping agent — reads all Docker container logs via the Docker socket and forwards them to Loki with `container_name`, `compose_service`, and `compose_project` labels | | `obs-tempo` | `grafana/tempo:2.7.2` | Distributed trace storage — OTLP gRPC receiver on port 4317, OTLP HTTP on port 4318 (both `archiv-net`-internal). Grafana queries traces on port 3200 (`obs-net`-internal). All ports are `expose`-only (not host-bound). | | `obs-grafana` | `grafana/grafana-oss:11.6.1` | Unified observability UI — metrics dashboards, log exploration, trace viewer. Bound to `127.0.0.1:${PORT_GRAFANA:-3001}` on the host. | | `obs-glitchtip` | `glitchtip/glitchtip:v4` | Sentry-compatible error tracker. Receives frontend + backend error events, groups by fingerprint, provides issue UI with stack traces. Bound to `127.0.0.1:${PORT_GLITCHTIP:-3002}`. | | `obs-glitchtip-worker` | `glitchtip/glitchtip:v4` | Celery + beat worker — processes async GlitchTip tasks (event ingestion, notifications, cleanup). | | `obs-redis` | `redis:7-alpine` | Celery task broker for GlitchTip. Internal to `obs-net`; no host port exposed. | | `obs-glitchtip-db-init` | `postgres:16-alpine` | One-shot init container. Creates the `glitchtip` database on the existing `archive-db` PostgreSQL instance if it does not already exist. Runs at stack startup; exits cleanly once done. | #### Grafana | Item | Value | |---|---| | URL | `http://localhost:3001` (or `http://localhost:$PORT_GRAFANA`) | | Username | `admin` | | Password | `$GRAFANA_ADMIN_PASSWORD` (default: `changeme` — **change before exposing to a network**) | Datasources are auto-provisioned on first start (Prometheus, Loki, Tempo — no manual setup required). Three dashboards are pre-loaded: | Dashboard | Grafana ID | Purpose | |---|---|---| | Node Exporter Full | 1860 | Host CPU, memory, disk, network | | Spring Boot Observability | 17175 | JVM metrics, HTTP latency, error rate | | Loki Logs | 13639 | Log exploration and filtering | Tempo traces are accessible via Grafana Explore → Tempo datasource, and linked from Loki logs via the `traceId` derived field. **Loki quick checks** (after ~60 s, run from inside the `obs-loki` container): ```bash # Loki health docker exec obs-loki wget -qO- http://localhost:3100/ready # List labels docker exec obs-loki wget -qO- 'http://localhost:3100/loki/api/v1/labels' # Query logs by service (stable across dev and prod environments) docker exec obs-loki wget -qO- \ 'http://localhost:3100/loki/api/v1/query_range?query=%7Bcompose_service%3D%22backend%22%7D&limit=5' ``` **Prefer `compose_service` over `container_name` in LogQL queries** — `container_name` differs between dev (`archive-backend`) and prod (`archiv-production-backend-1`), while `compose_service` is stable (`backend`, `db`, `minio`, etc.). Prometheus port `9090` and Grafana port `3001` are bound to `127.0.0.1` on the host. No other observability ports are host-bound. #### GlitchTip | Item | Value | |---|---| | URL | `http://localhost:3002` (or `http://localhost:$PORT_GLITCHTIP`) | **Required env vars** — set in `.env` before first start: ```bash GLITCHTIP_SECRET_KEY=$(python3 -c "import secrets; print(secrets.token_hex(32))") GLITCHTIP_DOMAIN=http://localhost:3002 # change to your public URL in prod PORT_GLITCHTIP=3002 # optional, defaults to 3002 ``` **Database:** GlitchTip shares the existing `archive-db` PostgreSQL instance. The `obs-glitchtip-db-init` one-shot container creates a dedicated `glitchtip` database on first stack start — no manual step required. **First-run steps** (one-time, after `docker compose -f docker-compose.observability.yml up -d`): ```bash # 1. Create the Django superuser (interactive) docker exec -it obs-glitchtip ./manage.py createsuperuser # 2. Open the GlitchTip UI and log in open http://localhost:3002 # 3. Create an organisation (e.g. "Familienarchiv") # 4. Create two projects: # - "familienarchiv-frontend" (platform: JavaScript / SvelteKit) # - "familienarchiv-backend" (platform: Java / Spring Boot) # 5. Copy each project's DSN from Settings → Projects → → Client Keys # 6. Wire the DSNs into the backend and frontend via env vars (separate issue) ``` --- ## 5. Backup + recovery ### Current state — no automated backup No automated backup is configured. Manual procedure for a point-in-time backup: ```bash # PostgreSQL dump docker exec archive-db pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB} > backup-$(date +%Y%m%d).sql # MinIO data (bind-mounted in dev) # Copy ./data/minio/ to external storage ``` Restoration: ```bash # Restore Postgres docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYMMDD.sql ``` ### Planned — phase 5 of Production v1 milestone Automated backup (nightly `pg_dump` + MinIO `mc mirror` over Tailscale to `heim-nas`) is a follow-up issue. Until that ships: **manual backups are the only recovery option.** ### Rollback Each release tag corresponds to a docker image tag on the host daemon (built via DooD; no registry). Rolling back to a previous tag is one command: ```bash TAG=v1.0.0 docker compose \ -f docker-compose.prod.yml \ -p archiv-production \ --env-file /opt/familienarchiv/.env.production \ up -d --wait --remove-orphans ``` If the rollback target image is no longer present on the host (host disk pruned, etc.), re-trigger `release.yml` for that tag from Gitea Actions UI — it rebuilds and redeploys. **Flyway migrations are not auto-rolled-back.** If a release contained a destructive migration (drop column, rename table), a tag rollback brings the schema back to a previous app version but the data shape has already changed. For breaking schema changes, prefer a forward-only fix. --- ## 6. Common operational tasks ### Reset dev database (truncates data, keeps schema) ```bash bash scripts/reset-db.sh ``` > Truncates all data but does **not** drop the schema or re-run Flyway. Use for E2E test resets, not full reinstalls. > ⚠️ Script hardcodes `DB_USER=archive_user` and `DB_NAME=family_archive_db` — if you customised these in `.env`, edit the script accordingly. ### Rebuild frontend container (clears node_modules volume) ```bash bash scripts/rebuild-frontend.sh ``` > Assumes the Docker Compose volume is named `familienarchiv_frontend_node_modules`. If your project directory is not named `familienarchiv`, edit line 16 of the script. ### Download Kraken OCR models ```bash bash scripts/download-kraken-models.sh ``` > Downloads the Kurrent/Sütterlin HTR models. Run once after a fresh clone or when models are updated. ### Trigger a mass import (Excel/ODS) **Dev:** drop the ODS spreadsheet + PDFs into `./import/` at the repo root — the dev compose bind-mounts it to `/import` automatically. **Staging/production:** 1. Pre-stage the payload on the host. Convention: `/srv/familienarchiv-staging/import/` or `/srv/familienarchiv-production/import/`. ```bash rsync -avh --progress ./import/ user@host:/srv/familienarchiv-staging/import/ ``` 2. Make sure `IMPORT_HOST_DIR=` is set in `.env.staging` / `.env.production` (the nightly/release workflows already write this — see §3). Compose refuses to start without it. 3. Redeploy the stack so the bind mount picks up — or, if the mount is already in place, skip to step 4. 4. Call `POST /api/admin/trigger-import` (requires `ADMIN` permission), or click the "Import starten" button on `/admin/system`. 5. The import runs asynchronously — poll `GET /api/admin/import-status`, watch `/admin/system`, or tail the backend logs. --- ## 7. Known limitations | Limitation | Reason | Reference | |---|---|---| | **Single-node OCR service** | The two required OCR engines (Surya + Kraken) exist only in the Python ecosystem; horizontal scaling would require a job queue not currently implemented | [ADR-001](adr/001-ocr-python-microservice.md) | | **No multi-tenancy** | Designed as a single-family private archive; all authenticated users share the same document space | Deliberate scope decision (family-only product frame) | | **No multi-region** | Single PostgreSQL + MinIO instance; no replication or failover | Deliberate scope decision | | **Max upload size** | 50 MB per file (500 MB per request for multi-file) | Configurable in `application.yaml` (`spring.servlet.multipart`) | | **No automated backup** | Phase 5 of Production v1 milestone is not yet implemented | See §5 above |