docs(deployment): rewrite for Gitea Actions / Caddy / prod compose

Brings DEPLOYMENT.md in line with the production deployment landed
in #497:

- Topology diagram: frontend port 3000 (Node adapter), 127.0.0.1
  binding, project-name isolation between prod and staging
- Caddyfile now lives in-tree at infra/caddy/Caddyfile (symlinked
  onto the server)
- Dev vs prod table: documents the new deploy method (workflows +
  --wait) and the prod-compose specific differences
- Env vars: adds MINIO_APP_PASSWORD; notes that prod compose
  hardcodes the MinIO root user and the bucket name
- Bootstrap section: server hardening, fail2ban, Tailscale, the 16
  Gitea secrets, and the workflow_dispatch first-deploy step
- Admin password warning: first deploy locks the password, secret
  rotation after that point has no effect
- Rollback: TAG= override + docker compose up -d --wait

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-10 21:58:51 +02:00
parent 334b507476
commit 2eade2b78f

View File

@@ -27,20 +27,22 @@ This doc is the Day-1 checklist and operational reference. It links to the canon
```mermaid
graph TD
Browser -->|HTTPS| Caddy["Caddy (TLS termination)"]
Caddy -->|HTTP :5173| Frontend["Web Frontend\nSvelteKit / Node.js"]
Caddy -->|HTTP :3000| Frontend["Web Frontend\nSvelteKit Node adapter"]
Caddy -->|HTTP :8080| Backend["API Backend\nSpring Boot / Jetty :8080"]
Backend -->|JDBC :5432| DB[(PostgreSQL 16)]
Backend -->|S3 API :9000| MinIO[(MinIO / Hetzner OBS)]
Backend -->|S3 API :9000| MinIO[(MinIO)]
Backend -->|HTTP :8000 internal| OCR["OCR Service\nPython FastAPI"]
OCR -->|presigned URL| MinIO
Browser -->|SSE direct| Backend
```
**Key facts:**
- Caddy terminates TLS and reverse-proxies to frontend and backend. See the Caddyfile in [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md).
- The OCR service has **no external port** — reachable only on the internal Docker network from the backend.
- Caddy terminates TLS and reverse-proxies to frontend (`:3000`) and backend (`:8080`). The Caddyfile is committed at [`infra/caddy/Caddyfile`](../infra/caddy/Caddyfile) and is installed on the host as `/etc/caddy/Caddyfile` (symlink).
- The host binds all docker-published ports to `127.0.0.1` only; Caddy is the sole external entry point.
- The OCR service has **no published port** — reachable only on the internal Docker network from the backend.
- SSE notifications go directly backend → browser (not via the SvelteKit SSR layer).
- Management port 8081 (Spring Actuator / Prometheus scrape) is internal only — the Caddy config blocks `/actuator/*` externally.
- The Caddyfile responds `404` on `/actuator/*` (defense in depth). Internal monitoring scrapes the backend on the docker network, not through Caddy.
- Production and staging cohabit on the same host via docker compose project names: `archiv-production` (ports 8080/3000) and `archiv-staging` (ports 8081/3001).
### OCR memory requirements
@@ -56,15 +58,19 @@ A CX32 cannot honour a `mem_limit: 12g` — set it to `6g` in the prod overlay o
### Dev vs production differences
| Concern | Dev compose | Prod overlay |
| Concern | Dev (`docker-compose.yml`) | Prod (`docker-compose.prod.yml`) |
|---|---|---|
| MinIO image tag | `minio/minio:latest` (unpinned) | Pinned in prod overlay |
| Data persistence | Bind mounts `./data/postgres`, `./data/minio` | Named Docker volumes |
| Bucket creation | `create-buckets` helper container | Pre-created in Hetzner console |
| Spring profile | `dev,e2e` (enables OpenAPI + Swagger UI) | `prod` |
| Mail | Mailpit (local catcher) | Real SMTP |
| MinIO image tag | `minio/minio:latest` | Pinned `minio/minio:RELEASE.…` |
| Data persistence | Bind mounts `./data/postgres`, `./data/minio` | Named Docker volumes (`postgres-data`, `minio-data`) |
| MinIO credentials for backend | Root user/password | Service account `archiv-app` with bucket-scoped rights |
| Bucket creation | `create-buckets` helper | Same helper, plus service-account bootstrap on every up |
| Spring profile | `dev,e2e` (Swagger + e2e overrides) | unset — base `application.yaml` is production-ready |
| Mail | Mailpit (local catcher) | Real SMTP (production) / Mailpit via `profiles: [staging]` (staging) |
| Frontend image | Dev server, `target: development`, port 5173 | Node adapter, `target: production`, port 3000 |
| Host port binding | All published | Bound to `127.0.0.1` only; Caddy is the front door |
| Deploy method | `docker compose up -d` (manual) | Gitea Actions: `nightly.yml` (staging, cron) and `release.yml` (production, on `v*` tag) — both use `up -d --wait` |
Full prod overlay: [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md).
Full prod compose: [`docker-compose.prod.yml`](../docker-compose.prod.yml). Workflow files: [`.gitea/workflows/nightly.yml`](../.gitea/workflows/nightly.yml), [`.gitea/workflows/release.yml`](../.gitea/workflows/release.yml).
---
@@ -112,9 +118,10 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
| Variable | Purpose | Default | Required? | Sensitive? |
|---|---|---|---|---|
| `MINIO_ROOT_USER` | MinIO root username | `minio_admin` | YES | — |
| `MINIO_ROOT_PASSWORD` | MinIO root password | `change-me` | YES | YES |
| `MINIO_DEFAULT_BUCKETS` | Bucket name | `archive-documents` | YES | — |
| `MINIO_ROOT_USER` | MinIO root username (dev compose only — prod compose hardcodes `archiv`) | `minio_admin` | YES (dev) | — |
| `MINIO_ROOT_PASSWORD` / `MINIO_PASSWORD` | MinIO root password. **Used only by the `mc admin` bootstrap in prod, never by the backend.** | `change-me` | YES | YES |
| `MINIO_APP_PASSWORD` | Password for the `archiv-app` service account that the backend uses. Bucket-scoped via `readwrite` policy on `familienarchiv`. Bootstrapped by `create-buckets`. | — | YES (prod) | YES |
| `MINIO_DEFAULT_BUCKETS` | Bucket name (dev compose only — prod compose hardcodes `familienarchiv`) | `archive-documents` | YES (dev) | — |
### OCR service
@@ -129,48 +136,81 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
## 3. Bootstrap from scratch
> Full VPS provisioning steps are in [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md). This section covers the sequence and the security-critical steps.
Production and staging deploy via Gitea Actions (`release.yml` on `v*` tag, `nightly.yml` on cron). The server itself only needs to host Caddy, Docker, and the runner — the workflows handle the rest.
### Security checklist — complete before first boot
> ⚠️ **These defaults ship in `.env.example` and `application.yaml`. Change them or you will have an insecure installation.**
- [ ] Set `APP_ADMIN_PASSWORD` (default: `admin123` — change before starting the backend)
- [ ] Set `APP_ADMIN_USERNAME` if you want a non-default admin login name (add to `.env` — not in `.env.example`)
- [ ] Rotate `POSTGRES_PASSWORD` from `change-me`
- [ ] Rotate `MINIO_ROOT_PASSWORD` from `change-me`
- [ ] Set a strong `APP_OCR_TRAINING_TOKEN` (backend) and the matching `TRAINING_TOKEN` (OCR service) — both must be the same value (`python3 -c "import secrets; print(secrets.token_hex(32))"`)
- [ ] Confirm `ALLOWED_PDF_HOSTS` is locked to your MinIO/S3 hostname — widening to `*` opens SSRF
- [ ] Set `SPRING_PROFILES_ACTIVE=prod` in the prod overlay (not `dev,e2e` — that exposes Swagger UI and `/v3/api-docs`)
- [ ] Use a dedicated MinIO service account for `S3_ACCESS_KEY` / `S3_SECRET_KEY`, not the root credentials
### Bootstrap sequence
### 3.1 Server one-time setup
```bash
# 1. Copy and fill the env file
cp .env.example .env
# edit .env — complete the security checklist above first
# Base hardening
ufw default deny incoming && ufw allow 22/tcp && ufw allow 80/tcp && ufw allow 443/tcp && ufw enable
# /etc/ssh/sshd_config: PasswordAuthentication no, PermitRootLogin no
# 2. (Production only) Create the MinIO / Hetzner OBS bucket in the console
# The dev compose has a create-buckets helper; production does not.
# Create the bucket named $MINIO_DEFAULT_BUCKETS with private access.
# Install Caddy 2 (https://caddyserver.com/docs/install#debian-ubuntu-raspbian)
apt install caddy
# 3. Start the stack (prod overlay — see docs/infrastructure/production-compose.md)
# docker-compose.prod.yml is NOT committed — create it from the guide above
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# Use the Caddyfile from the repo (replace path with the runner's clone target)
ln -sf /opt/familienarchiv/infra/caddy/Caddyfile /etc/caddy/Caddyfile
systemctl reload caddy
# 4. Flyway migrations run automatically on backend start.
# Watch the backend log to confirm:
docker compose logs --follow --tail=100 backend
# fail2ban — protect /api/auth/login from credential stuffing
# Jail watches Caddy access log for 401 responses on /api/auth/login.
# maxretry=10 findtime=10m bantime=30m
apt install fail2ban
# Drop the jail definition under /etc/fail2ban/jail.d/familienarchiv.conf
# 5. Verify the stack is healthy
curl http://localhost:8080/actuator/health
# Expected: {"status":"UP"}
# Tailscale — used by the backup pipeline to reach heim-nas (follow-up issue)
curl -fsSL https://tailscale.com/install.sh | sh && tailscale up
# 6. Open the app and log in with the admin credentials from .env
# Self-hosted Gitea runner — register against the repo with a runner token
# (see https://docs.gitea.com/usage/actions/quickstart for the register step)
```
> **Do not use `docker-compose.ci.yml` locally** — it disables bind mounts that the dev workflow depends on.
### 3.2 DNS records
```
archiv.raddatz.cloud A <server IP>
staging.raddatz.cloud A <server IP>
git.raddatz.cloud A <server IP>
```
### 3.3 Gitea secrets (Repo → Settings → Actions → Secrets)
| Secret | Used by | Notes |
|---|---|---|
| `PROD_POSTGRES_PASSWORD` | release.yml | strong unique password |
| `PROD_MINIO_PASSWORD` | release.yml | MinIO root password; used only at bootstrap |
| `PROD_MINIO_APP_PASSWORD` | release.yml | application service-account password |
| `PROD_OCR_TRAINING_TOKEN` | release.yml | `python3 -c "import secrets; print(secrets.token_hex(32))"` |
| `PROD_APP_ADMIN_USERNAME` | release.yml | e.g. `admin@archiv.raddatz.cloud` |
| `PROD_APP_ADMIN_PASSWORD` | release.yml | **⚠ locked permanently on first deploy** — see §3.5 |
| `STAGING_POSTGRES_PASSWORD` | nightly.yml | different from prod |
| `STAGING_MINIO_PASSWORD` | nightly.yml | different from prod |
| `STAGING_MINIO_APP_PASSWORD` | nightly.yml | different from prod |
| `STAGING_OCR_TRAINING_TOKEN` | nightly.yml | different from prod |
| `STAGING_APP_ADMIN_USERNAME` | nightly.yml | e.g. `admin@staging.raddatz.cloud` |
| `STAGING_APP_ADMIN_PASSWORD` | nightly.yml | locked on first staging deploy |
| `MAIL_HOST` | release.yml | SMTP relay hostname (prod only) |
| `MAIL_PORT` | release.yml | typically `587` |
| `MAIL_USERNAME` | release.yml | SMTP user |
| `MAIL_PASSWORD` | release.yml | SMTP password |
### 3.4 First deploy
```bash
# 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow")
# Expected: docker compose up -d --wait succeeds for archiv-staging
# 2. Verify TLS + reverse proxy
curl -I https://staging.raddatz.cloud/
# Expected: 200 (login page) with HSTS + X-Content-Type-Options headers
# 3. When staging looks healthy, push a v* tag to trigger release.yml
git tag v1.0.0 && git push origin v1.0.0
```
### 3.5 ⚠ Admin password is locked on first deploy
`UserDataInitializer` creates the admin user **only if the email does not exist**. The first successful deploy persists the admin password to the database. Changing `PROD_APP_ADMIN_PASSWORD` in Gitea secrets after that point has **no effect** — the secret is only consulted when the row is missing.
Before the first deploy: rotate `PROD_APP_ADMIN_PASSWORD` to a strong value. After the first deploy: change the admin password via the in-app account settings, not via the Gitea secret.
---
@@ -224,7 +264,23 @@ docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYM
### Planned — phase 5 of Production v1 milestone
Automated backup (PostgreSQL WAL archiving + MinIO bucket replication) is planned in the Production v1 milestone phase 5. Until that ships: **manual backups are the only recovery option.**
Automated backup (nightly `pg_dump` + MinIO `mc mirror` over Tailscale to `heim-nas`) is a follow-up issue. Until that ships: **manual backups are the only recovery option.**
### Rollback
Each release tag corresponds to a docker image tag on the host daemon (built via DooD; no registry). Rolling back to a previous tag is one command:
```bash
TAG=v1.0.0 docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file /opt/familienarchiv/.env.production \
up -d --wait --remove-orphans
```
If the rollback target image is no longer present on the host (host disk pruned, etc.), re-trigger `release.yml` for that tag from Gitea Actions UI — it rebuilds and redeploys.
**Flyway migrations are not auto-rolled-back.** If a release contained a destructive migration (drop column, rename table), a tag rollback brings the schema back to a previous app version but the data shape has already changed. For breaking schema changes, prefer a forward-only fix.
---