docs(deployment): document fail2ban symlink, OCR_MEM_LIMIT, smoke test
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m53s
CI / OCR Service Tests (push) Failing after 1m58s
CI / Backend Unit Tests (push) Failing after 1m23s
CI / Unit & Component Tests (pull_request) Failing after 3m59s
CI / Backend Unit Tests (pull_request) Successful in 5m39s
CI / OCR Service Tests (pull_request) Successful in 1m14s
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m53s
CI / OCR Service Tests (push) Failing after 1m58s
CI / Backend Unit Tests (push) Failing after 1m23s
CI / Unit & Component Tests (pull_request) Failing after 3m59s
CI / Backend Unit Tests (pull_request) Successful in 5m39s
CI / OCR Service Tests (pull_request) Successful in 1m14s
Updates DEPLOYMENT.md to match the infra changes in this PR:
§1 OCR memory — point operators at the new OCR_MEM_LIMIT env var instead
of telling them to edit "the prod overlay".
§2 OCR env vars — add OCR_MEM_LIMIT to the table.
§3.1 server setup — replace fail2ban prose with concrete `ln -sf`
commands referencing the committed jail/filter.
Document the single-tenant runner assumption near
the runner-registration step.
§3.4 first deploy — describe the new automated smoke test step.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -54,7 +54,7 @@ The OCR service requires significant RAM for model loading. The dev compose sets
|
||||
| Hetzner CX32 | 8 GB | 6 GB | Accept reduced batch sizes and slower throughput |
|
||||
| Hetzner CX22 | 4 GB | — | Disable the OCR service (`profiles: [ocr]`); run OCR on demand only |
|
||||
|
||||
A CX32 cannot honour a `mem_limit: 12g` — set it to `6g` in the prod overlay or use CX42.
|
||||
A CX32 cannot honour the default `mem_limit: 12g` — set the `OCR_MEM_LIMIT=6g` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow) before deploying on a CX32. The prod compose interpolates this var with a 12g default.
|
||||
|
||||
### Dev vs production differences
|
||||
|
||||
@@ -131,6 +131,7 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
||||
| `ALLOWED_PDF_HOSTS` | SSRF protection — comma-separated list of allowed PDF source hosts. **Do not widen to `*`** | `minio,localhost,127.0.0.1` | YES | — |
|
||||
| `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — |
|
||||
| `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — |
|
||||
| `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on CX32 hosts; leave unset on CX42+ to use the 12g default | `12g` (prod compose default) | — | — |
|
||||
|
||||
---
|
||||
|
||||
@@ -152,17 +153,28 @@ apt install caddy
|
||||
ln -sf /opt/familienarchiv/infra/caddy/Caddyfile /etc/caddy/Caddyfile
|
||||
systemctl reload caddy
|
||||
|
||||
# fail2ban — protect /api/auth/login from credential stuffing
|
||||
# Jail watches Caddy access log for 401 responses on /api/auth/login.
|
||||
# maxretry=10 findtime=10m bantime=30m
|
||||
# fail2ban — protect /api/auth/login from credential stuffing.
|
||||
# Jail watches the Caddy JSON access log for 401 responses on
|
||||
# /api/auth/login. The jail (maxretry=10 / findtime=10m / bantime=30m)
|
||||
# and filter are committed under infra/fail2ban/ — symlink them in:
|
||||
apt install fail2ban
|
||||
# Drop the jail definition under /etc/fail2ban/jail.d/familienarchiv.conf
|
||||
ln -sf /opt/familienarchiv/infra/fail2ban/jail.d/familienarchiv.conf \
|
||||
/etc/fail2ban/jail.d/familienarchiv.conf
|
||||
ln -sf /opt/familienarchiv/infra/fail2ban/filter.d/familienarchiv-auth.conf \
|
||||
/etc/fail2ban/filter.d/familienarchiv-auth.conf
|
||||
systemctl reload fail2ban
|
||||
# Verify after first deploy with:
|
||||
# fail2ban-client status familienarchiv-auth
|
||||
# fail2ban-regex /var/log/caddy/access.log familienarchiv-auth
|
||||
|
||||
# Tailscale — used by the backup pipeline to reach heim-nas (follow-up issue)
|
||||
curl -fsSL https://tailscale.com/install.sh | sh && tailscale up
|
||||
|
||||
# Self-hosted Gitea runner — register against the repo with a runner token
|
||||
# (see https://docs.gitea.com/usage/actions/quickstart for the register step)
|
||||
# Self-hosted Gitea runner — register against the repo with a runner token.
|
||||
# This runner is assumed single-tenant: the deploy workflows write .env.*
|
||||
# files to disk during execution (cleaned up unconditionally on completion).
|
||||
# A multi-tenant runner would need to switch to stdin-piped env files.
|
||||
# (See https://docs.gitea.com/usage/actions/quickstart for the register step.)
|
||||
```
|
||||
|
||||
### 3.2 DNS records
|
||||
@@ -198,8 +210,12 @@ git.raddatz.cloud A <server IP>
|
||||
|
||||
```bash
|
||||
# 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow")
|
||||
# Expected: docker compose up -d --wait succeeds for archiv-staging
|
||||
# 2. Verify TLS + reverse proxy
|
||||
# Expected: docker compose up -d --wait succeeds for archiv-staging, then
|
||||
# the workflow's "Smoke test deployed environment" step asserts:
|
||||
# - https://staging.raddatz.cloud/login returns 200
|
||||
# - HSTS header is present
|
||||
# - /actuator/health returns 404 (defense-in-depth check)
|
||||
# 2. (Optional) Re-verify manually
|
||||
curl -I https://staging.raddatz.cloud/
|
||||
# Expected: 200 (login page) with HSTS + X-Content-Type-Options headers
|
||||
# 3. When staging looks healthy, push a v* tag to trigger release.yml
|
||||
|
||||
Reference in New Issue
Block a user