Compare commits

...

93 Commits

Author SHA1 Message Date
Marcel
08c7dbcaa2 chore(renovate): require manual review for privileged CI image digest bumps
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m49s
CI / OCR Service Tests (push) Successful in 15s
CI / Backend Unit Tests (push) Successful in 4m7s
CI / fail2ban Regex (push) Successful in 38s
CI / Compose Bucket Idempotency (push) Successful in 57s
CI / Unit & Component Tests (pull_request) Failing after 2m47s
CI / OCR Service Tests (pull_request) Successful in 15s
CI / Backend Unit Tests (pull_request) Successful in 4m9s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Successful in 55s
Adds a packageRule matching .gitea/workflows/** digest updates with
automerge: false. Digest bumps for images running --privileged --pid=host
have root-equivalent host access and must not be auto-merged.

Addresses Nora's review concern on #537.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 23:15:05 +02:00
Marcel
a3f1260c23 docs(ci): add Troubleshooting section for Reload Caddy failures
Covers the three failure modes Sara flagged: Caddy stopped (explicit
systemctl error), symlink missing/mis-pointed (silent reload, stale
smoke test), and Docker socket / nsenter unavailable (container error).
Each failure mode includes symptoms and recovery steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 23:14:35 +02:00
Marcel
e75ddc318b docs(adr): ADR-012 — nsenter via privileged container for host service management in CI
Captures the architectural decision, alternatives considered (sudo
systemctl, Caddy admin API, SSH), and consequences (symlink contract,
Renovate review requirement, step duplication tracked in #539).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 23:13:50 +02:00
Marcel
60cb9d5ad0 docs(deploy): note Caddyfile symlink is a CI dependency
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m48s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m4s
CI / fail2ban Regex (push) Successful in 40s
CI / Compose Bucket Idempotency (push) Successful in 58s
CI / Unit & Component Tests (pull_request) Failing after 2m49s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Compose Bucket Idempotency (pull_request) Successful in 56s
CI / Backend Unit Tests (pull_request) Successful in 4m12s
CI / fail2ban Regex (pull_request) Successful in 37s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:52:34 +02:00
Marcel
1ce0638ae6 docs(ci): update nsenter example to Alpine, document alternatives considered
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:47:41 +02:00
Marcel
f608838f7a fix(ci): pin Reload Caddy to alpine:3.21 digest, add reload-vs-restart rationale
- Switch ubuntu:22.04 (floating, ~70 MB) to alpine:3.21 pinned by sha256
  digest (~5 MB); util-linux installed at run time via apk add
- Add explicit comment explaining why `reload` not `restart`: SIGHUP
  re-reads config in-process without dropping TLS connections

Addresses Tobias + Nora blocker from PR review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:43:55 +02:00
Marcel
52a96f657d docs(ci): document DooD runner architecture and nsenter pattern
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m49s
CI / OCR Service Tests (push) Successful in 16s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Successful in 23s
CI / Unit & Component Tests (pull_request) Failing after 2m48s
CI / OCR Service Tests (pull_request) Successful in 15s
CI / Backend Unit Tests (push) Successful in 4m4s
CI / Backend Unit Tests (pull_request) Successful in 4m6s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 55s
Replace the stale generic runner provisioning docs with an accurate
description of the actual two-container setup on the Hetzner VPS.
Document the nsenter pattern for running host-level commands (systemctl)
from containerised CI steps, and the Caddyfile symlink contract that the
reload step depends on.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:29:39 +02:00
Marcel
f87504fb23 fix(ci): add Caddy reload step to release workflow
Same gap as nightly.yml: production deploys also need Caddy to reload
the updated Caddyfile before the smoke test validates the public surface.
Uses the same nsenter pattern introduced in the previous commit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:29:02 +02:00
Marcel
99de6f1d07 fix(ci): reload Caddy via nsenter, not sudo systemctl
`sudo systemctl reload caddy` does not work from inside a DooD job
container: `systemctl` is absent from Ubuntu container images and
container processes cannot reach the host systemd without entering its
namespaces. Replace with `docker run --privileged --pid=host ubuntu:22.04
nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy`, which uses
the already-mounted Docker socket to spin up a privileged sibling
container that enters the host PID namespace via nsenter. Tested live on
the Hetzner VPS. No sudoers entry required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:28:24 +02:00
Marcel
432ae2ac83 ci(nightly): reload Caddy before smoke test
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m50s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m10s
CI / fail2ban Regex (push) Successful in 38s
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Adds a `sudo systemctl reload caddy` step between the docker compose
deploy and the smoke test. This ensures any committed Caddyfile changes
are applied before the public surface is verified.

Previously the workflow had no mechanism to push Caddyfile changes to
the running host daemon. A Caddyfile edit would land in the repo but
Caddy would keep serving the previous config, causing the smoke test to
catch a stale header or still-proxied /actuator route rather than the
intended current config.

This step also surfaces the root cause of today's port-443 failure
explicitly: if Caddy is not running, the step fails with a clear service
error rather than a misleading "Failed to connect to port 443" from curl.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:49:32 +02:00
Marcel
5f3529439a fix(infra): frontend healthcheck on 127.0.0.1, not localhost
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m53s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m33s
CI / fail2ban Regex (pull_request) Successful in 40s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s
CI / Unit & Component Tests (push) Failing after 2m52s
CI / OCR Service Tests (push) Successful in 18s
CI / Backend Unit Tests (push) Successful in 4m23s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Successful in 1m0s
The new alpine-based frontend production image (`node:20.19.0-alpine3.21`)
resolves `localhost` only to `::1` in /etc/hosts. SvelteKit's adapter-node
binds to 0.0.0.0 (IPv4 only), so `wget http://localhost:3000/login` from
inside the container connects to ::1 and gets "Connection refused" every
15s. Container goes unhealthy → `docker compose up --wait` fails → nightly
staging deploy fails. The app itself is fine.

Switching to 127.0.0.1 bypasses /etc/hosts and matches what Node actually
listens on.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:49:32 +02:00
Marcel
48c8bb8a5f fixup: address Nora's review on #520 (security blockers)
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m48s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m10s
CI / fail2ban Regex (push) Successful in 38s
CI / Compose Bucket Idempotency (push) Successful in 56s
- frontend/login: derive cookie `secure` flag from request URL protocol.
  Pre-PR the cookie was only read by SSR so the flag didn't matter; now
  the cookie IS the API credential and must be Secure on HTTPS or it
  leaks a 24h Basic token on plaintext networks. Dev runs over HTTP and
  would silently lose the cookie if we hardcoded `secure: true`, so the
  flag follows `event.url.protocol === 'https:'`.

- SecurityConfig: rewrite the CSRF-disabled comment. The old
  "browsers block cross-origin custom headers" justification no longer
  holds once /api/* is authenticated via the cookie. Make the
  load-bearing dependencies explicit: SameSite=strict on the auth_token
  cookie + Spring's default CORS rejection.

- AuthTokenCookieFilter:
  - Scope to /api/* only. /actuator/health and similar must not be
    cookie-authenticated.
  - Refuse malformed percent-encoding (URLDecoder throws); forward the
    request without a promoted Authorization rather than crash.
  - Use isBlank() instead of isEmpty() per Nora.
  - Javadoc warning: getHeaderNames/getHeaders exposes the Basic
    credential; any future header-iterating logger must scrub
    Authorization before logging.

- Tests: add `passes_through_unchanged_when_request_is_outside_api_scope`
  (/actuator/health with cookie should NOT be wrapped) and
  `passes_through_unchanged_when_cookie_value_is_malformed_percent_encoding`.
  Tighten the explicit-header test to verify same-instance forwarding
  rather than just header equality.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:20:10 +02:00
Marcel
023810df1e fix(security): promote auth_token cookie to Authorization header for browser /api/* calls
Closes #520.

The login action stores `Basic <base64>` in an HttpOnly `auth_token`
cookie. SSR fetches from hooks.server.ts explicitly set the
Authorization header. Vite's dev proxy does the same on every
/api/* request. Caddy in production does NOT. So browser-side
fetch() and EventSource() calls reach the backend without auth,
get 401 + WWW-Authenticate: Basic, and the browser pops a native
auth dialog over the SPA.

Add AuthTokenCookieFilter (Ordered.HIGHEST_PRECEDENCE, before any
Spring Security filter) that promotes the cookie to a request
header when no explicit Authorization is present. URL-decodes the
cookie value because SvelteKit URL-encodes spaces ("Basic " ->
"Basic%20") when serializing the cookie. Works the same for REST,
SSE (/api/notifications/stream, /api/ocr/jobs/.../progress), and
any other browser-direct backend call.

5 tests in AuthTokenCookieFilterTest cover: URL-decoded promotion,
explicit-Authorization-wins precedence, no-cookies pass-through,
absent-auth-token pass-through, empty-value pass-through.

Also: add `@ActiveProfiles("test")` to ThumbnailServiceIntegrationTest,
the one remaining @SpringBootTest in the suite that wasn't annotated.
After #516 made UserDataInitializer fail-closed outside dev/test/e2e,
this test's context load was throwing. Restores green main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:20:10 +02:00
Marcel
ad3b571bba fix(user): findOrCreate Administrators group instead of blind-INSERT (#518)
Some checks failed
CI / Backend Unit Tests (pull_request) Failing after 4m12s
CI / fail2ban Regex (pull_request) Successful in 39s
CI / Unit & Component Tests (pull_request) Failing after 2m50s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Compose Bucket Idempotency (pull_request) Successful in 58s
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Closes #518.

UserDataInitializer.initAdminUser was doing groupRepository.save(adminGroup)
unconditionally. If a previous boot had seeded the group but failed
before creating the admin user (or if the operator deleted just the
admin row to retry with a corrected APP_ADMIN_USERNAME), the next
seed attempt violated user_groups_name_key and aborted the context.

Switch to the same findByName(...).orElseGet(...) pattern initE2EData
already uses for the "Leser" group.

Tests in AdminSeedFailClosedTest:
- reuses_existing_Administrators_group_when_seeding_a_new_admin
- creates_Administrators_group_when_seeding_admin_on_a_fresh_database
Plus updated existing tests to stub groupRepository.save now that the
seed path also exercises it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:29:11 +02:00
Marcel
9686e304c2 fix(caddy): wrap actuator block in handle so it takes precedence over catch-all
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Closes #512.

The previous `(block_actuator)` snippet emitted `respond @actuator 404`
at the top level of each archive vhost. But each vhost also has a
catch-all `handle { reverse_proxy ... }` that matches /actuator/*
too. Caddy's `handle` blocks are mutually exclusive — once one matches,
the request never reaches a top-level `respond`. So /actuator/health
was being proxied to the backend, which 302s to /login.

Wrap the actuator response in its own `handle /actuator/*` block.
Caddy sorts `handle` blocks by path specificity, so /actuator/* wins
over the catch-all and the 404 is actually returned.

Verified with `caddy validate` against the caddy:2 image.

Also unblocks the nightly.yml smoke test's `/actuator/health → 404`
assertion, which has been failing since the first staging deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:15:03 +02:00
Marcel
ea0b3050e4 fix(user): fail-closed when admin seed would use dev defaults outside dev/test/e2e
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Addresses Nora's review concern on #513/#516.

The previous fix only made env-vars take effect — it did NOT close the
fail-open default path. If an operator forgets APP_ADMIN_USERNAME /
APP_ADMIN_PASSWORD on first prod boot, the seeded admin is the
well-known `admin@familienarchiv.local` / `admin123` and is permanently
locked (UserDataInitializer only seeds when the row is missing).

Refuse to seed outside dev/test/e2e profiles when either credential
matches the documented default. The startup fails fast with a clear
message pointing at the env-var names and the permanence trap.

Also adds Markus/Felix/Sara's "pin the Java side" coverage: a
reflection test on the @Value placeholder catches a future rename
of `${app.admin.email:...}` back to `${app.admin.username:...}`,
which would otherwise pass the yaml-side test but silently break
the binding.

Tests:
- AdminSeedFailClosedTest pins fail-closed for non-local profiles
  and verifies the dev/test/e2e bypass.
- AdminSeedPropertyKeyTest now also asserts the @Value placeholder
  string on UserDataInitializer.adminEmail/adminPassword.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:12:36 +02:00
Marcel
21343cdf23 fix(user): rename yaml key username→email so admin seed reads APP_ADMIN_USERNAME
Closes #513.

UserDataInitializer reads `@Value("${app.admin.email:...}")` but
application.yaml mapped APP_ADMIN_USERNAME to `app.admin.username`.
The keys never connected — env vars APP_ADMIN_USERNAME and
APP_ADMIN_PASSWORD were silently ignored and the admin user got
seeded with the hardcoded defaults admin@familyarchive.local /
admin123.

For production this is HIGH severity: DEPLOYMENT.md §3.5 documents
the admin password as permanently locked on first deploy. The
bug locked the lock-in to dev defaults, not to whatever an operator
set in PROD_APP_ADMIN_PASSWORD.

Rename yaml key from `username:` to `email:` so the Spring property
`app.admin.email` actually exists. Keep env-var name
APP_ADMIN_USERNAME (matches the already-set Gitea secrets and
DEPLOYMENT.md §3.3). Default value updated to an email-shape.

Added AdminSeedPropertyKeyTest (Binder pattern, no Spring context):
verifies both `app.admin.email` and `app.admin.password` resolve
from the yaml. Confirmed red without the fix, green with it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:12:36 +02:00
Marcel
6ba7254344 test(ci): assert prerender output is only /hilfe/transkription
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
Addresses Sara's review request on #515.

Without this gate, a future regression that turns prerender.crawl
back on (or adds a new prerender entry whose nav links into
protected routes) would silently bake /, /documents, /persons etc.
to "redirect-to-login" HTML and re-introduce #514.

Verified the script catches the current broken build state:
  $ find build/prerendered ... -not -path 'hilfe/*' ...
  build/prerendered/{index,documents,persons,geschichten,stammbaum}.html

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:00:54 +02:00
Marcel
b2955fb695 fix(frontend): disable prerender crawl so /, /documents, /persons aren't baked
Closes #514.

The build was prerendering protected routes via crawl from
/hilfe/transkription. Their load functions throw redirect('/login')
during the build (no auth cookie), so SvelteKit captured the redirect
as static HTML and shipped /app/build/prerendered/{index,documents,
persons,geschichten,stammbaum}.html with a `location.href=/login`
script. In production these files are served BEFORE hooks.server.ts
runs, so an authenticated user with a valid cookie is still served
the baked bounce-back page.

Setting `crawl: false` keeps the explicit /hilfe/transkription entry
prerendered (needed for the public help page) without dragging the
nav targets along with it.

Verified locally: build now emits only `hilfe/transkription.html`
under build/prerendered/, no index.html or documents.html etc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:00:10 +02:00
5d2888e038 Merge pull request 'fix(compose): mark create-buckets as one-shot for up --wait (#510)' (#511) from fix/issue-510-compose-wait-oneshot-create-buckets into main
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
2026-05-11 16:59:59 +02:00
Marcel
3668555421 fix(compose): mark create-buckets as one-shot for up --wait
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m47s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m12s
CI / fail2ban Regex (push) Successful in 37s
CI / Compose Bucket Idempotency (push) Successful in 56s
CI / Unit & Component Tests (pull_request) Failing after 2m49s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m13s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 58s
Closes #510.

`docker compose up -d --wait` exits 1 even when every service is
healthy because the one-shot `create-buckets` exits 0 and --wait
expects "running". The whole stack came up fine on staging, but the
workflow gate failed before the smoke step could run.

Two changes:

1. create-buckets: `restart: "no"` declares one-shot intent.
2. backend.depends_on: add `create-buckets: service_completed_successfully`.

With both, compose v2.20+ understands create-buckets is a one-shot
that must complete successfully, and --wait treats exited(0) as the
target state. Backend startup now also correctly gates on bucket
bootstrap (closes a latent race where backend could start before
the archiv-app policy was bound).

Verified `docker compose config --quiet` parses and the resolved
config shows the right dependency graph.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:33:04 +02:00
Marcel
54a8f7f8e9 fix(workflows): match runner label — runs-on ubuntu-latest, not self-hosted
Some checks failed
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Unit & Component Tests (push) Failing after 2m49s
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
Closes #508.

Our gitea-runner advertises labels ubuntu-latest / ubuntu-24.04 /
ubuntu-22.04. `runs-on: self-hosted` never matches → dispatched
deploy jobs sit in the queue forever. The runner is still
genuinely self-hosted (DooD socket, joined to gitea_gitea net,
single-tenant per ADR-011) — the `self-hosted` token was just an
unconfirmed assumption about the label name.

Unblocks #497 / #499 first deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:15:53 +02:00
Marcel
f8f0951bd5 fix(minio): bake bootstrap.sh into image instead of bind-mounting
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 2m50s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m9s
CI / fail2ban Regex (pull_request) Failing after 12s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
Closes #506.

Under Docker-out-of-Docker (the production Gitea Actions runner), the
host daemon resolves the relative bind-mount path against the host
filesystem — not the runner container's /workspace. The script is not
there, so Docker creates an empty directory at /bootstrap.sh and the
entrypoint fails with `/bootstrap.sh: Is a directory`.

Bake the script into a tiny derived image (infra/minio/Dockerfile) so
there is no runtime path resolution. Works in DooD, regular Docker,
and CI.

Unblocks the staging / production deploy pipelines from #497 / #499
and turns the Compose Bucket Idempotency CI job green.

Verified locally:
- `docker compose ... config --quiet` parses
- `docker compose ... build create-buckets` builds the image
- bootstrap.sh exists as a +x file at /bootstrap.sh inside the image

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 15:32:36 +02:00
c3c1efe5f1 Merge pull request 'fix(fail2ban): pin polling backend so jail actually reads Caddy access log (#503)' (#504) from fix/issue-503-fail2ban-polling-backend into main
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m47s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m12s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Failing after 50s
2026-05-11 15:08:58 +02:00
Marcel
e5363913ec fix(fail2ban): pin polling backend so jail actually reads Caddy access log
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m49s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m8s
CI / fail2ban Regex (push) Successful in 37s
CI / Compose Bucket Idempotency (push) Failing after 53s
CI / Unit & Component Tests (pull_request) Failing after 2m46s
CI / OCR Service Tests (pull_request) Successful in 15s
CI / Backend Unit Tests (pull_request) Successful in 4m14s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Failing after 50s
Closes #503.

Debian's fail2ban package ships defaults-debian.conf with
`[DEFAULT] backend = systemd`. Without an explicit override, our
familienarchiv-auth jail inherits the systemd backend at runtime,
reads from journald, and never inspects /var/log/caddy/access.log.
A live login brute-force would not be banned.

Add `backend = polling` to the jail and a CI step that links the jail
into /etc/fail2ban/ and asserts `fail2ban-client -d` resolves it to
the polling backend, not the inherited systemd backend.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:59:40 +02:00
Marcel
4d4d5793bb docs(glossary): add archiv-app service account entry
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m48s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m5s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Failing after 50s
CI / Unit & Component Tests (push) Failing after 2m46s
CI / OCR Service Tests (push) Successful in 15s
CI / Backend Unit Tests (push) Successful in 4m4s
CI / fail2ban Regex (push) Successful in 37s
CI / Compose Bucket Idempotency (push) Failing after 50s
`archiv-app` is the bucket-scoped MinIO service account introduced
in PR #499 alongside the production deploy pipeline. Until now the
term only appeared in `infra/minio/bootstrap.sh` and the prod compose
file; a reader encountering `S3_ACCESS_KEY: archiv-app` had no
single-page reference distinguishing it from the MinIO root account.

Adds a new "Infrastructure Terms" section to docs/GLOSSARY.md so the
distinction (root account vs. application service account) and the
attached `archiv-app-policy` scope live in the canonical glossary
location. Cross-links to ADR-010 for the MinIO-stays-self-hosted
rationale. Addresses @elicit's round-2 recommendation on PR #499.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:11:46 +02:00
Marcel
9adde3cd89 refactor(compose): rename docker network archive-net to archiv-net
The docker network was the only `archive-*` identifier in either
compose file; everything else (user, db, bucket, service account,
project name) uses the `archiv-*` spelling. Reviewers' eyes stuttered
on it on the prod compose review (round 2 of PR #499 — Markus and
Tobi). Renamed in both prod and dev compose for consistency and
updated the single doc reference to the dev-project-prefixed
network name.

Operational note: applying this change to a running stack will
recreate the network on the next `docker compose up`; containers
restart, named volumes are unaffected.

`docker compose config --quiet` passes for both compose files and
for the staging profile. Sweep confirms zero `archive-net`
references remain in the tree.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:10:39 +02:00
Marcel
440a191138 infra(workflows): annotate env-file cleanup as load-bearing
The `if: always()` conditional on the env-file cleanup step in both
deploy workflows is what makes the ADR-011 single-tenant runner trust
model safe: secrets land on disk before each deploy and are wiped
unconditionally afterwards. A future workflow refactor that drops
`if: always()` would silently leave plaintext secrets on the runner
on any failed deploy.

The ADR documents this; the workflow file did not. Adds a prominent
inline comment so the next reader of the YAML sees the constraint
without having to cross-reference ADR-011. No behaviour change — both
workflows still parse. Addresses @nora's round-2 suggestion on PR
#499 — "linchpin of the ADR-011 trust model".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:09:12 +02:00
Marcel
1873f50f7f infra(mailpit): use nc -z healthcheck instead of wget
The mailpit service healthcheck previously assumed `wget` ships in
the axllent/mailpit image. That's true for v1.29.7 but is not part
of the image's contract — a future Alpine slim-down could drop wget
and silently disable the healthcheck. Switched to BusyBox `nc -z
localhost 8025`, which is a TCP-port open check with no dependency
beyond BusyBox itself.

Verified inside axllent/mailpit:v1.29.7 that `nc` is present
(/usr/bin/nc, BusyBox v1.37.0) and that the proposed command
returns 0 against an open port and non-zero against a closed one.
Compose still parses with `--profile staging`. Addresses @tobi's
round-2 suggestion on PR #499.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:08:23 +02:00
Marcel
a4f2047bcc security(ocr): pin ALLOWED_PDF_HOSTS=minio in prod ocr-service env
Production never sources PDFs from localhost or 127.0.0.1 — the OCR
service only reads from MinIO over the internal docker network. The
Python default (`minio,localhost,127.0.0.1`) was permissive on
purpose for local dev, but in production a future change to that
default — or a host-env override — would silently broaden the SSRF
surface. Pinning the env var explicitly here freezes the allowlist
to the one hostname production actually needs.

`docker compose config --quiet` and `--profile staging config
--quiet` both still pass. Verified the resolved config emits
`ALLOWED_PDF_HOSTS: minio`. Addresses @nora's round-2 suggestion on
PR #499 — "five characters of YAML, lifetime guarantee".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:07:16 +02:00
Marcel
09680557ef security(caddy): add Permissions-Policy header
Adds `Permissions-Policy: camera=(), microphone=(), geolocation=()` to
the shared (security_headers) snippet, so both archiv vhosts and the
git vhost deny browser APIs the app does not use. Reduces blast radius
of an XSS landing in a privileged origin.

The deploy smoke steps in nightly.yml and release.yml gain a matching
assertion against the canonical header value, so a future Caddyfile
edit that drops or loosens the header (e.g. `camera=(self)`) fails the
deploy instead of regressing silently.

`caddy validate` against caddy:2 passes; both workflow YAMLs parse.
Addresses @nora's round-2 suggestion on PR #499 — "lower-impact than
CSP but nearly free".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:06:13 +02:00
Marcel
8fcf653cb0 ci(smoke): pin HSTS to preload-list-eligible value
Replaces the presence-only `grep -qi strict-transport-security` smoke
assertion in both nightly.yml and release.yml with a value-pinning
regex that requires `max-age=31536000`, `includeSubDomains`, and
`preload`. A future Caddyfile edit that drops any of those three
parts now fails the deploy smoke step instead of passing silently.

Verified locally that the new pattern matches the preload-eligible
value and rejects three degraded forms (short max-age, missing
includeSubDomains, missing preload). Addresses @sara's round-2 note
on PR #499 — "presence check, not value check".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:05:02 +02:00
Marcel
a7a80f8c16 docs(deployment): route SSE through Caddy in topology mermaid
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m48s
CI / OCR Service Tests (push) Successful in 16s
CI / Unit & Component Tests (pull_request) Failing after 2m48s
CI / Backend Unit Tests (pull_request) Successful in 4m8s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Failing after 49s
CI / Backend Unit Tests (push) Successful in 4m7s
CI / fail2ban Regex (push) Successful in 36s
CI / Compose Bucket Idempotency (push) Failing after 1m15s
CI / OCR Service Tests (pull_request) Successful in 16s
The top-level deployment diagram lagged the C4 L2 diagram, which
correctly notes that SSE notifications are fronted by Caddy. The
mermaid showed Browser → Backend direct, which would only be true
if the backend port were exposed publicly (it is not — all docker
ports bind to 127.0.0.1).

Fixes the inconsistency Markus flagged on PR #499: the public
surface is Caddy and Caddy only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:18:11 +02:00
Marcel
03d478840b docs(arch): show Caddy + X-Forwarded-Proto in auth-flow diagram
Adds the Caddy hop to seq-auth-flow.puml and surfaces the two
production-relevant header behaviours:

  - Caddy terminates TLS and forwards X-Forwarded-Proto: https
  - Spring Boot trusts this header (server.forward-headers-strategy:
    native, ForwardedRequestCustomizer at the Jetty layer), so
    request.getScheme() returns "https"
  - The Set-Cookie response carries the Secure flag because the
    observed scheme is https — without forward-headers-strategy this
    would silently drop to plain http and the cookie would lose Secure

Closes the doc-currency gap flagged in the Markus review on PR #499:
"Auth flow change → docs/architecture/c4/seq-auth-flow.puml".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:17:12 +02:00
Marcel
6a6a1c4353 docs(adr): ADR-011 single-tenant Gitea runner with on-disk env-files
Records the operational assumption that nightly.yml and release.yml
bake in: the self-hosted runner is single-tenant, so writing secrets
to .env.staging / .env.production on disk and removing them via an
`if: always()` cleanup step is acceptable for v1.

Documents the three migration triggers (second repo on the runner,
untrusted PR execution, move to shared infrastructure) and the
one-step migration path (--env-file <(printf '%s' "$SECRET_BLOB"))
so the next operator does not silently break the trust assumption.

The in-comment notes at the top of both workflow files already point
at this ADR's content; this commit records the decision in the durable
location the doc-currency table demands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:16:20 +02:00
Marcel
b57afb9ad2 docs(adr): ADR-010 MinIO stays self-hosted, Hetzner OBS deferred
Records the reversal of the earlier "migrate to Hetzner Object Storage"
direction in docs/infrastructure/production-compose.md. Documents the
cost/benefit (current 13 GB fits trivially on the VPS; OBS billing is
dominated by base fee at this size; migration is a three-env-var swap
plus `mc mirror`, no application rewrite cost).

Captures the four triggers that should re-open the decision (50 GB
threshold, healthcheck latency, VPS upgrade cost, backup runtime) so
the deferral does not become an indefinite punt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:15:38 +02:00
Marcel
59bc81d353 docs(adr): ADR-009 standalone docker-compose.prod.yml, not overlay
Records the decision to make docker-compose.prod.yml a fully self-contained
file rather than an overlay over docker-compose.yml. Captures the cost
(env-var duplication across dev and prod files) and the benefit (single
file the reviewer can hold in their head, no Compose merge-rule
surprises, automatic project-name namespacing for cohabiting staging +
production on one host).

Surfaces the retirement of the earlier overlay narrative in
docs/infrastructure/production-compose.md so a future maintainer does
not reverse the choice out of ignorance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:14:58 +02:00
Marcel
33300e4ad9 chore(infra): drop aspirational Renovate comments from compose
The repo's renovate.json only configures TipTap grouping; Renovate is
not currently active against MinIO / mc / mailpit / Postgres / Node /
Caddy. The "Renovate keeps it current" comments were aspirational —
those tags will rot until Renovate is bootstrapped (tracked in a
follow-up issue).

The "Pinned mc release; Renovate keeps it current" comment is gone
already since the create-buckets entrypoint was extracted to a script
in the preceding MinIO-policy commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:12:55 +02:00
Marcel
fe1451f570 ci(smoke): pin curl to 127.0.0.1 via --resolve
The smoke step previously curled the public hostname unconditionally,
which routes the runner's request via DNS → router → back into the same
host. Many SOHO routers do not implement hairpin NAT (or do so only after
a firmware update), so the deploy may pass on day one and silently fail
on day 90.

--resolve "<host>:443:127.0.0.1" pins the hostname to the runner's
loopback while keeping SNI on the public name (so the cert validates
correctly and the Caddy vhost block matches). The smoke test now
verifies that the Caddy-on-the-same-host is serving the right
hostname end-to-end, with no router dependency.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:12:05 +02:00
Marcel
f2ec81547b ci(deploy): add --pull to docker compose build for CVE pickup
Without --pull, the host's Docker layer cache wins: if a CVE drops in
node:20.19.0-alpine3.21 / postgres:16-alpine and the vendor re-publishes
the same tag, the runner keeps serving the cached layer until the cache
is manually cleared — a silent supply-chain blind spot.

Adding --pull to both `compose build` invocations costs a single
re-pull per run and lifts the base-image patch lag from "next host
prune" to "next nightly".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:10:59 +02:00
Marcel
7e430998b8 security(fail2ban): widen jail to /forgot-password and rate-limit 429
The filter only watched /api/auth/login 401 — leaving the forgot-password
endpoint open to:

  - email enumeration (slow brute-force probing which addresses exist)
  - password-reset brute-force against accounts whose addresses leak

Widens the failregex to /api/auth/(login|forgot-password) and adds 429 to
the status alternation so a future in-app rate-limiter response is also
caught by the jail (defense in depth).

CI assertions extended to cover both new dimensions plus a negative case
on an unrelated 401 endpoint (/api/documents) — pins that the widening
did not over-match.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:10:08 +02:00
Marcel
156afa14a2 test(ci): add compose bucket-bootstrap idempotency job
The create-buckets service in docker-compose.prod.yml runs on every
`docker compose up` (one-shot, restart=no). A re-deploy that fails
because the user/bucket/policy already exists would block the whole
nightly/release pipeline — and the only way to find out today is to
run a second deploy.

This job runs the bootstrap twice against a throwaway minio stack and
asserts both invocations exit 0. Caught at PR time, not at the third
nightly deploy at 02:00.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:08:51 +02:00
Marcel
91f70e652d security(minio): scope archiv-app to bucket-only IAM policy
Replaces MinIO's built-in `readwrite` policy (which grants s3:* on
arn:aws:s3:::* — every bucket present and future) with a bucket-scoped
custom policy `archiv-app-policy`:

  - s3:GetObject / s3:PutObject / s3:DeleteObject on familienarchiv/*
  - s3:ListBucket / s3:GetBucketLocation on familienarchiv

The previous configuration silently regressed the least-privilege guarantee
that the service-account separation was supposed to provide: a future
second bucket (logs, backups, mc-mirror staging) would have been
read/write/delete-accessible to a compromised backend.

While at it, two follow-on fixes:

  1. Extract the entrypoint to infra/minio/bootstrap.sh. The previous
     inline `/bin/sh -c "..."` was already at the YAML-escaping ceiling;
     adding the policy-JSON heredoc would have made it unreadable.

  2. Replace the `| grep -q readwrite || exit 1` fatal-check with a
     POSIX `case` substring match. The minio/mc image ships coreutils +
     bash but NOT grep/awk/sed — the original check was a no-op that
     ALWAYS exited 1 (verified locally). The new check passes on the
     first invocation and on every subsequent re-deploy.

Idempotency verified locally: two consecutive `docker compose run --rm
create-buckets` invocations both exit 0 with the user bound to the
new policy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:07:56 +02:00
Marcel
9652894aa4 test(ci): add fail2ban-regex regression job
Caddy 2.x emits JSON access logs; the failregex in
infra/fail2ban/filter.d/familienarchiv-auth.conf depends on the
"remote_ip" → "uri" → "status" key order being stable. A future Caddy
upgrade that reorders fields would break the jail silently (regex no
longer matches → fail2ban returns 0 hits → host stops banning
brute-force, discovered only at the next incident).

This job pins the contract: a sample /api/auth/login 401 line must
match (1 hit) and a /api/auth/login 200 line must not (0 hits).
Catches a regression at PR time instead of in production.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:03:04 +02:00
Marcel
e5d953dee8 test(config): rewrite ForwardHeadersConfigurationTest as context-less binder test
Drops @SpringBootTest + PostgresContainerConfig + @MockitoBean S3Client in
favour of Spring's Binder API against application.yaml. The new test binds
the property into the typed ServerProperties.ForwardHeadersStrategy enum,
so typos (`nativ`, `Native`, `framework `) and future enum renames fail
the build with BindException — addresses the silent-coercion concern that
the YAML-string assertion missed.

Verified the test goes red on a typo (BindException: Failed to convert
"nativ" → ForwardHeadersStrategy) and green on `native`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:01:06 +02:00
Marcel
ba5bd9cb11 docs(deployment): document fail2ban symlink, OCR_MEM_LIMIT, smoke test
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m53s
CI / OCR Service Tests (push) Failing after 1m58s
CI / Backend Unit Tests (push) Failing after 1m23s
CI / Unit & Component Tests (pull_request) Failing after 3m59s
CI / Backend Unit Tests (pull_request) Successful in 5m39s
CI / OCR Service Tests (pull_request) Successful in 1m14s
Updates DEPLOYMENT.md to match the infra changes in this PR:

§1 OCR memory — point operators at the new OCR_MEM_LIMIT env var instead
                of telling them to edit "the prod overlay".
§2 OCR env vars — add OCR_MEM_LIMIT to the table.
§3.1 server setup — replace fail2ban prose with concrete `ln -sf`
                    commands referencing the committed jail/filter.
                    Document the single-tenant runner assumption near
                    the runner-registration step.
§3.4 first deploy — describe the new automated smoke test step.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:07:59 +02:00
Marcel
83565c6bb5 docs(ci): document workflow operational assumptions
The two deploy workflows make two non-obvious assumptions that future
maintainers should not have to rediscover by reading the diff:

  1. Single-tenant self-hosted runner — the .env.* file lands on disk
     during the deploy and is cleaned up unconditionally. Multi-tenant
     usage would require switching to stdin-piped env input.

  2. Host docker layer cache is authoritative — there is no
     actions/cache directive; a host-level `docker system prune` will
     cold-start the next build.

Both notes added as block comments at the top of each workflow.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:06:48 +02:00
Marcel
a91a3e1f61 feat(ci): smoke test production deploy after up --wait
Mirrors the nightly.yml smoke step against archiv.raddatz.cloud. Catches
the same three failure modes (Caddy not reloaded, DNS missing, HSTS
dropped, /actuator block bypassed) on the prod path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:05:41 +02:00
Marcel
c523721ce8 feat(ci): smoke test staging deploy after up --wait
Healthchecks prove containers are healthy on the docker network; they
do not prove the public URL is reachable, HSTS still fires, or
/actuator is still blocked at the edge. Add a post-deploy smoke step
to nightly.yml that:

  1. GETs https://staging.raddatz.cloud/login (frontend reachable)
  2. asserts the response includes the Strict-Transport-Security header
  3. asserts /actuator/health returns 404 (defense-in-depth verified)

Failure aborts the workflow before the env-file cleanup step. The
cleanup step still runs because it is `if: always()`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:05:00 +02:00
Marcel
ad69d7cb83 feat(infra): commit fail2ban jail for /api/auth/login
Adds two files mirroring the on-host install layout:

  infra/fail2ban/filter.d/familienarchiv-auth.conf
  infra/fail2ban/jail.d/familienarchiv.conf

Filter parses the JSON access log emitted by Caddy (previous commit) and
matches 401 responses on /api/auth/login. Jail bans the offending IP for
30 min after 10 attempts in a 10-minute window.

Verified the failregex against four sample log lines via fail2ban-regex
in an alpine container:
  - 2 brute-force 401 attempts        → matched (ban)
  - 1 successful login (POST /api/auth/login 200) → not matched
  - 1 unrelated GET /login 200        → not matched
Date template "ts":{EPOCH} parses Caddy's Unix-epoch ts field.

The previous review iteration described this jail in DEPLOYMENT.md prose
only; committing it makes the security posture reproducible from a
fresh server build.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:04:06 +02:00
Marcel
8d27c82e6d feat(infra): write Caddy JSON access logs for fail2ban
Adds an (access_log) snippet writing JSON-formatted access logs to
/var/log/caddy/access.log with 10mb rolling and 14-file retention. Both
archive vhosts (archiv.raddatz.cloud and staging.raddatz.cloud) import
it; the git vhost is intentionally excluded.

This is the prerequisite for the fail2ban jail committed in the next
commit — fail2ban tails this file looking for 401 responses on
/api/auth/login to defend against credential stuffing.

Validated with `caddy validate` against caddy:2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:02:28 +02:00
Marcel
4eb5eba347 feat(infra): parameterize OCR mem_limit via OCR_MEM_LIMIT
Hardcoded `mem_limit: 12g` only works on CX42+ (16 GB) hosts; a CX32 (8
GB) cannot honour it. Make both mem_limit and memswap_limit driven by
the OCR_MEM_LIMIT env var, defaulting to 12g so prod deploys on a CX42
keep current behaviour. Operators on smaller hosts override to 6g.

Verified compose interpolation produces 12 GiB by default and 6 GiB when
OCR_MEM_LIMIT=6g.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:01:23 +02:00
Marcel
47c5f77c81 fix(infra): fail loud when archiv-app is missing the readwrite policy
The previous `mc admin policy attach … || true` swallowed every failure
mode: a renamed policy, an mc CLI signature change, or a transient MinIO
error would leave the bootstrap container exiting zero with the service
account possessing no permissions, and the backend would then fail every
S3 call after a "successful" deploy.

Replace the silent fallback with verify-after: keep the attach (idempotent
in current mc, redundant in older versions), then assert via `mc admin
user info` that `readwrite` ends up on archiv-app. A genuine attach
failure now exits 1 and blocks the stack from starting.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 12:00:34 +02:00
Marcel
a36f25cfc3 fix(infra): pin minio/mc client tag
Removes the implicit `:latest` from the create-buckets bootstrap
container. Pins to RELEASE.2025-08-13T08-35-41Z so a breaking change in
mc CLI syntax cannot silently brick deploys.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 11:59:18 +02:00
Marcel
c9ac83b2ba fix(infra): pin axllent/mailpit tag
Removes `:latest` from the mailpit service; pins to v1.29.7 so staging
deploys are reproducible. Renovate keeps the tag current.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 11:58:34 +02:00
Marcel
e4df17f308 docs: retire overlay narrative; add Caddy to C4 L2 diagram
Some checks failed
CI / Unit & Component Tests (push) Failing after 7m31s
CI / OCR Service Tests (push) Successful in 49s
CI / Backend Unit Tests (push) Failing after 3m30s
CI / Unit & Component Tests (pull_request) Failing after 6m55s
CI / OCR Service Tests (pull_request) Successful in 51s
CI / Backend Unit Tests (pull_request) Failing after 3m31s
- docs/infrastructure/production-compose.md: trimmed to VPS sizing,
  cost breakdown, and Hetzner ecosystem rationale. The inline
  compose spec (overlay + Hetzner OBS in prod) is retired; the
  live file is now docker-compose.prod.yml at the repo root and
  the Caddyfile lives at infra/caddy/Caddyfile. Observability
  stack is called out as a not-yet-deployed gap (issue #498).

- docs/architecture/c4/l2-containers.puml: adds Caddy as a named
  reverse-proxy container with the two port paths and notes the
  archiv-app service-account split on MinIO access.

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 22:00:21 +02:00
Marcel
2eade2b78f docs(deployment): rewrite for Gitea Actions / Caddy / prod compose
Brings DEPLOYMENT.md in line with the production deployment landed
in #497:

- Topology diagram: frontend port 3000 (Node adapter), 127.0.0.1
  binding, project-name isolation between prod and staging
- Caddyfile now lives in-tree at infra/caddy/Caddyfile (symlinked
  onto the server)
- Dev vs prod table: documents the new deploy method (workflows +
  --wait) and the prod-compose specific differences
- Env vars: adds MINIO_APP_PASSWORD; notes that prod compose
  hardcodes the MinIO root user and the bucket name
- Bootstrap section: server hardening, fail2ban, Tailscale, the 16
  Gitea secrets, and the workflow_dispatch first-deploy step
- Admin password warning: first deploy locks the password, secret
  rotation after that point has no effect
- Rollback: TAG= override + docker compose up -d --wait

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:58:51 +02:00
Marcel
334b507476 feat(ci): add release production deploy workflow
Fires on `v*` tag push. Tags the built images with the git tag so
rollbacks are a one-liner (TAG=<previous> docker compose ... up -d).

`up -d --wait` blocks until every service healthcheck reports
healthy; a bad release fails the workflow rather than crash-looping
silently. The .env.production file containing all Gitea secrets is
removed in `if: always()` after the deploy step.

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:56:37 +02:00
Marcel
59349dfe93 feat(ci): add nightly staging deploy workflow
Runs daily at 02:00 (and on workflow_dispatch). Builds the prod
compose stack with BuildKit, writes a transient .env.staging from
Gitea secrets, then `docker compose up -d --wait` so the job fails
loudly if any service's healthcheck never reports healthy.

The --profile staging flag starts the mailpit catcher in place of
a real SMTP relay; no production SMTP credentials touch the staging
environment.

The .env.staging file is cleaned up in `if: always()` to avoid
leaving secrets in the runner workspace between runs.

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:55:41 +02:00
Marcel
56e55ff488 feat(infra): add production Caddyfile
Reverse proxy for the Familienarchiv host, validated against Caddy 2.
Includes both vhosts (production and staging), the Gitea vhost, and:

- HSTS, X-Content-Type-Options, Referrer-Policy headers on every site
- "-Server" header strip to hide the Caddy version
- /actuator/* responds 404 on both archive vhosts (defense in depth
  for Spring Boot's management endpoints)

X-Frame-Options is intentionally not set in Caddy: Spring Security
configures frame-options SAMEORIGIN for the in-app PDF preview
iframe; a DENY header here would conflict.

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:54:38 +02:00
Marcel
ecb930e5f9 feat(infra): add docker-compose.prod.yml for production/staging
Standalone production compose file (not an overlay) that runs the
full stack on a single host. Environment isolation is achieved via
the docker compose project name (-p archiv-production / -p
archiv-staging) so the two environments cohabit cleanly.

Key choices, resolved in #497 review:
- Named volumes for persistent data (no host bind mounts)
- MinIO pinned to a specific RELEASE tag (no :latest)
- Backend uses MinIO service account (S3_ACCESS_KEY=archiv-app),
  not root credentials; create-buckets bootstraps the account
- Mailpit lives under profiles: [staging] so no real SMTP secret
  is ever wired into the staging deploy
- OCR mem_limit 12g + healthcheck (start_period 120s) copied from
  the dev compose so docker compose up -d --wait works in CI
- Backend admin credentials wired through APP_ADMIN_USERNAME /
  APP_ADMIN_PASSWORD; first deploy locks the password in
  permanently because UserDataInitializer is idempotent on email
- All host ports bound to 127.0.0.1; Caddy fronts external traffic

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:53:19 +02:00
Marcel
8b109349c2 feat(frontend): add production stage to Dockerfile
Multi-stage Dockerfile with three targets:
- development (dev server on :5173, used by docker-compose.yml)
- build (runs npm run build, produces SvelteKit Node-adapter output)
- production (self-contained node build server on :3000)

Node base pinned to node:20.19.0-alpine3.21 for reproducible CI
builds (Renovate will keep it current).

docker-compose.yml now specifies target: development for the
frontend so dev continues to use the dev-server stage. Without
this, Docker would default to the last stage (production).

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:51:32 +02:00
Marcel
ebd0f671f9 fix(auth): mark /hilfe/transkription as public for prerender
The route exports prerender = true and is listed in
svelte.config.js's prerender.entries. Until now the auth hook
redirected unauthenticated requests to /login, so the prerender
crawler hit a 302 and the build failed with "marked as prerenderable,
but were not prerendered".

Adding the path to PUBLIC_PATHS lets the crawler render the static
HTML; consistent with the route's intent as a public help page.

Surfaced by #497 (the production Docker build is the first place
npm run build runs in CI).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:50:53 +02:00
Marcel
83f022ff4b feat(security): trust X-Forwarded-Proto behind reverse proxy
Adds server.forward-headers-strategy: native so that Jetty honours
X-Forwarded-{Proto,For,Host} from Caddy. Without this, getScheme(),
redirect URLs, and Spring Session "Secure" cookies reflect the
internal http hop instead of the original https client request.

Refs #497.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-10 21:33:39 +02:00
Marcel
80ccc0f3c6 fix(test): extend coverage thresholds to all four dimensions
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 6m18s
CI / OCR Service Tests (pull_request) Successful in 43s
CI / Backend Unit Tests (pull_request) Failing after 3m24s
CI / Unit & Component Tests (push) Failing after 6m10s
CI / OCR Service Tests (push) Successful in 32s
CI / Backend Unit Tests (push) Failing after 3m22s
Add lines, functions, and statements at 80% alongside branches in both
the server (vite.config.ts) and client (vitest.client-coverage.config.ts)
coverage gates — branch-only thresholds allow misleadingly sparse tests to
pass the gate.

Also adds a plugin-sync comment to vitest.client-coverage.config.ts listing
the four Vite plugins mirrored from vite.config.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 18:56:44 +02:00
Marcel
eccecf35e3 ci: add combined coverage gate to unit-tests job
Some checks failed
CI / Unit & Component Tests (push) Failing after 5m54s
CI / Backend Unit Tests (push) Failing after 3m20s
CI / Unit & Component Tests (pull_request) Failing after 5m48s
CI / OCR Service Tests (push) Successful in 38s
CI / OCR Service Tests (pull_request) Successful in 33s
CI / Backend Unit Tests (pull_request) Failing after 3m21s
Runs test:coverage (server v8 + client Istanbul) after tests, hard-gates
on both 80% branch thresholds, and uploads coverage/ as an artifact.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 17:51:10 +02:00
Marcel
16f69fff33 feat(test): update test:coverage to run both server and client projects
Sequential && prevents the ENOTEMPTY race on coverage/.tmp. Server
uses v8 via --project=server; client uses the standalone Istanbul config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 17:50:13 +02:00
Marcel
bb374bf2cd feat(test): add Istanbul browser coverage via standalone client config
Vitest 4 silently ignores per-project coverage overrides in test.projects,
so a standalone vitest.client-coverage.config.ts provides the root-level
Istanbul coverage block that Vitest actually honours.

Root vite.config.ts retains the v8 coverage block (reportsDirectory:
coverage/server) for the server project. The client config writes to
coverage/client and instruments all .svelte and .svelte.ts files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 17:49:41 +02:00
Marcel
1a28e3114d build(deps): add @vitest/coverage-istanbul for browser-project coverage
Istanbul instruments code at transpile time and works inside Chromium's
sandbox; v8 coverage is silently a no-op in browser mode.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 17:21:00 +02:00
Marcel
915ad9f5c6 test(fts): add overflow guard and UUID-as-String regression tests
Some checks failed
CI / Unit & Component Tests (push) Failing after 4m35s
CI / OCR Service Tests (push) Successful in 39s
CI / Backend Unit Tests (push) Failing after 3m18s
- searchDocuments_relevance_returns_empty_when_offset_exceeds_maxInt:
  proves the long→int guard fires and findFtsPageRaw is never called
- searchDocuments_relevance_handles_string_uuid_from_jdbc_driver:
  exercises the toFtsPage String fallback branch for JDBC drivers that
  return UUID columns as String instead of java.util.UUID

Addresses Sara's review concerns on PR #488.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:35:01 +02:00
Marcel
143622bf27 refactor(fts): address PR #488 review concerns
- Extract isPureTextRelevance() private static method to replace the
  7-clause inline boolean in searchDocuments
- Guard long→int cast in relevanceSortedPageFromSql to prevent silent
  overflow at page ≥43M (CWE-190)
- resolvePersonName now uses the typed API client (createApiClient)
  instead of raw fetch, aligning with project conventions
- Update DocumentServiceTest stubs to match new FTS path (findFtsPageRaw
  + findAllById instead of findAllMatchingIdsByFts)
- Rewrite page.server.spec.ts person-name tests to mock via path-based
  API dispatch, matching the new api.GET call site

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:35:01 +02:00
Marcel
a3906976e8 test(fts): add integration tests and update unit tests for SQL-paginated relevance
- DocumentFtsPagedIntegrationTest: Testcontainers repo-level tests for
  findFtsPageRaw (page size, window total, last page, no matches, stopword)
- DocumentServiceSortTest: rewritten to stub findFtsPageRaw + findAllById
  for the pure-text RELEVANCE path; verifies filter-active path stays in-memory
- DocumentServiceTest: update two enrichment tests to use new SQL-path stubs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:35:01 +02:00
Marcel
b017da22c3 feat(fts): push FTS pagination into SQL via CTE window function
Pure-text RELEVANCE queries now use findFtsPageRaw (CTE + COUNT(*) OVER())
instead of loading all matching IDs into memory and sorting in-process.
Non-text paths (filters active, DATE sort) still use the in-memory path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:35:01 +02:00
Marcel
fea837b345 refactor(fts): add FtsHit/FtsPage records; rename findRankedIdsByFts -> findAllMatchingIdsByFts
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:35:01 +02:00
Marcel
a364e3f69b docs(adr): ADR-008 SQL-level FTS pagination via window-function CTE
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:35:01 +02:00
Marcel
7ca44d7df1 fix(db): add indexes on documents.sender_id and document_comments.author_id
Some checks failed
CI / Unit & Component Tests (push) Failing after 4m26s
CI / OCR Service Tests (push) Successful in 32s
CI / Backend Unit Tests (push) Failing after 3m16s
CI / Unit & Component Tests (pull_request) Failing after 4m33s
CI / OCR Service Tests (pull_request) Successful in 39s
CI / Backend Unit Tests (pull_request) Failing after 3m16s
Flyway V62 adds idx_documents_sender_id and idx_comments_author_id to speed up
FK-driven queries on the persons page and briefwechsel view. Closes #470.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:31:30 +02:00
Marcel
e975642a4c fix(pdf-controls): add focus-visible ring to all PdfControls buttons (WCAG 2.1 §2.4.7)
Some checks failed
CI / Backend Unit Tests (push) Has been cancelled
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:09:15 +02:00
Marcel
72f422afe2 fix(a11y): increase all PdfControls buttons to 44×44px touch targets
Add min-h-[44px] min-w-[44px] to all five PDF viewer buttons (prev,
next, zoom in, zoom out, annotation toggle) and widen icon-only
padding from p-1 to p-2. Adds aria-pressed to the annotation toggle
for correct toggle semantics (WCAG 2.2 §2.5.8 + ARIA 1.2).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:09:15 +02:00
Marcel
6074480482 ci: document Docker socket security trade-off in runner config
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 4m34s
CI / OCR Service Tests (pull_request) Successful in 35s
CI / Backend Unit Tests (pull_request) Failing after 3m18s
CI / Unit & Component Tests (push) Failing after 4m30s
CI / OCR Service Tests (push) Successful in 31s
CI / Backend Unit Tests (push) Failing after 3m13s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:05:19 +02:00
Marcel
5512790d5a ci: track act_runner config with Docker socket mount
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 4m31s
CI / OCR Service Tests (pull_request) Successful in 30s
CI / Backend Unit Tests (pull_request) Failing after 3m17s
Documents the NAS runner configuration needed for Testcontainers.
Must be deployed to the runner host alongside the act_runner binary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:03:36 +02:00
Marcel
a158048f45 fix(ci): expose Docker socket env vars for Testcontainers in backend job
DOCKER_HOST makes the socket explicit rather than relying on runner
config propagation; TESTCONTAINERS_RYUK_DISABLED=true avoids Ryuk
watchdog start failures in nested container environments.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:03:36 +02:00
Marcel
ac999066dd fix(ci): add TZ=Europe/Berlin to frontend test step
date-buckets.spec.ts midnight tests pass timezone-aware dates (+02:00)
which are 22:00 UTC the prior day; setHours(0,0,0,0) uses local TZ.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 16:03:36 +02:00
Marcel
8b25a5b940 fix(user): replace Math.abs(hashCode()) with Math.floorMod in computeColor
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
Math.abs(Integer.MIN_VALUE) overflows back to Integer.MIN_VALUE (negative),
making the old pattern unsafe for any palette size that doesn't evenly divide
MIN_VALUE. Math.floorMod always returns a non-negative residue in [0, n-1],
eliminating the overflow edge case entirely.

Fixes SpotBugs RV_ABSOLUTE_VALUE_OF_HASHCODE (priority 1, CORRECTNESS).
Closes #471

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 15:48:59 +02:00
Marcel
265b4f1484 fix(comment): declare missing @PathVariable params on block comment endpoints
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
getBlockComments was missing documentId; replyToBlockComment was missing
blockId. Spring silently ignored undeclared path variables — the segments
were parsed but never bound. Now both parameters are explicitly declared so
Spring rejects non-UUID values with 400.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 15:45:48 +02:00
Marcel
bfc3a17676 test(migration): guard cleanup in try-finally to ensure isolation
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 3m57s
CI / OCR Service Tests (pull_request) Successful in 40s
CI / Backend Unit Tests (pull_request) Failing after 3m21s
CI / Unit & Component Tests (push) Failing after 4m1s
CI / OCR Service Tests (push) Successful in 38s
CI / Backend Unit Tests (push) Failing after 3m24s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 15:25:26 +02:00
Marcel
eb54a98ea2 fix(user): use builder in createGroup and guard against null permissions
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 4m2s
CI / OCR Service Tests (pull_request) Successful in 37s
CI / Backend Unit Tests (pull_request) Failing after 3m18s
Null dto.permissions now produces an empty HashSet instead of propagating null
into the @ElementCollection — prevents a silent NPE after V64 adds NOT NULL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 15:19:20 +02:00
Marcel
3fcdfa85f1 fix(db): add PRIMARY KEY to group_permissions; promote tbmp UNIQUE to PK
V63 deduplicates any phantom (group_id, permission) rows accumulated since
the initial schema. V64 sets NOT NULL on permission and adds pk_group_permissions.
V65 renames uq_tbmp_block_person to pk_tbmp for naming-convention consistency.
Integration tests confirm each constraint via pg_catalog.pg_constraint. Closes #469 (partial).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 15:18:46 +02:00
Marcel
cd1c0b210e test(typeahead): note resetKey smoke-test limitation in spec comment
Some checks failed
CI / Unit & Component Tests (push) Failing after 4m19s
CI / OCR Service Tests (push) Successful in 42s
CI / Backend Unit Tests (push) Failing after 3m20s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 14:27:24 +02:00
Marcel
a239c16c31 fix(documents): sync filter display state with URL on navigation
Three root causes prevented filters from reflecting the URL after SvelteKit
client-side navigation:

1. +page.server.ts now resolves sender/receiver display names in parallel with
   the document search (UUID validation + silent 404 drop), so initialSenderName
   / initialReceiverName land in server data ready for the UI to use.

2. +page.svelte passes initialSenderName, initialReceiverName, and navKey
   (incremented via untrack on every navigation) down to SearchFilterBar.
   The untrack() prevents the effect from re-running due to its own navKey write.

3. SearchFilterBar forwards navKey as resetKey to each PersonTypeahead, which
   already had a void resetKey guard added in the previous commit.

Together these ensure that after navigating to /documents?senderId=<uuid> the
typeahead shows the person's display name, and clicking × reset clears it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 14:27:24 +02:00
Marcel
8a8205ad8d fix(person-typeahead): add resetKey prop to clear term on navigation reset
When the user types in the sender/receiver typeahead without selecting a
person and then clicks ×-reset (navigating back to /documents), the
manually-typed term was not cleared because initialName stayed '' between
navigations — the existing $effect tracking initialName never fired.

Adding `resetKey` (incremented by the page on every navigation) forces
the effect to re-run via `void resetKey`, clearing searchTerm=initialName
even when initialName is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 14:27:24 +02:00
Marcel
0430383e1c fix(date-input): re-derive display when value prop changes externally
`display` was initialised once and never updated, so the text box would
show a stale German date after the parent reset `value` (e.g. × reset
button or timeline drag). A guarded `$effect` re-derives `display` from
`value` whenever the two are out of sync while preserving mid-typing
partial dates (germanToIso returns '' for incomplete input, which matches
value='' during typing → no spurious re-derive).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 14:27:24 +02:00
Marcel
e2d74ff880 ci: add npm run build step to unit-tests job
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
The prerender fix only prevents regression if the build is actually run in
CI. Without this gate, a future prerendered route that becomes unreachable
behind auth would fail silently until someone runs the build manually.

Fits after the test step in the existing unit-tests job — no new job needed
since node_modules is already cached for the Playwright container.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 14:25:32 +02:00
Marcel
586eea009b fix(build): add prerender entry for /hilfe/transkription
The SvelteKit prerender crawler cannot reach this route because
hooks.server.ts redirects all non-public paths to /login before the
crawler follows links. Explicitly listing the route in kit.prerender.entries
tells SvelteKit to render it directly without crawling.

Also removes a misleading comment that claimed the auth hook guards
prerendered static files — it does not. Prerendered HTML is served as a
static file by the reverse proxy; hooks.server.ts only runs for SSR requests.

Closes #472

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-09 14:25:32 +02:00
71 changed files with 3605 additions and 445 deletions

View File

@@ -39,6 +39,48 @@ jobs:
- name: Run unit and component tests
run: npm test
working-directory: frontend
env:
TZ: Europe/Berlin
- name: Run coverage (server + client)
run: npm run test:coverage
working-directory: frontend
env:
TZ: Europe/Berlin
- name: Upload coverage reports
if: always()
uses: actions/upload-artifact@v4
with:
name: coverage-reports
path: frontend/coverage/
- name: Build frontend
run: npm run build
working-directory: frontend
# ── Prerender output is exactly the public help page ───────────────────
# SvelteKit prerender + crawl follows nav links and bakes "redirect to
# /login" HTML for every protected route, served BEFORE runtime hooks
# (see #514). With `crawl: false` only the explicit entry should land
# in build/prerendered/. Anything else is a regression — fail the build.
- name: Assert prerender output is only /hilfe/transkription
run: |
cd frontend
set -e
extra=$(find build/prerendered -type f \
-not -path 'build/prerendered/hilfe/*' \
-not -name '*.br' -not -name '*.gz' \
|| true)
if [ -n "$extra" ]; then
echo "FAIL: unexpected prerendered files (would shadow runtime hooks):"
echo "$extra"
exit 1
fi
# And the help page must still be there.
test -f build/prerendered/hilfe/transkription.html \
|| { echo "FAIL: /hilfe/transkription.html missing from prerender output"; exit 1; }
echo "PASS: only /hilfe/transkription.html prerendered."
- name: Upload screenshots
if: always()
@@ -74,6 +116,8 @@ jobs:
runs-on: ubuntu-latest
env:
DOCKER_API_VERSION: "1.43" # NAS runner runs Docker 24.x (max API 1.43); Testcontainers 2.x defaults to 1.44
DOCKER_HOST: unix:///var/run/docker.sock
TESTCONTAINERS_RYUK_DISABLED: "true"
steps:
- uses: actions/checkout@v4
@@ -93,4 +137,123 @@ jobs:
run: |
chmod +x mvnw
./mvnw clean test
working-directory: backend
working-directory: backend
# ─── fail2ban Regex Regression ────────────────────────────────────────────────
# The filter parses Caddy's JSON access log; a Caddy upgrade that reorders
# the JSON keys would silently break it (fail2ban-regex would return
# "0 matches", fail2ban would stop banning, no error surface). This job
# pins the contract against a deterministic sample line.
fail2ban-regex:
name: fail2ban Regex
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install fail2ban
run: |
sudo apt-get update
sudo apt-get install -y fail2ban
- name: Matches /api/auth/login 401
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/login 401"; exit 1; }
- name: Matches /api/auth/login 429
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":429}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/login 429"; exit 1; }
- name: Matches /api/auth/forgot-password 401
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/forgot-password"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/forgot-password 401"; exit 1; }
- name: Does not match /api/auth/login 200
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":200}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '0 matched' \
|| { echo "expected 0 matches for /api/auth/login 200"; exit 1; }
- name: Does not match /api/documents (unrelated 401)
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"GET","host":"archiv.raddatz.cloud","uri":"/api/documents"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '0 matched' \
|| { echo "expected 0 matches for /api/documents 401"; exit 1; }
# ── Backend resolves to file-polling, not systemd ─────────────────────
# The Debian/Ubuntu fail2ban package ships defaults-debian.conf with
# `[DEFAULT] backend = systemd`. Without `backend = polling` in our
# jail, the daemon loads the jail but reads from journald and never
# touches /var/log/caddy/access.log — i.e. the regex above passes in
# isolation while the live jail is inert. See issue #503.
- name: Jail resolves with polling backend (not inherited systemd)
run: |
sudo ln -sfn "$PWD/infra/fail2ban/jail.d/familienarchiv.conf" /etc/fail2ban/jail.d/familienarchiv.conf
sudo ln -sfn "$PWD/infra/fail2ban/filter.d/familienarchiv-auth.conf" /etc/fail2ban/filter.d/familienarchiv-auth.conf
dump=$(sudo fail2ban-client -d 2>&1)
echo "$dump" | grep -E "add.*familienarchiv-auth" || true
echo "$dump" | grep -qE "\['add', 'familienarchiv-auth', 'polling'\]" \
|| { echo "FAIL: familienarchiv-auth jail did not resolve to 'polling' backend"; exit 1; }
# ─── Compose Bucket-Bootstrap Idempotency ─────────────────────────────────────
# docker-compose.prod.yml's create-buckets service runs on every
# `docker compose up` (one-shot, no restart). Must be idempotent — a
# re-deploy must not fail just because the bucket / user / policy
# already exists. Validated by running create-buckets twice against a
# throwaway minio stack and asserting both invocations exit 0.
compose-idempotency:
name: Compose Bucket Idempotency
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write stub env file
run: |
cat > .env.test <<'EOF'
TAG=test
PORT_BACKEND=18080
PORT_FRONTEND=13000
APP_DOMAIN=localhost
POSTGRES_PASSWORD=stub
MINIO_PASSWORD=stubrootpassword
MINIO_APP_PASSWORD=stubapppassword
OCR_TRAINING_TOKEN=stub
APP_ADMIN_USERNAME=admin@local
APP_ADMIN_PASSWORD=stub
MAIL_HOST=mailpit
MAIL_PORT=1025
APP_MAIL_FROM=noreply@local
EOF
- name: Bring up minio
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test up -d --wait minio
- name: First create-buckets run
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test run --rm create-buckets
- name: Second create-buckets run (idempotency check)
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test run --rm create-buckets
- name: Teardown
if: always()
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test down -v
rm -f .env.test

View File

@@ -0,0 +1,171 @@
name: nightly
# Builds and deploys the staging environment from main every night.
# Runs on the self-hosted runner using Docker-out-of-Docker (the docker
# socket is mounted in), so `docker compose build` produces images on
# the host daemon and `docker compose up` consumes them directly — no
# registry hop.
#
# Operational assumptions (see docs/DEPLOYMENT.md §3 for the full setup):
#
# 1. Single-tenant self-hosted runner. The "Write staging env file" step
# writes every secret to .env.staging on the runner filesystem; the
# `if: always()` cleanup step removes it. A multi-tenant runner
# would need to switch to docker compose --env-file <(stdin) instead.
#
# 2. Host docker layer cache is authoritative. There is no
# actions/cache; we rely on the host daemon to keep Maven and npm
# layers warm between runs. A `docker system prune` on the host
# will cause the next nightly build to be cold (510 min slower).
#
# Staging environment isolation:
# - project name: archiv-staging
# - host ports: backend 8081, frontend 3001
# - profile: staging (starts mailpit instead of a real SMTP relay)
#
# Required Gitea secrets:
# STAGING_POSTGRES_PASSWORD
# STAGING_MINIO_PASSWORD
# STAGING_MINIO_APP_PASSWORD
# STAGING_OCR_TRAINING_TOKEN
# STAGING_APP_ADMIN_USERNAME
# STAGING_APP_ADMIN_PASSWORD
on:
schedule:
- cron: "0 2 * * *"
workflow_dispatch:
env:
# Ensures the backend Dockerfile's `RUN --mount=type=cache` lines are
# honoured (Maven cache survives between runs).
DOCKER_BUILDKIT: "1"
jobs:
deploy-staging:
# `ubuntu-latest` matches our self-hosted runner's advertised label
# (the runner has labels: ubuntu-latest / ubuntu-24.04 / ubuntu-22.04).
# `self-hosted` would never match — no runner advertises it — so the
# job parks in the queue forever. ADR-011's "single-tenant" promise
# is at the repo level; sharing this runner between CI and deploys
# for the same repo is within that boundary.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write staging env file
run: |
cat > .env.staging <<EOF
TAG=nightly
PORT_BACKEND=8081
PORT_FRONTEND=3001
APP_DOMAIN=staging.raddatz.cloud
POSTGRES_PASSWORD=${{ secrets.STAGING_POSTGRES_PASSWORD }}
MINIO_PASSWORD=${{ secrets.STAGING_MINIO_PASSWORD }}
MINIO_APP_PASSWORD=${{ secrets.STAGING_MINIO_APP_PASSWORD }}
OCR_TRAINING_TOKEN=${{ secrets.STAGING_OCR_TRAINING_TOKEN }}
APP_ADMIN_USERNAME=${{ secrets.STAGING_APP_ADMIN_USERNAME }}
APP_ADMIN_PASSWORD=${{ secrets.STAGING_APP_ADMIN_PASSWORD }}
MAIL_HOST=mailpit
MAIL_PORT=1025
MAIL_USERNAME=
MAIL_PASSWORD=
MAIL_SMTP_AUTH=false
MAIL_STARTTLS_ENABLE=false
APP_MAIL_FROM=noreply@staging.raddatz.cloud
EOF
- name: Build images
# `--pull` forces re-fetching pinned base images so a CVE
# re-publication of the same tag (e.g. node:20.19.0-alpine3.21,
# postgres:16-alpine) is picked up instead of being served
# from the host's stale Docker layer cache.
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
build --pull
- name: Deploy staging
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
up -d --wait --remove-orphans
- name: Reload Caddy
# Apply any committed Caddyfile changes before smoke-testing the
# public surface. Without this step, a Caddyfile edit lands in the
# repo but Caddy keeps serving the previous config until someone
# reloads it manually — the smoke test would then catch a stale
# header or a still-proxied /actuator route rather than confirming
# the current config is live.
#
# The runner executes job steps inside Docker containers (DooD).
# `systemctl` is not present in container images and cannot reach
# the host's systemd directly. We use the Docker socket (mounted
# into every job container via runner-config.yaml) to spin up a
# privileged sibling container in the host PID namespace; nsenter
# then enters the host's namespaces so systemctl talks to the real
# host systemd daemon. No sudoers entry is required — the Docker
# socket already grants root-equivalent host access.
#
# Alpine is used: ~5 MB vs ~70 MB for ubuntu, no unnecessary
# tooling, and the digest is pinned so any upstream change requires
# an explicit bump PR. util-linux (which ships nsenter) is installed
# at run time; apk add takes ~1 s on the warm VPS cache.
#
# `reload` not `restart`: reload sends SIGHUP so Caddy re-reads its
# config in-process without dropping TLS connections. `restart`
# would briefly stop the service, losing in-flight requests.
#
# If Caddy is not running this step fails fast before the smoke test
# issues a misleading "port 443 refused" error.
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# Healthchecks confirm containers are healthy; they do NOT confirm the
# public surface works. This step catches: Caddy not reloaded, HSTS
# header dropped, /actuator block bypassed.
#
# --resolve pins staging.raddatz.cloud to the runner's loopback so we
# do NOT depend on the host router doing hairpin NAT (many SOHO
# routers do not, or do so only after a firmware update). SNI still
# uses the public hostname so the cert validates correctly.
run: |
set -e
HOST="staging.raddatz.cloud"
URL="https://$HOST"
RESOLVE="--resolve $HOST:443:127.0.0.1"
echo "Smoke test: $URL (pinned to 127.0.0.1)"
curl -fsS $RESOLVE --max-time 10 "$URL/login" -o /dev/null
# Pin the preload-list-eligible HSTS value, not just header presence:
# a degraded `max-age=1` or a dropped `includeSubDomains; preload` must
# fail this check rather than pass it silently.
curl -fsS $RESOLVE --max-time 10 -I "$URL/" \
| grep -Eqi 'strict-transport-security:[[:space:]]*max-age=31536000.*includeSubDomains.*preload'
# Permissions-Policy denies APIs the app does not use (camera,
# microphone, geolocation). A regression that loosens or drops the
# header now fails the smoke step.
curl -fsS $RESOLVE --max-time 10 -I "$URL/" \
| grep -Eqi 'permissions-policy:[[:space:]]*camera=\(\),[[:space:]]*microphone=\(\),[[:space:]]*geolocation=\(\)'
status=$(curl -s $RESOLVE -o /dev/null -w "%{http_code}" --max-time 10 "$URL/actuator/health")
[ "$status" = "404" ] || { echo "expected 404 from /actuator/health, got $status"; exit 1; }
echo "All smoke checks passed"
- name: Cleanup env file
# LOAD-BEARING: `if: always()` is the linchpin of the ADR-011
# single-tenant runner trust model. Every secret in .env.staging
# is plain text on the runner filesystem until this step runs.
# If a future refactor drops `if: always()`, a failed deploy
# leaves the env-file behind. Do not remove this conditional
# without first re-evaluating ADR-011.
if: always()
run: rm -f .env.staging

View File

@@ -0,0 +1,140 @@
name: release
# Builds and deploys the production environment on `v*` tag push.
# Runs on the self-hosted runner via Docker-out-of-Docker; images are
# tagged with the actual git tag (e.g. v1.0.0) so rollback is
# `TAG=<previous> docker compose -f docker-compose.prod.yml -p archiv-production up -d --wait`
#
# Operational assumptions (see docs/DEPLOYMENT.md §3 for the full setup):
#
# 1. Single-tenant self-hosted runner. The "Write production env file"
# step writes every secret to .env.production on the runner
# filesystem; the `if: always()` cleanup step removes it. A
# multi-tenant runner would need to switch to
# `docker compose --env-file <(stdin)` instead.
#
# 2. Host docker layer cache is authoritative. There is no
# actions/cache; we rely on the host daemon to keep Maven and npm
# layers warm between runs. A `docker system prune` on the host
# will cause the next release build to be cold (510 min slower).
#
# Production environment:
# - project name: archiv-production
# - host ports: backend 8080, frontend 3000
# - profile: (none) — mailpit is excluded; real SMTP relay is used
#
# Required Gitea secrets:
# PROD_POSTGRES_PASSWORD
# PROD_MINIO_PASSWORD
# PROD_MINIO_APP_PASSWORD
# PROD_OCR_TRAINING_TOKEN
# PROD_APP_ADMIN_USERNAME (CRITICAL: see docs/DEPLOYMENT.md)
# PROD_APP_ADMIN_PASSWORD (CRITICAL: locked in on first deploy)
# MAIL_HOST
# MAIL_PORT
# MAIL_USERNAME
# MAIL_PASSWORD
on:
push:
tags:
- "v*"
env:
DOCKER_BUILDKIT: "1"
jobs:
deploy-production:
# See nightly.yml — same rationale: `ubuntu-latest` matches the
# advertised label of our single-tenant self-hosted runner.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write production env file
run: |
cat > .env.production <<EOF
TAG=${{ gitea.ref_name }}
PORT_BACKEND=8080
PORT_FRONTEND=3000
APP_DOMAIN=archiv.raddatz.cloud
POSTGRES_PASSWORD=${{ secrets.PROD_POSTGRES_PASSWORD }}
MINIO_PASSWORD=${{ secrets.PROD_MINIO_PASSWORD }}
MINIO_APP_PASSWORD=${{ secrets.PROD_MINIO_APP_PASSWORD }}
OCR_TRAINING_TOKEN=${{ secrets.PROD_OCR_TRAINING_TOKEN }}
APP_ADMIN_USERNAME=${{ secrets.PROD_APP_ADMIN_USERNAME }}
APP_ADMIN_PASSWORD=${{ secrets.PROD_APP_ADMIN_PASSWORD }}
MAIL_HOST=${{ secrets.MAIL_HOST }}
MAIL_PORT=${{ secrets.MAIL_PORT }}
MAIL_USERNAME=${{ secrets.MAIL_USERNAME }}
MAIL_PASSWORD=${{ secrets.MAIL_PASSWORD }}
MAIL_SMTP_AUTH=true
MAIL_STARTTLS_ENABLE=true
APP_MAIL_FROM=noreply@raddatz.cloud
EOF
- name: Build images
# `--pull` forces re-fetching pinned base images so a CVE
# re-publication of the same tag is picked up rather than served
# from the host's stale Docker layer cache.
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file .env.production \
build --pull
- name: Deploy production
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file .env.production \
up -d --wait --remove-orphans
- name: Reload Caddy
# See nightly.yml — same rationale and mechanism: DooD job containers
# cannot call systemctl directly; nsenter via a privileged sibling
# container reaches the host systemd. Must run after deploy (so the
# latest Caddyfile is on disk) and before the smoke test (so the
# public surface reflects the current config). Alpine with pinned
# digest; reload not restart — see nightly.yml for full rationale.
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# See nightly.yml — same three checks, against the prod vhost.
# --resolve pins archiv.raddatz.cloud to the runner's loopback so
# the smoke test does NOT depend on hairpin NAT on the host router.
run: |
set -e
HOST="archiv.raddatz.cloud"
URL="https://$HOST"
RESOLVE="--resolve $HOST:443:127.0.0.1"
echo "Smoke test: $URL (pinned to 127.0.0.1)"
curl -fsS $RESOLVE --max-time 10 "$URL/login" -o /dev/null
# Pin the preload-list-eligible HSTS value, not just header presence:
# a degraded `max-age=1` or a dropped `includeSubDomains; preload` must
# fail this check rather than pass it silently.
curl -fsS $RESOLVE --max-time 10 -I "$URL/" \
| grep -Eqi 'strict-transport-security:[[:space:]]*max-age=31536000.*includeSubDomains.*preload'
# Permissions-Policy denies APIs the app does not use (camera,
# microphone, geolocation). A regression that loosens or drops the
# header now fails the smoke step.
curl -fsS $RESOLVE --max-time 10 -I "$URL/" \
| grep -Eqi 'permissions-policy:[[:space:]]*camera=\(\),[[:space:]]*microphone=\(\),[[:space:]]*geolocation=\(\)'
status=$(curl -s $RESOLVE -o /dev/null -w "%{http_code}" --max-time 10 "$URL/actuator/health")
[ "$status" = "404" ] || { echo "expected 404 from /actuator/health, got $status"; exit 1; }
echo "All smoke checks passed"
- name: Cleanup env file
# LOAD-BEARING: `if: always()` is the linchpin of the ADR-011
# single-tenant runner trust model. Every secret in
# .env.production is plain text on the runner filesystem until
# this step runs. If a future refactor drops `if: always()`, a
# failed deploy leaves the env-file behind. Do not remove this
# conditional without first re-evaluating ADR-011.
if: always()
run: rm -f .env.production

View File

@@ -100,7 +100,45 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
ORDER BY ts_rank(d.search_vector, q.pq) DESC,
d.meta_date DESC NULLS LAST
""")
List<UUID> findRankedIdsByFts(@Param("query") String query);
// Unpaged path — for bulk-edit "select all" and density chart
List<UUID> findAllMatchingIdsByFts(@Param("query") String query);
/**
* Returns one page of FTS-ranked document IDs with the total match count.
*
* <p>Each row contains (in column order):
* <ol>
* <li>UUID — document id</li>
* <li>double — ts_rank score</li>
* <li>long — COUNT(*) OVER () — full match count, not page count</li>
* </ol>
*
* <p>Returns an empty list when the query matches no documents (including
* stopword-only queries where websearch_to_tsquery returns an empty tsquery).
* Use findAllMatchingIdsByFts for the unpaged bulk-edit path.
*/
@Query(nativeQuery = true, value = """
WITH q AS (
SELECT CASE WHEN websearch_to_tsquery('german', :query)::text <> ''
THEN to_tsquery('simple', regexp_replace(
websearch_to_tsquery('german', :query)::text,
'''([^'']+)''',
'''\\1'':*',
'g'))
END AS pq
), matches AS (
SELECT d.id, ts_rank(d.search_vector, q.pq) AS rank
FROM documents d, q
WHERE d.search_vector @@ q.pq
)
SELECT id, rank, COUNT(*) OVER () AS total
FROM matches
ORDER BY rank DESC, id
OFFSET :offset LIMIT :limit
""")
List<Object[]> findFtsPageRaw(@Param("query") String query,
@Param("offset") int offset,
@Param("limit") int limit);
/**
* Returns match-enrichment data for a set of documents identified by their IDs.

View File

@@ -162,7 +162,7 @@ public class DocumentService {
*/
private List<UUID> resolveFtsIds(String text) {
if (!StringUtils.hasText(text)) return null;
return documentRepository.findRankedIdsByFts(text);
return documentRepository.findAllMatchingIdsByFts(text);
}
/** Loads matching documents and projects to non-null {@link LocalDate}s. */
@@ -485,7 +485,7 @@ public class DocumentService {
boolean hasText = StringUtils.hasText(text);
List<UUID> rankedIds = null;
if (hasText) {
rankedIds = documentRepository.findRankedIdsByFts(text);
rankedIds = documentRepository.findAllMatchingIdsByFts(text);
if (rankedIds.isEmpty()) return List.of();
}
@@ -645,39 +645,43 @@ public class DocumentService {
// 1. Allgemeine Suche (für das Suchfeld im Frontend)
public DocumentSearchResult searchDocuments(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, List<String> tags, String tagQ, DocumentStatus status, DocumentSort sort, String dir, TagOperator tagOperator, Pageable pageable) {
boolean hasText = StringUtils.hasText(text);
List<UUID> rankedIds = null;
// Pure-text RELEVANCE: push pagination into SQL — skip findAllMatchingIdsByFts entirely (ADR-008).
if (isPureTextRelevance(hasText, sort, from, to, sender, receiver, tags, tagQ, status)) {
return relevanceSortedPageFromSql(text, pageable);
}
List<UUID> rankedIds = null;
if (hasText) {
rankedIds = documentRepository.findRankedIdsByFts(text);
rankedIds = documentRepository.findAllMatchingIdsByFts(text);
if (rankedIds.isEmpty()) return DocumentSearchResult.of(List.of());
}
Specification<Document> spec = buildSearchSpec(
hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
// SENDER, RECEIVER and RELEVANCE sorts load the full match set and slice in memory.
// SENDER and RECEIVER sorts load the full match set and slice in-memory.
// JPA's Sort.by("sender.lastName") generates an INNER JOIN that silently drops
// documents with null sender/receivers; RELEVANCE maps a DB order to an external
// rank list. Cost scales linearly with match count — acceptable while documents
// stays under ~10k rows. Past that, replace with SQL-level LEFT JOIN sort.
// documents with null sender/receivers. Cost scales with match count —
// acceptable while documents stays under ~10k rows. (ADR-008)
if (sort == DocumentSort.RECEIVER) {
// In-memory sort on page slice (≤ page size rows) — acceptable
List<Document> sorted = sortByFirstReceiver(documentRepository.findAll(spec), dir);
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
}
if (sort == DocumentSort.SENDER) {
// In-memory sort on page slice (≤ page size rows) — acceptable
List<Document> sorted = sortBySender(documentRepository.findAll(spec), dir);
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
}
// RELEVANCE: default when text present and no explicit sort given
// RELEVANCE with active filters: load filtered subset and sort in-memory by rank.
boolean useRankOrder = hasText && (sort == null || sort == DocumentSort.RELEVANCE);
if (useRankOrder) {
List<Document> results = documentRepository.findAll(spec);
Map<UUID, Integer> rankMap = new HashMap<>();
for (int i = 0; i < rankedIds.size(); i++) rankMap.put(rankedIds.get(i), i);
List<Document> sorted = results.stream()
.sorted(Comparator.comparingInt(
doc -> rankMap.getOrDefault(doc.getId(), Integer.MAX_VALUE)))
List<Document> sorted = documentRepository.findAll(spec).stream()
.sorted(Comparator.comparingInt(doc -> rankMap.getOrDefault(doc.getId(), Integer.MAX_VALUE)))
.toList();
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
}
@@ -688,6 +692,39 @@ public class DocumentService {
return buildResultPaged(page.getContent(), text, pageable, page.getTotalElements());
}
private static boolean isPureTextRelevance(boolean hasText, DocumentSort sort,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status) {
return hasText && (sort == null || sort == DocumentSort.RELEVANCE)
&& from == null && to == null && sender == null && receiver == null
&& (tags == null || tags.isEmpty()) && (tagQ == null || tagQ.isBlank()) && status == null;
}
/**
* Pure-text RELEVANCE path — pagination and ts_rank ordering pushed into SQL.
* Called when no non-text filters are active (ADR-008).
*/
private DocumentSearchResult relevanceSortedPageFromSql(String text, Pageable pageable) {
long rawOffset = pageable.getOffset();
if (rawOffset > Integer.MAX_VALUE) return DocumentSearchResult.of(List.of());
int offset = (int) rawOffset;
int limit = pageable.getPageSize();
FtsPage ftsPage = toFtsPage(documentRepository.findFtsPageRaw(text, offset, limit));
if (ftsPage.hits().isEmpty()) return DocumentSearchResult.of(List.of());
// Preserve ts_rank order from SQL across the JPA findAllById call.
Map<UUID, Integer> rankMap = new HashMap<>();
List<UUID> pageIds = new ArrayList<>();
for (int i = 0; i < ftsPage.hits().size(); i++) {
rankMap.put(ftsPage.hits().get(i).id(), i);
pageIds.add(ftsPage.hits().get(i).id());
}
List<Document> docs = documentRepository.findAllById(pageIds).stream()
.sorted(Comparator.comparingInt(d -> rankMap.getOrDefault(d.getId(), Integer.MAX_VALUE)))
.toList();
return buildResultPaged(docs, text, pageable, ftsPage.total());
}
private static <T> List<T> pageSlice(List<T> sorted, Pageable pageable) {
int from = Math.min((int) pageable.getOffset(), sorted.size());
int to = Math.min(from + pageable.getPageSize(), sorted.size());
@@ -1013,6 +1050,28 @@ public class DocumentService {
return result;
}
private static final int COL_ID = 0;
private static final int COL_RANK = 1;
private static final int COL_TOTAL = 2;
/**
* Maps raw Object[] rows from {@link DocumentRepository#findFtsPageRaw} to an
* {@link FtsPage}. Uses pattern-matching UUID cast to guard against driver-level
* type variance (some JDBC drivers return UUID as String).
*/
private static FtsPage toFtsPage(List<Object[]> rows) {
if (rows.isEmpty()) return new FtsPage(List.of(), 0);
long total = ((Number) rows.get(0)[COL_TOTAL]).longValue();
List<FtsHit> hits = rows.stream()
.map(r -> {
UUID id = r[COL_ID] instanceof UUID u ? u : UUID.fromString(r[COL_ID].toString());
double rank = ((Number) r[COL_RANK]).doubleValue();
return new FtsHit(id, rank);
})
.toList();
return new FtsPage(hits, total);
}
/** Clean text + highlight offsets parsed from a {@code ts_headline} sentinel-delimited string. */
public record ParsedHighlight(String cleanText, List<MatchOffset> offsets) {}

View File

@@ -0,0 +1,6 @@
package org.raddatz.familienarchiv.document;
import java.util.UUID;
/** A single document hit from a paginated FTS query — id and its ts_rank score. */
record FtsHit(UUID id, double rank) {}

View File

@@ -0,0 +1,6 @@
package org.raddatz.familienarchiv.document;
import java.util.List;
/** One page of FTS results — the ranked hit list for this page and the total match count. */
record FtsPage(List<FtsHit> hits, long total) {}

View File

@@ -27,7 +27,9 @@ public class CommentController {
// ─── Block (transcription) comments ────────────────────────────────────────
@GetMapping("/api/documents/{documentId}/transcription-blocks/{blockId}/comments")
public List<DocumentComment> getBlockComments(@PathVariable UUID blockId) {
public List<DocumentComment> getBlockComments(
@PathVariable UUID documentId,
@PathVariable UUID blockId) {
return commentService.getCommentsForBlock(blockId);
}
@@ -48,6 +50,7 @@ public class CommentController {
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL})
public DocumentComment replyToBlockComment(
@PathVariable UUID documentId,
@PathVariable UUID blockId,
@PathVariable UUID commentId,
@RequestBody CreateCommentDTO dto,
Authentication authentication) {

View File

@@ -0,0 +1,137 @@
package org.raddatz.familienarchiv.security;
import jakarta.servlet.FilterChain;
import jakarta.servlet.ServletException;
import jakarta.servlet.http.Cookie;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletRequestWrapper;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.core.annotation.Order;
import org.springframework.http.HttpHeaders;
import org.springframework.stereotype.Component;
import org.springframework.web.filter.OncePerRequestFilter;
import java.io.IOException;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.util.Collections;
import java.util.Enumeration;
/**
* Promotes the {@code auth_token} cookie to an {@code Authorization} header
* so that browser-side requests to {@code /api/*} authenticate the same way
* SSR fetches do.
*
* <p>The SvelteKit login action stores the full HTTP Basic header value
* ({@code "Basic <base64>"}) in an HttpOnly cookie. SSR fetches from
* {@code hooks.server.ts} read the cookie and pass it explicitly as the
* {@code Authorization} header. In the dev environment, Vite's proxy does
* the same on every {@code /api/*} request (see {@code vite.config.ts}).
* In production, Caddy proxies {@code /api/*} straight to the backend and
* does NOT translate the cookie — so client-side {@code fetch} and
* {@code EventSource} calls reach the backend without auth, get
* {@code 401 WWW-Authenticate: Basic}, and the browser pops a native dialog.
*
* <p>This filter closes that gap: if a request has an {@code auth_token}
* cookie but no explicit {@code Authorization} header, promote the cookie
* value (URL-decoded) into the header before Spring Security inspects it.
* Explicit {@code Authorization} headers are preserved unchanged.
*
* <p>See #520. Filter runs at {@code Ordered.HIGHEST_PRECEDENCE} so it
* mutates the request before any Spring Security filter sees it.
*
* <p><b>Scope:</b> only {@code /api/*} requests are touched. The
* {@code /actuator/*} block in Caddy plus the open auth/reset paths in
* {@link SecurityConfig} must NOT receive a promoted Authorization.
*
* <p><b>⚠ Log-leakage warning:</b> the wrapped request exposes the
* Authorization header via {@code getHeaderNames}/{@code getHeaders}. Any
* filter or interceptor that iterates request headers will see the live
* Basic credential. Do NOT add a request-header logger downstream of this
* filter without explicitly scrubbing the {@code Authorization} field.
*/
@Component
@Order(org.springframework.core.Ordered.HIGHEST_PRECEDENCE)
public class AuthTokenCookieFilter extends OncePerRequestFilter {
static final String COOKIE_NAME = "auth_token";
static final String SCOPE_PREFIX = "/api/";
@Override
protected void doFilterInternal(HttpServletRequest request,
HttpServletResponse response,
FilterChain chain) throws ServletException, IOException {
// Scope: only /api/* needs cookie promotion. /actuator/health (open),
// /api/auth/forgot-password (open), /login etc. don't.
if (!request.getRequestURI().startsWith(SCOPE_PREFIX)) {
chain.doFilter(request, response);
return;
}
// An explicit Authorization header wins — this is the SSR fetch path
// (hooks.server.ts builds the header itself).
if (request.getHeader(HttpHeaders.AUTHORIZATION) != null) {
chain.doFilter(request, response);
return;
}
Cookie[] cookies = request.getCookies();
if (cookies == null) {
chain.doFilter(request, response);
return;
}
for (Cookie c : cookies) {
if (COOKIE_NAME.equals(c.getName()) && c.getValue() != null && !c.getValue().isBlank()) {
String decoded;
try {
decoded = URLDecoder.decode(c.getValue(), StandardCharsets.UTF_8);
} catch (IllegalArgumentException malformed) {
// Malformed percent-encoding — refuse to forward a bogus
// Authorization header. Spring Security will treat the
// request as unauthenticated.
chain.doFilter(request, response);
return;
}
chain.doFilter(new AuthHeaderRequest(request, decoded), response);
return;
}
}
chain.doFilter(request, response);
}
/**
* Adds (or overrides) the {@code Authorization} header on a wrapped request.
* All other headers pass through unchanged.
*/
static final class AuthHeaderRequest extends HttpServletRequestWrapper {
private final String authorization;
AuthHeaderRequest(HttpServletRequest request, String authorization) {
super(request);
this.authorization = authorization;
}
@Override
public String getHeader(String name) {
if (HttpHeaders.AUTHORIZATION.equalsIgnoreCase(name)) {
return authorization;
}
return super.getHeader(name);
}
@Override
public Enumeration<String> getHeaders(String name) {
if (HttpHeaders.AUTHORIZATION.equalsIgnoreCase(name)) {
return Collections.enumeration(Collections.singletonList(authorization));
}
return super.getHeaders(name);
}
@Override
public Enumeration<String> getHeaderNames() {
Enumeration<String> base = super.getHeaderNames();
java.util.Set<String> names = new java.util.LinkedHashSet<>();
while (base.hasMoreElements()) names.add(base.nextElement());
names.add(HttpHeaders.AUTHORIZATION);
return Collections.enumeration(names);
}
}
}

View File

@@ -37,12 +37,20 @@ public class SecurityConfig {
@Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http
// CSRF is intentionally disabled: every request from the SvelteKit frontend
// carries an explicit Authorization header (Basic Auth token injected by
// hooks.server.ts). Browsers block cross-origin requests from setting custom
// headers, so cross-site request forgery via a third-party page is not
// possible with this auth scheme. If the auth model ever changes to
// cookie-based sessions, CSRF protection must be re-enabled.
// CSRF is intentionally disabled. With the cookie-promotion model
// (auth_token cookie → Authorization header via AuthTokenCookieFilter,
// see #520), every authenticated request to /api/* now carries the
// credential automatically once the cookie is set. The CSRF defence
// for state-changing endpoints is therefore LOAD-BEARING on:
//
// 1. SameSite=strict on the auth_token cookie (login/+page.server.ts).
// A cross-site POST from evil.com cannot include the cookie.
// 2. CORS — Spring's default rejects cross-origin requests with
// credentials unless explicitly allowed (no allowedOrigins config).
//
// If either of those is ever weakened (e.g. cookie flipped to
// SameSite=lax, CORS allowedOrigins expanded), CSRF protection
// MUST be re-enabled here.
.csrf(csrf -> csrf.disable())
.authorizeHttpRequests(auth -> {

View File

@@ -88,7 +88,8 @@ public class AppUser {
};
public static String computeColor(UUID id) {
return PALETTE[Math.abs(id.hashCode()) % PALETTE.length];
// Math.floorMod avoids the Integer.MIN_VALUE overflow trap in Math.abs(hashCode())
return PALETTE[Math.floorMod(id.hashCode(), PALETTE.length)];
}
@PrePersist

View File

@@ -20,6 +20,7 @@ import org.springframework.boot.CommandLineRunner;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile;
import org.springframework.core.env.Environment;
import org.springframework.security.crypto.password.PasswordEncoder;
import java.time.LocalDate;
@@ -31,26 +32,51 @@ import java.util.Set;
@DependsOn("flyway")
public class UserDataInitializer {
@Value("${app.admin.email:admin@familyarchive.local}")
static final String DEFAULT_ADMIN_EMAIL = "admin@familienarchiv.local";
static final String DEFAULT_ADMIN_PASSWORD = "admin123";
@Value("${app.admin.email:" + DEFAULT_ADMIN_EMAIL + "}")
private String adminEmail;
@Value("${app.admin.password:admin123}")
@Value("${app.admin.password:" + DEFAULT_ADMIN_PASSWORD + "}")
private String adminPassword;
private final AppUserRepository userRepository;
private final UserGroupRepository groupRepository;
private final Environment environment;
@Bean
public CommandLineRunner initAdminUser(PasswordEncoder passwordEncoder) {
return args -> {
if (userRepository.findByEmail(adminEmail).isEmpty()) {
// Fail-closed in production: refuse to seed with the well-known
// defaults. Otherwise an operator who forgets APP_ADMIN_USERNAME
// / APP_ADMIN_PASSWORD locks production to admin@…/admin123 PERMANENTLY
// (UserDataInitializer only seeds when the row is missing — see #513).
// Allowed in dev/test/e2e because those run without secrets configured.
boolean isLocalProfile = environment.matchesProfiles("dev", "test", "e2e");
if (!isLocalProfile
&& (DEFAULT_ADMIN_EMAIL.equals(adminEmail)
|| DEFAULT_ADMIN_PASSWORD.equals(adminPassword))) {
throw new IllegalStateException(
"Refusing to seed admin user with default credentials outside "
+ "the dev/test/e2e profiles. Set APP_ADMIN_USERNAME and "
+ "APP_ADMIN_PASSWORD to non-default values before first boot — "
+ "this lock-in is permanent."
);
}
log.info("Kein Admin-User '{}' gefunden. Erstelle Default-Admin...", adminEmail);
UserGroup adminGroup = UserGroup.builder()
.name("Administrators")
.permissions(Set.of("ADMIN", "READ_ALL", "WRITE_ALL", "ANNOTATE_ALL", "ADMIN_USER", "ADMIN_TAG", "ADMIN_PERMISSION"))
.build();
groupRepository.save(adminGroup);
// Reuse the Administrators group if it already exists (e.g. a
// previous boot seeded the group but failed before creating
// the admin user, or the operator deleted just the user row
// to retry the seed with a new email). Blind-INSERTing would
// violate user_groups_name_key and abort the context. See #518.
UserGroup adminGroup = groupRepository.findByName("Administrators")
.orElseGet(() -> groupRepository.save(UserGroup.builder()
.name("Administrators")
.permissions(Set.of("ADMIN", "READ_ALL", "WRITE_ALL", "ANNOTATE_ALL", "ADMIN_USER", "ADMIN_TAG", "ADMIN_PERMISSION"))
.build()));
AppUser admin = AppUser.builder()
.email(adminEmail)

View File

@@ -271,9 +271,10 @@ public class UserService {
@Transactional
public UserGroup createGroup(GroupDTO dto) {
UserGroup group = new UserGroup();
group.setName(dto.getName());
group.setPermissions(dto.getPermissions());
UserGroup group = UserGroup.builder()
.name(dto.getName())
.permissions(dto.getPermissions() != null ? dto.getPermissions() : new HashSet<>())
.build();
return groupRepository.save(group);
}

View File

@@ -38,6 +38,12 @@ spring:
starttls:
enable: true
server:
# Behind Caddy/reverse proxy: trust X-Forwarded-{Proto,For,Host} so that
# request.getScheme(), redirect URLs, and Spring Session "Secure" cookies
# reflect the original https client request, not the http hop from Caddy.
forward-headers-strategy: native
management:
health:
mail:
@@ -63,7 +69,11 @@ app:
from: ${APP_MAIL_FROM:noreply@familienarchiv.local}
admin:
username: ${APP_ADMIN_USERNAME:admin}
# Key must be `email`, not `username` — UserDataInitializer reads
# `${app.admin.email:...}`. The env-var name stays APP_ADMIN_USERNAME
# to match the existing Gitea secrets and DEPLOYMENT.md §3.3.
# See #513.
email: ${APP_ADMIN_USERNAME:admin@familienarchiv.local}
password: ${APP_ADMIN_PASSWORD:admin123}
import:

View File

@@ -0,0 +1,8 @@
-- Speeds up "documents by sender" queries used on /persons/[id] Korrespondenz-Überblick (#306),
-- /briefwechsel, and bulk-edit flows.
CREATE INDEX IF NOT EXISTS idx_documents_sender_id
ON documents(sender_id);
-- Speeds up "comments by author" queries on admin user detail and (future) contributor profile.
CREATE INDEX IF NOT EXISTS idx_comments_author_id
ON document_comments(author_id);

View File

@@ -0,0 +1,7 @@
-- Remove duplicate (group_id, permission) rows that accumulated without a UNIQUE constraint.
-- Keeps the row with the smallest ctid (earliest physical insertion order).
DELETE FROM group_permissions a
USING group_permissions b
WHERE a.ctid < b.ctid
AND a.group_id = b.group_id
AND a.permission = b.permission;

View File

@@ -0,0 +1,11 @@
-- Add NOT NULL and PRIMARY KEY to group_permissions.
-- Requires V63 to have run first (no duplicates can remain).
--
-- After this migration, future seed migrations can use:
-- INSERT INTO group_permissions ... ON CONFLICT DO NOTHING
-- instead of the INSERT ... WHERE NOT EXISTS pattern used before V64.
ALTER TABLE group_permissions
ALTER COLUMN permission SET NOT NULL;
ALTER TABLE group_permissions
ADD CONSTRAINT pk_group_permissions PRIMARY KEY (group_id, permission);

View File

@@ -0,0 +1,8 @@
-- Promote the de-facto unique constraint on transcription_block_mentioned_persons to a named PK.
-- uq_tbmp_block_person (added in V57) is backed by a B-tree index identical to a PK;
-- this rename makes the naming convention explicit (pk_* vs uq_*).
ALTER TABLE transcription_block_mentioned_persons
DROP CONSTRAINT uq_tbmp_block_person;
ALTER TABLE transcription_block_mentioned_persons
ADD CONSTRAINT pk_tbmp PRIMARY KEY (block_id, person_id);

View File

@@ -399,6 +399,86 @@ class MigrationIntegrationTest {
AND dc.annotation_id IS NOT NULL
""";
// ─── V62: indexes on FK columns ──────────────────────────────────────────
@Test
void v62_idx_documents_sender_id_exists() {
Integer count = jdbc.queryForObject(
"SELECT COUNT(*) FROM pg_catalog.pg_indexes WHERE tablename = 'documents' AND indexname = 'idx_documents_sender_id'",
Integer.class);
assertThat(count).isEqualTo(1);
}
@Test
void v62_idx_comments_author_id_exists() {
Integer count = jdbc.queryForObject(
"SELECT COUNT(*) FROM pg_catalog.pg_indexes WHERE tablename = 'document_comments' AND indexname = 'idx_comments_author_id'",
Integer.class);
assertThat(count).isEqualTo(1);
}
// ─── V63+V64: group_permissions dedup + primary key ──────────────────────
@Test
void v64_pk_group_permissions_exists() {
Integer count = jdbc.queryForObject(
"""
SELECT COUNT(*) FROM pg_catalog.pg_constraint c
JOIN pg_catalog.pg_class t ON c.conrelid = t.oid
WHERE t.relname = 'group_permissions'
AND c.conname = 'pk_group_permissions'
AND c.contype = 'p'
""",
Integer.class);
assertThat(count).isEqualTo(1);
}
@Test
void v64_permission_column_isNotNullable() {
Integer count = jdbc.queryForObject(
"""
SELECT COUNT(*) FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'group_permissions'
AND column_name = 'permission'
AND is_nullable = 'NO'
""",
Integer.class);
assertThat(count).isEqualTo(1);
}
@Test
@Transactional(propagation = Propagation.NOT_SUPPORTED)
void v64_rejectsDuplicateGroupPermission() {
UUID groupId = createUserGroup("DuplicateTestGroup-" + UUID.randomUUID());
try {
jdbc.update("INSERT INTO group_permissions (group_id, permission) VALUES (?, 'READ_ALL')", groupId);
assertThatThrownBy(() ->
jdbc.update("INSERT INTO group_permissions (group_id, permission) VALUES (?, 'READ_ALL')", groupId)
).isInstanceOf(DataIntegrityViolationException.class);
} finally {
jdbc.update("DELETE FROM group_permissions WHERE group_id = ?", groupId);
jdbc.update("DELETE FROM user_groups WHERE id = ?", groupId);
}
}
// ─── V65: tbmp UNIQUE promoted to PRIMARY KEY ─────────────────────────────
@Test
void v65_pk_tbmp_exists() {
Integer count = jdbc.queryForObject(
"""
SELECT COUNT(*) FROM pg_catalog.pg_constraint c
JOIN pg_catalog.pg_class t ON c.conrelid = t.oid
WHERE t.relname = 'transcription_block_mentioned_persons'
AND c.conname = 'pk_tbmp'
AND c.contype = 'p'
""",
Integer.class);
assertThat(count).isEqualTo(1);
}
// ─── helpers ─────────────────────────────────────────────────────────────
private UUID createPerson(String firstName, String lastName) {
@@ -482,4 +562,10 @@ class MigrationIntegrationTest {
""", id, recipientId, docId, commentId);
return id;
}
private UUID createUserGroup(String name) {
UUID id = UUID.randomUUID();
jdbc.update("INSERT INTO user_groups (id, name) VALUES (?, ?)", id, name);
return id;
}
}

View File

@@ -0,0 +1,48 @@
package org.raddatz.familienarchiv.config;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.config.YamlPropertiesFactoryBean;
import org.springframework.boot.web.server.autoconfigure.ServerProperties.ForwardHeadersStrategy;
import org.springframework.boot.context.properties.bind.Binder;
import org.springframework.boot.context.properties.source.ConfigurationPropertySources;
import org.springframework.core.env.PropertiesPropertySource;
import org.springframework.core.io.ClassPathResource;
import java.util.Properties;
import static org.assertj.core.api.Assertions.assertThat;
/**
* Binds {@code server.forward-headers-strategy} from {@code application.yaml} into
* Spring Boot's typed {@link ForwardHeadersStrategy} enum. The binder rejects any
* value that is not a valid enum constant ({@code BindException}), so a typo
* ({@code "nativ"}, {@code "Native"}, {@code "framework "}) or a future Spring
* rename of the property fails the test, not silently degrades to {@code NONE}.
*
* <p>No Spring context, no embedded server, no Testcontainers — this is the
* cheapest test that pins the contract "Caddy's X-Forwarded-Proto is trusted".
*/
class ForwardHeadersConfigurationTest {
@Test
void forward_headers_strategy_binds_to_NATIVE() {
YamlPropertiesFactoryBean yaml = new YamlPropertiesFactoryBean();
yaml.setResources(new ClassPathResource("application.yaml"));
Properties props = yaml.getObject();
assertThat(props).as("application.yaml must be on the classpath").isNotNull();
Binder binder = new Binder(ConfigurationPropertySources.from(
new PropertiesPropertySource("application", props)));
ForwardHeadersStrategy strategy = binder
.bind("server.forward-headers-strategy", ForwardHeadersStrategy.class)
.orElseThrow(() -> new AssertionError(
"server.forward-headers-strategy is missing from application.yaml"));
assertThat(strategy)
.as("Spring must trust X-Forwarded-Proto from Caddy so that "
+ "request.getScheme(), redirect URLs, and the Spring Session "
+ "'Secure' cookie reflect the original https client request.")
.isEqualTo(ForwardHeadersStrategy.NATIVE);
}
}

View File

@@ -0,0 +1,109 @@
package org.raddatz.familienarchiv.document;
import jakarta.persistence.EntityManager;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.raddatz.familienarchiv.PostgresContainerConfig;
import org.raddatz.familienarchiv.config.FlywayConfig;
import org.raddatz.familienarchiv.document.DocumentRepository;
import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.document.DocumentStatus;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.data.jpa.test.autoconfigure.DataJpaTest;
import org.springframework.boot.jdbc.test.autoconfigure.AutoConfigureTestDatabase;
import org.springframework.context.annotation.Import;
import org.springframework.test.annotation.DirtiesContext;
import java.util.List;
import java.util.UUID;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatNoException;
/**
* Repository-level integration tests for {@code findFtsPageRaw}: verifies that the
* paginated FTS query returns exactly page-size rows and that the window-function
* total reflects the full match count, not just the page count.
*
* <p>Uses real Postgres via Testcontainers so the GIN index, tsvector trigger, and
* {@code websearch_to_tsquery} semantics are identical to production.
*
* <p>{@code AFTER_CLASS} dirty-context keeps the Spring context alive for all tests
* in this class and rebuilds it once at the end, rather than after every test.
*/
@DataJpaTest
@AutoConfigureTestDatabase(replace = AutoConfigureTestDatabase.Replace.NONE)
@Import({PostgresContainerConfig.class, FlywayConfig.class})
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_CLASS)
class DocumentFtsPagedIntegrationTest {
@Autowired DocumentRepository documentRepository;
@Autowired EntityManager em;
// 60 docs match "Walter"; 10 docs with "Hans" do not.
private static final int WALTER_COUNT = 60;
private static final int PAGE_SIZE = 50;
@BeforeEach
void seed() {
documentRepository.deleteAll();
em.flush();
for (int i = 0; i < WALTER_COUNT; i++) {
documentRepository.saveAndFlush(doc("Brief von Walter Nr. " + i));
}
for (int i = 0; i < 10; i++) {
documentRepository.saveAndFlush(doc("Brief von Hans Nr. " + i));
}
em.clear();
}
@Test
void findFtsPageRaw_firstPage_returnsPageSizeRows() {
List<Object[]> rows = documentRepository.findFtsPageRaw("Walter", 0, PAGE_SIZE);
assertThat(rows).hasSize(PAGE_SIZE);
}
@Test
void findFtsPageRaw_windowTotal_equalsFullMatchCount_notPageSize() {
List<Object[]> rows = documentRepository.findFtsPageRaw("Walter", 0, PAGE_SIZE);
long total = ((Number) rows.get(0)[2]).longValue();
assertThat(total).isEqualTo(WALTER_COUNT);
}
@Test
void findFtsPageRaw_lastPage_returnsRemainder() {
int remainder = WALTER_COUNT % PAGE_SIZE; // 60 % 50 = 10
List<Object[]> rows = documentRepository.findFtsPageRaw("Walter", PAGE_SIZE, PAGE_SIZE);
assertThat(rows).hasSize(remainder);
long total = ((Number) rows.get(0)[2]).longValue();
assertThat(total).isEqualTo(WALTER_COUNT);
}
@Test
void findFtsPageRaw_noMatches_returnsEmptyList() {
List<Object[]> rows = documentRepository.findFtsPageRaw("XYZ_KEIN_TREFFER", 0, PAGE_SIZE);
assertThat(rows).isEmpty();
}
@Test
void findFtsPageRaw_stopwordOnlyQuery_returnsEmptyList_noException() {
assertThatNoException().isThrownBy(() -> {
List<Object[]> rows = documentRepository.findFtsPageRaw("der die das und", 0, PAGE_SIZE);
assertThat(rows).isEmpty();
});
}
// ─── Helper ───────────────────────────────────────────────────────────────
private Document doc(String title) {
return Document.builder()
.title(title)
.originalFilename(title.replace(" ", "_") + ".pdf")
.status(DocumentStatus.UPLOADED)
.build();
}
}

View File

@@ -69,7 +69,7 @@ class DocumentFtsTest {
documentRepository.saveAndFlush(document("Alter Brief"));
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Brief");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Brief");
assertThat(ids).hasSize(1);
}
@@ -79,7 +79,7 @@ class DocumentFtsTest {
documentRepository.saveAndFlush(document("Alter Brief"));
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Briefe");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Briefe");
assertThat(ids).hasSize(1);
}
@@ -89,7 +89,7 @@ class DocumentFtsTest {
documentRepository.saveAndFlush(document("Ein furchtbarer Brief"));
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("furchtb");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("furchtb");
assertThat(ids).hasSize(1);
}
@@ -99,7 +99,7 @@ class DocumentFtsTest {
documentRepository.saveAndFlush(document("Familienfoto"));
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Brief");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Brief");
assertThat(ids).isEmpty();
}
@@ -115,7 +115,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("schreiben");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("schreiben");
assertThat(ids).contains(doc.getId());
}
@@ -125,14 +125,14 @@ class DocumentFtsTest {
Document doc = documentRepository.saveAndFlush(document("Leeres Dokument"));
em.clear();
assertThat(documentRepository.findRankedIdsByFts("Grundbuch")).isEmpty();
assertThat(documentRepository.findAllMatchingIdsByFts("Grundbuch")).isEmpty();
UUID annotationId = annotation(doc.getId());
blockRepository.saveAndFlush(block(doc.getId(), annotationId, "Grundbuch Eintrag 1923", 0));
em.flush();
em.clear();
assertThat(documentRepository.findRankedIdsByFts("Grundbuch")).contains(doc.getId());
assertThat(documentRepository.findAllMatchingIdsByFts("Grundbuch")).contains(doc.getId());
}
@Test
@@ -144,13 +144,13 @@ class DocumentFtsTest {
em.flush();
em.clear();
assertThat(documentRepository.findRankedIdsByFts("Grundbuch")).contains(doc.getId());
assertThat(documentRepository.findAllMatchingIdsByFts("Grundbuch")).contains(doc.getId());
blockRepository.deleteById(block.getId());
em.flush();
em.clear();
assertThat(documentRepository.findRankedIdsByFts("Grundbuch")).doesNotContain(doc.getId());
assertThat(documentRepository.findAllMatchingIdsByFts("Grundbuch")).doesNotContain(doc.getId());
}
// ─── Ranking ───────────────────────────────────────────────────────────────
@@ -166,7 +166,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Grundbuch");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Grundbuch");
assertThat(ids).hasSize(2);
assertThat(ids.get(0)).isEqualTo(docA.getId());
@@ -179,7 +179,7 @@ class DocumentFtsTest {
documentRepository.saveAndFlush(document("Ein Brief von der Oma"));
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("der die das und");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("der die das und");
assertThat(ids).isEmpty();
}
@@ -195,7 +195,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Wille");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Wille");
assertThat(ids).contains(doc.getId());
}
@@ -205,7 +205,7 @@ class DocumentFtsTest {
documentRepository.saveAndFlush(document("Brief"));
em.clear();
assertThatNoException().isThrownBy(() -> documentRepository.findRankedIdsByFts("((("));
assertThatNoException().isThrownBy(() -> documentRepository.findAllMatchingIdsByFts("((("));
}
// ─── Weight C: sender/receiver names ───────────────────────────────────────
@@ -223,7 +223,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Schmidt");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Schmidt");
assertThat(ids).contains(doc.getId());
}
@@ -241,7 +241,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Raddatz");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Raddatz");
assertThat(ids).contains(doc.getId());
}
@@ -260,7 +260,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> ids = documentRepository.findRankedIdsByFts("Familiengeschichte");
List<UUID> ids = documentRepository.findAllMatchingIdsByFts("Familiengeschichte");
assertThat(ids).hasSize(1);
}
@@ -278,7 +278,7 @@ class DocumentFtsTest {
em.flush();
em.clear();
List<UUID> rankedIds = documentRepository.findRankedIdsByFts("Grundbuch");
List<UUID> rankedIds = documentRepository.findAllMatchingIdsByFts("Grundbuch");
Specification<Document> spec = Specification.where(hasIds(rankedIds))
.and(hasStatus(DocumentStatus.UPLOADED));

View File

@@ -21,17 +21,22 @@ import org.springframework.data.domain.Pageable;
import org.springframework.data.jpa.domain.Specification;
import java.time.LocalDate;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;
import static org.assertj.core.api.Assertions.assertThat;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.ArgumentMatchers.anyInt;
import static org.mockito.ArgumentMatchers.anyString;
import static org.mockito.Mockito.never;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;
@ExtendWith(MockitoExtension.class)
class DocumentServiceSortTest {
private static final Pageable UNPAGED = org.springframework.data.domain.PageRequest.of(0, 10_000);
private static final Pageable PAGE = org.springframework.data.domain.PageRequest.of(0, 10_000);
@Mock DocumentRepository documentRepository;
@Mock PersonService personService;
@@ -43,12 +48,12 @@ class DocumentServiceSortTest {
@Mock TranscriptionBlockQueryService transcriptionBlockQueryService;
@InjectMocks DocumentService documentService;
// ─── searchDocuments — DATE sort ──────────────────────────────────────────
// ─── DATE sort ────────────────────────────────────────────────────────────
@Test
void searchDocuments_with_DATE_sort_and_text_sorts_chronologically_not_by_relevance() {
UUID id1 = UUID.randomUUID(); // rank position 0 (higher relevance, older doc)
UUID id2 = UUID.randomUUID(); // rank position 1 (lower relevance, newer doc)
UUID id1 = UUID.randomUUID(); // higher relevance, older doc
UUID id2 = UUID.randomUUID(); // lower relevance, newer doc
Document older = Document.builder().id(id1)
.title("Brief").status(DocumentStatus.UPLOADED)
@@ -57,38 +62,48 @@ class DocumentServiceSortTest {
.title("Brief").status(DocumentStatus.UPLOADED)
.documentDate(LocalDate.of(1960, 1, 1)).build();
// FTS returns id1 first (higher rank), id2 second
when(documentRepository.findRankedIdsByFts("Brief")).thenReturn(List.of(id1, id2));
// findAll(spec, pageable) — the correct date path — returns date-DESC order
when(documentRepository.findAllMatchingIdsByFts("Brief")).thenReturn(List.of(id1, id2));
when(documentRepository.findAll(any(Specification.class), any(Pageable.class)))
.thenReturn(new PageImpl<>(List.of(newer, older)));
DocumentSearchResult result = documentService.searchDocuments(
"Brief", null, null, null, null, null, null, null, DocumentSort.DATE, "DESC", null, UNPAGED);
"Brief", null, null, null, null, null, null, null, DocumentSort.DATE, "DESC", null, PAGE);
// Expect: date order (newer 1960 first), NOT rank order (older 1940 first)
assertThat(result.items()).hasSize(2);
assertThat(result.items().get(0).document().getId()).isEqualTo(id2); // newer doc first
assertThat(result.items().get(0).document().getId()).isEqualTo(id2); // newer first
}
// ─── searchDocuments — RELEVANCE sort ─────────────────────────────────────
// ─── RELEVANCE sort — pure text (no filters) ──────────────────────────────
@Test
void searchDocuments_relevance_pureText_calls_findFtsPageRaw_not_findAllMatchingIds() {
UUID id1 = UUID.randomUUID();
List<Object[]> ftsRows = ftsRows(id1, 0.5d, 1L);
when(documentRepository.findFtsPageRaw(anyString(), anyInt(), anyInt())).thenReturn(ftsRows);
when(documentRepository.findAllById(any()))
.thenReturn(List.of(doc(id1)));
documentService.searchDocuments(
"Brief", null, null, null, null, null, null, null, DocumentSort.RELEVANCE, null, null, PAGE);
verify(documentRepository).findFtsPageRaw(anyString(), anyInt(), anyInt());
verify(documentRepository, never()).findAllMatchingIdsByFts(anyString());
}
@Test
void searchDocuments_with_RELEVANCE_sort_and_text_preserves_fts_rank_order() {
UUID id1 = UUID.randomUUID(); // rank position 0
UUID id2 = UUID.randomUUID(); // rank position 1
UUID id1 = UUID.randomUUID(); // higher rank — must appear first
UUID id2 = UUID.randomUUID(); // lower rank
Document doc1 = Document.builder().id(id1).title("Brief").status(DocumentStatus.UPLOADED).build();
Document doc2 = Document.builder().id(id2).title("Brief").status(DocumentStatus.UPLOADED).build();
when(documentRepository.findRankedIdsByFts("Brief")).thenReturn(List.of(id1, id2));
when(documentRepository.findAll(any(Specification.class)))
.thenReturn(List.of(doc2, doc1)); // unordered from DB
List<Object[]> ftsRows = new ArrayList<>();
ftsRows.add(new Object[]{id1, 0.8d, 2L});
ftsRows.add(new Object[]{id2, 0.3d, 2L});
when(documentRepository.findFtsPageRaw(anyString(), anyInt(), anyInt())).thenReturn(ftsRows);
when(documentRepository.findAllById(any())).thenReturn(List.of(doc(id2), doc(id1))); // unordered from JPA
DocumentSearchResult result = documentService.searchDocuments(
"Brief", null, null, null, null, null, null, null, DocumentSort.RELEVANCE, null, null, UNPAGED);
"Brief", null, null, null, null, null, null, null, DocumentSort.RELEVANCE, null, null, PAGE);
// Expect: rank order restored (id1 first)
assertThat(result.items().get(0).document().getId()).isEqualTo(id1);
}
@@ -97,16 +112,82 @@ class DocumentServiceSortTest {
UUID id1 = UUID.randomUUID();
UUID id2 = UUID.randomUUID();
Document doc1 = Document.builder().id(id1).title("Brief").status(DocumentStatus.UPLOADED).build();
Document doc2 = Document.builder().id(id2).title("Brief").status(DocumentStatus.UPLOADED).build();
when(documentRepository.findRankedIdsByFts("Brief")).thenReturn(List.of(id1, id2));
when(documentRepository.findAll(any(Specification.class)))
.thenReturn(List.of(doc2, doc1));
List<Object[]> ftsRows = new ArrayList<>();
ftsRows.add(new Object[]{id1, 0.8d, 2L});
ftsRows.add(new Object[]{id2, 0.3d, 2L});
when(documentRepository.findFtsPageRaw(anyString(), anyInt(), anyInt())).thenReturn(ftsRows);
when(documentRepository.findAllById(any())).thenReturn(List.of(doc(id2), doc(id1)));
DocumentSearchResult result = documentService.searchDocuments(
"Brief", null, null, null, null, null, null, null, null, null, null, UNPAGED);
"Brief", null, null, null, null, null, null, null, null, null, null, PAGE);
assertThat(result.items().get(0).document().getId()).isEqualTo(id1);
}
// ─── RELEVANCE sort — overflow guard ─────────────────────────────────────
@Test
void searchDocuments_relevance_returns_empty_when_offset_exceeds_maxInt() {
// offset = pageNumber * pageSize; choose values so offset > Integer.MAX_VALUE
Pageable hugePage = org.springframework.data.domain.PageRequest.of(Integer.MAX_VALUE / 10 + 1, 10);
DocumentSearchResult result = documentService.searchDocuments(
"Brief", null, null, null, null, null, null, null,
DocumentSort.RELEVANCE, null, null, hugePage);
assertThat(result.items()).isEmpty();
verify(documentRepository, never()).findFtsPageRaw(anyString(), anyInt(), anyInt());
}
// ─── toFtsPage — UUID-as-String JDBC driver variance ────────────────────
@Test
void searchDocuments_relevance_handles_string_uuid_from_jdbc_driver() {
String stringId = "11111111-1111-1111-1111-111111111111";
UUID uuidId = UUID.fromString(stringId);
// Simulate a JDBC driver that returns the id column as String instead of UUID
List<Object[]> ftsRows = new ArrayList<>();
ftsRows.add(new Object[]{stringId, 0.5d, 1L});
when(documentRepository.findFtsPageRaw(anyString(), anyInt(), anyInt())).thenReturn(ftsRows);
when(documentRepository.findAllById(any())).thenReturn(List.of(doc(uuidId)));
DocumentSearchResult result = documentService.searchDocuments(
"Brief", null, null, null, null, null, null, null,
DocumentSort.RELEVANCE, null, null, PAGE);
assertThat(result.items()).hasSize(1);
assertThat(result.items().get(0).document().getId()).isEqualTo(uuidId);
}
// ─── RELEVANCE sort — text + active filter ────────────────────────────────
@Test
void searchDocuments_relevance_with_active_filter_uses_inMemory_path() {
UUID id1 = UUID.randomUUID();
UUID id2 = UUID.randomUUID();
when(documentRepository.findAllMatchingIdsByFts("Brief")).thenReturn(List.of(id1, id2));
when(documentRepository.findAll(any(Specification.class)))
.thenReturn(List.of(doc(id2), doc(id1)));
// sender filter is active → triggers in-memory path, not findFtsPageRaw
LocalDate from = LocalDate.of(1900, 1, 1);
documentService.searchDocuments(
"Brief", from, null, null, null, null, null, null, DocumentSort.RELEVANCE, null, null, PAGE);
verify(documentRepository, never()).findFtsPageRaw(anyString(), anyInt(), anyInt());
verify(documentRepository).findAllMatchingIdsByFts("Brief");
}
// ─── Helpers ──────────────────────────────────────────────────────────────
private static Document doc(UUID id) {
return Document.builder().id(id).title("Brief").status(DocumentStatus.UPLOADED).build();
}
private static List<Object[]> ftsRows(UUID id, double rank, long total) {
List<Object[]> rows = new ArrayList<>();
rows.add(new Object[]{id, rank, total});
return rows;
}
}

View File

@@ -1620,9 +1620,10 @@ class DocumentServiceTest {
// chr(1)=\u0001 marks start, chr(2)=\u0002 marks end of highlighted term
List<Object[]> rows = Collections.singletonList(new Object[]{docId, "\u0001Brief\u0002 an Anna", null, false, null, null, null});
when(documentRepository.findRankedIdsByFts("Brief")).thenReturn(List.of(docId));
when(documentRepository.findAll(any(org.springframework.data.jpa.domain.Specification.class)))
.thenReturn(List.of(doc));
List<Object[]> ftsRows = new java.util.ArrayList<>();
ftsRows.add(new Object[]{docId, 0.5d, 1L});
when(documentRepository.findFtsPageRaw(anyString(), anyInt(), anyInt())).thenReturn(ftsRows);
when(documentRepository.findAllById(any())).thenReturn(List.of(doc));
when(documentRepository.findEnrichmentData(any(), eq("Brief"))).thenReturn(rows);
DocumentSearchResult result = documentService.searchDocuments(
@@ -1654,9 +1655,10 @@ class DocumentServiceTest {
String snippetHeadline = "Hier ist der \u0001Brief\u0002 aus Berlin";
List<Object[]> rows = Collections.singletonList(new Object[]{docId, "Dok", snippetHeadline, false, null, null, null});
when(documentRepository.findRankedIdsByFts("Brief")).thenReturn(List.of(docId));
when(documentRepository.findAll(any(org.springframework.data.jpa.domain.Specification.class)))
.thenReturn(List.of(doc));
List<Object[]> snippetFtsRows = new java.util.ArrayList<>();
snippetFtsRows.add(new Object[]{docId, 0.5d, 1L});
when(documentRepository.findFtsPageRaw(anyString(), anyInt(), anyInt())).thenReturn(snippetFtsRows);
when(documentRepository.findAllById(any())).thenReturn(List.of(doc));
when(documentRepository.findEnrichmentData(any(), eq("Brief"))).thenReturn(rows);
DocumentSearchResult result = documentService.searchDocuments(
@@ -2202,7 +2204,7 @@ class DocumentServiceTest {
@Test
void findIdsForFilter_returnsEmpty_whenFtsHasNoMatches() {
when(documentRepository.findRankedIdsByFts("xyz")).thenReturn(List.of());
when(documentRepository.findAllMatchingIdsByFts("xyz")).thenReturn(List.of());
List<UUID> result = documentService.findIdsForFilter(
"xyz", null, null, null, null, null, null, null, null);
@@ -2386,7 +2388,7 @@ class DocumentServiceTest {
@Test
void getDensity_shortCircuits_whenFtsReturnsNoMatches() {
when(documentRepository.findRankedIdsByFts("xyz")).thenReturn(List.of());
when(documentRepository.findAllMatchingIdsByFts("xyz")).thenReturn(List.of());
DocumentDensityResult result = documentService.getDensity(
new DensityFilters("xyz", null, null, null, null, null, null));

View File

@@ -10,6 +10,7 @@ import org.raddatz.familienarchiv.document.DocumentStatus;
import org.raddatz.familienarchiv.document.DocumentRepository;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.test.context.ActiveProfiles;
import org.springframework.context.annotation.Import;
import org.springframework.test.context.DynamicPropertyRegistry;
import org.springframework.test.context.DynamicPropertySource;
@@ -41,6 +42,7 @@ import static org.assertj.core.api.Assertions.assertThat;
* test pyramid mocks at the FileService boundary.
*/
@SpringBootTest
@ActiveProfiles("test")
@Import(PostgresContainerConfig.class)
class ThumbnailServiceIntegrationTest {

View File

@@ -44,6 +44,14 @@ class CommentControllerTest {
// ─── Block comment endpoints ─────────────────────────────────────────────
@Test
@WithMockUser
void getBlockComments_returns400_when_documentId_is_not_a_UUID() throws Exception {
UUID blockId = UUID.randomUUID();
mockMvc.perform(get("/api/documents/NOT-A-UUID/transcription-blocks/" + blockId + "/comments"))
.andExpect(status().isBadRequest());
}
@Test
@WithMockUser
void getBlockComments_returns200() throws Exception {
@@ -115,6 +123,15 @@ class CommentControllerTest {
// ─── Block reply endpoints ───────────────────────────────────────────────
@Test
@WithMockUser(authorities = "ANNOTATE_ALL")
void replyToBlockComment_returns400_when_blockId_is_not_a_UUID() throws Exception {
mockMvc.perform(post("/api/documents/" + DOC_ID + "/transcription-blocks/NOT-A-UUID"
+ "/comments/" + COMMENT_ID + "/replies")
.contentType(MediaType.APPLICATION_JSON).content(COMMENT_JSON))
.andExpect(status().isBadRequest());
}
@Test
void replyToBlockComment_returns401_whenUnauthenticated() throws Exception {
UUID blockId = UUID.randomUUID();

View File

@@ -0,0 +1,134 @@
package org.raddatz.familienarchiv.security;
import jakarta.servlet.FilterChain;
import jakarta.servlet.http.Cookie;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.junit.jupiter.api.Test;
import org.mockito.ArgumentCaptor;
import org.springframework.mock.web.MockHttpServletRequest;
import org.springframework.mock.web.MockHttpServletResponse;
import static org.assertj.core.api.Assertions.assertThat;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.times;
import static org.mockito.Mockito.verify;
/**
* The filter must turn a browser-side {@code Cookie: auth_token=Basic%20<base64>}
* into {@code Authorization: Basic <base64>} (URL-decoded) so that Spring's
* Basic-auth filter accepts it. Skips when the request already has an explicit
* {@code Authorization} header, or when no {@code auth_token} cookie is present.
*
* <p>See #520.
*/
class AuthTokenCookieFilterTest {
private final AuthTokenCookieFilter filter = new AuthTokenCookieFilter();
@Test
void promotes_url_encoded_auth_token_cookie_to_decoded_Authorization_header() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
req.setRequestURI("/api/users/me");
req.setCookies(new Cookie("auth_token", "Basic%20YWRtaW5AZmFtaWx5YXJjaGl2ZS5sb2NhbDpzZWNyZXQ%3D"));
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
ArgumentCaptor<HttpServletRequest> captor = ArgumentCaptor.forClass(HttpServletRequest.class);
verify(chain, times(1)).doFilter(captor.capture(), org.mockito.ArgumentMatchers.any(HttpServletResponse.class));
HttpServletRequest forwarded = captor.getValue();
assertThat(forwarded.getHeader("Authorization"))
.as("Authorization must be URL-decoded so Spring's Basic parser sees a literal space")
.isEqualTo("Basic YWRtaW5AZmFtaWx5YXJjaGl2ZS5sb2NhbDpzZWNyZXQ=");
}
@Test
void preserves_explicit_Authorization_header_and_ignores_cookie() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
req.setRequestURI("/api/users/me");
req.addHeader("Authorization", "Basic explicit-header-wins");
req.setCookies(new Cookie("auth_token", "Basic%20cookie-would-have-promoted"));
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
// Forwards the original request unchanged — same instance, no wrapping.
verify(chain).doFilter(req, res);
}
@Test
void passes_through_when_no_cookies_at_all() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
req.setRequestURI("/api/users/me");
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
verify(chain).doFilter(req, res);
}
@Test
void passes_through_when_auth_token_cookie_is_absent() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
req.setRequestURI("/api/users/me");
req.setCookies(new Cookie("some_other_cookie", "value"));
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
verify(chain).doFilter(req, res);
}
@Test
void passes_through_when_auth_token_cookie_is_empty() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
req.setRequestURI("/api/users/me");
req.setCookies(new Cookie("auth_token", ""));
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
verify(chain).doFilter(req, res);
}
@Test
void passes_through_unchanged_when_request_is_outside_api_scope() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
// /actuator/health and similar must NOT receive a promoted Authorization
// header — they have their own access rules and should never be authed
// via the cookie.
req.setRequestURI("/actuator/health");
req.setCookies(new Cookie("auth_token", "Basic%20YWR=="));
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
// Forwards the original request unchanged — same instance, no wrapping.
verify(chain).doFilter(req, res);
}
@Test
void passes_through_unchanged_when_cookie_value_is_malformed_percent_encoding() throws Exception {
MockHttpServletRequest req = new MockHttpServletRequest();
req.setRequestURI("/api/users/me");
// Lone "%" without two hex digits → URLDecoder throws → filter must
// refuse to forward a bogus Authorization header.
req.setCookies(new Cookie("auth_token", "Basic%2"));
MockHttpServletResponse res = new MockHttpServletResponse();
FilterChain chain = mock(FilterChain.class);
filter.doFilter(req, res, chain);
// Forwards the original request unchanged — Spring Security treats it
// as unauthenticated rather than crashing on bad input.
verify(chain).doFilter(req, res);
}
}

View File

@@ -0,0 +1,174 @@
package org.raddatz.familienarchiv.user;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.Mock;
import org.mockito.junit.jupiter.MockitoExtension;
import org.springframework.boot.CommandLineRunner;
import org.springframework.core.env.Environment;
import org.springframework.security.crypto.password.PasswordEncoder;
import org.springframework.test.util.ReflectionTestUtils;
import java.util.Optional;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.ArgumentMatchers.anyString;
import static org.mockito.ArgumentMatchers.eq;
import static org.mockito.Mockito.never;
import static org.mockito.Mockito.verify;
import static org.mockito.Mockito.when;
/**
* UserDataInitializer must refuse to seed the admin user with the hardcoded
* dev defaults when running outside the {@code dev} profile.
*
* <p>Why this matters: per DEPLOYMENT.md §3.5 and ADR-011, the admin password
* is permanently locked on first deploy (UserDataInitializer only seeds when
* the row is missing). If an operator forgets to set {@code APP_ADMIN_USERNAME}
* / {@code APP_ADMIN_PASSWORD}, prod silently boots with the well-known dev
* defaults — a credential-disclosure foot-gun, not a config typo. See #513.
*/
@ExtendWith(MockitoExtension.class)
class AdminSeedFailClosedTest {
@Mock AppUserRepository userRepository;
@Mock UserGroupRepository groupRepository;
@Mock Environment environment;
@Mock PasswordEncoder passwordEncoder;
UserDataInitializer initializer;
@BeforeEach
void setUp() {
initializer = new UserDataInitializer(userRepository, groupRepository, environment);
}
@Test
void refuses_to_seed_when_email_is_default_and_profile_is_not_dev() throws Exception {
when(userRepository.findByEmail(anyString())).thenReturn(Optional.empty());
when(environment.matchesProfiles("dev", "test", "e2e")).thenReturn(false);
ReflectionTestUtils.setField(initializer, "adminEmail", UserDataInitializer.DEFAULT_ADMIN_EMAIL);
ReflectionTestUtils.setField(initializer, "adminPassword", "operator-set-this-one");
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
assertThatThrownBy(() -> runner.run())
.isInstanceOf(IllegalStateException.class)
.hasMessageContaining("default credentials")
.hasMessageContaining("permanent");
verify(userRepository, never()).save(org.mockito.ArgumentMatchers.any());
}
@Test
void refuses_to_seed_when_password_is_default_and_profile_is_not_dev() throws Exception {
when(userRepository.findByEmail(anyString())).thenReturn(Optional.empty());
when(environment.matchesProfiles("dev", "test", "e2e")).thenReturn(false);
ReflectionTestUtils.setField(initializer, "adminEmail", "admin@archiv.raddatz.cloud");
ReflectionTestUtils.setField(initializer, "adminPassword", UserDataInitializer.DEFAULT_ADMIN_PASSWORD);
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
assertThatThrownBy(() -> runner.run())
.isInstanceOf(IllegalStateException.class)
.hasMessageContaining("default credentials");
}
@Test
void allows_seed_when_both_values_are_set_and_profile_is_not_dev() throws Exception {
when(userRepository.findByEmail(anyString())).thenReturn(Optional.empty());
when(groupRepository.findByName("Administrators")).thenReturn(Optional.empty());
when(groupRepository.save(any(UserGroup.class))).thenAnswer(inv -> inv.getArgument(0));
when(environment.matchesProfiles("dev", "test", "e2e")).thenReturn(false);
when(passwordEncoder.encode(anyString())).thenReturn("$2a$10$stub");
ReflectionTestUtils.setField(initializer, "adminEmail", "admin@archiv.raddatz.cloud");
ReflectionTestUtils.setField(initializer, "adminPassword", "a-real-strong-password");
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
runner.run();
verify(userRepository).save(any(AppUser.class));
}
@Test
void allows_seed_with_defaults_when_profile_is_dev() throws Exception {
when(userRepository.findByEmail(anyString())).thenReturn(Optional.empty());
when(groupRepository.findByName("Administrators")).thenReturn(Optional.empty());
when(groupRepository.save(any(UserGroup.class))).thenAnswer(inv -> inv.getArgument(0));
when(environment.matchesProfiles("dev", "test", "e2e")).thenReturn(true);
when(passwordEncoder.encode(anyString())).thenReturn("$2a$10$stub");
ReflectionTestUtils.setField(initializer, "adminEmail", UserDataInitializer.DEFAULT_ADMIN_EMAIL);
ReflectionTestUtils.setField(initializer, "adminPassword", UserDataInitializer.DEFAULT_ADMIN_PASSWORD);
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
runner.run();
verify(userRepository).save(any(AppUser.class));
}
@Test
void does_not_check_defaults_when_admin_already_exists() throws Exception {
AppUser existing = AppUser.builder()
.email("someone@example.com")
.password("$2a$10$stub")
.build();
when(userRepository.findByEmail(anyString())).thenReturn(Optional.of(existing));
ReflectionTestUtils.setField(initializer, "adminEmail", UserDataInitializer.DEFAULT_ADMIN_EMAIL);
ReflectionTestUtils.setField(initializer, "adminPassword", UserDataInitializer.DEFAULT_ADMIN_PASSWORD);
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
runner.run();
verify(userRepository, never()).save(org.mockito.ArgumentMatchers.any());
// Importantly, no IllegalStateException — re-deploys must not panic over
// historical default-seeded data they cannot retroactively fix.
}
@Test
void reuses_existing_Administrators_group_when_seeding_a_new_admin() throws Exception {
// Setup: admin user does not exist, but the Administrators group does
// (e.g. previous boot seeded the group then failed; operator deleted
// the bad user row to retry with a corrected APP_ADMIN_USERNAME). The
// re-seed must reuse the group, not blind-INSERT a duplicate. See #518.
UserGroup existingGroup = UserGroup.builder()
.name("Administrators")
.build();
when(userRepository.findByEmail(anyString())).thenReturn(Optional.empty());
when(groupRepository.findByName("Administrators")).thenReturn(Optional.of(existingGroup));
when(environment.matchesProfiles("dev", "test", "e2e")).thenReturn(false);
when(passwordEncoder.encode(anyString())).thenReturn("$2a$10$stub");
ReflectionTestUtils.setField(initializer, "adminEmail", "admin@archiv.raddatz.cloud");
ReflectionTestUtils.setField(initializer, "adminPassword", "a-real-strong-password");
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
runner.run();
// Group must not be re-inserted — that would violate user_groups_name_key.
verify(groupRepository, never()).save(any(UserGroup.class));
// But the admin user IS created, with the existing group attached.
org.mockito.ArgumentCaptor<AppUser> captor = org.mockito.ArgumentCaptor.forClass(AppUser.class);
verify(userRepository).save(captor.capture());
assertThat(captor.getValue().getGroups()).containsExactly(existingGroup);
}
@Test
void creates_Administrators_group_when_seeding_admin_on_a_fresh_database() throws Exception {
when(userRepository.findByEmail(anyString())).thenReturn(Optional.empty());
when(groupRepository.findByName("Administrators")).thenReturn(Optional.empty());
when(groupRepository.save(any(UserGroup.class))).thenAnswer(inv -> inv.getArgument(0));
when(environment.matchesProfiles("dev", "test", "e2e")).thenReturn(false);
when(passwordEncoder.encode(anyString())).thenReturn("$2a$10$stub");
ReflectionTestUtils.setField(initializer, "adminEmail", "admin@archiv.raddatz.cloud");
ReflectionTestUtils.setField(initializer, "adminPassword", "a-real-strong-password");
CommandLineRunner runner = initializer.initAdminUser(passwordEncoder);
runner.run();
// Group should be inserted exactly once.
verify(groupRepository).save(any(UserGroup.class));
verify(userRepository).save(any(AppUser.class));
}
}

View File

@@ -0,0 +1,95 @@
package org.raddatz.familienarchiv.user;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.beans.factory.config.YamlPropertiesFactoryBean;
import org.springframework.boot.context.properties.bind.Binder;
import org.springframework.boot.context.properties.source.ConfigurationPropertySources;
import org.springframework.core.env.PropertiesPropertySource;
import org.springframework.core.io.ClassPathResource;
import java.lang.reflect.Field;
import java.util.Properties;
import static org.assertj.core.api.Assertions.assertThat;
/**
* Pins the admin-seed property key contract. {@code UserDataInitializer} reads
* {@code @Value("${app.admin.email:...}")} and {@code @Value("${app.admin.password:...}")}.
* The yaml MUST expose those exact keys, not e.g. {@code app.admin.username}, or
* the env vars {@code APP_ADMIN_USERNAME} / {@code APP_ADMIN_PASSWORD} are
* silently ignored and the admin user gets seeded with the hardcoded defaults.
*
* <p>Discovered as a HIGH bug during the production-deploy bootstrap (#513): on
* first deploy the prod admin password is permanently locked to whatever ends
* up in the database, so a key-name mismatch would lock prod to the dev defaults
* {@code admin@familyarchive.local} / {@code admin123}.
*
* <p>No Spring context — Binder reads application.yaml directly.
*/
class AdminSeedPropertyKeyTest {
@Test
void admin_email_key_binds_from_yaml() {
Binder binder = binderFromApplicationYaml();
String email = binder.bind("app.admin.email", String.class)
.orElseThrow(() -> new AssertionError(
"app.admin.email is missing from application.yaml. "
+ "UserDataInitializer reads this exact key; if the yaml uses "
+ "a different name (e.g. 'username'), the env var "
+ "APP_ADMIN_USERNAME is silently ignored."));
assertThat(email)
.as("app.admin.email must resolve from APP_ADMIN_USERNAME or its default")
.isNotBlank();
}
@Test
void admin_password_key_binds_from_yaml() {
Binder binder = binderFromApplicationYaml();
String password = binder.bind("app.admin.password", String.class)
.orElseThrow(() -> new AssertionError(
"app.admin.password is missing from application.yaml. "
+ "UserDataInitializer reads this exact key."));
assertThat(password)
.as("app.admin.password must resolve from APP_ADMIN_PASSWORD or its default")
.isNotBlank();
}
@Test
void userDataInitializer_reads_app_admin_email_not_username() throws NoSuchFieldException {
// Pin the Java side too: a future rename of the @Value placeholder
// (e.g. back to `${app.admin.username:...}`) would silently break the
// binding while the yaml-side assertions above still pass. See #513.
Field field = UserDataInitializer.class.getDeclaredField("adminEmail");
Value annotation = field.getAnnotation(Value.class);
assertThat(annotation)
.as("UserDataInitializer.adminEmail must be @Value-annotated")
.isNotNull();
assertThat(annotation.value())
.as("UserDataInitializer must read app.admin.email — not username or any other key")
.startsWith("${app.admin.email:");
}
@Test
void userDataInitializer_reads_app_admin_password() throws NoSuchFieldException {
Field field = UserDataInitializer.class.getDeclaredField("adminPassword");
Value annotation = field.getAnnotation(Value.class);
assertThat(annotation).isNotNull();
assertThat(annotation.value())
.as("UserDataInitializer must read app.admin.password")
.startsWith("${app.admin.password:");
}
private Binder binderFromApplicationYaml() {
YamlPropertiesFactoryBean yaml = new YamlPropertiesFactoryBean();
yaml.setResources(new ClassPathResource("application.yaml"));
Properties props = yaml.getObject();
assertThat(props).as("application.yaml must be on the classpath").isNotNull();
return new Binder(ConfigurationPropertySources.from(
new PropertiesPropertySource("application", props)));
}
}

View File

@@ -35,4 +35,15 @@ class AppUserTest {
.count();
assertThat(distinct).isGreaterThan(1);
}
@Test
void computeColor_returnsValidPaletteColorForIntegerMinValueHash() {
// UUID "80000000-0000-0000-0000-000000000000" has hashCode() == Integer.MIN_VALUE.
// Math.abs(Integer.MIN_VALUE) overflows back to Integer.MIN_VALUE (negative), making
// Math.abs(hashCode()) % n unsafe for palette sizes that don't evenly divide MIN_VALUE.
// Math.floorMod eliminates this edge case entirely.
UUID minHashId = UUID.fromString("80000000-0000-0000-0000-000000000000");
assertThat(minHashId.hashCode()).isEqualTo(Integer.MIN_VALUE);
assertThat(EXPECTED_PALETTE).contains(AppUser.computeColor(minHashId));
}
}

View File

@@ -902,4 +902,18 @@ class UserServiceTest {
assertThat(result.getName()).isEqualTo("Familie");
assertThat(result.getPermissions()).containsExactlyInAnyOrder("READ_ALL", "WRITE_ALL");
}
@Test
void createGroup_withNullPermissions_savesGroupWithEmptyPermissionSet() {
org.raddatz.familienarchiv.user.GroupDTO dto = new org.raddatz.familienarchiv.user.GroupDTO();
dto.setName("Leser");
dto.setPermissions(null);
UserGroup saved = UserGroup.builder().id(UUID.randomUUID()).name("Leser").build();
when(groupRepository.save(any())).thenReturn(saved);
userService.createGroup(dto);
verify(groupRepository).save(argThat(g -> g.getPermissions() != null && g.getPermissions().isEmpty()));
}
}

231
docker-compose.prod.yml Normal file
View File

@@ -0,0 +1,231 @@
# Production / staging Docker Compose for Familienarchiv.
#
# This is a self-contained file (not an overlay over docker-compose.yml).
# All services for the prod stack live here. Environment isolation is
# achieved via the docker compose project name:
#
# production: docker compose -f docker-compose.prod.yml -p archiv-production ...
# staging: docker compose -f docker-compose.prod.yml -p archiv-staging --profile staging ...
#
# Volumes, networks and containers are namespaced by the project name,
# so the two environments cohabit cleanly on the same host.
#
# Required env vars (provided by .env.production / .env.staging in CI):
# TAG image tag (release tag or "nightly")
# PORT_BACKEND, PORT_FRONTEND host-side ports (bound to 127.0.0.1 only)
# APP_DOMAIN e.g. archiv.raddatz.cloud / staging.raddatz.cloud
# POSTGRES_PASSWORD Postgres password
# MINIO_PASSWORD MinIO root password (admin operations only)
# MINIO_APP_PASSWORD MinIO application service-account password
# (least-privilege scope: archive bucket only)
# OCR_TRAINING_TOKEN token guarding ocr-service /train endpoint
# APP_ADMIN_USERNAME seeded admin email (e.g. admin@archiv.raddatz.cloud)
# APP_ADMIN_PASSWORD seeded admin password — CRITICAL: locked in on
# first deploy because UserDataInitializer only
# creates the account if the email does not exist
# MAIL_HOST, MAIL_PORT, SMTP relay (production only; staging uses mailpit)
# MAIL_USERNAME, MAIL_PASSWORD
# APP_MAIL_FROM sender address (e.g. noreply@raddatz.cloud)
networks:
archiv-net:
driver: bridge
volumes:
postgres-data:
minio-data:
ocr-models:
ocr-cache:
services:
db:
image: postgres:16-alpine
restart: unless-stopped
environment:
POSTGRES_USER: archiv
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_DB: archiv
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- archiv-net
healthcheck:
test: ["CMD-SHELL", "pg_isready -U archiv -d archiv"]
interval: 10s
timeout: 5s
retries: 5
minio:
# Pinned MinIO release for reproducible deploys. Bumped manually until
# Renovate is bootstrapped for these production images (see follow-up issue).
image: minio/minio:RELEASE.2025-02-28T09-55-16Z
restart: unless-stopped
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: archiv
MINIO_ROOT_PASSWORD: ${MINIO_PASSWORD}
volumes:
- minio-data:/data
networks:
- archiv-net
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
# Idempotent bucket bootstrap + service-account creation.
# Runs once per `docker compose up` and exits 0. The entrypoint is
# extracted to infra/minio/bootstrap.sh so the (non-trivial) idempotent
# logic is readable, reviewable, and unit-testable as a script rather
# than YAML-escaped shell.
create-buckets:
# Custom image bakes bootstrap.sh in at build time. A bind-mount fails on
# the Docker-out-of-Docker production runner because the host daemon
# resolves the relative path against the host filesystem, not the
# runner container's CWD. See #506 + infra/minio/Dockerfile.
build:
context: ./infra/minio
# Declare one-shot intent so `docker compose up -d --wait` treats
# exited(0) as success rather than "not running, fail". Pair with
# backend's `service_completed_successfully` dependency below. See #510.
restart: "no"
depends_on:
minio:
condition: service_healthy
networks:
- archiv-net
environment:
MINIO_PASSWORD: ${MINIO_PASSWORD}
MINIO_APP_PASSWORD: ${MINIO_APP_PASSWORD}
# Dev-only mail catcher; gated behind the staging profile so production
# never starts it. Staging workflow runs with `--profile staging`.
mailpit:
# Pinned for reproducibility; bumped manually until Renovate is bootstrapped.
image: axllent/mailpit:v1.29.7
restart: unless-stopped
profiles: ["staging"]
networks:
- archiv-net
healthcheck:
# TCP-port open check via BusyBox `nc`. The previous wget-based probe
# introduced a non-obvious binary dependency on the mailpit image; a
# future tag that ships without wget would silently disable the
# healthcheck. `nc` is part of BusyBox in the upstream image.
test: ["CMD-SHELL", "nc -z localhost 8025 || exit 1"]
interval: 10s
timeout: 5s
retries: 5
ocr-service:
build:
context: ./ocr-service
restart: unless-stopped
expose:
- "8000"
# Surya OCR loads ~5GB of transformer models at startup; first request
# triggers a further ~1GB Kraken model download into ocr-cache.
# CX42+ (16 GB RAM) honours the default. On a CX32 (8 GB) override with
# OCR_MEM_LIMIT=6g (slower first-request, fits the host).
mem_limit: ${OCR_MEM_LIMIT:-12g}
memswap_limit: ${OCR_MEM_LIMIT:-12g}
volumes:
- ocr-models:/app/models
- ocr-cache:/root/.cache
environment:
KRAKEN_MODEL_PATH: /app/models/german_kurrent.mlmodel
TRAINING_TOKEN: ${OCR_TRAINING_TOKEN}
OCR_CONFIDENCE_THRESHOLD: "0.3"
OCR_CONFIDENCE_THRESHOLD_KURRENT: "0.5"
# SSRF allowlist pinned explicitly to the internal MinIO hostname.
# In prod the OCR service only fetches PDFs from MinIO over the
# docker network; localhost/127.0.0.1 are dev-only sources and
# must NOT be reachable here. Do not widen to `*`.
ALLOWED_PDF_HOSTS: "minio"
networks:
- archiv-net
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 10s
timeout: 5s
retries: 12
start_period: 120s
backend:
image: familienarchiv/backend:${TAG:-nightly}
build:
context: ./backend
restart: unless-stopped
depends_on:
db:
condition: service_healthy
minio:
condition: service_healthy
ocr-service:
condition: service_healthy
# Gate startup on the bucket bootstrap. Without this, backend
# starts in parallel with create-buckets and may race the policy
# bind. Also tells compose's `up -d --wait` that create-buckets
# is a one-shot that must complete successfully. See #510.
create-buckets:
condition: service_completed_successfully
# Bound to localhost only — Caddy fronts external traffic.
ports:
- "127.0.0.1:${PORT_BACKEND}:8080"
environment:
SPRING_DATASOURCE_URL: jdbc:postgresql://db:5432/archiv
SPRING_DATASOURCE_USERNAME: archiv
SPRING_DATASOURCE_PASSWORD: ${POSTGRES_PASSWORD}
# Application uses the bucket-scoped service account, not MinIO root.
S3_ENDPOINT: http://minio:9000
S3_ACCESS_KEY: archiv-app
S3_SECRET_KEY: ${MINIO_APP_PASSWORD}
S3_BUCKET_NAME: familienarchiv
S3_REGION: us-east-1
# No SPRING_PROFILES_ACTIVE — base application.yaml is production-ready
# (Swagger disabled, show-sql off, open-in-view false).
APP_BASE_URL: https://${APP_DOMAIN}
APP_ADMIN_USERNAME: ${APP_ADMIN_USERNAME}
APP_ADMIN_PASSWORD: ${APP_ADMIN_PASSWORD}
APP_OCR_BASE_URL: http://ocr-service:8000
APP_OCR_TRAINING_TOKEN: ${OCR_TRAINING_TOKEN}
MAIL_HOST: ${MAIL_HOST}
MAIL_PORT: ${MAIL_PORT:-587}
MAIL_USERNAME: ${MAIL_USERNAME:-}
MAIL_PASSWORD: ${MAIL_PASSWORD:-}
APP_MAIL_FROM: ${APP_MAIL_FROM:-noreply@raddatz.cloud}
SPRING_MAIL_PROPERTIES_MAIL_SMTP_AUTH: ${MAIL_SMTP_AUTH:-true}
SPRING_MAIL_PROPERTIES_MAIL_SMTP_STARTTLS_ENABLE: ${MAIL_STARTTLS_ENABLE:-true}
networks:
- archiv-net
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:8080/actuator/health | grep -q UP || exit 1"]
interval: 15s
timeout: 5s
retries: 10
start_period: 30s
frontend:
image: familienarchiv/frontend:${TAG:-nightly}
build:
context: ./frontend
target: production
restart: unless-stopped
depends_on:
backend:
condition: service_healthy
ports:
- "127.0.0.1:${PORT_FRONTEND}:3000"
environment:
# SSR fetches go inside the docker network; clients hit https://${APP_DOMAIN}
API_INTERNAL_URL: http://backend:8080
ORIGIN: https://${APP_DOMAIN}
networks:
- archiv-net
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://127.0.0.1:3000/login >/dev/null 2>&1 || exit 1"]
interval: 15s
timeout: 5s
retries: 10
start_period: 20s

View File

@@ -13,7 +13,7 @@ services:
ports:
- "${PORT_DB}:5432"
networks:
- archive-net
- archiv-net
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 5s
@@ -35,7 +35,7 @@ services:
- "${PORT_MINIO_API}:9000" # API Port
- "${PORT_MINIO_CONSOLE}:9001" # Web-Oberfläche
networks:
- archive-net
- archiv-net
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
@@ -56,7 +56,7 @@ services:
exit 0;
"
networks:
- archive-net
- archiv-net
# --- Mail catcher: Mailpit (dev only) ---
# Catches all outgoing emails and displays them in a web UI.
@@ -69,7 +69,7 @@ services:
- "${PORT_MAILPIT_UI:-8025}:8025" # Web UI
- "${PORT_MAILPIT_SMTP:-1025}:1025" # SMTP
networks:
- archive-net
- archiv-net
# --- OCR: Python microservice (Surya + Kraken) ---
# Single-node only: OCR training reloads the model in-process after each run.
@@ -99,7 +99,7 @@ services:
OCR_CLAHE_TILE_SIZE: "8" # CLAHE tile grid size (NxN tiles per page)
OCR_MAX_CACHED_MODELS: "2" # LRU cache; each model ~500 MB, so 2 = ~1 GB resident
networks:
- archive-net
- archiv-net
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 10s
@@ -150,7 +150,7 @@ services:
ports:
- "${PORT_BACKEND}:8080"
networks:
- archive-net
- archiv-net
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:8080/actuator/health | grep -q UP || exit 1"]
interval: 15s
@@ -163,6 +163,7 @@ services:
build:
context: ./frontend
dockerfile: Dockerfile
target: development # Dockerfile is multi-stage; default would be the production stage
container_name: archive-frontend
restart: unless-stopped
depends_on:
@@ -184,10 +185,10 @@ services:
ports:
- "${PORT_FRONTEND}:5173"
networks:
- archive-net
- archiv-net
networks:
archive-net:
archiv-net:
driver: bridge
volumes:

View File

@@ -27,20 +27,22 @@ This doc is the Day-1 checklist and operational reference. It links to the canon
```mermaid
graph TD
Browser -->|HTTPS| Caddy["Caddy (TLS termination)"]
Caddy -->|HTTP :5173| Frontend["Web Frontend\nSvelteKit / Node.js"]
Caddy -->|HTTP :3000| Frontend["Web Frontend\nSvelteKit Node adapter"]
Caddy -->|HTTP :8080| Backend["API Backend\nSpring Boot / Jetty :8080"]
Backend -->|JDBC :5432| DB[(PostgreSQL 16)]
Backend -->|S3 API :9000| MinIO[(MinIO / Hetzner OBS)]
Backend -->|S3 API :9000| MinIO[(MinIO)]
Backend -->|HTTP :8000 internal| OCR["OCR Service\nPython FastAPI"]
OCR -->|presigned URL| MinIO
Browser -->|SSE direct| Backend
Caddy -->|SSE proxy_pass| Backend
```
**Key facts:**
- Caddy terminates TLS and reverse-proxies to frontend and backend. See the Caddyfile in [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md).
- The OCR service has **no external port** — reachable only on the internal Docker network from the backend.
- SSE notifications go directly backend → browser (not via the SvelteKit SSR layer).
- Management port 8081 (Spring Actuator / Prometheus scrape) is internal only — the Caddy config blocks `/actuator/*` externally.
- Caddy terminates TLS and reverse-proxies to frontend (`:3000`) and backend (`:8080`). The Caddyfile is committed at [`infra/caddy/Caddyfile`](../infra/caddy/Caddyfile) and is installed on the host as `/etc/caddy/Caddyfile` (symlink).
- The host binds all docker-published ports to `127.0.0.1` only; Caddy is the sole external entry point.
- The OCR service has **no published port** — reachable only on the internal Docker network from the backend.
- SSE notifications transit Caddy (browser → Caddy → backend); the backend is never reachable directly from the public internet. The SvelteKit SSR layer is bypassed for SSE, but Caddy is not.
- The Caddyfile responds `404` on `/actuator/*` (defense in depth). Internal monitoring scrapes the backend on the docker network, not through Caddy.
- Production and staging cohabit on the same host via docker compose project names: `archiv-production` (ports 8080/3000) and `archiv-staging` (ports 8081/3001).
### OCR memory requirements
@@ -52,19 +54,23 @@ The OCR service requires significant RAM for model loading. The dev compose sets
| Hetzner CX32 | 8 GB | 6 GB | Accept reduced batch sizes and slower throughput |
| Hetzner CX22 | 4 GB | — | Disable the OCR service (`profiles: [ocr]`); run OCR on demand only |
A CX32 cannot honour a `mem_limit: 12g` — set it to `6g` in the prod overlay or use CX42.
A CX32 cannot honour the default `mem_limit: 12g` — set the `OCR_MEM_LIMIT=6g` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow) before deploying on a CX32. The prod compose interpolates this var with a 12g default.
### Dev vs production differences
| Concern | Dev compose | Prod overlay |
| Concern | Dev (`docker-compose.yml`) | Prod (`docker-compose.prod.yml`) |
|---|---|---|
| MinIO image tag | `minio/minio:latest` (unpinned) | Pinned in prod overlay |
| Data persistence | Bind mounts `./data/postgres`, `./data/minio` | Named Docker volumes |
| Bucket creation | `create-buckets` helper container | Pre-created in Hetzner console |
| Spring profile | `dev,e2e` (enables OpenAPI + Swagger UI) | `prod` |
| Mail | Mailpit (local catcher) | Real SMTP |
| MinIO image tag | `minio/minio:latest` | Pinned `minio/minio:RELEASE.…` |
| Data persistence | Bind mounts `./data/postgres`, `./data/minio` | Named Docker volumes (`postgres-data`, `minio-data`) |
| MinIO credentials for backend | Root user/password | Service account `archiv-app` with bucket-scoped rights |
| Bucket creation | `create-buckets` helper | Same helper, plus service-account bootstrap on every up |
| Spring profile | `dev,e2e` (Swagger + e2e overrides) | unset — base `application.yaml` is production-ready |
| Mail | Mailpit (local catcher) | Real SMTP (production) / Mailpit via `profiles: [staging]` (staging) |
| Frontend image | Dev server, `target: development`, port 5173 | Node adapter, `target: production`, port 3000 |
| Host port binding | All published | Bound to `127.0.0.1` only; Caddy is the front door |
| Deploy method | `docker compose up -d` (manual) | Gitea Actions: `nightly.yml` (staging, cron) and `release.yml` (production, on `v*` tag) — both use `up -d --wait` |
Full prod overlay: [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md).
Full prod compose: [`docker-compose.prod.yml`](../docker-compose.prod.yml). Workflow files: [`.gitea/workflows/nightly.yml`](../.gitea/workflows/nightly.yml), [`.gitea/workflows/release.yml`](../.gitea/workflows/release.yml).
---
@@ -112,9 +118,10 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
| Variable | Purpose | Default | Required? | Sensitive? |
|---|---|---|---|---|
| `MINIO_ROOT_USER` | MinIO root username | `minio_admin` | YES | — |
| `MINIO_ROOT_PASSWORD` | MinIO root password | `change-me` | YES | YES |
| `MINIO_DEFAULT_BUCKETS` | Bucket name | `archive-documents` | YES | — |
| `MINIO_ROOT_USER` | MinIO root username (dev compose only — prod compose hardcodes `archiv`) | `minio_admin` | YES (dev) | — |
| `MINIO_ROOT_PASSWORD` / `MINIO_PASSWORD` | MinIO root password. **Used only by the `mc admin` bootstrap in prod, never by the backend.** | `change-me` | YES | YES |
| `MINIO_APP_PASSWORD` | Password for the `archiv-app` service account that the backend uses. Bucket-scoped via `readwrite` policy on `familienarchiv`. Bootstrapped by `create-buckets`. | — | YES (prod) | YES |
| `MINIO_DEFAULT_BUCKETS` | Bucket name (dev compose only — prod compose hardcodes `familienarchiv`) | `archive-documents` | YES (dev) | — |
### OCR service
@@ -124,53 +131,105 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
| `ALLOWED_PDF_HOSTS` | SSRF protection — comma-separated list of allowed PDF source hosts. **Do not widen to `*`** | `minio,localhost,127.0.0.1` | YES | — |
| `KRAKEN_MODEL_PATH` | Directory containing Kraken HTR models (populated by `download-kraken-models.sh`) | `/app/models/` | — | — |
| `BLLA_MODEL_PATH` | Kraken baseline layout analysis model path | `/app/models/blla.mlmodel` | — | — |
| `OCR_MEM_LIMIT` | Container memory cap for ocr-service in `docker-compose.prod.yml`. Set to `6g` on CX32 hosts; leave unset on CX42+ to use the 12g default | `12g` (prod compose default) | — | — |
---
## 3. Bootstrap from scratch
> Full VPS provisioning steps are in [`docs/infrastructure/production-compose.md`](infrastructure/production-compose.md). This section covers the sequence and the security-critical steps.
Production and staging deploy via Gitea Actions (`release.yml` on `v*` tag, `nightly.yml` on cron). The server itself only needs to host Caddy, Docker, and the runner — the workflows handle the rest.
### Security checklist — complete before first boot
> ⚠️ **These defaults ship in `.env.example` and `application.yaml`. Change them or you will have an insecure installation.**
- [ ] Set `APP_ADMIN_PASSWORD` (default: `admin123` — change before starting the backend)
- [ ] Set `APP_ADMIN_USERNAME` if you want a non-default admin login name (add to `.env` — not in `.env.example`)
- [ ] Rotate `POSTGRES_PASSWORD` from `change-me`
- [ ] Rotate `MINIO_ROOT_PASSWORD` from `change-me`
- [ ] Set a strong `APP_OCR_TRAINING_TOKEN` (backend) and the matching `TRAINING_TOKEN` (OCR service) — both must be the same value (`python3 -c "import secrets; print(secrets.token_hex(32))"`)
- [ ] Confirm `ALLOWED_PDF_HOSTS` is locked to your MinIO/S3 hostname — widening to `*` opens SSRF
- [ ] Set `SPRING_PROFILES_ACTIVE=prod` in the prod overlay (not `dev,e2e` — that exposes Swagger UI and `/v3/api-docs`)
- [ ] Use a dedicated MinIO service account for `S3_ACCESS_KEY` / `S3_SECRET_KEY`, not the root credentials
### Bootstrap sequence
### 3.1 Server one-time setup
```bash
# 1. Copy and fill the env file
cp .env.example .env
# edit .env — complete the security checklist above first
# Base hardening
ufw default deny incoming && ufw allow 22/tcp && ufw allow 80/tcp && ufw allow 443/tcp && ufw enable
# /etc/ssh/sshd_config: PasswordAuthentication no, PermitRootLogin no
# 2. (Production only) Create the MinIO / Hetzner OBS bucket in the console
# The dev compose has a create-buckets helper; production does not.
# Create the bucket named $MINIO_DEFAULT_BUCKETS with private access.
# Install Caddy 2 (https://caddyserver.com/docs/install#debian-ubuntu-raspbian)
apt install caddy
# 3. Start the stack (prod overlay — see docs/infrastructure/production-compose.md)
# docker-compose.prod.yml is NOT committed — create it from the guide above
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# Use the Caddyfile from the repo (replace path with the runner's clone target)
# CI DEPENDENCY: the nightly and release workflows run `systemctl reload caddy` to
# pick up committed Caddyfile changes. They find the file via this symlink — if it
# is absent or points elsewhere, the reload succeeds but serves stale config.
ln -sf /opt/familienarchiv/infra/caddy/Caddyfile /etc/caddy/Caddyfile
systemctl reload caddy
# 4. Flyway migrations run automatically on backend start.
# Watch the backend log to confirm:
docker compose logs --follow --tail=100 backend
# fail2ban — protect /api/auth/login from credential stuffing.
# Jail watches the Caddy JSON access log for 401 responses on
# /api/auth/login. The jail (maxretry=10 / findtime=10m / bantime=30m)
# and filter are committed under infra/fail2ban/ — symlink them in:
apt install fail2ban
ln -sf /opt/familienarchiv/infra/fail2ban/jail.d/familienarchiv.conf \
/etc/fail2ban/jail.d/familienarchiv.conf
ln -sf /opt/familienarchiv/infra/fail2ban/filter.d/familienarchiv-auth.conf \
/etc/fail2ban/filter.d/familienarchiv-auth.conf
systemctl reload fail2ban
# Verify after first deploy with:
# fail2ban-client status familienarchiv-auth
# fail2ban-regex /var/log/caddy/access.log familienarchiv-auth
# 5. Verify the stack is healthy
curl http://localhost:8080/actuator/health
# Expected: {"status":"UP"}
# Tailscale — used by the backup pipeline to reach heim-nas (follow-up issue)
curl -fsSL https://tailscale.com/install.sh | sh && tailscale up
# 6. Open the app and log in with the admin credentials from .env
# Self-hosted Gitea runner — register against the repo with a runner token.
# This runner is assumed single-tenant: the deploy workflows write .env.*
# files to disk during execution (cleaned up unconditionally on completion).
# A multi-tenant runner would need to switch to stdin-piped env files.
# (See https://docs.gitea.com/usage/actions/quickstart for the register step.)
```
> **Do not use `docker-compose.ci.yml` locally** — it disables bind mounts that the dev workflow depends on.
### 3.2 DNS records
```
archiv.raddatz.cloud A <server IP>
staging.raddatz.cloud A <server IP>
git.raddatz.cloud A <server IP>
```
### 3.3 Gitea secrets (Repo → Settings → Actions → Secrets)
| Secret | Used by | Notes |
|---|---|---|
| `PROD_POSTGRES_PASSWORD` | release.yml | strong unique password |
| `PROD_MINIO_PASSWORD` | release.yml | MinIO root password; used only at bootstrap |
| `PROD_MINIO_APP_PASSWORD` | release.yml | application service-account password |
| `PROD_OCR_TRAINING_TOKEN` | release.yml | `python3 -c "import secrets; print(secrets.token_hex(32))"` |
| `PROD_APP_ADMIN_USERNAME` | release.yml | e.g. `admin@archiv.raddatz.cloud` |
| `PROD_APP_ADMIN_PASSWORD` | release.yml | **⚠ locked permanently on first deploy** — see §3.5 |
| `STAGING_POSTGRES_PASSWORD` | nightly.yml | different from prod |
| `STAGING_MINIO_PASSWORD` | nightly.yml | different from prod |
| `STAGING_MINIO_APP_PASSWORD` | nightly.yml | different from prod |
| `STAGING_OCR_TRAINING_TOKEN` | nightly.yml | different from prod |
| `STAGING_APP_ADMIN_USERNAME` | nightly.yml | e.g. `admin@staging.raddatz.cloud` |
| `STAGING_APP_ADMIN_PASSWORD` | nightly.yml | locked on first staging deploy |
| `MAIL_HOST` | release.yml | SMTP relay hostname (prod only) |
| `MAIL_PORT` | release.yml | typically `587` |
| `MAIL_USERNAME` | release.yml | SMTP user |
| `MAIL_PASSWORD` | release.yml | SMTP password |
### 3.4 First deploy
```bash
# 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow")
# Expected: docker compose up -d --wait succeeds for archiv-staging, then
# the workflow's "Smoke test deployed environment" step asserts:
# - https://staging.raddatz.cloud/login returns 200
# - HSTS header is present
# - /actuator/health returns 404 (defense-in-depth check)
# 2. (Optional) Re-verify manually
curl -I https://staging.raddatz.cloud/
# Expected: 200 (login page) with HSTS + X-Content-Type-Options headers
# 3. When staging looks healthy, push a v* tag to trigger release.yml
git tag v1.0.0 && git push origin v1.0.0
```
### 3.5 ⚠ Admin password is locked on first deploy
`UserDataInitializer` creates the admin user **only if the email does not exist**. The first successful deploy persists the admin password to the database. Changing `PROD_APP_ADMIN_PASSWORD` in Gitea secrets after that point has **no effect** — the secret is only consulted when the row is missing.
Before the first deploy: rotate `PROD_APP_ADMIN_PASSWORD` to a strong value. After the first deploy: change the admin password via the in-app account settings, not via the Gitea secret.
---
@@ -224,7 +283,23 @@ docker exec -i archive-db psql -U ${POSTGRES_USER} ${POSTGRES_DB} < backup-YYYYM
### Planned — phase 5 of Production v1 milestone
Automated backup (PostgreSQL WAL archiving + MinIO bucket replication) is planned in the Production v1 milestone phase 5. Until that ships: **manual backups are the only recovery option.**
Automated backup (nightly `pg_dump` + MinIO `mc mirror` over Tailscale to `heim-nas`) is a follow-up issue. Until that ships: **manual backups are the only recovery option.**
### Rollback
Each release tag corresponds to a docker image tag on the host daemon (built via DooD; no registry). Rolling back to a previous tag is one command:
```bash
TAG=v1.0.0 docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file /opt/familienarchiv/.env.production \
up -d --wait --remove-orphans
```
If the rollback target image is no longer present on the host (host disk pruned, etc.), re-trigger `release.yml` for that tag from Gitea Actions UI — it rebuilds and redeploys.
**Flyway migrations are not auto-rolled-back.** If a release contained a destructive migration (drop column, rename table), a tag rollback brings the schema back to a previous app version but the data shape has already changed. For breaking schema changes, prefer a forward-only fix.
---

View File

@@ -107,6 +107,13 @@ _See also [Briefwechsel](#briefwechsel-user-facing)._
---
## Infrastructure Terms
**archiv-app** — the bucket-scoped MinIO service account the backend uses to read and write the `familienarchiv` bucket. Distinct from the MinIO root account (`archiv`, used only by the bootstrap container for admin operations). Defined and provisioned in [`infra/minio/bootstrap.sh`](../infra/minio/bootstrap.sh) and consumed by the backend as `S3_ACCESS_KEY` in [`docker-compose.prod.yml`](../docker-compose.prod.yml). The attached `archiv-app-policy` grants `s3:GetObject/PutObject/DeleteObject` on `familienarchiv/*` and `s3:ListBucket/GetBucketLocation` on the bucket only — not the built-in `readwrite` policy which would grant `s3:*` on all buckets.
_See also [ADR-010 — MinIO stays self-hosted, not Hetzner OBS](./adr/010-minio-self-hosted-not-hetzner-obs.md)._
---
## Pending Terms
_Terms flagged as potentially ambiguous that have not yet been formally defined here. Add an entry above and remove it from this list when resolved._

View File

@@ -0,0 +1,68 @@
# ADR-008: SQL-level pagination for full-text search via window-function CTE
## Status
Accepted
## Context
`DocumentRepository.findAllMatchingIdsByFts` (formerly `findRankedIdsByFts`) returns all matching document IDs for a FTS query. `DocumentService.searchDocuments` then paginates in memory on the RELEVANCE sort path.
A pre-production audit against 1,520 documents measured:
```
rows_per_call: 911 / call (query: "walter")
```
At current scale this is acceptable — 911 UUIDs ≈ 14 KB, ms-level DB time. At 100 K+ documents two failure modes emerge:
1. **Memory**: a broad query returns ~60 K UUIDs ≈ 1 MB per request, multiplied by concurrent users.
2. **Latency**: the `LATERAL` join does work proportional to match-set size; at 60 K matches the FTS step alone exceeds 100 ms per query.
Tracked as finding **F-31 (High)** in the pre-production architectural review.
## Decision
Push pagination and rank ordering into SQL for the RELEVANCE sort path when no non-text filters are active (pure full-text search):
```sql
WITH q AS (
SELECT CASE WHEN websearch_to_tsquery('german', :query)::text <> ''
THEN to_tsquery('simple', regexp_replace(
websearch_to_tsquery('german', :query)::text,
'''([^'']+)''', '''\\1'':*', 'g'))
END AS pq
), matches AS (
SELECT d.id, ts_rank(d.search_vector, q.pq) AS rank
FROM documents d, q
WHERE d.search_vector @@ q.pq
)
SELECT id, rank, COUNT(*) OVER () AS total
FROM matches
ORDER BY rank DESC, id
OFFSET :offset LIMIT :limit
```
`COUNT(*) OVER ()` returns the full match count alongside each page row in a single round-trip — no separate count query needed.
`rows_per_call` for the FTS query drops from match-set size (911) to page size (≤ 50).
When non-text filters (date range, sender, receiver, tags, status) are also active, the existing path is preserved: `findAllMatchingIdsByFts` returns all ranked IDs, which are passed as an `IN` clause to the JPA Specification, and `totalElements` comes from the JPA `Page.getTotalElements()`. This keeps the count accurate across the combined filter set.
## Alternatives Considered
**1. Two-query approach (separate COUNT + paged SELECT)**
Correct, but doubles round-trips. The window function achieves the same result in one query.
**2. Capped result set with a user-visible warning**
Return at most N results (e.g. 500) and show "showing top 500 of many results". Simpler, but degrades UX for broad queries and doesn't reduce latency proportionally (still scans N rows).
**3. Full SQL rewrite combining FTS + JPA Specification filters**
Possible via a native query that embeds all filter predicates. Eliminates the in-memory SENDER/RECEIVER sort paths and the two-phase approach. High complexity, tight coupling to schema details, loses type-safe JPA Specification composition. Deferred to a future refactor if scale demands it.
## Consequences
- **`rows_per_call` for pure-text FTS searches drops to ≤ page size** — the primary metric.
- **SENDER and RECEIVER sort paths stay in-memory** for combined text+filter queries. For pure-text queries with SENDER/RECEIVER sort, the current approach (fetch all matched IDs, build spec, load all matched entities, sort in-memory) still runs. This is acceptable while the archive stays under ~10 K documents.
- **RELEVANCE sort with text+filters still loads the full filtered entity set in-memory.** The filtered set is typically much smaller than the raw FTS match set, so the cost is bounded by filter selectivity, not total match count.
- **`findAllMatchingIdsByFts` is retained** for: (a) the bulk-edit "select all" fast path (`findIdsForFilter`), (b) the document density chart (`getDensity`), and (c) the SENDER/RECEIVER in-memory sort paths.

View File

@@ -0,0 +1,50 @@
# ADR-009: Standalone `docker-compose.prod.yml`, not an overlay
## Status
Accepted
## Context
The repository's `docker-compose.yml` is a development stack: every service is built locally, ports are exposed on `0.0.0.0` for dev tooling, the frontend runs `npm run dev` with hot-reload, the backend is `spring-boot:run` with the dev profile, and there is no Caddy, no `archiv-app` service account, no admin-credential lock-in, no healthcheck-gated startup sequence. The dev stack reflects "single developer on a laptop", not "production on a single VPS".
The pre-merge design (issue #497, comment #8331) sketched two ways to add a production stack:
1. **Overlay** — keep `docker-compose.yml` as the base, add `docker-compose.prod.yml` as a `-f` overlay (`docker compose -f docker-compose.yml -f docker-compose.prod.yml up`). Compose merges the two files at runtime.
2. **Standalone** — make `docker-compose.prod.yml` a fully self-contained file that does not reference or merge with `docker-compose.yml` at all. Project-name namespacing (`-p archiv-production`, `-p archiv-staging`) keeps multi-environment deploys clean on a single host.
The earlier `docs/infrastructure/production-compose.md` notes assumed overlay because the original plan was to **remove** MinIO in production (replace with Hetzner Object Storage), so the prod file would only need to remove one service and add a few. With MinIO retained (see ADR-010), the prod stack diverges from dev in essentially every service: build vs pre-built image, target stage, port binding, env vars, healthcheck, restart policy, mem_limit, profile gating, service account, depends_on chain. Overlay would mostly be `override:` blocks that nullify the dev defaults — a fragile inversion.
## Decision
`docker-compose.prod.yml` is standalone. Production and staging both run it directly:
```
production: docker compose -f docker-compose.prod.yml -p archiv-production --env-file .env.production ...
staging: docker compose -f docker-compose.prod.yml -p archiv-staging --env-file .env.staging --profile staging ...
```
Environment isolation is achieved via the Docker Compose project name (`-p`). Volumes, networks, and containers are namespaced by the project name, so production and staging cohabit cleanly on the same host without interfering.
The dev `docker-compose.yml` is unchanged — `docker compose up` still works for developers, and its `frontend` service now specifies `target: development` explicitly so the new multi-stage Dockerfile builds the right stage.
## Alternatives Considered
| Alternative | Why rejected |
|---|---|
| Overlay (`-f base.yml -f prod.yml`) | With MinIO retained and most services differing across nearly every field, the overlay would consist mostly of `override:` blocks that null out dev defaults. Compose's merge semantics for nested keys (env, ports, healthcheck) are sharp — silent merges of port mappings, env-var entries, and depends_on edges cost reviewer hours. Standalone is one file the reader can hold in their head. |
| Two fully separate files (dev + prod) but with shared YAML anchors via `extends:` | `extends:` works across files but is a niche feature and is increasingly discouraged in compose v2. Reviewer load is higher than reading two flat files. |
| Generate prod compose from a template at deploy time (e.g. ytt, kustomize) | Adds a build-time step and a new tool to the operator toolchain. Justified for a fleet of 10+ environments; overkill for production + staging on one host. |
| Single compose file with environment-specific profiles | Compose profiles select which *services* run, not which *configuration* a service runs with. Using profiles to swap "build locally" vs "pull image" would smear dev and prod across one file. |
## Consequences
- The prod file can be read top-to-bottom without cross-referencing `docker-compose.yml`. Onboarding and review cost drops.
- Volume namespacing is automatic (`archiv-production_postgres-data`, `archiv-staging_postgres-data`) — no manual `volumes:` aliasing.
- Dev compose churn (e.g. swapping a dev port) cannot accidentally affect production. The two files are independent.
- The cost is duplication: identical environment variables (e.g. `POSTGRES_DB: archiv`) appear in both files. This duplication is bounded — there is no incentive to add more services that exist in both — and the alternative (overlay) carries its own duplication via `override:` boilerplate.
- The retired `docs/infrastructure/production-compose.md` narrative is trimmed to a pointer at the live files. The cost/sizing rationale is preserved there.
## Future Direction
If the deployment fleet ever grows beyond two environments on one host (e.g. add a `demo` environment, or shard staging across two VPS for load testing), revisit the templating decision. At three+ environments the duplication starts to bite and a template engine (kustomize or ytt) becomes attractive.

View File

@@ -0,0 +1,53 @@
# ADR-010: MinIO stays self-hosted on the production VPS
## Status
Accepted
## Context
`docs/infrastructure/production-compose.md` (pre-this-PR) sketched a production topology in which the application bucket migrates from in-cluster MinIO to Hetzner Object Storage (OBS, S3-compatible). The motivation was operational: one less service to back up, no MinIO RAM/disk pressure on the VPS, hand off durability to the hyperscaler.
Two facts revisited at pre-merge review (issue #497, comment #8331) changed the answer:
1. **Current data size is small.** The archive is ~13 GB of file uploads (Kurrent letters, scanned ODS files, attachment PDFs). Hetzner OBS billing on this size is dominated by the per-month base fee (~5 EUR/mo for the smallest unit), not capacity or egress. The break-even point against the VPS's existing disk is far above the current footprint.
2. **MinIO is already production-grade.** The dev stack uses MinIO; the backend already drives it via the AWS SDK v2 with a generic `S3_ENDPOINT`. Switching providers is a runtime env-var change (`S3_ENDPOINT`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`) plus an `mc mirror` to copy objects. There is no application-level rewrite cost waiting.
If Hetzner OBS were a one-way-door (provider-specific SDK, complex IAM integration, multi-month migration), the decision would deserve a serious weighing. As reversible as the migration is, deferring it costs nothing.
## Decision
MinIO stays on the production VPS for the first launch. The application bucket is created and managed inside the docker-compose stack (`infra/minio/bootstrap.sh`). The backend uses a least-privilege service account (`archiv-app`) with a bucket-scoped IAM policy, not the MinIO root credentials.
Hetzner Object Storage is **explicitly deferred**, not rejected. The migration path is documented as a runbook in `docs/DEPLOYMENT.md` (when the trigger fires): provision an OBS bucket, run `mc mirror local-minio:/familienarchiv obs:/familienarchiv`, rotate the three env vars, restart the backend, decommission the MinIO service from `docker-compose.prod.yml`.
## Triggers to re-evaluate
Revisit the decision when **any** of the following holds:
- The `minio-data` volume exceeds 50 GB and is growing > 5 GB/month.
- MinIO healthcheck latency exceeds 200 ms p95 (signal of disk pressure on the host).
- The VPS upgrade required to keep MinIO healthy costs more per month than the equivalent OBS bucket + traffic.
- Backup of the MinIO volume to `heim-nas` over Tailscale (deferred follow-up) is implemented and consistently runs > 30 min nightly. At that point durability-as-a-service starts paying for itself.
The migration runbook in `docs/DEPLOYMENT.md` is the script for executing the swap when one of the triggers fires.
## Alternatives Considered
| Alternative | Why rejected (for now) |
|---|---|
| Migrate to Hetzner Object Storage in this PR | Premature. Adds an external dependency, locks the operator into the Hetzner ecosystem before the data has demonstrated it needs hyperscaler durability, blocks the PR on a migration that buys ~5 GB of headroom. |
| Migrate to S3 (AWS) for HA across regions | Way over-spec for a family archive. Egress cost would dwarf any benefit; durability concerns at this size are addressed by nightly off-site backup, not by multi-region replication. |
| Drop S3 abstraction entirely; store files directly on the VPS disk | Possible, but loses the bucket-policy IAM surface (least-privilege service account), loses presigned-URL flow (OCR service downloads files via short-lived URLs, not via shared filesystem), loses the migration path to OBS. The S3 indirection is cheap insurance. |
| Self-hosted on-VPS plus periodic `mc mirror` to Hetzner OBS for off-site backup | This is the **target** for the backup pipeline follow-up. Treated as backup, not primary — primary stays MinIO. |
## Consequences
- The production VPS sizing (Hetzner CX42, 16 GB RAM, 80 GB disk) must accommodate MinIO's working set. Current footprint leaves ample headroom.
- Backup of MinIO data is the operator's responsibility until the off-site `mc mirror` pipeline is implemented (deferred follow-up). The DEPLOYMENT.md rollback procedure explicitly flags this — manual backup is the only recovery option until the pipeline ships.
- The backend never sees the MinIO root password; it uses the `archiv-app` service account with a bucket-scoped IAM policy (see `infra/minio/bootstrap.sh`). A backend RCE/SSRF cannot escalate beyond the `familienarchiv` bucket.
- The migration to Hetzner OBS remains a small, well-understood runbook step rather than a major refactor. No application code, no SDK swap.
## Future Direction
When one of the triggers above fires, the migration is: provision OBS bucket → `mc mirror` → rotate three env vars → restart backend → remove MinIO service from compose. The bucket-scoped policy translates 1:1 to an OBS user policy (S3-compatible).

View File

@@ -0,0 +1,58 @@
# ADR-011: Single-tenant Gitea runner with secrets-on-disk env-files
## Status
Accepted
## Context
The deploy workflows (`.gitea/workflows/nightly.yml`, `release.yml`) execute on a self-hosted Gitea Actions runner. The runner has Docker-out-of-Docker access (the host's Docker socket is mounted into the runner), so `docker compose build` produces images on the host daemon and `docker compose up` consumes them directly — no registry hop.
Two workflow steps shape the security model:
1. **"Write env file"** — the workflow writes every required secret to `.env.staging` or `.env.production` on the runner's filesystem so that `docker compose --env-file` can consume them. The file lives on disk for the duration of the workflow.
2. **"Cleanup env file"** — the matching `if: always()` step deletes the env file after the workflow ends, regardless of success.
This shape only works under one operational assumption: **the runner is single-tenant**. The runner is owned by the same operator who owns the secrets, no other repositories run jobs on the same runner, and no untrusted code is executed (no public fork PRs trigger workflows). If any of those held, the env-file-on-disk approach would be a credential exposure path — a sibling job could read `.env.production`, or a malicious PR could exfiltrate the secrets via a step.
The alternative — `docker compose --env-file <(printf "..." )` (bash process substitution) — is technically supported and would keep secrets out of the on-disk filesystem. It is more secure under a multi-tenant runner but requires bash 4+ and is brittle inside YAML (the `printf` step would need to escape every secret value containing newlines, equals signs, or quotes).
## Decision
The runner is treated as single-tenant for the lifetime of the v1 deployment. The workflows write env-files to disk under that assumption and rely on the `if: always()` cleanup step to remove them. The operational assumption is documented in-comment at the top of both workflow files (`nightly.yml`, `release.yml`) so the next operator who considers adding a second repo or accepting public PRs has the trigger surfaced in front of them.
Concretely:
- The Gitea runner only runs jobs for `marcel/familienarchiv`.
- No public fork PRs trigger the workflows (Gitea defaults to requiring an explicit approval on first-time contributor PRs for the actions to run).
- Secrets are stored in Gitea repository secrets and injected via `${{ secrets.* }}`. They land in the env-file at workflow start and are removed at workflow end.
## Migration trigger
Switch to the multi-tenant-safe pattern when **any** of the following becomes true:
- A second repository starts using the same runner.
- A workflow accepts contributions that can run untrusted code (public PRs without manual approval).
- The runner is moved off the operator's controlled host onto shared infrastructure.
The migration path is one-step per workflow: replace the "Write env file" step with `--env-file <(printf '%s' "${{ secrets.STAGING_ENV_BLOB }}")` and store the full env-file as a single Gitea secret. The cleanup step is then unnecessary because the env-file never touches disk.
## Alternatives Considered
| Alternative | Why rejected (for now) |
|---|---|
| `--env-file <(printf "...")` via bash process substitution | More secure under multi-tenant. Brittle for multi-line / quoted secret values; harder to debug ("env file not found" with no diff to inspect). Justified once the trigger above fires. |
| Docker secrets (`docker secret create` + `compose secrets:`) | Designed for Swarm; outside of Swarm, compose secrets read from files anyway, so the on-disk surface is the same. Adds complexity without changing the threat model. |
| External secret manager (Vault, AWS Secrets Manager) | Adds a third-party dependency to the deploy path. For a family-archive deployment with one operator and one VPS, the cost outweighs the benefit at this scale. |
| GitHub-hosted ephemeral runners | Would require uploading the prod-deploy artifacts to a registry first, then a deploy step on the VPS connecting back. Inverts the current Docker-out-of-Docker simplicity for marginal security gain. The single-tenant self-hosted runner *is* ephemeral in practice — the secrets are written to a directory the runner controls, then deleted. |
## Consequences
- The runner host's filesystem is in the secret-trust boundary. The host is hardened per `docs/DEPLOYMENT.md` (ufw, fail2ban, Tailscale-only SSH).
- An operator who later adds a second repo to the runner without revisiting the workflows would silently break the trust assumption. The in-file comments at the top of `nightly.yml` and `release.yml` are the breadcrumb that surfaces the assumption at change time.
- The `if: always()` cleanup step is load-bearing: removing it (e.g. during a future workflow refactor) leaves credentials on disk between runs. Treat it as a permanent invariant.
- Workflow debuggability stays high: an operator who needs to know what env-file the deploy ran with can SSH onto the host while a workflow is in flight and `cat .env.staging` — useful for first-deploy diagnostics.
## Future Direction
When the trigger fires, migrate both workflows in a single PR: replace the "Write env file" step with a single `--env-file <(printf '%s' …)` invocation, drop the cleanup step, and consolidate the per-secret Gitea entries into a single multi-line `STAGING_ENV_BLOB` / `PROD_ENV_BLOB` secret. Single commit, both workflows, no application change.

View File

@@ -0,0 +1,63 @@
# ADR-012: nsenter via privileged sibling container for host service management in CI
## Status
Accepted
## Context
The deploy workflows (`.gitea/workflows/nightly.yml`, `release.yml`) run job steps inside Docker containers under a Docker-out-of-Docker (DooD) setup: the Gitea runner container mounts the host Docker socket, and act_runner spawns a sibling container for each job. That job container also gets the Docker socket mounted (via `valid_volumes` in `runner-config.yaml`).
This architecture has one significant limitation: **job containers cannot manage host services**. Specifically:
- Job containers are not in the host's PID, mount, UTS, network, or IPC namespaces.
- There is no systemd PID 1 inside a job container — `systemctl` has nothing to talk to.
- `sudo` is not present in standard container images; even if it were, it would not help.
- Caddy runs as a **host systemd service** (not a Docker container), managing TLS certificates via Let's Encrypt. It must be running on the host to serve port 443.
The deploy workflows need to tell Caddy to reload its config after each deploy so that committed Caddyfile changes are applied before the smoke test validates the public surface. Without a reload step, Caddy silently serves the previous config and the smoke test may pass against stale configuration.
## Decision
Use the host Docker socket (already mounted in every job container via `runner-config.yaml`) to spin up a **privileged sibling container** in the host PID namespace, then use `nsenter` to enter all host namespaces and call `systemctl reload caddy`:
```yaml
- name: Reload Caddy
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
```
`nsenter -t 1 -m -u -n -p -i` enters the init process's mount, UTS, IPC, network, PID, and cgroup namespaces, giving `systemctl` a view of the real host systemd daemon.
**Alpine is used** instead of Ubuntu: ~5 MB vs ~70 MB pull size, no unnecessary tooling. `util-linux` (which ships `nsenter`) is installed at run time; apk add takes ~1 s on the warm VPS cache. The image digest is pinned so any upstream change requires an explicit Renovate bump PR.
**`reload` not `restart`**: reload sends SIGHUP so Caddy re-reads its config in-process without dropping TLS connections or in-flight requests.
**No sudoers entry is required**: the Docker socket already grants root-equivalent host access. This pattern makes existing implicit privileges explicit rather than introducing new ones.
This decision applies the same pattern to both `nightly.yml` and `release.yml` since both deploy the app stack and must apply Caddyfile changes before smoke-testing the public surface.
## Alternatives Considered
| Alternative | Why rejected |
|---|---|
| `sudo systemctl reload caddy` in the job container | No systemd PID 1 inside the container — `systemctl` has nothing to connect to. `sudo` is not present in container images and would not help even if it were. |
| Caddy admin API (`curl localhost:2019/load`) | Job containers do not share the host network namespace; `localhost:2019` on the host is unreachable. Exposing `:2019` on a host-bound port would add a network attack surface with no benefit over the current approach. |
| SSH from the job container to the VPS host | Requires storing an SSH private key as a CI secret, managing authorized_keys on the host, and opening an inbound SSH path from the container. Adds key management overhead for a pattern that the Docker socket already enables more directly. |
| Running Caddy as a Docker container (instead of host service) | Caddy manages TLS certificates via Let's Encrypt; running it in Docker complicates certificate persistence and renewal. As a host service, cert storage is straightforward and restarts do not risk rate-limit issues. This would be a larger infrastructure change unrelated to the CI gap. |
## Consequences
- The runner host's Docker socket access is now a capability relied upon for host service management, not just for running `docker compose` commands. This is stated explicitly in the YAML comment so future reviewers understand the trust boundary.
- The Caddyfile symlink on the VPS (`/etc/caddy/Caddyfile → /opt/familienarchiv/infra/caddy/Caddyfile`) is a required contract for CI to succeed. It is documented in `docs/DEPLOYMENT.md §3.1` and `docs/infrastructure/ci-gitea.md`. If the symlink is absent or mis-pointed, `systemctl reload caddy` succeeds but Caddy serves stale config.
- Renovate will create bump PRs when a new Alpine 3.21 digest is published. Because the container runs `--privileged --pid=host`, these bump PRs must be reviewed manually and must not be auto-merged. A `packageRule` in `renovate.json` enforces this.
- The step is duplicated between `nightly.yml` and `release.yml` (tracked in issue #539 for extraction into a composite action).
- If Caddy is not running when the step executes, `systemctl reload` exits non-zero and the workflow aborts before the smoke test — preventing a misleading "port 443 refused" curl error.
## References
- `docs/infrastructure/ci-gitea.md` §"Running host-level commands from CI (nsenter pattern)" — full operational context, troubleshooting guide
- `docs/DEPLOYMENT.md` §3.1 — Caddyfile symlink bootstrap step
- ADR-011 — single-tenant runner trust model (Docker socket access scope)

View File

@@ -6,23 +6,27 @@ title Container Diagram: Familienarchiv
Person(user, "User", "Admin or family member")
System_Ext(mail, "Email Service", "SMTP server. Delivers notification and password-reset emails.")
Container(caddy, "Reverse Proxy", "Caddy 2 (host-installed)", "TLS termination (auto Let's Encrypt). Routes /api/* to backend:8080, everything else to frontend:3000. Responds 404 on /actuator/* and adds HSTS, X-Content-Type-Options, Referrer-Policy headers.")
System_Boundary(archiv, "Familienarchiv (Docker Compose)") {
Container(frontend, "Web Frontend", "SvelteKit / Node.js", "Server-side rendered UI. Handles auth session cookies, document search and viewer, transcription editor, annotation layer, family tree (Stammbaum), stories (Geschichten), activity feed (Chronik), enrichment workflow, and admin panel.")
Container(backend, "API Backend", "Spring Boot 4 / Java 21 / Jetty", "REST API. Implements document management, search, user auth, file upload/download, transcription, OCR orchestration, and SSE notifications.")
Container(frontend, "Web Frontend", "SvelteKit / Node adapter / port 3000", "Server-side rendered UI. Handles auth session cookies, document search and viewer, transcription editor, annotation layer, family tree (Stammbaum), stories (Geschichten), activity feed (Chronik), enrichment workflow, and admin panel.")
Container(backend, "API Backend", "Spring Boot 4 / Java 21 / Jetty / port 8080", "REST API. Implements document management, search, user auth, file upload/download, transcription, OCR orchestration, and SSE notifications. Trusts X-Forwarded-* headers from Caddy.")
Container(ocr, "OCR Service", "Python FastAPI / port 8000", "Handwritten text recognition (HTR) and OCR microservice. Single-node by design — see ADR-001. Reachable only on the internal Docker network; no external port exposed.")
ContainerDb(db, "Relational Database", "PostgreSQL 16", "Stores document metadata, persons, users, permission groups, tags, transcription blocks, audit log, and Spring Session data.")
ContainerDb(storage, "Object Storage", "MinIO (S3-compatible)", "Stores the actual document files (PDFs, scans). Objects keyed as documents/{UUID}_{filename}.")
Container(mc, "Bucket Init Helper", "MinIO Client (mc)", "One-shot container on startup. Creates the archive bucket with private access policy.")
ContainerDb(storage, "Object Storage", "MinIO (S3-compatible)", "Stores the actual document files (PDFs, scans). Backend uses a bucket-scoped service account (archiv-app), not MinIO root.")
Container(mc, "Bucket / Service-Account Init", "MinIO Client (mc)", "One-shot container on startup. Idempotent: creates the archive bucket, the archiv-app service account, and attaches the readwrite policy.")
}
Rel(user, frontend, "Uses", "HTTPS / Browser")
Rel(user, caddy, "HTTPS", "TLS 1.2/1.3")
Rel(caddy, frontend, "Reverse proxies non-/api requests", "HTTP / loopback:3000")
Rel(caddy, backend, "Reverse proxies /api/*", "HTTP / loopback:8080")
Rel(frontend, backend, "API requests with Basic Auth token", "HTTP / REST / JSON")
Rel(backend, user, "SSE notifications (server-sent events)", "HTTP / SSE — direct backend-to-browser")
Rel(backend, user, "SSE notifications (server-sent events)", "HTTP / SSE — fronted by Caddy")
Rel(backend, db, "Reads and writes metadata and sessions", "JDBC / SQL")
Rel(backend, storage, "Uploads and streams document files", "HTTP / S3 API (AWS SDK v2)")
Rel(backend, storage, "Uploads and streams document files using archiv-app service account", "HTTP / S3 API (AWS SDK v2)")
Rel(backend, ocr, "OCR job requests with presigned MinIO URL", "HTTP / REST / JSON")
Rel(backend, mail, "Sends notification and password-reset emails (optional)", "SMTP")
Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned")
Rel(mc, storage, "Creates bucket on startup", "MinIO Client CLI")
Rel(mc, storage, "Bootstraps bucket + service account on startup", "MinIO Client CLI")
@enduml

View File

@@ -1,26 +1,49 @@
@startuml
title Authentication Flow
title Authentication Flow (behind Caddy reverse proxy)
actor User
participant Browser
participant "Caddy (TLS termination)" as Caddy
participant "Frontend (SvelteKit)" as Frontend
participant "Backend (Spring Boot)" as Backend
participant PostgreSQL as DB
User -> Browser: Enter email + password
Browser -> Frontend: POST /login (form action)
Browser -> Caddy: HTTPS POST /login (form action)
note right of Caddy
Caddy terminates TLS and forwards
to Frontend over HTTP with:
X-Forwarded-Proto: https
X-Forwarded-For: <client IP>
X-Forwarded-Host: archiv.raddatz.cloud
end note
Caddy -> Frontend: HTTP POST /login\n+ X-Forwarded-Proto: https
Frontend -> Frontend: Base64 encode "email:password"
Frontend -> Backend: GET /api/users/me\nAuthorization: Basic <token>
Frontend -> Backend: GET /api/users/me\nAuthorization: Basic <token>\n+ X-Forwarded-Proto: https
note right of Backend
server.forward-headers-strategy: native
Jetty's ForwardedRequestCustomizer
reads X-Forwarded-Proto so
request.getScheme() returns "https".
end note
Backend -> Backend: Spring Security parses Basic Auth
Backend -> DB: SELECT user WHERE email=?
DB --> Backend: AppUser + groups + permissions
Backend -> Backend: BCrypt.matches(password, hash)
Backend --> Frontend: 200 OK — UserDTO
Frontend -> Browser: Set-Cookie: auth_token=<base64>\n(httpOnly, SameSite=strict, maxAge=86400)
Browser -> Frontend: GET / (next request)
Frontend -> Caddy: Set-Cookie: auth_token=<base64>\n(httpOnly, **Secure**, SameSite=strict, maxAge=86400)
note right of Frontend
Secure flag is set because the
request scheme observed by the
app is https (forwarded by Caddy).
end note
Caddy -> Browser: HTTPS 200 + Set-Cookie
Browser -> Caddy: HTTPS GET / (next request)
Caddy -> Frontend: HTTP GET / + X-Forwarded-Proto: https
Frontend -> Frontend: hooks.server.ts reads auth_token cookie
Frontend -> Backend: GET /api/users/me\nAuthorization: Basic <token>
Backend --> Frontend: 200 OK — user in event.locals
Frontend --> Browser: Render page with user context
Frontend --> Caddy: rendered page
Caddy --> Browser: HTTPS 200
@enduml

View File

@@ -4,16 +4,109 @@ This document covers the Gitea Actions CI workflow for Familienarchiv, including
---
## Self-Hosted Runner Provisioning
## Runner Architecture
Gitea Actions requires self-hosted runners. GitHub Actions provides `ubuntu-latest` for free; on Gitea you run the runner yourself.
Familienarchiv uses **two runners** on the same Hetzner VPS:
```bash
# On the VPS — register a Gitea Actions runner
docker run -d --name gitea-runner --restart unless-stopped -v /var/run/docker.sock:/var/run/docker.sock -v gitea-runner-data:/data -e GITEA_INSTANCE_URL=https://gitea.example.com -e GITEA_RUNNER_REGISTRATION_TOKEN=<token-from-gitea-settings> -e GITEA_RUNNER_NAME=vps-runner-1 -e GITEA_RUNNER_LABELS=ubuntu-latest:docker://node:20-bullseye gitea/act_runner:latest
| Runner | Purpose | Config |
|---|---|---|
| `gitea` (Docker container) | Hosts Gitea itself | `infra/gitea/docker-compose.yml` |
| `gitea-runner` (Docker container) | Runs all CI and deploy jobs | `infra/gitea/docker-compose.yml` + `/root/docker/gitea/runner-config.yaml` |
Both containers live in the `gitea_gitea` Docker network on the VPS. The runner connects to Gitea via the LAN IP so job containers (which don't share the `gitea_gitea` network) can also reach it.
### Docker-out-of-Docker (DooD)
The `gitea-runner` container mounts the host Docker socket (`/var/run/docker.sock`). When a workflow job runs, act_runner spawns a **sibling container** for each job. That job container also gets the Docker socket mounted (via `valid_volumes` in `runner-config.yaml`), enabling `docker compose` calls in workflow steps.
### Running host-level commands from CI (nsenter pattern)
Job containers are unprivileged and do not share the host's PID/mount/network namespaces. Commands like `systemctl` that target the host daemon are therefore unavailable by default. When a workflow step needs to manage a host service (e.g. `systemctl reload caddy`), it uses the Docker socket to spin up a **privileged sibling container** in the host PID namespace:
```yaml
- name: Reload Caddy
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
```
The runner label `ubuntu-latest` maps to the Docker image it uses -- this is how `runs-on: ubuntu-latest` in the workflow YAML continues to work unchanged.
`nsenter -t 1 -m -u -n -p -i` enters the init process's mount, UTS, IPC, network, PID, and cgroup namespaces, giving `systemctl` a view of the real host systemd. No sudoers entry is required — the Docker socket already grants root-equivalent host access.
Alpine is used instead of Ubuntu: ~5 MB vs ~70 MB, and the digest is pinned to a specific sha256 so any upstream change requires an explicit Renovate bump PR. `util-linux` (which ships `nsenter`) is not part of the Alpine base image but is installed at run time in ~1 s from the warm VPS cache.
#### Why not `sudo systemctl` in the job container?
Job containers run as root inside an unprivileged Docker namespace. There is no systemd PID 1 inside the container — `systemctl` would attempt to reach a socket that does not exist. `sudo` is not present in container images and would not help even if it were.
#### Why not Caddy's admin API?
Caddy ships a localhost admin API at `:2019` by default. Job containers do not share the host network namespace, so they cannot reach `localhost:2019` on the host. Exposing `:2019` on a host-bound port to make it reachable would add a network attack surface with no benefit over the current approach.
### Caddyfile symlink contract
The deploy workflows reload Caddy to pick up committed Caddyfile changes. This relies on a symlink that must exist on the VPS:
```
/etc/caddy/Caddyfile → /opt/familienarchiv/infra/caddy/Caddyfile
```
Created once during server bootstrap (see `docs/DEPLOYMENT.md §3.1`). Verify with:
```bash
ls -la /etc/caddy/Caddyfile
# Expected: lrwxrwxrwx ... /etc/caddy/Caddyfile -> /opt/familienarchiv/infra/caddy/Caddyfile
```
### Troubleshooting: Reload Caddy step fails
**Failure mode 1 — Caddy is stopped**
Symptom in CI log:
```
Failed to reload caddy.service: Unit caddy.service is not active.
```
Recovery:
```bash
ssh root@<vps>
systemctl start caddy
systemctl status caddy # confirm Active: active (running)
```
Re-run the workflow via Gitea Actions → "Re-run workflow".
**Failure mode 2 — Caddyfile symlink is missing or mis-pointed**
This failure is silent — `systemctl reload caddy` exits 0 but Caddy reloads whatever `/etc/caddy/Caddyfile` currently resolves to. The smoke test may then pass against stale config.
Symptom: smoke test fails on the HSTS value or the `/actuator/health → 404` check despite the Reload Caddy step succeeding.
Diagnosis:
```bash
ssh root@<vps>
ls -la /etc/caddy/Caddyfile
# Should be: lrwxrwxrwx ... /etc/caddy/Caddyfile -> /opt/familienarchiv/infra/caddy/Caddyfile
```
Recovery if symlink is wrong or missing:
```bash
ln -sf /opt/familienarchiv/infra/caddy/Caddyfile /etc/caddy/Caddyfile
systemctl reload caddy
```
**Failure mode 3 — nsenter / Docker socket unavailable**
Symptom in CI log:
```
docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock.
```
or
```
nsenter: failed to execute /bin/systemctl: No such file or directory
```
The first error means the Docker socket is not mounted into the job container — check `valid_volumes` in `/root/docker/gitea/runner-config.yaml` on the VPS. The second means the Alpine image is running but cannot enter the host mount namespace; verify `--privileged` and `--pid=host` are both present in the workflow step.
---
@@ -166,7 +259,7 @@ jobs:
timeout 30 bash -c \
'until docker compose -f docker-compose.yml -f docker-compose.ci.yml exec -T db pg_isready -U archive_user; do sleep 2; done'
- name: Connect job container to compose network
run: docker network connect familienarchiv_archive-net $(cat /etc/hostname)
run: docker network connect familienarchiv_archiv-net $(cat /etc/hostname)
- uses: actions/setup-java@v4
with:
java-version: '21'

View File

@@ -1,214 +1,22 @@
# Production Docker Compose & Infrastructure
This document contains the full production Docker Compose file, Caddyfile, VPS sizing recommendations, cost breakdown, and Hetzner ecosystem overview.
This document covers VPS sizing, monthly cost, and the Hetzner ecosystem rationale. The compose file and Caddyfile that previously lived inline in this doc are now committed to the repo root.
> **Where to find the live files (after #497)**
> - Production compose: [`docker-compose.prod.yml`](../../docker-compose.prod.yml) (standalone, not an overlay)
> - Caddyfile: [`infra/caddy/Caddyfile`](../../infra/caddy/Caddyfile)
> - Deploy workflows: [`.gitea/workflows/nightly.yml`](../../.gitea/workflows/nightly.yml) and [`.gitea/workflows/release.yml`](../../.gitea/workflows/release.yml)
> - Bootstrap checklist, secrets, rollback procedure: [`docs/DEPLOYMENT.md`](../DEPLOYMENT.md)
The original spec in this doc proposed an overlay pattern (`docker compose -f docker-compose.yml -f docker-compose.prod.yml`) with MinIO disabled in production in favour of Hetzner Object Storage. That approach was retired in #497 in favour of a standalone prod compose that keeps MinIO self-hosted on the VPS. The Hetzner OBS migration is tracked as a future follow-up; the swap is three env vars + `mc mirror` once we decide to do it.
---
## Full docker-compose.prod.yml
## Observability stack — not yet deployed
Usage: `docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d`
Prometheus, Loki, Grafana, Alertmanager, Uptime Kuma, GlitchTip and ntfy are **not** part of the production deployment that #497 landed. They are tracked as follow-up issue #498.
```yaml
# docker-compose.prod.yml
# Usage: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
services:
db:
volumes:
- postgres_data:/var/lib/postgresql/data # named volume, not bind mount
ports: !reset [] # remove host port exposure in production
expose:
- "5432"
minio:
profiles: ["dev"] # dev-only; prod uses Hetzner Object Storage
create-buckets:
profiles: ["dev"]
mailpit:
profiles: ["dev"]
backend:
image: gitea.example.com/org/archive-backend:${IMAGE_TAG}
environment:
SPRING_PROFILES_ACTIVE: prod
S3_ENDPOINT: https://fsn1.your-objectstorage.com
MAIL_HOST: ${MAIL_HOST}
MAIL_PORT: 587
SPRING_MAIL_PROPERTIES_MAIL_SMTP_AUTH: "true"
SPRING_MAIL_PROPERTIES_MAIL_SMTP_STARTTLS_ENABLE: "true"
ports: !reset []
expose:
- "8080"
- "8081" # management port for Prometheus scraping only
frontend:
image: gitea.example.com/org/archive-frontend:${IMAGE_TAG}
ports: !reset []
expose:
- "3000"
caddy:
image: caddy:2-alpine
restart: unless-stopped
ports:
- "80:80"
- "443:443"
- "443:443/udp"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
# ── Observability ──────────────────────────────────────────────────────────
prometheus:
image: prom/prometheus:v2.51.0 # pinned
restart: unless-stopped
volumes:
- ./observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus_data:/prometheus
expose: ["9090"]
grafana:
image: grafana/grafana:10.4.0 # pinned
restart: unless-stopped
environment:
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
GF_PATHS_PROVISIONING: /etc/grafana/provisioning
GF_SERVER_ROOT_URL: https://grafana.example.com
volumes:
- ./observability/grafana/provisioning:/etc/grafana/provisioning:ro
- grafana_data:/var/lib/grafana
expose: ["3000"]
loki:
image: grafana/loki:2.9.0 # pinned
restart: unless-stopped
volumes:
- ./observability/loki-config.yml:/etc/loki/config.yml:ro
- loki_data:/loki
expose: ["3100"]
promtail:
image: grafana/promtail:2.9.0 # pinned
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./observability/promtail-config.yml:/etc/promtail/config.yml:ro
alertmanager:
image: prom/alertmanager:v0.27.0 # pinned
restart: unless-stopped
volumes:
- ./observability/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
expose: ["9093"]
# ── Uptime monitoring ──────────────────────────────────────────────────────
uptime-kuma:
image: louislam/uptime-kuma:1
restart: unless-stopped
volumes:
- uptime_kuma_data:/app/data
expose: ["3001"]
# ── Error tracking ─────────────────────────────────────────────────────────
glitchtip-web:
image: glitchtip/glitchtip:latest
restart: unless-stopped
depends_on: [db]
environment:
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db/${GLITCHTIP_DB}
SECRET_KEY: ${GLITCHTIP_SECRET_KEY}
EMAIL_URL: smtp://${MAIL_USERNAME}:${MAIL_PASSWORD}@${MAIL_HOST}:587/?tls=true
GLITCHTIP_DOMAIN: https://errors.example.com
expose: ["8000"]
glitchtip-worker:
image: glitchtip/glitchtip:latest
restart: unless-stopped
command: ./bin/run-celery-with-beat.sh
depends_on: [glitchtip-web]
environment:
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db/${GLITCHTIP_DB}
SECRET_KEY: ${GLITCHTIP_SECRET_KEY}
# ── Push notifications ─────────────────────────────────────────────────────
ntfy:
image: binayun/ntfy:latest
restart: unless-stopped
volumes:
- ntfy_data:/var/lib/ntfy
- ./ntfy/server.yml:/etc/ntfy/server.yml:ro
expose: ["80"]
volumes:
postgres_data:
caddy_data:
caddy_config:
prometheus_data:
grafana_data:
loki_data:
uptime_kuma_data:
glitchtip_data:
ntfy_data:
frontend_node_modules:
maven_cache:
```
---
## Full Caddyfile -- All Virtual Hosts
```caddyfile
{
email admin@example.com
}
# Main application
app.example.com {
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Content-Type-Options "nosniff"
X-Frame-Options "DENY"
Referrer-Policy "strict-origin-when-cross-origin"
-Server
}
@api path /api/*
reverse_proxy @api backend:8080
@actuator path /actuator/*
respond @actuator 404
reverse_proxy frontend:3000
}
# Gitea — source code and CI
git.example.com {
reverse_proxy gitea:3000
}
# Grafana — observability
grafana.example.com {
basicauth {
admin $2a$14$...
}
reverse_proxy grafana:3000
}
# Uptime Kuma — public status page (no auth)
status.example.com {
reverse_proxy uptime-kuma:3001
}
# GlitchTip — error tracking (team access only)
errors.example.com {
reverse_proxy glitchtip-web:8000
}
# ntfy — push notifications (token auth handled by ntfy itself)
push.example.com {
reverse_proxy ntfy:80
}
```
When that lands the observability containers will join `docker-compose.prod.yml` under a dedicated profile so they can be operated alongside the application stack without affecting the application containers' restart cycle.
---
@@ -216,61 +24,47 @@ push.example.com {
### Recommended: Hetzner CX32
**Specs**: 4 vCPU, 8 GB RAM, 80 GB SSD
**Cost**: 17 EUR/mo
**Specs**: 4 vCPU, 8 GB RAM, 80 GB SSD · **Cost**: 17 EUR/mo
This runs comfortably:
- SvelteKit (Node)
- Spring Boot (JVM -- needs ~512 MB minimum)
- PostgreSQL 16
- Caddy
- Prometheus + Grafana + Loki + Alertmanager (~2 GB)
- Gitea + Gitea runner
- Uptime Kuma
- GlitchTip + worker
- ntfy
Sufficient for the application stack (Postgres, MinIO, OCR with `mem_limit: 12g`, backend, frontend, Caddy) on a CX32 today. Once the observability stack lands (Prometheus/Loki/Grafana/Alertmanager add ~2 GB) consider a CX42.
### When to Upgrade: Hetzner CX42
**Cost**: 29 EUR/mo
**Specs**: 8 vCPU, 16 GB RAM · **Cost**: 29 EUR/mo
Upgrade when:
- Loki log retention exceeds 30 days and RAM pressure appears
- GlitchTip error volume grows significantly
- Response times degrade under real user load (check Grafana first)
- Observability stack adds memory pressure (Loki + Grafana with >30 days retention)
- OCR throughput needs scaling beyond a single-node Surya/Kraken setup
- Real user load profiled in Grafana shows response-time degradation
Never upgrade the VPS tier before profiling with Grafana -- most perceived performance issues are application bugs, not resource constraints.
Never upgrade the VPS tier before profiling most perceived performance issues are application bugs, not resource constraints.
---
## Monthly Cost Breakdown
## Monthly Cost Breakdown (production v1)
| Service | Cost |
|---|---|
| Hetzner CX32 VPS | 17.00 EUR |
| Hetzner Object Storage (~200 GB) | 5.00 EUR |
| Hetzner SMTP relay | ~1.00 EUR |
| Hetzner DNS | 0.00 EUR |
| **Total** | **~23 EUR/mo** |
| Hetzner SMTP relay | ~1.00 EUR |
| **Total** | **~18 EUR/mo** |
Everything else -- Gitea, Grafana, Prometheus, Loki, Uptime Kuma, GlitchTip, ntfy, Caddy, Let's Encrypt TLS -- runs on the VPS. Zero additional cost.
MinIO data lives on the VPS disk (no Object Storage line item yet). The Hetzner OBS migration would add ~5 EUR/mo at ~200 GB.
Equivalent SaaS stack: 200-300 EUR/mo.
Equivalent SaaS stack: 200300 EUR/mo.
---
## Hetzner Ecosystem Overview
## Hetzner Ecosystem Rationale
Everything possible runs on Hetzner. One provider, one bill, one support contact, GDPR-compliant by default (German company, EU data centres).
Everything possible runs on Hetzner. One provider, one bill, GDPR-compliant by default (German company, EU data centres).
### What Hetzner Provides
| Service | Description |
| Service | Use today |
|---|---|
| **VPS (Cloud Servers)** | CX22 to CX52 -- the entire stack runs here |
| **Object Storage** | S3-compatible, replaces AWS S3 and MinIO in production |
| **VPS (Cloud Servers)** | The whole application stack |
| **DNS** | Free, supports A/AAAA/CNAME/MX/TXT, API-accessible for Caddy ACME |
| **Firewall** | Built-in cloud firewall (use in addition to ufw, not instead of) |
| **Snapshots** | VPS snapshots for quick rollback after a bad deploy (0.013 EUR/GB/mo) |
| **Volumes** | Attachable block storage if the VPS disk fills up (0.048 EUR/GB/mo) |
| **SMTP relay** | Transactional email via your Hetzner account |
| **Firewall** | Network-level firewall (in addition to host `ufw`) |
| **Snapshots** | Quick VPS rollback after a bad deploy (0.013 EUR/GB/mo) |
| **SMTP relay** | Transactional email from `noreply@raddatz.cloud` |
| **Object Storage** | Not used today — MinIO stays on-VPS. Available when we decide to migrate |

View File

@@ -1,15 +1,34 @@
FROM node:20-alpine
# syntax=docker/dockerfile:1.7
# ── Development ──────────────────────────────────────────────────────────────
# Used by docker-compose.yml (target: development). Source is bind-mounted in
# dev so the COPY . below is effectively replaced at runtime; the layer still
# exists so the image is self-contained for cold starts (e.g. devcontainer).
FROM node:20.19.0-alpine3.21 AS development
WORKDIR /app
# Install dependencies as a separate layer so they are cached when only source changes
COPY package.json package-lock.json ./
RUN npm ci
# Source is mounted at runtime via docker-compose volume
# This COPY is only used when building without a volume (e.g. production image)
COPY . .
EXPOSE 5173
CMD ["npm", "run", "dev"]
# ── Build ────────────────────────────────────────────────────────────────────
# Compiles the SvelteKit Node-adapter output to /app/build.
FROM node:20.19.0-alpine3.21 AS build
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# ── Production ───────────────────────────────────────────────────────────────
# Self-contained Node server. `node build` is the adapter-node entrypoint.
FROM node:20.19.0-alpine3.21 AS production
WORKDIR /app
ENV NODE_ENV=production
COPY --from=build /app/build ./build
COPY --from=build /app/package.json ./package.json
COPY --from=build /app/package-lock.json ./package-lock.json
RUN npm ci --omit=dev
EXPOSE 3000
CMD ["node", "build"]

View File

@@ -31,6 +31,7 @@
"@types/diff": "^7.0.2",
"@types/node": "^24",
"@vitest/browser-playwright": "^4.0.10",
"@vitest/coverage-istanbul": "^4.1.0",
"@vitest/coverage-v8": "^4.1.0",
"eslint": "^9.39.1",
"eslint-config-prettier": "^10.1.8",
@@ -129,6 +130,153 @@
"node": ">=6.9.0"
}
},
"node_modules/@babel/compat-data": {
"version": "7.29.3",
"resolved": "https://registry.npmjs.org/@babel/compat-data/-/compat-data-7.29.3.tgz",
"integrity": "sha512-LIVqM46zQWZhj17qA8wb4nW/ixr2y1Nw+r1etiAWgRM6U1IqP+LNhL1yg440jYZR72jCWcWbLWzIosH+uP1fqg==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/core": {
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/core/-/core-7.29.0.tgz",
"integrity": "sha512-CGOfOJqWjg2qW/Mb6zNsDm+u5vFQ8DxXfbM09z69p5Z6+mE1ikP2jUXw+j42Pf1XTYED2Rni5f95npYeuwMDQA==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/code-frame": "^7.29.0",
"@babel/generator": "^7.29.0",
"@babel/helper-compilation-targets": "^7.28.6",
"@babel/helper-module-transforms": "^7.28.6",
"@babel/helpers": "^7.28.6",
"@babel/parser": "^7.29.0",
"@babel/template": "^7.28.6",
"@babel/traverse": "^7.29.0",
"@babel/types": "^7.29.0",
"@jridgewell/remapping": "^2.3.5",
"convert-source-map": "^2.0.0",
"debug": "^4.1.0",
"gensync": "^1.0.0-beta.2",
"json5": "^2.2.3",
"semver": "^6.3.1"
},
"engines": {
"node": ">=6.9.0"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/babel"
}
},
"node_modules/@babel/core/node_modules/semver": {
"version": "6.3.1",
"resolved": "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz",
"integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==",
"dev": true,
"license": "ISC",
"bin": {
"semver": "bin/semver.js"
}
},
"node_modules/@babel/generator": {
"version": "7.29.1",
"resolved": "https://registry.npmjs.org/@babel/generator/-/generator-7.29.1.tgz",
"integrity": "sha512-qsaF+9Qcm2Qv8SRIMMscAvG4O3lJ0F1GuMo5HR/Bp02LopNgnZBC/EkbevHFeGs4ls/oPz9v+Bsmzbkbe+0dUw==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/parser": "^7.29.0",
"@babel/types": "^7.29.0",
"@jridgewell/gen-mapping": "^0.3.12",
"@jridgewell/trace-mapping": "^0.3.28",
"jsesc": "^3.0.2"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-compilation-targets": {
"version": "7.28.6",
"resolved": "https://registry.npmjs.org/@babel/helper-compilation-targets/-/helper-compilation-targets-7.28.6.tgz",
"integrity": "sha512-JYtls3hqi15fcx5GaSNL7SCTJ2MNmjrkHXg4FSpOA/grxK8KwyZ5bubHsCq8FXCkua6xhuaaBit+3b7+VZRfcA==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/compat-data": "^7.28.6",
"@babel/helper-validator-option": "^7.27.1",
"browserslist": "^4.24.0",
"lru-cache": "^5.1.1",
"semver": "^6.3.1"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-compilation-targets/node_modules/lru-cache": {
"version": "5.1.1",
"resolved": "https://registry.npmjs.org/lru-cache/-/lru-cache-5.1.1.tgz",
"integrity": "sha512-KpNARQA3Iwv+jTA0utUVVbrh+Jlrr1Fv0e56GGzAFOXN7dk/FviaDW8LHmK52DlcH4WP2n6gI8vN1aesBFgo9w==",
"dev": true,
"license": "ISC",
"dependencies": {
"yallist": "^3.0.2"
}
},
"node_modules/@babel/helper-compilation-targets/node_modules/semver": {
"version": "6.3.1",
"resolved": "https://registry.npmjs.org/semver/-/semver-6.3.1.tgz",
"integrity": "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA==",
"dev": true,
"license": "ISC",
"bin": {
"semver": "bin/semver.js"
}
},
"node_modules/@babel/helper-globals": {
"version": "7.28.0",
"resolved": "https://registry.npmjs.org/@babel/helper-globals/-/helper-globals-7.28.0.tgz",
"integrity": "sha512-+W6cISkXFa1jXsDEdYA8HeevQT/FULhxzR99pxphltZcVaugps53THCeiWA8SguxxpSp3gKPiuYfSWopkLQ4hw==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-module-imports": {
"version": "7.28.6",
"resolved": "https://registry.npmjs.org/@babel/helper-module-imports/-/helper-module-imports-7.28.6.tgz",
"integrity": "sha512-l5XkZK7r7wa9LucGw9LwZyyCUscb4x37JWTPz7swwFE/0FMQAGpiWUZn8u9DzkSBWEcK25jmvubfpw2dnAMdbw==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/traverse": "^7.28.6",
"@babel/types": "^7.28.6"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-module-transforms": {
"version": "7.28.6",
"resolved": "https://registry.npmjs.org/@babel/helper-module-transforms/-/helper-module-transforms-7.28.6.tgz",
"integrity": "sha512-67oXFAYr2cDLDVGLXTEABjdBJZ6drElUSI7WKp70NrpyISso3plG9SAGEF6y7zbha/wOzUByWWTJvEDVNIUGcA==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/helper-module-imports": "^7.28.6",
"@babel/helper-validator-identifier": "^7.28.5",
"@babel/traverse": "^7.28.6"
},
"engines": {
"node": ">=6.9.0"
},
"peerDependencies": {
"@babel/core": "^7.0.0"
}
},
"node_modules/@babel/helper-string-parser": {
"version": "7.27.1",
"resolved": "https://registry.npmjs.org/@babel/helper-string-parser/-/helper-string-parser-7.27.1.tgz",
@@ -149,6 +297,30 @@
"node": ">=6.9.0"
}
},
"node_modules/@babel/helper-validator-option": {
"version": "7.27.1",
"resolved": "https://registry.npmjs.org/@babel/helper-validator-option/-/helper-validator-option-7.27.1.tgz",
"integrity": "sha512-YvjJow9FxbhFFKDSuFnVCe2WxXk1zWc22fFePVNEaWJEu8IrZVlda6N0uHwzZrUM1il7NC9Mlp4MaJYbYd9JSg==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/helpers": {
"version": "7.29.2",
"resolved": "https://registry.npmjs.org/@babel/helpers/-/helpers-7.29.2.tgz",
"integrity": "sha512-HoGuUs4sCZNezVEKdVcwqmZN8GoHirLUcLaYVNBK2J0DadGtdcqgr3BCbvH8+XUo4NGjNl3VOtSjEKNzqfFgKw==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/template": "^7.28.6",
"@babel/types": "^7.29.0"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/parser": {
"version": "7.29.2",
"resolved": "https://registry.npmjs.org/@babel/parser/-/parser-7.29.2.tgz",
@@ -165,6 +337,40 @@
"node": ">=6.0.0"
}
},
"node_modules/@babel/template": {
"version": "7.28.6",
"resolved": "https://registry.npmjs.org/@babel/template/-/template-7.28.6.tgz",
"integrity": "sha512-YA6Ma2KsCdGb+WC6UpBVFJGXL58MDA6oyONbjyF/+5sBgxY/dwkhLogbMT2GXXyU84/IhRw/2D1Os1B/giz+BQ==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/code-frame": "^7.28.6",
"@babel/parser": "^7.28.6",
"@babel/types": "^7.28.6"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/traverse": {
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/traverse/-/traverse-7.29.0.tgz",
"integrity": "sha512-4HPiQr0X7+waHfyXPZpWPfWL/J7dcN1mx9gL6WdQVMbPnF3+ZhSMs8tCxN7oHddJE9fhNE7+lxdnlyemKfJRuA==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/code-frame": "^7.29.0",
"@babel/generator": "^7.29.0",
"@babel/helper-globals": "^7.28.0",
"@babel/parser": "^7.29.0",
"@babel/template": "^7.28.6",
"@babel/types": "^7.29.0",
"debug": "^4.3.1"
},
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/@babel/types": {
"version": "7.29.0",
"resolved": "https://registry.npmjs.org/@babel/types/-/types-7.29.0.tgz",
@@ -1128,6 +1334,16 @@
"node": ">=18.0.0"
}
},
"node_modules/@istanbuljs/schema": {
"version": "0.1.6",
"resolved": "https://registry.npmjs.org/@istanbuljs/schema/-/schema-0.1.6.tgz",
"integrity": "sha512-+Sg6GCR/wy1oSmQDFq4LQDAhm3ETKnorxN+y5nbLULOR3P0c14f2Wurzj3/xqPXtasLFfHd5iRFQ7AJt4KH2cw==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=8"
}
},
"node_modules/@jridgewell/gen-mapping": {
"version": "0.3.13",
"resolved": "https://registry.npmjs.org/@jridgewell/gen-mapping/-/gen-mapping-0.3.13.tgz",
@@ -3544,6 +3760,31 @@
}
}
},
"node_modules/@vitest/coverage-istanbul": {
"version": "4.1.0",
"resolved": "https://registry.npmjs.org/@vitest/coverage-istanbul/-/coverage-istanbul-4.1.0.tgz",
"integrity": "sha512-0+67gA94YToxd+Pc3XgIA/2c8HN2hXNSg3T+1FI4HW7W/2gPitYCtktsY6Ke7vrt5caboMq3TUf0/vwbHRb0og==",
"dev": true,
"license": "MIT",
"dependencies": {
"@babel/core": "^7.29.0",
"@istanbuljs/schema": "^0.1.3",
"@jridgewell/gen-mapping": "^0.3.13",
"@jridgewell/trace-mapping": "0.3.31",
"istanbul-lib-coverage": "^3.2.2",
"istanbul-lib-report": "^3.0.1",
"istanbul-reports": "^3.2.0",
"magicast": "^0.5.2",
"obug": "^2.1.1",
"tinyrainbow": "^3.0.3"
},
"funding": {
"url": "https://opencollective.com/vitest"
},
"peerDependencies": {
"vitest": "4.1.0"
}
},
"node_modules/@vitest/coverage-v8": {
"version": "4.1.0",
"resolved": "https://registry.npmjs.org/@vitest/coverage-v8/-/coverage-v8-4.1.0.tgz",
@@ -3864,6 +4105,19 @@
"dev": true,
"license": "MIT"
},
"node_modules/baseline-browser-mapping": {
"version": "2.10.29",
"resolved": "https://registry.npmjs.org/baseline-browser-mapping/-/baseline-browser-mapping-2.10.29.tgz",
"integrity": "sha512-Asa2krT+XTPZINCS+2QcyS8WTkObE77RwkydwF7h6DmnKqbvlalz93m/dnphUyCa6SWSP51VgtEUf2FN+gelFQ==",
"dev": true,
"license": "Apache-2.0",
"bin": {
"baseline-browser-mapping": "dist/cli.cjs"
},
"engines": {
"node": ">=6.0.0"
}
},
"node_modules/bidi-js": {
"version": "1.0.3",
"resolved": "https://registry.npmjs.org/bidi-js/-/bidi-js-1.0.3.tgz",
@@ -3897,6 +4151,40 @@
"node": ">=8"
}
},
"node_modules/browserslist": {
"version": "4.28.2",
"resolved": "https://registry.npmjs.org/browserslist/-/browserslist-4.28.2.tgz",
"integrity": "sha512-48xSriZYYg+8qXna9kwqjIVzuQxi+KYWp2+5nCYnYKPTr0LvD89Jqk2Or5ogxz0NUMfIjhh2lIUX/LyX9B4oIg==",
"dev": true,
"funding": [
{
"type": "opencollective",
"url": "https://opencollective.com/browserslist"
},
{
"type": "tidelift",
"url": "https://tidelift.com/funding/github/npm/browserslist"
},
{
"type": "github",
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"dependencies": {
"baseline-browser-mapping": "^2.10.12",
"caniuse-lite": "^1.0.30001782",
"electron-to-chromium": "^1.5.328",
"node-releases": "^2.0.36",
"update-browserslist-db": "^1.2.3"
},
"bin": {
"browserslist": "cli.js"
},
"engines": {
"node": "^6 || ^7 || ^8 || ^9 || ^10 || ^11 || ^12 || >=13.7"
}
},
"node_modules/callsites": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/callsites/-/callsites-3.1.0.tgz",
@@ -3907,6 +4195,27 @@
"node": ">=6"
}
},
"node_modules/caniuse-lite": {
"version": "1.0.30001792",
"resolved": "https://registry.npmjs.org/caniuse-lite/-/caniuse-lite-1.0.30001792.tgz",
"integrity": "sha512-hVLMUZFgR4JJ6ACt1uEESvQN1/dBVqPAKY0hgrV70eN3391K6juAfTjKZLKvOMsx8PxA7gsY1/tLMMTcfFLLpw==",
"dev": true,
"funding": [
{
"type": "opencollective",
"url": "https://opencollective.com/browserslist"
},
{
"type": "tidelift",
"url": "https://tidelift.com/funding/github/npm/caniuse-lite"
},
{
"type": "github",
"url": "https://github.com/sponsors/ai"
}
],
"license": "CC-BY-4.0"
},
"node_modules/chai": {
"version": "6.2.2",
"resolved": "https://registry.npmjs.org/chai/-/chai-6.2.2.tgz",
@@ -4204,6 +4513,13 @@
"@types/trusted-types": "^2.0.7"
}
},
"node_modules/electron-to-chromium": {
"version": "1.5.353",
"resolved": "https://registry.npmjs.org/electron-to-chromium/-/electron-to-chromium-1.5.353.tgz",
"integrity": "sha512-kOrWphBi8TOZyiJZqsgqIle0lw+tzmnQK83pV9dZUd01Nm2POECSyFQMAuarzZdYqQW7FH9RaYOuaRo3h+bQ3w==",
"dev": true,
"license": "ISC"
},
"node_modules/enhanced-resolve": {
"version": "5.20.0",
"resolved": "https://registry.npmjs.org/enhanced-resolve/-/enhanced-resolve-5.20.0.tgz",
@@ -4279,6 +4595,16 @@
"@esbuild/win32-x64": "0.27.4"
}
},
"node_modules/escalade": {
"version": "3.2.0",
"resolved": "https://registry.npmjs.org/escalade/-/escalade-3.2.0.tgz",
"integrity": "sha512-WUj2qlxaQtO4g6Pq5c29GTcWGDyd8itL8zTlipgECz3JesAiiOKotd8JU6otB3PACgG6xkJUyVhboMS+bje/jA==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6"
}
},
"node_modules/escape-string-regexp": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/escape-string-regexp/-/escape-string-regexp-4.0.0.tgz",
@@ -4804,6 +5130,16 @@
"url": "https://github.com/sponsors/ljharb"
}
},
"node_modules/gensync": {
"version": "1.0.0-beta.2",
"resolved": "https://registry.npmjs.org/gensync/-/gensync-1.0.0-beta.2.tgz",
"integrity": "sha512-3hN7NaskYvMDLQY55gnW3NQ+mesEAepTqlg+VEbj7zzqEMBVNhzcGYYeqFo/TlYz6eQiFcp1HcsCZO+nGgS8zg==",
"dev": true,
"license": "MIT",
"engines": {
"node": ">=6.9.0"
}
},
"node_modules/get-tsconfig": {
"version": "4.14.0",
"resolved": "https://registry.npmjs.org/get-tsconfig/-/get-tsconfig-4.14.0.tgz",
@@ -5216,6 +5552,19 @@
}
}
},
"node_modules/jsesc": {
"version": "3.1.0",
"resolved": "https://registry.npmjs.org/jsesc/-/jsesc-3.1.0.tgz",
"integrity": "sha512-/sM3dO2FOzXjKQhJuo0Q173wf2KOo8t4I8vHy6lF9poUp7bKT0/NHE8fPX23PwfhnykfqnC2xRxOnVw5XuGIaA==",
"dev": true,
"license": "MIT",
"bin": {
"jsesc": "bin/jsesc"
},
"engines": {
"node": ">=6"
}
},
"node_modules/json-buffer": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/json-buffer/-/json-buffer-3.0.1.tgz",
@@ -5804,6 +6153,13 @@
"license": "MIT",
"optional": true
},
"node_modules/node-releases": {
"version": "2.0.38",
"resolved": "https://registry.npmjs.org/node-releases/-/node-releases-2.0.38.tgz",
"integrity": "sha512-3qT/88Y3FbH/Kx4szpQQ4HzUbVrHPKTLVpVocKiLfoYvw9XSGOX2FmD2d6DrXbVYyAQTF2HeF6My8jmzx7/CRw==",
"dev": true,
"license": "MIT"
},
"node_modules/obug": {
"version": "2.1.1",
"resolved": "https://registry.npmjs.org/obug/-/obug-2.1.1.tgz",
@@ -7181,6 +7537,37 @@
"@unrs/resolver-binding-win32-x64-msvc": "1.11.1"
}
},
"node_modules/update-browserslist-db": {
"version": "1.2.3",
"resolved": "https://registry.npmjs.org/update-browserslist-db/-/update-browserslist-db-1.2.3.tgz",
"integrity": "sha512-Js0m9cx+qOgDxo0eMiFGEueWztz+d4+M3rGlmKPT+T4IS/jP4ylw3Nwpu6cpTTP8R1MAC1kF4VbdLt3ARf209w==",
"dev": true,
"funding": [
{
"type": "opencollective",
"url": "https://opencollective.com/browserslist"
},
{
"type": "tidelift",
"url": "https://tidelift.com/funding/github/npm/browserslist"
},
{
"type": "github",
"url": "https://github.com/sponsors/ai"
}
],
"license": "MIT",
"dependencies": {
"escalade": "^3.2.0",
"picocolors": "^1.1.1"
},
"bin": {
"update-browserslist-db": "cli.js"
},
"peerDependencies": {
"browserslist": ">= 4.21.0"
}
},
"node_modules/uri-js": {
"version": "4.4.1",
"resolved": "https://registry.npmjs.org/uri-js/-/uri-js-4.4.1.tgz",
@@ -7607,6 +7994,13 @@
"integrity": "sha512-JZnDKK8B0RCDw84FNdDAIpZK+JuJw+s7Lz8nksI7SIuU3UXJJslUthsi+uWBUYOwPFwW7W7PRLRfUKpxjtjFCw==",
"license": "MIT"
},
"node_modules/yallist": {
"version": "3.1.1",
"resolved": "https://registry.npmjs.org/yallist/-/yallist-3.1.1.tgz",
"integrity": "sha512-a4UGQaWPH59mOXUYnAG2ewncQS4i4F43Tv3JoAM+s2VDAmS9NsK8GpDMLrCHPksFT7h3K6TOoUNn2pb7RoXx4g==",
"dev": true,
"license": "ISC"
},
"node_modules/yaml-ast-parser": {
"version": "0.0.43",
"resolved": "https://registry.npmjs.org/yaml-ast-parser/-/yaml-ast-parser-0.0.43.tgz",

View File

@@ -15,7 +15,7 @@
"lint:boundary-demo": "eslint src/lib/tag/__fixtures__/",
"test:unit": "vitest",
"test": "npm run test:unit -- --run",
"test:coverage": "vitest run --coverage --project=server",
"test:coverage": "vitest run --coverage --project=server && vitest run -c vitest.client-coverage.config.ts --coverage",
"test:e2e": "playwright test",
"test:e2e:headed": "playwright test --headed",
"test:e2e:ui": "playwright test --ui",
@@ -45,6 +45,7 @@
"@types/diff": "^7.0.2",
"@types/node": "^24",
"@vitest/browser-playwright": "^4.0.10",
"@vitest/coverage-istanbul": "^4.1.0",
"@vitest/coverage-v8": "^4.1.0",
"eslint": "^9.39.1",
"eslint-config-prettier": "^10.1.8",

View File

@@ -5,7 +5,14 @@ import { env } from 'process';
import { cookieName, cookieMaxAge } from '$lib/paraglide/runtime';
import { detectLocale } from '$lib/shared/server/locale';
const PUBLIC_PATHS = ['/login', '/logout', '/forgot-password', '/reset-password', '/register'];
const PUBLIC_PATHS = [
'/login',
'/logout',
'/forgot-password',
'/reset-password',
'/register',
'/hilfe/transkription' // prerendered help page — must be reachable without an auth cookie
];
const handleLocaleDetection: Handle = ({ event, resolve }) => {
if (!event.cookies.get(cookieName)) {

View File

@@ -35,7 +35,7 @@ let {
onclick={onPrev}
disabled={currentPage <= 1}
aria-label="Zurück"
class="rounded p-1 text-ink-3 transition hover:bg-surface/10 disabled:opacity-40"
class="min-h-[44px] min-w-[44px] rounded p-2 text-ink-3 transition hover:bg-surface/10 focus-visible:ring-2 focus-visible:ring-brand-navy focus-visible:ring-offset-1 disabled:opacity-40"
>
<svg class="h-4 w-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path stroke-linecap="round" stroke-linejoin="round" d="M15 19l-7-7 7-7" />
@@ -52,7 +52,7 @@ let {
onclick={onNext}
disabled={!isLoaded || currentPage >= totalPages}
aria-label="Weiter"
class="rounded p-1 text-ink-3 transition hover:bg-surface/10 disabled:opacity-40"
class="min-h-[44px] min-w-[44px] rounded p-2 text-ink-3 transition hover:bg-surface/10 focus-visible:ring-2 focus-visible:ring-brand-navy focus-visible:ring-offset-1 disabled:opacity-40"
>
<svg class="h-4 w-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<path stroke-linecap="round" stroke-linejoin="round" d="M9 5l7 7-7 7" />
@@ -65,7 +65,7 @@ let {
<button
onclick={onZoomOut}
aria-label="Verkleinern"
class="rounded p-1 text-ink-3 transition hover:bg-surface/10"
class="min-h-[44px] min-w-[44px] rounded p-2 text-ink-3 transition hover:bg-surface/10 focus-visible:ring-2 focus-visible:ring-brand-navy focus-visible:ring-offset-1"
>
<svg class="h-4 w-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<circle cx="11" cy="11" r="8" />
@@ -75,7 +75,7 @@ let {
<button
onclick={onZoomIn}
aria-label="Vergrößern"
class="rounded p-1 text-ink-3 transition hover:bg-surface/10"
class="min-h-[44px] min-w-[44px] rounded p-2 text-ink-3 transition hover:bg-surface/10 focus-visible:ring-2 focus-visible:ring-brand-navy focus-visible:ring-offset-1"
>
<svg class="h-4 w-4" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
<circle cx="11" cy="11" r="8" />
@@ -89,7 +89,8 @@ let {
<button
onclick={onToggleAnnotations}
aria-label={showAnnotations ? m.pdf_annotations_hide() : m.pdf_annotations_show()}
class="flex items-center gap-1.5 rounded px-2 py-1 font-sans text-xs transition {showAnnotations
aria-pressed={showAnnotations}
class="flex min-h-[44px] min-w-[44px] items-center gap-1.5 rounded px-3 py-2 font-sans text-xs transition focus-visible:ring-2 focus-visible:ring-brand-navy focus-visible:ring-offset-1 {showAnnotations
? 'text-ink-2 hover:bg-surface/10'
: 'bg-surface/10 text-primary'}"
>

View File

@@ -65,3 +65,111 @@ describe('PdfControls — annotation toggle contrast (WCAG 2.1 AA)', () => {
expect(annotationBtn!.className).not.toContain('text-accent');
});
});
describe('PdfControls — focus rings (WCAG 2.1 §2.4.7)', () => {
it('annotation toggle button has focus-visible:ring-2 focus ring', async () => {
const { container } = render(PdfControls, {
...defaultProps,
annotationCount: 2,
showAnnotations: false
});
const allButtons = container.querySelectorAll('button');
const annotationBtn = Array.from(allButtons).find((b) =>
b.getAttribute('aria-label')?.toLowerCase().includes('annotierungen')
);
expect(annotationBtn).not.toBeNull();
expect(annotationBtn!.className).toContain('focus-visible:ring-2');
});
it('icon-only nav/zoom buttons each have focus-visible:ring-2 focus ring', async () => {
const { container } = render(PdfControls, { ...defaultProps });
const allButtons = container.querySelectorAll('button');
const iconOnlyButtons = Array.from(allButtons).filter((b) => {
const label = b.getAttribute('aria-label') ?? '';
return ['zurück', 'weiter', 'verkleinern', 'vergrößern'].includes(label.toLowerCase());
});
expect(iconOnlyButtons).toHaveLength(4);
for (const btn of iconOnlyButtons) {
expect(btn.className).toContain('focus-visible:ring-2');
}
});
});
describe('PdfControls — touch targets (WCAG 2.2 §2.5.8)', () => {
it('annotation toggle button has min-h-[44px] touch target', async () => {
const { container } = render(PdfControls, {
...defaultProps,
annotationCount: 2,
showAnnotations: false
});
const allButtons = container.querySelectorAll('button');
const annotationBtn = Array.from(allButtons).find((b) =>
b.getAttribute('aria-label')?.toLowerCase().includes('annotierungen')
);
expect(annotationBtn).not.toBeNull();
expect(annotationBtn!.className).toContain('min-h-[44px]');
});
it('annotation toggle button has min-w-[44px] touch target', async () => {
const { container } = render(PdfControls, {
...defaultProps,
annotationCount: 2,
showAnnotations: false
});
const allButtons = container.querySelectorAll('button');
const annotationBtn = Array.from(allButtons).find((b) =>
b.getAttribute('aria-label')?.toLowerCase().includes('annotierungen')
);
expect(annotationBtn).not.toBeNull();
expect(annotationBtn!.className).toContain('min-w-[44px]');
});
it('annotation toggle reflects pressed state via aria-pressed', async () => {
const { container: c1 } = render(PdfControls, {
...defaultProps,
annotationCount: 2,
showAnnotations: false
});
const btn1 = Array.from(c1.querySelectorAll('button')).find((b) =>
b.getAttribute('aria-label')?.toLowerCase().includes('annotierungen')
);
expect(btn1!.getAttribute('aria-pressed')).toBe('false');
cleanup();
const { container: c2 } = render(PdfControls, {
...defaultProps,
annotationCount: 2,
showAnnotations: true
});
const btn2 = Array.from(c2.querySelectorAll('button')).find((b) =>
b.getAttribute('aria-label')?.toLowerCase().includes('annotierungen')
);
expect(btn2!.getAttribute('aria-pressed')).toBe('true');
});
it('icon-only nav/zoom buttons each have min-h-[44px] touch target', async () => {
const { container } = render(PdfControls, { ...defaultProps });
const allButtons = container.querySelectorAll('button');
const iconOnlyButtons = Array.from(allButtons).filter((b) => {
const label = b.getAttribute('aria-label') ?? '';
return ['zurück', 'weiter', 'verkleinern', 'vergrößern'].includes(label.toLowerCase());
});
expect(iconOnlyButtons).toHaveLength(4);
for (const btn of iconOnlyButtons) {
expect(btn.className).toContain('min-h-[44px]');
}
});
it('icon-only nav/zoom buttons each have min-w-[44px] touch target', async () => {
const { container } = render(PdfControls, { ...defaultProps });
const allButtons = container.querySelectorAll('button');
const iconOnlyButtons = Array.from(allButtons).filter((b) => {
const label = b.getAttribute('aria-label') ?? '';
return ['zurück', 'weiter', 'verkleinern', 'vergrößern'].includes(label.toLowerCase());
});
expect(iconOnlyButtons).toHaveLength(4);
for (const btn of iconOnlyButtons) {
expect(btn.className).toContain('min-w-[44px]');
}
});
});

View File

@@ -21,6 +21,7 @@ interface Props {
restrictToCorrespondentsOf?: string;
excludePersonId?: string;
badge?: 'additive' | 'replace';
resetKey?: number;
onchange?: (value: string) => void;
onfocused?: () => void;
}
@@ -39,17 +40,20 @@ let {
restrictToCorrespondentsOf,
excludePersonId,
badge,
resetKey = 0,
onchange,
onfocused
}: Props = $props();
// searchTerm must be both prop-derived AND locally writable (user typing), so $state +
// $effect is the correct pattern here — writable $derived is read-only and won't work.
// eslint-disable-next-line svelte/prefer-writable-derived
let searchTerm = $state(initialName);
// Sync display text when the selected person changes externally (e.g. swap, navigation).
// Sync display text when initialName changes OR when resetKey increments (navigation reset).
// resetKey is incremented by the page on every SvelteKit navigation so that a manually-typed
// term that was never committed (no person selected) gets cleared even if initialName stays ''.
$effect(() => {
void resetKey;
searchTerm = initialName;
});

View File

@@ -270,6 +270,33 @@ describe('PersonTypeahead correspondent mode', () => {
});
});
// ─── resetKey ─────────────────────────────────────────────────────────────────
describe('PersonTypeahead resetKey', () => {
// Note: rerender() in vitest-browser-svelte causes a full re-mount, not an in-place prop
// update. This is a smoke test — the $effect(resetKey) path that fires during SvelteKit
// navigation (prop update on a live instance) cannot be isolated at this level.
it('clears a manually-typed term when resetKey changes even if initialName stays empty', async () => {
mockFetchWithPersons([]);
const { rerender } = render(PersonTypeahead, {
name: 'senderId',
label: 'Absender',
initialName: '',
resetKey: 0
});
const input = page.getByPlaceholder('Namen tippen...');
// User types something without selecting a person
await input.fill('Max');
await waitForDebounce();
await expect.element(input).toHaveValue('Max');
// Navigation resets: initialName stays '', but resetKey increments
await rerender({ name: 'senderId', label: 'Absender', initialName: '', resetKey: 1 });
await expect.element(input).toHaveValue('');
});
});
// ─── Click outside ────────────────────────────────────────────────────────────
describe('PersonTypeahead click outside', () => {

View File

@@ -1,4 +1,5 @@
<script lang="ts">
import { untrack } from 'svelte';
import { isoToGerman, handleGermanDateInput, germanToIso } from '$lib/shared/utils/date';
import { m } from '$lib/paraglide/messages.js';
@@ -24,6 +25,16 @@ let {
let display = $state(isoToGerman(value ?? ''));
// Re-derive display when value changes externally (e.g. timeline drag, reset nav).
// Guard prevents overwriting while the user is mid-typing a partial date:
// germanToIso returns '' for partial input, matching value '' → no re-derive.
$effect(() => {
const externalIso = value ?? '';
if (germanToIso(untrack(() => display)) !== externalIso) {
display = isoToGerman(externalIso);
}
});
// ─── Validation helper ────────────────────────────────────────────────────
function isCalendarValid(iso: string): boolean {
if (!iso) return false;

View File

@@ -183,6 +183,26 @@ describe('DateInput clearing the date', () => {
});
});
// ─── External value changes ───────────────────────────────────────────────────
describe('DateInput external value changes', () => {
it('clears display when value prop is reset to empty externally', async () => {
const { rerender } = render(DateInput, { value: '1920-01-01' });
const input = page.getByRole('textbox');
await expect.element(input).toHaveValue('01.01.1920');
await rerender({ value: '' });
await expect.element(input).toHaveValue('');
});
it('updates display when value prop changes to a new date externally', async () => {
const { rerender } = render(DateInput, { value: '1920-01-01' });
const input = page.getByRole('textbox');
await expect.element(input).toHaveValue('01.01.1920');
await rerender({ value: '1945-05-08' });
await expect.element(input).toHaveValue('08.05.1945');
});
});
// ─── Hidden input ─────────────────────────────────────────────────────────────
describe('DateInput hidden input for form submission', () => {

View File

@@ -20,6 +20,7 @@ let {
showAdvanced = $bindable(false),
initialSenderName = '',
initialReceiverName = '',
navKey = 0,
isLoading = false,
onSearch,
onSearchImmediate,
@@ -39,6 +40,7 @@ let {
showAdvanced?: boolean;
initialSenderName?: string;
initialReceiverName?: string;
navKey?: number;
isLoading?: boolean;
onSearch: () => void;
onSearchImmediate?: () => void;
@@ -197,6 +199,7 @@ $effect(() => {
label={m.docs_filter_label_sender()}
bind:value={senderId}
initialName={initialSenderName}
resetKey={navKey}
onchange={onSearch}
/>
</div>
@@ -212,6 +215,7 @@ $effect(() => {
label={m.docs_filter_label_receivers()}
bind:value={receiverId}
initialName={initialReceiverName}
resetKey={navKey}
onchange={onSearch}
/>
</div>

View File

@@ -3,6 +3,23 @@ import { createApiClient } from '$lib/shared/api.server';
import { getErrorMessage } from '$lib/shared/errors';
import type { components } from '$lib/generated/api';
const UUID_RE = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i;
async function resolvePersonName(
id: string,
api: ReturnType<typeof createApiClient>
): Promise<string> {
if (!UUID_RE.test(id)) return '';
try {
const result = await api.GET('/api/persons/{id}', { params: { path: { id } } });
if (!result.response.ok) return '';
return result.data?.displayName ?? '';
} catch (e) {
console.error('[resolvePersonName] failed for id', id, e);
return '';
}
}
type DocumentSearchItem = components['schemas']['DocumentSearchItem'];
const VALID_SORTS = ['DATE', 'TITLE', 'SENDER', 'RECEIVER', 'UPLOAD_DATE', 'RELEVANCE'] as const;
@@ -34,25 +51,30 @@ export async function load({ url, fetch }) {
const api = createApiClient(fetch);
let result;
let initialSenderName = '';
let initialReceiverName = '';
try {
result = await api.GET('/api/documents/search', {
params: {
query: {
q: q || undefined,
from: from || undefined,
to: to || undefined,
senderId: senderId || undefined,
receiverId: receiverId || undefined,
tag: tags.length ? tags : undefined,
tagQ: tagQ && !tags.length ? tagQ : undefined,
tagOp: tagOp === 'OR' ? 'OR' : undefined,
sort,
dir: dir || undefined,
page,
size: PAGE_SIZE
[result, [initialSenderName, initialReceiverName]] = await Promise.all([
api.GET('/api/documents/search', {
params: {
query: {
q: q || undefined,
from: from || undefined,
to: to || undefined,
senderId: senderId || undefined,
receiverId: receiverId || undefined,
tag: tags.length ? tags : undefined,
tagQ: tagQ && !tags.length ? tagQ : undefined,
tagOp: tagOp === 'OR' ? 'OR' : undefined,
sort,
dir: dir || undefined,
page,
size: PAGE_SIZE
}
}
}
});
}),
Promise.all([resolvePersonName(senderId, api), resolvePersonName(receiverId, api)])
]);
} catch {
return {
items: [] as DocumentSearchItem[],
@@ -65,6 +87,8 @@ export async function load({ url, fetch }) {
to,
senderId,
receiverId,
initialSenderName: '',
initialReceiverName: '',
tags,
sort,
dir,
@@ -94,6 +118,8 @@ export async function load({ url, fetch }) {
to,
senderId,
receiverId,
initialSenderName,
initialReceiverName,
tags,
sort,
dir,

View File

@@ -22,6 +22,9 @@ let from = $state(untrack(() => data.from || ''));
let to = $state(untrack(() => data.to || ''));
let senderId = $state(untrack(() => data.senderId || ''));
let receiverId = $state(untrack(() => data.receiverId || ''));
let initialSenderName = $state(untrack(() => data.initialSenderName ?? ''));
let initialReceiverName = $state(untrack(() => data.initialReceiverName ?? ''));
let navKey = $state(0);
let tagNames = $state<{ name: string; id?: string; color?: string; parentId?: string }[]>(
untrack(() => (data.tags || []).map((name: string) => ({ name })))
);
@@ -207,12 +210,17 @@ async function editAllMatching() {
// Keep local filter state in sync with server data after navigation completes.
// Guard q: skip overwrite while the user is actively typing.
// navKey increments on every navigation so PersonTypeahead clears manually-typed
// terms even when initialSenderName/initialReceiverName stays '' across navigations.
$effect(() => {
if (!qFocused) q = data.q || '';
from = data.from || '';
to = data.to || '';
senderId = data.senderId || '';
receiverId = data.receiverId || '';
initialSenderName = data.initialSenderName ?? '';
initialReceiverName = data.initialReceiverName ?? '';
untrack(() => navKey++);
tagNames = (data.tags || []).map((name: string) => ({ name }));
sort = data.sort || 'DATE';
dir = data.dir || 'desc';
@@ -247,6 +255,9 @@ $effect(() => {
bind:dir={dir}
bind:tagQ={tagQ}
bind:tagOperator={tagOperator}
initialSenderName={initialSenderName}
initialReceiverName={initialReceiverName}
navKey={navKey}
isLoading={navigating.to !== null}
onSearch={handleTextSearch}
onSearchImmediate={handleImmediateSearch}

View File

@@ -167,3 +167,76 @@ describe('documents page load — network error fallback', () => {
expect(result.items).toEqual([]);
});
});
// ─── person name resolution ───────────────────────────────────────────────────
describe('documents page load — person name resolution', () => {
function makeSearchMock(personResult?: { ok: boolean; displayName?: string }) {
const mockGet = vi.fn().mockImplementation((path: string) => {
if (path === '/api/documents/search') {
return Promise.resolve({
response: { ok: true, status: 200 },
data: { items: [], totalElements: 0, pageNumber: 0, pageSize: 50, totalPages: 0 }
});
}
// person lookup via api.GET('/api/persons/{id}', ...)
if (!personResult?.ok) {
return Promise.resolve({ response: { ok: false, status: 404 }, data: undefined });
}
return Promise.resolve({
response: { ok: true, status: 200 },
data: { displayName: personResult.displayName ?? '' }
});
});
vi.mocked(createApiClient).mockReturnValue({ GET: mockGet } as ReturnType<
typeof createApiClient
>);
return mockGet;
}
it('returns initialSenderName from person lookup when senderId is a valid UUID', async () => {
makeSearchMock({ ok: true, displayName: 'Max Mustermann' });
const result = await load({
url: makeUrl({ senderId: '11111111-1111-1111-1111-111111111111' }),
fetch: vi.fn() as unknown as typeof fetch
});
expect(result.initialSenderName).toBe('Max Mustermann');
});
it('returns initialReceiverName from person lookup when receiverId is a valid UUID', async () => {
makeSearchMock({ ok: true, displayName: 'Anna Musterfrau' });
const result = await load({
url: makeUrl({ receiverId: '22222222-2222-2222-2222-222222222222' }),
fetch: vi.fn() as unknown as typeof fetch
});
expect(result.initialReceiverName).toBe('Anna Musterfrau');
});
it('returns empty string when senderId is not a valid UUID', async () => {
const mockGet = makeSearchMock();
const result = await load({
url: makeUrl({ senderId: 'not-a-uuid' }),
fetch: vi.fn() as unknown as typeof fetch
});
expect(result.initialSenderName).toBe('');
// UUID guard fires before any api.GET call — only document search is called
expect(mockGet).toHaveBeenCalledTimes(1);
});
it('returns empty string when person api returns 404', async () => {
makeSearchMock({ ok: false });
const result = await load({
url: makeUrl({ senderId: '11111111-1111-1111-1111-111111111111' }),
fetch: vi.fn() as unknown as typeof fetch
});
expect(result.initialSenderName).toBe('');
});
});

View File

@@ -23,6 +23,8 @@ function makeData(overrides: Record<string, unknown> = {}) {
to: '',
senderId: '',
receiverId: '',
initialSenderName: '',
initialReceiverName: '',
tags: [],
sort: 'DATE',
dir: 'desc',
@@ -136,6 +138,22 @@ describe('documents page — URL building', () => {
});
});
// ─── Sender / receiver name display ──────────────────────────────────────────
describe('documents page — sender/receiver display', () => {
it('pre-fills sender typeahead from initialSenderName when senderId filter is active', async () => {
render(Page, {
data: makeData({
senderId: '11111111-1111-1111-1111-111111111111',
initialSenderName: 'Max Mustermann'
})
});
// Advanced filters are auto-shown when senderId is set
const inputs = page.getByPlaceholder('Namen tippen...');
await expect.element(inputs.first()).toHaveValue('Max Mustermann');
});
});
// ─── Timeline density widget wiring (#385) ────────────────────────────────────
describe('documents page — timeline density widget', () => {

View File

@@ -1,3 +1 @@
// Safe: handleAuth in hooks.server.ts redirects unauthenticated requests
// before prerendered HTML is visible.
export const prerender = true;

View File

@@ -8,7 +8,7 @@ export const load: PageServerLoad = ({ url }) => {
};
export const actions = {
login: async ({ request, cookies, fetch }) => {
login: async ({ request, cookies, fetch, url }) => {
const data = await request.formData();
const email = data.get('email') as string;
const password = data.get('password') as string;
@@ -37,11 +37,17 @@ export const actions = {
return fail(500, { error: getErrorMessage('INTERNAL_ERROR') });
}
// The cookie IS the API credential — promoted to `Authorization: Basic …`
// on every browser → backend request by AuthTokenCookieFilter on the
// Spring side (see #520). It must be Secure on HTTPS or it leaks
// a 24h Basic token on plaintext networks. Dev runs over HTTP and
// would silently lose the cookie if we hardcoded secure=true.
const isHttps = url.protocol === 'https:';
cookies.set('auth_token', authHeader, {
path: '/',
httpOnly: true,
sameSite: 'strict',
secure: false, // set to true when HTTPS is available
secure: isHttps,
maxAge: 60 * 60 * 24
});
} catch (e) {

View File

@@ -6,7 +6,20 @@ const config = {
// Consult https://svelte.dev/docs/kit/integrations
// for more information about preprocessors
preprocess: vitePreprocess(),
kit: { adapter: adapter() }
kit: {
adapter: adapter(),
prerender: {
entries: ['/hilfe/transkription'],
// Disable crawl: by default SvelteKit follows nav links from
// prerendered pages and prerenders the targets too. The targets
// (/, /documents, /persons, …) throw redirect('/login') during
// the build (no auth cookie), so SvelteKit bakes a
// `<script>location.href='/login'</script>` HTML page and serves
// it before the runtime hooks ever run. Result: authenticated
// users with a valid cookie still get bounced. See #514.
crawl: false
}
}
};
export default config;

View File

@@ -46,6 +46,7 @@ export default defineConfig({
coverage: {
provider: 'v8',
reporter: ['text', 'lcov'],
reportsDirectory: 'coverage/server',
// Measure utility and server-side logic only. Svelte components run
// in the browser project and are excluded here. Browser-only TS files
// (actions, hooks, domain-specific UI state) are also excluded.
@@ -58,7 +59,10 @@ export default defineConfig({
],
exclude: ['**/*.svelte', '**/*.svelte.ts', '**/__mocks__/**'],
thresholds: {
branches: 80
lines: 80,
functions: 80,
branches: 80,
statements: 80
}
},
projects: [

View File

@@ -0,0 +1,51 @@
import { paraglideVitePlugin } from '@inlang/paraglide-js';
import devtoolsJson from 'vite-plugin-devtools-json';
import tailwindcss from '@tailwindcss/vite';
import { defineConfig } from 'vitest/config';
import { playwright } from '@vitest/browser-playwright';
import { sveltekit } from '@sveltejs/kit/vite';
// Standalone config for browser-project Istanbul coverage.
// Uses a dedicated root-level coverage block because Vitest 4 ignores
// per-project coverage overrides inside test.projects.
// Plugins mirrored from vite.config.ts: tailwindcss, sveltekit, devtoolsJson, paraglideVitePlugin
// Update here whenever vite.config.ts plugins change.
export default defineConfig({
optimizeDeps: {
include: ['pdfjs-dist', '@tiptap/core', '@tiptap/starter-kit', '@tiptap/extension-mention']
},
plugins: [
tailwindcss(),
sveltekit(),
devtoolsJson(),
paraglideVitePlugin({
project: './project.inlang',
outdir: './src/lib/paraglide'
})
],
test: {
expect: { requireAssertions: true },
browser: {
enabled: true,
provider: playwright(),
instances: [{ browser: 'chromium', headless: true }],
screenshotDirectory: 'test-results/screenshots',
screenshotFailures: true
},
include: ['src/**/*.svelte.{test,spec}.{js,ts}'],
exclude: ['src/lib/server/**'],
coverage: {
provider: 'istanbul',
reporter: ['text', 'lcov'],
reportsDirectory: 'coverage/client',
include: ['src/**/*.svelte', 'src/**/*.svelte.ts'],
exclude: ['src/lib/paraglide/**', 'src/lib/generated/**', 'src/hooks/**', '**/__mocks__/**'],
thresholds: {
lines: 80,
functions: 80,
branches: 80,
statements: 80
}
}
}
});

90
infra/caddy/Caddyfile Normal file
View File

@@ -0,0 +1,90 @@
# Caddyfile for the Familienarchiv host.
#
# Caddy runs on the host (not in a container) and reverse-proxies into
# the docker compose stacks bound to 127.0.0.1.
#
# Naming convention for ports (also documented in docker-compose.prod.yml):
# production: backend 8080, frontend 3000
# staging: backend 8081, frontend 3001
# gitea: 3005
#
# Security headers and the /actuator block apply to both archive vhosts.
# X-Frame-Options is deliberately NOT set here: Spring Security configures
# frame-options SAMEORIGIN (for the in-app PDF preview iframe). Setting
# DENY in Caddy would conflict.
(security_headers) {
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Content-Type-Options "nosniff"
Referrer-Policy "strict-origin-when-cross-origin"
# Deny browser APIs the app does not use. Reduces blast radius of an
# XSS landing in a privileged origin: a payload cannot silently turn
# on the microphone or read geolocation.
Permissions-Policy "camera=(), microphone=(), geolocation=()"
-Server
}
}
(block_actuator) {
# Defense in depth: even if management.endpoints.web.exposure.include grows
# in application.yaml, /actuator/* is unreachable externally. The internal
# Prometheus scrape (future) talks to the backend directly on the docker
# network, not via Caddy.
#
# Why a `handle` block and not a top-level `respond @matcher`: each archive
# vhost has a catch-all `handle { reverse_proxy ... }` that matches every
# path including /actuator/*, and Caddy's `handle` blocks are mutually
# exclusive. Without our own `handle /actuator/*` the catch-all wins, the
# request is proxied to the backend, and Spring Security 302s to /login
# instead of Caddy returning 404. See #512.
handle /actuator/* {
respond 404
}
}
(access_log) {
# JSON access log for fail2ban. The jail at infra/fail2ban/familienarchiv.conf
# watches this file for 401 responses on /api/auth/login.
# Caddy auto-creates /var/log/caddy/ when running as the `caddy` system user.
log {
output file /var/log/caddy/access.log {
roll_size 10mb
roll_keep 14
}
format json
}
}
archiv.raddatz.cloud {
import security_headers
import block_actuator
import access_log
handle /api/* {
reverse_proxy 127.0.0.1:8080
}
handle {
reverse_proxy 127.0.0.1:3000
}
}
staging.raddatz.cloud {
import security_headers
import block_actuator
import access_log
handle /api/* {
reverse_proxy 127.0.0.1:8081
}
handle {
reverse_proxy 127.0.0.1:3001
}
}
git.raddatz.cloud {
import security_headers
reverse_proxy 127.0.0.1:3005
}

View File

@@ -0,0 +1,39 @@
# fail2ban filter for credential-stuffing attempts against the
# Familienarchiv authentication endpoints.
#
# Parses Caddy JSON access log entries (configured in
# infra/caddy/Caddyfile via the (access_log) snippet).
#
# Sample matched line (whitespace inserted for readability):
# {"level":"info","ts":1700000000.12,"logger":"http.log.access",
# "msg":"handled request",
# "request":{"remote_ip":"203.0.113.42","method":"POST",
# "host":"archiv.raddatz.cloud",
# "uri":"/api/auth/login",…},
# "status":401,…}
#
# Watched endpoints:
# - /api/auth/login — credential stuffing
# - /api/auth/forgot-password — email enumeration + slow brute-force
# against accounts whose addresses leak
#
# Watched statuses:
# - 401 — bad credentials
# - 429 — server-side rate limit (in case a future in-app limiter
# returns 429 before fail2ban catches the volume)
#
# Caddy emits remote_ip *inside* the request object and status at the
# top level. The order within the request object is stable
# (remote_ip → … → uri) across Caddy 2.7+. Lazy `.*?` keeps the regex
# robust to header-dict size growth.
[INCLUDES]
before = common.conf
[Definition]
failregex = ^\s*\{.*?"remote_ip":"<HOST>".*?"uri":"/api/auth/(login|forgot-password).*?"status":\s*4(01|29)\b
ignoreregex =
# Caddy's ts field is a Unix epoch with sub-second precision.
datepattern = "ts":{EPOCH}

View File

@@ -0,0 +1,33 @@
# Jail definition for the Familienarchiv login endpoint.
#
# Install: ln -sf /opt/familienarchiv/infra/fail2ban/jail.d/familienarchiv.conf \
# /etc/fail2ban/jail.d/familienarchiv.conf
# ln -sf /opt/familienarchiv/infra/fail2ban/filter.d/familienarchiv-auth.conf \
# /etc/fail2ban/filter.d/familienarchiv-auth.conf
# systemctl reload fail2ban
#
# Verify with:
# fail2ban-client status familienarchiv-auth
# fail2ban-regex /var/log/caddy/access.log familienarchiv-auth
#
# Tuning rationale:
# - maxretry 10: legitimate users mistyping passwords don't trip the jail
# - findtime 10m: rolling window that catches automated brute force
# - bantime 30m: long enough to discourage scripted attacks, short
# enough that a user who fat-fingered their VPN comes
# back online within a coffee break
[familienarchiv-auth]
enabled = true
# Override Debian's `backend = systemd` default (set in
# /etc/fail2ban/jail.d/defaults-debian.conf). Without this line our jail
# inherits the systemd backend, reads from journald, and never inspects
# Caddy's file-based JSON access log — i.e. brute-force protection is inert.
# `polling` works without inotify and is fine for one rotated log file.
backend = polling
filter = familienarchiv-auth
logpath = /var/log/caddy/access.log
maxretry = 10
findtime = 10m
bantime = 30m
action = iptables-multiport[name=familienarchiv-auth, port="http,https"]

16
infra/minio/Dockerfile Normal file
View File

@@ -0,0 +1,16 @@
# Derived MinIO Client image with the idempotent bootstrap script baked in.
#
# Why a custom image instead of a bind-mount?
# The production Gitea Actions runner is Docker-out-of-Docker. A
# `./infra/minio/bootstrap.sh:/bootstrap.sh:ro` mount resolves the path
# against the HOST filesystem (the host daemon owns the bind), not the
# runner container's `/workspace/...`. The path doesn't exist on the host
# and Docker auto-creates an empty directory at the mount target — the
# entrypoint then fails with `/bootstrap.sh: Is a directory`. Baking the
# script in removes runtime path resolution entirely. See #506.
FROM minio/mc:RELEASE.2025-08-13T08-35-41Z
COPY bootstrap.sh /bootstrap.sh
RUN chmod +x /bootstrap.sh
ENTRYPOINT ["/bin/sh", "/bootstrap.sh"]

67
infra/minio/bootstrap.sh Executable file
View File

@@ -0,0 +1,67 @@
#!/bin/sh
# Idempotent MinIO bootstrap for the Familienarchiv stack.
#
# Runs on every `docker compose up` (the create-buckets service is one-shot,
# no restart). Each step swallows the "already exists" error so the script
# is safe to re-run.
#
# What it does:
# 1. Register the MinIO alias using the root credentials
# 2. Create the application bucket if missing
# 3. Lock the bucket to private (defense in depth)
# 4. Create/enable the `archiv-app` service account (least-privilege user)
# 5. Install a bucket-scoped policy `archiv-app-policy`:
# - GetObject/PutObject/DeleteObject on familienarchiv/*
# - ListBucket + GetBucketLocation on familienarchiv
# (Replaces MinIO's built-in `readwrite` which grants s3:* on *.)
# 6. Attach the policy to `archiv-app`
# 7. Fatal assertion: read back the user and confirm the policy is bound.
# Uses `case` (POSIX) for substring match — the minio/mc image ships
# coreutils + bash but NOT grep/awk/sed.
#
# Required env vars: MINIO_PASSWORD, MINIO_APP_PASSWORD
set -e
mc alias set myminio http://minio:9000 archiv "$MINIO_PASSWORD"
mc mb myminio/familienarchiv --ignore-existing
mc anonymous set private myminio/familienarchiv
mc admin user add myminio archiv-app "$MINIO_APP_PASSWORD" \
|| mc admin user enable myminio archiv-app
cat > /tmp/archiv-app-policy.json <<'POLICY'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": ["arn:aws:s3:::familienarchiv/*"]
},
{
"Effect": "Allow",
"Action": ["s3:ListBucket", "s3:GetBucketLocation"],
"Resource": ["arn:aws:s3:::familienarchiv"]
}
]
}
POLICY
mc admin policy create myminio archiv-app-policy /tmp/archiv-app-policy.json 2>/dev/null \
|| mc admin policy update myminio archiv-app-policy /tmp/archiv-app-policy.json
mc admin policy attach myminio archiv-app-policy --user archiv-app 2>/dev/null || true
INFO=$(mc admin user info myminio archiv-app)
case "$INFO" in
*archiv-app-policy*)
echo "archiv-app bound to archiv-app-policy"
;;
*)
echo "FATAL: archiv-app is missing the bucket-scoped policy"
echo "----- user info -----"
echo "$INFO"
exit 1
;;
esac

View File

@@ -5,6 +5,13 @@
"matchPackagePatterns": ["^@tiptap/"],
"groupName": "tiptap",
"automerge": false
},
{
"description": "Digest bumps for images used in privileged CI steps (--privileged --pid=host) must be reviewed manually — a compromised image has root-equivalent host access.",
"matchPaths": [".gitea/workflows/**"],
"matchUpdateTypes": ["digest"],
"automerge": false,
"reviewersFromCodeOwners": false
}
]
}

16
runner-config.yaml Normal file
View File

@@ -0,0 +1,16 @@
# runner-config.yaml — only the relevant section
container:
# passed as DOCKER_HOST inside the job container
docker_host: "unix:///var/run/docker.sock"
# whitelists the socket path so workflows can mount it
valid_volumes:
- "/var/run/docker.sock"
# appended to `docker run` when the runner spawns a job container
# SECURITY: Mounting the Docker socket grants job containers root-equivalent
# access to the host Docker daemon. Acceptable here because only trusted code
# from this private repo runs on this runner. Do NOT use on a runner that
# accepts untrusted PRs from external contributors.
options: "-v /var/run/docker.sock:/var/run/docker.sock"
# keep network mode default (bridge) — Testcontainers handles its own networking
force_pull: false