Commit Graph

2146 Commits

Author SHA1 Message Date
Marcel
fb52db1253 test: cover CorrespondentSuggestionsDropdown and PersonCard branches
CorrespondentSuggestionsDropdown: empty list still renders the static
heading and 'Alle Korrespondenten' row, populated rows when not loading,
loading hides correspondent rows, initials fallback (lastName-only when
firstName is null), click + keyboard selection, Escape closes.

PersonCard: full matrix of conditional UI — title visibility for PERSON
vs non-PERSON, avatar initials path (firstName+lastName vs lastName-only
fallback), PersonTypeBadge presence for non-PERSON types, alias, life
dates, notes, and the canWrite=true/false branches that gate the edit
link (Nora's authorization-rendering rule).

21 tests covering ~50 branches.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
2e5a9bd36c test: cover OcrTrigger, CoCorrespondentsList, reset-password page
OcrTrigger: select initialisation from storedScriptType (with the
UNKNOWN sentinel collapsing to empty), button disabled-state matrix
across blockCount × scriptType, onTrigger callback wiring, no-annotations
hint visibility.

CoCorrespondentsList: empty-list early return, populated heading + hint,
chip count and links, initials-from-up-to-two-name-parts logic.

reset-password page: form/success branches, hidden-token rendering with
null fallback, MISMATCH vs generic error code mapping, back-to-login
link.

21 tests across three files.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
f6bbb08b26 test: cover PersonTypeBadge, ExpandableText, PersonChipRow branches
PersonTypeBadge: one test per switch arm (INSTITUTION, GROUP, UNKNOWN)
plus the two no-render branches (unrecognised type, empty type).

ExpandableText: clamp detection, toggle visibility logic, expand →
collapse round-trip, default maxLines fallback.

PersonChipRow: sender-only, sender+arrow, abbreviated naming, max-two
visible receivers, +N overflow pill presence/absence, receivers-only
case (no sender → no arrow).

19 tests across three files. Each file uses afterEach(cleanup) and
queries via getByRole/getByText so tests stay decoupled from CSS.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
98335411af test(routes): cover +error and forgot-password page branches
+error.svelte: vi.mock('$app/state') drives the page state so each test
can assert one of the three rendering branches — populated error message,
distinct status code, and the 'Internal Error' fallback when page.error
is null.

forgot-password/+page.svelte: prop-driven tests for the four states —
default form, success banner, error message inside the form, and the
back-to-login link href.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
00bf2eba38 test(profile,documents): cover PasswordChangeForm and FileSectionNew branches
PasswordChangeForm: tests the null/success/error/mismatch banner branches
plus the form action wiring.

FileSectionNew: tests the no-file/file-selected toggle, onfileParsed
callback invocation with the parsed metadata, the early-return when no
file is in the change event, and the suggestedTitle fallback path.

Eleven tests across two files. Both follow the UploadZone template (props,
File API synthetic input, vi.fn() callback spies).

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
273bf5e5fa test(person): add PersonChip browser tests
Covers the abbreviated/full name branches, the firstName-null fallback
path, link href derivation from person id, initials rendering, and the
deterministic avatar palette colour. Six tests, six branches hit.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
2d18de57c9 test(document): cover all five DocumentStatusChip status branches
Adds DocumentStatusChip.svelte.test.ts asserting one branch per
DocumentStatus value (PLACEHOLDER, UPLOADED, TRANSCRIBED, REVIEWED,
ARCHIVED) plus the title/aria-label exposure. Each test queries the
element via getByTitle so the component's accessibility surface is
verified at the same time as its branch logic.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
4483413abf test(upload-zone): backfill afterEach(cleanup) for consistent test isolation
UploadZone is the canonical browser-test template referenced from issue #496
implementation guidance. Adding afterEach(cleanup) makes it match the
TranscriptionPanelHeader pattern and prevents cross-test DOM leakage as more
tests are added in this branch.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
9572b062f1 refactor(test): use getByRole instead of data-testid in TranscriptionPanelHeader test
Per Felix's review on issue #496, tests should query observable behaviour via
ARIA roles, not test-only data-testid attributes. Replaces every
'document.querySelector([data-testid=...])' with 'page.getByRole(...)'.
The disabled-button click test uses force: true so Playwright bypasses its
enabled-check — the behaviour under test is precisely that the click is
ignored.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
92da39ed84 chore(routes): delete dev-only demo route
Removes scaffolding pages from initial Paraglide setup that were never
navigated to in production. Shrinks the measured coverage surface and
removes dead code from the production bundle. CLAUDE.md route tables
updated to drop the demo/ entry.

Refs #496.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 21:50:28 +02:00
Marcel
3775f4cb52 ci(nightly): regression guard for backend /import:ro mount
Some checks failed
CI / Backend Unit Tests (pull_request) Successful in 4m13s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 2m48s
CI / OCR Service Tests (pull_request) Successful in 18s
CI / Compose Bucket Idempotency (pull_request) Failing after 11s
CI / Unit & Component Tests (push) Has been cancelled
Sara flagged that a future "compose cleanup" PR could silently drop the
backend volumes block and CI would happily pass while mass import on
staging silently broke. Adds a pre-build step that renders the staging
compose config and fails the deploy if `target: /import` or
`read_only: true` is missing.

Local verification of the guard:
- Volumes block removed → `grep -q 'target: /import'` exits 1 → step fails
- Volumes block present → both greps match → step passes

Addresses Sara's review on #526.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 20:08:30 +02:00
Marcel
c2c42706c7 ci(release): wire IMPORT_HOST_DIR=/srv/familienarchiv-production/import
Mirrors the staging change. The host directory does not yet exist on
the production server — first production release that consumes this
will create an empty bind source via Docker's auto-create behaviour;
mass import then reports "no spreadsheet found" until an operator
pre-stages a payload there.

Addresses Tobias's review on #526.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 20:06:33 +02:00
Marcel
9703a72e6c ci(nightly): wire IMPORT_HOST_DIR=/srv/familienarchiv-staging/import
The compose file now requires IMPORT_HOST_DIR or refuses to start
(#526). Without this line the next nightly deploy would fail with a
clear interpolation error, but it should not fail — the staging
import payload already lives at this host path (rsync'd in #526).

Addresses Tobias's review on #526.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 20:05:55 +02:00
Marcel
a40267e490 docs(deployment): document IMPORT_HOST_DIR and mass-import workflow
DEPLOYMENT.md line 81 declares any compose env var missing from §2 a
blocking review comment. IMPORT_HOST_DIR (added on this branch) was
unmentioned. Adds the row and rewrites §6.4 so the staging/prod operator
workflow (rsync host → set env → trigger import) is in the runbook,
not just buried in compose comments.

Addresses review feedback from Markus and Tobias on #526.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 20:05:14 +02:00
Marcel
cdb5db6c68 fix(compose): require IMPORT_HOST_DIR, no default
Tobias and Markus both flagged that a shared default (/srv/familienarchiv/
import) invites silent collision when staging and prod cohabit one host.
Switch to ${IMPORT_HOST_DIR:?...} so compose refuses to start without an
explicit per-env path — collision becomes structurally impossible.

The error message points operators at docs/DEPLOYMENT.md so the recovery
step is one click away. IMPORT_HOST_DIR moves from "Optional" to the
main required-env-vars block in the header.

Addresses review feedback from Markus, Tobias, and Nora on #526.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 20:03:57 +02:00
Marcel
ff20721dee refactor(import): make import directory @Value-configurable
The hardcoded `static final String IMPORT_DIR = "/import"` was the only
non-`@Value` configurable input in MassImportService — every column
index next to it is wired through `app.import.col.*`. Lifts the
contract from infrastructure (compose bind mount) into application
config (`app.import.dir`), with `/import` as the default so the existing
bind-mount path keeps working.

Addresses review feedback from Markus and Felix on #526.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 20:02:45 +02:00
Marcel
4a537d6b19 feat(infra): bind-mount /import for backend mass-import endpoint
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m55s
CI / OCR Service Tests (push) Successful in 18s
CI / Backend Unit Tests (push) Successful in 4m9s
CI / fail2ban Regex (push) Successful in 38s
CI / Compose Bucket Idempotency (push) Successful in 56s
CI / Unit & Component Tests (pull_request) Failing after 2m47s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m12s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
`MassImportService` reads the ODS spreadsheet and referenced PDFs from a
hardcoded `/import` path inside the backend container. Dev compose
already bind-mounts `./import:/import`, but the prod compose had no
equivalent, so `POST /api/admin/import` would always fail on staging/prod
with "no spreadsheet found".

Mount strategy:
- Source path is env-driven (`IMPORT_HOST_DIR`), defaulting to
  `/srv/familienarchiv/import` so the host path is stable across CI
  deploys (the compose working dir is recreated each run, so `./import`
  would not persist).
- Read-only — `MassImportService` only reads (`Files.list` /
  `Files.walk`), never writes. Read-only mount makes that contract
  explicit and prevents the backend container from mutating the source
  PDFs.
- Empty / missing path is harmless: the import API just returns the
  existing "no spreadsheet found" error rather than crashing the
  container.

To use on staging: rsync the import folder to
`/srv/familienarchiv-staging/import/` on the host, set
`IMPORT_HOST_DIR=/srv/familienarchiv-staging/import` in `.env.staging`,
redeploy, trigger import from `/admin/system`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:57:47 +02:00
Marcel
5f3529439a fix(infra): frontend healthcheck on 127.0.0.1, not localhost
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m53s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m33s
CI / fail2ban Regex (pull_request) Successful in 40s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s
CI / Unit & Component Tests (push) Failing after 2m52s
CI / OCR Service Tests (push) Successful in 18s
CI / Backend Unit Tests (push) Successful in 4m23s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Successful in 1m0s
The new alpine-based frontend production image (`node:20.19.0-alpine3.21`)
resolves `localhost` only to `::1` in /etc/hosts. SvelteKit's adapter-node
binds to 0.0.0.0 (IPv4 only), so `wget http://localhost:3000/login` from
inside the container connects to ::1 and gets "Connection refused" every
15s. Container goes unhealthy → `docker compose up --wait` fails → nightly
staging deploy fails. The app itself is fine.

Switching to 127.0.0.1 bypasses /etc/hosts and matches what Node actually
listens on.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:49:32 +02:00
Marcel
48c8bb8a5f fixup: address Nora's review on #520 (security blockers)
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m48s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m10s
CI / fail2ban Regex (push) Successful in 38s
CI / Compose Bucket Idempotency (push) Successful in 56s
- frontend/login: derive cookie `secure` flag from request URL protocol.
  Pre-PR the cookie was only read by SSR so the flag didn't matter; now
  the cookie IS the API credential and must be Secure on HTTPS or it
  leaks a 24h Basic token on plaintext networks. Dev runs over HTTP and
  would silently lose the cookie if we hardcoded `secure: true`, so the
  flag follows `event.url.protocol === 'https:'`.

- SecurityConfig: rewrite the CSRF-disabled comment. The old
  "browsers block cross-origin custom headers" justification no longer
  holds once /api/* is authenticated via the cookie. Make the
  load-bearing dependencies explicit: SameSite=strict on the auth_token
  cookie + Spring's default CORS rejection.

- AuthTokenCookieFilter:
  - Scope to /api/* only. /actuator/health and similar must not be
    cookie-authenticated.
  - Refuse malformed percent-encoding (URLDecoder throws); forward the
    request without a promoted Authorization rather than crash.
  - Use isBlank() instead of isEmpty() per Nora.
  - Javadoc warning: getHeaderNames/getHeaders exposes the Basic
    credential; any future header-iterating logger must scrub
    Authorization before logging.

- Tests: add `passes_through_unchanged_when_request_is_outside_api_scope`
  (/actuator/health with cookie should NOT be wrapped) and
  `passes_through_unchanged_when_cookie_value_is_malformed_percent_encoding`.
  Tighten the explicit-header test to verify same-instance forwarding
  rather than just header equality.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:20:10 +02:00
Marcel
023810df1e fix(security): promote auth_token cookie to Authorization header for browser /api/* calls
Closes #520.

The login action stores `Basic <base64>` in an HttpOnly `auth_token`
cookie. SSR fetches from hooks.server.ts explicitly set the
Authorization header. Vite's dev proxy does the same on every
/api/* request. Caddy in production does NOT. So browser-side
fetch() and EventSource() calls reach the backend without auth,
get 401 + WWW-Authenticate: Basic, and the browser pops a native
auth dialog over the SPA.

Add AuthTokenCookieFilter (Ordered.HIGHEST_PRECEDENCE, before any
Spring Security filter) that promotes the cookie to a request
header when no explicit Authorization is present. URL-decodes the
cookie value because SvelteKit URL-encodes spaces ("Basic " ->
"Basic%20") when serializing the cookie. Works the same for REST,
SSE (/api/notifications/stream, /api/ocr/jobs/.../progress), and
any other browser-direct backend call.

5 tests in AuthTokenCookieFilterTest cover: URL-decoded promotion,
explicit-Authorization-wins precedence, no-cookies pass-through,
absent-auth-token pass-through, empty-value pass-through.

Also: add `@ActiveProfiles("test")` to ThumbnailServiceIntegrationTest,
the one remaining @SpringBootTest in the suite that wasn't annotated.
After #516 made UserDataInitializer fail-closed outside dev/test/e2e,
this test's context load was throwing. Restores green main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 18:20:10 +02:00
Marcel
ad3b571bba fix(user): findOrCreate Administrators group instead of blind-INSERT (#518)
Some checks failed
CI / Backend Unit Tests (pull_request) Failing after 4m12s
CI / fail2ban Regex (pull_request) Successful in 39s
CI / Unit & Component Tests (pull_request) Failing after 2m50s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Compose Bucket Idempotency (pull_request) Successful in 58s
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Closes #518.

UserDataInitializer.initAdminUser was doing groupRepository.save(adminGroup)
unconditionally. If a previous boot had seeded the group but failed
before creating the admin user (or if the operator deleted just the
admin row to retry with a corrected APP_ADMIN_USERNAME), the next
seed attempt violated user_groups_name_key and aborted the context.

Switch to the same findByName(...).orElseGet(...) pattern initE2EData
already uses for the "Leser" group.

Tests in AdminSeedFailClosedTest:
- reuses_existing_Administrators_group_when_seeding_a_new_admin
- creates_Administrators_group_when_seeding_admin_on_a_fresh_database
Plus updated existing tests to stub groupRepository.save now that the
seed path also exercises it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:29:11 +02:00
Marcel
9686e304c2 fix(caddy): wrap actuator block in handle so it takes precedence over catch-all
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Closes #512.

The previous `(block_actuator)` snippet emitted `respond @actuator 404`
at the top level of each archive vhost. But each vhost also has a
catch-all `handle { reverse_proxy ... }` that matches /actuator/*
too. Caddy's `handle` blocks are mutually exclusive — once one matches,
the request never reaches a top-level `respond`. So /actuator/health
was being proxied to the backend, which 302s to /login.

Wrap the actuator response in its own `handle /actuator/*` block.
Caddy sorts `handle` blocks by path specificity, so /actuator/* wins
over the catch-all and the 404 is actually returned.

Verified with `caddy validate` against the caddy:2 image.

Also unblocks the nightly.yml smoke test's `/actuator/health → 404`
assertion, which has been failing since the first staging deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:15:03 +02:00
Marcel
ea0b3050e4 fix(user): fail-closed when admin seed would use dev defaults outside dev/test/e2e
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Addresses Nora's review concern on #513/#516.

The previous fix only made env-vars take effect — it did NOT close the
fail-open default path. If an operator forgets APP_ADMIN_USERNAME /
APP_ADMIN_PASSWORD on first prod boot, the seeded admin is the
well-known `admin@familienarchiv.local` / `admin123` and is permanently
locked (UserDataInitializer only seeds when the row is missing).

Refuse to seed outside dev/test/e2e profiles when either credential
matches the documented default. The startup fails fast with a clear
message pointing at the env-var names and the permanence trap.

Also adds Markus/Felix/Sara's "pin the Java side" coverage: a
reflection test on the @Value placeholder catches a future rename
of `${app.admin.email:...}` back to `${app.admin.username:...}`,
which would otherwise pass the yaml-side test but silently break
the binding.

Tests:
- AdminSeedFailClosedTest pins fail-closed for non-local profiles
  and verifies the dev/test/e2e bypass.
- AdminSeedPropertyKeyTest now also asserts the @Value placeholder
  string on UserDataInitializer.adminEmail/adminPassword.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:12:36 +02:00
Marcel
21343cdf23 fix(user): rename yaml key username→email so admin seed reads APP_ADMIN_USERNAME
Closes #513.

UserDataInitializer reads `@Value("${app.admin.email:...}")` but
application.yaml mapped APP_ADMIN_USERNAME to `app.admin.username`.
The keys never connected — env vars APP_ADMIN_USERNAME and
APP_ADMIN_PASSWORD were silently ignored and the admin user got
seeded with the hardcoded defaults admin@familyarchive.local /
admin123.

For production this is HIGH severity: DEPLOYMENT.md §3.5 documents
the admin password as permanently locked on first deploy. The
bug locked the lock-in to dev defaults, not to whatever an operator
set in PROD_APP_ADMIN_PASSWORD.

Rename yaml key from `username:` to `email:` so the Spring property
`app.admin.email` actually exists. Keep env-var name
APP_ADMIN_USERNAME (matches the already-set Gitea secrets and
DEPLOYMENT.md §3.3). Default value updated to an email-shape.

Added AdminSeedPropertyKeyTest (Binder pattern, no Spring context):
verifies both `app.admin.email` and `app.admin.password` resolve
from the yaml. Confirmed red without the fix, green with it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:12:36 +02:00
Marcel
6ba7254344 test(ci): assert prerender output is only /hilfe/transkription
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
Addresses Sara's review request on #515.

Without this gate, a future regression that turns prerender.crawl
back on (or adds a new prerender entry whose nav links into
protected routes) would silently bake /, /documents, /persons etc.
to "redirect-to-login" HTML and re-introduce #514.

Verified the script catches the current broken build state:
  $ find build/prerendered ... -not -path 'hilfe/*' ...
  build/prerendered/{index,documents,persons,geschichten,stammbaum}.html

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:00:54 +02:00
Marcel
b2955fb695 fix(frontend): disable prerender crawl so /, /documents, /persons aren't baked
Closes #514.

The build was prerendering protected routes via crawl from
/hilfe/transkription. Their load functions throw redirect('/login')
during the build (no auth cookie), so SvelteKit captured the redirect
as static HTML and shipped /app/build/prerendered/{index,documents,
persons,geschichten,stammbaum}.html with a `location.href=/login`
script. In production these files are served BEFORE hooks.server.ts
runs, so an authenticated user with a valid cookie is still served
the baked bounce-back page.

Setting `crawl: false` keeps the explicit /hilfe/transkription entry
prerendered (needed for the public help page) without dragging the
nav targets along with it.

Verified locally: build now emits only `hilfe/transkription.html`
under build/prerendered/, no index.html or documents.html etc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 17:00:10 +02:00
5d2888e038 Merge pull request 'fix(compose): mark create-buckets as one-shot for up --wait (#510)' (#511) from fix/issue-510-compose-wait-oneshot-create-buckets into main
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
2026-05-11 16:59:59 +02:00
Marcel
3668555421 fix(compose): mark create-buckets as one-shot for up --wait
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m47s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m12s
CI / fail2ban Regex (push) Successful in 37s
CI / Compose Bucket Idempotency (push) Successful in 56s
CI / Unit & Component Tests (pull_request) Failing after 2m49s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m13s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 58s
Closes #510.

`docker compose up -d --wait` exits 1 even when every service is
healthy because the one-shot `create-buckets` exits 0 and --wait
expects "running". The whole stack came up fine on staging, but the
workflow gate failed before the smoke step could run.

Two changes:

1. create-buckets: `restart: "no"` declares one-shot intent.
2. backend.depends_on: add `create-buckets: service_completed_successfully`.

With both, compose v2.20+ understands create-buckets is a one-shot
that must complete successfully, and --wait treats exited(0) as the
target state. Backend startup now also correctly gates on bucket
bootstrap (closes a latent race where backend could start before
the archiv-app policy was bound).

Verified `docker compose config --quiet` parses and the resolved
config shows the right dependency graph.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:33:04 +02:00
Marcel
54a8f7f8e9 fix(workflows): match runner label — runs-on ubuntu-latest, not self-hosted
Some checks failed
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Unit & Component Tests (push) Failing after 2m49s
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
Closes #508.

Our gitea-runner advertises labels ubuntu-latest / ubuntu-24.04 /
ubuntu-22.04. `runs-on: self-hosted` never matches → dispatched
deploy jobs sit in the queue forever. The runner is still
genuinely self-hosted (DooD socket, joined to gitea_gitea net,
single-tenant per ADR-011) — the `self-hosted` token was just an
unconfirmed assumption about the label name.

Unblocks #497 / #499 first deploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 16:15:53 +02:00
Marcel
f8f0951bd5 fix(minio): bake bootstrap.sh into image instead of bind-mounting
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 2m50s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m9s
CI / fail2ban Regex (pull_request) Failing after 12s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
Closes #506.

Under Docker-out-of-Docker (the production Gitea Actions runner), the
host daemon resolves the relative bind-mount path against the host
filesystem — not the runner container's /workspace. The script is not
there, so Docker creates an empty directory at /bootstrap.sh and the
entrypoint fails with `/bootstrap.sh: Is a directory`.

Bake the script into a tiny derived image (infra/minio/Dockerfile) so
there is no runtime path resolution. Works in DooD, regular Docker,
and CI.

Unblocks the staging / production deploy pipelines from #497 / #499
and turns the Compose Bucket Idempotency CI job green.

Verified locally:
- `docker compose ... config --quiet` parses
- `docker compose ... build create-buckets` builds the image
- bootstrap.sh exists as a +x file at /bootstrap.sh inside the image

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 15:32:36 +02:00
c3c1efe5f1 Merge pull request 'fix(fail2ban): pin polling backend so jail actually reads Caddy access log (#503)' (#504) from fix/issue-503-fail2ban-polling-backend into main
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m47s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m12s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Failing after 50s
2026-05-11 15:08:58 +02:00
Marcel
e5363913ec fix(fail2ban): pin polling backend so jail actually reads Caddy access log
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m49s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m8s
CI / fail2ban Regex (push) Successful in 37s
CI / Compose Bucket Idempotency (push) Failing after 53s
CI / Unit & Component Tests (pull_request) Failing after 2m46s
CI / OCR Service Tests (pull_request) Successful in 15s
CI / Backend Unit Tests (pull_request) Successful in 4m14s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Failing after 50s
Closes #503.

Debian's fail2ban package ships defaults-debian.conf with
`[DEFAULT] backend = systemd`. Without an explicit override, our
familienarchiv-auth jail inherits the systemd backend at runtime,
reads from journald, and never inspects /var/log/caddy/access.log.
A live login brute-force would not be banned.

Add `backend = polling` to the jail and a CI step that links the jail
into /etc/fail2ban/ and asserts `fail2ban-client -d` resolves it to
the polling backend, not the inherited systemd backend.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:59:40 +02:00
Marcel
4d4d5793bb docs(glossary): add archiv-app service account entry
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m48s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m5s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Failing after 50s
CI / Unit & Component Tests (push) Failing after 2m46s
CI / OCR Service Tests (push) Successful in 15s
CI / Backend Unit Tests (push) Successful in 4m4s
CI / fail2ban Regex (push) Successful in 37s
CI / Compose Bucket Idempotency (push) Failing after 50s
`archiv-app` is the bucket-scoped MinIO service account introduced
in PR #499 alongside the production deploy pipeline. Until now the
term only appeared in `infra/minio/bootstrap.sh` and the prod compose
file; a reader encountering `S3_ACCESS_KEY: archiv-app` had no
single-page reference distinguishing it from the MinIO root account.

Adds a new "Infrastructure Terms" section to docs/GLOSSARY.md so the
distinction (root account vs. application service account) and the
attached `archiv-app-policy` scope live in the canonical glossary
location. Cross-links to ADR-010 for the MinIO-stays-self-hosted
rationale. Addresses @elicit's round-2 recommendation on PR #499.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:11:46 +02:00
Marcel
9adde3cd89 refactor(compose): rename docker network archive-net to archiv-net
The docker network was the only `archive-*` identifier in either
compose file; everything else (user, db, bucket, service account,
project name) uses the `archiv-*` spelling. Reviewers' eyes stuttered
on it on the prod compose review (round 2 of PR #499 — Markus and
Tobi). Renamed in both prod and dev compose for consistency and
updated the single doc reference to the dev-project-prefixed
network name.

Operational note: applying this change to a running stack will
recreate the network on the next `docker compose up`; containers
restart, named volumes are unaffected.

`docker compose config --quiet` passes for both compose files and
for the staging profile. Sweep confirms zero `archive-net`
references remain in the tree.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:10:39 +02:00
Marcel
440a191138 infra(workflows): annotate env-file cleanup as load-bearing
The `if: always()` conditional on the env-file cleanup step in both
deploy workflows is what makes the ADR-011 single-tenant runner trust
model safe: secrets land on disk before each deploy and are wiped
unconditionally afterwards. A future workflow refactor that drops
`if: always()` would silently leave plaintext secrets on the runner
on any failed deploy.

The ADR documents this; the workflow file did not. Adds a prominent
inline comment so the next reader of the YAML sees the constraint
without having to cross-reference ADR-011. No behaviour change — both
workflows still parse. Addresses @nora's round-2 suggestion on PR
#499 — "linchpin of the ADR-011 trust model".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:09:12 +02:00
Marcel
1873f50f7f infra(mailpit): use nc -z healthcheck instead of wget
The mailpit service healthcheck previously assumed `wget` ships in
the axllent/mailpit image. That's true for v1.29.7 but is not part
of the image's contract — a future Alpine slim-down could drop wget
and silently disable the healthcheck. Switched to BusyBox `nc -z
localhost 8025`, which is a TCP-port open check with no dependency
beyond BusyBox itself.

Verified inside axllent/mailpit:v1.29.7 that `nc` is present
(/usr/bin/nc, BusyBox v1.37.0) and that the proposed command
returns 0 against an open port and non-zero against a closed one.
Compose still parses with `--profile staging`. Addresses @tobi's
round-2 suggestion on PR #499.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:08:23 +02:00
Marcel
a4f2047bcc security(ocr): pin ALLOWED_PDF_HOSTS=minio in prod ocr-service env
Production never sources PDFs from localhost or 127.0.0.1 — the OCR
service only reads from MinIO over the internal docker network. The
Python default (`minio,localhost,127.0.0.1`) was permissive on
purpose for local dev, but in production a future change to that
default — or a host-env override — would silently broaden the SSRF
surface. Pinning the env var explicitly here freezes the allowlist
to the one hostname production actually needs.

`docker compose config --quiet` and `--profile staging config
--quiet` both still pass. Verified the resolved config emits
`ALLOWED_PDF_HOSTS: minio`. Addresses @nora's round-2 suggestion on
PR #499 — "five characters of YAML, lifetime guarantee".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:07:16 +02:00
Marcel
09680557ef security(caddy): add Permissions-Policy header
Adds `Permissions-Policy: camera=(), microphone=(), geolocation=()` to
the shared (security_headers) snippet, so both archiv vhosts and the
git vhost deny browser APIs the app does not use. Reduces blast radius
of an XSS landing in a privileged origin.

The deploy smoke steps in nightly.yml and release.yml gain a matching
assertion against the canonical header value, so a future Caddyfile
edit that drops or loosens the header (e.g. `camera=(self)`) fails the
deploy instead of regressing silently.

`caddy validate` against caddy:2 passes; both workflow YAMLs parse.
Addresses @nora's round-2 suggestion on PR #499 — "lower-impact than
CSP but nearly free".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:06:13 +02:00
Marcel
8fcf653cb0 ci(smoke): pin HSTS to preload-list-eligible value
Replaces the presence-only `grep -qi strict-transport-security` smoke
assertion in both nightly.yml and release.yml with a value-pinning
regex that requires `max-age=31536000`, `includeSubDomains`, and
`preload`. A future Caddyfile edit that drops any of those three
parts now fails the deploy smoke step instead of passing silently.

Verified locally that the new pattern matches the preload-eligible
value and rejects three degraded forms (short max-age, missing
includeSubDomains, missing preload). Addresses @sara's round-2 note
on PR #499 — "presence check, not value check".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 14:05:02 +02:00
Marcel
a7a80f8c16 docs(deployment): route SSE through Caddy in topology mermaid
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m48s
CI / OCR Service Tests (push) Successful in 16s
CI / Unit & Component Tests (pull_request) Failing after 2m48s
CI / Backend Unit Tests (pull_request) Successful in 4m8s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Failing after 49s
CI / Backend Unit Tests (push) Successful in 4m7s
CI / fail2ban Regex (push) Successful in 36s
CI / Compose Bucket Idempotency (push) Failing after 1m15s
CI / OCR Service Tests (pull_request) Successful in 16s
The top-level deployment diagram lagged the C4 L2 diagram, which
correctly notes that SSE notifications are fronted by Caddy. The
mermaid showed Browser → Backend direct, which would only be true
if the backend port were exposed publicly (it is not — all docker
ports bind to 127.0.0.1).

Fixes the inconsistency Markus flagged on PR #499: the public
surface is Caddy and Caddy only.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:18:11 +02:00
Marcel
03d478840b docs(arch): show Caddy + X-Forwarded-Proto in auth-flow diagram
Adds the Caddy hop to seq-auth-flow.puml and surfaces the two
production-relevant header behaviours:

  - Caddy terminates TLS and forwards X-Forwarded-Proto: https
  - Spring Boot trusts this header (server.forward-headers-strategy:
    native, ForwardedRequestCustomizer at the Jetty layer), so
    request.getScheme() returns "https"
  - The Set-Cookie response carries the Secure flag because the
    observed scheme is https — without forward-headers-strategy this
    would silently drop to plain http and the cookie would lose Secure

Closes the doc-currency gap flagged in the Markus review on PR #499:
"Auth flow change → docs/architecture/c4/seq-auth-flow.puml".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:17:12 +02:00
Marcel
6a6a1c4353 docs(adr): ADR-011 single-tenant Gitea runner with on-disk env-files
Records the operational assumption that nightly.yml and release.yml
bake in: the self-hosted runner is single-tenant, so writing secrets
to .env.staging / .env.production on disk and removing them via an
`if: always()` cleanup step is acceptable for v1.

Documents the three migration triggers (second repo on the runner,
untrusted PR execution, move to shared infrastructure) and the
one-step migration path (--env-file <(printf '%s' "$SECRET_BLOB"))
so the next operator does not silently break the trust assumption.

The in-comment notes at the top of both workflow files already point
at this ADR's content; this commit records the decision in the durable
location the doc-currency table demands.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:16:20 +02:00
Marcel
b57afb9ad2 docs(adr): ADR-010 MinIO stays self-hosted, Hetzner OBS deferred
Records the reversal of the earlier "migrate to Hetzner Object Storage"
direction in docs/infrastructure/production-compose.md. Documents the
cost/benefit (current 13 GB fits trivially on the VPS; OBS billing is
dominated by base fee at this size; migration is a three-env-var swap
plus `mc mirror`, no application rewrite cost).

Captures the four triggers that should re-open the decision (50 GB
threshold, healthcheck latency, VPS upgrade cost, backup runtime) so
the deferral does not become an indefinite punt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:15:38 +02:00
Marcel
59bc81d353 docs(adr): ADR-009 standalone docker-compose.prod.yml, not overlay
Records the decision to make docker-compose.prod.yml a fully self-contained
file rather than an overlay over docker-compose.yml. Captures the cost
(env-var duplication across dev and prod files) and the benefit (single
file the reviewer can hold in their head, no Compose merge-rule
surprises, automatic project-name namespacing for cohabiting staging +
production on one host).

Surfaces the retirement of the earlier overlay narrative in
docs/infrastructure/production-compose.md so a future maintainer does
not reverse the choice out of ignorance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:14:58 +02:00
Marcel
33300e4ad9 chore(infra): drop aspirational Renovate comments from compose
The repo's renovate.json only configures TipTap grouping; Renovate is
not currently active against MinIO / mc / mailpit / Postgres / Node /
Caddy. The "Renovate keeps it current" comments were aspirational —
those tags will rot until Renovate is bootstrapped (tracked in a
follow-up issue).

The "Pinned mc release; Renovate keeps it current" comment is gone
already since the create-buckets entrypoint was extracted to a script
in the preceding MinIO-policy commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:12:55 +02:00
Marcel
fe1451f570 ci(smoke): pin curl to 127.0.0.1 via --resolve
The smoke step previously curled the public hostname unconditionally,
which routes the runner's request via DNS → router → back into the same
host. Many SOHO routers do not implement hairpin NAT (or do so only after
a firmware update), so the deploy may pass on day one and silently fail
on day 90.

--resolve "<host>:443:127.0.0.1" pins the hostname to the runner's
loopback while keeping SNI on the public name (so the cert validates
correctly and the Caddy vhost block matches). The smoke test now
verifies that the Caddy-on-the-same-host is serving the right
hostname end-to-end, with no router dependency.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:12:05 +02:00
Marcel
f2ec81547b ci(deploy): add --pull to docker compose build for CVE pickup
Without --pull, the host's Docker layer cache wins: if a CVE drops in
node:20.19.0-alpine3.21 / postgres:16-alpine and the vendor re-publishes
the same tag, the runner keeps serving the cached layer until the cache
is manually cleared — a silent supply-chain blind spot.

Adding --pull to both `compose build` invocations costs a single
re-pull per run and lifts the base-image patch lag from "next host
prune" to "next nightly".

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:10:59 +02:00
Marcel
7e430998b8 security(fail2ban): widen jail to /forgot-password and rate-limit 429
The filter only watched /api/auth/login 401 — leaving the forgot-password
endpoint open to:

  - email enumeration (slow brute-force probing which addresses exist)
  - password-reset brute-force against accounts whose addresses leak

Widens the failregex to /api/auth/(login|forgot-password) and adds 429 to
the status alternation so a future in-app rate-limiter response is also
caught by the jail (defense in depth).

CI assertions extended to cover both new dimensions plus a negative case
on an unrelated 401 endpoint (/api/documents) — pins that the widening
did not over-match.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:10:08 +02:00
Marcel
156afa14a2 test(ci): add compose bucket-bootstrap idempotency job
The create-buckets service in docker-compose.prod.yml runs on every
`docker compose up` (one-shot, restart=no). A re-deploy that fails
because the user/bucket/policy already exists would block the whole
nightly/release pipeline — and the only way to find out today is to
run a second deploy.

This job runs the bootstrap twice against a throwaway minio stack and
asserts both invocations exit 0. Caught at PR time, not at the third
nightly deploy at 02:00.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:08:51 +02:00
Marcel
91f70e652d security(minio): scope archiv-app to bucket-only IAM policy
Replaces MinIO's built-in `readwrite` policy (which grants s3:* on
arn:aws:s3:::* — every bucket present and future) with a bucket-scoped
custom policy `archiv-app-policy`:

  - s3:GetObject / s3:PutObject / s3:DeleteObject on familienarchiv/*
  - s3:ListBucket / s3:GetBucketLocation on familienarchiv

The previous configuration silently regressed the least-privilege guarantee
that the service-account separation was supposed to provide: a future
second bucket (logs, backups, mc-mirror staging) would have been
read/write/delete-accessible to a compromised backend.

While at it, two follow-on fixes:

  1. Extract the entrypoint to infra/minio/bootstrap.sh. The previous
     inline `/bin/sh -c "..."` was already at the YAML-escaping ceiling;
     adding the policy-JSON heredoc would have made it unreadable.

  2. Replace the `| grep -q readwrite || exit 1` fatal-check with a
     POSIX `case` substring match. The minio/mc image ships coreutils +
     bash but NOT grep/awk/sed — the original check was a no-op that
     ALWAYS exited 1 (verified locally). The new check passes on the
     first invocation and on every subsequent re-deploy.

Idempotency verified locally: two consecutive `docker compose run --rm
create-buckets` invocations both exit 0 with the user bound to the
new policy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 13:07:56 +02:00