Compare commits

..

1 Commits

Author SHA1 Message Date
Marcel
b0aa3a6ffd docs(spec): reader dashboard design exploration and final spec (#447)
Some checks failed
CI / Backend Unit Tests (push) Has been cancelled
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 4m29s
CI / OCR Service Tests (pull_request) Successful in 35s
CI / Backend Unit Tests (pull_request) Failing after 3m13s
Seven HTML mockups covering the full design process for the
permission-gated reader dashboard (issue #447): 3 initial concept
variants (A/B/C), 3 iterations of concept B (B.1 hell, B.2 warm,
B.3 navy), and the final merged spec combining B.1 layout with B.3
person cards. Final spec includes all 4 view variants: desktop light
READ_ALL, desktop light BLOG_WRITE, desktop dark, and mobile (3 phone
frames: light reader, light BLOG_WRITE, dark reader).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-06 11:36:04 +02:00
723 changed files with 12246 additions and 81476 deletions

View File

@@ -410,23 +410,6 @@ Never Kafka for teams under 10 or <100k events/day. Never gRPC inside a monolith
4. Identify missing database-layer enforcement (constraints, RLS) 4. Identify missing database-layer enforcement (constraints, RLS)
5. Check transport choices — simpler protocol available? 5. Check transport choices — simpler protocol available?
6. Propose a concrete simpler alternative, not just a critique 6. Propose a concrete simpler alternative, not just a critique
7. Verify documentation currency. For each category below, check whether the PR triggered the update. Flag missing updates as blockers.
| PR contains | Required doc update |
|---|---|
| New Flyway migration adding/removing/renaming a table or column | `docs/architecture/db/db-orm.puml` and `docs/architecture/db/db-relationships.puml`**except** framework-owned tables (e.g. Spring Session JDBC's `spring_session*`, Flyway's `flyway_schema_history`), which are opaque to app code; reference the relevant ADR if an exclusion is load-bearing |
| New `@ManyToMany` join table or FK | Both DB diagrams |
| New backend package or domain module | `CLAUDE.md` package table + matching `docs/architecture/c4/l3-backend-*.puml` |
| New controller or service in an existing backend domain | Matching `docs/architecture/c4/l3-backend-*.puml` |
| New SvelteKit route | `CLAUDE.md` route table + matching `docs/architecture/c4/l3-frontend-*.puml` |
| New Docker service or infrastructure component | `docs/architecture/c4/l2-containers.puml` + `docs/DEPLOYMENT.md` |
| New external system integrated | `docs/architecture/c4/l1-context.puml` |
| Auth or upload flow change | `docs/architecture/c4/seq-auth-flow.puml` or `docs/architecture/c4/seq-document-upload.puml` |
| New `ErrorCode` or `Permission` value | `CLAUDE.md` + `docs/ARCHITECTURE.md` |
| New domain concept or term | `docs/GLOSSARY.md` |
| Architectural decision with lasting consequences | New ADR in `docs/adr/` |
A doc omission is a blocker, not a concern — the PR does not merge until the diagram or text matches the code.
### Designing Systems ### Designing Systems
1. Start with the data model — get the schema right before application code 1. Start with the data model — get the schema right before application code

View File

@@ -980,24 +980,6 @@ Mark with `@pytest.mark.asyncio` so pytest runs the coroutine. Without it, the t
5. Refactor — apply clean code, extract if 3+ duplications, rename for intent 5. Refactor — apply clean code, extract if 3+ duplications, rename for intent
6. Repeat for the next behavior 6. Repeat for the next behavior
7. When all behaviors are green, review for SOLID violations across the full stack 7. When all behaviors are green, review for SOLID violations across the full stack
8. Update documentation before opening the PR. Use the table below to know which doc to touch.
| What changed in code | Doc(s) to update |
|---|---|
| New Flyway migration adds/removes/renames a table or column | `docs/architecture/db/db-orm.puml` (add/remove entity or attribute) **and** `docs/architecture/db/db-relationships.puml` (add/remove relationship line) — **except** framework-owned tables (e.g. Spring Session JDBC's `spring_session*`, Flyway's `flyway_schema_history`), which are opaque to app code; reference the relevant ADR if an exclusion is load-bearing |
| New `@ManyToMany` join table or FK relationship | Both DB diagrams above |
| New backend package / domain module | `CLAUDE.md` (package structure table) **and** the matching `docs/architecture/c4/l3-backend-*.puml` diagram for that domain |
| New Spring Boot controller or service in an existing domain | The matching `docs/architecture/c4/l3-backend-*.puml` for that domain |
| New SvelteKit route (`+page.svelte`) | `CLAUDE.md` (route structure section) **and** the matching `docs/architecture/c4/l3-frontend-*.puml` diagram |
| New Docker service / infrastructure component | `docs/architecture/c4/l2-containers.puml` **and** `docs/DEPLOYMENT.md` |
| New external system integrated (new API, new S3 bucket, etc.) | `docs/architecture/c4/l1-context.puml` |
| Auth flow or document-upload flow changes | `docs/architecture/c4/seq-auth-flow.puml` or `docs/architecture/c4/seq-document-upload.puml` |
| New `ErrorCode` enum value | `CLAUDE.md` error handling section **and** `CONTRIBUTING.md` |
| New `Permission` enum value | `CLAUDE.md` security section **and** `docs/ARCHITECTURE.md` |
| New domain term introduced (entity name, status, concept) | `docs/GLOSSARY.md` |
| Architectural decision with lasting consequences (new tech, new transport protocol, new pattern) | New ADR in `docs/adr/` |
Skip a doc only if the change genuinely does not affect what that doc describes.
### Reviewing Code ### Reviewing Code
1. TDD evidence — are there tests? Do they precede the implementation? 1. TDD evidence — are there tests? Do they precede the implementation?

View File

@@ -26,52 +26,6 @@ PORT_MAILPIT_SMTP=1025
# Generate with: python3 -c "import secrets; print(secrets.token_hex(32))" # Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"
OCR_TRAINING_TOKEN=change-me-in-production OCR_TRAINING_TOKEN=change-me-in-production
# --- Observability ---
# Optional stack — start with: docker compose -f docker-compose.observability.yml up -d
# Requires the main stack to already be running (docker compose up -d creates archiv-net).
# In production the stack is managed from /opt/familienarchiv/ (see docs/DEPLOYMENT.md §4).
# Ports for host access
PORT_GRAFANA=3003
PORT_GLITCHTIP=3002
PORT_PROMETHEUS=9090
# Grafana admin password — change this before exposing Grafana beyond localhost
GRAFANA_ADMIN_PASSWORD=changeme
# Password for the read-only grafana_reader PostgreSQL role used by the PO
# Overview dashboard. Consumed by Flyway V68 (to set the role's password) and
# by Grafana's PostgreSQL datasource (to connect). REQUIRED in production —
# generate with: openssl rand -hex 32
GRAFANA_DB_PASSWORD=changeme-generate-with-openssl-rand-hex-32
# GlitchTip domain — production: use https://glitchtip.archiv.raddatz.cloud (must match Caddy vhost)
GLITCHTIP_DOMAIN=http://localhost:3002
# GlitchTip secret key — Django SECRET_KEY equivalent, used to sign sessions and tokens.
# REQUIRED in production — must not be empty or 'changeme'. Fail-closed: GlitchTip will
# refuse to start with an invalid key.
# Generate with: python3 -c "import secrets; print(secrets.token_hex(50))"
GLITCHTIP_SECRET_KEY=changeme-generate-a-real-secret
# PostgreSQL hostname for GlitchTip's db-init job and workers.
# Override when only the staging stack is running (container name differs from archive-db).
# Default (archive-db) is correct for production with the full stack up.
POSTGRES_HOST=archive-db
# $$ escaping note: passwords in /opt/familienarchiv/.env that contain a literal '$' must
# use '$$' so Docker Compose does not expand them as variable references.
# Example: a password 'p@$$word' should be written as 'p@$$$$word' in the .env file.
# Error reporting DSNs — leave empty to disable the SDK (safe default).
# SENTRY_DSN: backend (Spring Boot) — used by the GlitchTip/Sentry Java SDK
SENTRY_DSN=
SENTRY_TRACES_SAMPLE_RATE=
# VITE_SENTRY_DSN: frontend (SvelteKit) — injected at build time via Vite
VITE_SENTRY_DSN=
# Sentry/GlitchTip auth token for source map upload at build time (optional)
SENTRY_AUTH_TOKEN=
# Production SMTP — uncomment and fill in to send real emails instead of catching them # Production SMTP — uncomment and fill in to send real emails instead of catching them
# APP_BASE_URL=https://your-domain.example.com # APP_BASE_URL=https://your-domain.example.com
# MAIL_HOST=smtp.example.com # MAIL_HOST=smtp.example.com

View File

@@ -2,7 +2,6 @@ name: CI
on: on:
push: push:
branches: [main]
pull_request: pull_request:
jobs: jobs:
@@ -13,7 +12,7 @@ jobs:
name: Unit & Component Tests name: Unit & Component Tests
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
image: mcr.microsoft.com/playwright:v1.60.0-noble image: mcr.microsoft.com/playwright:v1.58.2-noble
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -29,156 +28,27 @@ jobs:
run: npm ci run: npm ci
working-directory: frontend working-directory: frontend
- name: Security audit (no dev deps)
run: npm audit --audit-level=high --omit=dev
working-directory: frontend
- name: Compile Paraglide i18n - name: Compile Paraglide i18n
run: npx @inlang/paraglide-js compile --project ./project.inlang --outdir ./src/lib/paraglide run: npx @inlang/paraglide-js compile --project ./project.inlang --outdir ./src/lib/paraglide
working-directory: frontend working-directory: frontend
- name: Sync SvelteKit
run: npx svelte-kit sync
working-directory: frontend
- name: Lint - name: Lint
run: npm run lint run: npm run lint
working-directory: frontend working-directory: frontend
- name: Assert no banned vi.mock patterns - name: Run unit and component tests
shell: bash run: npm test
run: |
# Literal pdfjs-dist (libLoader pattern — ADR 012)
if grep -rF "vi.mock('pdfjs-dist'" frontend/src/; then
echo "FAIL: banned vi.mock('pdfjs-dist') pattern found — see ADR 012. Use the libLoader prop injection pattern instead."
exit 1
fi
# Async factory with dynamic import in body (named mechanism — ADR 012 / #553).
# Multiline PCRE matches `vi.mock(<arg>, async ... { ... await import(...) ... })`
# across line breaks. __meta__ is excluded because it contains fixture strings
# demonstrating the very pattern this check is meant to forbid.
if grep -rPzln 'vi\.mock\([^)]+,\s*async[^{]*\{[\s\S]*?await\s+import\s*\(' \
--include='*.spec.ts' --include='*.test.ts' \
--exclude-dir='__meta__' \
frontend/src/; then
echo "FAIL: banned async vi.mock factory with dynamic import in body — see ADR 012 / #553. Use a synchronous factory + vi.hoisted instead."
exit 1
fi
- name: Assert no raw document date rendered via {@html} (CWE-79 — #666)
shell: bash
run: |
# meta_date_raw is untrusted verbatim spreadsheet text — it must render via
# Svelte default escaping, never {@html}. This guard flags any {@html ...}
# whose expression references a raw-date variable. A comment mentioning
# "{@html}" without a raw token inside the braces does NOT match.
# The token list MUST cover every variable that carries the raw value:
# DocumentDate.svelte exposes it via the `raw` prop, so `\braw\b` is included.
# Grow this list whenever a new raw-bearing variable name is introduced.
pattern='\{@html[^}]*(metaDateRaw|documentDateRaw|rawDate|\braw\b)'
# Self-test: the regex must catch the dangerous forms and ignore the comment form.
printf '{@html doc.metaDateRaw}\n' | grep -qP "$pattern" \
|| { echo "FAIL: guard self-test — regex missed the unsafe {@html metaDateRaw} form"; exit 1; }
printf '{@html raw}\n' | grep -qP "$pattern" \
|| { echo "FAIL: guard self-test — regex missed the unsafe {@html raw} form (DocumentDate prop)"; exit 1; }
printf 'never use {@html} for this\n' | grep -qvP "$pattern" \
|| { echo "FAIL: guard self-test — regex wrongly flagged a {@html} comment"; exit 1; }
if grep -rPln "$pattern" --include='*.svelte' frontend/src/; then
echo "FAIL: meta_date_raw rendered via {@html} — use default {…} escaping (CWE-79, #666)."
exit 1
fi
- name: Assert no (upload|download)-artifact past v3
shell: bash
run: |
# Self-test: verify the regex catches v4+ and does not catch v3.
tmp=$(mktemp)
printf ' uses: actions/upload-artifact@v5\n' > "$tmp"
grep -qP '^\s+uses:\s+actions/(upload|download)-artifact@v[4-9]' "$tmp" \
|| { echo "FAIL: guard self-test — regex missed upload-artifact@v5"; rm "$tmp"; exit 1; }
printf ' uses: actions/upload-artifact@v3\n' > "$tmp"
grep -qvP '^\s+uses:\s+actions/(upload|download)-artifact@v[4-9]' "$tmp" \
|| { echo "FAIL: guard self-test — regex incorrectly flagged upload-artifact@v3"; rm "$tmp"; exit 1; }
rm "$tmp"
# Guard: Gitea Actions (act_runner) does not implement the v4 artifact protocol.
# Both upload-artifact and download-artifact share the same incompatibility.
# Pin to @v3. See ADR-014 / #557.
if grep -RPn '^\s+uses:\s+actions/(upload|download)-artifact@v[4-9]' .gitea/workflows/; then
echo "::error::actions/(upload|download)-artifact@v4+ is unsupported on Gitea Actions (act_runner). Pin to @v3. See ADR-014 / #557."
exit 1
fi
- name: Run unit and component tests with coverage
shell: bash
run: |
set -eo pipefail
npm run test:coverage 2>&1 | tee /tmp/coverage-test-${{ github.run_id }}.log
working-directory: frontend
env:
TZ: Europe/Berlin
# Diagnostic guard: covers the coverage run only. If `npm test` (above)
# exits 1 with a birpc error, the named pattern appears here — not there.
- name: Assert no birpc teardown race in coverage run
shell: bash
if: always()
run: |
if grep -qF "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}.log 2>/dev/null; then
echo "FAIL: [birpc] rpc is closed teardown race detected in coverage run"
grep -F "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}.log
exit 1
fi
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
- name: Upload coverage reports
if: always()
uses: actions/upload-artifact@v3
with:
name: coverage-reports
path: |
frontend/coverage/
/tmp/coverage-test-${{ github.run_id }}.log
- name: Build frontend
run: npm run build
working-directory: frontend working-directory: frontend
# ── Prerender output is exactly the public help page ───────────────────
# SvelteKit prerender + crawl follows nav links and bakes "redirect to
# /login" HTML for every protected route, served BEFORE runtime hooks
# (see #514). With `crawl: false` only the explicit entry should land
# in build/prerendered/. Anything else is a regression — fail the build.
- name: Assert prerender output is only /hilfe/transkription
run: |
cd frontend
set -e
extra=$(find build/prerendered -type f \
-not -path 'build/prerendered/hilfe/*' \
-not -name '*.br' -not -name '*.gz' \
|| true)
if [ -n "$extra" ]; then
echo "FAIL: unexpected prerendered files (would shadow runtime hooks):"
echo "$extra"
exit 1
fi
# And the help page must still be there.
test -f build/prerendered/hilfe/transkription.html \
|| { echo "FAIL: /hilfe/transkription.html missing from prerender output"; exit 1; }
echo "PASS: only /hilfe/transkription.html prerendered."
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
- name: Upload screenshots - name: Upload screenshots
if: always() if: always()
uses: actions/upload-artifact@v3 uses: actions/upload-artifact@v4
with: with:
name: unit-test-screenshots name: unit-test-screenshots
path: frontend/test-results/screenshots/ path: frontend/test-results/screenshots/
# ─── OCR Service Unit Tests ─────────────────────────────────────────────────── # ─── OCR Service Unit Tests ───────────────────────────────────────────────────
# Only stdlib/lightweight tests — no ML stack (PyTorch/Surya/Kraken) required. # Only spell_check.py, test_confidence.py, test_sender_registry.py — no ML stack required.
# test_tmpdir.py covers the TMPDIR env var and entrypoint mkdir behaviour (ADR-021).
# test_tmpdir_is_inside_persistent_cache_volume is skipped in CI (TMPDIR not
# set to /app/cache here); it runs inside the deployed Docker container.
ocr-tests: ocr-tests:
name: OCR Service Tests name: OCR Service Tests
runs-on: ubuntu-latest runs-on: ubuntu-latest
@@ -190,11 +60,11 @@ jobs:
python-version: '3.11' python-version: '3.11'
- name: Install test dependencies - name: Install test dependencies
run: pip install "pyspellchecker==0.9.0" "fastapi==0.115.6" pytest pytest-asyncio run: pip install "pyspellchecker==0.9.0" pytest pytest-asyncio
working-directory: ocr-service working-directory: ocr-service
- name: Run OCR unit tests (no ML stack required) - name: Run OCR unit tests (no ML stack required)
run: python -m pytest test_spell_check.py test_confidence.py test_sender_registry.py test_tmpdir.py -v run: python -m pytest test_spell_check.py test_confidence.py test_sender_registry.py -v
working-directory: ocr-service working-directory: ocr-service
# ─── Backend Unit & Slice Tests ─────────────────────────────────────────────── # ─── Backend Unit & Slice Tests ───────────────────────────────────────────────
@@ -204,8 +74,6 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
env: env:
DOCKER_API_VERSION: "1.43" # NAS runner runs Docker 24.x (max API 1.43); Testcontainers 2.x defaults to 1.44 DOCKER_API_VERSION: "1.43" # NAS runner runs Docker 24.x (max API 1.43); Testcontainers 2.x defaults to 1.44
DOCKER_HOST: unix:///var/run/docker.sock
TESTCONTAINERS_RYUK_DISABLED: "true"
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -224,155 +92,5 @@ jobs:
- name: Run backend tests - name: Run backend tests
run: | run: |
chmod +x mvnw chmod +x mvnw
./mvnw clean verify ./mvnw clean test
working-directory: backend working-directory: backend
- name: Upload surefire reports
if: always()
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
uses: actions/upload-artifact@v3
with:
name: surefire-reports
path: backend/target/surefire-reports/
# ─── fail2ban Regex Regression ────────────────────────────────────────────────
# The filter parses Caddy's JSON access log; a Caddy upgrade that reorders
# the JSON keys would silently break it (fail2ban-regex would return
# "0 matches", fail2ban would stop banning, no error surface). This job
# pins the contract against a deterministic sample line.
fail2ban-regex:
name: fail2ban Regex
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install fail2ban
run: |
sudo apt-get update
sudo apt-get install -y fail2ban
- name: Matches /api/auth/login 401
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/login 401"; exit 1; }
- name: Matches /api/auth/login 429
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":429}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/login 429"; exit 1; }
- name: Matches /api/auth/forgot-password 401
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/forgot-password"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/forgot-password 401"; exit 1; }
- name: Does not match /api/auth/login 200
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":200}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '0 matched' \
|| { echo "expected 0 matches for /api/auth/login 200"; exit 1; }
- name: Does not match /api/documents (unrelated 401)
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"GET","host":"archiv.raddatz.cloud","uri":"/api/documents"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '0 matched' \
|| { echo "expected 0 matches for /api/documents 401"; exit 1; }
# ── Backend resolves to file-polling, not systemd ─────────────────────
# The Debian/Ubuntu fail2ban package ships defaults-debian.conf with
# `[DEFAULT] backend = systemd`. Without `backend = polling` in our
# jail, the daemon loads the jail but reads from journald and never
# touches /var/log/caddy/access.log — i.e. the regex above passes in
# isolation while the live jail is inert. See issue #503.
- name: Jail resolves with polling backend (not inherited systemd)
run: |
sudo ln -sfn "$PWD/infra/fail2ban/jail.d/familienarchiv.conf" /etc/fail2ban/jail.d/familienarchiv.conf
sudo ln -sfn "$PWD/infra/fail2ban/filter.d/familienarchiv-auth.conf" /etc/fail2ban/filter.d/familienarchiv-auth.conf
dump=$(sudo fail2ban-client -d 2>&1)
echo "$dump" | grep -E "add.*familienarchiv-auth" || true
echo "$dump" | grep -qE "\['add', 'familienarchiv-auth', 'polling'\]" \
|| { echo "FAIL: familienarchiv-auth jail did not resolve to 'polling' backend"; exit 1; }
# ─── Semgrep Security Scan ───────────────────────────────────────────────────
# Catches XXE-unprotected XML parser factories and similar patterns defined in
# .semgrep/security.yml. Runs in parallel with backend-unit-tests for fast feedback.
# Uses local rules only (no SEMGREP_APP_TOKEN / OIDC — act_runner does not support it).
semgrep-scan:
name: Semgrep Security Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install Semgrep
run: pip install semgrep==1.163.0
- name: Run security rules
run: semgrep --config .semgrep/security.yml --error --metrics=off backend/src/
# ─── Compose Bucket-Bootstrap Idempotency ─────────────────────────────────────
# docker-compose.prod.yml's create-buckets service runs on every
# `docker compose up` (one-shot, no restart). Must be idempotent — a
# re-deploy must not fail just because the bucket / user / policy
# already exists. Validated by running create-buckets twice against a
# throwaway minio stack and asserting both invocations exit 0.
compose-idempotency:
name: Compose Bucket Idempotency
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write stub env file
run: |
cat > .env.test <<'EOF'
TAG=test
PORT_BACKEND=18080
PORT_FRONTEND=13000
APP_DOMAIN=localhost
POSTGRES_PASSWORD=stub
MINIO_PASSWORD=stubrootpassword
MINIO_APP_PASSWORD=stubapppassword
OCR_TRAINING_TOKEN=stub
APP_ADMIN_USERNAME=admin@local
APP_ADMIN_PASSWORD=stub
MAIL_HOST=mailpit
MAIL_PORT=1025
APP_MAIL_FROM=noreply@local
IMPORT_HOST_DIR=/tmp/dummy-import
COMPOSE_NETWORK_NAME=test-idem-archiv-net
EOF
- name: Bring up minio
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test up -d --wait minio
- name: First create-buckets run
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test run --rm create-buckets
- name: Second create-buckets run (idempotency check)
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test run --rm create-buckets
- name: Teardown
if: always()
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test down -v
rm -f .env.test

View File

@@ -1,65 +0,0 @@
name: Coverage Flake Probe
# Manually-triggered probe for the birpc teardown race documented in ADR 012
# / #553. Runs the full coverage suite 20× in parallel against a single SHA
# and asserts zero `[birpc] rpc is closed` lines across every cell. Verifies
# the acceptance criterion that the race no longer surfaces under coverage.
on:
workflow_dispatch:
jobs:
coverage-flake-probe:
name: Coverage flake probe (run ${{ matrix.run }})
runs-on: ubuntu-latest
container:
image: mcr.microsoft.com/playwright:v1.58.2-noble
strategy:
fail-fast: false
matrix:
run: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
steps:
- uses: actions/checkout@v4
- name: Cache node_modules
id: node-modules-cache
uses: actions/cache@v4
with:
path: frontend/node_modules
key: node-modules-${{ hashFiles('frontend/package-lock.json') }}
- name: Install dependencies
if: steps.node-modules-cache.outputs.cache-hit != 'true'
run: npm ci
working-directory: frontend
- name: Compile Paraglide i18n
run: npx @inlang/paraglide-js compile --project ./project.inlang --outdir ./src/lib/paraglide
working-directory: frontend
- name: Run unit and component tests with coverage
shell: bash
run: |
set -eo pipefail
npm run test:coverage 2>&1 | tee /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log
working-directory: frontend
env:
TZ: Europe/Berlin
- name: Assert no birpc teardown race
shell: bash
if: always()
run: |
if grep -qF "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log 2>/dev/null; then
echo "FAIL: [birpc] rpc is closed teardown race detected in run ${{ matrix.run }}"
grep -F "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log
exit 1
fi
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
- name: Upload coverage log on failure
if: failure()
uses: actions/upload-artifact@v3
with:
name: coverage-log-run-${{ matrix.run }}
path: /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log

View File

@@ -1,284 +0,0 @@
name: nightly
# Builds and deploys the staging environment from main every night.
# Runs on the self-hosted runner using Docker-out-of-Docker (the docker
# socket is mounted in), so `docker compose build` produces images on
# the host daemon and `docker compose up` consumes them directly — no
# registry hop.
#
# Operational assumptions (see docs/DEPLOYMENT.md §3 for the full setup):
#
# 1. Single-tenant self-hosted runner. The "Write staging env file" step
# writes every secret to .env.staging on the runner filesystem; the
# `if: always()` cleanup step removes it. A multi-tenant runner
# would need to switch to docker compose --env-file <(stdin) instead.
#
# 2. Host docker layer cache is authoritative. There is no
# actions/cache; we rely on the host daemon to keep Maven and npm
# layers warm between runs. A `docker system prune` on the host
# will cause the next nightly build to be cold (510 min slower).
#
# Staging environment isolation:
# - project name: archiv-staging
# - host ports: backend 8081, frontend 3001
# - profile: staging (starts mailpit instead of a real SMTP relay)
#
# Required Gitea secrets:
# STAGING_POSTGRES_PASSWORD
# STAGING_MINIO_PASSWORD
# STAGING_MINIO_APP_PASSWORD
# STAGING_OCR_TRAINING_TOKEN
# STAGING_APP_ADMIN_USERNAME
# STAGING_APP_ADMIN_PASSWORD
# GRAFANA_ADMIN_PASSWORD
# GRAFANA_DB_PASSWORD (read-only grafana_reader DB role, issue #651)
# GLITCHTIP_SECRET_KEY
# SENTRY_DSN (set after GlitchTip first-run; empty = Sentry disabled)
on:
schedule:
- cron: "0 2 * * *"
workflow_dispatch:
env:
# Ensures the backend Dockerfile's `RUN --mount=type=cache` lines are
# honoured (Maven cache survives between runs).
DOCKER_BUILDKIT: "1"
jobs:
deploy-staging:
# `ubuntu-latest` matches our self-hosted runner's advertised label
# (the runner has labels: ubuntu-latest / ubuntu-24.04 / ubuntu-22.04).
# `self-hosted` would never match — no runner advertises it — so the
# job parks in the queue forever. ADR-011's "single-tenant" promise
# is at the repo level; sharing this runner between CI and deploys
# for the same repo is within that boundary.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write staging env file
run: |
cat > .env.staging <<EOF
TAG=nightly
PORT_BACKEND=8081
PORT_FRONTEND=3001
APP_DOMAIN=staging.raddatz.cloud
POSTGRES_PASSWORD=${{ secrets.STAGING_POSTGRES_PASSWORD }}
MINIO_PASSWORD=${{ secrets.STAGING_MINIO_PASSWORD }}
MINIO_APP_PASSWORD=${{ secrets.STAGING_MINIO_APP_PASSWORD }}
OCR_TRAINING_TOKEN=${{ secrets.STAGING_OCR_TRAINING_TOKEN }}
APP_ADMIN_USERNAME=${{ secrets.STAGING_APP_ADMIN_USERNAME }}
APP_ADMIN_PASSWORD=${{ secrets.STAGING_APP_ADMIN_PASSWORD }}
MAIL_HOST=mailpit
MAIL_PORT=1025
MAIL_USERNAME=
MAIL_PASSWORD=
MAIL_SMTP_AUTH=false
MAIL_STARTTLS_ENABLE=false
APP_MAIL_FROM=noreply@staging.raddatz.cloud
IMPORT_HOST_DIR=/srv/familienarchiv-staging/import
POSTGRES_USER=archiv
SENTRY_DSN=${{ secrets.SENTRY_DSN }}
VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
EOF
- name: Verify backend /import:ro mount is wired
# Regression guard for #526: the /admin/system mass-import card
# only works when the backend service mounts the host import
# payload at /import (read-only). If a future "compose cleanup"
# PR drops the volumes block, mass import silently breaks again.
# `compose config` renders both shorthand and longform mounts as
# `target: /import` + `read_only: true`, so we assert against
# the rendered form rather than the raw source YAML.
run: |
set -e
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
config > /tmp/compose-rendered.yml
grep -q '^[[:space:]]*target: /import$' /tmp/compose-rendered.yml \
|| { echo "::error::backend is missing the /import bind mount (see #526)"; exit 1; }
grep -A2 '^[[:space:]]*target: /import$' /tmp/compose-rendered.yml \
| grep -q 'read_only: true' \
|| { echo "::error::backend /import mount is not read-only (see #526)"; exit 1; }
- name: Build images
# `--pull` forces re-fetching pinned base images so a CVE
# re-publication of the same tag (e.g. node:20.19.0-alpine3.21,
# postgres:16-alpine) is picked up instead of being served
# from the host's stale Docker layer cache.
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
build --pull
- name: Deploy staging
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
up -d --wait --remove-orphans
- name: Deploy observability configs
# Copies the compose file and config tree from the workspace checkout
# into /opt/familienarchiv/ — the permanent location that persists
# between CI runs. Containers started in the next step bind-mount
# from there, so a future workspace wipe cannot corrupt a running
# config file.
#
# obs-secrets.env is written fresh from Gitea secrets on every run so
# Gitea is always the single source of truth for secret rotation.
# Non-secret config lives in infra/observability/obs.env (tracked in git).
run: |
rm -rf /opt/familienarchiv/infra/observability
mkdir -p /opt/familienarchiv/infra/observability
cp -r infra/observability/. /opt/familienarchiv/infra/observability/
cp docker-compose.observability.yml /opt/familienarchiv/
cat > /opt/familienarchiv/obs-secrets.env <<'EOF'
GRAFANA_ADMIN_PASSWORD=${{ secrets.GRAFANA_ADMIN_PASSWORD }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
GLITCHTIP_SECRET_KEY=${{ secrets.GLITCHTIP_SECRET_KEY }}
POSTGRES_PASSWORD=${{ secrets.STAGING_POSTGRES_PASSWORD }}
POSTGRES_HOST=archiv-staging-db-1
EOF
# Note: POSTGRES_HOST is derived from the Compose project name (archiv-staging)
# and service name (db). A project rename requires updating this value.
chmod 600 /opt/familienarchiv/obs-secrets.env
- name: Validate observability compose config
# Dry-run: resolves all variable substitutions and reports any missing
# required keys before containers start. Catches undefined variables and
# YAML errors in config files updated by the previous step.
# --env-file order: obs.env first (git-tracked defaults), obs-secrets.env
# second (CI-written secrets). Later files win on duplicate keys, so
# obs-secrets.env overrides POSTGRES_HOST set in obs.env.
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
config --quiet
- name: Start observability stack
# Runs with absolute paths so bind mounts resolve to stable host paths
# that survive workspace wipes between nightly runs (see ADR-016).
# Non-secret config from obs.env (git-tracked); secrets from obs-secrets.env
# (written fresh from Gitea secrets above). --env-file order: obs.env first,
# obs-secrets.env second — later file wins on duplicate keys.
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
up -d --wait --remove-orphans
- name: Assert observability stack health
# docker compose up --wait covers services WITH healthcheck directives only.
# obs-promtail, obs-cadvisor, obs-node-exporter, and obs-glitchtip-worker have
# no healthcheck — they are considered "started" as soon as the process runs.
# This step explicitly asserts the five healthchecked critical services are
# healthy before the smoke test proceeds.
run: |
set -e
unhealthy=""
for svc in obs-loki obs-prometheus obs-grafana obs-tempo obs-glitchtip; do
status=$(docker inspect "$svc" --format '{{.State.Health.Status}}' 2>/dev/null || echo "missing")
if [ "$status" != "healthy" ]; then
echo "::error::$svc is not healthy (status: $status)"
unhealthy="$unhealthy $svc"
fi
done
[ -z "$unhealthy" ] || exit 1
echo "All critical observability services are healthy"
- name: Reload Caddy
# Apply any committed Caddyfile changes before smoke-testing the
# public surface. Without this step, a Caddyfile edit lands in the
# repo but Caddy keeps serving the previous config until someone
# reloads it manually — the smoke test would then catch a stale
# header or a still-proxied /actuator route rather than confirming
# the current config is live.
#
# The runner executes job steps inside Docker containers (DooD).
# `systemctl` is not present in container images and cannot reach
# the host's systemd directly. We use the Docker socket (mounted
# into every job container via runner-config.yaml) to spin up a
# privileged sibling container in the host PID namespace; nsenter
# then enters the host's namespaces so systemctl talks to the real
# host systemd daemon. No sudoers entry is required — the Docker
# socket already grants root-equivalent host access.
#
# Alpine is used: ~5 MB vs ~70 MB for ubuntu, no unnecessary
# tooling, and the digest is pinned so any upstream change requires
# an explicit bump PR. util-linux (which ships nsenter) is installed
# at run time; apk add takes ~1 s on the warm VPS cache.
#
# `reload` not `restart`: reload sends SIGHUP so Caddy re-reads its
# config in-process without dropping TLS connections. `restart`
# would briefly stop the service, losing in-flight requests.
#
# If Caddy is not running this step fails fast before the smoke test
# issues a misleading "port 443 refused" error.
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# Healthchecks confirm containers are healthy; they do NOT confirm the
# public surface works. This step catches: Caddy not reloaded, HSTS
# header dropped, /actuator block bypassed.
#
# --resolve pins staging.raddatz.cloud to the Docker bridge gateway IP
# (the host) so we do NOT depend on hairpin NAT on the host router.
# 127.0.0.1 cannot be used: job containers run in bridge network mode
# (runner-config.yaml), so 127.0.0.1 is the container's loopback, not
# the host's. The bridge gateway IS the host; Caddy binds 0.0.0.0:443
# and is therefore reachable from the container via that IP.
# SNI still uses the public hostname so the TLS cert validates correctly.
#
# Gateway detection reads /proc/net/route (always present, no package
# required) instead of `ip route` to avoid a dependency on iproute2.
# Field $2=="00000000" is the default route; field $3 is the gateway as
# a little-endian 32-bit hex value which awk decodes to dotted-decimal.
run: |
set -e
HOST="staging.raddatz.cloud"
URL="https://$HOST"
HOST_IP=$(awk 'NR>1 && $2=="00000000"{h=$3;printf "%d.%d.%d.%d\n",strtonum("0x"substr(h,7,2)),strtonum("0x"substr(h,5,2)),strtonum("0x"substr(h,3,2)),strtonum("0x"substr(h,1,2));exit}' /proc/net/route)
[ -n "$HOST_IP" ] || { echo "ERROR: could not detect Docker bridge gateway via /proc/net/route"; exit 1; }
RESOLVE=(--resolve "$HOST:443:$HOST_IP")
echo "Smoke test: $URL (pinned to $HOST_IP via bridge gateway)"
curl -fsS "${RESOLVE[@]}" --max-time 10 "$URL/login" -o /dev/null
# Pin the preload-list-eligible HSTS value, not just header presence:
# a degraded `max-age=1` or a dropped `includeSubDomains; preload` must
# fail this check rather than pass it silently.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'strict-transport-security:[[:space:]]*max-age=31536000.*includeSubDomains.*preload'
# Permissions-Policy denies APIs the app does not use (camera,
# microphone, geolocation). A regression that loosens or drops the
# header now fails the smoke step.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'permissions-policy:[[:space:]]*camera=\(\),[[:space:]]*microphone=\(\),[[:space:]]*geolocation=\(\)'
status=$(curl -s "${RESOLVE[@]}" -o /dev/null -w "%{http_code}" --max-time 10 "$URL/actuator/health")
[ "$status" = "404" ] || { echo "expected 404 from /actuator/health, got $status"; exit 1; }
echo "All smoke checks passed"
- name: Cleanup env file
# LOAD-BEARING: `if: always()` is the linchpin of the ADR-011
# single-tenant runner trust model. Every secret in .env.staging
# is plain text on the runner filesystem until this step runs.
# If a future refactor drops `if: always()`, a failed deploy
# leaves the env-file behind. Do not remove this conditional
# without first re-evaluating ADR-011.
if: always()
run: rm -f .env.staging

View File

@@ -1,223 +0,0 @@
name: release
# Builds and deploys the production environment on `v*` tag push.
# Runs on the self-hosted runner via Docker-out-of-Docker; images are
# tagged with the actual git tag (e.g. v1.0.0) so rollback is
# `TAG=<previous> docker compose -f docker-compose.prod.yml -p archiv-production up -d --wait`
#
# Operational assumptions (see docs/DEPLOYMENT.md §3 for the full setup):
#
# 1. Single-tenant self-hosted runner. The "Write production env file"
# step writes every secret to .env.production on the runner
# filesystem; the `if: always()` cleanup step removes it. A
# multi-tenant runner would need to switch to
# `docker compose --env-file <(stdin)` instead.
#
# 2. Host docker layer cache is authoritative. There is no
# actions/cache; we rely on the host daemon to keep Maven and npm
# layers warm between runs. A `docker system prune` on the host
# will cause the next release build to be cold (510 min slower).
#
# Production environment:
# - project name: archiv-production
# - host ports: backend 8080, frontend 3000
# - profile: (none) — mailpit is excluded; real SMTP relay is used
#
# Required Gitea secrets:
# PROD_POSTGRES_PASSWORD
# PROD_MINIO_PASSWORD
# PROD_MINIO_APP_PASSWORD
# PROD_OCR_TRAINING_TOKEN
# PROD_APP_ADMIN_USERNAME (CRITICAL: see docs/DEPLOYMENT.md)
# PROD_APP_ADMIN_PASSWORD (CRITICAL: locked in on first deploy)
# MAIL_HOST
# MAIL_PORT
# MAIL_USERNAME
# MAIL_PASSWORD
# GRAFANA_ADMIN_PASSWORD
# GRAFANA_DB_PASSWORD (read-only grafana_reader DB role, issue #651)
# GLITCHTIP_SECRET_KEY
# SENTRY_DSN (set after GlitchTip first-run; empty = Sentry disabled)
on:
push:
tags:
- "v*"
env:
DOCKER_BUILDKIT: "1"
jobs:
deploy-production:
# See nightly.yml — same rationale: `ubuntu-latest` matches the
# advertised label of our single-tenant self-hosted runner.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write production env file
run: |
cat > .env.production <<EOF
TAG=${{ gitea.ref_name }}
PORT_BACKEND=8080
PORT_FRONTEND=3000
APP_DOMAIN=archiv.raddatz.cloud
POSTGRES_PASSWORD=${{ secrets.PROD_POSTGRES_PASSWORD }}
MINIO_PASSWORD=${{ secrets.PROD_MINIO_PASSWORD }}
MINIO_APP_PASSWORD=${{ secrets.PROD_MINIO_APP_PASSWORD }}
OCR_TRAINING_TOKEN=${{ secrets.PROD_OCR_TRAINING_TOKEN }}
APP_ADMIN_USERNAME=${{ secrets.PROD_APP_ADMIN_USERNAME }}
APP_ADMIN_PASSWORD=${{ secrets.PROD_APP_ADMIN_PASSWORD }}
MAIL_HOST=${{ secrets.MAIL_HOST }}
MAIL_PORT=${{ secrets.MAIL_PORT }}
MAIL_USERNAME=${{ secrets.MAIL_USERNAME }}
MAIL_PASSWORD=${{ secrets.MAIL_PASSWORD }}
MAIL_SMTP_AUTH=true
MAIL_STARTTLS_ENABLE=true
APP_MAIL_FROM=noreply@raddatz.cloud
IMPORT_HOST_DIR=/srv/familienarchiv-production/import
POSTGRES_USER=archiv
SENTRY_DSN=${{ secrets.SENTRY_DSN }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
EOF
- name: Build images
# `--pull` forces re-fetching pinned base images so a CVE
# re-publication of the same tag is picked up rather than served
# from the host's stale Docker layer cache.
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file .env.production \
build --pull
- name: Deploy production
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file .env.production \
up -d --wait --remove-orphans
- name: Deploy observability configs
# Mirrors the nightly approach: copies obs compose file and config tree
# to /opt/familienarchiv/ (permanent path, survives workspace wipes — ADR-016),
# then writes obs-secrets.env fresh from Gitea secrets.
# Non-secret config lives in infra/observability/obs.env (tracked in git).
run: |
rm -rf /opt/familienarchiv/infra/observability
mkdir -p /opt/familienarchiv/infra/observability
cp -r infra/observability/. /opt/familienarchiv/infra/observability/
cp docker-compose.observability.yml /opt/familienarchiv/
cat > /opt/familienarchiv/obs-secrets.env <<'EOF'
GRAFANA_ADMIN_PASSWORD=${{ secrets.GRAFANA_ADMIN_PASSWORD }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
GLITCHTIP_SECRET_KEY=${{ secrets.GLITCHTIP_SECRET_KEY }}
POSTGRES_PASSWORD=${{ secrets.PROD_POSTGRES_PASSWORD }}
POSTGRES_HOST=archiv-production-db-1
EOF
# Note: POSTGRES_HOST is derived from the Compose project name (archiv-production)
# and service name (db). A project rename requires updating this value.
chmod 600 /opt/familienarchiv/obs-secrets.env
- name: Validate observability compose config
# Dry-run: resolves all variable substitutions and reports any missing
# required keys before containers start. Catches undefined variables and
# YAML errors in config files updated by the previous step.
# --env-file order: obs.env first (git-tracked defaults), obs-secrets.env
# second (CI-written secrets). Later files win on duplicate keys, so
# obs-secrets.env overrides POSTGRES_HOST set in obs.env.
# Keep in sync with the equivalent step in nightly.yml (#603).
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
config --quiet
- name: Start observability stack
# Runs with absolute paths so bind mounts resolve to stable host paths
# that survive workspace wipes between runs (see ADR-016).
# Non-secret config from obs.env (git-tracked); secrets from obs-secrets.env
# (written fresh from Gitea secrets above). --env-file order: obs.env first,
# obs-secrets.env second — later file wins on duplicate keys.
# Keep in sync with the equivalent step in nightly.yml (#603).
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
up -d --wait --remove-orphans
- name: Assert observability stack health
# docker compose up --wait covers services WITH healthcheck directives only.
# obs-promtail, obs-cadvisor, obs-node-exporter, and obs-glitchtip-worker have
# no healthcheck — they are considered "started" as soon as the process runs.
# This step explicitly asserts the five healthchecked critical services are
# healthy before the smoke test proceeds.
# Keep in sync with the equivalent step in nightly.yml (#603).
run: |
set -e
unhealthy=""
for svc in obs-loki obs-prometheus obs-grafana obs-tempo obs-glitchtip; do
status=$(docker inspect "$svc" --format '{{.State.Health.Status}}' 2>/dev/null || echo "missing")
if [ "$status" != "healthy" ]; then
echo "::error::$svc is not healthy (status: $status)"
unhealthy="$unhealthy $svc"
fi
done
[ -z "$unhealthy" ] || exit 1
echo "All critical observability services are healthy"
- name: Reload Caddy
# See nightly.yml — same rationale and mechanism: DooD job containers
# cannot call systemctl directly; nsenter via a privileged sibling
# container reaches the host systemd. Must run after deploy (so the
# latest Caddyfile is on disk) and before the smoke test (so the
# public surface reflects the current config). Alpine with pinned
# digest; reload not restart — see nightly.yml for full rationale.
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# See nightly.yml — same three checks, against the prod vhost.
# --resolve stored as a Bash array so "${RESOLVE[@]}" expands to two
# separate arguments; a quoted string would pass the flag and its value
# as one token and curl would reject it as an unknown option.
# Gateway detection via /proc/net/route — no iproute2 dependency.
# See nightly.yml for the full network topology explanation.
run: |
set -e
HOST="archiv.raddatz.cloud"
URL="https://$HOST"
HOST_IP=$(awk 'NR>1 && $2=="00000000"{h=$3;printf "%d.%d.%d.%d\n",strtonum("0x"substr(h,7,2)),strtonum("0x"substr(h,5,2)),strtonum("0x"substr(h,3,2)),strtonum("0x"substr(h,1,2));exit}' /proc/net/route)
[ -n "$HOST_IP" ] || { echo "ERROR: could not detect Docker bridge gateway via /proc/net/route"; exit 1; }
RESOLVE=(--resolve "$HOST:443:$HOST_IP")
echo "Smoke test: $URL (pinned to $HOST_IP via bridge gateway)"
curl -fsS "${RESOLVE[@]}" --max-time 10 "$URL/login" -o /dev/null
# Pin the preload-list-eligible HSTS value, not just header presence:
# a degraded `max-age=1` or a dropped `includeSubDomains; preload` must
# fail this check rather than pass it silently.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'strict-transport-security:[[:space:]]*max-age=31536000.*includeSubDomains.*preload'
# Permissions-Policy denies APIs the app does not use (camera,
# microphone, geolocation). A regression that loosens or drops the
# header now fails the smoke step.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'permissions-policy:[[:space:]]*camera=\(\),[[:space:]]*microphone=\(\),[[:space:]]*geolocation=\(\)'
status=$(curl -s "${RESOLVE[@]}" -o /dev/null -w "%{http_code}" --max-time 10 "$URL/actuator/health")
[ "$status" = "404" ] || { echo "expected 404 from /actuator/health, got $status"; exit 1; }
echo "All smoke checks passed"
- name: Cleanup env file
# LOAD-BEARING: `if: always()` is the linchpin of the ADR-011
# single-tenant runner trust model. Every secret in
# .env.production is plain text on the runner filesystem until
# this step runs. If a future refactor drops `if: always()`, a
# failed deploy leaves the env-file behind. Do not remove this
# conditional without first re-evaluating ADR-011.
if: always()
run: rm -f .env.production

10
.gitignore vendored
View File

@@ -18,15 +18,5 @@ scripts/large-data.sql
.claude/worktrees/ .claude/worktrees/
.claude/scheduled_tasks.lock .claude/scheduled_tasks.lock
# Run artifacts from verification tooling
proofshot-artifacts/
# Root-level Node.js tooling artifacts
node_modules/
# Repo uses npm; yarn.lock is ignored to avoid double-lockfile drift. # Repo uses npm; yarn.lock is ignored to avoid double-lockfile drift.
frontend/yarn.lock frontend/yarn.lock
**/.venv/
**/__pycache__/
*.pyc

View File

@@ -1,54 +0,0 @@
# Semgrep security rules for Familienarchiv backend.
# These rules catch the absence of XXE protection on XML parser factories.
# CWE-611: Improper Restriction of XML External Entity Reference.
# Run: semgrep --config .semgrep/security.yml --error backend/src/
rules:
# DocumentBuilderFactory without XXE hardening.
# All call sites must call setFeature("…disallow-doctype-decl", true) before use.
- id: dbf-xxe-default
patterns:
- pattern: $X = DocumentBuilderFactory.newInstance();
- pattern-not-inside: |
...
$X.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
...
message: >
DocumentBuilderFactory without XXE protection (CWE-611).
Call XxeSafeXmlParser.hardenedFactory() instead of DocumentBuilderFactory.newInstance().
See: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
languages: [java]
severity: ERROR
# SAXParserFactory without XXE hardening.
- id: sax-xxe-default
patterns:
- pattern: $X = SAXParserFactory.newInstance();
- pattern-not-inside: |
...
$X.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
...
message: >
SAXParserFactory without XXE protection (CWE-611).
Set disallow-doctype-decl=true, external-general-entities=false, external-parameter-entities=false,
and load-external-dtd=false before use. Follow the pattern in XxeSafeXmlParser.hardenedFactory().
See: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
languages: [java]
severity: ERROR
# XMLInputFactory without XXE hardening (StAX parser).
- id: stax-xxe-default
patterns:
- pattern: $X = XMLInputFactory.newInstance();
- pattern-not-inside: |
...
$X.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
...
message: >
XMLInputFactory without XXE protection (CWE-611).
Set IS_SUPPORTING_EXTERNAL_ENTITIES=false and SUPPORT_DTD=false before use.
Follow the pattern in XxeSafeXmlParser.hardenedFactory().
See: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
languages: [java]
severity: ERROR

View File

@@ -1,6 +1,4 @@
{ {
"java.configuration.updateBuildConfiguration": "interactive", "java.configuration.updateBuildConfiguration": "interactive",
"java.compile.nullAnalysis.mode": "automatic", "java.compile.nullAnalysis.mode": "automatic"
"plantuml.render": "PlantUMLServer",
"plantuml.server": "http://heim-nas:8500"
} }

View File

@@ -77,7 +77,6 @@ npm run generate:api # Regenerate TypeScript API types from OpenAPI spec
``` ```
backend/src/main/java/org/raddatz/familienarchiv/ backend/src/main/java/org/raddatz/familienarchiv/
├── audit/ Audit logging ├── audit/ Audit logging
├── auth/ AuthService, AuthSessionController, LoginRequest, LoginRateLimiter, RateLimitProperties (Spring Session JDBC)
├── config/ Infrastructure config (Minio, Async, Web) ├── config/ Infrastructure config (Minio, Async, Web)
├── dashboard/ Dashboard analytics + StatsController/StatsService ├── dashboard/ Dashboard analytics + StatsController/StatsService
├── document/ Document domain (entities, controller, service, repository, DTOs) ├── document/ Document domain (entities, controller, service, repository, DTOs)
@@ -87,14 +86,14 @@ backend/src/main/java/org/raddatz/familienarchiv/
├── exception/ DomainException, ErrorCode, GlobalExceptionHandler ├── exception/ DomainException, ErrorCode, GlobalExceptionHandler
├── filestorage/ FileService (S3/MinIO) ├── filestorage/ FileService (S3/MinIO)
├── geschichte/ Geschichte (story) domain ├── geschichte/ Geschichte (story) domain
├── importing/ CanonicalImportOrchestrator + four loaders (TagTree/PersonRegister/PersonTree/Document) + CanonicalSheetReader ├── importing/ MassImportService
├── notification/ Notification domain + SseEmitterRegistry ├── notification/ Notification domain + SseEmitterRegistry
├── ocr/ OCR domain — OcrService, OcrBatchService, training ├── ocr/ OCR domain — OcrService, OcrBatchService, training
├── person/ Person domain ├── person/ Person domain
│ └── relationship/ PersonRelationship sub-domain │ └── relationship/ PersonRelationship sub-domain
├── security/ SecurityConfig, Permission, @RequirePermission, PermissionAspect ├── security/ SecurityConfig, Permission, @RequirePermission, PermissionAspect
├── tag/ Tag domain ├── tag/ Tag domain
└── user/ User domain — AppUser, UserGroup, UserService └── user/ User domain — AppUser, UserGroup, UserService, auth controllers
``` ```
### Layering Rules ### Layering Rules
@@ -160,7 +159,7 @@ Input DTOs live flat in the domain package. Response types are the model entitie
→ See [CONTRIBUTING.md §Error handling](./CONTRIBUTING.md#error-handling) → See [CONTRIBUTING.md §Error handling](./CONTRIBUTING.md#error-handling)
**LLM reminder:** use `DomainException.notFound/forbidden/conflict/internal()` from service methods — never throw raw exceptions. When adding a new `ErrorCode`: (1) add to `ErrorCode.java`, (2) add to `ErrorCode` type in `frontend/src/lib/shared/errors.ts`, (3) add a `case` in `getErrorMessage()`, (4) add i18n keys in `messages/{de,en,es}.json`. Valid error codes include: `TOO_MANY_LOGIN_ATTEMPTS` (returned by `LoginRateLimiter` as HTTP 429 when a brute-force threshold is exceeded). **LLM reminder:** use `DomainException.notFound/forbidden/conflict/internal()` from service methods — never throw raw exceptions. When adding a new `ErrorCode`: (1) add to `ErrorCode.java`, (2) mirror in `frontend/src/lib/shared/errors.ts`, (3) add i18n keys in `messages/{de,en,es}.json`.
### Security / Permissions ### Security / Permissions
@@ -192,20 +191,19 @@ frontend/src/routes/
├── persons/ ├── persons/
│ ├── [id]/ Person detail │ ├── [id]/ Person detail
│ ├── [id]/edit/ Person edit form │ ├── [id]/edit/ Person edit form
── new/ Create person form ── new/ Create person form
│ └── review/ Triage view — confirm/rename/merge/delete provisional persons
├── briefwechsel/ Bilateral conversation timeline (Briefwechsel) ├── briefwechsel/ Bilateral conversation timeline (Briefwechsel)
├── aktivitaeten/ Unified activity feed (Chronik) ├── aktivitaeten/ Unified activity feed (Chronik)
├── geschichten/ Stories — list, [id], [id]/edit, new ├── geschichten/ Stories — list, [id], [id]/edit, new
├── stammbaum/ Family tree (Stammbaum) ├── stammbaum/ Family tree (Stammbaum)
├── themen/ Topics directory — browsable tag index
├── enrich/ Enrichment workflow — [id], done ├── enrich/ Enrichment workflow — [id], done
├── admin/ User, group, tag, OCR, system management ├── admin/ User, group, tag, OCR, system management
├── hilfe/transkription/ Transcription help page ├── hilfe/transkription/ Transcription help page
├── profile/ User profile settings ├── profile/ User profile settings
├── users/[id]/ Public user profile page ├── users/[id]/ Public user profile page
├── login/ logout/ register/ ├── login/ logout/ register/
── forgot-password/ reset-password/ ── forgot-password/ reset-password/
└── demo/ Dev-only demos
``` ```
### API Client Pattern ### API Client Pattern
@@ -269,7 +267,7 @@ Back button pattern — use the shared `<BackButton>` component from `$lib/share
→ See [CONTRIBUTING.md §Error handling](./CONTRIBUTING.md#error-handling) → See [CONTRIBUTING.md §Error handling](./CONTRIBUTING.md#error-handling)
**LLM reminder:** when adding a new `ErrorCode`: (1) add to `ErrorCode.java`, (2) add to `ErrorCode` type in `frontend/src/lib/shared/errors.ts`, (3) add a `case` in `getErrorMessage()`, (4) add i18n keys in `messages/{de,en,es}.json`. Valid error codes include: `TOO_MANY_LOGIN_ATTEMPTS` (returned by `LoginRateLimiter` as HTTP 429 when a brute-force threshold is exceeded). **LLM reminder:** when adding a new `ErrorCode`: (1) add to `ErrorCode.java`, (2) add to `ErrorCode` type in `frontend/src/lib/shared/errors.ts`, (3) add a `case` in `getErrorMessage()`, (4) add i18n keys in `messages/{de,en,es}.json`.
--- ---
@@ -277,35 +275,6 @@ Back button pattern — use the shared `<BackButton>` component from `$lib/share
→ See [docs/DEPLOYMENT.md](./docs/DEPLOYMENT.md) → See [docs/DEPLOYMENT.md](./docs/DEPLOYMENT.md)
### Observability stack (separate compose file)
Run via `docker-compose.observability.yml` — requires the main stack to be running first. Full setup procedure: [docs/DEPLOYMENT.md §4](./docs/DEPLOYMENT.md#4-logs--observability).
| Service | Container | Default Port | Purpose |
|---------|-----------|-------------|---------|
| Grafana | `obs-grafana` | 3003 | Metrics / logs / traces dashboard |
| Prometheus | `obs-prometheus` | 9090 (dev only — `127.0.0.1` bound) | Metrics store |
| Loki | `obs-loki` | — (internal) | Log store |
| Tempo | `obs-tempo` | — (internal) | Trace store |
| GlitchTip | `obs-glitchtip` | 3002 | Error tracking (Sentry-compatible) |
### Observability env vars
| Variable | Purpose |
|----------|---------|
| `PORT_GRAFANA` | Host port for Grafana UI (default: `3003`) |
| `PORT_GLITCHTIP` | Host port for GlitchTip UI (default: `3002`) |
| `PORT_PROMETHEUS` | Host port for Prometheus UI (default: `9090`) |
| `GRAFANA_ADMIN_PASSWORD` | Grafana `admin` login password — generate with `openssl rand -hex 32` |
| `GLITCHTIP_SECRET_KEY` | Django secret key for GlitchTip — generate with `python3 -c "import secrets; print(secrets.token_hex(32))"` |
| `GLITCHTIP_DOMAIN` | Public-facing base URL for GlitchTip (email links, CORS), e.g. `https://glitchtip.example.com` |
| `SENTRY_DSN` | GlitchTip/Sentry DSN for the backend (Spring Boot) — leave empty to disable |
| `VITE_SENTRY_DSN` | GlitchTip/Sentry DSN for the frontend (SvelteKit) — injected at build time via Vite |
## Observability
→ See [docs/OBSERVABILITY.md](./docs/OBSERVABILITY.md) — where to look for logs, traces, metrics, and errors.
## API Testing ## API Testing
HTTP test files are in `backend/api_tests/` for use with the VS Code REST Client extension. HTTP test files are in `backend/api_tests/` for use with the VS Code REST Client extension.

View File

@@ -263,7 +263,7 @@ if (!result.response.ok) {
return { person: result.data! }; // non-null assertion is safe after the ok check return { person: result.data! }; // non-null assertion is safe after the ok check
``` ```
For multipart/form-data (file uploads): bypass the typed client and use `event.fetch` directly — never global `fetch`. The typed client cannot handle multipart bodies, but `event.fetch` is still required so that `handleFetch` injects the session cookie. For multipart/form-data (file uploads): bypass the typed client and use raw `fetch` — the client cannot handle it.
### Date handling ### Date handling
@@ -272,7 +272,6 @@ For multipart/form-data (file uploads): bypass the typed client and use `event.f
| Form display | German `dd.mm.yyyy` with auto-dot insertion via `handleDateInput()` | | Form display | German `dd.mm.yyyy` with auto-dot insertion via `handleDateInput()` |
| Wire format | ISO 8601 via a hidden `<input type="hidden" name="documentDate" value={dateIso}>` | | Wire format | ISO 8601 via a hidden `<input type="hidden" name="documentDate" value={dateIso}>` |
| Display | `new Intl.DateTimeFormat('de-DE', …).format(new Date(val + 'T12:00:00'))` | | Display | `new Intl.DateTimeFormat('de-DE', …).format(new Date(val + 'T12:00:00'))` |
| Honest precision display | `formatDocumentDate(iso, precision, end?, raw?, locale?)` (`$lib/shared/utils/documentDate.ts`) or the `<DocumentDate>` component — renders a document date at exactly its `meta_date_precision` (MONTH → "Juni 1916", never a fabricated day). It mirrors the Java `DocumentTitleFormatter`; both are pinned to `docs/date-label-fixtures.json` so the title and UI labels can't drift. `meta_date_raw` is untrusted — render it via default escaping, never `{@html}` (a CI guard enforces this). |
### Security checklist (new endpoint) ### Security checklist (new endpoint)

View File

@@ -24,7 +24,6 @@ Spring Boot 4.0 monolith serving the Familienarchiv REST API. Handles document m
``` ```
src/main/java/org/raddatz/familienarchiv/ src/main/java/org/raddatz/familienarchiv/
├── audit/ # Audit logging (AuditService, AuditLogQueryService) ├── audit/ # Audit logging (AuditService, AuditLogQueryService)
├── auth/ # AuthService, AuthSessionController, LoginRequest (Spring Session JDBC — ADR-020)
├── config/ # Infrastructure config (MinioConfig, AsyncConfig, WebConfig) ├── config/ # Infrastructure config (MinioConfig, AsyncConfig, WebConfig)
├── dashboard/ # Dashboard analytics + StatsController/StatsService ├── dashboard/ # Dashboard analytics + StatsController/StatsService
├── document/ # Document domain — entities, controller, service, repository, DTOs ├── document/ # Document domain — entities, controller, service, repository, DTOs
@@ -34,14 +33,14 @@ src/main/java/org/raddatz/familienarchiv/
├── exception/ # DomainException, ErrorCode, GlobalExceptionHandler ├── exception/ # DomainException, ErrorCode, GlobalExceptionHandler
├── filestorage/ # FileService (S3/MinIO) ├── filestorage/ # FileService (S3/MinIO)
├── geschichte/ # Geschichte (story) domain ├── geschichte/ # Geschichte (story) domain
├── importing/ # CanonicalImportOrchestrator + 4 loaders + CanonicalSheetReader ├── importing/ # MassImportService
├── notification/ # Notification domain + SseEmitterRegistry ├── notification/ # Notification domain + SseEmitterRegistry
├── ocr/ # OCR domain — OcrService, OcrBatchService, training ├── ocr/ # OCR domain — OcrService, OcrBatchService, training
├── person/ # Person domain — Person, PersonService, PersonController ├── person/ # Person domain — Person, PersonService, PersonController
│ └── relationship/ # PersonRelationship sub-domain │ └── relationship/ # PersonRelationship sub-domain
├── security/ # SecurityConfig, Permission, @RequirePermission, PermissionAspect ├── security/ # SecurityConfig, Permission, @RequirePermission, PermissionAspect
├── tag/ # Tag domain — Tag, TagService, TagController ├── tag/ # Tag domain — Tag, TagService, TagController
└── user/ # User domain — AppUser, UserGroup, UserService └── user/ # User domain — AppUser, UserGroup, UserService, auth controllers
``` ```
For per-domain ownership and public surface, see each domain's `README.md`. For per-domain ownership and public surface, see each domain's `README.md`.
@@ -97,10 +96,7 @@ public class MyEntity {
- Annotated with `@Service`, `@RequiredArgsConstructor`, optionally `@Slf4j`. - Annotated with `@Service`, `@RequiredArgsConstructor`, optionally `@Slf4j`.
- Write methods: `@Transactional`. - Write methods: `@Transactional`.
- Read methods: no annotation (default non-transactional)**except** when the method returns - Read methods: no annotation (default non-transactional).
an entity whose lazy associations must remain accessible to the caller after the method
returns. In that case, use `@Transactional(readOnly = true)` to keep the Hibernate session
open. Removing this annotation causes `LazyInitializationException` in production. See ADR-022.
- Cross-domain access goes through the other domain's service, never its repository. - Cross-domain access goes through the other domain's service, never its repository.
## Error Handling ## Error Handling

View File

@@ -5,7 +5,7 @@
<parent> <parent>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId> <artifactId>spring-boot-starter-parent</artifactId>
<version>4.0.6</version> <version>4.0.0</version>
<relativePath/> <!-- lookup parent from repository --> <relativePath/> <!-- lookup parent from repository -->
</parent> </parent>
<groupId>org.raddatz</groupId> <groupId>org.raddatz</groupId>
@@ -29,30 +29,11 @@
<properties> <properties>
<java.version>21</java.version> <java.version>21</java.version>
</properties> </properties>
<dependencyManagement>
<dependencies>
<!-- opentelemetry-spring-boot-starter:2.27.0 was built against opentelemetry-api:1.61.0,
but Spring Boot 4.0.0 BOM only manages 1.55.0 (missing GlobalOpenTelemetry.getOrNoop()).
Import the core OTel BOM here to override it before the Spring Boot BOM applies. -->
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-bom</artifactId>
<version>1.61.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies> <dependencies>
<dependency> <dependency>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId> <artifactId>spring-boot-starter-actuator</artifactId>
</dependency> </dependency>
<!-- Spring Boot 4.0 splits Micrometer metrics export (incl. Prometheus scrape endpoint) into its own starter -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-micrometer-metrics</artifactId>
</dependency>
<dependency> <dependency>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-validation</artifactId> <artifactId>spring-boot-starter-validation</artifactId>
@@ -69,10 +50,6 @@
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId> <artifactId>spring-boot-starter-security</artifactId>
</dependency> </dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-session-jdbc</artifactId>
</dependency>
<dependency> <dependency>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webmvc</artifactId> <artifactId>spring-boot-starter-webmvc</artifactId>
@@ -180,16 +157,11 @@
<artifactId>flyway-database-postgresql</artifactId> <artifactId>flyway-database-postgresql</artifactId>
</dependency> </dependency>
<!-- Caffeine cache + Bucket4j for in-memory rate limiting --> <!-- Caffeine cache for in-memory rate limiting -->
<dependency> <dependency>
<groupId>com.github.ben-manes.caffeine</groupId> <groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId> <artifactId>caffeine</artifactId>
</dependency> </dependency>
<dependency>
<groupId>com.bucket4j</groupId>
<artifactId>bucket4j-core</artifactId>
<version>8.10.1</version>
</dependency>
<!-- OpenAPI / Swagger UI — enabled only in the dev Spring profile --> <!-- OpenAPI / Swagger UI — enabled only in the dev Spring profile -->
<dependency> <dependency>
@@ -216,50 +188,7 @@
<dependency> <dependency>
<groupId>com.googlecode.owasp-java-html-sanitizer</groupId> <groupId>com.googlecode.owasp-java-html-sanitizer</groupId>
<artifactId>owasp-java-html-sanitizer</artifactId> <artifactId>owasp-java-html-sanitizer</artifactId>
<version>20260101.1</version> <version>20240325.1</version>
</dependency>
<!-- HTML → plain-text extraction for comment previews -->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.18.1</version>
</dependency>
<!-- Observability: Prometheus metrics scrape endpoint (version managed by Spring Boot BOM) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<!-- Observability: Micrometer → OpenTelemetry tracing bridge (version managed by Spring Boot BOM) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing-bridge-otel</artifactId>
</dependency>
<!-- Observability: OTel Spring Boot auto-instrumentation — NOT in Spring Boot BOM, pinned explicitly -->
<dependency>
<groupId>io.opentelemetry.instrumentation</groupId>
<artifactId>opentelemetry-spring-boot-starter</artifactId>
<version>2.27.0</version>
<exclusions>
<!-- Excludes AzureAppServiceResourceProvider which references ServiceAttributes.SERVICE_INSTANCE_ID
that does not exist in the semconv version pulled by this project. -->
<exclusion>
<groupId>io.opentelemetry.contrib</groupId>
<artifactId>opentelemetry-azure-resources</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Sentry error reporting (GlitchTip-compatible) — sentry-spring-boot-4 is the
Spring Boot 4 / Spring Framework 7 compatible module (replaces the jakarta starter
which crashes with SF7 due to bean-name generation for triply-nested @Import classes) -->
<dependency>
<groupId>io.sentry</groupId>
<artifactId>sentry-spring-boot-4</artifactId>
<version>8.41.0</version>
</dependency> </dependency>
</dependencies> </dependencies>
@@ -306,7 +235,7 @@
<phase>verify</phase> <phase>verify</phase>
<goals><goal>report</goal></goals> <goals><goal>report</goal></goals>
</execution> </execution>
<!-- Gate: ratchet at 0.77 — actual measured coverage after drift; raise via #496 --> <!-- Gate: baseline 89.4% overall / service 90.2% / controller 80.0% -->
<execution> <execution>
<id>check</id> <id>check</id>
<phase>verify</phase> <phase>verify</phase>
@@ -319,7 +248,7 @@
<limit> <limit>
<counter>BRANCH</counter> <counter>BRANCH</counter>
<value>COVEREDRATIO</value> <value>COVEREDRATIO</value>
<minimum>0.77</minimum> <minimum>0.88</minimum>
</limit> </limit>
</limits> </limits>
</rule> </rule>
@@ -337,16 +266,6 @@
</profiles> </profiles>
</configuration> </configuration>
</plugin> </plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<forkedProcessTimeoutInSeconds>600</forkedProcessTimeoutInSeconds>
<systemPropertyVariables>
<junit.jupiter.execution.timeout.default>90 s</junit.jupiter.execution.timeout.default>
</systemPropertyVariables>
</configuration>
</plugin>
</plugins> </plugins>
</build> </build>

View File

@@ -35,22 +35,7 @@ public enum AuditKind {
USER_DELETED, USER_DELETED,
/** Payload: {@code {"userId": "uuid", "email": "addr", "addedGroups": ["Admin"], "removedGroups": []}} */ /** Payload: {@code {"userId": "uuid", "email": "addr", "addedGroups": ["Admin"], "removedGroups": []}} */
GROUP_MEMBERSHIP_CHANGED, GROUP_MEMBERSHIP_CHANGED;
/** Payload: {@code {"userId": "uuid", "ip": "1.2.3.4", "ua": "Mozilla/5.0..."}} */
LOGIN_SUCCESS,
/** Payload: {@code {"email": "addr", "ip": "1.2.3.4", "ua": "Mozilla/5.0..."}} — password NEVER included */
LOGIN_FAILED,
/** Payload: {@code {"userId": "uuid", "ip": "1.2.3.4", "ua": "Mozilla/5.0...", "reason": "password_change|password_reset|admin_force_logout", "revokedCount": 3}} */
LOGOUT,
/** Payload: {@code {"actorId": "uuid", "targetUserId": "uuid", "revokedCount": 3}} */
ADMIN_FORCE_LOGOUT,
/** Payload: {@code {"ip": "1.2.3.4", "email": "addr"}} — password NEVER included */
LOGIN_RATE_LIMITED;
public static final Set<AuditKind> ROLLUP_ELIGIBLE = Set.of( public static final Set<AuditKind> ROLLUP_ELIGIBLE = Set.of(
TEXT_SAVED, FILE_UPLOADED, ANNOTATION_CREATED, TEXT_SAVED, FILE_UPLOADED, ANNOTATION_CREATED,

View File

@@ -1,84 +0,0 @@
package org.raddatz.familienarchiv.auth;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.audit.AuditKind;
import org.raddatz.familienarchiv.audit.AuditService;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.user.AppUser;
import org.raddatz.familienarchiv.user.UserService;
import org.springframework.security.authentication.AuthenticationManager;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.AuthenticationException;
import org.springframework.stereotype.Service;
import java.util.Map;
import java.util.UUID;
@Service
@RequiredArgsConstructor
@Slf4j
public class AuthService {
private final AuthenticationManager authenticationManager;
private final UserService userService;
private final AuditService auditService;
private final LoginRateLimiter loginRateLimiter;
private final SessionRevocationPort sessionRevocationPort;
public LoginResult login(String email, String password, String ip, String ua) {
try {
loginRateLimiter.checkAndConsume(ip, email);
} catch (DomainException ex) {
auditService.log(AuditKind.LOGIN_RATE_LIMITED, null, null, Map.of(
"ip", ip,
"email", email));
throw ex;
}
try {
Authentication auth = authenticationManager.authenticate(
new UsernamePasswordAuthenticationToken(email, password));
AppUser user = userService.findByEmail(email);
auditService.log(AuditKind.LOGIN_SUCCESS, user.getId(), null, Map.of(
"userId", user.getId().toString(),
"ip", ip,
"ua", truncateUa(ua)));
loginRateLimiter.invalidateOnSuccess(ip, email);
return new LoginResult(user, auth);
} catch (AuthenticationException ex) {
// Audit login failure — intentionally does NOT log the attempted password.
// DaoAuthenticationProvider already runs a dummy BCrypt on unknown users to
// equalise timing between "user not found" and "wrong password" paths.
auditService.log(AuditKind.LOGIN_FAILED, null, null, Map.of(
"email", email,
"ip", ip,
"ua", truncateUa(ua)));
throw DomainException.invalidCredentials();
}
}
public int revokeOtherSessions(String currentSessionId, String principalName) {
return sessionRevocationPort.revokeOtherSessions(currentSessionId, principalName);
}
public int revokeAllSessions(String principalName) {
return sessionRevocationPort.revokeAllSessions(principalName);
}
public void logout(String email, String ip, String ua) {
AppUser user = userService.findByEmail(email);
auditService.log(AuditKind.LOGOUT, user.getId(), null, Map.of(
"userId", user.getId().toString(),
"ip", ip,
"ua", truncateUa(ua)));
}
private static String truncateUa(String ua) {
if (ua == null) return "";
return ua.length() > 200 ? ua.substring(0, 200) : ua;
}
public record LoginResult(AppUser user, Authentication authentication) {}
}

View File

@@ -1,102 +0,0 @@
package org.raddatz.familienarchiv.auth;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import jakarta.servlet.http.HttpSession;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.user.AppUser;
import org.springframework.http.ResponseEntity;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContext;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.security.web.authentication.session.SessionAuthenticationStrategy;
import org.springframework.security.web.context.HttpSessionSecurityContextRepository;
import org.springframework.web.bind.annotation.*;
// @RequirePermission is intentionally absent: login is unauthenticated by design;
// logout requires an authenticated session (enforced by SecurityConfig), not a specific permission.
@RestController
@RequestMapping("/api/auth")
@RequiredArgsConstructor
@Slf4j
public class AuthSessionController {
private final AuthService authService;
private final SessionAuthenticationStrategy sessionAuthenticationStrategy;
@PostMapping("/login")
public ResponseEntity<AppUser> login(
@RequestBody LoginRequest request,
HttpServletRequest httpRequest,
HttpServletResponse httpResponse) {
String ip = resolveClientIp(httpRequest);
String ua = resolveUserAgent(httpRequest);
AuthService.LoginResult result = authService.login(request.email(), request.password(), ip, ua);
// Session-fixation defense (CWE-384): rotate the session ID at the authentication
// boundary. ChangeSessionIdAuthenticationStrategy invalidates any pre-auth session ID
// an attacker may have planted and mints a fresh one before we attach the SecurityContext.
httpRequest.getSession(true);
sessionAuthenticationStrategy.onAuthentication(result.authentication(), httpRequest, httpResponse);
// Spring Session JDBC intercepts setAttribute() and persists the record under the
// (now rotated) opaque ID; the Set-Cookie: fa_session=<opaque-id> is added automatically.
SecurityContext context = SecurityContextHolder.createEmptyContext();
context.setAuthentication(result.authentication());
SecurityContextHolder.setContext(context);
httpRequest.getSession()
.setAttribute(HttpSessionSecurityContextRepository.SPRING_SECURITY_CONTEXT_KEY, context);
return ResponseEntity.ok(result.user());
}
@PostMapping("/logout")
public ResponseEntity<Void> logout(Authentication authentication, HttpServletRequest httpRequest) {
String email = authentication.getName();
String ip = resolveClientIp(httpRequest);
String ua = resolveUserAgent(httpRequest);
// CWE-613 defense: invalidate the session first — that is the contract the user
// is relying on when they click "Log out." Audit is best-effort and must not
// bubble up: if the user record was deleted while the session was live, the
// audit lookup throws, but the session row in spring_session must still die.
HttpSession session = httpRequest.getSession(false);
if (session != null) {
session.invalidate();
}
SecurityContextHolder.clearContext();
try {
authService.logout(email, ip, ua);
} catch (Exception ex) {
log.warn("Audit logout failed for {}; session was already invalidated", email, ex);
}
return ResponseEntity.noContent().build();
}
/**
* Resolves the client IP for audit-log purposes.
*
* <p>Trust model: the leftmost {@code X-Forwarded-For} value is taken at face value.
* This is correct <em>only</em> if the ingress (Caddy in production) strips any
* client-supplied XFF before forwarding — otherwise an attacker can pin audit-log
* IPs to whatever they want. Verify the reverse-proxy config before exposing this
* service behind a different ingress.
*/
private static String resolveClientIp(HttpServletRequest request) {
String forwarded = request.getHeader("X-Forwarded-For");
if (forwarded != null && !forwarded.isBlank()) {
return forwarded.split(",")[0].trim();
}
return request.getRemoteAddr();
}
private static String resolveUserAgent(HttpServletRequest request) {
String ua = request.getHeader("User-Agent");
return ua != null ? ua : "";
}
}

View File

@@ -1,29 +0,0 @@
package org.raddatz.familienarchiv.auth;
import lombok.RequiredArgsConstructor;
import org.springframework.session.jdbc.JdbcIndexedSessionRepository;
@RequiredArgsConstructor
class JdbcSessionRevocationAdapter implements SessionRevocationPort {
private final JdbcIndexedSessionRepository sessionRepository;
@Override
public int revokeOtherSessions(String currentSessionId, String principalName) {
int count = 0;
for (String id : sessionRepository.findByPrincipalName(principalName).keySet()) {
if (!id.equals(currentSessionId)) {
sessionRepository.deleteById(id);
count++;
}
}
return count;
}
@Override
public int revokeAllSessions(String principalName) {
var sessions = sessionRepository.findByPrincipalName(principalName);
sessions.keySet().forEach(sessionRepository::deleteById);
return sessions.size();
}
}

View File

@@ -1,72 +0,0 @@
package org.raddatz.familienarchiv.auth;
import com.github.benmanes.caffeine.cache.Caffeine;
import com.github.benmanes.caffeine.cache.LoadingCache;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.springframework.stereotype.Service;
import java.time.Duration;
import java.util.Locale;
import java.util.concurrent.TimeUnit;
@Service
@Slf4j
public class LoginRateLimiter {
private final LoadingCache<String, Bucket> byIpEmail;
private final LoadingCache<String, Bucket> byIp;
private final int maxPerIpEmail;
private final int maxPerIp;
private final int windowMinutes;
public LoginRateLimiter(RateLimitProperties props) {
this.maxPerIpEmail = props.getMaxAttemptsPerIpEmail();
this.maxPerIp = props.getMaxAttemptsPerIp();
this.windowMinutes = props.getWindowMinutes();
this.byIpEmail = Caffeine.newBuilder()
.expireAfterAccess(windowMinutes, TimeUnit.MINUTES)
.build(key -> newBucket(maxPerIpEmail, windowMinutes));
this.byIp = Caffeine.newBuilder()
.expireAfterAccess(windowMinutes, TimeUnit.MINUTES)
.build(key -> newBucket(maxPerIp, windowMinutes));
}
// NOTE: This cache is node-local (in-memory). In a multi-replica deployment,
// effective limits would be multiplied by replica count.
// For the current single-VPS setup this is the correct, simplest implementation.
public void checkAndConsume(String ip, String email) {
long retryAfterSeconds = windowMinutes * 60L;
String key = ip + ":" + email.toLowerCase(Locale.ROOT);
if (!byIpEmail.get(key).tryConsume(1)) {
throw DomainException.tooManyRequests(ErrorCode.TOO_MANY_LOGIN_ATTEMPTS,
"Too many login attempts from " + ip, retryAfterSeconds);
}
if (!byIp.get(ip).tryConsume(1)) {
// Refund the ipEmail token so IP-level blocking does not erode the per-email quota.
byIpEmail.get(key).addTokens(1);
throw DomainException.tooManyRequests(ErrorCode.TOO_MANY_LOGIN_ATTEMPTS,
"Too many login attempts from " + ip, retryAfterSeconds);
}
}
public void invalidateOnSuccess(String ip, String email) {
byIpEmail.invalidate(ip + ":" + email.toLowerCase(Locale.ROOT));
byIp.invalidate(ip);
}
private static Bucket newBucket(int limit, int minutes) {
return Bucket.builder()
.addLimit(Bandwidth.builder()
.capacity(limit)
.refillGreedy(limit, Duration.ofMinutes(minutes))
.build())
.build();
}
}

View File

@@ -1,3 +0,0 @@
package org.raddatz.familienarchiv.auth;
public record LoginRequest(String email, String password) {}

View File

@@ -1,14 +0,0 @@
package org.raddatz.familienarchiv.auth;
class NoOpSessionRevocationAdapter implements SessionRevocationPort {
@Override
public int revokeOtherSessions(String currentSessionId, String principalName) {
return 0;
}
@Override
public int revokeAllSessions(String principalName) {
return 0;
}
}

View File

@@ -1,14 +0,0 @@
package org.raddatz.familienarchiv.auth;
import lombok.Data;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;
@Component
@ConfigurationProperties("rate-limit.login")
@Data
public class RateLimitProperties {
private int maxAttemptsPerIpEmail = 10;
private int maxAttemptsPerIp = 20;
private int windowMinutes = 15;
}

View File

@@ -1,19 +0,0 @@
package org.raddatz.familienarchiv.auth;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.session.jdbc.JdbcIndexedSessionRepository;
@Configuration
class SessionRevocationConfig {
@Bean
SessionRevocationPort sessionRevocationPort(
@Autowired(required = false) JdbcIndexedSessionRepository sessionRepository) {
if (sessionRepository != null) {
return new JdbcSessionRevocationAdapter(sessionRepository);
}
return new NoOpSessionRevocationAdapter();
}
}

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.auth;
public interface SessionRevocationPort {
int revokeOtherSessions(String currentSessionId, String principalName);
int revokeAllSessions(String principalName);
}

View File

@@ -5,10 +5,8 @@ import lombok.extern.slf4j.Slf4j;
import org.flywaydb.core.Flyway; import org.flywaydb.core.Flyway;
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import javax.sql.DataSource; import javax.sql.DataSource;
import java.util.Map;
@Configuration @Configuration
@RequiredArgsConstructor @RequiredArgsConstructor
@@ -16,7 +14,6 @@ import java.util.Map;
public class FlywayConfig { public class FlywayConfig {
private final DataSource dataSource; private final DataSource dataSource;
private final Environment environment;
@Bean(name = "flyway") @Bean(name = "flyway")
public Flyway flyway() { public Flyway flyway() {
@@ -24,7 +21,6 @@ public class FlywayConfig {
Flyway flyway = Flyway.configure() Flyway flyway = Flyway.configure()
.dataSource(dataSource) .dataSource(dataSource)
.locations("classpath:db/migration") .locations("classpath:db/migration")
.placeholders(Map.of("grafanaDbPassword", resolveGrafanaDbPassword()))
.baselineOnMigrate(true) .baselineOnMigrate(true)
.baselineVersion("4") .baselineVersion("4")
.load(); .load();
@@ -32,22 +28,4 @@ public class FlywayConfig {
log.info("Flyway: {} migration(s) applied.", result.migrationsExecuted); log.info("Flyway: {} migration(s) applied.", result.migrationsExecuted);
return flyway; return flyway;
} }
// Fail-closed: refuse to boot when GRAFANA_DB_PASSWORD is unset. The
// grafana_reader role's password is (re)set on every boot by
// R__grafana_reader_password.sql, so a missing env var means we'd either
// skip the rotation silently or — with a hardcoded fallback — publish a
// well-known credential for a role with SELECT on audit_log, documents,
// and transcription_blocks. Same shape as UserDataInitializer's refusal
// to seed default admin credentials outside dev/test/e2e.
String resolveGrafanaDbPassword() {
String value = environment.getProperty("GRAFANA_DB_PASSWORD");
if (value == null || value.isBlank()) {
throw new IllegalStateException(
"GRAFANA_DB_PASSWORD is required: it is consumed by "
+ "R__grafana_reader_password.sql to (re)set the grafana_reader "
+ "role's password on every boot. Generate with: openssl rand -hex 32");
}
return value;
}
} }

View File

@@ -28,7 +28,6 @@ public class RateLimitInterceptor implements HandlerInterceptor {
AtomicInteger count = requestCounts.get(ip, k -> new AtomicInteger(0)); AtomicInteger count = requestCounts.get(ip, k -> new AtomicInteger(0));
if (count.incrementAndGet() > MAX_REQUESTS_PER_MINUTE) { if (count.incrementAndGet() > MAX_REQUESTS_PER_MINUTE) {
response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value()); response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
response.setHeader("Retry-After", "60");
response.getWriter().write("{\"code\":\"RATE_LIMIT_EXCEEDED\",\"message\":\"Too many requests\"}"); response.getWriter().write("{\"code\":\"RATE_LIMIT_EXCEEDED\",\"message\":\"Too many requests\"}");
return false; return false;
} }

View File

@@ -1,22 +0,0 @@
package org.raddatz.familienarchiv.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.session.web.http.CookieSerializer;
import org.springframework.session.web.http.DefaultCookieSerializer;
@Configuration
public class SpringSessionConfig {
@Bean
public CookieSerializer cookieSerializer() {
DefaultCookieSerializer serializer = new DefaultCookieSerializer();
serializer.setCookieName("fa_session");
serializer.setSameSite("Strict");
// cookieHttpOnly: true is the DefaultCookieSerializer default
// useSecureCookie not set: auto-detects from request.isSecure().
// With forward-headers-strategy: native, Caddy's X-Forwarded-Proto: https
// causes isSecure() → true in production; direct HTTP in dev/tests → false.
return serializer;
}
}

View File

@@ -29,11 +29,5 @@ public record ActivityFeedItemDTO(
requiredMode = Schema.RequiredMode.NOT_REQUIRED, requiredMode = Schema.RequiredMode.NOT_REQUIRED,
description = "Annotation associated with the comment; populated only for COMMENT_ADDED and MENTION_CREATED kinds." description = "Annotation associated with the comment; populated only for COMMENT_ADDED and MENTION_CREATED kinds."
) )
UUID annotationId, UUID annotationId
@Nullable
@Schema(
requiredMode = Schema.RequiredMode.NOT_REQUIRED,
description = "Plain-text preview of the comment body (HTML stripped server-side, truncated to 120 chars); null for non-comment feed items or deleted comments."
)
String commentPreview
) {} ) {}

View File

@@ -12,7 +12,6 @@ import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.person.Person; import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock; import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock;
import org.raddatz.familienarchiv.document.comment.CommentService; import org.raddatz.familienarchiv.document.comment.CommentService;
import org.raddatz.familienarchiv.document.comment.CommentData;
import org.raddatz.familienarchiv.document.DocumentService; import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.transcription.TranscriptionService; import org.raddatz.familienarchiv.document.transcription.TranscriptionService;
import org.raddatz.familienarchiv.user.UserService; import org.raddatz.familienarchiv.user.UserService;
@@ -134,9 +133,9 @@ public class DashboardService {
.filter(Objects::nonNull) .filter(Objects::nonNull)
.distinct() .distinct()
.toList(); .toList();
Map<UUID, CommentData> commentDataByComment = commentIds.isEmpty() Map<UUID, UUID> annotationByComment = commentIds.isEmpty()
? Map.of() ? Map.of()
: commentService.findDataByIds(commentIds); : commentService.findAnnotationIdsByIds(commentIds);
return rows.stream().map(row -> { return rows.stream().map(row -> {
ActivityActorDTO actor = row.getActorId() != null ActivityActorDTO actor = row.getActorId() != null
@@ -147,10 +146,7 @@ public class DashboardService {
? row.getHappenedAtUntil().atOffset(ZoneOffset.UTC) ? row.getHappenedAtUntil().atOffset(ZoneOffset.UTC)
: null; : null;
UUID commentId = row.getCommentId(); UUID commentId = row.getCommentId();
CommentData commentData = commentId != null ? commentDataByComment.get(commentId) : null; UUID annotationId = commentId != null ? annotationByComment.get(commentId) : null;
UUID annotationId = commentData != null ? commentData.annotationId() : null;
String commentPreview = commentData != null && !commentData.preview().isBlank()
? commentData.preview() : null;
return new ActivityFeedItemDTO( return new ActivityFeedItemDTO(
org.raddatz.familienarchiv.audit.AuditKind.valueOf(row.getKind()), org.raddatz.familienarchiv.audit.AuditKind.valueOf(row.getKind()),
actor, actor,
@@ -162,8 +158,7 @@ public class DashboardService {
row.getCount(), row.getCount(),
happenedAtUntil, happenedAtUntil,
commentId, commentId,
annotationId, annotationId
commentPreview
); );
}).toList(); }).toList();
} }

View File

@@ -1,12 +1,7 @@
package org.raddatz.familienarchiv.dashboard; package org.raddatz.familienarchiv.dashboard;
import io.swagger.v3.oas.annotations.media.Schema;
/** /**
* Aggregate counts for the dashboard/persons stats bar. * Aggregate counts for the dashboard/persons stats bar.
*/ */
public record StatsDTO( public record StatsDTO(long totalPersons, long totalDocuments) {
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) long totalPersons,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) long totalDocuments,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) long totalStories) {
} }

View File

@@ -2,7 +2,6 @@ package org.raddatz.familienarchiv.dashboard;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
import org.raddatz.familienarchiv.document.DocumentService; import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.geschichte.GeschichteService;
import org.raddatz.familienarchiv.person.PersonService; import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.dashboard.StatsDTO; import org.raddatz.familienarchiv.dashboard.StatsDTO;
import org.springframework.stereotype.Service; import org.springframework.stereotype.Service;
@@ -13,9 +12,8 @@ public class StatsService {
private final PersonService personService; private final PersonService personService;
private final DocumentService documentService; private final DocumentService documentService;
private final GeschichteService geschichteService;
public StatsDTO getStats() { public StatsDTO getStats() {
return new StatsDTO(personService.count(), documentService.count(), geschichteService.countPublished()); return new StatsDTO(personService.count(), documentService.count());
} }
} }

View File

@@ -1,17 +0,0 @@
package org.raddatz.familienarchiv.document;
/**
* Precision of a document's date. Verbatim mirror of the import normalizer's
* {@code Precision} enum (tools/import-normalizer/dates.py) — the canonical output is the
* contract, so there is no translation layer. Do not add, remove, or rename values without
* also changing the normalizer; a mismatch silently breaks import idempotency (see ADR-025).
*/
public enum DatePrecision {
DAY,
MONTH,
SEASON,
YEAR,
RANGE,
APPROX,
UNKNOWN
}

View File

@@ -1,23 +0,0 @@
package org.raddatz.familienarchiv.document;
import org.raddatz.familienarchiv.tag.TagOperator;
import java.util.List;
import java.util.UUID;
/**
* The non-date filters honoured by {@link DocumentService#getDensity(DensityFilters)}.
* Date bounds (from/to) are deliberately excluded — see the service Javadoc for why.
*
* Kept as a record so the seven values are passed as one named bundle instead of a
* positional argument list where two UUIDs (sender vs. receiver) can be swapped by
* accident at the call site.
*/
public record DensityFilters(
String text,
UUID sender,
UUID receiver,
List<String> tags,
String tagQ,
DocumentStatus status,
TagOperator tagOperator) {}

View File

@@ -2,7 +2,6 @@ package org.raddatz.familienarchiv.document;
import jakarta.persistence.*; import jakarta.persistence.*;
import lombok.*; import lombok.*;
import org.hibernate.annotations.BatchSize;
import org.hibernate.annotations.CreationTimestamp; import org.hibernate.annotations.CreationTimestamp;
import org.hibernate.annotations.UpdateTimestamp; import org.hibernate.annotations.UpdateTimestamp;
@@ -22,17 +21,6 @@ import java.util.HashSet;
import java.util.Set; import java.util.Set;
import java.util.UUID; import java.util.UUID;
@NamedEntityGraph(name = "Document.full", attributeNodes = {
@NamedAttributeNode("sender"),
@NamedAttributeNode("receivers"),
@NamedAttributeNode("tags"),
@NamedAttributeNode("trainingLabels")
})
@NamedEntityGraph(name = "Document.list", attributeNodes = {
@NamedAttributeNode("sender"),
@NamedAttributeNode("receivers"),
@NamedAttributeNode("tags")
})
@Entity @Entity
@Table(name = "documents") @Table(name = "documents")
@Data // Lombok: Generiert Getter, Setter, ToString, etc. @Data // Lombok: Generiert Getter, Setter, ToString, etc.
@@ -91,29 +79,6 @@ public class Document {
@Column(name = "meta_date") @Column(name = "meta_date")
private LocalDate documentDate; // Wann wurde der Brief geschrieben? private LocalDate documentDate; // Wann wurde der Brief geschrieben?
// Precision of documentDate — drives honest rendering ("ca. 1943", "Frühjahr 1943").
// Verbatim mirror of the normalizer's Precision enum (see ADR-025).
@Enumerated(EnumType.STRING)
@Column(name = "meta_date_precision", nullable = false, length = 16)
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
@Builder.Default
private DatePrecision metaDatePrecision = DatePrecision.UNKNOWN;
// Range end — only set when metaDatePrecision is RANGE (open-ended ranges allowed → may be null).
@Column(name = "meta_date_end")
private LocalDate metaDateEnd;
// Original date cell, verbatim, preserved for provenance and "as written" display.
@Column(name = "meta_date_raw", columnDefinition = "TEXT")
private String metaDateRaw;
// Raw attribution preserved even when a person is linked via sender/receivers.
@Column(name = "sender_text", columnDefinition = "TEXT")
private String senderText;
@Column(name = "receiver_text", columnDefinition = "TEXT")
private String receiverText;
@Column(name = "meta_location") @Column(name = "meta_location")
private String location; private String location;
@@ -153,27 +118,24 @@ public class Document {
@Builder.Default @Builder.Default
private ScriptType scriptType = ScriptType.UNKNOWN; private ScriptType scriptType = ScriptType.UNKNOWN;
@ManyToMany(fetch = FetchType.LAZY) @ManyToMany(fetch = FetchType.EAGER)
@JoinTable(name = "document_receivers", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "person_id")) @JoinTable(name = "document_receivers", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "person_id"))
@BatchSize(size = 50)
@Builder.Default @Builder.Default
private Set<Person> receivers = new HashSet<>(); private Set<Person> receivers = new HashSet<>();
@ManyToOne(fetch = FetchType.LAZY) @ManyToOne
@JoinColumn(name = "sender_id") @JoinColumn(name = "sender_id")
private Person sender; private Person sender;
@ManyToMany(fetch = FetchType.LAZY) @ManyToMany(fetch = FetchType.EAGER)
@JoinTable(name = "document_tags", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "tag_id")) @JoinTable(name = "document_tags", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "tag_id"))
@BatchSize(size = 50)
@Builder.Default @Builder.Default
private Set<Tag> tags = new HashSet<>(); private Set<Tag> tags = new HashSet<>();
@ElementCollection(fetch = FetchType.LAZY) @ElementCollection(fetch = FetchType.EAGER)
@CollectionTable(name = "document_training_labels", joinColumns = @JoinColumn(name = "document_id")) @CollectionTable(name = "document_training_labels", joinColumns = @JoinColumn(name = "document_id"))
@Column(name = "label") @Column(name = "label")
@Enumerated(EnumType.STRING) @Enumerated(EnumType.STRING)
@BatchSize(size = 50)
@Builder.Default @Builder.Default
private Set<TrainingLabel> trainingLabels = new HashSet<>(); private Set<TrainingLabel> trainingLabels = new HashSet<>();

View File

@@ -12,8 +12,6 @@ public class DocumentBatchMetadataDTO {
private UUID senderId; private UUID senderId;
private List<UUID> receiverIds; private List<UUID> receiverIds;
private LocalDate documentDate; private LocalDate documentDate;
private DatePrecision metaDatePrecision;
private LocalDate metaDateEnd;
private String location; private String location;
private List<String> tagNames; private List<String> tagNames;
private Boolean metadataComplete; private Boolean metadataComplete;

View File

@@ -3,7 +3,6 @@ package org.raddatz.familienarchiv.document;
import java.io.IOException; import java.io.IOException;
import java.time.LocalDate; import java.time.LocalDate;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.concurrent.TimeUnit;
import java.util.LinkedHashSet; import java.util.LinkedHashSet;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
@@ -49,7 +48,6 @@ import org.raddatz.familienarchiv.filestorage.FileService;
import org.raddatz.familienarchiv.user.UserService; import org.raddatz.familienarchiv.user.UserService;
import org.springframework.data.domain.Sort; import org.springframework.data.domain.Sort;
import org.springframework.security.core.Authentication; import org.springframework.security.core.Authentication;
import org.springframework.http.CacheControl;
import org.springframework.http.HttpHeaders; import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType; import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity; import org.springframework.http.ResponseEntity;
@@ -313,10 +311,9 @@ public class DocumentController {
@RequestParam(required = false) String tagQ, @RequestParam(required = false) String tagQ,
@RequestParam(required = false) DocumentStatus status, @RequestParam(required = false) DocumentStatus status,
@RequestParam(required = false) String tagOp, @RequestParam(required = false) String tagOp,
@RequestParam(required = false) Boolean undated,
Authentication authentication) { Authentication authentication) {
TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND; TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND;
List<UUID> ids = documentService.findIdsForFilter(q, from, to, senderId, receiverId, tags, tagQ, status, operator, Boolean.TRUE.equals(undated)); List<UUID> ids = documentService.findIdsForFilter(q, from, to, senderId, receiverId, tags, tagQ, status, operator);
if (ids.size() > BULK_EDIT_FILTER_MAX_IDS) { if (ids.size() > BULK_EDIT_FILTER_MAX_IDS) {
throw DomainException.badRequest(ErrorCode.BULK_EDIT_TOO_MANY_IDS, throw DomainException.badRequest(ErrorCode.BULK_EDIT_TOO_MANY_IDS,
"Filter matches " + ids.size() + " documents — refine filter (max " + BULK_EDIT_FILTER_MAX_IDS + ")"); "Filter matches " + ids.size() + " documents — refine filter (max " + BULK_EDIT_FILTER_MAX_IDS + ")");
@@ -376,7 +373,6 @@ public class DocumentController {
@Parameter(description = "Sort field") @RequestParam(required = false) DocumentSort sort, @Parameter(description = "Sort field") @RequestParam(required = false) DocumentSort sort,
@Parameter(description = "Sort direction: ASC or DESC") @RequestParam(required = false, defaultValue = "DESC") String dir, @Parameter(description = "Sort direction: ASC or DESC") @RequestParam(required = false, defaultValue = "DESC") String dir,
@Parameter(description = "Tag operator: AND (default) or OR") @RequestParam(required = false) String tagOp, @Parameter(description = "Tag operator: AND (default) or OR") @RequestParam(required = false) String tagOp,
@Parameter(description = "Restrict to undated documents (meta_date IS NULL)") @RequestParam(required = false) Boolean undated,
// @Max on page guards against overflow when pageable.getOffset() is computed // @Max on page guards against overflow when pageable.getOffset() is computed
// as page * size — Integer.MAX_VALUE * 50 would wrap to a negative long, which // as page * size — Integer.MAX_VALUE * 50 would wrap to a negative long, which
// Hibernate cheerfully turns into an invalid SQL OFFSET. // Hibernate cheerfully turns into an invalid SQL OFFSET.
@@ -389,24 +385,7 @@ public class DocumentController {
// defaults to AND, which matches the frontend default and keeps old clients working. // defaults to AND, which matches the frontend default and keeps old clients working.
TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND; TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND;
Pageable pageable = PageRequest.of(page, size); Pageable pageable = PageRequest.of(page, size);
return ResponseEntity.ok(documentService.searchDocuments(q, from, to, senderId, receiverId, tags, tagQ, status, sort, dir, operator, Boolean.TRUE.equals(undated), pageable)); return ResponseEntity.ok(documentService.searchDocuments(q, from, to, senderId, receiverId, tags, tagQ, status, sort, dir, operator, pageable));
}
@GetMapping(value = "/density", produces = MediaType.APPLICATION_JSON_VALUE)
public ResponseEntity<DocumentDensityResult> density(
@RequestParam(required = false) String q,
@RequestParam(required = false) UUID senderId,
@RequestParam(required = false) UUID receiverId,
@RequestParam(required = false, name = "tag") List<String> tags,
@RequestParam(required = false) String tagQ,
@Parameter(description = "Filter by document status") @RequestParam(required = false) DocumentStatus status,
@Parameter(description = "Tag operator: AND (default) or OR") @RequestParam(required = false) String tagOp) {
TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND;
DocumentDensityResult result = documentService.getDensity(
new DensityFilters(q, senderId, receiverId, tags, tagQ, status, operator));
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(5, TimeUnit.MINUTES).cachePrivate())
.body(result);
} }
// --- TRAINING LABELS --- // --- TRAINING LABELS ---

View File

@@ -1,27 +0,0 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.LocalDate;
import java.util.List;
/**
* Result of the timeline density aggregation.
*
* <p>{@code minDate} / {@code maxDate} are intentionally not marked
* {@code @Schema(requiredMode = REQUIRED)} — the empty-result case (no
* documents match the filter) returns them as {@code null}, which surfaces in
* the generated TypeScript as {@code minDate?: string | null}. Frontend code
* must treat them as optional.
*/
public record DocumentDensityResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<MonthBucket> buckets,
LocalDate minDate,
LocalDate maxDate
) {
/** The "no documents match the filter" result, with no buckets and null date bounds. */
public static DocumentDensityResult empty() {
return new DocumentDensityResult(List.of(), null, null);
}
}

View File

@@ -1,44 +0,0 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
import org.raddatz.familienarchiv.audit.ActivityActorDTO;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.tag.Tag;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.util.List;
import java.util.UUID;
public record DocumentListItem(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
UUID id,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
String title,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
String originalFilename,
String thumbnailUrl,
LocalDate documentDate,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
DatePrecision metaDatePrecision,
LocalDate metaDateEnd,
Person sender,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<Person> receivers,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<Tag> tags,
String archiveBox,
String archiveFolder,
String location,
String summary,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int completionPercentage,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<ActivityActorDTO> contributors,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
SearchMatchData matchData,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
LocalDateTime createdAt,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
LocalDateTime updatedAt
) {}

View File

@@ -7,8 +7,6 @@ import org.raddatz.familienarchiv.document.DocumentStatus;
import org.springframework.data.domain.Page; import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable; import org.springframework.data.domain.Pageable;
import org.springframework.data.domain.Sort; import org.springframework.data.domain.Sort;
import org.springframework.data.jpa.domain.Specification;
import org.springframework.data.jpa.repository.EntityGraph;
import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.JpaSpecificationExecutor; import org.springframework.data.jpa.repository.JpaSpecificationExecutor;
import org.springframework.data.jpa.repository.Query; import org.springframework.data.jpa.repository.Query;
@@ -25,18 +23,6 @@ import java.util.UUID;
@Repository @Repository
public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSpecificationExecutor<Document> { public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSpecificationExecutor<Document> {
@EntityGraph("Document.full")
Optional<Document> findById(UUID id);
@EntityGraph("Document.list")
Page<Document> findAll(Specification<Document> spec, Pageable pageable);
@EntityGraph("Document.list")
List<Document> findAll(Specification<Document> spec);
@EntityGraph("Document.list")
Page<Document> findAll(Pageable pageable);
// Findet ein Dokument anhand des ursprünglichen Dateinamens // Findet ein Dokument anhand des ursprünglichen Dateinamens
// Wichtig für den Abgleich beim Excel-Import & Datei-Upload // Wichtig für den Abgleich beim Excel-Import & Datei-Upload
Optional<Document> findByOriginalFilename(String originalFilename); Optional<Document> findByOriginalFilename(String originalFilename);
@@ -44,21 +30,17 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
// Wie oben, gibt aber nur das erste Ergebnis zurück — sicher wenn doppelte Dateinamen existieren // Wie oben, gibt aber nur das erste Ergebnis zurück — sicher wenn doppelte Dateinamen existieren
Optional<Document> findFirstByOriginalFilename(String originalFilename); Optional<Document> findFirstByOriginalFilename(String originalFilename);
// Callers access only status/id scalar fields — no graph needed. // Findet alle Dokumente mit einem bestimmten Status
// z.B. um alle offenen "PLACEHOLDER" zu finden
List<Document> findByStatus(DocumentStatus status); List<Document> findByStatus(DocumentStatus status);
// Prüft effizient, ob ein Dateiname schon existiert (gibt true/false zurück) // Prüft effizient, ob ein Dateiname schon existiert (gibt true/false zurück)
boolean existsByOriginalFilename(String originalFilename); boolean existsByOriginalFilename(String originalFilename);
// lazy @BatchSize(50) fallback active; see ADR-022
@EntityGraph("Document.full")
List<Document> findBySenderId(UUID senderId); List<Document> findBySenderId(UUID senderId);
// lazy @BatchSize(50) fallback active; see ADR-022
@EntityGraph("Document.full")
List<Document> findByReceiversId(UUID receiverId); List<Document> findByReceiversId(UUID receiverId);
// Callers access only doc.getTags() to mutate the set — receivers/sender not touched; no graph needed.
List<Document> findByTags_Id(UUID tagId); List<Document> findByTags_Id(UUID tagId);
@Query("SELECT d FROM Document d WHERE d.id NOT IN (SELECT DISTINCT dv.documentId FROM DocumentVersion dv)") @Query("SELECT d FROM Document d WHERE d.id NOT IN (SELECT DISTINCT dv.documentId FROM DocumentVersion dv)")
@@ -73,15 +55,12 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
long countByMetadataCompleteFalse(); long countByMetadataCompleteFalse();
// No production callers — only used if a future export path iterates the full list; no graph needed.
List<Document> findByMetadataCompleteFalse(Sort sort); List<Document> findByMetadataCompleteFalse(Sort sort);
// Callers map to IncompleteDocumentDTO using only scalar fields (id, title, createdAt) — no graph needed.
Page<Document> findByMetadataCompleteFalse(Pageable pageable); Page<Document> findByMetadataCompleteFalse(Pageable pageable);
Optional<Document> findFirstByMetadataCompleteFalseAndIdNot(UUID id, Sort sort); Optional<Document> findFirstByMetadataCompleteFalseAndIdNot(UUID id, Sort sort);
@EntityGraph("Document.full")
@Query("SELECT DISTINCT d FROM Document d " + @Query("SELECT DISTINCT d FROM Document d " +
"JOIN d.receivers r " + "JOIN d.receivers r " +
"WHERE " + "WHERE " +
@@ -96,7 +75,6 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
@Param("to") LocalDate to, @Param("to") LocalDate to,
Sort sort); Sort sort);
@EntityGraph("Document.full")
@Query("SELECT DISTINCT d FROM Document d " + @Query("SELECT DISTINCT d FROM Document d " +
"LEFT JOIN d.receivers r " + "LEFT JOIN d.receivers r " +
"WHERE (d.sender.id = :personId OR r.id = :personId) " + "WHERE (d.sender.id = :personId OR r.id = :personId) " +
@@ -122,45 +100,7 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
ORDER BY ts_rank(d.search_vector, q.pq) DESC, ORDER BY ts_rank(d.search_vector, q.pq) DESC,
d.meta_date DESC NULLS LAST d.meta_date DESC NULLS LAST
""") """)
// Unpaged path — for bulk-edit "select all" and density chart List<UUID> findRankedIdsByFts(@Param("query") String query);
List<UUID> findAllMatchingIdsByFts(@Param("query") String query);
/**
* Returns one page of FTS-ranked document IDs with the total match count.
*
* <p>Each row contains (in column order):
* <ol>
* <li>UUID — document id</li>
* <li>double — ts_rank score</li>
* <li>long — COUNT(*) OVER () — full match count, not page count</li>
* </ol>
*
* <p>Returns an empty list when the query matches no documents (including
* stopword-only queries where websearch_to_tsquery returns an empty tsquery).
* Use findAllMatchingIdsByFts for the unpaged bulk-edit path.
*/
@Query(nativeQuery = true, value = """
WITH q AS (
SELECT CASE WHEN websearch_to_tsquery('german', :query)::text <> ''
THEN to_tsquery('simple', regexp_replace(
websearch_to_tsquery('german', :query)::text,
'''([^'']+)''',
'''\\1'':*',
'g'))
END AS pq
), matches AS (
SELECT d.id, ts_rank(d.search_vector, q.pq) AS rank
FROM documents d, q
WHERE d.search_vector @@ q.pq
)
SELECT id, rank, COUNT(*) OVER () AS total
FROM matches
ORDER BY rank DESC, id
OFFSET :offset LIMIT :limit
""")
List<Object[]> findFtsPageRaw(@Param("query") String query,
@Param("offset") int offset,
@Param("limit") int limit);
/** /**
* Returns match-enrichment data for a set of documents identified by their IDs. * Returns match-enrichment data for a set of documents identified by their IDs.

View File

@@ -0,0 +1,18 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
import org.raddatz.familienarchiv.audit.ActivityActorDTO;
import org.raddatz.familienarchiv.document.Document;
import java.util.List;
public record DocumentSearchItem(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
Document document,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
SearchMatchData matchData,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int completionPercentage,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<ActivityActorDTO> contributors
) {}

View File

@@ -7,7 +7,7 @@ import java.util.List;
public record DocumentSearchResult( public record DocumentSearchResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<DocumentListItem> items, List<DocumentSearchItem> items,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
long totalElements, long totalElements,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
@@ -15,45 +15,24 @@ public record DocumentSearchResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int pageSize, int pageSize,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int totalPages, int totalPages
/**
* Total number of undated documents (meta_date IS NULL) matching the current
* filter context (q/tags/sender/receiver/status) across ALL pages — not the
* undated rows on the current page. Computed independently of the "Nur
* undatierte" toggle so it never collapses to the page slice (issue #668).
*/
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
long undatedCount
) { ) {
/** /**
* Single-page convenience factory used by empty-result shortcuts and by tests that * Single-page convenience factory used by empty-result shortcuts and by tests that
* don't care about paging. Treats the whole list as page 0 of itself. The undated * don't care about paging. Treats the whole list as page 0 of itself.
* count defaults to 0 — the service overlays the real global count via
* {@link #withUndatedCount(long)} before returning.
*/ */
public static DocumentSearchResult of(List<DocumentListItem> items) { public static DocumentSearchResult of(List<DocumentSearchItem> items) {
int size = items.size(); int size = items.size();
return new DocumentSearchResult(items, size, 0, size, size == 0 ? 0 : 1, 0L); return new DocumentSearchResult(items, size, 0, size, size == 0 ? 0 : 1);
} }
/** /**
* Paged factory used by the service when it has a real Pageable + full match count * Paged factory used by the service when it has a real Pageable + full match count
* (e.g. from Spring's Page&lt;T&gt; or from an in-memory sort-then-slice). The undated * (e.g. from Spring's Page<T> or from an in-memory sort-then-slice).
* count defaults to 0 — the service overlays the real global count via
* {@link #withUndatedCount(long)} before returning.
*/ */
public static DocumentSearchResult paged(List<DocumentListItem> slice, Pageable pageable, long totalElements) { public static DocumentSearchResult paged(List<DocumentSearchItem> slice, Pageable pageable, long totalElements) {
int pageSize = pageable.getPageSize(); int pageSize = pageable.getPageSize();
int totalPages = pageSize == 0 ? 0 : (int) ((totalElements + pageSize - 1) / pageSize); int totalPages = pageSize == 0 ? 0 : (int) ((totalElements + pageSize - 1) / pageSize);
return new DocumentSearchResult(slice, totalElements, pageable.getPageNumber(), pageSize, totalPages, 0L); return new DocumentSearchResult(slice, totalElements, pageable.getPageNumber(), pageSize, totalPages);
}
/**
* Returns a copy with the global undated count overlaid, leaving every other
* field untouched. Lets the service compute the count once and attach it to
* whichever result shape the search path produced.
*/
public DocumentSearchResult withUndatedCount(long undatedCount) {
return new DocumentSearchResult(items, totalElements, pageNumber, pageSize, totalPages, undatedCount);
} }
} }

View File

@@ -10,6 +10,7 @@ import org.raddatz.familienarchiv.audit.AuditService;
import org.raddatz.familienarchiv.document.DocumentBatchMetadataDTO; import org.raddatz.familienarchiv.document.DocumentBatchMetadataDTO;
import org.raddatz.familienarchiv.document.DocumentBatchSummary; import org.raddatz.familienarchiv.document.DocumentBatchSummary;
import org.raddatz.familienarchiv.document.DocumentBulkEditDTO; import org.raddatz.familienarchiv.document.DocumentBulkEditDTO;
import org.raddatz.familienarchiv.document.DocumentSearchItem;
import org.raddatz.familienarchiv.document.DocumentSearchResult; import org.raddatz.familienarchiv.document.DocumentSearchResult;
import org.raddatz.familienarchiv.document.DocumentSort; import org.raddatz.familienarchiv.document.DocumentSort;
import org.raddatz.familienarchiv.document.DocumentUpdateDTO; import org.raddatz.familienarchiv.document.DocumentUpdateDTO;
@@ -47,7 +48,6 @@ import java.io.IOException;
import java.security.MessageDigest; import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException; import java.security.NoSuchAlgorithmException;
import java.time.LocalDate; import java.time.LocalDate;
import java.time.YearMonth;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.Arrays; import java.util.Arrays;
import java.util.Collection; import java.util.Collection;
@@ -125,74 +125,6 @@ public class DocumentService {
return titles; return titles;
} }
/**
* Per-month document counts for the timeline density widget (issue #385).
*
* <p>Filter-reactive: the chart recomputes when other filters (sender,
* receiver, tag, q, status) change so it always matches the list it sits
* above. Date bounds (`from`/`to`) are deliberately omitted — the chart is
* the surface for picking those, so it must always span the broader space
* the user is selecting within.
*
* <p>Implementation note: groups in memory rather than via SQL GROUP BY
* because the existing {@link Specification} predicates compose easily
* with {@code findAll(spec)} and the archive size (≈5k docs) keeps this
* well under the 200ms p95 target. Cache-Control: max-age=300 on the
* controller layer absorbs repeated browse loads.
*
* <p>Tracked in issue #481 for re-evaluation when {@code documents > 50k}
* — at that scale move the aggregation into SQL (GROUP BY TO_CHAR(meta_date,
* 'YYYY-MM')) and accept that the criteria/specification surface needs a
* parallel native-query path.
*/
public DocumentDensityResult getDensity(DensityFilters filters) {
List<UUID> ftsIds = resolveFtsIds(filters.text());
if (ftsIds != null && ftsIds.isEmpty()) {
return DocumentDensityResult.empty();
}
List<LocalDate> dates = loadFilteredDates(filters, ftsIds);
return aggregateByMonth(dates);
}
/**
* Returns the FTS-ranked document IDs when {@code text} is non-blank, or {@code null}
* when no full-text query is active. An empty list means the FTS query ran but
* matched zero documents — the caller short-circuits on that signal.
*/
private List<UUID> resolveFtsIds(String text) {
if (!StringUtils.hasText(text)) return null;
return documentRepository.findAllMatchingIdsByFts(text);
}
/** Loads matching documents and projects to non-null {@link LocalDate}s. */
private List<LocalDate> loadFilteredDates(DensityFilters filters, List<UUID> ftsIds) {
boolean hasFts = ftsIds != null;
Specification<Document> spec = buildSearchSpec(
hasFts, ftsIds, null, null,
filters.sender(), filters.receiver(),
filters.tags(), filters.tagQ(),
filters.status(), filters.tagOperator(), false);
return documentRepository.findAll(spec).stream()
.map(Document::getDocumentDate)
.filter(Objects::nonNull)
.toList();
}
/** Buckets {@code dates} into one {@link MonthBucket} per YYYY-MM and computes min/max. */
private DocumentDensityResult aggregateByMonth(List<LocalDate> dates) {
if (dates.isEmpty()) return DocumentDensityResult.empty();
Map<String, Integer> counts = new java.util.TreeMap<>();
for (LocalDate d : dates) {
counts.merge(YearMonth.from(d).toString(), 1, Integer::sum);
}
List<MonthBucket> buckets = counts.entrySet().stream()
.map(e -> new MonthBucket(e.getKey(), e.getValue()))
.toList();
LocalDate minDate = dates.stream().min(LocalDate::compareTo).orElse(null);
LocalDate maxDate = dates.stream().max(LocalDate::compareTo).orElse(null);
return new DocumentDensityResult(buckets, minDate, maxDate);
}
/** /**
* Lädt eine Datei hoch. * Lädt eine Datei hoch.
* - Prüft, ob ein Eintrag (aus Excel) schon existiert. * - Prüft, ob ein Eintrag (aus Excel) schon existiert.
@@ -378,7 +310,6 @@ public class DocumentService {
// 1. Einfache Felder Update // 1. Einfache Felder Update
doc.setTitle(dto.getTitle()); doc.setTitle(dto.getTitle());
doc.setDocumentDate(dto.getDocumentDate()); doc.setDocumentDate(dto.getDocumentDate());
applyDatePrecision(doc, dto);
doc.setLocation(dto.getLocation()); doc.setLocation(dto.getLocation());
doc.setTranscription(dto.getTranscription()); doc.setTranscription(dto.getTranscription());
doc.setSummary(dto.getSummary()); doc.setSummary(dto.getSummary());
@@ -447,26 +378,6 @@ public class DocumentService {
return saved; return saved;
} }
/**
* Applies the three date-precision fields only when the DTO carries them.
* A null field means "not submitted" — overwriting the stored value with null
* would fabricate a precision the user never chose, the exact dishonesty #666
* exists to prevent. A row with a genuinely-unknown precision must keep it when
* an unrelated edit (e.g. a location typo) is saved.
*/
private void applyDatePrecision(Document doc, DocumentUpdateDTO dto) {
if (dto.getMetaDatePrecision() != null) {
doc.setMetaDatePrecision(dto.getMetaDatePrecision());
}
if (dto.getMetaDateEnd() != null) {
doc.setMetaDateEnd(dto.getMetaDateEnd());
}
if (dto.getMetaDateRaw() != null) {
doc.setMetaDateRaw(dto.getMetaDateRaw());
}
}
@Transactional
public Document updateDocumentTags(UUID docId, List<String> tagNames) { public Document updateDocumentTags(UUID docId, List<String> tagNames) {
Document doc = documentRepository.findById(docId) Document doc = documentRepository.findById(docId)
.orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + docId)); .orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + docId));
@@ -501,17 +412,16 @@ public class DocumentService {
*/ */
@Transactional(readOnly = true) @Transactional(readOnly = true)
public List<UUID> findIdsForFilter(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, public List<UUID> findIdsForFilter(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status, TagOperator tagOperator, List<String> tags, String tagQ, DocumentStatus status, TagOperator tagOperator) {
boolean undated) {
boolean hasText = StringUtils.hasText(text); boolean hasText = StringUtils.hasText(text);
List<UUID> rankedIds = null; List<UUID> rankedIds = null;
if (hasText) { if (hasText) {
rankedIds = documentRepository.findAllMatchingIdsByFts(text); rankedIds = documentRepository.findRankedIdsByFts(text);
if (rankedIds.isEmpty()) return List.of(); if (rankedIds.isEmpty()) return List.of();
} }
Specification<Document> spec = buildSearchSpec( Specification<Document> spec = buildSearchSpec(
hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator, undated); hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
return documentRepository.findAll(spec).stream().map(Document::getId).toList(); return documentRepository.findAll(spec).stream().map(Document::getId).toList();
} }
@@ -525,8 +435,7 @@ public class DocumentService {
LocalDate from, LocalDate to, LocalDate from, LocalDate to,
UUID sender, UUID receiver, UUID sender, UUID receiver,
List<String> tags, String tagQ, List<String> tags, String tagQ,
DocumentStatus status, TagOperator tagOperator, DocumentStatus status, TagOperator tagOperator) {
boolean undated) {
boolean useOrLogic = tagOperator == TagOperator.OR; boolean useOrLogic = tagOperator == TagOperator.OR;
List<Set<UUID>> expandedTagSets = tagService.expandTagNamesToDescendantIdSets(tags); List<Set<UUID>> expandedTagSets = tagService.expandTagNamesToDescendantIdSets(tags);
Specification<Document> textSpec = hasText ? hasIds(ftsIds) : (root, query, cb) -> null; Specification<Document> textSpec = hasText ? hasIds(ftsIds) : (root, query, cb) -> null;
@@ -536,8 +445,7 @@ public class DocumentService {
.and(hasReceiver(receiver)) .and(hasReceiver(receiver))
.and(hasTags(expandedTagSets, useOrLogic)) .and(hasTags(expandedTagSets, useOrLogic))
.and(hasTagPartial(tagQ)) .and(hasTagPartial(tagQ))
.and(hasStatus(status)) .and(hasStatus(status));
.and(undatedOnly(undated));
} }
/** /**
@@ -658,7 +566,7 @@ public class DocumentService {
return saved; return saved;
} }
@Transactional(readOnly = true) // 0. Zuletzt aktive Dokumente (sortiert nach updatedAt DESC)
public List<Document> getRecentActivity(int size) { public List<Document> getRecentActivity(int size) {
return documentRepository.findAll( return documentRepository.findAll(
PageRequest.of(0, size, Sort.by(Sort.Direction.DESC, "updatedAt")) PageRequest.of(0, size, Sort.by(Sort.Direction.DESC, "updatedAt"))
@@ -666,85 +574,41 @@ public class DocumentService {
} }
// 1. Allgemeine Suche (für das Suchfeld im Frontend) // 1. Allgemeine Suche (für das Suchfeld im Frontend)
public DocumentSearchResult searchDocuments(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, List<String> tags, String tagQ, DocumentStatus status, DocumentSort sort, String dir, TagOperator tagOperator, boolean undated, Pageable pageable) { public DocumentSearchResult searchDocuments(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, List<String> tags, String tagQ, DocumentStatus status, DocumentSort sort, String dir, TagOperator tagOperator, Pageable pageable) {
boolean hasText = StringUtils.hasText(text); boolean hasText = StringUtils.hasText(text);
// Pure-text RELEVANCE: push pagination + ts_rank ordering into SQL — skip
// findAllMatchingIdsByFts entirely (ADR-008). This must run BEFORE any
// findAllMatchingIdsByFts call so the fast path is preserved. An active undated
// filter must NOT take this path: it bypasses buildSearchSpec, so the
// undatedOnly predicate would be silently dropped. By definition this path has
// no date/sender/receiver/tag/status filters, and undated documents are valid
// FTS hits already folded into the ranked page, so there is no separate undated
// count to report here.
if (!undated && isPureTextRelevance(hasText, sort, from, to, sender, receiver, tags, tagQ, status)) {
return relevanceSortedPageFromSql(text, pageable);
}
List<UUID> rankedIds = null; List<UUID> rankedIds = null;
if (hasText) { if (hasText) {
rankedIds = documentRepository.findAllMatchingIdsByFts(text); rankedIds = documentRepository.findRankedIdsByFts(text);
// FTS matched nothing → no results and, by definition, no undated matches either.
if (rankedIds.isEmpty()) return DocumentSearchResult.of(List.of()); if (rankedIds.isEmpty()) return DocumentSearchResult.of(List.of());
} }
// Global undated count for the current filter (q/tags/sender/receiver/status),
// forcing undatedOnly(true) and IGNORING the user's "Nur undatierte" toggle so
// it never collapses to the page slice and never double-counts (issue #668).
long undatedCount = countUndatedForFilter(hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
return runSearch(text, hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, sort, dir, tagOperator, undated, pageable)
.withUndatedCount(undatedCount);
}
/**
* Counts every undated document (meta_date IS NULL) matching the active filter,
* across all pages, independent of the undated toggle. Reuses {@link #buildSearchSpec}
* with {@code undated=true} forced so the count tracks q/tags/sender/receiver/status.
* A {@code from}/{@code to} range excludes undated rows by the collision rule (#668),
* so the count is legitimately 0 inside a date range.
*/
private long countUndatedForFilter(boolean hasText, List<UUID> ftsIds,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status, TagOperator tagOperator) {
Specification<Document> undatedSpec = buildSearchSpec(
hasText, ftsIds, from, to, sender, receiver, tags, tagQ, status, tagOperator, true);
return documentRepository.count(undatedSpec);
}
/** The original search dispatch — produces the page slice + totals, sans undated count. */
private DocumentSearchResult runSearch(String text, boolean hasText, List<UUID> rankedIds,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status,
DocumentSort sort, String dir, TagOperator tagOperator,
boolean undated, Pageable pageable) {
// The pure-text RELEVANCE fast path is handled by the caller (searchDocuments)
// before findAllMatchingIdsByFts runs, so it never reaches here (ADR-008).
Specification<Document> spec = buildSearchSpec( Specification<Document> spec = buildSearchSpec(
hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator, undated); hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
// SENDER and RECEIVER sorts load the full match set and slice in-memory. // SENDER, RECEIVER and RELEVANCE sorts load the full match set and slice in memory.
// JPA's Sort.by("sender.lastName") generates an INNER JOIN that silently drops // JPA's Sort.by("sender.lastName") generates an INNER JOIN that silently drops
// documents with null sender/receivers. Cost scales with match count — // documents with null sender/receivers; RELEVANCE maps a DB order to an external
// acceptable while documents stays under ~10k rows. (ADR-008) // rank list. Cost scales linearly with match count — acceptable while documents
// stays under ~10k rows. Past that, replace with SQL-level LEFT JOIN sort.
if (sort == DocumentSort.RECEIVER) { if (sort == DocumentSort.RECEIVER) {
// In-memory sort on page slice (≤ page size rows) — acceptable
List<Document> sorted = sortByFirstReceiver(documentRepository.findAll(spec), dir); List<Document> sorted = sortByFirstReceiver(documentRepository.findAll(spec), dir);
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size()); return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
} }
if (sort == DocumentSort.SENDER) { if (sort == DocumentSort.SENDER) {
// In-memory sort on page slice (≤ page size rows) — acceptable
List<Document> sorted = sortBySender(documentRepository.findAll(spec), dir); List<Document> sorted = sortBySender(documentRepository.findAll(spec), dir);
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size()); return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
} }
// RELEVANCE with active filters: load filtered subset and sort in-memory by rank. // RELEVANCE: default when text present and no explicit sort given
boolean useRankOrder = hasText && (sort == null || sort == DocumentSort.RELEVANCE); boolean useRankOrder = hasText && (sort == null || sort == DocumentSort.RELEVANCE);
if (useRankOrder) { if (useRankOrder) {
List<Document> results = documentRepository.findAll(spec);
Map<UUID, Integer> rankMap = new HashMap<>(); Map<UUID, Integer> rankMap = new HashMap<>();
for (int i = 0; i < rankedIds.size(); i++) rankMap.put(rankedIds.get(i), i); for (int i = 0; i < rankedIds.size(); i++) rankMap.put(rankedIds.get(i), i);
List<Document> sorted = documentRepository.findAll(spec).stream() List<Document> sorted = results.stream()
.sorted(Comparator.comparingInt(doc -> rankMap.getOrDefault(doc.getId(), Integer.MAX_VALUE))) .sorted(Comparator.comparingInt(
doc -> rankMap.getOrDefault(doc.getId(), Integer.MAX_VALUE)))
.toList(); .toList();
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size()); return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
} }
@@ -755,39 +619,6 @@ public class DocumentService {
return buildResultPaged(page.getContent(), text, pageable, page.getTotalElements()); return buildResultPaged(page.getContent(), text, pageable, page.getTotalElements());
} }
private static boolean isPureTextRelevance(boolean hasText, DocumentSort sort,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status) {
return hasText && (sort == null || sort == DocumentSort.RELEVANCE)
&& from == null && to == null && sender == null && receiver == null
&& (tags == null || tags.isEmpty()) && (tagQ == null || tagQ.isBlank()) && status == null;
}
/**
* Pure-text RELEVANCE path — pagination and ts_rank ordering pushed into SQL.
* Called when no non-text filters are active (ADR-008).
*/
private DocumentSearchResult relevanceSortedPageFromSql(String text, Pageable pageable) {
long rawOffset = pageable.getOffset();
if (rawOffset > Integer.MAX_VALUE) return DocumentSearchResult.of(List.of());
int offset = (int) rawOffset;
int limit = pageable.getPageSize();
FtsPage ftsPage = toFtsPage(documentRepository.findFtsPageRaw(text, offset, limit));
if (ftsPage.hits().isEmpty()) return DocumentSearchResult.of(List.of());
// Preserve ts_rank order from SQL across the JPA findAllById call.
Map<UUID, Integer> rankMap = new HashMap<>();
List<UUID> pageIds = new ArrayList<>();
for (int i = 0; i < ftsPage.hits().size(); i++) {
rankMap.put(ftsPage.hits().get(i).id(), i);
pageIds.add(ftsPage.hits().get(i).id());
}
List<Document> docs = documentRepository.findAllById(pageIds).stream()
.sorted(Comparator.comparingInt(d -> rankMap.getOrDefault(d.getId(), Integer.MAX_VALUE)))
.toList();
return buildResultPaged(docs, text, pageable, ftsPage.total());
}
private static <T> List<T> pageSlice(List<T> sorted, Pageable pageable) { private static <T> List<T> pageSlice(List<T> sorted, Pageable pageable) {
int from = Math.min((int) pageable.getOffset(), sorted.size()); int from = Math.min((int) pageable.getOffset(), sorted.size());
int to = Math.min(from + pageable.getPageSize(), sorted.size()); int to = Math.min(from + pageable.getPageSize(), sorted.size());
@@ -798,7 +629,7 @@ public class DocumentService {
return DocumentSearchResult.paged(enrichItems(slice, text), pageable, totalElements); return DocumentSearchResult.paged(enrichItems(slice, text), pageable, totalElements);
} }
private List<DocumentListItem> enrichItems(List<Document> documents, String text) { private List<DocumentSearchItem> enrichItems(List<Document> documents, String text) {
List<Document> colorResolved = resolveDocumentTagColors(documents); List<Document> colorResolved = resolveDocumentTagColors(documents);
Map<UUID, SearchMatchData> matchData = enrichWithMatchData(colorResolved, text); Map<UUID, SearchMatchData> matchData = enrichWithMatchData(colorResolved, text);
@@ -806,7 +637,7 @@ public class DocumentService {
Map<UUID, Integer> completionByDoc = fetchCompletionPercentages(docIds); Map<UUID, Integer> completionByDoc = fetchCompletionPercentages(docIds);
Map<UUID, List<ActivityActorDTO>> contributorsByDoc = auditLogQueryService.findRecentContributorsPerDocument(docIds); Map<UUID, List<ActivityActorDTO>> contributorsByDoc = auditLogQueryService.findRecentContributorsPerDocument(docIds);
return colorResolved.stream().map(doc -> toListItem( return colorResolved.stream().map(doc -> new DocumentSearchItem(
doc, doc,
matchData.getOrDefault(doc.getId(), SearchMatchData.empty()), matchData.getOrDefault(doc.getId(), SearchMatchData.empty()),
completionByDoc.getOrDefault(doc.getId(), 0), completionByDoc.getOrDefault(doc.getId(), 0),
@@ -814,30 +645,6 @@ public class DocumentService {
)).toList(); )).toList();
} }
private DocumentListItem toListItem(Document doc, SearchMatchData match, int completionPct, List<ActivityActorDTO> contributors) {
return new DocumentListItem(
doc.getId(),
doc.getTitle(),
doc.getOriginalFilename(),
doc.getThumbnailUrl(),
doc.getDocumentDate(),
doc.getMetaDatePrecision(),
doc.getMetaDateEnd(),
doc.getSender(),
List.copyOf(doc.getReceivers()),
List.copyOf(doc.getTags()),
doc.getArchiveBox(),
doc.getArchiveFolder(),
doc.getLocation(),
doc.getSummary(),
completionPct,
contributors,
match,
doc.getCreatedAt(),
doc.getUpdatedAt()
);
}
private Map<UUID, Integer> fetchCompletionPercentages(List<UUID> docIds) { private Map<UUID, Integer> fetchCompletionPercentages(List<UUID> docIds) {
return transcriptionBlockQueryService.getCompletionStats(docIds); return transcriptionBlockQueryService.getCompletionStats(docIds);
} }
@@ -845,21 +652,12 @@ public class DocumentService {
private Sort resolveSort(DocumentSort sort, String dir) { private Sort resolveSort(DocumentSort sort, String dir) {
Sort.Direction direction = "ASC".equalsIgnoreCase(dir) ? Sort.Direction.ASC : Sort.Direction.DESC; Sort.Direction direction = "ASC".equalsIgnoreCase(dir) ? Sort.Direction.ASC : Sort.Direction.DESC;
if (sort == null || sort == DocumentSort.DATE || sort == DocumentSort.RELEVANCE) { if (sort == null || sort == DocumentSort.DATE || sort == DocumentSort.RELEVANCE) {
// Undated documents (null documentDate) must order last regardless of return Sort.by(direction, "documentDate");
// direction — Postgres puts NULLs FIRST on ASC by default, which would
// surface the undated pile at the top with no explanation (issue #668).
// The title tiebreaker gives a stable total order when every row is
// null-dated (the "Nur undatierte" filter), so pagination is deterministic.
// title is @Column(nullable=false), so it is always present.
return Sort.by(
new Sort.Order(direction, "documentDate").nullsLast(),
Sort.Order.asc("title"));
} }
// SENDER and RECEIVER are sorted in-memory before this method is called // SENDER and RECEIVER are sorted in-memory before this method is called
return switch (sort) { return switch (sort) {
case TITLE -> Sort.by(direction, "title"); case TITLE -> Sort.by(direction, "title");
case UPLOAD_DATE -> Sort.by(direction, "createdAt"); case UPLOAD_DATE -> Sort.by(direction, "createdAt");
case UPDATED_AT -> Sort.by(direction, "updatedAt");
default -> Sort.by(direction, "documentDate"); default -> Sort.by(direction, "documentDate");
}; };
} }
@@ -938,7 +736,6 @@ public class DocumentService {
documentRepository.save(doc); documentRepository.save(doc);
} }
@Transactional(readOnly = true)
public Document getDocumentById(UUID id) { public Document getDocumentById(UUID id) {
Document doc = documentRepository.findById(id) Document doc = documentRepository.findById(id)
.orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + id)); .orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + id));
@@ -1146,28 +943,6 @@ public class DocumentService {
return result; return result;
} }
private static final int COL_ID = 0;
private static final int COL_RANK = 1;
private static final int COL_TOTAL = 2;
/**
* Maps raw Object[] rows from {@link DocumentRepository#findFtsPageRaw} to an
* {@link FtsPage}. Uses pattern-matching UUID cast to guard against driver-level
* type variance (some JDBC drivers return UUID as String).
*/
private static FtsPage toFtsPage(List<Object[]> rows) {
if (rows.isEmpty()) return new FtsPage(List.of(), 0);
long total = ((Number) rows.get(0)[COL_TOTAL]).longValue();
List<FtsHit> hits = rows.stream()
.map(r -> {
UUID id = r[COL_ID] instanceof UUID u ? u : UUID.fromString(r[COL_ID].toString());
double rank = ((Number) r[COL_RANK]).doubleValue();
return new FtsHit(id, rank);
})
.toList();
return new FtsPage(hits, total);
}
/** Clean text + highlight offsets parsed from a {@code ts_headline} sentinel-delimited string. */ /** Clean text + highlight offsets parsed from a {@code ts_headline} sentinel-delimited string. */
public record ParsedHighlight(String cleanText, List<MatchOffset> offsets) {} public record ParsedHighlight(String cleanText, List<MatchOffset> offsets) {}

View File

@@ -1,5 +1,5 @@
package org.raddatz.familienarchiv.document; package org.raddatz.familienarchiv.document;
public enum DocumentSort { public enum DocumentSort {
DATE, TITLE, SENDER, RECEIVER, UPLOAD_DATE, UPDATED_AT, RELEVANCE DATE, TITLE, SENDER, RECEIVER, UPLOAD_DATE, RELEVANCE
} }

View File

@@ -55,12 +55,6 @@ public class DocumentSpecifications {
return (root, query, cb) -> status == null ? null : cb.equal(root.get("status"), status); return (root, query, cb) -> status == null ? null : cb.equal(root.get("status"), status);
} }
// Filtert auf undatierte Dokumente (meta_date IS NULL) — für die "Nur undatierte"-Triage.
// false → kein Prädikat (no-op), true → documentDate IS NULL (issue #668).
public static Specification<Document> undatedOnly(boolean undated) {
return (root, query, cb) -> undated ? cb.isNull(root.get("documentDate")) : null;
}
/** /**
* Filtert nach vorausgeweiteten Tag-ID-Sets mit AND- oder OR-Logik. * Filtert nach vorausgeweiteten Tag-ID-Sets mit AND- oder OR-Logik.
* *

View File

@@ -11,11 +11,6 @@ import org.raddatz.familienarchiv.ocr.ScriptType;
public class DocumentUpdateDTO { public class DocumentUpdateDTO {
private String title; private String title;
private LocalDate documentDate; private LocalDate documentDate;
private DatePrecision metaDatePrecision;
private LocalDate metaDateEnd;
private String metaDateRaw;
private String senderText;
private String receiverText;
private String location; private String location;
private String documentLocation; private String documentLocation;
private String archiveBox; private String archiveBox;

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.document;
import java.util.UUID;
/** A single document hit from a paginated FTS query — id and its ts_rank score. */
record FtsHit(UUID id, double rank) {}

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.document;
import java.util.List;
/** One page of FTS results — the ranked hit list for this page and the total match count. */
record FtsPage(List<FtsHit> hits, long total) {}

View File

@@ -1,10 +0,0 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
public record MonthBucket(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED, example = "1915-08")
String month,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int count
) {}

View File

@@ -27,9 +27,7 @@ public class CommentController {
// ─── Block (transcription) comments ──────────────────────────────────────── // ─── Block (transcription) comments ────────────────────────────────────────
@GetMapping("/api/documents/{documentId}/transcription-blocks/{blockId}/comments") @GetMapping("/api/documents/{documentId}/transcription-blocks/{blockId}/comments")
public List<DocumentComment> getBlockComments( public List<DocumentComment> getBlockComments(@PathVariable UUID blockId) {
@PathVariable UUID documentId,
@PathVariable UUID blockId) {
return commentService.getCommentsForBlock(blockId); return commentService.getCommentsForBlock(blockId);
} }
@@ -50,7 +48,6 @@ public class CommentController {
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL})
public DocumentComment replyToBlockComment( public DocumentComment replyToBlockComment(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId,
@PathVariable UUID commentId, @PathVariable UUID commentId,
@RequestBody CreateCommentDTO dto, @RequestBody CreateCommentDTO dto,
Authentication authentication) { Authentication authentication) {

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.document.comment;
import jakarta.annotation.Nullable;
import java.util.UUID;
public record CommentData(@Nullable UUID annotationId, String preview) {}

View File

@@ -13,7 +13,6 @@ import org.raddatz.familienarchiv.document.comment.DocumentComment;
import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock; import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock;
import org.raddatz.familienarchiv.document.comment.CommentRepository; import org.raddatz.familienarchiv.document.comment.CommentRepository;
import org.raddatz.familienarchiv.notification.NotificationService; import org.raddatz.familienarchiv.notification.NotificationService;
import org.jsoup.Jsoup;
import org.springframework.stereotype.Service; import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional; import org.springframework.transaction.annotation.Transactional;
@@ -29,29 +28,21 @@ import java.util.UUID;
@RequiredArgsConstructor @RequiredArgsConstructor
public class CommentService { public class CommentService {
private static final int PREVIEW_MAX_CHARS = 120;
private final CommentRepository commentRepository; private final CommentRepository commentRepository;
private final UserService userService; private final UserService userService;
private final NotificationService notificationService; private final NotificationService notificationService;
private final AuditService auditService; private final AuditService auditService;
private final TranscriptionService transcriptionService; private final TranscriptionService transcriptionService;
public Map<UUID, CommentData> findDataByIds(Collection<UUID> commentIds) { public Map<UUID, UUID> findAnnotationIdsByIds(Collection<UUID> commentIds) {
if (commentIds == null || commentIds.isEmpty()) return Map.of(); if (commentIds == null || commentIds.isEmpty()) return Map.of();
Map<UUID, CommentData> result = new HashMap<>(); Map<UUID, UUID> result = new HashMap<>();
for (DocumentComment c : commentRepository.findAllById(commentIds)) { for (DocumentComment c : commentRepository.findAllById(commentIds)) {
result.put(c.getId(), new CommentData(c.getAnnotationId(), stripAndTruncate(c.getContent()))); if (c.getAnnotationId() != null) result.put(c.getId(), c.getAnnotationId());
} }
return result; return result;
} }
private String stripAndTruncate(String html) {
if (html == null || html.isBlank()) return "";
String text = Jsoup.parse(html).text().trim();
return text.length() > PREVIEW_MAX_CHARS ? text.substring(0, PREVIEW_MAX_CHARS) : text;
}
public List<DocumentComment> getCommentsForBlock(UUID blockId) { public List<DocumentComment> getCommentsForBlock(UUID blockId) {
List<DocumentComment> roots = commentRepository.findByBlockIdAndParentIdIsNull(blockId); List<DocumentComment> roots = commentRepository.findByBlockIdAndParentIdIsNull(blockId);
return withRepliesAndMentions(roots); return withRepliesAndMentions(roots);

View File

@@ -43,7 +43,7 @@ public class TranscriptionBlockController {
@PostMapping @PostMapping
@ResponseStatus(HttpStatus.CREATED) @ResponseStatus(HttpStatus.CREATED)
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public TranscriptionBlock createBlock( public TranscriptionBlock createBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@Valid @RequestBody CreateTranscriptionBlockDTO dto, @Valid @RequestBody CreateTranscriptionBlockDTO dto,
@@ -53,7 +53,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/{blockId}") @PutMapping("/{blockId}")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public TranscriptionBlock updateBlock( public TranscriptionBlock updateBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId, @PathVariable UUID blockId,
@@ -65,7 +65,7 @@ public class TranscriptionBlockController {
@DeleteMapping("/{blockId}") @DeleteMapping("/{blockId}")
@ResponseStatus(HttpStatus.NO_CONTENT) @ResponseStatus(HttpStatus.NO_CONTENT)
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public void deleteBlock( public void deleteBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId) { @PathVariable UUID blockId) {
@@ -73,7 +73,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/reorder") @PutMapping("/reorder")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public List<TranscriptionBlock> reorderBlocks( public List<TranscriptionBlock> reorderBlocks(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@RequestBody ReorderTranscriptionBlocksDTO dto) { @RequestBody ReorderTranscriptionBlocksDTO dto) {
@@ -82,7 +82,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/{blockId}/review") @PutMapping("/{blockId}/review")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public TranscriptionBlock reviewBlock( public TranscriptionBlock reviewBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId, @PathVariable UUID blockId,
@@ -92,7 +92,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/review-all") @PutMapping("/review-all")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public List<TranscriptionBlock> markAllBlocksReviewed( public List<TranscriptionBlock> markAllBlocksReviewed(
@PathVariable UUID documentId, @PathVariable UUID documentId,
Authentication authentication) { Authentication authentication) {

View File

@@ -10,21 +10,11 @@ public class DomainException extends RuntimeException {
private final ErrorCode code; private final ErrorCode code;
private final HttpStatus status; private final HttpStatus status;
/** Seconds until the rate-limit window resets; {@code null} when not applicable. */
private final Long retryAfterSeconds;
public DomainException(ErrorCode code, HttpStatus status, String developerMessage) { public DomainException(ErrorCode code, HttpStatus status, String developerMessage) {
super(developerMessage); super(developerMessage);
this.code = code; this.code = code;
this.status = status; this.status = status;
this.retryAfterSeconds = null;
}
private DomainException(ErrorCode code, HttpStatus status, String developerMessage, Long retryAfterSeconds) {
super(developerMessage);
this.code = code;
this.status = status;
this.retryAfterSeconds = retryAfterSeconds;
} }
public ErrorCode getCode() { public ErrorCode getCode() {
@@ -35,11 +25,6 @@ public class DomainException extends RuntimeException {
return status; return status;
} }
/** Returns the {@code Retry-After} value in seconds, or {@code null} if not set. */
public Long getRetryAfterSeconds() {
return retryAfterSeconds;
}
// --- Static factories for common cases --- // --- Static factories for common cases ---
public static DomainException notFound(ErrorCode code, String message) { public static DomainException notFound(ErrorCode code, String message) {
@@ -54,11 +39,6 @@ public class DomainException extends RuntimeException {
return new DomainException(ErrorCode.UNAUTHORIZED, HttpStatus.UNAUTHORIZED, message); return new DomainException(ErrorCode.UNAUTHORIZED, HttpStatus.UNAUTHORIZED, message);
} }
public static DomainException invalidCredentials() {
return new DomainException(ErrorCode.INVALID_CREDENTIALS, HttpStatus.UNAUTHORIZED,
"Invalid email or password");
}
public static DomainException conflict(ErrorCode code, String message) { public static DomainException conflict(ErrorCode code, String message) {
return new DomainException(code, HttpStatus.CONFLICT, message); return new DomainException(code, HttpStatus.CONFLICT, message);
} }
@@ -70,12 +50,4 @@ public class DomainException extends RuntimeException {
public static DomainException internal(ErrorCode code, String message) { public static DomainException internal(ErrorCode code, String message) {
return new DomainException(code, HttpStatus.INTERNAL_SERVER_ERROR, message); return new DomainException(code, HttpStatus.INTERNAL_SERVER_ERROR, message);
} }
public static DomainException tooManyRequests(ErrorCode code, String message) {
return new DomainException(code, HttpStatus.TOO_MANY_REQUESTS, message);
}
public static DomainException tooManyRequests(ErrorCode code, String message, long retryAfterSeconds) {
return new DomainException(code, HttpStatus.TOO_MANY_REQUESTS, message, retryAfterSeconds);
}
} }

View File

@@ -30,8 +30,6 @@ public enum ErrorCode {
// --- Users --- // --- Users ---
/** A user with the given ID or username does not exist. 404 */ /** A user with the given ID or username does not exist. 404 */
USER_NOT_FOUND, USER_NOT_FOUND,
/** A group with the given ID does not exist. 404 */
GROUP_NOT_FOUND,
/** The supplied email address is already used by another account. 409 */ /** The supplied email address is already used by another account. 409 */
EMAIL_ALREADY_IN_USE, EMAIL_ALREADY_IN_USE,
/** The supplied current password does not match the stored hash. 400 */ /** The supplied current password does not match the stored hash. 400 */
@@ -40,8 +38,6 @@ public enum ErrorCode {
// --- Import --- // --- Import ---
/** A mass import is already in progress; only one can run at a time. 409 */ /** A mass import is already in progress; only one can run at a time. 409 */
IMPORT_ALREADY_RUNNING, IMPORT_ALREADY_RUNNING,
/** A canonical import artifact is missing, unreadable, or missing a required header. 400 */
IMPORT_ARTIFACT_INVALID,
// --- Thumbnails --- // --- Thumbnails ---
/** A thumbnail backfill is already in progress; only one can run at a time. 409 */ /** A thumbnail backfill is already in progress; only one can run at a time. 409 */
@@ -56,24 +52,14 @@ public enum ErrorCode {
INVITE_REVOKED, INVITE_REVOKED,
/** The invite has passed its expiry date. 410 */ /** The invite has passed its expiry date. 410 */
INVITE_EXPIRED, INVITE_EXPIRED,
/** A group cannot be deleted because one or more active invites reference it. 409 */
GROUP_HAS_ACTIVE_INVITES,
// --- Auth --- // --- Auth ---
/** The request is not authenticated. 401 */ /** The request is not authenticated. 401 */
UNAUTHORIZED, UNAUTHORIZED,
/** The authenticated user lacks the required permission. 403 */ /** The authenticated user lacks the required permission. 403 */
FORBIDDEN, FORBIDDEN,
/** The supplied email/password combination does not match any active account. 401 */
INVALID_CREDENTIALS,
/** The session has expired or been invalidated. 401 */
SESSION_EXPIRED,
/** The password-reset token is missing, expired, or already used. 400 */ /** The password-reset token is missing, expired, or already used. 400 */
INVALID_RESET_TOKEN, INVALID_RESET_TOKEN,
/** CSRF token is missing or does not match the expected value. 403 */
CSRF_TOKEN_MISSING,
/** The login rate limit has been exceeded for this IP/email combination. 429 */
TOO_MANY_LOGIN_ATTEMPTS,
// --- Annotations --- // --- Annotations ---
/** The annotation with the given ID does not exist. 404 */ /** The annotation with the given ID does not exist. 404 */

View File

@@ -2,7 +2,6 @@ package org.raddatz.familienarchiv.exception;
import java.util.stream.Collectors; import java.util.stream.Collectors;
import io.sentry.Sentry;
import jakarta.validation.ConstraintViolationException; import jakarta.validation.ConstraintViolationException;
import org.raddatz.familienarchiv.exception.DomainException; import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode; import org.raddatz.familienarchiv.exception.ErrorCode;
@@ -16,18 +15,15 @@ import org.springframework.web.server.ResponseStatusException;
import lombok.extern.slf4j.Slf4j; import lombok.extern.slf4j.Slf4j;
// "Handler" is Spring's @RestControllerAdvice naming convention — not a generic suffix.
@RestControllerAdvice @RestControllerAdvice
@Slf4j @Slf4j
public class GlobalExceptionHandler { public class GlobalExceptionHandler {
@ExceptionHandler(DomainException.class) @ExceptionHandler(DomainException.class)
public ResponseEntity<ErrorResponse> handleDomain(DomainException ex) { public ResponseEntity<ErrorResponse> handleDomain(DomainException ex) {
var builder = ResponseEntity.status(ex.getStatus()); return ResponseEntity
if (ex.getRetryAfterSeconds() != null) { .status(ex.getStatus())
builder = builder.header("Retry-After", String.valueOf(ex.getRetryAfterSeconds())); .body(new ErrorResponse(ex.getCode(), ex.getMessage()));
}
return builder.body(new ErrorResponse(ex.getCode(), ex.getMessage()));
} }
@ExceptionHandler(MethodArgumentNotValidException.class) @ExceptionHandler(MethodArgumentNotValidException.class)
@@ -66,7 +62,6 @@ public class GlobalExceptionHandler {
@ExceptionHandler(Exception.class) @ExceptionHandler(Exception.class)
public ResponseEntity<ErrorResponse> handleGeneric(Exception ex) { public ResponseEntity<ErrorResponse> handleGeneric(Exception ex) {
Sentry.captureException(ex);
log.error("Unhandled exception", ex); log.error("Unhandled exception", ex);
return ResponseEntity.internalServerError() return ResponseEntity.internalServerError()
.body(new ErrorResponse(ErrorCode.INTERNAL_ERROR, "An unexpected error occurred")); .body(new ErrorResponse(ErrorCode.INTERNAL_ERROR, "An unexpected error occurred"));

View File

@@ -56,10 +56,6 @@ public class GeschichteService {
// ─── Read API ──────────────────────────────────────────────────────────── // ─── Read API ────────────────────────────────────────────────────────────
public long countPublished() {
return geschichteRepository.count(GeschichteSpecifications.hasStatus(GeschichteStatus.PUBLISHED));
}
public Geschichte getById(UUID id) { public Geschichte getById(UUID id) {
Geschichte g = geschichteRepository.findById(id) Geschichte g = geschichteRepository.findById(id)
.orElseThrow(() -> DomainException.notFound( .orElseThrow(() -> DomainException.notFound(
@@ -81,10 +77,8 @@ public class GeschichteService {
GeschichteStatus effective = currentUserHasBlogWrite() ? status : GeschichteStatus.PUBLISHED; GeschichteStatus effective = currentUserHasBlogWrite() ? status : GeschichteStatus.PUBLISHED;
int safeLimit = limit <= 0 ? DEFAULT_LIMIT : Math.min(limit, MAX_LIMIT); int safeLimit = limit <= 0 ? DEFAULT_LIMIT : Math.min(limit, MAX_LIMIT);
UUID authorId = effective == GeschichteStatus.DRAFT ? currentUser().getId() : null;
Specification<Geschichte> spec = Specification.allOf( Specification<Geschichte> spec = Specification.allOf(
GeschichteSpecifications.hasStatus(effective), GeschichteSpecifications.hasStatus(effective),
GeschichteSpecifications.hasAuthor(authorId),
GeschichteSpecifications.hasAllPersons(personIds), GeschichteSpecifications.hasAllPersons(personIds),
GeschichteSpecifications.hasDocument(documentId), GeschichteSpecifications.hasDocument(documentId),
GeschichteSpecifications.orderByDisplayDateDesc() GeschichteSpecifications.orderByDisplayDateDesc()

View File

@@ -42,12 +42,6 @@ public final class GeschichteSpecifications {
}; };
} }
// null authorId → no restriction (PUBLISHED path passes null; Spring Data skips null predicates)
public static Specification<Geschichte> hasAuthor(UUID authorId) {
return (root, query, cb) ->
authorId == null ? null : cb.equal(root.get("author").get("id"), authorId);
}
public static Specification<Geschichte> hasDocument(UUID documentId) { public static Specification<Geschichte> hasDocument(UUID documentId) {
return (root, query, cb) -> { return (root, query, cb) -> {
if (documentId == null) return null; if (documentId == null) return null;

View File

@@ -1,94 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import java.io.File;
import java.time.LocalDateTime;
import java.util.List;
/**
* Runs the four canonical loaders in their real dependency order — encoded explicitly
* here, not implied by call order — and owns the async runner plus the {@link ImportStatus}
* state machine the admin UI consumes. The orchestrator smoke-checks that all four
* artifacts are present before starting, failing fast rather than half-loading tags but no
* documents. A malformed artifact (a loader throwing) sets {@code FAILED}; an individual
* bad file is surfaced through the {@link ImportStatus.SkippedFile} mechanism instead.
*/
@Service
@RequiredArgsConstructor
@Slf4j
public class CanonicalImportOrchestrator {
private static final String TAG_TREE_ARTIFACT = "canonical-tag-tree.xlsx";
private static final String PERSONS_ARTIFACT = "canonical-persons.xlsx";
private static final String PERSONS_TREE_ARTIFACT = "canonical-persons-tree.json";
private static final String DOCUMENTS_ARTIFACT = "canonical-documents.xlsx";
private final TagTreeImporter tagTreeImporter;
private final PersonRegisterImporter personRegisterImporter;
private final PersonTreeImporter personTreeImporter;
private final DocumentImporter documentImporter;
@Value("${app.import.dir:/import}")
private String canonicalDir;
private volatile ImportStatus currentStatus = new ImportStatus(
ImportStatus.State.IDLE, "IMPORT_IDLE", "Kein Import gestartet.", 0, List.of(), null);
public ImportStatus getStatus() {
return currentStatus;
}
@Async
public void runImportAsync() {
if (currentStatus.state() == ImportStatus.State.RUNNING) {
throw DomainException.conflict(ErrorCode.IMPORT_ALREADY_RUNNING, "A mass import is already in progress");
}
runImport();
}
/** Synchronous entry point — wrapped by {@link #runImportAsync()} and called directly in tests. */
void runImport() {
currentStatus = new ImportStatus(ImportStatus.State.RUNNING, "IMPORT_RUNNING",
"Import läuft...", 0, List.of(), LocalDateTime.now());
try {
File tagTree = requireArtifact(TAG_TREE_ARTIFACT);
File persons = requireArtifact(PERSONS_ARTIFACT);
File personsTree = requireArtifact(PERSONS_TREE_ARTIFACT);
File documents = requireArtifact(DOCUMENTS_ARTIFACT);
// Dependency DAG: documents need persons + tags; the tree needs persons.
tagTreeImporter.load(tagTree);
personRegisterImporter.load(persons);
personTreeImporter.load(personsTree);
DocumentImporter.LoadResult result = documentImporter.load(documents);
currentStatus = new ImportStatus(ImportStatus.State.DONE, "IMPORT_DONE",
"Import abgeschlossen. " + result.processed() + " Dokumente verarbeitet.",
result.processed(), result.skippedFiles(), currentStatus.startedAt());
} catch (DomainException e) {
log.error("Canonical import failed: {}", e.getMessage());
currentStatus = new ImportStatus(ImportStatus.State.FAILED, "IMPORT_FAILED_ARTIFACT",
"Fehler: " + e.getMessage(), 0, List.of(), currentStatus.startedAt());
} catch (Exception e) {
log.error("Canonical import failed", e);
currentStatus = new ImportStatus(ImportStatus.State.FAILED, "IMPORT_FAILED_INTERNAL",
"Fehler: " + e.getMessage(), 0, List.of(), currentStatus.startedAt());
}
}
private File requireArtifact(String name) {
File artifact = new File(canonicalDir, name);
if (!artifact.isFile()) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Missing canonical artifact: " + name);
}
return artifact;
}
}

View File

@@ -1,133 +0,0 @@
package org.raddatz.familienarchiv.importing;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.DateUtil;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* Value-level POI helper for the canonical import artifacts. No Spring, no domain
* knowledge: it opens a workbook, maps the header row to column indices by name, and
* yields typed rows whose cells are looked up by header name — the seam that replaces
* the old positional {@code @Value app.import.col.*} indices. List columns are split on
* the pipe delimiter the normalizer emits.
*/
public final class CanonicalSheetReader {
private CanonicalSheetReader() {
}
/** A single data row, addressable by canonical header name (never by index). */
public static final class Row {
private final Map<String, Integer> headerIndex;
private final List<String> cells;
private Row(Map<String, Integer> headerIndex, List<String> cells) {
this.headerIndex = headerIndex;
this.cells = cells;
}
/** Trimmed cell value for the named header, or "" when absent/blank. */
public String get(String header) {
Integer index = headerIndex.get(header);
if (index == null || index >= cells.size()) return "";
String value = cells.get(index);
return value == null ? "" : value.trim();
}
}
/**
* Reads all data rows from the first sheet, validating that every required header is
* present. Throws a fail-closed {@link DomainException} on a missing header so a
* loader never silently maps the wrong column.
*/
public static List<Row> readRows(File file, List<String> requiredHeaders) {
try (FileInputStream fis = new FileInputStream(file);
Workbook workbook = WorkbookFactory.create(fis)) {
Sheet sheet = workbook.getSheetAt(0);
org.apache.poi.ss.usermodel.Row headerRow = sheet.getRow(sheet.getFirstRowNum());
Map<String, Integer> headerIndex = mapHeaders(headerRow);
requireHeaders(file, headerIndex, requiredHeaders);
List<Row> rows = new ArrayList<>();
for (int i = sheet.getFirstRowNum() + 1; i <= sheet.getLastRowNum(); i++) {
org.apache.poi.ss.usermodel.Row poiRow = sheet.getRow(i);
if (poiRow == null) continue;
rows.add(new Row(headerIndex, readCells(poiRow, headerIndex.size())));
}
return rows;
} catch (DomainException e) {
throw e;
} catch (Exception e) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Unreadable canonical artifact: " + file.getName());
}
}
/** Splits a pipe-delimited list column into trimmed, non-empty segments. */
public static List<String> splitList(String raw) {
if (raw == null || raw.isBlank()) return List.of();
return Arrays.stream(raw.split("\\|"))
.map(String::trim)
.filter(s -> !s.isEmpty())
.toList();
}
private static Map<String, Integer> mapHeaders(org.apache.poi.ss.usermodel.Row headerRow) {
if (headerRow == null) {
return Map.of();
}
Map<String, Integer> headerIndex = new HashMap<>();
for (int c = 0; c < headerRow.getLastCellNum(); c++) {
String name = cellToString(headerRow.getCell(c)).trim();
if (!name.isEmpty()) headerIndex.putIfAbsent(name, c);
}
return headerIndex;
}
private static void requireHeaders(File file, Map<String, Integer> headerIndex, List<String> requiredHeaders) {
for (String header : requiredHeaders) {
if (!headerIndex.containsKey(header)) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Missing required header '" + header + "' in artifact " + file.getName());
}
}
}
private static List<String> readCells(org.apache.poi.ss.usermodel.Row poiRow, int columnCount) {
int width = Math.max(columnCount, poiRow.getLastCellNum());
List<String> cells = new ArrayList<>(width);
for (int c = 0; c < width; c++) {
cells.add(cellToString(poiRow.getCell(c)));
}
return cells;
}
private static String cellToString(Cell cell) {
if (cell == null) return "";
return switch (cell.getCellType()) {
case STRING -> cell.getStringCellValue();
case NUMERIC -> {
if (DateUtil.isCellDateFormatted(cell)) {
yield cell.getLocalDateTimeCellValue().toLocalDate().toString();
}
yield String.valueOf((long) cell.getNumericCellValue());
}
case BOOLEAN -> String.valueOf(cell.getBooleanCellValue());
default -> "";
};
}
}

View File

@@ -1,364 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.document.DatePrecision;
import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.DocumentStatus;
import org.raddatz.familienarchiv.document.ThumbnailAsyncRunner;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.person.PersonType;
import org.raddatz.familienarchiv.person.PersonUpsertCommand;
import org.raddatz.familienarchiv.tag.Tag;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;
import org.springframework.transaction.annotation.Transactional;
import software.amazon.awssdk.core.sync.RequestBody;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;
import org.raddatz.familienarchiv.tag.TagService;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.time.LocalDate;
import java.time.format.DateTimeParseException;
import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Optional;
import java.util.Set;
import java.util.UUID;
import java.util.regex.Pattern;
/**
* Loads {@code canonical-documents.xlsx} into the document domain. Java performs no
* semantic transformation: the normalizer already resolved people to slugs and dates to
* ISO values. This loader maps columns by header name, routes each attribution
* register-first (always retaining the raw cell in {@code sender_text}/{@code receiver_text}),
* parses clean dates, and keeps the S3/thumbnail plumbing.
*
* <p>The import corpus is uniform — every PDF is named {@code <index>.pdf} flat in the import
* dir — so a document's PDF is resolved <em>directly by its index</em>:
* {@code importDir.resolve(index + ".pdf")}. The {@code index} is still hostile input
* regardless of upstream trust (CWE-22 does not care it came from our Python tool): it is
* validated against a strict catalog pattern with {@link #isValidImportIndex} (no path
* separators, no {@code .}/{@code ..}, no absolute path, no slash homoglyphs) and the
* resolved path is asserted to stay inside the import dir in {@link #resolvePdfByIndex} as
* defense-in-depth. The {@code %PDF} magic-byte check still gates upload.
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class DocumentImporter {
static final List<String> REQUIRED_HEADERS = List.of(
"index", "sender_person_id", "sender_name",
"receiver_person_ids", "receiver_names", "date_iso", "date_raw", "date_precision");
// Catalog index shape: 14 letters (ASCII + Latin-1 letters, e.g. the German "ü" in
// "Mü-0001"), one or more hyphens (the corpus has a few "C--0029" data-entry artefacts),
// digits, and an optional trailing "x" the normalizer recognises. Anchored, with no
// separator / dot / slash characters in the class, so "<index>.pdf" can never traverse.
// NOTE: `\d` here is intentionally ASCII-only ([0-9]). Java's java.util.regex matches `\d`
// against [0-9] unless Pattern.UNICODE_CHARACTER_CLASS is set — do NOT add that flag, or
// Arabic-Indic / fullwidth digits would silently widen the accepted set.
private static final Pattern INDEX_PATTERN =
Pattern.compile("[A-Za-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u00FF]{1,4}-+\\d+x?");
private final DocumentService documentService;
private final PersonService personService;
private final TagService tagService;
private final S3Client s3Client;
private final ThumbnailAsyncRunner thumbnailAsyncRunner;
@Value("${app.s3.bucket:familienarchiv}")
private String bucketName;
@Value("${app.import.dir:/import}")
private String importDir;
/** Outcome of loading the document sheet: processed count + per-file skips. */
public record LoadResult(int processed, List<ImportStatus.SkippedFile> skippedFiles) {}
// One transaction for the whole sheet keeps the Hibernate session open so an existing
// document's lazy receivers collection initialises during an idempotent re-import.
// Invoked cross-bean from the orchestrator, so the @Transactional proxy applies.
@Transactional
public LoadResult load(File artifact) {
List<CanonicalSheetReader.Row> rows = CanonicalSheetReader.readRows(artifact, REQUIRED_HEADERS);
int processed = 0;
List<ImportStatus.SkippedFile> skipped = new ArrayList<>();
// 1-based source row number for ops triage breadcrumbs (the spreadsheet header is row 1,
// so the first data row is row 2 — matches what an operator sees in the .xlsx).
int rowNumber = 1;
for (CanonicalSheetReader.Row row : rows) {
rowNumber++;
String index = row.get("index");
if (index.isBlank()) continue;
Optional<ImportStatus.SkipReason> skipReason = importRow(row, index, rowNumber);
if (skipReason.isPresent()) {
skipped.add(new ImportStatus.SkippedFile(index, skipReason.get()));
} else {
processed++;
}
}
log.info("Imported {} documents from {} ({} skipped)", processed, artifact.getName(), skipped.size());
return new LoadResult(processed, skipped);
}
private Optional<ImportStatus.SkipReason> importRow(CanonicalSheetReader.Row row, String index, int rowNumber) {
if (!isValidImportIndex(index)) {
// Breadcrumb is the source row number, NOT the raw (possibly-hostile) index — an
// operator triaging the import can find the offending row in the .xlsx without us
// echoing attacker-controlled input into the log.
log.warn("Skipping import row {}: index rejected (fails catalog-shape validation)", rowNumber);
return Optional.of(ImportStatus.SkipReason.INVALID_FILENAME_PATH_TRAVERSAL);
}
Optional<File> resolved = resolvePdfByIndex(index, rowNumber);
if (resolved.isEmpty()) {
// Distinct from the "index rejected" skip above: the index is VALID but no
// <index>.pdf is on disk, so the row becomes a normal PLACEHOLDER (not skipped). The
// index is a validated catalog id (no hostile content), so it is safe to log here —
// this surfaces a corpus that drifts from the "<index>.pdf" assumption (e.g. a file
// that arrived under a different name) rather than dropping it silently.
log.info("Import row {}: index {} is valid but {}.pdf is absent — creating PLACEHOLDER",
rowNumber, index, index);
} else {
try {
if (!isPdfMagicBytes(resolved.get())) {
return Optional.of(ImportStatus.SkipReason.INVALID_PDF_SIGNATURE);
}
} catch (IOException e) {
log.error("Magic-byte check failed for row {}", index, e);
return Optional.of(ImportStatus.SkipReason.FILE_READ_ERROR);
}
}
return persist(row, index, resolved);
}
private Optional<ImportStatus.SkipReason> persist(CanonicalSheetReader.Row row, String index, Optional<File> file) {
Document existing = documentService.findByOriginalFilename(index).orElse(null);
if (existing != null && existing.getStatus() != DocumentStatus.PLACEHOLDER) {
return Optional.of(ImportStatus.SkipReason.ALREADY_EXISTS);
}
String s3Key = null;
String contentType = null;
DocumentStatus status = DocumentStatus.PLACEHOLDER;
if (file.isPresent()) {
contentType = probeContentType(file.get());
s3Key = "documents/" + UUID.randomUUID() + "_" + file.get().getName();
try {
uploadToS3(file.get(), s3Key, contentType);
status = DocumentStatus.UPLOADED;
} catch (Exception e) {
log.error("S3 upload failed for {}", file.get().getName(), e);
return Optional.of(ImportStatus.SkipReason.S3_UPLOAD_FAILED);
}
}
Document doc = buildDocument(row, index, existing, s3Key, contentType, status);
Document saved = documentService.save(doc);
if (file.isPresent()) {
thumbnailAsyncRunner.dispatchAfterCommit(saved.getId());
}
return Optional.empty();
}
private Document buildDocument(CanonicalSheetReader.Row row, String index, Document existing,
String s3Key, String contentType, DocumentStatus status) {
Document doc = existing != null ? existing
: Document.builder().originalFilename(index).build();
String senderName = row.get("sender_name");
String receiverNames = row.get("receiver_names");
Person sender = resolveSender(row.get("sender_person_id"), senderName);
Set<Person> receivers = resolveReceivers(row.get("receiver_person_ids"));
LocalDate date = parseIsoDate(row.get("date_iso"));
DatePrecision precision = parsePrecision(row.get("date_precision"));
LocalDate dateEnd = parseIsoDate(row.get("date_end"));
String dateRaw = blankToNull(row.get("date_raw"));
String location = blankToNull(row.get("location"));
doc.setTitle(buildTitle(index, date, precision, dateEnd, dateRaw, location));
doc.setStatus(status);
doc.setFilePath(s3Key);
doc.setContentType(contentType);
doc.setSender(sender);
doc.setSenderText(blankToNull(senderName));
// The canonical row is authoritative for receivers/tags (ADR-025): clear then
// re-populate so a shrunk set on re-import prunes stale links rather than
// accumulating them. The raw sender_text/receiver_text retention is separate.
doc.getReceivers().clear();
doc.getReceivers().addAll(receivers);
doc.setReceiverText(blankToNull(receiverNames));
doc.setDocumentDate(date);
doc.setMetaDatePrecision(precision);
doc.setMetaDateEnd(dateEnd);
doc.setMetaDateRaw(dateRaw);
doc.setLocation(location);
doc.setSummary(blankToNull(row.get("summary")));
attachTag(doc, row.get("tags"));
doc.setMetadataComplete(doc.getDocumentDate() != null || sender != null || !receivers.isEmpty());
return doc;
}
// The title carries the date at the HONEST precision (never a fabricated day) via the
// shared DocumentTitleFormatter, plus the location — kept under 20 lines by delegating.
private static String buildTitle(String index, LocalDate date, DatePrecision precision,
LocalDate end, String raw, String location) {
StringBuilder title = new StringBuilder(index);
if (date != null && precision != DatePrecision.UNKNOWN) {
title.append(" ").append(DocumentTitleFormatter.formatTitleDate(date, precision, end, raw));
}
if (location != null && !location.isBlank()) {
title.append(" ").append(location);
}
return title.toString();
}
// ─── attribution routing — register-first, always retain raw ─────────────────────
private Person resolveSender(String slug, String rawName) {
if (slug.isBlank()) return null;
return resolvePerson(slug, rawName);
}
private Set<Person> resolveReceivers(String slugs) {
Set<Person> receivers = new LinkedHashSet<>();
for (String slug : CanonicalSheetReader.splitList(slugs)) {
receivers.add(resolvePerson(slug, slug));
}
return receivers;
}
private Person resolvePerson(String slug, String rawName) {
return personService.findBySourceRef(slug)
.orElseGet(() -> personService.upsertBySourceRef(PersonUpsertCommand.builder()
.sourceRef(slug)
.lastName(blankToNull(rawName) == null ? slug : rawName)
.personType(PersonType.PERSON)
.provisional(true)
.build()));
}
// Authoritative: the canonical row defines the document's tags exactly. Clearing first
// means a tag removed from the row is pruned on re-import (ADR-025).
private void attachTag(Document doc, String tagPath) {
doc.getTags().clear();
if (tagPath.isBlank()) return;
tagService.findBySourceRef(tagPath).ifPresent(tag -> doc.getTags().add(tag));
}
// ─── clean-value parsing (no semantic logic) ─────────────────────────────────────
private static LocalDate parseIsoDate(String value) {
if (value == null || value.isBlank()) return null;
try {
return LocalDate.parse(value.trim());
} catch (DateTimeParseException e) {
return null;
}
}
private static DatePrecision parsePrecision(String value) {
if (value == null || value.isBlank()) return DatePrecision.UNKNOWN;
try {
return DatePrecision.valueOf(value.trim());
} catch (IllegalArgumentException e) {
return DatePrecision.UNKNOWN;
}
}
// ─── file handling + S3 (small ≤20-line methods) ─────────────────────────────────
private String probeContentType(File file) {
try {
String probed = Files.probeContentType(file.toPath());
return probed != null ? probed : "application/octet-stream";
} catch (IOException e) {
return "application/octet-stream";
}
}
private void uploadToS3(File file, String s3Key, String contentType) {
s3Client.putObject(PutObjectRequest.builder()
.bucket(bucketName)
.key(s3Key)
.contentType(contentType)
.build(),
RequestBody.fromFile(file));
}
// ─── index validation + containment — defense-in-depth, do not weaken ────────────
// The index is the only thing that drives the on-disk lookup, so it must never contain a
// path separator, traversal token, slash homoglyph, null byte, or absolute-path marker —
// each guard mirrors the filename guards ported from MassImportService — and it must match
// the strict catalog shape so anything unexpected is skipped loudly rather than read.
private boolean isValidImportIndex(String index) {
if (index == null || index.isBlank()) return false;
if (index.contains("/")) return false;
if (index.contains("\\")) return false;
if (index.contains("")) return false; // U+2215 DIVISION SLASH
if (index.contains("")) return false; // U+FF0F FULLWIDTH SOLIDUS
if (index.contains("")) return false; // U+29F5 REVERSE SOLIDUS OPERATOR
if (index.contains(".")) return false; // no dots — "<index>.pdf" is the only extension
if (index.contains("\0")) return false;
if (Paths.get(index).isAbsolute()) return false;
return INDEX_PATTERN.matcher(index).matches();
}
// package-private: a Mockito spy in tests can override to inject IOException
InputStream openFileStream(File file) throws IOException {
return new FileInputStream(file);
}
private boolean isPdfMagicBytes(File file) throws IOException {
try (InputStream is = openFileStream(file)) {
byte[] header = is.readNBytes(4);
return header.length == 4
&& header[0] == 0x25 // %
&& header[1] == 0x50 // P
&& header[2] == 0x44 // D
&& header[3] == 0x46; // F
}
}
// O(1) direct lookup: the PDF is exactly importDir/<index>.pdf. The caller has already
// validated the index shape; the canonical-path containment assertion below is
// defense-in-depth so even a symlinked <index>.pdf cannot read outside importDir.
private Optional<File> resolvePdfByIndex(String index, int rowNumber) {
File baseDir = new File(importDir);
File candidate = baseDir.toPath().resolve(index + ".pdf").toFile();
try {
if (!candidate.isFile()) return Optional.empty();
String baseDirCanonical = baseDir.getCanonicalPath();
if (!candidate.getCanonicalPath().startsWith(baseDirCanonical + File.separator)) {
throw DomainException.internal(ErrorCode.INTERNAL_ERROR, "Path escape detected: " + candidate);
}
return Optional.of(candidate);
} catch (IOException e) {
// Distinct from the deliberate symlink-escape abort above (which throws): canonical
// resolution itself failed (e.g. the OS rejected the path mid-resolution). We fail
// safe to a PLACEHOLDER, but never silently — log it so the asymmetry surfaces in ops.
log.warn("Canonical path resolution failed for import row {}: treating {}.pdf as absent",
rowNumber, index, e);
return Optional.empty();
}
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s;
}
}

View File

@@ -1,112 +0,0 @@
package org.raddatz.familienarchiv.importing;
import org.raddatz.familienarchiv.document.DatePrecision;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
/**
* Produces the honest German date label baked into an import title — at exactly
* the precision the data claims, never finer. This is the Java half of the
* single source of truth shared with the frontend {@code formatDocumentDate}
* (TypeScript): both are asserted against {@code docs/date-label-fixtures.json}
* so the two implementations cannot drift (see #666).
*
* <p>Import titles are always German, so the labels here are the German
* canonical form (mirroring the {@code de} Paraglide messages used by the UI).
*/
final class DocumentTitleFormatter {
private static final DateTimeFormatter LONG = DateTimeFormatter.ofPattern("d. MMMM yyyy", Locale.GERMAN);
private static final DateTimeFormatter MONTH_YEAR = DateTimeFormatter.ofPattern("MMMM yyyy", Locale.GERMAN);
private static final DateTimeFormatter MEDIUM = DateTimeFormatter.ofPattern("d. MMM yyyy", Locale.GERMAN);
private static final DateTimeFormatter DAY_MONTH = DateTimeFormatter.ofPattern("d. MMM", Locale.GERMAN);
private static final String UNKNOWN = "Datum unbekannt";
private static final String APPROX_PREFIX = "ca.";
private static final String OPEN_RANGE_PREFIX = "ab";
private DocumentTitleFormatter() {
}
/**
* @param date the sort/filter anchor day; null for UNKNOWN rows
* @param precision descriptive precision metadata
* @param end the RANGE end day; null means an open-ended range
* @param raw the verbatim spreadsheet cell, used only to pick a season word
* @return the honest German label
*/
static String formatTitleDate(LocalDate date, DatePrecision precision, LocalDate end, String raw) {
if (precision == DatePrecision.UNKNOWN || date == null) {
return UNKNOWN;
}
return switch (precision) {
case DAY -> LONG.format(date);
case MONTH -> MONTH_YEAR.format(date);
case SEASON -> seasonLabel(date, raw);
case YEAR -> String.valueOf(date.getYear());
case APPROX -> APPROX_PREFIX + " " + date.getYear();
case RANGE -> rangeLabel(date, end);
case UNKNOWN -> UNKNOWN;
};
}
private static String seasonLabel(LocalDate date, String raw) {
Season season = seasonFromRaw(raw);
if (season == null) {
season = seasonOfMonth(date.getMonthValue());
}
return season.german + " " + date.getYear();
}
private static String rangeLabel(LocalDate start, LocalDate end) {
if (end == null) {
return OPEN_RANGE_PREFIX + " " + MEDIUM.format(start);
}
if (end.equals(start)) {
return MEDIUM.format(start);
}
if (start.getYear() != end.getYear()) {
return MEDIUM.format(start) + " " + MEDIUM.format(end);
}
if (start.getMonthValue() == end.getMonthValue()) {
return start.getDayOfMonth() + "." + MEDIUM.format(end);
}
return DAY_MONTH.format(start) + " " + MEDIUM.format(end);
}
// ─── season mapping — mirrors the normalizer's representative months ─────────────
private enum Season {
SPRING("Frühling"),
SUMMER("Sommer"),
AUTUMN("Herbst"),
WINTER("Winter");
private final String german;
Season(String german) {
this.german = german;
}
}
private static Season seasonOfMonth(int month) {
if (month >= 3 && month <= 5) return Season.SPRING;
if (month >= 6 && month <= 8) return Season.SUMMER;
if (month >= 9 && month <= 11) return Season.AUTUMN;
return Season.WINTER;
}
private static Season seasonFromRaw(String raw) {
if (raw == null || raw.isBlank()) return null;
String token = raw.trim().split("\\s+")[0].toLowerCase(Locale.GERMAN);
return switch (token) {
case "frühling", "frühjahr" -> Season.SPRING;
case "sommer" -> Season.SUMMER;
case "herbst" -> Season.AUTUMN;
case "winter" -> Season.WINTER;
default -> null;
};
}
}

View File

@@ -1,50 +0,0 @@
package org.raddatz.familienarchiv.importing;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.LocalDateTime;
import java.util.List;
/**
* Async import state surfaced to {@code admin/system/ImportStatusCard.svelte} via the
* generated types. The shape ({@code state, statusCode, processed, skippedFiles, skipped})
* is kept verbatim from the retired MassImportService so the admin UI keeps working.
*/
public record ImportStatus(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) State state,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) String statusCode,
@JsonIgnore String message,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) int processed,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) List<SkippedFile> skippedFiles,
LocalDateTime startedAt
) {
public enum State { IDLE, RUNNING, DONE, FAILED }
public enum SkipReason {
INVALID_FILENAME_PATH_TRAVERSAL,
INVALID_PDF_SIGNATURE,
FILE_READ_ERROR,
ALREADY_EXISTS,
S3_UPLOAD_FAILED
}
public record SkippedFile(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) String filename,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) SkipReason reason
) {}
// Note: @Schema on a record accessor method is not picked up by SpringDoc; the
// "skipped" count is a computed convenience field derived from skippedFiles.size().
@JsonProperty("skipped")
public int skipped() {
return skippedFiles.size();
}
/** Defensive-copy constructor — callers cannot mutate the stored list after construction. */
public ImportStatus {
skippedFiles = List.copyOf(skippedFiles);
}
}

View File

@@ -0,0 +1,390 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.apache.poi.ss.usermodel.*;
import java.util.Objects;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.DocumentStatus;
import org.raddatz.familienarchiv.document.ThumbnailAsyncRunner;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.tag.Tag;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.person.PersonNameParser;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.tag.TagService;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import software.amazon.awssdk.core.sync.RequestBody;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.time.format.DateTimeParseException;
import java.util.ArrayList;
import java.util.List;
import java.util.Locale;
import java.util.Optional;
import java.util.UUID;
import java.util.stream.Stream;
import java.util.zip.ZipFile;
@Service
@RequiredArgsConstructor
@Slf4j
public class MassImportService {
public enum State { IDLE, RUNNING, DONE, FAILED }
public record ImportStatus(State state, String message, int processed, LocalDateTime startedAt) {}
private volatile ImportStatus currentStatus = new ImportStatus(State.IDLE, "Kein Import gestartet.", 0, null);
public ImportStatus getStatus() {
return currentStatus;
}
private final DocumentService documentService;
private final PersonService personService;
private final TagService tagService;
private final S3Client s3Client;
private final ThumbnailAsyncRunner thumbnailAsyncRunner;
@Value("${app.s3.bucket}")
private String bucketName;
@Value("${app.import.col.index:0}")
private int colIndex;
@Value("${app.import.col.box:1}")
private int colBox;
@Value("${app.import.col.folder:2}")
private int colFolder;
@Value("${app.import.col.sender:3}")
private int colSender;
@Value("${app.import.col.receivers:5}")
private int colReceivers;
@Value("${app.import.col.date:7}")
private int colDate;
@Value("${app.import.col.location:9}")
private int colLocation;
@Value("${app.import.col.tags:10}")
private int colTags;
@Value("${app.import.col.summary:11}")
private int colSummary;
@Value("${app.import.col.transcription:13}")
private int colTranscription;
private static final String IMPORT_DIR = "/import";
private static final DateTimeFormatter GERMAN_DATE = DateTimeFormatter.ofPattern("d. MMMM yyyy", Locale.GERMAN);
// ODS XML namespaces
private static final String NS_TABLE = "urn:oasis:names:tc:opendocument:xmlns:table:1.0";
private static final String NS_TEXT = "urn:oasis:names:tc:opendocument:xmlns:text:1.0";
// We only need up to this many columns; caps repeated-empty-cell expansion
private static final int MAX_COLS = 20;
@Async
public void runImportAsync() {
if (currentStatus.state() == State.RUNNING) {
throw DomainException.conflict(ErrorCode.IMPORT_ALREADY_RUNNING, "A mass import is already in progress");
}
currentStatus = new ImportStatus(State.RUNNING, "Import läuft...", 0, LocalDateTime.now());
try {
File spreadsheet = findSpreadsheetFile();
log.info("Starte Massenimport aus: {}", spreadsheet.getAbsolutePath());
int processed = processRows(readSpreadsheet(spreadsheet));
currentStatus = new ImportStatus(State.DONE,
"Import abgeschlossen. " + processed + " Dokumente verarbeitet.",
processed, currentStatus.startedAt());
} catch (Exception e) {
log.error("Massenimport fehlgeschlagen", e);
currentStatus = new ImportStatus(State.FAILED, "Fehler: " + e.getMessage(), 0, currentStatus.startedAt());
}
}
private File findSpreadsheetFile() throws IOException {
try (Stream<Path> files = Files.list(Paths.get(IMPORT_DIR))) {
return files
.filter(p -> {
String name = p.toString().toLowerCase();
return name.endsWith(".ods") || name.endsWith(".xlsx") || name.endsWith(".xls");
})
.findFirst()
.orElseThrow(() -> new RuntimeException(
"Keine Tabellendatei (.ods/.xlsx/.xls) in " + IMPORT_DIR + " gefunden!"))
.toFile();
}
}
// --- Spreadsheet reading (format-specific, produces neutral List<List<String>>) ---
private List<List<String>> readSpreadsheet(File file) throws Exception {
String name = file.getName().toLowerCase();
if (name.endsWith(".ods")) {
return readOds(file);
}
return readXlsx(file);
}
/**
* Reads an ODS file by parsing its content.xml directly (no extra library needed).
* ODS is a ZIP archive; content.xml holds the spreadsheet data as XML.
*/
private List<List<String>> readOds(File file) throws Exception {
List<List<String>> result = new ArrayList<>();
try (ZipFile zip = new ZipFile(file)) {
var entry = zip.getEntry("content.xml");
if (entry == null) throw new RuntimeException("Ungültige ODS-Datei: content.xml fehlt");
var factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
var builder = factory.newDocumentBuilder();
var doc = builder.parse(zip.getInputStream(entry));
NodeList tables = doc.getElementsByTagNameNS(NS_TABLE, "table");
if (tables.getLength() == 0) return result;
var table = (Element) tables.item(0);
NodeList rows = table.getElementsByTagNameNS(NS_TABLE, "table-row");
for (int i = 0; i < rows.getLength(); i++) {
var row = (Element) rows.item(i);
List<String> rowData = new ArrayList<>();
NodeList cells = row.getElementsByTagNameNS(NS_TABLE, "table-cell");
for (int j = 0; j < cells.getLength() && rowData.size() < MAX_COLS; j++) {
var cell = (Element) cells.item(j);
// Read the display text (first <text:p>)
String value = "";
NodeList textNodes = cell.getElementsByTagNameNS(NS_TEXT, "p");
if (textNodes.getLength() > 0) {
value = textNodes.item(0).getTextContent().trim();
}
// Expand number-columns-repeated (capped at MAX_COLS)
String repeatAttr = cell.getAttributeNS(NS_TABLE, "number-columns-repeated");
int repeat = repeatAttr.isEmpty() ? 1 : Integer.parseInt(repeatAttr);
repeat = Math.min(repeat, MAX_COLS - rowData.size());
for (int r = 0; r < repeat; r++) {
rowData.add(value);
}
}
result.add(rowData);
}
}
return result;
}
/** Reads an XLSX/XLS file using Apache POI. Converts all cells to strings. */
private List<List<String>> readXlsx(File file) throws Exception {
List<List<String>> result = new ArrayList<>();
try (FileInputStream fis = new FileInputStream(file);
Workbook workbook = WorkbookFactory.create(fis)) {
Sheet sheet = workbook.getSheetAt(0);
for (int i = 0; i <= sheet.getLastRowNum(); i++) {
Row row = sheet.getRow(i);
List<String> rowData = new ArrayList<>();
if (row != null) {
for (int j = 0; j < MAX_COLS; j++) {
rowData.add(xlsxCellToString(row.getCell(j)));
}
}
result.add(rowData);
}
}
return result;
}
private String xlsxCellToString(Cell cell) {
if (cell == null) return "";
return switch (cell.getCellType()) {
case STRING -> cell.getStringCellValue();
case NUMERIC -> {
if (DateUtil.isCellDateFormatted(cell)) {
yield cell.getLocalDateTimeCellValue().toLocalDate().toString(); // ISO
}
yield String.valueOf((int) cell.getNumericCellValue());
}
case BOOLEAN -> String.valueOf(cell.getBooleanCellValue());
default -> "";
};
}
// --- Import logic (works on neutral List<String> rows) ---
private int processRows(List<List<String>> rows) {
int count = 0;
for (int i = 1; i < rows.size(); i++) { // skip header row
List<String> cells = rows.get(i);
String index = getCell(cells, colIndex);
if (index.isBlank()) continue;
String filename = index.contains(".") ? index : index + ".pdf";
Optional<File> fileOnDisk = findFileRecursive(filename);
if (fileOnDisk.isEmpty()) {
log.warn("Datei nicht gefunden, importiere nur Metadaten: {}", filename);
}
importSingleDocument(cells, fileOnDisk, filename, index);
count++;
}
return count;
}
@Transactional
protected void importSingleDocument(List<String> cells, Optional<File> file, String originalFilename, String index) {
Optional<Document> existing = documentService.findByOriginalFilename(originalFilename);
if (existing.isPresent() && existing.get().getStatus() != DocumentStatus.PLACEHOLDER) {
log.info("Dokument {} existiert bereits, überspringe.", originalFilename);
return;
}
String archiveBox = getCell(cells, colBox);
String archiveFolder = getCell(cells, colFolder);
String senderRaw = getCell(cells, colSender);
String receiversRaw = getCell(cells, colReceivers);
LocalDate date = parseDate(getCell(cells, colDate));
String location = getCell(cells, colLocation);
String tagRaw = getCell(cells, colTags);
String summary = getCell(cells, colSummary);
String transcription = getCell(cells, colTranscription);
String s3Key = null;
String contentType = null;
DocumentStatus status = DocumentStatus.PLACEHOLDER;
if (file.isPresent()) {
try {
contentType = Files.probeContentType(file.get().toPath());
} catch (IOException e) {
contentType = null;
}
if (contentType == null) contentType = "application/octet-stream";
s3Key = "documents/" + UUID.randomUUID() + "_" + file.get().getName();
try {
s3Client.putObject(PutObjectRequest.builder()
.bucket(bucketName)
.key(s3Key)
.contentType(contentType)
.build(),
RequestBody.fromFile(file.get()));
status = DocumentStatus.UPLOADED;
} catch (Exception e) {
log.error("S3 Upload Fehler für {}", file.get().getName(), e);
return;
}
}
Person sender = senderRaw.isBlank() ? null : findOrCreatePerson(senderRaw);
List<Person> receivers = PersonNameParser.parseReceivers(receiversRaw).stream()
.map(this::findOrCreatePerson)
.filter(Objects::nonNull)
.toList();
Tag tag = null;
if (!tagRaw.isBlank()) {
tag = tagService.findOrCreate(tagRaw);
}
Document doc = existing.orElse(Document.builder()
.originalFilename(originalFilename)
.build());
// Heuristic: mark as complete if at least one key field is present in the spreadsheet row
boolean metadataComplete = date != null || !senderRaw.isBlank() || !receiversRaw.isBlank();
doc.setTitle(buildTitle(index, date, location));
doc.setFilePath(s3Key);
doc.setContentType(contentType);
doc.setStatus(status);
doc.setArchiveBox(archiveBox.isBlank() ? null : archiveBox);
doc.setArchiveFolder(archiveFolder.isBlank() ? null : archiveFolder);
doc.setDocumentDate(date);
doc.setLocation(location.isBlank() ? null : location);
doc.setSummary(summary.isBlank() ? null : summary);
doc.setTranscription(transcription.isBlank() ? null : transcription);
doc.setSender(sender);
doc.getReceivers().addAll(receivers);
if (tag != null) doc.getTags().add(tag);
doc.setMetadataComplete(metadataComplete);
Document saved = documentService.save(doc);
if (file.isPresent()) {
thumbnailAsyncRunner.dispatchAfterCommit(saved.getId());
}
log.info("Importiert{}: {}", file.isEmpty() ? " (nur Metadaten)" : "", originalFilename);
}
// --- Helpers ---
private String getCell(List<String> cells, int col) {
if (col >= cells.size()) return "";
String val = cells.get(col);
return val == null ? "" : val.trim();
}
private LocalDate parseDate(String value) {
if (value == null || value.isBlank()) return null;
try {
return LocalDate.parse(value.trim());
} catch (DateTimeParseException e) {
return null;
}
}
private String buildTitle(String index, LocalDate date, String location) {
StringBuilder sb = new StringBuilder(index);
if (date != null) {
sb.append(" \u2013 ").append(date.format(GERMAN_DATE));
}
if (location != null && !location.isBlank()) {
sb.append(" \u2013 ").append(location);
}
return sb.toString();
}
private Person findOrCreatePerson(String rawName) {
return personService.findOrCreateByAlias(rawName);
}
private Optional<File> findFileRecursive(String filename) {
try (Stream<Path> walk = Files.walk(Paths.get(IMPORT_DIR))) {
return walk.filter(p -> !Files.isDirectory(p))
.filter(p -> p.getFileName().toString().equals(filename))
.map(Path::toFile)
.findFirst();
} catch (IOException e) {
return Optional.empty();
}
}
}

View File

@@ -1,69 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.person.PersonType;
import org.raddatz.familienarchiv.person.PersonUpsertCommand;
import org.springframework.stereotype.Component;
import java.io.File;
import java.time.LocalDate;
import java.time.format.DateTimeParseException;
import java.util.List;
/**
* Loads {@code canonical-persons.xlsx} (the register) into the person domain via
* {@link PersonService}, upserting each person by the normalizer {@code person_id}
* (source_ref). Register persons are confident identities, so {@code provisional} is
* driven by the sheet's already-clean value (normally {@code False}).
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class PersonRegisterImporter {
static final List<String> REQUIRED_HEADERS = List.of("person_id", "last_name", "first_name", "provisional");
private final PersonService personService;
public int load(File artifact) {
List<CanonicalSheetReader.Row> rows = CanonicalSheetReader.readRows(artifact, REQUIRED_HEADERS);
int processed = 0;
for (CanonicalSheetReader.Row row : rows) {
String personId = row.get("person_id");
if (personId.isBlank()) continue;
personService.upsertBySourceRef(toCommand(row, personId));
processed++;
}
log.info("Imported {} register persons from {}", processed, artifact.getName());
return processed;
}
private PersonUpsertCommand toCommand(CanonicalSheetReader.Row row, String personId) {
return PersonUpsertCommand.builder()
.sourceRef(personId)
.lastName(blankToNull(row.get("last_name")))
.firstName(blankToNull(row.get("first_name")))
.maidenName(blankToNull(row.get("maiden_name")))
.notes(blankToNull(row.get("notes")))
.birthYear(yearOf(row.get("birth_date")))
.deathYear(yearOf(row.get("death_date")))
.personType(PersonType.PERSON)
.provisional(Boolean.parseBoolean(row.get("provisional")))
.build();
}
private static Integer yearOf(String isoDate) {
if (isoDate == null || isoDate.isBlank()) return null;
try {
return LocalDate.parse(isoDate.trim()).getYear();
} catch (DateTimeParseException e) {
return null;
}
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s;
}
}

View File

@@ -1,135 +0,0 @@
package org.raddatz.familienarchiv.importing;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.person.PersonType;
import org.raddatz.familienarchiv.person.PersonUpsertCommand;
import org.raddatz.familienarchiv.person.relationship.RelationType;
import org.raddatz.familienarchiv.person.relationship.RelationshipService;
import org.raddatz.familienarchiv.person.relationship.dto.CreateRelationshipRequest;
import org.springframework.stereotype.Component;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.UUID;
/**
* Loads {@code canonical-persons-tree.json} into the person + relationship domains.
* Tree persons are upserted via {@link PersonService} keyed on the shared
* {@code personId} slug (which Phase 1 #670 now emits into the tree), so they reconcile
* with the register rather than duplicating it. Relationships reference persons by the
* tree's local {@code rowId}; each side is mapped to the upserted person's UUID and
* created through {@link RelationshipService} (never the relationship repository —
* layering rule). A duplicate relationship on re-import is swallowed for idempotency.
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class PersonTreeImporter {
// The tree JSON is a local implementation detail, not a shared API payload, so the
// importer owns its own mapper rather than depending on the web ObjectMapper bean.
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
private final PersonService personService;
private final RelationshipService relationshipService;
public int load(File artifact) {
JsonNode root = readTree(artifact);
Map<String, UUID> idByRowId = upsertPersons(root.path("persons"));
int relationships = createRelationships(root.path("relationships"), idByRowId);
log.info("Imported {} tree persons and {} relationships from {}",
idByRowId.size(), relationships, artifact.getName());
return idByRowId.size();
}
private JsonNode readTree(File artifact) {
try {
return OBJECT_MAPPER.readTree(artifact);
} catch (Exception e) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Unreadable canonical artifact: " + artifact.getName());
}
}
private Map<String, UUID> upsertPersons(JsonNode persons) {
Map<String, UUID> idByRowId = new HashMap<>();
for (JsonNode node : persons) {
String personId = text(node, "personId");
if (personId.isBlank()) continue;
Person person = personService.upsertBySourceRef(toCommand(node, personId));
idByRowId.put(text(node, "rowId"), person.getId());
}
return idByRowId;
}
private PersonUpsertCommand toCommand(JsonNode node, String personId) {
return PersonUpsertCommand.builder()
.sourceRef(personId)
.lastName(blankToNull(text(node, "lastName")))
.firstName(blankToNull(text(node, "firstName")))
.maidenName(blankToNull(text(node, "maidenName")))
.notes(blankToNull(text(node, "notes")))
.birthYear(intOrNull(node, "birthYear"))
.deathYear(intOrNull(node, "deathYear"))
.familyMember(node.path("familyMember").asBoolean(false))
.personType(PersonType.PERSON)
.provisional(false)
.build();
}
private int createRelationships(JsonNode relationships, Map<String, UUID> idByRowId) {
int created = 0;
for (JsonNode node : relationships) {
// Trap: a relationship node's personId / relatedPersonId fields carry the tree's
// local rowId (e.g. "row_a"), NOT a person slug. They are resolved through
// idByRowId to the upserted person's UUID.
UUID person = idByRowId.get(text(node, "personId"));
UUID related = idByRowId.get(text(node, "relatedPersonId"));
if (person == null || related == null) {
log.warn("Skipping tree relationship with unresolved rowId: {} -> {}",
text(node, "personId"), text(node, "relatedPersonId"));
continue;
}
if (addRelationshipIdempotently(person, related, text(node, "type"))) {
created++;
}
}
return created;
}
private boolean addRelationshipIdempotently(UUID person, UUID related, String type) {
try {
relationshipService.addRelationship(person,
new CreateRelationshipRequest(related, RelationType.valueOf(type), null, null, null));
return true;
} catch (DomainException e) {
if (e.getCode() == ErrorCode.DUPLICATE_RELATIONSHIP
|| e.getCode() == ErrorCode.CIRCULAR_RELATIONSHIP) {
return false;
}
throw e;
}
}
private static String text(JsonNode node, String field) {
JsonNode value = node.get(field);
return value == null || value.isNull() ? "" : value.asText();
}
private static Integer intOrNull(JsonNode node, String field) {
JsonNode value = node.get(field);
return value == null || value.isNull() ? null : value.asInt();
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s;
}
}

View File

@@ -1,54 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.tag.Tag;
import org.raddatz.familienarchiv.tag.TagService;
import org.springframework.stereotype.Component;
import java.io.File;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.UUID;
/**
* Loads {@code canonical-tag-tree.xlsx} into the tag domain via {@link TagService},
* upserting each tag by its canonical {@code tag_path} (the source_ref). Parent links are
* resolved by the parent's path, which is the child path with its last {@code /segment}
* stripped. Rows are emitted parents-first by the normalizer, so a parent is always
* resolved before any child references it.
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class TagTreeImporter {
static final List<String> REQUIRED_HEADERS = List.of("tag_path", "parent_name", "tag_name");
private static final String PATH_SEPARATOR = "/";
private final TagService tagService;
public int load(File artifact) {
List<CanonicalSheetReader.Row> rows = CanonicalSheetReader.readRows(artifact, REQUIRED_HEADERS);
Map<String, UUID> idByPath = new HashMap<>();
int processed = 0;
for (CanonicalSheetReader.Row row : rows) {
String path = row.get("tag_path");
if (path.isBlank()) continue;
UUID parentId = resolveParentId(path, idByPath);
Tag tag = tagService.upsertBySourceRef(path, row.get("tag_name"), parentId);
idByPath.put(path, tag.getId());
processed++;
}
log.info("Imported {} tags from {}", processed, artifact.getName());
return processed;
}
private UUID resolveParentId(String path, Map<String, UUID> idByPath) {
int lastSeparator = path.lastIndexOf(PATH_SEPARATOR);
if (lastSeparator < 0) return null;
String parentPath = path.substring(0, lastSeparator);
return idByPath.get(parentPath);
}
}

View File

@@ -1,7 +1,6 @@
package org.raddatz.familienarchiv.person; package org.raddatz.familienarchiv.person;
import com.fasterxml.jackson.annotation.JsonIgnore; import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import io.swagger.v3.oas.annotations.media.Schema; import io.swagger.v3.oas.annotations.media.Schema;
import jakarta.persistence.*; import jakarta.persistence.*;
import lombok.*; import lombok.*;
@@ -10,9 +9,6 @@ import org.raddatz.familienarchiv.user.DisplayNameFormatter;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.List; import java.util.List;
import java.util.UUID; import java.util.UUID;
// prevents infinite recursion in JSON serialization; see ADR-022 for lazy-fetch context
@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})
@Entity @Entity
@Table(name = "persons") @Table(name = "persons")
@Data @Data
@@ -57,18 +53,6 @@ public class Person {
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private boolean familyMember = false; private boolean familyMember = false;
// The normalizer person_id — join key and re-import idempotency key. Null for manually
// created persons; unique among non-null values (see ADR-025).
@Column(name = "source_ref")
private String sourceRef;
// A provisional person is one the importer inferred but could not confidently identify.
// Distinct from familyMember (a genealogical fact); set true only by the importer (Phase 3).
@Column(name = "provisional", nullable = false)
@Builder.Default
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private boolean provisional = false;
// Entity-graph navigation for JPA JOIN queries (e.g. DocumentSpecifications.hasText). // Entity-graph navigation for JPA JOIN queries (e.g. DocumentSpecifications.hasText).
// Uses entity relationship rather than cross-domain repository access, avoiding a // Uses entity relationship rather than cross-domain repository access, avoiding a
// separate DB roundtrip while respecting domain boundaries. // separate DB roundtrip while respecting domain boundaries.

View File

@@ -22,15 +22,12 @@ import org.springframework.web.bind.annotation.*;
import org.springframework.web.server.ResponseStatusException; import org.springframework.web.server.ResponseStatusException;
import jakarta.validation.Valid; import jakarta.validation.Valid;
import jakarta.validation.constraints.Max;
import jakarta.validation.constraints.Min;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
@RestController @RestController
@RequestMapping("/api/persons") @RequestMapping("/api/persons")
@RequiredArgsConstructor @RequiredArgsConstructor
@Validated
public class PersonController { public class PersonController {
private final PersonService personService; private final PersonService personService;
@@ -38,37 +35,8 @@ public class PersonController {
@GetMapping @GetMapping
@RequirePermission(Permission.READ_ALL) @RequirePermission(Permission.READ_ALL)
public ResponseEntity<PersonSearchResult> getPersons( public ResponseEntity<List<PersonSummaryDTO>> getPersons(@RequestParam(required = false) String q) {
@RequestParam(required = false) String q, return ResponseEntity.ok(personService.findAll(q));
@RequestParam(required = false) PersonType type,
@RequestParam(required = false) Boolean familyOnly,
@RequestParam(required = false) Boolean hasDocuments,
@RequestParam(required = false) Boolean provisional,
// review=true reveals the import noise (transcriber view); absent/false keeps the
// clean reader default (familyMember OR documentCount > 0). The explicit filters AND
// within whichever base the review flag selects.
@RequestParam(required = false, defaultValue = "false") boolean review,
@RequestParam(required = false) String sort,
@RequestParam(defaultValue = "0") @Min(0) int page,
@RequestParam(defaultValue = "50") @Min(1) @Max(100) int size) {
// Legacy top-N-by-document-count path (reader dashboard): preserved, wrapped in the
// same envelope so /api/persons always returns one shape. It is explicitly NON-paged —
// the top-N query returns the complete result, so PersonSearchResult.topN reports an
// honest totalElements (= returned count) instead of pretending to be a page slice.
if ("documentCount".equals(sort) && q == null) {
int safeSize = Math.min(size, 50);
List<PersonSummaryDTO> top = personService.findTopByDocumentCount(safeSize);
return ResponseEntity.ok(PersonSearchResult.topN(top));
}
PersonFilter filter = PersonFilter.builder()
.type(type)
.familyOnly(familyOnly)
.hasDocuments(hasDocuments)
.provisional(provisional)
.readerDefault(!review)
.build();
return ResponseEntity.ok(personService.search(filter, page, size, q));
} }
@GetMapping("/{id}") @GetMapping("/{id}")
@@ -135,21 +103,6 @@ public class PersonController {
personService.mergePersons(id, UUID.fromString(targetIdStr)); personService.mergePersons(id, UUID.fromString(targetIdStr));
} }
// Dedicated state transition that clears the provisional flag. A separate verb (not a
// mass-assignable DTO field) so provisional can never be smuggled in via create/update.
@PatchMapping("/{id}/confirm")
@RequirePermission(Permission.WRITE_ALL)
public ResponseEntity<Person> confirmPerson(@PathVariable UUID id) {
return ResponseEntity.ok(personService.confirmPerson(id));
}
@DeleteMapping("/{id}")
@ResponseStatus(HttpStatus.NO_CONTENT)
@RequirePermission(Permission.WRITE_ALL)
public void deletePerson(@PathVariable UUID id) {
personService.deletePerson(id);
}
// ─── Alias endpoints ──────────────────────────────────────────────────── // ─── Alias endpoints ────────────────────────────────────────────────────
@GetMapping("/{id}/aliases") @GetMapping("/{id}/aliases")

View File

@@ -1,36 +0,0 @@
package org.raddatz.familienarchiv.person;
import lombok.Builder;
/**
* The reader/triage filter set for the persons directory, threaded as one value through
* {@code PersonController -> PersonService -> PersonRepository}. Each field is nullable:
* null means "do not constrain on this dimension".
*
* <ul>
* <li>{@code type} — restrict to a single {@link PersonType}.</li>
* <li>{@code familyOnly} — when true, only {@code familyMember} persons.</li>
* <li>{@code hasDocuments} — when true, only persons with documentCount &gt; 0.</li>
* <li>{@code provisional} — match the {@code Person.provisional} flag exactly.</li>
* <li>{@code readerDefault} — when true, restrict to {@code familyMember OR documentCount > 0}
* (the clean reader view). The explicit filters above AND with this restriction.</li>
* </ul>
*/
@Builder
public record PersonFilter(
PersonType type,
Boolean familyOnly,
Boolean hasDocuments,
Boolean provisional,
boolean readerDefault
) {
/** The unconstrained "show all" filter (transcriber view, no reader restriction). */
public static PersonFilter showAll() {
return PersonFilter.builder().readerDefault(false).build();
}
/** The clean reader default: familyMember OR documentCount &gt; 0, no other constraints. */
public static PersonFilter cleanDefault() {
return PersonFilter.builder().readerDefault(true).build();
}
}

View File

@@ -32,9 +32,6 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
// Lookup by full alias string, used during ODS mass import // Lookup by full alias string, used during ODS mass import
Optional<Person> findByAliasIgnoreCase(String alias); Optional<Person> findByAliasIgnoreCase(String alias);
// Lookup by the normalizer person_id, used for idempotent canonical re-import (Phase 3).
Optional<Person> findBySourceRef(String sourceRef);
// Exact first+last name match, used for filename-based sender lookup // Exact first+last name match, used for filename-based sender lookup
Optional<Person> findByFirstNameIgnoreCaseAndLastNameIgnoreCase(String firstName, String lastName); Optional<Person> findByFirstNameIgnoreCaseAndLastNameIgnoreCase(String firstName, String lastName);
@@ -44,7 +41,7 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName, SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType, p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes, p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional, p.family_member AS familyMember,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id) (SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount + (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p FROM persons p
@@ -57,7 +54,7 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName, SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType, p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes, p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional, p.family_member AS familyMember,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id) (SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount + (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p FROM persons p
@@ -66,83 +63,12 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
OR LOWER(CONCAT(p.last_name,' ',COALESCE(p.first_name,''))) LIKE LOWER(CONCAT('%',:query,'%')) OR LOWER(CONCAT(p.last_name,' ',COALESCE(p.first_name,''))) LIKE LOWER(CONCAT('%',:query,'%'))
OR LOWER(p.alias) LIKE LOWER(CONCAT('%',:query,'%')) OR LOWER(p.alias) LIKE LOWER(CONCAT('%',:query,'%'))
OR LOWER(a.last_name) LIKE LOWER(CONCAT('%',:query,'%')) OR LOWER(a.last_name) LIKE LOWER(CONCAT('%',:query,'%'))
GROUP BY p.id, p.title, p.first_name, p.last_name, p.person_type, p.alias, p.birth_year, p.death_year, p.notes, p.family_member, p.provisional GROUP BY p.id, p.title, p.first_name, p.last_name, p.person_type, p.alias, p.birth_year, p.death_year, p.notes, p.family_member
ORDER BY p.last_name ASC, p.first_name ASC ORDER BY p.last_name ASC, p.first_name ASC
""", """,
nativeQuery = true) nativeQuery = true)
List<PersonSummaryDTO> searchWithDocumentCount(@Param("query") String query); List<PersonSummaryDTO> searchWithDocumentCount(@Param("query") String query);
// ORDER BY uses the computed alias "documentCount" — valid PostgreSQL (aliases allowed in ORDER BY,
// unlike WHERE/HAVING). This is intentional; it would silently fail on MySQL or H2.
@Query(value = """
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p
ORDER BY documentCount DESC
LIMIT :limit
""",
nativeQuery = true)
List<PersonSummaryDTO> findTopByDocumentCount(@Param("limit") int limit);
// --- #667: filter-aware paged directory ---
//
// The slice query and the count query below MUST keep an IDENTICAL WHERE clause so the
// rendered page and totalElements can never drift. Every filter is nullable: a null param
// disables that predicate via the `:param IS NULL OR …` idiom. `readerDefault` (a plain
// boolean) restricts to "familyMember OR has documents"; the explicit filters AND on top.
// documentCount is recomputed inline (not via the SELECT alias) because WHERE cannot
// reference a computed alias. All params are named — no string concatenation, no injection.
String FILTER_WHERE = """
WHERE (CAST(:type AS text) IS NULL OR p.person_type = CAST(:type AS text))
AND (:familyOnly = FALSE OR :familyOnly IS NULL OR p.family_member = TRUE)
AND (:hasDocuments = FALSE OR :hasDocuments IS NULL OR (
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id)) > 0)
AND (:provisional IS NULL OR p.provisional = :provisional)
AND (:readerDefault = FALSE OR (
p.family_member = TRUE OR (
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id)) > 0))
AND (CAST(:query AS text) IS NULL OR
LOWER(CONCAT(COALESCE(p.first_name,''),' ',p.last_name)) LIKE LOWER(CONCAT('%',CAST(:query AS text),'%'))
OR LOWER(CONCAT(p.last_name,' ',COALESCE(p.first_name,''))) LIKE LOWER(CONCAT('%',CAST(:query AS text),'%'))
OR LOWER(p.alias) LIKE LOWER(CONCAT('%',CAST(:query AS text),'%')))
""";
@Query(value = """
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p
""" + FILTER_WHERE + """
ORDER BY p.last_name ASC, p.first_name ASC
LIMIT :limit OFFSET :offset
""",
nativeQuery = true)
List<PersonSummaryDTO> findByFilter(@Param("type") String type,
@Param("familyOnly") Boolean familyOnly,
@Param("hasDocuments") Boolean hasDocuments,
@Param("provisional") Boolean provisional,
@Param("readerDefault") boolean readerDefault,
@Param("query") String query,
@Param("limit") int limit,
@Param("offset") int offset);
@Query(value = "SELECT COUNT(*) FROM persons p " + FILTER_WHERE, nativeQuery = true)
long countByFilter(@Param("type") String type,
@Param("familyOnly") Boolean familyOnly,
@Param("hasDocuments") Boolean hasDocuments,
@Param("provisional") Boolean provisional,
@Param("readerDefault") boolean readerDefault,
@Param("query") String query);
// --- Correspondent queries --- // --- Correspondent queries ---
@Query(value = """ @Query(value = """
@@ -194,12 +120,6 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
@Query(value = "UPDATE documents SET sender_id = :target WHERE sender_id = :source", nativeQuery = true) @Query(value = "UPDATE documents SET sender_id = :target WHERE sender_id = :source", nativeQuery = true)
void reassignSender(@Param("source") UUID source, @Param("target") UUID target); void reassignSender(@Param("source") UUID source, @Param("target") UUID target);
// Used by deletePerson: detach a deleted person from documents they sent, so the hard
// delete cannot orphan a documents.sender_id FK (the column is nullable).
@Modifying
@Query(value = "UPDATE documents SET sender_id = NULL WHERE sender_id = :source", nativeQuery = true)
void reassignSenderToNull(@Param("source") UUID source);
@Modifying @Modifying
@Query(value = """ @Query(value = """
INSERT INTO document_receivers (document_id, person_id) INSERT INTO document_receivers (document_id, person_id)

View File

@@ -1,50 +0,0 @@
package org.raddatz.familienarchiv.person;
import io.swagger.v3.oas.annotations.media.Schema;
import java.util.List;
/**
* Paged result for the /api/persons list endpoint.
*
* <p>Hand-written to mirror {@code document/DocumentSearchResult} field-for-field so the
* frontend sees one paged shape across the app. Deliberately NOT Spring {@code Page<T>}
* (unstable serialized shape across Spring versions, noisy in OpenAPI) and deliberately
* NOT a reuse of the document DTO (would couple two feature modules — duplication beats
* coupling here).
*/
public record PersonSearchResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<PersonSummaryDTO> items,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
long totalElements,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int pageNumber,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int pageSize,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int totalPages
) {
/**
* Paged factory: derives {@code totalPages} from the full match count and the page size.
* A zero count yields zero pages so the frontend hides the pagination control.
*/
public static PersonSearchResult paged(List<PersonSummaryDTO> slice, int pageNumber, int pageSize, long totalElements) {
int totalPages = pageSize == 0 ? 0 : (int) ((totalElements + pageSize - 1) / pageSize);
return new PersonSearchResult(slice, totalElements, pageNumber, pageSize, totalPages);
}
/**
* Non-paged factory for the legacy {@code sort=documentCount} top-N dashboard path.
* That query returns the <em>complete</em> result in one shot — there is no further page
* to fetch — so the envelope reports reality rather than pretending to be a slice of a
* larger set: {@code totalElements} equals the number of rows actually returned,
* {@code pageSize} equals that same count, and {@code totalPages} is 1 (or 0 when empty).
* This avoids the earlier ambiguity where {@code totalElements} looked like a paged total.
*/
public static PersonSearchResult topN(List<PersonSummaryDTO> all) {
int count = all.size();
int totalPages = count == 0 ? 0 : 1;
return new PersonSearchResult(all, count, 0, count, totalPages);
}
}

View File

@@ -31,53 +31,14 @@ public class PersonService {
private final PersonRepository personRepository; private final PersonRepository personRepository;
private final PersonNameAliasRepository aliasRepository; private final PersonNameAliasRepository aliasRepository;
public List<PersonSummaryDTO> findTopByDocumentCount(int limit) { public List<PersonSummaryDTO> findAll(String q) {
return personRepository.findTopByDocumentCount(limit); if (q == null) {
} return personRepository.findAllWithDocumentCount();
}
/** if (q.isBlank()) {
* Filtered, paginated directory query. The slice and the total are derived from one return List.of();
* shared WHERE clause (see {@link PersonRepository#FILTER_WHERE}) so totalElements can }
* never drift from the rendered page. {@code type} is passed as the enum name because the return personRepository.searchWithDocumentCount(q.trim());
* native query compares against the string column.
*/
public PersonSearchResult search(PersonFilter filter, int page, int size, String q) {
String type = filter.type() == null ? null : filter.type().name();
String query = (q == null || q.isBlank()) ? null : q.trim();
int offset = page * size;
List<PersonSummaryDTO> items = personRepository.findByFilter(
type, filter.familyOnly(), filter.hasDocuments(), filter.provisional(),
filter.readerDefault(), query, size, offset);
long total = personRepository.countByFilter(
type, filter.familyOnly(), filter.hasDocuments(), filter.provisional(),
filter.readerDefault(), query);
return PersonSearchResult.paged(items, page, size, total);
}
/**
* Clears the {@code provisional} flag — a deliberate state transition exposed as
* {@code PATCH /api/persons/{id}/confirm}, never as a mass-assignable DTO field (CWE-915).
*/
@Transactional
public Person confirmPerson(UUID id) {
Person person = getById(id);
person.setProvisional(false);
return personRepository.save(person);
}
/**
* Hard-deletes a person used by triage. Detaches the person from any documents they
* sent (nulls sender_id) and from any received-document references first, so the delete
* cannot orphan an FK and fail with a 500.
*/
@Transactional
public void deletePerson(UUID id) {
getById(id);
personRepository.reassignSenderToNull(id);
personRepository.deleteReceiverReferences(id);
personRepository.deleteById(id);
} }
public Person getById(UUID id) { public Person getById(UUID id) {
@@ -115,11 +76,6 @@ public class PersonService {
return personRepository.findByFirstNameIgnoreCaseAndLastNameIgnoreCase(firstName, lastName); return personRepository.findByFirstNameIgnoreCaseAndLastNameIgnoreCase(firstName, lastName);
} }
/** Lookup by the normalizer person_id — used by the canonical importer for register-first matching. */
public Optional<Person> findBySourceRef(String sourceRef) {
return personRepository.findBySourceRef(sourceRef);
}
@Nullable @Nullable
@Transactional @Transactional
public Person findOrCreateByAlias(String rawName) { public Person findOrCreateByAlias(String rawName) {
@@ -155,80 +111,6 @@ public class PersonService {
}); });
} }
/**
* Idempotent upsert keyed on {@code sourceRef} (the normalizer person_id) for the
* canonical importer (Phase 3, ADR-025). On first import the canonical fields are
* written verbatim. On re-import the human-edit-preserve precedence applies:
* a non-blank existing field is never overwritten, and {@code provisional} never
* flips back to true once a human has confirmed the person.
*/
@Transactional
public Person upsertBySourceRef(PersonUpsertCommand cmd) {
return personRepository.findBySourceRef(cmd.sourceRef())
.map(existing -> personRepository.save(mergeCanonical(existing, cmd)))
.orElseGet(() -> fromCanonical(cmd));
}
private Person fromCanonical(PersonUpsertCommand cmd) {
Person person = personRepository.save(Person.builder()
.sourceRef(cmd.sourceRef())
.firstName(blankToNull(cmd.firstName()))
.lastName(cmd.lastName())
.notes(blankToNull(cmd.notes()))
.birthYear(cmd.birthYear())
.deathYear(cmd.deathYear())
.familyMember(cmd.familyMember())
.personType(cmd.personType() == null ? PersonType.PERSON : cmd.personType())
.provisional(cmd.provisional())
.build());
String maiden = blankToNull(cmd.maidenName());
if (maiden != null) {
int nextSortOrder = aliasRepository.findMaxSortOrder(person.getId()) + 1;
aliasRepository.save(PersonNameAlias.builder()
.person(person)
.lastName(maiden)
.type(PersonNameAliasType.MAIDEN_NAME)
.sortOrder(nextSortOrder)
.build());
}
return person;
}
private Person mergeCanonical(Person existing, PersonUpsertCommand cmd) {
existing.setFirstName(preferHuman(existing.getFirstName(), cmd.firstName()));
existing.setLastName(preferHuman(existing.getLastName(), cmd.lastName()));
existing.setNotes(preferHuman(existing.getNotes(), cmd.notes()));
existing.setBirthYear(preferHuman(existing.getBirthYear(), cmd.birthYear()));
existing.setDeathYear(preferHuman(existing.getDeathYear(), cmd.deathYear()));
if (cmd.personType() != null && existing.getPersonType() == PersonType.PERSON) {
existing.setPersonType(cmd.personType());
}
// provisional is monotonic-downward: once it is false it never reverts to true.
// This also pins the cross-loader precedence (ADR-025): a register/tree person is
// loaded before documents and already false, so a later document row that references
// the same source_ref (provisional=true) can never flip it provisional — the guard
// below only fires while existing is still provisional. Order of document rows is
// therefore irrelevant.
if (existing.isProvisional()) {
existing.setProvisional(cmd.provisional());
}
return existing;
}
// preferHuman keeps an existing human-entered value and only falls back to the canonical
// value when the existing one is absent — the single idiom for every fill-blank field.
private static String preferHuman(String existing, String canonical) {
return (existing == null || existing.isBlank()) ? blankToNull(canonical) : existing;
}
private static Integer preferHuman(Integer existing, Integer canonical) {
return existing != null ? existing : canonical;
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s.trim();
}
@Transactional @Transactional
public Person createPerson(String firstName, String lastName, String alias) { public Person createPerson(String firstName, String lastName, String alias) {
Person person = Person.builder() Person person = Person.builder()

View File

@@ -18,7 +18,6 @@ public interface PersonSummaryDTO {
Integer getDeathYear(); Integer getDeathYear();
String getNotes(); String getNotes();
boolean isFamilyMember(); boolean isFamilyMember();
boolean isProvisional();
long getDocumentCount(); long getDocumentCount();
default String getDisplayName() { default String getDisplayName() {

View File

@@ -1,24 +0,0 @@
package org.raddatz.familienarchiv.person;
import lombok.Builder;
/**
* Importer → {@link PersonService} command for an idempotent upsert keyed on
* {@code sourceRef} (the normalizer's stable person_id). Carries only the canonical
* fields the importer owns; the service applies the human-edit-preserve precedence
* (see ADR-025): non-blank existing fields are never overwritten, and {@code provisional}
* never flips back to true once a human has confirmed a person.
*/
@Builder
public record PersonUpsertCommand(
String sourceRef,
String firstName,
String lastName,
String maidenName,
String notes,
Integer birthYear,
Integer deathYear,
boolean familyMember,
PersonType personType,
boolean provisional
) {}

View File

@@ -1,42 +1,24 @@
package org.raddatz.familienarchiv.security; package org.raddatz.familienarchiv.security;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.user.CustomUserDetailsService; import org.raddatz.familienarchiv.user.CustomUserDetailsService;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Configuration;
import org.springframework.core.annotation.Order;
import org.springframework.core.env.Environment; import org.springframework.core.env.Environment;
import org.springframework.security.authentication.AuthenticationManager;
import org.springframework.security.authentication.dao.DaoAuthenticationProvider; import org.springframework.security.authentication.dao.DaoAuthenticationProvider;
import org.springframework.security.config.annotation.authentication.configuration.AuthenticationConfiguration; import org.springframework.security.config.Customizer;
import org.springframework.security.config.annotation.web.builders.HttpSecurity; import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity; import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.config.annotation.web.configurers.AbstractHttpConfigurer;
import org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder; import org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder;
import org.springframework.security.crypto.password.PasswordEncoder; import org.springframework.security.crypto.password.PasswordEncoder;
import org.springframework.security.web.SecurityFilterChain; import org.springframework.security.web.SecurityFilterChain;
import org.springframework.security.web.authentication.session.ChangeSessionIdAuthenticationStrategy;
import org.springframework.security.web.authentication.session.SessionAuthenticationStrategy;
import org.springframework.security.web.csrf.CookieCsrfTokenRepository;
import org.springframework.security.web.csrf.CsrfException;
import org.springframework.security.web.csrf.CsrfTokenRequestAttributeHandler;
import java.util.Map;
@Configuration @Configuration
@EnableWebSecurity @EnableWebSecurity
@RequiredArgsConstructor @RequiredArgsConstructor
public class SecurityConfig { public class SecurityConfig {
// @WebMvcTest slices do not include JacksonAutoConfiguration, so ObjectMapper
// cannot be injected here. A static instance is safe because the response
// only serializes fixed String keys — no custom naming strategy or module needed.
private static final ObjectMapper ERROR_WRITER = new ObjectMapper();
private final CustomUserDetailsService userDetailsService; private final CustomUserDetailsService userDetailsService;
private final Environment environment; private final Environment environment;
@@ -52,57 +34,20 @@ public class SecurityConfig {
return authProvider; return authProvider;
} }
@Bean
public AuthenticationManager authenticationManager(AuthenticationConfiguration config) throws Exception {
return config.getAuthenticationManager();
}
@Bean
public SessionAuthenticationStrategy sessionAuthenticationStrategy() {
// ChangeSessionIdAuthenticationStrategy rotates the session ID via the Servlet 3.1+
// HttpServletRequest.changeSessionId() — preserves attributes, mints a fresh ID.
// Used by AuthSessionController.login to defend against session fixation (CWE-384).
return new ChangeSessionIdAuthenticationStrategy();
}
@Bean
@Order(1)
public SecurityFilterChain managementFilterChain(HttpSecurity http) throws Exception {
http
.securityMatcher("/actuator/**")
.authorizeHttpRequests(auth -> {
// Health and Prometheus are open — Docker health checks and Prometheus scraping need no credentials.
auth.requestMatchers("/actuator/health", "/actuator/prometheus").permitAll();
// All other actuator endpoints (metrics, info, env, heapdump…) require authentication.
auth.anyRequest().authenticated();
})
// Explicitly return 401 for any unauthenticated actuator request.
// Without this override, Spring Security's DelegatingAuthenticationEntryPoint
// would redirect browser-like clients to the form-login page (302 → /login),
// making it impossible to distinguish "not authenticated" from "not found" in tests.
.exceptionHandling(ex -> ex.authenticationEntryPoint(
(req, res, e) -> res.setStatus(HttpServletResponse.SC_UNAUTHORIZED)))
.formLogin(AbstractHttpConfigurer::disable)
.csrf(AbstractHttpConfigurer::disable);
return http.build();
}
@Bean @Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception { public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http http
// CSRF protection via CookieCsrfTokenRepository (NFR-SEC-103). // CSRF is intentionally disabled: every request from the SvelteKit frontend
// The backend sets an XSRF-TOKEN cookie (not HttpOnly so JS can read it). // carries an explicit Authorization header (Basic Auth token injected by
// All state-changing requests must include X-XSRF-TOKEN matching the cookie. // hooks.server.ts). Browsers block cross-origin requests from setting custom
// See ADR-022 and issue #524 for the full security rationale. // headers, so cross-site request forgery via a third-party page is not
.csrf(csrf -> csrf // possible with this auth scheme. If the auth model ever changes to
.csrfTokenRepository(CookieCsrfTokenRepository.withHttpOnlyFalse()) // cookie-based sessions, CSRF protection must be re-enabled.
.csrfTokenRequestHandler(new CsrfTokenRequestAttributeHandler())) .csrf(csrf -> csrf.disable())
.authorizeHttpRequests(auth -> { .authorizeHttpRequests(auth -> {
// Actuator endpoints are governed by managementFilterChain (@Order(1)) above. // Health endpoint must be open so CI/Docker health checks work without credentials
auth.requestMatchers("/actuator/health", "/actuator/prometheus").permitAll(); auth.requestMatchers("/actuator/health").permitAll();
// Login is unauthenticated by definition
auth.requestMatchers("/api/auth/login").permitAll();
// Password reset endpoints are unauthenticated by nature // Password reset endpoints are unauthenticated by nature
auth.requestMatchers("/api/auth/forgot-password", "/api/auth/reset-password").permitAll(); auth.requestMatchers("/api/auth/forgot-password", "/api/auth/reset-password").permitAll();
// Invite-based registration endpoints are public // Invite-based registration endpoints are public
@@ -122,18 +67,9 @@ public class SecurityConfig {
// erlaubt pdf im Iframe // erlaubt pdf im Iframe
.headers(headers -> headers .headers(headers -> headers
.frameOptions(frameOptions -> frameOptions.sameOrigin())) .frameOptions(frameOptions -> frameOptions.sameOrigin()))
// Return 401 for unauthenticated requests; 403+CSRF_TOKEN_MISSING for CSRF failures. // Erlaubt Login via Browser-Popup oder REST-Header (Authorization: Basic ...)
.exceptionHandling(ex -> ex .httpBasic(Customizer.withDefaults())
.authenticationEntryPoint( .formLogin(form -> form.usernameParameter("email"));
(req, res, e) -> res.setStatus(HttpServletResponse.SC_UNAUTHORIZED))
.accessDeniedHandler((req, res, e) -> {
res.setStatus(HttpServletResponse.SC_FORBIDDEN);
res.setContentType("application/json;charset=UTF-8");
ErrorCode code = (e instanceof CsrfException)
? ErrorCode.CSRF_TOKEN_MISSING
: ErrorCode.FORBIDDEN;
res.getWriter().write(ERROR_WRITER.writeValueAsString(Map.of("code", code.name())));
}));
return http.build(); return http.build();
} }

View File

@@ -7,7 +7,6 @@ import org.springframework.security.core.Authentication;
import java.util.UUID; import java.util.UUID;
// Cross-cutting auth helper; no domain home — "Utils" is the correct suffix here.
public final class SecurityUtils { public final class SecurityUtils {
private SecurityUtils() {} private SecurityUtils() {}

View File

@@ -2,13 +2,10 @@ package org.raddatz.familienarchiv.tag;
import java.util.UUID; import java.util.UUID;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import io.swagger.v3.oas.annotations.media.Schema; import io.swagger.v3.oas.annotations.media.Schema;
import jakarta.persistence.*; import jakarta.persistence.*;
import lombok.*; import lombok.*;
// prevents infinite recursion in JSON serialization; see ADR-022 for lazy-fetch context
@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})
@Entity @Entity
@Data @Data
@NoArgsConstructor @NoArgsConstructor
@@ -30,11 +27,4 @@ public class Tag {
/** Color token name (e.g. "sage"), only set on root-level tags. Null means no color. */ /** Color token name (e.g. "sage"), only set on root-level tags. Null means no color. */
private String color; private String color;
/**
* Import identity key, keyed on the canonical tag_path. Null for manually created tags;
* unique among non-null values. The importer (Phase 3) uses it for idempotent re-import.
*/
@Column(name = "source_ref")
private String sourceRef;
} }

View File

@@ -22,9 +22,6 @@ public interface TagRepository extends JpaRepository<Tag, UUID> {
Optional<Tag> findByNameIgnoreCase(String name); Optional<Tag> findByNameIgnoreCase(String name);
// Lookup by the canonical tag_path, used for idempotent canonical re-import (Phase 3).
Optional<Tag> findBySourceRef(String sourceRef);
List<Tag> findByNameContainingIgnoreCase(String name); List<Tag> findByNameContainingIgnoreCase(String name);
/** /**

View File

@@ -7,7 +7,6 @@ import java.util.HashSet;
import java.util.LinkedHashMap; import java.util.LinkedHashMap;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.Optional;
import java.util.Set; import java.util.Set;
import java.util.UUID; import java.util.UUID;
import java.util.stream.Collectors; import java.util.stream.Collectors;
@@ -50,37 +49,12 @@ public class TagService {
.orElseThrow(() -> DomainException.notFound(ErrorCode.TAG_NOT_FOUND, "Tag not found: " + id)); .orElseThrow(() -> DomainException.notFound(ErrorCode.TAG_NOT_FOUND, "Tag not found: " + id));
} }
/** Lookup by the canonical tag_path — used by the canonical importer to attach a document's tag. */
public Optional<Tag> findBySourceRef(String sourceRef) {
return tagRepository.findBySourceRef(sourceRef);
}
public Tag findOrCreate(String name) { public Tag findOrCreate(String name) {
String cleanName = name.trim(); String cleanName = name.trim();
return tagRepository.findByNameIgnoreCase(cleanName) return tagRepository.findByNameIgnoreCase(cleanName)
.orElseGet(() -> tagRepository.save(Tag.builder().name(cleanName).build())); .orElseGet(() -> tagRepository.save(Tag.builder().name(cleanName).build()));
} }
/**
* Idempotent upsert keyed on {@code sourceRef} (the canonical tag_path) for the
* Phase-3 importer (ADR-025). On first import the canonical name and parent are
* written; on re-import a human-renamed tag name is preserved (the source_ref is the
* stable identity, the name is a human-editable label).
*/
@Transactional
public Tag upsertBySourceRef(String sourceRef, String name, UUID parentId) {
return tagRepository.findBySourceRef(sourceRef)
.map(existing -> {
existing.setParentId(parentId);
return tagRepository.save(existing);
})
.orElseGet(() -> tagRepository.save(Tag.builder()
.sourceRef(sourceRef)
.name(name)
.parentId(parentId)
.build()));
}
@Transactional @Transactional
public Tag update(UUID id, TagUpdateDTO dto) { public Tag update(UUID id, TagUpdateDTO dto) {
Tag tag = getById(id); Tag tag = getById(id);

View File

@@ -5,8 +5,7 @@ import org.raddatz.familienarchiv.security.Permission;
import org.raddatz.familienarchiv.security.RequirePermission; import org.raddatz.familienarchiv.security.RequirePermission;
import org.raddatz.familienarchiv.document.DocumentService; import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.DocumentVersionService; import org.raddatz.familienarchiv.document.DocumentVersionService;
import org.raddatz.familienarchiv.importing.CanonicalImportOrchestrator; import org.raddatz.familienarchiv.importing.MassImportService;
import org.raddatz.familienarchiv.importing.ImportStatus;
import org.raddatz.familienarchiv.document.ThumbnailBackfillService; import org.raddatz.familienarchiv.document.ThumbnailBackfillService;
import org.springframework.http.ResponseEntity; import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.GetMapping;
@@ -22,20 +21,20 @@ import lombok.RequiredArgsConstructor;
@RequiredArgsConstructor @RequiredArgsConstructor
public class AdminController { public class AdminController {
private final CanonicalImportOrchestrator importOrchestrator; private final MassImportService massImportService;
private final DocumentService documentService; private final DocumentService documentService;
private final DocumentVersionService documentVersionService; private final DocumentVersionService documentVersionService;
private final ThumbnailBackfillService thumbnailBackfillService; private final ThumbnailBackfillService thumbnailBackfillService;
@PostMapping("/trigger-import") @PostMapping("/trigger-import")
public ResponseEntity<ImportStatus> triggerMassImport() { public ResponseEntity<MassImportService.ImportStatus> triggerMassImport() {
importOrchestrator.runImportAsync(); massImportService.runImportAsync();
return ResponseEntity.accepted().body(importOrchestrator.getStatus()); return ResponseEntity.accepted().body(massImportService.getStatus());
} }
@GetMapping("/import-status") @GetMapping("/import-status")
public ResponseEntity<ImportStatus> importStatus() { public ResponseEntity<MassImportService.ImportStatus> importStatus() {
return ResponseEntity.ok(importOrchestrator.getStatus()); return ResponseEntity.ok(massImportService.getStatus());
} }
@PostMapping("/backfill-versions") @PostMapping("/backfill-versions")

View File

@@ -88,8 +88,7 @@ public class AppUser {
}; };
public static String computeColor(UUID id) { public static String computeColor(UUID id) {
// Math.floorMod avoids the Integer.MIN_VALUE overflow trap in Math.abs(hashCode()) return PALETTE[Math.abs(id.hashCode()) % PALETTE.length];
return PALETTE[Math.floorMod(id.hashCode(), PALETTE.length)];
} }
@PrePersist @PrePersist

View File

@@ -31,6 +31,5 @@ public class InviteListItemDTO {
private String status; private String status;
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private LocalDateTime createdAt; private LocalDateTime createdAt;
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private String shareableUrl; private String shareableUrl;
} }

View File

@@ -52,11 +52,7 @@ public class InviteService {
public InviteToken createInvite(CreateInviteRequest dto, AppUser creator) { public InviteToken createInvite(CreateInviteRequest dto, AppUser creator) {
Set<UUID> groupIds = new HashSet<>(); Set<UUID> groupIds = new HashSet<>();
if (dto.getGroupIds() != null && !dto.getGroupIds().isEmpty()) { if (dto.getGroupIds() != null && !dto.getGroupIds().isEmpty()) {
Set<UUID> uniqueIds = new HashSet<>(dto.getGroupIds()); List<UserGroup> groups = userService.findGroupsByIds(dto.getGroupIds());
List<UserGroup> groups = userService.findGroupsByIds(new ArrayList<>(uniqueIds));
if (groups.size() != uniqueIds.size()) {
throw DomainException.notFound(ErrorCode.GROUP_NOT_FOUND, "One or more group IDs do not exist");
}
groups.forEach(g -> groupIds.add(g.getId())); groups.forEach(g -> groupIds.add(g.getId()));
} }

View File

@@ -24,7 +24,4 @@ public interface InviteTokenRepository extends JpaRepository<InviteToken, UUID>
@Query("SELECT t FROM InviteToken t ORDER BY t.createdAt DESC") @Query("SELECT t FROM InviteToken t ORDER BY t.createdAt DESC")
List<InviteToken> findAllOrderedByCreatedAt(); List<InviteToken> findAllOrderedByCreatedAt();
@Query("SELECT CASE WHEN COUNT(t) > 0 THEN true ELSE false END FROM InviteToken t JOIN t.groupIds g WHERE g = :groupId AND t.revoked = false AND (t.expiresAt IS NULL OR t.expiresAt > CURRENT_TIMESTAMP) AND (t.maxUses IS NULL OR t.useCount < t.maxUses)")
boolean existsActiveWithGroupId(@Param("groupId") UUID groupId);
} }

View File

@@ -5,7 +5,6 @@ import java.time.LocalDateTime;
import java.util.HexFormat; import java.util.HexFormat;
import java.util.Optional; import java.util.Optional;
import org.raddatz.familienarchiv.auth.AuthService;
import org.raddatz.familienarchiv.user.ResetPasswordRequest; import org.raddatz.familienarchiv.user.ResetPasswordRequest;
import org.raddatz.familienarchiv.exception.DomainException; import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode; import org.raddatz.familienarchiv.exception.ErrorCode;
@@ -33,7 +32,6 @@ public class PasswordResetService {
private final UserService userService; private final UserService userService;
private final PasswordResetTokenRepository tokenRepository; private final PasswordResetTokenRepository tokenRepository;
private final PasswordEncoder passwordEncoder; private final PasswordEncoder passwordEncoder;
private final AuthService authService;
@Autowired(required = false) @Autowired(required = false)
private JavaMailSender mailSender; private JavaMailSender mailSender;
@@ -87,8 +85,6 @@ public class PasswordResetService {
resetToken.setUsed(true); resetToken.setUsed(true);
tokenRepository.save(resetToken); tokenRepository.save(resetToken);
authService.revokeAllSessions(user.getEmail());
} }
/** /**

View File

@@ -4,11 +4,7 @@ import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.UUID; import java.util.UUID;
import jakarta.servlet.http.HttpSession;
import jakarta.validation.Valid; import jakarta.validation.Valid;
import org.raddatz.familienarchiv.audit.AuditKind;
import org.raddatz.familienarchiv.audit.AuditService;
import org.raddatz.familienarchiv.auth.AuthService;
import org.raddatz.familienarchiv.user.AdminUpdateUserRequest; import org.raddatz.familienarchiv.user.AdminUpdateUserRequest;
import org.raddatz.familienarchiv.user.ChangePasswordDTO; import org.raddatz.familienarchiv.user.ChangePasswordDTO;
import org.raddatz.familienarchiv.user.CreateUserRequest; import org.raddatz.familienarchiv.user.CreateUserRequest;
@@ -30,15 +26,13 @@ import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseStatus; import org.springframework.web.bind.annotation.ResponseStatus;
import org.springframework.web.bind.annotation.RestController; import org.springframework.web.bind.annotation.RestController;
import lombok.RequiredArgsConstructor; import lombok.AllArgsConstructor;
@RestController @RestController
@RequestMapping("/api/") @RequestMapping("/api/")
@RequiredArgsConstructor @AllArgsConstructor
public class UserController { public class UserController {
private final UserService userService; private UserService userService;
private final AuthService authService;
private final AuditService auditService;
@GetMapping("users/me") @GetMapping("users/me")
public ResponseEntity<AppUser> getCurrentUser(Authentication authentication) { public ResponseEntity<AppUser> getCurrentUser(Authentication authentication) {
@@ -62,14 +56,9 @@ public class UserController {
@PostMapping("users/me/password") @PostMapping("users/me/password")
@ResponseStatus(HttpStatus.NO_CONTENT) @ResponseStatus(HttpStatus.NO_CONTENT)
public void changePassword(Authentication authentication, public void changePassword(Authentication authentication,
HttpSession session,
@RequestBody ChangePasswordDTO dto) { @RequestBody ChangePasswordDTO dto) {
AppUser current = userService.findByEmail(authentication.getName()); AppUser current = userService.findByEmail(authentication.getName());
userService.changePassword(current.getId(), dto); userService.changePassword(current.getId(), dto);
int revoked = authService.revokeOtherSessions(session.getId(), authentication.getName());
auditService.log(AuditKind.LOGOUT, current.getId(), null, Map.of(
"reason", "password_change",
"revokedCount", revoked));
} }
@GetMapping("users/{id}") @GetMapping("users/{id}")
@@ -112,18 +101,6 @@ public class UserController {
return ResponseEntity.ok().build(); return ResponseEntity.ok().build();
} }
@PostMapping("/users/{id}/force-logout")
@RequirePermission(Permission.ADMIN_USER)
public ResponseEntity<Map<String, Object>> forceLogout(Authentication authentication,
@PathVariable UUID id) {
AppUser target = userService.getById(id);
int revoked = authService.revokeAllSessions(target.getEmail());
auditService.log(AuditKind.ADMIN_FORCE_LOGOUT, actorId(authentication), null, Map.of(
"targetUserId", target.getId().toString(),
"revokedCount", revoked));
return ResponseEntity.ok(Map.of("revokedCount", revoked));
}
private UUID actorId(Authentication auth) { private UUID actorId(Authentication auth) {
return userService.findByEmail(auth.getName()).getId(); return userService.findByEmail(auth.getName()).getId();
} }

View File

@@ -20,7 +20,6 @@ import org.springframework.boot.CommandLineRunner;
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile; import org.springframework.context.annotation.Profile;
import org.springframework.core.env.Environment;
import org.springframework.security.crypto.password.PasswordEncoder; import org.springframework.security.crypto.password.PasswordEncoder;
import java.time.LocalDate; import java.time.LocalDate;
@@ -32,51 +31,26 @@ import java.util.Set;
@DependsOn("flyway") @DependsOn("flyway")
public class UserDataInitializer { public class UserDataInitializer {
static final String DEFAULT_ADMIN_EMAIL = "admin@familienarchiv.local"; @Value("${app.admin.email:admin@familyarchive.local}")
static final String DEFAULT_ADMIN_PASSWORD = "admin123";
@Value("${app.admin.email:" + DEFAULT_ADMIN_EMAIL + "}")
private String adminEmail; private String adminEmail;
@Value("${app.admin.password:" + DEFAULT_ADMIN_PASSWORD + "}") @Value("${app.admin.password:admin123}")
private String adminPassword; private String adminPassword;
private final AppUserRepository userRepository; private final AppUserRepository userRepository;
private final UserGroupRepository groupRepository; private final UserGroupRepository groupRepository;
private final Environment environment;
@Bean @Bean
public CommandLineRunner initAdminUser(PasswordEncoder passwordEncoder) { public CommandLineRunner initAdminUser(PasswordEncoder passwordEncoder) {
return args -> { return args -> {
if (userRepository.findByEmail(adminEmail).isEmpty()) { if (userRepository.findByEmail(adminEmail).isEmpty()) {
// Fail-closed in production: refuse to seed with the well-known
// defaults. Otherwise an operator who forgets APP_ADMIN_USERNAME
// / APP_ADMIN_PASSWORD locks production to admin@…/admin123 PERMANENTLY
// (UserDataInitializer only seeds when the row is missing — see #513).
// Allowed in dev/test/e2e because those run without secrets configured.
boolean isLocalProfile = environment.matchesProfiles("dev", "test", "e2e");
if (!isLocalProfile
&& (DEFAULT_ADMIN_EMAIL.equals(adminEmail)
|| DEFAULT_ADMIN_PASSWORD.equals(adminPassword))) {
throw new IllegalStateException(
"Refusing to seed admin user with default credentials outside "
+ "the dev/test/e2e profiles. Set APP_ADMIN_USERNAME and "
+ "APP_ADMIN_PASSWORD to non-default values before first boot — "
+ "this lock-in is permanent."
);
}
log.info("Kein Admin-User '{}' gefunden. Erstelle Default-Admin...", adminEmail); log.info("Kein Admin-User '{}' gefunden. Erstelle Default-Admin...", adminEmail);
// Reuse the Administrators group if it already exists (e.g. a UserGroup adminGroup = UserGroup.builder()
// previous boot seeded the group but failed before creating .name("Administrators")
// the admin user, or the operator deleted just the user row .permissions(Set.of("ADMIN", "READ_ALL", "WRITE_ALL", "ANNOTATE_ALL", "ADMIN_USER", "ADMIN_TAG", "ADMIN_PERMISSION"))
// to retry the seed with a new email). Blind-INSERTing would .build();
// violate user_groups_name_key and abort the context. See #518. groupRepository.save(adminGroup);
UserGroup adminGroup = groupRepository.findByName("Administrators")
.orElseGet(() -> groupRepository.save(UserGroup.builder()
.name("Administrators")
.permissions(Set.of("ADMIN", "READ_ALL", "WRITE_ALL", "ANNOTATE_ALL", "ADMIN_USER", "ADMIN_TAG", "ADMIN_PERMISSION"))
.build()));
AppUser admin = AppUser.builder() AppUser admin = AppUser.builder()
.email(adminEmail) .email(adminEmail)

View File

@@ -37,9 +37,6 @@ public class UserService {
private final AppUserRepository userRepository; private final AppUserRepository userRepository;
private final UserGroupRepository groupRepository; private final UserGroupRepository groupRepository;
// Injected directly (not via InviteService) to avoid a constructor injection cycle:
// InviteService → UserService → InviteService. Spring Framework 7 forbids such cycles.
private final InviteTokenRepository inviteTokenRepository;
private final PasswordEncoder passwordEncoder; private final PasswordEncoder passwordEncoder;
private final AuditService auditService; private final AuditService auditService;
@@ -274,10 +271,9 @@ public class UserService {
@Transactional @Transactional
public UserGroup createGroup(GroupDTO dto) { public UserGroup createGroup(GroupDTO dto) {
UserGroup group = UserGroup.builder() UserGroup group = new UserGroup();
.name(dto.getName()) group.setName(dto.getName());
.permissions(dto.getPermissions() != null ? dto.getPermissions() : new HashSet<>()) group.setPermissions(dto.getPermissions());
.build();
return groupRepository.save(group); return groupRepository.save(group);
} }
@@ -291,10 +287,6 @@ public class UserService {
@Transactional @Transactional
public void deleteGroup(UUID id) { public void deleteGroup(UUID id) {
if (inviteTokenRepository.existsActiveWithGroupId(id)) {
throw DomainException.conflict(ErrorCode.GROUP_HAS_ACTIVE_INVITES,
"Cannot delete group " + id + " — referenced by one or more active invites");
}
groupRepository.deleteById(id); groupRepository.deleteById(id);
} }
} }

View File

@@ -1,9 +1,6 @@
spring: spring:
jpa: jpa:
show-sql: true show-sql: true
# spring.session.cookie.secure is no longer a supported Boot 4.x property.
# DefaultCookieSerializer auto-detects Secure from request.isSecure().
# Direct HTTP in dev → isSecure()=false → cookie sent without Secure attribute.
springdoc: springdoc:
api-docs: api-docs:

View File

@@ -38,64 +38,10 @@ spring:
starttls: starttls:
enable: true enable: true
session:
timeout: 28800s # 8 h idle timeout (MaxInactiveIntervalInSeconds)
jdbc:
initialize-schema: never # Flyway owns schema creation (V67)
# Cookie name, SameSite, and Secure are configured via SpringSessionConfig#cookieSerializer
# (spring.session.cookie.* is not supported in Spring Boot 4.x).
server:
# Behind Caddy/reverse proxy: trust X-Forwarded-{Proto,For,Host} so that
# request.getScheme(), redirect URLs, and Spring Session "Secure" cookies
# reflect the original https client request, not the http hop from Caddy.
forward-headers-strategy: native
management: management:
server:
# Management port is separate from the app port so that:
# (a) Caddy never proxies /actuator/* (it only routes :8080 → the app port)
# (b) Prometheus scrapes backend:8081 directly inside archiv-net, not via Caddy
# Note: in Spring Boot 4.0 the management port shares the security filter chain; /actuator/health
# and /actuator/prometheus must be explicitly permitted in SecurityConfig — see SecurityConfig.java.
port: 8081
endpoints:
web:
exposure:
include: health,info,prometheus,metrics
endpoint:
prometheus:
enabled: true
# Spring Boot 4.0: metrics export is disabled by default — explicitly opt in for Prometheus
prometheus:
metrics:
export:
enabled: true
metrics:
tags:
# Common tag applied to every metric so Grafana's Spring Boot dashboard can filter by application name.
# Override via MANAGEMENT_METRICS_TAGS_APPLICATION env var.
application: ${spring.application.name}
health: health:
mail: mail:
enabled: false enabled: false
tracing:
sampling:
probability: 1.0 # 100% in dev; override via MANAGEMENT_TRACING_SAMPLING_PROBABILITY in prod compose
# OpenTelemetry trace export — failures are non-fatal (app starts cleanly without Tempo running)
# Port 4318 = OTLP HTTP (the default transport for Spring Boot's HttpExporter).
# Port 4317 is gRPC-only; sending HTTP/1.1 to it produces "Connection reset".
otel:
service:
name: familienarchiv-backend
exporter:
otlp:
endpoint: ${OTEL_EXPORTER_OTLP_ENDPOINT:http://localhost:4318}
logs:
exporter: none # Promtail captures Docker logs; disable OTLP log export (Tempo only accepts traces)
metrics:
exporter: none # Prometheus scrapes /actuator/prometheus; disable OTLP metric export to Tempo
springdoc: springdoc:
api-docs: api-docs:
@@ -117,35 +63,23 @@ app:
from: ${APP_MAIL_FROM:noreply@familienarchiv.local} from: ${APP_MAIL_FROM:noreply@familienarchiv.local}
admin: admin:
# Key must be `email`, not `username` — UserDataInitializer reads username: ${APP_ADMIN_USERNAME:admin}
# `${app.admin.email:...}`. The env-var name stays APP_ADMIN_USERNAME
# to match the existing Gitea secrets and DEPLOYMENT.md §3.3.
# See #513.
email: ${APP_ADMIN_USERNAME:admin@familienarchiv.local}
password: ${APP_ADMIN_PASSWORD:admin123} password: ${APP_ADMIN_PASSWORD:admin123}
import: import:
# Directory holding the normalizer's committed canonical artifacts col:
# (canonical-{documents,persons,tag-tree}.xlsx + canonical-persons-tree.json). index: 0
# The loader maps columns by header name — no positional indices (see ADR-025). box: 1
dir: ${IMPORT_DIR:/import} folder: 2
sender: 3
receivers: 5
date: 7
location: 9
tags: 10
summary: 11
transcription: 13
ocr: ocr:
sender-model: sender-model:
activation-threshold: 100 activation-threshold: 100
retrain-delta: 50 retrain-delta: 50
sentry:
dsn: ${SENTRY_DSN:}
environment: ${SPRING_PROFILES_ACTIVE:dev}
traces-sample-rate: ${SENTRY_TRACES_SAMPLE_RATE:1.0}
send-default-pii: false
enable-tracing: true
ignored-exceptions-for-type:
- org.raddatz.familienarchiv.exception.DomainException
rate-limit:
login:
max-attempts-per-ip-email: 10
max-attempts-per-ip: 20
window-minutes: 15

View File

@@ -1,14 +0,0 @@
-- Repeatable migration: sets the grafana_reader role's password from the
-- ${grafanaDbPassword} placeholder (resolved by FlywayConfig from the
-- GRAFANA_DB_PASSWORD environment variable). Flyway computes the checksum on
-- the resolved migration content, so any change to GRAFANA_DB_PASSWORD changes
-- the checksum and re-applies this migration on the next boot. That makes
-- password rotation a "change env var + restart" operation — no manual psql.
--
-- V68 created the role itself (without a usable password). This file owns the
-- password lifecycle; nothing else writes it.
DO $$
BEGIN
EXECUTE format('ALTER ROLE grafana_reader WITH PASSWORD %L', '${grafanaDbPassword}');
END
$$;

View File

@@ -1 +0,0 @@
CREATE INDEX IF NOT EXISTS idx_documents_updated_at ON documents(updated_at DESC);

View File

@@ -1,8 +0,0 @@
-- Speeds up "documents by sender" queries used on /persons/[id] Korrespondenz-Überblick (#306),
-- /briefwechsel, and bulk-edit flows.
CREATE INDEX IF NOT EXISTS idx_documents_sender_id
ON documents(sender_id);
-- Speeds up "comments by author" queries on admin user detail and (future) contributor profile.
CREATE INDEX IF NOT EXISTS idx_comments_author_id
ON document_comments(author_id);

View File

@@ -1,7 +0,0 @@
-- Remove duplicate (group_id, permission) rows that accumulated without a UNIQUE constraint.
-- Keeps the row with the smallest ctid (earliest physical insertion order).
DELETE FROM group_permissions a
USING group_permissions b
WHERE a.ctid < b.ctid
AND a.group_id = b.group_id
AND a.permission = b.permission;

View File

@@ -1,11 +0,0 @@
-- Add NOT NULL and PRIMARY KEY to group_permissions.
-- Requires V63 to have run first (no duplicates can remain).
--
-- After this migration, future seed migrations can use:
-- INSERT INTO group_permissions ... ON CONFLICT DO NOTHING
-- instead of the INSERT ... WHERE NOT EXISTS pattern used before V64.
ALTER TABLE group_permissions
ALTER COLUMN permission SET NOT NULL;
ALTER TABLE group_permissions
ADD CONSTRAINT pk_group_permissions PRIMARY KEY (group_id, permission);

View File

@@ -1,8 +0,0 @@
-- Promote the de-facto unique constraint on transcription_block_mentioned_persons to a named PK.
-- uq_tbmp_block_person (added in V57) is backed by a B-tree index identical to a PK;
-- this rename makes the naming convention explicit (pk_* vs uq_*).
ALTER TABLE transcription_block_mentioned_persons
DROP CONSTRAINT uq_tbmp_block_person;
ALTER TABLE transcription_block_mentioned_persons
ADD CONSTRAINT pk_tbmp PRIMARY KEY (block_id, person_id);

View File

@@ -1,3 +0,0 @@
-- The composite PK (invite_token_id, group_id) does not support efficient lookups by group_id alone.
-- Add a dedicated index to support existsActiveWithGroupId queries.
CREATE INDEX idx_itg_group_id ON invite_token_group_ids (group_id);

View File

@@ -1,27 +0,0 @@
-- Re-introduces the Spring Session JDBC tables that were dropped by V2 as unused.
-- DDL copied verbatim from Spring Session 3.x schema-postgresql.sql.
-- See ADR-020 and issue #523.
CREATE TABLE spring_session (
PRIMARY_ID CHAR(36) NOT NULL,
SESSION_ID CHAR(36) NOT NULL,
CREATION_TIME BIGINT NOT NULL,
LAST_ACCESS_TIME BIGINT NOT NULL,
MAX_INACTIVE_INTERVAL INT NOT NULL,
EXPIRY_TIME BIGINT NOT NULL,
PRINCIPAL_NAME VARCHAR(100),
CONSTRAINT spring_session_pk PRIMARY KEY (PRIMARY_ID)
);
CREATE UNIQUE INDEX spring_session_ix1 ON spring_session (SESSION_ID);
CREATE INDEX spring_session_ix2 ON spring_session (EXPIRY_TIME);
CREATE INDEX spring_session_ix3 ON spring_session (PRINCIPAL_NAME);
CREATE TABLE spring_session_attributes (
SESSION_PRIMARY_ID CHAR(36) NOT NULL,
ATTRIBUTE_NAME VARCHAR(200) NOT NULL,
ATTRIBUTE_BYTES BYTEA NOT NULL,
CONSTRAINT spring_session_attributes_pk PRIMARY KEY (SESSION_PRIMARY_ID, ATTRIBUTE_NAME),
CONSTRAINT spring_session_attributes_fk FOREIGN KEY (SESSION_PRIMARY_ID)
REFERENCES spring_session (PRIMARY_ID) ON DELETE CASCADE
);

View File

@@ -1,17 +0,0 @@
-- Read-only role used by the Grafana PostgreSQL datasource for the PO Overview
-- dashboard (issue #651). The role is created here without a usable password
-- (LOGIN-capable but no password set); R__grafana_reader_password.sql sets the
-- password from GRAFANA_DB_PASSWORD on every boot, so rotation is just "bump
-- the env var and restart the backend" — see docs/adr/024-* and the rotation
-- runbook in docs/DEPLOYMENT.md.
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_catalog.pg_roles WHERE rolname = 'grafana_reader') THEN
CREATE ROLE grafana_reader WITH LOGIN;
END IF;
END
$$;
GRANT CONNECT ON DATABASE ${flyway:database} TO grafana_reader;
GRANT USAGE ON SCHEMA public TO grafana_reader;
GRANT SELECT ON audit_log, documents, transcription_blocks TO grafana_reader;

View File

@@ -1,67 +0,0 @@
-- Phase 2 of "Handling the Unknowns": the schema foundation.
-- Consolidates every new import/precision/attribution/identity column into ONE
-- migration with a single owner so downstream phases (importer, rendering, persons
-- directory) compile against a finished, collision-free schema. See ADR-025.
--
-- This file is forward-only and immutable once shipped (Flyway checksum model):
-- any fix goes in a later version, never an edit here.
-- ─── documents: date precision, range end, raw date, raw attribution ──────────
-- Range end is only set for RANGE precision (open-ended ranges allowed → end may be null).
ALTER TABLE documents ADD COLUMN meta_date_end date;
-- Original date cell, verbatim, for provenance and "as written" display (Phase 4).
ALTER TABLE documents ADD COLUMN meta_date_raw text;
-- Raw attribution preserved even when a person is linked.
ALTER TABLE documents ADD COLUMN sender_text text;
ALTER TABLE documents ADD COLUMN receiver_text text;
-- Bound user-influenced spreadsheet text at the DB layer (mirrors transcription_blocks
-- length cap in V18). Defense in depth against malformed/huge import cells.
ALTER TABLE documents ADD CONSTRAINT chk_meta_date_raw_length CHECK (length(meta_date_raw) <= 10000);
ALTER TABLE documents ADD CONSTRAINT chk_sender_text_length CHECK (length(sender_text) <= 10000);
ALTER TABLE documents ADD CONSTRAINT chk_receiver_text_length CHECK (length(receiver_text) <= 10000);
-- Precision enum — added with a DB default of 'UNKNOWN', backfilled, then made NOT NULL.
-- The DEFAULT serves two purposes: (1) existing rows get 'UNKNOWN' immediately, and
-- (2) raw-SQL inserts that omit the column (test fixtures, ad-hoc data loads) get a sane,
-- CHECK-valid value instead of violating the NOT NULL constraint. JPA saves still set it
-- explicitly via the entity's @Builder.Default = DatePrecision.UNKNOWN.
ALTER TABLE documents ADD COLUMN meta_date_precision varchar(16) DEFAULT 'UNKNOWN';
UPDATE documents
SET meta_date_precision = CASE WHEN meta_date IS NOT NULL THEN 'DAY' ELSE 'UNKNOWN' END;
ALTER TABLE documents ALTER COLUMN meta_date_precision SET NOT NULL;
-- Fail-closed allowlist of the seven precision values (verbatim mirror of the
-- normalizer's Precision enum). The DB enforces validity independent of the Java enum.
ALTER TABLE documents ADD CONSTRAINT chk_meta_date_precision
CHECK (meta_date_precision IN ('DAY', 'MONTH', 'SEASON', 'YEAR', 'RANGE', 'APPROX', 'UNKNOWN'));
-- A non-null range end is permitted only when precision = RANGE. A RANGE row MAY have a
-- null end (open-ended range), so the rule is one-directional, not biconditional.
ALTER TABLE documents ADD CONSTRAINT chk_meta_date_end_only_for_range
CHECK (meta_date_end IS NULL OR meta_date_precision = 'RANGE');
-- For ranges with both endpoints, the end must not precede the start.
ALTER TABLE documents ADD CONSTRAINT chk_meta_date_end_after_start
CHECK (meta_date_end IS NULL OR meta_date IS NULL OR meta_date_end >= meta_date);
-- ─── persons: source_ref (import identity) + provisional flag ─────────────────
-- The normalizer person_id: join key for documents → persons and idempotency key for
-- re-import. Nullable (manually created persons never have one); unique among non-nulls.
ALTER TABLE persons ADD COLUMN source_ref varchar(255);
CREATE UNIQUE INDEX idx_persons_source_ref ON persons (source_ref);
-- A provisional person is one the importer inferred but could not confidently identify.
-- Stays false until Phase 3 (importer) sets it; no code path writes true in this phase.
ALTER TABLE persons ADD COLUMN provisional boolean NOT NULL DEFAULT false;
-- ─── tag: source_ref (import identity, keyed on canonical tag_path) ───────────
ALTER TABLE tag ADD COLUMN source_ref varchar(255);
CREATE UNIQUE INDEX idx_tag_source_ref ON tag (source_ref);

Some files were not shown because too many files have changed in this diff Show More