Compare commits

..

2 Commits

Author SHA1 Message Date
Marcel
ed028e793e test(e2e): add coverage for all 12 critical journeys (TEST-3 #405)
Some checks failed
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / Unit & Component Tests (pull_request) Failing after 3m32s
CI / OCR Service Tests (pull_request) Successful in 29s
CI / Backend Unit Tests (pull_request) Failing after 3m3s
Adds docs/audits/e2e-coverage-report.md mapping all 12 critical journeys
to their test files. Fills the 6 coverage gaps with new e2e tests:

- J1: Register via invite code (auth.spec.ts)
- J3: Edit document tags via TagInput (documents.spec.ts)
- J4: Create brand-new tag via TagInput (documents.spec.ts)
- J5: Add SPOUSE_OF relationship on person edit page (persons.spec.ts)
- J6: Multi-filter search (text + date, text + tagId) (documents.spec.ts)
- J10: Notification bell opens dropdown (notification-deep-link.spec.ts)
- J11: Non-admin blocked from /admin/* (permissions.spec.ts)
- J12: Mass import trigger shows status (admin.spec.ts)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 17:58:42 +02:00
Marcel
34bbf64198 docs(audit): add mutation test report for 7 Tier-1 service domains
35/35 mutations DETECTED across document, person, tag, user, geschichte,
notification, and OCR domains. No tautological tests found — the suite
is trustworthy on all critical paths. Closes issue #403.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-05 17:45:31 +02:00
754 changed files with 10263 additions and 85137 deletions

View File

@@ -410,23 +410,6 @@ Never Kafka for teams under 10 or <100k events/day. Never gRPC inside a monolith
4. Identify missing database-layer enforcement (constraints, RLS) 4. Identify missing database-layer enforcement (constraints, RLS)
5. Check transport choices — simpler protocol available? 5. Check transport choices — simpler protocol available?
6. Propose a concrete simpler alternative, not just a critique 6. Propose a concrete simpler alternative, not just a critique
7. Verify documentation currency. For each category below, check whether the PR triggered the update. Flag missing updates as blockers.
| PR contains | Required doc update |
|---|---|
| New Flyway migration adding/removing/renaming a table or column | `docs/architecture/db/db-orm.puml` and `docs/architecture/db/db-relationships.puml`**except** framework-owned tables (e.g. Spring Session JDBC's `spring_session*`, Flyway's `flyway_schema_history`), which are opaque to app code; reference the relevant ADR if an exclusion is load-bearing |
| New `@ManyToMany` join table or FK | Both DB diagrams |
| New backend package or domain module | `CLAUDE.md` package table + matching `docs/architecture/c4/l3-backend-*.puml` |
| New controller or service in an existing backend domain | Matching `docs/architecture/c4/l3-backend-*.puml` |
| New SvelteKit route | `CLAUDE.md` route table + matching `docs/architecture/c4/l3-frontend-*.puml` |
| New Docker service or infrastructure component | `docs/architecture/c4/l2-containers.puml` + `docs/DEPLOYMENT.md` |
| New external system integrated | `docs/architecture/c4/l1-context.puml` |
| Auth or upload flow change | `docs/architecture/c4/seq-auth-flow.puml` or `docs/architecture/c4/seq-document-upload.puml` |
| New `ErrorCode` or `Permission` value | `CLAUDE.md` + `docs/ARCHITECTURE.md` |
| New domain concept or term | `docs/GLOSSARY.md` |
| Architectural decision with lasting consequences | New ADR in `docs/adr/` |
A doc omission is a blocker, not a concern — the PR does not merge until the diagram or text matches the code.
### Designing Systems ### Designing Systems
1. Start with the data model — get the schema right before application code 1. Start with the data model — get the schema right before application code

View File

@@ -980,24 +980,6 @@ Mark with `@pytest.mark.asyncio` so pytest runs the coroutine. Without it, the t
5. Refactor — apply clean code, extract if 3+ duplications, rename for intent 5. Refactor — apply clean code, extract if 3+ duplications, rename for intent
6. Repeat for the next behavior 6. Repeat for the next behavior
7. When all behaviors are green, review for SOLID violations across the full stack 7. When all behaviors are green, review for SOLID violations across the full stack
8. Update documentation before opening the PR. Use the table below to know which doc to touch.
| What changed in code | Doc(s) to update |
|---|---|
| New Flyway migration adds/removes/renames a table or column | `docs/architecture/db/db-orm.puml` (add/remove entity or attribute) **and** `docs/architecture/db/db-relationships.puml` (add/remove relationship line) — **except** framework-owned tables (e.g. Spring Session JDBC's `spring_session*`, Flyway's `flyway_schema_history`), which are opaque to app code; reference the relevant ADR if an exclusion is load-bearing |
| New `@ManyToMany` join table or FK relationship | Both DB diagrams above |
| New backend package / domain module | `CLAUDE.md` (package structure table) **and** the matching `docs/architecture/c4/l3-backend-*.puml` diagram for that domain |
| New Spring Boot controller or service in an existing domain | The matching `docs/architecture/c4/l3-backend-*.puml` for that domain |
| New SvelteKit route (`+page.svelte`) | `CLAUDE.md` (route structure section) **and** the matching `docs/architecture/c4/l3-frontend-*.puml` diagram |
| New Docker service / infrastructure component | `docs/architecture/c4/l2-containers.puml` **and** `docs/DEPLOYMENT.md` |
| New external system integrated (new API, new S3 bucket, etc.) | `docs/architecture/c4/l1-context.puml` |
| Auth flow or document-upload flow changes | `docs/architecture/c4/seq-auth-flow.puml` or `docs/architecture/c4/seq-document-upload.puml` |
| New `ErrorCode` enum value | `CLAUDE.md` error handling section **and** `CONTRIBUTING.md` |
| New `Permission` enum value | `CLAUDE.md` security section **and** `docs/ARCHITECTURE.md` |
| New domain term introduced (entity name, status, concept) | `docs/GLOSSARY.md` |
| Architectural decision with lasting consequences (new tech, new transport protocol, new pattern) | New ADR in `docs/adr/` |
Skip a doc only if the change genuinely does not affect what that doc describes.
### Reviewing Code ### Reviewing Code
1. TDD evidence — are there tests? Do they precede the implementation? 1. TDD evidence — are there tests? Do they precede the implementation?

View File

@@ -38,10 +38,10 @@ Screen readers and search engines rely on landmarks to navigate. Every page need
2. **Use CSS custom properties for all brand colors** 2. **Use CSS custom properties for all brand colors**
```css ```css
/* layout.css — semantic tokens backed by CSS variables (see --palette-* for raw values) */ /* layout.css */
--color-ink: var(--c-ink); --color-ink: #002850;
--color-accent: var(--c-accent); --color-accent: #A6DAD8;
--color-surface: var(--c-surface); --color-surface: #E4E2D7;
``` ```
```svelte ```svelte
<div class="text-ink bg-surface border-line"> <div class="text-ink bg-surface border-line">
@@ -103,8 +103,8 @@ unsaved work without warning.
1. **Enforce WCAG AA contrast ratios** 1. **Enforce WCAG AA contrast ratios**
``` ```
brand-navy (--palette-navy) on white: ~14.5:1 -- AAA pass (verify exact value in layout.css) brand-navy (#002850) on white: 14.5:1 -- AAA pass
brand-mint (--palette-mint) on navy: ~7.2:1 -- AAA pass for large text brand-mint (#A6DAD8) on navy: 7.2:1 -- AAA pass for large text
Gray-500 on white: check >= 4.5:1 -- AA minimum for body text Gray-500 on white: check >= 4.5:1 -- AA minimum for body text
``` ```
Always verify contrast with a tool. AA is the floor (4.5:1 normal text, 3:1 large text). Target AAA (7:1) for body copy. Always verify contrast with a tool. AA is the floor (4.5:1 normal text, 3:1 large text). Target AAA (7:1) for body copy.
@@ -134,8 +134,8 @@ Color-blind users (8% of men) cannot distinguish status by color alone. Always p
/* Silver #CACAC9 on white = 1.5:1 -- fails all WCAG levels */ /* Silver #CACAC9 on white = 1.5:1 -- fails all WCAG levels */
.caption { color: #CACAC9; } .caption { color: #CACAC9; }
/* brand-mint on white = ~2.8:1 -- fails AA for normal text */ /* brand-mint on white = 2.8:1 -- fails AA for normal text */
.label { color: var(--palette-mint); } .label { color: #A6DAD8; }
``` ```
Test every text color against its background. Decorative palette colors are for borders and backgrounds, not text. Test every text color against its background. Decorative palette colors are for borders and backgrounds, not text.
@@ -338,7 +338,7 @@ Test at 320px (small phone), 768px (tablet), and 1440px (desktop). Review diffs
<table> <table>
<tr><td>Section title</td><td><code>text-xs font-bold uppercase tracking-widest</code></td> <tr><td>Section title</td><td><code>text-xs font-bold uppercase tracking-widest</code></td>
<td>12px / 700</td><td>Most commonly undersized</td></tr> <td>12px / 700</td><td>Most commonly undersized</td></tr>
<tr><td>Card container</td><td><code>bg-surface shadow-sm border border-line rounded-sm p-6</code></td> <tr><td>Card container</td><td><code>bg-white shadow-sm border border-brand-sand rounded-sm p-6</code></td>
<td>padding 24px</td><td></td></tr> <td>padding 24px</td><td></td></tr>
</table> </table>
</div> </div>
@@ -376,10 +376,10 @@ await page.setViewportSize({ width: 1440, height: 900 });
## Domain Expertise ## Domain Expertise
### Brand Palette ### Brand Palette
- **Primary**: `brand-navy` (`--palette-navy`) — text, buttons, headers; `brand-mint` (`--palette-mint`) — accents, hover; sand (`--palette-sand`) — page background (use `bg-canvas` or `bg-surface` as Tailwind utilities, not `bg-brand-sand`) - **Primary**: brand-navy `#002850` (text, buttons, headers), brand-mint `#A6DAD8` (accents, hover), brand-sand `#E4E2D7` (backgrounds, borders)
- **Typography**: `font-serif` (Tinos) for body/titles, `font-sans` (Montserrat) for labels/UI chrome - **Typography**: `font-serif` (Merriweather) for body/titles, `font-sans` (Montserrat) for labels/UI chrome
- **Card pattern**: `bg-surface shadow-sm border border-line rounded-sm p-6` - **Card pattern**: `bg-white shadow-sm border border-brand-sand rounded-sm p-6`
- **Section title**: `text-xs font-bold uppercase tracking-widest text-ink-3 mb-5` - **Section title**: `text-xs font-bold uppercase tracking-widest text-gray-400 mb-5`
### Dual-Audience Design (25-42 AND 60+) ### Dual-Audience Design (25-42 AND 60+)
- Seniors: 16px minimum body text (prefer 18px), 44px touch targets (prefer 48px), redundant cues, calm layouts, persistent navigation, no timed interactions - Seniors: 16px minimum body text (prefer 18px), 44px touch targets (prefer 48px), redundant cues, calm layouts, persistent navigation, no timed interactions

View File

@@ -1,3 +1,96 @@
# Dev Container # Dev Container — Familienarchiv
→ See [.devcontainer/README.md](./README.md) for configuration, usage, and known limitations. ## Overview
VS Code Dev Container configuration for a pre-configured development environment. Includes Java 21, Maven, and Node.js 24 — everything needed to work on both backend and frontend.
## Configuration
File: `.devcontainer/devcontainer.json`
### Included Features
| Feature | Version | Purpose |
|---|---|---|
| Java | 21 | Spring Boot backend |
| Maven | bundled with Java feature | Build tool |
| Node.js | 24 | SvelteKit frontend |
### VS Code Extensions (Auto-installed)
| Extension | Purpose |
|---|---|
| `vscjava.vscode-java-pack` | Java language support, debugging, testing |
| `vmware.vscode-spring-boot` | Spring Boot tooling |
| `gabrielbb.vscode-lombok` | Lombok annotation support |
| `humao.rest-client` | HTTP request files (for `backend/api_tests/`) |
### Ports
- `8080` forwarded to host — access backend at `http://localhost:8080`
### User
Runs as `vscode` user (not root) for security.
## How to Use
### Prerequisites
- VS Code with the **Dev Containers** extension installed
- Docker running locally
### Open in Dev Container
1. Open the project in VS Code
2. Press `F1` → type "Dev Containers: Reopen in Container"
3. VS Code will:
- Build the container using the root `docker-compose.yml`
- Install Java 21, Maven, and Node 24
- Install the listed extensions
- Mount the workspace folder
### Working Inside the Container
Once inside the container, you have access to both stacks:
```bash
# Backend
cd backend
./mvnw spring-boot:run
# Frontend (in a new terminal)
cd frontend
npm install
npm run dev
```
The container reuses the `docker-compose.yml` services, so PostgreSQL and MinIO are available automatically.
### Forwarding Frontend Port
The devcontainer config only forwards port 8080 by default. To access the frontend dev server (port 5173 or 3000), either:
1. Add `5173` to `forwardPorts` in `devcontainer.json`, or
2. Use the VS Code "Ports" panel to forward it dynamically
## Limitations
- The devcontainer attaches to the `backend` service from `docker-compose.yml`, so it inherits those environment variables
- OCR service and other containers should be started separately via `docker-compose up -d`
- GPU passthrough for OCR training is not configured
## Customization
To add more tools or extensions, edit `.devcontainer/devcontainer.json`:
```json
{
"features": {
"ghcr.io/devcontainers/features/python:1": {
"version": "3.11"
}
},
"forwardPorts": [8080, 5173, 3000]
}
```

View File

@@ -1,94 +0,0 @@
# Dev Container — Familienarchiv
VS Code Dev Container configuration for a pre-configured development environment. Includes Java 21, Maven, and Node.js 24 — everything needed to work on both backend and frontend.
## Configuration
File: `.devcontainer/devcontainer.json`
### Included Features
| Feature | Version | Purpose |
| ------- | ------------------------- | ------------------- |
| Java | 21 | Spring Boot backend |
| Maven | bundled with Java feature | Build tool |
| Node.js | 24 | SvelteKit frontend |
### VS Code Extensions (Auto-installed)
| Extension | Purpose |
| --------------------------- | --------------------------------------------- |
| `vscjava.vscode-java-pack` | Java language support, debugging, testing |
| `vmware.vscode-spring-boot` | Spring Boot tooling |
| `gabrielbb.vscode-lombok` | Lombok annotation support |
| `humao.rest-client` | HTTP request files (for `backend/api_tests/`) |
### Ports
- `8080` forwarded to host — access backend at `http://localhost:8080`
### User
Runs as `vscode` user (not root) for security.
## How to Use
### Prerequisites
- VS Code with the **Dev Containers** extension installed
- Docker running locally
### Open in Dev Container
1. Open the project in VS Code
2. Press `F1` → type "Dev Containers: Reopen in Container"
3. VS Code will:
- Build the container using the root `docker-compose.yml`
- Install Java 21, Maven, and Node 24
- Install the listed extensions
- Mount the workspace folder
### Working Inside the Container
Once inside the container, you have access to both stacks:
```bash
# Backend
cd backend
./mvnw spring-boot:run
# Frontend (in a new terminal)
cd frontend
npm install
npm run dev
```
The container reuses the `docker-compose.yml` services, so PostgreSQL and MinIO are available automatically.
### Forwarding Frontend Port
The devcontainer config only forwards port 8080 by default. To access the frontend dev server (port 5173 or 3000), either:
1. Add `5173` to `forwardPorts` in `devcontainer.json`, or
2. Use the VS Code "Ports" panel to forward it dynamically
## Limitations
- The devcontainer attaches to the `backend` service from `docker-compose.yml`, so it inherits those environment variables
- OCR service and other containers should be started separately via `docker-compose up -d`
- GPU passthrough for OCR training is not configured
## Customization
To add more tools or extensions, edit `.devcontainer/devcontainer.json`:
```json
{
"features": {
"ghcr.io/devcontainers/features/python:1": {
"version": "3.11"
}
},
"forwardPorts": [8080, 5173, 3000]
}
```

View File

@@ -26,52 +26,6 @@ PORT_MAILPIT_SMTP=1025
# Generate with: python3 -c "import secrets; print(secrets.token_hex(32))" # Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"
OCR_TRAINING_TOKEN=change-me-in-production OCR_TRAINING_TOKEN=change-me-in-production
# --- Observability ---
# Optional stack — start with: docker compose -f docker-compose.observability.yml up -d
# Requires the main stack to already be running (docker compose up -d creates archiv-net).
# In production the stack is managed from /opt/familienarchiv/ (see docs/DEPLOYMENT.md §4).
# Ports for host access
PORT_GRAFANA=3003
PORT_GLITCHTIP=3002
PORT_PROMETHEUS=9090
# Grafana admin password — change this before exposing Grafana beyond localhost
GRAFANA_ADMIN_PASSWORD=changeme
# Password for the read-only grafana_reader PostgreSQL role used by the PO
# Overview dashboard. Consumed by Flyway V68 (to set the role's password) and
# by Grafana's PostgreSQL datasource (to connect). REQUIRED in production —
# generate with: openssl rand -hex 32
GRAFANA_DB_PASSWORD=changeme-generate-with-openssl-rand-hex-32
# GlitchTip domain — production: use https://glitchtip.archiv.raddatz.cloud (must match Caddy vhost)
GLITCHTIP_DOMAIN=http://localhost:3002
# GlitchTip secret key — Django SECRET_KEY equivalent, used to sign sessions and tokens.
# REQUIRED in production — must not be empty or 'changeme'. Fail-closed: GlitchTip will
# refuse to start with an invalid key.
# Generate with: python3 -c "import secrets; print(secrets.token_hex(50))"
GLITCHTIP_SECRET_KEY=changeme-generate-a-real-secret
# PostgreSQL hostname for GlitchTip's db-init job and workers.
# Override when only the staging stack is running (container name differs from archive-db).
# Default (archive-db) is correct for production with the full stack up.
POSTGRES_HOST=archive-db
# $$ escaping note: passwords in /opt/familienarchiv/.env that contain a literal '$' must
# use '$$' so Docker Compose does not expand them as variable references.
# Example: a password 'p@$$word' should be written as 'p@$$$$word' in the .env file.
# Error reporting DSNs — leave empty to disable the SDK (safe default).
# SENTRY_DSN: backend (Spring Boot) — used by the GlitchTip/Sentry Java SDK
SENTRY_DSN=
SENTRY_TRACES_SAMPLE_RATE=
# VITE_SENTRY_DSN: frontend (SvelteKit) — injected at build time via Vite
VITE_SENTRY_DSN=
# Sentry/GlitchTip auth token for source map upload at build time (optional)
SENTRY_AUTH_TOKEN=
# Production SMTP — uncomment and fill in to send real emails instead of catching them # Production SMTP — uncomment and fill in to send real emails instead of catching them
# APP_BASE_URL=https://your-domain.example.com # APP_BASE_URL=https://your-domain.example.com
# MAIL_HOST=smtp.example.com # MAIL_HOST=smtp.example.com

View File

@@ -2,7 +2,6 @@ name: CI
on: on:
push: push:
branches: [main]
pull_request: pull_request:
jobs: jobs:
@@ -13,7 +12,7 @@ jobs:
name: Unit & Component Tests name: Unit & Component Tests
runs-on: ubuntu-latest runs-on: ubuntu-latest
container: container:
image: mcr.microsoft.com/playwright:v1.60.0-noble image: mcr.microsoft.com/playwright:v1.58.2-noble
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -29,156 +28,27 @@ jobs:
run: npm ci run: npm ci
working-directory: frontend working-directory: frontend
- name: Security audit (no dev deps)
run: npm audit --audit-level=high --omit=dev
working-directory: frontend
- name: Compile Paraglide i18n - name: Compile Paraglide i18n
run: npx @inlang/paraglide-js compile --project ./project.inlang --outdir ./src/lib/paraglide run: npx @inlang/paraglide-js compile --project ./project.inlang --outdir ./src/lib/paraglide
working-directory: frontend working-directory: frontend
- name: Sync SvelteKit
run: npx svelte-kit sync
working-directory: frontend
- name: Lint - name: Lint
run: npm run lint run: npm run lint
working-directory: frontend working-directory: frontend
- name: Assert no banned vi.mock patterns - name: Run unit and component tests
shell: bash run: npm test
run: |
# Literal pdfjs-dist (libLoader pattern — ADR 012)
if grep -rF "vi.mock('pdfjs-dist'" frontend/src/; then
echo "FAIL: banned vi.mock('pdfjs-dist') pattern found — see ADR 012. Use the libLoader prop injection pattern instead."
exit 1
fi
# Async factory with dynamic import in body (named mechanism — ADR 012 / #553).
# Multiline PCRE matches `vi.mock(<arg>, async ... { ... await import(...) ... })`
# across line breaks. __meta__ is excluded because it contains fixture strings
# demonstrating the very pattern this check is meant to forbid.
if grep -rPzln 'vi\.mock\([^)]+,\s*async[^{]*\{[\s\S]*?await\s+import\s*\(' \
--include='*.spec.ts' --include='*.test.ts' \
--exclude-dir='__meta__' \
frontend/src/; then
echo "FAIL: banned async vi.mock factory with dynamic import in body — see ADR 012 / #553. Use a synchronous factory + vi.hoisted instead."
exit 1
fi
- name: Assert no raw document date rendered via {@html} (CWE-79 — #666)
shell: bash
run: |
# meta_date_raw is untrusted verbatim spreadsheet text — it must render via
# Svelte default escaping, never {@html}. This guard flags any {@html ...}
# whose expression references a raw-date variable. A comment mentioning
# "{@html}" without a raw token inside the braces does NOT match.
# The token list MUST cover every variable that carries the raw value:
# DocumentDate.svelte exposes it via the `raw` prop, so `\braw\b` is included.
# Grow this list whenever a new raw-bearing variable name is introduced.
pattern='\{@html[^}]*(metaDateRaw|documentDateRaw|rawDate|\braw\b)'
# Self-test: the regex must catch the dangerous forms and ignore the comment form.
printf '{@html doc.metaDateRaw}\n' | grep -qP "$pattern" \
|| { echo "FAIL: guard self-test — regex missed the unsafe {@html metaDateRaw} form"; exit 1; }
printf '{@html raw}\n' | grep -qP "$pattern" \
|| { echo "FAIL: guard self-test — regex missed the unsafe {@html raw} form (DocumentDate prop)"; exit 1; }
printf 'never use {@html} for this\n' | grep -qvP "$pattern" \
|| { echo "FAIL: guard self-test — regex wrongly flagged a {@html} comment"; exit 1; }
if grep -rPln "$pattern" --include='*.svelte' frontend/src/; then
echo "FAIL: meta_date_raw rendered via {@html} — use default {…} escaping (CWE-79, #666)."
exit 1
fi
- name: Assert no (upload|download)-artifact past v3
shell: bash
run: |
# Self-test: verify the regex catches v4+ and does not catch v3.
tmp=$(mktemp)
printf ' uses: actions/upload-artifact@v5\n' > "$tmp"
grep -qP '^\s+uses:\s+actions/(upload|download)-artifact@v[4-9]' "$tmp" \
|| { echo "FAIL: guard self-test — regex missed upload-artifact@v5"; rm "$tmp"; exit 1; }
printf ' uses: actions/upload-artifact@v3\n' > "$tmp"
grep -qvP '^\s+uses:\s+actions/(upload|download)-artifact@v[4-9]' "$tmp" \
|| { echo "FAIL: guard self-test — regex incorrectly flagged upload-artifact@v3"; rm "$tmp"; exit 1; }
rm "$tmp"
# Guard: Gitea Actions (act_runner) does not implement the v4 artifact protocol.
# Both upload-artifact and download-artifact share the same incompatibility.
# Pin to @v3. See ADR-014 / #557.
if grep -RPn '^\s+uses:\s+actions/(upload|download)-artifact@v[4-9]' .gitea/workflows/; then
echo "::error::actions/(upload|download)-artifact@v4+ is unsupported on Gitea Actions (act_runner). Pin to @v3. See ADR-014 / #557."
exit 1
fi
- name: Run unit and component tests with coverage
shell: bash
run: |
set -eo pipefail
npm run test:coverage 2>&1 | tee /tmp/coverage-test-${{ github.run_id }}.log
working-directory: frontend
env:
TZ: Europe/Berlin
# Diagnostic guard: covers the coverage run only. If `npm test` (above)
# exits 1 with a birpc error, the named pattern appears here — not there.
- name: Assert no birpc teardown race in coverage run
shell: bash
if: always()
run: |
if grep -qF "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}.log 2>/dev/null; then
echo "FAIL: [birpc] rpc is closed teardown race detected in coverage run"
grep -F "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}.log
exit 1
fi
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
- name: Upload coverage reports
if: always()
uses: actions/upload-artifact@v3
with:
name: coverage-reports
path: |
frontend/coverage/
/tmp/coverage-test-${{ github.run_id }}.log
- name: Build frontend
run: npm run build
working-directory: frontend working-directory: frontend
# ── Prerender output is exactly the public help page ───────────────────
# SvelteKit prerender + crawl follows nav links and bakes "redirect to
# /login" HTML for every protected route, served BEFORE runtime hooks
# (see #514). With `crawl: false` only the explicit entry should land
# in build/prerendered/. Anything else is a regression — fail the build.
- name: Assert prerender output is only /hilfe/transkription
run: |
cd frontend
set -e
extra=$(find build/prerendered -type f \
-not -path 'build/prerendered/hilfe/*' \
-not -name '*.br' -not -name '*.gz' \
|| true)
if [ -n "$extra" ]; then
echo "FAIL: unexpected prerendered files (would shadow runtime hooks):"
echo "$extra"
exit 1
fi
# And the help page must still be there.
test -f build/prerendered/hilfe/transkription.html \
|| { echo "FAIL: /hilfe/transkription.html missing from prerender output"; exit 1; }
echo "PASS: only /hilfe/transkription.html prerendered."
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
- name: Upload screenshots - name: Upload screenshots
if: always() if: always()
uses: actions/upload-artifact@v3 uses: actions/upload-artifact@v4
with: with:
name: unit-test-screenshots name: unit-test-screenshots
path: frontend/test-results/screenshots/ path: frontend/test-results/screenshots/
# ─── OCR Service Unit Tests ─────────────────────────────────────────────────── # ─── OCR Service Unit Tests ───────────────────────────────────────────────────
# Only stdlib/lightweight tests — no ML stack (PyTorch/Surya/Kraken) required. # Only spell_check.py, test_confidence.py, test_sender_registry.py — no ML stack required.
# test_tmpdir.py covers the TMPDIR env var and entrypoint mkdir behaviour (ADR-021).
# test_tmpdir_is_inside_persistent_cache_volume is skipped in CI (TMPDIR not
# set to /app/cache here); it runs inside the deployed Docker container.
ocr-tests: ocr-tests:
name: OCR Service Tests name: OCR Service Tests
runs-on: ubuntu-latest runs-on: ubuntu-latest
@@ -190,11 +60,11 @@ jobs:
python-version: '3.11' python-version: '3.11'
- name: Install test dependencies - name: Install test dependencies
run: pip install "pyspellchecker==0.9.0" "fastapi==0.115.6" pytest pytest-asyncio run: pip install "pyspellchecker==0.9.0" pytest pytest-asyncio
working-directory: ocr-service working-directory: ocr-service
- name: Run OCR unit tests (no ML stack required) - name: Run OCR unit tests (no ML stack required)
run: python -m pytest test_spell_check.py test_confidence.py test_sender_registry.py test_tmpdir.py -v run: python -m pytest test_spell_check.py test_confidence.py test_sender_registry.py -v
working-directory: ocr-service working-directory: ocr-service
# ─── Backend Unit & Slice Tests ─────────────────────────────────────────────── # ─── Backend Unit & Slice Tests ───────────────────────────────────────────────
@@ -204,8 +74,6 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
env: env:
DOCKER_API_VERSION: "1.43" # NAS runner runs Docker 24.x (max API 1.43); Testcontainers 2.x defaults to 1.44 DOCKER_API_VERSION: "1.43" # NAS runner runs Docker 24.x (max API 1.43); Testcontainers 2.x defaults to 1.44
DOCKER_HOST: unix:///var/run/docker.sock
TESTCONTAINERS_RYUK_DISABLED: "true"
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -224,155 +92,5 @@ jobs:
- name: Run backend tests - name: Run backend tests
run: | run: |
chmod +x mvnw chmod +x mvnw
./mvnw clean verify ./mvnw clean test
working-directory: backend working-directory: backend
- name: Upload surefire reports
if: always()
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
uses: actions/upload-artifact@v3
with:
name: surefire-reports
path: backend/target/surefire-reports/
# ─── fail2ban Regex Regression ────────────────────────────────────────────────
# The filter parses Caddy's JSON access log; a Caddy upgrade that reorders
# the JSON keys would silently break it (fail2ban-regex would return
# "0 matches", fail2ban would stop banning, no error surface). This job
# pins the contract against a deterministic sample line.
fail2ban-regex:
name: fail2ban Regex
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install fail2ban
run: |
sudo apt-get update
sudo apt-get install -y fail2ban
- name: Matches /api/auth/login 401
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/login 401"; exit 1; }
- name: Matches /api/auth/login 429
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":429}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/login 429"; exit 1; }
- name: Matches /api/auth/forgot-password 401
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/forgot-password"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '1 matched' \
|| { echo "expected 1 match for /api/auth/forgot-password 401"; exit 1; }
- name: Does not match /api/auth/login 200
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"POST","host":"archiv.raddatz.cloud","uri":"/api/auth/login"},"status":200}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '0 matched' \
|| { echo "expected 0 matches for /api/auth/login 200"; exit 1; }
- name: Does not match /api/documents (unrelated 401)
run: |
echo '{"level":"info","ts":1700000000.12,"logger":"http.log.access","msg":"handled request","request":{"remote_ip":"203.0.113.42","method":"GET","host":"archiv.raddatz.cloud","uri":"/api/documents"},"status":401}' > /tmp/sample.log
out=$(fail2ban-regex /tmp/sample.log infra/fail2ban/filter.d/familienarchiv-auth.conf)
echo "$out"
echo "$out" | grep -qE '0 matched' \
|| { echo "expected 0 matches for /api/documents 401"; exit 1; }
# ── Backend resolves to file-polling, not systemd ─────────────────────
# The Debian/Ubuntu fail2ban package ships defaults-debian.conf with
# `[DEFAULT] backend = systemd`. Without `backend = polling` in our
# jail, the daemon loads the jail but reads from journald and never
# touches /var/log/caddy/access.log — i.e. the regex above passes in
# isolation while the live jail is inert. See issue #503.
- name: Jail resolves with polling backend (not inherited systemd)
run: |
sudo ln -sfn "$PWD/infra/fail2ban/jail.d/familienarchiv.conf" /etc/fail2ban/jail.d/familienarchiv.conf
sudo ln -sfn "$PWD/infra/fail2ban/filter.d/familienarchiv-auth.conf" /etc/fail2ban/filter.d/familienarchiv-auth.conf
dump=$(sudo fail2ban-client -d 2>&1)
echo "$dump" | grep -E "add.*familienarchiv-auth" || true
echo "$dump" | grep -qE "\['add', 'familienarchiv-auth', 'polling'\]" \
|| { echo "FAIL: familienarchiv-auth jail did not resolve to 'polling' backend"; exit 1; }
# ─── Semgrep Security Scan ───────────────────────────────────────────────────
# Catches XXE-unprotected XML parser factories and similar patterns defined in
# .semgrep/security.yml. Runs in parallel with backend-unit-tests for fast feedback.
# Uses local rules only (no SEMGREP_APP_TOKEN / OIDC — act_runner does not support it).
semgrep-scan:
name: Semgrep Security Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: 'pip'
- name: Install Semgrep
run: pip install semgrep==1.163.0
- name: Run security rules
run: semgrep --config .semgrep/security.yml --error --metrics=off backend/src/
# ─── Compose Bucket-Bootstrap Idempotency ─────────────────────────────────────
# docker-compose.prod.yml's create-buckets service runs on every
# `docker compose up` (one-shot, no restart). Must be idempotent — a
# re-deploy must not fail just because the bucket / user / policy
# already exists. Validated by running create-buckets twice against a
# throwaway minio stack and asserting both invocations exit 0.
compose-idempotency:
name: Compose Bucket Idempotency
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write stub env file
run: |
cat > .env.test <<'EOF'
TAG=test
PORT_BACKEND=18080
PORT_FRONTEND=13000
APP_DOMAIN=localhost
POSTGRES_PASSWORD=stub
MINIO_PASSWORD=stubrootpassword
MINIO_APP_PASSWORD=stubapppassword
OCR_TRAINING_TOKEN=stub
APP_ADMIN_USERNAME=admin@local
APP_ADMIN_PASSWORD=stub
MAIL_HOST=mailpit
MAIL_PORT=1025
APP_MAIL_FROM=noreply@local
IMPORT_HOST_DIR=/tmp/dummy-import
COMPOSE_NETWORK_NAME=test-idem-archiv-net
EOF
- name: Bring up minio
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test up -d --wait minio
- name: First create-buckets run
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test run --rm create-buckets
- name: Second create-buckets run (idempotency check)
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test run --rm create-buckets
- name: Teardown
if: always()
run: |
docker compose -f docker-compose.prod.yml -p test-idem --env-file .env.test down -v
rm -f .env.test

View File

@@ -1,65 +0,0 @@
name: Coverage Flake Probe
# Manually-triggered probe for the birpc teardown race documented in ADR 012
# / #553. Runs the full coverage suite 20× in parallel against a single SHA
# and asserts zero `[birpc] rpc is closed` lines across every cell. Verifies
# the acceptance criterion that the race no longer surfaces under coverage.
on:
workflow_dispatch:
jobs:
coverage-flake-probe:
name: Coverage flake probe (run ${{ matrix.run }})
runs-on: ubuntu-latest
container:
image: mcr.microsoft.com/playwright:v1.58.2-noble
strategy:
fail-fast: false
matrix:
run: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
steps:
- uses: actions/checkout@v4
- name: Cache node_modules
id: node-modules-cache
uses: actions/cache@v4
with:
path: frontend/node_modules
key: node-modules-${{ hashFiles('frontend/package-lock.json') }}
- name: Install dependencies
if: steps.node-modules-cache.outputs.cache-hit != 'true'
run: npm ci
working-directory: frontend
- name: Compile Paraglide i18n
run: npx @inlang/paraglide-js compile --project ./project.inlang --outdir ./src/lib/paraglide
working-directory: frontend
- name: Run unit and component tests with coverage
shell: bash
run: |
set -eo pipefail
npm run test:coverage 2>&1 | tee /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log
working-directory: frontend
env:
TZ: Europe/Berlin
- name: Assert no birpc teardown race
shell: bash
if: always()
run: |
if grep -qF "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log 2>/dev/null; then
echo "FAIL: [birpc] rpc is closed teardown race detected in run ${{ matrix.run }}"
grep -F "[birpc] rpc is closed" /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log
exit 1
fi
# Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557.
- name: Upload coverage log on failure
if: failure()
uses: actions/upload-artifact@v3
with:
name: coverage-log-run-${{ matrix.run }}
path: /tmp/coverage-test-${{ github.run_id }}-${{ matrix.run }}.log

View File

@@ -1,284 +0,0 @@
name: nightly
# Builds and deploys the staging environment from main every night.
# Runs on the self-hosted runner using Docker-out-of-Docker (the docker
# socket is mounted in), so `docker compose build` produces images on
# the host daemon and `docker compose up` consumes them directly — no
# registry hop.
#
# Operational assumptions (see docs/DEPLOYMENT.md §3 for the full setup):
#
# 1. Single-tenant self-hosted runner. The "Write staging env file" step
# writes every secret to .env.staging on the runner filesystem; the
# `if: always()` cleanup step removes it. A multi-tenant runner
# would need to switch to docker compose --env-file <(stdin) instead.
#
# 2. Host docker layer cache is authoritative. There is no
# actions/cache; we rely on the host daemon to keep Maven and npm
# layers warm between runs. A `docker system prune` on the host
# will cause the next nightly build to be cold (510 min slower).
#
# Staging environment isolation:
# - project name: archiv-staging
# - host ports: backend 8081, frontend 3001
# - profile: staging (starts mailpit instead of a real SMTP relay)
#
# Required Gitea secrets:
# STAGING_POSTGRES_PASSWORD
# STAGING_MINIO_PASSWORD
# STAGING_MINIO_APP_PASSWORD
# STAGING_OCR_TRAINING_TOKEN
# STAGING_APP_ADMIN_USERNAME
# STAGING_APP_ADMIN_PASSWORD
# GRAFANA_ADMIN_PASSWORD
# GRAFANA_DB_PASSWORD (read-only grafana_reader DB role, issue #651)
# GLITCHTIP_SECRET_KEY
# SENTRY_DSN (set after GlitchTip first-run; empty = Sentry disabled)
on:
schedule:
- cron: "0 2 * * *"
workflow_dispatch:
env:
# Ensures the backend Dockerfile's `RUN --mount=type=cache` lines are
# honoured (Maven cache survives between runs).
DOCKER_BUILDKIT: "1"
jobs:
deploy-staging:
# `ubuntu-latest` matches our self-hosted runner's advertised label
# (the runner has labels: ubuntu-latest / ubuntu-24.04 / ubuntu-22.04).
# `self-hosted` would never match — no runner advertises it — so the
# job parks in the queue forever. ADR-011's "single-tenant" promise
# is at the repo level; sharing this runner between CI and deploys
# for the same repo is within that boundary.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write staging env file
run: |
cat > .env.staging <<EOF
TAG=nightly
PORT_BACKEND=8081
PORT_FRONTEND=3001
APP_DOMAIN=staging.raddatz.cloud
POSTGRES_PASSWORD=${{ secrets.STAGING_POSTGRES_PASSWORD }}
MINIO_PASSWORD=${{ secrets.STAGING_MINIO_PASSWORD }}
MINIO_APP_PASSWORD=${{ secrets.STAGING_MINIO_APP_PASSWORD }}
OCR_TRAINING_TOKEN=${{ secrets.STAGING_OCR_TRAINING_TOKEN }}
APP_ADMIN_USERNAME=${{ secrets.STAGING_APP_ADMIN_USERNAME }}
APP_ADMIN_PASSWORD=${{ secrets.STAGING_APP_ADMIN_PASSWORD }}
MAIL_HOST=mailpit
MAIL_PORT=1025
MAIL_USERNAME=
MAIL_PASSWORD=
MAIL_SMTP_AUTH=false
MAIL_STARTTLS_ENABLE=false
APP_MAIL_FROM=noreply@staging.raddatz.cloud
IMPORT_HOST_DIR=/srv/familienarchiv-staging/import
POSTGRES_USER=archiv
SENTRY_DSN=${{ secrets.SENTRY_DSN }}
VITE_SENTRY_DSN=${{ secrets.VITE_SENTRY_DSN }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
EOF
- name: Verify backend /import:ro mount is wired
# Regression guard for #526: the /admin/system mass-import card
# only works when the backend service mounts the host import
# payload at /import (read-only). If a future "compose cleanup"
# PR drops the volumes block, mass import silently breaks again.
# `compose config` renders both shorthand and longform mounts as
# `target: /import` + `read_only: true`, so we assert against
# the rendered form rather than the raw source YAML.
run: |
set -e
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
config > /tmp/compose-rendered.yml
grep -q '^[[:space:]]*target: /import$' /tmp/compose-rendered.yml \
|| { echo "::error::backend is missing the /import bind mount (see #526)"; exit 1; }
grep -A2 '^[[:space:]]*target: /import$' /tmp/compose-rendered.yml \
| grep -q 'read_only: true' \
|| { echo "::error::backend /import mount is not read-only (see #526)"; exit 1; }
- name: Build images
# `--pull` forces re-fetching pinned base images so a CVE
# re-publication of the same tag (e.g. node:20.19.0-alpine3.21,
# postgres:16-alpine) is picked up instead of being served
# from the host's stale Docker layer cache.
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
build --pull
- name: Deploy staging
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-staging \
--env-file .env.staging \
--profile staging \
up -d --wait --remove-orphans
- name: Deploy observability configs
# Copies the compose file and config tree from the workspace checkout
# into /opt/familienarchiv/ — the permanent location that persists
# between CI runs. Containers started in the next step bind-mount
# from there, so a future workspace wipe cannot corrupt a running
# config file.
#
# obs-secrets.env is written fresh from Gitea secrets on every run so
# Gitea is always the single source of truth for secret rotation.
# Non-secret config lives in infra/observability/obs.env (tracked in git).
run: |
rm -rf /opt/familienarchiv/infra/observability
mkdir -p /opt/familienarchiv/infra/observability
cp -r infra/observability/. /opt/familienarchiv/infra/observability/
cp docker-compose.observability.yml /opt/familienarchiv/
cat > /opt/familienarchiv/obs-secrets.env <<'EOF'
GRAFANA_ADMIN_PASSWORD=${{ secrets.GRAFANA_ADMIN_PASSWORD }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
GLITCHTIP_SECRET_KEY=${{ secrets.GLITCHTIP_SECRET_KEY }}
POSTGRES_PASSWORD=${{ secrets.STAGING_POSTGRES_PASSWORD }}
POSTGRES_HOST=archiv-staging-db-1
EOF
# Note: POSTGRES_HOST is derived from the Compose project name (archiv-staging)
# and service name (db). A project rename requires updating this value.
chmod 600 /opt/familienarchiv/obs-secrets.env
- name: Validate observability compose config
# Dry-run: resolves all variable substitutions and reports any missing
# required keys before containers start. Catches undefined variables and
# YAML errors in config files updated by the previous step.
# --env-file order: obs.env first (git-tracked defaults), obs-secrets.env
# second (CI-written secrets). Later files win on duplicate keys, so
# obs-secrets.env overrides POSTGRES_HOST set in obs.env.
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
config --quiet
- name: Start observability stack
# Runs with absolute paths so bind mounts resolve to stable host paths
# that survive workspace wipes between nightly runs (see ADR-016).
# Non-secret config from obs.env (git-tracked); secrets from obs-secrets.env
# (written fresh from Gitea secrets above). --env-file order: obs.env first,
# obs-secrets.env second — later file wins on duplicate keys.
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
up -d --wait --remove-orphans
- name: Assert observability stack health
# docker compose up --wait covers services WITH healthcheck directives only.
# obs-promtail, obs-cadvisor, obs-node-exporter, and obs-glitchtip-worker have
# no healthcheck — they are considered "started" as soon as the process runs.
# This step explicitly asserts the five healthchecked critical services are
# healthy before the smoke test proceeds.
run: |
set -e
unhealthy=""
for svc in obs-loki obs-prometheus obs-grafana obs-tempo obs-glitchtip; do
status=$(docker inspect "$svc" --format '{{.State.Health.Status}}' 2>/dev/null || echo "missing")
if [ "$status" != "healthy" ]; then
echo "::error::$svc is not healthy (status: $status)"
unhealthy="$unhealthy $svc"
fi
done
[ -z "$unhealthy" ] || exit 1
echo "All critical observability services are healthy"
- name: Reload Caddy
# Apply any committed Caddyfile changes before smoke-testing the
# public surface. Without this step, a Caddyfile edit lands in the
# repo but Caddy keeps serving the previous config until someone
# reloads it manually — the smoke test would then catch a stale
# header or a still-proxied /actuator route rather than confirming
# the current config is live.
#
# The runner executes job steps inside Docker containers (DooD).
# `systemctl` is not present in container images and cannot reach
# the host's systemd directly. We use the Docker socket (mounted
# into every job container via runner-config.yaml) to spin up a
# privileged sibling container in the host PID namespace; nsenter
# then enters the host's namespaces so systemctl talks to the real
# host systemd daemon. No sudoers entry is required — the Docker
# socket already grants root-equivalent host access.
#
# Alpine is used: ~5 MB vs ~70 MB for ubuntu, no unnecessary
# tooling, and the digest is pinned so any upstream change requires
# an explicit bump PR. util-linux (which ships nsenter) is installed
# at run time; apk add takes ~1 s on the warm VPS cache.
#
# `reload` not `restart`: reload sends SIGHUP so Caddy re-reads its
# config in-process without dropping TLS connections. `restart`
# would briefly stop the service, losing in-flight requests.
#
# If Caddy is not running this step fails fast before the smoke test
# issues a misleading "port 443 refused" error.
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# Healthchecks confirm containers are healthy; they do NOT confirm the
# public surface works. This step catches: Caddy not reloaded, HSTS
# header dropped, /actuator block bypassed.
#
# --resolve pins staging.raddatz.cloud to the Docker bridge gateway IP
# (the host) so we do NOT depend on hairpin NAT on the host router.
# 127.0.0.1 cannot be used: job containers run in bridge network mode
# (runner-config.yaml), so 127.0.0.1 is the container's loopback, not
# the host's. The bridge gateway IS the host; Caddy binds 0.0.0.0:443
# and is therefore reachable from the container via that IP.
# SNI still uses the public hostname so the TLS cert validates correctly.
#
# Gateway detection reads /proc/net/route (always present, no package
# required) instead of `ip route` to avoid a dependency on iproute2.
# Field $2=="00000000" is the default route; field $3 is the gateway as
# a little-endian 32-bit hex value which awk decodes to dotted-decimal.
run: |
set -e
HOST="staging.raddatz.cloud"
URL="https://$HOST"
HOST_IP=$(awk 'NR>1 && $2=="00000000"{h=$3;printf "%d.%d.%d.%d\n",strtonum("0x"substr(h,7,2)),strtonum("0x"substr(h,5,2)),strtonum("0x"substr(h,3,2)),strtonum("0x"substr(h,1,2));exit}' /proc/net/route)
[ -n "$HOST_IP" ] || { echo "ERROR: could not detect Docker bridge gateway via /proc/net/route"; exit 1; }
RESOLVE=(--resolve "$HOST:443:$HOST_IP")
echo "Smoke test: $URL (pinned to $HOST_IP via bridge gateway)"
curl -fsS "${RESOLVE[@]}" --max-time 10 "$URL/login" -o /dev/null
# Pin the preload-list-eligible HSTS value, not just header presence:
# a degraded `max-age=1` or a dropped `includeSubDomains; preload` must
# fail this check rather than pass it silently.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'strict-transport-security:[[:space:]]*max-age=31536000.*includeSubDomains.*preload'
# Permissions-Policy denies APIs the app does not use (camera,
# microphone, geolocation). A regression that loosens or drops the
# header now fails the smoke step.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'permissions-policy:[[:space:]]*camera=\(\),[[:space:]]*microphone=\(\),[[:space:]]*geolocation=\(\)'
status=$(curl -s "${RESOLVE[@]}" -o /dev/null -w "%{http_code}" --max-time 10 "$URL/actuator/health")
[ "$status" = "404" ] || { echo "expected 404 from /actuator/health, got $status"; exit 1; }
echo "All smoke checks passed"
- name: Cleanup env file
# LOAD-BEARING: `if: always()` is the linchpin of the ADR-011
# single-tenant runner trust model. Every secret in .env.staging
# is plain text on the runner filesystem until this step runs.
# If a future refactor drops `if: always()`, a failed deploy
# leaves the env-file behind. Do not remove this conditional
# without first re-evaluating ADR-011.
if: always()
run: rm -f .env.staging

View File

@@ -1,223 +0,0 @@
name: release
# Builds and deploys the production environment on `v*` tag push.
# Runs on the self-hosted runner via Docker-out-of-Docker; images are
# tagged with the actual git tag (e.g. v1.0.0) so rollback is
# `TAG=<previous> docker compose -f docker-compose.prod.yml -p archiv-production up -d --wait`
#
# Operational assumptions (see docs/DEPLOYMENT.md §3 for the full setup):
#
# 1. Single-tenant self-hosted runner. The "Write production env file"
# step writes every secret to .env.production on the runner
# filesystem; the `if: always()` cleanup step removes it. A
# multi-tenant runner would need to switch to
# `docker compose --env-file <(stdin)` instead.
#
# 2. Host docker layer cache is authoritative. There is no
# actions/cache; we rely on the host daemon to keep Maven and npm
# layers warm between runs. A `docker system prune` on the host
# will cause the next release build to be cold (510 min slower).
#
# Production environment:
# - project name: archiv-production
# - host ports: backend 8080, frontend 3000
# - profile: (none) — mailpit is excluded; real SMTP relay is used
#
# Required Gitea secrets:
# PROD_POSTGRES_PASSWORD
# PROD_MINIO_PASSWORD
# PROD_MINIO_APP_PASSWORD
# PROD_OCR_TRAINING_TOKEN
# PROD_APP_ADMIN_USERNAME (CRITICAL: see docs/DEPLOYMENT.md)
# PROD_APP_ADMIN_PASSWORD (CRITICAL: locked in on first deploy)
# MAIL_HOST
# MAIL_PORT
# MAIL_USERNAME
# MAIL_PASSWORD
# GRAFANA_ADMIN_PASSWORD
# GRAFANA_DB_PASSWORD (read-only grafana_reader DB role, issue #651)
# GLITCHTIP_SECRET_KEY
# SENTRY_DSN (set after GlitchTip first-run; empty = Sentry disabled)
on:
push:
tags:
- "v*"
env:
DOCKER_BUILDKIT: "1"
jobs:
deploy-production:
# See nightly.yml — same rationale: `ubuntu-latest` matches the
# advertised label of our single-tenant self-hosted runner.
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Write production env file
run: |
cat > .env.production <<EOF
TAG=${{ gitea.ref_name }}
PORT_BACKEND=8080
PORT_FRONTEND=3000
APP_DOMAIN=archiv.raddatz.cloud
POSTGRES_PASSWORD=${{ secrets.PROD_POSTGRES_PASSWORD }}
MINIO_PASSWORD=${{ secrets.PROD_MINIO_PASSWORD }}
MINIO_APP_PASSWORD=${{ secrets.PROD_MINIO_APP_PASSWORD }}
OCR_TRAINING_TOKEN=${{ secrets.PROD_OCR_TRAINING_TOKEN }}
APP_ADMIN_USERNAME=${{ secrets.PROD_APP_ADMIN_USERNAME }}
APP_ADMIN_PASSWORD=${{ secrets.PROD_APP_ADMIN_PASSWORD }}
MAIL_HOST=${{ secrets.MAIL_HOST }}
MAIL_PORT=${{ secrets.MAIL_PORT }}
MAIL_USERNAME=${{ secrets.MAIL_USERNAME }}
MAIL_PASSWORD=${{ secrets.MAIL_PASSWORD }}
MAIL_SMTP_AUTH=true
MAIL_STARTTLS_ENABLE=true
APP_MAIL_FROM=noreply@raddatz.cloud
IMPORT_HOST_DIR=/srv/familienarchiv-production/import
POSTGRES_USER=archiv
SENTRY_DSN=${{ secrets.SENTRY_DSN }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
EOF
- name: Build images
# `--pull` forces re-fetching pinned base images so a CVE
# re-publication of the same tag is picked up rather than served
# from the host's stale Docker layer cache.
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file .env.production \
build --pull
- name: Deploy production
run: |
docker compose \
-f docker-compose.prod.yml \
-p archiv-production \
--env-file .env.production \
up -d --wait --remove-orphans
- name: Deploy observability configs
# Mirrors the nightly approach: copies obs compose file and config tree
# to /opt/familienarchiv/ (permanent path, survives workspace wipes — ADR-016),
# then writes obs-secrets.env fresh from Gitea secrets.
# Non-secret config lives in infra/observability/obs.env (tracked in git).
run: |
rm -rf /opt/familienarchiv/infra/observability
mkdir -p /opt/familienarchiv/infra/observability
cp -r infra/observability/. /opt/familienarchiv/infra/observability/
cp docker-compose.observability.yml /opt/familienarchiv/
cat > /opt/familienarchiv/obs-secrets.env <<'EOF'
GRAFANA_ADMIN_PASSWORD=${{ secrets.GRAFANA_ADMIN_PASSWORD }}
GRAFANA_DB_PASSWORD=${{ secrets.GRAFANA_DB_PASSWORD }}
GLITCHTIP_SECRET_KEY=${{ secrets.GLITCHTIP_SECRET_KEY }}
POSTGRES_PASSWORD=${{ secrets.PROD_POSTGRES_PASSWORD }}
POSTGRES_HOST=archiv-production-db-1
EOF
# Note: POSTGRES_HOST is derived from the Compose project name (archiv-production)
# and service name (db). A project rename requires updating this value.
chmod 600 /opt/familienarchiv/obs-secrets.env
- name: Validate observability compose config
# Dry-run: resolves all variable substitutions and reports any missing
# required keys before containers start. Catches undefined variables and
# YAML errors in config files updated by the previous step.
# --env-file order: obs.env first (git-tracked defaults), obs-secrets.env
# second (CI-written secrets). Later files win on duplicate keys, so
# obs-secrets.env overrides POSTGRES_HOST set in obs.env.
# Keep in sync with the equivalent step in nightly.yml (#603).
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
config --quiet
- name: Start observability stack
# Runs with absolute paths so bind mounts resolve to stable host paths
# that survive workspace wipes between runs (see ADR-016).
# Non-secret config from obs.env (git-tracked); secrets from obs-secrets.env
# (written fresh from Gitea secrets above). --env-file order: obs.env first,
# obs-secrets.env second — later file wins on duplicate keys.
# Keep in sync with the equivalent step in nightly.yml (#603).
run: |
docker compose \
-f /opt/familienarchiv/docker-compose.observability.yml \
--env-file /opt/familienarchiv/infra/observability/obs.env \
--env-file /opt/familienarchiv/obs-secrets.env \
up -d --wait --remove-orphans
- name: Assert observability stack health
# docker compose up --wait covers services WITH healthcheck directives only.
# obs-promtail, obs-cadvisor, obs-node-exporter, and obs-glitchtip-worker have
# no healthcheck — they are considered "started" as soon as the process runs.
# This step explicitly asserts the five healthchecked critical services are
# healthy before the smoke test proceeds.
# Keep in sync with the equivalent step in nightly.yml (#603).
run: |
set -e
unhealthy=""
for svc in obs-loki obs-prometheus obs-grafana obs-tempo obs-glitchtip; do
status=$(docker inspect "$svc" --format '{{.State.Health.Status}}' 2>/dev/null || echo "missing")
if [ "$status" != "healthy" ]; then
echo "::error::$svc is not healthy (status: $status)"
unhealthy="$unhealthy $svc"
fi
done
[ -z "$unhealthy" ] || exit 1
echo "All critical observability services are healthy"
- name: Reload Caddy
# See nightly.yml — same rationale and mechanism: DooD job containers
# cannot call systemctl directly; nsenter via a privileged sibling
# container reaches the host systemd. Must run after deploy (so the
# latest Caddyfile is on disk) and before the smoke test (so the
# public surface reflects the current config). Alpine with pinned
# digest; reload not restart — see nightly.yml for full rationale.
run: |
docker run --rm --privileged --pid=host \
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# See nightly.yml — same three checks, against the prod vhost.
# --resolve stored as a Bash array so "${RESOLVE[@]}" expands to two
# separate arguments; a quoted string would pass the flag and its value
# as one token and curl would reject it as an unknown option.
# Gateway detection via /proc/net/route — no iproute2 dependency.
# See nightly.yml for the full network topology explanation.
run: |
set -e
HOST="archiv.raddatz.cloud"
URL="https://$HOST"
HOST_IP=$(awk 'NR>1 && $2=="00000000"{h=$3;printf "%d.%d.%d.%d\n",strtonum("0x"substr(h,7,2)),strtonum("0x"substr(h,5,2)),strtonum("0x"substr(h,3,2)),strtonum("0x"substr(h,1,2));exit}' /proc/net/route)
[ -n "$HOST_IP" ] || { echo "ERROR: could not detect Docker bridge gateway via /proc/net/route"; exit 1; }
RESOLVE=(--resolve "$HOST:443:$HOST_IP")
echo "Smoke test: $URL (pinned to $HOST_IP via bridge gateway)"
curl -fsS "${RESOLVE[@]}" --max-time 10 "$URL/login" -o /dev/null
# Pin the preload-list-eligible HSTS value, not just header presence:
# a degraded `max-age=1` or a dropped `includeSubDomains; preload` must
# fail this check rather than pass it silently.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'strict-transport-security:[[:space:]]*max-age=31536000.*includeSubDomains.*preload'
# Permissions-Policy denies APIs the app does not use (camera,
# microphone, geolocation). A regression that loosens or drops the
# header now fails the smoke step.
curl -fsS "${RESOLVE[@]}" --max-time 10 -I "$URL/" \
| grep -Eqi 'permissions-policy:[[:space:]]*camera=\(\),[[:space:]]*microphone=\(\),[[:space:]]*geolocation=\(\)'
status=$(curl -s "${RESOLVE[@]}" -o /dev/null -w "%{http_code}" --max-time 10 "$URL/actuator/health")
[ "$status" = "404" ] || { echo "expected 404 from /actuator/health, got $status"; exit 1; }
echo "All smoke checks passed"
- name: Cleanup env file
# LOAD-BEARING: `if: always()` is the linchpin of the ADR-011
# single-tenant runner trust model. Every secret in
# .env.production is plain text on the runner filesystem until
# this step runs. If a future refactor drops `if: always()`, a
# failed deploy leaves the env-file behind. Do not remove this
# conditional without first re-evaluating ADR-011.
if: always()
run: rm -f .env.production

10
.gitignore vendored
View File

@@ -18,15 +18,5 @@ scripts/large-data.sql
.claude/worktrees/ .claude/worktrees/
.claude/scheduled_tasks.lock .claude/scheduled_tasks.lock
# Run artifacts from verification tooling
proofshot-artifacts/
# Root-level Node.js tooling artifacts
node_modules/
# Repo uses npm; yarn.lock is ignored to avoid double-lockfile drift. # Repo uses npm; yarn.lock is ignored to avoid double-lockfile drift.
frontend/yarn.lock frontend/yarn.lock
**/.venv/
**/__pycache__/
*.pyc

View File

@@ -1,54 +0,0 @@
# Semgrep security rules for Familienarchiv backend.
# These rules catch the absence of XXE protection on XML parser factories.
# CWE-611: Improper Restriction of XML External Entity Reference.
# Run: semgrep --config .semgrep/security.yml --error backend/src/
rules:
# DocumentBuilderFactory without XXE hardening.
# All call sites must call setFeature("…disallow-doctype-decl", true) before use.
- id: dbf-xxe-default
patterns:
- pattern: $X = DocumentBuilderFactory.newInstance();
- pattern-not-inside: |
...
$X.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
...
message: >
DocumentBuilderFactory without XXE protection (CWE-611).
Call XxeSafeXmlParser.hardenedFactory() instead of DocumentBuilderFactory.newInstance().
See: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
languages: [java]
severity: ERROR
# SAXParserFactory without XXE hardening.
- id: sax-xxe-default
patterns:
- pattern: $X = SAXParserFactory.newInstance();
- pattern-not-inside: |
...
$X.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
...
message: >
SAXParserFactory without XXE protection (CWE-611).
Set disallow-doctype-decl=true, external-general-entities=false, external-parameter-entities=false,
and load-external-dtd=false before use. Follow the pattern in XxeSafeXmlParser.hardenedFactory().
See: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
languages: [java]
severity: ERROR
# XMLInputFactory without XXE hardening (StAX parser).
- id: stax-xxe-default
patterns:
- pattern: $X = XMLInputFactory.newInstance();
- pattern-not-inside: |
...
$X.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
...
message: >
XMLInputFactory without XXE protection (CWE-611).
Set IS_SUPPORTING_EXTERNAL_ENTITIES=false and SUPPORT_DTD=false before use.
Follow the pattern in XxeSafeXmlParser.hardenedFactory().
See: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
languages: [java]
severity: ERROR

View File

@@ -1,6 +1,4 @@
{ {
"java.configuration.updateBuildConfiguration": "interactive", "java.configuration.updateBuildConfiguration": "interactive",
"java.compile.nullAnalysis.mode": "automatic", "java.compile.nullAnalysis.mode": "automatic"
"plantuml.render": "PlantUMLServer",
"plantuml.server": "http://heim-nas:8500"
} }

257
CLAUDE.md
View File

@@ -1,11 +1,7 @@
# CLAUDE.md # CLAUDE.md
> For a human-readable project overview, see [README.md](./README.md).
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
> For a human-readable project overview, see [README.md](./README.md).
## Project Overview ## Project Overview
**Familienarchiv** is a family document archival system — a full-stack web app for digitizing, organizing, and searching family documents. Key features: file uploads (stored in MinIO/S3), metadata management, Excel/ODS batch import, full-text search, conversation threads between family members, and role-based access control. **Familienarchiv** is a family document archival system — a full-stack web app for digitizing, organizing, and searching family documents. Key features: file uploads (stored in MinIO/S3), metadata management, Excel/ODS batch import, full-text search, conversation threads between family members, and role-based access control.
@@ -20,8 +16,6 @@ See [CODESTYLE.md](./CODESTYLE.md) for coding standards: Clean Code, DRY/KISS tr
## Stack ## Stack
→ See [README.md §Tech Stack](./README.md#tech-stack)
- **Backend**: Spring Boot 4.0 (Java 21, Maven, Jetty, JPA/Hibernate, Flyway, Spring Security, Spring Session JDBC) - **Backend**: Spring Boot 4.0 (Java 21, Maven, Jetty, JPA/Hibernate, Flyway, Spring Security, Spring Session JDBC)
- **Frontend**: SvelteKit 2 with Svelte 5, TypeScript, Tailwind CSS 4, Paraglide.js (i18n: de/en/es) - **Frontend**: SvelteKit 2 with Svelte 5, TypeScript, Tailwind CSS 4, Paraglide.js (i18n: de/en/es)
- **Database**: PostgreSQL 16 - **Database**: PostgreSQL 16
@@ -31,13 +25,12 @@ See [CODESTYLE.md](./CODESTYLE.md) for coding standards: Clean Code, DRY/KISS tr
## Common Commands ## Common Commands
### Running the Full Stack ### Running the Full Stack
```bash ```bash
# From repo root — starts PostgreSQL, MinIO, and Spring Boot backend
docker-compose up -d docker-compose up -d
``` ```
### Backend (Spring Boot) ### Backend (Spring Boot)
```bash ```bash
cd backend cd backend
@@ -49,12 +42,11 @@ cd backend
``` ```
### Frontend (SvelteKit) ### Frontend (SvelteKit)
```bash ```bash
cd frontend cd frontend
npm install npm install
npm run dev # Dev server (port 5173) npm run dev # Dev server (port 3000)
npm run build # Production build npm run build # Production build
npm run preview # Preview production build npm run preview # Preview production build
@@ -72,12 +64,11 @@ npm run generate:api # Regenerate TypeScript API types from OpenAPI spec
### Package Structure ### Package Structure
<!-- TODO: rewrite post-REFACTOR-1 — see Epic 4 --> Package-by-domain: each domain owns its controller, service, repository, entities, and DTOs.
``` ```
backend/src/main/java/org/raddatz/familienarchiv/ backend/src/main/java/org/raddatz/familienarchiv/
├── audit/ Audit logging ├── audit/ Audit logging
├── auth/ AuthService, AuthSessionController, LoginRequest, LoginRateLimiter, RateLimitProperties (Spring Session JDBC)
├── config/ Infrastructure config (Minio, Async, Web) ├── config/ Infrastructure config (Minio, Async, Web)
├── dashboard/ Dashboard analytics + StatsController/StatsService ├── dashboard/ Dashboard analytics + StatsController/StatsService
├── document/ Document domain (entities, controller, service, repository, DTOs) ├── document/ Document domain (entities, controller, service, repository, DTOs)
@@ -87,26 +78,32 @@ backend/src/main/java/org/raddatz/familienarchiv/
├── exception/ DomainException, ErrorCode, GlobalExceptionHandler ├── exception/ DomainException, ErrorCode, GlobalExceptionHandler
├── filestorage/ FileService (S3/MinIO) ├── filestorage/ FileService (S3/MinIO)
├── geschichte/ Geschichte (story) domain ├── geschichte/ Geschichte (story) domain
├── importing/ CanonicalImportOrchestrator + four loaders (TagTree/PersonRegister/PersonTree/Document) + CanonicalSheetReader ├── importing/ MassImportService
├── notification/ Notification domain + SseEmitterRegistry ├── notification/ Notification domain + SseEmitterRegistry
├── ocr/ OCR domain — OcrService, OcrBatchService, training ├── ocr/ OCR domain — OcrService, OcrBatchService, training
├── person/ Person domain ├── person/ Person domain
│ └── relationship/ PersonRelationship sub-domain │ └── relationship/ PersonRelationship sub-domain
├── security/ SecurityConfig, Permission, @RequirePermission, PermissionAspect ├── security/ SecurityConfig, Permission, @RequirePermission, PermissionAspect
├── tag/ Tag domain ├── tag/ Tag domain
└── user/ User domain — AppUser, UserGroup, UserService └── user/ User domain — AppUser, UserGroup, UserService, auth controllers
``` ```
### Layering Rules ### Layering Rules (strictly enforced)
→ See [docs/ARCHITECTURE.md §Layering rule](./docs/ARCHITECTURE.md#layering-rule) ```
Controller → Service → Repository → DB
```
**LLM reminder:** controllers never call repositories directly; services never reach into another domain's repository — always call the other domain's service instead. - **Controllers** never inject or call repositories directly.
- **Services** never reach into another domain's repository. Call the other domain's service instead.
-`DocumentService``PersonService.getById()``PersonRepository`
-`DocumentService``PersonRepository` directly
- This keeps domain boundaries clear and business logic testable in isolation.
### Domain Model ### Domain Model
| Entity | Table | Key relationships | | Entity | Table | Key relationships |
| ----------- | ------------- | ------------------------------------------------------------------------------------- | |---|---|---|
| `Document` | `documents` | ManyToOne `sender` (Person), ManyToMany `receivers` (Person), ManyToMany `tags` (Tag) | | `Document` | `documents` | ManyToOne `sender` (Person), ManyToMany `receivers` (Person), ManyToMany `tags` (Tag) |
| `Person` | `persons` | Referenced by documents as sender/receiver | | `Person` | `persons` | Referenced by documents as sender/receiver |
| `Tag` | `tag` | ManyToMany with documents via `document_tags` | | `Tag` | `tag` | ManyToMany with documents via `document_tags` |
@@ -121,7 +118,6 @@ backend/src/main/java/org/raddatz/familienarchiv/
### Entity Code Style ### Entity Code Style
All entities use these Lombok annotations: All entities use these Lombok annotations:
```java ```java
@Entity @Entity
@Table(name = "table_name") @Table(name = "table_name")
@@ -150,29 +146,65 @@ Services are annotated with `@Service`, `@RequiredArgsConstructor`, and optional
- Read methods are not annotated (default non-transactional is fine). - Read methods are not annotated (default non-transactional is fine).
- Each service owns its domain's repository. Cross-domain data access goes through the other domain's service. - Each service owns its domain's repository. Cross-domain data access goes through the other domain's service.
**Existing services:**
| Service | Responsibility |
|---|---|
| `DocumentService` | Document CRUD, search, tag cascade delete |
| `PersonService` | Person CRUD, find-or-create by alias |
| `TagService` | Tag find/create/update/delete |
| `UserService` | User and group CRUD |
| `FileService` | S3/MinIO upload and download |
| `MassImportService` | Async ODS/Excel import; delegates to PersonService and TagService |
### DTOs ### DTOs
Input DTOs live flat in the domain package. Response types are the model entities themselves (no response DTOs). Input DTOs live in `dto/`. Response types are the model entities themselves (no response DTOs).
- `@Schema(requiredMode = REQUIRED)` on every field the backend always populates — drives TypeScript generation. - `DocumentUpdateDTO` — used for both create and update (all fields optional)
- `CreateUserRequest` — user creation
- `GroupDTO` — group create/update
### Error Handling ### Error Handling
→ See [CONTRIBUTING.md §Error handling](./CONTRIBUTING.md#error-handling) Use `DomainException` for all domain errors. Never throw raw exceptions from service methods.
**LLM reminder:** use `DomainException.notFound/forbidden/conflict/internal()` from service methods — never throw raw exceptions. When adding a new `ErrorCode`: (1) add to `ErrorCode.java`, (2) add to `ErrorCode` type in `frontend/src/lib/shared/errors.ts`, (3) add a `case` in `getErrorMessage()`, (4) add i18n keys in `messages/{de,en,es}.json`. Valid error codes include: `TOO_MANY_LOGIN_ATTEMPTS` (returned by `LoginRateLimiter` as HTTP 429 when a brute-force threshold is exceeded). ```java
// Static factories match common HTTP status codes:
DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + id)
DomainException.forbidden("Access denied")
DomainException.conflict(ErrorCode.IMPORT_ALREADY_RUNNING, "Already running")
DomainException.internal(ErrorCode.FILE_UPLOAD_FAILED, "Upload failed: " + e.getMessage())
```
`ErrorCode` is an enum in `exception/ErrorCode.java`. When adding a new error case, add the value there **and** mirror it in the frontend's `src/lib/errors.ts` + add a Paraglide translation key.
For simple validation in controllers (not domain logic), `ResponseStatusException` is acceptable:
```java
throw new ResponseStatusException(HttpStatus.BAD_REQUEST, "firstName is required");
```
### Security / Permissions ### Security / Permissions
→ See [docs/ARCHITECTURE.md §Permission system](./docs/ARCHITECTURE.md#permission-system) Use `@RequirePermission` on controller methods (or the whole controller class):
**LLM reminder:** `@RequirePermission(Permission.WRITE_ALL)` is **required** on every `POST`, `PUT`, `PATCH`, `DELETE` endpoint — not optional. Do not mix with Spring Security's `@PreAuthorize`. Available permissions: `READ_ALL`, `WRITE_ALL`, `ADMIN`, `ADMIN_USER`, `ADMIN_TAG`, `ADMIN_PERMISSION`, `ANNOTATE_ALL`, `BLOG_WRITE`. ```java
@RequirePermission(Permission.WRITE_ALL)
public Document updateDocument(...) { ... }
```
Available permissions: `READ_ALL`, `WRITE_ALL`, `ADMIN`, `ADMIN_USER`, `ADMIN_TAG`, `ADMIN_PERMISSION`
`PermissionAspect` (AOP) checks the current user's `UserGroup.permissions` at runtime.
### OpenAPI / API Types ### OpenAPI / API Types
→ See [CONTRIBUTING.md §Walkthrough B — Add a new endpoint](./CONTRIBUTING.md#4-walkthrough-b--add-a-new-endpoint) SpringDoc generates the spec at `/v3/api-docs` (only accessible when running with `--spring.profiles.active=dev`).
**LLM reminder:** always run `npm run generate:api` in `frontend/` after any backend model or endpoint change — this is the most common cause of TypeScript type errors. When changing any model field or endpoint:
1. Rebuild the backend JAR with `-DskipTests`
2. Start it with `--spring.profiles.active=dev`
3. Run `npm run generate:api` in `frontend/`
--- ---
@@ -182,36 +214,48 @@ Input DTOs live flat in the domain package. Response types are the model entitie
``` ```
frontend/src/routes/ frontend/src/routes/
├── +layout.svelte / +layout.server.ts Global layout, auth cookie ├── +layout.svelte Global header (sticky), nav links, logout
├── +page.svelte / +page.server.ts Home / document search dashboard ├── +layout.server.ts Loads current user, injects auth cookie
├── +page.svelte Home / document search
├── +page.server.ts Load: search documents; no actions
├── documents/ ├── documents/
│ ├── [id]/ Document detail (view + file preview) │ ├── [id]/+page.svelte Document detail (view + file preview)
── [id]/edit/ Edit form (all metadata + file upload) ── [id]/edit/ Edit form (all metadata + file upload)
── new/ Upload form ── new/ Create form (same fields, empty)
│ └── bulk-edit/ Multi-document edit
├── persons/ ├── persons/
│ ├── [id]/ Person detail │ ├── +page.svelte Person list with search
│ ├── [id]/edit/ Person edit form │ ├── [id]/+page.svelte Person detail (inline edit + merge)
── new/ Create person form ── new/ Create person form
│ └── review/ Triage view — confirm/rename/merge/delete provisional persons ├── conversations/ Bilateral conversation timeline
├── briefwechsel/ Bilateral conversation timeline (Briefwechsel) ├── admin/ User + group + tag management
── aktivitaeten/ Unified activity feed (Chronik) ── login/ logout/ Auth pages
├── geschichten/ Stories — list, [id], [id]/edit, new
├── stammbaum/ Family tree (Stammbaum)
├── enrich/ Enrichment workflow — [id], done
├── admin/ User, group, tag, OCR, system management
├── hilfe/transkription/ Transcription help page
├── profile/ User profile settings
├── users/[id]/ Public user profile page
├── login/ logout/ register/
└── forgot-password/ reset-password/
``` ```
### API Client Pattern ### API Client Pattern
→ See [CONTRIBUTING.md §Frontend API client](./CONTRIBUTING.md#frontend-api-client) All server-side API calls use the typed client from `$lib/api.server.ts`:
**LLM reminder:** check `!result.response.ok` (not `result.error` — breaks when spec has no error responses defined); cast errors as `result.error as unknown as { code?: string }`; use `result.data!` after an ok check. ```typescript
const api = createApiClient(fetch);
const result = await api.GET('/api/persons/{id}', { params: { path: { id } } });
// Always check via response.ok, NOT result.error
if (!result.response.ok) {
const code = (result.error as unknown as { code?: string })?.code;
throw error(result.response.status, getErrorMessage(code));
}
return { person: result.data! };
```
Key rules:
- Use `!result.response.ok` for error checking (not `if (result.error)` — this breaks when the spec has no error responses defined)
- Cast errors as `result.error as unknown as { code?: string }` to extract the backend error code
- Use `result.data!` (non-null assertion) after an ok check — TypeScript knows it's present
For multipart/form-data endpoints (file uploads), bypass the typed client and use raw `fetch`:
```typescript
const res = await fetch(`${baseUrl}/api/documents`, { method: 'POST', body: formData });
```
### Form Actions Pattern ### Form Actions Pattern
@@ -220,90 +264,97 @@ frontend/src/routes/
export const actions = { export const actions = {
default: async ({ request, fetch }) => { default: async ({ request, fetch }) => {
const formData = await request.formData(); const formData = await request.formData();
const name = formData.get("name") as string; const name = formData.get('name') as string; // cast needed — FormData returns FormDataEntryValue
// ... // ...
return fail(400, { error: "message" }); // on error return fail(400, { error: 'message' }); // on error
throw redirect(303, "/target"); // on success throw redirect(303, '/target'); // on success
}, }
}; };
``` ```
### Date Handling ### Date Handling
→ See [CONTRIBUTING.md §Date handling](./CONTRIBUTING.md#date-handling) - **Forms**: German format `dd.mm.yyyy` with auto-dot insertion via `handleDateInput()`. A hidden `<input type="hidden" name="documentDate" value={dateIso}>` sends ISO format to the backend.
- **Display**: Always use `Intl.DateTimeFormat` with `T12:00:00` suffix to prevent UTC timezone off-by-one:
**LLM reminder:** always append `T12:00:00` when constructing `new Date()` from an ISO date string — prevents UTC timezone off-by-one errors. ```typescript
new Intl.DateTimeFormat('de-DE', { day: 'numeric', month: 'long', year: 'numeric' })
.format(new Date(doc.documentDate + 'T12:00:00'))
```
### UI Component Library ### UI Component Library
→ See per-domain READMEs: [`frontend/src/lib/person/README.md`](./frontend/src/lib/person/README.md), [`frontend/src/lib/tag/README.md`](./frontend/src/lib/tag/README.md), [`frontend/src/lib/document/README.md`](./frontend/src/lib/document/README.md), [`frontend/src/lib/shared/README.md`](./frontend/src/lib/shared/README.md) Custom components in `src/lib/components/`:
| Component | Props | Description |
|---|---|---|
| `PersonTypeahead` | `name`, `label`, `value`, `initialName`, `on:change` | Single-person selector with typeahead dropdown |
| `PersonMultiSelect` | `selectedPersons` (bind) | Chip-based multi-person selector |
| `TagInput` | `tags` (bind), `allowCreation?`, `on:change` | Tag chip input with typeahead |
### Styling Conventions (Tailwind CSS 4) ### Styling Conventions (Tailwind CSS 4)
Brand color tokens (defined in `layout.css`): Brand color utilities (defined in `layout.css`):
| Token / Utility | CSS variable | Usage | | Class | Value | Usage |
| ---------------- | ---------------- | ------------------------------------------------------- | |---|---|---|
| `brand-navy` | `--palette-navy` | Tailwind utility — buttons, headers, primary text | | `brand-navy` | `#002850` | Primary text, buttons, headers |
| `brand-mint` | `--palette-mint` | Tailwind utility — accents, hover underlines, icons | | `brand-mint` | `#A6DAD8` | Accents, hover underlines, icons |
| `--palette-sand` | `--palette-sand` | Palette constant only — use `bg-canvas` or `bg-surface` | | `brand-sand` | `#E4E2D7` | Page background, card borders |
Typography: Typography:
- `font-serif` (Merriweather) — body text, document titles, names
- `font-serif` (Tinos) — body text, document titles, names
- `font-sans` (Montserrat) — labels, metadata, UI chrome - `font-sans` (Montserrat) — labels, metadata, UI chrome
Card pattern for content sections: Card pattern for content sections:
```svelte ```svelte
<div class="rounded-sm border border-line bg-surface shadow-sm p-6"> <div class="bg-white shadow-sm border border-brand-sand rounded-sm p-6">
<h2 class="text-xs font-bold uppercase tracking-widest text-ink-3 mb-5">Section Title</h2> <h2 class="text-xs font-bold uppercase tracking-widest text-gray-400 mb-5">Section Title</h2>
<!-- content --> <!-- content -->
</div> </div>
``` ```
Back button pattern — use the shared `<BackButton>` component from `$lib/shared/primitives/BackButton.svelte`. Do not use a static `<a href>` for back navigation. Save bar pattern — use **sticky full-bleed** for long forms (edit document), **card-style with `mt-4`** for short forms (new person):
```svelte
<!-- Long forms: sticky, full-bleed -->
<div class="sticky bottom-0 z-10 -mx-4 px-6 py-4 bg-white border-t border-brand-sand shadow-[0_-2px_8px_rgba(0,0,0,0.06)] flex items-center justify-between">
<!-- Short forms: card, top margin -->
<div class="mt-4 flex items-center justify-between rounded-sm border border-brand-sand bg-white px-6 py-4 shadow-sm">
```
Back button pattern — use the shared `<BackButton>` component from `$lib/components/BackButton.svelte`:
```svelte
<script lang="ts">
import BackButton from '$lib/components/BackButton.svelte';
</script>
<BackButton />
```
The component calls `history.back()` so the user returns to wherever they came from. Label is always "Zurück" (no contextual suffix — destination is unknown). Touch target ≥ 44px and focus ring are built in. Do not use a static `<a href>` for back navigation.
Subtle action link (e.g. "new document/person"):
```svelte
<a href="/documents/new" class="inline-flex items-center gap-1 text-sm font-medium text-brand-navy/60 hover:text-brand-navy transition-colors">
<svg class="w-4 h-4" ...><!-- plus icon --></svg>
Neues Dokument
</a>
```
### Error Handling (Frontend) ### Error Handling (Frontend)
→ See [CONTRIBUTING.md §Error handling](./CONTRIBUTING.md#error-handling) `src/lib/errors.ts` mirrors the backend `ErrorCode` enum and maps codes to Paraglide translation keys. When adding a new `ErrorCode` on the backend:
1. Add it to `ErrorCode.java`
**LLM reminder:** when adding a new `ErrorCode`: (1) add to `ErrorCode.java`, (2) add to `ErrorCode` type in `frontend/src/lib/shared/errors.ts`, (3) add a `case` in `getErrorMessage()`, (4) add i18n keys in `messages/{de,en,es}.json`. Valid error codes include: `TOO_MANY_LOGIN_ATTEMPTS` (returned by `LoginRateLimiter` as HTTP 429 when a brute-force threshold is exceeded). 2. Add it to the `ErrorCode` type in `errors.ts`
3. Add a `case` in `getErrorMessage()`
4. Add the translation key in `messages/de.json`, `en.json`, `es.json`
--- ---
## Infrastructure ## Infrastructure
→ See [docs/DEPLOYMENT.md](./docs/DEPLOYMENT.md) The `docker-compose.yml` at the repo root orchestrates everything. A MinIO MC helper container runs at startup to create the `archive-documents` bucket. The backend container depends on both `db` and `minio` being healthy.
### Observability stack (separate compose file) Database migrations live in `backend/src/main/resources/db/migration/` (Flyway, SQL files named `V{n}__{description}.sql`).
Run via `docker-compose.observability.yml` — requires the main stack to be running first. Full setup procedure: [docs/DEPLOYMENT.md §4](./docs/DEPLOYMENT.md#4-logs--observability).
| Service | Container | Default Port | Purpose |
|---------|-----------|-------------|---------|
| Grafana | `obs-grafana` | 3003 | Metrics / logs / traces dashboard |
| Prometheus | `obs-prometheus` | 9090 (dev only — `127.0.0.1` bound) | Metrics store |
| Loki | `obs-loki` | — (internal) | Log store |
| Tempo | `obs-tempo` | — (internal) | Trace store |
| GlitchTip | `obs-glitchtip` | 3002 | Error tracking (Sentry-compatible) |
### Observability env vars
| Variable | Purpose |
|----------|---------|
| `PORT_GRAFANA` | Host port for Grafana UI (default: `3003`) |
| `PORT_GLITCHTIP` | Host port for GlitchTip UI (default: `3002`) |
| `PORT_PROMETHEUS` | Host port for Prometheus UI (default: `9090`) |
| `GRAFANA_ADMIN_PASSWORD` | Grafana `admin` login password — generate with `openssl rand -hex 32` |
| `GLITCHTIP_SECRET_KEY` | Django secret key for GlitchTip — generate with `python3 -c "import secrets; print(secrets.token_hex(32))"` |
| `GLITCHTIP_DOMAIN` | Public-facing base URL for GlitchTip (email links, CORS), e.g. `https://glitchtip.example.com` |
| `SENTRY_DSN` | GlitchTip/Sentry DSN for the backend (Spring Boot) — leave empty to disable |
| `VITE_SENTRY_DSN` | GlitchTip/Sentry DSN for the frontend (SvelteKit) — injected at build time via Vite |
## Observability
→ See [docs/OBSERVABILITY.md](./docs/OBSERVABILITY.md) — where to look for logs, traces, metrics, and errors.
## API Testing ## API Testing
@@ -311,4 +362,4 @@ HTTP test files are in `backend/api_tests/` for use with the VS Code REST Client
## Dev Container ## Dev Container
→ See [.devcontainer/README.md](./.devcontainer/README.md) A `.devcontainer/` config is available (Java 21 + Node 24, ports 8080 and 3000 forwarded). Use VS Code's "Reopen in Container" for a pre-configured environment.

View File

@@ -180,47 +180,8 @@ When in doubt, commit more often rather than less.
See [CODESTYLE.md](./CODESTYLE.md) for the full guide: Clean Code (Uncle Bob), DRY/KISS trade-offs, and SOLID principles applied to this stack. See [CODESTYLE.md](./CODESTYLE.md) for the full guide: Clean Code (Uncle Bob), DRY/KISS trade-offs, and SOLID principles applied to this stack.
For domain terminology (Person vs AppUser, DocumentStatus lifecycle, Chronik vs Aktivität, etc.) see [docs/GLOSSARY.md](./docs/GLOSSARY.md).
Quick reminders: Quick reminders:
- Pure functions over stateful helpers where possible - Pure functions over stateful helpers where possible
- No premature abstractions — KISS beats DRY - No premature abstractions — KISS beats DRY
- No backwards-compatibility shims for code that has no callers - No backwards-compatibility shims for code that has no callers
- Validate at system boundaries only (user input, external APIs) - Validate at system boundaries only (user input, external APIs)
## Frontend Domain Boundaries
The frontend mirrors the backend's package-by-domain structure. Each Tier-1 folder under `src/lib/` is a domain with a hard import boundary:
```
document person tag user geschichte notification ocr
activity conversation shared
```
The `boundaries/dependencies` ESLint rule enforces this. The full allow-list lives in `frontend/eslint.config.js`. The rule fires at error severity and blocks `npm run lint`.
### Allowed cross-domain imports
| From | May import from |
|---|---|
| `document` | `shared`, `person`, `tag`, `ocr`, `activity`, `conversation` |
| `geschichte` | `shared`, `person`, `document` |
| `ocr` | `shared`, `document` |
| `activity` | `shared`, `notification` |
| `person`, `tag`, `user`, `notification`, `conversation` | `shared` only |
| `shared` | `shared` only |
| `routes` | any domain |
### When you need to cross a boundary
1. **Move the code to `$lib/shared/`** — the correct fix when the code is truly generic (a UI primitive, a pure utility, a formatting helper).
2. **Add an explicit rule** — if a cross-domain dependency is architecturally justified (e.g., `document` importing `PersonTypeahead`), add the allow entry to `eslint.config.js` with a comment explaining the reason.
3. **Use `// eslint-disable-next-line boundaries/dependencies`** — last resort, only for cases where neither option is practical. Leave a comment explaining why.
### Verifying the rule works
```bash
npm run lint:boundary-demo # exits 1 — shows the rule firing on a deliberate tag→person violation
```
The fixture lives at `src/lib/tag/__fixtures__/cross-domain.fixture.ts` and is excluded from `npm run lint` via `--ignore-pattern`.

View File

@@ -1,306 +0,0 @@
# Contributing to Familienarchiv
For the full collaboration rules (issue workflow, PR process, Red/Green TDD, commit conventions) see [COLLABORATING.md](./COLLABORATING.md).
For coding style see [CODESTYLE.md](./CODESTYLE.md).
For the system architecture see [docs/ARCHITECTURE.md](./docs/ARCHITECTURE.md) (introduced in DOC-2; until that PR merges, see [docs/architecture/c4-diagrams.md](./docs/architecture/c4-diagrams.md)).
For domain terminology see [docs/GLOSSARY.md](./docs/GLOSSARY.md).
---
## 1. Environment setup
**Prerequisites:** Java 21 (SDKMAN), Node 24 (nvm), Docker
**Activate SDKMAN and nvm before running `java`, `mvn`, `node`, or `npm`:**
```bash
source "$HOME/.sdkman/bin/sdkman-init.sh"
export NVM_DIR="$HOME/.nvm" && [ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
```
---
## 2. Daily development workflow
**Startup order — services must start in this sequence:**
```bash
# 1. Start PostgreSQL and MinIO
docker compose up -d db minio
# 2. Start the backend (separate terminal)
cd backend && ./mvnw spring-boot:run
# 3. Start the frontend (separate terminal)
cd frontend && npm install && npm run dev
```
> `npm install` also wires up the Husky pre-commit hook via the `prepare` script.
> Run it before your first commit, or the hook will fail to execute.
> **Do not use `docker-compose.ci.yml` locally** — it disables the bind mounts that the dev workflow depends on.
**Regenerate TypeScript types after any backend API change:**
```bash
# Backend must be running with dev profile
cd frontend && npm run generate:api
```
> ⚠️ Forgetting this step is the most common cause of "where did my TypeScript type go?" — always regenerate after changing models or endpoints.
**Test commands:**
```bash
cd backend && ./mvnw test # backend unit + slice tests
cd frontend && npm run test # Vitest unit tests
cd frontend && npm run check # svelte-check (type errors)
cd frontend && npx playwright test # Playwright e2e tests
```
**Branch naming:** `<type>/<issue-number>-<short-description>`, e.g. `feat/398-contributing`
**Commits:** one logical change per commit; reference the Gitea issue:
```
feat(person): add aliases endpoint
Closes #42
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
```
### Test-type decision matrix
| What you're testing | Test type | Tool |
|---|---|---|
| Service business logic, calculations | Unit test | JUnit + `@ExtendWith(MockitoExtension.class)` |
| HTTP contract, request validation, error codes | Controller slice test | `@WebMvcTest` |
| Server `load` function | Vitest unit | Import directly, mock `fetch` |
| Shared UI component | Vitest browser-mode | `render()` + `getByRole()` |
| Full user-facing flow, navigation, forms | E2E | Playwright |
---
## 3. Walkthrough A — Add a new domain
**Example:** adding a `citation` domain (formal references to documents).
Both the backend and frontend are organised **domain-first**. A new domain means adding a package on both sides under the same name.
### Backend
1. Create `backend/src/main/java/org/raddatz/familienarchiv/citation/`
2. Add entity, repository, service, controller, and DTOs flat in the package:
- **Entity** `Citation.java` — annotate with `@Entity @Data @Builder @NoArgsConstructor @AllArgsConstructor`; use `@GeneratedValue(strategy = GenerationType.UUID)` for the `id` field; add `@Schema(requiredMode = REQUIRED)` on every field the backend always populates
- **Repository** `CitationRepository.java` — extends `JpaRepository<Citation, UUID>`
- **Service** `CitationService.java``@Service @RequiredArgsConstructor`; write methods `@Transactional`, read methods unannotated; cross-domain data goes through the other domain's service, never its repository
- **Controller** `CitationController.java``@RestController @RequestMapping("/api/citations")`
3. Add `@RequirePermission(Permission.WRITE_ALL)` on every `POST`, `PUT`, `PATCH`, and `DELETE` endpoint — **this is not optional**. Read-only `GET` endpoints stay unannotated.
4. Add a Flyway migration: `backend/src/main/resources/db/migration/V{n}__{description}.sql` (use the next sequential number after the highest existing one).
5. **Write failing tests before any implementation** (Red step):
- Service unit test for business logic (`@ExtendWith(MockitoExtension.class)`)
- `@WebMvcTest` slice test for each HTTP endpoint
6. Rebuild with `--spring.profiles.active=dev` and run `npm run generate:api` in `frontend/`.
### Frontend
7. Create `frontend/src/lib/citation/` — domain-specific Svelte components and TypeScript utilities go here.
8. Add routes under `frontend/src/routes/citations/` as needed.
9. Add a per-domain `README.md` in both the backend package folder and `frontend/src/lib/citation/` (per DOC-6).
### Documentation
10. Update `docs/ARCHITECTURE.md` Section 2 to include the new domain.
11. Update `docs/GLOSSARY.md` if new terms are introduced.
12. Update the ESLint boundary allow-list in `frontend/eslint.config.js` if the domain needs to import from another domain.
---
## 4. Walkthrough B — Add a new endpoint
**Example:** `POST /api/persons/{id}/aliases` — attach a name alias to an existing person.
### Red (write failing tests first)
1. Write a failing `@WebMvcTest` controller slice test:
```java
@Test
void addAlias_returns201_whenAliasCreated() { ... }
```
2. Write a failing service unit test:
```java
@Test
void addAlias_throwsNotFound_whenPersonDoesNotExist() { ... }
```
### Green (implement)
3. Add the service method in `PersonService.java`:
```java
@Transactional
public PersonNameAlias addAlias(UUID personId, PersonNameAliasDTO dto) { ... }
```
4. Add the controller method in `PersonController.java`:
```java
@PostMapping("/{id}/aliases")
@RequirePermission(Permission.WRITE_ALL)
public ResponseEntity<PersonNameAlias> addAlias(@PathVariable UUID id,
@RequestBody PersonNameAliasDTO dto) { ... }
```
`@RequirePermission(Permission.WRITE_ALL)` on every state-mutating endpoint — **not optional**.
5. Validate user-supplied inputs at the controller boundary:
```java
if (dto.name() == null || dto.name().isBlank())
throw new ResponseStatusException(HttpStatus.BAD_REQUEST, "name is required");
```
Validate at system boundaries; trust internal service code.
6. Use `DomainException` for domain errors:
```java
DomainException.notFound(ErrorCode.PERSON_NOT_FOUND, "Person not found: " + id)
```
If you need a new error code, add it to `ErrorCode.java`, mirror it in
`frontend/src/lib/shared/errors.ts`, and add translation keys in `messages/{de,en,es}.json`.
7. Mark every field the backend always populates with `@Schema(requiredMode = REQUIRED)` — this drives TypeScript type generation.
### Types and tests
8. Rebuild with `--spring.profiles.active=dev`, then `npm run generate:api` in `frontend/`.
> ⚠️ **Always regenerate types after any API change.** This is the #1 cause of "where did my TypeScript type go?"
9. Run the full test suite — all green before committing.
---
## 5. Walkthrough C — Add a new frontend page
**Example:** `/persons/[id]/timeline` — a chronological event timeline for one person.
### Red (write failing test first)
1. Write a failing Playwright E2E test for the user flow:
```typescript
test('timeline shows events in chronological order', async ({ page }) => {
await page.goto('/persons/1/timeline');
// assertions...
});
```
### Green (implement)
2. Create `frontend/src/routes/persons/[id]/timeline/+page.svelte`
3. Add `frontend/src/routes/persons/[id]/timeline/+page.server.ts` for the SSR load:
```typescript
import { createApiClient } from '$lib/shared/api.server';
export const load: PageServerLoad = async ({ params, fetch }) => {
const api = createApiClient(fetch);
const result = await api.GET('/api/persons/{id}', { params: { path: { id: params.id } } });
if (!result.response.ok) throw error(result.response.status, '...');
return { person: result.data! };
};
```
4. Domain-specific components (e.g. `TimelineEntry.svelte`) → `frontend/src/lib/person/`
5. Shared primitives (e.g. a generic date-range display) → `frontend/src/lib/shared/primitives/`
6. UI patterns to follow:
- Back navigation: `import BackButton from '$lib/shared/primitives/BackButton.svelte'`
- Date display: always append `T12:00:00` — `new Intl.DateTimeFormat('de-DE', …).format(new Date(val + 'T12:00:00'))` — prevents UTC off-by-one errors
- Brand colors: `brand-navy`, `brand-mint`, `brand-sand` (defined in `src/routes/layout.css`)
- Accessibility: touch targets ≥ 44 px (`min-h-[44px]`); focus rings (`focus-visible:ring-2 focus-visible:ring-brand-navy`); `aria-label` on icon-only buttons; `aria-live="polite"` on dynamic status messages
7. Add Paraglide i18n keys in `messages/de.json`, `messages/en.json`, `messages/es.json`.
8. If adding a new error code: mirror in `frontend/src/lib/shared/errors.ts` and add translation keys.
9. Make all tests green before committing.
---
## 6. Conventions reference
### Error handling
| Scenario | Pattern |
|---|---|
| Domain entity not found | `DomainException.notFound(ErrorCode.X, "…")` |
| Permission denied | `DomainException.forbidden("…")` |
| Concurrent edit conflict | `DomainException.conflict(ErrorCode.X, "…")` |
| Infrastructure failure | `DomainException.internal(ErrorCode.X, "…")` |
| Simple controller validation | `throw new ResponseStatusException(HttpStatus.BAD_REQUEST, "…")` |
New error code: `ErrorCode.java` → `frontend/src/lib/shared/errors.ts` → `messages/{de,en,es}.json`.
### DTOs
- Input DTOs live flat in the domain package (e.g. `PersonUpdateDTO.java`)
- Responses are the entity itself — no separate response DTOs
- `@Schema(requiredMode = REQUIRED)` on every field the backend always populates
### Frontend API client
```typescript
const api = createApiClient(fetch); // from $lib/shared/api.server
const result = await api.GET('/api/persons/{id}', { params: { path: { id } } });
if (!result.response.ok) {
const code = (result.error as unknown as { code?: string })?.code;
throw error(result.response.status, getErrorMessage(code));
}
return { person: result.data! }; // non-null assertion is safe after the ok check
```
For multipart/form-data (file uploads): bypass the typed client and use `event.fetch` directly — never global `fetch`. The typed client cannot handle multipart bodies, but `event.fetch` is still required so that `handleFetch` injects the session cookie.
### Date handling
| Context | Pattern |
|---|---|
| Form display | German `dd.mm.yyyy` with auto-dot insertion via `handleDateInput()` |
| Wire format | ISO 8601 via a hidden `<input type="hidden" name="documentDate" value={dateIso}>` |
| Display | `new Intl.DateTimeFormat('de-DE', …).format(new Date(val + 'T12:00:00'))` |
| Honest precision display | `formatDocumentDate(iso, precision, end?, raw?, locale?)` (`$lib/shared/utils/documentDate.ts`) or the `<DocumentDate>` component — renders a document date at exactly its `meta_date_precision` (MONTH → "Juni 1916", never a fabricated day). It mirrors the Java `DocumentTitleFormatter`; both are pinned to `docs/date-label-fixtures.json` so the title and UI labels can't drift. `meta_date_raw` is untrusted — render it via default escaping, never `{@html}` (a CI guard enforces this). |
### Security checklist (new endpoint)
- `@RequirePermission(Permission.WRITE_ALL)` on every `POST`, `PUT`, `PATCH`, `DELETE` — required, not optional
- Validate all user-supplied inputs at the controller boundary before passing to the service
- Parameterised queries only — never interpolate user input into JPQL/SQL strings
- No raw user input in log messages — use `{}` placeholders: `log.warn("Not found: {}", id)`
- Validate content-type and size on upload endpoints before reading the stream
### Accessibility baseline (new frontend page)
- Touch targets ≥ 44 px on all interactive elements (`min-h-[44px]`)
- Focus rings on all focusable elements (`focus-visible:ring-2 focus-visible:ring-brand-navy`)
- `aria-label` on every icon-only button
- `aria-live="polite"` on dynamic status messages
- Color is never the sole status indicator
Full WCAG 2.1 AA reference: [docs/STYLEGUIDE.md](./docs/STYLEGUIDE.md).
### Lint and format
```bash
# Frontend
cd frontend && npm run lint # Prettier + ESLint check
cd frontend && npm run format # Auto-fix formatting
cd frontend && npm run check # svelte-check (type errors)
# Backend — no standalone lint tool; compilation and test runs catch style issues
cd backend && ./mvnw test # compile + test
cd backend && ./mvnw clean package -DskipTests # compile-only check
```

View File

@@ -1,93 +0,0 @@
# Familienarchiv
Familienarchiv is a private web application for digitising, organising, and searching a family document collection — letters, postcards, and photographs from 1899 to 1950. Family members upload scans, transcribe handwritten text (Kurrent/Sütterlin), and read the archive from any device.
---
## Subsystems
- `frontend/` — SvelteKit 2 / Svelte 5 / TypeScript / Tailwind 4 web app (server-side rendered)
- `backend/` — Spring Boot 4 (Java 21) REST API; handles documents, persons, search, and user management
- `ocr-service/` — Python FastAPI microservice for OCR and handwritten text recognition (HTR); single-node by design — see [ADR-001](docs/adr/001-ocr-python-microservice.md). Not part of the default dev stack (see Quick start below)
- `infra/` — Gitea Actions CI/CD config; future home for infrastructure-as-code
- `scripts/` — operational and data-pipeline helpers (`reset-db.sh`, `clean-e2e-data.sh`, import scripts)
---
## Quick start
**Prerequisites:** Java 21, Node 24, Docker with the `docker compose` plugin (V2).
### 1. Configure environment
```bash
cp .env.example .env
# The defaults in .env.example work for local development without changes.
```
### 2. Start infrastructure
```bash
# Starts PostgreSQL, MinIO (object storage), and Mailpit (dev mail catcher)
docker compose up -d db minio mailpit
```
### 3. Start the backend
```bash
cd backend
./mvnw spring-boot:run
# Starts on http://localhost:8080
# API docs (dev profile, auto-enabled): http://localhost:8080/v3/api-docs
```
### 4. Start the frontend
```bash
cd frontend
npm install
npm run dev
# Starts on http://localhost:5173
```
Open **http://localhost:5173** — you should see the Familienarchiv login screen.
Default development credentials:
```
# local dev only — change before any network-exposed deployment
Email: admin@familyarchive.local
Password: admin123
```
> **Development setup only.** The default `docker compose` config exposes the database port and uses root MinIO credentials. Do not connect this to a network without first reading `docs/DEPLOYMENT.md` _(coming: [DOC-5, #399](http://heim-nas:3005/marcel/familienarchiv/issues/399))_.
### Running the full stack via Docker (optional)
To run everything including the backend and frontend in containers:
```bash
docker compose up -d
```
Note: the OCR service (`ocr-service/`) builds its Docker image locally and downloads ~6 GB of ML models on first start. Expect 3060 minutes on a first run. The rest of the stack starts independently; OCR can be excluded with `--scale ocr-service=0` on memory-constrained machines (requires ≥ 12 GB RAM).
---
## Where to go next
| Resource | Purpose |
|---|---|
| [docs/architecture/c4-diagrams.md](docs/architecture/c4-diagrams.md) | C4 container and component diagrams (current system view) |
| [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) _(coming: [DOC-2, #396](http://heim-nas:3005/marcel/familienarchiv/issues/396))_ | Full architecture guide with domain list |
| [docs/GLOSSARY.md](docs/GLOSSARY.md) | Overloaded terms: Person vs AppUser, Chronik vs Aktivität, etc. |
| [CONTRIBUTING.md](CONTRIBUTING.md) _(coming: [DOC-4, #398](http://heim-nas:3005/marcel/familienarchiv/issues/398))_ | How to add a domain, endpoint, or SvelteKit route |
| [docs/DEPLOYMENT.md](docs/DEPLOYMENT.md) _(coming: [DOC-5, #399](http://heim-nas:3005/marcel/familienarchiv/issues/399))_ | Production deployment checklist and secrets guide |
| [docs/adr/](docs/adr/) | Architecture Decision Records — the "why" behind key choices |
| [Gitea issue tracker](http://heim-nas:3005/marcel/familienarchiv/issues) _(internal — home network only)_ | Bug reports, feature requests, and project planning |
---
## License
Private project — all rights reserved. Not licensed for redistribution.

View File

@@ -11,7 +11,7 @@ Spring Boot 4.0 monolith serving the Familienarchiv REST API. Handles document m
- **Server**: Jetty (not Tomcat — excluded in pom.xml) - **Server**: Jetty (not Tomcat — excluded in pom.xml)
- **Data**: PostgreSQL 16, JPA/Hibernate, Spring Data JPA - **Data**: PostgreSQL 16, JPA/Hibernate, Spring Data JPA
- **Migrations**: Flyway (SQL files in `src/main/resources/db/migration/`) - **Migrations**: Flyway (SQL files in `src/main/resources/db/migration/`)
- **Security**: Spring Security, Spring Session JDBC - **Security**: Spring Security, Spring Session JDBC, JWT tokens
- **File Storage**: MinIO via AWS SDK v2 (S3-compatible) - **File Storage**: MinIO via AWS SDK v2 (S3-compatible)
- **Spreadsheet Import**: Apache POI 5.5.0 (Excel/ODS) - **Spreadsheet Import**: Apache POI 5.5.0 (Excel/ODS)
- **API Docs**: SpringDoc OpenAPI 3.x (`/v3/api-docs` — dev profile only) - **API Docs**: SpringDoc OpenAPI 3.x (`/v3/api-docs` — dev profile only)
@@ -19,12 +19,11 @@ Spring Boot 4.0 monolith serving the Familienarchiv REST API. Handles document m
## Package Structure ## Package Structure
<!-- TODO: rewrite post-REFACTOR-1 — see Epic 4 --> Package-by-domain: each domain owns its controller, service, repository, entities, and DTOs.
``` ```
src/main/java/org/raddatz/familienarchiv/ src/main/java/org/raddatz/familienarchiv/
├── audit/ # Audit logging (AuditService, AuditLogQueryService) ├── audit/ # Audit logging (AuditService, AuditLogQueryService)
├── auth/ # AuthService, AuthSessionController, LoginRequest (Spring Session JDBC — ADR-020)
├── config/ # Infrastructure config (MinioConfig, AsyncConfig, WebConfig) ├── config/ # Infrastructure config (MinioConfig, AsyncConfig, WebConfig)
├── dashboard/ # Dashboard analytics + StatsController/StatsService ├── dashboard/ # Dashboard analytics + StatsController/StatsService
├── document/ # Document domain — entities, controller, service, repository, DTOs ├── document/ # Document domain — entities, controller, service, repository, DTOs
@@ -34,28 +33,31 @@ src/main/java/org/raddatz/familienarchiv/
├── exception/ # DomainException, ErrorCode, GlobalExceptionHandler ├── exception/ # DomainException, ErrorCode, GlobalExceptionHandler
├── filestorage/ # FileService (S3/MinIO) ├── filestorage/ # FileService (S3/MinIO)
├── geschichte/ # Geschichte (story) domain ├── geschichte/ # Geschichte (story) domain
├── importing/ # CanonicalImportOrchestrator + 4 loaders + CanonicalSheetReader ├── importing/ # MassImportService
├── notification/ # Notification domain + SseEmitterRegistry ├── notification/ # Notification domain + SseEmitterRegistry
├── ocr/ # OCR domain — OcrService, OcrBatchService, training ├── ocr/ # OCR domain — OcrService, OcrBatchService, training
├── person/ # Person domain — Person, PersonService, PersonController ├── person/ # Person domain — Person, PersonService, PersonController
│ └── relationship/ # PersonRelationship sub-domain │ └── relationship/ # PersonRelationship sub-domain
├── security/ # SecurityConfig, Permission, @RequirePermission, PermissionAspect ├── security/ # SecurityConfig, Permission, @RequirePermission, PermissionAspect
├── tag/ # Tag domain — Tag, TagService, TagController ├── tag/ # Tag domain — Tag, TagService, TagController
└── user/ # User domain — AppUser, UserGroup, UserService └── user/ # User domain — AppUser, UserGroup, UserService, auth controllers
``` ```
For per-domain ownership and public surface, see each domain's `README.md`. ## Layering Rules (Strict)
## Layering Rules ```
Controller → Service → Repository → DB
```
→ See [docs/ARCHITECTURE.md §Layering rule](../docs/ARCHITECTURE.md#layering-rule) - **Controllers never call repositories directly.**
- **Services never reach into another domain's repository.** Call the other domain's service instead.
**LLM reminder:** controllers never call repositories directly; services never reach into another domain's repository — always call the other domain's service. -`DocumentService``PersonService.getById()``PersonRepository`
-`DocumentService``PersonRepository` directly
## Key Entities ## Key Entities
| Entity | Table | Key Relationships | | Entity | Table | Key Relationships |
| --------------------------- | ------------------------------- | ------------------------------------------------------------------------------- | |---|---|---|
| `Document` | `documents` | ManyToOne sender (Person), ManyToMany receivers (Person), ManyToMany tags (Tag) | | `Document` | `documents` | ManyToOne sender (Person), ManyToMany receivers (Person), ManyToMany tags (Tag) |
| `Person` | `persons` | Referenced by documents as sender/receiver; name aliases table | | `Person` | `persons` | Referenced by documents as sender/receiver; name aliases table |
| `Tag` | `tag` | ManyToMany with documents via `document_tags`; self-referencing parent for tree | | `Tag` | `tag` | ManyToMany with documents via `document_tags`; self-referencing parent for tree |
@@ -97,23 +99,37 @@ public class MyEntity {
- Annotated with `@Service`, `@RequiredArgsConstructor`, optionally `@Slf4j`. - Annotated with `@Service`, `@RequiredArgsConstructor`, optionally `@Slf4j`.
- Write methods: `@Transactional`. - Write methods: `@Transactional`.
- Read methods: no annotation (default non-transactional)**except** when the method returns - Read methods: no annotation (default non-transactional).
an entity whose lazy associations must remain accessible to the caller after the method
returns. In that case, use `@Transactional(readOnly = true)` to keep the Hibernate session
open. Removing this annotation causes `LazyInitializationException` in production. See ADR-022.
- Cross-domain access goes through the other domain's service, never its repository. - Cross-domain access goes through the other domain's service, never its repository.
## Error Handling ## Error Handling
→ See [CONTRIBUTING.md §Error handling](../CONTRIBUTING.md#error-handling) Use `DomainException` for all domain errors:
**LLM reminder:** use `DomainException.notFound/forbidden/conflict/internal()` — never throw raw exceptions from service methods. For simple controller validation (not domain logic), `ResponseStatusException` is acceptable: `throw new ResponseStatusException(HttpStatus.BAD_REQUEST, "…")`. When adding a new `ErrorCode`: add to `ErrorCode.java`, mirror in `frontend/src/lib/shared/errors.ts`, add i18n keys in `messages/{de,en,es}.json`. ```java
DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "...")
DomainException.forbidden("...")
DomainException.conflict(ErrorCode.IMPORT_ALREADY_RUNNING, "...")
DomainException.internal(ErrorCode.FILE_UPLOAD_FAILED, "...")
```
When adding a new `ErrorCode`:
1. Add to `ErrorCode.java`
2. Mirror in frontend `src/lib/errors.ts`
3. Add Paraglide translation key in `messages/{de,en,es}.json`
## Security / Permissions ## Security / Permissions
→ See [docs/ARCHITECTURE.md §Permission system](../docs/ARCHITECTURE.md#permission-system) Use `@RequirePermission` on controller methods or classes:
**LLM reminder:** `@RequirePermission(Permission.WRITE_ALL)` is **required** on every `POST`, `PUT`, `PATCH`, `DELETE` endpoint — not optional. Do not mix with Spring Security's `@PreAuthorize`. Available permissions: `READ_ALL`, `WRITE_ALL`, `ADMIN`, `ADMIN_USER`, `ADMIN_TAG`, `ADMIN_PERMISSION`, `ANNOTATE_ALL`, `BLOG_WRITE`. ```java
@RequirePermission(Permission.WRITE_ALL)
public Document updateDocument(...) { ... }
```
Available permissions: `READ_ALL`, `WRITE_ALL`, `ADMIN`, `ADMIN_USER`, `ADMIN_TAG`, `ADMIN_PERMISSION`
`PermissionAspect` checks the current user's `UserGroup.permissions` at runtime.
## OCR Integration ## OCR Integration
@@ -125,35 +141,49 @@ The backend orchestrates OCR by calling the Python `ocr-service` microservice vi
- `OcrBatchService` — handles batch/job workflows - `OcrBatchService` — handles batch/job workflows
- `OcrAsyncRunner` — async execution of OCR jobs - `OcrAsyncRunner` — async execution of OCR jobs
For ocr-service internals, see [`ocr-service/README.md`](../ocr-service/README.md).
## API Testing ## API Testing
HTTP test files in `backend/api_tests/` for the VS Code REST Client extension. HTTP test files in `backend/api_tests/` for the VS Code REST Client extension.
## How to Run ## How to Run
### Local Development
```bash ```bash
cd backend cd backend
./mvnw spring-boot:run # Run with dev profile (requires PostgreSQL + MinIO) # Run with dev profile (requires PostgreSQL + MinIO running via docker-compose)
./mvnw clean package # Build JAR (with tests) ./mvnw spring-boot:run
# Build JAR (with tests)
./mvnw clean package
# Build JAR skipping tests
./mvnw clean package -DskipTests ./mvnw clean package -DskipTests
./mvnw test # Run all tests
./mvnw test -Dtest=ClassName # Run a single test class # Run all tests
./mvnw clean verify # Run with JaCoCo coverage report ./mvnw test
# Run a single test class
./mvnw test -Dtest=ClassName
# Run with coverage (JaCoCo)
./mvnw clean verify
``` ```
**OpenAPI / TypeScript type generation:** ### OpenAPI TypeScript Generation
1. Start backend with `--spring.profiles.active=dev` 1. Build and start backend with `--spring.profiles.active=dev`
2. In `frontend/`: `npm run generate:api` 2. In `frontend/`, run: `npm run generate:api`
**LLM reminder:** always regenerate types after any model or endpoint change — the most common cause of "where did my TypeScript type go?" ### Profiles
- **dev** (default): Enables OpenAPI, dev configs, e2e seeds
- **prod**: Production profile — no dev endpoints
## Testing ## Testing
- Unit tests: Mockito + JUnit, pure in-memory - Unit tests: Mockito + JUnit, pure in-memory
- Slice tests: `@WebMvcTest`, `@DataJpaTest` with Testcontainers PostgreSQL - Slice tests: `@WebMvcTest`, `@DataJpaTest` with Testcontainers PostgreSQL
- Integration tests: Full Spring context with Testcontainers - Integration tests: Full Spring context with Testcontainers
- Coverage gate: 88% branch coverage (JaCoCo) - Coverage gate: 88% branch coverage overall (JaCoCo)

View File

@@ -5,7 +5,7 @@
<parent> <parent>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId> <artifactId>spring-boot-starter-parent</artifactId>
<version>4.0.6</version> <version>4.0.0</version>
<relativePath/> <!-- lookup parent from repository --> <relativePath/> <!-- lookup parent from repository -->
</parent> </parent>
<groupId>org.raddatz</groupId> <groupId>org.raddatz</groupId>
@@ -29,30 +29,11 @@
<properties> <properties>
<java.version>21</java.version> <java.version>21</java.version>
</properties> </properties>
<dependencyManagement>
<dependencies>
<!-- opentelemetry-spring-boot-starter:2.27.0 was built against opentelemetry-api:1.61.0,
but Spring Boot 4.0.0 BOM only manages 1.55.0 (missing GlobalOpenTelemetry.getOrNoop()).
Import the core OTel BOM here to override it before the Spring Boot BOM applies. -->
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-bom</artifactId>
<version>1.61.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies> <dependencies>
<dependency> <dependency>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId> <artifactId>spring-boot-starter-actuator</artifactId>
</dependency> </dependency>
<!-- Spring Boot 4.0 splits Micrometer metrics export (incl. Prometheus scrape endpoint) into its own starter -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-micrometer-metrics</artifactId>
</dependency>
<dependency> <dependency>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-validation</artifactId> <artifactId>spring-boot-starter-validation</artifactId>
@@ -69,10 +50,6 @@
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-security</artifactId> <artifactId>spring-boot-starter-security</artifactId>
</dependency> </dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-session-jdbc</artifactId>
</dependency>
<dependency> <dependency>
<groupId>org.springframework.boot</groupId> <groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-webmvc</artifactId> <artifactId>spring-boot-starter-webmvc</artifactId>
@@ -131,12 +108,6 @@
<groupId>org.awaitility</groupId> <groupId>org.awaitility</groupId>
<artifactId>awaitility</artifactId> <artifactId>awaitility</artifactId>
<scope>test</scope> <scope>test</scope>
</dependency>
<dependency>
<groupId>com.tngtech.archunit</groupId>
<artifactId>archunit-junit5</artifactId>
<version>1.3.0</version>
<scope>test</scope>
</dependency> </dependency>
<!-- Excel Bearbeitung (Apache POI) --> <!-- Excel Bearbeitung (Apache POI) -->
<dependency> <dependency>
@@ -180,16 +151,11 @@
<artifactId>flyway-database-postgresql</artifactId> <artifactId>flyway-database-postgresql</artifactId>
</dependency> </dependency>
<!-- Caffeine cache + Bucket4j for in-memory rate limiting --> <!-- Caffeine cache for in-memory rate limiting -->
<dependency> <dependency>
<groupId>com.github.ben-manes.caffeine</groupId> <groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId> <artifactId>caffeine</artifactId>
</dependency> </dependency>
<dependency>
<groupId>com.bucket4j</groupId>
<artifactId>bucket4j-core</artifactId>
<version>8.10.1</version>
</dependency>
<!-- OpenAPI / Swagger UI — enabled only in the dev Spring profile --> <!-- OpenAPI / Swagger UI — enabled only in the dev Spring profile -->
<dependency> <dependency>
@@ -216,50 +182,7 @@
<dependency> <dependency>
<groupId>com.googlecode.owasp-java-html-sanitizer</groupId> <groupId>com.googlecode.owasp-java-html-sanitizer</groupId>
<artifactId>owasp-java-html-sanitizer</artifactId> <artifactId>owasp-java-html-sanitizer</artifactId>
<version>20260101.1</version> <version>20240325.1</version>
</dependency>
<!-- HTML → plain-text extraction for comment previews -->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.18.1</version>
</dependency>
<!-- Observability: Prometheus metrics scrape endpoint (version managed by Spring Boot BOM) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>
<!-- Observability: Micrometer → OpenTelemetry tracing bridge (version managed by Spring Boot BOM) -->
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing-bridge-otel</artifactId>
</dependency>
<!-- Observability: OTel Spring Boot auto-instrumentation — NOT in Spring Boot BOM, pinned explicitly -->
<dependency>
<groupId>io.opentelemetry.instrumentation</groupId>
<artifactId>opentelemetry-spring-boot-starter</artifactId>
<version>2.27.0</version>
<exclusions>
<!-- Excludes AzureAppServiceResourceProvider which references ServiceAttributes.SERVICE_INSTANCE_ID
that does not exist in the semconv version pulled by this project. -->
<exclusion>
<groupId>io.opentelemetry.contrib</groupId>
<artifactId>opentelemetry-azure-resources</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Sentry error reporting (GlitchTip-compatible) — sentry-spring-boot-4 is the
Spring Boot 4 / Spring Framework 7 compatible module (replaces the jakarta starter
which crashes with SF7 due to bean-name generation for triply-nested @Import classes) -->
<dependency>
<groupId>io.sentry</groupId>
<artifactId>sentry-spring-boot-4</artifactId>
<version>8.41.0</version>
</dependency> </dependency>
</dependencies> </dependencies>
@@ -306,7 +229,7 @@
<phase>verify</phase> <phase>verify</phase>
<goals><goal>report</goal></goals> <goals><goal>report</goal></goals>
</execution> </execution>
<!-- Gate: ratchet at 0.77 — actual measured coverage after drift; raise via #496 --> <!-- Gate: baseline 89.4% overall / service 90.2% / controller 80.0% -->
<execution> <execution>
<id>check</id> <id>check</id>
<phase>verify</phase> <phase>verify</phase>
@@ -319,7 +242,7 @@
<limit> <limit>
<counter>BRANCH</counter> <counter>BRANCH</counter>
<value>COVEREDRATIO</value> <value>COVEREDRATIO</value>
<minimum>0.77</minimum> <minimum>0.88</minimum>
</limit> </limit>
</limits> </limits>
</rule> </rule>
@@ -337,16 +260,6 @@
</profiles> </profiles>
</configuration> </configuration>
</plugin> </plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<forkedProcessTimeoutInSeconds>600</forkedProcessTimeoutInSeconds>
<systemPropertyVariables>
<junit.jupiter.execution.timeout.default>90 s</junit.jupiter.execution.timeout.default>
</systemPropertyVariables>
</configuration>
</plugin>
</plugins> </plugins>
</build> </build>

View File

@@ -35,22 +35,7 @@ public enum AuditKind {
USER_DELETED, USER_DELETED,
/** Payload: {@code {"userId": "uuid", "email": "addr", "addedGroups": ["Admin"], "removedGroups": []}} */ /** Payload: {@code {"userId": "uuid", "email": "addr", "addedGroups": ["Admin"], "removedGroups": []}} */
GROUP_MEMBERSHIP_CHANGED, GROUP_MEMBERSHIP_CHANGED;
/** Payload: {@code {"userId": "uuid", "ip": "1.2.3.4", "ua": "Mozilla/5.0..."}} */
LOGIN_SUCCESS,
/** Payload: {@code {"email": "addr", "ip": "1.2.3.4", "ua": "Mozilla/5.0..."}} — password NEVER included */
LOGIN_FAILED,
/** Payload: {@code {"userId": "uuid", "ip": "1.2.3.4", "ua": "Mozilla/5.0...", "reason": "password_change|password_reset|admin_force_logout", "revokedCount": 3}} */
LOGOUT,
/** Payload: {@code {"actorId": "uuid", "targetUserId": "uuid", "revokedCount": 3}} */
ADMIN_FORCE_LOGOUT,
/** Payload: {@code {"ip": "1.2.3.4", "email": "addr"}} — password NEVER included */
LOGIN_RATE_LIMITED;
public static final Set<AuditKind> ROLLUP_ELIGIBLE = Set.of( public static final Set<AuditKind> ROLLUP_ELIGIBLE = Set.of(
TEXT_SAVED, FILE_UPLOADED, ANNOTATION_CREATED, TEXT_SAVED, FILE_UPLOADED, ANNOTATION_CREATED,

View File

@@ -1,37 +0,0 @@
# audit
Append-only event store for all domain mutations. Every write across the application produces an `audit_log` row. The activity feed and Family Pulse dashboard aggregate from this table.
## What this domain owns
Table: `audit_log` (append-only by convention — no UPDATE or DELETE in application code).
Features: log mutations, query activity feed, query per-entity history.
**Admission criteria (why this is cross-cutting, not a Tier-1 domain):** consumed by 5+ domains; has no user-facing CRUD of its own; the data model is fixed (event log, not a business entity).
## What this domain does NOT own
Nothing beyond the log table. `audit/` is an infrastructure layer, not a business domain.
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `logAfterCommit(event)` | document, person, user, ocr, geschichte | Record a mutation event after the DB transaction commits |
`logAfterCommit` is the only write-path. Query paths (`AuditLogQueryService`) are consumed by `dashboard/` and the activity feed route.
## Internal layout
- `AuditService``logAfterCommit()` (write)
- `AuditLogQueryService` — query by entity, by user, for the activity feed
- `AuditLog` (entity) → table `audit_log`
- `AuditLogRepository`
## Cross-domain dependencies
None. `audit/` is consumed by other domains; it does not call out to any of them.
## Frontend counterpart
No direct frontend counterpart. Audit data surfaces in the `activity/` and `conversation/` frontend domains via the dashboard API.

View File

@@ -1,84 +0,0 @@
package org.raddatz.familienarchiv.auth;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.audit.AuditKind;
import org.raddatz.familienarchiv.audit.AuditService;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.user.AppUser;
import org.raddatz.familienarchiv.user.UserService;
import org.springframework.security.authentication.AuthenticationManager;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.AuthenticationException;
import org.springframework.stereotype.Service;
import java.util.Map;
import java.util.UUID;
@Service
@RequiredArgsConstructor
@Slf4j
public class AuthService {
private final AuthenticationManager authenticationManager;
private final UserService userService;
private final AuditService auditService;
private final LoginRateLimiter loginRateLimiter;
private final SessionRevocationPort sessionRevocationPort;
public LoginResult login(String email, String password, String ip, String ua) {
try {
loginRateLimiter.checkAndConsume(ip, email);
} catch (DomainException ex) {
auditService.log(AuditKind.LOGIN_RATE_LIMITED, null, null, Map.of(
"ip", ip,
"email", email));
throw ex;
}
try {
Authentication auth = authenticationManager.authenticate(
new UsernamePasswordAuthenticationToken(email, password));
AppUser user = userService.findByEmail(email);
auditService.log(AuditKind.LOGIN_SUCCESS, user.getId(), null, Map.of(
"userId", user.getId().toString(),
"ip", ip,
"ua", truncateUa(ua)));
loginRateLimiter.invalidateOnSuccess(ip, email);
return new LoginResult(user, auth);
} catch (AuthenticationException ex) {
// Audit login failure — intentionally does NOT log the attempted password.
// DaoAuthenticationProvider already runs a dummy BCrypt on unknown users to
// equalise timing between "user not found" and "wrong password" paths.
auditService.log(AuditKind.LOGIN_FAILED, null, null, Map.of(
"email", email,
"ip", ip,
"ua", truncateUa(ua)));
throw DomainException.invalidCredentials();
}
}
public int revokeOtherSessions(String currentSessionId, String principalName) {
return sessionRevocationPort.revokeOtherSessions(currentSessionId, principalName);
}
public int revokeAllSessions(String principalName) {
return sessionRevocationPort.revokeAllSessions(principalName);
}
public void logout(String email, String ip, String ua) {
AppUser user = userService.findByEmail(email);
auditService.log(AuditKind.LOGOUT, user.getId(), null, Map.of(
"userId", user.getId().toString(),
"ip", ip,
"ua", truncateUa(ua)));
}
private static String truncateUa(String ua) {
if (ua == null) return "";
return ua.length() > 200 ? ua.substring(0, 200) : ua;
}
public record LoginResult(AppUser user, Authentication authentication) {}
}

View File

@@ -1,102 +0,0 @@
package org.raddatz.familienarchiv.auth;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import jakarta.servlet.http.HttpSession;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.user.AppUser;
import org.springframework.http.ResponseEntity;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContext;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.security.web.authentication.session.SessionAuthenticationStrategy;
import org.springframework.security.web.context.HttpSessionSecurityContextRepository;
import org.springframework.web.bind.annotation.*;
// @RequirePermission is intentionally absent: login is unauthenticated by design;
// logout requires an authenticated session (enforced by SecurityConfig), not a specific permission.
@RestController
@RequestMapping("/api/auth")
@RequiredArgsConstructor
@Slf4j
public class AuthSessionController {
private final AuthService authService;
private final SessionAuthenticationStrategy sessionAuthenticationStrategy;
@PostMapping("/login")
public ResponseEntity<AppUser> login(
@RequestBody LoginRequest request,
HttpServletRequest httpRequest,
HttpServletResponse httpResponse) {
String ip = resolveClientIp(httpRequest);
String ua = resolveUserAgent(httpRequest);
AuthService.LoginResult result = authService.login(request.email(), request.password(), ip, ua);
// Session-fixation defense (CWE-384): rotate the session ID at the authentication
// boundary. ChangeSessionIdAuthenticationStrategy invalidates any pre-auth session ID
// an attacker may have planted and mints a fresh one before we attach the SecurityContext.
httpRequest.getSession(true);
sessionAuthenticationStrategy.onAuthentication(result.authentication(), httpRequest, httpResponse);
// Spring Session JDBC intercepts setAttribute() and persists the record under the
// (now rotated) opaque ID; the Set-Cookie: fa_session=<opaque-id> is added automatically.
SecurityContext context = SecurityContextHolder.createEmptyContext();
context.setAuthentication(result.authentication());
SecurityContextHolder.setContext(context);
httpRequest.getSession()
.setAttribute(HttpSessionSecurityContextRepository.SPRING_SECURITY_CONTEXT_KEY, context);
return ResponseEntity.ok(result.user());
}
@PostMapping("/logout")
public ResponseEntity<Void> logout(Authentication authentication, HttpServletRequest httpRequest) {
String email = authentication.getName();
String ip = resolveClientIp(httpRequest);
String ua = resolveUserAgent(httpRequest);
// CWE-613 defense: invalidate the session first — that is the contract the user
// is relying on when they click "Log out." Audit is best-effort and must not
// bubble up: if the user record was deleted while the session was live, the
// audit lookup throws, but the session row in spring_session must still die.
HttpSession session = httpRequest.getSession(false);
if (session != null) {
session.invalidate();
}
SecurityContextHolder.clearContext();
try {
authService.logout(email, ip, ua);
} catch (Exception ex) {
log.warn("Audit logout failed for {}; session was already invalidated", email, ex);
}
return ResponseEntity.noContent().build();
}
/**
* Resolves the client IP for audit-log purposes.
*
* <p>Trust model: the leftmost {@code X-Forwarded-For} value is taken at face value.
* This is correct <em>only</em> if the ingress (Caddy in production) strips any
* client-supplied XFF before forwarding — otherwise an attacker can pin audit-log
* IPs to whatever they want. Verify the reverse-proxy config before exposing this
* service behind a different ingress.
*/
private static String resolveClientIp(HttpServletRequest request) {
String forwarded = request.getHeader("X-Forwarded-For");
if (forwarded != null && !forwarded.isBlank()) {
return forwarded.split(",")[0].trim();
}
return request.getRemoteAddr();
}
private static String resolveUserAgent(HttpServletRequest request) {
String ua = request.getHeader("User-Agent");
return ua != null ? ua : "";
}
}

View File

@@ -1,29 +0,0 @@
package org.raddatz.familienarchiv.auth;
import lombok.RequiredArgsConstructor;
import org.springframework.session.jdbc.JdbcIndexedSessionRepository;
@RequiredArgsConstructor
class JdbcSessionRevocationAdapter implements SessionRevocationPort {
private final JdbcIndexedSessionRepository sessionRepository;
@Override
public int revokeOtherSessions(String currentSessionId, String principalName) {
int count = 0;
for (String id : sessionRepository.findByPrincipalName(principalName).keySet()) {
if (!id.equals(currentSessionId)) {
sessionRepository.deleteById(id);
count++;
}
}
return count;
}
@Override
public int revokeAllSessions(String principalName) {
var sessions = sessionRepository.findByPrincipalName(principalName);
sessions.keySet().forEach(sessionRepository::deleteById);
return sessions.size();
}
}

View File

@@ -1,72 +0,0 @@
package org.raddatz.familienarchiv.auth;
import com.github.benmanes.caffeine.cache.Caffeine;
import com.github.benmanes.caffeine.cache.LoadingCache;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.springframework.stereotype.Service;
import java.time.Duration;
import java.util.Locale;
import java.util.concurrent.TimeUnit;
@Service
@Slf4j
public class LoginRateLimiter {
private final LoadingCache<String, Bucket> byIpEmail;
private final LoadingCache<String, Bucket> byIp;
private final int maxPerIpEmail;
private final int maxPerIp;
private final int windowMinutes;
public LoginRateLimiter(RateLimitProperties props) {
this.maxPerIpEmail = props.getMaxAttemptsPerIpEmail();
this.maxPerIp = props.getMaxAttemptsPerIp();
this.windowMinutes = props.getWindowMinutes();
this.byIpEmail = Caffeine.newBuilder()
.expireAfterAccess(windowMinutes, TimeUnit.MINUTES)
.build(key -> newBucket(maxPerIpEmail, windowMinutes));
this.byIp = Caffeine.newBuilder()
.expireAfterAccess(windowMinutes, TimeUnit.MINUTES)
.build(key -> newBucket(maxPerIp, windowMinutes));
}
// NOTE: This cache is node-local (in-memory). In a multi-replica deployment,
// effective limits would be multiplied by replica count.
// For the current single-VPS setup this is the correct, simplest implementation.
public void checkAndConsume(String ip, String email) {
long retryAfterSeconds = windowMinutes * 60L;
String key = ip + ":" + email.toLowerCase(Locale.ROOT);
if (!byIpEmail.get(key).tryConsume(1)) {
throw DomainException.tooManyRequests(ErrorCode.TOO_MANY_LOGIN_ATTEMPTS,
"Too many login attempts from " + ip, retryAfterSeconds);
}
if (!byIp.get(ip).tryConsume(1)) {
// Refund the ipEmail token so IP-level blocking does not erode the per-email quota.
byIpEmail.get(key).addTokens(1);
throw DomainException.tooManyRequests(ErrorCode.TOO_MANY_LOGIN_ATTEMPTS,
"Too many login attempts from " + ip, retryAfterSeconds);
}
}
public void invalidateOnSuccess(String ip, String email) {
byIpEmail.invalidate(ip + ":" + email.toLowerCase(Locale.ROOT));
byIp.invalidate(ip);
}
private static Bucket newBucket(int limit, int minutes) {
return Bucket.builder()
.addLimit(Bandwidth.builder()
.capacity(limit)
.refillGreedy(limit, Duration.ofMinutes(minutes))
.build())
.build();
}
}

View File

@@ -1,3 +0,0 @@
package org.raddatz.familienarchiv.auth;
public record LoginRequest(String email, String password) {}

View File

@@ -1,14 +0,0 @@
package org.raddatz.familienarchiv.auth;
class NoOpSessionRevocationAdapter implements SessionRevocationPort {
@Override
public int revokeOtherSessions(String currentSessionId, String principalName) {
return 0;
}
@Override
public int revokeAllSessions(String principalName) {
return 0;
}
}

View File

@@ -1,14 +0,0 @@
package org.raddatz.familienarchiv.auth;
import lombok.Data;
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.stereotype.Component;
@Component
@ConfigurationProperties("rate-limit.login")
@Data
public class RateLimitProperties {
private int maxAttemptsPerIpEmail = 10;
private int maxAttemptsPerIp = 20;
private int windowMinutes = 15;
}

View File

@@ -1,19 +0,0 @@
package org.raddatz.familienarchiv.auth;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.session.jdbc.JdbcIndexedSessionRepository;
@Configuration
class SessionRevocationConfig {
@Bean
SessionRevocationPort sessionRevocationPort(
@Autowired(required = false) JdbcIndexedSessionRepository sessionRepository) {
if (sessionRepository != null) {
return new JdbcSessionRevocationAdapter(sessionRepository);
}
return new NoOpSessionRevocationAdapter();
}
}

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.auth;
public interface SessionRevocationPort {
int revokeOtherSessions(String currentSessionId, String principalName);
int revokeAllSessions(String principalName);
}

View File

@@ -5,10 +5,8 @@ import lombok.extern.slf4j.Slf4j;
import org.flywaydb.core.Flyway; import org.flywaydb.core.Flyway;
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import javax.sql.DataSource; import javax.sql.DataSource;
import java.util.Map;
@Configuration @Configuration
@RequiredArgsConstructor @RequiredArgsConstructor
@@ -16,7 +14,6 @@ import java.util.Map;
public class FlywayConfig { public class FlywayConfig {
private final DataSource dataSource; private final DataSource dataSource;
private final Environment environment;
@Bean(name = "flyway") @Bean(name = "flyway")
public Flyway flyway() { public Flyway flyway() {
@@ -24,7 +21,6 @@ public class FlywayConfig {
Flyway flyway = Flyway.configure() Flyway flyway = Flyway.configure()
.dataSource(dataSource) .dataSource(dataSource)
.locations("classpath:db/migration") .locations("classpath:db/migration")
.placeholders(Map.of("grafanaDbPassword", resolveGrafanaDbPassword()))
.baselineOnMigrate(true) .baselineOnMigrate(true)
.baselineVersion("4") .baselineVersion("4")
.load(); .load();
@@ -32,22 +28,4 @@ public class FlywayConfig {
log.info("Flyway: {} migration(s) applied.", result.migrationsExecuted); log.info("Flyway: {} migration(s) applied.", result.migrationsExecuted);
return flyway; return flyway;
} }
// Fail-closed: refuse to boot when GRAFANA_DB_PASSWORD is unset. The
// grafana_reader role's password is (re)set on every boot by
// R__grafana_reader_password.sql, so a missing env var means we'd either
// skip the rotation silently or — with a hardcoded fallback — publish a
// well-known credential for a role with SELECT on audit_log, documents,
// and transcription_blocks. Same shape as UserDataInitializer's refusal
// to seed default admin credentials outside dev/test/e2e.
String resolveGrafanaDbPassword() {
String value = environment.getProperty("GRAFANA_DB_PASSWORD");
if (value == null || value.isBlank()) {
throw new IllegalStateException(
"GRAFANA_DB_PASSWORD is required: it is consumed by "
+ "R__grafana_reader_password.sql to (re)set the grafana_reader "
+ "role's password on every boot. Generate with: openssl rand -hex 32");
}
return value;
}
} }

View File

@@ -28,7 +28,6 @@ public class RateLimitInterceptor implements HandlerInterceptor {
AtomicInteger count = requestCounts.get(ip, k -> new AtomicInteger(0)); AtomicInteger count = requestCounts.get(ip, k -> new AtomicInteger(0));
if (count.incrementAndGet() > MAX_REQUESTS_PER_MINUTE) { if (count.incrementAndGet() > MAX_REQUESTS_PER_MINUTE) {
response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value()); response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
response.setHeader("Retry-After", "60");
response.getWriter().write("{\"code\":\"RATE_LIMIT_EXCEEDED\",\"message\":\"Too many requests\"}"); response.getWriter().write("{\"code\":\"RATE_LIMIT_EXCEEDED\",\"message\":\"Too many requests\"}");
return false; return false;
} }

View File

@@ -1,22 +0,0 @@
package org.raddatz.familienarchiv.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.session.web.http.CookieSerializer;
import org.springframework.session.web.http.DefaultCookieSerializer;
@Configuration
public class SpringSessionConfig {
@Bean
public CookieSerializer cookieSerializer() {
DefaultCookieSerializer serializer = new DefaultCookieSerializer();
serializer.setCookieName("fa_session");
serializer.setSameSite("Strict");
// cookieHttpOnly: true is the DefaultCookieSerializer default
// useSecureCookie not set: auto-detects from request.isSecure().
// With forward-headers-strategy: native, Caddy's X-Forwarded-Proto: https
// causes isSecure() → true in production; direct HTTP in dev/tests → false.
return serializer;
}
}

View File

@@ -29,11 +29,5 @@ public record ActivityFeedItemDTO(
requiredMode = Schema.RequiredMode.NOT_REQUIRED, requiredMode = Schema.RequiredMode.NOT_REQUIRED,
description = "Annotation associated with the comment; populated only for COMMENT_ADDED and MENTION_CREATED kinds." description = "Annotation associated with the comment; populated only for COMMENT_ADDED and MENTION_CREATED kinds."
) )
UUID annotationId, UUID annotationId
@Nullable
@Schema(
requiredMode = Schema.RequiredMode.NOT_REQUIRED,
description = "Plain-text preview of the comment body (HTML stripped server-side, truncated to 120 chars); null for non-comment feed items or deleted comments."
)
String commentPreview
) {} ) {}

View File

@@ -12,7 +12,6 @@ import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.person.Person; import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock; import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock;
import org.raddatz.familienarchiv.document.comment.CommentService; import org.raddatz.familienarchiv.document.comment.CommentService;
import org.raddatz.familienarchiv.document.comment.CommentData;
import org.raddatz.familienarchiv.document.DocumentService; import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.transcription.TranscriptionService; import org.raddatz.familienarchiv.document.transcription.TranscriptionService;
import org.raddatz.familienarchiv.user.UserService; import org.raddatz.familienarchiv.user.UserService;
@@ -134,9 +133,9 @@ public class DashboardService {
.filter(Objects::nonNull) .filter(Objects::nonNull)
.distinct() .distinct()
.toList(); .toList();
Map<UUID, CommentData> commentDataByComment = commentIds.isEmpty() Map<UUID, UUID> annotationByComment = commentIds.isEmpty()
? Map.of() ? Map.of()
: commentService.findDataByIds(commentIds); : commentService.findAnnotationIdsByIds(commentIds);
return rows.stream().map(row -> { return rows.stream().map(row -> {
ActivityActorDTO actor = row.getActorId() != null ActivityActorDTO actor = row.getActorId() != null
@@ -147,10 +146,7 @@ public class DashboardService {
? row.getHappenedAtUntil().atOffset(ZoneOffset.UTC) ? row.getHappenedAtUntil().atOffset(ZoneOffset.UTC)
: null; : null;
UUID commentId = row.getCommentId(); UUID commentId = row.getCommentId();
CommentData commentData = commentId != null ? commentDataByComment.get(commentId) : null; UUID annotationId = commentId != null ? annotationByComment.get(commentId) : null;
UUID annotationId = commentData != null ? commentData.annotationId() : null;
String commentPreview = commentData != null && !commentData.preview().isBlank()
? commentData.preview() : null;
return new ActivityFeedItemDTO( return new ActivityFeedItemDTO(
org.raddatz.familienarchiv.audit.AuditKind.valueOf(row.getKind()), org.raddatz.familienarchiv.audit.AuditKind.valueOf(row.getKind()),
actor, actor,
@@ -162,8 +158,7 @@ public class DashboardService {
row.getCount(), row.getCount(),
happenedAtUntil, happenedAtUntil,
commentId, commentId,
annotationId, annotationId
commentPreview
); );
}).toList(); }).toList();
} }

View File

@@ -1,39 +0,0 @@
# dashboard
Stats aggregation for the admin dashboard and the Family Pulse widget. This is a derived domain — it has no tables of its own; all data is computed on-the-fly from Tier-1 domain data.
## What this domain owns
No entities. Routes: `/api/dashboard/*`, `/api/stats/*`.
Features: document counts, person counts, publication stats, weekly activity data, incomplete-document list, enrichment queue, Family Pulse widget data, admin statistics.
**Admission criteria (cross-cutting):** aggregates from 3+ domains; no owned entities.
## What this domain does NOT own
None of the underlying data — it reads from `document/`, `person/`, `audit/`, `notification/`, `geschichte/`.
## Public surface
`dashboard/` is a leaf domain — no other domain calls its services. It is the aggregator, not the aggregated.
## Internal layout
- `StatsController` — REST under `/api/stats`
- `DashboardController` — REST under `/api/dashboard`
- `StatsService` — aggregated counts (documents, persons, geschichten, incomplete, etc.)
- `DashboardService` — activity feed composition, Family Pulse data
## Cross-domain dependencies
- `DocumentService.count()` — total document count (StatsService)
- `DocumentService.getDocumentById(UUID)` / `getDocumentsByIds(List<UUID>)` — document enrichment for activity feed (DashboardService)
- `PersonService.count()` — total person count (StatsService)
- `TranscriptionService.listBlocks(UUID)` — transcription block lookup for resume widget (DashboardService)
- `UserService.getById(UUID)` — actor name resolution in activity feed (DashboardService)
- `CommentService.findAnnotationIdsByIds(...)` — annotation context lookup for activity feed (DashboardService)
- `AuditLogQueryService.findMostRecentDocumentForUser()` / `getPulseStats()` / `findActivityFeed()` — audit-sourced feed rows (DashboardService)
## Frontend counterpart
Activity feed and Pulse widget are assembled in `frontend/src/lib/shared/dashboard/` and in the `aktivitaeten` route; no dedicated `dashboard/` lib folder.

View File

@@ -1,12 +1,7 @@
package org.raddatz.familienarchiv.dashboard; package org.raddatz.familienarchiv.dashboard;
import io.swagger.v3.oas.annotations.media.Schema;
/** /**
* Aggregate counts for the dashboard/persons stats bar. * Aggregate counts for the dashboard/persons stats bar.
*/ */
public record StatsDTO( public record StatsDTO(long totalPersons, long totalDocuments) {
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) long totalPersons,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) long totalDocuments,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) long totalStories) {
} }

View File

@@ -2,7 +2,6 @@ package org.raddatz.familienarchiv.dashboard;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
import org.raddatz.familienarchiv.document.DocumentService; import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.geschichte.GeschichteService;
import org.raddatz.familienarchiv.person.PersonService; import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.dashboard.StatsDTO; import org.raddatz.familienarchiv.dashboard.StatsDTO;
import org.springframework.stereotype.Service; import org.springframework.stereotype.Service;
@@ -13,9 +12,8 @@ public class StatsService {
private final PersonService personService; private final PersonService personService;
private final DocumentService documentService; private final DocumentService documentService;
private final GeschichteService geschichteService;
public StatsDTO getStats() { public StatsDTO getStats() {
return new StatsDTO(personService.count(), documentService.count(), geschichteService.countPublished()); return new StatsDTO(personService.count(), documentService.count());
} }
} }

View File

@@ -1,17 +0,0 @@
package org.raddatz.familienarchiv.document;
/**
* Precision of a document's date. Verbatim mirror of the import normalizer's
* {@code Precision} enum (tools/import-normalizer/dates.py) — the canonical output is the
* contract, so there is no translation layer. Do not add, remove, or rename values without
* also changing the normalizer; a mismatch silently breaks import idempotency (see ADR-025).
*/
public enum DatePrecision {
DAY,
MONTH,
SEASON,
YEAR,
RANGE,
APPROX,
UNKNOWN
}

View File

@@ -1,23 +0,0 @@
package org.raddatz.familienarchiv.document;
import org.raddatz.familienarchiv.tag.TagOperator;
import java.util.List;
import java.util.UUID;
/**
* The non-date filters honoured by {@link DocumentService#getDensity(DensityFilters)}.
* Date bounds (from/to) are deliberately excluded — see the service Javadoc for why.
*
* Kept as a record so the seven values are passed as one named bundle instead of a
* positional argument list where two UUIDs (sender vs. receiver) can be swapped by
* accident at the call site.
*/
public record DensityFilters(
String text,
UUID sender,
UUID receiver,
List<String> tags,
String tagQ,
DocumentStatus status,
TagOperator tagOperator) {}

View File

@@ -2,7 +2,6 @@ package org.raddatz.familienarchiv.document;
import jakarta.persistence.*; import jakarta.persistence.*;
import lombok.*; import lombok.*;
import org.hibernate.annotations.BatchSize;
import org.hibernate.annotations.CreationTimestamp; import org.hibernate.annotations.CreationTimestamp;
import org.hibernate.annotations.UpdateTimestamp; import org.hibernate.annotations.UpdateTimestamp;
@@ -22,17 +21,6 @@ import java.util.HashSet;
import java.util.Set; import java.util.Set;
import java.util.UUID; import java.util.UUID;
@NamedEntityGraph(name = "Document.full", attributeNodes = {
@NamedAttributeNode("sender"),
@NamedAttributeNode("receivers"),
@NamedAttributeNode("tags"),
@NamedAttributeNode("trainingLabels")
})
@NamedEntityGraph(name = "Document.list", attributeNodes = {
@NamedAttributeNode("sender"),
@NamedAttributeNode("receivers"),
@NamedAttributeNode("tags")
})
@Entity @Entity
@Table(name = "documents") @Table(name = "documents")
@Data // Lombok: Generiert Getter, Setter, ToString, etc. @Data // Lombok: Generiert Getter, Setter, ToString, etc.
@@ -91,29 +79,6 @@ public class Document {
@Column(name = "meta_date") @Column(name = "meta_date")
private LocalDate documentDate; // Wann wurde der Brief geschrieben? private LocalDate documentDate; // Wann wurde der Brief geschrieben?
// Precision of documentDate — drives honest rendering ("ca. 1943", "Frühjahr 1943").
// Verbatim mirror of the normalizer's Precision enum (see ADR-025).
@Enumerated(EnumType.STRING)
@Column(name = "meta_date_precision", nullable = false, length = 16)
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
@Builder.Default
private DatePrecision metaDatePrecision = DatePrecision.UNKNOWN;
// Range end — only set when metaDatePrecision is RANGE (open-ended ranges allowed → may be null).
@Column(name = "meta_date_end")
private LocalDate metaDateEnd;
// Original date cell, verbatim, preserved for provenance and "as written" display.
@Column(name = "meta_date_raw", columnDefinition = "TEXT")
private String metaDateRaw;
// Raw attribution preserved even when a person is linked via sender/receivers.
@Column(name = "sender_text", columnDefinition = "TEXT")
private String senderText;
@Column(name = "receiver_text", columnDefinition = "TEXT")
private String receiverText;
@Column(name = "meta_location") @Column(name = "meta_location")
private String location; private String location;
@@ -153,27 +118,24 @@ public class Document {
@Builder.Default @Builder.Default
private ScriptType scriptType = ScriptType.UNKNOWN; private ScriptType scriptType = ScriptType.UNKNOWN;
@ManyToMany(fetch = FetchType.LAZY) @ManyToMany(fetch = FetchType.EAGER)
@JoinTable(name = "document_receivers", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "person_id")) @JoinTable(name = "document_receivers", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "person_id"))
@BatchSize(size = 50)
@Builder.Default @Builder.Default
private Set<Person> receivers = new HashSet<>(); private Set<Person> receivers = new HashSet<>();
@ManyToOne(fetch = FetchType.LAZY) @ManyToOne
@JoinColumn(name = "sender_id") @JoinColumn(name = "sender_id")
private Person sender; private Person sender;
@ManyToMany(fetch = FetchType.LAZY) @ManyToMany(fetch = FetchType.EAGER)
@JoinTable(name = "document_tags", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "tag_id")) @JoinTable(name = "document_tags", joinColumns = @JoinColumn(name = "document_id"), inverseJoinColumns = @JoinColumn(name = "tag_id"))
@BatchSize(size = 50)
@Builder.Default @Builder.Default
private Set<Tag> tags = new HashSet<>(); private Set<Tag> tags = new HashSet<>();
@ElementCollection(fetch = FetchType.LAZY) @ElementCollection(fetch = FetchType.EAGER)
@CollectionTable(name = "document_training_labels", joinColumns = @JoinColumn(name = "document_id")) @CollectionTable(name = "document_training_labels", joinColumns = @JoinColumn(name = "document_id"))
@Column(name = "label") @Column(name = "label")
@Enumerated(EnumType.STRING) @Enumerated(EnumType.STRING)
@BatchSize(size = 50)
@Builder.Default @Builder.Default
private Set<TrainingLabel> trainingLabels = new HashSet<>(); private Set<TrainingLabel> trainingLabels = new HashSet<>();

View File

@@ -12,8 +12,6 @@ public class DocumentBatchMetadataDTO {
private UUID senderId; private UUID senderId;
private List<UUID> receiverIds; private List<UUID> receiverIds;
private LocalDate documentDate; private LocalDate documentDate;
private DatePrecision metaDatePrecision;
private LocalDate metaDateEnd;
private String location; private String location;
private List<String> tagNames; private List<String> tagNames;
private Boolean metadataComplete; private Boolean metadataComplete;

View File

@@ -3,7 +3,6 @@ package org.raddatz.familienarchiv.document;
import java.io.IOException; import java.io.IOException;
import java.time.LocalDate; import java.time.LocalDate;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.concurrent.TimeUnit;
import java.util.LinkedHashSet; import java.util.LinkedHashSet;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
@@ -49,7 +48,6 @@ import org.raddatz.familienarchiv.filestorage.FileService;
import org.raddatz.familienarchiv.user.UserService; import org.raddatz.familienarchiv.user.UserService;
import org.springframework.data.domain.Sort; import org.springframework.data.domain.Sort;
import org.springframework.security.core.Authentication; import org.springframework.security.core.Authentication;
import org.springframework.http.CacheControl;
import org.springframework.http.HttpHeaders; import org.springframework.http.HttpHeaders;
import org.springframework.http.MediaType; import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity; import org.springframework.http.ResponseEntity;
@@ -313,10 +311,9 @@ public class DocumentController {
@RequestParam(required = false) String tagQ, @RequestParam(required = false) String tagQ,
@RequestParam(required = false) DocumentStatus status, @RequestParam(required = false) DocumentStatus status,
@RequestParam(required = false) String tagOp, @RequestParam(required = false) String tagOp,
@RequestParam(required = false) Boolean undated,
Authentication authentication) { Authentication authentication) {
TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND; TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND;
List<UUID> ids = documentService.findIdsForFilter(q, from, to, senderId, receiverId, tags, tagQ, status, operator, Boolean.TRUE.equals(undated)); List<UUID> ids = documentService.findIdsForFilter(q, from, to, senderId, receiverId, tags, tagQ, status, operator);
if (ids.size() > BULK_EDIT_FILTER_MAX_IDS) { if (ids.size() > BULK_EDIT_FILTER_MAX_IDS) {
throw DomainException.badRequest(ErrorCode.BULK_EDIT_TOO_MANY_IDS, throw DomainException.badRequest(ErrorCode.BULK_EDIT_TOO_MANY_IDS,
"Filter matches " + ids.size() + " documents — refine filter (max " + BULK_EDIT_FILTER_MAX_IDS + ")"); "Filter matches " + ids.size() + " documents — refine filter (max " + BULK_EDIT_FILTER_MAX_IDS + ")");
@@ -376,7 +373,6 @@ public class DocumentController {
@Parameter(description = "Sort field") @RequestParam(required = false) DocumentSort sort, @Parameter(description = "Sort field") @RequestParam(required = false) DocumentSort sort,
@Parameter(description = "Sort direction: ASC or DESC") @RequestParam(required = false, defaultValue = "DESC") String dir, @Parameter(description = "Sort direction: ASC or DESC") @RequestParam(required = false, defaultValue = "DESC") String dir,
@Parameter(description = "Tag operator: AND (default) or OR") @RequestParam(required = false) String tagOp, @Parameter(description = "Tag operator: AND (default) or OR") @RequestParam(required = false) String tagOp,
@Parameter(description = "Restrict to undated documents (meta_date IS NULL)") @RequestParam(required = false) Boolean undated,
// @Max on page guards against overflow when pageable.getOffset() is computed // @Max on page guards against overflow when pageable.getOffset() is computed
// as page * size — Integer.MAX_VALUE * 50 would wrap to a negative long, which // as page * size — Integer.MAX_VALUE * 50 would wrap to a negative long, which
// Hibernate cheerfully turns into an invalid SQL OFFSET. // Hibernate cheerfully turns into an invalid SQL OFFSET.
@@ -389,24 +385,7 @@ public class DocumentController {
// defaults to AND, which matches the frontend default and keeps old clients working. // defaults to AND, which matches the frontend default and keeps old clients working.
TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND; TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND;
Pageable pageable = PageRequest.of(page, size); Pageable pageable = PageRequest.of(page, size);
return ResponseEntity.ok(documentService.searchDocuments(q, from, to, senderId, receiverId, tags, tagQ, status, sort, dir, operator, Boolean.TRUE.equals(undated), pageable)); return ResponseEntity.ok(documentService.searchDocuments(q, from, to, senderId, receiverId, tags, tagQ, status, sort, dir, operator, pageable));
}
@GetMapping(value = "/density", produces = MediaType.APPLICATION_JSON_VALUE)
public ResponseEntity<DocumentDensityResult> density(
@RequestParam(required = false) String q,
@RequestParam(required = false) UUID senderId,
@RequestParam(required = false) UUID receiverId,
@RequestParam(required = false, name = "tag") List<String> tags,
@RequestParam(required = false) String tagQ,
@Parameter(description = "Filter by document status") @RequestParam(required = false) DocumentStatus status,
@Parameter(description = "Tag operator: AND (default) or OR") @RequestParam(required = false) String tagOp) {
TagOperator operator = "OR".equalsIgnoreCase(tagOp) ? TagOperator.OR : TagOperator.AND;
DocumentDensityResult result = documentService.getDensity(
new DensityFilters(q, senderId, receiverId, tags, tagQ, status, operator));
return ResponseEntity.ok()
.cacheControl(CacheControl.maxAge(5, TimeUnit.MINUTES).cachePrivate())
.body(result);
} }
// --- TRAINING LABELS --- // --- TRAINING LABELS ---

View File

@@ -1,27 +0,0 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.LocalDate;
import java.util.List;
/**
* Result of the timeline density aggregation.
*
* <p>{@code minDate} / {@code maxDate} are intentionally not marked
* {@code @Schema(requiredMode = REQUIRED)} — the empty-result case (no
* documents match the filter) returns them as {@code null}, which surfaces in
* the generated TypeScript as {@code minDate?: string | null}. Frontend code
* must treat them as optional.
*/
public record DocumentDensityResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<MonthBucket> buckets,
LocalDate minDate,
LocalDate maxDate
) {
/** The "no documents match the filter" result, with no buckets and null date bounds. */
public static DocumentDensityResult empty() {
return new DocumentDensityResult(List.of(), null, null);
}
}

View File

@@ -1,39 +0,0 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
import org.raddatz.familienarchiv.audit.ActivityActorDTO;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.tag.Tag;
import java.time.LocalDate;
import java.util.List;
import java.util.UUID;
public record DocumentListItem(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
UUID id,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
String title,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
String originalFilename,
String thumbnailUrl,
LocalDate documentDate,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
DatePrecision metaDatePrecision,
LocalDate metaDateEnd,
Person sender,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<Person> receivers,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<Tag> tags,
String archiveBox,
String archiveFolder,
String location,
String summary,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int completionPercentage,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<ActivityActorDTO> contributors,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
SearchMatchData matchData
) {}

View File

@@ -7,8 +7,6 @@ import org.raddatz.familienarchiv.document.DocumentStatus;
import org.springframework.data.domain.Page; import org.springframework.data.domain.Page;
import org.springframework.data.domain.Pageable; import org.springframework.data.domain.Pageable;
import org.springframework.data.domain.Sort; import org.springframework.data.domain.Sort;
import org.springframework.data.jpa.domain.Specification;
import org.springframework.data.jpa.repository.EntityGraph;
import org.springframework.data.jpa.repository.JpaRepository; import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.data.jpa.repository.JpaSpecificationExecutor; import org.springframework.data.jpa.repository.JpaSpecificationExecutor;
import org.springframework.data.jpa.repository.Query; import org.springframework.data.jpa.repository.Query;
@@ -25,18 +23,6 @@ import java.util.UUID;
@Repository @Repository
public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSpecificationExecutor<Document> { public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSpecificationExecutor<Document> {
@EntityGraph("Document.full")
Optional<Document> findById(UUID id);
@EntityGraph("Document.list")
Page<Document> findAll(Specification<Document> spec, Pageable pageable);
@EntityGraph("Document.list")
List<Document> findAll(Specification<Document> spec);
@EntityGraph("Document.list")
Page<Document> findAll(Pageable pageable);
// Findet ein Dokument anhand des ursprünglichen Dateinamens // Findet ein Dokument anhand des ursprünglichen Dateinamens
// Wichtig für den Abgleich beim Excel-Import & Datei-Upload // Wichtig für den Abgleich beim Excel-Import & Datei-Upload
Optional<Document> findByOriginalFilename(String originalFilename); Optional<Document> findByOriginalFilename(String originalFilename);
@@ -44,21 +30,17 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
// Wie oben, gibt aber nur das erste Ergebnis zurück — sicher wenn doppelte Dateinamen existieren // Wie oben, gibt aber nur das erste Ergebnis zurück — sicher wenn doppelte Dateinamen existieren
Optional<Document> findFirstByOriginalFilename(String originalFilename); Optional<Document> findFirstByOriginalFilename(String originalFilename);
// Callers access only status/id scalar fields — no graph needed. // Findet alle Dokumente mit einem bestimmten Status
// z.B. um alle offenen "PLACEHOLDER" zu finden
List<Document> findByStatus(DocumentStatus status); List<Document> findByStatus(DocumentStatus status);
// Prüft effizient, ob ein Dateiname schon existiert (gibt true/false zurück) // Prüft effizient, ob ein Dateiname schon existiert (gibt true/false zurück)
boolean existsByOriginalFilename(String originalFilename); boolean existsByOriginalFilename(String originalFilename);
// lazy @BatchSize(50) fallback active; see ADR-022
@EntityGraph("Document.full")
List<Document> findBySenderId(UUID senderId); List<Document> findBySenderId(UUID senderId);
// lazy @BatchSize(50) fallback active; see ADR-022
@EntityGraph("Document.full")
List<Document> findByReceiversId(UUID receiverId); List<Document> findByReceiversId(UUID receiverId);
// Callers access only doc.getTags() to mutate the set — receivers/sender not touched; no graph needed.
List<Document> findByTags_Id(UUID tagId); List<Document> findByTags_Id(UUID tagId);
@Query("SELECT d FROM Document d WHERE d.id NOT IN (SELECT DISTINCT dv.documentId FROM DocumentVersion dv)") @Query("SELECT d FROM Document d WHERE d.id NOT IN (SELECT DISTINCT dv.documentId FROM DocumentVersion dv)")
@@ -73,15 +55,12 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
long countByMetadataCompleteFalse(); long countByMetadataCompleteFalse();
// No production callers — only used if a future export path iterates the full list; no graph needed.
List<Document> findByMetadataCompleteFalse(Sort sort); List<Document> findByMetadataCompleteFalse(Sort sort);
// Callers map to IncompleteDocumentDTO using only scalar fields (id, title, createdAt) — no graph needed.
Page<Document> findByMetadataCompleteFalse(Pageable pageable); Page<Document> findByMetadataCompleteFalse(Pageable pageable);
Optional<Document> findFirstByMetadataCompleteFalseAndIdNot(UUID id, Sort sort); Optional<Document> findFirstByMetadataCompleteFalseAndIdNot(UUID id, Sort sort);
@EntityGraph("Document.full")
@Query("SELECT DISTINCT d FROM Document d " + @Query("SELECT DISTINCT d FROM Document d " +
"JOIN d.receivers r " + "JOIN d.receivers r " +
"WHERE " + "WHERE " +
@@ -96,7 +75,6 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
@Param("to") LocalDate to, @Param("to") LocalDate to,
Sort sort); Sort sort);
@EntityGraph("Document.full")
@Query("SELECT DISTINCT d FROM Document d " + @Query("SELECT DISTINCT d FROM Document d " +
"LEFT JOIN d.receivers r " + "LEFT JOIN d.receivers r " +
"WHERE (d.sender.id = :personId OR r.id = :personId) " + "WHERE (d.sender.id = :personId OR r.id = :personId) " +
@@ -122,45 +100,7 @@ public interface DocumentRepository extends JpaRepository<Document, UUID>, JpaSp
ORDER BY ts_rank(d.search_vector, q.pq) DESC, ORDER BY ts_rank(d.search_vector, q.pq) DESC,
d.meta_date DESC NULLS LAST d.meta_date DESC NULLS LAST
""") """)
// Unpaged path — for bulk-edit "select all" and density chart List<UUID> findRankedIdsByFts(@Param("query") String query);
List<UUID> findAllMatchingIdsByFts(@Param("query") String query);
/**
* Returns one page of FTS-ranked document IDs with the total match count.
*
* <p>Each row contains (in column order):
* <ol>
* <li>UUID — document id</li>
* <li>double — ts_rank score</li>
* <li>long — COUNT(*) OVER () — full match count, not page count</li>
* </ol>
*
* <p>Returns an empty list when the query matches no documents (including
* stopword-only queries where websearch_to_tsquery returns an empty tsquery).
* Use findAllMatchingIdsByFts for the unpaged bulk-edit path.
*/
@Query(nativeQuery = true, value = """
WITH q AS (
SELECT CASE WHEN websearch_to_tsquery('german', :query)::text <> ''
THEN to_tsquery('simple', regexp_replace(
websearch_to_tsquery('german', :query)::text,
'''([^'']+)''',
'''\\1'':*',
'g'))
END AS pq
), matches AS (
SELECT d.id, ts_rank(d.search_vector, q.pq) AS rank
FROM documents d, q
WHERE d.search_vector @@ q.pq
)
SELECT id, rank, COUNT(*) OVER () AS total
FROM matches
ORDER BY rank DESC, id
OFFSET :offset LIMIT :limit
""")
List<Object[]> findFtsPageRaw(@Param("query") String query,
@Param("offset") int offset,
@Param("limit") int limit);
/** /**
* Returns match-enrichment data for a set of documents identified by their IDs. * Returns match-enrichment data for a set of documents identified by their IDs.

View File

@@ -0,0 +1,18 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
import org.raddatz.familienarchiv.audit.ActivityActorDTO;
import org.raddatz.familienarchiv.document.Document;
import java.util.List;
public record DocumentSearchItem(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
Document document,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
SearchMatchData matchData,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int completionPercentage,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<ActivityActorDTO> contributors
) {}

View File

@@ -7,7 +7,7 @@ import java.util.List;
public record DocumentSearchResult( public record DocumentSearchResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<DocumentListItem> items, List<DocumentSearchItem> items,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
long totalElements, long totalElements,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
@@ -15,45 +15,24 @@ public record DocumentSearchResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int pageSize, int pageSize,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int totalPages, int totalPages
/**
* Total number of undated documents (meta_date IS NULL) matching the current
* filter context (q/tags/sender/receiver/status) across ALL pages — not the
* undated rows on the current page. Computed independently of the "Nur
* undatierte" toggle so it never collapses to the page slice (issue #668).
*/
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
long undatedCount
) { ) {
/** /**
* Single-page convenience factory used by empty-result shortcuts and by tests that * Single-page convenience factory used by empty-result shortcuts and by tests that
* don't care about paging. Treats the whole list as page 0 of itself. The undated * don't care about paging. Treats the whole list as page 0 of itself.
* count defaults to 0 — the service overlays the real global count via
* {@link #withUndatedCount(long)} before returning.
*/ */
public static DocumentSearchResult of(List<DocumentListItem> items) { public static DocumentSearchResult of(List<DocumentSearchItem> items) {
int size = items.size(); int size = items.size();
return new DocumentSearchResult(items, size, 0, size, size == 0 ? 0 : 1, 0L); return new DocumentSearchResult(items, size, 0, size, size == 0 ? 0 : 1);
} }
/** /**
* Paged factory used by the service when it has a real Pageable + full match count * Paged factory used by the service when it has a real Pageable + full match count
* (e.g. from Spring's Page&lt;T&gt; or from an in-memory sort-then-slice). The undated * (e.g. from Spring's Page<T> or from an in-memory sort-then-slice).
* count defaults to 0 — the service overlays the real global count via
* {@link #withUndatedCount(long)} before returning.
*/ */
public static DocumentSearchResult paged(List<DocumentListItem> slice, Pageable pageable, long totalElements) { public static DocumentSearchResult paged(List<DocumentSearchItem> slice, Pageable pageable, long totalElements) {
int pageSize = pageable.getPageSize(); int pageSize = pageable.getPageSize();
int totalPages = pageSize == 0 ? 0 : (int) ((totalElements + pageSize - 1) / pageSize); int totalPages = pageSize == 0 ? 0 : (int) ((totalElements + pageSize - 1) / pageSize);
return new DocumentSearchResult(slice, totalElements, pageable.getPageNumber(), pageSize, totalPages, 0L); return new DocumentSearchResult(slice, totalElements, pageable.getPageNumber(), pageSize, totalPages);
}
/**
* Returns a copy with the global undated count overlaid, leaving every other
* field untouched. Lets the service compute the count once and attach it to
* whichever result shape the search path produced.
*/
public DocumentSearchResult withUndatedCount(long undatedCount) {
return new DocumentSearchResult(items, totalElements, pageNumber, pageSize, totalPages, undatedCount);
} }
} }

View File

@@ -10,6 +10,7 @@ import org.raddatz.familienarchiv.audit.AuditService;
import org.raddatz.familienarchiv.document.DocumentBatchMetadataDTO; import org.raddatz.familienarchiv.document.DocumentBatchMetadataDTO;
import org.raddatz.familienarchiv.document.DocumentBatchSummary; import org.raddatz.familienarchiv.document.DocumentBatchSummary;
import org.raddatz.familienarchiv.document.DocumentBulkEditDTO; import org.raddatz.familienarchiv.document.DocumentBulkEditDTO;
import org.raddatz.familienarchiv.document.DocumentSearchItem;
import org.raddatz.familienarchiv.document.DocumentSearchResult; import org.raddatz.familienarchiv.document.DocumentSearchResult;
import org.raddatz.familienarchiv.document.DocumentSort; import org.raddatz.familienarchiv.document.DocumentSort;
import org.raddatz.familienarchiv.document.DocumentUpdateDTO; import org.raddatz.familienarchiv.document.DocumentUpdateDTO;
@@ -47,7 +48,6 @@ import java.io.IOException;
import java.security.MessageDigest; import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException; import java.security.NoSuchAlgorithmException;
import java.time.LocalDate; import java.time.LocalDate;
import java.time.YearMonth;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.Arrays; import java.util.Arrays;
import java.util.Collection; import java.util.Collection;
@@ -125,74 +125,6 @@ public class DocumentService {
return titles; return titles;
} }
/**
* Per-month document counts for the timeline density widget (issue #385).
*
* <p>Filter-reactive: the chart recomputes when other filters (sender,
* receiver, tag, q, status) change so it always matches the list it sits
* above. Date bounds (`from`/`to`) are deliberately omitted — the chart is
* the surface for picking those, so it must always span the broader space
* the user is selecting within.
*
* <p>Implementation note: groups in memory rather than via SQL GROUP BY
* because the existing {@link Specification} predicates compose easily
* with {@code findAll(spec)} and the archive size (≈5k docs) keeps this
* well under the 200ms p95 target. Cache-Control: max-age=300 on the
* controller layer absorbs repeated browse loads.
*
* <p>Tracked in issue #481 for re-evaluation when {@code documents > 50k}
* — at that scale move the aggregation into SQL (GROUP BY TO_CHAR(meta_date,
* 'YYYY-MM')) and accept that the criteria/specification surface needs a
* parallel native-query path.
*/
public DocumentDensityResult getDensity(DensityFilters filters) {
List<UUID> ftsIds = resolveFtsIds(filters.text());
if (ftsIds != null && ftsIds.isEmpty()) {
return DocumentDensityResult.empty();
}
List<LocalDate> dates = loadFilteredDates(filters, ftsIds);
return aggregateByMonth(dates);
}
/**
* Returns the FTS-ranked document IDs when {@code text} is non-blank, or {@code null}
* when no full-text query is active. An empty list means the FTS query ran but
* matched zero documents — the caller short-circuits on that signal.
*/
private List<UUID> resolveFtsIds(String text) {
if (!StringUtils.hasText(text)) return null;
return documentRepository.findAllMatchingIdsByFts(text);
}
/** Loads matching documents and projects to non-null {@link LocalDate}s. */
private List<LocalDate> loadFilteredDates(DensityFilters filters, List<UUID> ftsIds) {
boolean hasFts = ftsIds != null;
Specification<Document> spec = buildSearchSpec(
hasFts, ftsIds, null, null,
filters.sender(), filters.receiver(),
filters.tags(), filters.tagQ(),
filters.status(), filters.tagOperator(), false);
return documentRepository.findAll(spec).stream()
.map(Document::getDocumentDate)
.filter(Objects::nonNull)
.toList();
}
/** Buckets {@code dates} into one {@link MonthBucket} per YYYY-MM and computes min/max. */
private DocumentDensityResult aggregateByMonth(List<LocalDate> dates) {
if (dates.isEmpty()) return DocumentDensityResult.empty();
Map<String, Integer> counts = new java.util.TreeMap<>();
for (LocalDate d : dates) {
counts.merge(YearMonth.from(d).toString(), 1, Integer::sum);
}
List<MonthBucket> buckets = counts.entrySet().stream()
.map(e -> new MonthBucket(e.getKey(), e.getValue()))
.toList();
LocalDate minDate = dates.stream().min(LocalDate::compareTo).orElse(null);
LocalDate maxDate = dates.stream().max(LocalDate::compareTo).orElse(null);
return new DocumentDensityResult(buckets, minDate, maxDate);
}
/** /**
* Lädt eine Datei hoch. * Lädt eine Datei hoch.
* - Prüft, ob ein Eintrag (aus Excel) schon existiert. * - Prüft, ob ein Eintrag (aus Excel) schon existiert.
@@ -378,7 +310,6 @@ public class DocumentService {
// 1. Einfache Felder Update // 1. Einfache Felder Update
doc.setTitle(dto.getTitle()); doc.setTitle(dto.getTitle());
doc.setDocumentDate(dto.getDocumentDate()); doc.setDocumentDate(dto.getDocumentDate());
applyDatePrecision(doc, dto);
doc.setLocation(dto.getLocation()); doc.setLocation(dto.getLocation());
doc.setTranscription(dto.getTranscription()); doc.setTranscription(dto.getTranscription());
doc.setSummary(dto.getSummary()); doc.setSummary(dto.getSummary());
@@ -447,26 +378,6 @@ public class DocumentService {
return saved; return saved;
} }
/**
* Applies the three date-precision fields only when the DTO carries them.
* A null field means "not submitted" — overwriting the stored value with null
* would fabricate a precision the user never chose, the exact dishonesty #666
* exists to prevent. A row with a genuinely-unknown precision must keep it when
* an unrelated edit (e.g. a location typo) is saved.
*/
private void applyDatePrecision(Document doc, DocumentUpdateDTO dto) {
if (dto.getMetaDatePrecision() != null) {
doc.setMetaDatePrecision(dto.getMetaDatePrecision());
}
if (dto.getMetaDateEnd() != null) {
doc.setMetaDateEnd(dto.getMetaDateEnd());
}
if (dto.getMetaDateRaw() != null) {
doc.setMetaDateRaw(dto.getMetaDateRaw());
}
}
@Transactional
public Document updateDocumentTags(UUID docId, List<String> tagNames) { public Document updateDocumentTags(UUID docId, List<String> tagNames) {
Document doc = documentRepository.findById(docId) Document doc = documentRepository.findById(docId)
.orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + docId)); .orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + docId));
@@ -501,17 +412,16 @@ public class DocumentService {
*/ */
@Transactional(readOnly = true) @Transactional(readOnly = true)
public List<UUID> findIdsForFilter(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, public List<UUID> findIdsForFilter(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status, TagOperator tagOperator, List<String> tags, String tagQ, DocumentStatus status, TagOperator tagOperator) {
boolean undated) {
boolean hasText = StringUtils.hasText(text); boolean hasText = StringUtils.hasText(text);
List<UUID> rankedIds = null; List<UUID> rankedIds = null;
if (hasText) { if (hasText) {
rankedIds = documentRepository.findAllMatchingIdsByFts(text); rankedIds = documentRepository.findRankedIdsByFts(text);
if (rankedIds.isEmpty()) return List.of(); if (rankedIds.isEmpty()) return List.of();
} }
Specification<Document> spec = buildSearchSpec( Specification<Document> spec = buildSearchSpec(
hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator, undated); hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
return documentRepository.findAll(spec).stream().map(Document::getId).toList(); return documentRepository.findAll(spec).stream().map(Document::getId).toList();
} }
@@ -525,8 +435,7 @@ public class DocumentService {
LocalDate from, LocalDate to, LocalDate from, LocalDate to,
UUID sender, UUID receiver, UUID sender, UUID receiver,
List<String> tags, String tagQ, List<String> tags, String tagQ,
DocumentStatus status, TagOperator tagOperator, DocumentStatus status, TagOperator tagOperator) {
boolean undated) {
boolean useOrLogic = tagOperator == TagOperator.OR; boolean useOrLogic = tagOperator == TagOperator.OR;
List<Set<UUID>> expandedTagSets = tagService.expandTagNamesToDescendantIdSets(tags); List<Set<UUID>> expandedTagSets = tagService.expandTagNamesToDescendantIdSets(tags);
Specification<Document> textSpec = hasText ? hasIds(ftsIds) : (root, query, cb) -> null; Specification<Document> textSpec = hasText ? hasIds(ftsIds) : (root, query, cb) -> null;
@@ -536,8 +445,7 @@ public class DocumentService {
.and(hasReceiver(receiver)) .and(hasReceiver(receiver))
.and(hasTags(expandedTagSets, useOrLogic)) .and(hasTags(expandedTagSets, useOrLogic))
.and(hasTagPartial(tagQ)) .and(hasTagPartial(tagQ))
.and(hasStatus(status)) .and(hasStatus(status));
.and(undatedOnly(undated));
} }
/** /**
@@ -658,7 +566,7 @@ public class DocumentService {
return saved; return saved;
} }
@Transactional(readOnly = true) // 0. Zuletzt aktive Dokumente (sortiert nach updatedAt DESC)
public List<Document> getRecentActivity(int size) { public List<Document> getRecentActivity(int size) {
return documentRepository.findAll( return documentRepository.findAll(
PageRequest.of(0, size, Sort.by(Sort.Direction.DESC, "updatedAt")) PageRequest.of(0, size, Sort.by(Sort.Direction.DESC, "updatedAt"))
@@ -666,85 +574,41 @@ public class DocumentService {
} }
// 1. Allgemeine Suche (für das Suchfeld im Frontend) // 1. Allgemeine Suche (für das Suchfeld im Frontend)
public DocumentSearchResult searchDocuments(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, List<String> tags, String tagQ, DocumentStatus status, DocumentSort sort, String dir, TagOperator tagOperator, boolean undated, Pageable pageable) { public DocumentSearchResult searchDocuments(String text, LocalDate from, LocalDate to, UUID sender, UUID receiver, List<String> tags, String tagQ, DocumentStatus status, DocumentSort sort, String dir, TagOperator tagOperator, Pageable pageable) {
boolean hasText = StringUtils.hasText(text); boolean hasText = StringUtils.hasText(text);
// Pure-text RELEVANCE: push pagination + ts_rank ordering into SQL — skip
// findAllMatchingIdsByFts entirely (ADR-008). This must run BEFORE any
// findAllMatchingIdsByFts call so the fast path is preserved. An active undated
// filter must NOT take this path: it bypasses buildSearchSpec, so the
// undatedOnly predicate would be silently dropped. By definition this path has
// no date/sender/receiver/tag/status filters, and undated documents are valid
// FTS hits already folded into the ranked page, so there is no separate undated
// count to report here.
if (!undated && isPureTextRelevance(hasText, sort, from, to, sender, receiver, tags, tagQ, status)) {
return relevanceSortedPageFromSql(text, pageable);
}
List<UUID> rankedIds = null; List<UUID> rankedIds = null;
if (hasText) { if (hasText) {
rankedIds = documentRepository.findAllMatchingIdsByFts(text); rankedIds = documentRepository.findRankedIdsByFts(text);
// FTS matched nothing → no results and, by definition, no undated matches either.
if (rankedIds.isEmpty()) return DocumentSearchResult.of(List.of()); if (rankedIds.isEmpty()) return DocumentSearchResult.of(List.of());
} }
// Global undated count for the current filter (q/tags/sender/receiver/status),
// forcing undatedOnly(true) and IGNORING the user's "Nur undatierte" toggle so
// it never collapses to the page slice and never double-counts (issue #668).
long undatedCount = countUndatedForFilter(hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
return runSearch(text, hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, sort, dir, tagOperator, undated, pageable)
.withUndatedCount(undatedCount);
}
/**
* Counts every undated document (meta_date IS NULL) matching the active filter,
* across all pages, independent of the undated toggle. Reuses {@link #buildSearchSpec}
* with {@code undated=true} forced so the count tracks q/tags/sender/receiver/status.
* A {@code from}/{@code to} range excludes undated rows by the collision rule (#668),
* so the count is legitimately 0 inside a date range.
*/
private long countUndatedForFilter(boolean hasText, List<UUID> ftsIds,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status, TagOperator tagOperator) {
Specification<Document> undatedSpec = buildSearchSpec(
hasText, ftsIds, from, to, sender, receiver, tags, tagQ, status, tagOperator, true);
return documentRepository.count(undatedSpec);
}
/** The original search dispatch — produces the page slice + totals, sans undated count. */
private DocumentSearchResult runSearch(String text, boolean hasText, List<UUID> rankedIds,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status,
DocumentSort sort, String dir, TagOperator tagOperator,
boolean undated, Pageable pageable) {
// The pure-text RELEVANCE fast path is handled by the caller (searchDocuments)
// before findAllMatchingIdsByFts runs, so it never reaches here (ADR-008).
Specification<Document> spec = buildSearchSpec( Specification<Document> spec = buildSearchSpec(
hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator, undated); hasText, rankedIds, from, to, sender, receiver, tags, tagQ, status, tagOperator);
// SENDER and RECEIVER sorts load the full match set and slice in-memory. // SENDER, RECEIVER and RELEVANCE sorts load the full match set and slice in memory.
// JPA's Sort.by("sender.lastName") generates an INNER JOIN that silently drops // JPA's Sort.by("sender.lastName") generates an INNER JOIN that silently drops
// documents with null sender/receivers. Cost scales with match count — // documents with null sender/receivers; RELEVANCE maps a DB order to an external
// acceptable while documents stays under ~10k rows. (ADR-008) // rank list. Cost scales linearly with match count — acceptable while documents
// stays under ~10k rows. Past that, replace with SQL-level LEFT JOIN sort.
if (sort == DocumentSort.RECEIVER) { if (sort == DocumentSort.RECEIVER) {
// In-memory sort on page slice (≤ page size rows) — acceptable
List<Document> sorted = sortByFirstReceiver(documentRepository.findAll(spec), dir); List<Document> sorted = sortByFirstReceiver(documentRepository.findAll(spec), dir);
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size()); return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
} }
if (sort == DocumentSort.SENDER) { if (sort == DocumentSort.SENDER) {
// In-memory sort on page slice (≤ page size rows) — acceptable
List<Document> sorted = sortBySender(documentRepository.findAll(spec), dir); List<Document> sorted = sortBySender(documentRepository.findAll(spec), dir);
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size()); return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
} }
// RELEVANCE with active filters: load filtered subset and sort in-memory by rank. // RELEVANCE: default when text present and no explicit sort given
boolean useRankOrder = hasText && (sort == null || sort == DocumentSort.RELEVANCE); boolean useRankOrder = hasText && (sort == null || sort == DocumentSort.RELEVANCE);
if (useRankOrder) { if (useRankOrder) {
List<Document> results = documentRepository.findAll(spec);
Map<UUID, Integer> rankMap = new HashMap<>(); Map<UUID, Integer> rankMap = new HashMap<>();
for (int i = 0; i < rankedIds.size(); i++) rankMap.put(rankedIds.get(i), i); for (int i = 0; i < rankedIds.size(); i++) rankMap.put(rankedIds.get(i), i);
List<Document> sorted = documentRepository.findAll(spec).stream() List<Document> sorted = results.stream()
.sorted(Comparator.comparingInt(doc -> rankMap.getOrDefault(doc.getId(), Integer.MAX_VALUE))) .sorted(Comparator.comparingInt(
doc -> rankMap.getOrDefault(doc.getId(), Integer.MAX_VALUE)))
.toList(); .toList();
return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size()); return buildResultPaged(pageSlice(sorted, pageable), text, pageable, sorted.size());
} }
@@ -755,39 +619,6 @@ public class DocumentService {
return buildResultPaged(page.getContent(), text, pageable, page.getTotalElements()); return buildResultPaged(page.getContent(), text, pageable, page.getTotalElements());
} }
private static boolean isPureTextRelevance(boolean hasText, DocumentSort sort,
LocalDate from, LocalDate to, UUID sender, UUID receiver,
List<String> tags, String tagQ, DocumentStatus status) {
return hasText && (sort == null || sort == DocumentSort.RELEVANCE)
&& from == null && to == null && sender == null && receiver == null
&& (tags == null || tags.isEmpty()) && (tagQ == null || tagQ.isBlank()) && status == null;
}
/**
* Pure-text RELEVANCE path — pagination and ts_rank ordering pushed into SQL.
* Called when no non-text filters are active (ADR-008).
*/
private DocumentSearchResult relevanceSortedPageFromSql(String text, Pageable pageable) {
long rawOffset = pageable.getOffset();
if (rawOffset > Integer.MAX_VALUE) return DocumentSearchResult.of(List.of());
int offset = (int) rawOffset;
int limit = pageable.getPageSize();
FtsPage ftsPage = toFtsPage(documentRepository.findFtsPageRaw(text, offset, limit));
if (ftsPage.hits().isEmpty()) return DocumentSearchResult.of(List.of());
// Preserve ts_rank order from SQL across the JPA findAllById call.
Map<UUID, Integer> rankMap = new HashMap<>();
List<UUID> pageIds = new ArrayList<>();
for (int i = 0; i < ftsPage.hits().size(); i++) {
rankMap.put(ftsPage.hits().get(i).id(), i);
pageIds.add(ftsPage.hits().get(i).id());
}
List<Document> docs = documentRepository.findAllById(pageIds).stream()
.sorted(Comparator.comparingInt(d -> rankMap.getOrDefault(d.getId(), Integer.MAX_VALUE)))
.toList();
return buildResultPaged(docs, text, pageable, ftsPage.total());
}
private static <T> List<T> pageSlice(List<T> sorted, Pageable pageable) { private static <T> List<T> pageSlice(List<T> sorted, Pageable pageable) {
int from = Math.min((int) pageable.getOffset(), sorted.size()); int from = Math.min((int) pageable.getOffset(), sorted.size());
int to = Math.min(from + pageable.getPageSize(), sorted.size()); int to = Math.min(from + pageable.getPageSize(), sorted.size());
@@ -798,7 +629,7 @@ public class DocumentService {
return DocumentSearchResult.paged(enrichItems(slice, text), pageable, totalElements); return DocumentSearchResult.paged(enrichItems(slice, text), pageable, totalElements);
} }
private List<DocumentListItem> enrichItems(List<Document> documents, String text) { private List<DocumentSearchItem> enrichItems(List<Document> documents, String text) {
List<Document> colorResolved = resolveDocumentTagColors(documents); List<Document> colorResolved = resolveDocumentTagColors(documents);
Map<UUID, SearchMatchData> matchData = enrichWithMatchData(colorResolved, text); Map<UUID, SearchMatchData> matchData = enrichWithMatchData(colorResolved, text);
@@ -806,7 +637,7 @@ public class DocumentService {
Map<UUID, Integer> completionByDoc = fetchCompletionPercentages(docIds); Map<UUID, Integer> completionByDoc = fetchCompletionPercentages(docIds);
Map<UUID, List<ActivityActorDTO>> contributorsByDoc = auditLogQueryService.findRecentContributorsPerDocument(docIds); Map<UUID, List<ActivityActorDTO>> contributorsByDoc = auditLogQueryService.findRecentContributorsPerDocument(docIds);
return colorResolved.stream().map(doc -> toListItem( return colorResolved.stream().map(doc -> new DocumentSearchItem(
doc, doc,
matchData.getOrDefault(doc.getId(), SearchMatchData.empty()), matchData.getOrDefault(doc.getId(), SearchMatchData.empty()),
completionByDoc.getOrDefault(doc.getId(), 0), completionByDoc.getOrDefault(doc.getId(), 0),
@@ -814,28 +645,6 @@ public class DocumentService {
)).toList(); )).toList();
} }
private DocumentListItem toListItem(Document doc, SearchMatchData match, int completionPct, List<ActivityActorDTO> contributors) {
return new DocumentListItem(
doc.getId(),
doc.getTitle(),
doc.getOriginalFilename(),
doc.getThumbnailUrl(),
doc.getDocumentDate(),
doc.getMetaDatePrecision(),
doc.getMetaDateEnd(),
doc.getSender(),
List.copyOf(doc.getReceivers()),
List.copyOf(doc.getTags()),
doc.getArchiveBox(),
doc.getArchiveFolder(),
doc.getLocation(),
doc.getSummary(),
completionPct,
contributors,
match
);
}
private Map<UUID, Integer> fetchCompletionPercentages(List<UUID> docIds) { private Map<UUID, Integer> fetchCompletionPercentages(List<UUID> docIds) {
return transcriptionBlockQueryService.getCompletionStats(docIds); return transcriptionBlockQueryService.getCompletionStats(docIds);
} }
@@ -843,21 +652,12 @@ public class DocumentService {
private Sort resolveSort(DocumentSort sort, String dir) { private Sort resolveSort(DocumentSort sort, String dir) {
Sort.Direction direction = "ASC".equalsIgnoreCase(dir) ? Sort.Direction.ASC : Sort.Direction.DESC; Sort.Direction direction = "ASC".equalsIgnoreCase(dir) ? Sort.Direction.ASC : Sort.Direction.DESC;
if (sort == null || sort == DocumentSort.DATE || sort == DocumentSort.RELEVANCE) { if (sort == null || sort == DocumentSort.DATE || sort == DocumentSort.RELEVANCE) {
// Undated documents (null documentDate) must order last regardless of return Sort.by(direction, "documentDate");
// direction — Postgres puts NULLs FIRST on ASC by default, which would
// surface the undated pile at the top with no explanation (issue #668).
// The title tiebreaker gives a stable total order when every row is
// null-dated (the "Nur undatierte" filter), so pagination is deterministic.
// title is @Column(nullable=false), so it is always present.
return Sort.by(
new Sort.Order(direction, "documentDate").nullsLast(),
Sort.Order.asc("title"));
} }
// SENDER and RECEIVER are sorted in-memory before this method is called // SENDER and RECEIVER are sorted in-memory before this method is called
return switch (sort) { return switch (sort) {
case TITLE -> Sort.by(direction, "title"); case TITLE -> Sort.by(direction, "title");
case UPLOAD_DATE -> Sort.by(direction, "createdAt"); case UPLOAD_DATE -> Sort.by(direction, "createdAt");
case UPDATED_AT -> Sort.by(direction, "updatedAt");
default -> Sort.by(direction, "documentDate"); default -> Sort.by(direction, "documentDate");
}; };
} }
@@ -936,7 +736,6 @@ public class DocumentService {
documentRepository.save(doc); documentRepository.save(doc);
} }
@Transactional(readOnly = true)
public Document getDocumentById(UUID id) { public Document getDocumentById(UUID id) {
Document doc = documentRepository.findById(id) Document doc = documentRepository.findById(id)
.orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + id)); .orElseThrow(() -> DomainException.notFound(ErrorCode.DOCUMENT_NOT_FOUND, "Document not found: " + id));
@@ -1144,28 +943,6 @@ public class DocumentService {
return result; return result;
} }
private static final int COL_ID = 0;
private static final int COL_RANK = 1;
private static final int COL_TOTAL = 2;
/**
* Maps raw Object[] rows from {@link DocumentRepository#findFtsPageRaw} to an
* {@link FtsPage}. Uses pattern-matching UUID cast to guard against driver-level
* type variance (some JDBC drivers return UUID as String).
*/
private static FtsPage toFtsPage(List<Object[]> rows) {
if (rows.isEmpty()) return new FtsPage(List.of(), 0);
long total = ((Number) rows.get(0)[COL_TOTAL]).longValue();
List<FtsHit> hits = rows.stream()
.map(r -> {
UUID id = r[COL_ID] instanceof UUID u ? u : UUID.fromString(r[COL_ID].toString());
double rank = ((Number) r[COL_RANK]).doubleValue();
return new FtsHit(id, rank);
})
.toList();
return new FtsPage(hits, total);
}
/** Clean text + highlight offsets parsed from a {@code ts_headline} sentinel-delimited string. */ /** Clean text + highlight offsets parsed from a {@code ts_headline} sentinel-delimited string. */
public record ParsedHighlight(String cleanText, List<MatchOffset> offsets) {} public record ParsedHighlight(String cleanText, List<MatchOffset> offsets) {}

View File

@@ -1,5 +1,5 @@
package org.raddatz.familienarchiv.document; package org.raddatz.familienarchiv.document;
public enum DocumentSort { public enum DocumentSort {
DATE, TITLE, SENDER, RECEIVER, UPLOAD_DATE, UPDATED_AT, RELEVANCE DATE, TITLE, SENDER, RECEIVER, UPLOAD_DATE, RELEVANCE
} }

View File

@@ -55,12 +55,6 @@ public class DocumentSpecifications {
return (root, query, cb) -> status == null ? null : cb.equal(root.get("status"), status); return (root, query, cb) -> status == null ? null : cb.equal(root.get("status"), status);
} }
// Filtert auf undatierte Dokumente (meta_date IS NULL) — für die "Nur undatierte"-Triage.
// false → kein Prädikat (no-op), true → documentDate IS NULL (issue #668).
public static Specification<Document> undatedOnly(boolean undated) {
return (root, query, cb) -> undated ? cb.isNull(root.get("documentDate")) : null;
}
/** /**
* Filtert nach vorausgeweiteten Tag-ID-Sets mit AND- oder OR-Logik. * Filtert nach vorausgeweiteten Tag-ID-Sets mit AND- oder OR-Logik.
* *

View File

@@ -11,11 +11,6 @@ import org.raddatz.familienarchiv.ocr.ScriptType;
public class DocumentUpdateDTO { public class DocumentUpdateDTO {
private String title; private String title;
private LocalDate documentDate; private LocalDate documentDate;
private DatePrecision metaDatePrecision;
private LocalDate metaDateEnd;
private String metaDateRaw;
private String senderText;
private String receiverText;
private String location; private String location;
private String documentLocation; private String documentLocation;
private String archiveBox; private String archiveBox;

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.document;
import java.util.UUID;
/** A single document hit from a paginated FTS query — id and its ts_rank score. */
record FtsHit(UUID id, double rank) {}

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.document;
import java.util.List;
/** One page of FTS results — the ranked hit list for this page and the total match count. */
record FtsPage(List<FtsHit> hits, long total) {}

View File

@@ -1,10 +0,0 @@
package org.raddatz.familienarchiv.document;
import io.swagger.v3.oas.annotations.media.Schema;
public record MonthBucket(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED, example = "1915-08")
String month,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int count
) {}

View File

@@ -1,50 +0,0 @@
# document
The archive's core concept. A `Document` represents one physical artefact (a letter, a postcard, a photo) stored in MinIO and described by metadata.
## What this domain owns
Entities: `Document`, `DocumentVersion`, `TranscriptionBlock`, `DocumentAnnotation`, `DocumentComment`.
Features: document CRUD, file upload/download, full-text search, bulk editing, transcription workflows, annotation canvas, threaded comments, thumbnail generation (PDFBox).
## What this domain does NOT own
- `Person` (sender / receivers) — referenced by ID, resolved via `PersonService`
- `Tag` — referenced by ID; the join is on the document side but tags are owned by `tag/`
- `AppUser` — comments reference `AppUser` IDs, but user management lives in `user/`
- OCR processing — `ocr/` orchestrates jobs; `ocr-service/` executes them
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `getDocumentById(UUID)` | ocr, notification | Fetch a single document |
| `getDocumentsByIds(List<UUID>)` | ocr | Bulk fetch for OCR job |
| `findByOriginalFilename(String)` | importing | Deduplication during mass import |
| `deleteTagCascading(UUID tagId)` | tag | Remove a tag from all documents before deleting it |
| `findWeeklyStats()` | dashboard | Activity data for Family Pulse widget |
| `count()` | dashboard | Total document count for stats |
| `addTrainingLabel(...)` | ocr | Attach a confirmed sender label to a document |
| `findSegmentationQueue(int limit)` / `findTranscriptionQueue(int limit)` / `findReadyToReadQueue(int limit)` | ocr | OCR pipeline queues |
## Internal layout
- `DocumentController` — REST under `/api/documents`
- `DocumentService` — CRUD, search (JPA Specifications), bulk edit
- `DocumentRepository` — includes bidirectional conversation-thread query
- `DocumentSpecifications` — composable `Specification` predicates for search
- `DocumentVersionService` / `DocumentVersionRepository` — append-only version history
- `ThumbnailService` + `ThumbnailAsyncRunner` — PDFBox thumbnail generation (separate thread pool)
- Sub-packages: `annotation/`, `comment/`, `transcription/`
## Cross-domain dependencies
- `PersonService.getById()` / `getAllById()` — resolve sender and receivers
- `TagService.expandTagNamesToDescendantIdSets()` — tag filter expansion
- `FileService.uploadFile()` / `downloadFile()` / `generatePresignedUrl()` — S3 I/O
- `NotificationService.notifyMentions()` / `.notifyReply()` — comment mentions
- `AuditService.logAfterCommit()` — every mutation is audited
## Frontend counterpart
`frontend/src/lib/document/README.md`

View File

@@ -27,9 +27,7 @@ public class CommentController {
// ─── Block (transcription) comments ──────────────────────────────────────── // ─── Block (transcription) comments ────────────────────────────────────────
@GetMapping("/api/documents/{documentId}/transcription-blocks/{blockId}/comments") @GetMapping("/api/documents/{documentId}/transcription-blocks/{blockId}/comments")
public List<DocumentComment> getBlockComments( public List<DocumentComment> getBlockComments(@PathVariable UUID blockId) {
@PathVariable UUID documentId,
@PathVariable UUID blockId) {
return commentService.getCommentsForBlock(blockId); return commentService.getCommentsForBlock(blockId);
} }
@@ -50,7 +48,6 @@ public class CommentController {
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL})
public DocumentComment replyToBlockComment( public DocumentComment replyToBlockComment(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId,
@PathVariable UUID commentId, @PathVariable UUID commentId,
@RequestBody CreateCommentDTO dto, @RequestBody CreateCommentDTO dto,
Authentication authentication) { Authentication authentication) {

View File

@@ -1,6 +0,0 @@
package org.raddatz.familienarchiv.document.comment;
import jakarta.annotation.Nullable;
import java.util.UUID;
public record CommentData(@Nullable UUID annotationId, String preview) {}

View File

@@ -13,7 +13,6 @@ import org.raddatz.familienarchiv.document.comment.DocumentComment;
import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock; import org.raddatz.familienarchiv.document.transcription.TranscriptionBlock;
import org.raddatz.familienarchiv.document.comment.CommentRepository; import org.raddatz.familienarchiv.document.comment.CommentRepository;
import org.raddatz.familienarchiv.notification.NotificationService; import org.raddatz.familienarchiv.notification.NotificationService;
import org.jsoup.Jsoup;
import org.springframework.stereotype.Service; import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional; import org.springframework.transaction.annotation.Transactional;
@@ -29,29 +28,21 @@ import java.util.UUID;
@RequiredArgsConstructor @RequiredArgsConstructor
public class CommentService { public class CommentService {
private static final int PREVIEW_MAX_CHARS = 120;
private final CommentRepository commentRepository; private final CommentRepository commentRepository;
private final UserService userService; private final UserService userService;
private final NotificationService notificationService; private final NotificationService notificationService;
private final AuditService auditService; private final AuditService auditService;
private final TranscriptionService transcriptionService; private final TranscriptionService transcriptionService;
public Map<UUID, CommentData> findDataByIds(Collection<UUID> commentIds) { public Map<UUID, UUID> findAnnotationIdsByIds(Collection<UUID> commentIds) {
if (commentIds == null || commentIds.isEmpty()) return Map.of(); if (commentIds == null || commentIds.isEmpty()) return Map.of();
Map<UUID, CommentData> result = new HashMap<>(); Map<UUID, UUID> result = new HashMap<>();
for (DocumentComment c : commentRepository.findAllById(commentIds)) { for (DocumentComment c : commentRepository.findAllById(commentIds)) {
result.put(c.getId(), new CommentData(c.getAnnotationId(), stripAndTruncate(c.getContent()))); if (c.getAnnotationId() != null) result.put(c.getId(), c.getAnnotationId());
} }
return result; return result;
} }
private String stripAndTruncate(String html) {
if (html == null || html.isBlank()) return "";
String text = Jsoup.parse(html).text().trim();
return text.length() > PREVIEW_MAX_CHARS ? text.substring(0, PREVIEW_MAX_CHARS) : text;
}
public List<DocumentComment> getCommentsForBlock(UUID blockId) { public List<DocumentComment> getCommentsForBlock(UUID blockId) {
List<DocumentComment> roots = commentRepository.findByBlockIdAndParentIdIsNull(blockId); List<DocumentComment> roots = commentRepository.findByBlockIdAndParentIdIsNull(blockId);
return withRepliesAndMentions(roots); return withRepliesAndMentions(roots);

View File

@@ -43,7 +43,7 @@ public class TranscriptionBlockController {
@PostMapping @PostMapping
@ResponseStatus(HttpStatus.CREATED) @ResponseStatus(HttpStatus.CREATED)
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public TranscriptionBlock createBlock( public TranscriptionBlock createBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@Valid @RequestBody CreateTranscriptionBlockDTO dto, @Valid @RequestBody CreateTranscriptionBlockDTO dto,
@@ -53,7 +53,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/{blockId}") @PutMapping("/{blockId}")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public TranscriptionBlock updateBlock( public TranscriptionBlock updateBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId, @PathVariable UUID blockId,
@@ -65,7 +65,7 @@ public class TranscriptionBlockController {
@DeleteMapping("/{blockId}") @DeleteMapping("/{blockId}")
@ResponseStatus(HttpStatus.NO_CONTENT) @ResponseStatus(HttpStatus.NO_CONTENT)
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public void deleteBlock( public void deleteBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId) { @PathVariable UUID blockId) {
@@ -73,7 +73,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/reorder") @PutMapping("/reorder")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public List<TranscriptionBlock> reorderBlocks( public List<TranscriptionBlock> reorderBlocks(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@RequestBody ReorderTranscriptionBlocksDTO dto) { @RequestBody ReorderTranscriptionBlocksDTO dto) {
@@ -82,7 +82,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/{blockId}/review") @PutMapping("/{blockId}/review")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public TranscriptionBlock reviewBlock( public TranscriptionBlock reviewBlock(
@PathVariable UUID documentId, @PathVariable UUID documentId,
@PathVariable UUID blockId, @PathVariable UUID blockId,
@@ -92,7 +92,7 @@ public class TranscriptionBlockController {
} }
@PutMapping("/review-all") @PutMapping("/review-all")
@RequirePermission({Permission.ANNOTATE_ALL, Permission.WRITE_ALL}) @RequirePermission(Permission.WRITE_ALL)
public List<TranscriptionBlock> markAllBlocksReviewed( public List<TranscriptionBlock> markAllBlocksReviewed(
@PathVariable UUID documentId, @PathVariable UUID documentId,
Authentication authentication) { Authentication authentication) {

View File

@@ -10,21 +10,11 @@ public class DomainException extends RuntimeException {
private final ErrorCode code; private final ErrorCode code;
private final HttpStatus status; private final HttpStatus status;
/** Seconds until the rate-limit window resets; {@code null} when not applicable. */
private final Long retryAfterSeconds;
public DomainException(ErrorCode code, HttpStatus status, String developerMessage) { public DomainException(ErrorCode code, HttpStatus status, String developerMessage) {
super(developerMessage); super(developerMessage);
this.code = code; this.code = code;
this.status = status; this.status = status;
this.retryAfterSeconds = null;
}
private DomainException(ErrorCode code, HttpStatus status, String developerMessage, Long retryAfterSeconds) {
super(developerMessage);
this.code = code;
this.status = status;
this.retryAfterSeconds = retryAfterSeconds;
} }
public ErrorCode getCode() { public ErrorCode getCode() {
@@ -35,11 +25,6 @@ public class DomainException extends RuntimeException {
return status; return status;
} }
/** Returns the {@code Retry-After} value in seconds, or {@code null} if not set. */
public Long getRetryAfterSeconds() {
return retryAfterSeconds;
}
// --- Static factories for common cases --- // --- Static factories for common cases ---
public static DomainException notFound(ErrorCode code, String message) { public static DomainException notFound(ErrorCode code, String message) {
@@ -54,11 +39,6 @@ public class DomainException extends RuntimeException {
return new DomainException(ErrorCode.UNAUTHORIZED, HttpStatus.UNAUTHORIZED, message); return new DomainException(ErrorCode.UNAUTHORIZED, HttpStatus.UNAUTHORIZED, message);
} }
public static DomainException invalidCredentials() {
return new DomainException(ErrorCode.INVALID_CREDENTIALS, HttpStatus.UNAUTHORIZED,
"Invalid email or password");
}
public static DomainException conflict(ErrorCode code, String message) { public static DomainException conflict(ErrorCode code, String message) {
return new DomainException(code, HttpStatus.CONFLICT, message); return new DomainException(code, HttpStatus.CONFLICT, message);
} }
@@ -70,12 +50,4 @@ public class DomainException extends RuntimeException {
public static DomainException internal(ErrorCode code, String message) { public static DomainException internal(ErrorCode code, String message) {
return new DomainException(code, HttpStatus.INTERNAL_SERVER_ERROR, message); return new DomainException(code, HttpStatus.INTERNAL_SERVER_ERROR, message);
} }
public static DomainException tooManyRequests(ErrorCode code, String message) {
return new DomainException(code, HttpStatus.TOO_MANY_REQUESTS, message);
}
public static DomainException tooManyRequests(ErrorCode code, String message, long retryAfterSeconds) {
return new DomainException(code, HttpStatus.TOO_MANY_REQUESTS, message, retryAfterSeconds);
}
} }

View File

@@ -30,8 +30,6 @@ public enum ErrorCode {
// --- Users --- // --- Users ---
/** A user with the given ID or username does not exist. 404 */ /** A user with the given ID or username does not exist. 404 */
USER_NOT_FOUND, USER_NOT_FOUND,
/** A group with the given ID does not exist. 404 */
GROUP_NOT_FOUND,
/** The supplied email address is already used by another account. 409 */ /** The supplied email address is already used by another account. 409 */
EMAIL_ALREADY_IN_USE, EMAIL_ALREADY_IN_USE,
/** The supplied current password does not match the stored hash. 400 */ /** The supplied current password does not match the stored hash. 400 */
@@ -40,8 +38,6 @@ public enum ErrorCode {
// --- Import --- // --- Import ---
/** A mass import is already in progress; only one can run at a time. 409 */ /** A mass import is already in progress; only one can run at a time. 409 */
IMPORT_ALREADY_RUNNING, IMPORT_ALREADY_RUNNING,
/** A canonical import artifact is missing, unreadable, or missing a required header. 400 */
IMPORT_ARTIFACT_INVALID,
// --- Thumbnails --- // --- Thumbnails ---
/** A thumbnail backfill is already in progress; only one can run at a time. 409 */ /** A thumbnail backfill is already in progress; only one can run at a time. 409 */
@@ -56,24 +52,14 @@ public enum ErrorCode {
INVITE_REVOKED, INVITE_REVOKED,
/** The invite has passed its expiry date. 410 */ /** The invite has passed its expiry date. 410 */
INVITE_EXPIRED, INVITE_EXPIRED,
/** A group cannot be deleted because one or more active invites reference it. 409 */
GROUP_HAS_ACTIVE_INVITES,
// --- Auth --- // --- Auth ---
/** The request is not authenticated. 401 */ /** The request is not authenticated. 401 */
UNAUTHORIZED, UNAUTHORIZED,
/** The authenticated user lacks the required permission. 403 */ /** The authenticated user lacks the required permission. 403 */
FORBIDDEN, FORBIDDEN,
/** The supplied email/password combination does not match any active account. 401 */
INVALID_CREDENTIALS,
/** The session has expired or been invalidated. 401 */
SESSION_EXPIRED,
/** The password-reset token is missing, expired, or already used. 400 */ /** The password-reset token is missing, expired, or already used. 400 */
INVALID_RESET_TOKEN, INVALID_RESET_TOKEN,
/** CSRF token is missing or does not match the expected value. 403 */
CSRF_TOKEN_MISSING,
/** The login rate limit has been exceeded for this IP/email combination. 429 */
TOO_MANY_LOGIN_ATTEMPTS,
// --- Annotations --- // --- Annotations ---
/** The annotation with the given ID does not exist. 404 */ /** The annotation with the given ID does not exist. 404 */

View File

@@ -2,7 +2,6 @@ package org.raddatz.familienarchiv.exception;
import java.util.stream.Collectors; import java.util.stream.Collectors;
import io.sentry.Sentry;
import jakarta.validation.ConstraintViolationException; import jakarta.validation.ConstraintViolationException;
import org.raddatz.familienarchiv.exception.DomainException; import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode; import org.raddatz.familienarchiv.exception.ErrorCode;
@@ -16,18 +15,15 @@ import org.springframework.web.server.ResponseStatusException;
import lombok.extern.slf4j.Slf4j; import lombok.extern.slf4j.Slf4j;
// "Handler" is Spring's @RestControllerAdvice naming convention — not a generic suffix.
@RestControllerAdvice @RestControllerAdvice
@Slf4j @Slf4j
public class GlobalExceptionHandler { public class GlobalExceptionHandler {
@ExceptionHandler(DomainException.class) @ExceptionHandler(DomainException.class)
public ResponseEntity<ErrorResponse> handleDomain(DomainException ex) { public ResponseEntity<ErrorResponse> handleDomain(DomainException ex) {
var builder = ResponseEntity.status(ex.getStatus()); return ResponseEntity
if (ex.getRetryAfterSeconds() != null) { .status(ex.getStatus())
builder = builder.header("Retry-After", String.valueOf(ex.getRetryAfterSeconds())); .body(new ErrorResponse(ex.getCode(), ex.getMessage()));
}
return builder.body(new ErrorResponse(ex.getCode(), ex.getMessage()));
} }
@ExceptionHandler(MethodArgumentNotValidException.class) @ExceptionHandler(MethodArgumentNotValidException.class)
@@ -66,7 +62,6 @@ public class GlobalExceptionHandler {
@ExceptionHandler(Exception.class) @ExceptionHandler(Exception.class)
public ResponseEntity<ErrorResponse> handleGeneric(Exception ex) { public ResponseEntity<ErrorResponse> handleGeneric(Exception ex) {
Sentry.captureException(ex);
log.error("Unhandled exception", ex); log.error("Unhandled exception", ex);
return ResponseEntity.internalServerError() return ResponseEntity.internalServerError()
.body(new ErrorResponse(ErrorCode.INTERNAL_ERROR, "An unexpected error occurred")); .body(new ErrorResponse(ErrorCode.INTERNAL_ERROR, "An unexpected error occurred"));

View File

@@ -56,10 +56,6 @@ public class GeschichteService {
// ─── Read API ──────────────────────────────────────────────────────────── // ─── Read API ────────────────────────────────────────────────────────────
public long countPublished() {
return geschichteRepository.count(GeschichteSpecifications.hasStatus(GeschichteStatus.PUBLISHED));
}
public Geschichte getById(UUID id) { public Geschichte getById(UUID id) {
Geschichte g = geschichteRepository.findById(id) Geschichte g = geschichteRepository.findById(id)
.orElseThrow(() -> DomainException.notFound( .orElseThrow(() -> DomainException.notFound(
@@ -81,10 +77,8 @@ public class GeschichteService {
GeschichteStatus effective = currentUserHasBlogWrite() ? status : GeschichteStatus.PUBLISHED; GeschichteStatus effective = currentUserHasBlogWrite() ? status : GeschichteStatus.PUBLISHED;
int safeLimit = limit <= 0 ? DEFAULT_LIMIT : Math.min(limit, MAX_LIMIT); int safeLimit = limit <= 0 ? DEFAULT_LIMIT : Math.min(limit, MAX_LIMIT);
UUID authorId = effective == GeschichteStatus.DRAFT ? currentUser().getId() : null;
Specification<Geschichte> spec = Specification.allOf( Specification<Geschichte> spec = Specification.allOf(
GeschichteSpecifications.hasStatus(effective), GeschichteSpecifications.hasStatus(effective),
GeschichteSpecifications.hasAuthor(authorId),
GeschichteSpecifications.hasAllPersons(personIds), GeschichteSpecifications.hasAllPersons(personIds),
GeschichteSpecifications.hasDocument(documentId), GeschichteSpecifications.hasDocument(documentId),
GeschichteSpecifications.orderByDisplayDateDesc() GeschichteSpecifications.orderByDisplayDateDesc()

View File

@@ -42,12 +42,6 @@ public final class GeschichteSpecifications {
}; };
} }
// null authorId → no restriction (PUBLISHED path passes null; Spring Data skips null predicates)
public static Specification<Geschichte> hasAuthor(UUID authorId) {
return (root, query, cb) ->
authorId == null ? null : cb.equal(root.get("author").get("id"), authorId);
}
public static Specification<Geschichte> hasDocument(UUID documentId) { public static Specification<Geschichte> hasDocument(UUID documentId) {
return (root, query, cb) -> { return (root, query, cb) -> {
if (documentId == null) return null; if (documentId == null) return null;

View File

@@ -1,38 +0,0 @@
# geschichte
Family stories — curated narrative pieces that weave together persons, documents, and commentary into a publishable article. German: *Geschichte* (story / history).
## What this domain owns
Entity: `Geschichte`.
Lifecycle: `DRAFT → PUBLISHED` (only published stories are visible to non-authors).
Features: story CRUD, rich-text editing with person and document cross-references, publish/unpublish toggle, comment thread (shared component from `shared/discussion/`).
## What this domain does NOT own
- `Person` or `Document` records — stories reference them by ID. Deleting a Person or Document does not cascade to Geschichte.
- Comment storage — shared comment infrastructure is in `document/comment/` (or `shared/discussion/` on the frontend).
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `getById(UUID)` | notification | Resolve story context in mention notifications |
| `list(...)` | dashboard | Recent stories for the activity feed |
| `count()` | dashboard | Published story count for stats |
## Internal layout
- `GeschichteController` — REST under `/api/geschichten`
- `GeschichteService` — CRUD, publish lifecycle
- `GeschichteRepository` — list by status, author
## Cross-domain dependencies
- `PersonService.getById()` / `getAllById()` — resolve person references in story body
- `DocumentService.getDocumentsByIds()` — resolve document references in story body
- `AuditService.logAfterCommit()` — story mutations are audited
## Frontend counterpart
`frontend/src/lib/geschichte/README.md`

View File

@@ -1,94 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import java.io.File;
import java.time.LocalDateTime;
import java.util.List;
/**
* Runs the four canonical loaders in their real dependency order — encoded explicitly
* here, not implied by call order — and owns the async runner plus the {@link ImportStatus}
* state machine the admin UI consumes. The orchestrator smoke-checks that all four
* artifacts are present before starting, failing fast rather than half-loading tags but no
* documents. A malformed artifact (a loader throwing) sets {@code FAILED}; an individual
* bad file is surfaced through the {@link ImportStatus.SkippedFile} mechanism instead.
*/
@Service
@RequiredArgsConstructor
@Slf4j
public class CanonicalImportOrchestrator {
private static final String TAG_TREE_ARTIFACT = "canonical-tag-tree.xlsx";
private static final String PERSONS_ARTIFACT = "canonical-persons.xlsx";
private static final String PERSONS_TREE_ARTIFACT = "canonical-persons-tree.json";
private static final String DOCUMENTS_ARTIFACT = "canonical-documents.xlsx";
private final TagTreeImporter tagTreeImporter;
private final PersonRegisterImporter personRegisterImporter;
private final PersonTreeImporter personTreeImporter;
private final DocumentImporter documentImporter;
@Value("${app.import.dir:/import}")
private String canonicalDir;
private volatile ImportStatus currentStatus = new ImportStatus(
ImportStatus.State.IDLE, "IMPORT_IDLE", "Kein Import gestartet.", 0, List.of(), null);
public ImportStatus getStatus() {
return currentStatus;
}
@Async
public void runImportAsync() {
if (currentStatus.state() == ImportStatus.State.RUNNING) {
throw DomainException.conflict(ErrorCode.IMPORT_ALREADY_RUNNING, "A mass import is already in progress");
}
runImport();
}
/** Synchronous entry point — wrapped by {@link #runImportAsync()} and called directly in tests. */
void runImport() {
currentStatus = new ImportStatus(ImportStatus.State.RUNNING, "IMPORT_RUNNING",
"Import läuft...", 0, List.of(), LocalDateTime.now());
try {
File tagTree = requireArtifact(TAG_TREE_ARTIFACT);
File persons = requireArtifact(PERSONS_ARTIFACT);
File personsTree = requireArtifact(PERSONS_TREE_ARTIFACT);
File documents = requireArtifact(DOCUMENTS_ARTIFACT);
// Dependency DAG: documents need persons + tags; the tree needs persons.
tagTreeImporter.load(tagTree);
personRegisterImporter.load(persons);
personTreeImporter.load(personsTree);
DocumentImporter.LoadResult result = documentImporter.load(documents);
currentStatus = new ImportStatus(ImportStatus.State.DONE, "IMPORT_DONE",
"Import abgeschlossen. " + result.processed() + " Dokumente verarbeitet.",
result.processed(), result.skippedFiles(), currentStatus.startedAt());
} catch (DomainException e) {
log.error("Canonical import failed: {}", e.getMessage());
currentStatus = new ImportStatus(ImportStatus.State.FAILED, "IMPORT_FAILED_ARTIFACT",
"Fehler: " + e.getMessage(), 0, List.of(), currentStatus.startedAt());
} catch (Exception e) {
log.error("Canonical import failed", e);
currentStatus = new ImportStatus(ImportStatus.State.FAILED, "IMPORT_FAILED_INTERNAL",
"Fehler: " + e.getMessage(), 0, List.of(), currentStatus.startedAt());
}
}
private File requireArtifact(String name) {
File artifact = new File(canonicalDir, name);
if (!artifact.isFile()) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Missing canonical artifact: " + name);
}
return artifact;
}
}

View File

@@ -1,133 +0,0 @@
package org.raddatz.familienarchiv.importing;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.DateUtil;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
/**
* Value-level POI helper for the canonical import artifacts. No Spring, no domain
* knowledge: it opens a workbook, maps the header row to column indices by name, and
* yields typed rows whose cells are looked up by header name — the seam that replaces
* the old positional {@code @Value app.import.col.*} indices. List columns are split on
* the pipe delimiter the normalizer emits.
*/
public final class CanonicalSheetReader {
private CanonicalSheetReader() {
}
/** A single data row, addressable by canonical header name (never by index). */
public static final class Row {
private final Map<String, Integer> headerIndex;
private final List<String> cells;
private Row(Map<String, Integer> headerIndex, List<String> cells) {
this.headerIndex = headerIndex;
this.cells = cells;
}
/** Trimmed cell value for the named header, or "" when absent/blank. */
public String get(String header) {
Integer index = headerIndex.get(header);
if (index == null || index >= cells.size()) return "";
String value = cells.get(index);
return value == null ? "" : value.trim();
}
}
/**
* Reads all data rows from the first sheet, validating that every required header is
* present. Throws a fail-closed {@link DomainException} on a missing header so a
* loader never silently maps the wrong column.
*/
public static List<Row> readRows(File file, List<String> requiredHeaders) {
try (FileInputStream fis = new FileInputStream(file);
Workbook workbook = WorkbookFactory.create(fis)) {
Sheet sheet = workbook.getSheetAt(0);
org.apache.poi.ss.usermodel.Row headerRow = sheet.getRow(sheet.getFirstRowNum());
Map<String, Integer> headerIndex = mapHeaders(headerRow);
requireHeaders(file, headerIndex, requiredHeaders);
List<Row> rows = new ArrayList<>();
for (int i = sheet.getFirstRowNum() + 1; i <= sheet.getLastRowNum(); i++) {
org.apache.poi.ss.usermodel.Row poiRow = sheet.getRow(i);
if (poiRow == null) continue;
rows.add(new Row(headerIndex, readCells(poiRow, headerIndex.size())));
}
return rows;
} catch (DomainException e) {
throw e;
} catch (Exception e) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Unreadable canonical artifact: " + file.getName());
}
}
/** Splits a pipe-delimited list column into trimmed, non-empty segments. */
public static List<String> splitList(String raw) {
if (raw == null || raw.isBlank()) return List.of();
return Arrays.stream(raw.split("\\|"))
.map(String::trim)
.filter(s -> !s.isEmpty())
.toList();
}
private static Map<String, Integer> mapHeaders(org.apache.poi.ss.usermodel.Row headerRow) {
if (headerRow == null) {
return Map.of();
}
Map<String, Integer> headerIndex = new HashMap<>();
for (int c = 0; c < headerRow.getLastCellNum(); c++) {
String name = cellToString(headerRow.getCell(c)).trim();
if (!name.isEmpty()) headerIndex.putIfAbsent(name, c);
}
return headerIndex;
}
private static void requireHeaders(File file, Map<String, Integer> headerIndex, List<String> requiredHeaders) {
for (String header : requiredHeaders) {
if (!headerIndex.containsKey(header)) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Missing required header '" + header + "' in artifact " + file.getName());
}
}
}
private static List<String> readCells(org.apache.poi.ss.usermodel.Row poiRow, int columnCount) {
int width = Math.max(columnCount, poiRow.getLastCellNum());
List<String> cells = new ArrayList<>(width);
for (int c = 0; c < width; c++) {
cells.add(cellToString(poiRow.getCell(c)));
}
return cells;
}
private static String cellToString(Cell cell) {
if (cell == null) return "";
return switch (cell.getCellType()) {
case STRING -> cell.getStringCellValue();
case NUMERIC -> {
if (DateUtil.isCellDateFormatted(cell)) {
yield cell.getLocalDateTimeCellValue().toLocalDate().toString();
}
yield String.valueOf((long) cell.getNumericCellValue());
}
case BOOLEAN -> String.valueOf(cell.getBooleanCellValue());
default -> "";
};
}
}

View File

@@ -1,364 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.document.DatePrecision;
import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.DocumentStatus;
import org.raddatz.familienarchiv.document.ThumbnailAsyncRunner;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.person.PersonType;
import org.raddatz.familienarchiv.person.PersonUpsertCommand;
import org.raddatz.familienarchiv.tag.Tag;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Component;
import org.springframework.transaction.annotation.Transactional;
import software.amazon.awssdk.core.sync.RequestBody;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;
import org.raddatz.familienarchiv.tag.TagService;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.time.LocalDate;
import java.time.format.DateTimeParseException;
import java.util.ArrayList;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Optional;
import java.util.Set;
import java.util.UUID;
import java.util.regex.Pattern;
/**
* Loads {@code canonical-documents.xlsx} into the document domain. Java performs no
* semantic transformation: the normalizer already resolved people to slugs and dates to
* ISO values. This loader maps columns by header name, routes each attribution
* register-first (always retaining the raw cell in {@code sender_text}/{@code receiver_text}),
* parses clean dates, and keeps the S3/thumbnail plumbing.
*
* <p>The import corpus is uniform — every PDF is named {@code <index>.pdf} flat in the import
* dir — so a document's PDF is resolved <em>directly by its index</em>:
* {@code importDir.resolve(index + ".pdf")}. The {@code index} is still hostile input
* regardless of upstream trust (CWE-22 does not care it came from our Python tool): it is
* validated against a strict catalog pattern with {@link #isValidImportIndex} (no path
* separators, no {@code .}/{@code ..}, no absolute path, no slash homoglyphs) and the
* resolved path is asserted to stay inside the import dir in {@link #resolvePdfByIndex} as
* defense-in-depth. The {@code %PDF} magic-byte check still gates upload.
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class DocumentImporter {
static final List<String> REQUIRED_HEADERS = List.of(
"index", "sender_person_id", "sender_name",
"receiver_person_ids", "receiver_names", "date_iso", "date_raw", "date_precision");
// Catalog index shape: 14 letters (ASCII + Latin-1 letters, e.g. the German "ü" in
// "Mü-0001"), one or more hyphens (the corpus has a few "C--0029" data-entry artefacts),
// digits, and an optional trailing "x" the normalizer recognises. Anchored, with no
// separator / dot / slash characters in the class, so "<index>.pdf" can never traverse.
// NOTE: `\d` here is intentionally ASCII-only ([0-9]). Java's java.util.regex matches `\d`
// against [0-9] unless Pattern.UNICODE_CHARACTER_CLASS is set — do NOT add that flag, or
// Arabic-Indic / fullwidth digits would silently widen the accepted set.
private static final Pattern INDEX_PATTERN =
Pattern.compile("[A-Za-z\\u00C0-\\u00D6\\u00D8-\\u00F6\\u00F8-\\u00FF]{1,4}-+\\d+x?");
private final DocumentService documentService;
private final PersonService personService;
private final TagService tagService;
private final S3Client s3Client;
private final ThumbnailAsyncRunner thumbnailAsyncRunner;
@Value("${app.s3.bucket:familienarchiv}")
private String bucketName;
@Value("${app.import.dir:/import}")
private String importDir;
/** Outcome of loading the document sheet: processed count + per-file skips. */
public record LoadResult(int processed, List<ImportStatus.SkippedFile> skippedFiles) {}
// One transaction for the whole sheet keeps the Hibernate session open so an existing
// document's lazy receivers collection initialises during an idempotent re-import.
// Invoked cross-bean from the orchestrator, so the @Transactional proxy applies.
@Transactional
public LoadResult load(File artifact) {
List<CanonicalSheetReader.Row> rows = CanonicalSheetReader.readRows(artifact, REQUIRED_HEADERS);
int processed = 0;
List<ImportStatus.SkippedFile> skipped = new ArrayList<>();
// 1-based source row number for ops triage breadcrumbs (the spreadsheet header is row 1,
// so the first data row is row 2 — matches what an operator sees in the .xlsx).
int rowNumber = 1;
for (CanonicalSheetReader.Row row : rows) {
rowNumber++;
String index = row.get("index");
if (index.isBlank()) continue;
Optional<ImportStatus.SkipReason> skipReason = importRow(row, index, rowNumber);
if (skipReason.isPresent()) {
skipped.add(new ImportStatus.SkippedFile(index, skipReason.get()));
} else {
processed++;
}
}
log.info("Imported {} documents from {} ({} skipped)", processed, artifact.getName(), skipped.size());
return new LoadResult(processed, skipped);
}
private Optional<ImportStatus.SkipReason> importRow(CanonicalSheetReader.Row row, String index, int rowNumber) {
if (!isValidImportIndex(index)) {
// Breadcrumb is the source row number, NOT the raw (possibly-hostile) index — an
// operator triaging the import can find the offending row in the .xlsx without us
// echoing attacker-controlled input into the log.
log.warn("Skipping import row {}: index rejected (fails catalog-shape validation)", rowNumber);
return Optional.of(ImportStatus.SkipReason.INVALID_FILENAME_PATH_TRAVERSAL);
}
Optional<File> resolved = resolvePdfByIndex(index, rowNumber);
if (resolved.isEmpty()) {
// Distinct from the "index rejected" skip above: the index is VALID but no
// <index>.pdf is on disk, so the row becomes a normal PLACEHOLDER (not skipped). The
// index is a validated catalog id (no hostile content), so it is safe to log here —
// this surfaces a corpus that drifts from the "<index>.pdf" assumption (e.g. a file
// that arrived under a different name) rather than dropping it silently.
log.info("Import row {}: index {} is valid but {}.pdf is absent — creating PLACEHOLDER",
rowNumber, index, index);
} else {
try {
if (!isPdfMagicBytes(resolved.get())) {
return Optional.of(ImportStatus.SkipReason.INVALID_PDF_SIGNATURE);
}
} catch (IOException e) {
log.error("Magic-byte check failed for row {}", index, e);
return Optional.of(ImportStatus.SkipReason.FILE_READ_ERROR);
}
}
return persist(row, index, resolved);
}
private Optional<ImportStatus.SkipReason> persist(CanonicalSheetReader.Row row, String index, Optional<File> file) {
Document existing = documentService.findByOriginalFilename(index).orElse(null);
if (existing != null && existing.getStatus() != DocumentStatus.PLACEHOLDER) {
return Optional.of(ImportStatus.SkipReason.ALREADY_EXISTS);
}
String s3Key = null;
String contentType = null;
DocumentStatus status = DocumentStatus.PLACEHOLDER;
if (file.isPresent()) {
contentType = probeContentType(file.get());
s3Key = "documents/" + UUID.randomUUID() + "_" + file.get().getName();
try {
uploadToS3(file.get(), s3Key, contentType);
status = DocumentStatus.UPLOADED;
} catch (Exception e) {
log.error("S3 upload failed for {}", file.get().getName(), e);
return Optional.of(ImportStatus.SkipReason.S3_UPLOAD_FAILED);
}
}
Document doc = buildDocument(row, index, existing, s3Key, contentType, status);
Document saved = documentService.save(doc);
if (file.isPresent()) {
thumbnailAsyncRunner.dispatchAfterCommit(saved.getId());
}
return Optional.empty();
}
private Document buildDocument(CanonicalSheetReader.Row row, String index, Document existing,
String s3Key, String contentType, DocumentStatus status) {
Document doc = existing != null ? existing
: Document.builder().originalFilename(index).build();
String senderName = row.get("sender_name");
String receiverNames = row.get("receiver_names");
Person sender = resolveSender(row.get("sender_person_id"), senderName);
Set<Person> receivers = resolveReceivers(row.get("receiver_person_ids"));
LocalDate date = parseIsoDate(row.get("date_iso"));
DatePrecision precision = parsePrecision(row.get("date_precision"));
LocalDate dateEnd = parseIsoDate(row.get("date_end"));
String dateRaw = blankToNull(row.get("date_raw"));
String location = blankToNull(row.get("location"));
doc.setTitle(buildTitle(index, date, precision, dateEnd, dateRaw, location));
doc.setStatus(status);
doc.setFilePath(s3Key);
doc.setContentType(contentType);
doc.setSender(sender);
doc.setSenderText(blankToNull(senderName));
// The canonical row is authoritative for receivers/tags (ADR-025): clear then
// re-populate so a shrunk set on re-import prunes stale links rather than
// accumulating them. The raw sender_text/receiver_text retention is separate.
doc.getReceivers().clear();
doc.getReceivers().addAll(receivers);
doc.setReceiverText(blankToNull(receiverNames));
doc.setDocumentDate(date);
doc.setMetaDatePrecision(precision);
doc.setMetaDateEnd(dateEnd);
doc.setMetaDateRaw(dateRaw);
doc.setLocation(location);
doc.setSummary(blankToNull(row.get("summary")));
attachTag(doc, row.get("tags"));
doc.setMetadataComplete(doc.getDocumentDate() != null || sender != null || !receivers.isEmpty());
return doc;
}
// The title carries the date at the HONEST precision (never a fabricated day) via the
// shared DocumentTitleFormatter, plus the location — kept under 20 lines by delegating.
private static String buildTitle(String index, LocalDate date, DatePrecision precision,
LocalDate end, String raw, String location) {
StringBuilder title = new StringBuilder(index);
if (date != null && precision != DatePrecision.UNKNOWN) {
title.append(" ").append(DocumentTitleFormatter.formatTitleDate(date, precision, end, raw));
}
if (location != null && !location.isBlank()) {
title.append(" ").append(location);
}
return title.toString();
}
// ─── attribution routing — register-first, always retain raw ─────────────────────
private Person resolveSender(String slug, String rawName) {
if (slug.isBlank()) return null;
return resolvePerson(slug, rawName);
}
private Set<Person> resolveReceivers(String slugs) {
Set<Person> receivers = new LinkedHashSet<>();
for (String slug : CanonicalSheetReader.splitList(slugs)) {
receivers.add(resolvePerson(slug, slug));
}
return receivers;
}
private Person resolvePerson(String slug, String rawName) {
return personService.findBySourceRef(slug)
.orElseGet(() -> personService.upsertBySourceRef(PersonUpsertCommand.builder()
.sourceRef(slug)
.lastName(blankToNull(rawName) == null ? slug : rawName)
.personType(PersonType.PERSON)
.provisional(true)
.build()));
}
// Authoritative: the canonical row defines the document's tags exactly. Clearing first
// means a tag removed from the row is pruned on re-import (ADR-025).
private void attachTag(Document doc, String tagPath) {
doc.getTags().clear();
if (tagPath.isBlank()) return;
tagService.findBySourceRef(tagPath).ifPresent(tag -> doc.getTags().add(tag));
}
// ─── clean-value parsing (no semantic logic) ─────────────────────────────────────
private static LocalDate parseIsoDate(String value) {
if (value == null || value.isBlank()) return null;
try {
return LocalDate.parse(value.trim());
} catch (DateTimeParseException e) {
return null;
}
}
private static DatePrecision parsePrecision(String value) {
if (value == null || value.isBlank()) return DatePrecision.UNKNOWN;
try {
return DatePrecision.valueOf(value.trim());
} catch (IllegalArgumentException e) {
return DatePrecision.UNKNOWN;
}
}
// ─── file handling + S3 (small ≤20-line methods) ─────────────────────────────────
private String probeContentType(File file) {
try {
String probed = Files.probeContentType(file.toPath());
return probed != null ? probed : "application/octet-stream";
} catch (IOException e) {
return "application/octet-stream";
}
}
private void uploadToS3(File file, String s3Key, String contentType) {
s3Client.putObject(PutObjectRequest.builder()
.bucket(bucketName)
.key(s3Key)
.contentType(contentType)
.build(),
RequestBody.fromFile(file));
}
// ─── index validation + containment — defense-in-depth, do not weaken ────────────
// The index is the only thing that drives the on-disk lookup, so it must never contain a
// path separator, traversal token, slash homoglyph, null byte, or absolute-path marker —
// each guard mirrors the filename guards ported from MassImportService — and it must match
// the strict catalog shape so anything unexpected is skipped loudly rather than read.
private boolean isValidImportIndex(String index) {
if (index == null || index.isBlank()) return false;
if (index.contains("/")) return false;
if (index.contains("\\")) return false;
if (index.contains("")) return false; // U+2215 DIVISION SLASH
if (index.contains("")) return false; // U+FF0F FULLWIDTH SOLIDUS
if (index.contains("")) return false; // U+29F5 REVERSE SOLIDUS OPERATOR
if (index.contains(".")) return false; // no dots — "<index>.pdf" is the only extension
if (index.contains("\0")) return false;
if (Paths.get(index).isAbsolute()) return false;
return INDEX_PATTERN.matcher(index).matches();
}
// package-private: a Mockito spy in tests can override to inject IOException
InputStream openFileStream(File file) throws IOException {
return new FileInputStream(file);
}
private boolean isPdfMagicBytes(File file) throws IOException {
try (InputStream is = openFileStream(file)) {
byte[] header = is.readNBytes(4);
return header.length == 4
&& header[0] == 0x25 // %
&& header[1] == 0x50 // P
&& header[2] == 0x44 // D
&& header[3] == 0x46; // F
}
}
// O(1) direct lookup: the PDF is exactly importDir/<index>.pdf. The caller has already
// validated the index shape; the canonical-path containment assertion below is
// defense-in-depth so even a symlinked <index>.pdf cannot read outside importDir.
private Optional<File> resolvePdfByIndex(String index, int rowNumber) {
File baseDir = new File(importDir);
File candidate = baseDir.toPath().resolve(index + ".pdf").toFile();
try {
if (!candidate.isFile()) return Optional.empty();
String baseDirCanonical = baseDir.getCanonicalPath();
if (!candidate.getCanonicalPath().startsWith(baseDirCanonical + File.separator)) {
throw DomainException.internal(ErrorCode.INTERNAL_ERROR, "Path escape detected: " + candidate);
}
return Optional.of(candidate);
} catch (IOException e) {
// Distinct from the deliberate symlink-escape abort above (which throws): canonical
// resolution itself failed (e.g. the OS rejected the path mid-resolution). We fail
// safe to a PLACEHOLDER, but never silently — log it so the asymmetry surfaces in ops.
log.warn("Canonical path resolution failed for import row {}: treating {}.pdf as absent",
rowNumber, index, e);
return Optional.empty();
}
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s;
}
}

View File

@@ -1,112 +0,0 @@
package org.raddatz.familienarchiv.importing;
import org.raddatz.familienarchiv.document.DatePrecision;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.util.Locale;
/**
* Produces the honest German date label baked into an import title — at exactly
* the precision the data claims, never finer. This is the Java half of the
* single source of truth shared with the frontend {@code formatDocumentDate}
* (TypeScript): both are asserted against {@code docs/date-label-fixtures.json}
* so the two implementations cannot drift (see #666).
*
* <p>Import titles are always German, so the labels here are the German
* canonical form (mirroring the {@code de} Paraglide messages used by the UI).
*/
final class DocumentTitleFormatter {
private static final DateTimeFormatter LONG = DateTimeFormatter.ofPattern("d. MMMM yyyy", Locale.GERMAN);
private static final DateTimeFormatter MONTH_YEAR = DateTimeFormatter.ofPattern("MMMM yyyy", Locale.GERMAN);
private static final DateTimeFormatter MEDIUM = DateTimeFormatter.ofPattern("d. MMM yyyy", Locale.GERMAN);
private static final DateTimeFormatter DAY_MONTH = DateTimeFormatter.ofPattern("d. MMM", Locale.GERMAN);
private static final String UNKNOWN = "Datum unbekannt";
private static final String APPROX_PREFIX = "ca.";
private static final String OPEN_RANGE_PREFIX = "ab";
private DocumentTitleFormatter() {
}
/**
* @param date the sort/filter anchor day; null for UNKNOWN rows
* @param precision descriptive precision metadata
* @param end the RANGE end day; null means an open-ended range
* @param raw the verbatim spreadsheet cell, used only to pick a season word
* @return the honest German label
*/
static String formatTitleDate(LocalDate date, DatePrecision precision, LocalDate end, String raw) {
if (precision == DatePrecision.UNKNOWN || date == null) {
return UNKNOWN;
}
return switch (precision) {
case DAY -> LONG.format(date);
case MONTH -> MONTH_YEAR.format(date);
case SEASON -> seasonLabel(date, raw);
case YEAR -> String.valueOf(date.getYear());
case APPROX -> APPROX_PREFIX + " " + date.getYear();
case RANGE -> rangeLabel(date, end);
case UNKNOWN -> UNKNOWN;
};
}
private static String seasonLabel(LocalDate date, String raw) {
Season season = seasonFromRaw(raw);
if (season == null) {
season = seasonOfMonth(date.getMonthValue());
}
return season.german + " " + date.getYear();
}
private static String rangeLabel(LocalDate start, LocalDate end) {
if (end == null) {
return OPEN_RANGE_PREFIX + " " + MEDIUM.format(start);
}
if (end.equals(start)) {
return MEDIUM.format(start);
}
if (start.getYear() != end.getYear()) {
return MEDIUM.format(start) + " " + MEDIUM.format(end);
}
if (start.getMonthValue() == end.getMonthValue()) {
return start.getDayOfMonth() + "." + MEDIUM.format(end);
}
return DAY_MONTH.format(start) + " " + MEDIUM.format(end);
}
// ─── season mapping — mirrors the normalizer's representative months ─────────────
private enum Season {
SPRING("Frühling"),
SUMMER("Sommer"),
AUTUMN("Herbst"),
WINTER("Winter");
private final String german;
Season(String german) {
this.german = german;
}
}
private static Season seasonOfMonth(int month) {
if (month >= 3 && month <= 5) return Season.SPRING;
if (month >= 6 && month <= 8) return Season.SUMMER;
if (month >= 9 && month <= 11) return Season.AUTUMN;
return Season.WINTER;
}
private static Season seasonFromRaw(String raw) {
if (raw == null || raw.isBlank()) return null;
String token = raw.trim().split("\\s+")[0].toLowerCase(Locale.GERMAN);
return switch (token) {
case "frühling", "frühjahr" -> Season.SPRING;
case "sommer" -> Season.SUMMER;
case "herbst" -> Season.AUTUMN;
case "winter" -> Season.WINTER;
default -> null;
};
}
}

View File

@@ -1,50 +0,0 @@
package org.raddatz.familienarchiv.importing;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonProperty;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.LocalDateTime;
import java.util.List;
/**
* Async import state surfaced to {@code admin/system/ImportStatusCard.svelte} via the
* generated types. The shape ({@code state, statusCode, processed, skippedFiles, skipped})
* is kept verbatim from the retired MassImportService so the admin UI keeps working.
*/
public record ImportStatus(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) State state,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) String statusCode,
@JsonIgnore String message,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) int processed,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) List<SkippedFile> skippedFiles,
LocalDateTime startedAt
) {
public enum State { IDLE, RUNNING, DONE, FAILED }
public enum SkipReason {
INVALID_FILENAME_PATH_TRAVERSAL,
INVALID_PDF_SIGNATURE,
FILE_READ_ERROR,
ALREADY_EXISTS,
S3_UPLOAD_FAILED
}
public record SkippedFile(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) String filename,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) SkipReason reason
) {}
// Note: @Schema on a record accessor method is not picked up by SpringDoc; the
// "skipped" count is a computed convenience field derived from skippedFiles.size().
@JsonProperty("skipped")
public int skipped() {
return skippedFiles.size();
}
/** Defensive-copy constructor — callers cannot mutate the stored list after construction. */
public ImportStatus {
skippedFiles = List.copyOf(skippedFiles);
}
}

View File

@@ -0,0 +1,390 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.apache.poi.ss.usermodel.*;
import java.util.Objects;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.document.Document;
import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.DocumentStatus;
import org.raddatz.familienarchiv.document.ThumbnailAsyncRunner;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.tag.Tag;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.person.PersonNameParser;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.tag.TagService;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import org.springframework.transaction.annotation.Transactional;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import software.amazon.awssdk.core.sync.RequestBody;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.PutObjectRequest;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.time.LocalDate;
import java.time.LocalDateTime;
import java.time.format.DateTimeFormatter;
import java.time.format.DateTimeParseException;
import java.util.ArrayList;
import java.util.List;
import java.util.Locale;
import java.util.Optional;
import java.util.UUID;
import java.util.stream.Stream;
import java.util.zip.ZipFile;
@Service
@RequiredArgsConstructor
@Slf4j
public class MassImportService {
public enum State { IDLE, RUNNING, DONE, FAILED }
public record ImportStatus(State state, String message, int processed, LocalDateTime startedAt) {}
private volatile ImportStatus currentStatus = new ImportStatus(State.IDLE, "Kein Import gestartet.", 0, null);
public ImportStatus getStatus() {
return currentStatus;
}
private final DocumentService documentService;
private final PersonService personService;
private final TagService tagService;
private final S3Client s3Client;
private final ThumbnailAsyncRunner thumbnailAsyncRunner;
@Value("${app.s3.bucket}")
private String bucketName;
@Value("${app.import.col.index:0}")
private int colIndex;
@Value("${app.import.col.box:1}")
private int colBox;
@Value("${app.import.col.folder:2}")
private int colFolder;
@Value("${app.import.col.sender:3}")
private int colSender;
@Value("${app.import.col.receivers:5}")
private int colReceivers;
@Value("${app.import.col.date:7}")
private int colDate;
@Value("${app.import.col.location:9}")
private int colLocation;
@Value("${app.import.col.tags:10}")
private int colTags;
@Value("${app.import.col.summary:11}")
private int colSummary;
@Value("${app.import.col.transcription:13}")
private int colTranscription;
private static final String IMPORT_DIR = "/import";
private static final DateTimeFormatter GERMAN_DATE = DateTimeFormatter.ofPattern("d. MMMM yyyy", Locale.GERMAN);
// ODS XML namespaces
private static final String NS_TABLE = "urn:oasis:names:tc:opendocument:xmlns:table:1.0";
private static final String NS_TEXT = "urn:oasis:names:tc:opendocument:xmlns:text:1.0";
// We only need up to this many columns; caps repeated-empty-cell expansion
private static final int MAX_COLS = 20;
@Async
public void runImportAsync() {
if (currentStatus.state() == State.RUNNING) {
throw DomainException.conflict(ErrorCode.IMPORT_ALREADY_RUNNING, "A mass import is already in progress");
}
currentStatus = new ImportStatus(State.RUNNING, "Import läuft...", 0, LocalDateTime.now());
try {
File spreadsheet = findSpreadsheetFile();
log.info("Starte Massenimport aus: {}", spreadsheet.getAbsolutePath());
int processed = processRows(readSpreadsheet(spreadsheet));
currentStatus = new ImportStatus(State.DONE,
"Import abgeschlossen. " + processed + " Dokumente verarbeitet.",
processed, currentStatus.startedAt());
} catch (Exception e) {
log.error("Massenimport fehlgeschlagen", e);
currentStatus = new ImportStatus(State.FAILED, "Fehler: " + e.getMessage(), 0, currentStatus.startedAt());
}
}
private File findSpreadsheetFile() throws IOException {
try (Stream<Path> files = Files.list(Paths.get(IMPORT_DIR))) {
return files
.filter(p -> {
String name = p.toString().toLowerCase();
return name.endsWith(".ods") || name.endsWith(".xlsx") || name.endsWith(".xls");
})
.findFirst()
.orElseThrow(() -> new RuntimeException(
"Keine Tabellendatei (.ods/.xlsx/.xls) in " + IMPORT_DIR + " gefunden!"))
.toFile();
}
}
// --- Spreadsheet reading (format-specific, produces neutral List<List<String>>) ---
private List<List<String>> readSpreadsheet(File file) throws Exception {
String name = file.getName().toLowerCase();
if (name.endsWith(".ods")) {
return readOds(file);
}
return readXlsx(file);
}
/**
* Reads an ODS file by parsing its content.xml directly (no extra library needed).
* ODS is a ZIP archive; content.xml holds the spreadsheet data as XML.
*/
private List<List<String>> readOds(File file) throws Exception {
List<List<String>> result = new ArrayList<>();
try (ZipFile zip = new ZipFile(file)) {
var entry = zip.getEntry("content.xml");
if (entry == null) throw new RuntimeException("Ungültige ODS-Datei: content.xml fehlt");
var factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
var builder = factory.newDocumentBuilder();
var doc = builder.parse(zip.getInputStream(entry));
NodeList tables = doc.getElementsByTagNameNS(NS_TABLE, "table");
if (tables.getLength() == 0) return result;
var table = (Element) tables.item(0);
NodeList rows = table.getElementsByTagNameNS(NS_TABLE, "table-row");
for (int i = 0; i < rows.getLength(); i++) {
var row = (Element) rows.item(i);
List<String> rowData = new ArrayList<>();
NodeList cells = row.getElementsByTagNameNS(NS_TABLE, "table-cell");
for (int j = 0; j < cells.getLength() && rowData.size() < MAX_COLS; j++) {
var cell = (Element) cells.item(j);
// Read the display text (first <text:p>)
String value = "";
NodeList textNodes = cell.getElementsByTagNameNS(NS_TEXT, "p");
if (textNodes.getLength() > 0) {
value = textNodes.item(0).getTextContent().trim();
}
// Expand number-columns-repeated (capped at MAX_COLS)
String repeatAttr = cell.getAttributeNS(NS_TABLE, "number-columns-repeated");
int repeat = repeatAttr.isEmpty() ? 1 : Integer.parseInt(repeatAttr);
repeat = Math.min(repeat, MAX_COLS - rowData.size());
for (int r = 0; r < repeat; r++) {
rowData.add(value);
}
}
result.add(rowData);
}
}
return result;
}
/** Reads an XLSX/XLS file using Apache POI. Converts all cells to strings. */
private List<List<String>> readXlsx(File file) throws Exception {
List<List<String>> result = new ArrayList<>();
try (FileInputStream fis = new FileInputStream(file);
Workbook workbook = WorkbookFactory.create(fis)) {
Sheet sheet = workbook.getSheetAt(0);
for (int i = 0; i <= sheet.getLastRowNum(); i++) {
Row row = sheet.getRow(i);
List<String> rowData = new ArrayList<>();
if (row != null) {
for (int j = 0; j < MAX_COLS; j++) {
rowData.add(xlsxCellToString(row.getCell(j)));
}
}
result.add(rowData);
}
}
return result;
}
private String xlsxCellToString(Cell cell) {
if (cell == null) return "";
return switch (cell.getCellType()) {
case STRING -> cell.getStringCellValue();
case NUMERIC -> {
if (DateUtil.isCellDateFormatted(cell)) {
yield cell.getLocalDateTimeCellValue().toLocalDate().toString(); // ISO
}
yield String.valueOf((int) cell.getNumericCellValue());
}
case BOOLEAN -> String.valueOf(cell.getBooleanCellValue());
default -> "";
};
}
// --- Import logic (works on neutral List<String> rows) ---
private int processRows(List<List<String>> rows) {
int count = 0;
for (int i = 1; i < rows.size(); i++) { // skip header row
List<String> cells = rows.get(i);
String index = getCell(cells, colIndex);
if (index.isBlank()) continue;
String filename = index.contains(".") ? index : index + ".pdf";
Optional<File> fileOnDisk = findFileRecursive(filename);
if (fileOnDisk.isEmpty()) {
log.warn("Datei nicht gefunden, importiere nur Metadaten: {}", filename);
}
importSingleDocument(cells, fileOnDisk, filename, index);
count++;
}
return count;
}
@Transactional
protected void importSingleDocument(List<String> cells, Optional<File> file, String originalFilename, String index) {
Optional<Document> existing = documentService.findByOriginalFilename(originalFilename);
if (existing.isPresent() && existing.get().getStatus() != DocumentStatus.PLACEHOLDER) {
log.info("Dokument {} existiert bereits, überspringe.", originalFilename);
return;
}
String archiveBox = getCell(cells, colBox);
String archiveFolder = getCell(cells, colFolder);
String senderRaw = getCell(cells, colSender);
String receiversRaw = getCell(cells, colReceivers);
LocalDate date = parseDate(getCell(cells, colDate));
String location = getCell(cells, colLocation);
String tagRaw = getCell(cells, colTags);
String summary = getCell(cells, colSummary);
String transcription = getCell(cells, colTranscription);
String s3Key = null;
String contentType = null;
DocumentStatus status = DocumentStatus.PLACEHOLDER;
if (file.isPresent()) {
try {
contentType = Files.probeContentType(file.get().toPath());
} catch (IOException e) {
contentType = null;
}
if (contentType == null) contentType = "application/octet-stream";
s3Key = "documents/" + UUID.randomUUID() + "_" + file.get().getName();
try {
s3Client.putObject(PutObjectRequest.builder()
.bucket(bucketName)
.key(s3Key)
.contentType(contentType)
.build(),
RequestBody.fromFile(file.get()));
status = DocumentStatus.UPLOADED;
} catch (Exception e) {
log.error("S3 Upload Fehler für {}", file.get().getName(), e);
return;
}
}
Person sender = senderRaw.isBlank() ? null : findOrCreatePerson(senderRaw);
List<Person> receivers = PersonNameParser.parseReceivers(receiversRaw).stream()
.map(this::findOrCreatePerson)
.filter(Objects::nonNull)
.toList();
Tag tag = null;
if (!tagRaw.isBlank()) {
tag = tagService.findOrCreate(tagRaw);
}
Document doc = existing.orElse(Document.builder()
.originalFilename(originalFilename)
.build());
// Heuristic: mark as complete if at least one key field is present in the spreadsheet row
boolean metadataComplete = date != null || !senderRaw.isBlank() || !receiversRaw.isBlank();
doc.setTitle(buildTitle(index, date, location));
doc.setFilePath(s3Key);
doc.setContentType(contentType);
doc.setStatus(status);
doc.setArchiveBox(archiveBox.isBlank() ? null : archiveBox);
doc.setArchiveFolder(archiveFolder.isBlank() ? null : archiveFolder);
doc.setDocumentDate(date);
doc.setLocation(location.isBlank() ? null : location);
doc.setSummary(summary.isBlank() ? null : summary);
doc.setTranscription(transcription.isBlank() ? null : transcription);
doc.setSender(sender);
doc.getReceivers().addAll(receivers);
if (tag != null) doc.getTags().add(tag);
doc.setMetadataComplete(metadataComplete);
Document saved = documentService.save(doc);
if (file.isPresent()) {
thumbnailAsyncRunner.dispatchAfterCommit(saved.getId());
}
log.info("Importiert{}: {}", file.isEmpty() ? " (nur Metadaten)" : "", originalFilename);
}
// --- Helpers ---
private String getCell(List<String> cells, int col) {
if (col >= cells.size()) return "";
String val = cells.get(col);
return val == null ? "" : val.trim();
}
private LocalDate parseDate(String value) {
if (value == null || value.isBlank()) return null;
try {
return LocalDate.parse(value.trim());
} catch (DateTimeParseException e) {
return null;
}
}
private String buildTitle(String index, LocalDate date, String location) {
StringBuilder sb = new StringBuilder(index);
if (date != null) {
sb.append(" \u2013 ").append(date.format(GERMAN_DATE));
}
if (location != null && !location.isBlank()) {
sb.append(" \u2013 ").append(location);
}
return sb.toString();
}
private Person findOrCreatePerson(String rawName) {
return personService.findOrCreateByAlias(rawName);
}
private Optional<File> findFileRecursive(String filename) {
try (Stream<Path> walk = Files.walk(Paths.get(IMPORT_DIR))) {
return walk.filter(p -> !Files.isDirectory(p))
.filter(p -> p.getFileName().toString().equals(filename))
.map(Path::toFile)
.findFirst();
} catch (IOException e) {
return Optional.empty();
}
}
}

View File

@@ -1,69 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.person.PersonType;
import org.raddatz.familienarchiv.person.PersonUpsertCommand;
import org.springframework.stereotype.Component;
import java.io.File;
import java.time.LocalDate;
import java.time.format.DateTimeParseException;
import java.util.List;
/**
* Loads {@code canonical-persons.xlsx} (the register) into the person domain via
* {@link PersonService}, upserting each person by the normalizer {@code person_id}
* (source_ref). Register persons are confident identities, so {@code provisional} is
* driven by the sheet's already-clean value (normally {@code False}).
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class PersonRegisterImporter {
static final List<String> REQUIRED_HEADERS = List.of("person_id", "last_name", "first_name", "provisional");
private final PersonService personService;
public int load(File artifact) {
List<CanonicalSheetReader.Row> rows = CanonicalSheetReader.readRows(artifact, REQUIRED_HEADERS);
int processed = 0;
for (CanonicalSheetReader.Row row : rows) {
String personId = row.get("person_id");
if (personId.isBlank()) continue;
personService.upsertBySourceRef(toCommand(row, personId));
processed++;
}
log.info("Imported {} register persons from {}", processed, artifact.getName());
return processed;
}
private PersonUpsertCommand toCommand(CanonicalSheetReader.Row row, String personId) {
return PersonUpsertCommand.builder()
.sourceRef(personId)
.lastName(blankToNull(row.get("last_name")))
.firstName(blankToNull(row.get("first_name")))
.maidenName(blankToNull(row.get("maiden_name")))
.notes(blankToNull(row.get("notes")))
.birthYear(yearOf(row.get("birth_date")))
.deathYear(yearOf(row.get("death_date")))
.personType(PersonType.PERSON)
.provisional(Boolean.parseBoolean(row.get("provisional")))
.build();
}
private static Integer yearOf(String isoDate) {
if (isoDate == null || isoDate.isBlank()) return null;
try {
return LocalDate.parse(isoDate.trim()).getYear();
} catch (DateTimeParseException e) {
return null;
}
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s;
}
}

View File

@@ -1,135 +0,0 @@
package org.raddatz.familienarchiv.importing;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.person.Person;
import org.raddatz.familienarchiv.person.PersonService;
import org.raddatz.familienarchiv.person.PersonType;
import org.raddatz.familienarchiv.person.PersonUpsertCommand;
import org.raddatz.familienarchiv.person.relationship.RelationType;
import org.raddatz.familienarchiv.person.relationship.RelationshipService;
import org.raddatz.familienarchiv.person.relationship.dto.CreateRelationshipRequest;
import org.springframework.stereotype.Component;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.UUID;
/**
* Loads {@code canonical-persons-tree.json} into the person + relationship domains.
* Tree persons are upserted via {@link PersonService} keyed on the shared
* {@code personId} slug (which Phase 1 #670 now emits into the tree), so they reconcile
* with the register rather than duplicating it. Relationships reference persons by the
* tree's local {@code rowId}; each side is mapped to the upserted person's UUID and
* created through {@link RelationshipService} (never the relationship repository —
* layering rule). A duplicate relationship on re-import is swallowed for idempotency.
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class PersonTreeImporter {
// The tree JSON is a local implementation detail, not a shared API payload, so the
// importer owns its own mapper rather than depending on the web ObjectMapper bean.
private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
private final PersonService personService;
private final RelationshipService relationshipService;
public int load(File artifact) {
JsonNode root = readTree(artifact);
Map<String, UUID> idByRowId = upsertPersons(root.path("persons"));
int relationships = createRelationships(root.path("relationships"), idByRowId);
log.info("Imported {} tree persons and {} relationships from {}",
idByRowId.size(), relationships, artifact.getName());
return idByRowId.size();
}
private JsonNode readTree(File artifact) {
try {
return OBJECT_MAPPER.readTree(artifact);
} catch (Exception e) {
throw DomainException.badRequest(ErrorCode.IMPORT_ARTIFACT_INVALID,
"Unreadable canonical artifact: " + artifact.getName());
}
}
private Map<String, UUID> upsertPersons(JsonNode persons) {
Map<String, UUID> idByRowId = new HashMap<>();
for (JsonNode node : persons) {
String personId = text(node, "personId");
if (personId.isBlank()) continue;
Person person = personService.upsertBySourceRef(toCommand(node, personId));
idByRowId.put(text(node, "rowId"), person.getId());
}
return idByRowId;
}
private PersonUpsertCommand toCommand(JsonNode node, String personId) {
return PersonUpsertCommand.builder()
.sourceRef(personId)
.lastName(blankToNull(text(node, "lastName")))
.firstName(blankToNull(text(node, "firstName")))
.maidenName(blankToNull(text(node, "maidenName")))
.notes(blankToNull(text(node, "notes")))
.birthYear(intOrNull(node, "birthYear"))
.deathYear(intOrNull(node, "deathYear"))
.familyMember(node.path("familyMember").asBoolean(false))
.personType(PersonType.PERSON)
.provisional(false)
.build();
}
private int createRelationships(JsonNode relationships, Map<String, UUID> idByRowId) {
int created = 0;
for (JsonNode node : relationships) {
// Trap: a relationship node's personId / relatedPersonId fields carry the tree's
// local rowId (e.g. "row_a"), NOT a person slug. They are resolved through
// idByRowId to the upserted person's UUID.
UUID person = idByRowId.get(text(node, "personId"));
UUID related = idByRowId.get(text(node, "relatedPersonId"));
if (person == null || related == null) {
log.warn("Skipping tree relationship with unresolved rowId: {} -> {}",
text(node, "personId"), text(node, "relatedPersonId"));
continue;
}
if (addRelationshipIdempotently(person, related, text(node, "type"))) {
created++;
}
}
return created;
}
private boolean addRelationshipIdempotently(UUID person, UUID related, String type) {
try {
relationshipService.addRelationship(person,
new CreateRelationshipRequest(related, RelationType.valueOf(type), null, null, null));
return true;
} catch (DomainException e) {
if (e.getCode() == ErrorCode.DUPLICATE_RELATIONSHIP
|| e.getCode() == ErrorCode.CIRCULAR_RELATIONSHIP) {
return false;
}
throw e;
}
}
private static String text(JsonNode node, String field) {
JsonNode value = node.get(field);
return value == null || value.isNull() ? "" : value.asText();
}
private static Integer intOrNull(JsonNode node, String field) {
JsonNode value = node.get(field);
return value == null || value.isNull() ? null : value.asInt();
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s;
}
}

View File

@@ -1,54 +0,0 @@
package org.raddatz.familienarchiv.importing;
import lombok.RequiredArgsConstructor;
import lombok.extern.slf4j.Slf4j;
import org.raddatz.familienarchiv.tag.Tag;
import org.raddatz.familienarchiv.tag.TagService;
import org.springframework.stereotype.Component;
import java.io.File;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.UUID;
/**
* Loads {@code canonical-tag-tree.xlsx} into the tag domain via {@link TagService},
* upserting each tag by its canonical {@code tag_path} (the source_ref). Parent links are
* resolved by the parent's path, which is the child path with its last {@code /segment}
* stripped. Rows are emitted parents-first by the normalizer, so a parent is always
* resolved before any child references it.
*/
@Component
@RequiredArgsConstructor
@Slf4j
public class TagTreeImporter {
static final List<String> REQUIRED_HEADERS = List.of("tag_path", "parent_name", "tag_name");
private static final String PATH_SEPARATOR = "/";
private final TagService tagService;
public int load(File artifact) {
List<CanonicalSheetReader.Row> rows = CanonicalSheetReader.readRows(artifact, REQUIRED_HEADERS);
Map<String, UUID> idByPath = new HashMap<>();
int processed = 0;
for (CanonicalSheetReader.Row row : rows) {
String path = row.get("tag_path");
if (path.isBlank()) continue;
UUID parentId = resolveParentId(path, idByPath);
Tag tag = tagService.upsertBySourceRef(path, row.get("tag_name"), parentId);
idByPath.put(path, tag.getId());
processed++;
}
log.info("Imported {} tags from {}", processed, artifact.getName());
return processed;
}
private UUID resolveParentId(String path, Map<String, UUID> idByPath) {
int lastSeparator = path.lastIndexOf(PATH_SEPARATOR);
if (lastSeparator < 0) return null;
String parentPath = path.substring(0, lastSeparator);
return idByPath.get(parentPath);
}
}

View File

@@ -1,41 +0,0 @@
# notification
In-app messages delivered in real time via SSE and persisted in the bell-icon dropdown. Notifications are created by other domains in response to events (comment mentions, replies).
## What this domain owns
Entity: `Notification`.
Features: create and deliver notifications, unread count, mark-read, SSE real-time push, per-user delivery preferences (stored as fields on `AppUser`, managed by `user/`).
## What this domain does NOT own
- `AppUser` (recipient) — owned by `user/`
- `Document` or `Geschichte` (notification context) — referenced by ID only
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `notifyMentions(mentionedUserIds, comment)` | document (comment) | Push mention notifications when a comment contains @mentions |
| `notifyReply(reply, participantIds)` | document (comment) | Push reply notification to all thread participants |
| `countUnread(userId)` | user session | Unread badge count in the nav bar |
| `getNotifications(userId)` | dashboard / activity | Notification list for bell dropdown |
| `markRead(id)` / `markAllRead(userId)` | notification controller | User-driven read-state updates |
| `updatePreferences(userId, dto)` | notification controller | Per-user delivery preferences |
## Internal layout
- `NotificationController` — REST under `/api/notifications`
- `NotificationService` — create, query, mark-read
- `SseEmitterRegistry` — runtime-stateful component that keeps one `SseEmitter` per connected user. On `notifyMentions()` / `notifyReply()`, the service writes to `SseEmitterRegistry` to push real-time events. SSE connections go **backend → browser directly**, not via the SvelteKit SSR layer.
- `NotificationRepository` — persisted notification rows
- `NotificationPreferenceDTO` — read/write DTO for preference endpoints (prefs stored on `AppUser`)
## Cross-domain dependencies
**Outbound (this domain calls):**
- `DocumentService.findTitlesByIds(List<UUID>)` — enriches notification DTOs with document titles for display in the bell dropdown
## Frontend counterpart
`frontend/src/lib/notification/README.md`

View File

@@ -1,44 +0,0 @@
# ocr
OCR/HTR pipeline orchestration. This domain manages job lifecycle and result ingestion — it does **not** perform OCR. Actual text recognition runs in the Python `ocr-service/` container (port 8000, internal network only).
## What this domain owns
Entities: `OcrJob`, `OcrJobDocument`, `SenderModel`.
Features: start OCR jobs, track job lifecycle (`PENDING → RUNNING → DONE / FAILED`), stream transcription blocks back into `document/transcription/`, sender-model training, segmentation training.
## What this domain does NOT own
- Document content — `Document` and `TranscriptionBlock` are owned by `document/`
- File storage — presigned MinIO URLs are generated by `filestorage/FileService` and passed to the OCR service
- OCR processing — the Python `ocr-service/` executes Surya (typewritten) and Kraken (Kurrent/Sütterlin HTR) and streams results back
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `startOcr(documentId, ...)` | document | Trigger an OCR job for a document |
| `getJob(UUID)` | document | Fetch job status |
| `getDocumentOcrStatus(UUID)` | document | Per-document OCR status summary |
## Internal layout
- `OcrController` — REST under `/api/ocr`
- `OcrService` — job creation, presigned URL generation, result ingestion
- `OcrBatchService` — batch job workflows
- `OcrAsyncRunner``@Async` execution of OCR jobs
- `OcrTrainingService` — calls `/train` and `/segtrain` on the Python service (protected by `X-Training-Token` header)
- `OcrJobRepository` / `OcrJobDocumentRepository`
- `SenderModelRepository` — trained sender-recognition models
- `OcrClient` (interface) / `RestClientOcrClient` — HTTP client for the Python OCR service; mockable for tests
## Cross-domain dependencies
- `DocumentService.getDocumentById()` / `getDocumentsByIds()` — resolve target documents
- `DocumentService.addTrainingLabel()` — attach confirmed sender labels after training
- `FileService.generatePresignedUrl()` — generate MinIO presigned URLs passed to the OCR service (PDF bytes never flow through the backend)
- `AuditService.logAfterCommit()` — OCR job events are audited
## Frontend counterpart
`frontend/src/lib/ocr/README.md`

View File

@@ -1,7 +1,6 @@
package org.raddatz.familienarchiv.person; package org.raddatz.familienarchiv.person;
import com.fasterxml.jackson.annotation.JsonIgnore; import com.fasterxml.jackson.annotation.JsonIgnore;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import io.swagger.v3.oas.annotations.media.Schema; import io.swagger.v3.oas.annotations.media.Schema;
import jakarta.persistence.*; import jakarta.persistence.*;
import lombok.*; import lombok.*;
@@ -10,9 +9,6 @@ import org.raddatz.familienarchiv.user.DisplayNameFormatter;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.List; import java.util.List;
import java.util.UUID; import java.util.UUID;
// prevents infinite recursion in JSON serialization; see ADR-022 for lazy-fetch context
@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})
@Entity @Entity
@Table(name = "persons") @Table(name = "persons")
@Data @Data
@@ -57,18 +53,6 @@ public class Person {
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private boolean familyMember = false; private boolean familyMember = false;
// The normalizer person_id — join key and re-import idempotency key. Null for manually
// created persons; unique among non-null values (see ADR-025).
@Column(name = "source_ref")
private String sourceRef;
// A provisional person is one the importer inferred but could not confidently identify.
// Distinct from familyMember (a genealogical fact); set true only by the importer (Phase 3).
@Column(name = "provisional", nullable = false)
@Builder.Default
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private boolean provisional = false;
// Entity-graph navigation for JPA JOIN queries (e.g. DocumentSpecifications.hasText). // Entity-graph navigation for JPA JOIN queries (e.g. DocumentSpecifications.hasText).
// Uses entity relationship rather than cross-domain repository access, avoiding a // Uses entity relationship rather than cross-domain repository access, avoiding a
// separate DB roundtrip while respecting domain boundaries. // separate DB roundtrip while respecting domain boundaries.

View File

@@ -22,15 +22,12 @@ import org.springframework.web.bind.annotation.*;
import org.springframework.web.server.ResponseStatusException; import org.springframework.web.server.ResponseStatusException;
import jakarta.validation.Valid; import jakarta.validation.Valid;
import jakarta.validation.constraints.Max;
import jakarta.validation.constraints.Min;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
@RestController @RestController
@RequestMapping("/api/persons") @RequestMapping("/api/persons")
@RequiredArgsConstructor @RequiredArgsConstructor
@Validated
public class PersonController { public class PersonController {
private final PersonService personService; private final PersonService personService;
@@ -38,37 +35,8 @@ public class PersonController {
@GetMapping @GetMapping
@RequirePermission(Permission.READ_ALL) @RequirePermission(Permission.READ_ALL)
public ResponseEntity<PersonSearchResult> getPersons( public ResponseEntity<List<PersonSummaryDTO>> getPersons(@RequestParam(required = false) String q) {
@RequestParam(required = false) String q, return ResponseEntity.ok(personService.findAll(q));
@RequestParam(required = false) PersonType type,
@RequestParam(required = false) Boolean familyOnly,
@RequestParam(required = false) Boolean hasDocuments,
@RequestParam(required = false) Boolean provisional,
// review=true reveals the import noise (transcriber view); absent/false keeps the
// clean reader default (familyMember OR documentCount > 0). The explicit filters AND
// within whichever base the review flag selects.
@RequestParam(required = false, defaultValue = "false") boolean review,
@RequestParam(required = false) String sort,
@RequestParam(defaultValue = "0") @Min(0) int page,
@RequestParam(defaultValue = "50") @Min(1) @Max(100) int size) {
// Legacy top-N-by-document-count path (reader dashboard): preserved, wrapped in the
// same envelope so /api/persons always returns one shape. It is explicitly NON-paged —
// the top-N query returns the complete result, so PersonSearchResult.topN reports an
// honest totalElements (= returned count) instead of pretending to be a page slice.
if ("documentCount".equals(sort) && q == null) {
int safeSize = Math.min(size, 50);
List<PersonSummaryDTO> top = personService.findTopByDocumentCount(safeSize);
return ResponseEntity.ok(PersonSearchResult.topN(top));
}
PersonFilter filter = PersonFilter.builder()
.type(type)
.familyOnly(familyOnly)
.hasDocuments(hasDocuments)
.provisional(provisional)
.readerDefault(!review)
.build();
return ResponseEntity.ok(personService.search(filter, page, size, q));
} }
@GetMapping("/{id}") @GetMapping("/{id}")
@@ -135,21 +103,6 @@ public class PersonController {
personService.mergePersons(id, UUID.fromString(targetIdStr)); personService.mergePersons(id, UUID.fromString(targetIdStr));
} }
// Dedicated state transition that clears the provisional flag. A separate verb (not a
// mass-assignable DTO field) so provisional can never be smuggled in via create/update.
@PatchMapping("/{id}/confirm")
@RequirePermission(Permission.WRITE_ALL)
public ResponseEntity<Person> confirmPerson(@PathVariable UUID id) {
return ResponseEntity.ok(personService.confirmPerson(id));
}
@DeleteMapping("/{id}")
@ResponseStatus(HttpStatus.NO_CONTENT)
@RequirePermission(Permission.WRITE_ALL)
public void deletePerson(@PathVariable UUID id) {
personService.deletePerson(id);
}
// ─── Alias endpoints ──────────────────────────────────────────────────── // ─── Alias endpoints ────────────────────────────────────────────────────
@GetMapping("/{id}/aliases") @GetMapping("/{id}/aliases")

View File

@@ -1,36 +0,0 @@
package org.raddatz.familienarchiv.person;
import lombok.Builder;
/**
* The reader/triage filter set for the persons directory, threaded as one value through
* {@code PersonController -> PersonService -> PersonRepository}. Each field is nullable:
* null means "do not constrain on this dimension".
*
* <ul>
* <li>{@code type} — restrict to a single {@link PersonType}.</li>
* <li>{@code familyOnly} — when true, only {@code familyMember} persons.</li>
* <li>{@code hasDocuments} — when true, only persons with documentCount &gt; 0.</li>
* <li>{@code provisional} — match the {@code Person.provisional} flag exactly.</li>
* <li>{@code readerDefault} — when true, restrict to {@code familyMember OR documentCount > 0}
* (the clean reader view). The explicit filters above AND with this restriction.</li>
* </ul>
*/
@Builder
public record PersonFilter(
PersonType type,
Boolean familyOnly,
Boolean hasDocuments,
Boolean provisional,
boolean readerDefault
) {
/** The unconstrained "show all" filter (transcriber view, no reader restriction). */
public static PersonFilter showAll() {
return PersonFilter.builder().readerDefault(false).build();
}
/** The clean reader default: familyMember OR documentCount &gt; 0, no other constraints. */
public static PersonFilter cleanDefault() {
return PersonFilter.builder().readerDefault(true).build();
}
}

View File

@@ -32,9 +32,6 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
// Lookup by full alias string, used during ODS mass import // Lookup by full alias string, used during ODS mass import
Optional<Person> findByAliasIgnoreCase(String alias); Optional<Person> findByAliasIgnoreCase(String alias);
// Lookup by the normalizer person_id, used for idempotent canonical re-import (Phase 3).
Optional<Person> findBySourceRef(String sourceRef);
// Exact first+last name match, used for filename-based sender lookup // Exact first+last name match, used for filename-based sender lookup
Optional<Person> findByFirstNameIgnoreCaseAndLastNameIgnoreCase(String firstName, String lastName); Optional<Person> findByFirstNameIgnoreCaseAndLastNameIgnoreCase(String firstName, String lastName);
@@ -44,7 +41,7 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName, SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType, p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes, p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional, p.family_member AS familyMember,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id) (SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount + (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p FROM persons p
@@ -57,7 +54,7 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName, SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType, p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes, p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional, p.family_member AS familyMember,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id) (SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount + (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p FROM persons p
@@ -66,83 +63,12 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
OR LOWER(CONCAT(p.last_name,' ',COALESCE(p.first_name,''))) LIKE LOWER(CONCAT('%',:query,'%')) OR LOWER(CONCAT(p.last_name,' ',COALESCE(p.first_name,''))) LIKE LOWER(CONCAT('%',:query,'%'))
OR LOWER(p.alias) LIKE LOWER(CONCAT('%',:query,'%')) OR LOWER(p.alias) LIKE LOWER(CONCAT('%',:query,'%'))
OR LOWER(a.last_name) LIKE LOWER(CONCAT('%',:query,'%')) OR LOWER(a.last_name) LIKE LOWER(CONCAT('%',:query,'%'))
GROUP BY p.id, p.title, p.first_name, p.last_name, p.person_type, p.alias, p.birth_year, p.death_year, p.notes, p.family_member, p.provisional GROUP BY p.id, p.title, p.first_name, p.last_name, p.person_type, p.alias, p.birth_year, p.death_year, p.notes, p.family_member
ORDER BY p.last_name ASC, p.first_name ASC ORDER BY p.last_name ASC, p.first_name ASC
""", """,
nativeQuery = true) nativeQuery = true)
List<PersonSummaryDTO> searchWithDocumentCount(@Param("query") String query); List<PersonSummaryDTO> searchWithDocumentCount(@Param("query") String query);
// ORDER BY uses the computed alias "documentCount" — valid PostgreSQL (aliases allowed in ORDER BY,
// unlike WHERE/HAVING). This is intentional; it would silently fail on MySQL or H2.
@Query(value = """
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p
ORDER BY documentCount DESC
LIMIT :limit
""",
nativeQuery = true)
List<PersonSummaryDTO> findTopByDocumentCount(@Param("limit") int limit);
// --- #667: filter-aware paged directory ---
//
// The slice query and the count query below MUST keep an IDENTICAL WHERE clause so the
// rendered page and totalElements can never drift. Every filter is nullable: a null param
// disables that predicate via the `:param IS NULL OR …` idiom. `readerDefault` (a plain
// boolean) restricts to "familyMember OR has documents"; the explicit filters AND on top.
// documentCount is recomputed inline (not via the SELECT alias) because WHERE cannot
// reference a computed alias. All params are named — no string concatenation, no injection.
String FILTER_WHERE = """
WHERE (CAST(:type AS text) IS NULL OR p.person_type = CAST(:type AS text))
AND (:familyOnly = FALSE OR :familyOnly IS NULL OR p.family_member = TRUE)
AND (:hasDocuments = FALSE OR :hasDocuments IS NULL OR (
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id)) > 0)
AND (:provisional IS NULL OR p.provisional = :provisional)
AND (:readerDefault = FALSE OR (
p.family_member = TRUE OR (
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id)) > 0))
AND (CAST(:query AS text) IS NULL OR
LOWER(CONCAT(COALESCE(p.first_name,''),' ',p.last_name)) LIKE LOWER(CONCAT('%',CAST(:query AS text),'%'))
OR LOWER(CONCAT(p.last_name,' ',COALESCE(p.first_name,''))) LIKE LOWER(CONCAT('%',CAST(:query AS text),'%'))
OR LOWER(p.alias) LIKE LOWER(CONCAT('%',CAST(:query AS text),'%')))
""";
@Query(value = """
SELECT p.id, p.title, p.first_name AS firstName, p.last_name AS lastName,
p.person_type AS personType,
p.alias, p.birth_year AS birthYear, p.death_year AS deathYear, p.notes,
p.family_member AS familyMember, p.provisional AS provisional,
(SELECT COUNT(*) FROM documents d WHERE d.sender_id = p.id)
+ (SELECT COUNT(*) FROM document_receivers dr WHERE dr.person_id = p.id) AS documentCount
FROM persons p
""" + FILTER_WHERE + """
ORDER BY p.last_name ASC, p.first_name ASC
LIMIT :limit OFFSET :offset
""",
nativeQuery = true)
List<PersonSummaryDTO> findByFilter(@Param("type") String type,
@Param("familyOnly") Boolean familyOnly,
@Param("hasDocuments") Boolean hasDocuments,
@Param("provisional") Boolean provisional,
@Param("readerDefault") boolean readerDefault,
@Param("query") String query,
@Param("limit") int limit,
@Param("offset") int offset);
@Query(value = "SELECT COUNT(*) FROM persons p " + FILTER_WHERE, nativeQuery = true)
long countByFilter(@Param("type") String type,
@Param("familyOnly") Boolean familyOnly,
@Param("hasDocuments") Boolean hasDocuments,
@Param("provisional") Boolean provisional,
@Param("readerDefault") boolean readerDefault,
@Param("query") String query);
// --- Correspondent queries --- // --- Correspondent queries ---
@Query(value = """ @Query(value = """
@@ -194,12 +120,6 @@ public interface PersonRepository extends JpaRepository<Person, UUID> {
@Query(value = "UPDATE documents SET sender_id = :target WHERE sender_id = :source", nativeQuery = true) @Query(value = "UPDATE documents SET sender_id = :target WHERE sender_id = :source", nativeQuery = true)
void reassignSender(@Param("source") UUID source, @Param("target") UUID target); void reassignSender(@Param("source") UUID source, @Param("target") UUID target);
// Used by deletePerson: detach a deleted person from documents they sent, so the hard
// delete cannot orphan a documents.sender_id FK (the column is nullable).
@Modifying
@Query(value = "UPDATE documents SET sender_id = NULL WHERE sender_id = :source", nativeQuery = true)
void reassignSenderToNull(@Param("source") UUID source);
@Modifying @Modifying
@Query(value = """ @Query(value = """
INSERT INTO document_receivers (document_id, person_id) INSERT INTO document_receivers (document_id, person_id)

View File

@@ -1,50 +0,0 @@
package org.raddatz.familienarchiv.person;
import io.swagger.v3.oas.annotations.media.Schema;
import java.util.List;
/**
* Paged result for the /api/persons list endpoint.
*
* <p>Hand-written to mirror {@code document/DocumentSearchResult} field-for-field so the
* frontend sees one paged shape across the app. Deliberately NOT Spring {@code Page<T>}
* (unstable serialized shape across Spring versions, noisy in OpenAPI) and deliberately
* NOT a reuse of the document DTO (would couple two feature modules — duplication beats
* coupling here).
*/
public record PersonSearchResult(
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<PersonSummaryDTO> items,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
long totalElements,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int pageNumber,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int pageSize,
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
int totalPages
) {
/**
* Paged factory: derives {@code totalPages} from the full match count and the page size.
* A zero count yields zero pages so the frontend hides the pagination control.
*/
public static PersonSearchResult paged(List<PersonSummaryDTO> slice, int pageNumber, int pageSize, long totalElements) {
int totalPages = pageSize == 0 ? 0 : (int) ((totalElements + pageSize - 1) / pageSize);
return new PersonSearchResult(slice, totalElements, pageNumber, pageSize, totalPages);
}
/**
* Non-paged factory for the legacy {@code sort=documentCount} top-N dashboard path.
* That query returns the <em>complete</em> result in one shot — there is no further page
* to fetch — so the envelope reports reality rather than pretending to be a slice of a
* larger set: {@code totalElements} equals the number of rows actually returned,
* {@code pageSize} equals that same count, and {@code totalPages} is 1 (or 0 when empty).
* This avoids the earlier ambiguity where {@code totalElements} looked like a paged total.
*/
public static PersonSearchResult topN(List<PersonSummaryDTO> all) {
int count = all.size();
int totalPages = count == 0 ? 0 : 1;
return new PersonSearchResult(all, count, 0, count, totalPages);
}
}

View File

@@ -31,53 +31,14 @@ public class PersonService {
private final PersonRepository personRepository; private final PersonRepository personRepository;
private final PersonNameAliasRepository aliasRepository; private final PersonNameAliasRepository aliasRepository;
public List<PersonSummaryDTO> findTopByDocumentCount(int limit) { public List<PersonSummaryDTO> findAll(String q) {
return personRepository.findTopByDocumentCount(limit); if (q == null) {
return personRepository.findAllWithDocumentCount();
} }
if (q.isBlank()) {
/** return List.of();
* Filtered, paginated directory query. The slice and the total are derived from one
* shared WHERE clause (see {@link PersonRepository#FILTER_WHERE}) so totalElements can
* never drift from the rendered page. {@code type} is passed as the enum name because the
* native query compares against the string column.
*/
public PersonSearchResult search(PersonFilter filter, int page, int size, String q) {
String type = filter.type() == null ? null : filter.type().name();
String query = (q == null || q.isBlank()) ? null : q.trim();
int offset = page * size;
List<PersonSummaryDTO> items = personRepository.findByFilter(
type, filter.familyOnly(), filter.hasDocuments(), filter.provisional(),
filter.readerDefault(), query, size, offset);
long total = personRepository.countByFilter(
type, filter.familyOnly(), filter.hasDocuments(), filter.provisional(),
filter.readerDefault(), query);
return PersonSearchResult.paged(items, page, size, total);
} }
return personRepository.searchWithDocumentCount(q.trim());
/**
* Clears the {@code provisional} flag — a deliberate state transition exposed as
* {@code PATCH /api/persons/{id}/confirm}, never as a mass-assignable DTO field (CWE-915).
*/
@Transactional
public Person confirmPerson(UUID id) {
Person person = getById(id);
person.setProvisional(false);
return personRepository.save(person);
}
/**
* Hard-deletes a person used by triage. Detaches the person from any documents they
* sent (nulls sender_id) and from any received-document references first, so the delete
* cannot orphan an FK and fail with a 500.
*/
@Transactional
public void deletePerson(UUID id) {
getById(id);
personRepository.reassignSenderToNull(id);
personRepository.deleteReceiverReferences(id);
personRepository.deleteById(id);
} }
public Person getById(UUID id) { public Person getById(UUID id) {
@@ -115,11 +76,6 @@ public class PersonService {
return personRepository.findByFirstNameIgnoreCaseAndLastNameIgnoreCase(firstName, lastName); return personRepository.findByFirstNameIgnoreCaseAndLastNameIgnoreCase(firstName, lastName);
} }
/** Lookup by the normalizer person_id — used by the canonical importer for register-first matching. */
public Optional<Person> findBySourceRef(String sourceRef) {
return personRepository.findBySourceRef(sourceRef);
}
@Nullable @Nullable
@Transactional @Transactional
public Person findOrCreateByAlias(String rawName) { public Person findOrCreateByAlias(String rawName) {
@@ -155,80 +111,6 @@ public class PersonService {
}); });
} }
/**
* Idempotent upsert keyed on {@code sourceRef} (the normalizer person_id) for the
* canonical importer (Phase 3, ADR-025). On first import the canonical fields are
* written verbatim. On re-import the human-edit-preserve precedence applies:
* a non-blank existing field is never overwritten, and {@code provisional} never
* flips back to true once a human has confirmed the person.
*/
@Transactional
public Person upsertBySourceRef(PersonUpsertCommand cmd) {
return personRepository.findBySourceRef(cmd.sourceRef())
.map(existing -> personRepository.save(mergeCanonical(existing, cmd)))
.orElseGet(() -> fromCanonical(cmd));
}
private Person fromCanonical(PersonUpsertCommand cmd) {
Person person = personRepository.save(Person.builder()
.sourceRef(cmd.sourceRef())
.firstName(blankToNull(cmd.firstName()))
.lastName(cmd.lastName())
.notes(blankToNull(cmd.notes()))
.birthYear(cmd.birthYear())
.deathYear(cmd.deathYear())
.familyMember(cmd.familyMember())
.personType(cmd.personType() == null ? PersonType.PERSON : cmd.personType())
.provisional(cmd.provisional())
.build());
String maiden = blankToNull(cmd.maidenName());
if (maiden != null) {
int nextSortOrder = aliasRepository.findMaxSortOrder(person.getId()) + 1;
aliasRepository.save(PersonNameAlias.builder()
.person(person)
.lastName(maiden)
.type(PersonNameAliasType.MAIDEN_NAME)
.sortOrder(nextSortOrder)
.build());
}
return person;
}
private Person mergeCanonical(Person existing, PersonUpsertCommand cmd) {
existing.setFirstName(preferHuman(existing.getFirstName(), cmd.firstName()));
existing.setLastName(preferHuman(existing.getLastName(), cmd.lastName()));
existing.setNotes(preferHuman(existing.getNotes(), cmd.notes()));
existing.setBirthYear(preferHuman(existing.getBirthYear(), cmd.birthYear()));
existing.setDeathYear(preferHuman(existing.getDeathYear(), cmd.deathYear()));
if (cmd.personType() != null && existing.getPersonType() == PersonType.PERSON) {
existing.setPersonType(cmd.personType());
}
// provisional is monotonic-downward: once it is false it never reverts to true.
// This also pins the cross-loader precedence (ADR-025): a register/tree person is
// loaded before documents and already false, so a later document row that references
// the same source_ref (provisional=true) can never flip it provisional — the guard
// below only fires while existing is still provisional. Order of document rows is
// therefore irrelevant.
if (existing.isProvisional()) {
existing.setProvisional(cmd.provisional());
}
return existing;
}
// preferHuman keeps an existing human-entered value and only falls back to the canonical
// value when the existing one is absent — the single idiom for every fill-blank field.
private static String preferHuman(String existing, String canonical) {
return (existing == null || existing.isBlank()) ? blankToNull(canonical) : existing;
}
private static Integer preferHuman(Integer existing, Integer canonical) {
return existing != null ? existing : canonical;
}
private static String blankToNull(String s) {
return (s == null || s.isBlank()) ? null : s.trim();
}
@Transactional @Transactional
public Person createPerson(String firstName, String lastName, String alias) { public Person createPerson(String firstName, String lastName, String alias) {
Person person = Person.builder() Person person = Person.builder()

View File

@@ -18,7 +18,6 @@ public interface PersonSummaryDTO {
Integer getDeathYear(); Integer getDeathYear();
String getNotes(); String getNotes();
boolean isFamilyMember(); boolean isFamilyMember();
boolean isProvisional();
long getDocumentCount(); long getDocumentCount();
default String getDisplayName() { default String getDisplayName() {

View File

@@ -1,24 +0,0 @@
package org.raddatz.familienarchiv.person;
import lombok.Builder;
/**
* Importer → {@link PersonService} command for an idempotent upsert keyed on
* {@code sourceRef} (the normalizer's stable person_id). Carries only the canonical
* fields the importer owns; the service applies the human-edit-preserve precedence
* (see ADR-025): non-blank existing fields are never overwritten, and {@code provisional}
* never flips back to true once a human has confirmed a person.
*/
@Builder
public record PersonUpsertCommand(
String sourceRef,
String firstName,
String lastName,
String maidenName,
String notes,
Integer birthYear,
Integer deathYear,
boolean familyMember,
PersonType personType,
boolean provisional
) {}

View File

@@ -1,45 +0,0 @@
# person
Historical individuals referenced by documents. A `Person` is a family member who appears as a sender or receiver in the archive — they are never login accounts.
## What this domain owns
Entities: `Person`, `PersonNameAlias`, `PersonRelationship`.
Features: person CRUD, name alias management, person merge (deduplication), family-member designation, relationship graph, person type classification (FAMILY, CORRESPONDENT, INSTITUTION).
## What this domain does NOT own
- `AppUser` — login accounts are in `user/`. A `Person` record has no login credentials. The separation is deliberate: a historical family member from 1905 is never a system user.
- Document content — `Person` records are referenced by documents (as sender/receiver), not the other way around.
- Relationship rendering — the Stammbaum view is derived by the frontend from `PersonRelationship` data.
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `getById(UUID)` | document, geschichte, ocr | Fetch one person by ID |
| `getAllById(List<UUID>)` | document | Bulk fetch for sender/receiver resolution |
| `findAll(String q)` | document, dashboard | List all persons |
| `findByName(String firstName, String lastName)` | document | Typeahead search |
| `findOrCreateByAlias(String rawName)` | importing | Idempotent create during mass import; type classification happens internally |
| `findAllFamilyMembers()` | dashboard | Family member list for stats |
| `findCorrespondents()` | document | Correspondent list for conversation filter |
| `count()` | dashboard | Total person count for stats |
## Internal layout
- `PersonController` — REST under `/api/persons`
- `PersonService` — CRUD, merge, alias management, family-member designation
- `PersonRepository` — sorted list, name search
- `PersonNameAlias` / `PersonNameAliasRepository` — alternative name spellings
- `PersonNameParser` / `PersonTypeClassifier` — name parsing utilities
- `PersonSummaryDTO` — lightweight DTO for typeahead / list views
- Sub-package: `relationship/``PersonRelationship`, `RelationshipService`, `RelationshipController`
## Cross-domain dependencies
- `AuditService.logAfterCommit()` — person mutations are audited
## Frontend counterpart
`frontend/src/lib/person/README.md`

View File

@@ -1,42 +1,24 @@
package org.raddatz.familienarchiv.security; package org.raddatz.familienarchiv.security;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.RequiredArgsConstructor; import lombok.RequiredArgsConstructor;
import org.raddatz.familienarchiv.exception.ErrorCode;
import org.raddatz.familienarchiv.user.CustomUserDetailsService; import org.raddatz.familienarchiv.user.CustomUserDetailsService;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration; import org.springframework.context.annotation.Configuration;
import org.springframework.core.annotation.Order;
import org.springframework.core.env.Environment; import org.springframework.core.env.Environment;
import org.springframework.security.authentication.AuthenticationManager;
import org.springframework.security.authentication.dao.DaoAuthenticationProvider; import org.springframework.security.authentication.dao.DaoAuthenticationProvider;
import org.springframework.security.config.annotation.authentication.configuration.AuthenticationConfiguration; import org.springframework.security.config.Customizer;
import org.springframework.security.config.annotation.web.builders.HttpSecurity; import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity; import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.config.annotation.web.configurers.AbstractHttpConfigurer;
import org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder; import org.springframework.security.crypto.bcrypt.BCryptPasswordEncoder;
import org.springframework.security.crypto.password.PasswordEncoder; import org.springframework.security.crypto.password.PasswordEncoder;
import org.springframework.security.web.SecurityFilterChain; import org.springframework.security.web.SecurityFilterChain;
import org.springframework.security.web.authentication.session.ChangeSessionIdAuthenticationStrategy;
import org.springframework.security.web.authentication.session.SessionAuthenticationStrategy;
import org.springframework.security.web.csrf.CookieCsrfTokenRepository;
import org.springframework.security.web.csrf.CsrfException;
import org.springframework.security.web.csrf.CsrfTokenRequestAttributeHandler;
import java.util.Map;
@Configuration @Configuration
@EnableWebSecurity @EnableWebSecurity
@RequiredArgsConstructor @RequiredArgsConstructor
public class SecurityConfig { public class SecurityConfig {
// @WebMvcTest slices do not include JacksonAutoConfiguration, so ObjectMapper
// cannot be injected here. A static instance is safe because the response
// only serializes fixed String keys — no custom naming strategy or module needed.
private static final ObjectMapper ERROR_WRITER = new ObjectMapper();
private final CustomUserDetailsService userDetailsService; private final CustomUserDetailsService userDetailsService;
private final Environment environment; private final Environment environment;
@@ -52,57 +34,20 @@ public class SecurityConfig {
return authProvider; return authProvider;
} }
@Bean
public AuthenticationManager authenticationManager(AuthenticationConfiguration config) throws Exception {
return config.getAuthenticationManager();
}
@Bean
public SessionAuthenticationStrategy sessionAuthenticationStrategy() {
// ChangeSessionIdAuthenticationStrategy rotates the session ID via the Servlet 3.1+
// HttpServletRequest.changeSessionId() — preserves attributes, mints a fresh ID.
// Used by AuthSessionController.login to defend against session fixation (CWE-384).
return new ChangeSessionIdAuthenticationStrategy();
}
@Bean
@Order(1)
public SecurityFilterChain managementFilterChain(HttpSecurity http) throws Exception {
http
.securityMatcher("/actuator/**")
.authorizeHttpRequests(auth -> {
// Health and Prometheus are open — Docker health checks and Prometheus scraping need no credentials.
auth.requestMatchers("/actuator/health", "/actuator/prometheus").permitAll();
// All other actuator endpoints (metrics, info, env, heapdump…) require authentication.
auth.anyRequest().authenticated();
})
// Explicitly return 401 for any unauthenticated actuator request.
// Without this override, Spring Security's DelegatingAuthenticationEntryPoint
// would redirect browser-like clients to the form-login page (302 → /login),
// making it impossible to distinguish "not authenticated" from "not found" in tests.
.exceptionHandling(ex -> ex.authenticationEntryPoint(
(req, res, e) -> res.setStatus(HttpServletResponse.SC_UNAUTHORIZED)))
.formLogin(AbstractHttpConfigurer::disable)
.csrf(AbstractHttpConfigurer::disable);
return http.build();
}
@Bean @Bean
public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception { public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http http
// CSRF protection via CookieCsrfTokenRepository (NFR-SEC-103). // CSRF is intentionally disabled: every request from the SvelteKit frontend
// The backend sets an XSRF-TOKEN cookie (not HttpOnly so JS can read it). // carries an explicit Authorization header (Basic Auth token injected by
// All state-changing requests must include X-XSRF-TOKEN matching the cookie. // hooks.server.ts). Browsers block cross-origin requests from setting custom
// See ADR-022 and issue #524 for the full security rationale. // headers, so cross-site request forgery via a third-party page is not
.csrf(csrf -> csrf // possible with this auth scheme. If the auth model ever changes to
.csrfTokenRepository(CookieCsrfTokenRepository.withHttpOnlyFalse()) // cookie-based sessions, CSRF protection must be re-enabled.
.csrfTokenRequestHandler(new CsrfTokenRequestAttributeHandler())) .csrf(csrf -> csrf.disable())
.authorizeHttpRequests(auth -> { .authorizeHttpRequests(auth -> {
// Actuator endpoints are governed by managementFilterChain (@Order(1)) above. // Health endpoint must be open so CI/Docker health checks work without credentials
auth.requestMatchers("/actuator/health", "/actuator/prometheus").permitAll(); auth.requestMatchers("/actuator/health").permitAll();
// Login is unauthenticated by definition
auth.requestMatchers("/api/auth/login").permitAll();
// Password reset endpoints are unauthenticated by nature // Password reset endpoints are unauthenticated by nature
auth.requestMatchers("/api/auth/forgot-password", "/api/auth/reset-password").permitAll(); auth.requestMatchers("/api/auth/forgot-password", "/api/auth/reset-password").permitAll();
// Invite-based registration endpoints are public // Invite-based registration endpoints are public
@@ -122,18 +67,9 @@ public class SecurityConfig {
// erlaubt pdf im Iframe // erlaubt pdf im Iframe
.headers(headers -> headers .headers(headers -> headers
.frameOptions(frameOptions -> frameOptions.sameOrigin())) .frameOptions(frameOptions -> frameOptions.sameOrigin()))
// Return 401 for unauthenticated requests; 403+CSRF_TOKEN_MISSING for CSRF failures. // Erlaubt Login via Browser-Popup oder REST-Header (Authorization: Basic ...)
.exceptionHandling(ex -> ex .httpBasic(Customizer.withDefaults())
.authenticationEntryPoint( .formLogin(form -> form.usernameParameter("email"));
(req, res, e) -> res.setStatus(HttpServletResponse.SC_UNAUTHORIZED))
.accessDeniedHandler((req, res, e) -> {
res.setStatus(HttpServletResponse.SC_FORBIDDEN);
res.setContentType("application/json;charset=UTF-8");
ErrorCode code = (e instanceof CsrfException)
? ErrorCode.CSRF_TOKEN_MISSING
: ErrorCode.FORBIDDEN;
res.getWriter().write(ERROR_WRITER.writeValueAsString(Map.of("code", code.name())));
}));
return http.build(); return http.build();
} }

View File

@@ -7,7 +7,6 @@ import org.springframework.security.core.Authentication;
import java.util.UUID; import java.util.UUID;
// Cross-cutting auth helper; no domain home — "Utils" is the correct suffix here.
public final class SecurityUtils { public final class SecurityUtils {
private SecurityUtils() {} private SecurityUtils() {}

View File

@@ -1,35 +0,0 @@
# tag
Hierarchical document categories. Tags form a tree via a self-referencing `parent_id` column and are applied to documents for filtering and browse navigation.
## What this domain owns
Entity: `Tag` (self-referencing `parent_id` tree).
Features: tag CRUD, hierarchical deletion (cascade to descendants), tag typeahead, admin tag management (rename, reparent, merge).
## What this domain does NOT own
- Documents — the `document_tags` join table is on the document side. `Tag` does not hold document references.
- Tag assignment — adding/removing a tag from a document is handled by `DocumentService`.
## Public surface (called from other domains)
| Method | Consumer | Purpose |
|---|---|---|
| `delete(UUID)` | document | Remove the tag record; called by `DocumentService.deleteTagCascading()` after all document references are unlinked |
| `deleteWithDescendants(UUID)` | admin tag UI | Recursive subtree deletion |
| `expandTagNamesToDescendantIdSets(List<String>)` | document | Expand tag filter to include descendant tags |
## Internal layout
- `TagController` — REST under `/api/tags`
- `TagService` — CRUD, hierarchy traversal, cascade-delete coordination
- `TagRepository` — find-or-create by name (case-insensitive), subtree queries
## Cross-domain dependencies
None. Documents reference tags; tags do not reference documents or other domains.
## Frontend counterpart
`frontend/src/lib/tag/README.md`

View File

@@ -2,13 +2,10 @@ package org.raddatz.familienarchiv.tag;
import java.util.UUID; import java.util.UUID;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import io.swagger.v3.oas.annotations.media.Schema; import io.swagger.v3.oas.annotations.media.Schema;
import jakarta.persistence.*; import jakarta.persistence.*;
import lombok.*; import lombok.*;
// prevents infinite recursion in JSON serialization; see ADR-022 for lazy-fetch context
@JsonIgnoreProperties({"hibernateLazyInitializer", "handler"})
@Entity @Entity
@Data @Data
@NoArgsConstructor @NoArgsConstructor
@@ -30,11 +27,4 @@ public class Tag {
/** Color token name (e.g. "sage"), only set on root-level tags. Null means no color. */ /** Color token name (e.g. "sage"), only set on root-level tags. Null means no color. */
private String color; private String color;
/**
* Import identity key, keyed on the canonical tag_path. Null for manually created tags;
* unique among non-null values. The importer (Phase 3) uses it for idempotent re-import.
*/
@Column(name = "source_ref")
private String sourceRef;
} }

View File

@@ -22,9 +22,6 @@ public interface TagRepository extends JpaRepository<Tag, UUID> {
Optional<Tag> findByNameIgnoreCase(String name); Optional<Tag> findByNameIgnoreCase(String name);
// Lookup by the canonical tag_path, used for idempotent canonical re-import (Phase 3).
Optional<Tag> findBySourceRef(String sourceRef);
List<Tag> findByNameContainingIgnoreCase(String name); List<Tag> findByNameContainingIgnoreCase(String name);
/** /**

View File

@@ -7,7 +7,6 @@ import java.util.HashSet;
import java.util.LinkedHashMap; import java.util.LinkedHashMap;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.Optional;
import java.util.Set; import java.util.Set;
import java.util.UUID; import java.util.UUID;
import java.util.stream.Collectors; import java.util.stream.Collectors;
@@ -50,37 +49,12 @@ public class TagService {
.orElseThrow(() -> DomainException.notFound(ErrorCode.TAG_NOT_FOUND, "Tag not found: " + id)); .orElseThrow(() -> DomainException.notFound(ErrorCode.TAG_NOT_FOUND, "Tag not found: " + id));
} }
/** Lookup by the canonical tag_path — used by the canonical importer to attach a document's tag. */
public Optional<Tag> findBySourceRef(String sourceRef) {
return tagRepository.findBySourceRef(sourceRef);
}
public Tag findOrCreate(String name) { public Tag findOrCreate(String name) {
String cleanName = name.trim(); String cleanName = name.trim();
return tagRepository.findByNameIgnoreCase(cleanName) return tagRepository.findByNameIgnoreCase(cleanName)
.orElseGet(() -> tagRepository.save(Tag.builder().name(cleanName).build())); .orElseGet(() -> tagRepository.save(Tag.builder().name(cleanName).build()));
} }
/**
* Idempotent upsert keyed on {@code sourceRef} (the canonical tag_path) for the
* Phase-3 importer (ADR-025). On first import the canonical name and parent are
* written; on re-import a human-renamed tag name is preserved (the source_ref is the
* stable identity, the name is a human-editable label).
*/
@Transactional
public Tag upsertBySourceRef(String sourceRef, String name, UUID parentId) {
return tagRepository.findBySourceRef(sourceRef)
.map(existing -> {
existing.setParentId(parentId);
return tagRepository.save(existing);
})
.orElseGet(() -> tagRepository.save(Tag.builder()
.sourceRef(sourceRef)
.name(name)
.parentId(parentId)
.build()));
}
@Transactional @Transactional
public Tag update(UUID id, TagUpdateDTO dto) { public Tag update(UUID id, TagUpdateDTO dto) {
Tag tag = getById(id); Tag tag = getById(id);

View File

@@ -5,8 +5,7 @@ import org.raddatz.familienarchiv.security.Permission;
import org.raddatz.familienarchiv.security.RequirePermission; import org.raddatz.familienarchiv.security.RequirePermission;
import org.raddatz.familienarchiv.document.DocumentService; import org.raddatz.familienarchiv.document.DocumentService;
import org.raddatz.familienarchiv.document.DocumentVersionService; import org.raddatz.familienarchiv.document.DocumentVersionService;
import org.raddatz.familienarchiv.importing.CanonicalImportOrchestrator; import org.raddatz.familienarchiv.importing.MassImportService;
import org.raddatz.familienarchiv.importing.ImportStatus;
import org.raddatz.familienarchiv.document.ThumbnailBackfillService; import org.raddatz.familienarchiv.document.ThumbnailBackfillService;
import org.springframework.http.ResponseEntity; import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.GetMapping;
@@ -22,20 +21,20 @@ import lombok.RequiredArgsConstructor;
@RequiredArgsConstructor @RequiredArgsConstructor
public class AdminController { public class AdminController {
private final CanonicalImportOrchestrator importOrchestrator; private final MassImportService massImportService;
private final DocumentService documentService; private final DocumentService documentService;
private final DocumentVersionService documentVersionService; private final DocumentVersionService documentVersionService;
private final ThumbnailBackfillService thumbnailBackfillService; private final ThumbnailBackfillService thumbnailBackfillService;
@PostMapping("/trigger-import") @PostMapping("/trigger-import")
public ResponseEntity<ImportStatus> triggerMassImport() { public ResponseEntity<MassImportService.ImportStatus> triggerMassImport() {
importOrchestrator.runImportAsync(); massImportService.runImportAsync();
return ResponseEntity.accepted().body(importOrchestrator.getStatus()); return ResponseEntity.accepted().body(massImportService.getStatus());
} }
@GetMapping("/import-status") @GetMapping("/import-status")
public ResponseEntity<ImportStatus> importStatus() { public ResponseEntity<MassImportService.ImportStatus> importStatus() {
return ResponseEntity.ok(importOrchestrator.getStatus()); return ResponseEntity.ok(massImportService.getStatus());
} }
@PostMapping("/backfill-versions") @PostMapping("/backfill-versions")

View File

@@ -88,8 +88,7 @@ public class AppUser {
}; };
public static String computeColor(UUID id) { public static String computeColor(UUID id) {
// Math.floorMod avoids the Integer.MIN_VALUE overflow trap in Math.abs(hashCode()) return PALETTE[Math.abs(id.hashCode()) % PALETTE.length];
return PALETTE[Math.floorMod(id.hashCode(), PALETTE.length)];
} }
@PrePersist @PrePersist

View File

@@ -31,6 +31,5 @@ public class InviteListItemDTO {
private String status; private String status;
@Schema(requiredMode = Schema.RequiredMode.REQUIRED) @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private LocalDateTime createdAt; private LocalDateTime createdAt;
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
private String shareableUrl; private String shareableUrl;
} }

View File

@@ -52,11 +52,7 @@ public class InviteService {
public InviteToken createInvite(CreateInviteRequest dto, AppUser creator) { public InviteToken createInvite(CreateInviteRequest dto, AppUser creator) {
Set<UUID> groupIds = new HashSet<>(); Set<UUID> groupIds = new HashSet<>();
if (dto.getGroupIds() != null && !dto.getGroupIds().isEmpty()) { if (dto.getGroupIds() != null && !dto.getGroupIds().isEmpty()) {
Set<UUID> uniqueIds = new HashSet<>(dto.getGroupIds()); List<UserGroup> groups = userService.findGroupsByIds(dto.getGroupIds());
List<UserGroup> groups = userService.findGroupsByIds(new ArrayList<>(uniqueIds));
if (groups.size() != uniqueIds.size()) {
throw DomainException.notFound(ErrorCode.GROUP_NOT_FOUND, "One or more group IDs do not exist");
}
groups.forEach(g -> groupIds.add(g.getId())); groups.forEach(g -> groupIds.add(g.getId()));
} }

View File

@@ -24,7 +24,4 @@ public interface InviteTokenRepository extends JpaRepository<InviteToken, UUID>
@Query("SELECT t FROM InviteToken t ORDER BY t.createdAt DESC") @Query("SELECT t FROM InviteToken t ORDER BY t.createdAt DESC")
List<InviteToken> findAllOrderedByCreatedAt(); List<InviteToken> findAllOrderedByCreatedAt();
@Query("SELECT CASE WHEN COUNT(t) > 0 THEN true ELSE false END FROM InviteToken t JOIN t.groupIds g WHERE g = :groupId AND t.revoked = false AND (t.expiresAt IS NULL OR t.expiresAt > CURRENT_TIMESTAMP) AND (t.maxUses IS NULL OR t.useCount < t.maxUses)")
boolean existsActiveWithGroupId(@Param("groupId") UUID groupId);
} }

View File

@@ -5,7 +5,6 @@ import java.time.LocalDateTime;
import java.util.HexFormat; import java.util.HexFormat;
import java.util.Optional; import java.util.Optional;
import org.raddatz.familienarchiv.auth.AuthService;
import org.raddatz.familienarchiv.user.ResetPasswordRequest; import org.raddatz.familienarchiv.user.ResetPasswordRequest;
import org.raddatz.familienarchiv.exception.DomainException; import org.raddatz.familienarchiv.exception.DomainException;
import org.raddatz.familienarchiv.exception.ErrorCode; import org.raddatz.familienarchiv.exception.ErrorCode;
@@ -33,7 +32,6 @@ public class PasswordResetService {
private final UserService userService; private final UserService userService;
private final PasswordResetTokenRepository tokenRepository; private final PasswordResetTokenRepository tokenRepository;
private final PasswordEncoder passwordEncoder; private final PasswordEncoder passwordEncoder;
private final AuthService authService;
@Autowired(required = false) @Autowired(required = false)
private JavaMailSender mailSender; private JavaMailSender mailSender;
@@ -87,8 +85,6 @@ public class PasswordResetService {
resetToken.setUsed(true); resetToken.setUsed(true);
tokenRepository.save(resetToken); tokenRepository.save(resetToken);
authService.revokeAllSessions(user.getEmail());
} }
/** /**

View File

@@ -1,35 +0,0 @@
# user
Login accounts and permission groups. An `AppUser` is a system user who can authenticate and act in the application — they are never a historical family member.
## What this domain owns
Entities: `AppUser`, `UserGroup`, password-reset tokens, invite tokens.
Features: user CRUD, group CRUD, password change, password reset flow, invite links.
## What this domain does NOT own
- `Person` records — historical family members. An `AppUser` is never linked to a `Person`. This separation is intentional: a person who digitized letters in 2024 is not the same entity as their great-grandmother who wrote them in 1912. See `docs/GLOSSARY.md`.
- Permission enforcement — `security/` owns `@RequirePermission` and `PermissionAspect`. `user/` only manages which permissions are stored on `UserGroup`.
## Public surface
`UserService` methods are consumed primarily by the security infrastructure and the admin UI. No other business-logic domain calls `UserService` directly.
The Spring Security chain (via `CustomUserDetailsService` in `security/`) calls `AppUserRepository.findByUsername()` on every authenticated request.
## Internal layout
- `UserController` — REST under `/api/users` (current user, CRUD)
- `AuthController` — password reset, invite flow
- `UserService` — BCrypt-encoded passwords, group assignment
- `AppUserRepository` — find by username (used by Spring Security)
- `UserGroupRepository` — group and permission management
## Cross-domain dependencies
- `AuditService.logAfterCommit()` — user-management mutations are audited
## Frontend counterpart
`frontend/src/lib/user/README.md`

View File

@@ -4,11 +4,7 @@ import java.util.List;
import java.util.Map; import java.util.Map;
import java.util.UUID; import java.util.UUID;
import jakarta.servlet.http.HttpSession;
import jakarta.validation.Valid; import jakarta.validation.Valid;
import org.raddatz.familienarchiv.audit.AuditKind;
import org.raddatz.familienarchiv.audit.AuditService;
import org.raddatz.familienarchiv.auth.AuthService;
import org.raddatz.familienarchiv.user.AdminUpdateUserRequest; import org.raddatz.familienarchiv.user.AdminUpdateUserRequest;
import org.raddatz.familienarchiv.user.ChangePasswordDTO; import org.raddatz.familienarchiv.user.ChangePasswordDTO;
import org.raddatz.familienarchiv.user.CreateUserRequest; import org.raddatz.familienarchiv.user.CreateUserRequest;
@@ -30,15 +26,13 @@ import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.ResponseStatus; import org.springframework.web.bind.annotation.ResponseStatus;
import org.springframework.web.bind.annotation.RestController; import org.springframework.web.bind.annotation.RestController;
import lombok.RequiredArgsConstructor; import lombok.AllArgsConstructor;
@RestController @RestController
@RequestMapping("/api/") @RequestMapping("/api/")
@RequiredArgsConstructor @AllArgsConstructor
public class UserController { public class UserController {
private final UserService userService; private UserService userService;
private final AuthService authService;
private final AuditService auditService;
@GetMapping("users/me") @GetMapping("users/me")
public ResponseEntity<AppUser> getCurrentUser(Authentication authentication) { public ResponseEntity<AppUser> getCurrentUser(Authentication authentication) {
@@ -62,14 +56,9 @@ public class UserController {
@PostMapping("users/me/password") @PostMapping("users/me/password")
@ResponseStatus(HttpStatus.NO_CONTENT) @ResponseStatus(HttpStatus.NO_CONTENT)
public void changePassword(Authentication authentication, public void changePassword(Authentication authentication,
HttpSession session,
@RequestBody ChangePasswordDTO dto) { @RequestBody ChangePasswordDTO dto) {
AppUser current = userService.findByEmail(authentication.getName()); AppUser current = userService.findByEmail(authentication.getName());
userService.changePassword(current.getId(), dto); userService.changePassword(current.getId(), dto);
int revoked = authService.revokeOtherSessions(session.getId(), authentication.getName());
auditService.log(AuditKind.LOGOUT, current.getId(), null, Map.of(
"reason", "password_change",
"revokedCount", revoked));
} }
@GetMapping("users/{id}") @GetMapping("users/{id}")
@@ -112,18 +101,6 @@ public class UserController {
return ResponseEntity.ok().build(); return ResponseEntity.ok().build();
} }
@PostMapping("/users/{id}/force-logout")
@RequirePermission(Permission.ADMIN_USER)
public ResponseEntity<Map<String, Object>> forceLogout(Authentication authentication,
@PathVariable UUID id) {
AppUser target = userService.getById(id);
int revoked = authService.revokeAllSessions(target.getEmail());
auditService.log(AuditKind.ADMIN_FORCE_LOGOUT, actorId(authentication), null, Map.of(
"targetUserId", target.getId().toString(),
"revokedCount", revoked));
return ResponseEntity.ok(Map.of("revokedCount", revoked));
}
private UUID actorId(Authentication auth) { private UUID actorId(Authentication auth) {
return userService.findByEmail(auth.getName()).getId(); return userService.findByEmail(auth.getName()).getId();
} }

Some files were not shown because too many files have changed in this diff Show More