Dokumenttitel automatisch mit Datum/Ort synchronisieren (Save-time + Backfill) (#726) #727

Merged
marcel merged 9 commits from feat/issue-726-auto-title-sync into main 2026-06-04 17:32:50 +02:00
Owner

Closes #726.

Document titles were strings built once at import ({index} – {dateLabel} – {location}) and never rebuilt, so correcting a date/location left a stale title behind. This makes titles follow date/location forward automatically, cleans the existing backlog once, and never overwrites a hand-written title.

What changed

A — Single source of truth (FR-TITLE-001)

  • New DocumentTitleFactory (@Component, document package) owns the {index} – {dateLabel} – {location} formula; DocumentImporter consumes it. DocumentTitleFormatter moved into document (package-private) with its #666 fixture-parity test.

B — Save-time regeneration, exact match (FR-TITLE-002)

  • updateDocument captures autoTitleBefore from the persisted state before any setter; resolveTitle + projectedState rebuild only when the submitted title still equals it. Hand-written/freshly-typed titles kept, blank never persisted, file-replaced docs treated as manual. projectedState mirrors the date/location-overwrite vs precision-skip-null asymmetry.

C — One-time backlog cleanup (FR-TITLE-003/004)

  • POST /api/admin/backfill-titles (synchronous, ADMIN-only) → BackfillResult(count).
  • DocumentTitleBackfillMatcher: literal startsWith index (no ReDoS / regex injection, CWE-1333), date-label forms derived from the same Locale.GERMAN formatters as the factory (no drift), prose left untouched, fail-closed.
  • Saves via the repository directly (no recordVersion → no version-spam), idempotent, logs scanned/updated/skipped.

D — Edit-form feedback (FR-TITLE-005)

  • Helper line under the title input in DescriptionSection.svelte (single-edit only, via showTitleHelp), aria-describedby-wired, text-ink-3 (AA on bg-surface). New Paraglide key form_helper_title_autogenerated in de/en/es.

Docs — ADR-031, GLOSSARY.md auto-generated title, l3-backend-3b-document-management.puml, backend/api_tests/ runbook entry.

Tests

  • Unit: factory composition, all save-time gherkin scenarios, the backfill heuristic in isolation (every date-label form + regex-metacharacter index), controller permission slice.
  • Integration (Testcontainers postgres:16): stale row fixed · idempotent (2nd run count==0) · prose skipped · no document_versions rows added.
  • Frontend: DescriptionSection helper component test + one Playwright E2E (edit date → title follows on detail page).

293 tests green across the touched classes; clean package builds; frontend lint clean.

Operator note

Run the cleanup once after deploy: POST http://<backend>:8080/api/admin/backfill-titles (ADMIN, direct to port 8080 to bypass the SvelteKit proxy timeout). Idempotent — a second run returns {"count": 0}.

🤖 Generated with Claude Code

Closes #726. Document titles were strings built once at import (`{index} – {dateLabel} – {location}`) and never rebuilt, so correcting a date/location left a stale title behind. This makes titles follow date/location forward automatically, cleans the existing backlog once, and never overwrites a hand-written title. ## What changed **A — Single source of truth (FR-TITLE-001)** - New `DocumentTitleFactory` (`@Component`, `document` package) owns the `{index} – {dateLabel} – {location}` formula; `DocumentImporter` consumes it. `DocumentTitleFormatter` moved into `document` (package-private) with its #666 fixture-parity test. **B — Save-time regeneration, exact match (FR-TITLE-002)** - `updateDocument` captures `autoTitleBefore` from the persisted state before any setter; `resolveTitle` + `projectedState` rebuild only when the submitted title still equals it. Hand-written/freshly-typed titles kept, blank never persisted, file-replaced docs treated as manual. `projectedState` mirrors the date/location-overwrite vs precision-skip-null asymmetry. **C — One-time backlog cleanup (FR-TITLE-003/004)** - `POST /api/admin/backfill-titles` (synchronous, ADMIN-only) → `BackfillResult(count)`. - `DocumentTitleBackfillMatcher`: literal `startsWith` index (no ReDoS / regex injection, CWE-1333), date-label forms derived from the same `Locale.GERMAN` formatters as the factory (no drift), prose left untouched, fail-closed. - Saves via the repository directly (no `recordVersion` → no version-spam), idempotent, logs `scanned/updated/skipped`. **D — Edit-form feedback (FR-TITLE-005)** - Helper line under the title input in `DescriptionSection.svelte` (single-edit only, via `showTitleHelp`), `aria-describedby`-wired, `text-ink-3` (AA on `bg-surface`). New Paraglide key `form_helper_title_autogenerated` in de/en/es. **Docs** — ADR-031, `GLOSSARY.md` *auto-generated title*, `l3-backend-3b-document-management.puml`, `backend/api_tests/` runbook entry. ## Tests - Unit: factory composition, all save-time gherkin scenarios, the backfill heuristic in isolation (every date-label form + regex-metacharacter index), controller permission slice. - Integration (Testcontainers `postgres:16`): stale row fixed · idempotent (2nd run `count==0`) · prose skipped · **no `document_versions` rows added**. - Frontend: `DescriptionSection` helper component test + one Playwright E2E (edit date → title follows on detail page). 293 tests green across the touched classes; `clean package` builds; frontend lint clean. ## Operator note Run the cleanup **once** after deploy: `POST http://<backend>:8080/api/admin/backfill-titles` (ADMIN, direct to port 8080 to bypass the SvelteKit proxy timeout). Idempotent — a second run returns `{"count": 0}`. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
marcel added 6 commits 2026-06-04 16:59:05 +02:00
Move DocumentTitleFormatter from importing into the document package and
introduce DocumentTitleFactory there as the single source of truth for the
{index} – {dateLabel} – {location} formula. DocumentImporter now consumes the
factory instead of owning the composition; the document package owns the rule,
importing depends on it (not the reverse). No behavioral change — importer
title assertions and the #666 fixture parity test stay green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
updateDocument now captures the machine title from the persisted state before any
setter runs, and rebuilds it from the new state only when the submitted title still
equals that machine value — an exact comparison that relies on the edit form
round-tripping an untouched title verbatim. A hand-written or freshly-typed title is
kept; a blank submission falls back to the rebuilt auto-title (title is always present);
a file-replaced document no longer matches its import-time title and is treated as
manual. projectedState mirrors the setter asymmetry exactly (date/location overwrite
incl. null-clear; precision/end/raw skip-null from the entity).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds POST /api/admin/backfill-titles (ADMIN-only, synchronous) which rebuilds every
machine-generated title from the row's current state. A grammar heuristic
(DocumentTitleBackfillMatcher) decides overwritability: index matched literally via
startsWith (originalFilename is user-controlled — no regex injection / ReDoS, CWE-1333),
date-label forms derived from the same Locale.GERMAN formatters as the factory so they
cannot drift, prose left untouched, fail-closed on any surprise. Saves via the repository
directly (no recordVersion — follows backfillFileHashes), so the mechanical rename never
version-spams document_versions. Idempotent: a second run rewrites nothing. Emits one
SLF4J-parameterized scanned/updated/skipped line.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pins backfill behaviour on postgres:16-alpine (H2 unusable — title is NOT NULL): a stale
auto-title is rewritten, the sweep is idempotent (second run touches nothing), prose is
left alone, and the mechanical rename adds no document_versions rows. Permission (401/403)
stays in the faster @WebMvcTest slice.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds the FR-TITLE-005 helper line under the title input in DescriptionSection, shown only
on the single-document edit form via a new showTitleHelp prop (off for the new-document and
bulk-edit forms). It is wired to the input with aria-describedby and uses text-ink-3 (WCAG AA
on bg-surface). New Paraglide key form_helper_title_autogenerated in de/en/es. Adds a
component test for the helper + aria wiring and an end-to-end pass: create an auto-titled doc,
edit its date, and see the title follow on the detail page.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
docs(document): ADR-031 + glossary/c4/api_tests for auto-title sync (#726)
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m32s
CI / OCR Service Tests (pull_request) Successful in 26s
CI / Backend Unit Tests (pull_request) Successful in 3m35s
CI / fail2ban Regex (pull_request) Successful in 44s
CI / Semgrep Security Scan (pull_request) Successful in 22s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m6s
cf457cb96f
ADR-031 records the shared document-package title factory, the exact-match save-time
regeneration, and the grammar-heuristic one-time backfill (with the ReDoS / no-version-spam
/ file-replace-is-manual decisions). Adds an "auto-generated title" glossary entry, extends
the document-management c4 diagram with DocumentTitleFactory / DocumentTitleBackfillMatcher
and the backfill flows, and documents POST /api/admin/backfill-titles in Admin-Auth.http as
a one-shot ADMIN call hitting port 8080 directly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Author
Owner

⚙️ Tobias Wendt — DevOps & Platform Engineer

Verdict: ⚠️ Approved with concerns

The operational story is genuinely good for a one-shot maintenance endpoint: it reuses the established backfill-* precedent, is ADMIN-gated, idempotent, fail-closed, logs scanned/updated/skipped, and ships a runbook entry. The ReDoS guard (literal startsWith, Pattern.quoted month tokens, bounded quantifiers) is exactly the kind of thing that bites you in prod, and it's handled. My concerns are about how the full-corpus sweep behaves under a transaction on the request thread, not about correctness.

Blockers

None. This is shippable as a one-time operator action.

Suggestions

  1. Whole-corpus load inside one transaction — watch the heap and the lock window. DocumentService.backfillTitles() does documentRepository.findAll() (line ~1072) and iterates with per-row save(), all inside a single @Transactional. For the current archive (low thousands of rows) this is fine, but it means: the entire documents table is materialised into the persistence context at once, and a single long-lived transaction holds for the duration of the sweep. Two pragmatic asks: (a) confirm the realistic corpus size keeps the heap comfortable on the CX32 (8GB) target, and (b) if it grows, switch to a paged/streamed sweep or chunked transactions. Not a blocker today — just don't let findAll() become the default pattern as the corpus grows. The backfillFileHashes precedent at least filters (findByFileHashIsNull...); this one scans everything by design, so it's the heaviest of the three.

  2. Synchronous on the request thread + the port-8080 note. The runbook correctly says "hit port 8080 directly, NOT through the SvelteKit proxy, so the sweep can't trip the proxy timeout." Good catch — but that's a smell worth flagging: it means the endpoint's runtime can exceed the proxy timeout, and there's no upstream limit if the corpus grows. The async pattern already exists in this very controller (runImportAsync, runBackfillAsync + a status endpoint). For a one-shot today, synchronous is acceptable and simpler (KISS), and I'd rather not over-engineer it — but note the inconsistency: the two heaviest admin operations in AdminController are async with status polling, and the unbounded-by-design one is synchronous. If a real archive ever makes this run for minutes, it should move to the async+status pattern, not stay reliant on operators remembering the port-8080 workaround.

  3. Runbook lives in a REST Client .http file — fine, but verify it reaches the operator. The backend/api_tests/Admin-Auth.http entry is clear and well-commented (one-shot, idempotent, returns {count}, direct port 8080). My only worry is discoverability: a .http test file isn't where someone reads a deploy procedure. Confirm this one-time step is also referenced in the actual deploy/release notes for #726 so it isn't missed post-deploy. An auto-title sync that's coded but never triggered leaves the backlog stale.

  4. No migration — correct, and worth stating. This is intentionally a data backfill via endpoint rather than a Flyway migration, which is the right call (the rebuild depends on application-layer formatters, not pure SQL). No action needed; noting it so the "where's the migration?" question is answered: there deliberately isn't one, and the idempotency makes a re-run safe if an operator is unsure whether it already ran.

Nice work on the fail-closed matcher and the version-spam avoidance — those are the two things that would have caused an incident, and both are handled.

## ⚙️ Tobias Wendt — DevOps & Platform Engineer **Verdict: ⚠️ Approved with concerns** The operational story is genuinely good for a one-shot maintenance endpoint: it reuses the established `backfill-*` precedent, is ADMIN-gated, idempotent, fail-closed, logs `scanned/updated/skipped`, and ships a runbook entry. The ReDoS guard (literal `startsWith`, `Pattern.quote`d month tokens, bounded quantifiers) is exactly the kind of thing that bites you in prod, and it's handled. My concerns are about how the full-corpus sweep behaves under a transaction on the request thread, not about correctness. ### Blockers None. This is shippable as a one-time operator action. ### Suggestions 1. **Whole-corpus load inside one transaction — watch the heap and the lock window.** `DocumentService.backfillTitles()` does `documentRepository.findAll()` (line ~1072) and iterates with per-row `save()`, all inside a single `@Transactional`. For the current archive (low thousands of rows) this is fine, but it means: the entire `documents` table is materialised into the persistence context at once, and a single long-lived transaction holds for the duration of the sweep. Two pragmatic asks: (a) confirm the realistic corpus size keeps the heap comfortable on the CX32 (8GB) target, and (b) if it grows, switch to a paged/streamed sweep or chunked transactions. Not a blocker today — just don't let `findAll()` become the default pattern as the corpus grows. The `backfillFileHashes` precedent at least filters (`findByFileHashIsNull...`); this one scans everything by design, so it's the heaviest of the three. 2. **Synchronous on the request thread + the port-8080 note.** The runbook correctly says "hit port 8080 directly, NOT through the SvelteKit proxy, so the sweep can't trip the proxy timeout." Good catch — but that's a smell worth flagging: it means the endpoint's runtime can exceed the proxy timeout, and there's no upstream limit if the corpus grows. The async pattern already exists in this very controller (`runImportAsync`, `runBackfillAsync` + a status endpoint). For a one-shot today, synchronous is acceptable and simpler (KISS), and I'd rather not over-engineer it — but note the inconsistency: the two heaviest admin operations in `AdminController` are async with status polling, and the unbounded-by-design one is synchronous. If a real archive ever makes this run for minutes, it should move to the async+status pattern, not stay reliant on operators remembering the port-8080 workaround. 3. **Runbook lives in a REST Client `.http` file — fine, but verify it reaches the operator.** The `backend/api_tests/Admin-Auth.http` entry is clear and well-commented (one-shot, idempotent, returns `{count}`, direct port 8080). My only worry is discoverability: a `.http` test file isn't where someone reads a deploy procedure. Confirm this one-time step is also referenced in the actual deploy/release notes for #726 so it isn't missed post-deploy. An auto-title sync that's coded but never triggered leaves the backlog stale. 4. **No migration — correct, and worth stating.** This is intentionally a data backfill via endpoint rather than a Flyway migration, which is the right call (the rebuild depends on application-layer formatters, not pure SQL). No action needed; noting it so the "where's the migration?" question is answered: there deliberately isn't one, and the idempotency makes a re-run safe if an operator is unsure whether it already ran. Nice work on the fail-closed matcher and the version-spam avoidance — those are the two things that would have caused an incident, and both are handled.
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: Approved

I read the full diff (24 files, +1037/-33) and the touched files on disk. This is the cleanest PR I've reviewed in a while — the TDD evidence is right there in the structure, the naming carries the intent, and the security framing is honest rather than decorative.

TDD evidence

  • Every new behaviour has a paired test, and the tests describe behaviour in their names (updateDocument_regeneratesAutoTitle_whenDateChanges, backfillTitles_neverRecordsVersions, index_with_regex_metacharacters_is_matched_literally_and_terminates). The save-time block covers the full matrix the persona asks for: happy path, hand-written kept, freshly-typed wins, location-cleared, precision change, blank-never-persisted, file-replaced-as-manual, idempotent. The empty/edge states for the matcher (null title, null/blank index, prose, location-only) are all present and fail-closed.
  • @Spy DocumentTitleFactory = new DocumentTitleFactory() in the service test and the real factory in DocumentImporterTest is the right call — the single source of truth is exercised, not stubbed away, so a drift in the formula breaks the tests. Good.
  • The integration test (@SpringBootTest + PostgresContainerConfig) is correctly scoped to what H2 can't prove (NOT NULL title + version-row count), and the comment explicitly defers permission enforcement to the faster @WebMvcTest slice (AdminControllerTest: 401/403/200). That's the test-slice discipline I look for.
  • @Timeout(5s) on the ReDoS test is exactly how you prove the literal-prefix claim instead of asserting it in prose.

Clean code / naming / function size

  • DocumentTitleFactory.build, resolveTitle, projectedState, DocumentTitleBackfillMatcher.isOverwritable — every name answers "what does this represent?" without tracing usage. resolveTitle is a guard-clause ladder with the happy path last, under 10 lines. backfillTitles orchestrates and stays readable.
  • Extracting the formula out of importing into the document package, package-private DocumentTitleFormatter, factory as the only seam — the dependency direction (document owns, importing consumes) is correct and the dead private buildTitle was deleted, not commented out.
  • Objects.equals import is present (verified on disk); no Optional.get(), no raw exceptions, @Transactional only on the two write sweeps. backfillTitles saving via documentRepository.save directly (no recordVersion) following the backfillFileHashes precedent is the right reliability call and is pinned by backfillTitles_neverRecordsVersions + the integration addsNoDocumentVersionRows.

Svelte 5

  • DescriptionSection uses $props() with a typed shape and $derived (titleValue) — no $effect-to-compute, no plain Map/Set, no unkeyed {#each} introduced. The helper is gated on showTitleHelp and wired via aria-describedby (conditional undefined when off). Verified bulk-edit passes hideTitle and never showTitleHelp (defaults false), and the component test pins that hiding the title suppresses the helper — so the helper is single-edit-only by construction, not by accident. i18n key added to all three locales.

Blockers

None.

Suggestions (non-blocking, take or leave)

  1. build(Document) takes the whole entity where the formula needs four fields. My persona's own rule is "pass specific props, not the entire object". Here the entity is the natural domain aggregate and projectedState builds a partial Document precisely to feed it, so I'd keep it — but it does mean a synthetic/partial Document is now a valid input the factory must tolerate (hence the originalFilename == null guard). That guard is load-bearing; a one-line test asserting build() on a bare Document.builder().build() returns "" rather than NPEs would lock that contract down. Minor.
  2. The matcher's (?: – .+)?$ trailing-location peel is greedy over a range label that itself contains . range_with_internal_separator_plus_trailing_location proves it works for the cross-month case, but the accepted FR-004 trade ({index} – {valid date label} – {anything} → rewrite the trailing segment) means a hand-written suffix on a dated title can be dropped on backfill. The ADR documents this as accepted and it's a one-time sweep, so I'm not blocking — just flagging that this is the single place where the heuristic can lose human text. The operator runbook note (run once, idempotent, check count) is the right mitigation.
  3. projectedState mirrors updateDocument's setter asymmetry by hand. The Javadoc says it's kept "lock-step" and a mismatch "silently produces a wrong label". That's a real coupling risk if applyDatePrecision ever changes. The season round-trip and precision-change tests cover the current behaviour; if this asymmetry grows, consider extracting the projection so the real setter path and projectedState share one mapping. Today it's fine.

Docs are complete (ADR-031, GLOSSARY, the L3 PUML, the api_tests runbook). Ship it.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** I read the full diff (24 files, +1037/-33) and the touched files on disk. This is the cleanest PR I've reviewed in a while — the TDD evidence is right there in the structure, the naming carries the intent, and the security framing is honest rather than decorative. ### TDD evidence - Every new behaviour has a paired test, and the tests describe behaviour in their names (`updateDocument_regeneratesAutoTitle_whenDateChanges`, `backfillTitles_neverRecordsVersions`, `index_with_regex_metacharacters_is_matched_literally_and_terminates`). The save-time block covers the full matrix the persona asks for: happy path, hand-written kept, freshly-typed wins, location-cleared, precision change, blank-never-persisted, file-replaced-as-manual, idempotent. The empty/edge states for the matcher (null title, null/blank index, prose, location-only) are all present and fail-closed. - `@Spy DocumentTitleFactory = new DocumentTitleFactory()` in the service test and the real factory in `DocumentImporterTest` is the right call — the single source of truth is exercised, not stubbed away, so a drift in the formula breaks the tests. Good. - The integration test (`@SpringBootTest` + `PostgresContainerConfig`) is correctly scoped to what H2 can't prove (NOT NULL title + version-row count), and the comment explicitly defers permission enforcement to the faster `@WebMvcTest` slice (`AdminControllerTest`: 401/403/200). That's the test-slice discipline I look for. - `@Timeout(5s)` on the ReDoS test is exactly how you prove the literal-prefix claim instead of asserting it in prose. ### Clean code / naming / function size - `DocumentTitleFactory.build`, `resolveTitle`, `projectedState`, `DocumentTitleBackfillMatcher.isOverwritable` — every name answers "what does this represent?" without tracing usage. `resolveTitle` is a guard-clause ladder with the happy path last, under 10 lines. `backfillTitles` orchestrates and stays readable. - Extracting the formula out of `importing` into the `document` package, package-private `DocumentTitleFormatter`, factory as the only seam — the dependency direction (`document` owns, `importing` consumes) is correct and the dead private `buildTitle` was deleted, not commented out. - `Objects.equals` import is present (verified on disk); no `Optional.get()`, no raw exceptions, `@Transactional` only on the two write sweeps. `backfillTitles` saving via `documentRepository.save` directly (no `recordVersion`) following the `backfillFileHashes` precedent is the right reliability call and is pinned by `backfillTitles_neverRecordsVersions` + the integration `addsNoDocumentVersionRows`. ### Svelte 5 - `DescriptionSection` uses `$props()` with a typed shape and `$derived` (`titleValue`) — no `$effect`-to-compute, no plain Map/Set, no unkeyed `{#each}` introduced. The helper is gated on `showTitleHelp` and wired via `aria-describedby` (conditional `undefined` when off). Verified bulk-edit passes `hideTitle` and never `showTitleHelp` (defaults false), and the component test pins that hiding the title suppresses the helper — so the helper is single-edit-only by construction, not by accident. i18n key added to all three locales. ### Blockers None. ### Suggestions (non-blocking, take or leave) 1. **`build(Document)` takes the whole entity where the formula needs four fields.** My persona's own rule is "pass specific props, not the entire object". Here the entity is the natural domain aggregate and `projectedState` builds a partial `Document` precisely to feed it, so I'd keep it — but it does mean a synthetic/partial `Document` is now a valid input the factory must tolerate (hence the `originalFilename == null` guard). That guard is load-bearing; a one-line test asserting `build()` on a bare `Document.builder().build()` returns `""` rather than NPEs would lock that contract down. Minor. 2. **The matcher's `(?: – .+)?$` trailing-location peel is greedy over a range label that itself contains ` – `.** `range_with_internal_separator_plus_trailing_location` proves it works for the cross-month case, but the accepted FR-004 trade (`{index} – {valid date label} – {anything}` → rewrite the trailing segment) means a hand-written suffix on a dated title can be dropped on backfill. The ADR documents this as accepted and it's a one-time sweep, so I'm not blocking — just flagging that this is the single place where the heuristic can lose human text. The operator runbook note (run once, idempotent, check `count`) is the right mitigation. 3. **`projectedState` mirrors `updateDocument`'s setter asymmetry by hand.** The Javadoc says it's kept "lock-step" and a mismatch "silently produces a wrong label". That's a real coupling risk if `applyDatePrecision` ever changes. The season round-trip and precision-change tests cover the current behaviour; if this asymmetry grows, consider extracting the projection so the real setter path and `projectedState` share one mapping. Today it's fine. Docs are complete (ADR-031, GLOSSARY, the L3 PUML, the api_tests runbook). Ship it.
Author
Owner

🎨 Leonie Voss — UI/UX Design Lead & Accessibility Advocate

Verdict: Approved

I reviewed the FR-TITLE-005 helper line through the UX/accessibility lens: tone for the 60+ audience, aria-describedby wiring, contrast, font size, i18n completeness/quality, and that it only surfaces where it should. The implementation is clean and considerate. No blockers. A few suggestions to make it a touch better for the senior researcher on a small phone in bright sunlight.

Blockers

None.

Suggestions

1. Helper font size is at the absolute floor (12px) — bump to 14px for the senior audience.
frontend/src/lib/document/DescriptionSection.svelte:81 renders the helper as text-xs (12px). That clears my hard minimum, but my body-text guidance for the 60+ cohort is 16px (14px acceptable for secondary/helper text). 12px helper copy on a phone in daylight is hard for this audience — and this is exactly the explanatory line that calms a user worried their title will be overwritten. I'd lift it to text-sm (14px). Note the same text-xs text-ink-3 pattern is already used for the Karton/Mappe helpers at lines 137 and 153, so changing only the title helper would create inconsistency — ideally bump all three together, or at minimum this one.

2. Contrast — verified, passes, no change needed (logged for the record).
text-ink-3 resolves to --c-ink-3: #6b7280 (gray-500) on --c-surface: #ffffff, documented at frontend/src/routes/layout.css:100 as ~4.8:1 — WCAG AA for normal text ✓. Dark mode remaps to #8b97a5 on #011526 = 7.1:1 — AAA ✓ (layout.css:191/267). Both themes verified. The helper is plain prose, not a color-only status cue, so the "never color alone" rule is satisfied. Good.

3. aria-describedby wiring is correct — nicely done.
DescriptionSection.svelte:77 sets aria-describedby={showTitleHelp ? 'title-help' : undefined} and the <p id="title-help"> (line 81) only renders under the same showTitleHelp guard. So the id reference and its target appear/disappear together — no dangling aria-describedby pointing at a missing node when the helper is off. Screen readers will announce the explanation right after the field label. Exactly the pattern I want.

4. "Only shows where it should" — mostly correct, one place worth a conscious decision.

  • Single edit (documents/[id]/edit) → shows. ✓ Intended.
  • Bulk edit (BulkDocumentEditLayout.svelte:445/496) → passes hideTitle, no title field at all, no helper. ✓
  • New document (documents/new/+page.svelte) → uses BulkDocumentEditLayout, hideTitle, no helper. ✓ Correct — there's no auto-title to explain before the doc exists.
  • Enrichment (enrich/[id]/+page.svelte:16) → uses DocumentEditLayout, which hard-codes showTitleHelp={true} (DocumentEditLayout.svelte:224). So the helper also appears in the enrich flow.

That last one is not a bug — enrich is single-doc editing of an auto-titled document, so the message is true and helpful there too. I just want it to be a deliberate choice rather than incidental. If you ever want enrich to differ, showTitleHelp is already a prop on DescriptionSection, but it's pinned true inside DocumentEditLayout, so enrich can't opt out without threading a new prop. Fine as-is; flagging so it's intentional.

5. i18n — complete across all three locales, and the translations are good.
form_helper_title_autogenerated is present in de/en/es (messages/{de,en,es}.json:59). Tone check:

  • de: "Wird automatisch aus Datum und Ort gebildet — sobald du den Titel änderst, bleibt deine Version erhalten." Warm, informal du, reassuring. ✓
  • en: "Generated automatically from the date and place — as soon as you edit the title, your version is kept." Clear. ✓
  • es: "Se genera automáticamente a partir de la fecha y el lugar; en cuanto edites el título, se conservará tu versión." Accurate; informal (edites) matches the German register. ✓

All three lead with what happens and the reassurance that a hand-written title survives — that's the right emphasis to prevent the "did my title just change?" anxiety, which is the whole UX point of FR-005. Well-judged copy.

6. No surprise title changes — UX confirmed.
The helper text explicitly promises the user's edit is preserved, and that matches the save-time logic (regenerate only while the title still equals the auto value). The interface is honest about the state change, which is exactly what I ask for. Nice alignment between copy and behavior.

Ship it. If you only do one thing, do suggestion #1 (14px helper text) — it's the one that most directly serves the people we most often forget.

## 🎨 Leonie Voss — UI/UX Design Lead & Accessibility Advocate **Verdict: ✅ Approved** I reviewed the FR-TITLE-005 helper line through the UX/accessibility lens: tone for the 60+ audience, `aria-describedby` wiring, contrast, font size, i18n completeness/quality, and that it only surfaces where it should. The implementation is clean and considerate. No blockers. A few suggestions to make it a touch better for the senior researcher on a small phone in bright sunlight. ### Blockers None. ### Suggestions **1. Helper font size is at the absolute floor (12px) — bump to 14px for the senior audience.** `frontend/src/lib/document/DescriptionSection.svelte:81` renders the helper as `text-xs` (12px). That clears my hard minimum, but my body-text guidance for the 60+ cohort is 16px (14px acceptable for secondary/helper text). 12px helper copy on a phone in daylight is hard for this audience — and this is exactly the explanatory line that calms a user worried their title will be overwritten. I'd lift it to `text-sm` (14px). Note the same `text-xs text-ink-3` pattern is already used for the Karton/Mappe helpers at lines 137 and 153, so changing only the title helper would create inconsistency — ideally bump all three together, or at minimum this one. **2. Contrast — verified, passes, no change needed (logged for the record).** `text-ink-3` resolves to `--c-ink-3: #6b7280` (gray-500) on `--c-surface: #ffffff`, documented at `frontend/src/routes/layout.css:100` as ~4.8:1 — WCAG AA for normal text ✓. Dark mode remaps to `#8b97a5` on `#011526` = 7.1:1 — AAA ✓ (`layout.css:191/267`). Both themes verified. The helper is plain prose, not a color-only status cue, so the "never color alone" rule is satisfied. Good. **3. `aria-describedby` wiring is correct — nicely done.** `DescriptionSection.svelte:77` sets `aria-describedby={showTitleHelp ? 'title-help' : undefined}` and the `<p id="title-help">` (line 81) only renders under the same `showTitleHelp` guard. So the id reference and its target appear/disappear together — no dangling `aria-describedby` pointing at a missing node when the helper is off. Screen readers will announce the explanation right after the field label. Exactly the pattern I want. **4. "Only shows where it should" — mostly correct, one place worth a conscious decision.** - Single edit (`documents/[id]/edit`) → shows. ✓ Intended. - Bulk edit (`BulkDocumentEditLayout.svelte:445/496`) → passes `hideTitle`, no title field at all, no helper. ✓ - New document (`documents/new/+page.svelte`) → uses `BulkDocumentEditLayout`, `hideTitle`, no helper. ✓ Correct — there's no auto-title to explain before the doc exists. - Enrichment (`enrich/[id]/+page.svelte:16`) → uses `DocumentEditLayout`, which hard-codes `showTitleHelp={true}` (`DocumentEditLayout.svelte:224`). So the helper also appears in the enrich flow. That last one is *not* a bug — enrich is single-doc editing of an auto-titled document, so the message is true and helpful there too. I just want it to be a deliberate choice rather than incidental. If you ever want enrich to differ, `showTitleHelp` is already a prop on `DescriptionSection`, but it's pinned `true` inside `DocumentEditLayout`, so enrich can't opt out without threading a new prop. Fine as-is; flagging so it's intentional. **5. i18n — complete across all three locales, and the translations are good.** `form_helper_title_autogenerated` is present in de/en/es (`messages/{de,en,es}.json:59`). Tone check: - **de**: "Wird automatisch aus Datum und Ort gebildet — sobald du den Titel änderst, bleibt deine Version erhalten." Warm, informal *du*, reassuring. ✓ - **en**: "Generated automatically from the date and place — as soon as you edit the title, your version is kept." Clear. ✓ - **es**: "Se genera automáticamente a partir de la fecha y el lugar; en cuanto edites el título, se conservará tu versión." Accurate; informal *tú* (`edites`) matches the German register. ✓ All three lead with what happens *and* the reassurance that a hand-written title survives — that's the right emphasis to prevent the "did my title just change?" anxiety, which is the whole UX point of FR-005. Well-judged copy. **6. No surprise title changes — UX confirmed.** The helper text explicitly promises the user's edit is preserved, and that matches the save-time logic (regenerate only while the title still equals the auto value). The interface is honest about the state change, which is exactly what I ask for. Nice alignment between copy and behavior. Ship it. If you only do one thing, do suggestion #1 (14px helper text) — it's the one that most directly serves the people we most often forget.
Author
Owner

🏛️ Markus Keller — Application Architect

Verdict: Approved

I reviewed this strictly through the architecture lens: layer boundaries, the documentimporting dependency direction, where the title formula/factory/matcher live, single-source-of-truth, abstraction quality, and documentation currency.

The headline structural decision is the right one. The title formula now lives in DocumentTitleFactory (@Component, document package) and importing consumes it — the dependency points from the leaf importer into the owning domain, never the reverse. DocumentImporter injects DocumentTitleFactory and calls build(doc) in applyFileMetadata (DocumentImporter.java:78, 230) rather than carrying its own private buildTitle(). DocumentTitleFormatter moved alongside it as package-private (document/DocumentTitleFormatter.java:17). That is single-source-of-truth done properly: one formula, one date-label implementation, both halves of the #666 fixture-parity contract co-located in the domain that owns the concept. ADR-031 records the why, and it supersedes the old import-time-only behaviour clearly.

What I checked and found sound

  • Dependency directionimporting → document only. No new edge from document back into importing. The formatter relocation eliminates the previous split-brain where the formula lived in the wrong module.
  • Cross-domain accessbackfillTitles() and updateDocument() stay inside DocumentService against its own documentRepository; person/tag access still routes through the respective services. No boundary leak introduced.
  • Save-time vs backfill asymmetry is a deliberate, documented abstraction split — exact old-vs-new match in resolveTitle/projectedState (DocumentService.java:471–501) for the edit path; a separate grammar heuristic (DocumentTitleBackfillMatcher) for the one-time sweep. Keeping the fragile heuristic out of the hot save path and confined to the backfill is the correct containment of complexity. projectedState explicitly mirrors applyDatePrecision's skip-null asymmetry, with a comment naming the drift risk.
  • Backfill ≠ versioned writebackfillTitles() saves via documentRepository.save directly and skips recordVersion, consistent with the backfillFileHashes precedent (DocumentService.java:1074–1096). A mechanical rename should not snapshot the corpus into document_versions; this is the right call and is tested.
  • No premature infrastructure — synchronous ADMIN-only POST, no broker, no async job for a one-shot cleanup. Appropriate for the volume. The operator note documents the once-after-deploy intent.
  • BackfillResult reuse — the document-package record reused by AdminController, and AdminController living in user while importing document types, are both pre-existing patterns (introduced in af2c983f), not regressions from this PR.
  • Doc currency (my blocker category) — all required updates are present: ADR-031, GLOSSARY.md auto-generated title entry, l3-backend-3b-document-management.puml, and the api_tests runbook entry. No schema/Flyway change, so no DB-diagram obligation. Nothing outstanding.

Suggestions (non-blocking)

  1. projectedState is a structural tripwire by design — keep it honest. DocumentService.java:489 rebuilds a shadow Document whose field handling must stay lock-step with the real setters in updateDocument. Today they match. The day someone adds a fourth date field to the edit path, this duplicate will silently drift and emit a wrong label. The comment flags it, but consider extracting a single applyDateFields(target, dto, fallback) helper that both the live update and the projection call, so the two cannot diverge. Cheaper than a future "why is the title wrong" bug hunt.

  2. File-replace title behaviour is subtle. resolveTitle runs at DocumentService.java:390, before the file-replace block sets a new originalFilename at :440. So a file-replaced document's title is still computed against the old index and not recomputed against the new one. The PR body calls this "treated as manual," which is a defensible product choice — just worth a one-line code comment at the resolve site so the ordering is intentional-by-record, not incidental.

Neither blocks merge. Structurally this is exactly how I'd want it: the formula migrated into its owning module, consumers depend inward, the risky heuristic is quarantined to the backfill, and the decision is captured in an ADR. Clean work.

— @mkeller

## 🏛️ Markus Keller — Application Architect **Verdict: ✅ Approved** I reviewed this strictly through the architecture lens: layer boundaries, the `document`↔`importing` dependency direction, where the title formula/factory/matcher live, single-source-of-truth, abstraction quality, and documentation currency. The headline structural decision is the right one. The title formula now lives in `DocumentTitleFactory` (`@Component`, `document` package) and `importing` *consumes* it — the dependency points from the leaf importer into the owning domain, never the reverse. `DocumentImporter` injects `DocumentTitleFactory` and calls `build(doc)` in `applyFileMetadata` (`DocumentImporter.java:78, 230`) rather than carrying its own private `buildTitle()`. `DocumentTitleFormatter` moved alongside it as package-private (`document/DocumentTitleFormatter.java:17`). That is single-source-of-truth done properly: one formula, one date-label implementation, both halves of the #666 fixture-parity contract co-located in the domain that owns the concept. ADR-031 records the *why*, and it supersedes the old import-time-only behaviour clearly. ### What I checked and found sound - **Dependency direction** — `importing → document` only. No new edge from `document` back into `importing`. The formatter relocation eliminates the previous split-brain where the formula lived in the wrong module. - **Cross-domain access** — `backfillTitles()` and `updateDocument()` stay inside `DocumentService` against its own `documentRepository`; person/tag access still routes through the respective services. No boundary leak introduced. - **Save-time vs backfill asymmetry is a deliberate, documented abstraction split** — exact old-vs-new match in `resolveTitle`/`projectedState` (`DocumentService.java:471–501`) for the edit path; a separate grammar heuristic (`DocumentTitleBackfillMatcher`) for the one-time sweep. Keeping the fragile heuristic *out* of the hot save path and confined to the backfill is the correct containment of complexity. `projectedState` explicitly mirrors `applyDatePrecision`'s skip-null asymmetry, with a comment naming the drift risk. - **Backfill ≠ versioned write** — `backfillTitles()` saves via `documentRepository.save` directly and skips `recordVersion`, consistent with the `backfillFileHashes` precedent (`DocumentService.java:1074–1096`). A mechanical rename should not snapshot the corpus into `document_versions`; this is the right call and is tested. - **No premature infrastructure** — synchronous ADMIN-only `POST`, no broker, no async job for a one-shot cleanup. Appropriate for the volume. The operator note documents the once-after-deploy intent. - **`BackfillResult` reuse** — the `document`-package record reused by `AdminController`, and `AdminController` living in `user` while importing `document` types, are both pre-existing patterns (introduced in `af2c983f`), not regressions from this PR. - **Doc currency (my blocker category)** — all required updates are present: ADR-031, `GLOSSARY.md` *auto-generated title* entry, `l3-backend-3b-document-management.puml`, and the `api_tests` runbook entry. No schema/Flyway change, so no DB-diagram obligation. Nothing outstanding. ### Suggestions (non-blocking) 1. **`projectedState` is a structural tripwire by design — keep it honest.** `DocumentService.java:489` rebuilds a shadow `Document` whose field handling must stay lock-step with the real setters in `updateDocument`. Today they match. The day someone adds a fourth date field to the edit path, this duplicate will silently drift and emit a wrong label. The comment flags it, but consider extracting a single `applyDateFields(target, dto, fallback)` helper that both the live update and the projection call, so the two cannot diverge. Cheaper than a future "why is the title wrong" bug hunt. 2. **File-replace title behaviour is subtle.** `resolveTitle` runs at `DocumentService.java:390`, before the file-replace block sets a new `originalFilename` at `:440`. So a file-replaced document's title is still computed against the *old* index and not recomputed against the new one. The PR body calls this "treated as manual," which is a defensible product choice — just worth a one-line code comment at the resolve site so the ordering is intentional-by-record, not incidental. Neither blocks merge. Structurally this is exactly how I'd want it: the formula migrated into its owning module, consumers depend inward, the risky heuristic is quarantined to the backfill, and the decision is captured in an ADR. Clean work. — @mkeller
Author
Owner

🧪 Sara Holt — QA Engineer & Test Strategist

Verdict: ⚠️ Approved with concerns

This is a genuinely well-built test suite — the pyramid is right (fast Mockito unit layer carries the permutations, one Testcontainers integration class pins the real-Postgres behaviour, a single E2E covers the journey), and the standout decisions all land: a @Spy real DocumentTitleFactory so the save-time tests exercise the actual composition instead of a stub, an explicit @Timeout(5s) ReDoS regression fence on the regex-metacharacter index, the NOT-NULL-title rationale for choosing Postgres over H2 written into the Javadoc, and a dedicated backfill_addsNoDocumentVersionRows assertion that proves the version-spam guarantee both at the unit layer (verify(documentVersionService, never())) and against the real document_versions table. The acceptance scenarios from #726 (FR-001..005) each have a home. My concerns are coverage gaps, not defects in what exists.

Blockers

None. Nothing here is wrong or flaky enough to block merge.

Suggestions

  1. Save-time path never asserts the no-version / audit behaviour for an auto-title rewrite. updateDocument always calls recordVersion + an audit event — correct for an interactive edit — but the title-sync tests (DocumentServiceTest lines 275-406) only assert stored.getTitle(). The backfill path has explicit verify(documentVersionService, never()) coverage; the save path has no symmetric assertion that a title-only regeneration still emits exactly one METADATA_UPDATED (not STATUS_CHANGED). One verify(auditService).logAfterCommit(eq(AuditKind.METADATA_UPDATED), ...) would lock that in.

  2. metaDateRaw/metaDateEnd carry-over asymmetry in projectedState is under-tested. projectedState (DocumentService.java:489) keeps stored metaDateRaw/metaDateEnd when the DTO omits them — this is load-bearing for SEASON and RANGE labels. Only updateDocument_roundTripsSeasonLabel exercises it, and it passes raw explicitly in the DTO (line 367), so the "DTO omits raw, stored raw is reused" branch is never hit. Add a test: stored SEASON title, edit submits metaDateRaw=null, assert the season label survives. Same for a RANGE title where the DTO omits metaDateEnd. This is exactly the kind of setter-asymmetry the Javadoc warns "would silently produce a wrong label."

  3. Backfill matcher has no negative test for a near-miss date label. DocumentTitleBackfillMatcherTest proves every valid form is overwritable and that prose is skipped, but there's no test that a plausible-but-invalid date-shaped segment is rejected — e.g. C-0029 – 13. Foo 1916 or C-0029 – 1916er Jahre. Without it, a regression that loosens the alternation (matching arbitrary prose containing a year) would pass silently. One isFalse() case closes that.

  4. Integration test asserts count isGreaterThanOrEqualTo(1) instead of an exact value (DocumentTitleBackfillIntegrationTest line 55). Because it's @Transactional-isolated and persists exactly one stale row, the count is deterministically 1. >=1 would mask a bug where the sweep also rewrites an unrelated row. Tighten to isEqualTo(1).

  5. E2E asserts the heading matches /E2E Auto-Titel Sync.*1928/ but never asserts the old value is gone. A regex .*1928 would still pass if the title became …Sync 2028 1928. A follow-up not.toContainText-style check, or asserting the exact rebuilt string, makes the "title follows the date" guarantee airtight. Minor, since the unit layer already has regeneratedTitle_doesNotContainOldDate.

  6. Frontend helper test asserts presence + aria-describedby wiring but not that the text is the localised Paraglide string. DescriptionSection.svelte.spec.ts:64 only checks textContent.length > 0. Asserting it renders m.form_helper_title_autogenerated() (or at least a stable substring) would catch an accidental rewire to the wrong message key — cheap, and you added the key in three locales.

  7. No test for the blank-index / null-originalFilename backfill row at the integration layer. The matcher unit test covers null/blank index fail-closed, but production rows have a NOT-NULL originalFilename; conversely a synthetic/legacy row with an empty filename would route through build()StringBuilder(""). Not a blocker (unit layer covers the matcher contract), just noting the integration suite only exercises happy rows.

Net: the layering, infra choice, and idempotency/version-spam assertions are exactly what I'd want. Items 1-2 are the ones I'd most like addressed before merge because they guard the two save-time invariants (audit kind, raw/end carry-over) that currently have no direct assertion.

## 🧪 Sara Holt — QA Engineer & Test Strategist **Verdict: ⚠️ Approved with concerns** This is a genuinely well-built test suite — the pyramid is right (fast Mockito unit layer carries the permutations, one Testcontainers integration class pins the real-Postgres behaviour, a single E2E covers the journey), and the standout decisions all land: a `@Spy` real `DocumentTitleFactory` so the save-time tests exercise the actual composition instead of a stub, an explicit `@Timeout(5s)` ReDoS regression fence on the regex-metacharacter index, the NOT-NULL-title rationale for choosing Postgres over H2 written into the Javadoc, and a dedicated `backfill_addsNoDocumentVersionRows` assertion that proves the version-spam guarantee both at the unit layer (`verify(documentVersionService, never())`) and against the real `document_versions` table. The acceptance scenarios from #726 (FR-001..005) each have a home. My concerns are coverage gaps, not defects in what exists. ### Blockers None. Nothing here is wrong or flaky enough to block merge. ### Suggestions 1. **Save-time path never asserts the no-version / audit behaviour for an auto-title rewrite.** `updateDocument` always calls `recordVersion` + an audit event — correct for an interactive edit — but the title-sync tests (`DocumentServiceTest` lines 275-406) only assert `stored.getTitle()`. The backfill path has explicit `verify(documentVersionService, never())` coverage; the save path has no symmetric assertion that a *title-only* regeneration still emits exactly one `METADATA_UPDATED` (not `STATUS_CHANGED`). One `verify(auditService).logAfterCommit(eq(AuditKind.METADATA_UPDATED), ...)` would lock that in. 2. **`metaDateRaw`/`metaDateEnd` carry-over asymmetry in `projectedState` is under-tested.** `projectedState` (DocumentService.java:489) keeps stored `metaDateRaw`/`metaDateEnd` when the DTO omits them — this is load-bearing for SEASON and RANGE labels. Only `updateDocument_roundTripsSeasonLabel` exercises it, and it *passes raw explicitly in the DTO* (line 367), so the "DTO omits raw, stored raw is reused" branch is never hit. Add a test: stored SEASON title, edit submits `metaDateRaw=null`, assert the season label survives. Same for a RANGE title where the DTO omits `metaDateEnd`. This is exactly the kind of setter-asymmetry the Javadoc warns "would silently produce a wrong label." 3. **Backfill matcher has no negative test for a near-miss date label.** `DocumentTitleBackfillMatcherTest` proves every valid form is overwritable and that prose is skipped, but there's no test that a *plausible-but-invalid* date-shaped segment is rejected — e.g. `C-0029 – 13. Foo 1916` or `C-0029 – 1916er Jahre`. Without it, a regression that loosens the alternation (matching arbitrary prose containing a year) would pass silently. One `isFalse()` case closes that. 4. **Integration test asserts count `isGreaterThanOrEqualTo(1)` instead of an exact value** (`DocumentTitleBackfillIntegrationTest` line 55). Because it's `@Transactional`-isolated and persists exactly one stale row, the count is deterministically `1`. `>=1` would mask a bug where the sweep also rewrites an unrelated row. Tighten to `isEqualTo(1)`. 5. **E2E asserts the heading matches `/E2E Auto-Titel Sync.*1928/` but never asserts the old value is *gone*.** A regex `.*1928` would still pass if the title became `…Sync 2028 1928`. A follow-up `not.toContainText`-style check, or asserting the exact rebuilt string, makes the "title *follows* the date" guarantee airtight. Minor, since the unit layer already has `regeneratedTitle_doesNotContainOldDate`. 6. **Frontend helper test asserts presence + `aria-describedby` wiring but not that the text is the localised Paraglide string.** `DescriptionSection.svelte.spec.ts:64` only checks `textContent.length > 0`. Asserting it renders `m.form_helper_title_autogenerated()` (or at least a stable substring) would catch an accidental rewire to the wrong message key — cheap, and you added the key in three locales. 7. **No test for the blank-index / null-`originalFilename` backfill row at the integration layer.** The matcher unit test covers `null`/blank index fail-closed, but production rows have a NOT-NULL `originalFilename`; conversely a synthetic/legacy row with an empty filename would route through `build()` → `StringBuilder("")`. Not a blocker (unit layer covers the matcher contract), just noting the integration suite only exercises happy rows. Net: the layering, infra choice, and idempotency/version-spam assertions are exactly what I'd want. Items 1-2 are the ones I'd most like addressed before merge because they guard the two save-time invariants (audit kind, raw/end carry-over) that currently have no direct assertion.
Author
Owner

📋 Elicit — Requirements Engineer / Business Analyst

Verdict: Approved

I reviewed PR #727 strictly against issue #726: every FR, every NFR, the non-goals, and each Gherkin scenario traced to a test. The traceability is unusually clean — this is what a "spec-dense issue → implementation" round-trip should look like. No blockers. A handful of small traceability/wording notes that do not gate merge.

Requirements coverage

Functional requirements

Req Satisfied Evidence
FR-TITLE-001 — shared DocumentTitleFactory in document, importer consumes it; DocumentTitleFormatter moved into document document/DocumentTitleFactory.java (@Component), DocumentImporter consumes it; the diff's rename from importing/… to document/DocumentTitleFormatter.java confirms the move (dependency direction correct — document owns, importing consumes).
FR-TITLE-002 — save-time exact-match regeneration in updateDocument only autoTitleBefore captured before any setter; resolveTitle + projectedState; bulk edit untouched.
FR-TITLE-003 — POST /api/admin/backfill-titles, ADMIN, synchronous, BackfillResult(count), direct repository.save, no version-spam, structured log AdminController.backfillTitles, DocumentService.backfillTitles (direct save, parameterized scanned/updated/skipped log).
FR-TITLE-004 — heuristic overwrite test, literal index, ReDoS-safe, fail-closed DocumentTitleBackfillMatcherString.startsWith for the index (never compiled), Pattern.quote on month tokens, bounded quantifiers, returns false on any null/blank/structural surprise.
FR-TITLE-005 — edit-form helper line, aria-describedby, de/en/es, AA contrast DescriptionSection.svelte (showTitleHelp, #title-help, text-ink-3); form_helper_title_autogenerated present in all three locale files; wired from DocumentEditLayout single-edit only.

Non-goals — all respected

  • Bulk edit untouched — showTitleHelp defaults false, helper omitted when hideTitle (test omits the helper when the title field is hidden (bulk edit)); save-time logic lives only in updateDocument.
  • File-replaced docs treated as manual — updateDocument_treatsFileReplacedDoc_asManual asserts the title is left unchanged.
  • No UI for the one-time cleanup — endpoint only, api_tests/ runbook hits port 8080 directly.
  • TS formatDocumentDate / #666 date-label split not collapsed.

NFRs

  • NFR-MAINT-001 (one builder, no divergence) — importer + save-time + backfill all route through DocumentTitleFactory; matcher derives month tokens from the same Locale.GERMAN formatters, so accepted spellings cannot drift from emitted ones.
  • NFR-PERF-001 — save adds a string build + compare; backfill one synchronous sweep.
  • NFR-OBS-001 — parameterized scanned/updated/skipped, never concatenates titles.
  • NFR-SEC-001 — class-level @RequirePermission(ADMIN); 401/403 slice tests; literal index defuses CWE-1333/625.

Gherkin → test traceability

All 12 save-time + 5 backfill scenarios trace to a test, with one wording caveat:

  • Save-time: date-change, hand-written survives, freshly-typed wins, date+location, location cleared, no-old-date, YEAR→DAY relabel, UNKNOWN→populated, blank-not-blank, file-replaced-manual, idempotent — all present in DocumentServiceTest.
  • Backfill: stale fixed (unit + Testcontainers), prose skipped, idempotent + no document_versions rows (backfill_addsNoDocumentVersionRows asserts count unchanged), regex-metacharacter index (@Timeout 5s), permission 401/403/200 in AdminControllerTest.

Suggestions (non-blocking)

  1. Partial trace on the "SEASON and RANGE labels round-trip" save-time scenario. The Gherkin names both precisions, but the save-time suite only has updateDocument_roundTripsSeasonLabel — there is no RANGE assertion through the projectedState path. RANGE composition is covered at the factory/formatter and backfill-matcher layers, so the risk is low, but to fully discharge the scenario as written, add a save-time case that edits a RANGE-precision date and asserts the honest range label survives. This is the only scenario without a 1:1 save-time test.

  2. projectedState is a hand-mirrored copy of updateDocument's setter asymmetry (date/location overwrite-with-null vs precision skip-null). The issue explicitly flagged this as the silent-wrong-label trap, and the code comments call it out well — but the two code paths can still drift independently over time. Consider a single guard test that pins the asymmetry (e.g. "null precision in the DTO keeps the stored precision in the regenerated title") so a future refactor of applyDatePrecision can't desync projectedState unnoticed. (updateDocument_dropsTrailingLocationSegment covers the null-location half; the null-precision half is implied but not isolated.)

  3. Matcher branch 3 ({index} – {location} with no date) requires body.equals(location) exactly. Correct and fail-safe as specified, but worth a one-line note in ADR-031: a stale location-only title (location was changed after the title was built, no date) will be skipped by the backfill rather than corrected, because the stored segment no longer equals the current location. Save-time still fixes it on the next edit; just make the accepted residue explicit so it isn't later mistaken for a bug.

  4. Helper-line copy is good and testable. The de string ("…sobald du den Titel änderst, bleibt deine Version erhalten") matches the issue's suggested wording and reads well for the 60+ audience. No change requested — noting that the component test asserts presence + non-empty + aria-describedby wiring, which is the right level; the actual AA contrast of text-ink-3 on bg-surface is asserted by claim, not by an automated check (acceptable for this stack).

Clean, traceable, non-goals honored. Ship it; fold suggestion (1) in if you want a fully discharged scenario set.

## 📋 Elicit — Requirements Engineer / Business Analyst **Verdict: ✅ Approved** I reviewed PR #727 strictly against issue #726: every FR, every NFR, the non-goals, and each Gherkin scenario traced to a test. The traceability is unusually clean — this is what a "spec-dense issue → implementation" round-trip should look like. No blockers. A handful of small traceability/wording notes that do not gate merge. ### Requirements coverage **Functional requirements** | Req | Satisfied | Evidence | |---|---|---| | FR-TITLE-001 — shared `DocumentTitleFactory` in `document`, importer consumes it; `DocumentTitleFormatter` moved into `document` | ✅ | `document/DocumentTitleFactory.java` (`@Component`), `DocumentImporter` consumes it; the diff's `rename from importing/… to document/DocumentTitleFormatter.java` confirms the move (dependency direction correct — `document` owns, `importing` consumes). | | FR-TITLE-002 — save-time exact-match regeneration in `updateDocument` only | ✅ | `autoTitleBefore` captured before any setter; `resolveTitle` + `projectedState`; bulk edit untouched. | | FR-TITLE-003 — `POST /api/admin/backfill-titles`, ADMIN, synchronous, `BackfillResult(count)`, direct `repository.save`, no version-spam, structured log | ✅ | `AdminController.backfillTitles`, `DocumentService.backfillTitles` (direct save, parameterized `scanned/updated/skipped` log). | | FR-TITLE-004 — heuristic overwrite test, literal index, ReDoS-safe, fail-closed | ✅ | `DocumentTitleBackfillMatcher` — `String.startsWith` for the index (never compiled), `Pattern.quote` on month tokens, bounded quantifiers, returns `false` on any null/blank/structural surprise. | | FR-TITLE-005 — edit-form helper line, `aria-describedby`, de/en/es, AA contrast | ✅ | `DescriptionSection.svelte` (`showTitleHelp`, `#title-help`, `text-ink-3`); `form_helper_title_autogenerated` present in all three locale files; wired from `DocumentEditLayout` single-edit only. | **Non-goals — all respected** - Bulk edit untouched — `showTitleHelp` defaults `false`, helper omitted when `hideTitle` (test `omits the helper when the title field is hidden (bulk edit)`); save-time logic lives only in `updateDocument`. ✅ - File-replaced docs treated as manual — `updateDocument_treatsFileReplacedDoc_asManual` asserts the title is left unchanged. ✅ - No UI for the one-time cleanup — endpoint only, `api_tests/` runbook hits port 8080 directly. ✅ - TS `formatDocumentDate` / #666 date-label split not collapsed. ✅ **NFRs** - NFR-MAINT-001 (one builder, no divergence) ✅ — importer + save-time + backfill all route through `DocumentTitleFactory`; matcher derives month tokens from the *same* `Locale.GERMAN` formatters, so accepted spellings cannot drift from emitted ones. - NFR-PERF-001 ✅ — save adds a string build + compare; backfill one synchronous sweep. - NFR-OBS-001 ✅ — parameterized `scanned/updated/skipped`, never concatenates titles. - NFR-SEC-001 ✅ — class-level `@RequirePermission(ADMIN)`; 401/403 slice tests; literal index defuses CWE-1333/625. ### Gherkin → test traceability All 12 save-time + 5 backfill scenarios trace to a test, with one wording caveat: - Save-time: date-change, hand-written survives, freshly-typed wins, date+location, location cleared, no-old-date, YEAR→DAY relabel, UNKNOWN→populated, blank-not-blank, file-replaced-manual, idempotent — all present in `DocumentServiceTest`. - Backfill: stale fixed (unit + Testcontainers), prose skipped, idempotent + **no `document_versions` rows** (`backfill_addsNoDocumentVersionRows` asserts `count` unchanged), regex-metacharacter index (`@Timeout` 5s), permission 401/403/200 in `AdminControllerTest`. ### Suggestions (non-blocking) 1. **Partial trace on the "SEASON *and* RANGE labels round-trip" save-time scenario.** The Gherkin names both precisions, but the save-time suite only has `updateDocument_roundTripsSeasonLabel` — there is no RANGE assertion through the `projectedState` path. RANGE composition is covered at the factory/formatter and backfill-matcher layers, so the risk is low, but to fully discharge the scenario as written, add a save-time case that edits a RANGE-precision date and asserts the honest range label survives. This is the only scenario without a 1:1 save-time test. 2. **`projectedState` is a hand-mirrored copy of `updateDocument`'s setter asymmetry** (date/location overwrite-with-null vs precision skip-null). The issue explicitly flagged this as the silent-wrong-label trap, and the code comments call it out well — but the two code paths can still drift independently over time. Consider a single guard test that pins the asymmetry (e.g. "null precision in the DTO keeps the stored precision in the regenerated title") so a future refactor of `applyDatePrecision` can't desync `projectedState` unnoticed. (`updateDocument_dropsTrailingLocationSegment` covers the null-location half; the null-precision half is implied but not isolated.) 3. **Matcher branch 3 (`{index} – {location}` with no date) requires `body.equals(location)` exactly.** Correct and fail-safe as specified, but worth a one-line note in ADR-031: a stale location-only title (location was *changed* after the title was built, no date) will be *skipped* by the backfill rather than corrected, because the stored segment no longer equals the current location. Save-time still fixes it on the next edit; just make the accepted residue explicit so it isn't later mistaken for a bug. 4. **Helper-line copy is good and testable.** The de string ("…sobald du den Titel änderst, bleibt deine Version erhalten") matches the issue's suggested wording and reads well for the 60+ audience. No change requested — noting that the component test asserts presence + non-empty + `aria-describedby` wiring, which is the right level; the actual AA contrast of `text-ink-3` on `bg-surface` is asserted by claim, not by an automated check (acceptable for this stack). Clean, traceable, non-goals honored. Ship it; fold suggestion (1) in if you want a fully discharged scenario set.
Author
Owner

🔒 Nora "NullX" Steiner — Application Security Engineer

Verdict: Approved

I came at this PR with the adversarial assumption that the backfill matcher is the soft underbelly — it ingests two user-controlled strings (title and originalFilename, both of which an editor or the file-replace path can shape) and feeds the residue into a compiled regex. I tried to break it. I could not. The threat model named in the PR body is the right one, and the implementation actually defends against it. Below is what I verified, including the things I attempted to exploit and failed.

What I verified (no blockers found)

1. Regex-injection / ReDoS on the user-controlled index — CWE-1333 / CWE-625: not exploitable.
DocumentTitleBackfillMatcher.isOverwritable (matcher lines 70–91) compares the index with String.startsWith (line 74) and the location with String.equals (line 90) — both literal. The index is never compiled into a pattern, so a filename like C-0029(.*).pdf or C-0029$|^.* is inert. The regression test index_with_regex_metacharacters_is_matched_literally_and_terminates (matcher test 140–150, @Timeout(5s)) pins this. Good — this is exactly the test I would have asked for.

2. The compiled date-label regex — no catastrophic backtracking.
DATE_LABEL_WITH_OPTIONAL_LOCATION (matcher lines 52–65) is the only attacker-reachable compiled pattern (the body it runs against is derived from a stored title). I extracted it and ran it standalone against inputs specifically engineered to exploit the overlap between the internal alternatives and the greedy trailing (?: – .+)?, plus deeply-nested separator runs:

  • "30. Jan." + " – 2. Feb.".repeat(8000) + " 1917 " (80 014 chars)
  • ("9. Jan. 1917 – ").repeat(8000) + "ZZZ"
  • "1917" + " – ".repeat(8000)
  • "1".repeat(50000)

All matched/failed in 0–1 ms, scaling linearly to 120 k chars. The reason it's safe: every prefix alternative is a fixed-structure, anchored token sequence; the quantifiers are bounded (\d{1,4}, \d{1,2}) and non-nested; and the single trailing .+ has no ambiguous repetition wrapped around it. The PR's "linear, no catastrophic backtracking" claim is accurate, and I confirmed it empirically rather than on faith.

3. Month-token derivation — no drift, no smuggled metacharacters.
monthAlternation (matcher 93–100) derives Jan.Dez. / JanuarDezember from the same Locale.GERMAN formatters the factory uses, then wraps each via Pattern.quote (\Q…\E). I dumped the actual tokens: the . in abbreviations is literal, never a wildcard. Sharing the formatter source between matcher and DocumentTitleFactory is the correct way to keep an allowlist from rotting.

4. Authorization on the new endpoint — fail-closed.
AdminController carries class-level @RequirePermission(Permission.ADMIN) (AdminController 21), which the PermissionAspect enforces over backfillTitles() (54–58). Type-safe enum, not a magic string — a typo would not silently open the route. AdminControllerTest covers all three boundaries: 401 unauthenticated (138–141), 403 for a non-ADMIN authenticated user (143–148), 200 for ADMIN (150–158). This is the 401-vs-403 distinction I always want to see tested separately, and it's here.

5. CSRF — covered structurally.
SecurityConfig.securityFilterChain (97–99) applies CookieCsrfTokenRepository to all state-changing requests, so the new POST /api/admin/backfill-titles requires a matching X-XSRF-TOKEN; a CSRF failure returns 403 CSRF_TOKEN_MISSING (SecurityConfig 129–135). The tests exercise the route .with(csrf()), consistent with the existing backfill endpoints.

6. No sensitive data in logs.
backfillTitles logs only aggregate counters via parameterized SLF4J: log.info("Title backfill complete: scanned={} updated={} skipped={}", …) (DocumentService 1094). No title is string-concatenated into a log line, so there's no log-injection / Log4Shell-style sink and no PII leak of document titles into the log store.

7. Fail-closed behavior.
The matcher returns false on any null/blank index or structural surprise (matcher 71–83), and the integration test backfill_skipsProse (integration test 70–79) confirms prose is left untouched. Idempotency (backfill_isIdempotent_secondRunChangesNothing) and the "no document_versions rows" guarantee (backfill_addsNoDocumentVersionRows) are both pinned against real Postgres — H2 would have hidden the NOT-NULL title semantics, so using Testcontainers here was the right call.

Suggestions (non-blocking)

  • backfillTitles() loads the whole corpus in one transaction. documentRepository.findAll() (DocumentService 1076) pulls every Document into memory inside a single @Transactional. At the documented ~1500-doc archive scale this is a non-issue, but it's an unbounded-allocation pattern: if the corpus ever grows (or an operator triggers it on a much larger future dataset), this is a memory-pressure / availability footgun. Consider paging/streaming the scan if the archive is expected to grow past ~50k. Worth a one-line code comment noting the bound is intentional so a future reader doesn't mistake it for an oversight.
  • Heuristic can rewrite a deliberately hand-typed title that looks machine-generated. A user-authored title that happens to be exactly {index} – {valid German date label} will be treated as auto and rebuilt by the backfill. This is the documented FR-004 trade-off and the design accepts it (a valid date label is taken as strong evidence of a machine title), so I'm not flagging it as a defect — just calling it out explicitly so it's a known accepted risk rather than an emergent surprise. The save-time path correctly avoids this by using exact old-vs-new comparison instead of the heuristic (DocumentService 471–479), which is the safer design where it matters most.

Nice work — the security-sensitive surface is small, literal where it must be, allowlisted where it can't be, and backed by the exact regression tests I'd have written. The threat-model comments in DocumentTitleBackfillMatcher (matcher 27–33) explain why startsWith is used instead of a pattern, which is precisely the kind of comment that survives a future "let me just refactor this into a regex" change. 🔒

## 🔒 Nora "NullX" Steiner — Application Security Engineer **Verdict: ✅ Approved** I came at this PR with the adversarial assumption that the backfill matcher is the soft underbelly — it ingests two user-controlled strings (`title` and `originalFilename`, both of which an editor or the file-replace path can shape) and feeds the residue into a compiled regex. I tried to break it. I could not. The threat model named in the PR body is the right one, and the implementation actually defends against it. Below is what I verified, including the things I attempted to exploit and failed. ### What I verified (no blockers found) **1. Regex-injection / ReDoS on the user-controlled index — CWE-1333 / CWE-625: not exploitable.** `DocumentTitleBackfillMatcher.isOverwritable` (matcher lines 70–91) compares the index with `String.startsWith` (line 74) and the location with `String.equals` (line 90) — both *literal*. The index is never compiled into a pattern, so a filename like `C-0029(.*).pdf` or `C-0029$|^.*` is inert. The regression test `index_with_regex_metacharacters_is_matched_literally_and_terminates` (matcher test 140–150, `@Timeout(5s)`) pins this. Good — this is exactly the test I would have asked for. **2. The compiled date-label regex — no catastrophic backtracking.** `DATE_LABEL_WITH_OPTIONAL_LOCATION` (matcher lines 52–65) is the only attacker-reachable compiled pattern (the `body` it runs against is derived from a stored title). I extracted it and ran it standalone against inputs specifically engineered to exploit the overlap between the internal ` – ` alternatives and the greedy trailing `(?: – .+)?`, plus deeply-nested separator runs: - `"30. Jan." + " – 2. Feb.".repeat(8000) + " 1917 "` (80 014 chars) - `("9. Jan. 1917 – ").repeat(8000) + "ZZZ"` - `"1917" + " – ".repeat(8000)` - `"1".repeat(50000)` All matched/failed in **0–1 ms**, scaling linearly to 120 k chars. The reason it's safe: every prefix alternative is a fixed-structure, anchored token sequence; the quantifiers are bounded (`\d{1,4}`, `\d{1,2}`) and non-nested; and the single trailing `.+` has no ambiguous repetition wrapped around it. The PR's "linear, no catastrophic backtracking" claim is accurate, and I confirmed it empirically rather than on faith. **3. Month-token derivation — no drift, no smuggled metacharacters.** `monthAlternation` (matcher 93–100) derives `Jan.`…`Dez.` / `Januar`…`Dezember` from the *same* `Locale.GERMAN` formatters the factory uses, then wraps each via `Pattern.quote` (`\Q…\E`). I dumped the actual tokens: the `.` in abbreviations is literal, never a wildcard. Sharing the formatter source between matcher and `DocumentTitleFactory` is the correct way to keep an allowlist from rotting. **4. Authorization on the new endpoint — fail-closed.** `AdminController` carries class-level `@RequirePermission(Permission.ADMIN)` (AdminController 21), which the `PermissionAspect` enforces over `backfillTitles()` (54–58). Type-safe enum, not a magic string — a typo would not silently open the route. `AdminControllerTest` covers all three boundaries: 401 unauthenticated (138–141), 403 for a non-ADMIN authenticated user (143–148), 200 for ADMIN (150–158). This is the 401-vs-403 distinction I always want to see tested separately, and it's here. **5. CSRF — covered structurally.** `SecurityConfig.securityFilterChain` (97–99) applies `CookieCsrfTokenRepository` to all state-changing requests, so the new `POST /api/admin/backfill-titles` requires a matching `X-XSRF-TOKEN`; a CSRF failure returns 403 `CSRF_TOKEN_MISSING` (SecurityConfig 129–135). The tests exercise the route `.with(csrf())`, consistent with the existing backfill endpoints. **6. No sensitive data in logs.** `backfillTitles` logs only aggregate counters via parameterized SLF4J: `log.info("Title backfill complete: scanned={} updated={} skipped={}", …)` (DocumentService 1094). No title is string-concatenated into a log line, so there's no log-injection / Log4Shell-style sink and no PII leak of document titles into the log store. **7. Fail-closed behavior.** The matcher returns `false` on any null/blank index or structural surprise (matcher 71–83), and the integration test `backfill_skipsProse` (integration test 70–79) confirms prose is left untouched. Idempotency (`backfill_isIdempotent_secondRunChangesNothing`) and the "no `document_versions` rows" guarantee (`backfill_addsNoDocumentVersionRows`) are both pinned against real Postgres — H2 would have hidden the NOT-NULL `title` semantics, so using Testcontainers here was the right call. ### Suggestions (non-blocking) - **`backfillTitles()` loads the whole corpus in one transaction.** `documentRepository.findAll()` (DocumentService 1076) pulls every `Document` into memory inside a single `@Transactional`. At the documented ~1500-doc archive scale this is a non-issue, but it's an unbounded-allocation pattern: if the corpus ever grows (or an operator triggers it on a much larger future dataset), this is a memory-pressure / availability footgun. Consider paging/streaming the scan if the archive is expected to grow past ~50k. Worth a one-line code comment noting the bound is intentional so a future reader doesn't mistake it for an oversight. - **Heuristic can rewrite a deliberately hand-typed title that *looks* machine-generated.** A user-authored title that happens to be exactly `{index} – {valid German date label}` will be treated as auto and rebuilt by the backfill. This is the documented FR-004 trade-off and the design accepts it (a valid date label is taken as strong evidence of a machine title), so I'm not flagging it as a defect — just calling it out explicitly so it's a *known* accepted risk rather than an emergent surprise. The save-time path correctly avoids this by using exact old-vs-new comparison instead of the heuristic (DocumentService 471–479), which is the safer design where it matters most. Nice work — the security-sensitive surface is small, literal where it must be, allowlisted where it can't be, and backed by the exact regression tests I'd have written. The threat-model comments in `DocumentTitleBackfillMatcher` (matcher 27–33) explain *why* `startsWith` is used instead of a pattern, which is precisely the kind of comment that survives a future "let me just refactor this into a regex" change. 🔒
marcel added 3 commits 2026-06-04 17:16:16 +02:00
Extract effectivePrecision/effectiveMetaDateEnd/effectiveMetaDateRaw, used by both
applyDatePrecision (the real setters) and projectedState (the title projection), so the two
can no longer drift — addresses review feedback (Markus/Felix/Sara). Writing a stored value
back when the DTO omits a field is a harmless no-op, so behaviour is unchanged (185 existing
DocumentServiceTest cases stay green). Also documents the file-replace "treat as manual" path
inline at the reassignment site.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- save-time: precision+raw carry-over when the DTO omits them (exercises the shared skip-null
  resolvers), and a RANGE label round-trip (Sara/Elicit)
- factory: a bare Document with a null index builds "" rather than NPE-ing (Felix)
- backfill matcher: negative near-misses — ASCII hyphen vs en dash, missing separator before
  trailing text, year-with-trailing-letters, index followed by text without a separator (Sara)
- backfill integration: tighten the count assertion to exactly 1 on the clean test DB (Sara)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fix(document): enlarge auto-title helper to 14px and assert its localized text (#726)
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m35s
CI / OCR Service Tests (pull_request) Successful in 22s
CI / Backend Unit Tests (pull_request) Successful in 3m33s
CI / fail2ban Regex (pull_request) Successful in 48s
CI / Semgrep Security Scan (pull_request) Successful in 22s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m6s
CI / Unit & Component Tests (push) Failing after 2m31s
CI / OCR Service Tests (push) Successful in 21s
CI / Backend Unit Tests (push) Successful in 3m38s
CI / fail2ban Regex (push) Successful in 44s
CI / Semgrep Security Scan (push) Successful in 22s
CI / Compose Bucket Idempotency (push) Successful in 1m6s
0693cfddd1
Bumps the title helper from text-xs (12px) to text-sm (14px) for the 60+ audience (FR-005
prefers a larger size than the field hints) and tightens the component test to assert the
actual localized string and the 14px class — addresses Leonie's and Sara's review notes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Author
Owner

Review concerns addressed — pushed 7316c510693cfd

Thanks all. No blockers were raised; here's how the concerns landed.

Fixed

  • projectedState drift (Markus, Felix, Sara) — extracted effectivePrecision / effectiveMetaDateEnd / effectiveMetaDateRaw, now the single skip-null resolution used by both applyDatePrecision (the real setters) and projectedState. They can no longer diverge; behaviour unchanged (185 existing DocumentServiceTest cases stay green). 7316c51
  • File-replace "treat as manual" (Markus) — added the one-line clarifying comment at the originalFilename reassignment site. 7316c51
  • Test gaps (Sara, Elicit, Felix) — added: save-time precision+raw carry-over (exercises the shared resolvers), a RANGE round-trip at save-time, factory null-index → "" guard, four backfill-matcher negative near-misses (ASCII hyphen vs en dash, missing separator, year-with-trailing-letters, index-without-separator), and tightened the integration count assertion to exactly 1. f656f7c
  • Helper UX (Leonie, Sara) — bumped the helper to 14px (text-sm) for the 60+ audience and the component test now asserts the actual localized string + the size class. 0693cfd

Touched-class suite green (DocumentServiceTest 187, factory 8, matcher 26, integration 4); the DescriptionSection browser spec passes; lint clean.

Deferred (with rationale)

  • Unbounded findAll() in one transaction (Tobias, Nora) — kept as-is. The issue mandated a synchronous sweep matching the existing backfill-versions / backfill-file-hashes precedent, the corpus is bounded (~1.5k docs), and all three backfills share this pattern. Changing it now would diverge from spec + precedent for no current benefit; it's the right follow-up if the corpus ever approaches the #481 50k threshold.
  • Async + status pattern instead of sync (Tobias) — same reason; the spec explicitly chose synchronous ("microseconds/row"). The port-8080-direct note in api_tests is the documented mitigation for the one-shot call.
  • Backfill heuristic can rewrite a hand-typed title that looks machine-generated (Felix, Nora) — accepted FR-004 trade-off, recorded in ADR-031; the lost pre-edit state makes an exact match impossible for the backlog, and it's a one-shot idempotent operation.
  • Runbook in release notes (Tobias) — no release-notes file exists in this repo; the runbook lives in api_tests/Admin-Auth.http, ADR-031, and the issue completion comment.
## Review concerns addressed — pushed `7316c51`…`0693cfd` Thanks all. No blockers were raised; here's how the concerns landed. ### Fixed - **`projectedState` drift** (Markus, Felix, Sara) — extracted `effectivePrecision` / `effectiveMetaDateEnd` / `effectiveMetaDateRaw`, now the single skip-null resolution used by **both** `applyDatePrecision` (the real setters) and `projectedState`. They can no longer diverge; behaviour unchanged (185 existing `DocumentServiceTest` cases stay green). `7316c51` - **File-replace "treat as manual"** (Markus) — added the one-line clarifying comment at the `originalFilename` reassignment site. `7316c51` - **Test gaps** (Sara, Elicit, Felix) — added: save-time precision+raw **carry-over** (exercises the shared resolvers), a **RANGE** round-trip at save-time, factory **null-index → `""`** guard, four backfill-matcher **negative near-misses** (ASCII hyphen vs en dash, missing separator, year-with-trailing-letters, index-without-separator), and tightened the integration count assertion to exactly `1`. `f656f7c` - **Helper UX** (Leonie, Sara) — bumped the helper to **14px** (`text-sm`) for the 60+ audience and the component test now asserts the **actual localized string** + the size class. `0693cfd` Touched-class suite green (DocumentServiceTest 187, factory 8, matcher 26, integration 4); the `DescriptionSection` browser spec passes; lint clean. ### Deferred (with rationale) - **Unbounded `findAll()` in one transaction** (Tobias, Nora) — kept as-is. The issue mandated a *synchronous* sweep matching the existing `backfill-versions` / `backfill-file-hashes` precedent, the corpus is bounded (~1.5k docs), and all three backfills share this pattern. Changing it now would diverge from spec + precedent for no current benefit; it's the right follow-up if the corpus ever approaches the #481 50k threshold. - **Async + status pattern instead of sync** (Tobias) — same reason; the spec explicitly chose synchronous ("microseconds/row"). The port-8080-direct note in `api_tests` is the documented mitigation for the one-shot call. - **Backfill heuristic can rewrite a hand-typed title that looks machine-generated** (Felix, Nora) — accepted FR-004 trade-off, recorded in ADR-031; the lost pre-edit state makes an exact match impossible for the backlog, and it's a one-shot idempotent operation. - **Runbook in release notes** (Tobias) — no release-notes file exists in this repo; the runbook lives in `api_tests/Admin-Auth.http`, ADR-031, and the issue completion comment.
marcel merged commit 0693cfddd1 into main 2026-06-04 17:32:50 +02:00
marcel deleted branch feat/issue-726-auto-title-sync 2026-06-04 17:32:50 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#727