The lowest-id tie-break stream is guarded non-empty, so .get() never
throws — but the project bans Optional.get(). Switch to .orElseThrow()
for the project idiom. No behaviour change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mocked TagServiceTest can't prove the two things that actually broke:
that findAllByNameIgnoreCase folds umlauts the way Postgres LOWER() does,
and that saving a document tagged with a case-colliding tag no longer
throws NonUniqueResultException. Testcontainers postgres:16-alpine:
- updateDocument on a doc tagged with the child "weihnachten" succeeds
and keeps exactly the child tag (not the parent).
- findOrCreate("GLÜCKWÜNSCHE") resolves the Glückwünsche/glückwünsche
umlaut pair deterministically (lowest id) without throwing — the
regression catcher a plain-ASCII pair would miss.
- bulk-edit funnels through resolveTags → findOrCreate, guarding a
future refactor that bypasses it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
findOrCreate used tagRepository.findByNameIgnoreCase, which returns
Optional<Tag> and threw NonUniqueResultException whenever two tags
collided case-insensitively (a canonical parent and its same-named
lowercase child). Every document carrying such a tag became un-editable:
any save re-resolves the whole tag set by name and blew up with a 500.
Replace the throwing lookup with exact-case-first resolution: findByName
(exact) → findAllByNameIgnoreCase (lowest-id, deterministic, never
throws) → create. Delete findByNameIgnoreCase so the throwing call can't
be reintroduced. Case collisions are valid tree nodes — no migration, no
unique(lower(name)) constraint.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- save-time: precision+raw carry-over when the DTO omits them (exercises the shared skip-null
resolvers), and a RANGE label round-trip (Sara/Elicit)
- factory: a bare Document with a null index builds "" rather than NPE-ing (Felix)
- backfill matcher: negative near-misses — ASCII hyphen vs en dash, missing separator before
trailing text, year-with-trailing-letters, index followed by text without a separator (Sara)
- backfill integration: tighten the count assertion to exactly 1 on the clean test DB (Sara)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract effectivePrecision/effectiveMetaDateEnd/effectiveMetaDateRaw, used by both
applyDatePrecision (the real setters) and projectedState (the title projection), so the two
can no longer drift — addresses review feedback (Markus/Felix/Sara). Writing a stored value
back when the DTO omits a field is a harmless no-op, so behaviour is unchanged (185 existing
DocumentServiceTest cases stay green). Also documents the file-replace "treat as manual" path
inline at the reassignment site.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ADR-031 records the shared document-package title factory, the exact-match save-time
regeneration, and the grammar-heuristic one-time backfill (with the ReDoS / no-version-spam
/ file-replace-is-manual decisions). Adds an "auto-generated title" glossary entry, extends
the document-management c4 diagram with DocumentTitleFactory / DocumentTitleBackfillMatcher
and the backfill flows, and documents POST /api/admin/backfill-titles in Admin-Auth.http as
a one-shot ADMIN call hitting port 8080 directly.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pins backfill behaviour on postgres:16-alpine (H2 unusable — title is NOT NULL): a stale
auto-title is rewritten, the sweep is idempotent (second run touches nothing), prose is
left alone, and the mechanical rename adds no document_versions rows. Permission (401/403)
stays in the faster @WebMvcTest slice.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds POST /api/admin/backfill-titles (ADMIN-only, synchronous) which rebuilds every
machine-generated title from the row's current state. A grammar heuristic
(DocumentTitleBackfillMatcher) decides overwritability: index matched literally via
startsWith (originalFilename is user-controlled — no regex injection / ReDoS, CWE-1333),
date-label forms derived from the same Locale.GERMAN formatters as the factory so they
cannot drift, prose left untouched, fail-closed on any surprise. Saves via the repository
directly (no recordVersion — follows backfillFileHashes), so the mechanical rename never
version-spams document_versions. Idempotent: a second run rewrites nothing. Emits one
SLF4J-parameterized scanned/updated/skipped line.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
updateDocument now captures the machine title from the persisted state before any
setter runs, and rebuilds it from the new state only when the submitted title still
equals that machine value — an exact comparison that relies on the edit form
round-tripping an untouched title verbatim. A hand-written or freshly-typed title is
kept; a blank submission falls back to the rebuilt auto-title (title is always present);
a file-replaced document no longer matches its import-time title and is treated as
manual. projectedState mirrors the setter asymmetry exactly (date/location overwrite
incl. null-clear; precision/end/raw skip-null from the entity).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move DocumentTitleFormatter from importing into the document package and
introduce DocumentTitleFactory there as the single source of truth for the
{index} – {dateLabel} – {location} formula. DocumentImporter now consumes the
factory instead of owning the composition; the document package owns the rule,
importing depends on it (not the reverse). No behavioral change — importer
title assertions and the #666 fixture parity test stay green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete findConversation and findSinglePersonCorrespondence (no remaining
callers after the service methods were removed) and their integration
test section. Drops the now-unused LocalDate import.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete getConversationFiltered (the endpoint's only caller is gone) and
the dead 2-arg getConversation(personA, personB) which had zero callers,
along with both getConversationFiltered test blocks. The hasSender/
hasReceiver specifications stay — document search still uses them.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Delete GET /api/documents/conversation and its controller handler — the
only client was the removed Briefwechsel view. Drops the now-unused Sort
import.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The density chart is an interactive filter control; a 5-minute private
browser cache let it show stale month counts after an edit/upload/re-tag.
The in-memory aggregation is sub-200ms p95 over ~5k docs, so there is no
load reason to cache. Removing the explicit header lets Spring Security's
default no-store directive apply, so the response is always fresh.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses Sara's review concerns:
- Add a negative Testcontainers test: saveAndFlush of a RANGE with end < start
throws DataIntegrityViolationException, proving chk_meta_date_end_after_start
actually fires (H2 wouldn't) and exercising the backstop's trigger end-to-end.
Guards against silent app/DB drift if the service guard ever regresses.
- Tighten updateDocument_acceptsRange_whenEndAfterStart to assert the persisted
end value, not just that save was called.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses Tobias's review concern: the generic DataIntegrityViolation
backstop turned every integrity violation into a silent 400 with no
constraint name, no stack, no Sentry — an unanticipated write bug would
fail invisibly in production.
Now extract the constraint NAME from the cause chain (schema metadata, safe
for Loki) and log it parameterized at WARN, so the failure is debuggable.
Still never pass `ex`/`getMessage()` (SQL + values, CWE-209) and still no
Sentry — the response stays generic, so the response logic is not brittle.
New test proves the WARN names the constraint but never carries the SQL.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Testcontainers integration test persisting a RANGE doc with end == start
against real Postgres + Flyway, which (unlike H2) enforces the V69
chk_meta_date_end_after_start CHECK. Pins the app guard's isBefore
semantics to the actual >= constraint, guarding against app/DB drift (AC2).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add @ExceptionHandler(DataIntegrityViolationException) returning 400
VALIDATION_ERROR with a fixed constant message, so any integrity violation
that slips past the upstream guards (a future constraint, or the import
path) becomes a clean 400 instead of a 500 + Sentry alert (AC9).
Deliberately generic — it does not inspect which constraint failed. Never
echoes ex.getMessage() (constraint name + SQL, CWE-209), logs at WARN
without passing the exception (would re-leak the SQL to Loki), and does not
call Sentry.captureException.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the second validateDateRange predicate mirroring
chk_meta_date_end_only_for_range, so a direct API client that sets an end
date without RANGE precision gets a clean 400 INVALID_DATE_RANGE instead of
a 500 (AC6). Shares the code with the end-before-start branch.
Also fix updateDocument_preservesStoredPrecision_whenDtoOmitsIt: its stored
fixture (MONTH + end date) is a state the DB CHECK forbids, so the
carried-over-state guard correctly rejects it. Switched to RANGE + end —
the only DB-valid non-null-end combo — preserving the test's intent.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cover AC2 (end == start), AC3 (open-ended, end null) and AC4 (null start +
end set, which must not reject or NPE), plus end-after-start. Guards the
guard against future over-rejection that would diverge from the DB CHECK.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add ErrorCode.INVALID_DATE_RANGE and a validateDateRange guard on
DocumentService.updateDocument, run right after applyDatePrecision so it
fires before any save (updateDocumentTags persists earlier in the method).
Mirrors the V69 chk_meta_date_end_after_start CHECK: end >= start with a
null start allowed, using isBefore so equal dates stay valid. Turns a user
date typo into a clean 400 instead of a 500 + Sentry alert.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The /search path already pins the Boolean-undated->primitive coercion via
search_withoutUndatedParam_forwardsFalseToService; add the symmetric pin for
getDocumentIds so an absent param provably resolves to undated=false on the
record (never NPE). Raised in the #702 review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Clarify at loadFilteredDates why the density path constructs a SearchFilters:
the two filter records are kept separate (density has no date/undated fields),
so it adapts here to reuse buildSearchSpec. Raised in the #702 review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the ~29 repeated `new SearchFilters(null, null, null, null, null,
null, null, null, null, false)` literals across the search test suites with
a shared SearchFiltersFixtures.noFilters() factory (and noFilters()
.withUndated(true) for the undated-only case). Tests that pin a specific
field keep their explicit `new SearchFilters(...)` so intent stays visible.
Pure test-ergonomics cleanup raised in the #702 review; no behaviour change.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the long positional filter lists on the document search chain
with the SearchFilters record. searchDocuments now takes
(SearchFilters, DocumentSort, String dir, Pageable) and findIdsForFilter
takes a single SearchFilters; the four private helpers (buildSearchSpec,
runSearch, countUndatedForFilter, isPureTextRelevance) no longer carry a
positional 10-field filter list. The controller builds the record after
its existing tagOp/undated coercions; the density path adapts its
DensityFilters into a SearchFilters at the shared buildSearchSpec call.
The forced-undated count path is preserved via filters.withUndated(true),
so countUndatedForFilter still ignores the user's toggle (#668) while
runSearch honours it. No behaviour change.
Controller binding tests swap their positional any()/eq() matchers for
ArgumentCaptor<SearchFilters>, asserting captured.undated()/.status()/
.sender() — strictly stronger than the previous any()-soup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Filter-only value object bundling the ten search predicates so the long
positional argument lists on the document search chain can be replaced
with one named record — killing the sender/receiver and from/to swap-bug
class. Mirrors the existing DensityFilters; carries a withUndated copy
accessor for the forced-undated count path. Unused as of this commit.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round out the "read-only users can't write anything" boundary: a READ_ALL
principal is forbidden from posting a block comment, replying, and editing a
comment (the prior tests only used a no-authority principal for create).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the hasTranscription existence query out of the shared getDocumentById
into a dedicated getDocumentDetail used solely by GET /api/documents/{id}.
The flag is only consumed by the detail page, so the extra EXISTS query no
longer runs for the many internal getDocumentById callers (e.g. the
Geschichte resolve loop and the dashboard resume path). Behaviour of the
detail endpoint is unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Read-only users will soon be able to open the transcription read view, so
the write endpoints become the real authorization boundary. Explicitly
assert a READ_ALL-only principal is forbidden from create/update/reorder/
review block writes and annotation create/patch (the prior tests only used
a no-authority principal).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
getDocumentById now populates a transient hasTranscription boolean so the
document detail page can gate the transcription entry control at first
paint (no client store, no full block fetch, no layout shift).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Domain-service wrapper over existsByDocumentId so other domains can ask
"does this document have any transcription blocks?" without reaching into
the repository.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cheap EXISTS query backing a server-side "has a transcription" signal so
read-only users can be offered the read view at first paint.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Address review nit: the older getTagTree tests relied on Mockito's default
empty-list return for findSubtreeDocumentCountsPerTag. Stub it explicitly so
the two-query contract is self-documenting.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Record that getTagTree returns both documentCount (direct, read by admin
surfaces) and subtreeDocumentCount (rollup, read by the reader surfaces),
matching the corrected getTagTree JavaDoc.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Cover AC#1-4 (leaf=direct, distinct overlap counted once, full descendant
depth), REQ-THEMEN-05 (empty subtree absent), REQ-THEMEN-06 (cycle terminates
via the 50-level guard) and AC#7 (rollup equals distinct documents found by the
real tag-search expansion — count↔destination parity). Testcontainers
postgres:16-alpine since the recursive CTE + COUNT(DISTINCT) needs real PG.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add subtreeDocumentCount to TagTreeNodeDTO, populated by a new recursive-CTE
aggregate query that builds a tag closure and counts distinct documents per
ancestor subtree. The direct documentCount is unchanged; getTagTree now maps
both counts onto each node from two aggregate queries (no N+1).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Hiding the header upload button is UI polish; the real control is endpoint
authz. Add explicit READ_ALL-only 403 boundary tests for POST /api/documents
and POST /api/documents/quick-upload, matching the reader-only convention
already used elsewhere in this suite.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sara's QA concerns:
1. PersonControllerTest.updatePerson_returns200_whenGenerationNull was
asymmetric — only checked status 200, no body assertion. Now also
asserts `$.generation` is null in the JSON response, mirroring the
in-range test's body check.
2. New full-stack PUT→DB→GET round-trip in PersonServiceIntegrationTest
(updatePerson_clearGenerationToNull_readsBackNullFromDb) seeds a
person with generation=3, calls updatePerson with generation=null,
flushes the persistence context, and asserts the column reads back
null from the DB. Without this we only had the mocked WebMvcTest
boundary; nothing proved JPA actually wrote SQL NULL.
3. Sibling test (updatePerson_setGenerationToZero_readsBackZeroFromDb)
pins the G 0 end-to-end so a primitive zero can't silently coerce
to null anywhere along controller → service → JPA.
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Markus flagged the 0/10 range was duplicated across five sites (DB
CHECK, both importers, DTO @Min/@Max, dropdown range). New
PersonGeneration.MIN_GENERATION / MAX_GENERATION constants are now
the canonical Java source; the DTO annotations and both importer
guards reference them. The V70 SQL CHECK comment now points at the
Java constants so future widening updates one Java class plus one
SQL literal (Flyway forbids rewriting the migration in place).
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Inject RelationshipService into CanonicalImportOrchestrator and walk
PARENT_OF edges in the family network after both person loaders finish
(before documents). For every edge where child.generation is set and
not strictly deeper than parent.generation, log a WARN — soft check,
never fails the batch.
Reads through getFamilyNetwork() per the layering rule (orchestrator
never touches PersonRelationshipRepository directly). Curators see the
warning in the import log; the rest of the pipeline is unaffected so
data with curatorial gaps still loads cleanly.
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PersonNodeDTO is a positional record. The optional Integer generation
field is inserted between deathYear and familyMember so all four
construction sites stay readable without a builder.
- RelationshipService.getFamilyNetwork → populates with
person.getGeneration() (the Stammbaum's strict-rank source on the
frontend).
- RelationshipInferenceService.findAllFor → populates the same way;
inference UI does not consume it but the field travels along for
consistency.
- RelationshipControllerTest fixtures pass null.
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reads the optional `generation` integer from the canonical tree JSON and
routes it into PersonUpsertCommand. Out-of-range values are skip-and-
warned with the same policy as the register importer.
Tree imports run after register (per CanonicalImportOrchestrator); a
tree-confirmed integer overwrites a register-parsed value — both sides
are "canonical" in preferHuman terms (neither is a human edit).
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reads the optional `generation` cell by header name (REQUIRED_HEADERS is
not extended — REQ-IMP-001 backward-compat for older artifacts), parses
it through GENERATION_PATTERN (^\s*G?\s*(-?\d+)), and routes it into
PersonUpsertCommand.generation.
Out-of-range values (G 99, G -1) are skip-and-warned, never abort the
batch; the post-parse range guard mirrors the V70 CHECK constraint so
the DB never sees a value Bean Validation wouldn't accept.
Pinned with a parametrised CsvSource covering every shape from the
acceptance criteria plus a backward-compat test (artifact without a
generation column still imports, all upserts get generation=null).
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- fromCanonical writes the imported generation into a new Person row.
- mergeCanonical routes existing/canonical generation through the
existing preferHuman(Integer, Integer) overload so a human-edited
value is never overwritten on re-import (ADR-025).
- updatePerson writes generation verbatim from the form DTO so a human
can clear it back to null — same shape as birthYear/deathYear.
- createPerson(PersonUpdateDTO) writes generation so /persons/new flow
doesn't silently drop a selected G value on create.
Pinned with five tests covering the four write paths plus the
documenting test that captures preferHuman's known limitation
(explicit human null is overwritten by a non-null canonical value —
same as birthYear/deathYear, deferred to a future helper rework if it
ever produces a user-visible bug).
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds the optional generation field to both DTOs:
- PersonUpsertCommand gains Integer generation in the canonical-import
builder chain; service wiring lands in the next commit.
- PersonUpdateDTO gains @Min(0)@Max(10) Integer generation, the form-path
surface. The constraints mirror the V70 CHECK so validation fails fast
at the controller before reaching the DB.
PersonControllerTest pins the validation behaviour: -1 → 400, 11 → 400,
null → 200, 3 → 200 for both PUT (update) and POST (create) paths. The
GlobalExceptionHandler maps MethodArgumentNotValidException to
VALIDATION_ERROR so the frontend's extractErrorCode keeps working.
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Flyway V70: SMALLINT generation column with CHECK(0..10) and partial
index over non-null rows. Person.generation field surfaces it through
the JPA model. Pre-import rows and persons outside the curated family
graph legitimately stay null; the canonical importer (next commits)
back-fills via preferHuman so a human-edited value is never lost.
Refs #689
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
addRelationship now auto-flips family_member=true on both endpoints for
PARENT_OF/SPOUSE_OF/SIBLING_OF (commit 07300aef). That side-effect breaks
the pre-condition assertion in setFamilyMember_true_makes_person_appear_in_network,
which expects charlie not to appear in the network before the explicit flip.
Reset charlie's flag after addRelationship so the test still exercises the
setFamilyMember(true) -> network presence path it was written for.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
DocumentImporter exposed a package-private openFileStream(File) so a
Mockito spy could force the IO-error branch of isPdfMagicBytes. The
test-only seam leaked into production: the method existed for testing,
not for any production extensibility.
Replace with a constructor-injected FileStreamOpener interface (single
abstract method, @FunctionalInterface) and a one-line
@Component DefaultFileStreamOpener delegate. Tests now inject a mock
opener instead of spying on the importer itself, which is also a more
idiomatic Mockito usage.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
resolveReceivers passed the slug as both `sourceRef` AND `lastName`, so
an unresolved receiver "smith-john" became a provisional Person with
lastName="smith-john" — a regression of the existing senderName→Person
contract.
Fix: zip the parallel `receiver_person_ids` and `receiver_names`
columns by position (the normalizer emits them 1:1 like
sender_person_id/sender_name). When the names list is shorter than the
slugs list, fall back to slug-as-name for the missing entries.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
buildDocument was a ~30-line method mixing attribution routing, date
parsing, authoritative collection management, file metadata, and
computed flags. Split into five named helpers — applyAttribution,
applyDates, applyAuthoritativeAssociations, applyFileMetadata,
applyComputedFlags — each doing one job. Pure refactor; all 43 existing
DocumentImporterTest cases still pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>