Bug: PersonService.findOrCreateByAlias kann NonUniqueResultException werfen bei case-kollidierenden Aliassen (Schwester von #730) #731
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Same structural lookup bug as #730, in the Person domain — and it lives on two lookup surfaces, not one:
Alias path —
PersonService.findOrCreateByAlias(rawName)resolves viapersonRepository.findByAliasIgnoreCase(alias)→Optional<Person>. If two persons have aliases that collide case-insensitively (e.g.müllerandMüller), the case-insensitive derived query matches two rows and Hibernate throwsNonUniqueResultException→IncorrectResultSizeDataAccessException→ generic 500. Latent — only triggers once two case-colliding rows coexist.Name path —
PersonService.findByName(firstName, lastName)→personRepository.findByFirstNameIgnoreCaseAndLastNameIgnoreCase(...)→Optional<Person>, called byDocumentService.storeDocument(DocumentService.java:222) to resolve the upload sender from the parsed filename (.orElse(null)). Two real people sharing a first+last name (Hans Müller/hans müller) is routine, so this path can throw the same 500 on upload. The original draft asked only to "confirm its caller can't hit the throw" — review established it can, so it is in scope here, not a follow-up.Found while fixing #730 (the tag-domain instance). Both are the same root cause; fix both in this one PR (near-identical diff).
Root cause
Optional<…> findBy…IgnoreCase(…)is a leaky identity that only works while the queried column is globally unique case-insensitively. There is no such constraint onpersons.aliasnor on first/last name, and the importer can legitimately create colliding rows. Like #730 this is a lookup problem, not a data problem — do not add aunique(lower(alias))constraint and do not merge/dedupe persons.source_refis the stable identity (ADR-025 /upsertBySourceRef);alias/name are human-editable labels.Definition — "ambiguous": 2+ persons match the queried label case-insensitively with no exact-case match. An exact-case match among case-folding siblings is not ambiguous — the exact row always wins.
Affected code
backend/.../person/PersonService.javafindOrCreateByAlias(~line 125), viapersonRepository.findByAliasIgnoreCase. Note: the create branch (:130-155) is non-trivial — SKIP early-return, INSTITUTION/GROUP vs. name-split + maiden-name alias (PersonNameParser.split+aliasRepository). It must be preserved verbatim; do not flatten it to #730's tag one-liner. The.orElseGet(...)lambda simply moves below the new exact/CI guards.findByName(~line 114), viapersonRepository.findByFirstNameIgnoreCaseAndLastNameIgnoreCase. The service signature staysOptional<Person>— only its internals change.backend/.../person/PersonRepository.javaOptional<Person> findByAliasIgnoreCase(String alias)(line 33)Optional<Person> findByFirstNameIgnoreCaseAndLastNameIgnoreCase(String firstName, String lastName)(line 39)backend/.../document/DocumentService.java:222— the name-path caller (upload sender resolution). Layering is clean: it already resolves throughpersonService.findByName(...), not the repository, so the fix stays entirely insidePersonService+PersonRepository. Do not lift resolution logic up intoDocumentService.Fix shape
Alias path — mirror #730 (exact-case-first, non-throwing)
Delete the
Optional<…> findByAliasIgnoreCasevariant so the throwing call can't be reintroduced. KeepfindOrCreateByAliasas the single resolution point. Use.orElseThrow()(not bare.get()) on themin()so the "list is non-empty here" invariant is explicit and matches the no-bare-.get()rule. (The #730 final form80f6468duses.orElseThrow()— mirror that, not the original.get().)Name path — non-throwing, bail to null on ambiguity (decision below)
Add
List<Person> findAllByFirstNameIgnoreCaseAndLastNameIgnoreCase(...)+ an exactfindByFirstNameAndLastName(...); delete theOptional<…>IgnoreCasevariant.findByNamereturns:Fail-closed requirement on null first name. Parsed filenames can have a null first name; today the derived query folds that to a non-match. The new plural method must keep this: confirm Spring Data generates
first_name = ?(which never matches a null arg), notfirst_name IS NULL. Afirst_name IS NULLpredicate would silently widen the result and pull a last-name-only / institution row in as a "sender" — a quiet provenance-integrity defect. A null first name must resolve to no match. Write that failing test first (repo layer, real Postgres).Resolved decision — name-collision sender resolution
When a filename resolves to a name shared (case-insensitively) by 2+ persons:
Hans Mülleris worse than an empty field (senior reviewers won't re-check a pre-filled value). Leave the sender unset and let the human assign it.The two divergent strategies are deliberate. The load-bearing comment at the name path must explain why it differs from the alias path (provenance correctness > pre-filled guess), not just that it differs — otherwise a future "consistency cleanup" reintroduces the confidently-wrong-sender bug.
Acceptance criteria
Test strategy
Unit (
PersonServiceTest, mocked repo) — each branch a separate test:" Clara Cram "trims then resolves · null first name → folds to non-match.findByAliasIgnoreCaseis mocked in 8PersonServiceTestcases (lines 378, 390, 403, 425, 442, 460, 472, 476); they returnOptionaland won't compile after the method change.DocumentServiceTestis unaffected — leave it alone. Its mocks atDocumentServiceTest.java:1072/1089/1104stubpersonService.findByName(...)returningOptional; since the service signature staysOptional<Person>, they still compile. Do not edit them.Repository (
PersonRepositoryTest,@DataJpaTest) — MUST be migrated, not just deleted:PersonRepositoryTest.java:123-154callsfindByAliasIgnoreCase(:126,:137) andfindByFirstNameIgnoreCaseAndLastNameIgnoreCase(:146/:149) directly. Deleting those two repo variants breaks this test's compilation. Rewrite the three @Test methods against the new methods (findByAlias,findAllByAliasIgnoreCase,findByFirstNameAndLastName,findAllByFirstNameIgnoreCaseAndLastNameIgnoreCase).findAllByAliasIgnoreCase("müller")returns bothMüller/müller) — proves the derived query itself folds case in real Postgres, independent of the service test.findAllByFirstNameIgnoreCaseAndLastNameIgnoreCase(null, "Müller")returns empty (no widening). This NULL-semantics bug cannot be caught by a mocked unit test — assert it against real Postgres.Integration (Testcontainers
postgres:16, inPersonServiceIntegrationTest):Müller/müller; only the real DB proves PostgresLOWER()foldsü. A plain-ASCII test stays green while umlaut aliases regress.Hans Müller/hans müller, drive the upload/findByNamepath → 200, sender null (assert it's null, not an arbitrary person). "No 500" alone is too weak — assert the surviving value. Add the happy single-match case (uniqueHans Müller→ sender set) and the null-first-name case.PersonServiceIntegrationTestis already@Transactional(rollback) — keep it so seeded umlaut rows don't leak across tests.Coverage: backend gate is 88% branch (JaCoCo). New branches (exact-present / single-CI / multi-CI / create / SKIP / null-firstName) each need a hitting test.
Verification / observability
GlobalExceptionHandlerhas no dedicated handler forIncorrectResultSizeDataAccessException; it falls through to the generic@ExceptionHandler(Exception.class)(~line 100) →Sentry.captureException+log.error+ a wire body ofINTERNAL_ERROR/ "An unexpected error occurred". No Hibernate class name, SQL, or row count reaches the client. After the fix this path no longer throws, but keep/add a regression test asserting the body carries no Hibernate class name, no SQL, no "2 results were returned" — it guards the handler against being weakened later.GROUP BY … HAVING).IncorrectResultSizeDataAccessException/NonUniqueResultExceptionfiltered topersonstack frames, record the group id in the PR, and confirm zero new occurrences in the 24h after deploy.UX note
Option B leaves the sender unset on an ambiguous filename. Verified the empty-sender state already degrades cleanly and is tested (
DocumentRow.svelte"renders the unknown placeholder when sender is null";DocumentMetadataDrawerno-persons placeholder), andmetadataComplete=falsealready routes the doc into the "needs completion" state. So this is adequately covered here — no UI work in this PR, no blocker.Optional follow-up (file as a separate low-priority issue, not this PR): an inline hint on the edit form so the transcriber learns why the sender is empty. A11y spec for whoever picks it up: redundant cue (info icon + text, not color-only), German copy "Absender konnte aus dem Dateinamen nicht eindeutig zugeordnet werden", min 16px,
text-inkonbg-surface(do not usebrand-mintas text — ~2.8:1 fails AA), andaria-describedbytying the hint to the sender field. Keep the ambiguous-copy ("nicht eindeutig zuordenbar") distinct from a genuinely-unknown sender ("kein Absender").Notes
unique(lower(...))index — it's semantically wrong (collisions are valid) and would fail to apply against existing colliding rows, turning a clean deploy into a failed Flyway migration that blocks startup.docs/adr/032-tag-name-resolution-tolerates-case-collisions.md(created for #730, onmain) already names this Person path as the tracked sibling ("The sibling Person path is unfixed but tracked … deferred to #731"). This fix is the same root-cause pattern, so add a "Person extension" section to ADR-032 rather than a new ADR — keep the case-collision resolution story in one document. The section records: alias path = deterministic lowest-id (mirrors tag); name/sender path = bail-to-null with the provenance rationale (the one genuinely new wrinkle vs. the tag decision); and the same "nounique(lower(...))constraint" stance. Flip the load-bearing code comment to point at ADR-032 (not just "#730"). (Aside, separate housekeeping: thefeat/issue-684-person-delete-db-fkbranch independently created a differentdocs/adr/032-person-delete-db-level-integrity.md— an ADR-number collision to resolve when that branch merges; it does not affect this issue, which extends themainADR-032.)resolveExactThenCi(...)helper. The two paths have different fallbacks (lowest-id vs. null-on-ambiguity), and on a correctly-based branch the tag code (#730) is the only other user — extracting a 3-way-shared helper is premature. Inline the exact→CI→fallback logic in each path with the load-bearing comment; that's the KISS choice. (If a future refactor does extract it, the fallback must be a parameter, never buried.)person/README.mdmethod table (lines 23–24) for the changedfindByName/findOrCreateByAliassignatures. Also fix line 23's description —findByNameis currently mislabelled "Typeahead search"; its sole caller is filename-based sender resolution instoreDocument. No DB-diagram update (no schema change). No C4 update (no new package/route/service).Implemented in PR #745 (
fix/issue-731-person-case-collision, branched offmainwith #730 merged).Two atomic commits, each red→green TDD:
20cfe41f— alias path:findOrCreateByAliasresolves exact-case → lowest-id case-insensitive sibling → existing create branch (INSTITUTION/GROUP + maiden-name alias preserved verbatim); throwingfindByAliasIgnoreCasedeleted.ddf378aa— name/sender path:findByNameresolves exact-case → single CI → else empty (ambiguous sender left unset, not guessed). Both new name queries use explicit HQL equality so a null first name binds= NULL(no match), not the derivedfirst_name IS NULLwidening. ThrowingfindByFirstNameIgnoreCaseAndLastNameIgnoreCasedeleted. Opaque-error path pinned. ADR-032 extended with the Person section;person/README.mdfindByNamemislabel fixed.Acceptance criteria — all covered by tests:
PersonServiceTest(64),PersonRepositoryTest(52, Testcontainers, incl. umlaut folding + null-first-name on real Postgres),PersonServiceIntegrationTest(19, upload path viastoreDocument),GlobalExceptionHandlerTest(4).TagCaseCollisionIntegrationTest(3) andDocumentServiceTest(187) re-run green.clean packageBUILD SUCCESS.No DB migration — code-only, reversible; no
unique(lower(...))constraint; no merge/dedupe; no shared helper.Deferred to you (needs prod / GlitchTip access): the two read-only pre-deploy detection queries (latent vs. already-active counts), and the post-deploy GlitchTip check for
IncorrectResultSizeDataAccessException/NonUniqueResultExceptionin person frames.Next: multi-persona review on PR #745.