Bug: Dokument mit case-kollidierendem Tag lässt sich nicht speichern (findByNameIgnoreCase NonUniqueResultException) #730

Closed
opened 2026-06-04 19:14:58 +02:00 by marcel · 1 comment
Owner

Summary

Saving any edit to a document fails with HTTP 500 ("Ein unerwarteter Fehler ist aufgetreten") when the document carries a tag whose name collides case-insensitively with another tag in the tree. TagService.findOrCreate(name) resolves via tagRepository.findByNameIgnoreCase(name), which expects a unique row but matches two, throwing NonUniqueResultException.

This is independent of #726 (the tag-resolution path is untouched by that PR; it reproduces on main).

Reproduction (observed on staging)

  1. Open document 4423782d-e8b9-4bcb-bde5-0777dc53908c (carries tag weihnachten).
  2. Change the date, save.
  3. → 500. Backend stack trace:
org.springframework.dao.IncorrectResultSizeDataAccessException:
    Query did not return a unique result: 2 results were returned
  at TagService.findOrCreate(TagService.java:60)            // findByNameIgnoreCase("weihnachten")
  at DocumentService.resolveTags(DocumentService.java:578)
  at DocumentService.updateDocumentTags(DocumentService.java:560)
  at DocumentService.updateDocument(DocumentService.java:409)
Caused by: org.hibernate.NonUniqueResultException: ... 2 results were returned

Root cause

DocumentService.updateDocument re-resolves the document's tags on every save: the edit form round-trips tag names (tags.map(t => t.name).join(','), frontend/src/routes/documents/[id]/edit/+page.server.ts:89), and resolveTags calls TagService.findOrCreate(name) per name:

public Tag findOrCreate(String name) {
    String cleanName = name.trim();
    return tagRepository.findByNameIgnoreCase(cleanName)              // ← Optional<Tag>, blows up on 2 matches
            .orElseGet(() -> tagRepository.save(Tag.builder().name(cleanName).build()));
}

The lookup is case-insensitive over the whole tag table, but the canonical tag tree legitimately contains names that differ only by case — a parent and its same-named lowercase child, or two siblings. These are not accidental duplicates and must not be merged/deduped (each has a distinct source_ref / tree position and real document attachments):

name source_ref (tag_path) parent docs children
Geburt Geburt 0 3
geburt Geburt/geburt →Geburt 36 0
Hochzeit Hochzeit 0 7
hochzeit Hochzeit/hochzeit →Hochzeit 51 0
Weihnachten Weihnachten 0 5
weihnachten Weihnachten/weihnachten →Weihnachten 69 0
Glückwünsche / glückwünsche parent / Glückwünsche/glückwünsche — / →Glückwünsche 0 / 12 4 / 0
Reisepläne / reisepläne Reise/Reisepläne / Reise/reisepläne both →Reise 2 / 12 siblings

So a global case-insensitive uniqueness constraint would be wrong — the collision is a lookup problem, not a data problem. source_ref (tag_path) is the stable identity (ADR-025 / upsertBySourceRef); name is a human-editable label. findByNameIgnoreCase was always a leaky identity that only works while names are globally unique case-insensitively — an invariant the canonical tree explicitly violates.

Impact

Every document carrying one of these 10 tags is currently un-editable (any field — date, sender, transcription — fails on save, because the whole tag set is re-resolved). On staging that's 36 + 51 + 69 + 12 + 12 + 2 ≈ 180 document-tag attachments across common tags. No end-user workaround. The user-facing symptom is the opaque generic error with no hint that a tag is the cause.

Fix (code only — do NOT touch the data)

Make name→tag resolution unambiguous and non-throwing in TagService.findOrCreate. Two repository methods with distinct names (Spring Data can't disambiguate by return type alone):

  • Optional<Tag> findByName(String name) — exact-case derived query.
  • List<Tag> findAllByNameIgnoreCase(String name) — the plural list. Delete the existing Optional<Tag> findByNameIgnoreCase so the throwing call can't be reintroduced.

Resolution order:

String cleanName = name.trim();
Optional<Tag> exact = tagRepository.findByName(cleanName);
if (exact.isPresent()) return exact.get();                          // exact-case wins (edit round-trip replays stored name)
List<Tag> ci = tagRepository.findAllByNameIgnoreCase(cleanName);
if (!ci.isEmpty()) {
    return ci.stream().min(Comparator.comparing(Tag::getId)).get(); // deterministic tie-break by id — NEVER throw
}
return tagRepository.save(Tag.builder().name(cleanName).build());   // create-when-absent (orphan tag: null sourceRef/parentId — preserve)
  • Tie-break by id (stable column, present on every row). Document the ordering in a one-line comment so "deterministic" has a concrete meaning.
  • Keep findOrCreate as the single source of truth — do not push resolution logic up into DocumentService.resolveTags. The single-doc edit, bulk-edit, and upload-batch paths all funnel through resolveTagsfindOrCreate, so this one change fixes all three.
  • Leave a load-bearing comment at both findOrCreate and the repo method: case-colliding tag names across the tree are valid (parent Geburt + child geburt); do NOT add a unique(lower(name)) constraint.

Decision (accepted — option A): free-text tag entry semantics under case collision are accepted as-is for this issue. With "exact-case wins → single CI → create", typing the bare word weihnachten in new-doc/bulk-edit binds to the deep child (Weihnachten/weihnachten); typing Weihnachten binds to the parent container. This is correct for the edit round-trip and acceptable for free-text authoring. Disambiguating the typeahead/chips (showing tree path so an author can tell a container from its same-named child) is tracked as a separate follow-up, alongside the deeper Option 2 below — NOT in this issue.

Out of scope (follow-up issue): round-tripping tag IDs rather than names so resolution can't be ambiguous at all. Cleaner long-term shape (removes the name-as-key smell) but a larger change with frontend surface. File separately; the lookup fix above is the minimal correct unblock.

Acceptance criteria

Scenario: Saving a document tagged with a case-colliding tag succeeds
  Given two tags exist whose names differ only by case ("Weihnachten" parent, "weihnachten" child)
  And a document is tagged with the child "weihnachten"
  When I change the document's date and save
  Then the save succeeds (no 500) and the document keeps exactly the "weihnachten" tag (not "Weihnachten")

Scenario: Exact-case match wins over a case-insensitive sibling
  Given tags "Geburt" and "geburt" both exist
  When tags are resolved for the name "geburt"
  Then the tag whose name is exactly "geburt" is returned

Scenario: findOrCreate never throws on case-insensitive duplicates
  Given two tags collide case-insensitively and neither equals the queried name exactly
  When findOrCreate is called with that name
  Then it returns the same one deterministically (lowest id) on every call, without throwing

Test strategy

Unit (TagServiceTest, mocked repo) — all four resolution branches as separate tests:

  • exact-case wins
  • single CI match → use it
  • multiple CI matches (no exact) → deterministic (lowest id) + no throw — assert determinism by calling twice and asserting the same id (a single call can't distinguish "deterministic" from "happened to pick this one")
  • create-when-absent → still creates the orphan tag (regression guard)
  • whitespace + collision: " weihnachten " still trims then lands on the child
  • Update the 3 existing mocks at TagServiceTest lines 58/69/81 — they reference findByNameIgnoreCase returning Optional and won't compile after the method change.

Integration (Testcontainers postgres:16):

  • Seed parent Weihnachten + child weihnachten; tag a doc with the child; PUT /api/documents/{id} changing the date → 200 (not 500), and assert the doc keeps exactly the child tag — assert tag set size and that the surviving tag's id is the child's, not the parent's. "No 500" alone is too weak.
  • Add a second case with an umlaut pair (Glückwünsche/glückwünsche) — this is the test that actually catches regressions: a mocked-repo unit test never exercises Postgres LOWER() folding of ü/ä; only the real DB proves findAllByNameIgnoreCase("glückwünsche") returns both rows. A plain-ASCII test would stay green while the bug reappears for umlaut tags.
  • Bulk-edit and upload-batch share resolveTagsfindOrCreate. One edit-path integration test plus a comment noting the shared code path is acceptable; better, add a thin bulk-edit assertion so a future refactor that bypasses findOrCreate is caught.

Verification / observability

  • Confirm GlobalExceptionHandler maps IncorrectResultSizeDataAccessException to a generic 500 body without leaking the Hibernate stack trace or SQL in the response payload (server-log/Sentry trace is fine; the wire response must stay opaque).
  • After deploy to staging: grab the NonUniqueResultException GlitchTip group id, save the 4423782d… repro doc, confirm no new occurrences land. A read-only query enumerating docs carrying any of the 10 colliding tags gives a post-deploy spot-check list (save one per tag → 200).
  • Proofshot the real edit flow: open a weihnachten-tagged doc, change the date, save, confirm the chip still reads weihnachten and no generic error flashes.

Notes

  • NO DB migration. Resist adding a unique(lower(name)) index — it's wrong (collisions are valid canonical nodes) and would fail to apply against the existing 10 rows, turning a clean code deploy into a failed Flyway migration that blocks startup. Code-only, zero-migration, fully reversible (roll back the JAR).
  • Sibling sweep: findByNameContainingIgnoreCase returns List (safe). Confirm PersonService.findOrCreateByAlias can't hit the same non-unique throw on user-influenced alias names — if it can, file a sibling issue (do not fix here).
  • Found while triaging an edit failure on staging during #726 verification; #726 itself is unaffected.
  • The 10 staging tags above are legitimate canonical-tree tags — leave the data as-is.
## Summary Saving **any** edit to a document fails with HTTP 500 ("Ein unerwarteter Fehler ist aufgetreten") when the document carries a tag whose name collides **case-insensitively** with another tag in the tree. `TagService.findOrCreate(name)` resolves via `tagRepository.findByNameIgnoreCase(name)`, which expects a unique row but matches two, throwing `NonUniqueResultException`. This is **independent of #726** (the tag-resolution path is untouched by that PR; it reproduces on `main`). ## Reproduction (observed on staging) 1. Open document `4423782d-e8b9-4bcb-bde5-0777dc53908c` (carries tag `weihnachten`). 2. Change the date, save. 3. → 500. Backend stack trace: ``` org.springframework.dao.IncorrectResultSizeDataAccessException: Query did not return a unique result: 2 results were returned at TagService.findOrCreate(TagService.java:60) // findByNameIgnoreCase("weihnachten") at DocumentService.resolveTags(DocumentService.java:578) at DocumentService.updateDocumentTags(DocumentService.java:560) at DocumentService.updateDocument(DocumentService.java:409) Caused by: org.hibernate.NonUniqueResultException: ... 2 results were returned ``` ## Root cause `DocumentService.updateDocument` re-resolves the document's tags on **every** save: the edit form round-trips tag **names** (`tags.map(t => t.name).join(',')`, `frontend/src/routes/documents/[id]/edit/+page.server.ts:89`), and `resolveTags` calls `TagService.findOrCreate(name)` per name: ```java public Tag findOrCreate(String name) { String cleanName = name.trim(); return tagRepository.findByNameIgnoreCase(cleanName) // ← Optional<Tag>, blows up on 2 matches .orElseGet(() -> tagRepository.save(Tag.builder().name(cleanName).build())); } ``` The lookup is **case-insensitive over the whole `tag` table**, but the canonical tag tree legitimately contains names that differ only by case — a parent and its same-named lowercase **child**, or two siblings. These are **not** accidental duplicates and must **not** be merged/deduped (each has a distinct `source_ref` / tree position and real document attachments): | name | source_ref (tag_path) | parent | docs | children | |---|---|---|---|---| | `Geburt` | `Geburt` | — | 0 | 3 | | `geburt` | `Geburt/geburt` | →Geburt | 36 | 0 | | `Hochzeit` | `Hochzeit` | — | 0 | 7 | | `hochzeit` | `Hochzeit/hochzeit` | →Hochzeit | 51 | 0 | | `Weihnachten` | `Weihnachten` | — | 0 | 5 | | `weihnachten` | `Weihnachten/weihnachten` | →Weihnachten | 69 | 0 | | `Glückwünsche` / `glückwünsche` | parent / `Glückwünsche/glückwünsche` | — / →Glückwünsche | 0 / 12 | 4 / 0 | | `Reisepläne` / `reisepläne` | `Reise/Reisepläne` / `Reise/reisepläne` | both →Reise | 2 / 12 | siblings | So a global case-insensitive **uniqueness constraint would be wrong** — the collision is a *lookup* problem, not a data problem. `source_ref` (tag_path) is the stable identity (ADR-025 / `upsertBySourceRef`); `name` is a human-editable label. `findByNameIgnoreCase` was always a leaky identity that only works while names are globally unique case-insensitively — an invariant the canonical tree explicitly violates. ## Impact Every document carrying one of these 10 tags is currently **un-editable** (any field — date, sender, transcription — fails on save, because the whole tag set is re-resolved). On staging that's 36 + 51 + 69 + 12 + 12 + 2 ≈ **180 document-tag attachments** across common tags. No end-user workaround. The user-facing symptom is the opaque generic error with no hint that a tag is the cause. ## Fix (code only — do NOT touch the data) Make name→tag resolution unambiguous and non-throwing in `TagService.findOrCreate`. Two repository methods with distinct names (Spring Data can't disambiguate by return type alone): - `Optional<Tag> findByName(String name)` — exact-case derived query. - `List<Tag> findAllByNameIgnoreCase(String name)` — the plural list. **Delete the existing `Optional<Tag> findByNameIgnoreCase`** so the throwing call can't be reintroduced. Resolution order: ```java String cleanName = name.trim(); Optional<Tag> exact = tagRepository.findByName(cleanName); if (exact.isPresent()) return exact.get(); // exact-case wins (edit round-trip replays stored name) List<Tag> ci = tagRepository.findAllByNameIgnoreCase(cleanName); if (!ci.isEmpty()) { return ci.stream().min(Comparator.comparing(Tag::getId)).get(); // deterministic tie-break by id — NEVER throw } return tagRepository.save(Tag.builder().name(cleanName).build()); // create-when-absent (orphan tag: null sourceRef/parentId — preserve) ``` - Tie-break by `id` (stable column, present on every row). Document the ordering in a one-line comment so "deterministic" has a concrete meaning. - Keep `findOrCreate` as the **single source of truth** — do not push resolution logic up into `DocumentService.resolveTags`. The single-doc edit, bulk-edit, and upload-batch paths all funnel through `resolveTags` → `findOrCreate`, so this one change fixes all three. - **Leave a load-bearing comment** at both `findOrCreate` and the repo method: case-colliding tag names across the tree are valid (parent `Geburt` + child `geburt`); do **NOT** add a `unique(lower(name))` constraint. **Decision (accepted — option A):** free-text tag entry semantics under case collision are accepted as-is for this issue. With "exact-case wins → single CI → create", typing the bare word `weihnachten` in new-doc/bulk-edit binds to the deep child (`Weihnachten/weihnachten`); typing `Weihnachten` binds to the parent container. This is correct for the edit round-trip and acceptable for free-text authoring. Disambiguating the typeahead/chips (showing tree path so an author can tell a container from its same-named child) is tracked as a **separate follow-up**, alongside the deeper Option 2 below — NOT in this issue. **Out of scope (follow-up issue):** round-tripping tag **IDs** rather than names so resolution can't be ambiguous at all. Cleaner long-term shape (removes the name-as-key smell) but a larger change with frontend surface. File separately; the lookup fix above is the minimal correct unblock. ## Acceptance criteria ```gherkin Scenario: Saving a document tagged with a case-colliding tag succeeds Given two tags exist whose names differ only by case ("Weihnachten" parent, "weihnachten" child) And a document is tagged with the child "weihnachten" When I change the document's date and save Then the save succeeds (no 500) and the document keeps exactly the "weihnachten" tag (not "Weihnachten") Scenario: Exact-case match wins over a case-insensitive sibling Given tags "Geburt" and "geburt" both exist When tags are resolved for the name "geburt" Then the tag whose name is exactly "geburt" is returned Scenario: findOrCreate never throws on case-insensitive duplicates Given two tags collide case-insensitively and neither equals the queried name exactly When findOrCreate is called with that name Then it returns the same one deterministically (lowest id) on every call, without throwing ``` ## Test strategy **Unit (`TagServiceTest`, mocked repo) — all four resolution branches as separate tests:** - exact-case wins - single CI match → use it - multiple CI matches (no exact) → deterministic (lowest id) + no throw — assert determinism by calling twice and asserting the **same id** (a single call can't distinguish "deterministic" from "happened to pick this one") - create-when-absent → still creates the orphan tag (regression guard) - whitespace + collision: `" weihnachten "` still trims then lands on the child - **Update the 3 existing mocks** at `TagServiceTest` lines 58/69/81 — they reference `findByNameIgnoreCase` returning `Optional` and won't compile after the method change. **Integration (Testcontainers `postgres:16`):** - Seed parent `Weihnachten` + child `weihnachten`; tag a doc with the **child**; `PUT /api/documents/{id}` changing the date → **200** (not 500), and assert the doc keeps **exactly** the child tag — assert tag set size **and** that the surviving tag's id is the child's, not the parent's. "No 500" alone is too weak. - **Add a second case with an umlaut pair** (`Glückwünsche`/`glückwünsche`) — this is the test that actually catches regressions: a mocked-repo unit test never exercises Postgres `LOWER()` folding of `ü`/`ä`; only the real DB proves `findAllByNameIgnoreCase("glückwünsche")` returns both rows. A plain-ASCII test would stay green while the bug reappears for umlaut tags. - Bulk-edit and upload-batch share `resolveTags` → `findOrCreate`. One edit-path integration test plus a comment noting the shared code path is acceptable; better, add a thin bulk-edit assertion so a future refactor that bypasses `findOrCreate` is caught. ## Verification / observability - Confirm `GlobalExceptionHandler` maps `IncorrectResultSizeDataAccessException` to a generic 500 body **without** leaking the Hibernate stack trace or SQL in the response payload (server-log/Sentry trace is fine; the wire response must stay opaque). - After deploy to staging: grab the `NonUniqueResultException` GlitchTip group id, save the `4423782d…` repro doc, confirm no new occurrences land. A read-only query enumerating docs carrying any of the 10 colliding tags gives a post-deploy spot-check list (save one per tag → 200). - Proofshot the real edit flow: open a `weihnachten`-tagged doc, change the date, save, confirm the chip still reads `weihnachten` and no generic error flashes. ## Notes - **NO DB migration.** Resist adding a `unique(lower(name))` index — it's wrong (collisions are valid canonical nodes) and would **fail to apply** against the existing 10 rows, turning a clean code deploy into a failed Flyway migration that blocks startup. Code-only, zero-migration, fully reversible (roll back the JAR). - Sibling sweep: `findByNameContainingIgnoreCase` returns `List` (safe). Confirm `PersonService.findOrCreateByAlias` can't hit the same non-unique throw on user-influenced alias names — if it can, file a sibling issue (do not fix here). - Found while triaging an edit failure on staging during #726 verification; #726 itself is unaffected. - The 10 staging tags above are legitimate canonical-tree tags — leave the data as-is.
marcel added the P1-highbug labels 2026-06-04 19:15:04 +02:00
Author
Owner

Implemented

Branch fix/issue-730-tag-case-collision (off main).

Commits

  • d000170f fix(tag): resolve case-colliding tag names without throwing — deleted Optional<Tag> findByNameIgnoreCase (its only production caller was findOrCreate), added Optional<Tag> findByName + List<Tag> findAllByNameIgnoreCase, and rewrote TagService.findOrCreate to: exact-case → lowest-id case-insensitive (deterministic, never throws) → create. Load-bearing comments at both findOrCreate and the repo methods warn against a unique(lower(name)) constraint. Updated the 3 existing TagServiceTest mocks and added the four resolution-branch tests (exact-case wins; single CI; multiple CI → lowest id asserted across two calls; create-when-absent; whitespace+collision).
  • a58378e8 test(tag): pin case-colliding tag resolution on real Postgres — Testcontainers postgres:16-alpine (TagCaseCollisionIntegrationTest): saving a doc tagged with the child weihnachten succeeds and keeps exactly the child (id-asserted, not the parent); the Glückwünsche/glückwünsche umlaut pair resolves deterministically without throwing (the regression a plain-ASCII test would miss — only the real DB proves LOWER() folds ü); bulk-edit funnels through resolveTags → findOrCreate.

Acceptance criteria

  • Saving a doc tagged with a case-colliding tag succeeds (no 500), keeps exactly the child tag — integration-asserted by tag-set size and surviving tag id.
  • Exact-case match wins over a CI sibling.
  • findOrCreate never throws on CI duplicates; returns the same (lowest-id) tag on every call.

Verification notes

  • No migration, no data change — code-only, fully reversible, as specified.
  • GlobalExceptionHandler keeps the wire response opaque: after the fix this path no longer throws IncorrectResultSizeDataAccessException, and the existing generic handler maps any stray one to INTERNAL_ERROR ("An unexpected error occurred") with no Hibernate/SQL leak. No new handler added (verification item, not a fix).
  • ./mvnw clean package -DskipTests · TagServiceTest (45) · TagCaseCollisionIntegrationTest (3) · DocumentServiceTest (187, the resolveTags consumer) .

Sibling sweep → follow-ups filed

  • #731 (bug, person, P2) — PersonService.findOrCreateByAlias has the identical Optional<Person> findByAliasIgnoreCase non-unique-throw risk on case-colliding aliases (plus findByFirstNameIgnoreCaseAndLastNameIgnoreCase). Latent (import path), filed not fixed here.
  • #732 (refactor, P3) — round-trip tag IDs instead of names (the out-of-scope cleaner long-term shape) + typeahead/chips tree-path disambiguation.

Not yet done

  • Proofshot of the live edit flow (needs the full stack running) and the staging post-deploy spot-check — left for the deploy/verify step.

Next: open the PR (or run review).

## Implemented ✅ Branch `fix/issue-730-tag-case-collision` (off `main`). ### Commits - `d000170f` **fix(tag): resolve case-colliding tag names without throwing** — deleted `Optional<Tag> findByNameIgnoreCase` (its only production caller was `findOrCreate`), added `Optional<Tag> findByName` + `List<Tag> findAllByNameIgnoreCase`, and rewrote `TagService.findOrCreate` to: exact-case → lowest-id case-insensitive (deterministic, never throws) → create. Load-bearing comments at both `findOrCreate` and the repo methods warn against a `unique(lower(name))` constraint. Updated the 3 existing `TagServiceTest` mocks and added the four resolution-branch tests (exact-case wins; single CI; multiple CI → lowest id asserted across two calls; create-when-absent; whitespace+collision). - `a58378e8` **test(tag): pin case-colliding tag resolution on real Postgres** — Testcontainers `postgres:16-alpine` (`TagCaseCollisionIntegrationTest`): saving a doc tagged with the child `weihnachten` succeeds and keeps **exactly** the child (id-asserted, not the parent); the `Glückwünsche`/`glückwünsche` **umlaut** pair resolves deterministically without throwing (the regression a plain-ASCII test would miss — only the real DB proves `LOWER()` folds `ü`); bulk-edit funnels through `resolveTags → findOrCreate`. ### Acceptance criteria - ✅ Saving a doc tagged with a case-colliding tag succeeds (no 500), keeps exactly the child tag — integration-asserted by tag-set size **and** surviving tag id. - ✅ Exact-case match wins over a CI sibling. - ✅ `findOrCreate` never throws on CI duplicates; returns the same (lowest-id) tag on every call. ### Verification notes - **No migration, no data change** — code-only, fully reversible, as specified. - `GlobalExceptionHandler` keeps the wire response opaque: after the fix this path no longer throws `IncorrectResultSizeDataAccessException`, and the existing generic handler maps any stray one to `INTERNAL_ERROR` ("An unexpected error occurred") with no Hibernate/SQL leak. No new handler added (verification item, not a fix). - `./mvnw clean package -DskipTests` ✅ · `TagServiceTest` (45) ✅ · `TagCaseCollisionIntegrationTest` (3) ✅ · `DocumentServiceTest` (187, the `resolveTags` consumer) ✅. ### Sibling sweep → follow-ups filed - **#731** (bug, person, P2) — `PersonService.findOrCreateByAlias` has the identical `Optional<Person> findByAliasIgnoreCase` non-unique-throw risk on case-colliding aliases (plus `findByFirstNameIgnoreCaseAndLastNameIgnoreCase`). Latent (import path), filed not fixed here. - **#732** (refactor, P3) — round-trip tag **IDs** instead of names (the out-of-scope cleaner long-term shape) + typeahead/chips tree-path disambiguation. ### Not yet done - Proofshot of the live edit flow (needs the full stack running) and the staging post-deploy spot-check — left for the deploy/verify step. Next: open the PR (or run review).
Sign in to join this conversation.
No Label P1-high bug
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#730