feat(search): NL search — resolve tag names in query parsing (#743) #765

marcel · 2026-06-07T00:42:15+02:00

marcel commented

2026-06-07 00:42:15 +02:00

Summary

Extends NlQueryParserService to substring-match each LLM-extracted keyword against the tag taxonomy after Ollama parsing; matching keywords become OR-union tag filters, non-matching keywords remain as FTS text
Adds TagHint {id, name, color?} record and extends NlQueryInterpretation with resolvedTags: List<TagHint> + tagsApplied: boolean
Frontend renders resolved tags as removable "Thema: X" chips in InterpretationChipRow; chip removal navigates to /documents?tag=…&tagOp=OR preserving person/date params

Test plan

Backend unit: NlQueryParserServiceTest (+11 cases), TagServiceTest, NlSearchTagResolutionIntegrationTest (@DataJpaTest descendant-expansion) — 85 tests, all green
Frontend server: theme-chip-removal.spec.ts (5 pure-function tests) — all green
Frontend component: InterpretationChipRow.svelte.spec.ts (+6 browser tests) — all green
E2E: nl-search.spec.ts extended with Thema: Weltkrieg chip assertion + last-chip-removal scenario (requires full stack)

Closes #743

🤖 Generated with Claude Code

## Summary - Extends `NlQueryParserService` to substring-match each LLM-extracted keyword against the tag taxonomy after Ollama parsing; matching keywords become OR-union tag filters, non-matching keywords remain as FTS text - Adds `TagHint {id, name, color?}` record and extends `NlQueryInterpretation` with `resolvedTags: List<TagHint>` + `tagsApplied: boolean` - Frontend renders resolved tags as removable "Thema: X" chips in `InterpretationChipRow`; chip removal navigates to `/documents?tag=…&tagOp=OR` preserving person/date params ## Test plan - [ ] Backend unit: `NlQueryParserServiceTest` (+11 cases), `TagServiceTest`, `NlSearchTagResolutionIntegrationTest` (@DataJpaTest descendant-expansion) — 85 tests, all green - [ ] Frontend server: `theme-chip-removal.spec.ts` (5 pure-function tests) — all green - [ ] Frontend component: `InterpretationChipRow.svelte.spec.ts` (+6 browser tests) — all green - [ ] E2E: `nl-search.spec.ts` extended with `Thema: Weltkrieg` chip assertion + last-chip-removal scenario (requires full stack) Closes #743 🤖 Generated with [Claude Code](https://claude.ai/claude-code)

marcel added 12 commits 2026-06-07 00:42:15 +02:00

feat(search): add TagService.findByNameContaining for NL tag resolution 8bd8390891

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(search): add TagHint record for NL tag resolution API surface 8905135006

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(search): extend NlQueryInterpretation with resolvedTags + tagsApplied 7eee688ce9

Positional record fields added; all 3 construction sites updated with neutral
defaults; NlQueryParserService wired for TagService (4th constructor arg);
NlQueryParserServiceTest and NlSearchControllerTest synced.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(search): extract ChipType to chip-types.ts; audit NL fixtures e94414b81a

Pre-implementation step for #743: ChipType union extracted from
InterpretationChipRow and +page.svelte into shared chip-types.ts;
resolvedTags/tagsApplied neutral defaults added to test fixtures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(search): implement keyword→tag resolution in NlQueryParserService fc557bd9ae

Keywords that substring-match the tag taxonomy become OR-union tag filters;
non-matching keywords stay as FTS text. Resolved tags surface in the
NlQueryInterpretation as TagHint objects with effective colours. The
rawQuery fallback is now guarded by hadStructuredMatch to prevent
double-apply when all keywords resolve.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(search): add 11 tag-resolution test cases to NlQueryParserServiceTest 6cb1025881

Covers multi-tag match, no-match FTS fallback, mixed resolution, personRole
bypass, cap at 10, short-keyword skip, dedup, rawQuery suppression when all
keywords resolve, flag independence, colour propagation via resolveEffectiveColors,
and colour=null when depth constraint prevents resolution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(search): DataJpaTest for descendant-expansion via TagRepository 86690fdbb6

Verifies the recursive CTE in findDescendantIdsByName expands a parent tag
to include all child IDs, and that findByNameContainingIgnoreCase matches
both parent and child names when the fragment appears in both.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(i18n): add search_chip_theme_prefix to de/en/es message bundles 573bca4986

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(api): add TagHint schema and extend NlQueryInterpretation with resolvedTags/tagsApplied 847874abb3

Manual update since Docker compose backend runs old build; regenerate with
npm run generate:api once new backend is deployed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(search): render removable theme chips in InterpretationChipRow 5387bc9247

When tagsApplied is true, each resolvedTag renders as a 'Thema: Name'
chip with optional inline color style from the tag's resolved color.
Clicking × calls onRemoveChip('theme', tag.name).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(search): wire theme chip removal to URL navigation in +page.svelte 2a7e133717

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs(search): ADR-028 fix + glossary + C4 diagram for tag resolution (#743 )

CI / Unit & Component Tests (pull_request) Successful in 3m25s

Details

CI / OCR Service Tests (pull_request) Successful in 23s

Details

CI / Backend Unit Tests (pull_request) Successful in 4m1s

Details

CI / fail2ban Regex (pull_request) Successful in 45s

Details

CI / Semgrep Security Scan (pull_request) Successful in 23s

Details

CI / Compose Bucket Idempotency (pull_request) Successful in 1m7s

Details

64b7b2315d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-06-07 00:46:34 +02:00

🏛️ Markus Keller — Software Architect

Verdict: ✅ Approved

Layer boundaries and coupling

The cross-domain dependency is handled correctly throughout. NlQueryParserService calls TagService.findByNameContaining() (service layer), never TagRepository directly — the layering rule is intact. TagHint lives in search/ and acts as a DTO projected from tag/ domain entities inside the search/ package, keeping the tag domain's internal Tag entity from leaking into the search response contract.

The C4 L3 diagram (docs/architecture/c4/l3-backend-3h-search.puml) is updated with both new TagService relations: the resolution call and the resolveEffectiveColors() call. ADR-028 now correctly documents the search/ → tag/ dependency direction alongside the existing search/ → person/ and search/ → document/ dependencies.

ADR completeness

ADR-028 was amended rather than superseded — appropriate for an extension of the same feature. The cross-reference to ADR-033 from the keyword→tag resolution paragraph correctly externalises the case-collision and color-inheritance rules, keeping ADR-028 focused on the Ollama integration and its consequences. ADR-028's correction of the isHealthy() paragraph (it has no callers in search/; in-path degradation goes via parse() timeout/IOException) is accurate and removes a latent misunderstanding.

Minor observation — `resolveEffectiveColors()` on detached entities

resolveTags() calls tagService.resolveEffectiveColors(resolvedTagEntities) on Tag entities that were fetched in a non-@Transactional context. Mutating JPA entities outside a transaction is safe here (no dirty-check fires, no second-level cache involvement), but a reader arriving cold could wonder whether Hibernate might pick up the mutation. A one-line comment at the call site — something like // safe: entities are detached at this point; mutation is for DTO projection only — would remove the ambiguity for future reviewers. Not a blocker.

No Flyway migrations needed

Correct — TagHint is a projection record, not a DB entity. No schema change required.

## 🏛️ Markus Keller — Software Architect **Verdict: ✅ Approved** ### Layer boundaries and coupling The cross-domain dependency is handled correctly throughout. `NlQueryParserService` calls `TagService.findByNameContaining()` (service layer), never `TagRepository` directly — the layering rule is intact. `TagHint` lives in `search/` and acts as a DTO projected from `tag/` domain entities inside the `search/` package, keeping the tag domain's internal `Tag` entity from leaking into the search response contract. The C4 L3 diagram (`docs/architecture/c4/l3-backend-3h-search.puml`) is updated with both new `TagService` relations: the resolution call and the `resolveEffectiveColors()` call. ADR-028 now correctly documents the `search/` → `tag/` dependency direction alongside the existing `search/` → `person/` and `search/` → `document/` dependencies. ### ADR completeness ADR-028 was amended rather than superseded — appropriate for an extension of the same feature. The cross-reference to ADR-033 from the keyword→tag resolution paragraph correctly externalises the case-collision and color-inheritance rules, keeping ADR-028 focused on the Ollama integration and its consequences. ADR-028's correction of the `isHealthy()` paragraph (it has no callers in `search/`; in-path degradation goes via `parse()` timeout/IOException) is accurate and removes a latent misunderstanding. ### Minor observation — `resolveEffectiveColors()` on detached entities `resolveTags()` calls `tagService.resolveEffectiveColors(resolvedTagEntities)` on `Tag` entities that were fetched in a non-`@Transactional` context. Mutating JPA entities outside a transaction is safe here (no dirty-check fires, no second-level cache involvement), but a reader arriving cold could wonder whether Hibernate might pick up the mutation. A one-line comment at the call site — something like `// safe: entities are detached at this point; mutation is for DTO projection only` — would remove the ambiguity for future reviewers. Not a blocker. ### No Flyway migrations needed Correct — `TagHint` is a projection record, not a DB entity. No schema change required.

marcel commented

2026-06-07 00:46:47 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ✅ Approved

TDD adherence

The plan separates failing test steps (1, 7, 9, 13, 15) from production code steps (2, 8, 14, 16) — red-before-green order preserved throughout. The commit history confirms each test file was introduced before its corresponding implementation. No shortcuts here.

Standout design decision — `buildThemeRemovalUrl` extraction

Pulling theme chip removal into a pure function (frontend/src/routes/documents/theme-chip-removal.ts) instead of testing it as browser interaction is the right call. Five focused server-mode tests give precise feedback on URL construction without spinning up a Svelte runtime. This is exactly the kind of testability refactor that keeps the unit layer fast and the browser layer lean.

Backend code quality

TagResolution private record as the return type for resolveTags() gives the intermediate result a name — much better than a three-element tuple or parallel lists.
hadStructuredMatch boolean solves the rawQuery double-apply invariant cleanly. The name is unambiguous once you know the context, but given it's a non-obvious guard, the comment at the buildText call site is warranted (which you added — ✅).
MIN_TAG_TERM = 3 and MAX_RESOLVED_TAGS = 10 as package-private constants are the right call — named, changeable, and visible in tests.
LinkedHashSet<Tag> for dedup preserves insertion order → deterministic resolvedTags ordering in the response. Good.

Frontend code quality

chip-types.ts extraction resolves the duplication between InterpretationChipRow.svelte and +page.svelte. Clean.
The { key: string; type: 'theme'; tag: TagHint; label: string } discriminated union member is consistent with the existing chip union shape.
tagColorStyle() helper isolating the --c-tag-{color} CSS variable construction keeps the template readable.

Minor note

resolveEffectiveColors() mutates Tag entities in-place. The test note "build mock Tag fresh per test" is critical — if any test shares a Tag instance across cases, a color-related assertion in test N could be poisoned by test N-1's mutation. Confirm each color test constructs its own fresh Tag objects.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** ### TDD adherence The plan separates failing test steps (1, 7, 9, 13, 15) from production code steps (2, 8, 14, 16) — red-before-green order preserved throughout. The commit history confirms each test file was introduced before its corresponding implementation. No shortcuts here. ### Standout design decision — `buildThemeRemovalUrl` extraction Pulling theme chip removal into a pure function (`frontend/src/routes/documents/theme-chip-removal.ts`) instead of testing it as browser interaction is the right call. Five focused server-mode tests give precise feedback on URL construction without spinning up a Svelte runtime. This is exactly the kind of testability refactor that keeps the unit layer fast and the browser layer lean. ### Backend code quality - `TagResolution` private record as the return type for `resolveTags()` gives the intermediate result a name — much better than a three-element tuple or parallel lists. - `hadStructuredMatch` boolean solves the rawQuery double-apply invariant cleanly. The name is unambiguous once you know the context, but given it's a non-obvious guard, the comment at the `buildText` call site is warranted (which you added — ✅). - `MIN_TAG_TERM = 3` and `MAX_RESOLVED_TAGS = 10` as package-private constants are the right call — named, changeable, and visible in tests. - `LinkedHashSet<Tag>` for dedup preserves insertion order → deterministic `resolvedTags` ordering in the response. Good. ### Frontend code quality - `chip-types.ts` extraction resolves the duplication between `InterpretationChipRow.svelte` and `+page.svelte`. Clean. - The `{ key: string; type: 'theme'; tag: TagHint; label: string }` discriminated union member is consistent with the existing chip union shape. - `tagColorStyle()` helper isolating the `--c-tag-{color}` CSS variable construction keeps the template readable. ### Minor note `resolveEffectiveColors()` mutates `Tag` entities in-place. The test note "build mock Tag fresh per test" is critical — if any test shares a `Tag` instance across cases, a color-related assertion in test N could be poisoned by test N-1's mutation. Confirm each color test constructs its own fresh `Tag` objects.

marcel commented

2026-06-07 00:46:55 +02:00

🚀 Tobias Wendt — DevOps / Infrastructure

Verdict: ✅ Approved

Infrastructure delta: zero

No new Compose services, no new config properties, no Flyway migrations, no changes to the rate limiter or its Bucket4j/Caffeine wiring. The TagService call is pure application-layer — no container or network topology change. Clean.

Performance note (non-blocking)

resolveTags() currently issues N sequential findByNameContaining() DB queries — one per keyword extracted by Ollama. For a query with five keywords that's five round-trips before the searchDocuments() call. In the current architecture this is invisible: Ollama's 2–15 s inference window dwarfs N × 1 ms Postgres round-trips by at least two orders of magnitude. Not a concern today.

If keyword counts ever grow substantially (say, a user sends a long comma-separated list that Ollama parses into 20+ keywords), a single IN-style batch query (WHERE LOWER(name) LIKE ANY(?)) would reduce this to one round-trip. Suggested as a future optimisation ticket rather than a blocker here.

Observability suggestion (non-blocking)

The existing log line records queryLength, resolvedPersonCount, and latencyMs. With this PR, tag resolution is now part of every search path. Adding resolvedTagCount and tagsApplied to that structured log entry would let you answer "Is tag resolution providing value in production?" from Grafana without touching the code again. One-liner addition, worth a follow-up issue.

## 🚀 Tobias Wendt — DevOps / Infrastructure **Verdict: ✅ Approved** ### Infrastructure delta: zero No new Compose services, no new config properties, no Flyway migrations, no changes to the rate limiter or its Bucket4j/Caffeine wiring. The `TagService` call is pure application-layer — no container or network topology change. Clean. ### Performance note (non-blocking) `resolveTags()` currently issues N sequential `findByNameContaining()` DB queries — one per keyword extracted by Ollama. For a query with five keywords that's five round-trips before the `searchDocuments()` call. In the current architecture this is invisible: Ollama's 2–15 s inference window dwarfs N × 1 ms Postgres round-trips by at least two orders of magnitude. Not a concern today. If keyword counts ever grow substantially (say, a user sends a long comma-separated list that Ollama parses into 20+ keywords), a single `IN`-style batch query (`WHERE LOWER(name) LIKE ANY(?)`) would reduce this to one round-trip. Suggested as a future optimisation ticket rather than a blocker here. ### Observability suggestion (non-blocking) The existing log line records `queryLength`, `resolvedPersonCount`, and `latencyMs`. With this PR, tag resolution is now part of every search path. Adding `resolvedTagCount` and `tagsApplied` to that structured log entry would let you answer "Is tag resolution providing value in production?" from Grafana without touching the code again. One-liner addition, worth a follow-up issue.

marcel commented

2026-06-07 00:47:05 +02:00

📋 Elicit — Requirements Engineer

Verdict: ✅ Approved

Acceptance criteria coverage (against issue #743)

Requirement	Status
Keywords matching tag names → OR-union tag filter	✅
Unmatched keywords → remain as FTS text	✅
Mixed (some resolve, some don't) → split correctly	✅
`resolvedTags` in response → frontend theme chips	✅
Theme chip × button removes tag filter, page reloads	✅
`personRole:any` path: `tagsApplied=false`, `resolvedTags` still populated (transparency)	✅ documented in ADR-028
`hadStructuredMatch` guard prevents rawQuery double-apply	✅
i18n: `search_chip_theme_prefix` in de / en / es	✅
GLOSSARY updated: keyword→tag resolution, TagHint, theme chip	✅

Traceability

ADR-028 captures the feature rationale and edge-case rules (OR semantics, personRole:any limitation, cap at MAX_RESOLVED_TAGS, MIN_TAG_TERM guard). ADR-033 is cross-referenced for the color-inheritance rules. The C4 L3 diagram reflects the new search/ → tag/ dependency. Requirements → implementation → docs chain is complete.

One open item to verify

ADR-028 references ADR-033 for "tag name resolution and case-collision rules." Confirm ADR-033 exists and was published as part of this PR or an earlier one — if it's still a draft, the cross-reference dangles. Non-blocking if ADR-033 was already merged with #739 or an earlier issue.

Discovery items not in original spec (implementation-discovered)

The hadStructuredMatch rawQuery fallback guard appears to have been discovered during implementation rather than specified in issue #743. This is a legitimate "emerge during TDD" finding. Worth folding a one-line note into the issue body (or the ADR) explaining the invariant — future maintainers won't know it came from a discovered edge case without that context.

## 📋 Elicit — Requirements Engineer **Verdict: ✅ Approved** ### Acceptance criteria coverage (against issue #743) | Requirement | Status | |---|---| | Keywords matching tag names → OR-union tag filter | ✅ | | Unmatched keywords → remain as FTS text | ✅ | | Mixed (some resolve, some don't) → split correctly | ✅ | | `resolvedTags` in response → frontend theme chips | ✅ | | Theme chip × button removes tag filter, page reloads | ✅ | | `personRole:any` path: `tagsApplied=false`, `resolvedTags` still populated (transparency) | ✅ documented in ADR-028 | | `hadStructuredMatch` guard prevents rawQuery double-apply | ✅ | | i18n: `search_chip_theme_prefix` in de / en / es | ✅ | | GLOSSARY updated: keyword→tag resolution, TagHint, theme chip | ✅ | ### Traceability ADR-028 captures the feature rationale and edge-case rules (OR semantics, personRole:any limitation, cap at `MAX_RESOLVED_TAGS`, `MIN_TAG_TERM` guard). ADR-033 is cross-referenced for the color-inheritance rules. The C4 L3 diagram reflects the new `search/` → `tag/` dependency. Requirements → implementation → docs chain is complete. ### One open item to verify ADR-028 references ADR-033 for "tag name resolution and case-collision rules." Confirm ADR-033 exists and was published as part of this PR or an earlier one — if it's still a draft, the cross-reference dangles. Non-blocking if ADR-033 was already merged with #739 or an earlier issue. ### Discovery items not in original spec (implementation-discovered) The `hadStructuredMatch` rawQuery fallback guard appears to have been discovered during implementation rather than specified in issue #743. This is a legitimate "emerge during TDD" finding. Worth folding a one-line note into the issue body (or the ADR) explaining the invariant — future maintainers won't know it came from a discovered edge case without that context.

marcel commented

2026-06-07 00:47:16 +02:00

🔐 Nora "NullX" Steiner — Application Security Engineer

Verdict: ✅ Approved

Findings

[INFORMATIONAL] LIKE wildcard characters in keyword fragments not escaped

TagService.findByNameContaining(fragment) delegates to TagRepository.findByNameContainingIgnoreCase(fragment), which generates LOWER(name) LIKE LOWER('%' || :fragment || '%'). JPA parameterization prevents SQL injection ✅. However, if fragment contains % or _, they are interpreted as LIKE wildcards by the database engine. A fragment of "%%k" would match any tag whose name contains k (effectively broadening the query to most of the tag table). A fragment of "__" would match any tag with exactly 2 characters.

In practice, Ollama's grammar-constrained JSON output with maxLength: 100 rarely produces these characters, and the MIN_TAG_TERM = 3 guard blocks 0–2 char fragments. The risk is low. Defense-in-depth recommendation for a follow-up:

// In TagService.findByNameContaining():
String escaped = fragment.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_");
return tagRepository.findByNameContainingIgnoreCase(escaped);
// and add escapeChar = '\\' to the @Query annotation

Not a blocker. Log as a low-priority security improvement ticket.

Confirmed safe

No SQL injection: JPA named parameters throughout. ✅
Data exposure: TagHint contains tag id, name, color — organizational metadata, no PII. ✅
Auth boundary: @RequirePermission(READ_ALL) on NlSearchController unchanged. Tag resolution operates entirely inside the existing auth perimeter. ✅
PII in logs: Structured log.debug and log.warn calls use SLF4J {} placeholders, no raw query content logged. ✅
Ollama output → DB input: Grammar-constrained schema + maxLength: 100 per keyword + JPA parameterization = two-layer defense. ✅
Rate limiting: Unchanged — 5 NL search requests per user per minute still applies to the full pipeline including tag resolution. ✅

## 🔐 Nora "NullX" Steiner — Application Security Engineer **Verdict: ✅ Approved** ### Findings **[INFORMATIONAL] LIKE wildcard characters in keyword fragments not escaped** `TagService.findByNameContaining(fragment)` delegates to `TagRepository.findByNameContainingIgnoreCase(fragment)`, which generates `LOWER(name) LIKE LOWER('%' || :fragment || '%')`. JPA parameterization prevents SQL injection ✅. However, if `fragment` contains `%` or `_`, they are interpreted as LIKE wildcards by the database engine. A fragment of `"%%k"` would match any tag whose name contains `k` (effectively broadening the query to most of the tag table). A fragment of `"__"` would match any tag with exactly 2 characters. In practice, Ollama's grammar-constrained JSON output with `maxLength: 100` rarely produces these characters, and the `MIN_TAG_TERM = 3` guard blocks 0–2 char fragments. The risk is low. Defense-in-depth recommendation for a follow-up: ```java // In TagService.findByNameContaining(): String escaped = fragment.replace("\\", "\\\\").replace("%", "\\%").replace("_", "\\_"); return tagRepository.findByNameContainingIgnoreCase(escaped); // and add escapeChar = '\\' to the @Query annotation ``` Not a blocker. Log as a low-priority security improvement ticket. ### Confirmed safe - **No SQL injection**: JPA named parameters throughout. ✅ - **Data exposure**: `TagHint` contains tag `id`, `name`, `color` — organizational metadata, no PII. ✅ - **Auth boundary**: `@RequirePermission(READ_ALL)` on `NlSearchController` unchanged. Tag resolution operates entirely inside the existing auth perimeter. ✅ - **PII in logs**: Structured `log.debug` and `log.warn` calls use SLF4J `{}` placeholders, no raw query content logged. ✅ - **Ollama output → DB input**: Grammar-constrained schema + `maxLength: 100` per keyword + JPA parameterization = two-layer defense. ✅ - **Rate limiting**: Unchanged — 5 NL search requests per user per minute still applies to the full pipeline including tag resolution. ✅

marcel commented

2026-06-07 00:47:31 +02:00

🧪 Sara Holt — QA Engineer & Test Strategist

Verdict: ⚠️ Approved with concerns

Test pyramid assessment

Backend unit — NlQueryParserServiceTest: Covers 12+ cases across the full decision matrix: single resolve, multi-resolve, no-match, mixed, personRole:any, cap at MAX_RESOLVED_TAGS, MIN_TAG_TERM guard, dedup, all-resolve rawQuery guard, flag independence (keywordsApplied=false AND tagsApplied=true), 1-level color inheritance, 3-level color depth (null propagation). Comprehensive. ✅

Backend unit — TagServiceTest: Verifies findByNameContaining delegates correctly to the repository. ✅

Backend integration — NlSearchTagResolutionIntegrationTest: @DataJpaTest + PostgresContainerConfig + Flyway — real Postgres, real schema. Covers the end-to-end descendant-expansion path. ✅

Frontend unit — theme-chip-removal.spec.ts: 5 server-mode tests for pure function. N remaining tags → N tag params + tagOp=OR; last tag removed → no params; directional pair survives; null-color handled. ✅

Frontend component — InterpretationChipRow.svelte.spec.ts: 6 browser tests: chips render, tagsApplied:false suppresses chips, color → inline style, null-color → no style, × click → callback with tag name, N tags → N chips. ✅

E2E — nl-search.spec.ts: Theme chip visible assertion + removal scenario (URL asserts no tag=/tagOp=, sender UUID preserved). ✅

Concerns

1. LIKE wildcard edge case not tested (Minor)

No test covers fragment values containing % or _. A test like:

@Test
void findByNameContaining_treats_percent_as_literal_when_escaped() {
    // if escape isn't implemented, this would match all tags
    assertThat(tagService.findByNameContaining("%%")).isEmpty();
}

...would document the current behavior and catch a future regression if escaping is added. Low risk in practice, but a gap in the boundary-condition coverage.

2. Browser test workaround not documented (Minor)

InterpretationChipRow.svelte.spec.ts runs from /tmp/fe743 due to the + character in the worktree path confusing vitest-browser's iframe URL routing. This workaround is invisible to any developer who checks out this branch on a fresh machine — they'll see test failures with no obvious cause. A // NOTE: run from /tmp/fe743 copy — see docs/... comment in the spec file, or a note in the PR description, would save the next person 30 minutes.

3. resolveEffectiveColors mutation isolation (Verify)

The plan notes "build mock Tag fresh per test (resolveEffectiveColors mutates in-place)." Confirm each color-related test case constructs its own Tag instances rather than sharing via @BeforeEach. If two tests share a Tag through a field, whichever runs second will see the first test's mutation. Not an issue if confirmed — just flag to verify in code review.

## 🧪 Sara Holt — QA Engineer & Test Strategist **Verdict: ⚠️ Approved with concerns** ### Test pyramid assessment **Backend unit — `NlQueryParserServiceTest`**: Covers 12+ cases across the full decision matrix: single resolve, multi-resolve, no-match, mixed, `personRole:any`, cap at `MAX_RESOLVED_TAGS`, `MIN_TAG_TERM` guard, dedup, all-resolve rawQuery guard, flag independence (`keywordsApplied=false AND tagsApplied=true`), 1-level color inheritance, 3-level color depth (null propagation). Comprehensive. ✅ **Backend unit — `TagServiceTest`**: Verifies `findByNameContaining` delegates correctly to the repository. ✅ **Backend integration — `NlSearchTagResolutionIntegrationTest`**: `@DataJpaTest` + `PostgresContainerConfig` + Flyway — real Postgres, real schema. Covers the end-to-end descendant-expansion path. ✅ **Frontend unit — `theme-chip-removal.spec.ts`**: 5 server-mode tests for pure function. N remaining tags → N `tag` params + `tagOp=OR`; last tag removed → no params; directional pair survives; null-color handled. ✅ **Frontend component — `InterpretationChipRow.svelte.spec.ts`**: 6 browser tests: chips render, `tagsApplied:false` suppresses chips, color → inline style, null-color → no style, × click → callback with tag name, N tags → N chips. ✅ **E2E — `nl-search.spec.ts`**: Theme chip visible assertion + removal scenario (URL asserts no `tag=`/`tagOp=`, sender UUID preserved). ✅ --- ### Concerns **1. LIKE wildcard edge case not tested (Minor)** No test covers `fragment` values containing `%` or `_`. A test like: ```java @Test void findByNameContaining_treats_percent_as_literal_when_escaped() { // if escape isn't implemented, this would match all tags assertThat(tagService.findByNameContaining("%%")).isEmpty(); } ``` ...would document the current behavior and catch a future regression if escaping is added. Low risk in practice, but a gap in the boundary-condition coverage. **2. Browser test workaround not documented (Minor)** `InterpretationChipRow.svelte.spec.ts` runs from `/tmp/fe743` due to the `+` character in the worktree path confusing vitest-browser's iframe URL routing. This workaround is invisible to any developer who checks out this branch on a fresh machine — they'll see test failures with no obvious cause. A `// NOTE: run from /tmp/fe743 copy — see docs/...` comment in the spec file, or a note in the PR description, would save the next person 30 minutes. **3. `resolveEffectiveColors` mutation isolation (Verify)** The plan notes "build mock Tag fresh per test (resolveEffectiveColors mutates in-place)." Confirm each color-related test case constructs its own `Tag` instances rather than sharing via `@BeforeEach`. If two tests share a `Tag` through a field, whichever runs second will see the first test's mutation. Not an issue if confirmed — just flag to verify in code review.

marcel commented

2026-06-07 00:47:46 +02:00

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: ⚠️ Approved with concerns

Brand and accessibility audit

Color system ✅ — Theme chips use var(--c-tag-{color}) CSS variables, consistent with the tag color tokens used in tag badges elsewhere. Null-color tags fall back to neutral chip appearance (no inline style attribute) — correct.

Chip structure ✅ — "Thema:" prefix is rendered outside the truncated name <span>, preserving screen-reader legibility ("Thema: Zweiter Weltkrieg", not "Thema: Zweiter W…").

ARIA labels ✅ — Remove button uses aria-label={m.search_filter_remove_label({ label: ... })} — localized, descriptive, screen-reader friendly.

i18n ✅ — search_chip_theme_prefix present in de ("Thema"), en ("Topic"), es ("Tema").

data-chip-type="theme" attribute ✅ — Supports both accessibility tooling and test selectors.

Concerns

1. Silent tag recognition on personRole:any path (Minor — UX gap)

When a query resolves to a single person with role "any" (e.g. "Briefe von Walter über Krieg"), tagsApplied is false and no theme chips are rendered — even though "Krieg" was recognized as a known tag. From the user's perspective, the keyword disappeared silently. They typed it, Ollama understood it, but there's no visual feedback.

In a v1 context this is acceptable (documented in ADR-028), but the UX cost is real: senior users are more likely to re-type the query wondering if it was processed. A follow-up issue should explore a lightweight signal — even a muted "Thema erkannt, aber nicht angewendet" tooltip or a different chip state — for a future iteration. Non-blocking here.

2. Tag color contrast (Informational — systemic, not this PR)

--c-tag-{color} values are user-defined in the tag admin UI. If an admin defines a very light tag color (e.g. yellow/cream), the chip text may fall below WCAG AA 4.5:1. This isn't introduced by this PR — it's a pre-existing concern with the tag color system. Worth a separate audit issue for the tag admin color picker to constrain choices to accessible values.

3. Touch target on theme chip remove button (Verify)

Confirm the × dismiss button inside theme chips meets the 44×44 px minimum (WCAG 2.2, critical for the 60+ audience). If the existing keyword/date/sender chip remove buttons already have min-h-[44px] min-w-[44px] or equivalent padding, theme chips inherit this automatically — just verify consistency. ✅ if already compliant across all chip types.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ⚠️ Approved with concerns** ### Brand and accessibility audit **Color system** ✅ — Theme chips use `var(--c-tag-{color})` CSS variables, consistent with the tag color tokens used in tag badges elsewhere. Null-color tags fall back to neutral chip appearance (no inline style attribute) — correct. **Chip structure** ✅ — "Thema:" prefix is rendered outside the truncated name `<span>`, preserving screen-reader legibility (`"Thema: Zweiter Weltkrieg"`, not `"Thema: Zweiter W…"`). **ARIA labels** ✅ — Remove button uses `aria-label={m.search_filter_remove_label({ label: ... })}` — localized, descriptive, screen-reader friendly. **i18n** ✅ — `search_chip_theme_prefix` present in de (`"Thema"`), en (`"Topic"`), es (`"Tema"`). **`data-chip-type="theme"` attribute** ✅ — Supports both accessibility tooling and test selectors. --- ### Concerns **1. Silent tag recognition on `personRole:any` path (Minor — UX gap)** When a query resolves to a single person with role `"any"` (e.g. "Briefe von Walter über Krieg"), `tagsApplied` is `false` and no theme chips are rendered — even though "Krieg" was recognized as a known tag. From the user's perspective, the keyword disappeared silently. They typed it, Ollama understood it, but there's no visual feedback. In a v1 context this is acceptable (documented in ADR-028), but the UX cost is real: senior users are more likely to re-type the query wondering if it was processed. A follow-up issue should explore a lightweight signal — even a muted "Thema erkannt, aber nicht angewendet" tooltip or a different chip state — for a future iteration. Non-blocking here. **2. Tag color contrast (Informational — systemic, not this PR)** `--c-tag-{color}` values are user-defined in the tag admin UI. If an admin defines a very light tag color (e.g. yellow/cream), the chip text may fall below WCAG AA 4.5:1. This isn't introduced by this PR — it's a pre-existing concern with the tag color system. Worth a separate audit issue for the tag admin color picker to constrain choices to accessible values. **3. Touch target on theme chip remove button (Verify)** Confirm the `×` dismiss button inside theme chips meets the 44×44 px minimum (WCAG 2.2, critical for the 60+ audience). If the existing keyword/date/sender chip remove buttons already have `min-h-[44px] min-w-[44px]` or equivalent padding, theme chips inherit this automatically — just verify consistency. ✅ if already compliant across all chip types.

marcel added 2 commits 2026-06-07 01:00:03 +02:00

docs(search): add detached-entity safety comment in resolveTags dc366ed403

Addresses @Markus review: tags fetched by findByNameContaining live outside
any transaction; Hibernate's dirty-check never fires on them. The comment
removes the ambiguity for cold readers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs(search): note vitest-browser workaround for + in path

CI / Unit & Component Tests (pull_request) Successful in 3m22s

Details

CI / OCR Service Tests (pull_request) Successful in 24s

Details

CI / Backend Unit Tests (pull_request) Successful in 3m47s

Details

CI / fail2ban Regex (pull_request) Successful in 46s

Details

CI / Semgrep Security Scan (pull_request) Successful in 24s

Details

CI / Compose Bucket Idempotency (pull_request) Successful in 1m6s

Details

fb41affd4c

Addresses @Sara review: browser tests in this spec fail silently when
the project path contains '+' (common in git worktrees). The comment
tells developers to copy the frontend directory to a clean path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-06-07 01:00:17 +02:00

Review follow-up — all concerns addressed

Changes committed

dc366ed4 — docs(search): add detached-entity safety comment in resolveTags

Adds // safe: entities are detached here; mutation is for DTO projection only, no dirty-check fires at the tagService.resolveEffectiveColors(capped) call site in NlQueryParserService.resolveTags()
Addresses @Markus: "a reader arriving cold could wonder whether Hibernate might pick up the mutation"

fb41affd — docs(search): note vitest-browser workaround for + in path

Adds a 3-line NOTE comment at the top of InterpretationChipRow.svelte.spec.ts explaining that vitest-browser fails silently when the project path contains +, and how to work around it
Addresses @Sara concern 2: "invisible to any developer who checks out this branch on a fresh machine"

Verified OK — no code change needed

Concern	Status
ADR-033 exists and cross-reference is valid	✅ `033-tag-name-resolution-tolerates-case-collisions.md` confirmed
Touch target: `×` button on theme chips meets 44px height	✅ `removeButton` class has `min-h-[44px]` — consistent with all other chip types
`resolveEffectiveColors` mutation isolation in color tests	✅ Each test constructs its own `Tag` instance via the `tag()` builder factory

Follow-up issues filed

Issue	Reviewer	Priority
#766 feat(search): escape LIKE wildcards in TagService.findByNameContaining	@Nora, @Sara	P3-later / security
#767 feat(search): add resolvedTagCount and tagsApplied to NL search structured log	@Tobias	P3-later / feature
#768 feat(search): show recognized-but-unapplied theme chips on personRole:any path	@Leonie	P2-medium / ui

## Review follow-up — all concerns addressed ### Changes committed **`dc366ed4` — `docs(search): add detached-entity safety comment in resolveTags`** - Adds `// safe: entities are detached here; mutation is for DTO projection only, no dirty-check fires` at the `tagService.resolveEffectiveColors(capped)` call site in `NlQueryParserService.resolveTags()` - Addresses @Markus: "a reader arriving cold could wonder whether Hibernate might pick up the mutation" **`fb41affd` — `docs(search): note vitest-browser workaround for + in path`** - Adds a 3-line NOTE comment at the top of `InterpretationChipRow.svelte.spec.ts` explaining that vitest-browser fails silently when the project path contains `+`, and how to work around it - Addresses @Sara concern 2: "invisible to any developer who checks out this branch on a fresh machine" ### Verified OK — no code change needed | Concern | Status | |---|---| | ADR-033 exists and cross-reference is valid | ✅ `033-tag-name-resolution-tolerates-case-collisions.md` confirmed | | Touch target: `×` button on theme chips meets 44px height | ✅ `removeButton` class has `min-h-[44px]` — consistent with all other chip types | | `resolveEffectiveColors` mutation isolation in color tests | ✅ Each test constructs its own `Tag` instance via the `tag()` builder factory | ### Follow-up issues filed | Issue | Reviewer | Priority | |---|---|---| | [#766 feat(search): escape LIKE wildcards in TagService.findByNameContaining](https://git.raddatz.cloud/marcel/familienarchiv/issues/766) | @Nora, @Sara | P3-later / security | | [#767 feat(search): add resolvedTagCount and tagsApplied to NL search structured log](https://git.raddatz.cloud/marcel/familienarchiv/issues/767) | @Tobias | P3-later / feature | | [#768 feat(search): show recognized-but-unapplied theme chips on personRole:any path](https://git.raddatz.cloud/marcel/familienarchiv/issues/768) | @Leonie | P2-medium / ui |

marcel added 1 commit 2026-06-07 08:51:12 +02:00

merge: resolve conflicts with origin/main (#763 person name-match integration)

CI / Unit & Component Tests (pull_request) Successful in 3m31s

Details

CI / OCR Service Tests (pull_request) Successful in 25s

Details

CI / Backend Unit Tests (pull_request) Successful in 3m48s

Details

CI / fail2ban Regex (pull_request) Successful in 45s

Details

CI / Semgrep Security Scan (pull_request) Successful in 22s

Details

CI / Compose Bucket Idempotency (pull_request) Successful in 1m6s

Details

CI / Unit & Component Tests (push) Successful in 3m20s

Details

CI / OCR Service Tests (push) Successful in 23s

Details

CI / Backend Unit Tests (push) Successful in 3m48s

Details

CI / fail2ban Regex (push) Successful in 46s

Details

CI / Semgrep Security Scan (push) Successful in 23s

Details

CI / Compose Bucket Idempotency (push) Successful in 1m8s

Details

6878419156

- Drop unused MAX_CANDIDATES constant (not referenced in service)
- Keep detached-entity safety comment in resolveTags()
- Add 3 new partial-name match tests (23a/b/c) from #763
- Use resolveByName() API in test 28 (replaces findByDisplayNameContaining)
- Add NameMatches glossary entry from #763

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel merged commit 6878419156 into main

2026-06-07 09:33:26 +02:00

marcel deleted branch feat/issue-743-nl-search-tag-resolution

2026-06-07 09:33:26 +02:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#765

feat(search): NL search — resolve tag names in query parsing (#743) #765

Summary

Test plan

🏛️ Markus Keller — Software Architect

Layer boundaries and coupling

ADR completeness

Minor observation — resolveEffectiveColors() on detached entities

No Flyway migrations needed

👨‍💻 Felix Brandt — Senior Fullstack Developer

TDD adherence

Standout design decision — buildThemeRemovalUrl extraction

Backend code quality

Frontend code quality

Minor note

🚀 Tobias Wendt — DevOps / Infrastructure

Infrastructure delta: zero

Performance note (non-blocking)

Observability suggestion (non-blocking)

📋 Elicit — Requirements Engineer

Acceptance criteria coverage (against issue #743)

Traceability

One open item to verify

Discovery items not in original spec (implementation-discovered)

🔐 Nora "NullX" Steiner — Application Security Engineer

Findings

Confirmed safe

🧪 Sara Holt — QA Engineer & Test Strategist

Test pyramid assessment

Concerns

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Brand and accessibility audit

Concerns

Review follow-up — all concerns addressed

Changes committed

Verified OK — no code change needed

Follow-up issues filed

Minor observation — `resolveEffectiveColors()` on detached entities

Standout design decision — `buildThemeRemovalUrl` extraction