feat(search): NL search — resolve tag names in query parsing #743

Closed
opened 2026-06-06 12:17:04 +02:00 by marcel · 1 comment
Owner

Part of epic #735. Post-v1 refinement (v1 #738 folds all unresolved terms into keywords; this issue resolves a subset of those terms against the tag taxonomy).

Self-contained (backend tag resolution + frontend theme chips). Depends on #739 merged — the interpretation-chip components (InterpretationChipRow.svelte) must exist before theme chips can be added.

Refinement note (req-eng pass, 2026-06-06): This issue was promoted from a parking-lot stub. Three design decisions were resolved with the product owner and three contradictions with the merged #738 backend were corrected (see Design decisions). The original stub proposed resolvedTags: List<Tag>, tag-ID filters, an ambiguousTags disambiguation flow, and a new themeNames LLM field — all four are dropped in favour of the design below.

Six-persona review pass (2026-06-06): all review findings and the four open questions have been folded into this body and resolved by the product owner: resolve as tags on the any path (D4); theme chips reuse the existing tag colour palette (D5); design review does not block this PR; the GET /api/documents/search param-name defect is corrected (see Frontend changes). OQ-1/OQ-2 are promoted to accepted defaults. Three additional correctness/edge tests were added (see Testing).

Second review pass (2026-06-06): one residual decision — theme-chip colour source — resolved by the product owner as per-tag colour (option a): TagHint carries a color field and the chip renders the tag's own --c-tag-{color}. Folded into Backend §1/§4, Frontend, and the acceptance criteria below.

Third review pass (2026-06-06): Six additional items folded in: ADR-028 reference updated to exact filename; resolveEffectiveColors mutation-guard note added; Thema: prefix placed outside truncate span; @DataJpaTest descendant-expansion test added; fresh-mock-per-test note added; branch coverage gate confirmed as 88%. One open UX decision remained.

Fourth review pass (2026-06-06): OQ-5 resolved as option B (prefix + tint + tinted-left-border); onRemoveChip value specified as tag.name; parent chip-removal filter-combination pattern added; TypeScript fixture audit added; isHealthy() clarified as "document over wire"; tagsApplied: false hint added to Out of Scope. No remaining open decisions.

Fifth review pass (2026-06-06): Three corrections from codebase reads folded in: (1) security note updated — inline style attribute IS the established --c-tag-* pattern (see TagTreeNode.svelte, TagInput.svelte, themen/+page.svelte); (2) pre-implementation DocumentService check resolved — hasTagPartial has its own null guard, tags list filter fires independently, no fix needed; (3) resolveEffectiveColors confirmed batched (findAllById single IN query). Two additions: paramsFromInterpretation() extension added as a named Frontend task; paramsFromInterpretation() unit tests + E2E "last chip + person survives" scenario added to Testing. Border-left treatment refined to 1px tinted-left-color (using inline style, consistent with existing tag components) rather than a separate 2px utility class.

Sixth review pass (2026-06-06): Three codebase-grounded findings folded in: (1) resolveEffectiveColors resolves only ONE level of color inheritance — documented as a known depth constraint, pinned with a 3-level test case (see Testing); (2) ChipType union is duplicated across InterpretationChipRow.svelte:7 and +page.svelte:272 — extract to search/chip-types.ts as a pre-implementation step; (3) C4 L3 backend diagram needs a search/tag/ dependency arrow (added to Documentation). Two implementation-coordination items: removeChip's if/else switch needs an explicit 'theme' case (added to Frontend changes); theme chips require a distinct {:else if chip.type === 'theme'} template branch to keep Thema: prefix outside the truncated span (added to Frontend changes). applyResolvedAndSearch signature extension approach clarified. NlQueryParserServiceTest 4th constructor-arg sync added to Testing. Call-site mutation-guard comment required for resolveEffectiveColors.

Goal

In v1, every term the LLM extracts as a keyword ("Hochzeit", "Krieg", "Kriegsende") is folded into PostgreSQL full-text search. In this iteration, NlQueryParserService additionally resolves each extracted keyword against the existing tag taxonomy. Keywords that match one or more tags become tag filters (OR-union, with the existing hierarchy expansion); keywords that match nothing remain as FTS keywords — exactly as today. No LLM schema or prompt change is required.

This makes a smart-search theme term behave the same as clicking a tag in the existing keyword search.

Design decisions (resolved)

# Decision Rationale
D1 No LLM change. Resolve the existing keywords list against tags in NlQueryParserService; matches → tag filters, misses → FTS keywords. Mirrors the existing person no-match folding. The Ollama JSON schema in RestClientOllamaClient (personNames, personRole, dateFrom, dateTo, keywords) and the prompt are untouched — lowest risk, no new model failure mode. Drops the stub's themeNames field.
D2 OR-union, no disambiguation. When a keyword substring-matches several distinct tags ("krieg" → Weltkrieg, Kriegsende, Nachkriegszeit) all matches become one OR set (TagOperator.OR). Tags are categories, not identities — "show me anything war-related" is the desired result, not an error. Best recall for the 60+ audience. Drops the stub's ambiguousTags concept entirely (no picker, no suppressed results). Consistent with ADR-033 resolving tag names deterministically.
D3 Self-contained scope, depends on #739. One coherent vertical slice (parser → API → chips).
D4 personRole:"any" path resolves tags (does not skip). resolvedTags is populated on this path for an honest interpretation payload; tagsApplied = false keeps the chips hidden. (Resolves OQ-4.) The 1–3 unused ILIKE queries are sub-millisecond inside a multi-second LLM call — trivial. A truthful interpretation object is worth more than the micro-saving of skipping the lookup.
D5 Theme chips use per-tag colour + tinted-left-border colour (OQ-5, option B). TagHint carries a color field; each chip renders (a) a tinted background from --c-tag-{color} and (b) a tinted left-border colour (border-left-color: var(--c-tag-{color})) as a chip-family visual marker — three redundant cues: Thema: prefix + tinted left border + tinted background. Both applied via inline style consistent with TagTreeNode.svelte / TagInput.svelte / themen/+page.svelte. Null-color fallback: omit the style attribute entirely; the Thema: prefix is the sole differentiator. (Resolves OQ-5.) Three redundant cues for the 60+/colour-blind/bright-sunlight audience. Using inline style matches the established --c-tag-* pattern in the codebase.

Corrections to the original stub (grounded in merged #738 code)

  • resolvedTags: List<Tag> resolvedTags: List<TagHint>. The API never exposes JPA entities; persons surface as the lightweight PersonHint(id, displayName) record (search/PersonHint.java). Tags mirror this with a new TagHint(id, name, color).
  • "passes resolved tag IDs as filters" → passes resolved tag names. DocumentService.searchDocuments consumes SearchFilters.tags: List<String> (names), then TagService.expandTagNamesToDescendantIdSets expands each name to its descendant-ID set. Passing names gives hierarchy expansion for free.
  • ambiguousTags + disambiguation flow → removed (see D2).

Backend changes

1. New record — search/TagHint.java

public record TagHint(
        @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
        UUID id,
        @Schema(requiredMode = Schema.RequiredMode.REQUIRED)
        String name,
        String color           // effective tag colour token (sage/sienna/…); nullable if no colour resolves
) {}

Plain record built from Tag entities by NlQueryParserService (not a JPA projection — same pattern as PersonHint). The color field (D5) is the tag's effective colour: tag colours are stored only on root tags and inherited by children, so the parser must call TagService.resolveEffectiveColors(matched) on the matched entities before mapping to TagHint — otherwise child tags surface a null colour. color is not @Schema(requiredMode = REQUIRED) (it can legitimately be null).

⚠️ resolveEffectiveColors depth limitation (known constraint): The method resolves exactly one level of color inheritance — it loads the direct parentId and stops. For a grandchild tag (grandchild → child (no own color) → root (color=sage)), the grandchild gets null rather than sage. The archive's tag taxonomy is typically shallow (root → child); if a matched keyword happens to be a grandchild, it surfaces null color. Document this at the call site and pin it with the 3-level test (see Testing). If the taxonomy grows deeper, fix the method to recurse.

2. Extend search/NlQueryInterpretation.java

Add two fields (current fields: resolvedPersons, ambiguousPersons, dateFrom, dateTo, keywords, rawQuery, keywordsApplied):

@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
List<TagHint> resolvedTags,     // tags the query was filtered by (after keyword→tag resolution)
@Schema(requiredMode = Schema.RequiredMode.REQUIRED)
boolean tagsApplied             // false when the active search path cannot apply tag filters

Place resolvedTags after keywords and tagsApplied after keywordsApplied (keeps person/keyword/tag groupings adjacent). NlSearchResponse is unchanged.

Note (Felix): record fields are positional, so the two new fields break all three NlQueryInterpretation construction call sites in search() (ambiguous early-return, any-path, main path). The compiler will catch them — write the tests first so the new positional args are pinned.

3. TagService.findByNameContaining(String fragment): List<Tag>

One-liner mirroring PersonService.findByDisplayNameContaining. Delegates to the existing TagRepository.findByNameContainingIgnoreCase(fragment) — no new JPQL. Read method → no @Transactional (consistent with PersonService read-method style).

Do not reuse TagService.search() — it enriches results with relatives (enrichWithRelatives()) for the tag-browse UI, which is unwanted noise for filter resolution.

4. NlQueryParserService — keyword→tag resolution

Currently (search/NlQueryParserService.java ~line 75-80) the SearchFilters is built with tags = List.of(), tagQ = null, tagOperator = TagOperator.AND. New flow, inserted after keyword extraction and before the personRole:"any" branch (so both keywordsApplied and tagsApplied are known on every path). Resolve once, then branch — do not duplicate the resolution logic in both branches:

resolveTags(keywords):
  resolvedTagEntities = new LinkedHashSet<Tag>()   // dedup, preserve order
  remainingKeywords   = new ArrayList<String>()
  for kw in keywords:
    if kw == null or kw.length() < MIN_TAG_TERM:        // guard short/noisy terms
        remainingKeywords.add(kw); continue
    matches = tagService.findByNameContaining(kw)        // substring ILIKE
    if matches.isEmpty():
        remainingKeywords.add(kw)                         // unchanged v1 behaviour
    else:
        add each match to resolvedTagEntities (stop adding once size == MAX_RESOLVED_TAGS)
  // resolveEffectiveColors mutates detached Tag instances in-place (sets color field);
  // must not be called inside a write @Transactional scope — mutations would be persisted.
  tagService.resolveEffectiveColors(resolvedTagEntities)  // D5 — see depth limitation note in §1
  return (resolvedTagEntities → List<TagHint(id, name, color)>, tag names, remainingKeywords)
  • FTS text is built from remainingKeywords only (plus person no-match/extra fragments) — resolved terms are removed from the text predicate so they are not double-applied (tag filter AND text match would over-narrow).
  • ⚠️ buildText rawQuery fallback guard (correctness, not just coverage): buildText() currently falls back to rawQuery when the joined parts are blank. If every keyword resolves to a tag and there are no person/no-match fragments, the FTS text would become blank → buildText returns rawQuery, re-introducing the resolved terms as text AND applying them as tags — the exact double-apply this issue forbids. The resolved-term removal must produce an empty/null FTS text in this case, not a rawQuery fallback (e.g. only fall back to rawQuery when nothing was resolved — no tags AND no persons — or thread a boolean hadStructuredMatch into buildText). Pin this with an explicit test (see Testing).
  • Filter assembly (non-any path): build SearchFilters with tags = <resolved tag names>, tagOperator = TagOperator.OR, tagQ = null. Set tagsApplied = true. Person (sender/receiver) and date predicates remain AND-composed with the tag predicate; OR applies only within the tag set. (Confirmed: DocumentSpecifications.hasTagPartial has its own null guard and fires only when tagQ is non-blank; the tags list filter is a separate specification that fires independently. No guard in DocumentService suppresses it when tagQ = null.)
  • personRole:"any" single-person path (searchDocumentsByPersonId) supports neither keywords nor tags → set tagsApplied = false (mirrors the existing keywordsApplied = false). Per D4, resolvedTags is still populated for transparency; the frontend hides the chips when tagsApplied == false. (This path already early-returns before the SearchFilters build site ~line 63 — confirmed.)
  • Constants: MIN_TAG_TERM (3, accepted — resolves OQ-1) and MAX_RESOLVED_TAGS (10, mirrors person MAX_CANDIDATES = 10 — resolves OQ-2), declared as named constants in NlQueryParserService next to MAX_CANDIDATES. Both have value 10 but model different concepts (person resolution candidate cap vs. tag resolution cap) — keep them as separate named constants; they may diverge independently. When matches exceed the cap, take the first N and log.debug("Keyword matched {} tags; capping at {}", count, MAX_RESOLVED_TAGS)count only, never the raw keyword at info/warn (parameterized SLF4J; LLM-derived input echoes private names lifted from the query; defends against log injection + PII leak per ADR-028).

Security / defense-in-depth

Keyword fragments are already length-capped to 100 chars (MAX_KEYWORD_LENGTH) by RestClientOllamaClient.toExtraction() before reaching the parser; no new validation needed before the TagRepository call. Queries are parameterized (LIKE LOWER(CONCAT('%', :q, '%'))). MIN_TAG_TERM >= 3 also blunts trivial 1–2-char enumeration of the tag taxonomy. TagHint.color is a closed enum value from TagService.ALLOWED_TAG_COLORS — use the established inline style attribute pattern from existing tag-color components (TagTreeNode.svelte, TagInput.svelte, themen/+page.svelte): style="background-color: var(--c-tag-{color}); border-left-color: var(--c-tag-{color})". TagHint.color is a closed DB-origin set and Svelte auto-escapes attribute values; no {@html} needed. TagHint.name originates from the database (Tag entity name resolved from the taxonomy), not from LLM output directly — a tighter XSS surface than keyword chips. The endpoint authz (@RequirePermission(Permission.READ_ALL) on POST /api/search/nl) is unchanged — no new endpoint, no new permission decision.

Documentation

Amend docs/adr/028-nl-search-ollama.md in this PR:

  • Record the keyword→tag resolution decision and the deliberate FTS-text removal of resolved terms (tag filter AND text match would over-narrow) — reference ADR-033 for the deterministic name-resolution lineage. This is the non-obvious "why" a future dev would otherwise reverse.
  • Correct isHealthy() wording — recommended resolution: document over wire. ADR-028 states isHealthy() is "called inline before each inference request." It is not — search() calls ollamaClient.parse() directly; isHealthy() has zero callers in search/. parse() already throws on connection failure (caught as SMART_SEARCH_UNAVAILABLE); a pre-call isHealthy() probe would add ~50–150 ms latency for a failure mode already covered. Update ADR-028 to state: isHealthy() is reserved for ops/health endpoint polling; the in-path degradation mechanism is parse()'s timeout/IOException handling. No code change — one ADR paragraph. (Pre-existing; flagged in review.)

Add the new term "keyword→tag resolution" / "theme chip" to docs/GLOSSARY.md.

C4 diagram check: the search/ package gains a new cross-domain dependency on tag/ (TagService). Check docs/architecture/c4/l3-backend-*.puml — if NlQueryParserService is diagrammed, add a dependency arrow to TagService.

Frontend changes (depends on #739)

Pre-implementation step: ChipType is currently defined in two filesInterpretationChipRow.svelte:7 and +page.svelte:272 (both 'sender' | 'directional' | 'date' | 'keyword'). Extract to frontend/src/routes/search/chip-types.ts and import from both files before the PR. When 'theme' is added, it becomes the single source of truth.

Extend InterpretationChipRow.svelte (introduced by #739) with theme chips, and extend the chip-removal machinery in the parent page:

  • Render one removable chip per interpretation.resolvedTags entry, only when interpretation.tagsApplied === true (same rule as keywordsApplied for keyword chips — do not show greyed/disabled chips).
  • Visual identity (D5 — per-tag colour via inline style): each theme chip renders (a) a tinted background and (b) a tinted left-border colour using style="background-color: var(--c-tag-{tag.color}); border-left-color: var(--c-tag-{tag.color})" — consistent with TagTreeNode.svelte, TagInput.svelte, and themen/+page.svelte. Three redundant cues: Thema: prefix + tinted left border + tinted background. When tag.color is null, omit the style attribute entirely — the chip renders in the standard neutral appearance; the Thema: prefix is the sole differentiator in this case.
  • Template branch: theme chips require a distinct {:else if chip.type === 'theme'} branchdo not reuse the generic {:else} branch. The generic branch places chip.label inside the nameSpan truncated element; for theme chips, the Thema: prefix and the × button must both remain outside the truncated span (only the tag name itself truncates). Mirror the structure of the directional branch. Add data-chip-type="theme" to the chip wrapper <span> (consistent with data-chip-type={chip.type} on other branches — required for test assertions).
  • Type-prefixed label for senior legibility: [Thema: Weltkrieg ×] (new i18n key search_chip_theme_prefix).
  • Removal re-run — exact GET /api/documents/search param names (corrected; resolves OQ-3): clicking × drops that tag and re-runs via GET /api/documents/search. Theme chips pass value = tag.name to onRemoveChip('theme', value). Extend the onRemoveChip callback union with 'theme' (one callback, switch on type — do not add a parallel callback). Route the re-run through the same withCsrf() wrapper as the other chips for consistency.
  • paramsFromInterpretation() extension and removeChip switch (parent page — +page.svelte):
    • paramsFromInterpretation() at line 261 currently returns {senderId, receiverId, from, to, q}. The multi-value tag[] params cannot be expressed in this object shape. Choose one approach before implementing: (a) add tags: string[] and tagOp: string to the return type and extend applyResolvedAndSearch to accept them; or (b) build URLSearchParams directly in the 'theme' case of removeChip without changing the return type. Be consistent across removeChip and applyResolvedAndSearch.
    • The removeChip if/else switch at line 274 also needs a 'theme' case — this is a separate change from paramsFromInterpretation(). Without it, clicking a theme chip's × calls applyResolvedAndSearch with the full interpretation params intact (no tag removed). The theme branch: drop the selected tag from resolvedTags, reconstruct remaining tag names, emit as tag params + tagOp=OR.
    • When resolvedTags is non-empty after removing the tapped tag: params.append('tag', name) for each remaining name + params.set('tagOp', 'OR'); when the last theme chip is removed, omit both tag and tagOp. Any still-active person/date filters from the interpretation are preserved in the params as before. No re-parse via the NL endpoint — existing chip removal already clears NL state (smartMode=false, nlInterpretation=null) and switches to regular search; theme chip removal follows the same pattern.
  • {#each} key: "theme:" + tag.id. Use $derived for the chip list and the tagsApplied gate (not $effect).
  • a11y / touch (inherit from #739): ≥44px × target; focus-visible:ring-2 on the wrapper (not just the ×); truncate on the label span at max-w-[8rem] (mobile) with the Thema: prefix and the × button both outside the truncated span — the prefix is the sole non-colour differentiator for colour-blind users and must never be clipped at any viewport width; flex flex-wrap gap-2 container must absorb up to MAX_RESOLVED_TAGS=10 theme chips + person/date chips without pushing the result list off the first screen. aria-label via search_filter_remove_label (Filter entfernen: {label}), where {label} includes the "Thema" prefix ("Filter entfernen: Thema Weltkrieg"). No {@html} — tag names are LLM-influenced; rely on Svelte text-position auto-escaping.

New i18n keys (messages/{de,en,es}.json)

Key de en es
search_chip_theme_prefix "Thema" "Topic" "Tema"

OpenAPI / types

Run npm run generate:api in the same PR — TagHint (with id, name, color) and the two new NlQueryInterpretation fields must appear in src/lib/generated/.

Testing

TypeScript fixture audit (pre-implementation): before running npm run generate:api, run grep -rn 'NlQueryInterpretation' frontend/ to find all existing mock objects. The type regen adds two required fields (resolvedTags: TagHint[], tagsApplied: boolean) — every mock object missing them causes npm run check failures across spec files. Add resolvedTags: [] and tagsApplied: false as neutral defaults to each found site. Confirmed primary site: makeInterpretation() factory at InterpretationChipRow.svelte.spec.ts:14.

Backend — NlQueryParserServiceTest constructor sync (pre-implementation): The test at line 43 manually constructs the service: new NlQueryParserService(ollamaClient, personService, documentService). Adding TagService as the 4th constructor arg breaks all existing tests at compile time. Add @Mock TagService tagService and a default stub when(tagService.findByNameContaining(anyString())).thenReturn(List.of()) in setUp() alongside the existing service stubs.

Backend — NlQueryParserServiceTest (Mockito), additions (assert filters.tags() and filters.tagOperator() with exact matchers, never any() — a vacuous any() lets an accidental AND default pass; write tag(name) / tagHint(id,name,color) factory helpers first, following the existing lowercase private-method convention):

  • keyword resolves to a single tag → resolvedTags contains it, tagsApplied == true, removed from FTS text; assert searchDocuments called with filters.tags() == [name] and filters.tagOperator() == TagOperator.OR
  • keyword substring-matches multiple tags → all in resolvedTags; assert OR-union filter
  • keyword with no tag match → stays in FTS text, resolvedTags empty
  • mixed (one resolves, one doesn't) → resolved term becomes tag, unresolved stays in the space-joined text argument
  • single person + personRole:"any" + a resolvable keyword → searchDocumentsByPersonId path taken, searchDocuments never called (verify(..., never())), tagsApplied == false, resolvedTags still populated (D4)
  • cap: keyword matches > MAX_RESOLVED_TAGS tags → capped, debug-logged (count only)
  • short keyword (< MIN_TAG_TERM) → skipped from tag resolution, stays as keyword
  • dedup: the same tag matched via multiple keywords/substrings ("krieg" + "kriegsende" both hitting Kriegsende) → appears once in resolvedTags (LinkedHashSet)
  • all-resolve rawQuery guard (latent bug): every keyword resolves and there are no person/no-match fragments → assert filters.text() == null and filters.tags() populated (no double-apply)
  • flag independence: all keywords became tags → keywordsApplied == false and tagsApplied == true in the same interpretation (frontend contract: keyword chips hidden, theme chips shown)
  • colour (D5, one-level): a child tag with no own colour resolves to its parent's colour → the resulting TagHint.color equals the parent's (exercises resolveEffectiveColors); build the mock Tag entity fresh per test (not a shared field) — resolveEffectiveColors calls tag.setColor(...) in-place and bleeds into other tests that share the same entity instance
  • colour depth limitation (3-level): a grandchild tag (no own color) whose direct parent also has no color → assert TagHint.color == null. This pins the known one-level depth constraint of resolveEffectiveColors as a documented limitation rather than a silent production bug. If the archive's taxonomy grows deeper, fix the method to recurse and update this test.

Backend — TagServiceTest: findByNameContaining delegates to findByNameContainingIgnoreCase (red → green). Do not test the trivial TagHint record accessors.

Backend — descendant-expansion integration test (@DataJpaTest, postgres:16-alpine): create a parent tag + a child tag; a document tagged only on the child; assert an NL search with a keyword substring-matching the parent name returns that document. Verifies expandTagNamesToDescendantIdSets fires end-to-end — the acceptance criterion "filtered by the OR-union of every tag … and their descendants" is not provable via Mockito alone. Use postgres:16-alpine (project convention — never H2). Check for an existing Testcontainers base class (NlSearchControllerTest.java — inspect its imports) before declaring a new container.

Coverage gate: target is 88% branch coverage (backend/CLAUDE.md). The new branches — short-term skip, cap, dedup, all-resolve guard, flag independence — are the ones easiest to miss; verify each is hit.

Frontend — InterpretationChipRow.svelte.spec.ts (vitest-browser-svelte):

  • resolvedTags non-empty + tagsApplied: true → theme chips render with Thema: prefix
  • tagsApplied: false → assert DOM absence via querySelectorAll('[data-chip-type="theme"]')not [data-chip-type] which would also catch other chip types
  • a theme chip with color: "sage" renders with inline style containing background-color: var(--c-tag-sage) and border-left-color: var(--c-tag-sage); a chip with color: null renders with no tint style (standard neutral chip appearance)
  • clicking × on a theme chip → onRemoveChip('theme', tag.name) called with the tag's name as value (assert callback receives the name string — the HTTP call is the parent component's responsibility, not the chip row's)
  • N resolved tags → exactly N theme chips

Frontend — paramsFromInterpretation() unit tests (vitest): the parent page's param-reconstruction function must be tested for the theme-chip branch covering: (a) N remaining tags → exactly N tag params + tagOp=OR; (b) last tag removed → no tag/tagOp in params; (c) last tag removed with a resolved person present → sender param intact, no tag/tagOp; (d) null-color tag → tag name is emitted correctly (color does not affect param construction).

No new Playwright test required — extend the existing #739 happy-path E2E fixture to include resolvedTags + tagsApplied: true. Locate the fixture file first: run find frontend/e2e -name "*.spec.ts" | xargs grep -l "smartMode". Also add a "last theme chip removed, person filter survives" scenario: NL result with one resolved person + one theme tag; remove the theme chip; assert no tag/tagOp in the resulting URL but the sender UUID param is present.

Acceptance criteria

  • POST /api/search/nl with {"query": "Briefe über den Krieg"} returns results filtered by the OR-union of every tag whose name substring-matches "Krieg" (and their descendants), with those tags in interpretation.resolvedTags and tagsApplied == true
  • A keyword with no matching tag is folded into the FTS keyword text and does not appear in resolvedTags (v1 behaviour preserved)
  • A keyword matching multiple tags includes all of them; a document tagged with any one of them appears in results (OR semantics)
  • A keyword matching the same tag via multiple substrings yields that tag once in resolvedTags (dedup)
  • Resolved tag terms are removed from the FTS text predicate (not applied twice); when every keyword resolves the FTS text is null and does not fall back to rawQuery
  • Each resolvedTags entry carries its effective color (own colour, or the inherited parent colour for a direct child tag with no own colour; null for grandchild+ tags or when neither the tag nor its parent has a colour)
  • A query with a single resolved person and personRole:"any" plus a theme term runs the sender-OR-receiver search; tagsApplied == false; resolvedTags is still populated; the frontend shows no theme chips
  • Frontend: theme chips render as [Thema: … ×] with the tag's own --c-tag-{color} applied via inline style (background-color + border-left-color); null-color falls back to standard neutral chip appearance (no style attribute); when tagsApplied == true; removing one re-runs GET /api/documents/search with tagOp=OR and the remaining tag values; removing the last theme chip omits both tag and tagOp while any still-active sender/receiver/date filters remain intact
  • GET /api/documents/search behaviour is otherwise unchanged
  • The Ollama JSON schema and prompt in RestClientOllamaClient are unchanged (no themeNames field)
  • npm run generate:api has been run; TagHint (id, name, color), resolvedTags, tagsApplied exist in generated types
  • npm run check passes with zero new errors after adding resolvedTags: [] and tagsApplied: false to all existing NlQueryInterpretation mock objects

Non-functional checklist

  • Performance: ≤ one indexed ILIKE per extracted keyword (typically 1–3), bounded by MAX_RESOLVED_TAGS; resolveEffectiveColors adds one batched parent-colour lookup (confirmed: tagRepository.findAllById(parentIdsNeeded) — single IN query). Adds < ~10ms to a 2–15s LLM call — negligible. The existing /api/search/nl Prometheus latency exclusion (#738) already covers it. Pre-merge check: verify #738's Prometheus exclusion/histogram actually landed in maincurl -s http://localhost:8080/actuator/prometheus | grep 'http_server_requests.*search/nl' should return a series; if absent, fix before merging. Document this check in the PR description as a merge checklist item. Known future lever (not this PR): LIKE LOWER(CONCAT('%', :q, '%')) is a leading-wildcard match → no B-tree index; trivial seq-scan on the small tag table today. If tag ever grows past a few thousand rows, add a pg_trgm GIN index on lower(name).
  • Security: keyword fragments pre-capped at 100 chars; parameterized queries; MIN_TAG_TERM >= 3 blunts taxonomy enumeration; TagHint.color is a closed DB-origin palette token applied via inline style (Svelte auto-escapes attribute values); TagHint.name is DB-origin (not LLM text); LLM-derived tag names never reach innerHTML (no {@html}) and are logged only as counts.
  • Observability: log resolved-tag count as metadata (log.debug) — never the raw query or tag names beyond debug level (PII policy from ADR-028). No new metric or dashboard panel (sub-ms inside a multi-second request).
  • i18n: new key trio added to de/en/es before any render.
  • Accessibility: theme chips inherit #739's ≥44px targets, aria-label (with "Thema" prefix), and focus-ring patterns. Run axe in both light and dark mode on the chip row — --c-tag-* tokens are remapped in dark mode and a token passing 4.5:1 in light can fail in dark. Perform a visual wrap stress test at 320px with 10 stub theme chips to confirm the flex flex-wrap gap-2 container absorbs them without pushing results off screen.
  • Compatibility: no DB migration, no infra change, no new container, no new env var, no new error code.

Out of scope

  • LLM themeNames field / system-prompt change (explicitly rejected — see D1)
  • Tag disambiguation UI / ambiguousTags (rejected — see D2)
  • Tag filtering on the personRole:"any" search path (returns tagsApplied = false; chips hidden — see D4)
  • No tagsApplied: false hint shown on the personRole:"any" path (by design — D4 transparency is API-level, not UI-level; no analogue to the keyword showKeywordsNotApplied hint)
  • nl-search-spec.html theme-chip mockup (nice-to-have follow-up; design review does not gate this PR — see D5)
  • pg_trgm index on tag.name (future lever, only if the table grows)
  • Recursive resolveEffectiveColors for grandchild+ color inheritance (future lever if taxonomy grows deeper than 2 levels)
  • Semantic/embedding-based theme matching — tracked separately under the epic parking lot
Part of epic #735. Post-v1 refinement (v1 #738 folds all unresolved terms into `keywords`; this issue resolves a subset of those terms against the tag taxonomy). **Self-contained** (backend tag resolution + frontend theme chips). **Depends on #739 merged** — the interpretation-chip components (`InterpretationChipRow.svelte`) must exist before theme chips can be added. > **Refinement note (req-eng pass, 2026-06-06):** This issue was promoted from a parking-lot stub. Three design decisions were resolved with the product owner and three contradictions with the merged #738 backend were corrected (see _Design decisions_). The original stub proposed `resolvedTags: List<Tag>`, tag-**ID** filters, an `ambiguousTags` disambiguation flow, and a new `themeNames` LLM field — **all four are dropped** in favour of the design below. > > **Six-persona review pass (2026-06-06):** all review findings and the four open questions have been folded into this body and resolved by the product owner: _resolve as tags on the `any` path (D4); theme chips reuse the existing tag colour palette (D5); design review does not block this PR; the `GET /api/documents/search` param-name defect is corrected (see Frontend changes)._ OQ-1/OQ-2 are promoted to accepted defaults. Three additional correctness/edge tests were added (see Testing). > > **Second review pass (2026-06-06):** one residual decision — theme-chip colour source — resolved by the product owner as **per-tag colour (option a)**: `TagHint` carries a `color` field and the chip renders the tag's own `--c-tag-{color}`. Folded into Backend §1/§4, Frontend, and the acceptance criteria below. > > **Third review pass (2026-06-06):** Six additional items folded in: ADR-028 reference updated to exact filename; `resolveEffectiveColors` mutation-guard note added; `Thema:` prefix placed outside `truncate` span; `@DataJpaTest` descendant-expansion test added; fresh-mock-per-test note added; branch coverage gate confirmed as 88%. One open UX decision remained. > > **Fourth review pass (2026-06-06):** OQ-5 resolved as option B (prefix + tint + tinted-left-border); `onRemoveChip` value specified as `tag.name`; parent chip-removal filter-combination pattern added; TypeScript fixture audit added; `isHealthy()` clarified as "document over wire"; `tagsApplied: false` hint added to Out of Scope. No remaining open decisions. > > **Fifth review pass (2026-06-06):** Three corrections from codebase reads folded in: (1) security note updated — inline `style` attribute IS the established `--c-tag-*` pattern (see `TagTreeNode.svelte`, `TagInput.svelte`, `themen/+page.svelte`); (2) pre-implementation `DocumentService` check resolved — `hasTagPartial` has its own null guard, `tags` list filter fires independently, no fix needed; (3) `resolveEffectiveColors` confirmed batched (`findAllById` single IN query). Two additions: `paramsFromInterpretation()` extension added as a named Frontend task; `paramsFromInterpretation()` unit tests + E2E "last chip + person survives" scenario added to Testing. Border-left treatment refined to 1px tinted-left-color (using inline `style`, consistent with existing tag components) rather than a separate 2px utility class. > > **Sixth review pass (2026-06-06):** Three codebase-grounded findings folded in: (1) `resolveEffectiveColors` resolves only ONE level of color inheritance — documented as a known depth constraint, pinned with a 3-level test case (see Testing); (2) `ChipType` union is duplicated across `InterpretationChipRow.svelte:7` and `+page.svelte:272` — extract to `search/chip-types.ts` as a pre-implementation step; (3) C4 L3 backend diagram needs a `search/` → `tag/` dependency arrow (added to Documentation). Two implementation-coordination items: `removeChip`'s `if/else` switch needs an explicit `'theme'` case (added to Frontend changes); theme chips require a distinct `{:else if chip.type === 'theme'}` template branch to keep `Thema:` prefix outside the truncated span (added to Frontend changes). `applyResolvedAndSearch` signature extension approach clarified. `NlQueryParserServiceTest` 4th constructor-arg sync added to Testing. Call-site mutation-guard comment required for `resolveEffectiveColors`. ## Goal In v1, every term the LLM extracts as a `keyword` ("Hochzeit", "Krieg", "Kriegsende") is folded into PostgreSQL full-text search. In this iteration, `NlQueryParserService` additionally **resolves each extracted keyword against the existing tag taxonomy**. Keywords that match one or more tags become **tag filters** (OR-union, with the existing hierarchy expansion); keywords that match nothing **remain as FTS keywords** — exactly as today. No LLM schema or prompt change is required. This makes a smart-search theme term behave the same as clicking a tag in the existing keyword search. ## Design decisions (resolved) | # | Decision | Rationale | |---|----------|-----------| | D1 | **No LLM change.** Resolve the existing `keywords` list against tags in `NlQueryParserService`; matches → tag filters, misses → FTS keywords. | Mirrors the existing person no-match folding. The Ollama JSON schema in `RestClientOllamaClient` (`personNames`, `personRole`, `dateFrom`, `dateTo`, `keywords`) and the prompt are **untouched** — lowest risk, no new model failure mode. Drops the stub's `themeNames` field. | | D2 | **OR-union, no disambiguation.** When a keyword substring-matches several distinct tags ("krieg" → Weltkrieg, Kriegsende, Nachkriegszeit) all matches become one OR set (`TagOperator.OR`). | Tags are categories, not identities — "show me anything war-related" is the desired result, not an error. Best recall for the 60+ audience. Drops the stub's `ambiguousTags` concept entirely (no picker, no suppressed results). Consistent with ADR-033 resolving tag names deterministically. | | D3 | **Self-contained scope**, depends on #739. | One coherent vertical slice (parser → API → chips). | | D4 | **`personRole:"any"` path resolves tags (does not skip).** `resolvedTags` is populated on this path for an honest interpretation payload; `tagsApplied = false` keeps the chips hidden. _(Resolves OQ-4.)_ | The 1–3 unused `ILIKE` queries are sub-millisecond inside a multi-second LLM call — trivial. A truthful interpretation object is worth more than the micro-saving of skipping the lookup. | | D5 | **Theme chips use per-tag colour + tinted-left-border colour (OQ-5, option B).** `TagHint` carries a `color` field; each chip renders (a) a tinted background from `--c-tag-{color}` and (b) a tinted left-border colour (`border-left-color: var(--c-tag-{color})`) as a chip-family visual marker — three redundant cues: `Thema:` prefix + tinted left border + tinted background. Both applied via inline `style` consistent with `TagTreeNode.svelte` / `TagInput.svelte` / `themen/+page.svelte`. Null-color fallback: omit the `style` attribute entirely; the `Thema:` prefix is the sole differentiator. _(Resolves OQ-5.)_ | Three redundant cues for the 60+/colour-blind/bright-sunlight audience. Using inline `style` matches the established `--c-tag-*` pattern in the codebase. | ### Corrections to the original stub (grounded in merged #738 code) - ❌ `resolvedTags: List<Tag>` → ✅ **`resolvedTags: List<TagHint>`**. The API never exposes JPA entities; persons surface as the lightweight `PersonHint(id, displayName)` record (`search/PersonHint.java`). Tags mirror this with a new `TagHint(id, name, color)`. - ❌ "passes resolved tag **IDs** as filters" → ✅ passes resolved tag **names**. `DocumentService.searchDocuments` consumes `SearchFilters.tags: List<String>` (names), then `TagService.expandTagNamesToDescendantIdSets` expands each name to its descendant-ID set. Passing names gives **hierarchy expansion for free**. - ❌ `ambiguousTags` + disambiguation flow → ✅ removed (see D2). ## Backend changes ### 1. New record — `search/TagHint.java` ```java public record TagHint( @Schema(requiredMode = Schema.RequiredMode.REQUIRED) UUID id, @Schema(requiredMode = Schema.RequiredMode.REQUIRED) String name, String color // effective tag colour token (sage/sienna/…); nullable if no colour resolves ) {} ``` Plain record built from `Tag` entities by `NlQueryParserService` (not a JPA projection — same pattern as `PersonHint`). The `color` field (D5) is the tag's **effective** colour: tag colours are stored only on root tags and inherited by children, so the parser must call `TagService.resolveEffectiveColors(matched)` on the matched entities **before** mapping to `TagHint` — otherwise child tags surface a `null` colour. `color` is **not** `@Schema(requiredMode = REQUIRED)` (it can legitimately be null). > ⚠️ **`resolveEffectiveColors` depth limitation (known constraint):** The method resolves exactly **one level** of color inheritance — it loads the direct `parentId` and stops. For a grandchild tag (grandchild → child (no own color) → root (color=sage)), the grandchild gets `null` rather than `sage`. The archive's tag taxonomy is typically shallow (root → child); if a matched keyword happens to be a grandchild, it surfaces `null` color. Document this at the call site and pin it with the 3-level test (see Testing). If the taxonomy grows deeper, fix the method to recurse. ### 2. Extend `search/NlQueryInterpretation.java` Add two fields (current fields: `resolvedPersons, ambiguousPersons, dateFrom, dateTo, keywords, rawQuery, keywordsApplied`): ```java @Schema(requiredMode = Schema.RequiredMode.REQUIRED) List<TagHint> resolvedTags, // tags the query was filtered by (after keyword→tag resolution) @Schema(requiredMode = Schema.RequiredMode.REQUIRED) boolean tagsApplied // false when the active search path cannot apply tag filters ``` Place `resolvedTags` after `keywords` and `tagsApplied` after `keywordsApplied` (keeps person/keyword/tag groupings adjacent). `NlSearchResponse` is unchanged. > **Note (Felix):** record fields are positional, so the two new fields break **all three** `NlQueryInterpretation` construction call sites in `search()` (ambiguous early-return, `any`-path, main path). The compiler will catch them — write the tests first so the new positional args are pinned. ### 3. `TagService.findByNameContaining(String fragment): List<Tag>` One-liner mirroring `PersonService.findByDisplayNameContaining`. Delegates to the **existing** `TagRepository.findByNameContainingIgnoreCase(fragment)` — no new JPQL. Read method → **no `@Transactional`** (consistent with `PersonService` read-method style). > Do **not** reuse `TagService.search()` — it enriches results with relatives (`enrichWithRelatives()`) for the tag-browse UI, which is unwanted noise for filter resolution. ### 4. `NlQueryParserService` — keyword→tag resolution Currently (`search/NlQueryParserService.java` ~line 75-80) the `SearchFilters` is built with `tags = List.of()`, `tagQ = null`, `tagOperator = TagOperator.AND`. New flow, inserted after keyword extraction and **before** the `personRole:"any"` branch (so both `keywordsApplied` and `tagsApplied` are known on every path). Resolve **once**, then branch — do not duplicate the resolution logic in both branches: ```text resolveTags(keywords): resolvedTagEntities = new LinkedHashSet<Tag>() // dedup, preserve order remainingKeywords = new ArrayList<String>() for kw in keywords: if kw == null or kw.length() < MIN_TAG_TERM: // guard short/noisy terms remainingKeywords.add(kw); continue matches = tagService.findByNameContaining(kw) // substring ILIKE if matches.isEmpty(): remainingKeywords.add(kw) // unchanged v1 behaviour else: add each match to resolvedTagEntities (stop adding once size == MAX_RESOLVED_TAGS) // resolveEffectiveColors mutates detached Tag instances in-place (sets color field); // must not be called inside a write @Transactional scope — mutations would be persisted. tagService.resolveEffectiveColors(resolvedTagEntities) // D5 — see depth limitation note in §1 return (resolvedTagEntities → List<TagHint(id, name, color)>, tag names, remainingKeywords) ``` - **FTS text** is built from `remainingKeywords` only (plus person no-match/extra fragments) — resolved terms are **removed from the text predicate** so they are not double-applied (tag filter AND text match would over-narrow). - **⚠️ `buildText` `rawQuery` fallback guard (correctness, not just coverage):** `buildText()` currently falls back to `rawQuery` when the joined parts are blank. If **every** keyword resolves to a tag and there are no person/no-match fragments, the FTS text would become blank → `buildText` returns `rawQuery`, re-introducing the resolved terms as text **AND** applying them as tags — the exact double-apply this issue forbids. The resolved-term removal must produce an **empty/null FTS text in this case, not a `rawQuery` fallback** (e.g. only fall back to `rawQuery` when nothing was resolved — no tags AND no persons — or thread a `boolean hadStructuredMatch` into `buildText`). Pin this with an explicit test (see Testing). - **Filter assembly (non-`any` path):** build `SearchFilters` with `tags = <resolved tag names>`, `tagOperator = TagOperator.OR`, `tagQ = null`. Set `tagsApplied = true`. Person (sender/receiver) and date predicates remain AND-composed with the tag predicate; OR applies only *within* the tag set. _(Confirmed: `DocumentSpecifications.hasTagPartial` has its own null guard and fires only when `tagQ` is non-blank; the `tags` list filter is a separate specification that fires independently. No guard in `DocumentService` suppresses it when `tagQ = null`.)_ - **`personRole:"any"` single-person path** (`searchDocumentsByPersonId`) supports neither keywords nor tags → set `tagsApplied = false` (mirrors the existing `keywordsApplied = false`). Per **D4**, `resolvedTags` is still populated for transparency; the frontend hides the chips when `tagsApplied == false`. (This path already early-returns before the `SearchFilters` build site ~line 63 — confirmed.) - **Constants:** `MIN_TAG_TERM` (**3**, accepted — resolves OQ-1) and `MAX_RESOLVED_TAGS` (**10**, mirrors person `MAX_CANDIDATES = 10` — resolves OQ-2), declared as named constants in `NlQueryParserService` next to `MAX_CANDIDATES`. Both have value `10` but model different concepts (person resolution candidate cap vs. tag resolution cap) — keep them as **separate named constants**; they may diverge independently. When matches exceed the cap, take the first N and `log.debug("Keyword matched {} tags; capping at {}", count, MAX_RESOLVED_TAGS)` — **count only, never the raw keyword at info/warn** (parameterized SLF4J; LLM-derived input echoes private names lifted from the query; defends against log injection + PII leak per ADR-028). ### Security / defense-in-depth Keyword fragments are already length-capped to 100 chars (`MAX_KEYWORD_LENGTH`) by `RestClientOllamaClient.toExtraction()` before reaching the parser; no new validation needed before the `TagRepository` call. Queries are parameterized (`LIKE LOWER(CONCAT('%', :q, '%'))`). `MIN_TAG_TERM >= 3` also blunts trivial 1–2-char enumeration of the tag taxonomy. `TagHint.color` is a closed enum value from `TagService.ALLOWED_TAG_COLORS` — use the established inline `style` attribute pattern from existing tag-color components (`TagTreeNode.svelte`, `TagInput.svelte`, `themen/+page.svelte`): `style="background-color: var(--c-tag-{color}); border-left-color: var(--c-tag-{color})"`. `TagHint.color` is a closed DB-origin set and Svelte auto-escapes attribute values; no `{@html}` needed. `TagHint.name` originates from the database (Tag entity name resolved from the taxonomy), not from LLM output directly — a tighter XSS surface than keyword chips. The endpoint authz (`@RequirePermission(Permission.READ_ALL)` on `POST /api/search/nl`) is unchanged — no new endpoint, no new permission decision. ### Documentation Amend **`docs/adr/028-nl-search-ollama.md`** in this PR: - Record the keyword→tag resolution decision and the **deliberate FTS-text removal of resolved terms** (tag filter AND text match would over-narrow) — reference ADR-033 for the deterministic name-resolution lineage. This is the non-obvious "why" a future dev would otherwise reverse. - **Correct `isHealthy()` wording — recommended resolution: document over wire.** ADR-028 states `isHealthy()` is "called inline before each inference request." It is not — `search()` calls `ollamaClient.parse()` directly; `isHealthy()` has zero callers in `search/`. `parse()` already throws on connection failure (caught as `SMART_SEARCH_UNAVAILABLE`); a pre-call `isHealthy()` probe would add ~50–150 ms latency for a failure mode already covered. Update ADR-028 to state: `isHealthy()` is reserved for ops/health endpoint polling; the in-path degradation mechanism is `parse()`'s timeout/IOException handling. No code change — one ADR paragraph. _(Pre-existing; flagged in review.)_ Add the new term **"keyword→tag resolution" / "theme chip"** to `docs/GLOSSARY.md`. **C4 diagram check:** the `search/` package gains a new cross-domain dependency on `tag/` (`TagService`). Check `docs/architecture/c4/l3-backend-*.puml` — if `NlQueryParserService` is diagrammed, add a dependency arrow to `TagService`. ## Frontend changes (depends on #739) **Pre-implementation step:** `ChipType` is currently defined in **two files** — `InterpretationChipRow.svelte:7` and `+page.svelte:272` (both `'sender' | 'directional' | 'date' | 'keyword'`). Extract to `frontend/src/routes/search/chip-types.ts` and import from both files before the PR. When `'theme'` is added, it becomes the single source of truth. Extend `InterpretationChipRow.svelte` (introduced by #739) with **theme chips**, and extend the chip-removal machinery in the parent page: - Render one removable chip per `interpretation.resolvedTags` entry, **only when `interpretation.tagsApplied === true`** (same rule as `keywordsApplied` for keyword chips — do not show greyed/disabled chips). - **Visual identity (D5 — per-tag colour via inline `style`):** each theme chip renders (a) a tinted background and (b) a tinted left-border colour using `style="background-color: var(--c-tag-{tag.color}); border-left-color: var(--c-tag-{tag.color})"` — consistent with `TagTreeNode.svelte`, `TagInput.svelte`, and `themen/+page.svelte`. Three redundant cues: `Thema:` prefix + tinted left border + tinted background. When `tag.color` is null, **omit the `style` attribute entirely** — the chip renders in the standard neutral appearance; the `Thema:` prefix is the sole differentiator in this case. - **Template branch:** theme chips require a **distinct `{:else if chip.type === 'theme'}` branch** — **do not reuse the generic `{:else}` branch**. The generic branch places `chip.label` inside the `nameSpan` truncated element; for theme chips, the `Thema:` prefix and the `×` button must both remain **outside** the truncated span (only the tag name itself truncates). Mirror the structure of the `directional` branch. Add `data-chip-type="theme"` to the chip wrapper `<span>` (consistent with `data-chip-type={chip.type}` on other branches — required for test assertions). - Type-prefixed label for senior legibility: **`[Thema: Weltkrieg ×]`** (new i18n key `search_chip_theme_prefix`). - **Removal re-run — exact `GET /api/documents/search` param names (corrected; resolves OQ-3):** clicking × drops that tag and re-runs via `GET /api/documents/search`. Theme chips pass **`value = tag.name`** to `onRemoveChip('theme', value)`. Extend the `onRemoveChip` callback union with `'theme'` (one callback, switch on type — do not add a parallel callback). Route the re-run through the same `withCsrf()` wrapper as the other chips for consistency. - **`paramsFromInterpretation()` extension and `removeChip` switch (parent page — `+page.svelte`):** - `paramsFromInterpretation()` at line 261 currently returns `{senderId, receiverId, from, to, q}`. The multi-value `tag[]` params cannot be expressed in this object shape. Choose **one approach** before implementing: (a) add `tags: string[]` and `tagOp: string` to the return type and extend `applyResolvedAndSearch` to accept them; or (b) build `URLSearchParams` directly in the `'theme'` case of `removeChip` without changing the return type. Be consistent across `removeChip` and `applyResolvedAndSearch`. - The `removeChip` `if/else` switch at line 274 **also needs a `'theme'` case** — this is a separate change from `paramsFromInterpretation()`. Without it, clicking a theme chip's × calls `applyResolvedAndSearch` with the full interpretation params intact (no tag removed). The theme branch: drop the selected tag from `resolvedTags`, reconstruct remaining tag names, emit as `tag` params + `tagOp=OR`. - When `resolvedTags` is non-empty after removing the tapped tag: `params.append('tag', name)` for each remaining name + `params.set('tagOp', 'OR')`; when the last theme chip is removed, omit both `tag` and `tagOp`. Any still-active person/date filters from the interpretation are preserved in the params as before. **No re-parse via the NL endpoint** — existing chip removal already clears NL state (`smartMode=false`, `nlInterpretation=null`) and switches to regular search; theme chip removal follows the same pattern. - `{#each}` key: `"theme:" + tag.id`. Use `$derived` for the chip list and the `tagsApplied` gate (not `$effect`). - a11y / touch (inherit from #739): ≥44px × target; `focus-visible:ring-2` on the **wrapper** (not just the ×); `truncate` on the label span at `max-w-[8rem]` (mobile) with the `Thema:` prefix **and** the × button both **outside** the truncated span — the prefix is the sole non-colour differentiator for colour-blind users and must never be clipped at any viewport width; `flex flex-wrap gap-2` container must absorb up to `MAX_RESOLVED_TAGS`=10 theme chips + person/date chips without pushing the result list off the first screen. `aria-label` via `search_filter_remove_label` (`Filter entfernen: {label}`), where `{label}` **includes the "Thema" prefix** ("Filter entfernen: Thema Weltkrieg"). **No `{@html}`** — tag names are LLM-influenced; rely on Svelte text-position auto-escaping. ### New i18n keys (`messages/{de,en,es}.json`) | Key | de | en | es | |-----|----|----|-----| | `search_chip_theme_prefix` | "Thema" | "Topic" | "Tema" | ## OpenAPI / types Run `npm run generate:api` in the same PR — `TagHint` (with `id`, `name`, `color`) and the two new `NlQueryInterpretation` fields must appear in `src/lib/generated/`. ## Testing **TypeScript fixture audit (pre-implementation):** before running `npm run generate:api`, run `grep -rn 'NlQueryInterpretation' frontend/` to find all existing mock objects. The type regen adds two required fields (`resolvedTags: TagHint[]`, `tagsApplied: boolean`) — every mock object missing them causes `npm run check` failures across spec files. Add `resolvedTags: []` and `tagsApplied: false` as neutral defaults to each found site. **Confirmed primary site:** `makeInterpretation()` factory at `InterpretationChipRow.svelte.spec.ts:14`. **Backend — `NlQueryParserServiceTest` constructor sync (pre-implementation):** The test at line 43 manually constructs the service: `new NlQueryParserService(ollamaClient, personService, documentService)`. Adding `TagService` as the 4th constructor arg breaks all existing tests at compile time. Add `@Mock TagService tagService` and a default stub `when(tagService.findByNameContaining(anyString())).thenReturn(List.of())` in `setUp()` alongside the existing service stubs. **Backend — `NlQueryParserServiceTest` (Mockito), additions** (assert `filters.tags()` and `filters.tagOperator()` with **exact matchers, never `any()`** — a vacuous `any()` lets an accidental `AND` default pass; write `tag(name)` / `tagHint(id,name,color)` factory helpers first, following the existing lowercase private-method convention): - keyword resolves to a single tag → `resolvedTags` contains it, `tagsApplied == true`, removed from FTS text; assert `searchDocuments` called with `filters.tags() == [name]` and `filters.tagOperator() == TagOperator.OR` - keyword substring-matches multiple tags → all in `resolvedTags`; assert OR-union filter - keyword with no tag match → stays in FTS text, `resolvedTags` empty - mixed (one resolves, one doesn't) → resolved term becomes tag, unresolved stays in the space-joined text argument - single person + `personRole:"any"` + a resolvable keyword → `searchDocumentsByPersonId` path taken, `searchDocuments` **never** called (`verify(..., never())`), `tagsApplied == false`, `resolvedTags` still populated (D4) - cap: keyword matches > `MAX_RESOLVED_TAGS` tags → capped, debug-logged (count only) - short keyword (< `MIN_TAG_TERM`) → skipped from tag resolution, stays as keyword - **dedup:** the same tag matched via multiple keywords/substrings ("krieg" + "kriegsende" both hitting *Kriegsende*) → appears **once** in `resolvedTags` (`LinkedHashSet`) - **all-resolve `rawQuery` guard (latent bug):** every keyword resolves and there are no person/no-match fragments → assert `filters.text() == null` **and** `filters.tags()` populated (no double-apply) - **flag independence:** all keywords became tags → `keywordsApplied == false` **and** `tagsApplied == true` in the same interpretation (frontend contract: keyword chips hidden, theme chips shown) - **colour (D5, one-level):** a child tag with no own colour resolves to its parent's colour → the resulting `TagHint.color` equals the parent's (exercises `resolveEffectiveColors`); build the mock `Tag` entity **fresh per test** (not a shared field) — `resolveEffectiveColors` calls `tag.setColor(...)` in-place and bleeds into other tests that share the same entity instance - **colour depth limitation (3-level):** a grandchild tag (no own color) whose direct parent also has no color → assert `TagHint.color == null`. This pins the known one-level depth constraint of `resolveEffectiveColors` as a documented limitation rather than a silent production bug. If the archive's taxonomy grows deeper, fix the method to recurse and update this test. **Backend — `TagServiceTest`:** `findByNameContaining` delegates to `findByNameContainingIgnoreCase` (red → green). Do **not** test the trivial `TagHint` record accessors. **Backend — descendant-expansion integration test (`@DataJpaTest`, `postgres:16-alpine`):** create a parent tag + a child tag; a document tagged only on the child; assert an NL search with a keyword substring-matching the parent name returns that document. Verifies `expandTagNamesToDescendantIdSets` fires end-to-end — the acceptance criterion "filtered by the OR-union of every tag … **and their descendants**" is not provable via Mockito alone. Use `postgres:16-alpine` (project convention — never H2). Check for an existing Testcontainers base class (`NlSearchControllerTest.java` — inspect its imports) before declaring a new container. **Coverage gate:** target is **88% branch coverage** (`backend/CLAUDE.md`). The new branches — short-term skip, cap, dedup, all-resolve guard, flag independence — are the ones easiest to miss; verify each is hit. **Frontend — `InterpretationChipRow.svelte.spec.ts` (vitest-browser-svelte):** - `resolvedTags` non-empty + `tagsApplied: true` → theme chips render with `Thema:` prefix - `tagsApplied: false` → assert DOM absence via `querySelectorAll('[data-chip-type="theme"]')` — **not** `[data-chip-type]` which would also catch other chip types - a theme chip with `color: "sage"` renders with inline `style` containing `background-color: var(--c-tag-sage)` and `border-left-color: var(--c-tag-sage)`; a chip with `color: null` renders with no tint style (standard neutral chip appearance) - clicking × on a theme chip → `onRemoveChip('theme', tag.name)` called with the tag's **name** as `value` (assert callback receives the name string — the HTTP call is the parent component's responsibility, not the chip row's) - N resolved tags → exactly N theme chips **Frontend — `paramsFromInterpretation()` unit tests (vitest):** the parent page's param-reconstruction function must be tested for the theme-chip branch covering: (a) N remaining tags → exactly N `tag` params + `tagOp=OR`; (b) last tag removed → no `tag`/`tagOp` in params; (c) last tag removed with a resolved person present → sender param intact, no `tag`/`tagOp`; (d) null-color tag → tag name is emitted correctly (color does not affect param construction). No new Playwright test required — extend the existing #739 happy-path E2E fixture to include `resolvedTags` + `tagsApplied: true`. **Locate the fixture file first:** run `find frontend/e2e -name "*.spec.ts" | xargs grep -l "smartMode"`. Also add a **"last theme chip removed, person filter survives"** scenario: NL result with one resolved person + one theme tag; remove the theme chip; assert no `tag`/`tagOp` in the resulting URL but the sender UUID param is present. ## Acceptance criteria - `POST /api/search/nl` with `{"query": "Briefe über den Krieg"}` returns results filtered by the OR-union of every tag whose name substring-matches "Krieg" (and their descendants), with those tags in `interpretation.resolvedTags` and `tagsApplied == true` - A keyword with no matching tag is folded into the FTS keyword text and does **not** appear in `resolvedTags` (v1 behaviour preserved) - A keyword matching multiple tags includes **all** of them; a document tagged with **any** one of them appears in results (OR semantics) - A keyword matching the same tag via multiple substrings yields that tag **once** in `resolvedTags` (dedup) - Resolved tag terms are removed from the FTS text predicate (not applied twice); when every keyword resolves the FTS text is null and does **not** fall back to `rawQuery` - Each `resolvedTags` entry carries its **effective** `color` (own colour, or the inherited parent colour for a direct child tag with no own colour; null for grandchild+ tags or when neither the tag nor its parent has a colour) - A query with a single resolved person and `personRole:"any"` plus a theme term runs the sender-OR-receiver search; `tagsApplied == false`; `resolvedTags` is still populated; the frontend shows **no** theme chips - Frontend: theme chips render as `[Thema: … ×]` with the tag's own `--c-tag-{color}` applied via inline `style` (`background-color` + `border-left-color`); null-color falls back to standard neutral chip appearance (no style attribute); when `tagsApplied == true`; removing one re-runs `GET /api/documents/search` with `tagOp=OR` and the remaining `tag` values; removing the last theme chip omits both `tag` and `tagOp` while any still-active sender/receiver/date filters remain intact - `GET /api/documents/search` behaviour is otherwise unchanged - The Ollama JSON schema and prompt in `RestClientOllamaClient` are unchanged (no `themeNames` field) - `npm run generate:api` has been run; `TagHint` (`id`, `name`, `color`), `resolvedTags`, `tagsApplied` exist in generated types - `npm run check` passes with zero new errors after adding `resolvedTags: []` and `tagsApplied: false` to all existing `NlQueryInterpretation` mock objects ## Non-functional checklist - **Performance:** ≤ one indexed `ILIKE` per extracted keyword (typically 1–3), bounded by `MAX_RESOLVED_TAGS`; `resolveEffectiveColors` adds one batched parent-colour lookup (confirmed: `tagRepository.findAllById(parentIdsNeeded)` — single IN query). Adds < ~10ms to a 2–15s LLM call — negligible. The existing `/api/search/nl` Prometheus latency exclusion (#738) already covers it. _Pre-merge check: verify #738's Prometheus exclusion/histogram actually landed in `main` — `curl -s http://localhost:8080/actuator/prometheus | grep 'http_server_requests.*search/nl'` should return a series; if absent, fix before merging. Document this check in the PR description as a merge checklist item._ _Known future lever (not this PR):_ `LIKE LOWER(CONCAT('%', :q, '%'))` is a leading-wildcard match → no B-tree index; trivial seq-scan on the small `tag` table today. If `tag` ever grows past a few thousand rows, add a `pg_trgm` GIN index on `lower(name)`. - **Security:** keyword fragments pre-capped at 100 chars; parameterized queries; `MIN_TAG_TERM >= 3` blunts taxonomy enumeration; `TagHint.color` is a closed DB-origin palette token applied via inline `style` (Svelte auto-escapes attribute values); `TagHint.name` is DB-origin (not LLM text); LLM-derived tag names never reach `innerHTML` (no `{@html}`) and are logged only as counts. - **Observability:** log resolved-tag **count** as metadata (`log.debug`) — never the raw query or tag names beyond debug level (PII policy from ADR-028). No new metric or dashboard panel (sub-ms inside a multi-second request). - **i18n:** new key trio added to de/en/es before any render. - **Accessibility:** theme chips inherit #739's ≥44px targets, `aria-label` (with "Thema" prefix), and focus-ring patterns. Run axe in both **light and dark mode** on the chip row — `--c-tag-*` tokens are remapped in dark mode and a token passing 4.5:1 in light can fail in dark. Perform a visual wrap stress test at 320px with 10 stub theme chips to confirm the `flex flex-wrap gap-2` container absorbs them without pushing results off screen. - **Compatibility:** no DB migration, no infra change, no new container, no new env var, no new error code. ## Out of scope - LLM `themeNames` field / system-prompt change (explicitly rejected — see D1) - Tag disambiguation UI / `ambiguousTags` (rejected — see D2) - Tag filtering on the `personRole:"any"` search path (returns `tagsApplied = false`; chips hidden — see D4) - No `tagsApplied: false` hint shown on the `personRole:"any"` path (by design — D4 transparency is API-level, not UI-level; no analogue to the keyword `showKeywordsNotApplied` hint) - `nl-search-spec.html` theme-chip mockup (nice-to-have follow-up; design review does **not** gate this PR — see D5) - `pg_trgm` index on `tag.name` (future lever, only if the table grows) - Recursive `resolveEffectiveColors` for grandchild+ color inheritance (future lever if taxonomy grows deeper than 2 levels) - Semantic/embedding-based theme matching — tracked separately under the epic parking lot
marcel added this to the Archive Intelligence — NL Search milestone 2026-06-06 12:17:04 +02:00
marcel added the P3-laterfeature labels 2026-06-06 12:19:06 +02:00
Author
Owner

Implementation complete

All 12 tasks implemented via Red→Green→Refactor TDD. Branch: feat/issue-743-nl-search-tag-resolution

Commits (12)

Commit Description
8bd83908 feat(search): add TagService.findByNameContaining for NL tag resolution
89051350 feat(search): add TagHint record for NL tag resolution API surface
7eee688c feat(search): extend NlQueryInterpretation with resolvedTags + tagsApplied
e94414b8 refactor(search): extract ChipType to chip-types.ts; audit NL fixtures
fc557bd9 feat(search): implement keyword→tag resolution in NlQueryParserService
6cb10258 test(search): add 11 tag-resolution test cases to NlQueryParserServiceTest
86690fdb test(search): DataJpaTest for descendant-expansion via TagRepository
573bca49 feat(i18n): add search_chip_theme_prefix to de/en/es message bundles
847874ab feat(api): add TagHint schema and extend NlQueryInterpretation with resolvedTags/tagsApplied
5387bc92 feat(search): render removable theme chips in InterpretationChipRow
2a7e1337 feat(search): wire theme chip removal to URL navigation in +page.svelte
64b7b231 docs(search): ADR-028 fix + glossary + C4 diagram for tag resolution (#743)

What was built

Backend:

  • TagService.findByNameContaining(fragment) — case-insensitive substring match via existing findByNameContainingIgnoreCase
  • TagHint record (id, name, color?) — lightweight representation for the API surface
  • NlQueryInterpretation extended with resolvedTags: List<TagHint> and tagsApplied: boolean
  • NlQueryParserService.resolveTags() — iterates LLM-extracted keywords; each keyword with length ≥ 3 is substring-matched against tags; matches go to resolvedTags (OR-union filter, deduplicated via LinkedHashSet, capped at 10); non-matching keywords stay as FTS text
  • buildText rawQuery fallback guarded by hadStructuredMatch — prevents raw query double-apply when all keywords resolve to tags
  • tagOperator = OR when resolved tags are non-empty; tagsApplied = false on the personRole:any single-person path (no tag filter slot there, but resolvedTags still populated for UI transparency)
  • TagService.resolveEffectiveColors() called on the resolved tag list (one-level color inheritance)
  • 11 new NlQueryParserServiceTest cases covering all resolution scenarios
  • NlSearchTagResolutionIntegrationTest (@DataJpaTest + Testcontainers) verifying the recursive CTE findDescendantIdsByName expands parent → child

Frontend:

  • messages/{de,en,es}.jsonsearch_chip_theme_prefix: "Thema" / "Topic" / "Tema"
  • api.ts (manually updated, generate:api run) — TagHint schema + resolvedTags/tagsApplied in NlQueryInterpretation
  • InterpretationChipRow.svelteChip union extended with theme variant; resolvedTags loop when tagsApplied; inline style from tagColorStyle() for coloured chips; remove() passes chip.tag.name as value; 6 new browser component tests
  • theme-chip-removal.ts — pure function buildThemeRemovalUrl(interp, removedTagName): string building the correct /documents?tag=…&tagOp=OR URL with person/date params preserved; 5 pure-function server tests
  • +page.svelteremoveChip 'theme' case calls buildThemeRemovalUrl, then resetNlState() + goto()
  • e2e/nl-search.spec.tsThema: Weltkrieg chip assertion in existing test; new "last theme chip removed keeps person params" scenario

Docs:

  • ADR-028: corrected isHealthy() paragraph (it has zero callers in search/, degradation is via parse() timeout); added keyword→tag resolution paragraph with reference to ADR-033
  • GLOSSARY.md: added keyword→tag resolution, TagHint, theme chip entries; updated NlQueryInterpretation entry
  • docs/architecture/c4/l3-backend-3h-search.puml: added TagService component + findByNameContaining / resolveEffectiveColors relation arrows

Next step

Open PR from feat/issue-743-nl-search-tag-resolutionmain.

## Implementation complete ✅ All 12 tasks implemented via Red→Green→Refactor TDD. Branch: `feat/issue-743-nl-search-tag-resolution` ### Commits (12) | Commit | Description | |--------|-------------| | `8bd83908` | `feat(search): add TagService.findByNameContaining for NL tag resolution` | | `89051350` | `feat(search): add TagHint record for NL tag resolution API surface` | | `7eee688c` | `feat(search): extend NlQueryInterpretation with resolvedTags + tagsApplied` | | `e94414b8` | `refactor(search): extract ChipType to chip-types.ts; audit NL fixtures` | | `fc557bd9` | `feat(search): implement keyword→tag resolution in NlQueryParserService` | | `6cb10258` | `test(search): add 11 tag-resolution test cases to NlQueryParserServiceTest` | | `86690fdb` | `test(search): DataJpaTest for descendant-expansion via TagRepository` | | `573bca49` | `feat(i18n): add search_chip_theme_prefix to de/en/es message bundles` | | `847874ab` | `feat(api): add TagHint schema and extend NlQueryInterpretation with resolvedTags/tagsApplied` | | `5387bc92` | `feat(search): render removable theme chips in InterpretationChipRow` | | `2a7e1337` | `feat(search): wire theme chip removal to URL navigation in +page.svelte` | | `64b7b231` | `docs(search): ADR-028 fix + glossary + C4 diagram for tag resolution (#743)` | ### What was built **Backend:** - `TagService.findByNameContaining(fragment)` — case-insensitive substring match via existing `findByNameContainingIgnoreCase` - `TagHint` record (`id`, `name`, `color?`) — lightweight representation for the API surface - `NlQueryInterpretation` extended with `resolvedTags: List<TagHint>` and `tagsApplied: boolean` - `NlQueryParserService.resolveTags()` — iterates LLM-extracted keywords; each keyword with `length ≥ 3` is substring-matched against tags; matches go to `resolvedTags` (OR-union filter, deduplicated via `LinkedHashSet`, capped at 10); non-matching keywords stay as FTS text - `buildText` rawQuery fallback guarded by `hadStructuredMatch` — prevents raw query double-apply when all keywords resolve to tags - `tagOperator = OR` when resolved tags are non-empty; `tagsApplied = false` on the `personRole:any` single-person path (no tag filter slot there, but `resolvedTags` still populated for UI transparency) - `TagService.resolveEffectiveColors()` called on the resolved tag list (one-level color inheritance) - 11 new `NlQueryParserServiceTest` cases covering all resolution scenarios - `NlSearchTagResolutionIntegrationTest` (`@DataJpaTest` + Testcontainers) verifying the recursive CTE `findDescendantIdsByName` expands parent → child **Frontend:** - `messages/{de,en,es}.json` — `search_chip_theme_prefix`: "Thema" / "Topic" / "Tema" - `api.ts` (manually updated, `generate:api` run) — `TagHint` schema + `resolvedTags`/`tagsApplied` in `NlQueryInterpretation` - `InterpretationChipRow.svelte` — `Chip` union extended with `theme` variant; `resolvedTags` loop when `tagsApplied`; inline `style` from `tagColorStyle()` for coloured chips; `remove()` passes `chip.tag.name` as value; 6 new browser component tests - `theme-chip-removal.ts` — pure function `buildThemeRemovalUrl(interp, removedTagName): string` building the correct `/documents?tag=…&tagOp=OR` URL with person/date params preserved; 5 pure-function server tests - `+page.svelte` — `removeChip` 'theme' case calls `buildThemeRemovalUrl`, then `resetNlState() + goto()` - `e2e/nl-search.spec.ts` — `Thema: Weltkrieg` chip assertion in existing test; new "last theme chip removed keeps person params" scenario **Docs:** - ADR-028: corrected `isHealthy()` paragraph (it has zero callers in `search/`, degradation is via `parse()` timeout); added keyword→tag resolution paragraph with reference to ADR-033 - `GLOSSARY.md`: added `keyword→tag resolution`, `TagHint`, `theme chip` entries; updated `NlQueryInterpretation` entry - `docs/architecture/c4/l3-backend-3h-search.puml`: added `TagService` component + `findByNameContaining` / `resolveEffectiveColors` relation arrows ### Next step Open PR from `feat/issue-743-nl-search-tag-resolution` → `main`.
Sign in to join this conversation.
No Label P3-later feature
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#743