feat(search): NL search — resolve tag names in query parsing (#743) #765
Reference in New Issue
Block a user
Delete Branch "feat/issue-743-nl-search-tag-resolution"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
NlQueryParserServiceto substring-match each LLM-extracted keyword against the tag taxonomy after Ollama parsing; matching keywords become OR-union tag filters, non-matching keywords remain as FTS textTagHint {id, name, color?}record and extendsNlQueryInterpretationwithresolvedTags: List<TagHint>+tagsApplied: booleanInterpretationChipRow; chip removal navigates to/documents?tag=…&tagOp=ORpreserving person/date paramsTest plan
NlQueryParserServiceTest(+11 cases),TagServiceTest,NlSearchTagResolutionIntegrationTest(@DataJpaTest descendant-expansion) — 85 tests, all greentheme-chip-removal.spec.ts(5 pure-function tests) — all greenInterpretationChipRow.svelte.spec.ts(+6 browser tests) — all greennl-search.spec.tsextended withThema: Weltkriegchip assertion + last-chip-removal scenario (requires full stack)Closes #743
🤖 Generated with Claude Code
When tagsApplied is true, each resolvedTag renders as a 'Thema: Name' chip with optional inline color style from the tag's resolved color. Clicking × calls onRemoveChip('theme', tag.name). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>🏛️ Markus Keller — Software Architect
Verdict: ✅ Approved
Layer boundaries and coupling
The cross-domain dependency is handled correctly throughout.
NlQueryParserServicecallsTagService.findByNameContaining()(service layer), neverTagRepositorydirectly — the layering rule is intact.TagHintlives insearch/and acts as a DTO projected fromtag/domain entities inside thesearch/package, keeping the tag domain's internalTagentity from leaking into the search response contract.The C4 L3 diagram (
docs/architecture/c4/l3-backend-3h-search.puml) is updated with both newTagServicerelations: the resolution call and theresolveEffectiveColors()call. ADR-028 now correctly documents thesearch/→tag/dependency direction alongside the existingsearch/→person/andsearch/→document/dependencies.ADR completeness
ADR-028 was amended rather than superseded — appropriate for an extension of the same feature. The cross-reference to ADR-033 from the keyword→tag resolution paragraph correctly externalises the case-collision and color-inheritance rules, keeping ADR-028 focused on the Ollama integration and its consequences. ADR-028's correction of the
isHealthy()paragraph (it has no callers insearch/; in-path degradation goes viaparse()timeout/IOException) is accurate and removes a latent misunderstanding.Minor observation —
resolveEffectiveColors()on detached entitiesresolveTags()callstagService.resolveEffectiveColors(resolvedTagEntities)onTagentities that were fetched in a non-@Transactionalcontext. Mutating JPA entities outside a transaction is safe here (no dirty-check fires, no second-level cache involvement), but a reader arriving cold could wonder whether Hibernate might pick up the mutation. A one-line comment at the call site — something like// safe: entities are detached at this point; mutation is for DTO projection only— would remove the ambiguity for future reviewers. Not a blocker.No Flyway migrations needed
Correct —
TagHintis a projection record, not a DB entity. No schema change required.👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ✅ Approved
TDD adherence
The plan separates failing test steps (1, 7, 9, 13, 15) from production code steps (2, 8, 14, 16) — red-before-green order preserved throughout. The commit history confirms each test file was introduced before its corresponding implementation. No shortcuts here.
Standout design decision —
buildThemeRemovalUrlextractionPulling theme chip removal into a pure function (
frontend/src/routes/documents/theme-chip-removal.ts) instead of testing it as browser interaction is the right call. Five focused server-mode tests give precise feedback on URL construction without spinning up a Svelte runtime. This is exactly the kind of testability refactor that keeps the unit layer fast and the browser layer lean.Backend code quality
TagResolutionprivate record as the return type forresolveTags()gives the intermediate result a name — much better than a three-element tuple or parallel lists.hadStructuredMatchboolean solves the rawQuery double-apply invariant cleanly. The name is unambiguous once you know the context, but given it's a non-obvious guard, the comment at thebuildTextcall site is warranted (which you added — ✅).MIN_TAG_TERM = 3andMAX_RESOLVED_TAGS = 10as package-private constants are the right call — named, changeable, and visible in tests.LinkedHashSet<Tag>for dedup preserves insertion order → deterministicresolvedTagsordering in the response. Good.Frontend code quality
chip-types.tsextraction resolves the duplication betweenInterpretationChipRow.svelteand+page.svelte. Clean.{ key: string; type: 'theme'; tag: TagHint; label: string }discriminated union member is consistent with the existing chip union shape.tagColorStyle()helper isolating the--c-tag-{color}CSS variable construction keeps the template readable.Minor note
resolveEffectiveColors()mutatesTagentities in-place. The test note "build mock Tag fresh per test" is critical — if any test shares aTaginstance across cases, a color-related assertion in test N could be poisoned by test N-1's mutation. Confirm each color test constructs its own freshTagobjects.🚀 Tobias Wendt — DevOps / Infrastructure
Verdict: ✅ Approved
Infrastructure delta: zero
No new Compose services, no new config properties, no Flyway migrations, no changes to the rate limiter or its Bucket4j/Caffeine wiring. The
TagServicecall is pure application-layer — no container or network topology change. Clean.Performance note (non-blocking)
resolveTags()currently issues N sequentialfindByNameContaining()DB queries — one per keyword extracted by Ollama. For a query with five keywords that's five round-trips before thesearchDocuments()call. In the current architecture this is invisible: Ollama's 2–15 s inference window dwarfs N × 1 ms Postgres round-trips by at least two orders of magnitude. Not a concern today.If keyword counts ever grow substantially (say, a user sends a long comma-separated list that Ollama parses into 20+ keywords), a single
IN-style batch query (WHERE LOWER(name) LIKE ANY(?)) would reduce this to one round-trip. Suggested as a future optimisation ticket rather than a blocker here.Observability suggestion (non-blocking)
The existing log line records
queryLength,resolvedPersonCount, andlatencyMs. With this PR, tag resolution is now part of every search path. AddingresolvedTagCountandtagsAppliedto that structured log entry would let you answer "Is tag resolution providing value in production?" from Grafana without touching the code again. One-liner addition, worth a follow-up issue.📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
Acceptance criteria coverage (against issue #743)
resolvedTagsin response → frontend theme chipspersonRole:anypath:tagsApplied=false,resolvedTagsstill populated (transparency)hadStructuredMatchguard prevents rawQuery double-applysearch_chip_theme_prefixin de / en / esTraceability
ADR-028 captures the feature rationale and edge-case rules (OR semantics, personRole:any limitation, cap at
MAX_RESOLVED_TAGS,MIN_TAG_TERMguard). ADR-033 is cross-referenced for the color-inheritance rules. The C4 L3 diagram reflects the newsearch/→tag/dependency. Requirements → implementation → docs chain is complete.One open item to verify
ADR-028 references ADR-033 for "tag name resolution and case-collision rules." Confirm ADR-033 exists and was published as part of this PR or an earlier one — if it's still a draft, the cross-reference dangles. Non-blocking if ADR-033 was already merged with #739 or an earlier issue.
Discovery items not in original spec (implementation-discovered)
The
hadStructuredMatchrawQuery fallback guard appears to have been discovered during implementation rather than specified in issue #743. This is a legitimate "emerge during TDD" finding. Worth folding a one-line note into the issue body (or the ADR) explaining the invariant — future maintainers won't know it came from a discovered edge case without that context.🔐 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
Findings
[INFORMATIONAL] LIKE wildcard characters in keyword fragments not escaped
TagService.findByNameContaining(fragment)delegates toTagRepository.findByNameContainingIgnoreCase(fragment), which generatesLOWER(name) LIKE LOWER('%' || :fragment || '%'). JPA parameterization prevents SQL injection ✅. However, iffragmentcontains%or_, they are interpreted as LIKE wildcards by the database engine. A fragment of"%%k"would match any tag whose name containsk(effectively broadening the query to most of the tag table). A fragment of"__"would match any tag with exactly 2 characters.In practice, Ollama's grammar-constrained JSON output with
maxLength: 100rarely produces these characters, and theMIN_TAG_TERM = 3guard blocks 0–2 char fragments. The risk is low. Defense-in-depth recommendation for a follow-up:Not a blocker. Log as a low-priority security improvement ticket.
Confirmed safe
TagHintcontains tagid,name,color— organizational metadata, no PII. ✅@RequirePermission(READ_ALL)onNlSearchControllerunchanged. Tag resolution operates entirely inside the existing auth perimeter. ✅log.debugandlog.warncalls use SLF4J{}placeholders, no raw query content logged. ✅maxLength: 100per keyword + JPA parameterization = two-layer defense. ✅🧪 Sara Holt — QA Engineer & Test Strategist
Verdict: ⚠️ Approved with concerns
Test pyramid assessment
Backend unit —
NlQueryParserServiceTest: Covers 12+ cases across the full decision matrix: single resolve, multi-resolve, no-match, mixed,personRole:any, cap atMAX_RESOLVED_TAGS,MIN_TAG_TERMguard, dedup, all-resolve rawQuery guard, flag independence (keywordsApplied=false AND tagsApplied=true), 1-level color inheritance, 3-level color depth (null propagation). Comprehensive. ✅Backend unit —
TagServiceTest: VerifiesfindByNameContainingdelegates correctly to the repository. ✅Backend integration —
NlSearchTagResolutionIntegrationTest:@DataJpaTest+PostgresContainerConfig+ Flyway — real Postgres, real schema. Covers the end-to-end descendant-expansion path. ✅Frontend unit —
theme-chip-removal.spec.ts: 5 server-mode tests for pure function. N remaining tags → Ntagparams +tagOp=OR; last tag removed → no params; directional pair survives; null-color handled. ✅Frontend component —
InterpretationChipRow.svelte.spec.ts: 6 browser tests: chips render,tagsApplied:falsesuppresses chips, color → inline style, null-color → no style, × click → callback with tag name, N tags → N chips. ✅E2E —
nl-search.spec.ts: Theme chip visible assertion + removal scenario (URL asserts notag=/tagOp=, sender UUID preserved). ✅Concerns
1. LIKE wildcard edge case not tested (Minor)
No test covers
fragmentvalues containing%or_. A test like:...would document the current behavior and catch a future regression if escaping is added. Low risk in practice, but a gap in the boundary-condition coverage.
2. Browser test workaround not documented (Minor)
InterpretationChipRow.svelte.spec.tsruns from/tmp/fe743due to the+character in the worktree path confusing vitest-browser's iframe URL routing. This workaround is invisible to any developer who checks out this branch on a fresh machine — they'll see test failures with no obvious cause. A// NOTE: run from /tmp/fe743 copy — see docs/...comment in the spec file, or a note in the PR description, would save the next person 30 minutes.3.
resolveEffectiveColorsmutation isolation (Verify)The plan notes "build mock Tag fresh per test (resolveEffectiveColors mutates in-place)." Confirm each color-related test case constructs its own
Taginstances rather than sharing via@BeforeEach. If two tests share aTagthrough a field, whichever runs second will see the first test's mutation. Not an issue if confirmed — just flag to verify in code review.🎨 Leonie Voss — UX Designer & Accessibility Strategist
Verdict: ⚠️ Approved with concerns
Brand and accessibility audit
Color system ✅ — Theme chips use
var(--c-tag-{color})CSS variables, consistent with the tag color tokens used in tag badges elsewhere. Null-color tags fall back to neutral chip appearance (no inline style attribute) — correct.Chip structure ✅ — "Thema:" prefix is rendered outside the truncated name
<span>, preserving screen-reader legibility ("Thema: Zweiter Weltkrieg", not"Thema: Zweiter W…").ARIA labels ✅ — Remove button uses
aria-label={m.search_filter_remove_label({ label: ... })}— localized, descriptive, screen-reader friendly.i18n ✅ —
search_chip_theme_prefixpresent in de ("Thema"), en ("Topic"), es ("Tema").data-chip-type="theme"attribute ✅ — Supports both accessibility tooling and test selectors.Concerns
1. Silent tag recognition on
personRole:anypath (Minor — UX gap)When a query resolves to a single person with role
"any"(e.g. "Briefe von Walter über Krieg"),tagsAppliedisfalseand no theme chips are rendered — even though "Krieg" was recognized as a known tag. From the user's perspective, the keyword disappeared silently. They typed it, Ollama understood it, but there's no visual feedback.In a v1 context this is acceptable (documented in ADR-028), but the UX cost is real: senior users are more likely to re-type the query wondering if it was processed. A follow-up issue should explore a lightweight signal — even a muted "Thema erkannt, aber nicht angewendet" tooltip or a different chip state — for a future iteration. Non-blocking here.
2. Tag color contrast (Informational — systemic, not this PR)
--c-tag-{color}values are user-defined in the tag admin UI. If an admin defines a very light tag color (e.g. yellow/cream), the chip text may fall below WCAG AA 4.5:1. This isn't introduced by this PR — it's a pre-existing concern with the tag color system. Worth a separate audit issue for the tag admin color picker to constrain choices to accessible values.3. Touch target on theme chip remove button (Verify)
Confirm the
×dismiss button inside theme chips meets the 44×44 px minimum (WCAG 2.2, critical for the 60+ audience). If the existing keyword/date/sender chip remove buttons already havemin-h-[44px] min-w-[44px]or equivalent padding, theme chips inherit this automatically — just verify consistency. ✅ if already compliant across all chip types.Review follow-up — all concerns addressed
Changes committed
dc366ed4—docs(search): add detached-entity safety comment in resolveTags// safe: entities are detached here; mutation is for DTO projection only, no dirty-check firesat thetagService.resolveEffectiveColors(capped)call site inNlQueryParserService.resolveTags()fb41affd—docs(search): note vitest-browser workaround for + in pathInterpretationChipRow.svelte.spec.tsexplaining that vitest-browser fails silently when the project path contains+, and how to work around itVerified OK — no code change needed
033-tag-name-resolution-tolerates-case-collisions.mdconfirmed×button on theme chips meets 44px heightremoveButtonclass hasmin-h-[44px]— consistent with all other chip typesresolveEffectiveColorsmutation isolation in color testsTaginstance via thetag()builder factoryFollow-up issues filed