refactor(search): remove NLP/smart-search feature entirely #772
Reference in New Issue
Block a user
Delete Branch "worktree-feat+nlp-service"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
search/package (NlSearchController, NlQueryParserService, NlpClient, NlSearchRateLimiter — 14 classes + 6 test classes)nlp-service/Python microservice (FastAPI, rapidfuzz, DB-backed person matching)SMART_SEARCH_UNAVAILABLE/SMART_SEARCH_RATE_LIMITEDerror codes from backend, frontend types, and all three i18n files (de/en/es)nlp-servicecontainer andAPP_NLP_BASE_URLfrom both docker-compose filesTest plan
cd backend && ./mvnw compile -q→ BUILD SUCCESScd frontend && npm run test -- --project=servergrep -r "SmartSearch\|NlSearch\|nlp-service\|SMART_SEARCH" backend/src frontend/srcdocker compose configvalidates both compose files🤖 Generated with Claude Code
feat(search): replace Ollama with rule-based nlp-service (#771)to refactor(search): remove NLP/smart-search feature entirely👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
The core deletion is thorough: all 14 backend source files, 6 test files, 10 frontend components/specs, the Python microservice, infrastructure services, ADRs, and observability config are gone cleanly. The affected test suite (
SearchFilterBar,theme-chip-removal, unit tests for the NL service layer) was pruned in sync with the removed code — that's the right habit. The+page.svelteandSearchFilterBar.sveltesimplifications read correctly: no reactive remnants, props dropped from both declaration and call site.Three issues need fixing before the next feature build trips on them.
Blockers
1.
frontend/e2e/nl-search.spec.tswas modified instead of deleted.The file tests the exact feature this PR removes, yet it's still in the tree — and it received a new assertion (
expect(body.lang).toBeTruthy()). That assertion will never run against real code again and will fail or be silently skipped the next time the e2e suite is wired up. Delete the file.2. ~10 orphaned i18n keys remain in
de.json/en.json/es.json.Every component that consumed these keys was deleted (
InterpretationChipRow.svelte,DisambiguationPicker.svelte,SmartSearchStatus.svelte), but the keys themselves were not removed from the message catalogs:search_loading_nl,search_loading_nl_sub(the value was even edited to remove "KI" — wasted diff on a dead key)search_switch_to_keywordsearch_filter_remove_label,search_chip_sender,search_chip_date,search_chip_directional_labelsearch_disambiguation_trigger_label,search_disambiguation_cue,search_disambiguation_select_labelDead keys add noise to every future i18n audit. Remove them from all three locales.
3.
@ConfigurationPropertiesScanadded toFamilienarchivApplicationwith no justification.Both
OllamaPropertiesandNlSearchRateLimitProperties— the@ConfigurationPropertiesbeans this PR deletes — already carried@Component, which means Spring registered them without any class-level scan annotation. The only surviving@ConfigurationPropertiesbean (auth/RateLimitProperties) also carries@Component. Adding@ConfigurationPropertiesScannow is inert at runtime but signals the opposite of what just happened: it looks like you're broadening config discovery at the exact moment you're removing config classes. Either there's an unrelated fix buried here that wasn't described, or it's a leftover from a half-considered refactor. Either way, remove it or explain it in the commit message.Suggestions
docs/superpowers/plans/2026-06-07-remove-nlp-search.mdwas committed as a new file. Implementation-plan markdown files belong in the Gitea issue body, not in the repository tree. Delete it before merging.🏗️ Markus Keller — Senior Application Architect
Verdict: ⚠️ Approved with concerns
Blockers
1. DEPLOYMENT.md Mermaid diagram contradicts the PR goal
The diff adds a new arrow and node to the architecture diagram in
docs/DEPLOYMENT.md:And updates the key-facts bullet to read: "The OCR service and NLP service have no published ports…"
This is the opposite of what a removal PR should do. The NLP service no longer exists in any compose file, yet the diagram now shows it for the first time. The result is a deployment doc that describes a container that is not deployed. This must be reverted before merge.
2. GLOSSARY.md — orphaned
keyword→tag resolutionentryThe
## NL Search Termssection was correctly deleted, but thekeyword→tag resolutionentry still reads: "the post-Ollama step inNlQueryParserServicewhere each LLM-extracted keyword…" — referencing a class that no longer exists. Either delete this entry entirely or rewrite it to describe the mechanism independently of the removed feature.3. No removal ADR
Four ADRs (028, 028-ollama-compose, 034, 035) are deleted outright. The project's ADR discipline exists precisely to make decisions traceable. Deleting ADRs erases the rationale — if NL search is reinstated or questioned in future, there will be no record of what was tried, why it was built, or why it was removed.
The correct pattern is to add a superseding ADR (e.g.
036-remove-nl-search.md, Status: Supersedes 028/034/035) with a short Context/Decision/Consequences. ADR-030 (briefwechsel-removal) in this same repo is a good precedent for a removal ADR. This is a blocker under this project's documentation standards.Suggestions
4. Three orphaned i18n keys —
search_loading_nl,search_loading_nl_sub,search_switch_to_keywordremain in all three locale files but their only caller (SmartSearchStatus.svelte) was deleted.5.
@ConfigurationPropertiesScanside-change is unexplained —FamilienarchivApplicationgains@ConfigurationPropertiesScanwithout any note in the commit message. If it was necessary to fix a binding after removingNlSearchRateLimitProperties, the commit message should say so. If it's unrelated, it belongs in a separate commit.6. Implementation plan committed to
docs/superpowers/plans/— Scaffolding artefacts used during development do not belong in the permanent history of a production repository. Delete before merge.🔒 Nora Steiner — Application Security Engineer
Verdict: ✅ Approved with concerns
No security regressions. The deletion removes no security controls that protect other features — every item below is either a documentation inconsistency or a low-severity stale artefact.
Blockers
None.
Suggestions
1. DEPLOYMENT.md has contradictory edits (docs bug, not a security hole)
The diff adds a new line to the architecture diagram:
…while simultaneously updating the prose to read: "The OCR service and NLP service have no published ports". A future operator could infer the service is still running with an active :8001 port. Fix: remove both lines.
2.
frontend/e2e/nl-search.spec.tswas not deletedThe file mocks
POST /api/search/nland asserts NL-search UI interactions.POST /api/search/nlno longer exists. Dead test coverage for a non-existent auth-gated endpoint misleads future reviewers into thinking the old path is still guarded. Delete the file.Confirmed clean from a security standpoint:
NlSearchControllercarried@RequirePermission(Permission.READ_ALL)on its sole endpoint — this permission did not overlap with any other controller; its deletion leaves no endpoint unguarded.NlSearchRateLimiter(Bucket4j + Caffeine, per-user, 5 req/min) was purpose-built for NL search.LoginRateLimiter(brute-force guard on the auth endpoint) is a separate class inauth/and was untouched by this PR.expose:(Docker internal network only), neverports:. Their removal closes two Docker network services; no host-port attack surface was left dangling.APP_NLP_BASE_URLenv var removed from both compose files — no dead environment variable pointing at a non-existent service.csrfFetchwas used only insiderunSmartSearch()— that function and its import are both deleted. No mutation endpoint lost its CSRF protection.@ConfigurationPropertiesScanaddition toFamilienarchivApplicationis unrelated to the deletion and is safe.🧪 Sara Holt — QA Engineer & Test Strategist
Verdict: 🚫 Changes requested
Blockers
1.
frontend/e2e/nl-search.spec.tswas not deleted — it will fail in CI.The file is 115 lines of Playwright tests for the smart search UI that no longer exists. The diff shows it was only modified (2 lines added), not deleted. Every assertion in that file will fail immediately:
page.getByRole('button', { name: /Text/ }).click()— the SmartModeToggle button no longer existspage.getByText(/Archiv wird befragt/)— SmartSearchStatus no longer renderedpage.getByText('Stichwort: krieg')/page.getByText(/Thema:.*Weltkrieg/)— InterpretationChipRow deleteddata-testid="smart-search-results"— axe scan target no longer in the DOMfrontend/e2e/nl-search.spec.tsmust be deleted entirely.Suggestions
2. Orphaned i18n keys (low risk, but tidy up).
The following keys remain in
de.json,en.json, andes.jsonafter all their consumer components were deleted:search_loading_nl/search_loading_nl_sub— were used bySmartSearchStatus.svelte(deleted)search_switch_to_keyword— samesearch_chip_sender/search_chip_date/search_chip_keyword/search_chip_directional_label— used byInterpretationChipRow.svelte(deleted)search_disambiguation_trigger_label/search_disambiguation_cue/search_disambiguation_select_label— used byDisambiguationPicker.svelte(deleted)3. Stale GLOSSARY entry.
docs/GLOSSARY.mdstill containskeyword→tag resolutionwhich referencesNlQueryParserService— a class that no longer exists. It should be removed or reworded.Summary: The backend and most frontend test cleanup is thorough and correct — no orphaned mocks, no dead imports referencing the deleted package,
ErrorCode+errors.ts+ primary i18n keys are all aligned. The single blocker is the E2E spec file that was left alive and will immediately explode when the test runner finds a toggle button that is no longer there.🚀 Tobias Wendt — DevOps & Platform Engineer
Verdict: ⚠️ Approved with concerns
Blockers
DEPLOYMENT.md introduces two NLP references instead of removing them.
The diff adds rather than removes the NLP service node from the architecture diagram:
And the "Key facts" section was updated to read: "The OCR service and NLP service have no published ports…"
Both lines are present in the working tree. This is the inverse of what a removal PR should do — the diagram now documents a service that no longer exists in either compose file. Operators or new contributors reading the runbook will try to find a service that is not there. Fix: remove both lines.
Suggestions
No volume cleanup note for operators. The
ollama-modelsvolume declared indocker-compose.prod.yml(now dropped) will still exist on any server that ran the old stack. Docker does not remove named volumes after the declaration disappears — they just become orphans. A one-liner in the upgrade notes or PR description would help:Not a blocker — the orphan volume causes no functional harm — but saves disk on production hosts.
docker-compose.yml and docker-compose.prod.yml are in sync. Both had the
nlp-service/ollama+ollama-model-initservice blocks removed,depends_oncleaned, andAPP_NLP_BASE_URLdropped from the backend environment. ✅prometheus.yml scrape config is clean. The
ollamajob entry was removed without leaving a dangling reference. ✅Grafana dashboard deleted cleanly.
infra/observability/grafana/provisioning/dashboards/ollama.jsonis fully gone. ✅CI workflows are clean. All
.gitea/workflows/files contain no references toollama,nlp-service, orAPP_NLP. ✅Fix the two stale lines in DEPLOYMENT.md and this is good to merge from an infrastructure perspective. Fewer services, fewer failure points, smaller compose memory budget, simpler observability config — the right call.
📋 Elicit — Requirements Engineer
Verdict: ⚠️ Approved with concerns
The decision to remove NL/smart-search is well-founded and the core removal is complete. The rationale (unreliable, slow, better served by filters) is concise and credible for a solo-operator family archive. Users still have full keyword search with date, person, tag, and directional filters — the functional gap is minimal. No blockers on the decision itself.
Blockers
None. The concerns below are correctness and hygiene issues.
Suggestions
1. DEPLOYMENT.md architecture diagram contradicts the removal (bug)
docs/DEPLOYMENT.mdhas two additions that re-introduce the NLP service it is supposed to be removing:and an updated key-facts bullet mentioning "NLP service". Both lines need to be removed before merge.
2. Playwright E2E test
nl-search.spec.tswas not deletedfrontend/e2e/nl-search.spec.tsstill exists. It mocks**/api/search/nl, asserts NL-search UI interactions, and imports removed type fixtures. It will either fail in CI or be silently skipped. Delete it.3. Three orphan i18n keys remain in all three locale files
search_loading_nl,search_loading_nl_sub, andsearch_switch_to_keywordare still present but no longer referenced by any active Svelte or TypeScript source. Remove from all three files.4. GLOSSARY entries reference the removed subsystem
Three
docs/GLOSSARY.mdentries still describe removed concepts:keyword→tag resolution— describes "the post-Ollama step inNlQueryParserService" — rewrite to remove NL-search framing, or delete if the tag-resolution concept is only meaningful in that contextNameMatches— the last sentence mapsNlQueryParserService's resolved/ambiguous buckets, which no longer existTagHint— described entirely as a component ofNlQueryInterpretation.resolvedTags, which was deleted — remove the entry5. No replacement ADR documents the removal decision
Four ADRs (028 ×2, 034, 035) were deleted — they documented why NL search was added and how it was wired. Deleting them without a tombstone ADR erases the reasoning trail. A short superseding ADR (
036-remove-nl-search.md) should capture: what was removed, the root cause (unreliable + slow on CPU-only inference), the decision (keyword filters are sufficient), and references to the deleted ADR numbers. ADR-030 in this repo is a good precedent for a removal ADR.6. Generated API types were not regenerated
frontend/src/lib/generated/api.tsstill containsNlSearchRequest,NlQueryInterpretation,NlSearchResponse, and thePOST /api/search/nlpath. These stale types are harmless today (nothing imports them) but are misleading. Runnpm run generate:apiand commit the updated types before merge.🎨 Leonie Voss — UX Designer & Accessibility Strategist
Verdict: ⚠️ Approved with concerns
Blockers
None — the core removal is clean and the keyword search path works correctly.
Suggestions
1.
pr-20is now dead space — reduce topr-4SmartModeTogglewas absolutely positioned atright-2inside therelative flex-1wrapper. With the toggle gone, nothing is rendered on the right side of that wrapper. The input still carriespr-20(5 rem), which means roughly 80 px of text area is silently eaten by empty air. For our 60+ audience typing on a tablet, every character of visible input width matters. Change topr-4:pl-10is still justified — the search icon sits there.pr-20is not justified by anything that still exists.2. Twelve i18n keys are dead code — remove them from all three message files
The following keys remain in
de.json,en.json, andes.jsonbut have zero consumers insrc/:search_loading_nl,search_loading_nl_subsearch_switch_to_keywordsearch_filter_remove_labelsearch_chip_sender,search_chip_date,search_chip_keyword,search_chip_theme_prefix,search_chip_directional_labelsearch_disambiguation_trigger_label,search_disambiguation_cue,search_disambiguation_select_label3. The UX simplification is the right call for seniors
The "KI / Text" pill toggle was a genuine point of confusion for our 60+ transcribers: two modes, different trigger behaviour (keystroke vs. Enter), different result layouts. Removing it leaves a single, predictable interaction: type → results appear. The simplified bar is visually lighter and cognitively quieter — exactly the right direction for the read path on small screens.
4. Brand audit of deleted components: no regressions
SmartModeToggleusedfocus-visible:ring-brand-navy(correct token).SmartSearchStatusandInterpretationChipRowusedborder-primary,bg-primary,text-ink-3throughout — all semantic design tokens, no hardcoded hex or raw Tailwind palette escapes. The deletion leaves the token surface cleaner, not dirtier. ✅5. Minor: loading spinner
aria-labelis hard-coded GermanPre-dates this PR, but worth noting while the component is in focus: the spinner carries
aria-label="Suche läuft…"as a hard-coded string. Spanish and English users hear this announced in German. A follow-up to replace it withm.docs_search_loading_label()would close a silent accessibility gap.Review Resolution — All Blockers Fixed (commit
784a7759)All blockers and concerns from the 7-persona review have been addressed:
Blockers resolved
frontend/e2e/nl-search.spec.tsnot deleted (Sara, Felix, Nora)docs/DEPLOYMENT.mdadds NLP service instead of removing it (Markus, Tobias, Nora)docs/adr/034-remove-nl-search.md(supersedes deleted ADRs 028×2, 034-ollama, 035)keyword→tag resolution,PersonHint,TagHint,theme chip; trimmedNameMatchesSuggestions addressed
@ConfigurationPropertiesScanunexplained (Felix, Markus)@ConfigurationPropertiesbeans carry@ComponentSearchFilterBar.sveltepr-20is dead space (Leonie)pr-4docs/superpowers/plans/2026-06-07-remove-nlp-search.mdcommitted (Felix, Markus)Remaining open items (deferred)
frontend/src/lib/generated/api.tsstill containsNlSearchRequest/NlQueryInterpretation/NlSearchResponse. Regenerating requires runningnpm run generate:apiagainst a live backend, which is not available in this worktree. These types are unused (nothing imports them) — harmless but ideally cleaned up post-merge.aria-labelhard-coded German (Leonie): pre-existing issue, out of scope for this PR.Overall verdict: ready to merge.
Deferred item closed — generated API types cleaned (commit
1b9fb5a3)The one review item left open in the resolution comment — @Elicit #6, stale NL types in
frontend/src/lib/generated/api.ts— is now fixed.Rather than bring up a second backend instance, the stale entries were stripped manually. Because nothing in
frontend/srcimports them, the removal is byte-equivalent to whatnpm run generate:apiproduces against the now NL-free backend.Removed (73 deletions, 0 additions):
"/api/search/nl"path blockoperations["search"](its only operation)NlSearchRequest,NlQueryInterpretation,NlSearchResponsePersonHint,TagHint— orphaned onceNlQueryInterpretationwas removed (no other consumer), so a true regen would drop them tooVerification:
git grep -E 'Nl(Search|Query)|PersonHint|TagHint|/api/search/nl' frontend/src→ emptytsc --noEmit --skipLibCheckonapi.ts→ exit 0Both items the resolution comment flagged as remaining are now resolved:
aria-label— already moot; the string lived inside the deletedSmartSearchStatus.svelteand no longer exists anywhere infrontend/srcNo open review concerns remain.
🤖 Generated with Claude Code
1b9fb5a359to3d929c55c3