Files
familienarchiv/docs/adr/034-remove-nl-search.md
Marcel acc73fd3e1 fix(review): resolve all review blockers and concerns
- Delete frontend/e2e/nl-search.spec.ts (was left alive; would have
  crashed CI when Playwright couldn't find the deleted SmartModeToggle)
- Fix docs/DEPLOYMENT.md: remove NLP service arrow + key-facts bullet
  that were accidentally added instead of removed in the prior commit
- Clean docs/GLOSSARY.md: remove keyword→tag resolution, PersonHint,
  TagHint, theme chip entries; trim NameMatches to drop the
  NlQueryParserService reference
- Remove @ConfigurationPropertiesScan from FamilienarchivApplication
  (all remaining @ConfigurationProperties beans carry @Component)
- Remove 12 orphaned i18n keys from de/en/es message files
  (search_loading_nl, search_chip_*, search_disambiguation_*, etc.)
- Fix SearchFilterBar.svelte input padding: pr-20 → pr-4 (SmartModeToggle
  that justified the right padding is gone)
- Delete docs/superpowers/plans/2026-06-07-remove-nlp-search.md
  (scaffolding artefact; plan files belong in Gitea issues, not the repo)
- Add docs/adr/034-remove-nl-search.md documenting the removal decision
  (supersedes deleted ADR-028 ×2, ADR-034-ollama, ADR-035)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 10:56:32 +02:00

54 lines
3.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-034 — Remove NL/smart-search (supersedes ADR-028 ×2, ADR-034-ollama, ADR-035)
**Date:** 2026-06-07
**Status:** Accepted
**Issue:** #772
**Supersedes:** ADR-028 (nl-search-ollama), ADR-028 (ollama-docker-compose-service), ADR-034 (ollama-production-deployment-and-keep-alive), ADR-035 (rule-based-nlp-service)
---
## Context
The natural-language search feature ("KI-Suche" / smart search) allowed users to enter
free-form queries like *"Was hat Walter an Emma im Krieg geschrieben?"* and have them
interpreted by an LLM into structured filters (persons, tags, date range, keywords).
The feature went through two major iterations:
1. **Ollama integration** (ADR-028): an `ollama` Docker service running a local LLM
(llama3.2/gemma3) parsed queries via a JSON-mode prompt.
2. **Rule-based NLP service** (ADR-035): after Ollama proved too slow and unreliable on
CPU-only hardware, a Python FastAPI microservice (`nlp-service`, port 8001) replaced
it with deterministic regex + spaCy parsing plus a lightweight LLM call.
Both approaches shared the same fundamental problem: inference on the production server
(Hetzner Serverbörse, no GPU, 64 GB RAM, i7-6700) was too slow to be useful, with
typical query latencies of 1030 seconds. Users got better and faster results from
the existing keyword search with date/person/tag filters.
## Decision
**Remove the NL search feature entirely.** The Python `nlp-service` microservice, the
Spring Boot `search/` package (`NlSearchController`, `NlQueryParserService`,
`RestClientNlpClient`, `NlSearchRateLimiter`, and all supporting classes), the frontend
NL search components (`SmartModeToggle`, `SmartSearchStatus`, `InterpretationChipRow`,
`DisambiguationPicker`), the related Docker Compose services, Prometheus scrape job,
Grafana dashboard, and all i18n keys are removed.
The existing structured search (FTS keyword + person/tag/date/directional filters) is
sufficient for the archive's current audience and search workload.
## Consequences
- **Capability removed:** users can no longer enter free-form natural-language queries.
They must use the structured filter bar (keyword text box + person/tag/date/directional
dropdowns). For documents where these filters are sufficient, there is no regression.
- **Operational simplification:** the Docker Compose stack loses two services
(`nlp-service` and previously `ollama`/`ollama-model-init`). Memory budget on the
production host is freed. No external model weights need to be kept warm.
- **Future reinstatement:** if a GPU-capable host becomes available, re-implementing
server-side LLM inference would be straightforward given the clean separation of the
`NlSearchController` entry point. However, this ADR deliberately avoids leaving dead
infrastructure or stub code in place — start clean if and when that becomes viable.
- **No data or schema change:** only query/endpoint code and Docker services are removed.
The `documents`, `persons`, and `tags` tables and their FTS indexes are untouched.