- Delete frontend/e2e/nl-search.spec.ts (was left alive; would have crashed CI when Playwright couldn't find the deleted SmartModeToggle) - Fix docs/DEPLOYMENT.md: remove NLP service arrow + key-facts bullet that were accidentally added instead of removed in the prior commit - Clean docs/GLOSSARY.md: remove keyword→tag resolution, PersonHint, TagHint, theme chip entries; trim NameMatches to drop the NlQueryParserService reference - Remove @ConfigurationPropertiesScan from FamilienarchivApplication (all remaining @ConfigurationProperties beans carry @Component) - Remove 12 orphaned i18n keys from de/en/es message files (search_loading_nl, search_chip_*, search_disambiguation_*, etc.) - Fix SearchFilterBar.svelte input padding: pr-20 → pr-4 (SmartModeToggle that justified the right padding is gone) - Delete docs/superpowers/plans/2026-06-07-remove-nlp-search.md (scaffolding artefact; plan files belong in Gitea issues, not the repo) - Add docs/adr/034-remove-nl-search.md documenting the removal decision (supersedes deleted ADR-028 ×2, ADR-034-ollama, ADR-035) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.0 KiB
ADR-034 — Remove NL/smart-search (supersedes ADR-028 ×2, ADR-034-ollama, ADR-035)
Date: 2026-06-07 Status: Accepted Issue: #772 Supersedes: ADR-028 (nl-search-ollama), ADR-028 (ollama-docker-compose-service), ADR-034 (ollama-production-deployment-and-keep-alive), ADR-035 (rule-based-nlp-service)
Context
The natural-language search feature ("KI-Suche" / smart search) allowed users to enter free-form queries like "Was hat Walter an Emma im Krieg geschrieben?" and have them interpreted by an LLM into structured filters (persons, tags, date range, keywords).
The feature went through two major iterations:
- Ollama integration (ADR-028): an
ollamaDocker service running a local LLM (llama3.2/gemma3) parsed queries via a JSON-mode prompt. - Rule-based NLP service (ADR-035): after Ollama proved too slow and unreliable on
CPU-only hardware, a Python FastAPI microservice (
nlp-service, port 8001) replaced it with deterministic regex + spaCy parsing plus a lightweight LLM call.
Both approaches shared the same fundamental problem: inference on the production server (Hetzner Serverbörse, no GPU, 64 GB RAM, i7-6700) was too slow to be useful, with typical query latencies of 10–30 seconds. Users got better and faster results from the existing keyword search with date/person/tag filters.
Decision
Remove the NL search feature entirely. The Python nlp-service microservice, the
Spring Boot search/ package (NlSearchController, NlQueryParserService,
RestClientNlpClient, NlSearchRateLimiter, and all supporting classes), the frontend
NL search components (SmartModeToggle, SmartSearchStatus, InterpretationChipRow,
DisambiguationPicker), the related Docker Compose services, Prometheus scrape job,
Grafana dashboard, and all i18n keys are removed.
The existing structured search (FTS keyword + person/tag/date/directional filters) is sufficient for the archive's current audience and search workload.
Consequences
- Capability removed: users can no longer enter free-form natural-language queries. They must use the structured filter bar (keyword text box + person/tag/date/directional dropdowns). For documents where these filters are sufficient, there is no regression.
- Operational simplification: the Docker Compose stack loses two services
(
nlp-serviceand previouslyollama/ollama-model-init). Memory budget on the production host is freed. No external model weights need to be kept warm. - Future reinstatement: if a GPU-capable host becomes available, re-implementing
server-side LLM inference would be straightforward given the clean separation of the
NlSearchControllerentry point. However, this ADR deliberately avoids leaving dead infrastructure or stub code in place — start clean if and when that becomes viable. - No data or schema change: only query/endpoint code and Docker services are removed.
The
documents,persons, andtagstables and their FTS indexes are untouched.