Files
familienarchiv/docs/adr/034-remove-nl-search.md
marcel d650b6c066
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m23s
CI / OCR Service Tests (push) Successful in 24s
CI / Backend Unit Tests (push) Successful in 3m46s
CI / fail2ban Regex (push) Successful in 46s
CI / Semgrep Security Scan (push) Successful in 25s
CI / Compose Bucket Idempotency (push) Successful in 1m8s
refactor(search): remove NLP/smart-search feature entirely (#772)
## Summary

- Removes the NLP/smart-search feature completely — the feature was too unreliable and slow; users get better results with the regular search filters
- Deletes the entire backend `search/` package (NlSearchController, NlQueryParserService, NlpClient, NlSearchRateLimiter — 14 classes + 6 test classes)
- Deletes the `nlp-service/` Python microservice (FastAPI, rapidfuzz, DB-backed person matching)
- Removes all frontend NL search components: SmartModeToggle, SmartSearchStatus, InterpretationChipRow, DisambiguationPicker, chip-types, theme-chip-removal
- Strips smart-mode logic from SearchFilterBar and documents/+page.svelte
- Removes `SMART_SEARCH_UNAVAILABLE` / `SMART_SEARCH_RATE_LIMITED` error codes from backend, frontend types, and all three i18n files (de/en/es)
- Removes `nlp-service` container and `APP_NLP_BASE_URL` from both docker-compose files
- Removes Ollama/NLP Prometheus scrape job and Grafana dashboard
- Deletes ADRs 028 (×2), 034, 035

## Test plan

- [ ] Backend compiles: `cd backend && ./mvnw compile -q` → BUILD SUCCESS
- [ ] Frontend server tests pass: `cd frontend && npm run test -- --project=server`
- [ ] No NLP/smart-search references remain in source: `grep -r "SmartSearch\|NlSearch\|nlp-service\|SMART_SEARCH" backend/src frontend/src`
- [ ] `docker compose config` validates both compose files
- [ ] Search page loads, filter bar works, no smart-mode toggle visible

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Marcel <marcel@familienarchiv>
Reviewed-on: #772
2026-06-08 10:57:00 +02:00

3.0 KiB
Raw Blame History

ADR-034 — Remove NL/smart-search (supersedes ADR-028 ×2, ADR-034-ollama, ADR-035)

Date: 2026-06-07 Status: Accepted Issue: #772 Supersedes: ADR-028 (nl-search-ollama), ADR-028 (ollama-docker-compose-service), ADR-034 (ollama-production-deployment-and-keep-alive), ADR-035 (rule-based-nlp-service)


Context

The natural-language search feature ("KI-Suche" / smart search) allowed users to enter free-form queries like "Was hat Walter an Emma im Krieg geschrieben?" and have them interpreted by an LLM into structured filters (persons, tags, date range, keywords).

The feature went through two major iterations:

  1. Ollama integration (ADR-028): an ollama Docker service running a local LLM (llama3.2/gemma3) parsed queries via a JSON-mode prompt.
  2. Rule-based NLP service (ADR-035): after Ollama proved too slow and unreliable on CPU-only hardware, a Python FastAPI microservice (nlp-service, port 8001) replaced it with deterministic regex + spaCy parsing plus a lightweight LLM call.

Both approaches shared the same fundamental problem: inference on the production server (Hetzner Serverbörse, no GPU, 64 GB RAM, i7-6700) was too slow to be useful, with typical query latencies of 1030 seconds. Users got better and faster results from the existing keyword search with date/person/tag filters.

Decision

Remove the NL search feature entirely. The Python nlp-service microservice, the Spring Boot search/ package (NlSearchController, NlQueryParserService, RestClientNlpClient, NlSearchRateLimiter, and all supporting classes), the frontend NL search components (SmartModeToggle, SmartSearchStatus, InterpretationChipRow, DisambiguationPicker), the related Docker Compose services, Prometheus scrape job, Grafana dashboard, and all i18n keys are removed.

The existing structured search (FTS keyword + person/tag/date/directional filters) is sufficient for the archive's current audience and search workload.

Consequences

  • Capability removed: users can no longer enter free-form natural-language queries. They must use the structured filter bar (keyword text box + person/tag/date/directional dropdowns). For documents where these filters are sufficient, there is no regression.
  • Operational simplification: the Docker Compose stack loses two services (nlp-service and previously ollama/ollama-model-init). Memory budget on the production host is freed. No external model weights need to be kept warm.
  • Future reinstatement: if a GPU-capable host becomes available, re-implementing server-side LLM inference would be straightforward given the clean separation of the NlSearchController entry point. However, this ADR deliberately avoids leaving dead infrastructure or stub code in place — start clean if and when that becomes viable.
  • No data or schema change: only query/endpoint code and Docker services are removed. The documents, persons, and tags tables and their FTS indexes are untouched.