refactor(search): remove NLP/smart-search feature entirely (#772)
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m23s
CI / OCR Service Tests (push) Successful in 24s
CI / Backend Unit Tests (push) Successful in 3m46s
CI / fail2ban Regex (push) Successful in 46s
CI / Semgrep Security Scan (push) Successful in 25s
CI / Compose Bucket Idempotency (push) Successful in 1m8s
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m23s
CI / OCR Service Tests (push) Successful in 24s
CI / Backend Unit Tests (push) Successful in 3m46s
CI / fail2ban Regex (push) Successful in 46s
CI / Semgrep Security Scan (push) Successful in 25s
CI / Compose Bucket Idempotency (push) Successful in 1m8s
## Summary - Removes the NLP/smart-search feature completely — the feature was too unreliable and slow; users get better results with the regular search filters - Deletes the entire backend `search/` package (NlSearchController, NlQueryParserService, NlpClient, NlSearchRateLimiter — 14 classes + 6 test classes) - Deletes the `nlp-service/` Python microservice (FastAPI, rapidfuzz, DB-backed person matching) - Removes all frontend NL search components: SmartModeToggle, SmartSearchStatus, InterpretationChipRow, DisambiguationPicker, chip-types, theme-chip-removal - Strips smart-mode logic from SearchFilterBar and documents/+page.svelte - Removes `SMART_SEARCH_UNAVAILABLE` / `SMART_SEARCH_RATE_LIMITED` error codes from backend, frontend types, and all three i18n files (de/en/es) - Removes `nlp-service` container and `APP_NLP_BASE_URL` from both docker-compose files - Removes Ollama/NLP Prometheus scrape job and Grafana dashboard - Deletes ADRs 028 (×2), 034, 035 ## Test plan - [ ] Backend compiles: `cd backend && ./mvnw compile -q` → BUILD SUCCESS - [ ] Frontend server tests pass: `cd frontend && npm run test -- --project=server` - [ ] No NLP/smart-search references remain in source: `grep -r "SmartSearch\|NlSearch\|nlp-service\|SMART_SEARCH" backend/src frontend/src` - [ ] `docker compose config` validates both compose files - [ ] Search page loads, filter bar works, no smart-mode toggle visible 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Marcel <marcel@familienarchiv> Reviewed-on: #772
This commit was merged in pull request #772.
This commit is contained in:
53
docs/adr/034-remove-nl-search.md
Normal file
53
docs/adr/034-remove-nl-search.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# ADR-034 — Remove NL/smart-search (supersedes ADR-028 ×2, ADR-034-ollama, ADR-035)
|
||||
|
||||
**Date:** 2026-06-07
|
||||
**Status:** Accepted
|
||||
**Issue:** #772
|
||||
**Supersedes:** ADR-028 (nl-search-ollama), ADR-028 (ollama-docker-compose-service), ADR-034 (ollama-production-deployment-and-keep-alive), ADR-035 (rule-based-nlp-service)
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
The natural-language search feature ("KI-Suche" / smart search) allowed users to enter
|
||||
free-form queries like *"Was hat Walter an Emma im Krieg geschrieben?"* and have them
|
||||
interpreted by an LLM into structured filters (persons, tags, date range, keywords).
|
||||
|
||||
The feature went through two major iterations:
|
||||
1. **Ollama integration** (ADR-028): an `ollama` Docker service running a local LLM
|
||||
(llama3.2/gemma3) parsed queries via a JSON-mode prompt.
|
||||
2. **Rule-based NLP service** (ADR-035): after Ollama proved too slow and unreliable on
|
||||
CPU-only hardware, a Python FastAPI microservice (`nlp-service`, port 8001) replaced
|
||||
it with deterministic regex + spaCy parsing plus a lightweight LLM call.
|
||||
|
||||
Both approaches shared the same fundamental problem: inference on the production server
|
||||
(Hetzner Serverbörse, no GPU, 64 GB RAM, i7-6700) was too slow to be useful, with
|
||||
typical query latencies of 10–30 seconds. Users got better and faster results from
|
||||
the existing keyword search with date/person/tag filters.
|
||||
|
||||
## Decision
|
||||
|
||||
**Remove the NL search feature entirely.** The Python `nlp-service` microservice, the
|
||||
Spring Boot `search/` package (`NlSearchController`, `NlQueryParserService`,
|
||||
`RestClientNlpClient`, `NlSearchRateLimiter`, and all supporting classes), the frontend
|
||||
NL search components (`SmartModeToggle`, `SmartSearchStatus`, `InterpretationChipRow`,
|
||||
`DisambiguationPicker`), the related Docker Compose services, Prometheus scrape job,
|
||||
Grafana dashboard, and all i18n keys are removed.
|
||||
|
||||
The existing structured search (FTS keyword + person/tag/date/directional filters) is
|
||||
sufficient for the archive's current audience and search workload.
|
||||
|
||||
## Consequences
|
||||
|
||||
- **Capability removed:** users can no longer enter free-form natural-language queries.
|
||||
They must use the structured filter bar (keyword text box + person/tag/date/directional
|
||||
dropdowns). For documents where these filters are sufficient, there is no regression.
|
||||
- **Operational simplification:** the Docker Compose stack loses two services
|
||||
(`nlp-service` and previously `ollama`/`ollama-model-init`). Memory budget on the
|
||||
production host is freed. No external model weights need to be kept warm.
|
||||
- **Future reinstatement:** if a GPU-capable host becomes available, re-implementing
|
||||
server-side LLM inference would be straightforward given the clean separation of the
|
||||
`NlSearchController` entry point. However, this ADR deliberately avoids leaving dead
|
||||
infrastructure or stub code in place — start clean if and when that becomes viable.
|
||||
- **No data or schema change:** only query/endpoint code and Docker services are removed.
|
||||
The `documents`, `persons`, and `tags` tables and their FTS indexes are untouched.
|
||||
Reference in New Issue
Block a user