# ADR-034 — Remove NL/smart-search (supersedes ADR-028 ×2, ADR-034-ollama, ADR-035) **Date:** 2026-06-07 **Status:** Accepted **Issue:** #772 **Supersedes:** ADR-028 (nl-search-ollama), ADR-028 (ollama-docker-compose-service), ADR-034 (ollama-production-deployment-and-keep-alive), ADR-035 (rule-based-nlp-service) --- ## Context The natural-language search feature ("KI-Suche" / smart search) allowed users to enter free-form queries like *"Was hat Walter an Emma im Krieg geschrieben?"* and have them interpreted by an LLM into structured filters (persons, tags, date range, keywords). The feature went through two major iterations: 1. **Ollama integration** (ADR-028): an `ollama` Docker service running a local LLM (llama3.2/gemma3) parsed queries via a JSON-mode prompt. 2. **Rule-based NLP service** (ADR-035): after Ollama proved too slow and unreliable on CPU-only hardware, a Python FastAPI microservice (`nlp-service`, port 8001) replaced it with deterministic regex + spaCy parsing plus a lightweight LLM call. Both approaches shared the same fundamental problem: inference on the production server (Hetzner Serverbörse, no GPU, 64 GB RAM, i7-6700) was too slow to be useful, with typical query latencies of 10–30 seconds. Users got better and faster results from the existing keyword search with date/person/tag filters. ## Decision **Remove the NL search feature entirely.** The Python `nlp-service` microservice, the Spring Boot `search/` package (`NlSearchController`, `NlQueryParserService`, `RestClientNlpClient`, `NlSearchRateLimiter`, and all supporting classes), the frontend NL search components (`SmartModeToggle`, `SmartSearchStatus`, `InterpretationChipRow`, `DisambiguationPicker`), the related Docker Compose services, Prometheus scrape job, Grafana dashboard, and all i18n keys are removed. The existing structured search (FTS keyword + person/tag/date/directional filters) is sufficient for the archive's current audience and search workload. ## Consequences - **Capability removed:** users can no longer enter free-form natural-language queries. They must use the structured filter bar (keyword text box + person/tag/date/directional dropdowns). For documents where these filters are sufficient, there is no regression. - **Operational simplification:** the Docker Compose stack loses two services (`nlp-service` and previously `ollama`/`ollama-model-init`). Memory budget on the production host is freed. No external model weights need to be kept warm. - **Future reinstatement:** if a GPU-capable host becomes available, re-implementing server-side LLM inference would be straightforward given the clean separation of the `NlSearchController` entry point. However, this ADR deliberately avoids leaving dead infrastructure or stub code in place — start clean if and when that becomes viable. - **No data or schema change:** only query/endpoint code and Docker services are removed. The `documents`, `persons`, and `tags` tables and their FTS indexes are untouched.