fix(review): resolve all review blockers and concerns
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m51s
CI / OCR Service Tests (pull_request) Successful in 25s
CI / Backend Unit Tests (pull_request) Successful in 3m46s
CI / fail2ban Regex (pull_request) Successful in 45s
CI / Semgrep Security Scan (pull_request) Successful in 24s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m8s

- Delete frontend/e2e/nl-search.spec.ts (was left alive; would have
  crashed CI when Playwright couldn't find the deleted SmartModeToggle)
- Fix docs/DEPLOYMENT.md: remove NLP service arrow + key-facts bullet
  that were accidentally added instead of removed in the prior commit
- Clean docs/GLOSSARY.md: remove keyword→tag resolution, PersonHint,
  TagHint, theme chip entries; trim NameMatches to drop the
  NlQueryParserService reference
- Remove @ConfigurationPropertiesScan from FamilienarchivApplication
  (all remaining @ConfigurationProperties beans carry @Component)
- Remove 12 orphaned i18n keys from de/en/es message files
  (search_loading_nl, search_chip_*, search_disambiguation_*, etc.)
- Fix SearchFilterBar.svelte input padding: pr-20 → pr-4 (SmartModeToggle
  that justified the right padding is gone)
- Delete docs/superpowers/plans/2026-06-07-remove-nlp-search.md
  (scaffolding artefact; plan files belong in Gitea issues, not the repo)
- Add docs/adr/034-remove-nl-search.md documenting the removal decision
  (supersedes deleted ADR-028 ×2, ADR-034-ollama, ADR-035)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-06-07 19:50:48 +02:00
parent fbaf180136
commit 784a7759f5
10 changed files with 56 additions and 934 deletions

View File

@@ -0,0 +1,53 @@
# ADR-034 — Remove NL/smart-search (supersedes ADR-028 ×2, ADR-034-ollama, ADR-035)
**Date:** 2026-06-07
**Status:** Accepted
**Issue:** #772
**Supersedes:** ADR-028 (nl-search-ollama), ADR-028 (ollama-docker-compose-service), ADR-034 (ollama-production-deployment-and-keep-alive), ADR-035 (rule-based-nlp-service)
---
## Context
The natural-language search feature ("KI-Suche" / smart search) allowed users to enter
free-form queries like *"Was hat Walter an Emma im Krieg geschrieben?"* and have them
interpreted by an LLM into structured filters (persons, tags, date range, keywords).
The feature went through two major iterations:
1. **Ollama integration** (ADR-028): an `ollama` Docker service running a local LLM
(llama3.2/gemma3) parsed queries via a JSON-mode prompt.
2. **Rule-based NLP service** (ADR-035): after Ollama proved too slow and unreliable on
CPU-only hardware, a Python FastAPI microservice (`nlp-service`, port 8001) replaced
it with deterministic regex + spaCy parsing plus a lightweight LLM call.
Both approaches shared the same fundamental problem: inference on the production server
(Hetzner Serverbörse, no GPU, 64 GB RAM, i7-6700) was too slow to be useful, with
typical query latencies of 1030 seconds. Users got better and faster results from
the existing keyword search with date/person/tag filters.
## Decision
**Remove the NL search feature entirely.** The Python `nlp-service` microservice, the
Spring Boot `search/` package (`NlSearchController`, `NlQueryParserService`,
`RestClientNlpClient`, `NlSearchRateLimiter`, and all supporting classes), the frontend
NL search components (`SmartModeToggle`, `SmartSearchStatus`, `InterpretationChipRow`,
`DisambiguationPicker`), the related Docker Compose services, Prometheus scrape job,
Grafana dashboard, and all i18n keys are removed.
The existing structured search (FTS keyword + person/tag/date/directional filters) is
sufficient for the archive's current audience and search workload.
## Consequences
- **Capability removed:** users can no longer enter free-form natural-language queries.
They must use the structured filter bar (keyword text box + person/tag/date/directional
dropdowns). For documents where these filters are sufficient, there is no regression.
- **Operational simplification:** the Docker Compose stack loses two services
(`nlp-service` and previously `ollama`/`ollama-model-init`). Memory budget on the
production host is freed. No external model weights need to be kept warm.
- **Future reinstatement:** if a GPU-capable host becomes available, re-implementing
server-side LLM inference would be straightforward given the clean separation of the
`NlSearchController` entry point. However, this ADR deliberately avoids leaving dead
infrastructure or stub code in place — start clean if and when that becomes viable.
- **No data or schema change:** only query/endpoint code and Docker services are removed.
The `documents`, `persons`, and `tags` tables and their FTS indexes are untouched.