docs: remove nlp-service and NL search references from DEPLOYMENT.md and GLOSSARY.md
This commit is contained in:
@@ -51,17 +51,15 @@ graph TD
|
||||
|
||||
The OCR service requires significant RAM for model loading. The dev compose sets `mem_limit: 12g`.
|
||||
|
||||
| Production target | RAM | Recommended OCR limit | NL Search | Notes |
|
||||
|---|---|---|---|---|
|
||||
| Current server (Hetzner Serverbörse, i7-6700) | 64 GB | 12 GB | Supported | Default `mem_limit: 12g` works comfortably; nlp-service adds only ~256 MB |
|
||||
| ≥ 16 GB RAM | 16+ GB | 12 GB | Supported | Default works |
|
||||
| 8 GB RAM | 8 GB | 6 GB | Supported | Set `OCR_MEM_LIMIT=6g`; accept reduced batch sizes; nlp-service is lightweight |
|
||||
| 4 GB RAM | 4 GB | — | Supported | Disable OCR service (`profiles: [ocr]`); run OCR on demand only; nlp-service still runs |
|
||||
| Production target | RAM | Recommended OCR limit | Notes |
|
||||
|---|---|---|---|
|
||||
| Current server (Hetzner Serverbörse, i7-6700) | 64 GB | 12 GB | Default `mem_limit: 12g` works comfortably |
|
||||
| ≥ 16 GB RAM | 16+ GB | 12 GB | Default works |
|
||||
| 8 GB RAM | 8 GB | 6 GB | Set `OCR_MEM_LIMIT=6g`; accept reduced batch sizes |
|
||||
| 4 GB RAM | 4 GB | — | Disable OCR service (`profiles: [ocr]`); run OCR on demand only |
|
||||
|
||||
On servers with less than 16 GB RAM the default `mem_limit: 12g` cannot be honoured — set the `OCR_MEM_LIMIT` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow). The prod compose interpolates this var with a 12g default.
|
||||
|
||||
> **Memory budget:** OCR (~6 GB active) + nlp-service (~256 MB) = ~6.25 GB. The previous Ollama LLM (~8 GB) has been replaced by the rule-based nlp-service — significant memory headroom freed on all server tiers.
|
||||
|
||||
### Dev vs production differences
|
||||
|
||||
| Concern | Dev (`docker-compose.yml`) | Prod (`docker-compose.prod.yml`) |
|
||||
@@ -148,19 +146,6 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
||||
| `XDG_CACHE_HOME` | XDG cache base dir — redirects Matplotlib and other XDG-aware libraries away from the read-only `HOME` (`/home/ocr`) to the writable cache volume | `/app/cache` | — | — |
|
||||
| `TORCH_HOME` | PyTorch model cache — redirects `~/.cache/torch` to the writable models volume | `/app/models/torch` | — | — |
|
||||
|
||||
### NLP service (NL search)
|
||||
|
||||
| Variable | Purpose | Default | Required? | Sensitive? |
|
||||
|---|---|---|---|---|
|
||||
| `APP_NLP_BASE_URL` | Internal URL of the nlp-service container. Wired automatically in compose via `http://nlp-service:8001`. | `http://nlp-service:8001` | YES | — |
|
||||
| `NLP_FUZZY_THRESHOLD` | Rapidfuzz similarity floor for person-name matching (0–100). Lower values match more aggressively; raise if false positives appear. | `80` | — | — |
|
||||
|
||||
The nlp-service reads `DATABASE_URL` at startup (composed from `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`). Any credential rotation that touches those three vars must be followed by a restart of **both** `backend` and `nlp-service`:
|
||||
|
||||
```bash
|
||||
docker compose restart nlp-service backend
|
||||
```
|
||||
|
||||
### Observability stack (`docker-compose.observability.yml`)
|
||||
|
||||
| Variable | Purpose | Default | Required? | Sensitive? |
|
||||
@@ -281,14 +266,6 @@ git.raddatz.cloud A <server IP>
|
||||
|
||||
### 3.4 First deploy
|
||||
|
||||
> **NL search startup:** `nlp-service` loads person names from the database at startup (single query, ~1–2 s). No model weights to download. The backend waits for `nlp-service` to pass its healthcheck (`/health` returns `{"status":"ok","persons_loaded":N}`) before starting, so `docker compose up -d --wait` is safe to use on first deploy.
|
||||
>
|
||||
> **Verify NL search is active:**
|
||||
> ```bash
|
||||
> curl -s http://localhost:8001/health
|
||||
> # Returns {"status":"ok","persons_loaded":N} with N > 0 → person matching enabled
|
||||
> # Returns {"status":"ok","persons_loaded":0} → DB not reachable or persons table empty
|
||||
> ```
|
||||
|
||||
```bash
|
||||
# 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow")
|
||||
@@ -328,7 +305,7 @@ docker compose logs --follow
|
||||
|
||||
# Single snapshot
|
||||
docker compose logs --tail=200 <service>
|
||||
# services: frontend, backend, db, minio, ocr-service, nlp-service
|
||||
# services: frontend, backend, db, minio, ocr-service
|
||||
```
|
||||
|
||||
### Log locations
|
||||
@@ -585,41 +562,6 @@ bash scripts/download-kraken-models.sh
|
||||
|
||||
> Downloads the Kurrent/Sütterlin HTR models. Run once after a fresh clone or when models are updated.
|
||||
|
||||
### NLP service — natural-language search (NL Search)
|
||||
|
||||
NL search uses the rule-based `nlp-service` FastAPI container for query parsing. It has no model weights — it loads person names from the database at startup and applies regex + fuzzy matching. See ADR-035.
|
||||
|
||||
**Health check:**
|
||||
|
||||
```bash
|
||||
curl -s http://localhost:8001/health
|
||||
# {"status":"ok","persons_loaded":1247}
|
||||
```
|
||||
|
||||
`persons_loaded: 0` means the service started but could not reach the database (check `DATABASE_URL` and that `db` is healthy).
|
||||
|
||||
If `POST /api/search/nl` returns HTTP 503 `SMART_SEARCH_UNAVAILABLE`, the backend cannot reach `nlp-service`. Check with:
|
||||
|
||||
```bash
|
||||
docker compose logs nlp-service --tail=50
|
||||
docker compose ps nlp-service
|
||||
```
|
||||
|
||||
**Configuration** (see `application.yaml` under `app.nlp`):
|
||||
|
||||
| Property | Default | Description |
|
||||
|---|---|---|
|
||||
| `app.nlp.base-url` | `http://nlp-service:8001` | nlp-service URL; set via `APP_NLP_BASE_URL` env var |
|
||||
| `app.nl-search.rate-limit.max-requests-per-minute` | `20` | Per-user rate limit |
|
||||
|
||||
**Tuning person matching:**
|
||||
|
||||
Set `NLP_FUZZY_THRESHOLD` in `.env` (default: `80`, range: `0–100`). Lower values match more aggressively at the cost of false positives. Restart nlp-service after changing:
|
||||
|
||||
```bash
|
||||
docker compose restart nlp-service
|
||||
```
|
||||
|
||||
### Trigger a canonical import
|
||||
|
||||
The importer no longer parses the raw spreadsheet. It consumes the **canonical artifacts**
|
||||
|
||||
@@ -165,13 +165,6 @@ _See also [Chronik](#chronik-internal)._
|
||||
|
||||
**Domain** — a Tier-1 bounded context with its own entities, controller, service, repository, and DTOs. Backend domains: `document`, `person`, `tag`, `user`, `geschichte`, `notification`, `ocr`, `audit`, `dashboard`. Frontend domains mirror this structure under `src/lib/`.
|
||||
|
||||
---
|
||||
|
||||
## NL Search Terms
|
||||
|
||||
**NlSearch** — the natural-language document search feature. Users type a plain-German query (e.g. "Was hat Walter im Krieg an Emma geschrieben?"); the backend parses it via Ollama, resolves person names to database UUIDs, and delegates to the standard `DocumentService.searchDocuments()` path. Endpoint: `POST /api/search/nl`.
|
||||
|
||||
**NlQueryInterpretation** — the structured result of parsing a natural-language query. Contains: `resolvedPersons` (persons whose names unambiguously matched one DB record), `ambiguousPersons` (all candidates when a name matched more than one person), `keywords` (LLM-extracted search terms), `dateFrom`/`dateTo` (extracted date range), `rawQuery` (the original user input), `keywordsApplied` (whether keyword FTS was used), `resolvedTags` (tags matched by keyword→tag resolution), and `tagsApplied` (whether the OR-union tag filter was applied).
|
||||
|
||||
**keyword→tag resolution** — the post-Ollama step in `NlQueryParserService` where each LLM-extracted keyword is substring-matched against the tag taxonomy via `TagService.findByNameContaining()`. Keywords that hit one or more tags are removed from the FTS text list and become an OR-union tag filter; keywords with no match remain as FTS text. Matching is case-insensitive and traverses the tag hierarchy via the recursive CTE `findDescendantIdsByName`. See ADR-033.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user