docs: remove nlp-service and NL search references from DEPLOYMENT.md and GLOSSARY.md
This commit is contained in:
@@ -51,17 +51,15 @@ graph TD
|
|||||||
|
|
||||||
The OCR service requires significant RAM for model loading. The dev compose sets `mem_limit: 12g`.
|
The OCR service requires significant RAM for model loading. The dev compose sets `mem_limit: 12g`.
|
||||||
|
|
||||||
| Production target | RAM | Recommended OCR limit | NL Search | Notes |
|
| Production target | RAM | Recommended OCR limit | Notes |
|
||||||
|---|---|---|---|---|
|
|---|---|---|---|
|
||||||
| Current server (Hetzner Serverbörse, i7-6700) | 64 GB | 12 GB | Supported | Default `mem_limit: 12g` works comfortably; nlp-service adds only ~256 MB |
|
| Current server (Hetzner Serverbörse, i7-6700) | 64 GB | 12 GB | Default `mem_limit: 12g` works comfortably |
|
||||||
| ≥ 16 GB RAM | 16+ GB | 12 GB | Supported | Default works |
|
| ≥ 16 GB RAM | 16+ GB | 12 GB | Default works |
|
||||||
| 8 GB RAM | 8 GB | 6 GB | Supported | Set `OCR_MEM_LIMIT=6g`; accept reduced batch sizes; nlp-service is lightweight |
|
| 8 GB RAM | 8 GB | 6 GB | Set `OCR_MEM_LIMIT=6g`; accept reduced batch sizes |
|
||||||
| 4 GB RAM | 4 GB | — | Supported | Disable OCR service (`profiles: [ocr]`); run OCR on demand only; nlp-service still runs |
|
| 4 GB RAM | 4 GB | — | Disable OCR service (`profiles: [ocr]`); run OCR on demand only |
|
||||||
|
|
||||||
On servers with less than 16 GB RAM the default `mem_limit: 12g` cannot be honoured — set the `OCR_MEM_LIMIT` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow). The prod compose interpolates this var with a 12g default.
|
On servers with less than 16 GB RAM the default `mem_limit: 12g` cannot be honoured — set the `OCR_MEM_LIMIT` env var (in `.env.production` / `.env.staging`, or as a Gitea secret consumed by the workflow). The prod compose interpolates this var with a 12g default.
|
||||||
|
|
||||||
> **Memory budget:** OCR (~6 GB active) + nlp-service (~256 MB) = ~6.25 GB. The previous Ollama LLM (~8 GB) has been replaced by the rule-based nlp-service — significant memory headroom freed on all server tiers.
|
|
||||||
|
|
||||||
### Dev vs production differences
|
### Dev vs production differences
|
||||||
|
|
||||||
| Concern | Dev (`docker-compose.yml`) | Prod (`docker-compose.prod.yml`) |
|
| Concern | Dev (`docker-compose.yml`) | Prod (`docker-compose.prod.yml`) |
|
||||||
@@ -148,19 +146,6 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
|||||||
| `XDG_CACHE_HOME` | XDG cache base dir — redirects Matplotlib and other XDG-aware libraries away from the read-only `HOME` (`/home/ocr`) to the writable cache volume | `/app/cache` | — | — |
|
| `XDG_CACHE_HOME` | XDG cache base dir — redirects Matplotlib and other XDG-aware libraries away from the read-only `HOME` (`/home/ocr`) to the writable cache volume | `/app/cache` | — | — |
|
||||||
| `TORCH_HOME` | PyTorch model cache — redirects `~/.cache/torch` to the writable models volume | `/app/models/torch` | — | — |
|
| `TORCH_HOME` | PyTorch model cache — redirects `~/.cache/torch` to the writable models volume | `/app/models/torch` | — | — |
|
||||||
|
|
||||||
### NLP service (NL search)
|
|
||||||
|
|
||||||
| Variable | Purpose | Default | Required? | Sensitive? |
|
|
||||||
|---|---|---|---|---|
|
|
||||||
| `APP_NLP_BASE_URL` | Internal URL of the nlp-service container. Wired automatically in compose via `http://nlp-service:8001`. | `http://nlp-service:8001` | YES | — |
|
|
||||||
| `NLP_FUZZY_THRESHOLD` | Rapidfuzz similarity floor for person-name matching (0–100). Lower values match more aggressively; raise if false positives appear. | `80` | — | — |
|
|
||||||
|
|
||||||
The nlp-service reads `DATABASE_URL` at startup (composed from `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`). Any credential rotation that touches those three vars must be followed by a restart of **both** `backend` and `nlp-service`:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose restart nlp-service backend
|
|
||||||
```
|
|
||||||
|
|
||||||
### Observability stack (`docker-compose.observability.yml`)
|
### Observability stack (`docker-compose.observability.yml`)
|
||||||
|
|
||||||
| Variable | Purpose | Default | Required? | Sensitive? |
|
| Variable | Purpose | Default | Required? | Sensitive? |
|
||||||
@@ -281,14 +266,6 @@ git.raddatz.cloud A <server IP>
|
|||||||
|
|
||||||
### 3.4 First deploy
|
### 3.4 First deploy
|
||||||
|
|
||||||
> **NL search startup:** `nlp-service` loads person names from the database at startup (single query, ~1–2 s). No model weights to download. The backend waits for `nlp-service` to pass its healthcheck (`/health` returns `{"status":"ok","persons_loaded":N}`) before starting, so `docker compose up -d --wait` is safe to use on first deploy.
|
|
||||||
>
|
|
||||||
> **Verify NL search is active:**
|
|
||||||
> ```bash
|
|
||||||
> curl -s http://localhost:8001/health
|
|
||||||
> # Returns {"status":"ok","persons_loaded":N} with N > 0 → person matching enabled
|
|
||||||
> # Returns {"status":"ok","persons_loaded":0} → DB not reachable or persons table empty
|
|
||||||
> ```
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow")
|
# 1. Trigger nightly.yml manually (Repo → Actions → nightly → "Run workflow")
|
||||||
@@ -328,7 +305,7 @@ docker compose logs --follow
|
|||||||
|
|
||||||
# Single snapshot
|
# Single snapshot
|
||||||
docker compose logs --tail=200 <service>
|
docker compose logs --tail=200 <service>
|
||||||
# services: frontend, backend, db, minio, ocr-service, nlp-service
|
# services: frontend, backend, db, minio, ocr-service
|
||||||
```
|
```
|
||||||
|
|
||||||
### Log locations
|
### Log locations
|
||||||
@@ -585,41 +562,6 @@ bash scripts/download-kraken-models.sh
|
|||||||
|
|
||||||
> Downloads the Kurrent/Sütterlin HTR models. Run once after a fresh clone or when models are updated.
|
> Downloads the Kurrent/Sütterlin HTR models. Run once after a fresh clone or when models are updated.
|
||||||
|
|
||||||
### NLP service — natural-language search (NL Search)
|
|
||||||
|
|
||||||
NL search uses the rule-based `nlp-service` FastAPI container for query parsing. It has no model weights — it loads person names from the database at startup and applies regex + fuzzy matching. See ADR-035.
|
|
||||||
|
|
||||||
**Health check:**
|
|
||||||
|
|
||||||
```bash
|
|
||||||
curl -s http://localhost:8001/health
|
|
||||||
# {"status":"ok","persons_loaded":1247}
|
|
||||||
```
|
|
||||||
|
|
||||||
`persons_loaded: 0` means the service started but could not reach the database (check `DATABASE_URL` and that `db` is healthy).
|
|
||||||
|
|
||||||
If `POST /api/search/nl` returns HTTP 503 `SMART_SEARCH_UNAVAILABLE`, the backend cannot reach `nlp-service`. Check with:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose logs nlp-service --tail=50
|
|
||||||
docker compose ps nlp-service
|
|
||||||
```
|
|
||||||
|
|
||||||
**Configuration** (see `application.yaml` under `app.nlp`):
|
|
||||||
|
|
||||||
| Property | Default | Description |
|
|
||||||
|---|---|---|
|
|
||||||
| `app.nlp.base-url` | `http://nlp-service:8001` | nlp-service URL; set via `APP_NLP_BASE_URL` env var |
|
|
||||||
| `app.nl-search.rate-limit.max-requests-per-minute` | `20` | Per-user rate limit |
|
|
||||||
|
|
||||||
**Tuning person matching:**
|
|
||||||
|
|
||||||
Set `NLP_FUZZY_THRESHOLD` in `.env` (default: `80`, range: `0–100`). Lower values match more aggressively at the cost of false positives. Restart nlp-service after changing:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
docker compose restart nlp-service
|
|
||||||
```
|
|
||||||
|
|
||||||
### Trigger a canonical import
|
### Trigger a canonical import
|
||||||
|
|
||||||
The importer no longer parses the raw spreadsheet. It consumes the **canonical artifacts**
|
The importer no longer parses the raw spreadsheet. It consumes the **canonical artifacts**
|
||||||
|
|||||||
@@ -165,13 +165,6 @@ _See also [Chronik](#chronik-internal)._
|
|||||||
|
|
||||||
**Domain** — a Tier-1 bounded context with its own entities, controller, service, repository, and DTOs. Backend domains: `document`, `person`, `tag`, `user`, `geschichte`, `notification`, `ocr`, `audit`, `dashboard`. Frontend domains mirror this structure under `src/lib/`.
|
**Domain** — a Tier-1 bounded context with its own entities, controller, service, repository, and DTOs. Backend domains: `document`, `person`, `tag`, `user`, `geschichte`, `notification`, `ocr`, `audit`, `dashboard`. Frontend domains mirror this structure under `src/lib/`.
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## NL Search Terms
|
|
||||||
|
|
||||||
**NlSearch** — the natural-language document search feature. Users type a plain-German query (e.g. "Was hat Walter im Krieg an Emma geschrieben?"); the backend parses it via Ollama, resolves person names to database UUIDs, and delegates to the standard `DocumentService.searchDocuments()` path. Endpoint: `POST /api/search/nl`.
|
|
||||||
|
|
||||||
**NlQueryInterpretation** — the structured result of parsing a natural-language query. Contains: `resolvedPersons` (persons whose names unambiguously matched one DB record), `ambiguousPersons` (all candidates when a name matched more than one person), `keywords` (LLM-extracted search terms), `dateFrom`/`dateTo` (extracted date range), `rawQuery` (the original user input), `keywordsApplied` (whether keyword FTS was used), `resolvedTags` (tags matched by keyword→tag resolution), and `tagsApplied` (whether the OR-union tag filter was applied).
|
|
||||||
|
|
||||||
**keyword→tag resolution** — the post-Ollama step in `NlQueryParserService` where each LLM-extracted keyword is substring-matched against the tag taxonomy via `TagService.findByNameContaining()`. Keywords that hit one or more tags are removed from the FTS text list and become an OR-union tag filter; keywords with no match remain as FTS text. Matching is case-insensitive and traverses the tag hierarchy via the recursive CTE `findDescendantIdsByName`. See ADR-033.
|
**keyword→tag resolution** — the post-Ollama step in `NlQueryParserService` where each LLM-extracted keyword is substring-matched against the tag taxonomy via `TagService.findByNameContaining()`. Keywords that hit one or more tags are removed from the FTS text list and become an OR-union tag filter; keywords with no match remain as FTS text. Matching is case-insensitive and traverses the tag hierarchy via the recursive CTE `findDescendantIdsByName`. See ADR-033.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user