docs(search): update CLAUDE.md, GLOSSARY, DEPLOYMENT, and C4 diagrams
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 22s
CI / Backend Unit Tests (pull_request) Successful in 3m52s
CI / fail2ban Regex (pull_request) Successful in 44s
CI / Semgrep Security Scan (pull_request) Successful in 21s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m3s
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 22s
CI / Backend Unit Tests (pull_request) Successful in 3m52s
CI / fail2ban Regex (pull_request) Successful in 44s
CI / Semgrep Security Scan (pull_request) Successful in 21s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m3s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -560,6 +560,37 @@ bash scripts/download-kraken-models.sh
|
||||
|
||||
> Downloads the Kurrent/Sütterlin HTR models. Run once after a fresh clone or when models are updated.
|
||||
|
||||
### Ollama — natural-language search (NL Search)
|
||||
|
||||
NL search uses a local Ollama instance for query parsing. The `ollama` service is defined in `docker-compose.yml` alongside the main stack.
|
||||
|
||||
**First-time model pull** (required before the feature works):
|
||||
|
||||
```bash
|
||||
docker compose exec ollama ollama pull qwen2.5:7b-instruct-q4_K_M
|
||||
```
|
||||
|
||||
This downloads ~4.4 GB. The model is stored in the `ollama_data` Docker volume and persists across container restarts.
|
||||
|
||||
**Verify the model is available:**
|
||||
|
||||
```bash
|
||||
docker compose exec ollama ollama list
|
||||
```
|
||||
|
||||
Expected output includes `qwen2.5:7b-instruct-q4_K_M`.
|
||||
|
||||
**Health check** — the backend polls `GET /api/tags` on Ollama at startup and before inference. If Ollama is absent, `POST /api/search/nl` returns HTTP 503 with `SMART_SEARCH_UNAVAILABLE`.
|
||||
|
||||
**Configuration** (see `application.yaml` under `app.ollama`):
|
||||
|
||||
| Property | Default | Description |
|
||||
|---|---|---|
|
||||
| `app.ollama.base-url` | `http://ollama:11434` | Ollama service URL (dev: `http://localhost:11434`) |
|
||||
| `app.ollama.model` | `qwen2.5:7b-instruct-q4_K_M` | Model to use for inference |
|
||||
| `app.ollama.timeout-seconds` | `30` | Read timeout for inference calls |
|
||||
| `app.nl-search.rate-limit.max-requests-per-minute` | `5` | Per-user rate limit |
|
||||
|
||||
### Trigger a canonical import
|
||||
|
||||
The importer no longer parses the raw spreadsheet. It consumes the **canonical artifacts**
|
||||
|
||||
Reference in New Issue
Block a user