Capture the why behind deploying Ollama to prod/staging compose: the
corrected init recipe (supersedes ADR-028 §10's never-functional curl
loop), the OLLAMA_KEEP_ALIVE=-1 pin (so a future maintainer doesn't
optimize it away and reintroduce the post-idle cold-load 503), the
30->60s timeout NFR, and the memswap==mem hard-OOM trade-off.
Addresses #759 review (Markus #3, Nora #2).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The diagram declared Container(ollama, ...) twice — an alias collision that
renders a duplicate box. It also declared the backend->ollama relationship
twice. Keep the richer 'Ollama LLM Service' declaration and the more
specific 'NL query parsing (POST /api/generate)' relationship; drop the
duplicates.
Addresses #759 review (Markus #2).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mirror the prod hardening in the dev stack: guard the model pull with
`ollama list | grep -q <model>` so an already-cached model exits clean
without a registry round-trip. Keeps dev and prod on one recipe.
Addresses #759 review (Tobias #1).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The init command unconditionally ran `ollama pull`, which contacts the
registry to verify the manifest digest even when the model is already on
the volume. A host reboot during a registry/upstream-network blip would
then fail init non-zero, the `service_completed_successfully` gate would
never be met, and the ollama service (hence NL search) would stay down
until the registry was reachable again.
Guard the pull with `ollama list | grep -q <model>` so a cached model
exits clean without any registry round-trip.
Addresses #759 review (Tobias #1).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
docker-compose.prod.yml declares the volume as `ollama-models` (hyphen),
so the compose-project-prefixed name is `archiv-production_ollama-models`,
not the underscored `archiv-production_ollama_models` the model-upgrade
guide documented. The documented `docker volume rm` would not have matched
the real volume.
Addresses #759 review (Tobias #2).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
application.yaml sets app.ollama.timeout-seconds: 60 (raised from 30 to
absorb the cold model load on the first query after an Ollama restart),
but DEPLOYMENT.md still documented 30. A doc that contradicts the shipped
value is a traceability defect.
Addresses #759 review (Markus, Felix, Elicit).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
NL search recovered after deploy but went 503 again after a few minutes:
Ollama unloads the model after its default ~5 min keep-alive, so the next
query cold-loads the 4.7 GB model and exceeds the backend's 30s read
timeout (ResourceAccessException -> SMART_SEARCH_UNAVAILABLE). Warm
inference is ~18s; the cold load after idle is what timed out.
- docker-compose.{prod,yml}: set OLLAMA_KEEP_ALIVE=-1 on the ollama
service so the model stays resident and never pays a cold-load penalty
during normal operation (verified on staging: `ollama ps` -> UNTIL
"Forever"; host has 47 GB free).
- application.yaml: raise app.ollama.timeout-seconds 30 -> 60 so the one
unavoidable cold load (first query after an Ollama restart, before the
model is pinned) completes instead of timing out.
Refs #758
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
NL search returned 503 (SMART_SEARCH_UNAVAILABLE / "Intelligente Suche
nicht verfügbar") on staging because Ollama was never reachable.
Two defects, both downstream of #737:
1. Ollama was added only to the dev docker-compose.yml. Staging/prod
deploy from the self-contained docker-compose.prod.yml, which had no
ollama service — so the backend (defaulting to http://ollama:11434)
hit a non-existent host (ResourceAccessException -> 503).
2. The merged model-init recipe never worked: the ollama/ollama image
ENTRYPOINT is `ollama` (so `command: sh -c ...` ran as `ollama sh ...`
-> "unknown command sh"), and the image ships no curl (so both the
readiness loop and the healthcheck could never pass).
- docker-compose.prod.yml: add ollama-model-init + ollama services and
the ollama-models volume, with the corrected recipe (entrypoint
override to /bin/sh -c, `ollama list` for readiness and healthcheck).
- docker-compose.yml: fix the same broken entrypoint/command and the
curl healthcheck so the dev stack actually starts Ollama.
Verified on staging end-to-end: model-init exits 0, ollama healthy,
backend reaches /api/tags, inference succeeds within the 8g limit.
Refs #758
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Markus (architect): document SearchFilterBar + the search/ components
(SmartModeToggle, InterpretationChipRow, SmartSearchStatus,
DisambiguationPicker) and the POST /api/search/nl relation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Leonie (UX): the toggle pill (text-[7.5px]) and loading subtitle
(text-[9px]) were below the 12px floor for the 60+ audience. Bump both
to text-xs and the toggle icon to h-3.5/w-3.5. Overrides the visual
spec's tokens, which conflicted with the issue's own legibility mandate.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Mock POST /api/search/nl (delayed fixture: 2-name directional + applied
keyword), assert loading announcement → chips render → axe-clean in light
and dark → removing the keyword chip re-runs a keyword GET with the
remaining sender+receiver params. Adds a data-testid wrapper on the NL
results region for axe scoping.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the smart-search sub-component directory to the frontend Project
Structure tree (merge blocker per #739).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
SearchFilterBar drives chip-clearing via onModeToggle (mode switch) and
onSmartSearch (new query); pin that callback contract.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Lift smartMode to documents/+page.svelte and drive the full smart-search
lifecycle: POST /api/search/nl via csrfFetch, loading/error panels, chip
row, single-select disambiguation, and a transparent empty state. Chip
removal and disambiguation selection map the interpretation to keyword
params and re-run via GET (Option A in-page fallback). Mode toggle and
new queries reset prior interpretation.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add smartMode $bindable plus onSmartSearch/onModeToggle callbacks. The
toggle pill sits in the input's right slot (decorative icon moved to the
left); smart mode disables the live oninput keyword search, adds
maxlength=500, and submits the NL query on Enter. 4 integration specs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Accessible disclosure: aria-expanded/aria-controls trigger, focus moves
into the option list on open, Escape and click-outside close and return
focus to the trigger, selecting a candidate emits onSelect. Single-select
(GET re-run) per the resolved#738 open decision — backend has no
multi-sender OR param. 5 vitest-browser-svelte specs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Renders type-prefixed chips (Absender/Zeitraum/Stichwort), a single
directional chip for 2-name queries, gates keyword chips on
keywordsApplied, and emits onRemoveChip(type, value?). Truncating name
spans keep the 44px × button visible; chip wrappers show a focus ring.
9 vitest-browser-svelte specs (red/green).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Toggle pill with aria-pressed, active/resting styles matching the
AND/OR operator button pattern, and mobile-expanded KI/Text labels.
4 vitest-browser-svelte specs (red/green).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Toggle labels, loading panel, error panels (503/429), empty-state
retry, chip type-prefixes + remove label, and disambiguation strings
for the smart search UI (#739). Formal Sie form per project standard.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Both the first-time model pull runbook (from this branch) and the model
upgrade procedure (from main) belong in DEPLOYMENT.md.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers the SmartModeToggle pill (inside the search input, Google AI Mode
style), InterpretationChipRow anatomy, DisambiguationPicker, and all
status/error/empty states as full-result-area panels.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- error_smart_search_unavailable/rate_limited now use "Sie" (formal) to
match the tone of all existing German error messages
- Replace inline FQNs in DocumentService.buildPersonSpec with proper
JoinType + Predicate imports
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Switch to wiremock-jetty12 artifact and force ee10 Jetty deps to 12.1.8
to resolve compatibility with Spring Boot 4's Jetty 12.1.8 core.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Addresses Elicit's and Sara's review concerns on PR #749:
- Expand §6 ollama_models section into a full model upgrade runbook (step-by-step
docker volume rm + recreate, including production volume name prefix)
- Add re-deploy idempotency note to §3.4 (init container exits quickly when model
already present on the volume)
- Add NL search smoke test to §3.4 (curl command distinguishing 200 from 503
NL_SEARCH_UNAVAILABLE)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updated OLLAMA_API_KEY env vars table from 0.6.5 to 0.6.5 or 0.30.6 to
match both tested versions. Added an explicit warning in §3.4 that
docker compose up -d --wait blocks for 60–90 min on first deploy when the
model pull has not yet completed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both versions were tested and neither enforces the key. Comment updated to
say "0.6.5 or 0.30.6" and surface archiv-net as the sole effective control.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hardcoded literal overrides any .env setting — setting APP_OLLAMA_BASE_URL=
in .env had no effect on the backend container. Now uses the same pattern
as APP_OCR_TRAINING_TOKEN with a safe default.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
§12 stated OLLAMA_API_KEY guards against lateral movement — contradicts
§7's empirical finding that it is not enforced. Replaced with an accurate
note referencing §7. Stale pre-merge placeholder in Consequences ("Three
TBD items must be resolved") removed; all three are resolved. §7 section
title updated from "0.6.5" to "0.6.5 and 0.30.6" to match the body text.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Docker Compose interpolates $VAR in command strings — use $$ to pass a
literal $ to the shell so SERVE_PID=$! and kill $SERVE_PID work correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- OLLAMA_API_KEY: non-enforcement confirmed on both 0.6.5 and 0.30.6
- read_only: true: confirmed working on both 0.6.5 and 0.30.6
- Peak RSS during pull: ~108 MiB (well under 2g limit)
- All TBD placeholders resolved
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ollama-model-init: one-shot init container that pulls qwen2.5:7b-instruct-q4_K_M
into the ollama_models volume on first start
- ollama: main inference service on archiv-net (expose: only, no public port)
- ollama_models named volume for persistent model storage
- APP_OLLAMA_BASE_URL + APP_OLLAMA_API_KEY added to backend env
- Both services: cap_drop ALL, no-new-privileges, read_only+tmpfs (ADR-019 + ADR-028)
- start_period: 60s — model pre-pulled by init container
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>