feat(search): NL search backend — query parser, controller, rate limiting #738

Closed
opened 2026-06-06 12:15:42 +02:00 by marcel · 1 comment
Owner

Part of epic #735. Depends on infra issue (Ollama service) — add Gitea issue number when created.

Goal

Implement the Spring Boot backend for natural language search: a new search/ domain package that parses a German natural-language query via Ollama (Qwen 2.5 7B), resolves names to person UUIDs, and delegates to the existing DocumentService.searchDocuments(). The existing search endpoint is not modified.

Architecture

POST /api/search/nl
  → NlSearchController
    → NlQueryParserService → OllamaClient → Ollama HTTP API
    → PersonService.findByDisplayNameContaining() for each name
    → DocumentService.searchDocuments(senderId?, receiverId?, from?, to?, q?, pageable)
    → DocumentService.searchDocumentsByPersonId(personId, from?, to?, pageable)  ← new overload for OR semantics (personRole="any", single name — keywords not applied)
    → DocumentService.searchDocuments(senderId=person1, receiverId=person2, from?, to?, q?, pageable)  ← for 2-name queries where both resolve
  ← NlSearchResponse { DocumentSearchResult, NlQueryInterpretation }

ADR

Write docs/adr/028-nl-search-ollama.md before writing any code. Cover:

  • Why Ollama (vs. extending OCR service, vs. calling API)
  • CPU-only inference constraint and model selection (Qwen 2.5 7B Q4_K_M) — server has 64 GB RAM; no memory constraint
  • DB-blind name resolution (prompt stays small; person lookup is a cheap SQL query)
  • Prompt injection mitigation via grammar-constrained JSON output + maxLength constraints
  • Graceful degradation when Ollama is absent
  • Expected inference latency (2–15 seconds on CPU) — frontend must show a loading indicator
  • app.ollama.timeout-seconds=30 default and justification
  • NL query logging policy: log metadata only (query length, person count, latency) — never raw query content (PII — real person names)
  • Prompt-amplification abuse as the threat mitigated by rate limiting
  • Ollama model pre-pull requirement on first deploy (model not bundled in Docker image)
  • Startup dependency: ollama pull qwen2.5:7b-instruct-q4_K_M must run before first inference
  • Multi-name resolution heuristic: first extracted name = sender, second = receiver; personRole applies to single-name queries only — chosen over per-name roles because the most natural German phrasing ("Was hat Walter an Emma geschrieben?") strongly implies sender→receiver order, and per-name roles would require a combinatorially complex schema
  • personRole: "any" keyword limitation: keyword filtering is not supported for OR-semantics person queries — only person identity and date range are applied; keywordsApplied = false is returned in the response (KISS over completeness; disambiguation and date filters already narrow results significantly)
  • The search/person/ + document/ dependency direction is intentional: NlQueryParserService calls PersonService and DocumentService — cross-domain service calls, not repository leaks.

New package: org.raddatz.familienarchiv.search/

Component Type Responsibility
OllamaClient interface NlQueryInterpretation parse(String query)
OllamaHealthClient interface boolean isHealthy() — called inline by NlQueryParserService before each inference call. No Actuator HealthIndicator bean needed — consistent with OcrHealthClient pattern
RestClientOllamaClient @Service implements OllamaClient, OllamaHealthClient HTTP call to Ollama API with JSON schema enforcement; isHealthy() calls GET /api/tags (Ollama has no /health endpoint); degrades gracefully when Ollama is absent. Uses two separate RestClient instances: inference client (30 s timeout) and health-check client (2 s connect timeout) — see OllamaProperties.healthCheckTimeoutSeconds.
NlQueryParserService @Service Orchestrates Ollama call → name resolution → document search; delegates to OllamaClient, PersonService, DocumentService
OllamaProperties @Component @ConfigurationProperties("app.ollama") @Data baseUrl, model, timeoutSeconds (inference, default: 30), healthCheckTimeoutSeconds (health-check RestClient only, default: 2). String fields (baseUrl, model) are null if absent from yaml — explicit yaml entries are required.
NlSearchRateLimiter @Service Bucket4j + Caffeine, keyed on userKey (email string from principal.getUsername()), 5 req/min. Node-local — in multi-replica deployments the effective limit multiplies by replica count (same caveat as LoginRateLimiter). Has package-private resetForTest() — calls cache.invalidateAll() (not key-based invalidation, which would couple the method to the @WithMockUser username value).
NlSearchRateLimitProperties @Component @ConfigurationProperties("app.nl-search.rate-limit") @Data maxRequestsPerMinute (default: 5)
NlSearchController @RestController POST /api/search/nl
NlSearchRequest record @NotBlank @Size(min=3, max=500) String query
PersonHint record UUID id, String displayName — lightweight person reference for search results; not a JPA projection
NlQueryInterpretation record List<PersonHint> resolvedPersons, List<PersonHint> ambiguousPersons, LocalDate dateFrom, LocalDate dateTo, List<String> keywords, String rawQuery, boolean keywordsApplied
NlSearchResponse record DocumentSearchResult result, NlQueryInterpretation interpretation

Model OllamaClient / OllamaHealthClient / RestClientOllamaClient on the existing OcrClient / OcrHealthClient / RestClientOcrClient split.

Note on PersonHint: PersonSummaryDTO is a JPA interface projection — it cannot be constructed manually. PersonHint is a plain record built from Person entities by NlQueryParserService.

Config properties pattern: OllamaProperties and NlSearchRateLimitProperties follow the same pattern as RateLimitProperties (auth/RateLimitProperties.java): @Component, @ConfigurationProperties(...), @Data. All three annotations are required — without @Component the bean doesn't auto-register; without @Data the binder cannot set fields.

Ollama JSON schema (grammar-constrained)

{
  "type": "object",
  "required": ["personNames", "personRole", "keywords"],
  "properties": {
    "personNames": { "type": "array", "items": { "type": "string", "maxLength": 200 } },
    "personRole":  { "type": "string", "enum": ["sender", "receiver", "any"] },
    "dateFrom":    { "type": ["string", "null"], "maxLength": 20 },
    "dateTo":      { "type": ["string", "null"], "maxLength": 20 },
    "keywords":    { "type": "array", "items": { "type": "string", "maxLength": 100 } }
  }
}

Defense-in-depth: NlQueryParserService must also enforce these length limits in code before passing any LLM-extracted fragment to PersonRepository.searchByName().

Null-coalescing: Treat every field as nullable — personNames and keywords absent from the Ollama response must be coalesced to List.of() before processing. A fully absent response falls back to rawQuery as a keyword.

Defensive personRole parsing: If Ollama returns a value not in ["sender", "receiver", "any"] (e.g. due to model drift), log a warning and default to "any" rather than propagating a JsonMappingException. Use parameterized SLF4J: log.warn("Unexpected personRole from Ollama: {}", value) — never string concatenation (defends against log injection from LLM output).

Name resolution

// PersonRepository — delegates to existing searchByName() JPQL
List<Person> findByDisplayNameContaining(String fragment);

PersonService.findByDisplayNameContaining(String fragment): List<Person> delegates to the existing PersonRepository.searchByName(fragment)no new JPQL needed; keep as a one-liner. The existing query already covers first name + last name (both orderings), alias, and name aliases (maiden names) via LEFT JOIN p.nameAliases. This is a read method — no @Transactional annotation (consistent with PersonService code style for read methods).

Max candidates cap: NlQueryParserService passes at most 10 persons to ambiguousPersons. If searchByName() returns more than 10 results, take the first 10 and log log.debug("Name '{}' matched {} persons; capping disambiguation at 10", name, results.size()). No new response fields — the frontend shows up to 10 disambiguation candidates. This prevents unusable disambiguation UIs for common surnames.

Match Behaviour
Single person Resolved to PersonHintsenderId or receiverId based on personRole
Multiple persons (ambiguous, up to 10) Search does NOT execute; empty DocumentSearchResult returned with matched persons (up to 10) as List<PersonHint> in ambiguousPersons — frontend shows disambiguation UI. Disambiguation takes priority regardless of personRole — this applies even when personRole is "any".
No match Name folded into keywords

personRole: "any" (single match) → call DocumentService.searchDocumentsByPersonId(UUID personId, LocalDate from, LocalDate to, Pageable pageable) — queries documents where that person is sender OR receiver. Keywords from the NL interpretation are not applied for this path — only person identity and date range filter results. Set keywordsApplied = false in the returned NlQueryInterpretation. The frontend must not show keyword chips when keywordsApplied == false.

Date strings from Ollama ("1914", "1914-01-01") → parse to LocalDate: from = 1914-01-01, to = 1914-12-31 (or range if specified). Store as LocalDate in NlQueryInterpretation, serialized to ISO-8601 in the response. Malformed date strings that cannot be parsed to a year or ISO date are treated as null (not folded into keywords).

Multi-name query resolution

When personNames contains exactly 2 entries (e.g. ["Walter", "Emma"] from "Was hat Walter an Emma geschrieben?"):

  • Both resolve → AND semantics: first name treated as sender, second as receiver. Calls DocumentService.searchDocuments(senderId=person1, receiverId=person2, ...). Both appear in resolvedPersons.
  • One resolves, one is ambiguous → disambiguation required: search does NOT execute. The resolved person appears in resolvedPersons; the ambiguous name's candidates appear in ambiguousPersons. Frontend shows disambiguation UI for the ambiguous name. The user must resolve the ambiguity before the search proceeds — the resolved name does not trigger a partial search.
  • One or both have no match: unresolved name folded into keywords; the resolved name is used as a single-name search with personRole.
  • 3+ names: first two entries follow the 2-name rule above; remaining names are folded into keywords.

personRole from the Ollama schema applies only to single-name queries. For 2-name queries, roles are implicit: first=sender, second=receiver.

Consistency rule: ambiguousPersons non-empty always means the search result is empty and disambiguation UI is shown — regardless of whether any other names in the query resolved cleanly.

Response list order semantics: in NlQueryInterpretation.resolvedPersons, index 0 is the sender candidate and index 1 is the receiver candidate for 2-name queries. The frontend interpretation chip must render this directionally (e.g. "Walter → Emma"), not as an unordered list.

DocumentService changes

Add a new overload using JPQL (not native SQL):

// DocumentRepository
@Query("""
    SELECT DISTINCT d FROM Document d
    WHERE (d.sender = :person OR :person MEMBER OF d.receivers)
      AND (:from IS NULL OR d.documentDate >= :from)
      AND (:to IS NULL OR d.documentDate <= :to)
    ORDER BY d.documentDate DESC
    """)
Page<Document> findBySenderOrReceiver(
    @Param("person") Person person,
    @Param("from") LocalDate from,
    @Param("to") LocalDate to,
    Pageable pageable);

// DocumentService — keywords not supported for personRole:"any" (JPQL has no text predicate; KISS over completeness)
public DocumentSearchResult searchDocumentsByPersonId(UUID personId, LocalDate from, LocalDate to, Pageable pageable)

Use JPQL MEMBER OF — do NOT use native SQL ANY() which is PostgreSQL-specific. Date predicates must be in the JPQL query itself — post-query filtering breaks pagination correctness.

Call-site defaults for NlQueryParserService

Keywords → text join: String.join(" ", interpretation.keywords()) maps to websearch_to_tsquery AND semantics in PostgreSQL — "Krieg Walter" finds documents mentioning both. An empty list produces an empty string, which has no effect on the FTS predicate.

Keyword-only path (no resolved persons):

documentService.searchDocuments(
    String.join(" ", interpretation.keywords()),  // text: keywords space-joined
    interpretation.dateFrom(), interpretation.dateTo(),
    null, null,                                   // sender, receiver
    List.of(), null, null,                        // tags, tagQ, status
    DocumentSort.DATE, "desc", TagOperator.AND,   // sort, dir, tagOperator
    false, pageable                               // undated, pageable
);

2-name resolved path (both names resolve, sender + receiver):

documentService.searchDocuments(
    String.join(" ", interpretation.keywords()),
    interpretation.dateFrom(), interpretation.dateTo(),
    person1Id, person2Id,                         // first=sender, second=receiver
    List.of(), null, null,
    DocumentSort.DATE, "desc", TagOperator.AND,
    false, pageable
);

The Mockito verify() calls in NlQueryParserServiceTest must assert the exact values of sort, dir, tagOperator, status, and undated — not any(). Without explicit matchers the test passes vacuously for wrong defaults.

Error codes

Code HTTP Condition
SMART_SEARCH_UNAVAILABLE 503 Ollama unreachable or timed out
SMART_SEARCH_RATE_LIMITED 429 Exceeds 5 NL search requests per user per minute
VALIDATION_ERROR 400 Blank query or query < 3 / > 500 characters

Add each to: ErrorCode.javaerrors.tsgetErrorMessage()messages/{de,en,es}.json.

i18n guidance:

  • SMART_SEARCH_UNAVAILABLE: de: "Die intelligente Suche ist momentan nicht verfügbar. Bitte nutze die normale Suche."; en: "The smart search is currently unavailable. Please use the regular search."; es: "La búsqueda inteligente no está disponible en este momento. Por favor, usa la búsqueda normal."
  • SMART_SEARCH_RATE_LIMITED: de: "Du hast die Suchfunktion zu häufig genutzt. Bitte warte eine Minute."; en: "You have used the search function too frequently. Please wait a minute."; es: "Has utilizado la función de búsqueda demasiadas veces. Por favor, espera un minuto."
  • smart_search_keywords_not_applied (for keywordsApplied == false frontend display): de: "Schlüsselwörter konnten bei dieser Suche nicht berücksichtigt werden."; en: "Keywords could not be applied to this search."; es: "Las palabras clave no pudieron aplicarse a esta búsqueda."

Rate limiting

New NlSearchRateLimiter bean — do not reuse LoginRateLimiter (keyed on ip + ":" + email). Use the same Bucket4j + Caffeine pattern, keyed on principal.getUsername() (email string — stable, injection-proof, already guaranteed by the session). Config via NlSearchRateLimitProperties (@ConfigurationProperties("app.nl-search.rate-limit")), default maxRequestsPerMinute = 5. Return HTTP 429 with SMART_SEARCH_RATE_LIMITED when exceeded.

Add a package-private resetForTest() method that calls cache.invalidateAll() — not key-based invalidation (which would couple the method to the @WithMockUser(username = "testuser") value). invalidateAll() is simpler and avoids this coupling.

Security

  • @RequirePermission(Permission.READ_ALL) on POST /api/search/nl — this is a read operation, not WRITE_ALL.
  • Log only metadata: log.debug("NL search: queryLength={}, personNamesCount={}, latencyMs={}", ...) — never log the raw query (PII).
  • Do not expose Ollama port (11434) in the Compose file — use expose: not ports:. Hard requirement: the Ollama inference API has no authentication by default; ports: would make it reachable from the host network, allowing arbitrary model inference.

Controller — @AuthenticationPrincipal wiring

Do NOT use @AuthenticationPrincipal AppUser. CustomUserDetailsService.loadUserByUsername() returns new User(email, password, authorities) — a Spring User, not AppUser. At runtime, @AuthenticationPrincipal AppUser always resolves to null; the rate limiter call would NPE on every request, causing rate limiting to fail open.

Correct approach: use @AuthenticationPrincipal UserDetails principal and pass principal.getUsername() (email string) to NlSearchRateLimiter.checkAndConsume(String userKey). This matches the InviteController pattern. In @WebMvcTest, @WithMockUser(username = "testuser", authorities = {"READ_ALL"}) works directly — no @WithUserDetails or UserDetailsService mocking needed.

WireMock dependency

Add org.wiremock:wiremock version 3.9.x (not wiremock-standalone) as test scope in pom.xml as the first commit on the implementation branch. Verify the coordinate before committing: run mvn dependency:get -Dartifact=org.wiremock:wiremock:3.9.2:jar to confirm it resolves from Maven Central — the old com.github.tomakehurst groupId was retired in the 3.x repackage. The standalone artifact bundles its own Jackson and conflicts with Spring Boot's Jackson on the classpath. WireMock 3.9.x is compatible with Java 21 and Spring Boot 4 — verify the latest stable 3.9.x patch on Maven Central before pinning.

Configuration — application.yaml

@ConfigurationProperties("app.ollama") requires explicit yaml entries (unlike @Value which supports annotation-level defaults). String fields are null if the key is absent. Add to application.yaml:

app:
  ollama:
    base-url: http://ollama:11434
    model: qwen2.5:7b-instruct-q4_K_M
    timeout-seconds: 30
    health-check-timeout-seconds: 2

application-dev.yaml already exists at backend/src/main/resources/application-dev.yaml — add to it, do not create a new file. Add the dev override key for local development (where Ollama runs on the host, not inside Docker):

app:
  ollama:
    base-url: http://localhost:11434

(health-check-timeout-seconds does not need a dev override — 2 seconds is appropriate in all environments.)

Test architecture

Commit order:

  1. org.wiremock:wiremock in pom.xml (test scope) — verify coordinate org.wiremock:wiremock (not the retired com.github.tomakehurst groupId)
  2. PersonService.findByDisplayNameContaining() with PersonServiceTest (red → green)
  3. ADR-028
    3.5. docs/architecture/c4/l3-backend-search.puml — create the C4 L3 diagram for the search domain while the architecture is fresh. Follow the naming convention of existing l3-backend-*.puml files in docs/architecture/c4/.
  4. Domain records + interfaces (NlQueryInterpretation, PersonHint, NlSearchResponse, OllamaClient, OllamaHealthClient)
  5. OllamaProperties + NlSearchRateLimiter + NlSearchRateLimitProperties + config yaml entries
  6. RestClientOllamaClient with RestClientOllamaClientTest (WireMock)
  7. NlQueryParserService with NlQueryParserServiceTest (Mockito)
  8. NlSearchController with NlSearchControllerTest (@WebMvcTest)
  9. DocumentRepository.findBySenderOrReceiver + DocumentService.searchDocumentsByPersonId + integration test

Factory helpers — write these before the first test:

  • makePersonHint(UUID id, String displayName) — builds a PersonHint record
  • makePerson(String firstName, String lastName) — builds a Person entity
  • makeOllamaResponseJson(String... names) — for RestClientOllamaClientTest WireMock stubs
  • makeInterpretation(...) — for controller test stubs

Writing these helpers first prevents 80% of test boilerplate duplication across the 40+ test cases in the plan.

JaCoCo gate: 77% (not 88% — backend README states 88% as an aspirational target; pom.xml gates at 0.77). NlQueryParserService has many branches and its test plan covers all permutations — it will push coverage up, not threaten the gate.

  • NlQueryParserServiceTest (@ExtendWith(MockitoExtension.class)) — unit tests for service logic; no Spring context.
  • NlSearchControllerTest (@WebMvcTest(NlSearchController.class) + @Import({SecurityConfig.class, PermissionAspect.class, AopAutoConfiguration.class})) — controller-layer tests; uses @MockitoBean OllamaClient (not deprecated @MockBean). Call rateLimiter.resetForTest() in @BeforeEach to prevent bucket state from leaking between test methods. Package placement: NlSearchControllerTest must be in org.raddatz.familienarchiv.search (not a test-specific subpackage) to access the package-private resetForTest() method.
  • RestClientOllamaClientTest (real WireMockServer) — tests HTTP client behaviour: timeout, error responses, JSON parsing.

Test plan — NlQueryParserServiceTest:

  • Happy path: single-match name → PersonHint in resolvedPersons
  • Multi-match name → ambiguousPersons non-empty, search result is empty
  • Multi-match name + personRole: "any" → same as multi-match: ambiguousPersons non-empty, search does NOT execute (explicit test name: should_not_execute_search_when_name_is_ambiguous_even_if_personRole_is_any)
  • No-match name → folded into keywords
  • personRole: "any" (single match) → searchDocumentsByPersonId overload is called; verify keywordsApplied == false in returned NlQueryInterpretation
  • 2-name query, both resolvesearchDocuments(senderId=person1, receiverId=person2) called; positional assertion: resolvedPersons.get(0) is the sender candidate and resolvedPersons.get(1) is the receiver candidate — assert by index, not just presence in list
  • 2-name query, first resolves, second is ambiguous → search does NOT execute; verify DocumentService is never called (zero invocations); first person in resolvedPersons, second name's candidates in ambiguousPersons
  • 2-name query, first has no match → first folded into keywords, second used as single-name search
  • 3+ names, explicit third name: personNames = ["Walter", "Emma", "Heinrich"], Walter and Emma both resolve → assert DocumentService.searchDocuments(senderId=Walter, receiverId=Emma, text="Heinrich", ...) called; "Heinrich" is in the space-joined keyword string passed as the text argument
  • Keyword list space-join (standalone test): input keywords=["Krieg", "Walter"] → assert documentService.searchDocuments(eq("Krieg Walter"), ...) called — pins String.join(" ", keywords) behavior. Write as a standalone test, not part of a larger happy-path test.
  • Date extraction: "1914–1918" → LocalDate from/to mapping
  • Malformed date string (e.g. "ca. 1910") → treated as null, not folded into keywords
  • Blank/under-3-char query → throws VALIDATION_ERROR before calling Ollama
  • Query over 500 chars → throws VALIDATION_ERROR
  • Ollama returns all-null/empty → raw query used as keyword fallback
  • Ollama returns null personNames/keywords fields → null-coalesced to empty list, no NPE
  • Ollama returns unrecognized personRole → defaults to "any", logs warning
  • Ollama times out → SMART_SEARCH_UNAVAILABLE
  • LLM-extracted name longer than 200 chars → truncated/rejected before PersonRepository call

Test plan — NlSearchControllerTest (@WebMvcTest):

  • Full request → response shape with stubbed OllamaClient; assert NlQueryInterpretation.resolvedPersons, ambiguousPersons, keywords, dateFrom, dateTo, keywordsApplied in response body
  • ambiguousPersons response: verify PersonHint shape (id + displayName)
  • Unauthenticated request → 401
  • Query under 3 chars or over 500 chars → 400 with VALIDATION_ERROR
  • @MockitoBean OllamaClient returns error → 503 with SMART_SEARCH_UNAVAILABLE
  • 6th request within 1 minute → 429 with SMART_SEARCH_RATE_LIMITED; use @WithMockUser(username = "testuser", authorities = {"READ_ALL"}) — works directly since rate limiter is keyed on the email string

Test plan — RestClientOllamaClientTest (WireMock):

  • Ollama returns valid JSON → parsed correctly into NlQueryInterpretation
  • Ollama returns HTTP 500 → SMART_SEARCH_UNAVAILABLE
  • Ollama exceeds timeout (WireMock fixed delay > timeoutSeconds) → SMART_SEARCH_UNAVAILABLE
  • Ollama returns malformed/truncated JSON → SMART_SEARCH_UNAVAILABLE (not a parse exception escaping to the controller). Stub must include Content-Type: application/json — without it Jackson may not attempt parsing and the error path differs.

Test plan — DocumentRepository.findBySenderOrReceiver integration test (Testcontainers, real Postgres):

  • Person is sender only → document appears in result
  • Person is receiver only → document appears in result
  • Person is both sender and receiver on the same document → document appears exactly once (DISTINCT)
  • Date range filter applied → documents outside the range excluded
  • No documents match the person → empty page returned

Infrastructure notes (for infra issue)

When the Ollama Compose service is defined:

  • Pin image tag (ollama/ollama:0.5.x) — not :latest; add to Renovate config (same pattern as existing services in renovate.json)
  • Named volume ollama_models: — persists the downloaded model across restarts
  • Use expose: ["11434"] not ports: — internal network only, never Caddy-routed. Hard requirement: the Ollama inference API has no authentication by default; ports: would expose it to the host network, allowing arbitrary model inference by anyone who can reach the host.
  • Healthcheck on GET http://localhost:11434/api/tags; start_period: 120s for model loading (weight loading from SSD takes 20–60 s on the current hardware; 120 s provides ample margin)
  • Model pre-pull on first deploy: ollama pull qwen2.5:7b-instruct-q4_K_M — must complete before backend starts or backend will 503 on all NL search requests. Add as explicit checklist in DEPLOYMENT.md: (1) docker compose up -d ollama; (2) pull model (allow 10–30 min, ~4.5 GB): docker exec <ollama-container> ollama pull qwen2.5:7b-instruct-q4_K_M; (3) verify: curl http://localhost:11434/api/tags; (4) docker compose up -d backend.
  • backend.depends_on on Ollama healthcheck: declare depends_on: ollama: condition: service_healthy to prevent a boot-time 503 storm. Note: service_healthy confirms Ollama is responding to /api/tags — it does NOT confirm the model is downloaded. If the ollama pull step was skipped, inference returns 404 and all NL search requests 503 silently until the pull completes.
  • Backup exclusion: add ollama_models to the backup exclusion list in the backup runbook — model weights are re-downloadable from the Ollama registry and are not user data; they must not be included in pg_dump or Hetzner S3 backup flows

Documentation

  • Add search/ to the CLAUDE.md package structure table
  • Create docs/architecture/c4/l3-backend-search.puml — follow the naming convention of existing l3-backend-*.puml files; create in commit 3.5 (after ADR, before domain records)
  • Update docs/architecture/c4/l2-containers.puml (new Ollama container)
  • Update docs/architecture/c4/l1-context.puml (new external system: Ollama)
  • Add SMART_SEARCH_UNAVAILABLE, SMART_SEARCH_RATE_LIMITED to CLAUDE.md and docs/ARCHITECTURE.md
  • Add NlSearch, NlQueryInterpretation, PersonHint to docs/GLOSSARY.md
  • Add Ollama section to docs/DEPLOYMENT.md (model pre-pull runbook from Infrastructure notes above, volume management, update procedure)

Response records

All fields the backend always populates need @Schema(requiredMode = REQUIRED). Run npm run generate:api in the same PR.

Frontend contract note: ambiguousPersons non-empty → disambiguation UI, results suppressed. resolvedPersons non-empty (and ambiguousPersons empty) → interpretation chip shown. Both empty → keyword/date-only search. For 2-name queries, resolvedPersons[0] = sender, resolvedPersons[1] = receiver — the chip must render the directionality ("Walter → Emma"), not just the names. When keywordsApplied == false (single-name personRole: "any" queries), keyword chips must NOT be shown — keywords were parsed but not used to filter results.

Frontend UX notes (for the frontend issue):

  • keywordsApplied == false rendering: do not silently omit the parsed keywords. Show a secondary text line below the interpretation chip using the i18n key smart_search_keywords_not_applied. Silent omission confuses 60+ users who said "Krieg" and see no explanation for why it was ignored.

  • Disambiguation UI: the frontend issue must specify an interaction pattern. A modal with large text and a clear confirmation button is recommended for the 60+ audience — inline DOM changes below the search bar are less disorienting but harder to discover. ambiguousPersons contains at most 10 candidates (capped in NlQueryParserService). Either way, the pattern must be decided before writing the frontend issue.

  • Loading state (2–15 seconds): use aria-live="polite" with a persistent, non-dismissable "Suche läuft…" message. Do not use a toast or auto-dismissing spinner — the search takes up to 15 seconds and users need continuous feedback. Test with axe-playwright before marking the frontend issue done.

  • Timeout fallback CTA (30 seconds): when SMART_SEARCH_UNAVAILABLE (503) arrives after the timeout, show an actionable message with a link to the regular search: de: "Die Suche hat zu lange gedauert — bitte versuche es noch einmal oder nutze die normale Suche." Include this as an acceptance criterion in the frontend issue.

Acceptance Criteria

  • POST /api/search/nl with {"query": "Was hat walter im krieg geschrieben?"} returns a NlSearchResponse containing matching documents and interpretation chips
  • A query with only keywords and no person name (e.g. "Briefe aus dem Krieg") returns documents filtered by keywords and/or date range via searchDocuments(text="Briefe aus dem Krieg", sender=null, receiver=null, from=null, to=null, ...) — sender and receiver are null
  • A query where all extracted person names have no database match (e.g. Ollama extracts ["Schmidt"] but no Schmidts exist) returns documents filtered by keywords and date range only, with the unmatched name(s) folded into the keywords list
  • A query with an ambiguous name returns an empty document result with the ambiguous persons in NlQueryInterpretation.ambiguousPersons as PersonHint objects (id + displayName, up to 10 candidates) — the frontend must prompt the user to disambiguate
  • A query where personRole is "any" and the name resolves to exactly one person returns documents where that person is sender OR receiver (via searchDocumentsByPersonId); keywordsApplied is false in the response
  • A query where personRole is "any" and the name is ambiguous behaves identically to the regular ambiguous case — empty result + ambiguousPersons list; disambiguation takes priority regardless of personRole
  • 2-name query where both resolve (e.g. "Was hat Walter an Emma geschrieben?"): returns documents where Walter is sender AND Emma is receiver
  • 2-name query with partial ambiguity (e.g. Walter resolves, Emma is ambiguous): returns empty result with Walter in resolvedPersons and Emma's candidates in ambiguousPersons — user must pick before search proceeds
  • Ollama unavailable returns 503 with error code SMART_SEARCH_UNAVAILABLE
  • A query over 500 characters or under 3 characters returns 400 with VALIDATION_ERROR
  • The 6th NL search request within 1 minute returns 429 with SMART_SEARCH_RATE_LIMITED
  • GET /api/documents/search behaviour is unchanged
  • NlQueryInterpretation.dateFrom and dateTo are serialized as ISO-8601 date strings ("1914-01-01") when present, null when absent
Part of epic #735. Depends on infra issue (Ollama service) — add Gitea issue number when created. ## Goal Implement the Spring Boot backend for natural language search: a new `search/` domain package that parses a German natural-language query via Ollama (Qwen 2.5 7B), resolves names to person UUIDs, and delegates to the existing `DocumentService.searchDocuments()`. The existing search endpoint is not modified. ## Architecture ``` POST /api/search/nl → NlSearchController → NlQueryParserService → OllamaClient → Ollama HTTP API → PersonService.findByDisplayNameContaining() for each name → DocumentService.searchDocuments(senderId?, receiverId?, from?, to?, q?, pageable) → DocumentService.searchDocumentsByPersonId(personId, from?, to?, pageable) ← new overload for OR semantics (personRole="any", single name — keywords not applied) → DocumentService.searchDocuments(senderId=person1, receiverId=person2, from?, to?, q?, pageable) ← for 2-name queries where both resolve ← NlSearchResponse { DocumentSearchResult, NlQueryInterpretation } ``` ## ADR Write `docs/adr/028-nl-search-ollama.md` before writing any code. Cover: - Why Ollama (vs. extending OCR service, vs. calling API) - CPU-only inference constraint and model selection (Qwen 2.5 7B Q4_K_M) — server has 64 GB RAM; no memory constraint - DB-blind name resolution (prompt stays small; person lookup is a cheap SQL query) - Prompt injection mitigation via grammar-constrained JSON output + `maxLength` constraints - Graceful degradation when Ollama is absent - Expected inference latency (2–15 seconds on CPU) — frontend must show a loading indicator - `app.ollama.timeout-seconds=30` default and justification - NL query logging policy: log metadata only (query length, person count, latency) — never raw query content (PII — real person names) - Prompt-amplification abuse as the threat mitigated by rate limiting - Ollama model pre-pull requirement on first deploy (model not bundled in Docker image) - Startup dependency: `ollama pull qwen2.5:7b-instruct-q4_K_M` must run before first inference - Multi-name resolution heuristic: first extracted name = sender, second = receiver; `personRole` applies to single-name queries only — chosen over per-name roles because the most natural German phrasing ("Was hat Walter an Emma geschrieben?") strongly implies sender→receiver order, and per-name roles would require a combinatorially complex schema - `personRole: "any"` keyword limitation: keyword filtering is not supported for OR-semantics person queries — only person identity and date range are applied; `keywordsApplied = false` is returned in the response (KISS over completeness; disambiguation and date filters already narrow results significantly) - The `search/` → `person/` + `document/` dependency direction is intentional: `NlQueryParserService` calls `PersonService` and `DocumentService` — cross-domain service calls, not repository leaks. ## New package: `org.raddatz.familienarchiv.search/` | Component | Type | Responsibility | |---|---|---| | `OllamaClient` | interface | `NlQueryInterpretation parse(String query)` | | `OllamaHealthClient` | interface | `boolean isHealthy()` — called inline by `NlQueryParserService` before each inference call. No Actuator `HealthIndicator` bean needed — consistent with `OcrHealthClient` pattern | | `RestClientOllamaClient` | `@Service` implements `OllamaClient`, `OllamaHealthClient` | HTTP call to Ollama API with JSON schema enforcement; `isHealthy()` calls `GET /api/tags` (Ollama has no `/health` endpoint); degrades gracefully when Ollama is absent. Uses **two separate `RestClient` instances**: inference client (30 s timeout) and health-check client (2 s connect timeout) — see `OllamaProperties.healthCheckTimeoutSeconds`. | | `NlQueryParserService` | `@Service` | Orchestrates Ollama call → name resolution → document search; delegates to `OllamaClient`, `PersonService`, `DocumentService` | | `OllamaProperties` | `@Component @ConfigurationProperties("app.ollama") @Data` | `baseUrl`, `model`, `timeoutSeconds` (inference, default: 30), `healthCheckTimeoutSeconds` (health-check RestClient only, default: 2). String fields (`baseUrl`, `model`) are null if absent from yaml — explicit yaml entries are required. | | `NlSearchRateLimiter` | `@Service` | Bucket4j + Caffeine, keyed on `userKey` (email string from `principal.getUsername()`), 5 req/min. **Node-local** — in multi-replica deployments the effective limit multiplies by replica count (same caveat as `LoginRateLimiter`). Has package-private `resetForTest()` — calls `cache.invalidateAll()` (not key-based invalidation, which would couple the method to the `@WithMockUser` username value). | | `NlSearchRateLimitProperties` | `@Component @ConfigurationProperties("app.nl-search.rate-limit") @Data` | `maxRequestsPerMinute` (default: 5) | | `NlSearchController` | `@RestController` | `POST /api/search/nl` | | `NlSearchRequest` | record | `@NotBlank @Size(min=3, max=500) String query` | | `PersonHint` | record | `UUID id, String displayName` — lightweight person reference for search results; not a JPA projection | | `NlQueryInterpretation` | record | `List<PersonHint> resolvedPersons`, `List<PersonHint> ambiguousPersons`, `LocalDate dateFrom`, `LocalDate dateTo`, `List<String> keywords`, `String rawQuery`, `boolean keywordsApplied` | | `NlSearchResponse` | record | `DocumentSearchResult result`, `NlQueryInterpretation interpretation` | Model `OllamaClient` / `OllamaHealthClient` / `RestClientOllamaClient` on the existing `OcrClient` / `OcrHealthClient` / `RestClientOcrClient` split. **Note on `PersonHint`:** `PersonSummaryDTO` is a JPA interface projection — it cannot be constructed manually. `PersonHint` is a plain record built from `Person` entities by `NlQueryParserService`. **Config properties pattern:** `OllamaProperties` and `NlSearchRateLimitProperties` follow the same pattern as `RateLimitProperties` (`auth/RateLimitProperties.java`): `@Component`, `@ConfigurationProperties(...)`, `@Data`. All three annotations are required — without `@Component` the bean doesn't auto-register; without `@Data` the binder cannot set fields. ## Ollama JSON schema (grammar-constrained) ```json { "type": "object", "required": ["personNames", "personRole", "keywords"], "properties": { "personNames": { "type": "array", "items": { "type": "string", "maxLength": 200 } }, "personRole": { "type": "string", "enum": ["sender", "receiver", "any"] }, "dateFrom": { "type": ["string", "null"], "maxLength": 20 }, "dateTo": { "type": ["string", "null"], "maxLength": 20 }, "keywords": { "type": "array", "items": { "type": "string", "maxLength": 100 } } } } ``` **Defense-in-depth:** `NlQueryParserService` must also enforce these length limits in code before passing any LLM-extracted fragment to `PersonRepository.searchByName()`. **Null-coalescing:** Treat every field as nullable — `personNames` and `keywords` absent from the Ollama response must be coalesced to `List.of()` before processing. A fully absent response falls back to `rawQuery` as a keyword. **Defensive `personRole` parsing:** If Ollama returns a value not in `["sender", "receiver", "any"]` (e.g. due to model drift), log a warning and default to `"any"` rather than propagating a `JsonMappingException`. Use parameterized SLF4J: `log.warn("Unexpected personRole from Ollama: {}", value)` — never string concatenation (defends against log injection from LLM output). ## Name resolution ```java // PersonRepository — delegates to existing searchByName() JPQL List<Person> findByDisplayNameContaining(String fragment); ``` `PersonService.findByDisplayNameContaining(String fragment): List<Person>` delegates to the existing `PersonRepository.searchByName(fragment)` — **no new JPQL needed; keep as a one-liner**. The existing query already covers first name + last name (both orderings), alias, and name aliases (maiden names) via `LEFT JOIN p.nameAliases`. This is a read method — **no `@Transactional` annotation** (consistent with `PersonService` code style for read methods). **Max candidates cap:** `NlQueryParserService` passes at most **10** persons to `ambiguousPersons`. If `searchByName()` returns more than 10 results, take the first 10 and log `log.debug("Name '{}' matched {} persons; capping disambiguation at 10", name, results.size())`. No new response fields — the frontend shows up to 10 disambiguation candidates. This prevents unusable disambiguation UIs for common surnames. | Match | Behaviour | |---|---| | Single person | Resolved to `PersonHint` → `senderId` or `receiverId` based on `personRole` | | Multiple persons (ambiguous, up to 10) | **Search does NOT execute**; empty `DocumentSearchResult` returned with matched persons (up to 10) as `List<PersonHint>` in `ambiguousPersons` — frontend shows disambiguation UI. **Disambiguation takes priority regardless of `personRole`** — this applies even when `personRole` is `"any"`. | | No match | Name folded into `keywords` | **`personRole: "any"` (single match)** → call `DocumentService.searchDocumentsByPersonId(UUID personId, LocalDate from, LocalDate to, Pageable pageable)` — queries documents where that person is sender OR receiver. **Keywords from the NL interpretation are not applied for this path** — only person identity and date range filter results. Set `keywordsApplied = false` in the returned `NlQueryInterpretation`. The frontend must not show keyword chips when `keywordsApplied == false`. Date strings from Ollama (`"1914"`, `"1914-01-01"`) → parse to `LocalDate`: `from = 1914-01-01`, `to = 1914-12-31` (or range if specified). Store as `LocalDate` in `NlQueryInterpretation`, serialized to ISO-8601 in the response. Malformed date strings that cannot be parsed to a year or ISO date are treated as `null` (not folded into keywords). ### Multi-name query resolution When `personNames` contains exactly 2 entries (e.g. `["Walter", "Emma"]` from "Was hat Walter an Emma geschrieben?"): - **Both resolve → AND semantics:** first name treated as sender, second as receiver. Calls `DocumentService.searchDocuments(senderId=person1, receiverId=person2, ...)`. Both appear in `resolvedPersons`. - **One resolves, one is ambiguous → disambiguation required:** search does NOT execute. The resolved person appears in `resolvedPersons`; the ambiguous name's candidates appear in `ambiguousPersons`. Frontend shows disambiguation UI for the ambiguous name. The **user must resolve the ambiguity** before the search proceeds — the resolved name does not trigger a partial search. - **One or both have no match:** unresolved name folded into `keywords`; the resolved name is used as a single-name search with `personRole`. - **3+ names:** first two entries follow the 2-name rule above; remaining names are folded into `keywords`. `personRole` from the Ollama schema applies **only to single-name queries**. For 2-name queries, roles are implicit: first=sender, second=receiver. **Consistency rule:** `ambiguousPersons` non-empty always means the search result is empty and disambiguation UI is shown — regardless of whether any other names in the query resolved cleanly. **Response list order semantics:** in `NlQueryInterpretation.resolvedPersons`, index 0 is the sender candidate and index 1 is the receiver candidate for 2-name queries. The frontend interpretation chip must render this directionally (e.g. "Walter → Emma"), not as an unordered list. ## `DocumentService` changes Add a new overload using JPQL (not native SQL): ```java // DocumentRepository @Query(""" SELECT DISTINCT d FROM Document d WHERE (d.sender = :person OR :person MEMBER OF d.receivers) AND (:from IS NULL OR d.documentDate >= :from) AND (:to IS NULL OR d.documentDate <= :to) ORDER BY d.documentDate DESC """) Page<Document> findBySenderOrReceiver( @Param("person") Person person, @Param("from") LocalDate from, @Param("to") LocalDate to, Pageable pageable); // DocumentService — keywords not supported for personRole:"any" (JPQL has no text predicate; KISS over completeness) public DocumentSearchResult searchDocumentsByPersonId(UUID personId, LocalDate from, LocalDate to, Pageable pageable) ``` Use JPQL `MEMBER OF` — do NOT use native SQL `ANY()` which is PostgreSQL-specific. Date predicates must be in the JPQL query itself — post-query filtering breaks pagination correctness. ### Call-site defaults for `NlQueryParserService` **Keywords → text join:** `String.join(" ", interpretation.keywords())` maps to `websearch_to_tsquery` AND semantics in PostgreSQL — `"Krieg Walter"` finds documents mentioning both. An empty list produces an empty string, which has no effect on the FTS predicate. **Keyword-only path** (no resolved persons): ```java documentService.searchDocuments( String.join(" ", interpretation.keywords()), // text: keywords space-joined interpretation.dateFrom(), interpretation.dateTo(), null, null, // sender, receiver List.of(), null, null, // tags, tagQ, status DocumentSort.DATE, "desc", TagOperator.AND, // sort, dir, tagOperator false, pageable // undated, pageable ); ``` **2-name resolved path** (both names resolve, sender + receiver): ```java documentService.searchDocuments( String.join(" ", interpretation.keywords()), interpretation.dateFrom(), interpretation.dateTo(), person1Id, person2Id, // first=sender, second=receiver List.of(), null, null, DocumentSort.DATE, "desc", TagOperator.AND, false, pageable ); ``` The Mockito `verify()` calls in `NlQueryParserServiceTest` must assert the exact values of `sort`, `dir`, `tagOperator`, `status`, and `undated` — not `any()`. Without explicit matchers the test passes vacuously for wrong defaults. ## Error codes | Code | HTTP | Condition | |---|---|---| | `SMART_SEARCH_UNAVAILABLE` | 503 | Ollama unreachable or timed out | | `SMART_SEARCH_RATE_LIMITED` | 429 | Exceeds 5 NL search requests per user per minute | | `VALIDATION_ERROR` | 400 | Blank query or query < 3 / > 500 characters | Add each to: `ErrorCode.java` → `errors.ts` → `getErrorMessage()` → `messages/{de,en,es}.json`. i18n guidance: - `SMART_SEARCH_UNAVAILABLE`: de: "Die intelligente Suche ist momentan nicht verfügbar. Bitte nutze die normale Suche."; en: "The smart search is currently unavailable. Please use the regular search."; es: "La búsqueda inteligente no está disponible en este momento. Por favor, usa la búsqueda normal." - `SMART_SEARCH_RATE_LIMITED`: de: "Du hast die Suchfunktion zu häufig genutzt. Bitte warte eine Minute."; en: "You have used the search function too frequently. Please wait a minute."; es: "Has utilizado la función de búsqueda demasiadas veces. Por favor, espera un minuto." - `smart_search_keywords_not_applied` (for `keywordsApplied == false` frontend display): de: "Schlüsselwörter konnten bei dieser Suche nicht berücksichtigt werden."; en: "Keywords could not be applied to this search."; es: "Las palabras clave no pudieron aplicarse a esta búsqueda." ## Rate limiting New `NlSearchRateLimiter` bean — do **not** reuse `LoginRateLimiter` (keyed on `ip + ":" + email`). Use the same Bucket4j + Caffeine pattern, keyed on `principal.getUsername()` (email string — stable, injection-proof, already guaranteed by the session). Config via `NlSearchRateLimitProperties` (`@ConfigurationProperties("app.nl-search.rate-limit")`), default `maxRequestsPerMinute = 5`. Return HTTP 429 with `SMART_SEARCH_RATE_LIMITED` when exceeded. Add a package-private `resetForTest()` method that calls `cache.invalidateAll()` — not key-based invalidation (which would couple the method to the `@WithMockUser(username = "testuser")` value). `invalidateAll()` is simpler and avoids this coupling. ## Security - `@RequirePermission(Permission.READ_ALL)` on `POST /api/search/nl` — this is a read operation, not `WRITE_ALL`. - Log only metadata: `log.debug("NL search: queryLength={}, personNamesCount={}, latencyMs={}", ...)` — never log the raw query (PII). - Do not expose Ollama port (11434) in the Compose file — use `expose:` not `ports:`. **Hard requirement**: the Ollama inference API has no authentication by default; `ports:` would make it reachable from the host network, allowing arbitrary model inference. ## Controller — `@AuthenticationPrincipal` wiring **Do NOT use `@AuthenticationPrincipal AppUser`.** `CustomUserDetailsService.loadUserByUsername()` returns `new User(email, password, authorities)` — a Spring `User`, not `AppUser`. At runtime, `@AuthenticationPrincipal AppUser` always resolves to `null`; the rate limiter call would NPE on every request, causing rate limiting to **fail open**. **Correct approach:** use `@AuthenticationPrincipal UserDetails principal` and pass `principal.getUsername()` (email string) to `NlSearchRateLimiter.checkAndConsume(String userKey)`. This matches the `InviteController` pattern. In `@WebMvcTest`, `@WithMockUser(username = "testuser", authorities = {"READ_ALL"})` works directly — no `@WithUserDetails` or `UserDetailsService` mocking needed. ## WireMock dependency Add `org.wiremock:wiremock` version **3.9.x** (**not** `wiremock-standalone`) as `test` scope in `pom.xml` **as the first commit on the implementation branch**. Verify the coordinate before committing: run `mvn dependency:get -Dartifact=org.wiremock:wiremock:3.9.2:jar` to confirm it resolves from Maven Central — the old `com.github.tomakehurst` groupId was retired in the 3.x repackage. The standalone artifact bundles its own Jackson and conflicts with Spring Boot's Jackson on the classpath. WireMock 3.9.x is compatible with Java 21 and Spring Boot 4 — verify the latest stable 3.9.x patch on Maven Central before pinning. ## Configuration — `application.yaml` `@ConfigurationProperties("app.ollama")` requires explicit yaml entries (unlike `@Value` which supports annotation-level defaults). String fields are null if the key is absent. Add to `application.yaml`: ```yaml app: ollama: base-url: http://ollama:11434 model: qwen2.5:7b-instruct-q4_K_M timeout-seconds: 30 health-check-timeout-seconds: 2 ``` **`application-dev.yaml` already exists** at `backend/src/main/resources/application-dev.yaml` — add to it, do not create a new file. Add the dev override key for local development (where Ollama runs on the host, not inside Docker): ```yaml app: ollama: base-url: http://localhost:11434 ``` (`health-check-timeout-seconds` does not need a dev override — 2 seconds is appropriate in all environments.) ## Test architecture **Commit order:** 1. `org.wiremock:wiremock` in `pom.xml` (test scope) — verify coordinate `org.wiremock:wiremock` (not the retired `com.github.tomakehurst` groupId) 2. `PersonService.findByDisplayNameContaining()` with `PersonServiceTest` (red → green) 3. ADR-028 3.5. `docs/architecture/c4/l3-backend-search.puml` — create the C4 L3 diagram for the search domain while the architecture is fresh. Follow the naming convention of existing `l3-backend-*.puml` files in `docs/architecture/c4/`. 4. Domain records + interfaces (`NlQueryInterpretation`, `PersonHint`, `NlSearchResponse`, `OllamaClient`, `OllamaHealthClient`) 5. `OllamaProperties` + `NlSearchRateLimiter` + `NlSearchRateLimitProperties` + config yaml entries 6. `RestClientOllamaClient` with `RestClientOllamaClientTest` (WireMock) 7. `NlQueryParserService` with `NlQueryParserServiceTest` (Mockito) 8. `NlSearchController` with `NlSearchControllerTest` (`@WebMvcTest`) 9. `DocumentRepository.findBySenderOrReceiver` + `DocumentService.searchDocumentsByPersonId` + integration test **Factory helpers — write these before the first test:** - `makePersonHint(UUID id, String displayName)` — builds a `PersonHint` record - `makePerson(String firstName, String lastName)` — builds a `Person` entity - `makeOllamaResponseJson(String... names)` — for `RestClientOllamaClientTest` WireMock stubs - `makeInterpretation(...)` — for controller test stubs Writing these helpers first prevents 80% of test boilerplate duplication across the 40+ test cases in the plan. **JaCoCo gate: 77%** (not 88% — backend README states 88% as an aspirational target; `pom.xml` gates at `0.77`). `NlQueryParserService` has many branches and its test plan covers all permutations — it will push coverage up, not threaten the gate. - **`NlQueryParserServiceTest`** (`@ExtendWith(MockitoExtension.class)`) — unit tests for service logic; no Spring context. - **`NlSearchControllerTest`** (`@WebMvcTest(NlSearchController.class)` + `@Import({SecurityConfig.class, PermissionAspect.class, AopAutoConfiguration.class})`) — controller-layer tests; uses `@MockitoBean OllamaClient` (not deprecated `@MockBean`). Call `rateLimiter.resetForTest()` in `@BeforeEach` to prevent bucket state from leaking between test methods. **Package placement:** `NlSearchControllerTest` must be in `org.raddatz.familienarchiv.search` (not a test-specific subpackage) to access the package-private `resetForTest()` method. - **`RestClientOllamaClientTest`** (real `WireMockServer`) — tests HTTP client behaviour: timeout, error responses, JSON parsing. **Test plan — `NlQueryParserServiceTest`:** - Happy path: single-match name → `PersonHint` in `resolvedPersons` - Multi-match name → `ambiguousPersons` non-empty, search result is empty - Multi-match name + `personRole: "any"` → same as multi-match: ambiguousPersons non-empty, search does NOT execute (explicit test name: `should_not_execute_search_when_name_is_ambiguous_even_if_personRole_is_any`) - No-match name → folded into keywords - `personRole: "any"` (single match) → `searchDocumentsByPersonId` overload is called; verify `keywordsApplied == false` in returned `NlQueryInterpretation` - **2-name query, both resolve** → `searchDocuments(senderId=person1, receiverId=person2)` called; **positional assertion**: `resolvedPersons.get(0)` is the sender candidate and `resolvedPersons.get(1)` is the receiver candidate — assert by index, not just presence in list - **2-name query, first resolves, second is ambiguous** → search does NOT execute; verify `DocumentService` is never called (zero invocations); first person in `resolvedPersons`, second name's candidates in `ambiguousPersons` - **2-name query, first has no match** → first folded into keywords, second used as single-name search - **3+ names, explicit third name**: `personNames = ["Walter", "Emma", "Heinrich"]`, Walter and Emma both resolve → assert `DocumentService.searchDocuments(senderId=Walter, receiverId=Emma, text="Heinrich", ...)` called; "Heinrich" is in the space-joined keyword string passed as the text argument - **Keyword list space-join (standalone test)**: input `keywords=["Krieg", "Walter"]` → assert `documentService.searchDocuments(eq("Krieg Walter"), ...)` called — pins `String.join(" ", keywords)` behavior. Write as a standalone test, not part of a larger happy-path test. - Date extraction: "1914–1918" → `LocalDate` from/to mapping - Malformed date string (e.g. "ca. 1910") → treated as `null`, not folded into keywords - Blank/under-3-char query → throws `VALIDATION_ERROR` before calling Ollama - Query over 500 chars → throws `VALIDATION_ERROR` - Ollama returns all-null/empty → raw query used as keyword fallback - Ollama returns null `personNames`/`keywords` fields → null-coalesced to empty list, no NPE - Ollama returns unrecognized `personRole` → defaults to `"any"`, logs warning - Ollama times out → `SMART_SEARCH_UNAVAILABLE` - LLM-extracted name longer than 200 chars → truncated/rejected before `PersonRepository` call **Test plan — `NlSearchControllerTest` (`@WebMvcTest`):** - Full request → response shape with stubbed `OllamaClient`; assert `NlQueryInterpretation.resolvedPersons`, `ambiguousPersons`, `keywords`, `dateFrom`, `dateTo`, `keywordsApplied` in response body - `ambiguousPersons` response: verify `PersonHint` shape (id + displayName) - Unauthenticated request → 401 - Query under 3 chars or over 500 chars → 400 with `VALIDATION_ERROR` - `@MockitoBean OllamaClient` returns error → 503 with `SMART_SEARCH_UNAVAILABLE` - 6th request within 1 minute → 429 with `SMART_SEARCH_RATE_LIMITED`; use `@WithMockUser(username = "testuser", authorities = {"READ_ALL"})` — works directly since rate limiter is keyed on the email string **Test plan — `RestClientOllamaClientTest` (WireMock):** - Ollama returns valid JSON → parsed correctly into `NlQueryInterpretation` - Ollama returns HTTP 500 → `SMART_SEARCH_UNAVAILABLE` - Ollama exceeds timeout (WireMock fixed delay > `timeoutSeconds`) → `SMART_SEARCH_UNAVAILABLE` - Ollama returns malformed/truncated JSON → `SMART_SEARCH_UNAVAILABLE` (not a parse exception escaping to the controller). Stub must include `Content-Type: application/json` — without it Jackson may not attempt parsing and the error path differs. **Test plan — `DocumentRepository.findBySenderOrReceiver` integration test (Testcontainers, real Postgres):** - Person is sender only → document appears in result - Person is receiver only → document appears in result - Person is both sender and receiver on the same document → document appears exactly once (DISTINCT) - Date range filter applied → documents outside the range excluded - No documents match the person → empty page returned ## Infrastructure notes (for infra issue) When the Ollama Compose service is defined: - **Pin image tag** (`ollama/ollama:0.5.x`) — not `:latest`; add to Renovate config (same pattern as existing services in `renovate.json`) - **Named volume** `ollama_models:` — persists the downloaded model across restarts - **Use `expose: ["11434"]`** not `ports:` — internal network only, never Caddy-routed. **Hard requirement**: the Ollama inference API has no authentication by default; `ports:` would expose it to the host network, allowing arbitrary model inference by anyone who can reach the host. - **Healthcheck** on `GET http://localhost:11434/api/tags`; `start_period: 120s` for model loading (weight loading from SSD takes 20–60 s on the current hardware; 120 s provides ample margin) - **Model pre-pull** on first deploy: `ollama pull qwen2.5:7b-instruct-q4_K_M` — must complete before backend starts or backend will 503 on all NL search requests. **Add as explicit checklist in DEPLOYMENT.md**: (1) `docker compose up -d ollama`; (2) pull model (allow 10–30 min, ~4.5 GB): `docker exec <ollama-container> ollama pull qwen2.5:7b-instruct-q4_K_M`; (3) verify: `curl http://localhost:11434/api/tags`; (4) `docker compose up -d backend`. - **`backend.depends_on` on Ollama healthcheck:** declare `depends_on: ollama: condition: service_healthy` to prevent a boot-time 503 storm. **Note:** `service_healthy` confirms Ollama is responding to `/api/tags` — it does NOT confirm the model is downloaded. If the `ollama pull` step was skipped, inference returns 404 and all NL search requests 503 silently until the pull completes. - **Backup exclusion:** add `ollama_models` to the backup exclusion list in the backup runbook — model weights are re-downloadable from the Ollama registry and are not user data; they must not be included in `pg_dump` or Hetzner S3 backup flows ## Documentation - Add `search/` to the CLAUDE.md package structure table - Create `docs/architecture/c4/l3-backend-search.puml` — follow the naming convention of existing `l3-backend-*.puml` files; create in commit 3.5 (after ADR, before domain records) - Update `docs/architecture/c4/l2-containers.puml` (new Ollama container) - Update `docs/architecture/c4/l1-context.puml` (new external system: Ollama) - Add `SMART_SEARCH_UNAVAILABLE`, `SMART_SEARCH_RATE_LIMITED` to `CLAUDE.md` and `docs/ARCHITECTURE.md` - Add `NlSearch`, `NlQueryInterpretation`, `PersonHint` to `docs/GLOSSARY.md` - Add Ollama section to `docs/DEPLOYMENT.md` (model pre-pull runbook from Infrastructure notes above, volume management, update procedure) ## Response records All fields the backend always populates need `@Schema(requiredMode = REQUIRED)`. Run `npm run generate:api` in the same PR. **Frontend contract note:** `ambiguousPersons` non-empty → disambiguation UI, results suppressed. `resolvedPersons` non-empty (and `ambiguousPersons` empty) → interpretation chip shown. Both empty → keyword/date-only search. For 2-name queries, `resolvedPersons[0]` = sender, `resolvedPersons[1]` = receiver — the chip must render the directionality ("Walter → Emma"), not just the names. When `keywordsApplied == false` (single-name `personRole: "any"` queries), keyword chips must NOT be shown — keywords were parsed but not used to filter results. **Frontend UX notes (for the frontend issue):** - **`keywordsApplied == false` rendering:** do not silently omit the parsed keywords. Show a secondary text line below the interpretation chip using the i18n key `smart_search_keywords_not_applied`. Silent omission confuses 60+ users who said "Krieg" and see no explanation for why it was ignored. - **Disambiguation UI:** the frontend issue must specify an interaction pattern. A modal with large text and a clear confirmation button is recommended for the 60+ audience — inline DOM changes below the search bar are less disorienting but harder to discover. `ambiguousPersons` contains at most 10 candidates (capped in `NlQueryParserService`). Either way, the pattern must be decided before writing the frontend issue. - **Loading state (2–15 seconds):** use `aria-live="polite"` with a persistent, non-dismissable "Suche läuft…" message. Do not use a toast or auto-dismissing spinner — the search takes up to 15 seconds and users need continuous feedback. Test with axe-playwright before marking the frontend issue done. - **Timeout fallback CTA (30 seconds):** when `SMART_SEARCH_UNAVAILABLE` (503) arrives after the timeout, show an actionable message with a link to the regular search: de: "Die Suche hat zu lange gedauert — bitte versuche es noch einmal oder nutze die normale Suche." Include this as an acceptance criterion in the frontend issue. ## Acceptance Criteria - `POST /api/search/nl` with `{"query": "Was hat walter im krieg geschrieben?"}` returns a `NlSearchResponse` containing matching documents and interpretation chips - A query with only keywords and no person name (e.g. "Briefe aus dem Krieg") returns documents filtered by keywords and/or date range via `searchDocuments(text="Briefe aus dem Krieg", sender=null, receiver=null, from=null, to=null, ...)` — sender and receiver are null - A query where **all** extracted person names have no database match (e.g. Ollama extracts `["Schmidt"]` but no Schmidts exist) returns documents filtered by keywords and date range only, with the unmatched name(s) folded into the keywords list - A query with an ambiguous name returns an **empty document result** with the ambiguous persons in `NlQueryInterpretation.ambiguousPersons` as `PersonHint` objects (id + displayName, up to 10 candidates) — the frontend must prompt the user to disambiguate - A query where `personRole` is `"any"` **and the name resolves to exactly one person** returns documents where that person is **sender OR receiver** (via `searchDocumentsByPersonId`); `keywordsApplied` is `false` in the response - A query where `personRole` is `"any"` **and the name is ambiguous** behaves identically to the regular ambiguous case — empty result + `ambiguousPersons` list; disambiguation takes priority regardless of `personRole` - **2-name query where both resolve** (e.g. "Was hat Walter an Emma geschrieben?"): returns documents where Walter is sender AND Emma is receiver - **2-name query with partial ambiguity** (e.g. Walter resolves, Emma is ambiguous): returns empty result with Walter in `resolvedPersons` and Emma's candidates in `ambiguousPersons` — user must pick before search proceeds - Ollama unavailable returns 503 with error code `SMART_SEARCH_UNAVAILABLE` - A query over 500 characters or under 3 characters returns 400 with `VALIDATION_ERROR` - The 6th NL search request within 1 minute returns 429 with `SMART_SEARCH_RATE_LIMITED` - `GET /api/documents/search` behaviour is unchanged - `NlQueryInterpretation.dateFrom` and `dateTo` are serialized as ISO-8601 date strings (`"1914-01-01"`) when present, null when absent
marcel added this to the Archive Intelligence — NL Search milestone 2026-06-06 12:15:42 +02:00
marcel added the P2-mediumfeature labels 2026-06-06 12:16:36 +02:00
Author
Owner

Implementation complete

Branch: worktree-feat+issue-738-nl-search-backend

Commits

  1. feat(person): add findByDisplayNameContaining service method — one-liner wrapper delegating to existing PersonRepository.searchByName()
  2. docs(adr): ADR-028 — NL search via Ollama — covers Qwen 2.5 7B, grammar-constrained JSON, CPU-only inference, graceful degradation
  3. docs(c4): add L3 backend search component diagram — all 8 search-package components + relations to PersonService, DocumentService, PostgreSQL
  4. feat(search): add NL search domain records and OllamaClient interfaces — OllamaClient, OllamaHealthClient, PersonHint, NlQueryInterpretation, NlSearchResponse, NlSearchRequest, OllamaExtraction
  5. feat(search): add NL search error codes and i18n strings — SMART_SEARCH_UNAVAILABLE (503), SMART_SEARCH_RATE_LIMITED (429) in ErrorCode.java, errors.ts, and all three message files
  6. feat(search): add Ollama and rate-limit config properties — OllamaProperties, NlSearchRateLimitProperties, application.yaml + application-dev.yaml
  7. feat(search): add NlSearchRateLimiter with Bucket4j/Caffeine — 5 req/min per user, package-private resetForTest()
  8. feat(search): implement RestClientOllamaClient with WireMock tests — grammar-constrained POST /api/generate, 2 s health-check client, graceful degradation on timeout/500/malformed JSON
  9. feat(search): implement NlQueryParserService with Mockito tests (23 cases) — full name resolution algorithm: single-match / ambiguous / no-match / role-based sender+receiver / 3+ names → extra fragments; keywordsApplied flag; 200-char name guard; 10-candidate cap
  10. feat(search): implement NlSearchController with @WebMvcTest tests (7 cases) — POST /api/search/nl, @RequirePermission(READ_ALL), rate limiter wired to principal.getUsername()
  11. feat(search): add searchDocumentsByPersonId with Specification-based sender/receiver query — avoids PostgreSQL null type-inference issue; DISTINCT via query.distinct(true); 5 DocumentRepository integration tests
  12. feat(search): add @Schema annotations and regenerate TypeScript API types — NlSearchRequest, NlQueryInterpretation, NlSearchResponse, PersonHint now in generated api.ts
  13. docs(search): update CLAUDE.md, GLOSSARY, DEPLOYMENT, and C4 diagrams — search/ package entry, Ollama runbook, C4 L1+L2 updates

Test results

  • NlQueryParserServiceTest — 23 tests
  • NlSearchControllerTest — 7 tests
  • RestClientOllamaClientTest — 4 tests
  • NlSearchRateLimiterTest — 4 tests
  • DocumentRepositoryTest — 34 tests (incl. 5 new person-spec tests)

What was built

  • POST /api/search/nl endpoint — natural language query → structured NlSearchResponse
  • Ollama integration via RestClientOllamaClient with grammar-constrained JSON schema (Qwen 2.5 7B)
  • Name-to-UUID resolution: resolves person names against DB, handles ambiguous/no-match cases, supports sender+receiver role detection
  • Per-user rate limiting: 5 req/min via Bucket4j + Caffeine
  • Graceful degradation: Ollama down/timeout → 503 SMART_SEARCH_UNAVAILABLE
  • Full i18n (de/en/es) for both new error codes + smart_search_keywords_not_applied

Notes

  • searchDocumentsByPersonId uses a JPA Specification instead of JPQL to avoid PostgreSQL's null parameter type-inference issue with IS NULL OR patterns
  • The frontend implementation (#735) can now consume NlSearchRequest / NlSearchResponse types from the regenerated api.ts
## Implementation complete ✅ Branch: `worktree-feat+issue-738-nl-search-backend` ### Commits 1. **`feat(person): add findByDisplayNameContaining service method`** — one-liner wrapper delegating to existing `PersonRepository.searchByName()` 2. **`docs(adr): ADR-028 — NL search via Ollama`** — covers Qwen 2.5 7B, grammar-constrained JSON, CPU-only inference, graceful degradation 3. **`docs(c4): add L3 backend search component diagram`** — all 8 search-package components + relations to PersonService, DocumentService, PostgreSQL 4. **`feat(search): add NL search domain records and OllamaClient interfaces`** — OllamaClient, OllamaHealthClient, PersonHint, NlQueryInterpretation, NlSearchResponse, NlSearchRequest, OllamaExtraction 5. **`feat(search): add NL search error codes and i18n strings`** — SMART_SEARCH_UNAVAILABLE (503), SMART_SEARCH_RATE_LIMITED (429) in ErrorCode.java, errors.ts, and all three message files 6. **`feat(search): add Ollama and rate-limit config properties`** — OllamaProperties, NlSearchRateLimitProperties, application.yaml + application-dev.yaml 7. **`feat(search): add NlSearchRateLimiter with Bucket4j/Caffeine`** — 5 req/min per user, package-private resetForTest() 8. **`feat(search): implement RestClientOllamaClient with WireMock tests`** — grammar-constrained POST /api/generate, 2 s health-check client, graceful degradation on timeout/500/malformed JSON 9. **`feat(search): implement NlQueryParserService with Mockito tests (23 cases)`** — full name resolution algorithm: single-match / ambiguous / no-match / role-based sender+receiver / 3+ names → extra fragments; keywordsApplied flag; 200-char name guard; 10-candidate cap 10. **`feat(search): implement NlSearchController with @WebMvcTest tests (7 cases)`** — POST /api/search/nl, @RequirePermission(READ_ALL), rate limiter wired to principal.getUsername() 11. **`feat(search): add searchDocumentsByPersonId with Specification-based sender/receiver query`** — avoids PostgreSQL null type-inference issue; DISTINCT via query.distinct(true); 5 DocumentRepository integration tests 12. **`feat(search): add @Schema annotations and regenerate TypeScript API types`** — NlSearchRequest, NlQueryInterpretation, NlSearchResponse, PersonHint now in generated api.ts 13. **`docs(search): update CLAUDE.md, GLOSSARY, DEPLOYMENT, and C4 diagrams`** — search/ package entry, Ollama runbook, C4 L1+L2 updates ### Test results - `NlQueryParserServiceTest` — 23 tests ✅ - `NlSearchControllerTest` — 7 tests ✅ - `RestClientOllamaClientTest` — 4 tests ✅ - `NlSearchRateLimiterTest` — 4 tests ✅ - `DocumentRepositoryTest` — 34 tests (incl. 5 new person-spec tests) ✅ ### What was built - `POST /api/search/nl` endpoint — natural language query → structured `NlSearchResponse` - Ollama integration via `RestClientOllamaClient` with grammar-constrained JSON schema (Qwen 2.5 7B) - Name-to-UUID resolution: resolves person names against DB, handles ambiguous/no-match cases, supports sender+receiver role detection - Per-user rate limiting: 5 req/min via Bucket4j + Caffeine - Graceful degradation: Ollama down/timeout → 503 SMART_SEARCH_UNAVAILABLE - Full i18n (de/en/es) for both new error codes + `smart_search_keywords_not_applied` ### Notes - `searchDocumentsByPersonId` uses a JPA Specification instead of JPQL to avoid PostgreSQL's null parameter type-inference issue with `IS NULL OR` patterns - The frontend implementation (#735) can now consume `NlSearchRequest` / `NlSearchResponse` types from the regenerated `api.ts`
Sign in to join this conversation.
No Label P2-medium feature
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#738