familienarchiv

Author	SHA1	Message	Date
Marcel	1e1e96b86f	refactor(search): delete frontend NLP search components and utilities Removes SmartModeToggle, SmartSearchStatus, InterpretationChipRow, DisambiguationPicker, chip-types utilities, and theme-chip-removal utilities as part of NLP feature removal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 18:43:34 +02:00
Marcel	30aba010f4	refactor(search): remove NLP error codes and application config Remove SMART_SEARCH_UNAVAILABLE and SMART_SEARCH_RATE_LIMITED error codes from ErrorCode enum; remove nlp and nl-search configuration blocks from application.yaml; remove nlp config block from application-dev.yaml. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 18:42:48 +02:00
Marcel	be7ad1d1fa	refactor(search): delete backend NLP search package Remove entire backend search domain including: - NlSearchController, NlQueryParserService, NlpClient implementations - Rate limiting, properties, DTOs (NlSearchRequest/Response/NlQueryInterpretation) - All domain logic and tests (5 test files deleted) Backend compiles successfully post-deletion. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 18:41:36 +02:00
Marcel	4232941b99	fix(infra): replace Ollama with nlp-service in docker-compose.prod.yml Some checks failed CI / OCR Service Tests (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details CI / fail2ban Regex (pull_request) Has been cancelled Details CI / Semgrep Security Scan (pull_request) Has been cancelled Details CI / Compose Bucket Idempotency (pull_request) Has been cancelled Details CI / Unit & Component Tests (pull_request) Has been cancelled Details Removes the ollama and ollama-model-init services (and ollama-models volume) from the production/staging compose file. Adds the nlp-service in their place — mirroring the dev compose — and wires the backend dependency and APP_NLP_BASE_URL env var so staging can reach the new service. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 18:16:45 +02:00
Marcel	f41acfb29e	fix(search): replace languageTag() with getLocale(); sync KI→Smart in tests All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m24s Details CI / OCR Service Tests (pull_request) Successful in 26s Details CI / Backend Unit Tests (pull_request) Successful in 3m57s Details CI / fail2ban Regex (pull_request) Successful in 46s Details CI / Semgrep Security Scan (pull_request) Successful in 23s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m7s Details Paraglide 2.5 runtime exports getLocale(), not languageTag(). The `8bed0cc6` commit introduced the wrong import when threading lang through the NL search path. Also updates two test assertions that still expected the old 'KI' button label after `0b31a51e` renamed it to 'Smart-Suche'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 17:44:38 +02:00
Marcel	15dff2a7b9	refactor(search): delete orphaned RestClientOllamaClientTest Some checks failed CI / Unit & Component Tests (pull_request) Failing after 2m49s Details CI / OCR Service Tests (pull_request) Successful in 23s Details CI / Backend Unit Tests (pull_request) Successful in 4m3s Details CI / fail2ban Regex (pull_request) Successful in 45s Details CI / Semgrep Security Scan (pull_request) Successful in 24s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m5s Details The source class RestClientOllamaClient was removed in `864f44a4` but the corresponding test file was not staged at the time. Removes the leftover file; coverage is provided by RestClientNlpClientTest. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:42:20 +02:00
Marcel	081e9c3163	docs(deployment): replace Ollama with nlp-service in DEPLOYMENT.md - §1: update memory table (nlp-service ~256 MB vs Ollama ~8 GB); update memory budget note; add nlp-service to topology diagram - §2: replace 'Ollama (NL search) service' env var table with 'NLP service' table (APP_NLP_BASE_URL, NLP_FUZZY_THRESHOLD); add credential-rotation restart note - §3.4: replace Ollama model-pull first-deploy warning with nlp-service startup note (no download, --wait safe) - §6: replace Ollama operational section (model pull, ollama list, upgrade guide) with nlp-service health check and tuning guide Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:41:46 +02:00
Marcel	0c8d516eed	docs(nlp-service): update CLAUDE.md — remove stale dateparser entry and prototype note Removes 'dateparser 1.2' from the stack section (dependency was dropped in favour of the rule-based date regex pipeline). Rewrites the Notes section to reflect that docker-compose integration and Java-side wiring were both delivered in this PR. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:40:01 +02:00
Marcel	6fdbc6240a	fix(infra): wait for nlp-service healthy before starting backend Changes condition: service_started → service_healthy so the backend container does not start until FastAPI has bound its port and loaded person names from the database. Eliminates the startup race where a first NL search would return 503 during nlp-service bootstrap. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:39:25 +02:00
Marcel	6e997c7474	docs(adr): ADR-035 — replace Ollama with rule-based nlp-service Some checks failed CI / Unit & Component Tests (pull_request) Failing after 2m56s Details CI / OCR Service Tests (pull_request) Successful in 25s Details CI / Semgrep Security Scan (pull_request) Successful in 22s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m8s Details CI / Backend Unit Tests (pull_request) Failing after 39s Details CI / fail2ban Regex (pull_request) Successful in 53s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:13:58 +02:00
Marcel	2559260ee8	docs(c4): replace Ollama with nlp-service in L2 container diagram Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:12:59 +02:00
Marcel	2b8fb602e3	feat(infra): replace Ollama with nlp-service in docker-compose Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:12:13 +02:00
Marcel	0b31a51ed9	chore(i18n): remove AI/KI/IA and timing refs from smart search strings Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:10:32 +02:00
Marcel	7ebfaf7933	test(search): assert lang field sent in E2E NL search request Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:08:57 +02:00
Marcel	a4e0d1685c	feat(search): raise NL search rate limit from 5 to 20 req/min The rule-based NLP service is <100ms vs Ollama's ~15s, making the old limit too restrictive for normal interactive use. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:06:04 +02:00
Marcel	ac21f4fe38	test(search): replace OllamaClient test suite with NlpClient equivalents - Delete RestClientOllamaClientTest, add RestClientNlpClientTest: WireMock targets POST /parse; adds isHealthy_returnsFalse_whenPersonsLoadedIsZero - NlQueryParserServiceTest: @Mock NlpClient; all stubs updated to parse(String,String); NlpExtraction throughout; service.search(..., "de", PAGE); adds verify(nlpClient).parse(eq,eq) - NlSearchControllerTest: add lang:"de" to all request bodies; stubs use anyString×3; rename search_returns503_whenOllamaUnavailable → search_returns503_whenNlpServiceUnavailable Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 16:04:50 +02:00
Marcel	864f44a4be	refactor(search): delete Ollama* classes replaced by Nlp* equivalents Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:59:20 +02:00
Marcel	8bed0cc6e2	feat(search): thread lang through NlSearchRequest → controller → NlQueryParserService → NlpClient - NlSearchRequest gains @NotBlank @Pattern(regexp="de\|en\|es") lang field - NlSearchController passes request.lang() to service - NlQueryParserService.search signature: (String, String, Pageable); renames ollamaClient→nlpClient; removes redundant length guard (Bean Validation is enforcement point) - application.yaml: replaces app.ollama.* with app.nlp.base-url; application-dev.yaml: points to localhost:8001 - frontend/documents/+page.svelte: sends lang: languageTag() in POST body Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:58:48 +02:00
Marcel	34387f2d59	feat(search): add RestClientNlpClient — POST /parse, GET /health with persons_loaded check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:55:50 +02:00
Marcel	8d1ff1efe7	test(search): NlpPropertiesTest — validates baseUrl required and defaults Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:54:39 +02:00
Marcel	492a064735	feat(search): add NlpProperties config and @ConfigurationPropertiesScan Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:52:12 +02:00
Marcel	e1ec1c0dfe	feat(search): add NlpExtraction record, NlpClient and NlpHealthClient interfaces Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:51:26 +02:00
Marcel	00b2d46424	test(nlp-service): guard global matcher state in try/finally Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:50:32 +02:00
Marcel	d3da3b6cd1	chore(nlp-service): add .dockerignore to exclude dev artifacts from image Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:50:01 +02:00
Marcel	24e5ac9c22	chore(nlp-service): remove unused dateparser dependency Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:49:37 +02:00
Marcel	2eb5572d7a	feat(nlp-service): wire NLP_FUZZY_THRESHOLD env var with 0-100 validation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:48:57 +02:00
Marcel	99d6a9a428	feat(nlp-service): cap /parse query at 500 chars via Field(max_length=500) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:47:40 +02:00
Marcel	4697f5fbb3	feat(nlp-service): log WARNING when DATABASE_URL absent, ERROR on DB failure Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:47:03 +02:00
Marcel	5d8ec38474	fix(nlp-service): return generic 500 detail to prevent credential leakage Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 15:46:24 +02:00
Marcel	824f048640	fix(nlp-service): eliminate false-positive person matches from dirty DB records - Wire _EXTRA_SPAN_STOPS into _extract_persons_and_role so German function words (im, seine, ihre, dem, …) terminate name spans — fixes "Clara im" and "seine Kinder" leaking into personNames - Add _NON_NAME_TOKENS filter in PersonMatcher.load() to skip DB records whose first_name contains prepositions or possessives — filters 290 bad records (annotations like "an seine Eltern", "Eltern in", place references like "Enkel Cram aus Mexiko") that were causing exact Pass-2 matches - Remove spaCy model downloads from Dockerfile (no longer needed after the DB-backed matcher rewrite) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 11:09:35 +02:00
Marcel	6c5cf8ec9b	feat(nlp-service): replace spaCy NER with DB-backed PersonMatcher Rule-based pipeline: persons matched via rapidfuzz against all known names loaded from DB at startup. Fixes first-name-only extraction (Eugenie, Herbert), merged-span bug (Herbert + Eugenie de Gruyter), false positives on compound nouns, and EN/ES model failures. Date extraction unchanged (regex). No spaCy models required. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 11:00:03 +02:00
Marcel	9472d8c25e	feat(nlp-service): Dockerfile — python:3.11-slim, models baked in	2026-06-07 10:31:18 +02:00
Marcel	8521e6f173	feat(nlp-service): FastAPI app with /parse and /health endpoints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 10:29:32 +02:00
Marcel	cc4c81e218	feat(nlp-service): full extract() pipeline — assembles all steps Also adds regex year-fallback in extract_dates() for de/es spaCy small models that don't tag bare 4-digit years as DATE entities, and widens the direction-token window to 2 tokens back to handle Spanish "antes de". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 10:28:40 +02:00
Marcel	55f419d20f	feat(nlp-service): keyword extraction (POS-filtered, deduped lemmas)	2026-06-07 10:24:35 +02:00
Marcel	53f6dcbfed	feat(nlp-service): date range extraction with direction detection	2026-06-07 10:23:33 +02:00
Marcel	0ab2e2a743	feat(nlp-service): role detection (sender/receiver/any)	2026-06-07 10:22:14 +02:00
Marcel	bff16f6f1f	feat(nlp-service): NER person name extraction	2026-06-07 10:21:16 +02:00
Marcel	18f028e2dd	feat(nlp-service): spaCy model loading with get_nlp/load_all_models	2026-06-07 10:17:07 +02:00
Marcel	e3b8e57746	feat(nlp-service): scaffold — models, requirements, CLAUDE.md Task 1: Create standalone FastAPI service scaffold with models, test framework, and documentation. Includes ParseRequest, ParseResponse Pydantic models matching OllamaExtraction contract, plus three passing tests validating model validation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 10:13:08 +02:00
Marcel	6878419156	merge: resolve conflicts with origin/main (#763 person name-match integration) All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m31s Details CI / OCR Service Tests (pull_request) Successful in 25s Details CI / Backend Unit Tests (pull_request) Successful in 3m48s Details CI / fail2ban Regex (pull_request) Successful in 45s Details CI / Semgrep Security Scan (pull_request) Successful in 22s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m6s Details CI / Unit & Component Tests (push) Successful in 3m20s Details CI / OCR Service Tests (push) Successful in 23s Details CI / Backend Unit Tests (push) Successful in 3m48s Details CI / fail2ban Regex (push) Successful in 46s Details CI / Semgrep Security Scan (push) Successful in 23s Details CI / Compose Bucket Idempotency (push) Successful in 1m8s Details - Drop unused MAX_CANDIDATES constant (not referenced in service) - Keep detached-entity safety comment in resolveTags() - Add 3 new partial-name match tests (23a/b/c) from #763 - Use resolveByName() API in test 28 (replaces findByDisplayNameContaining) - Add NameMatches glossary entry from #763 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-07 08:50:48 +02:00
Marcel	09b77e9b36	test(person): pin fetchPool dedup when one person matches two tokens (#763 review) All checks were successful CI / Unit & Component Tests (push) Successful in 3m20s Details CI / OCR Service Tests (push) Successful in 24s Details CI / Backend Unit Tests (push) Successful in 3m53s Details CI / fail2ban Regex (push) Successful in 44s Details CI / Semgrep Security Scan (push) Successful in 21s Details CI / Compose Bucket Idempotency (push) Successful in 1m5s Details Assert that when the same person id is returned by two different token fetches, the person appears exactly once in the result -- pinning fetchPool's putIfAbsent dedup so a future refactor can't silently double-classify a candidate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	9d202b042b	test(person): close fetch-to-classify seam for alias matches on real Postgres (#763 review) AC#4 (maiden alias -> direct) and AC#5 (alias first name -> fetchable + classifiable) were each split across PersonRepositoryTest (the fetch) and PersonServiceTest (the classifier with stubs) -- nothing walked searchByName -> resolveByName end-to-end on real Postgres. Add two tests in the existing @DataJpaTest slice that build a real PersonService over the autowired repositories, persist a person with a MAIDEN_NAME alias and one with an alias firstName, and assert both classify as direct. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	8429b1e9f8	fix(search): derive disambiguation trigger aria-label from match count (#763 review) The trigger hardcoded the multiple-people label for every count, so a single did-you-mean picker announced "Mehrere Personen gefunden" to screen readers while sighted users saw one name and a "Meintest du …?" heading. Derive the trigger's accessible name from persons.length: a single suggestion reuses the heading prop, two or more keep the multiple-people label. Visible truncated name span unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	6959651b36	docs(search): document NameMatches and resolveByName (#763 ) GLOSSARY entry for NameMatches (direct vs partial name-match strength and how the search layer maps it); person/README adds resolveByName to the public surface. No ADR — the matching rule is localized and justified inline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	0ef4f4f07c	feat(search): case-appropriate disambiguation picker copy (#763 ) A 1-item picker now reads "Meintest du …?" (a single direct match auto-selects and never reaches the picker), while ≥2 keeps the "Person auswählen" framing. The prompt lives in a visible, non-truncated panel heading (the trigger span clips at 320px), and the "(auswählen…)" cue is dropped for the 1-item case. DisambiguationPicker takes heading + showCue props; the page derives both from ambiguousPersons.length. New search_disambiguation_did_you_mean key in de/en/es. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	f1bb9d3a69	feat(search): map direct/partial NameMatches into resolve buckets (#763 ) resolveNames now delegates to PersonService.resolveByName and maps by match strength: 1 direct → resolved (auto-select), ≥2 direct → ambiguous, 0 direct with partials → ambiguous suggestions, 0 candidates → folded into full-text. A single direct match no longer forces the picker when looser substring hits coexist. The MAX_CANDIDATES cap moved into PersonService (after classification); the MAX_NAME_LENGTH guard, resolved-cap overflow, and sender/receiver mapping are preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	ca52145556	feat(person): add resolveByName for direct/partial name matching (#763 ) Token-set containment over all of a person's name components (firstName, lastName, alias, each PersonNameAlias first+last, title) decides direct vs partial. Orchestrates tokenize → cap(8) → fetch pool → classify → cap(10) after classification, with an empty-token guard and a PII-free debug log of the outcome bucket. MAX_TOKENS is a DoS control; the after-classify cap keeps a direct match that sorts past position 10 among partials. Read-only transaction keeps lazy nameAliases reachable during classification (ADR-022). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	9a26bf75b0	feat(person): match alias first names in searchByName (#763 ) The direct-match classifier accepts alias firstName tokens, so the fetch must surface candidates matchable only via an alias first name. Add a.firstName to the searchByName LIKE clause (reuses the bound :query — injection-proof). The person_name_aliases.first_name column already exists; no migration. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00
Marcel	9c616f9fb8	feat(person): add name-match tokenizer for direct matching (#763 ) Lowercase, split on whitespace/hyphen/apostrophe, drop empties. Applied symmetrically to query and candidate name components so "Anna-Maria" and "Anna Maria" tokenize alike. Foundation for resolveByName direct matching. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 08:47:47 +02:00

1 2 3 4 5 ...

3409 Commits