Commit Graph

3 Commits

Author SHA1 Message Date
Marcel
24e5ac9c22 chore(nlp-service): remove unused dateparser dependency
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 15:49:37 +02:00
Marcel
6c5cf8ec9b feat(nlp-service): replace spaCy NER with DB-backed PersonMatcher
Rule-based pipeline: persons matched via rapidfuzz against all known
names loaded from DB at startup. Fixes first-name-only extraction
(Eugenie, Herbert), merged-span bug (Herbert + Eugenie de Gruyter),
false positives on compound nouns, and EN/ES model failures.
Date extraction unchanged (regex). No spaCy models required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 11:00:03 +02:00
Marcel
e3b8e57746 feat(nlp-service): scaffold — models, requirements, CLAUDE.md
Task 1: Create standalone FastAPI service scaffold with models, test framework,
and documentation. Includes ParseRequest, ParseResponse Pydantic models matching
OllamaExtraction contract, plus three passing tests validating model validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 10:13:08 +02:00