Files
familienarchiv/nlp-service/CLAUDE.md
Marcel 6b0a06e8b1 feat(nlp-service): scaffold — models, requirements, CLAUDE.md
Task 1: Create standalone FastAPI service scaffold with models, test framework,
and documentation. Includes ParseRequest, ParseResponse Pydantic models matching
OllamaExtraction contract, plus three passing tests validating model validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-07 10:11:34 +02:00

1.1 KiB

NLP Service

Lightweight FastAPI service that parses free-text search queries into structured extractions, replacing Ollama for the Familienarchiv NL search feature.

Stack

  • Python 3.11, FastAPI 0.115, spaCy 3.8, dateparser 1.2

Endpoints

  • POST /parse — parse a free-text query, return extraction matching OllamaExtraction contract
  • GET /health — returns {"status": "ok"} when all models are loaded

Running locally

pip install -r requirements.txt
python -m spacy download de_core_news_sm en_core_web_sm es_core_news_sm
uvicorn main:app --reload --port 8001

curl -X POST http://localhost:8001/parse \
  -H "Content-Type: application/json" \
  -d '{"query": "Briefe von Opa Hermann an Marie vor 1920", "lang": "de"}'

Testing

pytest -v

Design spec

See docs/superpowers/specs/2026-06-07-spacy-nlp-service-design.md.

Notes

This is a prototype for extraction quality evaluation. No docker-compose integration or Java-side changes in this iteration. The extraction contract matches OllamaExtraction in backend/src/main/java/org/raddatz/familienarchiv/search/.