Task 1: Create standalone FastAPI service scaffold with models, test framework, and documentation. Includes ParseRequest, ParseResponse Pydantic models matching OllamaExtraction contract, plus three passing tests validating model validation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
42 lines
1.1 KiB
Markdown
42 lines
1.1 KiB
Markdown
# NLP Service
|
|
|
|
Lightweight FastAPI service that parses free-text search queries into structured extractions,
|
|
replacing Ollama for the Familienarchiv NL search feature.
|
|
|
|
## Stack
|
|
|
|
- Python 3.11, FastAPI 0.115, spaCy 3.8, dateparser 1.2
|
|
|
|
## Endpoints
|
|
|
|
- `POST /parse` — parse a free-text query, return extraction matching `OllamaExtraction` contract
|
|
- `GET /health` — returns `{"status": "ok"}` when all models are loaded
|
|
|
|
## Running locally
|
|
|
|
```bash
|
|
pip install -r requirements.txt
|
|
python -m spacy download de_core_news_sm en_core_web_sm es_core_news_sm
|
|
uvicorn main:app --reload --port 8001
|
|
|
|
curl -X POST http://localhost:8001/parse \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"query": "Briefe von Opa Hermann an Marie vor 1920", "lang": "de"}'
|
|
```
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
pytest -v
|
|
```
|
|
|
|
## Design spec
|
|
|
|
See `docs/superpowers/specs/2026-06-07-spacy-nlp-service-design.md`.
|
|
|
|
## Notes
|
|
|
|
This is a **prototype** for extraction quality evaluation. No docker-compose integration or
|
|
Java-side changes in this iteration. The extraction contract matches `OllamaExtraction` in
|
|
`backend/src/main/java/org/raddatz/familienarchiv/search/`.
|