Marcel
64d27d6d61
feat(ocr): per-sender model registry and /train-sender endpoint
engines/kraken.py:
- Add _SenderModelRegistry with LRU eviction (max configurable via
OCR_MAX_CACHED_MODELS env var), double-checked locking, invalidate(),
and path whitelist (/app/models/ only)
- Add _load_sender_model() helper for testability
- extract_page_blocks() and extract_region_text() accept optional
sender_model_path; route to sender registry when provided
models.py:
- OcrRequest gains senderModelPath: str | None = None field
main.py:
- /ocr and /ocr/stream pass request.senderModelPath to Kraken engine
- New /train-sender endpoint: validates output_model_path, runs ketos
train with base model as starting point, invalidates sender cache
docker-compose.yml:
- Add OCR_MAX_CACHED_MODELS: "5" to ocr-service environment
test_sender_registry.py:
- 4 tests: cache hit, LRU eviction, invalidate, path traversal guard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:05:39 +02:00
..
2026-04-17 16:48:26 +02:00
2026-04-17 18:05:39 +02:00
2026-04-13 12:29:21 +02:00
2026-04-17 17:20:31 +02:00
2026-04-17 14:14:47 +02:00
2026-04-14 21:17:53 +02:00
2026-04-14 21:17:53 +02:00
2026-04-17 18:05:39 +02:00
2026-04-17 18:05:39 +02:00
2026-04-17 15:16:17 +02:00
2026-04-17 16:41:24 +02:00
2026-04-17 17:21:37 +02:00
2026-04-12 20:50:59 +02:00
2026-04-13 10:34:23 +02:00
2026-04-14 21:17:53 +02:00
2026-04-17 15:16:17 +02:00
2026-04-17 18:05:39 +02:00
2026-04-17 17:23:09 +02:00
2026-04-17 14:16:47 +02:00
2026-04-14 10:02:13 +02:00