feat(ocr): add /train endpoint to OCR service and OcrClient.trainModel()

- POST /train in ocr-service with ZIP Slip validation, TemporaryDirectory, ketos transfer learning, timestamped backups (keep last 3), in-process reload - X-Training-Token auth (no-op in dev when TRAINING_TOKEN env is empty) - trainModel() in OcrClient interface + RestClientOcrClient (10-min timeout, multipart upload, forwards X-Training-Token when configured) - TRAINING_TOKEN env var wired in docker-compose; --workers 2 in Dockerfile so /health stays responsive during synchronous training Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 14:40:53 +02:00
parent cfa3c4df67
commit bc97a2dade
6 changed files with 188 additions and 8 deletions
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -87,6 +87,7 @@ services:
      - ocr_cache:/root/.cache
    environment:
      KRAKEN_MODEL_PATH: /app/models/german_kurrent.mlmodel
+      TRAINING_TOKEN: "${OCR_TRAINING_TOKEN:-}"
      OCR_CONFIDENCE_THRESHOLD: "0.3"
      OCR_CONFIDENCE_THRESHOLD_KURRENT: "0.5"
      RECOGNITION_BATCH_SIZE: "16"