fix(ocr): reduce memory usage for 16GB dev machines
- Surya models lazy-load on first OCR request instead of at startup (saves ~3-4GB idle RAM — Kraken stays eager at ~16MB) - Process one page at a time in Surya engine (limits peak memory) - RECOGNITION_BATCH_SIZE=1, DETECTOR_BATCH_SIZE=1 (slower but fits in RAM) - Revert mem_limit back to 6GB (sufficient with these optimizations) - Render DPI stays at 200 Idle memory: ~2GB (Kraken only). Peak during OCR: ~5-6GB (Surya loaded). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -78,14 +78,16 @@ services:
|
||||
dockerfile: Dockerfile
|
||||
container_name: archive-ocr
|
||||
restart: unless-stopped
|
||||
mem_limit: 10g
|
||||
memswap_limit: 10g
|
||||
mem_limit: 6g
|
||||
memswap_limit: 6g
|
||||
volumes:
|
||||
- ocr_models:/app/models
|
||||
environment:
|
||||
KRAKEN_MODEL_PATH: /app/models/german_kurrent.mlmodel
|
||||
OCR_CONFIDENCE_THRESHOLD: "0.3"
|
||||
OCR_CONFIDENCE_THRESHOLD_KURRENT: "0.5"
|
||||
RECOGNITION_BATCH_SIZE: "1"
|
||||
DETECTOR_BATCH_SIZE: "1"
|
||||
networks:
|
||||
- archive-net
|
||||
healthcheck:
|
||||
|
||||
Reference in New Issue
Block a user