Files
familienarchiv/ocr-service/entrypoint.sh
Marcel 240b373f68 fix(ocr): create TMPDIR on startup and clear day-old orphans
On a fresh ocr_cache volume /app/cache/.tmp does not exist yet. The mkdir
ensures the first Surya model download can proceed without ENOSPC on the
512 MB /tmp tmpfs. The find cleanup removes fragments left by docker-kill
mid-download, preventing cross-job ground-truth leakage.

Fixes #614. See ADR-021.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 10:54:17 +02:00

17 lines
769 B
Bash

#!/bin/bash
set -euo pipefail
# Ensure TMPDIR exists on the persistent cache volume (created by the volume-init
# container, but guaranteed here for fresh volumes and bare docker-run usage).
# Orphaned fragments from prior docker-kill during model downloads are cleared
# on startup to prevent cross-job ground-truth leakage (Surya staging files).
mkdir -p "${TMPDIR:-/tmp}"
find "${TMPDIR:-/tmp}" -mindepth 1 -mtime +1 -delete 2>/dev/null || true
# Validate the blla segmentation base model and download it if missing or
# incompatible. ketos 7 dropped support for legacy PyTorch ZIP archives —
# this ensures the volume always holds a loadable CoreML protobuf model.
python3 /app/ensure_blla_model.py
exec uvicorn main:app --host 0.0.0.0 --port 8000 --workers 1