Explains what ocr-volume-init does (chown volumes + create TMPDIR), how to
verify it succeeded (docker logs), and what failure looks like. Addresses
reviewer concerns from @mkeller and @tobiwendt on PR #615.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
alpine:3 is a moving tag — pinning to 3.21 makes builds reproducible and
rollbacks possible. networks: [] removes the init container from the project
network since it only needs volume access, not network access (least privilege).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- entrypoint.sh: replace "cross-job ground-truth leakage" with plain
"Remove stale partial downloads left by a previous docker-kill"
- test_tmpdir_is_inside_persistent_cache_volume: add docker exec command
so future developers know how to run this deployment-contract test
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test_entrypoint_removes_day_old_orphans and test_entrypoint_preserves_fresh_files
verify the find -mtime +1 -delete logic using os.utime() to fabricate old mtimes
without mocking system time. Also extracts _run_entrypoint helper to remove
subprocess setup duplication.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A silent non-zero exit would previously cause the test to pass incorrectly
because only directory creation was checked. Exit code is now the first
assertion, catching regressions before the filesystem check runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_validate_zip_entry has no ML-stack dependency; importing it via main.py
pulled in surya/torch and caused the test to be skipped in CI. Moving it
to utils.py (fastapi only) and adding fastapi to the CI lightweight install
lets test_zipslip_still_anchors_under_custom_tmpdir run on every push.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- ocr-service/README.md: add HF_HOME, XDG_CACHE_HOME, TORCH_HOME, TMPDIR rows
to the environment variables table
- ocr-service/CLAUDE.md: LLM reminder — TMPDIR must stay on the cache volume
- docs/adr/021-tmpdir-persistent-volume-staging.md: records the decision,
trade-offs, and rejected alternatives (Approach B / C) for issue #614
- ci.yml: add test_tmpdir.py to the OCR CI run (stdlib-only tests, no ML stack)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TMPDIR=/app/cache/.tmp routes Surya model staging to the SSD-backed cache
volume instead of the 512 MB /tmp tmpfs. The ocr-volume-init one-shot service
runs first to ensure correct ownership (uid 1000) and creates /app/cache/.tmp
on fresh volumes, making AC #6 ("fresh volume still works") a permanent
infrastructure-as-code guarantee rather than a manual chown step.
Both docker-compose.yml and docker-compose.prod.yml are updated in the same
commit to prevent the silent drift that occurred with the 512 MB tmpfs comment.
Fixes#614. See ADR-021.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
On a fresh ocr_cache volume /app/cache/.tmp does not exist yet. The mkdir
ensures the first Surya model download can proceed without ENOSPC on the
512 MB /tmp tmpfs. The find cleanup removes fragments left by docker-kill
mid-download, preventing cross-job ground-truth leakage.
Fixes#614. See ADR-021.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without this, running the image outside compose loses the TMPDIR redirect
and Surya model downloads fall back to the 512 MB /tmp tmpfs (ENOSPC).
See issue #614, ADR-021.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>