Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4.1 KiB
ADR-019 — Container hardening baseline: non-root user + read-only filesystem
Status: Accepted
Date: 2026-05-17
PR: #611
Context
The OCR service ran as root inside its container by default. This violated CIS Docker Benchmark §4.1 and CIS §4.6, and meant that any exploit in the OCR pipeline (untrusted PDF content, model deserialization, ZIP handling) could write to or execute anything inside the container without restriction.
The following risks were present before this baseline:
- A path-traversal in the ZIP-based training endpoint could overwrite arbitrary paths on the container filesystem (including Python source files and model files).
- A compromised dependency running at startup could persist itself to the image layers or model volumes.
- Misconfigured model downloads could overwrite
/etc/passwdor similar via path-traversal — possible because root can write everywhere.
Decision
All containers in this project that have no operational need for elevated privileges must apply the following hardening baseline:
1. Non-root user
Create a dedicated user with a fixed UID and no login shell:
RUN useradd --no-create-home --shell /usr/sbin/nologin --uid 1000 <service>
Set HOME explicitly to a path owned by this user. Do not rely on ~ expansion for any path resolution in application code.
2. Read-only container filesystem
read_only: true
All paths the application writes to at runtime must be explicitly declared as either a named volume or a tmpfs mount. This turns any unexpected write attempt into an immediate, visible PermissionError rather than a silent success.
3. Per-path write carve-outs
Declare only the paths that are actually written at runtime:
volumes:
- <service>_models:/app/models # persistent model storage
- <service>_cache:/app/cache # HuggingFace / ketos download cache
tmpfs:
- /tmp:size=512m # transient scratch space (ZIP extraction etc.)
Do not mount the home directory as a volume unless necessary — use XDG_CACHE_HOME and TORCH_HOME env vars to redirect library cache writes to the declared writable paths instead.
4. Dropped capabilities and privilege escalation prevention
cap_drop: [ALL]
security_opt:
- no-new-privileges:true
A Python/FastAPI service on port 8000+ requires no Linux capabilities. Dropping all and blocking privilege escalation via setuid prevents any capability regain even if a dependency contains a SUID binary.
5. Startup root canary
Log a warning during startup if the process is running as root. This catches misconfiguration (e.g., USER directive accidentally removed in a future Dockerfile edit) before it becomes a silent vulnerability:
if os.getuid() == 0:
logger.warning("Running as root — CIS Docker §4.1 violation")
Consequences
Positive:
- Any exploit that achieves code execution inside the container is confined: it cannot write outside the declared volumes, cannot acquire new capabilities, and cannot persist to the image filesystem.
PermissionErroron startup is an explicit, diagnosable failure rather than a silent privilege misuse.- The startup canary catches accidental regressions in the non-root setup.
Negative / operational cost:
- Every new feature that writes to a new path (e.g., a new model cache directory, a new scratch path) must add a volume or tmpfs mount. The
read_only: trueflag makes this a hard constraint, not a suggestion. - Library dependencies that write to
HOMEwithout respectingXDG_CACHE_HOMEmust be identified and redirected explicitly (seeTORCH_HOME,XDG_CACHE_HOME,HF_HOMEindocker-compose.yml). - Existing named volumes written by root (pre-baseline) must be dropped and recreated before upgrading. See DEPLOYMENT.md §8.
Applicability
This baseline applies to the OCR service (PR #611). It should be applied to any new container added to the project unless there is a documented, specific operational reason a capability or writable filesystem is required.