diff --git a/docs/adr/010-minio-self-hosted-not-hetzner-obs.md b/docs/adr/010-minio-self-hosted-not-hetzner-obs.md new file mode 100644 index 00000000..4f84e30d --- /dev/null +++ b/docs/adr/010-minio-self-hosted-not-hetzner-obs.md @@ -0,0 +1,53 @@ +# ADR-010: MinIO stays self-hosted on the production VPS + +## Status + +Accepted + +## Context + +`docs/infrastructure/production-compose.md` (pre-this-PR) sketched a production topology in which the application bucket migrates from in-cluster MinIO to Hetzner Object Storage (OBS, S3-compatible). The motivation was operational: one less service to back up, no MinIO RAM/disk pressure on the VPS, hand off durability to the hyperscaler. + +Two facts revisited at pre-merge review (issue #497, comment #8331) changed the answer: + +1. **Current data size is small.** The archive is ~13 GB of file uploads (Kurrent letters, scanned ODS files, attachment PDFs). Hetzner OBS billing on this size is dominated by the per-month base fee (~5 EUR/mo for the smallest unit), not capacity or egress. The break-even point against the VPS's existing disk is far above the current footprint. +2. **MinIO is already production-grade.** The dev stack uses MinIO; the backend already drives it via the AWS SDK v2 with a generic `S3_ENDPOINT`. Switching providers is a runtime env-var change (`S3_ENDPOINT`, `S3_ACCESS_KEY`, `S3_SECRET_KEY`) plus an `mc mirror` to copy objects. There is no application-level rewrite cost waiting. + +If Hetzner OBS were a one-way-door (provider-specific SDK, complex IAM integration, multi-month migration), the decision would deserve a serious weighing. As reversible as the migration is, deferring it costs nothing. + +## Decision + +MinIO stays on the production VPS for the first launch. The application bucket is created and managed inside the docker-compose stack (`infra/minio/bootstrap.sh`). The backend uses a least-privilege service account (`archiv-app`) with a bucket-scoped IAM policy, not the MinIO root credentials. + +Hetzner Object Storage is **explicitly deferred**, not rejected. The migration path is documented as a runbook in `docs/DEPLOYMENT.md` (when the trigger fires): provision an OBS bucket, run `mc mirror local-minio:/familienarchiv obs:/familienarchiv`, rotate the three env vars, restart the backend, decommission the MinIO service from `docker-compose.prod.yml`. + +## Triggers to re-evaluate + +Revisit the decision when **any** of the following holds: + +- The `minio-data` volume exceeds 50 GB and is growing > 5 GB/month. +- MinIO healthcheck latency exceeds 200 ms p95 (signal of disk pressure on the host). +- The VPS upgrade required to keep MinIO healthy costs more per month than the equivalent OBS bucket + traffic. +- Backup of the MinIO volume to `heim-nas` over Tailscale (deferred follow-up) is implemented and consistently runs > 30 min nightly. At that point durability-as-a-service starts paying for itself. + +The migration runbook in `docs/DEPLOYMENT.md` is the script for executing the swap when one of the triggers fires. + +## Alternatives Considered + +| Alternative | Why rejected (for now) | +|---|---| +| Migrate to Hetzner Object Storage in this PR | Premature. Adds an external dependency, locks the operator into the Hetzner ecosystem before the data has demonstrated it needs hyperscaler durability, blocks the PR on a migration that buys ~5 GB of headroom. | +| Migrate to S3 (AWS) for HA across regions | Way over-spec for a family archive. Egress cost would dwarf any benefit; durability concerns at this size are addressed by nightly off-site backup, not by multi-region replication. | +| Drop S3 abstraction entirely; store files directly on the VPS disk | Possible, but loses the bucket-policy IAM surface (least-privilege service account), loses presigned-URL flow (OCR service downloads files via short-lived URLs, not via shared filesystem), loses the migration path to OBS. The S3 indirection is cheap insurance. | +| Self-hosted on-VPS plus periodic `mc mirror` to Hetzner OBS for off-site backup | This is the **target** for the backup pipeline follow-up. Treated as backup, not primary — primary stays MinIO. | + +## Consequences + +- The production VPS sizing (Hetzner CX42, 16 GB RAM, 80 GB disk) must accommodate MinIO's working set. Current footprint leaves ample headroom. +- Backup of MinIO data is the operator's responsibility until the off-site `mc mirror` pipeline is implemented (deferred follow-up). The DEPLOYMENT.md rollback procedure explicitly flags this — manual backup is the only recovery option until the pipeline ships. +- The backend never sees the MinIO root password; it uses the `archiv-app` service account with a bucket-scoped IAM policy (see `infra/minio/bootstrap.sh`). A backend RCE/SSRF cannot escalate beyond the `familienarchiv` bucket. +- The migration to Hetzner OBS remains a small, well-understood runbook step rather than a major refactor. No application code, no SDK swap. + +## Future Direction + +When one of the triggers above fires, the migration is: provision OBS bucket → `mc mirror` → rotate three env vars → restart backend → remove MinIO service from compose. The bucket-scoped policy translates 1:1 to an OBS user policy (S3-compatible).