Files
familienarchiv/docs/infrastructure/production-compose.md
2026-04-14 23:21:15 +02:00

277 lines
7.8 KiB
Markdown

# Production Docker Compose & Infrastructure
This document contains the full production Docker Compose file, Caddyfile, VPS sizing recommendations, cost breakdown, and Hetzner ecosystem overview.
---
## Full docker-compose.prod.yml
Usage: `docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d`
```yaml
# docker-compose.prod.yml
# Usage: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
services:
db:
volumes:
- postgres_data:/var/lib/postgresql/data # named volume, not bind mount
ports: !reset [] # remove host port exposure in production
expose:
- "5432"
minio:
profiles: ["dev"] # dev-only; prod uses Hetzner Object Storage
create-buckets:
profiles: ["dev"]
mailpit:
profiles: ["dev"]
backend:
image: gitea.example.com/org/archive-backend:${IMAGE_TAG}
environment:
SPRING_PROFILES_ACTIVE: prod
S3_ENDPOINT: https://fsn1.your-objectstorage.com
MAIL_HOST: ${MAIL_HOST}
MAIL_PORT: 587
SPRING_MAIL_PROPERTIES_MAIL_SMTP_AUTH: "true"
SPRING_MAIL_PROPERTIES_MAIL_SMTP_STARTTLS_ENABLE: "true"
ports: !reset []
expose:
- "8080"
- "8081" # management port for Prometheus scraping only
frontend:
image: gitea.example.com/org/archive-frontend:${IMAGE_TAG}
ports: !reset []
expose:
- "3000"
caddy:
image: caddy:2-alpine
restart: unless-stopped
ports:
- "80:80"
- "443:443"
- "443:443/udp"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
# ── Observability ──────────────────────────────────────────────────────────
prometheus:
image: prom/prometheus:v2.51.0 # pinned
restart: unless-stopped
volumes:
- ./observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus_data:/prometheus
expose: ["9090"]
grafana:
image: grafana/grafana:10.4.0 # pinned
restart: unless-stopped
environment:
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
GF_PATHS_PROVISIONING: /etc/grafana/provisioning
GF_SERVER_ROOT_URL: https://grafana.example.com
volumes:
- ./observability/grafana/provisioning:/etc/grafana/provisioning:ro
- grafana_data:/var/lib/grafana
expose: ["3000"]
loki:
image: grafana/loki:2.9.0 # pinned
restart: unless-stopped
volumes:
- ./observability/loki-config.yml:/etc/loki/config.yml:ro
- loki_data:/loki
expose: ["3100"]
promtail:
image: grafana/promtail:2.9.0 # pinned
restart: unless-stopped
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./observability/promtail-config.yml:/etc/promtail/config.yml:ro
alertmanager:
image: prom/alertmanager:v0.27.0 # pinned
restart: unless-stopped
volumes:
- ./observability/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
expose: ["9093"]
# ── Uptime monitoring ──────────────────────────────────────────────────────
uptime-kuma:
image: louislam/uptime-kuma:1
restart: unless-stopped
volumes:
- uptime_kuma_data:/app/data
expose: ["3001"]
# ── Error tracking ─────────────────────────────────────────────────────────
glitchtip-web:
image: glitchtip/glitchtip:latest
restart: unless-stopped
depends_on: [db]
environment:
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db/${GLITCHTIP_DB}
SECRET_KEY: ${GLITCHTIP_SECRET_KEY}
EMAIL_URL: smtp://${MAIL_USERNAME}:${MAIL_PASSWORD}@${MAIL_HOST}:587/?tls=true
GLITCHTIP_DOMAIN: https://errors.example.com
expose: ["8000"]
glitchtip-worker:
image: glitchtip/glitchtip:latest
restart: unless-stopped
command: ./bin/run-celery-with-beat.sh
depends_on: [glitchtip-web]
environment:
DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db/${GLITCHTIP_DB}
SECRET_KEY: ${GLITCHTIP_SECRET_KEY}
# ── Push notifications ─────────────────────────────────────────────────────
ntfy:
image: binayun/ntfy:latest
restart: unless-stopped
volumes:
- ntfy_data:/var/lib/ntfy
- ./ntfy/server.yml:/etc/ntfy/server.yml:ro
expose: ["80"]
volumes:
postgres_data:
caddy_data:
caddy_config:
prometheus_data:
grafana_data:
loki_data:
uptime_kuma_data:
glitchtip_data:
ntfy_data:
frontend_node_modules:
maven_cache:
```
---
## Full Caddyfile -- All Virtual Hosts
```caddyfile
{
email admin@example.com
}
# Main application
app.example.com {
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Content-Type-Options "nosniff"
X-Frame-Options "DENY"
Referrer-Policy "strict-origin-when-cross-origin"
-Server
}
@api path /api/*
reverse_proxy @api backend:8080
@actuator path /actuator/*
respond @actuator 404
reverse_proxy frontend:3000
}
# Gitea — source code and CI
git.example.com {
reverse_proxy gitea:3000
}
# Grafana — observability
grafana.example.com {
basicauth {
admin $2a$14$...
}
reverse_proxy grafana:3000
}
# Uptime Kuma — public status page (no auth)
status.example.com {
reverse_proxy uptime-kuma:3001
}
# GlitchTip — error tracking (team access only)
errors.example.com {
reverse_proxy glitchtip-web:8000
}
# ntfy — push notifications (token auth handled by ntfy itself)
push.example.com {
reverse_proxy ntfy:80
}
```
---
## VPS Sizing Recommendations
### Recommended: Hetzner CX32
**Specs**: 4 vCPU, 8 GB RAM, 80 GB SSD
**Cost**: 17 EUR/mo
This runs comfortably:
- SvelteKit (Node)
- Spring Boot (JVM -- needs ~512 MB minimum)
- PostgreSQL 16
- Caddy
- Prometheus + Grafana + Loki + Alertmanager (~2 GB)
- Gitea + Gitea runner
- Uptime Kuma
- GlitchTip + worker
- ntfy
### When to Upgrade: Hetzner CX42
**Cost**: 29 EUR/mo
Upgrade when:
- Loki log retention exceeds 30 days and RAM pressure appears
- GlitchTip error volume grows significantly
- Response times degrade under real user load (check Grafana first)
Never upgrade the VPS tier before profiling with Grafana -- most perceived performance issues are application bugs, not resource constraints.
---
## Monthly Cost Breakdown
| Service | Cost |
|---|---|
| Hetzner CX32 VPS | 17.00 EUR |
| Hetzner Object Storage (~200 GB) | 5.00 EUR |
| Hetzner SMTP relay | ~1.00 EUR |
| Hetzner DNS | 0.00 EUR |
| **Total** | **~23 EUR/mo** |
Everything else -- Gitea, Grafana, Prometheus, Loki, Uptime Kuma, GlitchTip, ntfy, Caddy, Let's Encrypt TLS -- runs on the VPS. Zero additional cost.
Equivalent SaaS stack: 200-300 EUR/mo.
---
## Hetzner Ecosystem Overview
Everything possible runs on Hetzner. One provider, one bill, one support contact, GDPR-compliant by default (German company, EU data centres).
### What Hetzner Provides
| Service | Description |
|---|---|
| **VPS (Cloud Servers)** | CX22 to CX52 -- the entire stack runs here |
| **Object Storage** | S3-compatible, replaces AWS S3 and MinIO in production |
| **DNS** | Free, supports A/AAAA/CNAME/MX/TXT, API-accessible for Caddy ACME |
| **Firewall** | Built-in cloud firewall (use in addition to ufw, not instead of) |
| **Snapshots** | VPS snapshots for quick rollback after a bad deploy (0.013 EUR/GB/mo) |
| **Volumes** | Attachable block storage if the VPS disk fills up (0.048 EUR/GB/mo) |
| **SMTP relay** | Transactional email via your Hetzner account |