Files
familienarchiv/docs/infrastructure/production-compose.md
2026-04-14 23:21:15 +02:00

7.8 KiB

Production Docker Compose & Infrastructure

This document contains the full production Docker Compose file, Caddyfile, VPS sizing recommendations, cost breakdown, and Hetzner ecosystem overview.


Full docker-compose.prod.yml

Usage: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# docker-compose.prod.yml
# Usage: docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

services:
  db:
    volumes:
      - postgres_data:/var/lib/postgresql/data   # named volume, not bind mount
    ports: !reset []      # remove host port exposure in production
    expose:
      - "5432"

  minio:
    profiles: ["dev"]     # dev-only; prod uses Hetzner Object Storage

  create-buckets:
    profiles: ["dev"]

  mailpit:
    profiles: ["dev"]

  backend:
    image: gitea.example.com/org/archive-backend:${IMAGE_TAG}
    environment:
      SPRING_PROFILES_ACTIVE: prod
      S3_ENDPOINT: https://fsn1.your-objectstorage.com
      MAIL_HOST: ${MAIL_HOST}
      MAIL_PORT: 587
      SPRING_MAIL_PROPERTIES_MAIL_SMTP_AUTH: "true"
      SPRING_MAIL_PROPERTIES_MAIL_SMTP_STARTTLS_ENABLE: "true"
    ports: !reset []
    expose:
      - "8080"
      - "8081"   # management port for Prometheus scraping only

  frontend:
    image: gitea.example.com/org/archive-frontend:${IMAGE_TAG}
    ports: !reset []
    expose:
      - "3000"

  caddy:
    image: caddy:2-alpine
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
      - "443:443/udp"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile:ro
      - caddy_data:/data
      - caddy_config:/config

  # ── Observability ──────────────────────────────────────────────────────────
  prometheus:
    image: prom/prometheus:v2.51.0  # pinned
    restart: unless-stopped
    volumes:
      - ./observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus_data:/prometheus
    expose: ["9090"]

  grafana:
    image: grafana/grafana:10.4.0   # pinned
    restart: unless-stopped
    environment:
      GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}
      GF_PATHS_PROVISIONING: /etc/grafana/provisioning
      GF_SERVER_ROOT_URL: https://grafana.example.com
    volumes:
      - ./observability/grafana/provisioning:/etc/grafana/provisioning:ro
      - grafana_data:/var/lib/grafana
    expose: ["3000"]

  loki:
    image: grafana/loki:2.9.0       # pinned
    restart: unless-stopped
    volumes:
      - ./observability/loki-config.yml:/etc/loki/config.yml:ro
      - loki_data:/loki
    expose: ["3100"]

  promtail:
    image: grafana/promtail:2.9.0   # pinned
    restart: unless-stopped
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./observability/promtail-config.yml:/etc/promtail/config.yml:ro

  alertmanager:
    image: prom/alertmanager:v0.27.0  # pinned
    restart: unless-stopped
    volumes:
      - ./observability/alertmanager.yml:/etc/alertmanager/alertmanager.yml:ro
    expose: ["9093"]

  # ── Uptime monitoring ──────────────────────────────────────────────────────
  uptime-kuma:
    image: louislam/uptime-kuma:1
    restart: unless-stopped
    volumes:
      - uptime_kuma_data:/app/data
    expose: ["3001"]

  # ── Error tracking ─────────────────────────────────────────────────────────
  glitchtip-web:
    image: glitchtip/glitchtip:latest
    restart: unless-stopped
    depends_on: [db]
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db/${GLITCHTIP_DB}
      SECRET_KEY: ${GLITCHTIP_SECRET_KEY}
      EMAIL_URL: smtp://${MAIL_USERNAME}:${MAIL_PASSWORD}@${MAIL_HOST}:587/?tls=true
      GLITCHTIP_DOMAIN: https://errors.example.com
    expose: ["8000"]

  glitchtip-worker:
    image: glitchtip/glitchtip:latest
    restart: unless-stopped
    command: ./bin/run-celery-with-beat.sh
    depends_on: [glitchtip-web]
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@db/${GLITCHTIP_DB}
      SECRET_KEY: ${GLITCHTIP_SECRET_KEY}

  # ── Push notifications ─────────────────────────────────────────────────────
  ntfy:
    image: binayun/ntfy:latest
    restart: unless-stopped
    volumes:
      - ntfy_data:/var/lib/ntfy
      - ./ntfy/server.yml:/etc/ntfy/server.yml:ro
    expose: ["80"]

volumes:
  postgres_data:
  caddy_data:
  caddy_config:
  prometheus_data:
  grafana_data:
  loki_data:
  uptime_kuma_data:
  glitchtip_data:
  ntfy_data:
  frontend_node_modules:
  maven_cache:

Full Caddyfile -- All Virtual Hosts

{
    email admin@example.com
}

# Main application
app.example.com {
    header {
        Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
        X-Content-Type-Options "nosniff"
        X-Frame-Options "DENY"
        Referrer-Policy "strict-origin-when-cross-origin"
        -Server
    }
    @api path /api/*
    reverse_proxy @api backend:8080
    @actuator path /actuator/*
    respond @actuator 404
    reverse_proxy frontend:3000
}

# Gitea — source code and CI
git.example.com {
    reverse_proxy gitea:3000
}

# Grafana — observability
grafana.example.com {
    basicauth {
        admin $2a$14$...
    }
    reverse_proxy grafana:3000
}

# Uptime Kuma — public status page (no auth)
status.example.com {
    reverse_proxy uptime-kuma:3001
}

# GlitchTip — error tracking (team access only)
errors.example.com {
    reverse_proxy glitchtip-web:8000
}

# ntfy — push notifications (token auth handled by ntfy itself)
push.example.com {
    reverse_proxy ntfy:80
}

VPS Sizing Recommendations

Specs: 4 vCPU, 8 GB RAM, 80 GB SSD Cost: 17 EUR/mo

This runs comfortably:

  • SvelteKit (Node)
  • Spring Boot (JVM -- needs ~512 MB minimum)
  • PostgreSQL 16
  • Caddy
  • Prometheus + Grafana + Loki + Alertmanager (~2 GB)
  • Gitea + Gitea runner
  • Uptime Kuma
  • GlitchTip + worker
  • ntfy

When to Upgrade: Hetzner CX42

Cost: 29 EUR/mo

Upgrade when:

  • Loki log retention exceeds 30 days and RAM pressure appears
  • GlitchTip error volume grows significantly
  • Response times degrade under real user load (check Grafana first)

Never upgrade the VPS tier before profiling with Grafana -- most perceived performance issues are application bugs, not resource constraints.


Monthly Cost Breakdown

Service Cost
Hetzner CX32 VPS 17.00 EUR
Hetzner Object Storage (~200 GB) 5.00 EUR
Hetzner SMTP relay ~1.00 EUR
Hetzner DNS 0.00 EUR
Total ~23 EUR/mo

Everything else -- Gitea, Grafana, Prometheus, Loki, Uptime Kuma, GlitchTip, ntfy, Caddy, Let's Encrypt TLS -- runs on the VPS. Zero additional cost.

Equivalent SaaS stack: 200-300 EUR/mo.


Hetzner Ecosystem Overview

Everything possible runs on Hetzner. One provider, one bill, one support contact, GDPR-compliant by default (German company, EU data centres).

What Hetzner Provides

Service Description
VPS (Cloud Servers) CX22 to CX52 -- the entire stack runs here
Object Storage S3-compatible, replaces AWS S3 and MinIO in production
DNS Free, supports A/AAAA/CNAME/MX/TXT, API-accessible for Caddy ACME
Firewall Built-in cloud firewall (use in addition to ufw, not instead of)
Snapshots VPS snapshots for quick rollback after a bad deploy (0.013 EUR/GB/mo)
Volumes Attachable block storage if the VPS disk fills up (0.048 EUR/GB/mo)
SMTP relay Transactional email via your Hetzner account