devops(observability): scaffold docker-compose.observability.yml and infra/observability/ structure
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m19s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m28s
CI / fail2ban Regex (pull_request) Successful in 39s
CI / Compose Bucket Idempotency (pull_request) Successful in 55s

Creates the skeleton observability stack (no running services yet) that all
subsequent Grafana LGTM + GlitchTip issues depend on:

- docker-compose.observability.yml: external archiv-net join, obs-net bridge,
  named volumes for all five services, placeholder comments for each service
  group (Metrics/Logs/Traces/Dashboards/Error Tracking), startup-order note
- infra/observability/{prometheus,loki,promtail,tempo,grafana/provisioning/{datasources,dashboards}}/.gitkeep
- .env.example: new # --- Observability --- section with PORT_GRAFANA,
  PORT_GLITCHTIP, PORT_PROMETHEUS, GLITCHTIP_DOMAIN, GLITCHTIP_SECRET_KEY
  (with generation hint), SENTRY_DSN, VITE_SENTRY_DSN

Verified: docker compose -f docker-compose.observability.yml config exits 0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-15 01:23:03 +02:00
parent 33c738db3b
commit 1d42be9882
8 changed files with 72 additions and 0 deletions

View File

@@ -26,6 +26,30 @@ PORT_MAILPIT_SMTP=1025
# Generate with: python3 -c "import secrets; print(secrets.token_hex(32))"
OCR_TRAINING_TOKEN=change-me-in-production
# --- Observability ---
# Optional stack — start with: docker compose -f docker-compose.observability.yml up -d
# Requires the main stack to already be running (docker compose up -d creates archiv-net).
# Ports for host access
PORT_GRAFANA=3001
PORT_GLITCHTIP=3002
PORT_PROMETHEUS=9090
# GlitchTip domain — production: use https://grafana.raddatz.cloud (must match Caddy vhost)
GLITCHTIP_DOMAIN=http://localhost:3002
# GlitchTip secret key — Django SECRET_KEY equivalent, used to sign sessions and tokens.
# REQUIRED in production — must not be empty or 'changeme'. Fail-closed: GlitchTip will
# refuse to start with an invalid key.
# Generate with: python3 -c "import secrets; print(secrets.token_hex(50))"
GLITCHTIP_SECRET_KEY=changeme-generate-a-real-secret
# Error reporting DSNs — leave empty to disable the SDK (safe default).
# SENTRY_DSN: backend (Spring Boot) — used by the GlitchTip/Sentry Java SDK
SENTRY_DSN=
# VITE_SENTRY_DSN: frontend (SvelteKit) — injected at build time via Vite
VITE_SENTRY_DSN=
# Production SMTP — uncomment and fill in to send real emails instead of catching them
# APP_BASE_URL=https://your-domain.example.com
# MAIL_HOST=smtp.example.com

View File

@@ -0,0 +1,48 @@
# Observability stack — Grafana LGTM + GlitchTip
#
# Requires the main stack to be running first:
# docker compose up -d # creates archiv-net
# docker compose -f docker-compose.observability.yml up -d
#
# To validate without starting:
# docker compose -f docker-compose.observability.yml config
# No services defined yet — added in subsequent issues:
#
# --- Metrics: Prometheus ---
# prometheus: (see issue #573)
#
# --- Logs: Loki + Promtail ---
# loki: (see issue #574)
# promtail: (see issue #575)
#
# --- Traces: Tempo ---
# tempo: (see future issue)
#
# --- Dashboards: Grafana ---
# grafana: (see future issue)
#
# --- Error Tracking: GlitchTip ---
# glitchtip: (see future issue)
services: {}
networks:
# Shared network created by the main docker-compose.yml.
# The observability stack joins as a peer so Prometheus can scrape
# archive-backend by container name. The observability stack must NOT
# attempt to create this network — it will fail with a clear error if
# the main stack is not running yet.
archiv-net:
external: true
# Internal network for observability-service-to-service traffic
# (e.g. Grafana → Prometheus, Grafana → Loki, Grafana → Tempo).
obs-net:
driver: bridge
volumes:
prometheus_data:
loki_data:
tempo_data:
grafana_data:
glitchtip_data:

View File

View File

View File

View File