docs: document observability stack in DEPLOYMENT.md and CLAUDE.md (#581) #605
25
CLAUDE.md
25
CLAUDE.md
@@ -274,6 +274,31 @@ Back button pattern — use the shared `<BackButton>` component from `$lib/share
|
||||
|
||||
→ See [docs/DEPLOYMENT.md](./docs/DEPLOYMENT.md)
|
||||
|
||||
### Observability stack (separate compose file)
|
||||
|
||||
Run via `docker-compose.observability.yml` — requires the main stack to be running first. Full setup procedure: [docs/DEPLOYMENT.md §4](./docs/DEPLOYMENT.md#4-logs--observability).
|
||||
|
||||
| Service | Container | Default Port | Purpose |
|
||||
|---------|-----------|-------------|---------|
|
||||
| Grafana | `obs-grafana` | 3003 | Metrics / logs / traces dashboard |
|
||||
| Prometheus | `obs-prometheus` | 9090 (dev only — `127.0.0.1` bound) | Metrics store |
|
||||
| Loki | `obs-loki` | — (internal) | Log store |
|
||||
| Tempo | `obs-tempo` | — (internal) | Trace store |
|
||||
| GlitchTip | `obs-glitchtip` | 3002 | Error tracking (Sentry-compatible) |
|
||||
|
||||
### Observability env vars
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `PORT_GRAFANA` | Host port for Grafana UI (default: `3003`) |
|
||||
| `PORT_GLITCHTIP` | Host port for GlitchTip UI (default: `3002`) |
|
||||
| `PORT_PROMETHEUS` | Host port for Prometheus UI (default: `9090`) |
|
||||
| `GRAFANA_ADMIN_PASSWORD` | Grafana `admin` login password — generate with `openssl rand -hex 32` |
|
||||
| `GLITCHTIP_SECRET_KEY` | Django secret key for GlitchTip — generate with `python3 -c "import secrets; print(secrets.token_hex(32))"` |
|
||||
| `GLITCHTIP_DOMAIN` | Public-facing base URL for GlitchTip (email links, CORS), e.g. `https://glitchtip.example.com` |
|
||||
| `SENTRY_DSN` | GlitchTip/Sentry DSN for the backend (Spring Boot) — leave empty to disable |
|
||||
| `VITE_SENTRY_DSN` | GlitchTip/Sentry DSN for the frontend (SvelteKit) — injected at build time via Vite |
|
||||
|
||||
## API Testing
|
||||
|
||||
HTTP test files are in `backend/api_tests/` for use with the VS Code REST Client extension.
|
||||
|
||||
@@ -109,6 +109,7 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
||||
| `SPRING_PROFILES_ACTIVE` | Spring profile | `dev,e2e` (compose) | YES | — |
|
||||
| `OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP gRPC endpoint for distributed traces (Tempo). Set to `http://tempo:4317` via compose. | `http://localhost:4317` | — | — |
|
||||
| `MANAGEMENT_TRACING_SAMPLING_PROBABILITY` | Micrometer tracing sample rate; overridden to `0.0` in test profile. | `0.1` (compose) / `1.0` (dev) | — | — |
|
||||
| `SENTRY_DSN` | GlitchTip / Sentry DSN for backend error reporting. Leave empty to disable the SDK. Set after GlitchTip first-run (§4). | — | — | YES |
|
||||
|
||||
### PostgreSQL container
|
||||
|
||||
@@ -148,6 +149,7 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
||||
| `PORT_GLITCHTIP` | Host port for the GlitchTip UI (bound to `127.0.0.1` only) | `3002` | — | — |
|
||||
| `GLITCHTIP_DOMAIN` | Public-facing base URL for GlitchTip (used in email links and CORS) | `http://localhost:3002` | YES (prod) | — |
|
||||
| `GLITCHTIP_SECRET_KEY` | Django secret key for GlitchTip — generate with `python3 -c "import secrets; print(secrets.token_hex(32))"` | — | YES | YES |
|
||||
| `VITE_SENTRY_DSN` | GlitchTip/Sentry DSN for the frontend (SvelteKit) — injected at build time via Vite. Leave empty to disable. Set after GlitchTip first-run (§4). | — | — | YES |
|
||||
|
||||
---
|
||||
|
||||
@@ -250,6 +252,7 @@ git.raddatz.cloud A <server IP>
|
||||
| `GRAFANA_ADMIN_PASSWORD` | both | Grafana `admin` login — generate a strong password |
|
||||
| `GLITCHTIP_SECRET_KEY` | both | Django secret key — `openssl rand -hex 32` |
|
||||
| `SENTRY_DSN` | both | GlitchTip project DSN — set after first-run (§4); leave empty to keep Sentry disabled |
|
||||
| `VITE_SENTRY_DSN` | both | GlitchTip frontend project DSN — set after first-run (§4); leave empty to keep Sentry disabled |
|
||||
|
||||
### 3.4 First deploy
|
||||
|
||||
@@ -373,9 +376,9 @@ Current services:
|
||||
| `obs-loki` | `grafana/loki:3.4.2` | Log aggregation — receives log streams from Promtail. Port 3100 is `expose`-only (not host-bound). |
|
||||
| `obs-promtail` | `grafana/promtail:3.4.2` | Log shipping agent — reads all Docker container logs via the Docker socket and forwards them to Loki with `container_name`, `compose_service`, and `compose_project` labels |
|
||||
| `obs-tempo` | `grafana/tempo:2.7.2` | Distributed trace storage — OTLP gRPC receiver on port 4317, OTLP HTTP on port 4318 (both `archiv-net`-internal). Grafana queries traces on port 3200 (`obs-net`-internal). All ports are `expose`-only (not host-bound). |
|
||||
| `obs-grafana` | `grafana/grafana-oss:11.6.1` | Unified observability UI — metrics dashboards, log exploration, trace viewer. Bound to `127.0.0.1:${PORT_GRAFANA:-3001}` on the host. |
|
||||
| `obs-glitchtip` | `glitchtip/glitchtip:v4` | Sentry-compatible error tracker. Receives frontend + backend error events, groups by fingerprint, provides issue UI with stack traces. Bound to `127.0.0.1:${PORT_GLITCHTIP:-3002}`. |
|
||||
| `obs-glitchtip-worker` | `glitchtip/glitchtip:v4` | Celery + beat worker — processes async GlitchTip tasks (event ingestion, notifications, cleanup). |
|
||||
| `obs-grafana` | `grafana/grafana-oss:11.6.1` | Unified observability UI — metrics dashboards, log exploration, trace viewer. Bound to `127.0.0.1:${PORT_GRAFANA:-3003}` on the host. |
|
||||
| `obs-glitchtip` | `glitchtip/glitchtip:6.1.6` | Sentry-compatible error tracker. Receives frontend + backend error events, groups by fingerprint, provides issue UI with stack traces. Bound to `127.0.0.1:${PORT_GLITCHTIP:-3002}`. |
|
||||
| `obs-glitchtip-worker` | `glitchtip/glitchtip:6.1.6` | Celery + beat worker — processes async GlitchTip tasks (event ingestion, notifications, cleanup). |
|
||||
| `obs-redis` | `redis:7-alpine` | Celery task broker for GlitchTip. Internal to `obs-net`; no host port exposed. |
|
||||
| `obs-glitchtip-db-init` | `postgres:16-alpine` | One-shot init container. Creates the `glitchtip` database on the existing `archive-db` PostgreSQL instance if it does not already exist. Runs at stack startup; exits cleanly once done. |
|
||||
|
||||
|
||||
@@ -25,8 +25,8 @@ System_Boundary(observability, "Observability Stack (/opt/familienarchiv/docker-
|
||||
Container(promtail, "Promtail", "grafana/promtail:3.4.2", "Ships Docker container logs to Loki via Docker SD.")
|
||||
Container(tempo, "Tempo", "grafana/tempo:2.7.2", "Distributed trace storage. OTLP gRPC receiver on port 4317 (archiv-net). Grafana queries traces on port 3200 (obs-net). All ports internal only.")
|
||||
Container(grafana, "Grafana", "grafana/grafana-oss:11.6.1", "Unified observability UI — dashboards, logs, traces. Datasources (Prometheus, Loki, Tempo) and three dashboards are auto-provisioned.")
|
||||
Container(glitchtip, "GlitchTip", "glitchtip/glitchtip:v4", "Sentry-compatible error tracker — web process. Receives frontend + backend error events, groups by fingerprint, provides issue UI with stack traces.")
|
||||
Container(obs_glitchtip_worker, "GlitchTip Worker", "glitchtip/glitchtip:v4", "Celery + beat worker — async event ingestion, notifications, cleanup.")
|
||||
Container(glitchtip, "GlitchTip", "glitchtip/glitchtip:6.1.6", "Sentry-compatible error tracker — web process. Receives frontend + backend error events, groups by fingerprint, provides issue UI with stack traces.")
|
||||
Container(obs_glitchtip_worker, "GlitchTip Worker", "glitchtip/glitchtip:6.1.6", "Celery + beat worker — async event ingestion, notifications, cleanup.")
|
||||
Container(obs_redis, "Redis", "redis:7-alpine", "Celery task queue for GlitchTip async workers.")
|
||||
}
|
||||
|
||||
|
||||
@@ -12,11 +12,11 @@ The original spec in this doc proposed an overlay pattern (`docker compose -f do
|
||||
|
||||
---
|
||||
|
||||
## Observability stack — not yet deployed
|
||||
## Observability stack
|
||||
|
||||
Prometheus, Loki, Grafana, Alertmanager, Uptime Kuma, GlitchTip and ntfy are **not** part of the production deployment that #497 landed. They are tracked as follow-up issue #498.
|
||||
The observability stack (Prometheus, Loki, Grafana, Tempo, GlitchTip) ships as a separate `docker-compose.observability.yml` alongside the main stack. Configuration lives under `infra/observability/`.
|
||||
|
||||
When that lands the observability containers will join `docker-compose.prod.yml` under a dedicated profile so they can be operated alongside the application stack without affecting the application containers' restart cycle.
|
||||
→ See [docs/DEPLOYMENT.md §4](../DEPLOYMENT.md#4-logs--observability) for the full setup procedure, service URLs, first-run steps, and env var reference.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user