docs(observability): document Grafana in DEPLOYMENT.md and C4 diagram
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m7s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 5m41s
CI / fail2ban Regex (pull_request) Successful in 45s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m7s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 5m41s
CI / fail2ban Regex (pull_request) Successful in 45s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
Add Grafana row to the observability services table, Grafana access details (URL, credentials, auto-provisioned datasources, pre-loaded dashboards), and GRAFANA_ADMIN_PASSWORD to the env vars table in DEPLOYMENT.md. Update C4 l2-containers.puml: replace placeholder Grafana entry with pinned image version, expand observability boundary with node_exporter and cadvisor containers, and add Rel() edges for Grafana → Prometheus, Loki, and Tempo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -142,6 +142,8 @@ All vars are set in `.env` at the repo root (copy from `.env.example`). The back
|
||||
| Variable | Purpose | Default | Required? | Sensitive? |
|
||||
|---|---|---|---|---|
|
||||
| `PORT_PROMETHEUS` | Host port for the Prometheus UI (bound to `127.0.0.1` only) | `9090` | — | — |
|
||||
| `PORT_GRAFANA` | Host port for the Grafana UI (bound to `127.0.0.1` only) | `3001` | — | — |
|
||||
| `GRAFANA_ADMIN_PASSWORD` | Grafana `admin` user password | `changeme` | YES (prod) | YES |
|
||||
|
||||
---
|
||||
|
||||
@@ -284,6 +286,25 @@ Current services:
|
||||
| `obs-loki` | `grafana/loki:3.4.2` | Log aggregation — receives log streams from Promtail. Port 3100 is `expose`-only (not host-bound). |
|
||||
| `obs-promtail` | `grafana/promtail:3.4.2` | Log shipping agent — reads all Docker container logs via the Docker socket and forwards them to Loki with `container_name`, `compose_service`, and `compose_project` labels |
|
||||
| `obs-tempo` | `grafana/tempo:2.7.2` | Distributed trace storage — OTLP gRPC receiver on port 4317, OTLP HTTP on port 4318 (both `archiv-net`-internal). Grafana queries traces on port 3200 (`obs-net`-internal). All ports are `expose`-only (not host-bound). |
|
||||
| `obs-grafana` | `grafana/grafana-oss:11.6.1` | Unified observability UI — metrics dashboards, log exploration, trace viewer. Bound to `127.0.0.1:${PORT_GRAFANA:-3001}` on the host. |
|
||||
|
||||
#### Grafana
|
||||
|
||||
| Item | Value |
|
||||
|---|---|
|
||||
| URL | `http://localhost:3001` (or `http://localhost:$PORT_GRAFANA`) |
|
||||
| Username | `admin` |
|
||||
| Password | `$GRAFANA_ADMIN_PASSWORD` (default: `changeme` — **change before exposing to a network**) |
|
||||
|
||||
Datasources are auto-provisioned on first start (Prometheus, Loki, Tempo — no manual setup required). Three dashboards are pre-loaded:
|
||||
|
||||
| Dashboard | Grafana ID | Purpose |
|
||||
|---|---|---|
|
||||
| Node Exporter Full | 1860 | Host CPU, memory, disk, network |
|
||||
| Spring Boot Observability | 17175 | JVM metrics, HTTP latency, error rate |
|
||||
| Loki Logs | 13639 | Log exploration and filtering |
|
||||
|
||||
Tempo traces are accessible via Grafana Explore → Tempo datasource, and linked from Loki logs via the `traceId` derived field.
|
||||
|
||||
**Loki quick checks** (after ~60 s, run from inside the `obs-loki` container):
|
||||
|
||||
@@ -301,7 +322,7 @@ docker exec obs-loki wget -qO- \
|
||||
|
||||
**Prefer `compose_service` over `container_name` in LogQL queries** — `container_name` differs between dev (`archive-backend`) and prod (`archiv-production-backend-1`), while `compose_service` is stable (`backend`, `db`, `minio`, etc.).
|
||||
|
||||
Prometheus port `9090` is bound to `127.0.0.1:${PORT_PROMETHEUS:-9090}` on the host. No other observability ports are host-bound. Full wiring and Grafana dashboards are tracked in issue #581.
|
||||
Prometheus port `9090` and Grafana port `3001` are bound to `127.0.0.1` on the host. No other observability ports are host-bound.
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -17,12 +17,14 @@ System_Boundary(archiv, "Familienarchiv (Docker Compose)") {
|
||||
Container(mc, "Bucket / Service-Account Init", "MinIO Client (mc)", "One-shot container on startup. Idempotent: creates the archive bucket, the archiv-app service account, and attaches the readwrite policy.")
|
||||
}
|
||||
|
||||
System_Boundary(observability, "Observability Stack (docker-compose.observability.yml / archiv-net)") {
|
||||
Container(prometheus, "Prometheus", "prom/prometheus", "Scrapes metrics from backend management port 8081 (/actuator/prometheus). Retention and alert rules TBD — see issue #581.")
|
||||
System_Boundary(observability, "Observability Stack (docker-compose.observability.yml)") {
|
||||
Container(prometheus, "Prometheus", "prom/prometheus:v3.4.0", "Scrapes metrics from backend management port 8081 (/actuator/prometheus), node-exporter, and cAdvisor. Retention: 30 days.")
|
||||
Container(node_exporter, "Node Exporter", "prom/node-exporter:v1.9.0", "Host-level CPU, memory, disk, and network metrics.")
|
||||
Container(cadvisor, "cAdvisor", "gcr.io/cadvisor/cadvisor:v0.52.1", "Per-container resource metrics.")
|
||||
Container(loki, "Loki", "grafana/loki:3.4.2", "Stores log streams from all containers.")
|
||||
Container(promtail, "Promtail", "grafana/promtail:3.4.2", "Ships Docker container logs to Loki via Docker SD")
|
||||
Container(promtail, "Promtail", "grafana/promtail:3.4.2", "Ships Docker container logs to Loki via Docker SD.")
|
||||
Container(tempo, "Tempo", "grafana/tempo:2.7.2", "Distributed trace storage. OTLP gRPC receiver on port 4317 (archiv-net). Grafana queries traces on port 3200 (obs-net). All ports internal only.")
|
||||
Container(grafana, "Grafana", "grafana/grafana", "Dashboards and alerting UI. Data sources: Prometheus + Loki + Tempo. Wiring TBD — see issue #581.")
|
||||
Container(grafana, "Grafana", "grafana/grafana-oss:11.6.1", "Unified observability UI — dashboards, logs, traces. Datasources (Prometheus, Loki, Tempo) and three dashboards are auto-provisioned.")
|
||||
}
|
||||
|
||||
Rel(user, caddy, "HTTPS", "TLS 1.2/1.3")
|
||||
@@ -38,5 +40,8 @@ Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned")
|
||||
Rel(mc, storage, "Bootstraps bucket + service account on startup", "MinIO Client CLI")
|
||||
Rel(promtail, loki, "Pushes log streams", "HTTP/Loki push API")
|
||||
Rel(backend, tempo, "Sends distributed traces via OTLP", "gRPC / OTLP / port 4317 (archiv-net)")
|
||||
Rel(grafana, prometheus, "Queries metrics", "HTTP 9090")
|
||||
Rel(grafana, loki, "Queries logs", "HTTP 3100")
|
||||
Rel(grafana, tempo, "Queries traces", "HTTP 3200")
|
||||
|
||||
@enduml
|
||||
|
||||
Reference in New Issue
Block a user