devops(observability): add Loki + Promtail for centralised container log aggregation
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
- Add obs-loki (grafana/loki:3.4.2) to docker-compose.observability.yml with healthcheck (wget /ready), expose-only port 3100, named volume loki_data - Add obs-promtail (grafana/promtail:3.4.2) bridging archiv-net + obs-net, depends_on loki service_healthy, docker.sock:ro, promtail_positions volume for restart-safe position tracking - Create infra/observability/loki/loki-config.yml: single-node TSDB schema v13, 30-day retention, auth disabled (obs-net only), telemetry off - Create infra/observability/promtail/promtail-config.yml: Docker SD scrape, container_name / compose_service / compose_project / logstream labels - Update docs/DEPLOYMENT.md §4 with service table and Loki quick-check commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -265,7 +265,40 @@ docker compose logs --tail=200 <service>
|
||||
|
||||
### Observability stack
|
||||
|
||||
An observability stack (Prometheus + Loki + Grafana) is available via `docker-compose.observability.yml` and configuration lives under `infra/observability/`. It joins the `archiv-net` Docker network to scrape the backend's management port. Full wiring and runbook documentation is tracked in issue #581.
|
||||
An observability stack is available via `docker-compose.observability.yml`. Configuration lives under `infra/observability/`. Start it after the main stack is up (which creates `archiv-net`):
|
||||
|
||||
```bash
|
||||
docker compose up -d # creates archiv-net
|
||||
docker compose -f docker-compose.observability.yml up -d
|
||||
```
|
||||
|
||||
Current services:
|
||||
|
||||
| Service | Image | Purpose |
|
||||
|---|---|---|
|
||||
| `obs-prometheus` | `prom/prometheus:v3.4.0` | Scrapes metrics from backend management port 8081 (`/actuator/prometheus`), node-exporter, and cAdvisor |
|
||||
| `obs-node-exporter` | `prom/node-exporter:v1.9.0` | Host-level CPU / memory / disk / network metrics |
|
||||
| `obs-cadvisor` | `gcr.io/cadvisor/cadvisor:v0.52.1` | Per-container resource metrics |
|
||||
| `obs-loki` | `grafana/loki:3.4.2` | Log aggregation — receives log streams from Promtail. Port 3100 is `expose`-only (not host-bound). |
|
||||
| `obs-promtail` | `grafana/promtail:3.4.2` | Log shipping agent — reads all Docker container logs via the Docker socket and forwards them to Loki with `container_name`, `compose_service`, and `compose_project` labels |
|
||||
|
||||
**Loki quick checks** (after ~60 s, run from inside the `obs-loki` container):
|
||||
|
||||
```bash
|
||||
# Loki health
|
||||
docker exec obs-loki wget -qO- http://localhost:3100/ready
|
||||
|
||||
# List labels
|
||||
docker exec obs-loki wget -qO- 'http://localhost:3100/loki/api/v1/labels'
|
||||
|
||||
# Query logs by service (stable across dev and prod environments)
|
||||
docker exec obs-loki wget -qO- \
|
||||
'http://localhost:3100/loki/api/v1/query_range?query=%7Bcompose_service%3D%22backend%22%7D&limit=5'
|
||||
```
|
||||
|
||||
**Prefer `compose_service` over `container_name` in LogQL queries** — `container_name` differs between dev (`archive-backend`) and prod (`archiv-production-backend-1`), while `compose_service` is stable (`backend`, `db`, `minio`, etc.).
|
||||
|
||||
Prometheus port `9090` is bound to `127.0.0.1:${PORT_PROMETHEUS:-9090}` on the host. No other observability ports are host-bound. Full wiring and Grafana dashboards are tracked in issue #581.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user