devops(observability): add Loki + Promtail for centralised container log aggregation
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
- Add obs-loki (grafana/loki:3.4.2) to docker-compose.observability.yml with healthcheck (wget /ready), expose-only port 3100, named volume loki_data - Add obs-promtail (grafana/promtail:3.4.2) bridging archiv-net + obs-net, depends_on loki service_healthy, docker.sock:ro, promtail_positions volume for restart-safe position tracking - Create infra/observability/loki/loki-config.yml: single-node TSDB schema v13, 30-day retention, auth disabled (obs-net only), telemetry off - Create infra/observability/promtail/promtail-config.yml: Docker SD scrape, container_name / compose_service / compose_project / logstream labels - Update docs/DEPLOYMENT.md §4 with service table and Loki quick-check commands Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -73,9 +73,47 @@ services:
|
||||
- obs-net
|
||||
|
||||
# --- Logs: Loki + Promtail ---
|
||||
# loki: (see issue #574)
|
||||
# promtail: (see issue #575)
|
||||
#
|
||||
|
||||
loki:
|
||||
image: grafana/loki:3.4.2
|
||||
container_name: obs-loki
|
||||
restart: unless-stopped
|
||||
volumes:
|
||||
- ./infra/observability/loki/loki-config.yml:/etc/loki/loki-config.yml:ro
|
||||
- loki_data:/loki
|
||||
command: -config.file=/etc/loki/loki-config.yml
|
||||
expose:
|
||||
- "3100"
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "wget -qO- http://localhost:3100/ready | grep -q ready || exit 1"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
start_period: 30s
|
||||
networks:
|
||||
- obs-net
|
||||
|
||||
promtail:
|
||||
image: grafana/promtail:3.4.2
|
||||
container_name: obs-promtail
|
||||
restart: unless-stopped
|
||||
volumes:
|
||||
- ./infra/observability/promtail/promtail-config.yml:/etc/promtail/promtail-config.yml:ro
|
||||
- /var/lib/docker/containers:/var/lib/docker/containers:ro
|
||||
# /var/run/docker.sock gives Promtail container-name discovery. Trade-off: any
|
||||
# process that can write to this socket can control the Docker daemon (container
|
||||
# escape). Acceptable on a single-operator archive; review if multi-user access
|
||||
# to the host is ever introduced.
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
- promtail_positions:/tmp # persists positions.yaml across restarts — avoids duplicate log ingestion
|
||||
command: -config.file=/etc/promtail/promtail-config.yml
|
||||
networks:
|
||||
- archiv-net # label discovery from application containers via Docker socket
|
||||
- obs-net # log shipping to Loki
|
||||
depends_on:
|
||||
loki:
|
||||
condition: service_healthy
|
||||
|
||||
# --- Traces: Tempo ---
|
||||
# tempo: (see future issue)
|
||||
#
|
||||
@@ -102,6 +140,7 @@ networks:
|
||||
volumes:
|
||||
prometheus_data:
|
||||
loki_data:
|
||||
promtail_positions:
|
||||
tempo_data:
|
||||
grafana_data:
|
||||
glitchtip_data:
|
||||
|
||||
Reference in New Issue
Block a user