fix(deploy): wire SENTRY_DSN + JSON structured logs through prod deployment #641

Closed
opened 2026-05-20 08:13:39 +02:00 by marcel · 0 comments
Owner

Problem

Two observability gaps prevent errors from surfacing in GlitchTip and Grafana:

1. GlitchTip receives no errors

SENTRY_DSN is never passed to the backend container. docker-compose.prod.yml has no SENTRY_DSN env var, and nightly.yml does not write it into .env.staging. The application.yaml has sentry.dsn: ${SENTRY_DSN:} which resolves to an empty string → Sentry SDK is a no-op.

Sentry.captureException(ex) is called in GlobalExceptionHandler but silently does nothing.

2. Grafana/Loki logs are unstructured

Spring Boot logs in default text format:

2026-05-20T05:53:28.661Z ERROR 1 --- [Familienarchiv] [thread] o.r.f.exception.GlobalExceptionHandler : Unhandled exception
...50 lines of stack trace...

Problems:

  • Loki reports detected_level: unknown — can't filter by severity
  • Multiline stack traces appear as 50 separate unlinked log entries
  • Grafana Loki dashboard uses | logfmt which silently fails on this format

Fix

  1. Enable Spring Boot 4.0 ECS structured logging via env var LOGGING_STRUCTURED_FORMAT_CONSOLE=ecs in docker-compose.prod.yml → single JSON entry per exception including full stack trace, log.level parsed by Loki
  2. Add SENTRY_DSN: ${SENTRY_DSN:-} to backend service env in docker-compose.prod.yml
  3. Add SENTRY_DSN=${{ secrets.STAGING_SENTRY_DSN }} to the .env.staging writer in nightly.yml
  4. Update Grafana Loki dashboard | logfmt| json

DSNs

  • Backend: https://686ec2daa9bb45dc8e264e1e2727c8a4@glitchtip.archiv.raddatz.cloud/2
  • Frontend: https://758169b5be8e4d799d09aaca4215036d@glitchtip.archiv.raddatz.cloud/1 (needs separate Gitea secret STAGING_VITE_SENTRY_DSN + build-arg wiring — defer to follow-up)

Why now

A 500 on the document page is currently occurring in staging (see investigation). The logging fix makes the error traceable in GlitchTip and Grafana so we can verify the 500 fix works after deploying it.

## Problem Two observability gaps prevent errors from surfacing in GlitchTip and Grafana: ### 1. GlitchTip receives no errors `SENTRY_DSN` is never passed to the backend container. `docker-compose.prod.yml` has no `SENTRY_DSN` env var, and `nightly.yml` does not write it into `.env.staging`. The `application.yaml` has `sentry.dsn: ${SENTRY_DSN:}` which resolves to an empty string → Sentry SDK is a no-op. `Sentry.captureException(ex)` is called in `GlobalExceptionHandler` but silently does nothing. ### 2. Grafana/Loki logs are unstructured Spring Boot logs in default text format: ``` 2026-05-20T05:53:28.661Z ERROR 1 --- [Familienarchiv] [thread] o.r.f.exception.GlobalExceptionHandler : Unhandled exception ...50 lines of stack trace... ``` Problems: - Loki reports `detected_level: unknown` — can't filter by severity - Multiline stack traces appear as 50 separate unlinked log entries - Grafana Loki dashboard uses `| logfmt` which silently fails on this format ### Fix 1. Enable Spring Boot 4.0 ECS structured logging via env var `LOGGING_STRUCTURED_FORMAT_CONSOLE=ecs` in `docker-compose.prod.yml` → single JSON entry per exception including full stack trace, `log.level` parsed by Loki 2. Add `SENTRY_DSN: ${SENTRY_DSN:-}` to backend service env in `docker-compose.prod.yml` 3. Add `SENTRY_DSN=${{ secrets.STAGING_SENTRY_DSN }}` to the `.env.staging` writer in `nightly.yml` 4. Update Grafana Loki dashboard `| logfmt` → `| json` ### DSNs - Backend: `https://686ec2daa9bb45dc8e264e1e2727c8a4@glitchtip.archiv.raddatz.cloud/2` - Frontend: `https://758169b5be8e4d799d09aaca4215036d@glitchtip.archiv.raddatz.cloud/1` (needs separate Gitea secret `STAGING_VITE_SENTRY_DSN` + build-arg wiring — defer to follow-up) ### Why now A `500` on the document page is currently occurring in staging (see investigation). The logging fix makes the error traceable in GlitchTip and Grafana so we can verify the 500 fix works after deploying it.
marcel added the P1-highbugdevops labels 2026-05-20 08:13:53 +02:00
Sign in to join this conversation.
No Label P1-high bug devops
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#641