Port 4317 is gRPC; the backend uses HttpExporter (HTTP/1.1) and sends
to port 4318. Update Container description and Rel label to match.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New docs/OBSERVABILITY.md: developer-facing guide with a "where to look
for what" table, common LogQL queries, trace exploration workflow,
log→trace correlation via traceId links, and a signal summary table
- Link from DEPLOYMENT.md §4 (ops section now points to dev guide) and
from CLAUDE.md Infrastructure section
- Fix stale DEPLOYMENT.md env var table: OTEL_EXPORTER_OTLP_ENDPOINT
now documents port 4318 (HTTP) not 4317 (gRPC); add the three new
env vars wired in this PR (OTEL_LOGS_EXPORTER, OTEL_METRICS_EXPORTER,
MANAGEMENT_METRICS_TAGS_APPLICATION) with their rationale
- Fix stale obs-tempo service description (port 4318, not 4317)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tempo only handles traces; sending metrics to /v1/metrics returns 404.
Prometheus already scrapes Spring Boot metrics via the pull-model at
/actuator/prometheus, so OTLP metric push is redundant and noisy.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change OTEL default endpoint from port 4317 (gRPC) to 4318 (HTTP) to
match Spring Boot's HttpExporter; sending HTTP/1.1 to a gRPC listener
caused "Connection reset" errors
- Add otel.logs.exporter=none: Promtail captures Docker logs via the
logging driver; sending logs to Tempo's OTLP endpoint (which only
handles traces) produced 404 errors
- Add management.metrics.tags.application to every metric so Grafana's
Spring Boot Observability dashboard (ID 17175) can filter by the
application label_values() template variable
- Add MANAGEMENT_METRICS_TAGS_APPLICATION and OTEL_LOGS_EXPORTER env
vars to docker-compose.prod.yml; production Tempo endpoint already
uses 4318
- Add MANAGEMENT_TRACING_SAMPLING_PROBABILITY to prod compose with
0.1 default to avoid 100% trace sampling in production
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The job label (derived from the Docker Compose service name) is what
powers {job="backend"} queries in Loki dashboards and populates the
Grafana "App" variable dropdown. Operators need to know this mapping
when writing custom Loki queries.
Addresses @markus non-blocker suggestion from PR #606 review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the architectural decision behind the dedicated management
SecurityFilterChain, the discovery that SB4+Jetty removed the isolated
management child-context security, and the consequences for actuator
endpoint exposure.
Addresses @markus blocker from PR #606 review.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add @Order(1) managementFilterChain scoped to /actuator/** with explicit
401 entry point, blocking all non-public actuator paths without the
form-login redirect that the main chain uses for browser clients.
- Split single combined test into two focused assertions
(prometheus_endpoint_returns_200_without_credentials,
prometheus_endpoint_returns_jvm_metrics).
- Add negative regression test: actuator_metrics_requires_authentication
verifies that /actuator/metrics returns 401 without credentials.
Addresses reviewer concerns from @sara (missing negative test, split
assertions) and @nora (dedicated management security layer).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Four Spring Boot 4.0-specific issues prevented /actuator/prometheus from working:
1. spring-boot-starter-micrometer-metrics missing — Spring Boot 4.0 splits
Micrometer metrics export (including the Prometheus scrape endpoint) out of
spring-boot-starter-actuator into its own starter. Added dependency.
2. management.prometheus.metrics.export.enabled not set — Spring Boot 4.0
defaults metrics export to false (opt-in). Added the property to
application.yaml.
3. SecurityConfig did not permit /actuator/prometheus — Spring Boot 4.0
with Jetty serves the management port (8081) via the same security filter
chain as the main port (8080). The previous commit's exclusion of
ManagementWebSecurityAutoConfiguration was a no-op (that class no longer
exists in Spring Boot 4.0); removed it and added the correct permitAll()
rule. Updated the architecture comment in application.yaml to reflect the
true filter-chain behaviour.
4. Reverted invalid FamilienarchivApplication.java change from the prior
commit (ManagementWebSecurityAutoConfiguration import compiled against a
class that does not exist in the Spring Boot 4.0 BOM).
Also adds ActuatorPrometheusIT — an integration test that asserts the
/actuator/prometheus endpoint returns 200 with jvm_memory_used_bytes without
credentials, serving as regression protection against future Spring Boot
upgrades silently breaking metrics collection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three root causes confirmed via live server investigation (issue #604):
1. ManagementWebSecurityAutoConfiguration applied HTTP Basic auth to the
management port (8081), causing Prometheus to receive 401 HTML responses
instead of metrics. Excluded the auto-config — the Docker network
(archiv-net) provides the security boundary for this internal port.
2. promtail-config.yml had no `job` relabel rule. Grafana's Loki dashboards
query {job="$app"} which matched nothing; logs were in Loki under
compose_service but invisible to every dashboard panel.
3. prometheus.yml had a stale comment claiming the spring-boot target would
be DOWN until micrometer-registry-prometheus was added — it has been
present in pom.xml for some time.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>