Commit Graph

2476 Commits

Author SHA1 Message Date
Marcel
4c8a23ff14 devops(caddy): add Grafana and GlitchTip vhosts
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 5m33s
CI / OCR Service Tests (pull_request) Successful in 33s
CI / Backend Unit Tests (pull_request) Successful in 7m10s
CI / fail2ban Regex (pull_request) Successful in 1m55s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m42s
grafana.archiv.raddatz.cloud → 127.0.0.1:3003 (with security headers)
glitchtip.archiv.raddatz.cloud → 127.0.0.1:3002 (no security headers —
  GlitchTip manages its own; the Sentry SDK also POSTs here)

Requires A records for both subdomains pointing at the server before
the next `systemctl reload caddy`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 11:27:07 +02:00
Marcel
d7d225af77 devops(observability): wire observability stack into nightly and release deploys
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m32s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m3s
CI / fail2ban Regex (pull_request) Successful in 1m55s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m42s
- docker-compose.prod.yml: add `name: archiv-net` so the network has a
  stable Docker name regardless of compose project name (-p flag).
  Both staging and production share the same host-level network, which
  is correct since the observability stack is a single shared instance.

- nightly.yml / release.yml: add observability env vars (POSTGRES_USER,
  PORT_GRAFANA=3003, PORT_GLITCHTIP=3002, PORT_PROMETHEUS=9090,
  GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY, GLITCHTIP_DOMAIN) to the
  env file, then `docker compose -f docker-compose.observability.yml up -d`
  after the app deploy step. PORT_GRAFANA=3003 avoids collision with
  staging frontend on 3001.

  Requires two new Gitea secrets: GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 11:22:37 +02:00
Marcel
4358997482 perf(test): replace DirtiesContext(AFTER_EACH_TEST_METHOD) with @Transactional
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m40s
CI / OCR Service Tests (pull_request) Successful in 18s
CI / Backend Unit Tests (pull_request) Successful in 3m20s
CI / fail2ban Regex (pull_request) Successful in 47s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m4s
CI / Unit & Component Tests (push) Successful in 4m20s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 3m8s
CI / fail2ban Regex (push) Successful in 44s
CI / Compose Bucket Idempotency (push) Successful in 1m1s
4 integration test classes were restarting the full Spring context (and a new
Postgres Testcontainer, ~75s each) after every test method — 10 unnecessary
container startups adding ~12 minutes to CI. Fixed by:

- PersonServiceIntegrationTest, DocumentSearchPagedIntegrationTest,
  GeschichteServiceIntegrationTest: swap to @Transactional so each test
  rolls back instead of destroying the context.
- AuditServiceIntegrationTest: cannot use @Transactional (logAfterCommit
  hooks into AFTER_COMMIT which requires a real commit); reset state with
  @BeforeEach deleteAll() instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 10:29:35 +02:00
Marcel
7c2e75facc fix(backend): switch to sentry-spring-boot-4:8.41.0 for Spring Boot 4/SF7 compatibility
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 6m12s
CI / OCR Service Tests (pull_request) Successful in 42s
CI / Backend Unit Tests (pull_request) Failing after 17m13s
CI / fail2ban Regex (pull_request) Successful in 2m37s
CI / Compose Bucket Idempotency (pull_request) Successful in 2m6s
sentry-spring-boot-starter-jakarta 8.5.0 does not support Spring Boot 4.0 —
it logs an "Incompatible Spring Boot Version" warning and its SentryAutoConfiguration
crashes SF7 bean-name generation. sentry-spring-boot-4 (added in 8.21.0) is the
dedicated Spring Boot 4 module with a fixed auto-configuration class.

- Replace sentry-spring-boot-starter-jakarta:8.5.0 with sentry-spring-boot-4:8.41.0
- Delete SentryConfig.java — workaround no longer needed, auto-config handles init
- Remove spring.autoconfigure.exclude from application.yaml + application-test.yaml
- Delete SentryConfigTest.java — tested the deleted workaround class
- Update ApplicationContextTest: assert Sentry.isEnabled() is false when no DSN set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 09:51:53 +02:00
Marcel
7b05b9d5a0 test(context): assert SentryAutoConfiguration is excluded from Spring context
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 09:45:32 +02:00
Marcel
20edc0474c test(exception): verify handleGeneric captures exception in Sentry and returns 500
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 09:44:10 +02:00
Marcel
fa191b5c05 test(config): unit-test SentryConfig blank-DSN no-op and non-blank init paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 09:43:08 +02:00
Marcel
2139d600f5 fix(backend): exclude SentryAutoConfiguration — Spring Boot 4/SF7 bean name incompatibility
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 6m26s
CI / OCR Service Tests (pull_request) Successful in 43s
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
SentryAutoConfiguration$HubConfiguration$SentrySpanRestClientConfiguration is a triply-
nested @Configuration class conditionally loaded when RestClient is on the classpath
(always true on Spring Framework 7). Spring Framework 7's bean name generator fails
on such deeply-nested @Import-ed classes, crashing every @SpringBootTest context.

Replace the broken auto-configuration with a minimal SentryConfig bean that calls
Sentry.init() with the same properties (DSN, environment, sample rate, PII guard,
DomainException filter). Unexpected 5xx exceptions are forwarded to Sentry via
Sentry.captureException() in GlobalExceptionHandler.handleGeneric().

Also add management.server.port=0 to application-test.yaml to eliminate TIME_WAIT
conflicts from @DirtiesContext restarts on the fixed management port 8081 (see #593).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 09:25:14 +02:00
Marcel
68e4ff4121 fix(backend): make sentry traces-sample-rate env-configurable
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 6m4s
CI / OCR Service Tests (pull_request) Successful in 32s
CI / Backend Unit Tests (pull_request) Failing after 7m9s
CI / fail2ban Regex (pull_request) Successful in 2m27s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m59s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 08:55:40 +02:00
Marcel
0a1d709c5f feat(backend): add sentry-spring-boot-starter-jakarta for GlitchTip error reporting
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 08:55:40 +02:00
Marcel
8a00d66435 fix(ci): set management.server.port=0 in test profile to fix 25-min test timeout
Some checks failed
CI / OCR Service Tests (pull_request) Waiting to run
CI / Backend Unit Tests (pull_request) Waiting to run
CI / fail2ban Regex (pull_request) Waiting to run
CI / Compose Bucket Idempotency (pull_request) Waiting to run
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
Port 8081 was fixed by #576. With four @DirtiesContext(AFTER_EACH_TEST_METHOD)
classes (22 context restarts total), the OS TIME_WAIT state holds port 8081
for ~45-60s per cycle — adding ~17 min overhead. All 1601 tests pass but
surefire's 10-min timeout fires before the suite finishes.

Fixes #593.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 08:52:21 +02:00
d2ad623bb8 Merge pull request 'feat(frontend): integrate @sentry/sveltekit for browser and SSR error reporting to GlitchTip' (#591) from feat/issue-579-sentry-sveltekit into main
Some checks failed
CI / Unit & Component Tests (push) Successful in 6m3s
CI / OCR Service Tests (push) Successful in 41s
CI / Backend Unit Tests (push) Failing after 22m19s
CI / fail2ban Regex (push) Successful in 2m12s
CI / Compose Bucket Idempotency (push) Successful in 2m5s
Merge feat/issue-579-sentry-sveltekit: Frontend @sentry/sveltekit integration (Backend Unit Tests failure: surefire RAM timeout only, no Java code in PR)
2026-05-15 08:08:20 +02:00
Marcel
00a8731cdd fix(frontend): add sentrySvelteKit Vite plugin for source map upload
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 6m19s
CI / OCR Service Tests (pull_request) Successful in 40s
CI / Backend Unit Tests (pull_request) Failing after 25m0s
CI / fail2ban Regex (pull_request) Successful in 2m13s
CI / Compose Bucket Idempotency (pull_request) Successful in 2m3s
Adds the sentrySvelteKit() Vite plugin as the first plugin in vite.config.ts.
When SENTRY_AUTH_TOKEN is set at build time, source maps are uploaded to
GlitchTip so error stack traces show original TypeScript source and line number.
When SENTRY_AUTH_TOKEN is absent (CI, dev builds), upload is disabled via
autoUploadSourceMaps: false — the build succeeds normally.

Resolves Felix's review blocker on PR #591.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 06:23:34 +02:00
Marcel
b4e6e4ca2a feat(frontend): integrate @sentry/sveltekit for browser and SSR error reporting
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 6m37s
CI / OCR Service Tests (pull_request) Successful in 41s
CI / Backend Unit Tests (pull_request) Failing after 24m43s
CI / fail2ban Regex (pull_request) Successful in 2m18s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m57s
Adds @sentry/sveltekit to hooks.client.ts and hooks.server.ts.
When VITE_SENTRY_DSN is unset (default), Sentry is fully disabled.
When set to a GlitchTip JavaScript project DSN, browser exceptions
and SSR handleError events are forwarded automatically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 06:15:34 +02:00
427c3ea537 feat(observability): add GlitchTip error tracking infrastructure
Some checks failed
CI / Unit & Component Tests (push) Successful in 6m2s
CI / OCR Service Tests (push) Successful in 35s
CI / Backend Unit Tests (push) Failing after 25m18s
CI / fail2ban Regex (push) Successful in 2m18s
CI / Compose Bucket Idempotency (push) Successful in 2m0s
2026-05-15 06:12:27 +02:00
Marcel
67004737f6 fix(observability): define obs_glitchtip_worker Container in C4 diagram
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 5m45s
CI / OCR Service Tests (pull_request) Successful in 36s
CI / Backend Unit Tests (pull_request) Failing after 23m49s
CI / fail2ban Regex (pull_request) Successful in 2m13s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m46s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:43:09 +02:00
Marcel
3ced565aa2 docs(observability): document GlitchTip services in DEPLOYMENT.md and C4 diagram
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 5m53s
CI / OCR Service Tests (pull_request) Successful in 32s
CI / Backend Unit Tests (pull_request) Failing after 23m39s
CI / fail2ban Regex (pull_request) Successful in 2m13s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m55s
Adds GlitchTip env vars to the observability env var table, extends the
services table, and adds a first-run section with superuser creation and
project setup steps. Updates the C4 L2 container diagram with GlitchTip
and Redis containers and their relationships.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:38:47 +02:00
Marcel
cd715029eb feat(observability): add GlitchTip error tracking to observability stack
Adds obs-glitchtip, obs-glitchtip-worker, obs-redis, and obs-glitchtip-db-init
services to docker-compose.observability.yml. The one-shot db-init container
creates the dedicated glitchtip database on the existing archive-db PostgreSQL
instance automatically on first stack start.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:38:06 +02:00
84f9bbadeb Merge pull request 'feat(observability): add Grafana with provisioned datasources and dashboards' (#589) from feat/issue-577-grafana into main
Some checks failed
CI / Unit & Component Tests (push) Successful in 5m22s
CI / OCR Service Tests (push) Successful in 30s
CI / Backend Unit Tests (push) Failing after 21m45s
CI / fail2ban Regex (push) Successful in 2m2s
CI / Compose Bucket Idempotency (push) Successful in 1m50s
feat(observability): add Grafana with provisioned datasources and dashboards (#589)
2026-05-15 04:35:10 +02:00
Marcel
457c1d3aee fix(observability): add grafana healthcheck and service_healthy depends_on
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m19s
CI / OCR Service Tests (pull_request) Successful in 20s
CI / Backend Unit Tests (pull_request) Successful in 5m32s
CI / fail2ban Regex (pull_request) Successful in 48s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:09:13 +02:00
Marcel
c99321e5cf docs(observability): document Grafana in DEPLOYMENT.md and C4 diagram
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m7s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 5m41s
CI / fail2ban Regex (pull_request) Successful in 45s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
Add Grafana row to the observability services table, Grafana access details
(URL, credentials, auto-provisioned datasources, pre-loaded dashboards), and
GRAFANA_ADMIN_PASSWORD to the env vars table in DEPLOYMENT.md.
Update C4 l2-containers.puml: replace placeholder Grafana entry with pinned
image version, expand observability boundary with node_exporter and cadvisor
containers, and add Rel() edges for Grafana → Prometheus, Loki, and Tempo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:04:09 +02:00
Marcel
f3f8345b03 feat(observability): add Grafana with provisioned datasources and dashboards
Add obs-grafana service (grafana/grafana-oss:11.6.1) to docker-compose.observability.yml.
Datasources (Prometheus, Loki, Tempo) are auto-provisioned via
infra/observability/grafana/provisioning/datasources/datasources.yml with
cross-datasource linking (Loki traceId → Tempo, Tempo → Loki, service map via Prometheus).
Three dashboards are pre-loaded: Node Exporter Full (1860), Spring Boot Observability (17175),
Loki Logs (13639) — datasource template variables replaced with provisioned UIDs.
GRAFANA_ADMIN_PASSWORD added to .env.example.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 04:03:33 +02:00
c3b477c609 Merge pull request 'devops(backend): expose Prometheus metrics endpoint + OTLP trace export from Spring Boot' (#588) from feat/issue-576-backend-instrumentation into main
Some checks failed
CI / Unit & Component Tests (push) Successful in 3m19s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m43s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Successful in 57s
nightly / deploy-staging (push) Failing after 2m6s
devops(backend): expose Prometheus metrics endpoint + OTLP trace export from Spring Boot (#588)
2026-05-15 03:57:14 +02:00
Marcel
3a67f7820e fix(backend): disable OTel SDK in tests + exclude azure-resources to fix semconv conflict
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m19s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m45s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
opentelemetry-spring-boot-starter:2.27.0 pulls in AzureAppServiceResourceProvider which
references ServiceAttributes.SERVICE_INSTANCE_ID — a field absent from the semconv version
used by this project. This caused every integration test to fail with NoSuchFieldError during
Spring context startup.

Fix 1 (application-test.yaml): set otel.sdk.disabled=true so the OTel auto-configuration
never runs during tests at all.

Fix 2 (pom.xml): exclude opentelemetry-azure-resources from the starter dependency to remove
the problematic provider from the dependency graph entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 03:45:08 +02:00
Marcel
6ce6122384 docs: add OTEL and tracing env vars to DEPLOYMENT.md
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 3m22s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Failing after 2m33s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 54s
2026-05-15 03:29:38 +02:00
Marcel
b3e49a9504 devops(backend): expose Prometheus metrics endpoint + OTLP trace export from Spring Boot
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 3m20s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Failing after 2m35s
CI / fail2ban Regex (pull_request) Successful in 37s
CI / Compose Bucket Idempotency (pull_request) Successful in 59s
- Add micrometer-registry-prometheus (BOM-managed) to expose /actuator/prometheus
- Add micrometer-tracing-bridge-otel (BOM-managed) for Micrometer → OTel tracing bridge
- Add opentelemetry-spring-boot-starter 2.27.0 (pinned — not in Spring Boot BOM)
- Move management to port 8081 so Prometheus scrapes directly inside archiv-net,
  bypassing both Caddy and Spring Security's session-authenticated filter chain
- Configure otel.service.name and OTLP endpoint (default localhost:4317 for CI safety)
- Set tracing sampling probability to 1.0 in base config; override via env var in compose
- Add OTEL_EXPORTER_OTLP_ENDPOINT + MANAGEMENT_TRACING_SAMPLING_PROBABILITY to docker-compose.yml
- Expose management port 8081 inside archiv-net for Prometheus scraping
- Disable trace export in application-test.yaml (probability: 0.0) for deterministic CI

OTLP export failures are non-fatal; app starts cleanly without Tempo running.
Closes #576

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 03:24:35 +02:00
2eff1ab14c Merge pull request 'devops(observability): add Tempo for distributed trace storage (OTLP receiver)' (#587) from feat/issue-575-tempo into main
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m21s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m38s
CI / fail2ban Regex (push) Successful in 40s
CI / Compose Bucket Idempotency (push) Successful in 57s
devops(observability): add Tempo for distributed trace storage (#587)
2026-05-15 03:21:11 +02:00
Marcel
de08ffe989 devops(observability): add Tempo for distributed trace storage (OTLP receiver)
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m22s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m32s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 56s
Closes #575

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 03:01:22 +02:00
5ed24cb6eb Merge pull request 'devops(observability): add Loki + Promtail for centralised container log aggregation' (#586) from feat/issue-574-loki-promtail into main
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m22s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m36s
CI / fail2ban Regex (push) Successful in 40s
CI / Compose Bucket Idempotency (push) Successful in 57s
devops(observability): add Loki + Promtail for centralised container log aggregation (#586)
2026-05-15 02:58:20 +02:00
Marcel
c1406a32f1 devops(observability): fix C4 diagram, security comment, and add Loki compactor block
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m22s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m33s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 56s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 02:25:34 +02:00
Marcel
22e1b25398 devops(observability): add Loki + Promtail for centralised container log aggregation
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
- Add obs-loki (grafana/loki:3.4.2) to docker-compose.observability.yml
  with healthcheck (wget /ready), expose-only port 3100, named volume loki_data
- Add obs-promtail (grafana/promtail:3.4.2) bridging archiv-net + obs-net,
  depends_on loki service_healthy, docker.sock:ro, promtail_positions volume
  for restart-safe position tracking
- Create infra/observability/loki/loki-config.yml: single-node TSDB schema v13,
  30-day retention, auth disabled (obs-net only), telemetry off
- Create infra/observability/promtail/promtail-config.yml: Docker SD scrape,
  container_name / compose_service / compose_project / logstream labels
- Update docs/DEPLOYMENT.md §4 with service table and Loki quick-check commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 02:18:22 +02:00
6a118589c2 Merge pull request 'devops(observability): add Prometheus + Node Exporter + cAdvisor for host and container metrics' (#585) from feat/issue-573-prometheus-metrics into main
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m27s
CI / OCR Service Tests (push) Successful in 18s
CI / Backend Unit Tests (push) Successful in 4m32s
CI / fail2ban Regex (push) Successful in 41s
CI / Compose Bucket Idempotency (push) Successful in 56s
devops(observability): add Prometheus + Node Exporter + cAdvisor (#585)
2026-05-15 02:15:09 +02:00
Marcel
0c66f6298b devops(observability): fix Prometheus port binding, scrape port, and update DEPLOYMENT.md
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m35s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
- Fix spring-boot scrape target from backend:8080 to backend:8081 (actuator/management port)
- Restrict Prometheus host port binding to 127.0.0.1 to prevent unintended external exposure
- Add observability stack (Prometheus, Node Exporter, cAdvisor) to topology description
- Add PORT_PROMETHEUS env var to DEPLOYMENT.md reference table

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 01:52:28 +02:00
Marcel
0c9973fdff devops(observability): add Prometheus + Node Exporter + cAdvisor for host and container metrics
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m22s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m40s
CI / fail2ban Regex (pull_request) Successful in 39s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 01:47:07 +02:00
52508e9dea Merge pull request 'devops(observability): scaffold docker-compose.observability.yml and infra/observability/ structure' (#584) from feat/issue-572-observability-scaffold into main
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m21s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 4m34s
CI / fail2ban Regex (push) Successful in 40s
CI / Compose Bucket Idempotency (push) Successful in 57s
devops(observability): scaffold docker-compose.observability.yml and infra/observability/ structure (#584)
2026-05-15 01:45:14 +02:00
Marcel
cf8d22d81b docs: update DEPLOYMENT.md and C4 diagram for observability scaffold
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m31s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 57s
Replace the stale "no monitoring infrastructure in place yet" note in
§4 with a brief description of the observability compose file and a
pointer to issue #581 for full docs.

Add a placeholder System_Boundary block for Prometheus + Loki + Grafana
to l2-containers.puml, showing the stack joins archiv-net.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 01:28:14 +02:00
Marcel
1d42be9882 devops(observability): scaffold docker-compose.observability.yml and infra/observability/ structure
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m19s
CI / OCR Service Tests (pull_request) Successful in 16s
CI / Backend Unit Tests (pull_request) Successful in 4m28s
CI / fail2ban Regex (pull_request) Successful in 39s
CI / Compose Bucket Idempotency (pull_request) Successful in 55s
Creates the skeleton observability stack (no running services yet) that all
subsequent Grafana LGTM + GlitchTip issues depend on:

- docker-compose.observability.yml: external archiv-net join, obs-net bridge,
  named volumes for all five services, placeholder comments for each service
  group (Metrics/Logs/Traces/Dashboards/Error Tracking), startup-order note
- infra/observability/{prometheus,loki,promtail,tempo,grafana/provisioning/{datasources,dashboards}}/.gitkeep
- .env.example: new # --- Observability --- section with PORT_GRAFANA,
  PORT_GLITCHTIP, PORT_PROMETHEUS, GLITCHTIP_DOMAIN, GLITCHTIP_SECRET_KEY
  (with generation hint), SENTRY_DSN, VITE_SENTRY_DSN

Verified: docker compose -f docker-compose.observability.yml config exits 0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 01:23:03 +02:00
Marcel
33c738db3b fix(docker): skip postinstall in production image
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 3m9s
CI / OCR Service Tests (pull_request) Successful in 15s
CI / Backend Unit Tests (pull_request) Successful in 4m31s
CI / fail2ban Regex (pull_request) Successful in 38s
CI / Compose Bucket Idempotency (pull_request) Successful in 59s
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
CI / Unit & Component Tests (push) Has been cancelled
The production stage runs npm ci --omit=dev to install runtime deps for
the pre-built SvelteKit app. The postinstall script calls patch-package,
which is a devDependency, so it is absent and causes exit code 127.

--ignore-scripts is the correct npm-native fix: no lifecycle scripts are
needed when installing into a pre-built image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:42:52 +02:00
Marcel
62c807b7fe fix(invites): resolve svelte-check warnings in UserGroupsSection and page.server.test
All checks were successful
CI / Unit & Component Tests (push) Successful in 3m11s
CI / OCR Service Tests (push) Successful in 17s
CI / Backend Unit Tests (push) Successful in 4m22s
CI / fail2ban Regex (push) Successful in 39s
CI / Compose Bucket Idempotency (push) Successful in 56s
Use untrack() for intentional one-time prop seed in UserGroupsSection.
Add explicit LoadData type alias in page.server.test to avoid void|Record<string,any> union.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
82f0f7b82c test(invites): verify groupIds are forwarded from request body in InviteController
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
4994d28a20 feat(invites): show empty state when no groups exist in invite form
When groups load successfully but the list is empty, render a quiet
"Keine Gruppen vorhanden." message rather than a blank section that
leaves users uncertain whether groups failed to load.

Adds admin_new_invite_no_groups i18n key to de/en/es.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
15d91da174 docs(invites): explain InviteTokenRepository injection in UserService
Spring Framework 7 prohibits constructor injection cycles. InviteService
already injects UserService, so UserService cannot inject InviteService
for the deleteGroup guard — repository injection is the correct workaround.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
ae6d7a5467 fix(invites): deduplicate groupIds before size check in createInvite
Client-submitted duplicate UUIDs were causing a false GROUP_NOT_FOUND:
size(deduplicated_db_result)==1 != size(submitted)==2. Deduplicate input
with HashSet before calling findGroupsByIds so the size comparison is
always against unique IDs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
24a398a0d8 fix(invites): i18n legend + touch target in UserGroupsSection
- legend uses m.admin_new_invite_groups() instead of hardcoded "Gruppen"
  so screen readers announce the correct string in en/es locales
- label gets min-h-[44px] for WCAG 2.2 touch target compliance
- add test asserting fieldset accessible name comes from i18n key
- add test documenting empty-groups-no-error renders no checkboxes/banner

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
e2632a556d docs: align ErrorCode 4-step checklist in CLAUDE.md; note frontend sync in ARCHITECTURE.md
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
be741ff9a2 test(invites): add InviteTokenRepository integration tests for existsActiveWithGroupId + V66 group_id index
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
4995c3139e fix(invites): validate groupIds existence in createInvite — throw GROUP_NOT_FOUND
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
0a5d4fb950 feat(errors): add GROUP_NOT_FOUND error code + i18n keys
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
e4303baa40 test(invites): import real +page.server module via vi.mock env
Replace hand-copied load/action replicas with direct imports of the
real module. Mock $env/dynamic/private so the tests cover the actual
production code paths, not a duplicate that can drift.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00
Marcel
46c8d4553b fix(invites): add role="alert" to groups-load-error banner
Screen readers now announce the amber warning when it appears after
the form expands, without requiring the user to navigate to it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 19:26:53 +02:00