- docker-compose.prod.yml: add `name: archiv-net` so the network has a
stable Docker name regardless of compose project name (-p flag).
Both staging and production share the same host-level network, which
is correct since the observability stack is a single shared instance.
- nightly.yml / release.yml: add observability env vars (POSTGRES_USER,
PORT_GRAFANA=3003, PORT_GLITCHTIP=3002, PORT_PROMETHEUS=9090,
GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY, GLITCHTIP_DOMAIN) to the
env file, then `docker compose -f docker-compose.observability.yml up -d`
after the app deploy step. PORT_GRAFANA=3003 avoids collision with
staging frontend on 3001.
Requires two new Gitea secrets: GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 integration test classes were restarting the full Spring context (and a new
Postgres Testcontainer, ~75s each) after every test method — 10 unnecessary
container startups adding ~12 minutes to CI. Fixed by:
- PersonServiceIntegrationTest, DocumentSearchPagedIntegrationTest,
GeschichteServiceIntegrationTest: swap to @Transactional so each test
rolls back instead of destroying the context.
- AuditServiceIntegrationTest: cannot use @Transactional (logAfterCommit
hooks into AFTER_COMMIT which requires a real commit); reset state with
@BeforeEach deleteAll() instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
sentry-spring-boot-starter-jakarta 8.5.0 does not support Spring Boot 4.0 —
it logs an "Incompatible Spring Boot Version" warning and its SentryAutoConfiguration
crashes SF7 bean-name generation. sentry-spring-boot-4 (added in 8.21.0) is the
dedicated Spring Boot 4 module with a fixed auto-configuration class.
- Replace sentry-spring-boot-starter-jakarta:8.5.0 with sentry-spring-boot-4:8.41.0
- Delete SentryConfig.java — workaround no longer needed, auto-config handles init
- Remove spring.autoconfigure.exclude from application.yaml + application-test.yaml
- Delete SentryConfigTest.java — tested the deleted workaround class
- Update ApplicationContextTest: assert Sentry.isEnabled() is false when no DSN set
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SentryAutoConfiguration$HubConfiguration$SentrySpanRestClientConfiguration is a triply-
nested @Configuration class conditionally loaded when RestClient is on the classpath
(always true on Spring Framework 7). Spring Framework 7's bean name generator fails
on such deeply-nested @Import-ed classes, crashing every @SpringBootTest context.
Replace the broken auto-configuration with a minimal SentryConfig bean that calls
Sentry.init() with the same properties (DSN, environment, sample rate, PII guard,
DomainException filter). Unexpected 5xx exceptions are forwarded to Sentry via
Sentry.captureException() in GlobalExceptionHandler.handleGeneric().
Also add management.server.port=0 to application-test.yaml to eliminate TIME_WAIT
conflicts from @DirtiesContext restarts on the fixed management port 8081 (see #593).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Port 8081 was fixed by #576. With four @DirtiesContext(AFTER_EACH_TEST_METHOD)
classes (22 context restarts total), the OS TIME_WAIT state holds port 8081
for ~45-60s per cycle — adding ~17 min overhead. All 1601 tests pass but
surefire's 10-min timeout fires before the suite finishes.
Fixes#593.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the sentrySvelteKit() Vite plugin as the first plugin in vite.config.ts.
When SENTRY_AUTH_TOKEN is set at build time, source maps are uploaded to
GlitchTip so error stack traces show original TypeScript source and line number.
When SENTRY_AUTH_TOKEN is absent (CI, dev builds), upload is disabled via
autoUploadSourceMaps: false — the build succeeds normally.
Resolves Felix's review blocker on PR #591.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds @sentry/sveltekit to hooks.client.ts and hooks.server.ts.
When VITE_SENTRY_DSN is unset (default), Sentry is fully disabled.
When set to a GlitchTip JavaScript project DSN, browser exceptions
and SSR handleError events are forwarded automatically.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds GlitchTip env vars to the observability env var table, extends the
services table, and adds a first-run section with superuser creation and
project setup steps. Updates the C4 L2 container diagram with GlitchTip
and Redis containers and their relationships.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds obs-glitchtip, obs-glitchtip-worker, obs-redis, and obs-glitchtip-db-init
services to docker-compose.observability.yml. The one-shot db-init container
creates the dedicated glitchtip database on the existing archive-db PostgreSQL
instance automatically on first stack start.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Grafana row to the observability services table, Grafana access details
(URL, credentials, auto-provisioned datasources, pre-loaded dashboards), and
GRAFANA_ADMIN_PASSWORD to the env vars table in DEPLOYMENT.md.
Update C4 l2-containers.puml: replace placeholder Grafana entry with pinned
image version, expand observability boundary with node_exporter and cadvisor
containers, and add Rel() edges for Grafana → Prometheus, Loki, and Tempo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add obs-grafana service (grafana/grafana-oss:11.6.1) to docker-compose.observability.yml.
Datasources (Prometheus, Loki, Tempo) are auto-provisioned via
infra/observability/grafana/provisioning/datasources/datasources.yml with
cross-datasource linking (Loki traceId → Tempo, Tempo → Loki, service map via Prometheus).
Three dashboards are pre-loaded: Node Exporter Full (1860), Spring Boot Observability (17175),
Loki Logs (13639) — datasource template variables replaced with provisioned UIDs.
GRAFANA_ADMIN_PASSWORD added to .env.example.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
opentelemetry-spring-boot-starter:2.27.0 pulls in AzureAppServiceResourceProvider which
references ServiceAttributes.SERVICE_INSTANCE_ID — a field absent from the semconv version
used by this project. This caused every integration test to fail with NoSuchFieldError during
Spring context startup.
Fix 1 (application-test.yaml): set otel.sdk.disabled=true so the OTel auto-configuration
never runs during tests at all.
Fix 2 (pom.xml): exclude opentelemetry-azure-resources from the starter dependency to remove
the problematic provider from the dependency graph entirely.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add micrometer-registry-prometheus (BOM-managed) to expose /actuator/prometheus
- Add micrometer-tracing-bridge-otel (BOM-managed) for Micrometer → OTel tracing bridge
- Add opentelemetry-spring-boot-starter 2.27.0 (pinned — not in Spring Boot BOM)
- Move management to port 8081 so Prometheus scrapes directly inside archiv-net,
bypassing both Caddy and Spring Security's session-authenticated filter chain
- Configure otel.service.name and OTLP endpoint (default localhost:4317 for CI safety)
- Set tracing sampling probability to 1.0 in base config; override via env var in compose
- Add OTEL_EXPORTER_OTLP_ENDPOINT + MANAGEMENT_TRACING_SAMPLING_PROBABILITY to docker-compose.yml
- Expose management port 8081 inside archiv-net for Prometheus scraping
- Disable trace export in application-test.yaml (probability: 0.0) for deterministic CI
OTLP export failures are non-fatal; app starts cleanly without Tempo running.
Closes#576
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the stale "no monitoring infrastructure in place yet" note in
§4 with a brief description of the observability compose file and a
pointer to issue #581 for full docs.
Add a placeholder System_Boundary block for Prometheus + Loki + Grafana
to l2-containers.puml, showing the stack joins archiv-net.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates the skeleton observability stack (no running services yet) that all
subsequent Grafana LGTM + GlitchTip issues depend on:
- docker-compose.observability.yml: external archiv-net join, obs-net bridge,
named volumes for all five services, placeholder comments for each service
group (Metrics/Logs/Traces/Dashboards/Error Tracking), startup-order note
- infra/observability/{prometheus,loki,promtail,tempo,grafana/provisioning/{datasources,dashboards}}/.gitkeep
- .env.example: new # --- Observability --- section with PORT_GRAFANA,
PORT_GLITCHTIP, PORT_PROMETHEUS, GLITCHTIP_DOMAIN, GLITCHTIP_SECRET_KEY
(with generation hint), SENTRY_DSN, VITE_SENTRY_DSN
Verified: docker compose -f docker-compose.observability.yml config exits 0
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The production stage runs npm ci --omit=dev to install runtime deps for
the pre-built SvelteKit app. The postinstall script calls patch-package,
which is a devDependency, so it is absent and causes exit code 127.
--ignore-scripts is the correct npm-native fix: no lifecycle scripts are
needed when installing into a pre-built image.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use untrack() for intentional one-time prop seed in UserGroupsSection.
Add explicit LoadData type alias in page.server.test to avoid void|Record<string,any> union.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When groups load successfully but the list is empty, render a quiet
"Keine Gruppen vorhanden." message rather than a blank section that
leaves users uncertain whether groups failed to load.
Adds admin_new_invite_no_groups i18n key to de/en/es.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spring Framework 7 prohibits constructor injection cycles. InviteService
already injects UserService, so UserService cannot inject InviteService
for the deleteGroup guard — repository injection is the correct workaround.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Client-submitted duplicate UUIDs were causing a false GROUP_NOT_FOUND:
size(deduplicated_db_result)==1 != size(submitted)==2. Deduplicate input
with HashSet before calling findGroupsByIds so the size comparison is
always against unique IDs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- legend uses m.admin_new_invite_groups() instead of hardcoded "Gruppen"
so screen readers announce the correct string in en/es locales
- label gets min-h-[44px] for WCAG 2.2 touch target compliance
- add test asserting fieldset accessible name comes from i18n key
- add test documenting empty-groups-no-error renders no checkboxes/banner
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hand-copied load/action replicas with direct imports of the
real module. Mock $env/dynamic/private so the tests cover the actual
production code paths, not a duplicate that can drift.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Screen readers now announce the amber warning when it appears after
the form expands, without requiring the user to navigate to it.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
bind:group requires a writable $state variable; $derived is read-only
in Svelte 5, so every click was silently reset to unchecked, making
the group picker non-functional.
Also wraps checkboxes in <fieldset>/<legend> for WCAG 1.3.1 compliance.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>