Compare commits

..

14 Commits

Author SHA1 Message Date
Marcel
b1ff624c98 fix(infra): pin GlitchTip image to 6.1.6 (v4 tag never existed)
glitchtip/glitchtip:v4 is not a real tag — GlitchTip does not use a
v-prefix in its Docker image versioning. Latest stable release is 6.1.6.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 17:53:13 +02:00
Marcel
fed427dc4a fix(infra): set OTEL_EXPORTER_OTLP_ENDPOINT in docker-compose.prod.yml
Some checks failed
CI / Unit & Component Tests (pull_request) Has been cancelled
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled
The endpoint belongs in the compose file (hardcoded to the in-network
Tempo service) rather than per-environment workflow files. This covers
both staging (nightly.yml) and production (release.yml) with a single
change and removes the duplicate from the nightly env-file block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 17:43:23 +02:00
Marcel
cf78ab2f8e fix(staging): correct backend healthcheck port and OTel endpoint
Some checks failed
CI / OCR Service Tests (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Unit & Component Tests (pull_request) Has been cancelled
Two bugs introduced when the management port was split from the app port:

1. Backend healthcheck hit localhost:8080/actuator/health (app port) —
   actuator is on management.server.port=8081, so every probe got a 404
   from the main MVC dispatcher, marking the container permanently unhealthy.
   Fix: change the probe to localhost:8081.

2. OTEL_EXPORTER_OTLP_ENDPOINT was not set in .env.staging, so the exporter
   fell back to http://localhost:4317 (the CI-safe default) instead of
   http://tempo:4317 (the in-network Tempo service). Fix: inject the correct
   endpoint in the nightly env-file generation step.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 17:37:15 +02:00
Marcel
c8883d0e40 fix(ci): isolate compose-idempotency network from archiv-net collisions
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 5m40s
CI / OCR Service Tests (pull_request) Successful in 34s
CI / Backend Unit Tests (pull_request) Successful in 7m8s
CI / fail2ban Regex (pull_request) Successful in 1m58s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m41s
CI / Unit & Component Tests (push) Successful in 5m37s
CI / OCR Service Tests (push) Successful in 28s
CI / Backend Unit Tests (push) Successful in 6m59s
CI / fail2ban Regex (push) Successful in 1m59s
CI / Compose Bucket Idempotency (push) Successful in 1m44s
The name: archiv-net declaration (needed so docker-compose.observability.yml
can join the network as external: true) caused the compose-idempotency CI job
to collide with any archiv-net left on the runner from staging or a previous
run. mc would resolve 'minio' to the wrong container and fail with a signature
mismatch.

Make the network name interpolable via COMPOSE_NETWORK_NAME (default: archiv-net
so production/staging behaviour is unchanged). Inject COMPOSE_NETWORK_NAME=
test-idem-archiv-net into the stub env file so the idempotency test always
gets a fully isolated network.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 16:33:07 +02:00
Marcel
7154092547 fix(deps): pin opentelemetry-bom to 1.61.0 to fix startup crash
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 5m34s
CI / OCR Service Tests (pull_request) Successful in 30s
CI / Backend Unit Tests (pull_request) Successful in 7m6s
CI / fail2ban Regex (pull_request) Successful in 1m49s
CI / Compose Bucket Idempotency (pull_request) Failing after 1m26s
opentelemetry-spring-boot-starter:2.27.0 was built against
opentelemetry-api:1.61.0. Spring Boot 4.0.0 only manages 1.55.0,
which is missing GlobalOpenTelemetry.getOrNoop(). The backend crashed
at startup with NoSuchMethodError on the first staging nightly.

Add a <dependencyManagement> import of opentelemetry-bom:1.61.0 before
the Spring Boot BOM applies, so all OTel core artifacts resolve to the
version the instrumentation starter actually requires.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 16:05:44 +02:00
Marcel
ada3a3ccaf devops(ci): add --remove-orphans to observability stack deploy steps
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 5m27s
CI / OCR Service Tests (pull_request) Successful in 34s
CI / Backend Unit Tests (pull_request) Successful in 7m13s
CI / fail2ban Regex (pull_request) Successful in 1m51s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m47s
CI / Unit & Component Tests (push) Successful in 5m45s
CI / OCR Service Tests (push) Successful in 36s
CI / Backend Unit Tests (push) Successful in 7m12s
CI / fail2ban Regex (push) Successful in 1m54s
CI / Compose Bucket Idempotency (push) Successful in 1m41s
Both nightly and release workflows were missing --remove-orphans on the
observability compose up, while the main app deploy step already had it.
Without it, containers removed from docker-compose.observability.yml
linger as unnamed orphans until manually pruned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 14:55:28 +02:00
Marcel
8cf3a2a726 devops(caddy): apply full security_headers snippet to GlitchTip vhost
The GlitchTip vhost only had a manual HSTS header; the rest of the
(security_headers) snippet (X-Content-Type-Options, Referrer-Policy,
Permissions-Policy, -Server removal) was missing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 14:54:54 +02:00
Marcel
553e2f8898 docs(deployment): add observability secrets to §3.3 Gitea secrets table
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 5m35s
CI / OCR Service Tests (pull_request) Successful in 33s
CI / Backend Unit Tests (pull_request) Successful in 7m10s
CI / fail2ban Regex (pull_request) Successful in 1m54s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m39s
GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY, and SENTRY_DSN were
referenced in the workflow env files but absent from the secrets table,
leaving the first-run operator without a complete checklist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 13:46:01 +02:00
Marcel
4a7349543a devops(ci): wire SENTRY_DSN into staging and production env files
Adds SENTRY_DSN as an optional secret (empty by default) so it can be
set after GlitchTip first-run without requiring another code change.
Backend reads it via application.yaml; empty value keeps Sentry disabled.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 13:45:07 +02:00
Marcel
f15e004645 devops(ci): add --wait to observability stack startup
Prometheus, Loki, Tempo, and Grafana all define healthchecks in
docker-compose.observability.yml. Without --wait, the step exits 0
as soon as containers are created, masking startup failures silently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 13:44:16 +02:00
Marcel
b137e3e72d devops(caddy): add HSTS to GlitchTip vhost
Caddy does not set Strict-Transport-Security on GlitchTip because the
full security_headers snippet is intentionally omitted (Permissions-Policy
interferes with the Sentry SDK CORS). Adding HSTS alone guarantees
HTTPS enforcement at the Caddy layer without breaking SDK ingestion.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 13:43:35 +02:00
Marcel
4c8a23ff14 devops(caddy): add Grafana and GlitchTip vhosts
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 5m33s
CI / OCR Service Tests (pull_request) Successful in 33s
CI / Backend Unit Tests (pull_request) Successful in 7m10s
CI / fail2ban Regex (pull_request) Successful in 1m55s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m42s
grafana.archiv.raddatz.cloud → 127.0.0.1:3003 (with security headers)
glitchtip.archiv.raddatz.cloud → 127.0.0.1:3002 (no security headers —
  GlitchTip manages its own; the Sentry SDK also POSTs here)

Requires A records for both subdomains pointing at the server before
the next `systemctl reload caddy`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 11:27:07 +02:00
Marcel
d7d225af77 devops(observability): wire observability stack into nightly and release deploys
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m32s
CI / OCR Service Tests (pull_request) Successful in 17s
CI / Backend Unit Tests (pull_request) Successful in 4m3s
CI / fail2ban Regex (pull_request) Successful in 1m55s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m42s
- docker-compose.prod.yml: add `name: archiv-net` so the network has a
  stable Docker name regardless of compose project name (-p flag).
  Both staging and production share the same host-level network, which
  is correct since the observability stack is a single shared instance.

- nightly.yml / release.yml: add observability env vars (POSTGRES_USER,
  PORT_GRAFANA=3003, PORT_GLITCHTIP=3002, PORT_PROMETHEUS=9090,
  GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY, GLITCHTIP_DOMAIN) to the
  env file, then `docker compose -f docker-compose.observability.yml up -d`
  after the app deploy step. PORT_GRAFANA=3003 avoids collision with
  staging frontend on 3001.

  Requires two new Gitea secrets: GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 11:22:37 +02:00
Marcel
4358997482 perf(test): replace DirtiesContext(AFTER_EACH_TEST_METHOD) with @Transactional
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 4m40s
CI / OCR Service Tests (pull_request) Successful in 18s
CI / Backend Unit Tests (pull_request) Successful in 3m20s
CI / fail2ban Regex (pull_request) Successful in 47s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m4s
CI / Unit & Component Tests (push) Successful in 4m20s
CI / OCR Service Tests (push) Successful in 16s
CI / Backend Unit Tests (push) Successful in 3m8s
CI / fail2ban Regex (push) Successful in 44s
CI / Compose Bucket Idempotency (push) Successful in 1m1s
4 integration test classes were restarting the full Spring context (and a new
Postgres Testcontainer, ~75s each) after every test method — 10 unnecessary
container startups adding ~12 minutes to CI. Fixed by:

- PersonServiceIntegrationTest, DocumentSearchPagedIntegrationTest,
  GeschichteServiceIntegrationTest: swap to @Transactional so each test
  rolls back instead of destroying the context.
- AuditServiceIntegrationTest: cannot use @Transactional (logAfterCommit
  hooks into AFTER_COMMIT which requires a real commit); reset state with
  @BeforeEach deleteAll() instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 10:29:35 +02:00
12 changed files with 81 additions and 11 deletions

View File

@@ -305,6 +305,7 @@ jobs:
MAIL_PORT=1025
APP_MAIL_FROM=noreply@local
IMPORT_HOST_DIR=/tmp/dummy-import
COMPOSE_NETWORK_NAME=test-idem-archiv-net
EOF
- name: Bring up minio

View File

@@ -30,6 +30,9 @@ name: nightly
# STAGING_OCR_TRAINING_TOKEN
# STAGING_APP_ADMIN_USERNAME
# STAGING_APP_ADMIN_PASSWORD
# GRAFANA_ADMIN_PASSWORD
# GLITCHTIP_SECRET_KEY
# SENTRY_DSN (set after GlitchTip first-run; empty = Sentry disabled)
on:
schedule:
@@ -74,6 +77,14 @@ jobs:
MAIL_STARTTLS_ENABLE=false
APP_MAIL_FROM=noreply@staging.raddatz.cloud
IMPORT_HOST_DIR=/srv/familienarchiv-staging/import
POSTGRES_USER=archiv
PORT_GRAFANA=3003
PORT_GLITCHTIP=3002
PORT_PROMETHEUS=9090
GRAFANA_ADMIN_PASSWORD=${{ secrets.GRAFANA_ADMIN_PASSWORD }}
GLITCHTIP_SECRET_KEY=${{ secrets.GLITCHTIP_SECRET_KEY }}
GLITCHTIP_DOMAIN=https://glitchtip.archiv.raddatz.cloud
SENTRY_DSN=${{ secrets.SENTRY_DSN }}
EOF
- name: Verify backend /import:ro mount is wired
@@ -120,6 +131,13 @@ jobs:
--profile staging \
up -d --wait --remove-orphans
- name: Start observability stack
run: |
docker compose \
-f docker-compose.observability.yml \
--env-file .env.staging \
up -d --wait --remove-orphans
- name: Reload Caddy
# Apply any committed Caddyfile changes before smoke-testing the
# public surface. Without this step, a Caddyfile edit lands in the

View File

@@ -34,6 +34,9 @@ name: release
# MAIL_PORT
# MAIL_USERNAME
# MAIL_PASSWORD
# GRAFANA_ADMIN_PASSWORD
# GLITCHTIP_SECRET_KEY
# SENTRY_DSN (set after GlitchTip first-run; empty = Sentry disabled)
on:
push:
@@ -72,6 +75,14 @@ jobs:
MAIL_STARTTLS_ENABLE=true
APP_MAIL_FROM=noreply@raddatz.cloud
IMPORT_HOST_DIR=/srv/familienarchiv-production/import
POSTGRES_USER=archiv
PORT_GRAFANA=3003
PORT_GLITCHTIP=3002
PORT_PROMETHEUS=9090
GRAFANA_ADMIN_PASSWORD=${{ secrets.GRAFANA_ADMIN_PASSWORD }}
GLITCHTIP_SECRET_KEY=${{ secrets.GLITCHTIP_SECRET_KEY }}
GLITCHTIP_DOMAIN=https://glitchtip.archiv.raddatz.cloud
SENTRY_DSN=${{ secrets.SENTRY_DSN }}
EOF
- name: Build images
@@ -93,6 +104,13 @@ jobs:
--env-file .env.production \
up -d --wait --remove-orphans
- name: Start observability stack
run: |
docker compose \
-f docker-compose.observability.yml \
--env-file .env.production \
up -d --wait --remove-orphans
- name: Reload Caddy
# See nightly.yml — same rationale and mechanism: DooD job containers
# cannot call systemctl directly; nsenter via a privileged sibling

View File

@@ -29,6 +29,20 @@
<properties>
<java.version>21</java.version>
</properties>
<dependencyManagement>
<dependencies>
<!-- opentelemetry-spring-boot-starter:2.27.0 was built against opentelemetry-api:1.61.0,
but Spring Boot 4.0.0 BOM only manages 1.55.0 (missing GlobalOpenTelemetry.getOrNoop()).
Import the core OTel BOM here to override it before the Spring Boot BOM applies. -->
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-bom</artifactId>
<version>1.61.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>

View File

@@ -1,11 +1,11 @@
package org.raddatz.familienarchiv.audit;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.raddatz.familienarchiv.PostgresContainerConfig;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.context.annotation.Import;
import org.springframework.test.annotation.DirtiesContext;
import org.springframework.test.context.ActiveProfiles;
import org.springframework.test.context.bean.override.mockito.MockitoBean;
import org.springframework.transaction.support.TransactionTemplate;
@@ -18,7 +18,6 @@ import static org.awaitility.Awaitility.await;
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.NONE)
@ActiveProfiles("test")
@Import(PostgresContainerConfig.class)
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
class AuditServiceIntegrationTest {
@MockitoBean S3Client s3Client;
@@ -26,6 +25,11 @@ class AuditServiceIntegrationTest {
@Autowired AuditLogRepository auditLogRepository;
@Autowired TransactionTemplate transactionTemplate;
@BeforeEach
void resetAuditLog() {
auditLogRepository.deleteAll();
}
@Test
void logAfterCommit_writes_ANNOTATION_CREATED_row_after_transaction_commits() {
transactionTemplate.execute(status -> {

View File

@@ -12,9 +12,9 @@ import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.context.annotation.Import;
import org.springframework.data.domain.PageRequest;
import org.springframework.test.annotation.DirtiesContext;
import org.springframework.test.context.ActiveProfiles;
import org.springframework.test.context.bean.override.mockito.MockitoBean;
import org.springframework.transaction.annotation.Transactional;
import software.amazon.awssdk.services.s3.S3Client;
import java.time.LocalDate;
@@ -33,7 +33,7 @@ import static org.assertj.core.api.Assertions.assertThat;
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.NONE)
@ActiveProfiles("test")
@Import(PostgresContainerConfig.class)
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
@Transactional
class DocumentSearchPagedIntegrationTest {
private static final int FIXTURE_SIZE = 120;

View File

@@ -19,9 +19,9 @@ import org.springframework.context.annotation.Import;
import org.springframework.security.authentication.UsernamePasswordAuthenticationToken;
import org.springframework.security.core.authority.SimpleGrantedAuthority;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.test.annotation.DirtiesContext;
import org.springframework.test.context.ActiveProfiles;
import org.springframework.test.context.bean.override.mockito.MockitoBean;
import org.springframework.transaction.annotation.Transactional;
import software.amazon.awssdk.services.s3.S3Client;
import java.util.List;
@@ -32,7 +32,7 @@ import static org.assertj.core.api.Assertions.assertThat;
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.NONE)
@ActiveProfiles("test")
@Import(PostgresContainerConfig.class)
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
@Transactional
class GeschichteServiceIntegrationTest {
@MockitoBean

View File

@@ -8,9 +8,9 @@ import org.raddatz.familienarchiv.person.PersonRepository;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.context.annotation.Import;
import org.springframework.test.annotation.DirtiesContext;
import org.springframework.test.context.ActiveProfiles;
import org.springframework.test.context.bean.override.mockito.MockitoBean;
import org.springframework.transaction.annotation.Transactional;
import software.amazon.awssdk.services.s3.S3Client;
import static org.assertj.core.api.Assertions.assertThat;
@@ -18,7 +18,7 @@ import static org.assertj.core.api.Assertions.assertThat;
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.NONE)
@ActiveProfiles("test")
@Import(PostgresContainerConfig.class)
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
@Transactional
class PersonServiceIntegrationTest {
@MockitoBean S3Client s3Client;

View File

@@ -184,7 +184,7 @@ services:
- obs-net
obs-glitchtip:
image: glitchtip/glitchtip:v4
image: glitchtip/glitchtip:6.1.6
container_name: obs-glitchtip
restart: unless-stopped
depends_on:
@@ -207,7 +207,7 @@ services:
- obs-net
obs-glitchtip-worker:
image: glitchtip/glitchtip:v4
image: glitchtip/glitchtip:6.1.6
container_name: obs-glitchtip-worker
restart: unless-stopped
command: ./bin/run-celery-with-beat.sh

View File

@@ -39,6 +39,7 @@
networks:
archiv-net:
driver: bridge
name: ${COMPOSE_NETWORK_NAME:-archiv-net}
volumes:
postgres-data:
@@ -212,10 +213,11 @@ services:
APP_MAIL_FROM: ${APP_MAIL_FROM:-noreply@raddatz.cloud}
SPRING_MAIL_PROPERTIES_MAIL_SMTP_AUTH: ${MAIL_SMTP_AUTH:-true}
SPRING_MAIL_PROPERTIES_MAIL_SMTP_STARTTLS_ENABLE: ${MAIL_STARTTLS_ENABLE:-true}
OTEL_EXPORTER_OTLP_ENDPOINT: http://tempo:4317
networks:
- archiv-net
healthcheck:
test: ["CMD-SHELL", "wget -qO- http://localhost:8080/actuator/health | grep -q UP || exit 1"]
test: ["CMD-SHELL", "wget -qO- http://localhost:8081/actuator/health | grep -q UP || exit 1"]
interval: 15s
timeout: 5s
retries: 10

View File

@@ -223,6 +223,9 @@ git.raddatz.cloud A <server IP>
| `MAIL_PORT` | release.yml | typically `587` |
| `MAIL_USERNAME` | release.yml | SMTP user |
| `MAIL_PASSWORD` | release.yml | SMTP password |
| `GRAFANA_ADMIN_PASSWORD` | both | Grafana `admin` login — generate a strong password |
| `GLITCHTIP_SECRET_KEY` | both | Django secret key — `openssl rand -hex 32` |
| `SENTRY_DSN` | both | GlitchTip project DSN — set after first-run (§4); leave empty to keep Sentry disabled |
### 3.4 First deploy

View File

@@ -88,3 +88,13 @@ git.raddatz.cloud {
import security_headers
reverse_proxy 127.0.0.1:3005
}
grafana.archiv.raddatz.cloud {
import security_headers
reverse_proxy 127.0.0.1:3003
}
glitchtip.archiv.raddatz.cloud {
import security_headers
reverse_proxy 127.0.0.1:3002
}