familienarchiv

Author	SHA1	Message	Date
Marcel	c43f45a472	Merge branch 'fix/issue-601-obs-stack-permanent' Some checks failed CI / OCR Service Tests (push) Has been cancelled Details CI / Backend Unit Tests (push) Has been cancelled Details CI / fail2ban Regex (push) Has been cancelled Details CI / Compose Bucket Idempotency (push) Has been cancelled Details CI / Unit & Component Tests (push) Has been cancelled Details	2026-05-16 10:19:59 +02:00
Marcel	55ccd5f3c0	ci(obs): replace rsync with rm+cp in deploy step rsync is not present in the act_runner job container image. rm -rf + cp -r gives identical semantics (including removal of deleted files) using only coreutils, which are always available. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 10:18:42 +02:00
marcel	0bb0a314ad	ci(obs): add obs-glitchtip to health assertion loop (now has /_health/ healthcheck) Some checks are pending CI / Unit & Component Tests (pull_request) Waiting to run Details CI / OCR Service Tests (pull_request) Waiting to run Details CI / Backend Unit Tests (pull_request) Waiting to run Details CI / fail2ban Regex (pull_request) Waiting to run Details CI / Compose Bucket Idempotency (pull_request) Waiting to run Details	2026-05-16 09:36:37 +02:00
marcel	b194b565f6	ci(obs): reference #603 in keep-in-sync comments; add obs-glitchtip to health assertion Some checks failed CI / Unit & Component Tests (pull_request) Has been cancelled Details CI / OCR Service Tests (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details CI / fail2ban Regex (pull_request) Has been cancelled Details CI / Compose Bucket Idempotency (pull_request) Has been cancelled Details	2026-05-16 09:35:43 +02:00
Marcel	6720a5aeb2	chore(obs): improve deploy maintainability from review feedback Some checks failed CI / Unit & Component Tests (pull_request) Successful in 5m45s Details CI / OCR Service Tests (pull_request) Successful in 47s Details CI / fail2ban Regex (pull_request) Has been cancelled Details CI / Compose Bucket Idempotency (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details - Move POSTGRES_USER to obs.env (non-secret, constant across envs) - Replace cp -r with rsync -a --delete so removed config files are purged from /opt/familienarchiv on next deploy instead of lingering - Document --env-file ordering contract in validate + start steps: obs.env first (defaults), obs-secrets.env second (wins on dupes) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 09:20:08 +02:00
Marcel	25062be657	ci(obs): quote heredoc delimiter in release obs-secrets.env write Same fix as nightly.yml: prevents shell expansion of '$' in secret values after Gitea renders them. Keep in sync with nightly.yml. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 09:04:12 +02:00
Marcel	9662ff5f8c	ci(obs): quote heredoc delimiter in nightly obs-secrets.env write Prevents shell from expanding '$' in Gitea-rendered secret values. Without the quote, a password like 'P@$s5w0rd' has '$s5w0rd' silently expanded to '' — writing a truncated value to obs-secrets.env. '<<'EOF'' suppresses shell expansion; Gitea's '${{ }}' template rendering already ran before the shell sees the script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 09:03:46 +02:00
Marcel	f5c7be932b	ci(obs): document POSTGRES_HOST derivation from Compose project name Some checks failed CI / Unit & Component Tests (pull_request) Successful in 5m38s Details CI / OCR Service Tests (pull_request) Successful in 45s Details CI / Backend Unit Tests (pull_request) Failing after 10m48s Details CI / fail2ban Regex (pull_request) Successful in 2m51s Details CI / Compose Bucket Idempotency (pull_request) Successful in 2m16s Details The container names archiv-staging-db-1 and archiv-production-db-1 are derived from the Compose project name + service name. A project rename silently breaks the obs stack DB connection. Add a comment at the point of definition so the dependency is obvious when someone changes it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 08:54:17 +02:00
Marcel	dec0001bd1	ci(obs): chmod 600 obs-secrets.env after creation in both workflows The heredoc creates the file with default umask permissions (644 — world-readable). Setting 600 immediately after creation prevents other processes on the host from reading the Grafana, GlitchTip, and Postgres credentials. Defence-in-depth for the single-tenant VPS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 08:53:49 +02:00
Marcel	f628ab6435	ci(obs): add validate + health assertion steps to release.yml nightly.yml had two observability gates that release.yml lacked: - "Validate observability compose config" (docker compose config --quiet) catches missing env vars and YAML errors before any containers start - "Assert observability stack health" checks obs-loki/prometheus/grafana/tempo are healthy after up --wait, covering services without healthcheck directives Mirrors the nightly.yml steps verbatim so the production deploy path is at least as well-verified as the nightly staging path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 08:53:18 +02:00
Marcel	53cf1837b2	fix(obs): set POSTGRES_HOST per environment — staging/prod use compose auto-names not archive-db All checks were successful CI / Unit & Component Tests (pull_request) Successful in 2m58s Details CI / OCR Service Tests (pull_request) Successful in 19s Details CI / Backend Unit Tests (pull_request) Successful in 2m39s Details CI / fail2ban Regex (pull_request) Successful in 40s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 00:21:53 +02:00
Marcel	1ae4bfe325	ci(obs): GitOps obs env split in release — deploy to /opt/familienarchiv/, secrets fresh from Gitea Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 00:19:12 +02:00
Marcel	c5139851b8	ci(obs): GitOps obs env split in nightly — obs.env in git, secrets fresh from Gitea Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 00:18:38 +02:00
Marcel	79735e23e0	ci(obs): assert obs-loki/prometheus/grafana/tempo are healthy after stack up All checks were successful CI / Unit & Component Tests (pull_request) Successful in 2m58s Details CI / OCR Service Tests (pull_request) Successful in 17s Details CI / Backend Unit Tests (pull_request) Successful in 2m36s Details CI / fail2ban Regex (pull_request) Successful in 41s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 00:01:48 +02:00
Marcel	df37113d38	ci(obs): add compose config dry-run before obs stack up to catch .env substitution errors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-16 00:01:17 +02:00
Marcel	7e52494880	fix(ci): deploy obs configs to /opt/familienarchiv/ before starting stack All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m4s Details CI / OCR Service Tests (pull_request) Successful in 18s Details CI / Backend Unit Tests (pull_request) Successful in 2m42s Details CI / fail2ban Regex (pull_request) Successful in 41s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s Details The observability stack's bind-mount sources pointed to workspace-relative paths. When CI wiped the workspace between runs, containers kept running but their config files disappeared — causing Docker to auto-create directories at the missing paths and crash the services on next restart. Fix: mount /opt/familienarchiv/ into CI job containers via runner-config.yaml, then copy infra/observability/ and docker-compose.observability.yml there before docker compose up. Compose runs from the permanent path, so bind mounts resolve to stable host paths that survive workspace wipes. Docker Compose reads /opt/familienarchiv/.env automatically (no --env-file flag), which is managed on the server and persists between CI runs. Closes #601 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 21:59:23 +02:00
Marcel	56c3e51657	fix(ci): replace overlay2 sync with workspace bind mount for DooD runner-config.yaml: correct path to /srv/gitea-workspace (VPS, not Synology). docker-compose.observability.yml: revert 5 bind mounts to plain relative paths; OBS_CONFIG_DIR variable is no longer needed. nightly.yml / release.yml: remove OBS_CONFIG_DIR env injection and the "Sync observability configs to host" step from both workflows. With workdir_parent=/srv/gitea-workspace and an identical host<->container bind mount, $(pwd) inside job containers resolves to a real host path the daemon can find — no privileged container, no overlay2 inspection, no nsenter. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 19:36:55 +02:00
Marcel	1fc47888d5	fix(ci): sync observability configs to host before docker compose up (#598 ) All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m26s Details CI / OCR Service Tests (pull_request) Successful in 18s Details CI / Backend Unit Tests (pull_request) Successful in 2m40s Details CI / fail2ban Regex (pull_request) Successful in 41s Details CI / Compose Bucket Idempotency (pull_request) Successful in 57s Details DooD runner only shares /var/run/docker.sock — no workspace directory is mapped to the host daemon. Relative bind mounts in docker-compose.observability.yml resolved to paths that didn't exist on the host; Docker auto-created directories in their place, causing 'not a directory' mount failures for all five config files. Fix: - docker-compose.observability.yml: replace hardcoded ./infra/observability/ prefix with ${OBS_CONFIG_DIR:-./infra/observability} so the path is configurable while remaining backwards-compatible for local use. - nightly.yml / release.yml: add a 'Sync observability configs to host' step that finds the job container's overlay2 MergedDir (the container's full filesystem as seen from the host mount namespace), then uses the existing nsenter/alpine pattern to cp the config tree into a stable host path (/srv/familienarchiv-{staging,production}/obs-configs). OBS_CONFIG_DIR is injected into the env file so Compose picks it up. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 19:02:53 +02:00
Marcel	fed427dc4a	fix(infra): set OTEL_EXPORTER_OTLP_ENDPOINT in docker-compose.prod.yml Some checks failed CI / Unit & Component Tests (pull_request) Has been cancelled Details CI / OCR Service Tests (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details CI / fail2ban Regex (pull_request) Has been cancelled Details CI / Compose Bucket Idempotency (pull_request) Has been cancelled Details CI / Unit & Component Tests (push) Has been cancelled Details CI / OCR Service Tests (push) Has been cancelled Details CI / Backend Unit Tests (push) Has been cancelled Details CI / fail2ban Regex (push) Has been cancelled Details CI / Compose Bucket Idempotency (push) Has been cancelled Details The endpoint belongs in the compose file (hardcoded to the in-network Tempo service) rather than per-environment workflow files. This covers both staging (nightly.yml) and production (release.yml) with a single change and removes the duplicate from the nightly env-file block. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 17:43:23 +02:00
Marcel	cf78ab2f8e	fix(staging): correct backend healthcheck port and OTel endpoint Some checks failed CI / OCR Service Tests (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details CI / fail2ban Regex (pull_request) Has been cancelled Details CI / Compose Bucket Idempotency (pull_request) Has been cancelled Details CI / Unit & Component Tests (pull_request) Has been cancelled Details Two bugs introduced when the management port was split from the app port: 1. Backend healthcheck hit localhost:8080/actuator/health (app port) — actuator is on management.server.port=8081, so every probe got a 404 from the main MVC dispatcher, marking the container permanently unhealthy. Fix: change the probe to localhost:8081. 2. OTEL_EXPORTER_OTLP_ENDPOINT was not set in .env.staging, so the exporter fell back to http://localhost:4317 (the CI-safe default) instead of http://tempo:4317 (the in-network Tempo service). Fix: inject the correct endpoint in the nightly env-file generation step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 17:37:15 +02:00
Marcel	c8883d0e40	fix(ci): isolate compose-idempotency network from archiv-net collisions All checks were successful CI / Unit & Component Tests (pull_request) Successful in 5m40s Details CI / OCR Service Tests (pull_request) Successful in 34s Details CI / Backend Unit Tests (pull_request) Successful in 7m8s Details CI / fail2ban Regex (pull_request) Successful in 1m58s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m41s Details CI / Unit & Component Tests (push) Successful in 5m37s Details CI / OCR Service Tests (push) Successful in 28s Details CI / Backend Unit Tests (push) Successful in 6m59s Details CI / fail2ban Regex (push) Successful in 1m59s Details CI / Compose Bucket Idempotency (push) Successful in 1m44s Details The name: archiv-net declaration (needed so docker-compose.observability.yml can join the network as external: true) caused the compose-idempotency CI job to collide with any archiv-net left on the runner from staging or a previous run. mc would resolve 'minio' to the wrong container and fail with a signature mismatch. Make the network name interpolable via COMPOSE_NETWORK_NAME (default: archiv-net so production/staging behaviour is unchanged). Inject COMPOSE_NETWORK_NAME= test-idem-archiv-net into the stub env file so the idempotency test always gets a fully isolated network. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 16:33:07 +02:00
Marcel	ada3a3ccaf	devops(ci): add --remove-orphans to observability stack deploy steps All checks were successful CI / Unit & Component Tests (pull_request) Successful in 5m27s Details CI / OCR Service Tests (pull_request) Successful in 34s Details CI / Backend Unit Tests (pull_request) Successful in 7m13s Details CI / fail2ban Regex (pull_request) Successful in 1m51s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m47s Details CI / Unit & Component Tests (push) Successful in 5m45s Details CI / OCR Service Tests (push) Successful in 36s Details CI / Backend Unit Tests (push) Successful in 7m12s Details CI / fail2ban Regex (push) Successful in 1m54s Details CI / Compose Bucket Idempotency (push) Successful in 1m41s Details Both nightly and release workflows were missing --remove-orphans on the observability compose up, while the main app deploy step already had it. Without it, containers removed from docker-compose.observability.yml linger as unnamed orphans until manually pruned. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 14:55:28 +02:00
Marcel	4a7349543a	devops(ci): wire SENTRY_DSN into staging and production env files Adds SENTRY_DSN as an optional secret (empty by default) so it can be set after GlitchTip first-run without requiring another code change. Backend reads it via application.yaml; empty value keeps Sentry disabled. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 13:45:07 +02:00
Marcel	f15e004645	devops(ci): add --wait to observability stack startup Prometheus, Loki, Tempo, and Grafana all define healthchecks in docker-compose.observability.yml. Without --wait, the step exits 0 as soon as containers are created, masking startup failures silently. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 13:44:16 +02:00
Marcel	d7d225af77	devops(observability): wire observability stack into nightly and release deploys All checks were successful CI / Unit & Component Tests (pull_request) Successful in 4m32s Details CI / OCR Service Tests (pull_request) Successful in 17s Details CI / Backend Unit Tests (pull_request) Successful in 4m3s Details CI / fail2ban Regex (pull_request) Successful in 1m55s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m42s Details - docker-compose.prod.yml: add `name: archiv-net` so the network has a stable Docker name regardless of compose project name (-p flag). Both staging and production share the same host-level network, which is correct since the observability stack is a single shared instance. - nightly.yml / release.yml: add observability env vars (POSTGRES_USER, PORT_GRAFANA=3003, PORT_GLITCHTIP=3002, PORT_PROMETHEUS=9090, GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY, GLITCHTIP_DOMAIN) to the env file, then `docker compose -f docker-compose.observability.yml up -d` after the app deploy step. PORT_GRAFANA=3003 avoids collision with staging frontend on 3001. Requires two new Gitea secrets: GRAFANA_ADMIN_PASSWORD, GLITCHTIP_SECRET_KEY. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-15 11:22:37 +02:00
Marcel	ebdb36b7d0	devops(ci): upload surefire XML reports as CI artifact Captures all 102 test results independent of log verbosity. if: always() ensures reports are available on failure — exactly when they're needed most. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 14:25:37 +02:00
Marcel	5646e739c2	fix(ci): run svelte-kit sync before lint to fix cache-hit tsconfig miss All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m8s Details CI / OCR Service Tests (pull_request) Successful in 17s Details CI / Backend Unit Tests (pull_request) Successful in 4m25s Details CI / fail2ban Regex (pull_request) Successful in 38s Details CI / Compose Bucket Idempotency (pull_request) Successful in 57s Details CI / Unit & Component Tests (push) Successful in 3m7s Details CI / OCR Service Tests (push) Successful in 17s Details CI / Backend Unit Tests (push) Successful in 4m15s Details CI / fail2ban Regex (push) Successful in 39s Details CI / Compose Bucket Idempotency (push) Successful in 58s Details When the node_modules cache hits, npm ci is skipped and the prepare lifecycle (svelte-kit sync) never runs. frontend/tsconfig.json extends .svelte-kit/tsconfig.json which only exists after svelte-kit sync — so ESLint fails at tsconfig resolution on every cache-warm run. Adding an unconditional svelte-kit sync step after Paraglide compile and before Lint ensures .svelte-kit/tsconfig.json is always present regardless of cache state. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 12:07:15 +02:00
Marcel	bbbdf8cd09	ci: restrict push trigger to main — eliminate duplicate runs on feature branches Some checks failed CI / Unit & Component Tests (push) Failing after 1m5s Details CI / OCR Service Tests (push) Successful in 17s Details CI / Backend Unit Tests (push) Successful in 4m27s Details CI / fail2ban Regex (push) Successful in 40s Details CI / Compose Bucket Idempotency (push) Successful in 58s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 11:12:24 +02:00
Marcel	3de0d2f0fe	fix(ci): add IMPORT_HOST_DIR stub to compose-idempotency env file Some checks failed CI / fail2ban Regex (push) Has been cancelled Details CI / Compose Bucket Idempotency (push) Has been cancelled Details CI / Unit & Component Tests (push) Has been cancelled Details CI / OCR Service Tests (push) Has been cancelled Details CI / Backend Unit Tests (push) Has been cancelled Details Docker Compose interpolates all variables in the full file even when only a subset of services is requested. The backend service uses IMPORT_HOST_DIR with :? (hard-required), causing the idempotency job to abort before any container starts. A dummy path satisfies the parser; the backend service is never started in this job so the path need not exist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 10:58:38 +02:00
Marcel	0abbc147e2	ci(unit-tests): add negative self-test case to upload-artifact guard Some checks failed CI / Unit & Component Tests (push) Has been cancelled Details CI / OCR Service Tests (push) Has been cancelled Details CI / Backend Unit Tests (push) Has been cancelled Details CI / fail2ban Regex (push) Has been cancelled Details CI / Compose Bucket Idempotency (push) Has been cancelled Details The previous self-test proved the regex catches @v5 (positive case). This adds a negative case proving @v3 is NOT flagged — guards against a false-positive that would break every CI run permanently. Suggested by Sara Holt in review of PR #558. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 10:58:19 +02:00
Marcel	fa46492759	ci(workflows): downgrade upload-artifact v4 → v3 — Gitea act_runner limitation (ADR-014) Reverts the re-regression introduced in `410b91e2`. Gitea Actions (act_runner) does not implement the v4 artifact protocol — jobs report failure even when all tests pass. Pins all three call sites back to @v3 and adds load-bearing inline comments pointing to ADR-014 / #557. This commit makes the grep guard added in the previous commit GREEN. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 10:58:19 +02:00
Marcel	3965541879	ci(unit-tests): add grep guard for (upload\|download)-artifact@v4+ Adds a repo-invariant check in the same 'Assert' block as the ADR-012 birpc guard. Anchored to YAML `uses:` lines so the inline self-test fixture does not false-positive. Fails with an actionable error referencing ADR-014 / #557. Guard is intentionally RED at this commit — the three v4 call sites are downgraded in the next commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-14 10:58:19 +02:00
Marcel	c820884765	ci(coverage-flake-probe): add workflow_dispatch matrix job (20 parallel runs) Verification mechanism for the 20-run acceptance criterion of issue #553. Triggered manually via workflow_dispatch, runs the full coverage suite 20× in parallel against a single SHA, asserts zero `[birpc] rpc is closed` lines in every cell. One fire, parallel cost (~one main-job's wall-clock), deterministic signal for the teardown race. Cheaper than 20 sequential push events and tests the same property the AC names. Closes the verification gap raised by Tobias and Elicit in the issue discussion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 22:12:04 +02:00
Marcel	67cd56acc7	ci(unit-tests): extend grep guard to async vi.mock with dynamic import The pdfjs-dist literal grep added in `9260866f` only caught one named trigger of the birpc teardown race; the underlying mechanism (ADR 012 / #553) is any async vi.mock factory whose body performs `await import(...)`. Add a second PCRE-multiline grep matching that shape. Scoped to */.{spec,test}.ts under frontend/src/, excluding __meta__ (which holds the fixture strings exercising the meta-test). Defence in depth pairs with the ESLint rule (saves at edit time) and the in-suite meta-test (catches when tests run). Verified locally with real GNU grep against a planted synthetic offender. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 22:11:09 +02:00
Marcel	9260866f47	ci(unit-tests): add early grep check for banned vi.mock pdfjs-dist pattern Some checks failed CI / Unit & Component Tests (push) Failing after 1m47s Details CI / OCR Service Tests (push) Successful in 16s Details CI / Backend Unit Tests (push) Successful in 4m11s Details CI / fail2ban Regex (push) Successful in 38s Details CI / Compose Bucket Idempotency (push) Failing after 11s Details Adds a static grep step that runs after Lint and before the test suite. Fails in ~1 s if any file under frontend/src/ contains the banned vi.mock('pdfjs-dist' pattern, catching the regression before Playwright spins up. Belt-and-suspenders with the ESLint rule (ADR 012). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 12:32:23 +02:00
Marcel	1ead1f293f	ci(coverage): document that birpc guard covers coverage run only Adds a comment above the assertion step so a future developer diagnosing a birpc-related failure in `npm test` knows where to find the diagnostic. Addresses Sara Holt + Tobias Wendt round-4 observation on PR #536. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:57:28 +02:00
Marcel	729f5c66d6	ci(coverage): use grep -F for birpc guard to avoid BRE escaping -F (fixed string) matches the literal pattern [birpc] rpc is closed without relying on BRE bracket escaping, making the intent explicit and immune to accidental regex interpretation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:57:28 +02:00
Marcel	d40f477397	ci(coverage): include coverage log in artifact upload The birpc guard step writes to /tmp/coverage-test-<run_id>.log and exits 1 when a race is detected. Without this file in the artifact, the evidence disappears when the runner tears down — only the exit code remained visible. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:57:28 +02:00
Marcel	cf78957476	ci(coverage): harden coverage guard step - Add explicit set -eo pipefail so npm test:coverage exit code propagates through the pipe (not just tee's always-0 exit) - Scope log file to github.run_id to prevent stale-log false positives on retried steps sharing the same runner /tmp - Tighten grep pattern to \[birpc\] rpc is closed to avoid matching unrelated log lines that happen to contain "rpc is closed" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:57:28 +02:00
Marcel	3594204214	ci(coverage): simplify coverage step and pin shell to bash - removes unreachable `; exit ${PIPESTATUS[0]}` — already covered by pipefail (Tobias) - adds explicit `shell: bash` to both new steps for clarity (Tobias) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:57:28 +02:00
Marcel	538adb43a9	ci(guard): fail unit-tests job if [birpc] rpc is closed appears in coverage run Captures npm run test:coverage output with tee and adds an always-run step that greps for the teardown-race fingerprint. Any future regression where a vi.mock factory races with birpc teardown will now surface as an explicit CI failure rather than a silent exit-1 after all tests report green (#535). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:57:28 +02:00
Marcel	9c26c00eee	fix(ci): replace iproute2 `ip` with /proc/net/route for gateway detection Some checks failed CI / Unit & Component Tests (push) Has been cancelled Details CI / OCR Service Tests (push) Has been cancelled Details CI / Backend Unit Tests (push) Has been cancelled Details CI / fail2ban Regex (push) Has been cancelled Details CI / Compose Bucket Idempotency (push) Has been cancelled Details CI / Unit & Component Tests (pull_request) Has been cancelled Details CI / OCR Service Tests (pull_request) Has been cancelled Details CI / Backend Unit Tests (pull_request) Has been cancelled Details CI / fail2ban Regex (pull_request) Has been cancelled Details CI / Compose Bucket Idempotency (pull_request) Has been cancelled Details `ip route` (iproute2) is not installed in the Gitea runner container, causing the smoke test step to exit 127. /proc/net/route is a kernel virtual file that is always present on Linux; awk decodes the little-endian hex gateway field to dotted-decimal without any external binary dependency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:50:56 +02:00
Marcel	6d16be4669	fix(ci): quote \$RESOLVE in all curl calls Some checks failed CI / Unit & Component Tests (pull_request) Failing after 1m51s Details CI / OCR Service Tests (pull_request) Successful in 18s Details CI / Backend Unit Tests (pull_request) Successful in 4m1s Details CI / fail2ban Regex (pull_request) Successful in 38s Details CI / Compose Bucket Idempotency (pull_request) Failing after 11s Details CI / Unit & Component Tests (push) Failing after 1m51s Details CI / OCR Service Tests (push) Successful in 18s Details CI / Backend Unit Tests (push) Successful in 4m10s Details CI / fail2ban Regex (push) Successful in 38s Details CI / Compose Bucket Idempotency (push) Failing after 10s Details Unquoted variable expansion is safe here since the value contains no spaces or glob characters, but quoting is the correct default and keeps the script consistent with surrounding style. Addresses review suggestion by Felix Brandt and Tobias Wendt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:26:35 +02:00
Marcel	f1032865f3	fix(ci): guard against empty HOST_IP in smoke test If `ip route show default` returns no output the old code passed an empty string to curl --resolve, producing a confusing error 6 ("couldn't resolve host") with no indication that gateway detection had failed. The new guard exits immediately with a clear message. Addresses review concern raised by Tobias Wendt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:26:35 +02:00
Marcel	3056311c24	fix(ci): resolve smoke test host via bridge gateway, not 127.0.0.1 Some checks failed CI / Unit & Component Tests (pull_request) Failing after 1m50s Details CI / OCR Service Tests (pull_request) Successful in 17s Details CI / Backend Unit Tests (pull_request) Successful in 4m8s Details CI / fail2ban Regex (pull_request) Successful in 38s Details CI / Compose Bucket Idempotency (pull_request) Failing after 10s Details CI / OCR Service Tests (push) Has been cancelled Details CI / Backend Unit Tests (push) Has been cancelled Details CI / fail2ban Regex (push) Has been cancelled Details CI / Unit & Component Tests (push) Has started running Details CI / Compose Bucket Idempotency (push) Has been cancelled Details Job containers run in bridge network mode (runner-config.yaml). Inside a bridge-networked container 127.0.0.1 is the container's own loopback; Caddy on the host is unreachable there, causing an immediate ECONNREFUSED. Use the Docker bridge gateway IP instead — the host's docker0 interface where Caddy (bound on 0.0.0.0:443) is reachable from the container. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 09:10:17 +02:00
Marcel	544b96bc9e	fix(ci): pin Reload Caddy to alpine:3.21 digest, add reload-vs-restart rationale - Switch ubuntu:22.04 (floating, ~70 MB) to alpine:3.21 pinned by sha256 digest (~5 MB); util-linux installed at run time via apk add - Add explicit comment explaining why `reload` not `restart`: SIGHUP re-reads config in-process without dropping TLS connections Addresses Tobias + Nora blocker from PR review. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 07:42:28 +02:00
Marcel	d29169eb39	fix(ci): add Caddy reload step to release workflow Same gap as nightly.yml: production deploys also need Caddy to reload the updated Caddyfile before the smoke test validates the public surface. Uses the same nsenter pattern introduced in the previous commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 07:42:28 +02:00
Marcel	d750d5cee2	fix(ci): reload Caddy via nsenter, not sudo systemctl `sudo systemctl reload caddy` does not work from inside a DooD job container: `systemctl` is absent from Ubuntu container images and container processes cannot reach the host systemd without entering its namespaces. Replace with `docker run --privileged --pid=host ubuntu:22.04 nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy`, which uses the already-mounted Docker socket to spin up a privileged sibling container that enters the host PID namespace via nsenter. Tested live on the Hetzner VPS. No sudoers entry required. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 07:42:28 +02:00
Marcel	90f52eae41	ci(nightly): reload Caddy before smoke test Adds a `sudo systemctl reload caddy` step between the docker compose deploy and the smoke test. This ensures any committed Caddyfile changes are applied before the public surface is verified. Previously the workflow had no mechanism to push Caddyfile changes to the running host daemon. A Caddyfile edit would land in the repo but Caddy would keep serving the previous config, causing the smoke test to catch a stale header or still-proxied /actuator route rather than the intended current config. This step also surfaces the root cause of today's port-443 failure explicitly: if Caddy is not running, the step fails with a clear service error rather than a misleading "Failed to connect to port 443" from curl. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 07:42:28 +02:00
Marcel	e42c7b04c1	ci: drop redundant npm test step, coverage run covers it The test:coverage step runs the full suite under Istanbul; running `npm test` first executes every test twice for no extra signal. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-11 21:50:28 +02:00

1 2

93 Commits