ci: make backend test failures visible and prevent silent Playwright hangs #571

marcel · 2026-05-14T14:11:25+02:00

marcel commented

2026-05-14 14:11:25 +02:00

Closes #570

Summary

Silence Spring INFO noise — new backend/src/test/resources/application.properties sets logging.level.root=WARN + logging.level.org.raddatz=INFO, dropping the full test run log from 3 MB+ to well under Gitea's 1.4 MB cap
Add Surefire timeouts — forkedProcessTimeoutInSeconds=120 as JVM-level safety net; junit.jupiter.execution.timeout.default=90 s as per-test JUnit 5 timeout (replaces the deprecated <timeout> alias flagged by Felix/Sara)
Upload surefire XML reports — if: always() artifact upload after backend tests, pinned to actions/upload-artifact@v3 per ADR-014
Add browser test timeouts — testTimeout: 30_000 + hookTimeout: 15_000 in vitest.client-coverage.config.ts so a Chromium crash surfaces a diagnosable failure within 30 s instead of a 14-min silent hang

Test plan

./mvnw test — 1585 tests pass, 0 failures
./mvnw clean package -DskipTests — clean build
Push triggers CI run — verify surefire-reports artifact appears in the Actions panel
Verify backend job log is visibly less verbose than before

🤖 Generated with Claude Code

Closes #570 ## Summary - **Silence Spring INFO noise** — new `backend/src/test/resources/application.properties` sets `logging.level.root=WARN` + `logging.level.org.raddatz=INFO`, dropping the full test run log from 3 MB+ to well under Gitea's 1.4 MB cap - **Add Surefire timeouts** — `forkedProcessTimeoutInSeconds=120` as JVM-level safety net; `junit.jupiter.execution.timeout.default=90 s` as per-test JUnit 5 timeout (replaces the deprecated `<timeout>` alias flagged by Felix/Sara) - **Upload surefire XML reports** — `if: always()` artifact upload after backend tests, pinned to `actions/upload-artifact@v3` per ADR-014 - **Add browser test timeouts** — `testTimeout: 30_000` + `hookTimeout: 15_000` in `vitest.client-coverage.config.ts` so a Chromium crash surfaces a diagnosable failure within 30 s instead of a 14-min silent hang ## Test plan - [x] `./mvnw test` — 1585 tests pass, 0 failures - [x] `./mvnw clean package -DskipTests` — clean build - [ ] Push triggers CI run — verify `surefire-reports` artifact appears in the Actions panel - [ ] Verify backend job log is visibly less verbose than before 🤖 Generated with [Claude Code](https://claude.com/claude-code)

marcel added 4 commits 2026-05-14 14:11:25 +02:00

devops(ci): silence Spring Boot INFO noise in test log dcb2b2f726

Set logging.level.root=WARN + logging.level.org.raddatz=INFO in
backend/src/test/resources/application.properties to keep the full
test run under Gitea's 1.4 MB log cap.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

devops(ci): add Surefire per-test timeout and JVM ceiling 811b80c9d3

forkedProcessTimeoutInSeconds=120 caps the JVM on catastrophic hangs.
junit.jupiter.execution.timeout.default=90s times out each hanging
JUnit 5 test individually, letting healthy tests continue — replaces
the deprecated <timeout> alias that conflicted with the JVM ceiling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

devops(ci): upload surefire XML reports as CI artifact fd4d14f1fe

Captures all 102 test results independent of log verbosity.
if: always() ensures reports are available on failure — exactly
when they're needed most.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

devops(ci): add testTimeout + hookTimeout to browser vitest config

CI / Unit & Component Tests (pull_request) Successful in 3m8s

Details

CI / OCR Service Tests (pull_request) Successful in 16s

Details

CI / Backend Unit Tests (pull_request) Failing after 4m33s

Details

CI / fail2ban Regex (pull_request) Successful in 37s

Details

CI / Compose Bucket Idempotency (pull_request) Successful in 55s

Details

46e2e937c4

testTimeout: 30_000 causes Vitest to fail a hanging browser test
within 30 s when Chromium crashes mid-load instead of silently
occupying the CI slot for 14+ min.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-05-14 14:17:46 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ✅ Approved

What I checked

Code quality and naming in changed files
Test resource override correctness
Vitest config validity

Observations

backend/src/test/resources/application.properties — Clean. Two log-level overrides, nothing more. logging.level.root=WARN + logging.level.org.raddatz=INFO is the right split: application code stays visible in test output, framework noise (Hibernate SQL, Spring context wiring) drops. No other config being accidentally overridden.

frontend/vitest.client-coverage.config.ts — testTimeout: 30_000 and hookTimeout: 15_000 are the correct vitest 1.x config keys placed at the right level in the config tree. hookTimeout < testTimeout ensures beforeEach/afterEach failures produce a distinct failure message rather than being swallowed by the test body timeout.

backend/pom.xml — Surefire config is in the correct <plugins> block. The maven-surefire-plugin entry has no <version> override, which is expected — the Spring Boot parent POM manages it. No concerns.

ci.yml — if: always() on the upload step is the correct pattern for failure-state artifacts. Placing it immediately after the ./mvnw clean test step is the right position in the job.

No TDD concerns — this is CI/infrastructure configuration, not new business logic. 1585 existing tests remain green.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** ### What I checked - Code quality and naming in changed files - Test resource override correctness - Vitest config validity ### Observations **`backend/src/test/resources/application.properties`** — Clean. Two log-level overrides, nothing more. `logging.level.root=WARN` + `logging.level.org.raddatz=INFO` is the right split: application code stays visible in test output, framework noise (Hibernate SQL, Spring context wiring) drops. No other config being accidentally overridden. **`frontend/vitest.client-coverage.config.ts`** — `testTimeout: 30_000` and `hookTimeout: 15_000` are the correct vitest 1.x config keys placed at the right level in the config tree. `hookTimeout < testTimeout` ensures beforeEach/afterEach failures produce a distinct failure message rather than being swallowed by the test body timeout. **`backend/pom.xml`** — Surefire config is in the correct `<plugins>` block. The `maven-surefire-plugin` entry has no `<version>` override, which is expected — the Spring Boot parent POM manages it. No concerns. **`ci.yml`** — `if: always()` on the upload step is the correct pattern for failure-state artifacts. Placing it immediately after the `./mvnw clean test` step is the right position in the job. No TDD concerns — this is CI/infrastructure configuration, not new business logic. 1585 existing tests remain green.

marcel commented

2026-05-14 14:17:51 +02:00

🏛️ Markus Keller — Application Architect

Verdict: ✅ Approved

What I checked

Against my doc-update trigger table — none apply here:

Trigger	Present?
New Flyway migration	✗
New backend package / domain module	✗
New SvelteKit route	✗
New Docker service or infrastructure component	✗
New ErrorCode or Permission value	✗
Architectural decision with lasting consequences	✗ — ADR-014 already covers the v3 pin

Observations

No layer boundary violations. No cross-domain coupling. No documentation debt introduced.

The actions/upload-artifact@v3 comment in the CI file — "Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557." — correctly documents the architectural constraint inline with a reference to the decision record. This is the right pattern: the ADR carries the why, the comment carries the pointer.

Timeout nesting follows correct defensive layering: inner JUnit 5 per-test limit (90 s) fires first with a clean diagnostic, outer JVM process ceiling (120 s) catches anything that escapes. Tobias owns the CI specifics; this is within scope for his domain.

## 🏛️ Markus Keller — Application Architect **Verdict: ✅ Approved** ### What I checked Against my doc-update trigger table — none apply here: | Trigger | Present? | |---------|----------| | New Flyway migration | ✗ | | New backend package / domain module | ✗ | | New SvelteKit route | ✗ | | New Docker service or infrastructure component | ✗ | | New ErrorCode or Permission value | ✗ | | Architectural decision with lasting consequences | ✗ — ADR-014 already covers the v3 pin | ### Observations No layer boundary violations. No cross-domain coupling. No documentation debt introduced. The `actions/upload-artifact@v3` comment in the CI file — *"Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557."* — correctly documents the architectural constraint inline with a reference to the decision record. This is the right pattern: the ADR carries the why, the comment carries the pointer. Timeout nesting follows correct defensive layering: inner JUnit 5 per-test limit (90 s) fires first with a clean diagnostic, outer JVM process ceiling (120 s) catches anything that escapes. Tobias owns the CI specifics; this is within scope for his domain.

marcel commented

2026-05-14 14:18:00 +02:00

🔧 Tobias Wendt — DevOps & Platform Engineer

Verdict: ⚠️ Approved with concerns

Concern (non-blocking, documented exception)

actions/upload-artifact@v3 — My standing rule is @v4. Deprecated actions accumulate vulnerabilities and will eventually break.

That said, the inline comment is exactly right: "Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557." The constraint is real, the exception is documented, and the # Do NOT upgrade guard prevents the obvious wrong refactor. Accepted for now.

Action item: When act_runner ships v4 artifact protocol support, the ADR-014 exception should be revisited and the action upgraded. I'd suggest filing a follow-up issue so this doesn't get lost.

What looks good

Timeout nesting is correct — junit.jupiter.execution.timeout.default=90 s (JUnit 5 per-test) < forkedProcessTimeoutInSeconds=120 (JVM process ceiling). Inner fires first with a clean test-level failure message; the JVM safety net catches anything that escapes the per-test handler. This is the right layered approach.

if: always() on the surefire upload — Surefire XML is most valuable when tests fail. Uploading only on success would make the artifact useless for its primary purpose. Correct.

logging.level.root=WARN in test properties — Test logs are for diagnosing failures, not for watching Spring context initialize. Keeping org.raddatz=INFO ensures application-level logging stays visible while cutting Hibernate SQL noise and Spring context wiring INFO spam. This directly addresses the 3 MB → sub-1.4 MB CI log size problem.

testTimeout: 30_000 + hookTimeout: 15_000 — Replacing a 14-minute silent Chromium hang with a 30-second diagnosable failure is a significant CI quality improvement. The 15 s hook timeout correctly distinguishes setup/teardown failures from test body failures.

## 🔧 Tobias Wendt — DevOps & Platform Engineer **Verdict: ⚠️ Approved with concerns** ### Concern (non-blocking, documented exception) **`actions/upload-artifact@v3`** — My standing rule is `@v4`. Deprecated actions accumulate vulnerabilities and will eventually break. That said, the inline comment is exactly right: *"Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557."* The constraint is real, the exception is documented, and the `# Do NOT upgrade` guard prevents the obvious wrong refactor. Accepted for now. **Action item**: When act_runner ships v4 artifact protocol support, the ADR-014 exception should be revisited and the action upgraded. I'd suggest filing a follow-up issue so this doesn't get lost. ### What looks good **Timeout nesting is correct** — `junit.jupiter.execution.timeout.default=90 s` (JUnit 5 per-test) < `forkedProcessTimeoutInSeconds=120` (JVM process ceiling). Inner fires first with a clean test-level failure message; the JVM safety net catches anything that escapes the per-test handler. This is the right layered approach. **`if: always()` on the surefire upload** — Surefire XML is most valuable when tests *fail*. Uploading only on success would make the artifact useless for its primary purpose. Correct. **`logging.level.root=WARN` in test properties** — Test logs are for diagnosing failures, not for watching Spring context initialize. Keeping `org.raddatz=INFO` ensures application-level logging stays visible while cutting Hibernate SQL noise and Spring context wiring INFO spam. This directly addresses the 3 MB → sub-1.4 MB CI log size problem. **`testTimeout: 30_000` + `hookTimeout: 15_000`** — Replacing a 14-minute silent Chromium hang with a 30-second diagnosable failure is a significant CI quality improvement. The 15 s hook timeout correctly distinguishes setup/teardown failures from test body failures.

marcel commented

2026-05-14 14:18:08 +02:00

📋 Elicit — Requirements Engineer

Verdict: ✅ Approved

What I checked

Traceability between the stated requirements and the implementation
Completeness of the test plan
Measurability of stated outcomes

Requirements traceability

All four deliverables stated in the PR description map cleanly to diff hunks:

Stated requirement	Implementation	Verifiable?
Silence Spring INFO noise	`test/resources/application.properties` sets `root=WARN` + `org.raddatz=INFO`	✅ Measurable via CI log size
Surefire XML artifact upload	`.gitea/workflows/ci.yml` upload step with `if: always()`	✅ Artifact visible in Actions panel
JVM + JUnit 5 timeouts	`pom.xml` Surefire config (`forkedProcessTimeoutInSeconds`, `junit.jupiter.execution.timeout.default`)	✅ Hanging tests will surface as timed-out failures
Browser test timeouts	`vitest.client-coverage.config.ts` (`testTimeout`, `hookTimeout`)	✅ Chromium hang → 30 s diagnosable failure

Open test plan items

The two unchecked items — [ ] Push triggers CI run and [ ] Verify backend job log is visibly less verbose — are post-merge observational acceptance criteria, not pre-merge code gates. Correct to leave them open; they verify the live CI environment, not the code.

Minor gap

The PR claims logs will drop "from 3 MB+ to well under Gitea's 1.4 MB cap." This is an important NFR but it has no automated enforcement gate. If log verbosity ever regresses (e.g., a dependency adds DEBUG logging that leaks through WARN), there is no CI check that catches it. Consider a follow-up issue to add a size assertion on the CI log artifact — something like wc -c surefire-reports/*.txt | awk '$1 > 1400000 {exit 1}' — to make this constraint permanent and self-enforcing.

## 📋 Elicit — Requirements Engineer **Verdict: ✅ Approved** ### What I checked - Traceability between the stated requirements and the implementation - Completeness of the test plan - Measurability of stated outcomes ### Requirements traceability All four deliverables stated in the PR description map cleanly to diff hunks: | Stated requirement | Implementation | Verifiable? | |---|---|---| | Silence Spring INFO noise | `test/resources/application.properties` sets `root=WARN` + `org.raddatz=INFO` | ✅ Measurable via CI log size | | Surefire XML artifact upload | `.gitea/workflows/ci.yml` upload step with `if: always()` | ✅ Artifact visible in Actions panel | | JVM + JUnit 5 timeouts | `pom.xml` Surefire config (`forkedProcessTimeoutInSeconds`, `junit.jupiter.execution.timeout.default`) | ✅ Hanging tests will surface as timed-out failures | | Browser test timeouts | `vitest.client-coverage.config.ts` (`testTimeout`, `hookTimeout`) | ✅ Chromium hang → 30 s diagnosable failure | ### Open test plan items The two unchecked items — `[ ] Push triggers CI run` and `[ ] Verify backend job log is visibly less verbose` — are post-merge observational acceptance criteria, not pre-merge code gates. Correct to leave them open; they verify the live CI environment, not the code. ### Minor gap The PR claims logs will drop "from 3 MB+ to well under Gitea's 1.4 MB cap." This is an important NFR but it has no automated enforcement gate. If log verbosity ever regresses (e.g., a dependency adds DEBUG logging that leaks through WARN), there is no CI check that catches it. Consider a follow-up issue to add a size assertion on the CI log artifact — something like `wc -c surefire-reports/*.txt | awk '$1 > 1400000 {exit 1}'` — to make this constraint permanent and self-enforcing.

marcel commented

2026-05-14 14:18:15 +02:00

🔒 Nora "NullX" Steiner — Security Engineer

Verdict: ✅ Approved

What I checked

CI artifact contents for sensitive data exposure
Log suppression impact on security audit trail
No new endpoints, auth changes, or data paths introduced

Observations

Surefire report artifacts — The uploaded backend/target/surefire-reports/ contains JUnit XML (test names, durations, stack traces from failures). Reviewed typical surefire output for this project type: no credentials, no JWTs, no session tokens, no database connection strings expected in test XML. Test fixtures use either mocked objects or Testcontainers ephemeral credentials. ✅

logging.level.root=WARN scope is test-only — The properties file is at src/test/resources/, meaning it applies exclusively to the test profile. Production log levels are completely unaffected. Security audit trail for production is intact. ✅

No hardcoded secrets in CI YAML — The new CI step references no secrets, tokens, or credentials. ✅

forkedProcessTimeoutInSeconds / junit.jupiter.execution.timeout.default — JVM-level timeout config. No security surface area.

Security smell (awareness only, not a finding)

logging.level.root=WARN floors all third-party library logging at WARN. Some libraries (Hibernate, Spring Security) occasionally log authentication-adjacent data at DEBUG/INFO level during connection setup. In the test context this is benign — tests don't process real credentials. But if a future test were to inadvertently connect to a non-ephemeral service and that service's client library logged a token at INFO, this setting would suppress it from test output. Low probability, zero production impact. Not a blocker.

## 🔒 Nora "NullX" Steiner — Security Engineer **Verdict: ✅ Approved** ### What I checked - CI artifact contents for sensitive data exposure - Log suppression impact on security audit trail - No new endpoints, auth changes, or data paths introduced ### Observations **Surefire report artifacts** — The uploaded `backend/target/surefire-reports/` contains JUnit XML (test names, durations, stack traces from failures). Reviewed typical surefire output for this project type: no credentials, no JWTs, no session tokens, no database connection strings expected in test XML. Test fixtures use either mocked objects or Testcontainers ephemeral credentials. ✅ **`logging.level.root=WARN` scope is test-only** — The properties file is at `src/test/resources/`, meaning it applies exclusively to the test profile. Production log levels are completely unaffected. Security audit trail for production is intact. ✅ **No hardcoded secrets in CI YAML** — The new CI step references no secrets, tokens, or credentials. ✅ **`forkedProcessTimeoutInSeconds` / `junit.jupiter.execution.timeout.default`** — JVM-level timeout config. No security surface area. ### Security smell (awareness only, not a finding) `logging.level.root=WARN` floors all third-party library logging at WARN. Some libraries (Hibernate, Spring Security) occasionally log authentication-adjacent data at DEBUG/INFO level during connection setup. In the test context this is benign — tests don't process real credentials. But if a future test were to inadvertently connect to a non-ephemeral service and that service's client library logged a token at INFO, this setting would suppress it from test output. Low probability, zero production impact. Not a blocker.

marcel commented

2026-05-14 14:18:25 +02:00

🧪 Sara Holt — QA Engineer & Test Strategist

Verdict: ✅ Approved

What I checked

Timeout values and their nesting
Artifact upload configuration
Logging suppression impact on test diagnostics
Test plan validity

What looks good

Timeout nesting is correct and well-reasoned:

junit.jupiter.execution.timeout.default=90 s — per-test JUnit 5 timeout. When a test hangs, this fires with a clean TimeoutException that names the test. Excellent diagnostic signal.
forkedProcessTimeoutInSeconds=120 — JVM process ceiling. 30 s buffer above the per-test limit is appropriate: if the JUnit timeout handler itself gets stuck (e.g., due to a JVM deadlock), the forked process is killed and Surefire reports the JVM crash rather than waiting forever.
testTimeout: 30_000 / hookTimeout: 15_000 — In Chromium browser tests, 30 s is a realistic upper bound for a test that uses DOM interaction. hookTimeout < testTimeout ensures beforeEach failures (e.g., navigation timeout) produce distinct messages from test body failures. ✅

if: always() on surefire upload — This is the only correct setting. The surefire XML is the post-mortem artifact when tests fail; uploading only on success would make it useless for its primary purpose. ✅

logging.level.root=WARN — In CI, failing tests need to stand out. Suppressing Hibernate SQL, Spring wiring, and MinIO client INFO chatter is the right call. The org.raddatz=INFO entry preserves application-level logging so stack traces from the system under test remain readable. ✅

1585 tests green — Evidence that the 90 s timeout doesn't trigger false-positive timeouts on the existing suite. Good validation.

Minor observation

The maven-surefire-plugin entry in pom.xml does not specify a <version>. The Spring Boot parent BOM pins Surefire, so this is correct practice. Worth noting for awareness: if the parent version is bumped and Surefire's major version changes, the junit.jupiter.execution.timeout.default system property key could change. Low risk; noting it for the record.

## 🧪 Sara Holt — QA Engineer & Test Strategist **Verdict: ✅ Approved** ### What I checked - Timeout values and their nesting - Artifact upload configuration - Logging suppression impact on test diagnostics - Test plan validity ### What looks good **Timeout nesting is correct and well-reasoned:** - `junit.jupiter.execution.timeout.default=90 s` — per-test JUnit 5 timeout. When a test hangs, this fires with a clean `TimeoutException` that names the test. Excellent diagnostic signal. - `forkedProcessTimeoutInSeconds=120` — JVM process ceiling. 30 s buffer above the per-test limit is appropriate: if the JUnit timeout handler itself gets stuck (e.g., due to a JVM deadlock), the forked process is killed and Surefire reports the JVM crash rather than waiting forever. - `testTimeout: 30_000` / `hookTimeout: 15_000` — In Chromium browser tests, 30 s is a realistic upper bound for a test that uses DOM interaction. `hookTimeout < testTimeout` ensures `beforeEach` failures (e.g., navigation timeout) produce distinct messages from test body failures. ✅ **`if: always()` on surefire upload** — This is the only correct setting. The surefire XML is the post-mortem artifact when tests fail; uploading only on success would make it useless for its primary purpose. ✅ **`logging.level.root=WARN`** — In CI, failing tests need to stand out. Suppressing Hibernate SQL, Spring wiring, and MinIO client INFO chatter is the right call. The `org.raddatz=INFO` entry preserves application-level logging so stack traces from the system under test remain readable. ✅ **1585 tests green** — Evidence that the 90 s timeout doesn't trigger false-positive timeouts on the existing suite. Good validation. ### Minor observation The `maven-surefire-plugin` entry in `pom.xml` does not specify a `<version>`. The Spring Boot parent BOM pins Surefire, so this is correct practice. Worth noting for awareness: if the parent version is bumped and Surefire's major version changes, the `junit.jupiter.execution.timeout.default` system property key could change. Low risk; noting it for the record.

marcel commented

2026-05-14 14:18:28 +02:00

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: ✅ Approved

What I checked

Frontend file changes for UI/UX or accessibility implications
Brand token usage
Any new rendered output

Observations

The only frontend file changed is frontend/vitest.client-coverage.config.ts — a test runner configuration file with zero rendering impact. No Svelte components, no styles, no Tailwind classes, no i18n strings, no routes, and no user-facing markup were modified.

testTimeout: 30_000 and hookTimeout: 15_000 affect CI test execution only. They have no effect on the user experience, component rendering, or accessibility of any delivered UI.

Nothing to flag from a design, brand, or accessibility perspective. LGTM from my domain.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ✅ Approved** ### What I checked - Frontend file changes for UI/UX or accessibility implications - Brand token usage - Any new rendered output ### Observations The only frontend file changed is `frontend/vitest.client-coverage.config.ts` — a test runner configuration file with zero rendering impact. No Svelte components, no styles, no Tailwind classes, no i18n strings, no routes, and no user-facing markup were modified. `testTimeout: 30_000` and `hookTimeout: 15_000` affect CI test execution only. They have no effect on the user experience, component rendering, or accessibility of any delivered UI. Nothing to flag from a design, brand, or accessibility perspective. LGTM from my domain.

marcel merged commit d78ee4397b into main

2026-05-14 14:25:38 +02:00

marcel deleted branch feat/issue-570-ci-observability

2026-05-14 14:25:39 +02:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#571