ci: make backend test failures visible and prevent silent Playwright hangs #571
Reference in New Issue
Block a user
Delete Branch "feat/issue-570-ci-observability"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #570
Summary
backend/src/test/resources/application.propertiessetslogging.level.root=WARN+logging.level.org.raddatz=INFO, dropping the full test run log from 3 MB+ to well under Gitea's 1.4 MB capforkedProcessTimeoutInSeconds=120as JVM-level safety net;junit.jupiter.execution.timeout.default=90 sas per-test JUnit 5 timeout (replaces the deprecated<timeout>alias flagged by Felix/Sara)if: always()artifact upload after backend tests, pinned toactions/upload-artifact@v3per ADR-014testTimeout: 30_000+hookTimeout: 15_000invitest.client-coverage.config.tsso a Chromium crash surfaces a diagnosable failure within 30 s instead of a 14-min silent hangTest plan
./mvnw test— 1585 tests pass, 0 failures./mvnw clean package -DskipTests— clean buildsurefire-reportsartifact appears in the Actions panel🤖 Generated with Claude Code
👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ✅ Approved
What I checked
Observations
backend/src/test/resources/application.properties— Clean. Two log-level overrides, nothing more.logging.level.root=WARN+logging.level.org.raddatz=INFOis the right split: application code stays visible in test output, framework noise (Hibernate SQL, Spring context wiring) drops. No other config being accidentally overridden.frontend/vitest.client-coverage.config.ts—testTimeout: 30_000andhookTimeout: 15_000are the correct vitest 1.x config keys placed at the right level in the config tree.hookTimeout < testTimeoutensures beforeEach/afterEach failures produce a distinct failure message rather than being swallowed by the test body timeout.backend/pom.xml— Surefire config is in the correct<plugins>block. Themaven-surefire-pluginentry has no<version>override, which is expected — the Spring Boot parent POM manages it. No concerns.ci.yml—if: always()on the upload step is the correct pattern for failure-state artifacts. Placing it immediately after the./mvnw clean teststep is the right position in the job.No TDD concerns — this is CI/infrastructure configuration, not new business logic. 1585 existing tests remain green.
🏛️ Markus Keller — Application Architect
Verdict: ✅ Approved
What I checked
Against my doc-update trigger table — none apply here:
Observations
No layer boundary violations. No cross-domain coupling. No documentation debt introduced.
The
actions/upload-artifact@v3comment in the CI file — "Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557." — correctly documents the architectural constraint inline with a reference to the decision record. This is the right pattern: the ADR carries the why, the comment carries the pointer.Timeout nesting follows correct defensive layering: inner JUnit 5 per-test limit (90 s) fires first with a clean diagnostic, outer JVM process ceiling (120 s) catches anything that escapes. Tobias owns the CI specifics; this is within scope for his domain.
🔧 Tobias Wendt — DevOps & Platform Engineer
Verdict: ⚠️ Approved with concerns
Concern (non-blocking, documented exception)
actions/upload-artifact@v3— My standing rule is@v4. Deprecated actions accumulate vulnerabilities and will eventually break.That said, the inline comment is exactly right: "Gitea Actions (act_runner) does not implement upload-artifact v4 protocol — pinned per ADR-014. Do NOT upgrade. See #557." The constraint is real, the exception is documented, and the
# Do NOT upgradeguard prevents the obvious wrong refactor. Accepted for now.Action item: When act_runner ships v4 artifact protocol support, the ADR-014 exception should be revisited and the action upgraded. I'd suggest filing a follow-up issue so this doesn't get lost.
What looks good
Timeout nesting is correct —
junit.jupiter.execution.timeout.default=90 s(JUnit 5 per-test) <forkedProcessTimeoutInSeconds=120(JVM process ceiling). Inner fires first with a clean test-level failure message; the JVM safety net catches anything that escapes the per-test handler. This is the right layered approach.if: always()on the surefire upload — Surefire XML is most valuable when tests fail. Uploading only on success would make the artifact useless for its primary purpose. Correct.logging.level.root=WARNin test properties — Test logs are for diagnosing failures, not for watching Spring context initialize. Keepingorg.raddatz=INFOensures application-level logging stays visible while cutting Hibernate SQL noise and Spring context wiring INFO spam. This directly addresses the 3 MB → sub-1.4 MB CI log size problem.testTimeout: 30_000+hookTimeout: 15_000— Replacing a 14-minute silent Chromium hang with a 30-second diagnosable failure is a significant CI quality improvement. The 15 s hook timeout correctly distinguishes setup/teardown failures from test body failures.📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
What I checked
Requirements traceability
All four deliverables stated in the PR description map cleanly to diff hunks:
test/resources/application.propertiessetsroot=WARN+org.raddatz=INFO.gitea/workflows/ci.ymlupload step withif: always()pom.xmlSurefire config (forkedProcessTimeoutInSeconds,junit.jupiter.execution.timeout.default)vitest.client-coverage.config.ts(testTimeout,hookTimeout)Open test plan items
The two unchecked items —
[ ] Push triggers CI runand[ ] Verify backend job log is visibly less verbose— are post-merge observational acceptance criteria, not pre-merge code gates. Correct to leave them open; they verify the live CI environment, not the code.Minor gap
The PR claims logs will drop "from 3 MB+ to well under Gitea's 1.4 MB cap." This is an important NFR but it has no automated enforcement gate. If log verbosity ever regresses (e.g., a dependency adds DEBUG logging that leaks through WARN), there is no CI check that catches it. Consider a follow-up issue to add a size assertion on the CI log artifact — something like
wc -c surefire-reports/*.txt | awk '$1 > 1400000 {exit 1}'— to make this constraint permanent and self-enforcing.🔒 Nora "NullX" Steiner — Security Engineer
Verdict: ✅ Approved
What I checked
Observations
Surefire report artifacts — The uploaded
backend/target/surefire-reports/contains JUnit XML (test names, durations, stack traces from failures). Reviewed typical surefire output for this project type: no credentials, no JWTs, no session tokens, no database connection strings expected in test XML. Test fixtures use either mocked objects or Testcontainers ephemeral credentials. ✅logging.level.root=WARNscope is test-only — The properties file is atsrc/test/resources/, meaning it applies exclusively to the test profile. Production log levels are completely unaffected. Security audit trail for production is intact. ✅No hardcoded secrets in CI YAML — The new CI step references no secrets, tokens, or credentials. ✅
forkedProcessTimeoutInSeconds/junit.jupiter.execution.timeout.default— JVM-level timeout config. No security surface area.Security smell (awareness only, not a finding)
logging.level.root=WARNfloors all third-party library logging at WARN. Some libraries (Hibernate, Spring Security) occasionally log authentication-adjacent data at DEBUG/INFO level during connection setup. In the test context this is benign — tests don't process real credentials. But if a future test were to inadvertently connect to a non-ephemeral service and that service's client library logged a token at INFO, this setting would suppress it from test output. Low probability, zero production impact. Not a blocker.🧪 Sara Holt — QA Engineer & Test Strategist
Verdict: ✅ Approved
What I checked
What looks good
Timeout nesting is correct and well-reasoned:
junit.jupiter.execution.timeout.default=90 s— per-test JUnit 5 timeout. When a test hangs, this fires with a cleanTimeoutExceptionthat names the test. Excellent diagnostic signal.forkedProcessTimeoutInSeconds=120— JVM process ceiling. 30 s buffer above the per-test limit is appropriate: if the JUnit timeout handler itself gets stuck (e.g., due to a JVM deadlock), the forked process is killed and Surefire reports the JVM crash rather than waiting forever.testTimeout: 30_000/hookTimeout: 15_000— In Chromium browser tests, 30 s is a realistic upper bound for a test that uses DOM interaction.hookTimeout < testTimeoutensuresbeforeEachfailures (e.g., navigation timeout) produce distinct messages from test body failures. ✅if: always()on surefire upload — This is the only correct setting. The surefire XML is the post-mortem artifact when tests fail; uploading only on success would make it useless for its primary purpose. ✅logging.level.root=WARN— In CI, failing tests need to stand out. Suppressing Hibernate SQL, Spring wiring, and MinIO client INFO chatter is the right call. Theorg.raddatz=INFOentry preserves application-level logging so stack traces from the system under test remain readable. ✅1585 tests green — Evidence that the 90 s timeout doesn't trigger false-positive timeouts on the existing suite. Good validation.
Minor observation
The
maven-surefire-pluginentry inpom.xmldoes not specify a<version>. The Spring Boot parent BOM pins Surefire, so this is correct practice. Worth noting for awareness: if the parent version is bumped and Surefire's major version changes, thejunit.jupiter.execution.timeout.defaultsystem property key could change. Low risk; noting it for the record.🎨 Leonie Voss — UX Designer & Accessibility Strategist
Verdict: ✅ Approved
What I checked
Observations
The only frontend file changed is
frontend/vitest.client-coverage.config.ts— a test runner configuration file with zero rendering impact. No Svelte components, no styles, no Tailwind classes, no i18n strings, no routes, and no user-facing markup were modified.testTimeout: 30_000andhookTimeout: 15_000affect CI test execution only. They have no effect on the user experience, component rendering, or accessibility of any delivered UI.Nothing to flag from a design, brand, or accessibility perspective. LGTM from my domain.