fix(ci): re-enable Testcontainers Ryuk to stop the backend fork shutdown hang (#848) #849
Reference in New Issue
Block a user
Delete Branch "devops/issue-848-fork-exit-timeout"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #848.
Symptom
CI
Backend Unit Testsgoes red despite all tests passing: after the last test, the fork hangs at JVM shutdown and Surefire reportsThere was a timeout in the fork→BUILD FAILURE.Root cause (corrected after investigation)
My first theory (slow shutdown needs a bigger timeout) was wrong — raising
forkedProcessExitTimeoutInSeconds30→120 only delayed the kill by ~90s (total time 12:35 → 14:04), proving an indefinite hang, not slowness.The real cause is Testcontainers teardown with Ryuk disabled:
TESTCONTAINERS_RYUK_DISABLED: "true"(carry-over from the old NAS runner).JVMHookResourceReaperat shutdown. That reaper crashes (NotFoundException) and leaks containers run-over-run.PostgresContainerConfigis a per-context@Bean), so ~30 Postgres containers are torn down in-JVM at shutdown.postgres:16-alpine/miniocontainers up to 5 weeks old; manually killing them is what restored CI before (a recurring pattern).Environment confirmed via
ssh root@raddatz.cloud: CI now runs on a root server with Docker 29.4.3 (8 CPU, 62 GB, socket access) — so the original reason to disable Ryuk no longer applies, and Docker is not slow.Change
TESTCONTAINERS_RYUK_DISABLED) — Ryuk reaps each run's containers out-of-process after the JVM exits, so they never accumulate. Automates the manual "kill all testcontainers."forkedProcessExitTimeoutInSeconds=120as a harmless backstop.DOCKER_API_VERSION.Operational: the 21 leaked containers were already removed from the server (by
org.testcontainers=truelabel; real services untouched), giving immediate relief.Validation
Validated by this PR's CI run on the real runner (watching it). If Ryuk can't start in the runner's docker-outside-docker setup, the integration tests fail fast and I revert — fallback is a singleton Postgres container.
🤖 Generated with Claude Code
fix(ci): give the backend test fork 120s to shut down (#848)to fix(ci): re-enable Testcontainers Ryuk to stop the backend fork shutdown hang (#848)