From 0f9e8c75cccf738e36d8c168fa211e74b8e7f2b5 Mon Sep 17 00:00:00 2001 From: Marcel Date: Mon, 15 Jun 2026 20:25:58 +0200 Subject: [PATCH] fix(ci): re-enable Testcontainers Ryuk to stop the shutdown hang MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The backend job set TESTCONTAINERS_RYUK_DISABLED=true, a carry-over from the old NAS runner. With Ryuk off, Testcontainers tears down containers via the in-JVM JVMHookResourceReaper at shutdown; that reaper crashes (NotFoundException) and leaks containers run-over-run. As leaked postgres:16-alpine containers pile up on the runner, the per-run teardown of ~30 per-context containers degrades until the fork hangs at JVM shutdown and Surefire reports "There was a timeout in the fork" — even though all tests pass. (The server had 21 such leaks, up to 5 weeks old; manually killing them was what restored CI before.) CI now runs on a root server with modern Docker (29.4.3, socket access), so the original reason to disable Ryuk no longer applies. Re-enabling it reaps each run's containers out-of-process after the JVM exits, so they never accumulate. Also drops the stale "NAS runner" comment on DOCKER_API_VERSION. Fixes #848. Co-Authored-By: Claude Opus 4.8 --- .gitea/workflows/ci.yml | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/.gitea/workflows/ci.yml b/.gitea/workflows/ci.yml index 46e38aa1..75c1efe1 100644 --- a/.gitea/workflows/ci.yml +++ b/.gitea/workflows/ci.yml @@ -229,9 +229,14 @@ jobs: name: Backend Unit Tests runs-on: ubuntu-latest env: - DOCKER_API_VERSION: "1.43" # NAS runner runs Docker 24.x (max API 1.43); Testcontainers 2.x defaults to 1.44 + # CI runs against the root-server Docker daemon (29.x). This API pin is a harmless + # carry-over from the old NAS runner (Docker 24.x, max API 1.43); safe to drop later. + DOCKER_API_VERSION: "1.43" DOCKER_HOST: unix:///var/run/docker.sock - TESTCONTAINERS_RYUK_DISABLED: "true" + # Ryuk (Testcontainers' out-of-process reaper) is intentionally LEFT ENABLED so it + # removes each run's containers after the JVM exits. Disabling it forced the in-JVM + # reaper, which hung at JVM shutdown and leaked Postgres containers run-over-run until + # the daemon degraded and the fork timed out at teardown — see #848. steps: - uses: actions/checkout@v4