fix(ci): pin Reload Caddy to alpine:3.21 digest, add reload-vs-restart rationale

- Switch ubuntu:22.04 (floating, ~70 MB) to alpine:3.21 pinned by sha256
  digest (~5 MB); util-linux installed at run time via apk add
- Add explicit comment explaining why `reload` not `restart`: SIGHUP
  re-reads config in-process without dropping TLS connections

Addresses Tobias + Nora blocker from PR review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-11 22:43:55 +02:00
parent 52a96f657d
commit f608838f7a
2 changed files with 24 additions and 16 deletions

View File

@@ -106,22 +106,29 @@ jobs:
# the current config is live.
#
# The runner executes job steps inside Docker containers (DooD).
# `systemctl` is not present in Ubuntu container images and cannot
# reach the host's systemd directly. We use the Docker socket
# (mounted into every job container via runner-config.yaml) to spin
# up a privileged sibling container in the host PID namespace;
# nsenter then enters the host's namespaces so systemctl talks to
# the real host systemd daemon. No sudoers entry is required — the
# Docker socket already grants root-equivalent host access.
# `systemctl` is not present in container images and cannot reach
# the host's systemd directly. We use the Docker socket (mounted
# into every job container via runner-config.yaml) to spin up a
# privileged sibling container in the host PID namespace; nsenter
# then enters the host's namespaces so systemctl talks to the real
# host systemd daemon. No sudoers entry is required — the Docker
# socket already grants root-equivalent host access.
#
# `systemctl reload caddy` sends SIGHUP; Caddy re-reads
# /etc/caddy/Caddyfile (symlinked to infra/caddy/Caddyfile) without
# dropping connections. If Caddy is not running this step fails fast
# before the smoke test issues a misleading "port 443 refused" error.
# Alpine is used: ~5 MB vs ~70 MB for ubuntu, no unnecessary
# tooling, and the digest is pinned so any upstream change requires
# an explicit bump PR. util-linux (which ships nsenter) is installed
# at run time; apk add takes ~1 s on the warm VPS cache.
#
# `reload` not `restart`: reload sends SIGHUP so Caddy re-reads its
# config in-process without dropping TLS connections. `restart`
# would briefly stop the service, losing in-flight requests.
#
# If Caddy is not running this step fails fast before the smoke test
# issues a misleading "port 443 refused" error.
run: |
docker run --rm --privileged --pid=host \
ubuntu:22.04 \
nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# Healthchecks confirm containers are healthy; they do NOT confirm the

View File

@@ -97,11 +97,12 @@ jobs:
# cannot call systemctl directly; nsenter via a privileged sibling
# container reaches the host systemd. Must run after deploy (so the
# latest Caddyfile is on disk) and before the smoke test (so the
# public surface reflects the current config).
# public surface reflects the current config). Alpine with pinned
# digest; reload not restart — see nightly.yml for full rationale.
run: |
docker run --rm --privileged --pid=host \
ubuntu:22.04 \
nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy
alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'
- name: Smoke test deployed environment
# See nightly.yml — same three checks, against the prod vhost.