fix(ci): replace iproute2 ip with /proc/net/route for gateway detection
#544
Reference in New Issue
Block a user
Delete Branch "fix/nightly-caddy-reload"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
ip: command not found(exit 127) becauseiproute2is not installed in the Gitea runner containerip route show default | awk '/default/ {print $3}'with a direct read of/proc/net/route— a kernel virtual file always present on Linux, requiring no installed packageawkdecodes the little-endian 32-bit hex gateway field to dotted-decimal; behaviour is identical to the previousip routeapproachTest Plan
workflow_dispatchand verify the smoke test step reaches the curl checks without exit 127HOST_IPis logged as the expected Docker bridge gateway (e.g.172.17.0.1)ipwith /proc/net/route for gateway detection🔧 Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer
Verdict: ✅ Approved
This is my turf, so I looked at it carefully.
What's correct
iproute2is not installed in the Gitea runner container image (the job steps run in a DooD container, not directly on the Ubuntu host).ipis a userspace binary from that package — it was never guaranteed to be present./proc/net/routeis the right replacement. It's a kernel virtual file, always present on Linux, no package required. This is whatip routereads under the hood anyway./proc/net/routeis a little-endian 32-bit hex value (e.g.0101A8C0= 192.168.1.1). Reading bytes at positions 7–8, 5–6, 3–4, 1–2 and converting each pair withstrtonum("0x...")produces the right dotted-decimal result.[ -n "$HOST_IP" ] || { echo ... ; exit 1; }still catches the no-default-route case cleanly.exiton first match handles the (unlikely but possible) case of multiple default routes.One observation — not a blocker
strtonumis a gawk extension. On systems withmawk(Ubuntu's lightweight default awk in some images) it would fail silently or produce zero. Ubuntu'sawksymlink points togawkin the standard ubuntu images, and the Gitea runner containers are built on that base, so this works in practice. Worth knowing if the runner image ever changes.A fully portable fallback if needed:
But this is overkill for a gawk-equipped ubuntu runner. The awk form is fine.
What's good
The comment block on lines 169–172 is exactly the right level of detail — it explains the data format (
little-endian 32-bit hex) and the field selection ($2=="00000000"= default route), so no future engineer needs to reverse-engineer this.LGTM. Ship it.
👨💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer
Verdict: ✅ Approved
Pure CI YAML — no backend, no frontend, no test pyramid concerns. I'll focus on code quality of the shell snippet.
What's good
little-endian 32-bit hex value which awk decodes to dotted-decimalis the kind of remark that saves someone 20 minutes with a hex dump. That's the correct comment-writing discipline.[ -n "$HOST_IP" ] || { ... exit 1 }is a proper early-exit guard clause. Consistent with how we handle missing values elsewhere./proc/net/routeinstead of'ip route'). Small thing, gets it wrong surprisingly often in patches.One observation — not a blocker
The one-liner on line 177 is dense:
This is technically correct and the comment above explains what it does, so a reader has the context they need. The alternative would be a multi-line awk block, which would be easier to scan at a glance but adds visual noise in YAML. Given the comment coverage, the one-liner is defensible.
No changes required from me.
🏛️ Markus Keller (@mkeller) — Application Architect
Verdict: ✅ Approved
This is a tactical CI fix — no architectural implications. My standard documentation checklist:
docs/architecture/c4/l2-containers.puml+docs/DEPLOYMENT.mddocs/architecture/c4/l1-context.pumldocs/adr/ADR consideration
The existing ADR-012 documents the nsenter pattern for host service management in CI. The change from
ip routeto/proc/net/routeis an implementation detail within that pattern, not a new architectural decision. No new ADR is needed. If ADR-012 contains a code snippet referencingip route, updating it would be appropriate — but it's not a blocker.What's right
The comment block correctly captures why this approach was chosen (no iproute2 package dependency) and how the data is structured (little-endian hex). This is the information that belongs in an ADR, and it's in the right place (the code that uses it). The principle of "boring technology" applies here —
/proc/net/routeis as boring and stable as Linux gets.No blockers.
🔒 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
CI workflow only. I reviewed through an adversarial lens anyway.
Threat surface analysis
/proc/net/routeas a data source:This is a Linux kernel virtual file, read-only from userspace, populated by the kernel's routing table. It cannot be written to or manipulated by any userspace process. No injection risk from the data source itself.
HOST_IPused in--resolve $HOST:443:$HOST_IP:The value is passed as a curl
--resolveflag argument. curl parses this ashost:port:ip— it does not invoke a shell. Even if the awk expression produced unexpected output (e.g. due to a corrupt/proc/net/route, which is structurally impossible), the worst outcome is a curl failure, caught byset -e. There is no shell injection path here.Context:
This step runs on a self-hosted runner. There is no external input to
/proc/net/route. The only "attacker" who could influence this would already have host-level access, at which point the CI pipeline is the least of your problems.No concerns
The guard
[ -n "$HOST_IP" ]provides a fail-closed check. The awk command has no user-controlled input. The resulting value is consumed by curl in a non-shell-expanded context.Clean from a security standpoint.
🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist
Verdict: ✅ Approved
CI infrastructure change — no unit test pyramid involvement. I'm reviewing from a verification and reliability perspective.
Verification coverage
The test plan lists:
workflow_dispatchand confirm the smoke test step passes exit 127HOST_IPis logged as the expected bridge gatewayThis is the correct verification strategy for a CI step — the only meaningful test is running it. The step itself echoes
Smoke test: $URL (pinned to $HOST_IP via bridge gateway)which gives a direct observable signal in the logs.Edge cases I checked
/proc/net/routemissing (impossible on Linux, but)awkwould fail,HOST_IPwould be empty, guard firesawkproduces no output, guard fires with clear error messageexiton first match — deterministicstrtonumproduces 0 for invalid hex0.0.0.0, non-empty, guard passes but curl fails — the smoke test itself is the backstopThe last scenario is theoretical only (the kernel writes valid hex to
/proc/net/route), but if extra paranoia is wanted, a regex guard on the IP format could be added:Not a blocker — I mention it as a potential improvement for a follow-up if this step continues to be fragile.
LGTM.
🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead
Verdict: ✅ Approved
This PR modifies a CI workflow file only — no frontend components, no Svelte templates, no Tailwind classes, no accessibility surface, no brand tokens. Completely outside my review scope.
LGTM from a UI/UX perspective. Nothing to check here.
📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
This is a corrective infrastructure fix, not a feature. No requirements implications.
The PR description maps cleanly to the failure:
ip: command not found→ iproute2 not installed → replace with/proc/net/route. The "why" is stated, the fix is described, and the test plan is actionable and verifiable (workflow_dispatch trigger with specific log signals to check).From a requirements standpoint, the nightly pipeline's stated responsibility — "deploy staging and smoke-test the public surface" — remains unchanged. The gateway detection is an implementation detail, not a functional requirement. This fix restores a broken prerequisite without changing scope.
No open questions, no scope concerns.