fix(docker): skip postinstall in production image #583
Reference in New Issue
Block a user
Delete Branch "fix/production-npm-ci-ignore-scripts"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
The nightly deploy pipeline fails at
[frontend production 6/6] RUN npm ci --omit=devwith exit code 127:postinstallunconditionally runspatch-package, which is adevDependency.--omit=devskips devDependencies, so the binary is absent.Root Cause
patch-packagepatches@vitest/browser-playwright— a pure test tool. It is never needed at runtime in the production image, which only runs the pre-built SvelteKit app vianode build.Fix
Add
--ignore-scriptsto the productionnpm cicall. This is the npm-native way to suppress lifecycle scripts during a dep-only install. No changes topackage.jsonneeded —postinstallcontinues to work normally in dev and test environments.Test Plan
productiontarget:docker build --target production frontend/node buildstarts and serves the app correctlynpm ciwithout flags) still applies patches viapostinstall🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer
Verdict: ✅ Approved
This is exactly right. Let me walk through it.
What's correct
--ignore-scriptsis the npm-native answer here. The production stage's only job is to install runtime dependencies sonode buildcan execute — no lifecycle scripts belong there. The alternative (a shell guard inpackage.json) would have been a smell: it patches around the problem in the wrong place and is not idiomatic.The multi-stage build structure is also sound:
buildstage: fullnpm ci→ applies patches viapostinstall→ compiles the appproductionstage:npm ci --omit=dev --ignore-scripts→ installs only whatnode buildneeds at runtimePatches are applied exactly once, at the right stage.
One thing worth adding (suggestion)
Consider adding a brief comment in the Dockerfile explaining why
--ignore-scriptsis here. Future maintainers will wonder if it's safe to remove:This is a suggestion, not a blocker.
Missing CI guard (suggestion)
The production Docker target is not explicitly tested in CI — the pipeline currently only catches failures during nightly deploy runs. Consider adding a
docker build --target production frontend/step to the CI workflow so regressions like this fail at PR time, not post-merge. Again, not a blocker for this PR, but worth tracking.What's done well
node:20.19.0-alpine3.21) — reproducible builds ✓package.jsonstays clean — no workaround code committed ✓🔒 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
This is a net security improvement, not just a bug fix.
Why
--ignore-scriptsis the more secure approachnpm lifecycle scripts (
postinstall,preinstall,install) are a well-known supply-chain attack vector. A compromised transitive dependency can ship a maliciouspostinstallthat exfiltrates secrets, creates backdoors, or fingerprints the host at install time. This is not hypothetical — it's how packages likeevent-stream(2018) andua-parser-js(2021) were weaponized.By adding
--ignore-scriptsto the production install, this PR ensures that no package in the dependency tree can execute arbitrary code during the production container build. The attack surface shrinks meaningfully.The development install (
npm ciwithout flags) still runs scripts — and that's correct. Patches need to be applied in dev/build. But in theproductionstage, the app is already compiled and patched; there is no legitimate reason for any lifecycle script to run.CWE reference
This aligns with CWE-829 (Inclusion of Functionality from Untrusted Control Sphere). Lifecycle scripts are untrusted control sphere code executing in a privileged build context.
Nothing to flag
node:20.19.0-alpine3.21) ✓Minor observation (not a blocker)
If you want to harden further in the future:
npm ci --omit=dev --ignore-scriptscombined with--no-auditremoves the network call to the npm audit registry during CI, which is a small additional hardening step (reduces outbound egress and eliminates a potential audit-endpoint compromise vector). Not required here.👨💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer
Verdict: ✅ Approved
This is outside my primary domain (no application code changed), but I have one observation from a TDD perspective.
What's good
The fix is in the right place. Adding a shell guard to
package.jsonwould have been the wrong layer — it would have mixed infrastructure concerns into the application's package manifest. Fixing it in the Dockerfile keeps the separation clean.The one-line diff is exactly what a minimal fix should look like. No "while I'm here" cleanup, no unrelated changes bundled in.
TDD lens
The PR description includes a test plan with three manual steps. That's better than nothing, but this bug existed because there was no automated regression for it. The fix resolves the immediate failure, but without an automated test, the same class of problem (lifecycle script relying on a devDep, breaking
--omit=devinstalls) can silently reappear.A CI step like:
…would have caught this before it ever reached the nightly deploy. Worth tracking, though I'd open a separate issue rather than block this fix.
🏛️ Markus Keller (@mkeller) — Application Architect
Verdict: ✅ Approved
Docs check
This PR modifies
frontend/Dockerfile. Per the documentation update table:docs/architecture/c4/l2-containers.puml+docs/DEPLOYMENT.mdThis is not a new service or component — it's a flag change to an existing build step. No diagram or doc update is required. ✓
Architecture assessment
The fix is at the correct layer. The multi-stage build design is sound:
buildstage installs everything, compiles the appproductionstage installs only runtime deps, serves the pre-compiled outputThe
postinstallscript exists to apply patches to test tooling (@vitest/browser-playwright). Test tooling patches have no place in the production image.--ignore-scriptsenforces that boundary without requiring any change to the application's dependency manifest — which is the right call.One structural note (not a blocker)
The Dockerfile currently has no comment explaining why
--ignore-scriptsis used. Given that this is a deliberate constraint, future maintainers (or Renovate bumping the npm version) might remove it thinking it's unnecessary. A one-line comment would document the intent without ceremony. Tobias flagged the same thing — aligning on that suggestion.🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist
Verdict: ⚠️ Approved with concerns
The fix is correct. My concern is about what's missing after it.
Blocker: no regression guard
This bug lived undetected until the nightly deploy pipeline failed. That means the production Docker build is not verified anywhere in the PR-level CI pipeline. The same failure mode — a lifecycle script depending on a devDep — can silently reappear in any future
package.jsonchange, and it will again only be discovered after deploy.The test plan in this PR is three manual steps. Manual steps do not prevent regressions.
Suggested follow-up (I'd open a separate issue for this):
Adding this to CI would make the production build a first-class quality gate. It's fast (minutes), it runs on every PR that touches
frontend/, and it would have caught this before merge.I won't block this PR for that — the fix is urgent and correct. But please track the CI coverage gap.
What's covered
npm ciwithout flags still applies patches) ✓📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
Reviewing from a requirements and specification perspective.
Problem statement quality
The PR description is clear and precise:
postinstallcallspatch-package, which is a devDependency absent under--omit=dev✓--ignore-scriptssuppresses lifecycle scripts in the production stage ✓Acceptance criteria check
The test plan covers three scenarios:
docker build --target productionsucceeds ✓ (directly verifies the fix)node buildstarts and serves the app ✓ (verifies no runtime regression)npm cistill applies patches ✓ (verifies no dev regression)This is complete for the stated scope.
One gap worth noting
The requirements context (the nightly deploy pipeline) is referenced in the problem description but not linked to a Gitea issue. If there's a tracking issue for deploy reliability, linking this PR to it would close the traceability loop. Not a blocker — this is a hotfix and speed is appropriate.
🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead
Verdict: ✅ Approved
No UI or UX impact from this change. The modification is purely to the Docker build pipeline — no Svelte components, no routes, no accessibility surface, no brand tokens, no responsive layouts affected.
Checked: no frontend source files changed, no user-facing behavior modified, no design system elements touched.
LGTM from a UI/UX perspective. Ship it.