DevOps: set up Renovate vulnerability surfacing + nightly npm audit early-warning #818

Open
opened 2026-06-12 23:59:34 +02:00 by marcel · 0 comments
Owner

Context

Issue #817 (esbuild/cookie audit-gate failure) exposed a process gap, not just a dependency problem: main has no early-warning mechanism for newly-published advisories. An advisory landed against already-pinned versions, turned the npm audit --audit-level=high --omit=dev gate (.gitea/workflows/ci.yml:32-33) red on main, and then ambushed the next unrelated PR (#774, which touched zero frontend files). The author who hit it didn't cause it and had no warning.

This issue sets up the prevention layer so a freshly-published advisory is caught on a schedule, on a branch we own, instead of on a contributor's PR.

⚠️ Correction after review (2026-06-13). The original framing said "we already run Renovate." That is false: renovate.json exists, but there is no .gitea/workflows/renovate.yml and zero Renovate-authored PRs in the entire history — Renovate has never executed, and its three existing packageRules (bucket4j / tiptap / privileged-digest) have been silently inert. The vuln-surfacing config below therefore depends on first standing up a Renovate runner (Phase 0).

Decisions resolved: (1) Stand up Renovate (don't drop it). (2) The nightly audit scans dev deps too (no --omit=dev) — deliberately broader than the PR gate.

Goal

Newly-published advisories against our dependencies are surfaced proactively (scheduled Renovate job + dependency dashboard + nightly audit), never first discovered as a red gate on an unrelated PR.

Scope

Phase 0 — Stand up the Renovate runner (prerequisite — the actual load-bearing work)

  • Add .gitea/workflows/renovate.yml: weekly cron + workflow_dispatch, running renovatebot/github-action pinned by digest with renovate-version pinned to a fixed version (not latest — matches this repo's pin-everything posture: @v3 artifacts, semgrep==1.163.0, etc.). A starter snippet lives in docs/infrastructure/self-hosted-catalogue.md:150 — pin it before use.
  • Create a dedicated bot account + PAT, stored as a Gitea secret, scoped to contents + pull_request + issues. (Mend's hosted app does not support self-hosted Gitea — Renovate must self-host.)
  • Configure platform/endpoint/repositories (or autodiscover) so Renovate targets this repo on the self-hosted Gitea.
  • Verify Renovate opens at least one onboarding/update PR before relying on the vuln flags below.

1. Turn on Renovate vulnerability surfacing

Extend renovate.json (additive — keep the existing packageRules):

"osvVulnerabilityAlerts": true,
"dependencyDashboard": true,
"vulnerabilityAlerts": {
  "labels": ["security", "P1-high"]
},
"lockFileMaintenance": {
  "enabled": true,
  "schedule": ["before 6am on monday"]
}
  • osvVulnerabilityAlerts is the load-bearing flag on self-hosted Gitea — Renovate queries OSV.dev directly (platform-agnostic). vulnerabilityAlerts keys off a platform vulnerability graph that Gitea does not expose, so treat that block as a label carrier for the OSV alerts, not an independent detector.
  • dependencyDashboard: true set explicitly — it is not guaranteed on without onboarding.
  • lockFileMaintenance: refreshes transitive pins weekly so we drift into fewer advisories. Do not add automerge — a weekly transitive bump can break the build silently; these PRs get reviewed.

2. Nightly audit early-warning job

Add a separate job to .gitea/workflows/nightly.yml (parallel to deploy-staging, not a step inside it — it must produce an independent red/green signal a deploy failure can't mask):

cd frontend && npm audit --audit-level=high
  • Scans dev deps too (no --omit=dev) — deliberately broader than the PR gate, to catch dev-tooling advisories (esbuild, Vite, etc.) early. The PR gate stays --omit=dev (unchanged — out of scope). Document this divergence so the broader nightly result isn't mistaken for a gate failure.
  • npm audit needs only frontend/package-lock.jsoncheckout + setup-node, no npm ci / no node_modules cache (nightly.yml has no node setup to reuse anyway; this is a fresh job).
  • On non-zero exit: open OR update a single tracking issue, deduped by a fixed title marker (e.g. Nightly npm audit: high-severity advisory) — GET open issues labelled security, match the marker, update if present, else create. Labels: security, devops, + P1-high (severity parity with the Renovate path). Use the auto-provided ${{ secrets.GITEA_TOKEN }} (confirm it has issues:write).
  • Heartbeat: on the clean path, emit a job-summary line so "no issue" provably means "ran and clean," not "never ran."
  • Add a shell self-test (mirroring ci.yml's existing grep self-tests) asserting the dedupe title-match catches an existing-issue sample and ignores an unrelated one.

Acceptance criteria

  • Phase 0: .gitea/workflows/renovate.yml exists (pinned action + version), bot token wired, and a real Renovate PR has appeared at least once.
  • renovate.json enables osvVulnerabilityAlerts, dependencyDashboard, vulnerabilityAlerts (security + P1-high), and lockFileMaintenance (no automerge) — existing packageRules preserved.
  • A nightly job runs npm audit --audit-level=high (including dev deps) against frontend/ as its own job; on failure it opens/updates one deduped tracking issue.
  • Dedupe verified: two consecutive failing runs → exactly one issue, updated not duplicated (both run URLs captured in this issue as evidence; manual workflow_dispatch against a deliberately vulnerable lockfile or a lowered --audit-level is acceptable proof).
  • The Renovate Dependency Dashboard is visible.
  • docs/infrastructure/ci-gitea.md documents: the runner setup, what OSV vs platform alerts mean on Gitea, the nightly job's dev-dep divergence from the PR gate, and the runbook for a nightly-opened issue (triage severity → override/pin or escalate, mirroring the #817 decision tree).

Out of scope

  • Fixing the current esbuild/cookie advisory — that's #817.
  • Changing the existing PR audit gate (it stays --omit=dev; this adds a scheduled, broader sibling).
  • Backend (Maven) SCA — tracked as #820. The Spring Boot 4 / Hibernate / Jetty tree currently has no advisory early-warning at all, and it's the larger attack surface for a private-document archive.

Notes

  • DevOps philosophy check (Tobias): every added automation is a new failure mode — hence the heartbeat on the nightly job and the manual-verify gate on the Renovate runner. This is justified because the alternative — discovering advisories via random red PRs — has a concrete, recurring cost we just paid on #774.
  • Relates to #817. Backend SCA follow-up: #820.

Review summary — six-persona pre-implementation review (2026-06-13)

Findings folded in from the review; the original per-persona comments were removed so this issue stays the single source of truth.

Resolved decisions

  • Stand up Renovate (not drop it) → Phase 0 above. (Tobias; echoed by Markus, Elicit)
  • Nightly audit scans dev deps (drop --omit=dev) → §2 above; deliberately broader than the PR gate, divergence documented. (Nora)

🏛️ Architecture (Markus) — "We already run Renovate" was false; the missing runner is the load-bearing work, not the config flags. Sequence: runner → prove one PR → then vuln flags. Nightly audit must be its own job, never nested inside deploy-staging. Capture the runner decision ADR-style in the doc.

👨‍💻 Developer (Felix) ��� npm audit needs no node_modules / npm ci (resolves from the lockfile); nightly.yml has no node setup to reuse. Dedupe via a fixed title marker: GET open security issues → update-or-create. Pin the Renovate action by digest + fixed renovate-version (the catalogue's @v40 / latest violates the repo's pin posture). Add a shell self-test for the dedupe match.

🛠️ DevOps (Tobias) — Renovate has never run; existing packageRules are inert. Mend's hosted app doesn't support Gitea → must self-host. osvVulnerabilityAlerts (OSV.dev) is load-bearing; vulnerabilityAlerts is only a label carrier on Gitea. lockFileMaintenance must not automerge. Add a clean-path heartbeat so "no issue" ≠ "never ran."

🛡️ Security (Nora)--omit=dev divergence resolved (scan dev deps); document exactly what's covered. Severity-label the nightly tracking issue (P1-high) for parity with the Renovate path. Backend Maven SCA blind spot → tracked as #820.

🧪 QA (Sara) — The "Renovate PR appears OR nightly fires" AC collapses to "nightly works" until the runner exists — made explicit. The update path needs an explicit test: two consecutive manual runs → one issue, updated not duplicated (capture both run URLs).

📋 Requirements (Elicit) — Recommended splitting into two issues; not taken — kept as one issue with Phase 0 as an explicit gating prerequisite. Added the measurable dedupe AC. Milestone still unassigned (orphan).

🎨 UX (Leonie) — No concerns; pure CI / dependency tooling, no user-facing surface.

## Context Issue #817 (esbuild/cookie audit-gate failure) exposed a process gap, not just a dependency problem: **`main` has no early-warning mechanism for newly-published advisories.** An advisory landed against already-pinned versions, turned the `npm audit --audit-level=high --omit=dev` gate (`.gitea/workflows/ci.yml:32-33`) red on `main`, and then **ambushed the next unrelated PR** (#774, which touched zero frontend files). The author who hit it didn't cause it and had no warning. This issue sets up the prevention layer so a freshly-published advisory is caught **on a schedule, on a branch we own**, instead of on a contributor's PR. > **⚠️ Correction after review (2026-06-13).** The original framing said "we already run Renovate." That is **false**: `renovate.json` exists, but there is **no `.gitea/workflows/renovate.yml`** and **zero Renovate-authored PRs in the entire history** — Renovate has never executed, and its three existing `packageRules` (bucket4j / tiptap / privileged-digest) have been silently inert. The vuln-surfacing config below therefore depends on **first standing up a Renovate runner (Phase 0)**. > > **Decisions resolved:** (1) **Stand up Renovate** (don't drop it). (2) **The nightly audit scans dev deps too** (no `--omit=dev`) — deliberately broader than the PR gate. ## Goal Newly-published advisories against our dependencies are surfaced proactively (scheduled Renovate job + dependency dashboard + nightly audit), never first discovered as a red gate on an unrelated PR. ## Scope ### Phase 0 — Stand up the Renovate runner _(prerequisite — the actual load-bearing work)_ - Add `.gitea/workflows/renovate.yml`: weekly cron + `workflow_dispatch`, running `renovatebot/github-action` **pinned by digest** with `renovate-version` **pinned to a fixed version** (not `latest` — matches this repo's pin-everything posture: `@v3` artifacts, `semgrep==1.163.0`, etc.). A starter snippet lives in `docs/infrastructure/self-hosted-catalogue.md:150` — pin it before use. - Create a dedicated **bot account + PAT**, stored as a Gitea secret, scoped to `contents` + `pull_request` + `issues`. (Mend's hosted app does not support self-hosted Gitea — Renovate must self-host.) - Configure `platform`/`endpoint`/`repositories` (or autodiscover) so Renovate targets this repo on the self-hosted Gitea. - **Verify** Renovate opens at least one onboarding/update PR before relying on the vuln flags below. ### 1. Turn on Renovate vulnerability surfacing Extend `renovate.json` (additive — keep the existing `packageRules`): ```json "osvVulnerabilityAlerts": true, "dependencyDashboard": true, "vulnerabilityAlerts": { "labels": ["security", "P1-high"] }, "lockFileMaintenance": { "enabled": true, "schedule": ["before 6am on monday"] } ``` - `osvVulnerabilityAlerts` is the **load-bearing** flag on self-hosted Gitea — Renovate queries OSV.dev directly (platform-agnostic). `vulnerabilityAlerts` keys off a *platform* vulnerability graph that **Gitea does not expose**, so treat that block as a label carrier for the OSV alerts, not an independent detector. - `dependencyDashboard: true` set **explicitly** — it is not guaranteed on without onboarding. - `lockFileMaintenance`: refreshes transitive pins weekly so we drift into fewer advisories. **Do not add `automerge`** — a weekly transitive bump can break the build silently; these PRs get reviewed. ### 2. Nightly audit early-warning job Add a **separate job** to `.gitea/workflows/nightly.yml` (parallel to `deploy-staging`, **not** a step inside it — it must produce an independent red/green signal a deploy failure can't mask): ```bash cd frontend && npm audit --audit-level=high ``` - **Scans dev deps too** (no `--omit=dev`) — deliberately broader than the PR gate, to catch dev-tooling advisories (esbuild, Vite, etc.) early. The PR gate stays `--omit=dev` (unchanged — out of scope). **Document this divergence** so the broader nightly result isn't mistaken for a gate failure. - `npm audit` needs only `frontend/package-lock.json` — `checkout` + `setup-node`, **no `npm ci` / no `node_modules` cache** (nightly.yml has no node setup to reuse anyway; this is a fresh job). - On non-zero exit: **open OR update a single tracking issue**, deduped by a fixed title marker (e.g. `Nightly npm audit: high-severity advisory`) — `GET` open issues labelled `security`, match the marker, update if present, else create. Labels: `security`, `devops`, **+ `P1-high`** (severity parity with the Renovate path). Use the auto-provided `${{ secrets.GITEA_TOKEN }}` (confirm it has `issues:write`). - **Heartbeat:** on the clean path, emit a job-summary line so "no issue" provably means "ran and clean," not "never ran." - Add a shell **self-test** (mirroring `ci.yml`'s existing `grep` self-tests) asserting the dedupe title-match catches an existing-issue sample and ignores an unrelated one. ## Acceptance criteria - **Phase 0:** `.gitea/workflows/renovate.yml` exists (pinned action + version), bot token wired, and a real Renovate PR has appeared at least once. - `renovate.json` enables `osvVulnerabilityAlerts`, `dependencyDashboard`, `vulnerabilityAlerts` (`security` + `P1-high`), and `lockFileMaintenance` (no `automerge`) — existing `packageRules` preserved. - A nightly job runs `npm audit --audit-level=high` (**including dev deps**) against `frontend/` as its own job; on failure it opens/updates **one** deduped tracking issue. - **Dedupe verified:** two consecutive failing runs → exactly one issue, updated not duplicated (both run URLs captured in this issue as evidence; manual `workflow_dispatch` against a deliberately vulnerable lockfile or a lowered `--audit-level` is acceptable proof). - The Renovate Dependency Dashboard is visible. - `docs/infrastructure/ci-gitea.md` documents: the runner setup, what OSV vs platform alerts mean on Gitea, the nightly job's dev-dep divergence from the PR gate, and the runbook for a nightly-opened issue (triage severity → override/pin or escalate, mirroring the #817 decision tree). ## Out of scope - Fixing the current esbuild/cookie advisory — that's #817. - Changing the existing PR audit gate (it stays `--omit=dev`; this adds a *scheduled, broader* sibling). - Backend (Maven) SCA — **tracked as #820.** The Spring Boot 4 / Hibernate / Jetty tree currently has no advisory early-warning at all, and it's the larger attack surface for a private-document archive. ## Notes - DevOps philosophy check (Tobias): every added automation is a new failure mode — hence the heartbeat on the nightly job and the manual-verify gate on the Renovate runner. This is justified because the alternative — discovering advisories via random red PRs — has a concrete, recurring cost we just paid on #774. - Relates to #817. Backend SCA follow-up: #820. ## Review summary — six-persona pre-implementation review (2026-06-13) _Findings folded in from the review; the original per-persona comments were removed so this issue stays the single source of truth._ **Resolved decisions** - **Stand up Renovate** (not drop it) → Phase 0 above. _(Tobias; echoed by Markus, Elicit)_ - **Nightly audit scans dev deps** (drop `--omit=dev`) → §2 above; deliberately broader than the PR gate, divergence documented. _(Nora)_ **🏛️ Architecture (Markus)** — "We already run Renovate" was false; the missing runner is the load-bearing work, not the config flags. Sequence: runner → prove one PR → then vuln flags. Nightly audit must be its own job, never nested inside `deploy-staging`. Capture the runner decision ADR-style in the doc. **👨‍💻 Developer (Felix)** ��� `npm audit` needs no `node_modules` / `npm ci` (resolves from the lockfile); `nightly.yml` has no node setup to reuse. Dedupe via a fixed title marker: `GET` open `security` issues → update-or-create. Pin the Renovate action by digest + fixed `renovate-version` (the catalogue's `@v40` / `latest` violates the repo's pin posture). Add a shell self-test for the dedupe match. **🛠️ DevOps (Tobias)** — Renovate has never run; existing `packageRules` are inert. Mend's hosted app doesn't support Gitea → must self-host. `osvVulnerabilityAlerts` (OSV.dev) is load-bearing; `vulnerabilityAlerts` is only a label carrier on Gitea. `lockFileMaintenance` must not `automerge`. Add a clean-path heartbeat so "no issue" ≠ "never ran." **🛡️ Security (Nora)** — `--omit=dev` divergence resolved (scan dev deps); document exactly what's covered. Severity-label the nightly tracking issue (`P1-high`) for parity with the Renovate path. Backend Maven SCA blind spot → tracked as #820. **🧪 QA (Sara)** — The "Renovate PR appears OR nightly fires" AC collapses to "nightly works" until the runner exists — made explicit. The *update* path needs an explicit test: two consecutive manual runs → one issue, updated not duplicated (capture both run URLs). **📋 Requirements (Elicit)** — Recommended splitting into two issues; **not taken** — kept as one issue with Phase 0 as an explicit gating prerequisite. Added the measurable dedupe AC. Milestone still unassigned (orphan). **🎨 UX (Leonie)** — No concerns; pure CI / dependency tooling, no user-facing surface.
marcel added the P2-mediumdevopssecurity labels 2026-06-12 23:59:38 +02:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#818