DevOps: Renovate runner + nightly npm audit early-warning (#818) (#821)
Some checks failed
CI / OCR Service Tests (push) Has been cancelled
CI / Unit & Component Tests (push) Has started running
CI / Backend Unit Tests (push) Has been cancelled
CI / fail2ban Regex (push) Has been cancelled
CI / Semgrep Security Scan (push) Has been cancelled
CI / Compose Bucket Idempotency (push) Has been cancelled

## Summary

Closes #818. Sets up the prevention layer so newly-published advisories are caught on a branch we own, not on a contributor's PR.

**What changed:**
- `renovate.json` — migrated 2 deprecated keys (`matchPackagePatterns` → `matchPackageNames`, `matchPaths` → `matchFileNames`); added `osvVulnerabilityAlerts`, `dependencyDashboard`, `vulnerabilityAlerts` (labels: security + P1-high), weekly routine `schedule`, and `lockFileMaintenance` (no automerge)
- `.gitea/workflows/renovate.yml` — **new** daily cron runner (`0 3 * * *`), pinned to `renovatebot/github-action@8217b3fc` (v46.1.15) with `renovate-version: "46.1.15"`, `RENOVATE_TOKEN` secret, Gitea platform/endpoint env vars
- `.gitea/workflows/nightly.yml` — added `npm-audit` job (parallel to `deploy-staging`, independent signal): shell self-test, `set +e` audit capture, jq-built deduped issue open/update, `NIGHTLY_AUDIT_TOKEN` via step env only, heartbeat on clean path
- `docs/adr/041-renovate-runner-setup.md` — **new** negative-space ADR (no auto GITEA_TOKEN, two-token rationale, OSV-vs-platform, digest-pin threat model, schedule-batches-routine-only, l2-containers omission)
- `docs/infrastructure/ci-gitea.md` — two-token model, PAT rotation cadence, OSV-vs-platform, nightly/PR-gate divergence table, runbook for nightly-opened issues
- `docs/infrastructure/self-hosted-catalogue.md` — fixed Renovate snippet (daily cron, digest pin, `RENOVATE_TOKEN`, fixed version, no root `automerge: true`)

**No `l2-containers.puml` entry** — Renovate is a scheduled CI job, not a long-lived container. Stated here as a decision, not an oversight (ADR-041).

## Manual steps required before the runner is live (not automated)

1. Create a dedicated bot account (e.g. `renovate-bot`) on the Gitea instance
2. Mint `RENOVATE_TOKEN` PAT (scopes: `contents` + `pull_request` + `issues`) → add as Gitea secret
3. Mint `NIGHTLY_AUDIT_TOKEN` PAT (scope: `issues` only) → add as Gitea secret
4. Configure `main` branch protection to forbid the bot pushing directly

## Acceptance criteria status

- [x] `renovate.json` deprecated keys migrated; vuln surfacing config enabled
- [x] `.gitea/workflows/renovate.yml` exists (digest-pinned, daily cron, fixed version)
- [x] `self-hosted-catalogue.md` snippet corrected (4 items)
- [x] `nightly.yml` npm-audit job: survives non-zero exit, deduped tracking issue, jq payload, NIGHTLY_AUDIT_TOKEN via env only, heartbeat on clean
- [x] ADR-041 records all negative-space decisions
- [x] `ci-gitea.md` documents two-token model + runbook
- [ ] Phase 0 manual gates: bot account creation, Renovate onboarding PR evidence, Dependency Dashboard screenshot — **requires manual provisioning**
- [ ] Dedupe AC verified via `workflow_dispatch` — **requires NIGHTLY_AUDIT_TOKEN secret to be provisioned first**
- [ ] `$GITHUB_STEP_SUMMARY` availability on this runner — **verify in first live run**

Co-authored-by: Marcel <marcel@familienarchiv>
Reviewed-on: #821
This commit was merged in pull request #821.
This commit is contained in:
2026-06-13 12:13:35 +02:00
parent bde1237358
commit 83ca2eb34d
6 changed files with 440 additions and 14 deletions

View File

@@ -0,0 +1,123 @@
# ADR-041 — Renovate runner stand-up: two-token model, OSV surfacing, digest pinning
**Date:** 2026-06-13
**Status:** Accepted
**Issue:** [#818](https://git.raddatz.cloud/marcel/familienarchiv/issues/818)
---
## Context
Issue #817 (esbuild/cookie advisory) revealed that `main` had no early-warning
mechanism for newly-published advisories. An advisory landed against already-pinned
versions, turned the `npm audit --audit-level=high --omit=dev` gate red on `main`,
and then ambushed the next unrelated PR (#774). The author who hit it did not cause
it and had no warning.
`renovate.json` existed but `renovatebot` had never actually run: there was no
`.gitea/workflows/renovate.yml` and zero Renovate-authored PRs in the repo's entire
history. The three `packageRules` (bucket4j / tiptap / privileged-digest) were
silently inert.
This ADR records the **negative space** — why specific design choices were made,
so future maintainers do not "tidy up" toward a worse outcome.
---
## Decision
### Why there is no auto-provided `GITEA_TOKEN`
Self-hosted Gitea runners do not auto-inject a `GITEA_TOKEN` equivalent.
`docs/infrastructure/ci-gitea.md` (and its current line ~251) explicitly states the
token "must be created manually." No existing workflow in this repo references
`GITEA_TOKEN` for API calls — only for container registry auth (`docker login`).
Both `RENOVATE_TOKEN` and `NIGHTLY_AUDIT_TOKEN` must be manually provisioned as
Gitea secrets by a repository admin.
### Why two tokens, not one
The two jobs have different blast radii on token compromise:
| Token | Scopes | Used by |
|-------|--------|---------|
| `RENOVATE_TOKEN` | `contents` + `pull_request` + `issues` | Renovate — must read/write files and open PRs |
| `NIGHTLY_AUDIT_TOKEN` | `issues` only | Nightly audit — only needs to file a tracking issue |
The nightly job's token appears in step `env:` and is passed to `curl -H`. A leak via
runner logs, process arguments, or a misconfigured step would expose the token.
An `issues`-only token cannot push branches, open PRs, or read repository contents —
the leaked token's blast radius is limited to creating/editing issues.
A single broad token would give any leak path full `contents` + `pull_request` write
access to the repository. That risk is asymmetric with the upside (one fewer secret).
Both tokens belong to one dedicated bot account (consistent authorship; one identity
to audit and rotate). **Branch protection on `main` must forbid the bot pushing
directly**, because a `contents`-scoped token can push to any unprotected branch.
### Why the Renovate action is digest-pinned
`renovatebot/github-action` executes with the `RENOVATE_TOKEN` in scope. That token
carries `contents` + `pull_request` + `issues` — enough to read files, open PRs, and
write issues. An unpinned `@v40` tag can be re-pointed by the upstream maintainer
(or a compromised maintainer account) at any time. A pinned digest (`@<sha>`) cannot
be silently modified; the SHA is immutable. This is the same threat model applied to
all privileged CI steps in this repo (see the `matchFileNames` rule in `renovate.json`
for `.gitea/workflows/**`).
Renovate itself will open a PR to bump the digest when a new release ships, which is
the intended update path.
### Why `osvVulnerabilityAlerts` is the load-bearing detector on Gitea
Renovate's `vulnerabilityAlerts` config key triggers off a *platform* vulnerability
graph. GitHub exposes the GitHub Advisory Database via its API; **Gitea does not
expose an equivalent vulnerability graph**. On self-hosted Gitea, `vulnerabilityAlerts`
is effectively a label carrier — it attaches the configured labels to PRs that
`osvVulnerabilityAlerts` already detected, but it is not an independent detector.
`osvVulnerabilityAlerts: true` is the load-bearing flag: Renovate queries
[OSV.dev](https://osv.dev) directly (platform-agnostic). The runner host must be able
to reach OSV.dev over HTTPS — if egress is filtered, allow `osv.dev:443` or the flag
silently no-ops.
### Why the root `schedule` does not mute security PRs
`"schedule": ["before 6am on monday"]` in `renovate.json` batches **routine** dependency
updates (version bumps outside any security context) to a weekly window. This reduces
noise from routine update PRs while still allowing review before merge.
**Security and vulnerability PRs bypass the schedule by design** — Renovate raises
them immediately regardless of the schedule window. A future "tidy-up" that removes
or widens the schedule cannot mute vulnerability alerts; this is worth stating
explicitly to prevent that misunderstanding.
### Why `lockFileMaintenance` has no `automerge`
`lockFileMaintenance` refreshes transitive pins weekly so the dependency tree drifts
into fewer advisories over time. It is explicitly set without `automerge: true` because
a weekly transitive pin refresh can silently break the build if a transitive dep
introduces a breaking change. These PRs are small and should be reviewed.
### Why there is no entry in `l2-containers.puml`
`docs/architecture/c4/l2-containers.puml` documents long-lived infrastructure
containers (services that run continuously). Renovate is a scheduled CI job that runs
on a Gitea Actions runner and exits — it is not a long-lived container. Adding it to
the container diagram would misrepresent the architecture. This omission is deliberate,
not an oversight.
---
## Consequences
- Newly-published advisories against our frontend dependencies are surfaced within
one day (daily Renovate cron) rather than at the next contributor PR.
- A nightly `npm audit` job provides an independent signal for dev-dependency advisories
that Renovate may not cover via OSV.
- Two secrets (`RENOVATE_TOKEN`, `NIGHTLY_AUDIT_TOKEN`) must be manually provisioned
and rotated annually (or on suspected compromise). See
`docs/infrastructure/ci-gitea.md` for the runbook.
- The bot account must be kept active and branch protection on `main` must forbid
it pushing directly. These are operational prerequisites, not code invariants.