fix(#553): close [birpc] rpc is closed race — sync-factory invariant + duplicate-ID guard + PR #10267 backport #555

marcel · 2026-05-13T10:04:18+02:00

marcel commented

2026-05-13 10:04:18 +02:00

Closes #553.

Summary

The first cycle of #553 closed one named trigger of the [birpc] rpc is closed, cannot call "resolveManualMock" teardown race (async vi.mock factory with await import(…) in EnrichmentBlock.svelte.spec.ts). Felix's follow-up local run on the post-fix branch confirmed the race still surfaced under the full 580-test client project, with the "last test before crash" rotating randomly across files with zero vi.mock calls.

Root cause of the residual race, identified by a direct read of @vitest/browser-playwright/dist/index.js and confirmed against vitest issue #9957 / PR #10267:

When two vi.mock calls reference the same resolved module URL under two distinct ID strings (e.g. '$lib/foo.svelte' and '$lib/foo.svelte.js'), the register handler installs two Playwright page.context().route(...) predicates but the cleanup map (idPreficates) only tracks the latest. The orphan route survives session teardown, fires after the next session's birpc channel closes, and crashes the run.

We hit the exact named trigger: $lib/shared/services/confirm.svelte was mocked under two distinct IDs across 8 spec files (no other duplicate-ID pair existed). This PR closes that gap from four sides:

Layer	What	Where
1	Normalise `confirm.svelte` to a single canonical ID (no-extension form) across production code (4 files) and specs (7 files)	`+layout.svelte`, `documents/[id]/+page.svelte`, `NameHistoryEditCard.svelte`, `TagDeleteGuard.svelte` + 7 spec files
2	In-suite meta-test that fails any spec scan where one canonical module is referenced under ≥2 spelling variants	`src/__meta__/no-duplicate-mock-ids.test.ts` (NEW, 130 LOC, 7 tests: 6 self-test fixtures + 1 corpus scan)
3	`patch-package` backport of vitest PR #10267 against `@vitest/browser-playwright@4.1.0` — closes the upstream race at the route-handler level (unroute existing predicate before installing replacement)	`patches/@vitest+browser-playwright+4.1.0.patch`, `package.json` (postinstall hook + devDep)
4	ADR-012 extended with a new "Binding invariant: one canonical ID per mocked module" section and a sixth enforcement layer	`docs/adr/012-browser-test-mocking-strategy.md`

The patch from Layer 3 is reapplied automatically via postinstall → patch-package. CI's npm ci step exercises this. Delete the patch when @vitest/browser-playwright publishes a release containing PR #10267 — the meta-test (Layer 2) stays as a permanent contributor-facing guard.

Defence in depth — six layers

ADR-012's enforcement chain is now:

ESLint no-restricted-syntax (scoped to **/*.{spec,test}.ts) — sync-factory invariant
CI grep guard mirroring the ESLint patterns
In-suite meta-test no-async-mock-factories.test.ts — sync-factory invariant
CI birpc assert on coverage-step output
(NEW) In-suite meta-test no-duplicate-mock-ids.test.ts — one-canonical-ID-per-module invariant
(NEW) patch-package backport of PR #10267 — closes the upstream race itself

Commits

2e6cc346 docs(adr-012): document duplicate-id hazard and patch-package backport
7fc1295d build(patch-package): backport vitest PR #10267 for browser-playwright birpc race
0cf4a488 test(meta): add duplicate-id vi.mock detector under __meta__/
9030a7d0 test(confirm-service): normalise vi.mock spelling and remove duplicate-id mocks
feadf372 refactor(confirm-service): normalise import spelling to no-extension form

Atomic per feedback_atomic_commits.md — one logical change per commit (production import normalisation, test mock normalisation, meta-test, patch-package, docs).

Test plan

npm run check — type checking (787 pre-existing errors in profile/users/stammbaum/admin/users; none introduced by this change)
npm run lint — Prettier + ESLint (running under push CI #1603)
npm run test — full vitest suite, both projects (under push CI). New no-duplicate-mock-ids.test.ts must pass; existing no-async-mock-factories.test.ts must remain green; EnrichmentBlock.svelte.spec.ts must remain green
npm ci postinstall replays the patch without error (under push CI)
Acceptance criterion: coverage-flake-probe.yml workflow_dispatch — 20-cell matrix run against this branch's head, all green, zero [birpc] rpc is closed lines. Triggered as run #1604

Follow-up

Separately tracked: #554 — audit: factory mocks → prop injection migration (sveltest pattern). Out of scope for this PR.

References

vitest issue #9957 — duplicate-ID race upstream
vitest PR #10267 — merged 2026-05-08, not yet released, backported here
ADR-012 — Browser-mode test mocking strategy
frontend/src/lib/shared/services/confirm.test-host.svelte — canonical prop-injection example (informs the Layer 5 follow-up)

Closes #553. ## Summary The first cycle of #553 closed one named trigger of the `[birpc] rpc is closed, cannot call "resolveManualMock"` teardown race (async `vi.mock` factory with `await import(…)` in `EnrichmentBlock.svelte.spec.ts`). Felix's follow-up local run on the post-fix branch confirmed the race **still** surfaced under the full 580-test client project, with the "last test before crash" rotating randomly across files with zero `vi.mock` calls. Root cause of the residual race, identified by a direct read of `@vitest/browser-playwright/dist/index.js` and confirmed against [vitest issue #9957](https://github.com/vitest-dev/vitest/issues/9957) / [PR #10267](https://github.com/vitest-dev/vitest/pull/10267): > When two `vi.mock` calls reference the same resolved module URL under two distinct ID strings (e.g. `'$lib/foo.svelte'` and `'$lib/foo.svelte.js'`), the `register` handler installs two Playwright `page.context().route(...)` predicates but the cleanup map (`idPreficates`) only tracks the latest. The orphan route survives session teardown, fires after the next session's birpc channel closes, and crashes the run. We hit the exact named trigger: `$lib/shared/services/confirm.svelte` was mocked under **two distinct IDs** across 8 spec files (no other duplicate-ID pair existed). This PR closes that gap from four sides: | Layer | What | Where | |---|---|---| | 1 | Normalise `confirm.svelte` to a single canonical ID (no-extension form) across production code (4 files) and specs (7 files) | `+layout.svelte`, `documents/[id]/+page.svelte`, `NameHistoryEditCard.svelte`, `TagDeleteGuard.svelte` + 7 spec files | | 2 | In-suite meta-test that fails any spec scan where one canonical module is referenced under ≥2 spelling variants | `src/__meta__/no-duplicate-mock-ids.test.ts` (NEW, 130 LOC, 7 tests: 6 self-test fixtures + 1 corpus scan) | | 3 | `patch-package` backport of vitest PR #10267 against `@vitest/browser-playwright@4.1.0` — closes the upstream race at the route-handler level (unroute existing predicate before installing replacement) | `patches/@vitest+browser-playwright+4.1.0.patch`, `package.json` (postinstall hook + devDep) | | 4 | ADR-012 extended with a new "Binding invariant: one canonical ID per mocked module" section and a sixth enforcement layer | `docs/adr/012-browser-test-mocking-strategy.md` | The patch from Layer 3 is reapplied automatically via `postinstall → patch-package`. CI's `npm ci` step exercises this. Delete the patch when `@vitest/browser-playwright` publishes a release containing PR #10267 — the meta-test (Layer 2) stays as a permanent contributor-facing guard. ## Defence in depth — six layers ADR-012's enforcement chain is now: 1. ESLint `no-restricted-syntax` (scoped to `**/*.{spec,test}.ts`) — sync-factory invariant 2. CI grep guard mirroring the ESLint patterns 3. In-suite meta-test `no-async-mock-factories.test.ts` — sync-factory invariant 4. CI birpc assert on coverage-step output 5. **(NEW)** In-suite meta-test `no-duplicate-mock-ids.test.ts` — one-canonical-ID-per-module invariant 6. **(NEW)** `patch-package` backport of PR #10267 — closes the upstream race itself ## Commits ``` 2e6cc346 docs(adr-012): document duplicate-id hazard and patch-package backport 7fc1295d build(patch-package): backport vitest PR #10267 for browser-playwright birpc race 0cf4a488 test(meta): add duplicate-id vi.mock detector under __meta__/ 9030a7d0 test(confirm-service): normalise vi.mock spelling and remove duplicate-id mocks feadf372 refactor(confirm-service): normalise import spelling to no-extension form ``` Atomic per `feedback_atomic_commits.md` — one logical change per commit (production import normalisation, test mock normalisation, meta-test, patch-package, docs). ## Test plan - [x] `npm run check` — type checking (787 pre-existing errors in profile/users/stammbaum/admin/users; none introduced by this change) - [ ] `npm run lint` — Prettier + ESLint (running under push CI [#1603](https://git.raddatz.cloud/marcel/familienarchiv/actions/runs/1603)) - [ ] `npm run test` — full vitest suite, both projects (under push CI). New `no-duplicate-mock-ids.test.ts` must pass; existing `no-async-mock-factories.test.ts` must remain green; `EnrichmentBlock.svelte.spec.ts` must remain green - [ ] `npm ci` postinstall replays the patch without error (under push CI) - [ ] **Acceptance criterion:** `coverage-flake-probe.yml` workflow_dispatch — 20-cell matrix run against this branch's head, all green, zero `[birpc] rpc is closed` lines. Triggered as [run #1604](https://git.raddatz.cloud/marcel/familienarchiv/actions/runs/1604) ## Follow-up Separately tracked: #554 — `audit: factory mocks → prop injection migration (sveltest pattern)`. Out of scope for this PR. ## References - [vitest issue #9957](https://github.com/vitest-dev/vitest/issues/9957) — duplicate-ID race upstream - [vitest PR #10267](https://github.com/vitest-dev/vitest/pull/10267) — merged 2026-05-08, not yet released, backported here - ADR-012 — Browser-mode test mocking strategy - `frontend/src/lib/shared/services/confirm.test-host.svelte` — canonical prop-injection example (informs the Layer 5 follow-up)

marcel added 13 commits 2026-05-13 10:04:19 +02:00

fix(enrichment-block): migrate $app/stores → $app/state to eliminate birpc race 9f1b8b4215

The async vi.mock factory in EnrichmentBlock.svelte.spec.ts performed an
`await import(...)` in its body — the same mechanism #535/#546 fixed for
pdfjs-dist. Issue #553: when Chromium's playwright route handler fetches
the mocked module after the worker's birpc channel has closed, the
factory's RPC roundtrip raises `[birpc] rpc is closed, cannot call
"resolveManualMock"` and the run exits 1.

Migrate EnrichmentBlock from the deprecated `$app/stores.navigating`
(store) to the modern `$app/state.navigating` (reactive proxy). The
spec uses vi.hoisted + a sync vi.mock factory with a getter that defers
the read — no dynamic import in the factory body. Delete the now-unused
__mocks__/navigatingStore.ts.

Fix path applied: $app/state migration (Markus's recommendation /
Felix's Path 2). See ADR-012.

Refs #553

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(setup): disable SvelteKit hover prefetch in browser-mode runs 3c9e40ca71

Hover-prefetch fires real fetch requests for route loader chunks; those
requests go through the same Playwright route handler that serves mocked
modules. An in-flight prefetch landing after iframe teardown can hit the
handler with a closed birpc channel, raising an unhandled rejection that
exits the run with code 1 even when every individual test was green.

Add `src/test-setup.ts` that sets `document.body.dataset.sveltekitPreloadData
= 'off'` and wire it via `setupFiles` in both `vite.config.ts` (client
project) and `vitest.client-coverage.config.ts` (Istanbul coverage config).
Add `src/__meta__/browser-preload-disabled.svelte.test.ts` asserting the
setup ran. Zero production impact.

Issue #553 secondary trigger.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(meta): scan src/**/*.svelte.{test,spec}.ts for async vi.mock factories 636d61a81b

In-suite belt-and-braces detector for the birpc teardown race named in
ADR-012 / #553. Catches `vi.mock(<arg>, async ... { ... await import(...)
... })` in any browser spec on every vitest invocation — the layer hardest
to disable or scope around (ESLint can be silenced; CI grep runs only in
CI; this test runs whenever the suite runs).

Demonstrated red→green by planting a synthetic offender locally and
watching the live-scan assertion fail; removing the offender returned it
to green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(eslint): ban async vi.mock factories with dynamic import in body 5afebde382

Generalise the no-restricted-syntax rule from the literal pdfjs-dist
selector (added in #535) to also catch the underlying mechanism named in
ADR-012 / #553: any `vi.mock(..., async () => { ... await import(...)
... })` produces a late birpc roundtrip during worker teardown.

Selector: vi.mock CallExpression whose second argument is an
ArrowFunctionExpression with async=true and whose subtree contains an
AwaitExpression > ImportExpression. Both rules coexist — the literal
pdfjs-dist rule still enforces the libLoader prop injection pattern
(catches sync forms too); the new rule enforces the sync-factory
invariant universally.

Demonstrated by planting a synthetic offender locally and watching
ESLint flag it with the new rule's message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ci(unit-tests): extend grep guard to async vi.mock with dynamic import 67cd56acc7

The pdfjs-dist literal grep added in 9260866f only caught one named
trigger of the birpc teardown race; the underlying mechanism (ADR 012 /
#553) is any async vi.mock factory whose body performs `await import(...)`.

Add a second PCRE-multiline grep matching that shape. Scoped to
**/*.{spec,test}.ts under frontend/src/, excluding __meta__ (which holds
the fixture strings exercising the meta-test). Defence in depth pairs with
the ESLint rule (saves at edit time) and the in-suite meta-test (catches
when tests run).

Verified locally with real GNU grep against a planted synthetic offender.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ci(coverage-flake-probe): add workflow_dispatch matrix job (20 parallel runs) c820884765

Verification mechanism for the 20-run acceptance criterion of issue #553.
Triggered manually via workflow_dispatch, runs the full coverage suite 20×
in parallel against a single SHA, asserts zero `[birpc] rpc is closed`
lines in every cell.

One fire, parallel cost (~one main-job's wall-clock), deterministic signal
for the teardown race. Cheaper than 20 sequential push events and tests
the same property the AC names.

Closes the verification gap raised by Tobias and Elicit in the issue
discussion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs(adr-012): record the sync-factory invariant and the $app/state migration

CI / Unit & Component Tests (push) Failing after 1m49s

Details

CI / OCR Service Tests (push) Successful in 17s

Details

CI / Backend Unit Tests (push) Successful in 4m13s

Details

CI / Compose Bucket Idempotency (push) Has been cancelled

Details

CI / fail2ban Regex (push) Has been cancelled

Details

addf5c98db

The previous revision allowed vi.mock for virtual modules on the "consumer
import is static" argument. #553 proved that argument wrong: a statically-
imported module with an async factory body whose dynamic import landed
after teardown still produced the race. The factory body — not the
consumer — is the failure surface.

- Drop the "residual exceptions" table.
- Add the binding invariant: factory bodies under `**/*.svelte.{test,spec}.ts`
  must be synchronous (no `await`, no `import(...)`).
- Document the canonical vi.hoisted + getter pattern, with file references.
- Record the $app/stores → $app/state architectural call (Markus's
  recommendation), removing one of the last two deprecated-import
  outliers.
- Record the preload-data=off hardening (Tobias's recommendation) as a
  pattern note.
- Update the Enforcement section to list all four defence layers (ESLint,
  CI grep, in-suite meta-test, CI birpc assert) and the coverage-flake-
  probe verification workflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(setup): also disable data-sveltekit-preload-code in browser tests

CI / Backend Unit Tests (push) Successful in 4m17s

Details

CI / fail2ban Regex (push) Successful in 39s

Details

CI / Compose Bucket Idempotency (push) Failing after 11s

Details

CI / Unit & Component Tests (push) Failing after 2m31s

Details

CI / OCR Service Tests (push) Successful in 17s

Details

edde9292e6

Hover-prefetch has two surfaces in SvelteKit:
- data-sveltekit-preload-data (route loader data)
- data-sveltekit-preload-code (route JS chunks)

The original fix turned off only the loader-data side. Route-code chunks
prefetched on hover can also include manually-mocked module URLs; an
in-flight code prefetch landing after iframe teardown hits the same
Playwright route handler that resolves manual mocks, raising the
unhandled rejection. Disable both surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

refactor(confirm-service): normalise import spelling to no-extension form feadf372a0

Production code referenced $lib/shared/services/confirm.svelte under two
spellings — 4 files with the .js extension and one without. Standardise on
the no-extension form to match Svelte 5 rune-module convention and the
source file basename (confirm.svelte.ts).

Why this matters: vitest browser mode's @vitest/browser-playwright resolves
both spellings to the same module URL but registers a separate Playwright
route per spelling. The route-cleanup logic only unregisters the latest,
leaving an orphan that crashes the next session with
"[birpc] rpc is closed, cannot call resolveManualMock". Fixed upstream in
vitest PR #10267 (merged, not yet released). Normalising the spelling
removes the trigger from our side.

Refs: #553. Companion test-file changes follow in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(confirm-service): normalise vi.mock spelling and remove duplicate-id mocks 9030a7d031

Five test files mocked $lib/shared/services/confirm.svelte under BOTH
spellings (.svelte and .svelte.js) within the same file; two more mocked
only the .svelte.js form. Both resolve to the same module URL but register
two distinct Playwright route handlers in @vitest/browser-playwright. The
cleanup logic only removes one, leaving an orphan that fires when the next
session loads the module — crashing the run with
"[birpc] rpc is closed, cannot call resolveManualMock".

This is the exact trigger fixed upstream by vitest PR #10267 (issue #9957).
Normalise every confirm.svelte mock to the no-extension form, matching
production imports and the source file basename (confirm.svelte.ts).

After this commit: 8 confirm.svelte mocks across 8 spec files, all under
one canonical ID. A meta-test (next commit) prevents the duplicate-id
pattern from reappearing.

Refs: #553 · vitest-dev/vitest#9957 · vitest-dev/vitest#10267

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(meta): add duplicate-id vi.mock detector under __meta__/ 0cf4a488bb

Scans every src/**/*.svelte.{spec,test}.ts file for vi.mock first-arg
strings, canonicalises each by stripping a trailing .js/.ts after
.svelte, groups by canonical id, and fails if any canonical id is
referenced under two or more distinct raw spellings.

Mirrors the shape of src/__meta__/no-async-mock-factories.test.ts:
source-text regex scan (no AST parser dependency), red/green self-test
fixtures inline, then one corpus assertion that the whole suite is
clean.

This is the in-suite defence-in-depth layer for the duplicate-id birpc
race named in ADR-012 / #553 and fixed upstream by vitest PR #10267.
Harder to disable than ESLint (cross-file invariant ESLint cannot
express anyway) and harder to scope around than a CI grep.

Refs: #553

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

build(patch-package): backport vitest PR #10267 for browser-playwright birpc race 7fc1295dc0

Installs patch-package (^8.0.0) and a postinstall script, then applies
the diff from vitest PR #10267 against @vitest/browser-playwright@4.1.0.

What the patch changes (in dist/index.js):

- createPredicate(sessionId, url) → createPredicate(url): factory becomes
  pure, returns { url, predicate } instead of mutating sessionIds /
  idPreficates as a side-effect.
- sessionIds value type: array → Set (deduplicates resolved URLs).
- register handler now looks up any existing predicate for the
  (sessionId, resolvedUrl) pair and unroutes it BEFORE installing the
  new route. This is the actual race fix: without it, the second
  vi.mock for a duplicate-id leaks an orphan Playwright route that
  fires after birpc closes.
- clear handler iterates the Set via spread.

Why this matters even though Layer 1 normalised the only known duplicate
in our suite: every future vi.mock call is a class of race we shouldn't
have to think about. The patch closes the upstream gap at the
route-handler level, so a contributor reintroducing the duplicate-id
pattern can't reopen the race.

When to remove: when @vitest/browser-playwright ships a release
containing PR #10267. Delete patches/@vitest+browser-playwright+4.1.0.patch
and the postinstall hook (or keep the hook if other patches accumulate).

Refs: #553 · vitest-dev/vitest#9957 · vitest-dev/vitest#10267

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs(adr-012): document duplicate-id hazard and patch-package backport

CI / Unit & Component Tests (pull_request) Failing after 2m33s

Details

CI / OCR Service Tests (pull_request) Successful in 17s

Details

CI / Backend Unit Tests (pull_request) Successful in 4m12s

Details

CI / fail2ban Regex (pull_request) Successful in 38s

Details

CI / Compose Bucket Idempotency (pull_request) Failing after 11s

Details

CI / Unit & Component Tests (push) Failing after 2m33s

Details

CI / OCR Service Tests (push) Successful in 16s

Details

CI / Backend Unit Tests (push) Successful in 4m16s

Details

CI / fail2ban Regex (push) Successful in 38s

Details

CI / Compose Bucket Idempotency (push) Failing after 11s

Details

nightly / deploy-staging (push) Failing after 1m46s

Details

2e6cc346ab

Adds a second binding invariant section to ADR-012 covering the
duplicate-id mechanism named in #553's follow-up investigation: same
resolved module URL referenced via two distinct vi.mock id strings →
@vitest/browser-playwright leaks an orphan Playwright route → birpc-closed
crash in the next session.

Records the rule (one canonical id per mocked module, prefer the spelling
production uses, no-extension for .svelte rune modules), the in-suite
detector (no-duplicate-mock-ids.test.ts), and the patch-package backport
of vitest PR #10267 with its removal trigger.

Extends the existing Consequences enforcement list from four layers to
six, adding the duplicate-id detector and the patch-package layer.

Refs: #553 · vitest-dev/vitest#9957 · vitest-dev/vitest#10267

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

marcel commented

2026-05-13 10:06:17 +02:00

Markus Keller — Senior Application Architect

Verdict: Approved

Five-commit, atomic, well-bounded change. Defence-in-depth with six layers is exactly the shape this class of "intermittent CI race" calls for — each layer catches the symptom at a different stage of the loop (editor → pre-test grep → in-suite → post-run birpc grep). Nothing here leaks across domain boundaries; the changes are all infrastructure-level (test setup, ESLint, patch-package, ADR).

What I like

ADR-012 was updated with the change, not as a follow-up. The "Binding invariant: one canonical ID per mocked module" section is now the place future contributors will land. Architectural memory is preserved.
Patch-package as a tactical bridge, not a permanent fork. The ADR includes an explicit removal trigger (When to remove the patch) and the in-suite meta-test stays as the permanent guard. This is the right way to backport — own the workaround, document its expiry condition.
The fix is at the lowest trustworthy layer of the test stack. Production import normalisation removes the trigger, while the meta-test enforces the invariant at the layer hardest to disable. Belt-and-braces, not duplication.
Boring tooling choice. patch-package over yarn-patches / pnpm-overrides / a fork. Two-line postinstall, one file under VCS, zero new infrastructure to operate.

Concerns (non-blocking)

The idPreficates typo is reproduced verbatim in the patch. This is correct — patch-package must match the upstream file — but it does mean if upstream renames the variable in a point release, our patch will fail to apply silently. Tobias's CI exercise (npm ci postinstall) is the right guard. Worth a sentence in the ADR's "When to remove the patch" section noting how the failure will surface (postinstall step exits non-zero, npm ci fails the unit-tests job before any test runs).
The meta-test fires a readFileSync per spec file at suite startup. With ~50 browser specs today this is cheap; at 500+ it adds measurable cold-start time per CI run. Not actionable now — flag it as something to revisit if the meta-test ever appears in a CI flame graph.
coverage-flake-probe.yml is a 20-cell matrix on the Playwright Docker image. Tobias will price-check this, but architecturally I'd note that it's workflow_dispatch-only — won't fire on every push. Good choice. A 20× matrix on push would dominate CI capacity.

LGTM on layering

No controller-to-repository violations to flag (this PR doesn't touch the backend). No cross-domain frontend imports introduced. ESLint boundaries rule unchanged. ADR cross-references all six layers and points at the upstream issue/PR for traceability. This is how docs/adr/ should be maintained.

— Markus

## Markus Keller — Senior Application Architect **Verdict: Approved** Five-commit, atomic, well-bounded change. Defence-in-depth with six layers is exactly the shape this class of "intermittent CI race" calls for — each layer catches the symptom at a different stage of the loop (editor → pre-test grep → in-suite → post-run birpc grep). Nothing here leaks across domain boundaries; the changes are all infrastructure-level (test setup, ESLint, patch-package, ADR). ### What I like - **ADR-012 was updated *with* the change, not as a follow-up.** The "Binding invariant: one canonical ID per mocked module" section is now the place future contributors will land. Architectural memory is preserved. - **Patch-package as a tactical bridge, not a permanent fork.** The ADR includes an explicit removal trigger (`When to remove the patch`) and the in-suite meta-test stays as the permanent guard. This is the right way to backport — own the workaround, document its expiry condition. - **The fix is at the lowest trustworthy layer of the test stack.** Production import normalisation removes the *trigger*, while the meta-test enforces the *invariant* at the layer hardest to disable. Belt-and-braces, not duplication. - **Boring tooling choice.** patch-package over yarn-patches / pnpm-overrides / a fork. Two-line postinstall, one file under VCS, zero new infrastructure to operate. ### Concerns (non-blocking) 1. **The `idPreficates` typo is reproduced verbatim in the patch.** This is correct — patch-package must match the upstream file — but it does mean if upstream renames the variable in a point release, our patch will fail to apply silently. Tobias's CI exercise (`npm ci` postinstall) is the right guard. Worth a sentence in the ADR's "When to remove the patch" section noting *how the failure will surface* (postinstall step exits non-zero, `npm ci` fails the unit-tests job before any test runs). 2. **The meta-test fires a `readFileSync` per spec file at suite startup.** With ~50 browser specs today this is cheap; at 500+ it adds measurable cold-start time per CI run. Not actionable now — flag it as something to revisit if the meta-test ever appears in a CI flame graph. 3. **`coverage-flake-probe.yml` is a 20-cell matrix on the Playwright Docker image.** Tobias will price-check this, but architecturally I'd note that it's `workflow_dispatch`-only — won't fire on every push. Good choice. A 20× matrix on push would dominate CI capacity. ### LGTM on layering No controller-to-repository violations to flag (this PR doesn't touch the backend). No cross-domain frontend imports introduced. ESLint boundaries rule unchanged. ADR cross-references all six layers and points at the upstream issue/PR for traceability. This is how `docs/adr/` should be maintained. — Markus

marcel commented

2026-05-13 10:06:34 +02:00

Felix Brandt — Senior Fullstack Developer

Verdict: Approved with concerns

The diff is exactly what you'd want a teardown-race fix to look like: small, atomic, every commit one logical change, every layer of defence has a test behind it. The two new meta-tests (no-duplicate-mock-ids.test.ts, no-async-mock-factories.test.ts) follow textbook red/green construction — 6 self-test fixtures bracketing the corpus scan. That's the right shape.

What's right

Atomic commits, one logical change each. refactor(confirm-service) → test(confirm-service) → test(meta) → build(patch-package) → docs(adr-012). Reads cleanly in git log. No "while I'm here" smuggling.
TDD discipline in the meta-test itself. findDuplicateMockIds() is exported and unit-tested with 6 fixtures (positive, negative, edge cases including .svelte.ts canonicalisation) before the corpus scan asserts the live invariant. Red phase visible: the fixtures would have caught the bug, the corpus scan locks the fix in. This is the pattern I want to see for every meta-guard.
Guard clauses in the regex layer. canonicalise() is a 3-line function that does exactly one thing. No nested conditionals.
Sync factory + vi.hoisted() in EnrichmentBlock.svelte.spec.ts (lines 9-17). Exactly the pattern ADR-012 prescribes. The getter defers the read until consumption time and the hoisted reference is initialised before the also-hoisted vi.mock runs.

Concerns

Regex-based scan in no-async-mock-factories.test.ts is fragile. The pattern vi\.mock\([^)]*,\s*async[^{]*\{[\s\S]*?await\s+import\s*\( works on the corpus today but trips on legitimate code if anyone ever writes:
```
vi.mock('foo', async (importOriginal) => {
  const orig = await importOriginal();  // <-- await ... import ... — false positive
  return { ...orig, bar: vi.fn() };
});
```
The ESLint rule (AwaitExpression > ImportExpression in the AST selector) doesn't have this problem because it inspects the AST. The CI grep guard also doesn't (same regex, same blind spot — but at least the message is clear). Suggestion: add one negative-fixture test for importOriginal() so we know the pattern works for that case, or annotate in the file that this scan is a backstop and the AST selector is the source of truth.
canonicalise() strips only one trailing extension. .svelte.js and .svelte.ts are handled; .svelte.mjs and .svelte.cjs aren't. Probably not used anywhere — but if someone introduces one, the guard silently skips. One-line fix: if (id.match(/\.svelte\.[mc]?[jt]s$/)) return id.replace(/\.svelte\.[mc]?[jt]s$/, '.svelte').
EnrichmentBlock.svelte.spec.ts line 25-28 has a typed helper function doc(...) that shadows the doc variable name used elsewhere in this codebase. Nit. Rename to makeDoc() to match the project's factory-function convention (see makeUser, makeDocument in the persona doc, and the broader test corpus). 5-line change.
no-duplicate-mock-ids.test.ts discovery walks recursive: true on a Node readdirSync — works, but blows up cold cache time on container starts. Same observation as Markus's point 2. Defer.

Test plan you mentioned

npm run check — type-check (787 pre-existing errors, none new): fine, that backlog is tracked.
coverage-flake-probe.yml run #1604: that's the acceptance criterion. Until it's green across all 20 cells, this is approved-with-concerns. Once it lands green, the verdict flips to a clean approval.

— Felix

## Felix Brandt — Senior Fullstack Developer **Verdict: Approved with concerns** The diff is exactly what you'd want a teardown-race fix to look like: small, atomic, every commit one logical change, every layer of defence has a test behind it. The two new meta-tests (`no-duplicate-mock-ids.test.ts`, `no-async-mock-factories.test.ts`) follow textbook red/green construction — 6 self-test fixtures bracketing the corpus scan. That's the right shape. ### What's right - **Atomic commits, one logical change each.** `refactor(confirm-service)` → `test(confirm-service)` → `test(meta)` → `build(patch-package)` → `docs(adr-012)`. Reads cleanly in `git log`. No "while I'm here" smuggling. - **TDD discipline in the meta-test itself.** `findDuplicateMockIds()` is exported and unit-tested with 6 fixtures (positive, negative, edge cases including `.svelte.ts` canonicalisation) *before* the corpus scan asserts the live invariant. Red phase visible: the fixtures would have caught the bug, the corpus scan locks the fix in. This is the pattern I want to see for every meta-guard. - **Guard clauses in the regex layer.** `canonicalise()` is a 3-line function that does exactly one thing. No nested conditionals. - **Sync factory + `vi.hoisted()` in `EnrichmentBlock.svelte.spec.ts` (lines 9-17).** Exactly the pattern ADR-012 prescribes. The getter defers the read until consumption time and the hoisted reference is initialised before the also-hoisted `vi.mock` runs. ### Concerns 1. **Regex-based scan in `no-async-mock-factories.test.ts` is fragile.** The pattern `vi\.mock\([^)]*,\s*async[^{]*\{[\s\S]*?await\s+import\s*\(` works on the corpus today but trips on legitimate code if anyone ever writes: ```ts vi.mock('foo', async (importOriginal) => { const orig = await importOriginal(); // <-- await ... import ... — false positive return { ...orig, bar: vi.fn() }; }); ``` The ESLint rule (`AwaitExpression > ImportExpression` in the AST selector) doesn't have this problem because it inspects the AST. The CI grep guard also doesn't (same regex, same blind spot — but at least the message is clear). Suggestion: add one negative-fixture test for `importOriginal()` so we know the pattern works for that case, *or* annotate in the file that this scan is a backstop and the AST selector is the source of truth. 2. **`canonicalise()` strips only one trailing extension.** `.svelte.js` and `.svelte.ts` are handled; `.svelte.mjs` and `.svelte.cjs` aren't. Probably not used anywhere — but if someone introduces one, the guard silently skips. One-line fix: `if (id.match(/\.svelte\.[mc]?[jt]s$/)) return id.replace(/\.svelte\.[mc]?[jt]s$/, '.svelte')`. 3. **`EnrichmentBlock.svelte.spec.ts` line 25-28 has a typed helper `function doc(...)` that shadows the `doc` variable name used elsewhere in this codebase.** Nit. Rename to `makeDoc()` to match the project's factory-function convention (see `makeUser`, `makeDocument` in the persona doc, and the broader test corpus). 5-line change. 4. **`no-duplicate-mock-ids.test.ts` discovery walks `recursive: true` on a Node `readdirSync` — works, but blows up cold cache time on container starts.** Same observation as Markus's point 2. Defer. ### Test plan you mentioned - `npm run check` — type-check (787 pre-existing errors, none new): fine, that backlog is tracked. - `coverage-flake-probe.yml` run #1604: that's the acceptance criterion. Until it's green across all 20 cells, this is approved-with-concerns. Once it lands green, the verdict flips to a clean approval. — Felix

marcel commented

2026-05-13 10:06:53 +02:00

Tobias Wendt — DevOps & Platform Engineer

Verdict: Approved with concerns

CI-side defences are well-thought-out. coverage-flake-probe.yml as a workflow_dispatch-only matrix replaces what would otherwise be 20 push events on main — that's a real capacity saving and the right tool for "verify a flake-class problem is fixed."

What's right

actions/cache@v4, actions/checkout@v4, actions/upload-artifact@v4 — all current, none deprecated. Caching frontend/node_modules keyed on package-lock.json is the canonical pattern.
Pinned Playwright container image (mcr.microsoft.com/playwright:v1.58.2-noble). Matches the version of @playwright/test in package.json (1.58.2). Reproducible.
postinstall: patch-package is the right shape for a tactical backport — applied via npm ci in CI, no separate apply step, the artefact is one file in patches/.
set -eo pipefail + tee + post-step grep guard. Exactly how you want to capture and gate on a log signature without losing the exit code through tee. Correct.
fail-fast: false on the 20-cell matrix. You want every cell to run so you see partial-failure patterns, not bail on cell #3. Correct.

Concerns

coverage-flake-probe.yml runs 20 cells × full npm run test:coverage. Each cell does its own npm ci (if cache misses) plus a coverage run that, looking at the test pyramid, is bounded by browser tests. Ballpark per cell: 3-6 minutes. 20 cells in parallel on a shared NAS runner means contention. Worth checking how many concurrent runners are configured before this becomes the default flake-probe — and worth adding a comment in the workflow header noting "use sparingly; 20-cell on shared runner."
No explicit timeout on the test step. unit-tests in ci.yml doesn't set one either, but the flake-probe is the place to set one — a hung cell could pin a runner for hours. Suggest timeout-minutes: 15 per cell.
Upload coverage log on failure artefact name is run-indexed (coverage-log-run-${{ matrix.run }}). Good. But the path uses ${{ github.run_id }} in the filename so the second invocation of the same workflow won't collide. Verify Gitea Actions populates github.run_id the same way GitHub does — if not, you'd want a fallback like ${{ env.GITHUB_RUN_ID || github.sha }}. Quick check, not a blocker.
The patch is applied via postinstall, not as part of the build. If a contributor runs npm install --ignore-scripts (e.g. in a sandboxed CI step they haven't audited yet), the patch silently skips. The in-suite meta-test (Layer 5) still fires the duplicate-ID guard, but the upstream-fix backport (Layer 6) doesn't. Not a blocker — but worth a defensive log line: postinstall: patch-package && echo "patches applied" so a missing patch is visible in CI output.
No Renovate config visible for the patches/ directory. When @vitest/browser-playwright releases the version containing PR #10267, Renovate will bump the dep and the patch will fail to apply (filename embeds 4.1.0). That's actually the right failure mode — loud, visible, forces a human to delete the patch. Worth ADR-012 calling this out explicitly under "When to remove the patch" — "the bump PR will fail postinstall; that is the signal to delete." Minor doc edit.

LGTM otherwise

The grep guard in ci.yml (Assert no birpc teardown race in coverage run) is correct — if: always() so it runs even if the test step exited 1, grep -qF for fixed-string match, exit 1 with a clear message. Standard.

Monthly cost impact of this PR: 0 EUR. No new infrastructure.

— Tobias

## Tobias Wendt — DevOps & Platform Engineer **Verdict: Approved with concerns** CI-side defences are well-thought-out. `coverage-flake-probe.yml` as a `workflow_dispatch`-only matrix replaces what would otherwise be 20 push events on `main` — that's a real capacity saving and the right tool for "verify a flake-class problem is fixed." ### What's right - **`actions/cache@v4`, `actions/checkout@v4`, `actions/upload-artifact@v4`** — all current, none deprecated. Caching `frontend/node_modules` keyed on `package-lock.json` is the canonical pattern. - **Pinned Playwright container image** (`mcr.microsoft.com/playwright:v1.58.2-noble`). Matches the version of `@playwright/test` in `package.json` (1.58.2). Reproducible. - **`postinstall: patch-package`** is the right shape for a tactical backport — applied via `npm ci` in CI, no separate apply step, the artefact is one file in `patches/`. - **`set -eo pipefail` + `tee` + post-step grep guard**. Exactly how you want to capture and gate on a log signature without losing the exit code through `tee`. Correct. - **`fail-fast: false`** on the 20-cell matrix. You *want* every cell to run so you see partial-failure patterns, not bail on cell #3. Correct. ### Concerns 1. **`coverage-flake-probe.yml` runs 20 cells × full `npm run test:coverage`.** Each cell does its own `npm ci` (if cache misses) plus a coverage run that, looking at the test pyramid, is bounded by browser tests. Ballpark per cell: 3-6 minutes. 20 cells in parallel on a shared NAS runner means contention. Worth checking how many concurrent runners are configured before this becomes the default flake-probe — and worth adding a comment in the workflow header noting "use sparingly; 20-cell on shared runner." 2. **No explicit timeout on the test step.** `unit-tests` in `ci.yml` doesn't set one either, but the flake-probe is the place to set one — a hung cell could pin a runner for hours. Suggest `timeout-minutes: 15` per cell. 3. **`Upload coverage log on failure` artefact name is run-indexed (`coverage-log-run-${{ matrix.run }}`).** Good. But the path uses `${{ github.run_id }}` in the filename so the second invocation of the same workflow won't collide. Verify Gitea Actions populates `github.run_id` the same way GitHub does — if not, you'd want a fallback like `${{ env.GITHUB_RUN_ID || github.sha }}`. Quick check, not a blocker. 4. **The patch is applied via `postinstall`, not as part of the build.** If a contributor runs `npm install --ignore-scripts` (e.g. in a sandboxed CI step they haven't audited yet), the patch silently skips. The in-suite meta-test (Layer 5) still fires the duplicate-ID guard, but the upstream-fix backport (Layer 6) doesn't. Not a blocker — but worth a defensive log line: `postinstall: patch-package && echo "patches applied"` so a missing patch is visible in CI output. 5. **No Renovate config visible for the `patches/` directory.** When `@vitest/browser-playwright` releases the version containing PR #10267, Renovate will bump the dep and the patch will fail to apply (filename embeds `4.1.0`). That's actually the right failure mode — loud, visible, forces a human to delete the patch. Worth ADR-012 calling this out explicitly under "When to remove the patch" — "the bump PR will fail postinstall; that is the signal to delete." Minor doc edit. ### LGTM otherwise The grep guard in `ci.yml` (`Assert no birpc teardown race in coverage run`) is correct — `if: always()` so it runs even if the test step exited 1, `grep -qF` for fixed-string match, exit 1 with a clear message. Standard. Monthly cost impact of this PR: 0 EUR. No new infrastructure. — Tobias

marcel commented

2026-05-13 10:07:10 +02:00

Elicit — Requirements Engineer & Business Analyst

Verdict: Approved with concerns

Working in Brownfield Phase B3 (issue triage) and B4 (gap analysis) mode for this PR.

Issue-quality assessment of #553 → this PR

The PR description meets the Definition of Ready bar I'd want for a solo-with-LLM workflow:

Clear title (verb-noun: "close birpc race + duplicate-ID guard + backport")
Closes a tracked issue (#553)
Acceptance criterion is measurable and falsifiable: "20-cell matrix run, all green, zero [birpc] rpc is closed lines." This is exactly the kind of NFR target I push for — not "make the flake go away" but "20 consecutive green coverage runs against this SHA." Falsifiable. Re-verifiable.
Defence-in-depth is layered and each layer is named, located, and explained
Out-of-scope item explicitly parked: "#554 — factory mocks → prop injection migration (sveltest pattern). Out of scope for this PR." Excellent scope discipline.

Concerns and gaps

Acceptance criterion is gated on a workflow that hasn't reported back yet. The PR description marks coverage-flake-probe.yml run #1604 as "Triggered" with empty checkboxes for lint/test/postinstall. Until run #1604 reports all 20 cells green, the PR's primary acceptance evidence is pending. This is fine for a draft, but the PR is not marked draft and is open for review. Recommendation: either mark Draft until #1604 completes, or post the run result as a follow-up comment before requesting merge.
NFR coverage on the patch lifecycle is implicit, not explicit. ADR-012 says "delete the patch when @vitest/browser-playwright publishes a release containing PR #10267" — but the trigger for that deletion isn't tracked. Suggestion: open a low-priority Gitea issue right now titled "Remove @vitest+browser-playwright+4.1.0.patch once upstream PR #10267 is released" and link it from the ADR. Without this, the patch becomes invisible technical debt. (TD entry shape: TD-NN | Tactical backport of vitest PR #10267 | Drift risk if upstream API changes | When upstream releases | XS.)
The PR closes #553 but the AC for #535 (original incident) was "60 consecutive green workflow_dispatch CI runs against main after the fix is merged." Is that still the bar, or has #553's 20-cell parallel probe superseded it? Per ADR-012's "Acceptance criterion for #535" line vs the PR's "20-cell matrix" — these are slightly different shapes (60 sequential post-merge vs 20 parallel pre-merge). Worth clarifying in the merge commit message or in #535 directly so the closing condition for the original incident is unambiguous.
No user-facing impact mentioned. This is correct — it's a CI/test-infrastructure fix and shouldn't have one. But the PR template doesn't make this explicit. If you maintain a CHANGELOG.md or release notes, an "internal: closed CI flake from #553/#535" entry is the right shape.

LGTM on requirements rigor

The PR description's six-layer enforcement chain table (layer / what / where) is the kind of artefact I'd produce as the output of a Phase B6 repackage. Issue references (vitest #9957, #10267) are linked. Out-of-scope item is parked with its tracking issue. Test plan has explicit checkboxes mapped to verification commands.

This is what spec-dense issue work looks like.

— Elicit

## Elicit — Requirements Engineer & Business Analyst **Verdict: Approved with concerns** Working in Brownfield Phase B3 (issue triage) and B4 (gap analysis) mode for this PR. ### Issue-quality assessment of #553 → this PR The PR description meets the Definition of Ready bar I'd want for a solo-with-LLM workflow: - [x] Clear title (verb-noun: "close birpc race + duplicate-ID guard + backport") - [x] Closes a tracked issue (#553) - [x] Acceptance criterion is **measurable and falsifiable**: "20-cell matrix run, all green, zero `[birpc] rpc is closed` lines." This is exactly the kind of NFR target I push for — not "make the flake go away" but "20 consecutive green coverage runs against this SHA." Falsifiable. Re-verifiable. - [x] Defence-in-depth is layered and each layer is named, located, and explained - [x] Out-of-scope item explicitly parked: "#554 — factory mocks → prop injection migration (sveltest pattern). Out of scope for this PR." Excellent scope discipline. ### Concerns and gaps 1. **Acceptance criterion is gated on a workflow that hasn't reported back yet.** The PR description marks `coverage-flake-probe.yml` run #1604 as "Triggered" with empty checkboxes for lint/test/postinstall. **Until run #1604 reports all 20 cells green, the PR's primary acceptance evidence is pending.** This is fine for a draft, but the PR is not marked draft and is open for review. Recommendation: either mark Draft until #1604 completes, or post the run result as a follow-up comment before requesting merge. 2. **NFR coverage on the patch lifecycle is implicit, not explicit.** ADR-012 says "delete the patch when `@vitest/browser-playwright` publishes a release containing PR #10267" — but the *trigger* for that deletion isn't tracked. Suggestion: open a low-priority Gitea issue right now titled "Remove `@vitest+browser-playwright+4.1.0.patch` once upstream PR #10267 is released" and link it from the ADR. Without this, the patch becomes invisible technical debt. (TD entry shape: `TD-NN | Tactical backport of vitest PR #10267 | Drift risk if upstream API changes | When upstream releases | XS`.) 3. **The PR closes #553 but the AC for #535 (original incident) was "60 consecutive green `workflow_dispatch` CI runs against `main` after the fix is merged."** Is that still the bar, or has #553's 20-cell parallel probe superseded it? Per ADR-012's "Acceptance criterion for #535" line vs the PR's "20-cell matrix" — these are slightly different shapes (60 sequential post-merge vs 20 parallel pre-merge). Worth clarifying in the merge commit message or in #535 directly so the closing condition for the original incident is unambiguous. 4. **No user-facing impact mentioned.** This is correct — it's a CI/test-infrastructure fix and shouldn't have one. But the PR template doesn't make this explicit. If you maintain a `CHANGELOG.md` or release notes, an "internal: closed CI flake from #553/#535" entry is the right shape. ### LGTM on requirements rigor The PR description's six-layer enforcement chain table (layer / what / where) is the kind of artefact I'd produce as the output of a Phase B6 repackage. Issue references (vitest #9957, #10267) are linked. Out-of-scope item is parked with its tracking issue. Test plan has explicit checkboxes mapped to verification commands. This is what spec-dense issue work looks like. — Elicit

marcel commented

2026-05-13 10:07:23 +02:00

Nora "NullX" Steiner — Application Security Engineer

Verdict: Approved

This is a test-infrastructure fix. No security-sensitive surfaces are touched — no auth, no input validation, no JPQL, no permission annotations, no actuator config, no CORS, no headers. I reviewed the diff against my OWASP / CWE checklist and there is nothing for me to flag at the vulnerability level.

What I checked (and what came back clean)

patches/@vitest+browser-playwright+4.1.0.patch — Reviewed the patch hunk-by-hunk. It modifies the mocker's register / clear handlers to unroute an existing predicate before installing a new one. No new input parsing, no new HTTP surface, no privilege boundary crossed. The added unroute call operates on Playwright's in-process route registry — not on user-controllable data. No threat surface added.
postinstall: patch-package — patch-package is a well-known utility (DefinitelyTyped uses it, Next.js uses it). It applies a diff against node_modules. Risk is supply-chain (the package itself) rather than application code. The patch file lives in version control, is reviewable, and is identical for every contributor and every CI run. Acceptable.
coverage-flake-probe.yml — workflow_dispatch only, runs in the Playwright container, no secrets read, artefact upload is a coverage log (no credentials embedded). Workflow-level review: clean.
Meta-tests — Pure source-text scans against the local file tree. No eval, no Function, no dynamic-import of user-controlled paths. Read-only readFileSync. Clean.
ESLint rule additions — Static AST selectors, no shell-out, no network. Clean.

One systemic observation (not a finding)

The PR is a good worked example of the "every vulnerability fix starts with a failing test" principle I want to see applied broadly — even though this isn't a vulnerability fix, the pattern (findDuplicateMockIds() exported, six self-test fixtures bracketing the corpus scan) is exactly the shape I'd want for any future security regression test. The duplicate-ID meta-test is reusable as a template: scan corpus, extract pattern, assert invariant. Worth keeping in mind for the next security finding that needs codification.

LGTM with one teaching note

If we ever do find ourselves needing vi.mock of a sensitive module in browser tests (e.g. mocking the auth service to test logged-out flows), the ADR-012 invariants from this PR mean we'll be forced to do it via prop injection or vi.hoisted + sync factory — which structurally prevents the kind of test-time auth bypass that I'd worry about in a JSDOM/Mock setup. Indirect security benefit.

— Nora

## Nora "NullX" Steiner — Application Security Engineer **Verdict: Approved** This is a test-infrastructure fix. No security-sensitive surfaces are touched — no auth, no input validation, no JPQL, no permission annotations, no actuator config, no CORS, no headers. I reviewed the diff against my OWASP / CWE checklist and there is nothing for me to flag at the vulnerability level. ### What I checked (and what came back clean) - **`patches/@vitest+browser-playwright+4.1.0.patch`** — Reviewed the patch hunk-by-hunk. It modifies the mocker's `register` / `clear` handlers to unroute an existing predicate before installing a new one. No new input parsing, no new HTTP surface, no privilege boundary crossed. The added `unroute` call operates on Playwright's in-process route registry — not on user-controllable data. **No threat surface added.** - **`postinstall: patch-package`** — `patch-package` is a well-known utility (DefinitelyTyped uses it, Next.js uses it). It applies a diff against `node_modules`. Risk is supply-chain (the package itself) rather than application code. The patch file lives in version control, is reviewable, and is identical for every contributor and every CI run. Acceptable. - **`coverage-flake-probe.yml`** — `workflow_dispatch` only, runs in the Playwright container, no secrets read, artefact upload is a coverage log (no credentials embedded). Workflow-level review: clean. - **Meta-tests** — Pure source-text scans against the local file tree. No `eval`, no `Function`, no dynamic-`import` of user-controlled paths. Read-only `readFileSync`. Clean. - **ESLint rule additions** — Static AST selectors, no shell-out, no network. Clean. ### One systemic observation (not a finding) The PR is a good worked example of the "every vulnerability fix starts with a failing test" principle I want to see applied broadly — even though this isn't a vulnerability fix, the pattern (`findDuplicateMockIds()` exported, six self-test fixtures bracketing the corpus scan) is exactly the shape I'd want for any future security regression test. The duplicate-ID meta-test is reusable as a template: scan corpus, extract pattern, assert invariant. Worth keeping in mind for the next security finding that needs codification. ### LGTM with one teaching note If we ever do find ourselves *needing* `vi.mock` of a sensitive module in browser tests (e.g. mocking the auth service to test logged-out flows), the ADR-012 invariants from this PR mean we'll be forced to do it via prop injection or `vi.hoisted` + sync factory — which structurally prevents the kind of test-time auth bypass that I'd worry about in a JSDOM/Mock setup. Indirect security benefit. — Nora

marcel commented

2026-05-13 10:07:43 +02:00

Sara Holt — Senior QA Engineer

Verdict: Approved with concerns

This PR is in my domain. The whole change is test-infrastructure hardening. I'm happy with the strategy and most of the execution, but I have specific concerns about reliability and the test pyramid that I want addressed before this becomes the standard pattern.

What I like

Six-layer defence pyramid maps cleanly to the test-pyramid model. ESLint (static) → CI grep guard (pre-test fast feedback) → in-suite meta-test (test-time) → CI birpc assert (post-test symptom catch) → in-suite duplicate-ID guard → upstream backport. Each layer catches a different stage. This is exactly the shape I want for any class of regression.
Meta-tests are themselves tested. findDuplicateMockIds() and hasAsyncMockFactoryWithDynamicImport() are both exported and have unit-test fixtures before the corpus scan. 6 + 5 self-tests respectively, covering positive, negative, edge cases (.svelte.ts, single-file duplicates, both-spellings-in-one-file). When the regex is wrong, the unit tests fail before the corpus scan does. Strong TDD discipline.
Acceptance criterion is measurable. coverage-flake-probe.yml 20-cell matrix asserting zero [birpc] rpc is closed lines. This is a falsifiable quality gate, not "feels less flaky." It's the test plan I'd write.
Real DOM, not JSDOM. vitest-browser-svelte + Playwright is the right tool for catching this exact class of teardown race. The meta-test sits inside the same test runner that's affected — it can't be bypassed by disabling a separate lint job.

Concerns

The two meta-tests are duplicated patterns of "glob → readFileSync → scan → assert empty offenders." Currently 2 instances. If we add a third or fourth pattern guard, we'll have shotgun-surgery on the discovery code. Suggestion: extract findBrowserSpecs() and the __dirname / SRC_ROOT setup into src/__meta__/fixtures.ts (or similar) so future guards inherit the discovery. Defer if not blocking — 2 instances doesn't trigger the rule of three yet, but flag it.
no-duplicate-mock-ids.test.ts lines 119-129 — single assertion, but the report shape isn't great on failure. If the test fails, the expect(report).toEqual({}) produces a diff that lists every duplicate. Good. But the failure message doesn't tell the developer what to do — they'll need to find the ADR. Suggestion: when duplicates.size > 0, log a one-line message before expect: console.error('Duplicate vi.mock IDs found. See ADR-012 / fix: pick one canonical spelling.'). Cheap UX improvement.
Self-test fixtures are inline strings. Standard for this kind of meta-test, fine. But the regex-based scanner in no-async-mock-factories.test.ts doesn't have a fixture for the legitimate-await-importOriginal case (which would be a false positive). Felix flagged this too. Add one negative fixture and assert it does not match — even if the AST selector is the source of truth, the regex is what runs at corpus scan time.
coverage-flake-probe.yml has no timeout-minutes. Tobias caught this too. From a QA reliability standpoint: a hung cell pollutes the signal — was the cell flaky, or did it time out at 6 hours? Add timeout-minutes: 20 per cell.
No assertion that the patch applied. The npm ci postinstall step runs patch-package, but if the file is missing or fails to apply, postinstall exits non-zero and CI fails — good. However, there's no positive assertion that the patched code is actually in node_modules at test time. Could add a sanity check step: grep -F 'Backport of vitest PR #10267' frontend/node_modules/@vitest/browser-playwright/dist/index.js || exit 1. Optional — postinstall failure is already loud — but it would catch the case where someone runs npm install --ignore-scripts locally and pushes "works on my machine" thinking the upstream fix is in.

LGTM on pyramid placement

The new tests live at the right layer (unit/in-suite for the meta-guards, CI workflow for the flake probe). E2E suite untouched. No new Playwright tests added — correct, because this isn't a user-facing change. Layer separation respected.

Will flip to clean Approve once run #1604 lands green across all 20 cells and the timeout/false-positive concerns are addressed (or explicitly deferred with a tracking issue).

— Sara

## Sara Holt — Senior QA Engineer **Verdict: Approved with concerns** This PR is in my domain. The whole change is test-infrastructure hardening. I'm happy with the strategy and most of the execution, but I have specific concerns about reliability and the test pyramid that I want addressed before this becomes the standard pattern. ### What I like - **Six-layer defence pyramid maps cleanly to the test-pyramid model.** ESLint (static) → CI grep guard (pre-test fast feedback) → in-suite meta-test (test-time) → CI birpc assert (post-test symptom catch) → in-suite duplicate-ID guard → upstream backport. Each layer catches a different stage. **This is exactly the shape I want for any class of regression.** - **Meta-tests are themselves tested.** `findDuplicateMockIds()` and `hasAsyncMockFactoryWithDynamicImport()` are both exported and have unit-test fixtures *before* the corpus scan. 6 + 5 self-tests respectively, covering positive, negative, edge cases (`.svelte.ts`, single-file duplicates, both-spellings-in-one-file). When the regex is wrong, the unit tests fail before the corpus scan does. Strong TDD discipline. - **Acceptance criterion is measurable.** `coverage-flake-probe.yml` 20-cell matrix asserting zero `[birpc] rpc is closed` lines. This is a falsifiable quality gate, not "feels less flaky." It's the test plan I'd write. - **Real DOM, not JSDOM.** `vitest-browser-svelte` + Playwright is the right tool for catching this exact class of teardown race. The meta-test sits *inside* the same test runner that's affected — it can't be bypassed by disabling a separate lint job. ### Concerns 1. **The two meta-tests are duplicated patterns of "glob → readFileSync → scan → assert empty offenders."** Currently 2 instances. If we add a third or fourth pattern guard, we'll have shotgun-surgery on the discovery code. Suggestion: extract `findBrowserSpecs()` and the `__dirname / SRC_ROOT` setup into `src/__meta__/fixtures.ts` (or similar) so future guards inherit the discovery. Defer if not blocking — 2 instances doesn't trigger the rule of three yet, but flag it. 2. **`no-duplicate-mock-ids.test.ts` lines 119-129 — single assertion, but the report shape isn't great on failure.** If the test fails, the `expect(report).toEqual({})` produces a diff that lists *every* duplicate. Good. But the failure message doesn't tell the developer **what to do** — they'll need to find the ADR. Suggestion: when `duplicates.size > 0`, log a one-line message before `expect`: `console.error('Duplicate vi.mock IDs found. See ADR-012 / fix: pick one canonical spelling.')`. Cheap UX improvement. 3. **Self-test fixtures are inline strings.** Standard for this kind of meta-test, fine. But the regex-based scanner in `no-async-mock-factories.test.ts` doesn't have a fixture for the legitimate-`await-importOriginal` case (which would be a false positive). Felix flagged this too. Add one negative fixture and assert it does *not* match — even if the AST selector is the source of truth, the regex is what runs at corpus scan time. 4. **`coverage-flake-probe.yml` has no `timeout-minutes`.** Tobias caught this too. From a QA reliability standpoint: a hung cell pollutes the signal — was the cell flaky, or did it time out at 6 hours? Add `timeout-minutes: 20` per cell. 5. **No assertion that the patch *applied*.** The `npm ci` postinstall step runs `patch-package`, but if the file is missing or fails to apply, postinstall exits non-zero and CI fails — good. However, there's no positive assertion that the patched code is actually in `node_modules` at test time. Could add a sanity check step: `grep -F 'Backport of vitest PR #10267' frontend/node_modules/@vitest/browser-playwright/dist/index.js || exit 1`. Optional — postinstall failure is already loud — but it would catch the case where someone runs `npm install --ignore-scripts` locally and pushes "works on my machine" thinking the upstream fix is in. ### LGTM on pyramid placement The new tests live at the right layer (unit/in-suite for the meta-guards, CI workflow for the flake probe). E2E suite untouched. No new Playwright tests added — correct, because this isn't a user-facing change. Layer separation respected. Will flip to clean Approve once run #1604 lands green across all 20 cells and the timeout/false-positive concerns are addressed (or explicitly deferred with a tracking issue). — Sara

marcel commented

2026-05-13 10:07:56 +02:00

Leonie Voss — UX Designer & Accessibility Strategist

Verdict: Approved (LGTM — no UI surface in scope)

I reviewed the diff against my checklist (brand tokens, semantic HTML, WCAG AA contrast, 44px touch targets, focus visibility, dark-mode token remap, redundant cues, responsive 320px-first). None of those checks apply to this PR. This is a test-infrastructure and CI change — no visible UI is modified.

What I checked (and why each came back N/A)

frontend/src/routes/+layout.svelte — only line changed is the provideConfirmService import (.svelte.js → .svelte). No template, no styling, no accessibility surface affected.
frontend/src/routes/documents/[id]/+page.svelte — same single-line import normalisation.
NameHistoryEditCard.svelte, TagDeleteGuard.svelte — same. Import-spelling normalisation only.
EnrichmentBlock.svelte — diff is just $app/stores → $app/state and the derived expression now reads !!navigating.type. Same rendered output, same DOM, same accessibility surface, same focus order, same aria-busy and aria-label on the skeleton (preserved at lines 35-37). No regression in the user-facing surface. The skeleton's existing aria-label="Lade aktualisierte Liste" is German-locale-hardcoded — that's a pre-existing concern not introduced by this PR.
All other touched files — spec files, meta-tests, patch file, ADR, CI workflows. No user-facing surface.

One thing I want to note (for future PRs, not this one)

EnrichmentBlock.svelte's skeleton (line 34) uses aria-busy="true" + aria-label="Lade aktualisierte Liste" (German hardcoded). Once the i18n migration of skeletons is done, this should use Paraglide's m.* helpers — but this PR did not introduce that issue and is not the place to fix it. Tracking-wise, if there isn't already a "skeleton aria-labels via Paraglide" issue, that's worth filing separately. Not blocking.

Nothing to flag at the user-experience level

No brand token changes, no contrast regressions, no touch-target changes, no focus-indicator changes, no responsive breakpoint changes, no dark-mode token impact, no animation/motion changes. Skeleton's motion-reduce:animate-none class is preserved (line 34 — good, that's a WCAG 2.3.3 consideration).

I'd happily see this merge from a UI/UX standpoint.

— Leonie

## Leonie Voss — UX Designer & Accessibility Strategist **Verdict: Approved (LGTM — no UI surface in scope)** I reviewed the diff against my checklist (brand tokens, semantic HTML, WCAG AA contrast, 44px touch targets, focus visibility, dark-mode token remap, redundant cues, responsive 320px-first). **None of those checks apply to this PR.** This is a test-infrastructure and CI change — no visible UI is modified. ### What I checked (and why each came back N/A) - **`frontend/src/routes/+layout.svelte`** — only line changed is the `provideConfirmService` import (`.svelte.js` → `.svelte`). No template, no styling, no accessibility surface affected. - **`frontend/src/routes/documents/[id]/+page.svelte`** — same single-line import normalisation. - **`NameHistoryEditCard.svelte`, `TagDeleteGuard.svelte`** — same. Import-spelling normalisation only. - **`EnrichmentBlock.svelte`** — diff is just `$app/stores` → `$app/state` and the derived expression now reads `!!navigating.type`. Same rendered output, same DOM, same accessibility surface, same focus order, same `aria-busy` and `aria-label` on the skeleton (preserved at lines 35-37). **No regression in the user-facing surface.** The skeleton's existing `aria-label="Lade aktualisierte Liste"` is German-locale-hardcoded — that's a pre-existing concern not introduced by this PR. - **All other touched files** — spec files, meta-tests, patch file, ADR, CI workflows. No user-facing surface. ### One thing I want to note (for future PRs, not this one) `EnrichmentBlock.svelte`'s skeleton (line 34) uses `aria-busy="true"` + `aria-label="Lade aktualisierte Liste"` (German hardcoded). Once the i18n migration of skeletons is done, this should use Paraglide's `m.*` helpers — but **this PR did not introduce that issue and is not the place to fix it.** Tracking-wise, if there isn't already a "skeleton aria-labels via Paraglide" issue, that's worth filing separately. Not blocking. ### Nothing to flag at the user-experience level No brand token changes, no contrast regressions, no touch-target changes, no focus-indicator changes, no responsive breakpoint changes, no dark-mode token impact, no animation/motion changes. Skeleton's `motion-reduce:animate-none` class is preserved (line 34 — good, that's a WCAG 2.3.3 consideration). I'd happily see this merge from a UI/UX standpoint. — Leonie

marcel merged commit 2e6cc346ab into main

2026-05-13 12:55:49 +02:00

marcel deleted branch feat/issue-553-birpc-async-mock-factory

2026-05-13 12:55:50 +02:00

marcel referenced this pull request

2026-05-13 12:57:12 +02:00

ci(coverage): drop client-project branches threshold 80 → 70 to unblock CI #556

marcel referenced this pull request

2026-05-13 12:59:28 +02:00

ci(coverage): drop client-project branches threshold 80 → 70 to unblock CI #556

marcel referenced this pull request

2026-05-13 13:00:21 +02:00