ci(coverage): drop client-project branches threshold 80 → 70 to unblock CI #556

New Issue

marcel · 2026-05-13T12:57:12+02:00

marcel commented

2026-05-13 12:57:12 +02:00

Context

The client-project Istanbul coverage configuration in frontend/vitest.client-coverage.config.ts currently enforces an 80% threshold across lines, functions, branches, and statements. The branch-coverage figure is 77% right now — below the 80% gate — and every CI run with test:coverage (including the coverage-flake-probe.yml matrix and the standard unit-tests job) fails on this check alone, even when all tests are green.

Per the prior incident captured in #496, reaching 80% branches on this codebase is a long-tail grind — the denominator grows as parents pull children into coverage, so each commit that improves a child's coverage often makes the parent worse. Expected cost to reach 80% from 77% is 30–100+ commits. That cost is not justified right now, especially while #553 / PR #555 is the active blocker for shipping the birpc-race fix.

Decision

Drop the branches threshold from 80 → 70 in vitest.client-coverage.config.ts. Leave lines, functions, statements at 80. Document the decision in a new ADR so the rationale is not lost.

The 70% figure gives us ~7 percentage points of headroom (77% current → 70% gate) without effectively disabling the check. Lines/functions/statements stay at 80% because they don't suffer the same long-tail-grind problem (the parent/child accounting differs).

Acceptance

frontend/vitest.client-coverage.config.ts:47 set to branches: 70; lines/functions/statements remain at 80
New ADR docs/adr/013-client-branches-coverage-threshold.md (or next free number — verify against docs/adr/) explaining:
- The current state (77% branches across the client project)
- The long-tail-grind problem (cross-link to #496)
- Why 70% (matches today's headroom; not a permanent ceiling)
- When to revisit (every 6 months OR when a deliberate coverage push is scheduled; the ratchet direction is up, never down)
coverage-flake-probe.yml run (workflow_dispatch) succeeds end-to-end with branches: 70

Non-goals

Not raising actual branch coverage — that's #496's job, tracked separately.
Not touching the server-project coverage config (vitest.config.ts) — only the client project hits the long-tail-grind pattern.

References

#496 — Branch coverage long-tail grind
frontend/vitest.client-coverage.config.ts:38-50 — coverage block to modify
docs/adr/ — naming convention for the new ADR

## Context The client-project Istanbul coverage configuration in `frontend/vitest.client-coverage.config.ts` currently enforces an 80% threshold across `lines`, `functions`, `branches`, and `statements`. The branch-coverage figure is **77%** right now — below the 80% gate — and every CI run with `test:coverage` (including the `coverage-flake-probe.yml` matrix and the standard `unit-tests` job) fails on this check alone, even when all tests are green. Per the prior incident captured in #496, reaching 80% branches on this codebase is a **long-tail grind** — the denominator grows as parents pull children into coverage, so each commit that improves a child's coverage often makes the parent worse. Expected cost to reach 80% from 77% is 30–100+ commits. That cost is not justified right now, especially while #553 / PR #555 is the active blocker for shipping the birpc-race fix. ## Decision Drop the **branches** threshold from `80` → `70` in `vitest.client-coverage.config.ts`. Leave `lines`, `functions`, `statements` at `80`. Document the decision in a new ADR so the rationale is not lost. The 70% figure gives us ~7 percentage points of headroom (77% current → 70% gate) without effectively disabling the check. Lines/functions/statements stay at 80% because they don't suffer the same long-tail-grind problem (the parent/child accounting differs). ## Acceptance - [ ] `frontend/vitest.client-coverage.config.ts:47` set to `branches: 70`; lines/functions/statements remain at 80 - [ ] New ADR `docs/adr/013-client-branches-coverage-threshold.md` (or next free number — verify against `docs/adr/`) explaining: - The current state (77% branches across the client project) - The long-tail-grind problem (cross-link to #496) - Why 70% (matches today's headroom; not a permanent ceiling) - When to revisit (every 6 months OR when a deliberate coverage push is scheduled; the ratchet direction is **up**, never down) - [ ] `coverage-flake-probe.yml` run (workflow_dispatch) succeeds end-to-end with `branches: 70` ## Non-goals - **Not** raising actual branch coverage — that's #496's job, tracked separately. - **Not** touching the server-project coverage config (`vitest.config.ts`) — only the client project hits the long-tail-grind pattern. ## References - #496 — Branch coverage long-tail grind - `frontend/vitest.client-coverage.config.ts:38-50` — coverage block to modify - `docs/adr/` — naming convention for the new ADR

marcel added the P0-critical devops documentation test labels 2026-05-13 12:57:30 +02:00

marcel commented

2026-05-13 12:59:01 +02:00

Markus Keller — Senior Application Architect

Observations

This is a small, well-scoped CI-gate adjustment with an ADR attached — exactly the right shape for "lowering a quality bar." The rationale (long-tail-grind in branch accounting, parent/child denominator coupling) is real and matches what I saw in #496.
Duplicate ADR number 012 in the tree. docs/adr/ already contains BOTH 012-browser-test-mocking-strategy.md and 012-nsenter-for-host-service-management-in-ci.md. The issue says "ADR 013-client-branches-coverage-threshold.md (or next free number — verify against docs/adr/)". The next free number is 013, but anyone reading the directory now sees a collision and will not know which 012 to cross-link from a new ADR. That is an architectural-memory smell.
The decision boundary is correct: client-only, branches-only. Server project (vitest.config.ts does not exist at that path — see below) is genuinely a different beast because its parent/child denominator behaviour differs. Leaving lines/functions/statements at 80% preserves the gate's signal.
The issue references vitest.config.ts as the "server project" — that file does not exist at frontend/vitest.config.ts. The split-config layout uses vitest.client-coverage.config.ts plus presumably a server-side companion. Worth verifying the issue's "Non-goals" wording before merge so the ADR doesn't reference a file that's been renamed.

Recommendations

Use 013 for the new ADR — and fix the duplicate-012 collision in the same PR. Rename one of the existing 012-*.md to 014-*.md (e.g. 014-nsenter-for-host-service-management-in-ci.md, since it landed later) and update any cross-references. Otherwise the next ADR author hits the same collision and the rationale chain breaks. This is exactly what ADRs exist to prevent.
The new ADR's "Consequences" section must commit to a ratchet, not a hope. Add concrete language: "When npm run test:coverage reports ≥ 73% branches sustained across 3 consecutive PRs, raise the gate to 73%. Never lower." That's the only mechanism that prevents 70% from quietly becoming the new permanent floor.
Cross-link both directions. New ADR references #496; #496 should reference the new ADR. Update #496's body (reopen briefly, add a "See also: ADR-013" line, close again). Without back-links the ADR becomes orphaned context six months from now.
Verify the issue's file references. The "References" section names vitest.client-coverage.config.ts:38-50 (correct, I checked) but the "Non-goals" names vitest.config.ts for the server project — that path doesn't exist. Pin the actual filename before the ADR ships.

Open Decisions

Should #496 be reopened? It is currently closed but its problem statement (reaching 80% branches) is the entire reason this issue exists. Closing #496 while branches sit at 77% is misleading state — anyone reading the issue tracker thinks the work is done. Options: (a) reopen #496 as the tracking issue for the ratchet-up work; (b) create a fresh issue for "raise client branches gate 70 → 80" and link from the new ADR. (a) preserves history; (b) gives a clean Definition of Done. Your call based on how you want the backlog to read.

## Markus Keller — Senior Application Architect ### Observations - This is a small, well-scoped CI-gate adjustment with an ADR attached — exactly the right shape for "lowering a quality bar." The rationale (long-tail-grind in branch accounting, parent/child denominator coupling) is real and matches what I saw in #496. - **Duplicate ADR number 012 in the tree.** `docs/adr/` already contains BOTH `012-browser-test-mocking-strategy.md` and `012-nsenter-for-host-service-management-in-ci.md`. The issue says "ADR 013-client-branches-coverage-threshold.md (or next free number — verify against `docs/adr/`)". The next free number is 013, but anyone reading the directory now sees a collision and will not know which 012 to cross-link from a new ADR. That is an architectural-memory smell. - The decision boundary is correct: client-only, branches-only. Server project (`vitest.config.ts` does not exist at that path — see below) is genuinely a different beast because its parent/child denominator behaviour differs. Leaving lines/functions/statements at 80% preserves the gate's signal. - The issue references `vitest.config.ts` as the "server project" — that file does not exist at `frontend/vitest.config.ts`. The split-config layout uses `vitest.client-coverage.config.ts` plus presumably a server-side companion. Worth verifying the issue's "Non-goals" wording before merge so the ADR doesn't reference a file that's been renamed. ### Recommendations - **Use 013 for the new ADR — and fix the duplicate-012 collision in the same PR.** Rename one of the existing `012-*.md` to `014-*.md` (e.g. `014-nsenter-for-host-service-management-in-ci.md`, since it landed later) and update any cross-references. Otherwise the next ADR author hits the same collision and the rationale chain breaks. This is exactly what ADRs exist to prevent. - **The new ADR's "Consequences" section must commit to a ratchet, not a hope.** Add concrete language: "When `npm run test:coverage` reports ≥ 73% branches sustained across 3 consecutive PRs, raise the gate to 73%. Never lower." That's the only mechanism that prevents 70% from quietly becoming the new permanent floor. - **Cross-link both directions.** New ADR references #496; #496 should reference the new ADR. Update #496's body (reopen briefly, add a "See also: ADR-013" line, close again). Without back-links the ADR becomes orphaned context six months from now. - **Verify the issue's file references.** The "References" section names `vitest.client-coverage.config.ts:38-50` (correct, I checked) but the "Non-goals" names `vitest.config.ts` for the server project — that path doesn't exist. Pin the actual filename before the ADR ships. ### Open Decisions - **Should #496 be reopened?** It is currently closed but its problem statement (reaching 80% branches) is the entire reason this issue exists. Closing #496 while branches sit at 77% is misleading state — anyone reading the issue tracker thinks the work is done. Options: (a) reopen #496 as the tracking issue for the ratchet-up work; (b) create a fresh issue for "raise client branches gate 70 → 80" and link from the new ADR. (a) preserves history; (b) gives a clean Definition of Done. Your call based on how you want the backlog to read.

marcel commented

2026-05-13 12:59:12 +02:00

Felix Brandt — Senior Fullstack Developer

Observations

The change is one line: branches: 80 → branches: 70 in frontend/vitest.client-coverage.config.ts:47. Zero production code, zero test code, no failing-test-first cycle applies — this is a config gate adjustment, not a behavior change. TDD discipline isn't on the line here.
The acceptance criterion that does need a test is the gate itself: after the change, npm run test:coverage against the client project must exit zero. That's verified by running coverage-flake-probe.yml end-to-end (already listed in AC).
The issue is precise about scope: branches only, client only. That matches what I'd want — no scope creep, no incidental "while we're here let me also..." additions.
One small clean-code nit: when I look at vitest.client-coverage.config.ts:44-49, the thresholds block has four explicit fields. After the change, three are at 80 and one is at 70. That asymmetry deserves a one-line inline comment pointing at the ADR — exactly the kind of why comment that earns its keep.

Recommendations

Add an inline comment at the threshold block referencing the new ADR:

thresholds: {
    lines: 80,
    functions: 80,
    // Branches at 70 — long-tail-grind in parent/child accounting.
    // See docs/adr/013-client-branches-coverage-threshold.md and #496.
    branches: 70,
    statements: 80
}

A reviewer six months from now sees the asymmetry and immediately understands why. Without it, the obvious next move is "let me normalize this back to 80."

Run npm run test:coverage locally before pushing to confirm the current branches number really is 77% (issue says 77%; I want the ADR to record the exact measured number with the date). The ADR's headroom argument hinges on this — if it's actually 73% the 70% gate has 3pp not 7pp.
No new tests needed. The behavior of the production codebase is unchanged. The change is purely a CI-gate adjustment.
The "branch + commit + PR" workflow per project rules applies even for a one-line config change. Branch like chore/issue-556-drop-client-branches-coverage-gate, commit referencing #556, PR with the ADR included.

Open Decisions

(none — concrete, scoped, mechanical change)

## Felix Brandt — Senior Fullstack Developer ### Observations - The change is one line: `branches: 80` → `branches: 70` in `frontend/vitest.client-coverage.config.ts:47`. Zero production code, zero test code, no failing-test-first cycle applies — this is a config gate adjustment, not a behavior change. TDD discipline isn't on the line here. - The acceptance criterion that does need a test is the gate itself: after the change, `npm run test:coverage` against the client project must exit zero. That's verified by running `coverage-flake-probe.yml` end-to-end (already listed in AC). - The issue is precise about scope: branches only, client only. That matches what I'd want — no scope creep, no incidental "while we're here let me also..." additions. - One small clean-code nit: when I look at `vitest.client-coverage.config.ts:44-49`, the thresholds block has four explicit fields. After the change, three are at 80 and one is at 70. That asymmetry deserves a one-line inline comment pointing at the ADR — exactly the kind of *why* comment that earns its keep. ### Recommendations - **Add an inline comment at the threshold block** referencing the new ADR: ```ts thresholds: { lines: 80, functions: 80, // Branches at 70 — long-tail-grind in parent/child accounting. // See docs/adr/013-client-branches-coverage-threshold.md and #496. branches: 70, statements: 80 } ``` A reviewer six months from now sees the asymmetry and immediately understands why. Without it, the obvious next move is "let me normalize this back to 80." - **Run `npm run test:coverage` locally before pushing** to confirm the current branches number really is 77% (issue says 77%; I want the ADR to record the exact measured number with the date). The ADR's headroom argument hinges on this — if it's actually 73% the 70% gate has 3pp not 7pp. - **No new tests needed.** The behavior of the production codebase is unchanged. The change is purely a CI-gate adjustment. - **The "branch + commit + PR" workflow per project rules applies even for a one-line config change.** Branch like `chore/issue-556-drop-client-branches-coverage-gate`, commit referencing #556, PR with the ADR included. ### Open Decisions _(none — concrete, scoped, mechanical change)_

marcel commented

2026-05-13 12:59:28 +02:00

Tobias Wendt — DevOps & Platform Engineer

Observations

The actual CI symptom matches the issue: unit-tests job in .gitea/workflows/ci.yml runs npm run test:coverage (line 63). Vitest exits non-zero on threshold failure, the step fails, and downstream jobs that depend on unit-tests never run. So yes — this single threshold is currently a hard merge gate for #555 and every other PR.
coverage-flake-probe.yml is a workflow_dispatch matrix of 20 parallel runs. After the threshold drops to 70%, each cell of the matrix must now also pass the threshold. If a cell randomly comes in at 69.8% (small variance across runs is real with browser/playwright), the probe fails on coverage rather than on the birpc race it's meant to detect — defeating the probe's diagnostic purpose. This is the one infrastructure risk I'd flag.
The change does not touch any persistent data, Compose service, secret, or workflow runner. No deployment impact, no rollback complexity. From an ops perspective this is the simplest kind of CI change.
The ADR is the right artifact. It locks the rationale into the repo where the next operator sees it without needing to read three issues.

Recommendations

Verify coverage variance across the 20-cell probe matrix BEFORE merging the threshold drop. Run the existing coverage-flake-probe.yml once on main with current settings and capture the per-cell branches percentage from the lcov output. If the spread is e.g. 76.8% – 78.2%, then 70% has plenty of headroom. If it's 70.5% – 78%, then 70% is too close and 65% is the safer pick. The issue assumes 7pp headroom — verify it's real before locking it in.
The ADR's "When to revisit" clause should be cron-able, not vibes-based. "Every 6 months OR when a deliberate coverage push is scheduled" needs an owner and a calendar entry. Concretely: add a Gitea issue with next-review: 2026-11-13 in the body, label cleanup, milestone TBD. Otherwise this ADR slides off the radar like every other "revisit later" promise.
Don't update coverage-flake-probe.yml itself. Its job is to detect the birpc teardown race (ADR 012), not coverage thresholds. The threshold check happens implicitly inside npm run test:coverage; the probe should keep its single responsibility.
Acceptance #3 says "coverage-flake-probe.yml run (workflow_dispatch) succeeds end-to-end with branches: 70". This is the right validation — but record the per-cell coverage numbers in the PR description (or a comment) so we have a baseline for the future ratchet decision.

Open Decisions

70% vs 65% — depends on actual variance. The headroom argument in the issue is 77% current → 70% gate = 7pp, but that's a single measurement. If the 20-cell probe shows ±2% noise, 70% is fine. If it shows ±5% noise, the gate will start flaking and we'll be right back here in two weeks. Cost of 70% if variance is high: random CI failures every ~10 PRs. Cost of 65%: even less signal from the gate, but no flake. Measure first, pick second.

## Tobias Wendt — DevOps & Platform Engineer ### Observations - The actual CI symptom matches the issue: `unit-tests` job in `.gitea/workflows/ci.yml` runs `npm run test:coverage` (line 63). Vitest exits non-zero on threshold failure, the step fails, and downstream jobs that depend on `unit-tests` never run. So yes — this single threshold is currently a hard merge gate for #555 and every other PR. - `coverage-flake-probe.yml` is a workflow_dispatch matrix of 20 parallel runs. After the threshold drops to 70%, each cell of the matrix must now also pass the threshold. If a cell randomly comes in at 69.8% (small variance across runs is real with browser/playwright), the probe fails on coverage rather than on the birpc race it's meant to detect — defeating the probe's diagnostic purpose. **This is the one infrastructure risk I'd flag.** - The change does not touch any persistent data, Compose service, secret, or workflow runner. No deployment impact, no rollback complexity. From an ops perspective this is the simplest kind of CI change. - The ADR is the right artifact. It locks the rationale into the repo where the next operator sees it without needing to read three issues. ### Recommendations - **Verify coverage variance across the 20-cell probe matrix BEFORE merging the threshold drop.** Run the existing `coverage-flake-probe.yml` once on `main` with current settings and capture the per-cell branches percentage from the lcov output. If the spread is e.g. 76.8% – 78.2%, then 70% has plenty of headroom. If it's 70.5% – 78%, then 70% is too close and 65% is the safer pick. The issue assumes 7pp headroom — verify it's real before locking it in. - **The ADR's "When to revisit" clause should be cron-able, not vibes-based.** "Every 6 months OR when a deliberate coverage push is scheduled" needs an owner and a calendar entry. Concretely: add a Gitea issue with `next-review: 2026-11-13` in the body, label `cleanup`, milestone TBD. Otherwise this ADR slides off the radar like every other "revisit later" promise. - **Don't update `coverage-flake-probe.yml` itself.** Its job is to detect the birpc teardown race (ADR 012), not coverage thresholds. The threshold check happens implicitly inside `npm run test:coverage`; the probe should keep its single responsibility. - **Acceptance #3 says "`coverage-flake-probe.yml` run (workflow_dispatch) succeeds end-to-end with `branches: 70`".** This is the right validation — but record the per-cell coverage numbers in the PR description (or a comment) so we have a baseline for the future ratchet decision. ### Open Decisions - **70% vs 65% — depends on actual variance.** The headroom argument in the issue is `77% current → 70% gate = 7pp`, but that's a single measurement. If the 20-cell probe shows ±2% noise, 70% is fine. If it shows ±5% noise, the gate will start flaking and we'll be right back here in two weeks. Cost of 70% if variance is high: random CI failures every ~10 PRs. Cost of 65%: even less signal from the gate, but no flake. Measure first, pick second.

marcel commented

2026-05-13 12:59:43 +02:00

Sara Holt — QA Engineer & Test Strategist

Observations

We are explicitly lowering a quality gate. That's allowed, but it has to be done with eyes open and with a written ratchet path back up. The issue does both — ADR explains why, AC commits to "ratchet direction is up, never down." Good.
The "long-tail-grind" framing is accurate but worth naming precisely: in Istanbul, when a child component adds tests, the parent's reported denominator grows because more branches in the call graph become reachable. So a child going 40% → 80% can drag the parent from 78% → 72%. It's not a bug, it's how branch coverage accounting works, and it's why arbitrary thresholds above ~75% on a real-world UI codebase are a constant grind.
80% branches is the floor I usually quote in my own persona. I'm not contradicting myself by approving 70% here — the floor exists for integration-test layer meaningful business logic, not for browser-mode component coverage including every conditional class binding in Svelte templates. Different layer, different appropriate threshold. Worth saying explicitly in the ADR so nobody reads this as "we lowered our quality bar."
#496 is closed but its problem statement is still true. Branches are at 77%, the issue title says "Increase ... to ≥ 80%", and AC explicitly says all four metrics must be ≥ 80%. If #496 is closed-as-done, the work is not actually done. That's misleading state for the backlog — exactly the "zombie issue" pattern I flag in audits.

Recommendations

Add a sentence to the ADR distinguishing layer-appropriate coverage targets. Something like: "The 80% branch floor we apply to backend unit/integration tests stays in place — this ADR only lowers the browser-mode component coverage gate, which exercises Svelte template branches (class bindings, conditional renders) that have a fundamentally different accounting." Otherwise future contributors will cargo-cult "70% is fine" into the backend.
Reopen #496 (or create a successor issue) and link the new ADR. The work it tracks is not complete — closing it while branches are at 77% leaves the backlog dishonest. Title the successor "Ratchet client-branches coverage gate 70 → 80" so the goal is explicit.
Record the exact baseline number in the ADR with a date and reproducer command. "77%" alone is not actionable. "77.4% measured 2026-05-13 via cd frontend && npm run test:coverage against commit <sha>" gives the next reviewer something to compare against when deciding whether to ratchet.
Add the threshold drop to the regression-test mental model. When AC #3 says the flake-probe "succeeds end-to-end with branches: 70" — that's the actual quality test for this change. Capture the per-cell coverage from those 20 runs and paste it in the PR. That's our evidence the gate holds.

Open Decisions

(none — the change is scoped, the ratchet path is documented, the only judgment call is the layer-distinction wording in the ADR)

## Sara Holt — QA Engineer & Test Strategist ### Observations - We are explicitly lowering a quality gate. That's allowed, but it has to be done with eyes open and with a written ratchet path back up. The issue does both — ADR explains why, AC commits to "ratchet direction is up, never down." Good. - The "long-tail-grind" framing is accurate but worth naming precisely: in Istanbul, when a child component adds tests, the parent's reported denominator grows because more branches in the call graph become reachable. So a child going 40% → 80% can drag the parent from 78% → 72%. It's not a bug, it's how branch coverage accounting works, and it's why arbitrary thresholds above ~75% on a real-world UI codebase are a constant grind. - **80% branches is the floor I usually quote in my own persona.** I'm not contradicting myself by approving 70% here — the floor exists for *integration-test layer* meaningful business logic, not for browser-mode component coverage including every conditional class binding in Svelte templates. Different layer, different appropriate threshold. Worth saying explicitly in the ADR so nobody reads this as "we lowered our quality bar." - **#496 is closed but its problem statement is still true.** Branches are at 77%, the issue title says "Increase ... to ≥ 80%", and AC explicitly says all four metrics must be ≥ 80%. If #496 is closed-as-done, the work is not actually done. That's misleading state for the backlog — exactly the "zombie issue" pattern I flag in audits. ### Recommendations - **Add a sentence to the ADR distinguishing layer-appropriate coverage targets.** Something like: "The 80% branch floor we apply to backend unit/integration tests stays in place — this ADR only lowers the browser-mode component coverage gate, which exercises Svelte template branches (class bindings, conditional renders) that have a fundamentally different accounting." Otherwise future contributors will cargo-cult "70% is fine" into the backend. - **Reopen #496** (or create a successor issue) and link the new ADR. The work it tracks is not complete — closing it while branches are at 77% leaves the backlog dishonest. Title the successor "Ratchet client-branches coverage gate 70 → 80" so the goal is explicit. - **Record the exact baseline number in the ADR with a date and reproducer command.** "77%" alone is not actionable. "77.4% measured 2026-05-13 via `cd frontend && npm run test:coverage` against commit `<sha>`" gives the next reviewer something to compare against when deciding whether to ratchet. - **Add the threshold drop to the regression-test mental model.** When AC #3 says the flake-probe "succeeds end-to-end with `branches: 70`" — that's the actual quality test for this change. Capture the per-cell coverage from those 20 runs and paste it in the PR. That's our evidence the gate holds. ### Open Decisions _(none — the change is scoped, the ratchet path is documented, the only judgment call is the layer-distinction wording in the ADR)_

marcel commented

2026-05-13 12:59:53 +02:00

Nora "NullX" Steiner — Application Security Engineer

Observations

This is a CI configuration change to a code-coverage threshold for browser-mode component tests. It does not touch authentication, authorization, input validation, SSRF/XXE surfaces, secrets handling, or any production code path. No CWE applies directly.
The indirect security question: does lowering a coverage gate weaken security-test coverage specifically? I checked. The gate is over src/**/*.svelte and src/**/*.svelte.ts (component templates and component-local logic). Auth flow components (login/+page.svelte, reset-password/+page.svelte, forgot-password/+page.svelte) are explicitly listed in #496 as 0% branch coverage today — meaning the 77% number already does not include them. So this change does not regress an already-tested security path; it acknowledges that several security-relevant components are simply untested today.
That's a real finding but it belongs in #496 (or its successor), not in #556. The gate-drop doesn't create the problem — the problem already exists. The gate-drop just makes the existing state legible.
The ADR is the right place to call out which security-relevant components are currently uncovered, so the next reviewer of the threshold knows what they should test FIRST when ratcheting back up.

Recommendations

Add a "Security-relevant uncovered components" subsection to the ADR. List login/+page.svelte, forgot-password/+page.svelte, reset-password/+page.svelte, register/+page.svelte, and any other auth/permission-boundary UI that has 0% branch coverage today. When the ratchet-up work begins, these jump to the top of the priority list — uncovered auth UI is where regression risk lives.
Do not change the gate for backend tests. I see no proposal to do so, but flagging it explicitly: backend @RequirePermission enforcement tests, parseBackendError parsing, and CSRF/auth-cookie handling are tested at a different layer with different tooling. Their 80% gate stays.
No security regression test is required for this PR. The change is observable purely as a CI configuration delta. Verify in the diff that ONLY vitest.client-coverage.config.ts thresholds change — no test files removed, no exclude-list growth, no skipIf added to existing security tests.

Open Decisions

(none — this is a CI-gate change with no direct security surface)

## Nora "NullX" Steiner — Application Security Engineer ### Observations - This is a CI configuration change to a code-coverage threshold for browser-mode component tests. It does not touch authentication, authorization, input validation, SSRF/XXE surfaces, secrets handling, or any production code path. No CWE applies directly. - The indirect security question: **does lowering a coverage gate weaken security-test coverage specifically?** I checked. The gate is over `src/**/*.svelte` and `src/**/*.svelte.ts` (component templates and component-local logic). Auth flow components (`login/+page.svelte`, `reset-password/+page.svelte`, `forgot-password/+page.svelte`) are explicitly listed in #496 as 0% branch coverage today — meaning the 77% number already does not include them. So this change does not regress an already-tested security path; it acknowledges that several security-relevant components are simply untested today. - That's a real finding but it belongs in #496 (or its successor), not in #556. The gate-drop doesn't create the problem — the problem already exists. The gate-drop just makes the existing state legible. - The ADR is the right place to call out which security-relevant components are *currently uncovered*, so the next reviewer of the threshold knows what they should test FIRST when ratcheting back up. ### Recommendations - **Add a "Security-relevant uncovered components" subsection to the ADR.** List `login/+page.svelte`, `forgot-password/+page.svelte`, `reset-password/+page.svelte`, `register/+page.svelte`, and any other auth/permission-boundary UI that has 0% branch coverage today. When the ratchet-up work begins, these jump to the top of the priority list — uncovered auth UI is where regression risk lives. - **Do not change the gate for backend tests.** I see no proposal to do so, but flagging it explicitly: backend `@RequirePermission` enforcement tests, `parseBackendError` parsing, and CSRF/auth-cookie handling are tested at a different layer with different tooling. Their 80% gate stays. - **No security regression test is required for this PR.** The change is observable purely as a CI configuration delta. Verify in the diff that ONLY `vitest.client-coverage.config.ts` thresholds change — no test files removed, no exclude-list growth, no `skipIf` added to existing security tests. ### Open Decisions _(none — this is a CI-gate change with no direct security surface)_

marcel commented

2026-05-13 13:00:04 +02:00

Leonie Voss — UX Designer & Accessibility Strategist

Observations

This is a CI/coverage-threshold change. No user-facing behaviour, no component, no copy, no color, no breakpoint, no focus state, no ARIA attribute is modified. Nothing renders differently to a user — senior or millennial, on a tablet or a small phone.
The downstream UX consequence I do care about: lowering the gate means fewer Svelte template branches are forced to be exercised by tests. Many of those branches are visual state branches (loading, empty, error, populated; light/dark; mobile/desktop). Untested visual-state branches are where dark-mode-contrast regressions and mobile-overflow regressions sneak in.
But that's a long-game concern, not a reason to block a CI unblock. The current state is already 77% — the regression risk exists today regardless of what the gate is set to.

Recommendations

When the ratchet-up work begins (whoever owns the successor to #496), prioritise visual-state branches in components that render different states. Specifically: error/empty/loading branches in DocumentTopBar, PersonCard, PersonChipRow, and the auth pages. These are where a11y regressions hide. (#496 already lists them — the prioritization should hold.)
Keep axe-playwright runs at full strength. That's the actual accessibility quality gate. The Istanbul branch coverage threshold is a proxy metric; axe is the real check. Verify in the PR that nothing in frontend/e2e/ accessibility tests is being skipped or relaxed alongside this change.
No design or component changes required for this PR.

Open Decisions

(none — no UX/accessibility surface in this change)

## Leonie Voss — UX Designer & Accessibility Strategist ### Observations - This is a CI/coverage-threshold change. No user-facing behaviour, no component, no copy, no color, no breakpoint, no focus state, no ARIA attribute is modified. Nothing renders differently to a user — senior or millennial, on a tablet or a small phone. - The downstream UX consequence I do care about: lowering the gate means fewer Svelte template branches are forced to be exercised by tests. Many of those branches are *visual state* branches (loading, empty, error, populated; light/dark; mobile/desktop). Untested visual-state branches are where dark-mode-contrast regressions and mobile-overflow regressions sneak in. - But that's a long-game concern, not a reason to block a CI unblock. The current state is already 77% — the regression risk exists today regardless of what the gate is set to. ### Recommendations - **When the ratchet-up work begins (whoever owns the successor to #496), prioritise visual-state branches in components that render different states.** Specifically: error/empty/loading branches in `DocumentTopBar`, `PersonCard`, `PersonChipRow`, and the auth pages. These are where a11y regressions hide. (#496 already lists them — the prioritization should hold.) - **Keep axe-playwright runs at full strength.** That's the actual accessibility quality gate. The Istanbul branch coverage threshold is a proxy metric; axe is the real check. Verify in the PR that nothing in `frontend/e2e/` accessibility tests is being skipped or relaxed alongside this change. - **No design or component changes required for this PR.** ### Open Decisions _(none — no UX/accessibility surface in this change)_

marcel commented

2026-05-13 13:00:21 +02:00

"Elicit" — Requirements Engineer & Business Analyst

Observations (Brownfield mode)

The issue is spec-dense in the way solo-LLM-driven workflow requires: explicit Context, Decision, Acceptance Criteria, Non-goals, References. INVEST-compliant: Independent (no upstream dependencies), Negotiable (the 70% number is the obvious lever), Valuable (unblocks #555 and every other PR), Estimable (one-line config + one ADR file ≈ 30 min), Small, Testable (AC #3 is a concrete CI run).
The MoSCoW classification is implicit but clear: P0-critical label + "unblock CI" framing = Must for this release.
AC gap I see: there is no acceptance criterion that the EXISTING npm run test:coverage run passes on the branch where the gate is dropped. AC #3 says the flake-probe workflow passes, which proves the threshold-drop is enforceable. But the primary failure mode this issue exists to fix is: unit-tests job in ci.yml currently fails. That needs its own AC bullet to close the loop.
AC #1 is precise about the file but doesn't bound the diff. A literal reading allows other unrelated changes in the same file. Worth tightening — see recommendations.
NFR checklist for this change:
- Performance: N/A (CI runtime unchanged).
- Maintainability: ADR commitment satisfies this — rationale is captured.
- Observability: the ADR's "When to revisit" is the only observability hook. Not enough on its own (see below).
- Compatibility: no breaking change — all existing tests continue to run identically.
- Documentation: ADR is the doc artifact. Cross-link to #496 is in the issue but not yet in #496 itself.

Recommendations

Add a fourth acceptance criterion: "Given the change is merged, when CI runs on PR #555 (or any open PR currently failing only on this threshold), Then the unit-tests job passes." This is the actual business outcome — without it, AC measures "is the gate enforceable" but not "did we achieve the unblock."
Tighten AC #1: "Only frontend/vitest.client-coverage.config.ts lines 38–50 change; the branches value is set to 70; no other key in the thresholds block is modified." This prevents accidental scope drift on a P0-critical merge.
Add a fifth AC for the ratchet mechanism: "Given the new ADR exists, Then it specifies (a) the exact measured baseline percentage on <commit-sha>, (b) the trigger condition for raising the gate (e.g. 'when 3 consecutive PRs report ≥ 73% branches'), (c) a calendar review date no later than 2026-11-13." Without this, "when to revisit" is unenforceable and the 70% gate calcifies.
Reconcile with #496. #496 is closed but its AC ("statement, line, branch, and function coverage for the client project are each ≥ 80%") is not met. Either reopen #496 with a status note pointing to ADR-013 and #556, OR create a successor issue. The current state — closed issue whose AC is unmet — is the "zombie issue" failure mode.
Title nit: the issue title is good ("ci(coverage): drop client-project branches threshold 80 → 70 to unblock CI") — verb-noun, scoped, includes both the change and the why. No improvement needed.

Open Decisions

Does this issue close on merge, or stay open until the ratchet is achieved? Options: (a) close on merge — clean PR/issue 1:1 mapping, but loses the ratchet context. (b) keep open with the new ADR linked, close when branches ≥ 80% — accurate state, but a long-running issue. (c) close this one on merge AND open a fresh "ratchet to 80%" issue immediately, with ADR-013 referenced. (c) is my recommendation but it depends on how you want the milestone burndown to read.
Should the new ADR have a numbered "Revisit Schedule" or just a free-text "when to revisit"? Schedule is enforceable (you can ls -la docs/adr and find ADRs whose review date has passed). Free text is informal and slides. Free text is fine for solo workflow IF you keep a separate "ADR review queue" issue or label. Pick one.

## "Elicit" — Requirements Engineer & Business Analyst ### Observations (Brownfield mode) - The issue is spec-dense in the way solo-LLM-driven workflow requires: explicit Context, Decision, Acceptance Criteria, Non-goals, References. INVEST-compliant: Independent (no upstream dependencies), Negotiable (the 70% number is the obvious lever), Valuable (unblocks #555 and every other PR), Estimable (one-line config + one ADR file ≈ 30 min), Small, Testable (AC #3 is a concrete CI run). - The MoSCoW classification is implicit but clear: P0-critical label + "unblock CI" framing = **Must** for this release. - **AC gap I see: there is no acceptance criterion that the EXISTING `npm run test:coverage` run passes on the branch where the gate is dropped.** AC #3 says the flake-probe workflow passes, which proves the threshold-drop is enforceable. But the *primary failure mode* this issue exists to fix is: `unit-tests` job in `ci.yml` currently fails. That needs its own AC bullet to close the loop. - **AC #1 is precise about the file but doesn't bound the diff.** A literal reading allows other unrelated changes in the same file. Worth tightening — see recommendations. - **NFR checklist for this change:** - Performance: N/A (CI runtime unchanged). - Maintainability: ADR commitment satisfies this — rationale is captured. - Observability: the ADR's "When to revisit" is the only observability hook. Not enough on its own (see below). - Compatibility: no breaking change — all existing tests continue to run identically. - Documentation: ADR is the doc artifact. Cross-link to #496 is in the issue but not yet in #496 itself. ### Recommendations - **Add a fourth acceptance criterion**: *"Given the change is merged, when CI runs on PR #555 (or any open PR currently failing only on this threshold), Then the `unit-tests` job passes."* This is the actual business outcome — without it, AC measures "is the gate enforceable" but not "did we achieve the unblock." - **Tighten AC #1**: *"Only `frontend/vitest.client-coverage.config.ts` lines 38–50 change; the `branches` value is set to 70; no other key in the thresholds block is modified."* This prevents accidental scope drift on a P0-critical merge. - **Add a fifth AC for the ratchet mechanism**: *"Given the new ADR exists, Then it specifies (a) the exact measured baseline percentage on `<commit-sha>`, (b) the trigger condition for raising the gate (e.g. 'when 3 consecutive PRs report ≥ 73% branches'), (c) a calendar review date no later than 2026-11-13."* Without this, "when to revisit" is unenforceable and the 70% gate calcifies. - **Reconcile with #496.** #496 is closed but its AC ("statement, line, branch, and function coverage for the client project are each ≥ 80%") is not met. Either reopen #496 with a status note pointing to ADR-013 and #556, OR create a successor issue. The current state — closed issue whose AC is unmet — is the "zombie issue" failure mode. - **Title nit:** the issue title is good ("ci(coverage): drop client-project branches threshold 80 → 70 to unblock CI") — verb-noun, scoped, includes both the change and the why. No improvement needed. ### Open Decisions - **Does this issue close on merge, or stay open until the ratchet is achieved?** Options: (a) close on merge — clean PR/issue 1:1 mapping, but loses the ratchet context. (b) keep open with the new ADR linked, close when branches ≥ 80% — accurate state, but a long-running issue. (c) close this one on merge AND open a fresh "ratchet to 80%" issue immediately, with ADR-013 referenced. (c) is my recommendation but it depends on how you want the milestone burndown to read. - **Should the new ADR have a numbered "Revisit Schedule" or just a free-text "when to revisit"?** Schedule is enforceable (you can ls -la docs/adr and find ADRs whose review date has passed). Free text is informal and slides. Free text is fine for solo workflow IF you keep a separate "ADR review queue" issue or label. Pick one.

marcel commented

2026-05-13 13:00:37 +02:00

Decision Queue — Action Required

3 decisions need your input before implementation starts.

Backlog hygiene

What to do with #496. It is closed but its acceptance criterion (all four metrics ≥ 80%) is not met — branches sit at 77%. Three options surfaced (Markus, Sara, Elicit converged on this):
- (a) Reopen #496 as the tracking issue for ratcheting client branches back up to 80%; preserves history, but makes #496 a long-running issue.
- (b) Keep #496 closed and create a fresh successor issue ("Ratchet client-branches coverage gate 70 → 80") linked from ADR-013; clean Definition of Done, but splits the context across two issues.
- (c) Reopen #496 only briefly to add a "See also: ADR-013 and #556" back-link, then close again.
  The current state — closed-as-done with AC unmet — is the "zombie issue" pattern. Pick one.

Threshold value

70% vs 65% — depends on actual variance (Raised by: Tobias). The issue's 7pp-headroom argument is based on a single 77% measurement. Run coverage-flake-probe.yml once on main with the current branches: 80 removed/relaxed to capture the per-cell branches percentage across the 20-cell matrix. If spread is small (e.g. 76.8% – 78.2%), 70% is fine. If wide (e.g. 70.5% – 78%), 65% is the safer pick and 70% will start flaking the same job in two weeks. Cost of guessing wrong: random CI failures every ~10 PRs and reopening this same conversation.

ADR governance

Revisit mechanism: scheduled date vs free-text "when to revisit" (Raised by: Tobias, Elicit; Markus and Sara independently asked for stronger ratchet language).
- Calendar-based (e.g. next-review: 2026-11-13 in a tracking issue): enforceable, surfacable via labels/queries.
- Trigger-based (e.g. "ratchet up when 3 consecutive PRs report ≥ 73% branches"): self-policing, no calendar dependency.
- Free text ("every 6 months OR when a deliberate push is scheduled"): familiar but unenforceable in solo-LLM workflow — slides.
  Pick one for the ADR. Calendar + trigger combined is fine; free-text-only is not.

## Decision Queue — Action Required _3 decisions need your input before implementation starts._ ### Backlog hygiene - **What to do with #496.** It is closed but its acceptance criterion (all four metrics ≥ 80%) is not met — branches sit at 77%. Three options surfaced (Markus, Sara, Elicit converged on this): - (a) Reopen #496 as the tracking issue for ratcheting client branches back up to 80%; preserves history, but makes #496 a long-running issue. - (b) Keep #496 closed and create a fresh successor issue ("Ratchet client-branches coverage gate 70 → 80") linked from ADR-013; clean Definition of Done, but splits the context across two issues. - (c) Reopen #496 only briefly to add a "See also: ADR-013 and #556" back-link, then close again. _The current state — closed-as-done with AC unmet — is the "zombie issue" pattern. Pick one._ ### Threshold value - **70% vs 65% — depends on actual variance** _(Raised by: Tobias)_. The issue's 7pp-headroom argument is based on a single 77% measurement. Run `coverage-flake-probe.yml` once on `main` with the current `branches: 80` removed/relaxed to capture the per-cell branches percentage across the 20-cell matrix. If spread is small (e.g. 76.8% – 78.2%), 70% is fine. If wide (e.g. 70.5% – 78%), 65% is the safer pick and 70% will start flaking the same job in two weeks. Cost of guessing wrong: random CI failures every ~10 PRs and reopening this same conversation. ### ADR governance - **Revisit mechanism: scheduled date vs free-text "when to revisit"** _(Raised by: Tobias, Elicit; Markus and Sara independently asked for stronger ratchet language)_. - Calendar-based (e.g. `next-review: 2026-11-13` in a tracking issue): enforceable, surfacable via labels/queries. - Trigger-based (e.g. "ratchet up when 3 consecutive PRs report ≥ 73% branches"): self-policing, no calendar dependency. - Free text ("every 6 months OR when a deliberate push is scheduled"): familiar but unenforceable in solo-LLM workflow — slides. _Pick one for the ADR. Calendar + trigger combined is fine; free-text-only is not._

marcel commented

2026-05-13 13:01:55 +02:00

Decisions on the review queue

Resolving the four cross-cutting themes from the persona review so the implementation can proceed without further blocking input.

1. ADR number → 013

docs/adr/ confirmed: existing 012-browser-test-mocking-strategy.md and 012-nsenter-for-host-service-management-in-ci.md are a pre-existing collision (note: I will file a separate cleanup issue to renumber 012-nsenter → 013-nsenter). For this issue, use the next free numerical slot — 013 for the threshold ADR is acceptable; the duplicate-012 problem is out of scope and tracked separately.

2. Concrete ratchet mechanism (not "vibes-based revisit")

The ADR must contain:

Ratchet rule. The branches threshold ratchets up by 3 percentage points when the rolling 3-PR-average client-project branches figure on main stays at or above threshold + 3pp for ≥30 consecutive days. Direction is up-only — never lower the floor below 70 without a new ADR superseding this one. Manual today (verify before any vitest.client-coverage.config.ts edit); a future automation issue may codify the check.

This is a concrete, observable trigger — not a calendar review.

3. Exact baseline recorded in the ADR

The ADR must include:

Measured 2026-05-13 against branch `feat/issue-553-birpc-async-mock-factory` (head 2e6cc346):
  branches:   77.xx%  ← below 80 gate, reason for this ADR
  lines:      8x.xx%
  functions:  8x.xx%
  statements: 8x.xx%

Reproducer:
  cd frontend && npm ci && npm run test -- --coverage \
    --config vitest.client-coverage.config.ts

Capture the actual numbers from a fresh local run during the implementation (replace xx.xx placeholders).

4. Variance handling

70% is the floor. If the coverage-flake-probe.yml matrix produces any cell with branches < 70%, that is a separate signal worth investigating — open a follow-up issue rather than silently lowering the floor again. The 7-point headroom is intentional.

Acceptance update

Add to the existing list:

ADR-013 contains the ratchet rule (item 2 above) verbatim
ADR-013 contains a measured baseline block (item 3 above) with real numbers from a clean local run
No further threshold drops without a superseding ADR

Items closed by these decisions

"Concrete ratchet mechanism" (Markus, Tobias, Sara, Elicit) → §2 above
"Record exact baseline" (Felix, Sara, Tobias) → §3 above
"Variance-vs-headroom uncertainty" (Tobias) → §4 above
"Duplicate-012 ADR collision" (Markus) → out of scope; follow-up issue to be filed

Out of scope

"#496 is a zombie issue" (Markus, Sara, Elicit) — valid criticism but it's a backlog-hygiene problem with #496 itself, not a blocker for this threshold drop. Resolve in #496 or a successor.

Plan is pre-approved by the maintainer. The implementer should treat the items above as final, file the noted out-of-scope follow-ups as separate Gitea issues, and proceed.

## Decisions on the review queue Resolving the four cross-cutting themes from the persona review so the implementation can proceed without further blocking input. ### 1. ADR number → **013** `docs/adr/` confirmed: existing `012-browser-test-mocking-strategy.md` and `012-nsenter-for-host-service-management-in-ci.md` are a **pre-existing collision** (note: I will file a separate cleanup issue to renumber `012-nsenter` → `013-nsenter`). For *this* issue, use the next free numerical slot — `013` for the threshold ADR is acceptable; the duplicate-012 problem is out of scope and tracked separately. ### 2. Concrete ratchet mechanism (not "vibes-based revisit") The ADR must contain: > **Ratchet rule.** The branches threshold ratchets **up by 3 percentage points** when the rolling 3-PR-average client-project branches figure on `main` stays at or above `threshold + 3pp` for ≥30 consecutive days. Direction is up-only — never lower the floor below 70 without a new ADR superseding this one. Manual today (verify before any `vitest.client-coverage.config.ts` edit); a future automation issue may codify the check. This is a concrete, observable trigger — not a calendar review. ### 3. Exact baseline recorded in the ADR The ADR must include: ``` Measured 2026-05-13 against branch `feat/issue-553-birpc-async-mock-factory` (head 2e6cc346): branches: 77.xx% ← below 80 gate, reason for this ADR lines: 8x.xx% functions: 8x.xx% statements: 8x.xx% Reproducer: cd frontend && npm ci && npm run test -- --coverage \ --config vitest.client-coverage.config.ts ``` Capture the actual numbers from a fresh local run during the implementation (replace `xx.xx` placeholders). ### 4. Variance handling 70% is the **floor**. If the `coverage-flake-probe.yml` matrix produces any cell with branches < 70%, that is a separate signal worth investigating — open a follow-up issue rather than silently lowering the floor again. The 7-point headroom is intentional. ### Acceptance update Add to the existing list: - [ ] ADR-013 contains the ratchet rule (item 2 above) verbatim - [ ] ADR-013 contains a measured baseline block (item 3 above) with real numbers from a clean local run - [ ] No further threshold drops without a superseding ADR ### Items closed by these decisions - "Concrete ratchet mechanism" (Markus, Tobias, Sara, Elicit) → §2 above - "Record exact baseline" (Felix, Sara, Tobias) → §3 above - "Variance-vs-headroom uncertainty" (Tobias) → §4 above - "Duplicate-012 ADR collision" (Markus) → out of scope; follow-up issue to be filed ### Out of scope - "#496 is a zombie issue" (Markus, Sara, Elicit) — valid criticism but it's a backlog-hygiene problem with #496 itself, not a blocker for *this* threshold drop. Resolve in #496 or a successor. Plan is **pre-approved** by the maintainer. The implementer should treat the items above as final, file the noted out-of-scope follow-ups as separate Gitea issues, and proceed.

marcel referenced this issue

2026-05-13 13:02:09 +02:00

ci(devops): downgrade actions/upload-artifact v4 → v3 (re-regression — needs ADR to prevent future re-upgrade) #557

marcel referenced this issue

2026-05-14 08:50:32 +02:00

ci(devops): downgrade actions/upload-artifact v4 → v3 (re-regression — needs ADR to prevent future re-upgrade) #557

marcel referenced this issue from a commit

2026-05-14 08:55:32 +02:00

chore(coverage): drop client branches threshold 80→75, add ADR-013

marcel referenced a pull request that will close this issue

2026-05-14 08:55:46 +02:00

chore(coverage): drop client branches threshold 80→75 to unblock CI #559