feat(transcription): E2E test for bulk "Alle als fertig markieren" action #355

New Issue

marcel · 2026-04-27T08:12:37+02:00

marcel commented

2026-04-27 08:12:37 +02:00

Context

PR #352 shipped the bulk "Alle als fertig markieren" action without a Playwright E2E test. The unit and component tests are sufficient for the Demo Day milestone, but the E2E coverage gap remains open.

Raised by Sara (QA) in the PR #352 Decision Queue (Item 3).

Scenario to cover

Given a document with ≥ 2 unreviewed transcription blocks
When the user clicks "Alle als fertig markieren"
Then all blocks are shown as reviewed in the UI
And the reviewed count equals the total count

When the user un-marks one block
Then that block is shown as unreviewed
And the reviewed count decreases by 1

Acceptance criteria

Playwright test in frontend/e2e/ covers the scenario above
Test is added to the existing E2E suite (does not require a separate test file if a transcription suite already exists)
Test passes in CI

Links

Feature implementation: PR #352
Original issue: feat(transcription): add "Alle als fertig markieren" bulk action to transcription panel (#345)

## Context PR #352 shipped the bulk "Alle als fertig markieren" action without a Playwright E2E test. The unit and component tests are sufficient for the Demo Day milestone, but the E2E coverage gap remains open. Raised by Sara (QA) in the PR #352 Decision Queue (Item 3). ## Scenario to cover ``` Given a document with ≥ 2 unreviewed transcription blocks When the user clicks "Alle als fertig markieren" Then all blocks are shown as reviewed in the UI And the reviewed count equals the total count When the user un-marks one block Then that block is shown as unreviewed And the reviewed count decreases by 1 ``` ## Acceptance criteria - [ ] Playwright test in `frontend/e2e/` covers the scenario above - [ ] Test is added to the existing E2E suite (does not require a separate test file if a transcription suite already exists) - [ ] Test passes in CI ## Links - Feature implementation: PR #352 - Original issue: #345

marcel added the P2-medium test labels 2026-04-27 08:12:56 +02:00

marcel referenced this issue

2026-04-27 08:14:09 +02:00

feat(transcription): add "Alle als fertig markieren" bulk action (#345) #352

marcel referenced this issue

2026-04-27 08:14:24 +02:00

feat(transcription): add "Alle als fertig markieren" bulk action (#345) #352

marcel referenced this issue

2026-04-27 08:14:44 +02:00

feat(transcription): add "Alle als fertig markieren" bulk action (#345) #352

marcel commented

2026-05-07 17:49:23 +02:00

👨‍💻 Markus Keller — Application Architect

Observations

The feature itself is layered correctly: PUT /api/documents/{documentId}/transcription-blocks/review-all lives in TranscriptionBlockController, delegating to TranscriptionService.markAllBlocksReviewed(). No boundary violations.
The E2E gap is purely a test coverage concern — the implementation does not introduce architectural debt.
The existing transcription.spec.ts uses beforeAll with API-seeded data and no afterAll cleanup. The test document leaks into the database on every CI run. The upcoming test should not repeat this mistake.
The single-block review toggle goes through PUT /{blockId}/review; the bulk action goes through PUT /review-all. Both return the updated block(s) and patch local state in the page component. This symmetric design makes the E2E scenario straightforward to write.
The E2E test suite is not wired into CI (ci.yml has no E2E job — only unit-tests running in the Playwright container). The acceptance criterion "Test passes in CI" is therefore currently impossible without a CI job for E2E. This is a pre-existing gap, but worth noting.

Recommendations

Add afterAll cleanup (DELETE the test document via API) to the new test, mirroring the pattern already established in transcribe-coach.spec.ts. The existing transcription.spec.ts omits this — do not copy that omission.
Place the new scenario in the existing transcription.spec.ts as a new test.describe block rather than a separate file. The issue already permits this, and keeping the transcription fixtures in one file avoids duplicated beforeAll setup.
The CI gap is a separate concern (issue #355 scope is the test, not the pipeline), but flag it: the acceptance criterion "passes in CI" cannot be green until a Playwright E2E job is added to ci.yml.

Open Decisions

Should the E2E CI gap be tracked as a follow-up issue, or should this issue be blocked until CI supports E2E? Given the P2 label and the explicit "test passes in CI" acceptance criterion, it is worth clarifying whether to expand scope or create a linked blocker issue.

## 👨‍💻 Markus Keller — Application Architect ### Observations - The feature itself is layered correctly: `PUT /api/documents/{documentId}/transcription-blocks/review-all` lives in `TranscriptionBlockController`, delegating to `TranscriptionService.markAllBlocksReviewed()`. No boundary violations. - The E2E gap is purely a test coverage concern — the implementation does not introduce architectural debt. - The existing `transcription.spec.ts` uses `beforeAll` with API-seeded data and no `afterAll` cleanup. The test document leaks into the database on every CI run. The upcoming test should not repeat this mistake. - The single-block review toggle goes through `PUT /{blockId}/review`; the bulk action goes through `PUT /review-all`. Both return the updated block(s) and patch local state in the page component. This symmetric design makes the E2E scenario straightforward to write. - The E2E test suite is **not** wired into CI (`ci.yml` has no E2E job — only `unit-tests` running in the Playwright container). The acceptance criterion "Test passes in CI" is therefore currently impossible without a CI job for E2E. This is a pre-existing gap, but worth noting. ### Recommendations - Add `afterAll` cleanup (DELETE the test document via API) to the new test, mirroring the pattern already established in `transcribe-coach.spec.ts`. The existing `transcription.spec.ts` omits this — do not copy that omission. - Place the new scenario in the existing `transcription.spec.ts` as a new `test.describe` block rather than a separate file. The issue already permits this, and keeping the transcription fixtures in one file avoids duplicated `beforeAll` setup. - The CI gap is a separate concern (issue #355 scope is the test, not the pipeline), but flag it: the acceptance criterion "passes in CI" cannot be green until a Playwright E2E job is added to `ci.yml`. ### Open Decisions - Should the E2E CI gap be tracked as a follow-up issue, or should this issue be blocked until CI supports E2E? Given the P2 label and the explicit "test passes in CI" acceptance criterion, it is worth clarifying whether to expand scope or create a linked blocker issue.

marcel commented

2026-05-07 17:49:45 +02:00

👨‍💻 Felix Brandt — Fullstack Developer

Observations

The component-level coverage in TranscriptionEditView.svelte.spec.ts is solid: button visibility, disabled state, single-call guarantee, and in-flight disabling are all covered at the unit layer using vitest-browser-svelte. What's missing is the full-stack round-trip: real API call → database update → UI reflects new state after reload.
The existing E2E helper createEmptyDocument (in helpers/upload-empty-document.ts) already handles document creation and PDF upload. A createTwoBlocks helper following the same pattern would be the minimal addition needed for this test.
The reviewToggle in +page.svelte patches local reactive state directly after the API response (transcriptionBlocks.map(b => b.id === blockId ? updated : b)). The markAllReviewed does a loop mutation on the same array. Both patterns are correct Svelte 5 reactive mutations — nothing to fix, just be aware the E2E test needs to assert on DOM state, not JS state.
The scenario specifies "un-marks one block" — this maps to clicking the individual block's review toggle button (checkmark icon in TranscriptionBlock.svelte, aria-label driven by m.transcription_block_unreview() / m.transcription_block_review()). Use the ARIA label to locate it reliably.
The data-block-id attribute exists on each block's wrapper div — this gives a stable selector to identify which block to un-mark.

Recommendations

Extract a createBlocksForDocument(request, docId, count) helper into helpers/ and reuse it here and in transcribe-coach.spec.ts's createBlock function (currently inlined). Single place to maintain.
Use named ARIA selectors throughout: page.getByRole('button', { name: /Alle als fertig markieren/ }) for the bulk action, and page.getByRole('button', { name: /Als fertig markieren/ }).first() (or .nth(0)) for the single-block toggle. Avoid CSS class selectors — they break on style refactors.
Assert on the counter text after each action: await expect(page.getByText(/2 \/ 2 geprüft/)).toBeVisible() after bulk mark, and await expect(page.getByText(/1 \/ 2 geprüft/)).toBeVisible() after un-marking one. This is the most user-visible signal and directly matches the acceptance criterion.
Add afterAll with request.delete(\/api/documents/${docId}`)` to avoid test data accumulation.
Do not use page.waitForSelector('[data-hydrated]') in the new test — the existing transcription.spec.ts uses it, but the cleaner pattern from transcribe-coach.spec.ts is to wait for a specific visible element instead.

Open Decisions

The i18n key for "Als fertig markieren" / "Als unfertig markieren" (single-block toggle) needs verification against the actual m.* function name in Paraglide before writing the selector. Check TranscriptionBlock.svelte lines 234–235 for the exact key.

## 👨‍💻 Felix Brandt — Fullstack Developer ### Observations - The component-level coverage in `TranscriptionEditView.svelte.spec.ts` is solid: button visibility, disabled state, single-call guarantee, and in-flight disabling are all covered at the unit layer using `vitest-browser-svelte`. What's missing is the full-stack round-trip: real API call → database update → UI reflects new state after reload. - The existing E2E helper `createEmptyDocument` (in `helpers/upload-empty-document.ts`) already handles document creation and PDF upload. A `createTwoBlocks` helper following the same pattern would be the minimal addition needed for this test. - The `reviewToggle` in `+page.svelte` patches local reactive state directly after the API response (`transcriptionBlocks.map(b => b.id === blockId ? updated : b)`). The `markAllReviewed` does a loop mutation on the same array. Both patterns are correct Svelte 5 reactive mutations — nothing to fix, just be aware the E2E test needs to assert on DOM state, not JS state. - The scenario specifies "un-marks one block" — this maps to clicking the individual block's review toggle button (checkmark icon in `TranscriptionBlock.svelte`, `aria-label` driven by `m.transcription_block_unreview()` / `m.transcription_block_review()`). Use the ARIA label to locate it reliably. - The `data-block-id` attribute exists on each block's wrapper div — this gives a stable selector to identify which block to un-mark. ### Recommendations - Extract a `createBlocksForDocument(request, docId, count)` helper into `helpers/` and reuse it here and in `transcribe-coach.spec.ts`'s `createBlock` function (currently inlined). Single place to maintain. - Use named ARIA selectors throughout: `page.getByRole('button', { name: /Alle als fertig markieren/ })` for the bulk action, and `page.getByRole('button', { name: /Als fertig markieren/ }).first()` (or `.nth(0)`) for the single-block toggle. Avoid CSS class selectors — they break on style refactors. - Assert on the counter text after each action: `await expect(page.getByText(/2 \/ 2 geprüft/)).toBeVisible()` after bulk mark, and `await expect(page.getByText(/1 \/ 2 geprüft/)).toBeVisible()` after un-marking one. This is the most user-visible signal and directly matches the acceptance criterion. - Add `afterAll` with `request.delete(\`/api/documents/${docId}\`)` to avoid test data accumulation. - Do not use `page.waitForSelector('[data-hydrated]')` in the new test — the existing `transcription.spec.ts` uses it, but the cleaner pattern from `transcribe-coach.spec.ts` is to wait for a specific visible element instead. ### Open Decisions - The i18n key for "Als fertig markieren" / "Als unfertig markieren" (single-block toggle) needs verification against the actual `m.*` function name in Paraglide before writing the selector. Check `TranscriptionBlock.svelte` lines 234–235 for the exact key.

marcel commented

2026-05-07 17:50:03 +02:00

👨‍💻 Tobias Wendt — DevOps & Platform Engineer

Observations

The E2E suite (frontend/e2e/) is not wired into ci.yml. The current CI pipeline runs unit-tests inside the Playwright Docker container (mcr.microsoft.com/playwright:v1.58.2-noble) but only runs Vitest unit/component tests — no Playwright E2E job exists. The acceptance criterion "Test passes in CI" in this issue cannot be satisfied without adding an E2E job.
The E2E tests require the full stack: SvelteKit dev server + Spring Boot backend + PostgreSQL + MinIO. The CI pipeline currently starts none of these for E2E purposes.
The Playwright image version in CI (v1.58.2-noble) should match the @playwright/test version in frontend/package.json to prevent version skew between local dev and CI.
The e2e/helpers/upload-empty-document.ts helper reads a PDF fixture from disk using a relative path resolved at import time — this works in both local and CI environments as long as the repo is checked out, which it is.
retries: process.env.CI ? 2 : 0 is already configured in playwright.config.ts. This is the right approach — retry logic handles infrastructure flakiness in CI without masking real failures locally.

Recommendations

Add a e2e-tests job to ci.yml that: (1) starts docker compose -f docker-compose.yml -f docker-compose.ci.yml up -d db minio, (2) starts the backend JAR, (3) runs npm run test:e2e. Without this, the "passes in CI" acceptance criterion is permanently unmet.
Until the CI job exists, add a note in the PR description acknowledging the gap and link the issue where CI E2E will be wired up. Do not mark the acceptance criterion green without an actual CI run.
Pin the Playwright version in package.json to match the Docker image version used in CI (v1.58.2). Mismatches cause protocol errors that are painful to debug.
The transcription.spec.ts beforeAll leaks a document on every test run — it creates the document and never deletes it. Over many CI runs this accumulates test data. The new test should include afterAll cleanup, and the existing transcription.spec.ts should be patched in a follow-up.

Open Decisions

Should the CI E2E job be added in this PR (expanding scope) or tracked as a separate issue? Given that "passes in CI" is an explicit acceptance criterion, it is a reasonable scope expansion rather than gold-plating.

## 👨‍💻 Tobias Wendt — DevOps & Platform Engineer ### Observations - The E2E suite (`frontend/e2e/`) is not wired into `ci.yml`. The current CI pipeline runs `unit-tests` inside the Playwright Docker container (`mcr.microsoft.com/playwright:v1.58.2-noble`) but only runs Vitest unit/component tests — no Playwright E2E job exists. The acceptance criterion "Test passes in CI" in this issue cannot be satisfied without adding an E2E job. - The E2E tests require the full stack: SvelteKit dev server + Spring Boot backend + PostgreSQL + MinIO. The CI pipeline currently starts none of these for E2E purposes. - The Playwright image version in CI (`v1.58.2-noble`) should match the `@playwright/test` version in `frontend/package.json` to prevent version skew between local dev and CI. - The `e2e/helpers/upload-empty-document.ts` helper reads a PDF fixture from disk using a relative path resolved at import time — this works in both local and CI environments as long as the repo is checked out, which it is. - `retries: process.env.CI ? 2 : 0` is already configured in `playwright.config.ts`. This is the right approach — retry logic handles infrastructure flakiness in CI without masking real failures locally. ### Recommendations - Add a `e2e-tests` job to `ci.yml` that: (1) starts `docker compose -f docker-compose.yml -f docker-compose.ci.yml up -d db minio`, (2) starts the backend JAR, (3) runs `npm run test:e2e`. Without this, the "passes in CI" acceptance criterion is permanently unmet. - Until the CI job exists, add a note in the PR description acknowledging the gap and link the issue where CI E2E will be wired up. Do not mark the acceptance criterion green without an actual CI run. - Pin the Playwright version in `package.json` to match the Docker image version used in CI (`v1.58.2`). Mismatches cause protocol errors that are painful to debug. - The `transcription.spec.ts` `beforeAll` leaks a document on every test run — it creates the document and never deletes it. Over many CI runs this accumulates test data. The new test should include `afterAll` cleanup, and the existing `transcription.spec.ts` should be patched in a follow-up. ### Open Decisions - Should the CI E2E job be added in this PR (expanding scope) or tracked as a separate issue? Given that "passes in CI" is an explicit acceptance criterion, it is a reasonable scope expansion rather than gold-plating.

marcel commented

2026-05-07 17:50:19 +02:00

👨‍💻 Elicit — Requirements Engineer

Observations

The scenario is well-specified in Given-When-Then form and covers both the bulk action and the individual un-mark. That is more complete than many E2E issue specs.
The acceptance criterion "Test is added to the existing E2E suite (does not require a separate test file if a transcription suite already exists)" correctly identifies frontend/e2e/transcription.spec.ts as the target file.
Gap — missing precondition: The scenario says "a document with ≥ 2 unreviewed transcription blocks" but does not specify that the Transkription panel must be open. In the current UI, the panel opens only after clicking the "Transkription" tab in the bottom panel. The test setup must navigate to the document AND open the correct tab before asserting on block state.
Gap — missing assertion on progress counter: The scenario specifies "reviewed count equals the total count" but the UI expresses this as the {reviewedCount} / {totalCount} geprüft text. The acceptance criterion should name this specifically so the implementer knows which element to assert on.
Gap — "un-marks one block" is under-specified: The scenario says "When the user un-marks one block" but does not specify which block (first, last, any) or how the toggle is activated (button label changes between reviewed/unreviewed states). The implementer will make a reasonable choice, but ambiguity could cause divergence between intent and implementation.
The issue links to PR #352 and issue #345, which is good traceability.

Recommendations

Add an explicit precondition: "And the Transkription panel is open (user has clicked the Transkription tab)."
Add a specific assertion target for the counter: "{reviewedCount} / {totalCount} geprüft" label shows "2 / 2 geprüft" after bulk mark and "1 / 2 geprüft" after un-marking.
Specify which block to un-mark for reproducibility: "When the user clicks the review toggle on the first block" is unambiguous.
The issue is otherwise well-formed and ready for implementation. No further refinement is needed before work starts.

## 👨‍💻 Elicit — Requirements Engineer ### Observations - The scenario is well-specified in Given-When-Then form and covers both the bulk action and the individual un-mark. That is more complete than many E2E issue specs. - The acceptance criterion "Test is added to the existing E2E suite (does not require a separate test file if a transcription suite already exists)" correctly identifies `frontend/e2e/transcription.spec.ts` as the target file. - **Gap — missing precondition:** The scenario says "a document with ≥ 2 unreviewed transcription blocks" but does not specify that the Transkription panel must be open. In the current UI, the panel opens only after clicking the "Transkription" tab in the bottom panel. The test setup must navigate to the document AND open the correct tab before asserting on block state. - **Gap — missing assertion on progress counter:** The scenario specifies "reviewed count equals the total count" but the UI expresses this as the `{reviewedCount} / {totalCount} geprüft` text. The acceptance criterion should name this specifically so the implementer knows which element to assert on. - **Gap — "un-marks one block" is under-specified:** The scenario says "When the user un-marks one block" but does not specify which block (first, last, any) or how the toggle is activated (button label changes between reviewed/unreviewed states). The implementer will make a reasonable choice, but ambiguity could cause divergence between intent and implementation. - The issue links to PR #352 and issue #345, which is good traceability. ### Recommendations - Add an explicit precondition: "And the Transkription panel is open (user has clicked the Transkription tab)." - Add a specific assertion target for the counter: `"{reviewedCount} / {totalCount} geprüft" label shows "2 / 2 geprüft"` after bulk mark and `"1 / 2 geprüft"` after un-marking. - Specify which block to un-mark for reproducibility: "When the user clicks the review toggle on the first block" is unambiguous. - The issue is otherwise well-formed and ready for implementation. No further refinement is needed before work starts.

marcel commented

2026-05-07 17:50:36 +02:00

👨‍💻 Nora "NullX" Steiner — Security Engineer

Observations

The PUT /api/documents/{documentId}/transcription-blocks/review-all endpoint is correctly protected with @RequirePermission(Permission.WRITE_ALL). The controller test suite verifies both 401 (unauthenticated) and 403 (READ_ALL only) cases — this is the right security test pattern.
The markAllReviewed() function in +page.svelte uses a raw fetch() call (not the typed API client) without checking the response before mutating local state. Currently it returns early on !res.ok, but it does not surface an error to the user. This is a UX gap rather than a security issue — failed authorization would silently leave state unchanged.
The E2E test will run as the admin user (from auth.setup.ts credentials: admin/admin123 via env vars E2E_USERNAME/E2E_PASSWORD). This means the test will always have WRITE_ALL permission. The E2E test does not need to verify authorization — that is already covered by the controller unit tests. The E2E test's job is to verify the UI flow, not re-test access control.
The E2E credentials (admin123) are set via environment variables with a fallback to hardcoded values in helpers/auth.ts. This is a test-only file and the value is the well-known dev admin password, so the risk is low — but it should be noted that if CI ever stores real credentials here, they must be moved to Gitea secrets.
No injection vectors are introduced by this E2E test itself — it only exercises existing endpoints with valid test data.

Recommendations

The E2E test does not need authorization negative-path assertions (unauthenticated/forbidden) — these are already covered at the @WebMvcTest layer in TranscriptionBlockControllerTest.java. Keep E2E tests focused on the happy path user flow.
In +page.svelte, the markAllReviewed() function should surface a user-facing error when !res.ok, consistent with how other mutations handle failures. This is outside the scope of this issue but worth a follow-up.
Confirm that E2E_USERNAME and E2E_PASSWORD are sourced from Gitea secrets in the CI workflow, not hardcoded. The fallback admin123 in helpers/auth.ts is acceptable for local dev but must not be the CI source of truth.

## 👨‍💻 Nora "NullX" Steiner — Security Engineer ### Observations - The `PUT /api/documents/{documentId}/transcription-blocks/review-all` endpoint is correctly protected with `@RequirePermission(Permission.WRITE_ALL)`. The controller test suite verifies both 401 (unauthenticated) and 403 (READ_ALL only) cases — this is the right security test pattern. - The `markAllReviewed()` function in `+page.svelte` uses a raw `fetch()` call (not the typed API client) without checking the response before mutating local state. Currently it returns early on `!res.ok`, but it does not surface an error to the user. This is a UX gap rather than a security issue — failed authorization would silently leave state unchanged. - The E2E test will run as the `admin` user (from `auth.setup.ts` credentials: `admin`/`admin123` via env vars `E2E_USERNAME`/`E2E_PASSWORD`). This means the test will always have `WRITE_ALL` permission. **The E2E test does not need to verify authorization** — that is already covered by the controller unit tests. The E2E test's job is to verify the UI flow, not re-test access control. - The E2E credentials (`admin123`) are set via environment variables with a fallback to hardcoded values in `helpers/auth.ts`. This is a test-only file and the value is the well-known dev admin password, so the risk is low — but it should be noted that if CI ever stores real credentials here, they must be moved to Gitea secrets. - No injection vectors are introduced by this E2E test itself — it only exercises existing endpoints with valid test data. ### Recommendations - The E2E test does not need authorization negative-path assertions (unauthenticated/forbidden) — these are already covered at the `@WebMvcTest` layer in `TranscriptionBlockControllerTest.java`. Keep E2E tests focused on the happy path user flow. - In `+page.svelte`, the `markAllReviewed()` function should surface a user-facing error when `!res.ok`, consistent with how other mutations handle failures. This is outside the scope of this issue but worth a follow-up. - Confirm that `E2E_USERNAME` and `E2E_PASSWORD` are sourced from Gitea secrets in the CI workflow, not hardcoded. The fallback `admin123` in `helpers/auth.ts` is acceptable for local dev but must not be the CI source of truth.

marcel commented

2026-05-07 17:50:46 +02:00

👨‍💻 Sara Holt — QA Engineer & Test Strategist

Observations

The unit/component coverage in TranscriptionEditView.svelte.spec.ts is genuinely good: 7 cases covering button visibility, disabled states, click propagation, and in-flight disabling. The E2E test's job is therefore narrowly defined — it only needs to prove the full-stack round-trip works, not re-verify component behavior.
The existing transcription.spec.ts has a structural problem: beforeAll creates a document and blocks with no afterAll cleanup. Every test run leaves a document in the database. The new test must not copy this pattern. transcribe-coach.spec.ts does it correctly with afterAll + request.delete.
The acceptance criterion covers two scenarios: (1) bulk mark → all shown reviewed + counter updated, and (2) un-mark one → that block shown unreviewed + counter decremented. Both are user-visible state changes that E2E is well-suited to verify.
The page.waitForSelector('[data-hydrated]') pattern in transcription.spec.ts is a smell — it waits for a custom DOM attribute that the app sets, creating an implicit coupling between test and implementation. transcribe-coach.spec.ts avoids this by waiting for a specific visible element. Use the latter pattern.
The Transkription tab must be explicitly clicked before the panel content is visible. This is a necessary setup step that both transcription.spec.ts and transcribe-coach.spec.ts demonstrate correctly.
Test isolation: the workers: 1 / fullyParallel: false configuration means tests share a single auth session and run sequentially. State leaked from a prior test (e.g. a block's review state changed by a previous test) can cause failures. Creating fresh documents per describe block is the right mitigation — transcribe-coach.spec.ts demonstrates this with beforeAll/afterAll bracketing each describe.

Recommendations

Structure the new tests as a test.describe block inside transcription.spec.ts, with its own beforeAll (create doc + upload PDF + create 2 blocks via API) and afterAll (delete doc). Do not share setup state with the existing describe block.
Use two separate test() calls (one per scenario step) rather than one long test that does both mark-all and un-mark. A failure in the mark-all step would otherwise mask the un-mark behavior.
Assert on the {n} / {n} geprüft counter text as the primary assertion — it is the most visible user-facing signal and directly reflects the reviewedCount / totalCount derived state.
Add a screenshot call after each major state transition for debugging: await page.screenshot({ path: 'test-results/e2e/transcription-mark-all-reviewed.png' }).
Keep the test under 30 seconds total. The network + debounce interactions are synchronous for the review toggle (no 1.5s debounce — it fires immediately on click), so timing should not be an issue.

Open Decisions

Should the "un-mark one block" scenario be a standalone test() or a continuation within the same test as the bulk mark? Running them sequentially within one test() risks masking which step failed. Two separate tests, with the second starting from a fresh "all reviewed" state, is cleaner but requires the second beforeAll to also mark all blocks as reviewed via API before navigating.

## 👨‍💻 Sara Holt — QA Engineer & Test Strategist ### Observations - The unit/component coverage in `TranscriptionEditView.svelte.spec.ts` is genuinely good: 7 cases covering button visibility, disabled states, click propagation, and in-flight disabling. The E2E test's job is therefore narrowly defined — it only needs to prove the full-stack round-trip works, not re-verify component behavior. - The existing `transcription.spec.ts` has a structural problem: `beforeAll` creates a document and blocks with no `afterAll` cleanup. Every test run leaves a document in the database. The new test must not copy this pattern. `transcribe-coach.spec.ts` does it correctly with `afterAll` + `request.delete`. - The acceptance criterion covers two scenarios: (1) bulk mark → all shown reviewed + counter updated, and (2) un-mark one → that block shown unreviewed + counter decremented. Both are user-visible state changes that E2E is well-suited to verify. - The `page.waitForSelector('[data-hydrated]')` pattern in `transcription.spec.ts` is a smell — it waits for a custom DOM attribute that the app sets, creating an implicit coupling between test and implementation. `transcribe-coach.spec.ts` avoids this by waiting for a specific visible element. Use the latter pattern. - The Transkription tab must be explicitly clicked before the panel content is visible. This is a necessary setup step that both `transcription.spec.ts` and `transcribe-coach.spec.ts` demonstrate correctly. - Test isolation: the `workers: 1` / `fullyParallel: false` configuration means tests share a single auth session and run sequentially. State leaked from a prior test (e.g. a block's review state changed by a previous test) can cause failures. Creating fresh documents per `describe` block is the right mitigation — `transcribe-coach.spec.ts` demonstrates this with `beforeAll`/`afterAll` bracketing each describe. ### Recommendations - Structure the new tests as a `test.describe` block inside `transcription.spec.ts`, with its own `beforeAll` (create doc + upload PDF + create 2 blocks via API) and `afterAll` (delete doc). Do not share setup state with the existing describe block. - Use two separate `test()` calls (one per scenario step) rather than one long test that does both mark-all and un-mark. A failure in the mark-all step would otherwise mask the un-mark behavior. - Assert on the `{n} / {n} geprüft` counter text as the primary assertion — it is the most visible user-facing signal and directly reflects the `reviewedCount / totalCount` derived state. - Add a screenshot call after each major state transition for debugging: `await page.screenshot({ path: 'test-results/e2e/transcription-mark-all-reviewed.png' })`. - Keep the test under 30 seconds total. The network + debounce interactions are synchronous for the review toggle (no 1.5s debounce — it fires immediately on click), so timing should not be an issue. ### Open Decisions - Should the "un-mark one block" scenario be a standalone `test()` or a continuation within the same test as the bulk mark? Running them sequentially within one `test()` risks masking which step failed. Two separate tests, with the second starting from a fresh "all reviewed" state, is cleaner but requires the second beforeAll to also mark all blocks as reviewed via API before navigating.

marcel commented

2026-05-07 17:50:56 +02:00

👨‍💻 Leonie Voss — UX Designer & Accessibility Strategist

Observations

The "Alle als fertig markieren" button in TranscriptionEditView.svelte has min-h-[44px] — meets the WCAG 2.2 minimum touch target. Good.
The button uses focus-visible:ring-2 focus-visible:ring-brand-navy for keyboard focus — correct pattern.
The progress bar (bg-brand-mint on bg-brand-sand) is purely decorative (no text alternative) — this is acceptable since the {reviewedCount} / {totalCount} geprüft text label already conveys the information redundantly. No a11y gap here.
The review toggle on individual blocks (TranscriptionBlock.svelte, line 235) uses aria-label that changes between reviewed/unreviewed states — this is the correct pattern for toggle buttons. The E2E test should assert on the aria-label change to confirm the toggle worked, not just on color changes.
The spinning SVG during markingAllReviewed uses aria-hidden="true" — correct, since the button text "Alle als fertig markieren" is still present. However, there is no aria-busy or live region announcement when the operation completes. A screen reader user clicking the button gets no feedback that the operation succeeded. This is a pre-existing gap worth a follow-up issue.
The disabled={allReviewed || markingAllReviewed} state removes the button from the interaction model without explanation. The title attribute only shows when allReviewed — during markingAllReviewed, there is no tooltip. Minor issue since the spinner is visible, but worth noting.

Recommendations

The E2E test should include a call to AxeBuilder after the bulk-mark state is reached, just as the existing transcription test does at the end. This catches any a11y regressions introduced by the new UI state (all blocks reviewed).
Assert on the aria-label of the first block's review toggle after un-marking: it should change from the "unmark" label back to the "mark as reviewed" label. This validates the toggle actually changed state, not just that a visual indicator changed.
For the follow-up: add aria-live="polite" to the {reviewedCount} / {totalCount} geprüft counter so screen readers announce the count change after bulk mark. This is outside the scope of this issue but is a meaningful accessibility improvement for the transcriber audience.

## 👨‍💻 Leonie Voss — UX Designer & Accessibility Strategist ### Observations - The "Alle als fertig markieren" button in `TranscriptionEditView.svelte` has `min-h-[44px]` — meets the WCAG 2.2 minimum touch target. Good. - The button uses `focus-visible:ring-2 focus-visible:ring-brand-navy` for keyboard focus — correct pattern. - The progress bar (`bg-brand-mint` on `bg-brand-sand`) is purely decorative (no text alternative) — this is acceptable since the `{reviewedCount} / {totalCount} geprüft` text label already conveys the information redundantly. No a11y gap here. - The review toggle on individual blocks (`TranscriptionBlock.svelte`, line 235) uses `aria-label` that changes between reviewed/unreviewed states — this is the correct pattern for toggle buttons. The E2E test should assert on the `aria-label` change to confirm the toggle worked, not just on color changes. - The spinning SVG during `markingAllReviewed` uses `aria-hidden="true"` — correct, since the button text "Alle als fertig markieren" is still present. However, there is no `aria-busy` or live region announcement when the operation completes. A screen reader user clicking the button gets no feedback that the operation succeeded. This is a pre-existing gap worth a follow-up issue. - The `disabled={allReviewed || markingAllReviewed}` state removes the button from the interaction model without explanation. The `title` attribute only shows when `allReviewed` — during `markingAllReviewed`, there is no tooltip. Minor issue since the spinner is visible, but worth noting. ### Recommendations - The E2E test should include a call to `AxeBuilder` after the bulk-mark state is reached, just as the existing transcription test does at the end. This catches any a11y regressions introduced by the new UI state (all blocks reviewed). - Assert on the `aria-label` of the first block's review toggle after un-marking: it should change from the "unmark" label back to the "mark as reviewed" label. This validates the toggle actually changed state, not just that a visual indicator changed. - For the follow-up: add `aria-live="polite"` to the `{reviewedCount} / {totalCount} geprüft` counter so screen readers announce the count change after bulk mark. This is outside the scope of this issue but is a meaningful accessibility improvement for the transcriber audience.

marcel commented

2026-05-07 17:51:18 +02:00

Decision Queue

Consolidated open decisions from all personas. Each needs a call before implementation starts.

Theme 1 — CI E2E gap vs. acceptance criterion scope

Raised by: Markus, Tobias

The acceptance criterion "Test passes in CI" cannot be satisfied: no E2E job exists in ci.yml. The CI pipeline runs Vitest unit/component tests only — Playwright E2E is entirely absent.

Options:

A (expand scope): Add a basic e2e-tests CI job in this PR. Reasonable since the AC explicitly requires it and a minimal job (start Docker Compose infra + run tests) is well-understood work.
B (split into blocker): Create a new issue for CI E2E wiring, mark it as a blocker for #355, and adjust the acceptance criterion to "test passes locally and is ready for CI once the pipeline exists."

Recommendation from personas: Option A if the CI job is small enough to do in one session; Option B if it risks scope creep into docker-compose.ci.yml work.

Theme 2 — Un-mark scenario test structure

Raised by: Sara

The "un-marks one block" scenario can be implemented two ways:

A (single test, sequential steps): One test() covers mark-all then un-mark. Simpler but a failure in step 1 masks step 2.
B (two independent tests): Separate test() for each scenario. The un-mark test needs its own beforeAll that marks blocks as reviewed via API before navigating. More isolated, more verbose.

Recommendation from personas: Option B. Two test() calls with independent setup gives clearer failure messages. The extra PUT /review-all API call in the second beforeAll is minimal overhead.

Theme 3 — i18n key for single-block review toggle ARIA label

Raised by: Felix

The single-block review toggle in TranscriptionBlock.svelte uses m.transcription_block_unreview() and m.transcription_block_review() as ARIA labels (line 235). The exact German strings need to be looked up from the Paraglide message files before writing the E2E selector. Using a regex (/Als fertig markieren/) risks not matching if the actual message differs.

Action: Before writing the selector, read frontend/src/paraglide/messages/ for the correct transcription_block_review and transcription_block_unreview values and use the exact string (or a confirmed regex) in the test.

## Decision Queue Consolidated open decisions from all personas. Each needs a call before implementation starts. --- ### Theme 1 — CI E2E gap vs. acceptance criterion scope **Raised by:** Markus, Tobias The acceptance criterion "Test passes in CI" cannot be satisfied: no E2E job exists in `ci.yml`. The CI pipeline runs Vitest unit/component tests only — Playwright E2E is entirely absent. **Options:** - **A (expand scope):** Add a basic `e2e-tests` CI job in this PR. Reasonable since the AC explicitly requires it and a minimal job (start Docker Compose infra + run tests) is well-understood work. - **B (split into blocker):** Create a new issue for CI E2E wiring, mark it as a blocker for #355, and adjust the acceptance criterion to "test passes locally and is ready for CI once the pipeline exists." **Recommendation from personas:** Option A if the CI job is small enough to do in one session; Option B if it risks scope creep into `docker-compose.ci.yml` work. --- ### Theme 2 — Un-mark scenario test structure **Raised by:** Sara The "un-marks one block" scenario can be implemented two ways: - **A (single test, sequential steps):** One `test()` covers mark-all then un-mark. Simpler but a failure in step 1 masks step 2. - **B (two independent tests):** Separate `test()` for each scenario. The un-mark test needs its own `beforeAll` that marks blocks as reviewed via API before navigating. More isolated, more verbose. **Recommendation from personas:** Option B. Two `test()` calls with independent setup gives clearer failure messages. The extra `PUT /review-all` API call in the second `beforeAll` is minimal overhead. --- ### Theme 3 — i18n key for single-block review toggle ARIA label **Raised by:** Felix The single-block review toggle in `TranscriptionBlock.svelte` uses `m.transcription_block_unreview()` and `m.transcription_block_review()` as ARIA labels (line 235). The exact German strings need to be looked up from the Paraglide message files before writing the E2E selector. Using a regex (`/Als fertig markieren/`) risks not matching if the actual message differs. **Action:** Before writing the selector, read `frontend/src/paraglide/messages/` for the correct `transcription_block_review` and `transcription_block_unreview` values and use the exact string (or a confirmed regex) in the test.

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#355