feat(transcription): E2E test for bulk "Alle als fertig markieren" action #355
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
PR #352 shipped the bulk "Alle als fertig markieren" action without a Playwright E2E test. The unit and component tests are sufficient for the Demo Day milestone, but the E2E coverage gap remains open.
Raised by Sara (QA) in the PR #352 Decision Queue (Item 3).
Scenario to cover
Acceptance criteria
frontend/e2e/covers the scenario aboveLinks
👨💻 Markus Keller — Application Architect
Observations
PUT /api/documents/{documentId}/transcription-blocks/review-alllives inTranscriptionBlockController, delegating toTranscriptionService.markAllBlocksReviewed(). No boundary violations.transcription.spec.tsusesbeforeAllwith API-seeded data and noafterAllcleanup. The test document leaks into the database on every CI run. The upcoming test should not repeat this mistake.PUT /{blockId}/review; the bulk action goes throughPUT /review-all. Both return the updated block(s) and patch local state in the page component. This symmetric design makes the E2E scenario straightforward to write.ci.ymlhas no E2E job — onlyunit-testsrunning in the Playwright container). The acceptance criterion "Test passes in CI" is therefore currently impossible without a CI job for E2E. This is a pre-existing gap, but worth noting.Recommendations
afterAllcleanup (DELETE the test document via API) to the new test, mirroring the pattern already established intranscribe-coach.spec.ts. The existingtranscription.spec.tsomits this — do not copy that omission.transcription.spec.tsas a newtest.describeblock rather than a separate file. The issue already permits this, and keeping the transcription fixtures in one file avoids duplicatedbeforeAllsetup.ci.yml.Open Decisions
👨💻 Felix Brandt — Fullstack Developer
Observations
TranscriptionEditView.svelte.spec.tsis solid: button visibility, disabled state, single-call guarantee, and in-flight disabling are all covered at the unit layer usingvitest-browser-svelte. What's missing is the full-stack round-trip: real API call → database update → UI reflects new state after reload.createEmptyDocument(inhelpers/upload-empty-document.ts) already handles document creation and PDF upload. AcreateTwoBlockshelper following the same pattern would be the minimal addition needed for this test.reviewTogglein+page.sveltepatches local reactive state directly after the API response (transcriptionBlocks.map(b => b.id === blockId ? updated : b)). ThemarkAllRevieweddoes a loop mutation on the same array. Both patterns are correct Svelte 5 reactive mutations — nothing to fix, just be aware the E2E test needs to assert on DOM state, not JS state.TranscriptionBlock.svelte,aria-labeldriven bym.transcription_block_unreview()/m.transcription_block_review()). Use the ARIA label to locate it reliably.data-block-idattribute exists on each block's wrapper div — this gives a stable selector to identify which block to un-mark.Recommendations
createBlocksForDocument(request, docId, count)helper intohelpers/and reuse it here and intranscribe-coach.spec.ts'screateBlockfunction (currently inlined). Single place to maintain.page.getByRole('button', { name: /Alle als fertig markieren/ })for the bulk action, andpage.getByRole('button', { name: /Als fertig markieren/ }).first()(or.nth(0)) for the single-block toggle. Avoid CSS class selectors — they break on style refactors.await expect(page.getByText(/2 \/ 2 geprüft/)).toBeVisible()after bulk mark, andawait expect(page.getByText(/1 \/ 2 geprüft/)).toBeVisible()after un-marking one. This is the most user-visible signal and directly matches the acceptance criterion.afterAllwithrequest.delete(\/api/documents/${docId}`)` to avoid test data accumulation.page.waitForSelector('[data-hydrated]')in the new test — the existingtranscription.spec.tsuses it, but the cleaner pattern fromtranscribe-coach.spec.tsis to wait for a specific visible element instead.Open Decisions
m.*function name in Paraglide before writing the selector. CheckTranscriptionBlock.sveltelines 234–235 for the exact key.👨💻 Tobias Wendt — DevOps & Platform Engineer
Observations
frontend/e2e/) is not wired intoci.yml. The current CI pipeline runsunit-testsinside the Playwright Docker container (mcr.microsoft.com/playwright:v1.58.2-noble) but only runs Vitest unit/component tests — no Playwright E2E job exists. The acceptance criterion "Test passes in CI" in this issue cannot be satisfied without adding an E2E job.v1.58.2-noble) should match the@playwright/testversion infrontend/package.jsonto prevent version skew between local dev and CI.e2e/helpers/upload-empty-document.tshelper reads a PDF fixture from disk using a relative path resolved at import time — this works in both local and CI environments as long as the repo is checked out, which it is.retries: process.env.CI ? 2 : 0is already configured inplaywright.config.ts. This is the right approach — retry logic handles infrastructure flakiness in CI without masking real failures locally.Recommendations
e2e-testsjob toci.ymlthat: (1) startsdocker compose -f docker-compose.yml -f docker-compose.ci.yml up -d db minio, (2) starts the backend JAR, (3) runsnpm run test:e2e. Without this, the "passes in CI" acceptance criterion is permanently unmet.package.jsonto match the Docker image version used in CI (v1.58.2). Mismatches cause protocol errors that are painful to debug.transcription.spec.tsbeforeAllleaks a document on every test run — it creates the document and never deletes it. Over many CI runs this accumulates test data. The new test should includeafterAllcleanup, and the existingtranscription.spec.tsshould be patched in a follow-up.Open Decisions
👨💻 Elicit — Requirements Engineer
Observations
frontend/e2e/transcription.spec.tsas the target file.{reviewedCount} / {totalCount} geprüfttext. The acceptance criterion should name this specifically so the implementer knows which element to assert on.Recommendations
"{reviewedCount} / {totalCount} geprüft" label shows "2 / 2 geprüft"after bulk mark and"1 / 2 geprüft"after un-marking.👨💻 Nora "NullX" Steiner — Security Engineer
Observations
PUT /api/documents/{documentId}/transcription-blocks/review-allendpoint is correctly protected with@RequirePermission(Permission.WRITE_ALL). The controller test suite verifies both 401 (unauthenticated) and 403 (READ_ALL only) cases — this is the right security test pattern.markAllReviewed()function in+page.svelteuses a rawfetch()call (not the typed API client) without checking the response before mutating local state. Currently it returns early on!res.ok, but it does not surface an error to the user. This is a UX gap rather than a security issue — failed authorization would silently leave state unchanged.adminuser (fromauth.setup.tscredentials:admin/admin123via env varsE2E_USERNAME/E2E_PASSWORD). This means the test will always haveWRITE_ALLpermission. The E2E test does not need to verify authorization — that is already covered by the controller unit tests. The E2E test's job is to verify the UI flow, not re-test access control.admin123) are set via environment variables with a fallback to hardcoded values inhelpers/auth.ts. This is a test-only file and the value is the well-known dev admin password, so the risk is low — but it should be noted that if CI ever stores real credentials here, they must be moved to Gitea secrets.Recommendations
@WebMvcTestlayer inTranscriptionBlockControllerTest.java. Keep E2E tests focused on the happy path user flow.+page.svelte, themarkAllReviewed()function should surface a user-facing error when!res.ok, consistent with how other mutations handle failures. This is outside the scope of this issue but worth a follow-up.E2E_USERNAMEandE2E_PASSWORDare sourced from Gitea secrets in the CI workflow, not hardcoded. The fallbackadmin123inhelpers/auth.tsis acceptable for local dev but must not be the CI source of truth.👨💻 Sara Holt — QA Engineer & Test Strategist
Observations
TranscriptionEditView.svelte.spec.tsis genuinely good: 7 cases covering button visibility, disabled states, click propagation, and in-flight disabling. The E2E test's job is therefore narrowly defined — it only needs to prove the full-stack round-trip works, not re-verify component behavior.transcription.spec.tshas a structural problem:beforeAllcreates a document and blocks with noafterAllcleanup. Every test run leaves a document in the database. The new test must not copy this pattern.transcribe-coach.spec.tsdoes it correctly withafterAll+request.delete.page.waitForSelector('[data-hydrated]')pattern intranscription.spec.tsis a smell — it waits for a custom DOM attribute that the app sets, creating an implicit coupling between test and implementation.transcribe-coach.spec.tsavoids this by waiting for a specific visible element. Use the latter pattern.transcription.spec.tsandtranscribe-coach.spec.tsdemonstrate correctly.workers: 1/fullyParallel: falseconfiguration means tests share a single auth session and run sequentially. State leaked from a prior test (e.g. a block's review state changed by a previous test) can cause failures. Creating fresh documents perdescribeblock is the right mitigation —transcribe-coach.spec.tsdemonstrates this withbeforeAll/afterAllbracketing each describe.Recommendations
test.describeblock insidetranscription.spec.ts, with its ownbeforeAll(create doc + upload PDF + create 2 blocks via API) andafterAll(delete doc). Do not share setup state with the existing describe block.test()calls (one per scenario step) rather than one long test that does both mark-all and un-mark. A failure in the mark-all step would otherwise mask the un-mark behavior.{n} / {n} geprüftcounter text as the primary assertion — it is the most visible user-facing signal and directly reflects thereviewedCount / totalCountderived state.await page.screenshot({ path: 'test-results/e2e/transcription-mark-all-reviewed.png' }).Open Decisions
test()or a continuation within the same test as the bulk mark? Running them sequentially within onetest()risks masking which step failed. Two separate tests, with the second starting from a fresh "all reviewed" state, is cleaner but requires the second beforeAll to also mark all blocks as reviewed via API before navigating.👨💻 Leonie Voss — UX Designer & Accessibility Strategist
Observations
TranscriptionEditView.sveltehasmin-h-[44px]— meets the WCAG 2.2 minimum touch target. Good.focus-visible:ring-2 focus-visible:ring-brand-navyfor keyboard focus — correct pattern.bg-brand-mintonbg-brand-sand) is purely decorative (no text alternative) — this is acceptable since the{reviewedCount} / {totalCount} geprüfttext label already conveys the information redundantly. No a11y gap here.TranscriptionBlock.svelte, line 235) usesaria-labelthat changes between reviewed/unreviewed states — this is the correct pattern for toggle buttons. The E2E test should assert on thearia-labelchange to confirm the toggle worked, not just on color changes.markingAllReviewedusesaria-hidden="true"— correct, since the button text "Alle als fertig markieren" is still present. However, there is noaria-busyor live region announcement when the operation completes. A screen reader user clicking the button gets no feedback that the operation succeeded. This is a pre-existing gap worth a follow-up issue.disabled={allReviewed || markingAllReviewed}state removes the button from the interaction model without explanation. Thetitleattribute only shows whenallReviewed— duringmarkingAllReviewed, there is no tooltip. Minor issue since the spinner is visible, but worth noting.Recommendations
AxeBuilderafter the bulk-mark state is reached, just as the existing transcription test does at the end. This catches any a11y regressions introduced by the new UI state (all blocks reviewed).aria-labelof the first block's review toggle after un-marking: it should change from the "unmark" label back to the "mark as reviewed" label. This validates the toggle actually changed state, not just that a visual indicator changed.aria-live="polite"to the{reviewedCount} / {totalCount} geprüftcounter so screen readers announce the count change after bulk mark. This is outside the scope of this issue but is a meaningful accessibility improvement for the transcriber audience.Decision Queue
Consolidated open decisions from all personas. Each needs a call before implementation starts.
Theme 1 — CI E2E gap vs. acceptance criterion scope
Raised by: Markus, Tobias
The acceptance criterion "Test passes in CI" cannot be satisfied: no E2E job exists in
ci.yml. The CI pipeline runs Vitest unit/component tests only — Playwright E2E is entirely absent.Options:
e2e-testsCI job in this PR. Reasonable since the AC explicitly requires it and a minimal job (start Docker Compose infra + run tests) is well-understood work.Recommendation from personas: Option A if the CI job is small enough to do in one session; Option B if it risks scope creep into
docker-compose.ci.ymlwork.Theme 2 — Un-mark scenario test structure
Raised by: Sara
The "un-marks one block" scenario can be implemented two ways:
test()covers mark-all then un-mark. Simpler but a failure in step 1 masks step 2.test()for each scenario. The un-mark test needs its ownbeforeAllthat marks blocks as reviewed via API before navigating. More isolated, more verbose.Recommendation from personas: Option B. Two
test()calls with independent setup gives clearer failure messages. The extraPUT /review-allAPI call in the secondbeforeAllis minimal overhead.Theme 3 — i18n key for single-block review toggle ARIA label
Raised by: Felix
The single-block review toggle in
TranscriptionBlock.svelteusesm.transcription_block_unreview()andm.transcription_block_review()as ARIA labels (line 235). The exact German strings need to be looked up from the Paraglide message files before writing the E2E selector. Using a regex (/Als fertig markieren/) risks not matching if the actual message differs.Action: Before writing the selector, read
frontend/src/paraglide/messages/for the correcttranscription_block_reviewandtranscription_block_unreviewvalues and use the exact string (or a confirmed regex) in the test.