[test] Playwright e2e: PDF render-decode correctness across image codecs (CCITT/JBIG2/JPEG/JPX + multi-page + text-only) #714

Open
opened 2026-06-01 21:22:32 +02:00 by marcel · 0 comments
Owner

Context

Follow-up to #708 / PR #713. That fix configured pdf.js 5.x wasmUrl so CCITT/JBIG2/JPEG2000 scans stop rendering blank, and serves the wasm from /pdfjs-wasm/.

In #713 we attempted an in-browser vitest behavioral test (render a committed fixture with the real pdf.js loader, assert the canvas is non-blank). It was green locally but flaky in CI — the pdf.js worker could not fetch /pdfjs-wasm/ in the CI Chromium container, so the canvas stayed blank. That test was removed in favour of deterministic guards:

  • unit guardgetDocument is called with a non-null wasmUrl ending in /;
  • build-output guardfrontend/scripts/assert-pdfjs-wasm.mjs (postbuild) asserts jbig2.wasm/openjpeg.wasm ship into build/client/pdfjs-wasm/;
  • manualnode build serves /pdfjs-wasm/jbig2.wasm → 200 application/wasm.

These cover the regression deterministically, but no automated test asserts a real page actually paints. Sara (QA) and Elicit (Requirements) flagged this gap on #713; this issue tracks closing it properly.

Why e2e, not vitest browser mode

The vitest browser server's serving of /pdfjs-wasm/ (and pdf.js worker behaviour) is environment-fragile in CI. A Playwright e2e test runs against a real built server (npm run build && npm run preview, or the adapter-node node build), where /pdfjs-wasm/ is served from build/client/ exactly as in production — the reliable place to assert real rendering. frontend/e2e/ already exists with Playwright configured and e2e/fixtures/*.pdf.

Scope

Add a Playwright e2e spec that, against a built+served frontend, renders PDFs through the document viewer and asserts the page canvas is non-blank (sample the canvas via page.evaluate + getImageData, count non-background pixels > threshold — mirror the pixel oracle from the removed vitest test).

Codec / structure matrix (one fixture per row)

  • CCITTFax (G4) — non-blank. This is the actual #708 failure class; exercises the shared jbig2.wasm module.
  • JBIG2 — non-blank. Same jbig2.wasm path; needs a real JBIG2 fixture (no jbig2enc was available when #713 was done — source a small public-domain sample or generate one in CI).
  • DCTDecode (JPEG) — non-blank (no-regression; decodes natively).
  • JPEG2000 / JPXDecode — non-blank (covered by openjpeg.wasm); assert if a fixture can be produced, else explicitly skip with a note.
  • Multi-page mixed-codec (e.g. page 1 JPEG + page 2 CCITT, via pdfunite) — both pages and the page between render.
  • Text-only (no image XObject — e2e/fixtures/minimal.pdf already qualifies) — renders, guards over-scoping.
  • Console-warning assertion — fail if JBig2 failed to initialize / wasmUrl warnings appear in the page console.

Fixtures

  • Hermetic, committed (synthesize with ImageMagick: -compress Group4 for CCITT, -compress JPEG for DCT; pdfunite to build the mixed-codec doc). Do not fetch from staging at test time.
  • A document-viewer route needs a document with a file; either seed one in the e2e setup or add a minimal harness route that renders PdfViewer against a fixture URL (prefer the former to avoid test-only routes).

Acceptance criteria

  • A Playwright e2e spec covering the matrix above passes in CI against the built frontend (not the vitest dev server).
  • The CCITT case is demonstrably load-bearing: it fails (blank) if wasmUrl is removed from getDocument.
  • No wasmUrl / JBig2 failed to initialize console warnings on affected docs.
  • Runs in the existing frontend/e2e/ Playwright setup; documented in the test how to regenerate fixtures.

Out of scope

  • Re-introducing an in-browser vitest render gate (the fragile path #713 removed).
  • standardFontDataUrl / iccUrl (still no affected document found).

References

  • #708, PR #713 (the deterministic guards + ADR-028)
  • frontend/src/lib/document/viewer/usePdfRenderer.svelte.ts (WASM_URL), frontend/scripts/assert-pdfjs-wasm.mjs, docs/adr/028-pdfjs-wasm-decoders-and-csp-constraint.md
## Context Follow-up to #708 / PR #713. That fix configured pdf.js 5.x `wasmUrl` so CCITT/JBIG2/JPEG2000 scans stop rendering blank, and serves the wasm from `/pdfjs-wasm/`. In #713 we attempted an **in-browser vitest** behavioral test (render a committed fixture with the real pdf.js loader, assert the canvas is non-blank). It was **green locally but flaky in CI** — the pdf.js worker could not fetch `/pdfjs-wasm/` in the CI Chromium container, so the canvas stayed blank. That test was removed in favour of deterministic guards: - **unit guard** — `getDocument` is called with a non-null `wasmUrl` ending in `/`; - **build-output guard** — `frontend/scripts/assert-pdfjs-wasm.mjs` (postbuild) asserts `jbig2.wasm`/`openjpeg.wasm` ship into `build/client/pdfjs-wasm/`; - **manual** — `node build` serves `/pdfjs-wasm/jbig2.wasm` → 200 `application/wasm`. These cover the regression deterministically, but **no automated test asserts a real page actually paints**. Sara (QA) and Elicit (Requirements) flagged this gap on #713; this issue tracks closing it properly. ## Why e2e, not vitest browser mode The vitest browser server's serving of `/pdfjs-wasm/` (and pdf.js worker behaviour) is environment-fragile in CI. A **Playwright e2e** test runs against a **real built server** (`npm run build && npm run preview`, or the adapter-node `node build`), where `/pdfjs-wasm/` is served from `build/client/` exactly as in production — the reliable place to assert real rendering. `frontend/e2e/` already exists with Playwright configured and `e2e/fixtures/*.pdf`. ## Scope Add a Playwright e2e spec that, against a built+served frontend, renders PDFs through the document viewer and asserts the page canvas is **non-blank** (sample the canvas via `page.evaluate` + `getImageData`, count non-background pixels > threshold — mirror the pixel oracle from the removed vitest test). ### Codec / structure matrix (one fixture per row) - [ ] **CCITTFax (G4)** — non-blank. This is the actual #708 failure class; exercises the shared `jbig2.wasm` module. - [ ] **JBIG2** — non-blank. Same `jbig2.wasm` path; needs a real JBIG2 fixture (no `jbig2enc` was available when #713 was done — source a small public-domain sample or generate one in CI). - [ ] **DCTDecode (JPEG)** — non-blank (no-regression; decodes natively). - [ ] **JPEG2000 / JPXDecode** — non-blank (covered by `openjpeg.wasm`); assert if a fixture can be produced, else explicitly skip with a note. - [ ] **Multi-page mixed-codec** (e.g. page 1 JPEG + page 2 CCITT, via `pdfunite`) — both pages and the page between render. - [ ] **Text-only** (no image XObject — `e2e/fixtures/minimal.pdf` already qualifies) — renders, guards over-scoping. - [ ] **Console-warning assertion** — fail if `JBig2 failed to initialize` / `wasmUrl` warnings appear in the page console. ### Fixtures - Hermetic, committed (synthesize with ImageMagick: `-compress Group4` for CCITT, `-compress JPEG` for DCT; `pdfunite` to build the mixed-codec doc). Do **not** fetch from staging at test time. - A document-viewer route needs a document with a file; either seed one in the e2e setup or add a minimal harness route that renders `PdfViewer` against a fixture URL (prefer the former to avoid test-only routes). ## Acceptance criteria - [ ] A Playwright e2e spec covering the matrix above passes in CI against the **built** frontend (not the vitest dev server). - [ ] The CCITT case is demonstrably **load-bearing**: it fails (blank) if `wasmUrl` is removed from `getDocument`. - [ ] No `wasmUrl` / `JBig2 failed to initialize` console warnings on affected docs. - [ ] Runs in the existing `frontend/e2e/` Playwright setup; documented in the test how to regenerate fixtures. ## Out of scope - Re-introducing an in-browser **vitest** render gate (the fragile path #713 removed). - `standardFontDataUrl` / `iccUrl` (still no affected document found). ## References - #708, PR #713 (the deterministic guards + ADR-028) - `frontend/src/lib/document/viewer/usePdfRenderer.svelte.ts` (`WASM_URL`), `frontend/scripts/assert-pdfjs-wasm.mjs`, `docs/adr/028-pdfjs-wasm-decoders-and-csp-constraint.md`
marcel added the P2-mediumtest labels 2026-06-01 21:22:36 +02:00
Sign in to join this conversation.
No Label P2-medium test
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#714