docs(adr): record pdf.js wasm same-origin serving + future-CSP constraint
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 23s
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Semgrep Security Scan (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / Unit & Component Tests (push) Successful in 3m18s
CI / OCR Service Tests (push) Successful in 21s
CI / Backend Unit Tests (push) Successful in 3m45s
CI / fail2ban Regex (push) Successful in 44s
CI / Semgrep Security Scan (push) Successful in 21s
CI / Compose Bucket Idempotency (push) Successful in 1m3s
nightly / deploy-staging (push) Successful in 2m14s
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 3m21s
CI / OCR Service Tests (pull_request) Successful in 23s
CI / fail2ban Regex (pull_request) Has been cancelled
CI / Semgrep Security Scan (pull_request) Has been cancelled
CI / Compose Bucket Idempotency (pull_request) Has been cancelled
CI / Backend Unit Tests (pull_request) Has been cancelled
CI / Unit & Component Tests (push) Successful in 3m18s
CI / OCR Service Tests (push) Successful in 21s
CI / Backend Unit Tests (push) Successful in 3m45s
CI / fail2ban Regex (push) Successful in 44s
CI / Semgrep Security Scan (push) Successful in 21s
CI / Compose Bucket Idempotency (push) Successful in 1m3s
nightly / deploy-staging (push) Successful in 2m14s
Promote the future-CSP constraint from an inline Caddyfile comment to a durable ADR-028: serve the pdf.js wasm decoders same-origin (never a CDN), any future CSP must allow 'wasm-unsafe-eval' + worker-src 'self' blob:, and the build-time guard keeps the wasm shipping. Caddyfile now points at the ADR. Addresses re-review: Markus (constraint should be an ADR, not a comment). Refs #708 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit was merged in pull request #713.
This commit is contained in:
60
docs/adr/028-pdfjs-wasm-decoders-and-csp-constraint.md
Normal file
60
docs/adr/028-pdfjs-wasm-decoders-and-csp-constraint.md
Normal file
@@ -0,0 +1,60 @@
|
|||||||
|
# ADR-028 — pdf.js wasm decoders are served same-origin; a future CSP must allow them
|
||||||
|
|
||||||
|
**Date:** 2026-06-01
|
||||||
|
**Status:** Accepted
|
||||||
|
**Issue:** #708 (scanned PDFs with CCITT/JBIG2 images render blank)
|
||||||
|
**Milestone:** Pre-prod read-path hardening
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
pdf.js 5.x moved the **JBIG2, CCITTFax, and JPEG2000 image decoders into
|
||||||
|
WebAssembly**. A single `jbig2.wasm` module decodes both JBIG2 and CCITTFax;
|
||||||
|
`openjpeg.wasm` decodes JPEG2000. These modules live in
|
||||||
|
`node_modules/pdfjs-dist/wasm/` and are not on the web path by default, and
|
||||||
|
`getDocument` will not load them unless it is given a `wasmUrl`. Without that,
|
||||||
|
bi-level black-and-white scans (CCITT G4 fax — ~16% of the archive) painted a
|
||||||
|
blank canvas in production while JPEG scans rendered fine.
|
||||||
|
|
||||||
|
Two cross-cutting, long-lived constraints fall out of the fix and are not
|
||||||
|
obvious from reading any single file — hence this record.
|
||||||
|
|
||||||
|
## Decision
|
||||||
|
|
||||||
|
1. **Serve the pdf.js wasm from our own origin**, at the unversioned path
|
||||||
|
`/pdfjs-wasm/`, copied from `node_modules/pdfjs-dist/wasm/` into
|
||||||
|
`build/client/` at build time by `vite-plugin-static-copy` (a devDependency;
|
||||||
|
see `frontend/vite.config.ts`). `getDocument` is called with
|
||||||
|
`wasmUrl: '/pdfjs-wasm/'`. **Never point `wasmUrl` at a public CDN** — a
|
||||||
|
decoder on the core read path must not become a supply-chain RCE surface.
|
||||||
|
|
||||||
|
2. **Any future `Content-Security-Policy` MUST include
|
||||||
|
`script-src 'wasm-unsafe-eval'` and `worker-src 'self' blob:`.** pdf.js
|
||||||
|
instantiates WebAssembly and runs its renderer in a worker created from a
|
||||||
|
`blob:` URL. A CSP without these directives silently re-breaks PDF rendering
|
||||||
|
for the exact class of documents #708 fixed. No CSP is set today
|
||||||
|
(`infra/caddy/Caddyfile` `(security_headers)`); the Caddyfile carries a
|
||||||
|
pointer to this ADR so the future CSP author cannot miss it.
|
||||||
|
|
||||||
|
3. **The wasm shipping is guarded at build time.** `frontend/postbuild`
|
||||||
|
(`scripts/assert-pdfjs-wasm.mjs`) fails the build loudly if `jbig2.wasm` or
|
||||||
|
`openjpeg.wasm` is absent from `build/client/pdfjs-wasm/` — so a future
|
||||||
|
`pdfjs-dist` bump that renames or relocates the wasm cannot regress to a
|
||||||
|
blank canvas unnoticed. This runs in CI and in the Docker build stage.
|
||||||
|
|
||||||
|
## Consequences
|
||||||
|
|
||||||
|
- The decoders load from the same origin as the app — no third-party trust, no
|
||||||
|
SRI to manage, correct `Content-Type: application/wasm` served by
|
||||||
|
adapter-node.
|
||||||
|
- `/pdfjs-wasm/` is **not** content-hashed, so it must not be served
|
||||||
|
`immutable` — a revalidating cache avoids serving a stale `.wasm` against a
|
||||||
|
newer worker after a pdfjs upgrade.
|
||||||
|
- The CSP constraint is a standing obligation on whoever introduces a CSP. If
|
||||||
|
that work happens, this ADR and the Caddyfile note are the source of truth.
|
||||||
|
- No new container or external system is introduced, so the C4 L1/L2 diagrams
|
||||||
|
are unaffected; `/pdfjs-wasm/` is a static asset served by the existing
|
||||||
|
frontend container.
|
||||||
|
- Render/decode failures are no longer silent: the viewer surfaces a localized
|
||||||
|
message plus a working download link (see #708).
|
||||||
@@ -25,7 +25,8 @@
|
|||||||
# No Content-Security-Policy is set yet. When one is added, it MUST
|
# No Content-Security-Policy is set yet. When one is added, it MUST
|
||||||
# include `script-src 'wasm-unsafe-eval'` and `worker-src 'self' blob:`
|
# include `script-src 'wasm-unsafe-eval'` and `worker-src 'self' blob:`
|
||||||
# or the pdf.js WebAssembly image decoders (JBIG2/CCITTFax/JPEG2000)
|
# or the pdf.js WebAssembly image decoders (JBIG2/CCITTFax/JPEG2000)
|
||||||
# and worker will be blocked and scanned PDFs render blank. See #708.
|
# and worker will be blocked and scanned PDFs render blank.
|
||||||
|
# See #708 and docs/adr/028-pdfjs-wasm-decoders-and-csp-constraint.md.
|
||||||
-Server
|
-Server
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user