Marcel dd078d50da
Some checks failed
CI / Unit & Component Tests (push) Failing after 2s
CI / Backend Unit Tests (push) Failing after 0s
CI / Unit & Component Tests (pull_request) Failing after 2s
CI / Backend Unit Tests (pull_request) Failing after 1s
fix(ocr): extract PDF pages as PNGs before running kraken OCR
Kraken's -f pdf mode tries to write output next to the input file,
which fails on read-only mounts. Instead, extract pages as PNGs via
pypdfium2 (already installed), then run kraken on each image.
Both models run in a single container per PDF to avoid overhead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 20:37:29 +02:00
2026-03-17 13:35:32 +00:00
Description
No description provided
44 MiB
Languages
Python 73.3%
TypeScript 11.4%
Java 10.8%
Svelte 4.2%
Shell 0.1%