test(ocr): add integration test for full streaming pipeline with a real image #258
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Background
Deferred during PR #255 review cycle 1 (Sara Holt — QA review).
Concern
The preprocessing pipeline is tested at unit level only.
test_stream.pymocks the OCR engine andtest_preprocessing.pytestspreprocess_pagein isolation. There is no test that spins up the FastAPI app and posts a real image through/ocr/streamto verify the full event sequence end-to-end.Suggested approach
Use
httpx.AsyncClient+ASGITransport(already used in other tests) with a small real PNG/PDF image. The test should assert that:preprocessingevent is emitted for each pagepageevent follows eachpreprocessingeventdoneevent closes the streamThis would catch bugs in the
generate()/generate_guided()ordering that unit tests cannot catch.Why deferred
Requires a real PDF fixture to be included in the test assets and needs the OCR models to be available (or a lightweight mock that doesn't need patching). Infrastructure work beyond the scope of PR #255.
Reference
PR: http://heim-nas:3005/marcel/familienarchiv/pulls/255