Commit Graph

21 Commits

Author SHA1 Message Date
Marcel
0801da8df0 docs(ocr): explain why two metrics tests skip fresh_metrics fixture
Some checks failed
CI / Backend Unit Tests (push) Successful in 3m42s
CI / fail2ban Regex (push) Successful in 43s
CI / Semgrep Security Scan (push) Successful in 19s
CI / Compose Bucket Idempotency (push) Successful in 1m0s
CI / Unit & Component Tests (pull_request) Successful in 3m24s
CI / OCR Service Tests (pull_request) Successful in 20s
CI / Backend Unit Tests (pull_request) Successful in 3m28s
CI / fail2ban Regex (pull_request) Successful in 43s
CI / Semgrep Security Scan (pull_request) Successful in 19s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
CI / Unit & Component Tests (push) Failing after 2m44s
CI / OCR Service Tests (push) Successful in 20s
Sara's cycle-2 S2: clarify the latent (but not actual) cross-test state
risk on the two metrics tests that hit the global REGISTRY instead of
the per-test fresh_metrics fixture. Migrating them would actually break
them — the /metrics endpoint is served by prometheus-fastapi-instrumentator
which binds to the default REGISTRY at app-construction time, and the
http_requests_total assertion only finds counters on that global
registry. Both tests already assert response shape only (status code,
content-type substring, body substrings), not numeric values, so the
shared-registry caveat is documented for future readers rather than
treated as a bug to fix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 17:23:32 +02:00
Marcel
e0e1578bdd test(ocr): widen spell-check exclusion bound to 0.09s with rationale
Sara's cycle-2 S1: the wall-clock assertion at < 0.05s could trip on a
slow CI runner under load even when the timer correctly excludes
spell-check. Sara's preferred structural fix (patch main.time.monotonic
with a deterministic sequence) proved awkward — the patched attribute is
the *global* time.monotonic which httpx and asyncio consume, exhausting
the sequence before the request reaches the engine loop.

Take the documented fallback: widen the bound to 0.09s and explain why.
The failure mode the test guards against (spell-check inside the timer)
would add 0.1s (2 × 0.05s sleep), so 0.09s catches the bug while leaving
~90ms of headroom for slow CI runners. Verified red→green by temporarily
moving correct_text inside the timer block: bound trips at 0.101s; the
fixed code reads ~0.001s.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-21 17:22:49 +02:00
Marcel
ddf6cf4cbc test(ocr): collapse shared client setup into ocr_client helper
Each metrics test was repeating the same five-line block — patch
kraken_engine.load_models, patch load_spell_checker, instantiate the
AsyncClient, force _models_ready True, restore it. Lift the lot into a
single async context manager so each test body shrinks to its real
arrange / act / assert intent.

Tests that drive the lifespan directly (models_ready gauge) or stub
asyncio.to_thread for /train (which already patches _models_ready) stay
unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 17:03:29 +02:00
Marcel
0179e93a4b test(ocr): narrow training error test to subprocess.run seam
The asyncio.to_thread patch stubbed out the entire _run_training call,
hiding the real error path. Replacing it with a failing CompletedProcess
from subprocess.run exercises the actual ketos-failed branch and keeps
the test's intent — error counter bumps, 500 surfaces — intact.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:55:14 +02:00
Marcel
0fc0cbcffd test(ocr): lock in MetricsPathFilter fail-open behavior
If uvicorn's access log format ever changes (args=None, or shorter
than 3 elements), the filter must keep forwarding records rather than
silently dropping them. Two extra LogRecords cover both edge cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:54:24 +02:00
Marcel
549cb15845 test(ocr): cover /train-sender counter and accuracy=None gauge default
Two regression tests:
- /train-sender hitting the success path bumps the recognition counter
  (previously only /train and /segtrain were covered).
- A successful run whose result.accuracy is None must not call set() on
  ocr_model_accuracy — the gauge stays at its default 0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:53:48 +02:00
Marcel
74ddf16b01 feat(ocr): time only engine work in guided stream histogram
Previously the guided generator's page_started timer wrapped the entire
region loop including the synchronous correct_text() call, inflating
ocr_processing_seconds with spell-check latency. Sum the per-region
engine.extract_region_text durations instead so the histogram matches
the unguided stream's "engine only" semantic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:53:04 +02:00
Marcel
ebaedb1af0 test(ocr): assert ocr_jobs_total stays zero when stream download fails
Locks in the post-download placement of the counter increment so a
regression that moves it back above _download_and_convert_pdf would fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:51:23 +02:00
Marcel
525f091b3a feat(ocr): suppress uvicorn access logs for /metrics and /health
Adds a logging.Filter on uvicorn.access that drops records whose request
path is /metrics or /health. Each is hit on a tight schedule (Prometheus
scrape interval and Docker healthcheck), so unfiltered they dominate the
access log without carrying any information about real traffic.

Refs #652 (Nora's recommendation)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:16:14 +02:00
Marcel
d6abf990c7 feat(ocr): flip ocr_models_ready to 1 once the lifespan startup finishes
Mirrors the existing _models_ready bool so Prometheus has a time-series
liveness/readiness signal for future alerting rules (e.g.
ocr_models_ready < 1 for 2m).

Refs #652 (AC7)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:15:11 +02:00
Marcel
77d59c5d83 test(ocr): assert ocr_model_accuracy gauge is set per kind on success
Hits /train then /segtrain through the same test, each with a distinct
mocked accuracy, and asserts the labelled gauges reflect the two values.
Locks down the kind-label separation between recognition and segmentation
accuracy (decision #2).

Refs #652 (AC6)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:13:05 +02:00
Marcel
6c2b9af10b feat(ocr): record training runs in ocr_training_runs_total per kind and outcome
Wraps the await asyncio.to_thread(_run_*) calls in /train, /train-sender,
and /segtrain with try/except. Recognition training (/train, /train-sender)
shares kind="recognition"; /segtrain uses kind="segmentation". The
ocr_model_accuracy gauge is set per kind on success.

Refs #652 (AC6, decision #2)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:12:26 +02:00
Marcel
2e3744d9ef feat(ocr): observe ocr_processing_seconds around engine.to_thread calls
Wraps every asyncio.to_thread(engine.extract_*) call with time.monotonic()
deltas in /ocr (per document) and in both /ocr/stream generators (per page).
Streaming buckets are the useful operational signal; the non-streaming
observation is a bonus.

Refs #652 (AC5)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:09:25 +02:00
Marcel
131ed336bc feat(ocr): count words and illegible words at the OCR call sites
Walks block["words"] before apply_confidence_markers strips the list, then
increments ocr_words_total by len(words) and ocr_illegible_words_total by
the count below threshold. Same pattern in both /ocr and /ocr/stream so the
ratio illegible/words is a faithful quality signal across endpoints.

Refs #652 (AC4)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:07:59 +02:00
Marcel
3fa3460dbf feat(ocr): increment ocr_skipped_pages_total on per-page engine failure
Bumps the counter in both /ocr/stream except blocks (standard and guided
generators) so the existing skipped_pages local variable now also flows
into Prometheus.

Refs #652 (AC3b)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:06:50 +02:00
Marcel
79edb94558 feat(ocr): increment ocr_pages_total per successful page in stream
Bumps the counter inside both the standard and guided /ocr/stream
generators after a page yields its blocks, before the per-page json line is
emitted. Also moves the ocr_jobs_total increment for /ocr/stream right after
engine selection so the counter still fires when a page later errors out.

Refs #652 (AC3a)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:05:36 +02:00
Marcel
52d8dc2b20 test(ocr): assert ocr_jobs_total label is engine=surya for typewriter
Locks down AC2 for the non-Kurrent path. The same code branch in /ocr that
sets engine_name from script_type now has explicit coverage for both
HANDWRITING_KURRENT → kraken and TYPEWRITER → surya.

Refs #652 (AC2)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:04:20 +02:00
Marcel
696b71da5a feat(ocr): increment ocr_jobs_total with engine and script_type labels
Pick engine="kraken" for HANDWRITING_KURRENT, engine="surya" otherwise,
then increment after the blocks have been extracted.

Refs #652 (AC2)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:03:37 +02:00
Marcel
f3e3545d06 feat(ocr): add metrics.py factory with test-scoped CollectorRegistry support
Encapsulates every custom OCR metric in an OcrMetrics frozen dataclass and
exposes a `build_metrics(registry)` factory. Production main.py binds against
the default REGISTRY; tests construct a fresh CollectorRegistry per case and
monkeypatch main.metrics, so counter values stay isolated between tests
(decision #3 on issue #652, Option A).

Refs #652

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:02:20 +02:00
Marcel
4bb6685edb test(ocr): assert http_* metrics appear after an /ocr request
Locks down AC1: prometheus-fastapi-instrumentator must keep auto-exposing
http_requests_total and http_request_duration_seconds for application
traffic, not just register the /metrics endpoint.

Refs #652

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 16:00:33 +02:00
Marcel
18c93d4eaa feat(ocr): expose /metrics endpoint via prometheus-fastapi-instrumentator
Mount the instrumentator immediately after FastAPI app creation, excluding
/health and /metrics from request metrics to keep http_requests_total focused
on real application traffic.

Refs #652

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-21 15:59:37 +02:00