From 2df71beb7eac48469b3b963e68d64f4aa693a9c3 Mon Sep 17 00:00:00 2001 From: Marcel Date: Thu, 21 May 2026 17:06:44 +0200 Subject: [PATCH] docs: add ADR-023 and glossary entries for OCR metrics MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ADR-023 captures why prometheus-fastapi-instrumentator was chosen, the build_metrics(registry) factory pattern, and the test rebinding seam. The glossary gains four ops-aligned terms — illegible word, models-ready gauge, recognition vs segmentation accuracy — so the metrics documentation in OBSERVABILITY.md has a vocabulary to lean on. Co-Authored-By: Claude Sonnet 4.6 --- docs/GLOSSARY.md | 8 ++ ...mentator-and-metrics-registry-injection.md | 94 +++++++++++++++++++ 2 files changed, 102 insertions(+) create mode 100644 docs/adr/023-prometheus-instrumentator-and-metrics-registry-injection.md diff --git a/docs/GLOSSARY.md b/docs/GLOSSARY.md index 21addb4c..99da1775 100644 --- a/docs/GLOSSARY.md +++ b/docs/GLOSSARY.md @@ -80,6 +80,14 @@ _See also [DocumentStatus lifecycle](#documentstatus-lifecycle)._ **Sütterlin** — A specific standardized style of Kurrent taught in German schools from 1915 to 1941. +**Illegible word** — a word whose recognition confidence falls below the configured threshold; replaced with the literal token `[unleserlich]` in the rendered block text and counted in the `ocr_illegible_words_total` Prometheus counter. + +**Models-ready gauge** — the `ocr_models_ready` Prometheus gauge, flipped from `0` to `1` once the FastAPI lifespan startup has finished loading the Kraken model and the spell-checker. Used both for the `/health` endpoint and as the supervised signal for the `ocr_models_ready < 1 for 2m` alert. + +**Recognition model accuracy** — the accuracy reported by `ketos train` for the recognition (text-line) model, exposed as `ocr_model_accuracy{kind="recognition"}`. Sourced from `_parse_best_checkpoint` on the highest-scoring checkpoint after training. + +**Segmentation model accuracy** — the accuracy reported by `ketos segtrain` for the baseline layout analysis (`blla`) model, exposed as `ocr_model_accuracy{kind="segmentation"}`. Distinct from recognition accuracy because the two models are trained and improved independently. + --- ## Other Domain Terms diff --git a/docs/adr/023-prometheus-instrumentator-and-metrics-registry-injection.md b/docs/adr/023-prometheus-instrumentator-and-metrics-registry-injection.md new file mode 100644 index 00000000..5e8a1020 --- /dev/null +++ b/docs/adr/023-prometheus-instrumentator-and-metrics-registry-injection.md @@ -0,0 +1,94 @@ +# ADR-023: Prometheus Instrumentator and Metrics Registry Injection + +## Status + +Accepted + +## Context + +Until issue #652 the OCR service exposed no `/metrics` endpoint. The +observability stack already scrapes the Spring Boot backend's actuator +endpoint, but it had nothing to scrape on the Python side. Without HTTP- +and domain-level metrics from `ocr-service` we cannot answer questions +like "what is the share of words rendered as `[unleserlich]`" or +"is the training error rate above its budget" from Grafana. + +Two implementation requirements influenced the design: + +1. **Counter / gauge isolation in tests.** `prometheus_client` collectors + are module-level singletons keyed by name on the global `REGISTRY`. + Re-importing or naively re-instantiating them raises a duplicated- + collector error and cross-test state leaks (a `.inc()` in test A is + still readable by test B). A test harness needs a way to swap the + active container for a fresh per-test instance. + +2. **Minimal blast radius on the request path.** We did not want to + hand-instrument every endpoint with FastAPI middleware. The + `prometheus-fastapi-instrumentator` library already provides + `http_requests_total`, `http_request_duration_seconds`, and the + `/metrics` exposition route, all idiomatic Prometheus names. + +## Decision + +- Add `prometheus-fastapi-instrumentator==7.0.0` and pin its transitive + dependency `prometheus-client==0.25.0` explicitly in + `ocr-service/requirements.txt`. +- Mount the instrumentator once at module load: + `Instrumentator(excluded_handlers=["/health", "/metrics"]).instrument(app).expose(app)`. + This adds `/metrics` and an HTTP-level dashboard surface without + changing any endpoint code. +- Define every domain metric (`ocr_jobs_total`, `ocr_pages_total`, + `ocr_processing_seconds`, …) inside a `build_metrics(registry)` + factory in `ocr-service/metrics.py` that returns a frozen `OcrMetrics` + dataclass. Production code binds the container to the default + `REGISTRY` once: `metrics: OcrMetrics = build_metrics(REGISTRY)`. +- Tests use a `fresh_metrics` fixture that builds a new + `CollectorRegistry()` per test and monkeypatches `main.metrics` with + a container bound to it. The endpoint code keeps reading + `metrics.` without knowing whether it is talking to the global + registry or a per-test one. + +## Consequences + +**Positive** + +- One reusable factory captures the metric definitions; future metrics + go in one place. +- Tests run with full counter isolation. Cross-test state leakage is + impossible because each test sees its own dataclass instance. +- The instrumentator gives us `http_*` metrics for free, including a + Grafana-ready histogram that pairs with the Spring Boot one. + +**Negative** + +- One extra level of indirection: any test that asserts on metric + values must remember to monkeypatch `main.metrics`, not the registry + directly. Rebinding through the registry is harmless but useless — + the dataclass holds references to the original collectors. +- `prometheus-client` is now pinned. Upgrading it requires an explicit + bump and re-checking the instrumentator's compatibility range. +- `/metrics` is exposed unauthenticated and relies on the Docker + internal network for confidentiality. See + [docs/OBSERVABILITY.md §Internal-only endpoints](../OBSERVABILITY.md) + for the Caddy snippet that must be added if the service ever gets a + host-side port mapping. + +## Alternatives considered + +- **Hand-roll the `/metrics` endpoint.** Rejected: would have meant + duplicating what `prometheus-fastapi-instrumentator` ships, plus + middleware for the HTTP histograms. +- **Skip the factory; pass `registry` as a function argument + everywhere.** Rejected: clutters every endpoint signature and breaks + the symmetry with the Spring Boot side, which also relies on a + process-global Micrometer registry. +- **Use a `pytest` autouse fixture that resets `REGISTRY` between + tests.** Rejected: `prometheus_client` does not expose a clean + "unregister all" hook, and we would be relying on private APIs. + +## References + +- Issue: [#652](https://git.raddatz.cloud/marcel/familienarchiv/issues/652) +- Library: +- Code: `ocr-service/metrics.py`, `ocr-service/main.py`, + `ocr-service/test_metrics.py`