familienarchiv

Author	SHA1	Message	Date
Marcel	75b3ca8b9e	fix(normalizer): don't coerce boolean cells to 1/0 Add bool guard before the int branch in _cell_to_str so True/False cells are preserved as "True"/"False" instead of "1"/"0". Add two regression tests covering the fix and missing-sheet error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:11:19 +02:00
Marcel	74c4c390fc	feat(normalizer): xlsx ingest + header mapping Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:08:30 +02:00
Marcel	29087319e6	test(normalizer): cover AliasIndex unambiguous first-name resolution Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:07:20 +02:00
Marcel	53457d9319	feat(normalizer): alias index with maiden/married/nickname resolution Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:04:11 +02:00
Marcel	2d97595e9c	fix(normalizer): split_receivers returns [] for a geb.-only cell Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 14:02:35 +02:00
Marcel	a177077b40	feat(normalizer): receiver splitting Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:59:51 +02:00
Marcel	b7a2332861	fix(normalizer): suffix all members of a colliding person-id group Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:58:35 +02:00
Marcel	1da1a8d223	feat(normalizer): person register parsing Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:54:37 +02:00
Marcel	59715bdccd	fix(normalizer): require day-dot in English month-first matcher (structural anti-shadow) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:53:05 +02:00
Marcel	53a661adb6	feat(normalizer): month/year, feast/season, range matchers + overrides Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:47:26 +02:00
Marcel	4942c0ea07	feat(normalizer): day-first month-name matcher Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:42:36 +02:00
Marcel	7edc002ebb	feat(normalizer): roman-numeral month matcher Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:38:32 +02:00
Marcel	b43dd6cdd4	fix(normalizer): keep Task 5 scoped — drop year-only matcher (belongs to Task 8) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:36:48 +02:00
Marcel	cff486dda7	fix(normalizer): treat leading date qualifiers (nach/vor/…) as APPROX _preprocess now sets approx=True when a leading marker is stripped; add _match_year_only so bare years (e.g. "nach 1900" -> "1900") resolve to 1900-01-01/YEAR before being upgraded to APPROX. Strengthen test_parse_approx_marker_upgrades_precision and add test_parse_leading_qualifier_is_approx (11 tests, all pass). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:35:19 +02:00
Marcel	df14e6b1ee	feat(normalizer): parse_date dispatch + iso/numeric matchers Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:30:07 +02:00
Marcel	1908dde859	feat(normalizer): year expansion century rule Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:27:26 +02:00
Marcel	4845e7a3c1	feat(normalizer): feast + season resolution Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:24:26 +02:00
Marcel	c6cceec6e9	feat(normalizer): Easter computus Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:21:39 +02:00
Marcel	8f6f4f2d62	feat(normalizer): scaffold tool + config tables Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 13:18:52 +02:00
Marcel	6f7aa643c9	docs(import): add normalizer implementation plan + apply persona review 17-task TDD plan for tools/import-normalizer/. Incorporates inline 6-persona review: content-deterministic idempotency, duplicate-index fix, provisional-id collision guard, date-parser edge cases, multi-sender split, CSV-injection defang, pinned deps. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 12:55:50 +02:00
Marcel	adfff420a5	docs(import): add import-migration analysis + normalizer spec Document the raw archive spreadsheet findings (IMP-01..12) and a requirements spec for an offline normalizer that produces a clean canonical dataset before import. Local docs only; no Gitea issue yet. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-25 12:32:37 +02:00
Marcel	8e9e3bba06	refactor(document): address review concerns from PR #660 All checks were successful CI / Semgrep Security Scan (pull_request) Successful in 21s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m2s Details nightly / deploy-staging (push) Successful in 2m2s Details CI / Unit & Component Tests (push) Successful in 3m58s Details CI / OCR Service Tests (push) Successful in 20s Details CI / Backend Unit Tests (push) Successful in 3m50s Details CI / fail2ban Regex (push) Successful in 44s Details CI / Unit & Component Tests (pull_request) Successful in 3m29s Details CI / Semgrep Security Scan (push) Successful in 21s Details CI / OCR Service Tests (pull_request) Successful in 21s Details CI / Backend Unit Tests (pull_request) Successful in 3m43s Details CI / Compose Bucket Idempotency (push) Successful in 59s Details CI / fail2ban Regex (pull_request) Successful in 45s Details - Restore JavaDoc on DocumentSearchResult.of() and .paged() factory methods - Remove redundant null guards on @Builder.Default collections in toListItem() - Map DocumentListItem fields explicitly in DocumentMultiSelect before cast - Add DocumentListItem required fields to docFactory in spec Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 19:27:31 +02:00
Marcel	627fc44d99	fix(document): fix test regressions from DocumentListItem migration All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m32s Details CI / OCR Service Tests (pull_request) Successful in 20s Details CI / Backend Unit Tests (pull_request) Successful in 3m46s Details CI / fail2ban Regex (pull_request) Successful in 42s Details CI / Semgrep Security Scan (pull_request) Successful in 19s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s Details - Use documentService.getDocumentById() in detail_stillReturnsTrainingLabels so the Document.full entity graph eager-loads trainingLabels - Flatten makeItem() factory in DocumentList.svelte.test.ts (nested document: {} overrides broke item.id / item.documentDate access) - Remove { document: {} } wrapper from DocumentMultiSelect.svelte.spec.ts mock responses — component now reads body.items directly as flat items - Flatten single nested item in page.svelte.test.ts document list test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 19:19:28 +02:00
Marcel	6583226d79	refactor(document): migrate frontend from DocumentSearchItem to flat DocumentListItem All components, specs, and the generated API client now use the new DocumentListItem shape — flat access (item.title, item.sender) instead of the removed item.document.* nesting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 19:19:28 +02:00
Marcel	41b205becc	test(document): add LazyInit guard + detail regression tests; prune Document.list graph Remove trainingLabels from Document.list entity graph now that DocumentListItem does not touch that association. Integration tests guard against future LazyInitializationException regressions and confirm Document.full still loads trainingLabels for the detail endpoint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 19:19:28 +02:00
Marcel	f22dcaecb7	refactor(document): replace DocumentSearchItem with flat DocumentListItem DTO Eliminates excessive data exposure (OWASP API3:2023) — transcription, filePath, fileHash, thumbnailKey, scriptType and other detail-only fields are no longer serialised in the list API response. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 19:19:03 +02:00
Marcel	1109ab917b	docs(observability): ADR-024 + rotation runbook for grafana_reader All checks were successful CI / Backend Unit Tests (push) Successful in 3m35s Details CI / fail2ban Regex (push) Successful in 42s Details CI / Semgrep Security Scan (push) Successful in 19s Details CI / Compose Bucket Idempotency (push) Successful in 1m3s Details nightly / deploy-staging (push) Successful in 2m0s Details CI / Unit & Component Tests (pull_request) Successful in 3m39s Details CI / OCR Service Tests (pull_request) Successful in 22s Details CI / Backend Unit Tests (pull_request) Successful in 3m53s Details CI / fail2ban Regex (pull_request) Successful in 43s Details CI / Semgrep Security Scan (pull_request) Successful in 20s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s Details CI / Unit & Component Tests (push) Successful in 3m39s Details CI / OCR Service Tests (push) Successful in 20s Details ADR-024 records the deliberate cross-domain link (obs-grafana joins archiv-net to query archive-db via the SELECT-only grafana_reader role), the rejected alternatives (Prometheus exporter, read replica, versioned migration + flyway repair, hardcoded fallback), and the consequences — specifically that a Grafana compromise gains TCP reach to archive-db but is bounded by the role's least-privilege grants. The DEPLOYMENT.md runbook documents the rotation procedure that R__grafana_reader_password.sql now enables: bump GRAFANA_DB_PASSWORD, restart backend (Flyway re-applies because the resolved checksum changed), restart obs-grafana (datasource picks up the new env var). Also calls out the fail-closed startup behavior so operators who hit IllegalStateException know it is deliberate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 17:21:27 +02:00
Marcel	769984608b	test(observability): expand grafana_reader coverage with write-deny + PII negatives The original 4 tests asserted SELECT existed on the three granted tables and was absent on app_users. That left two gaps a future migration could slip through silently: - INSERT/UPDATE/DELETE on the granted tables — if someone GRANTed write access on, say, documents to grafana_reader, the SELECT positives stay green and the boundary is breached invisibly. - Other PII / sensitive tables — the single app_users negative checks one table; a wildcard "GRANT SELECT ON ALL TABLES IN SCHEMA public" would still leave it green by accident if app_users wasn't the only sensitive table. Switch to a hasPrivilege(table, privilege) helper, add three write-deny tests (INSERT/UPDATE/DELETE on each granted table), and replace the single app_users negative with a parameterized sweep over app_users, user_groups, persons, notifications, document_comments, document_annotations, geschichten. New sensitive tables get added to that list as they appear. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 17:21:01 +02:00
Marcel	c282f38170	feat(observability): own grafana_reader password via repeatable migration V68 used to set the role's password in a versioned migration, which Flyway applies exactly once per database. Rotating GRAFANA_DB_PASSWORD therefore had no effect on the DB role — operators would need a manual ALTER ROLE or a `flyway repair` that nobody documented. The shape conflated two lifecycles: schema migration (one-shot, immutable) and credential provisioning (rotatable). Split into: - V68 (versioned, immutable): creates the role and applies SELECT grants on audit_log, documents, transcription_blocks. - R__grafana_reader_password.sql (repeatable): issues ALTER ROLE … PASSWORD with the placeholder. Flyway computes the checksum on the resolved content, so any change to GRAFANA_DB_PASSWORD changes the checksum and re-applies the migration on the next boot. Rotation becomes "bump env var + restart backend". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 17:20:35 +02:00
Marcel	3ea7f0b5b2	feat(observability): fail closed when GRAFANA_DB_PASSWORD is unset FlywayConfig used to fall back to a hardcoded "changeme-grafana-db-password" string when the env var was missing. That published a known credential for the grafana_reader role (SELECT on audit_log, documents, transcription_blocks) into git history and made silent fail-open the default for any deploy that forgot the secret. Now resolution goes through Spring's Environment and throws IllegalStateException at startup when the value is unset or blank — same shape as UserDataInitializer's refusal to seed default admin creds. Tests inject via the global GRAFANA_DB_PASSWORD entry in test-resources application.properties so existing Flyway-loading test classes keep booting without per-class TestPropertySource boilerplate. FlywayConfigTest covers both branches against MockEnvironment without a Spring context. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-22 17:20:09 +02:00
Marcel	bcba4dab80	ci(observability): inject GRAFANA_DB_PASSWORD from Gitea secrets All checks were successful CI / fail2ban Regex (pull_request) Successful in 42s Details CI / Semgrep Security Scan (pull_request) Successful in 20s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m2s Details CI / Unit & Component Tests (pull_request) Successful in 3m32s Details CI / OCR Service Tests (pull_request) Successful in 20s Details CI / Backend Unit Tests (pull_request) Successful in 3m30s Details Wires the new GRAFANA_DB_PASSWORD secret through the deploy pipeline: - docker-compose.prod.yml: backend env now passes GRAFANA_DB_PASSWORD through so Flyway V68 can resolve the ${grafanaDbPassword} placeholder in production and staging (it already worked in local dev via docker-compose.yml). - release.yml + nightly.yml: declare GRAFANA_DB_PASSWORD as a required Gitea secret, write it into .env.production / .env.staging (consumed by archive-backend), and into /opt/familienarchiv/obs-secrets.env (consumed by obs-grafana's PostgreSQL datasource). Operator action before the next deploy: add a GRAFANA_DB_PASSWORD value to the Gitea repo secrets (openssl rand -hex 32). Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:27 +02:00
Marcel	a4a3e3b105	docs(architecture): show Grafana→PostgreSQL link for PO Overview dashboard Adds the new read-only connection from Grafana to archive-db (via the grafana_reader role) introduced by the PO Overview dashboard. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	cac00ed711	docs(deployment): document GRAFANA_DB_PASSWORD across env tables Adds GRAFANA_DB_PASSWORD to the observability-stack env-var table, the Gitea secrets table, and the obs-secrets.env reference, so operators see the variable wherever they look for related secrets. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	637829cebc	feat(observability): add PO Overview Grafana dashboard Provisioned dashboard for the product owner's weekly check-in: system health (Prometheus + Loki), user activity (PostgreSQL audit_log), archive progress (PostgreSQL transcription_blocks + audit_log), and OCR quality (Prometheus ocr-service metrics). Default range 7d, manual refresh, thresholds per the issue spec. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	4e636b3253	chore(observability): document GRAFANA_DB_PASSWORD in env files .env.example: declare GRAFANA_DB_PASSWORD with an openssl rand -hex 32 hint so a missing value fails loudly (NFR-OPS-02). obs.env: add a comment explaining that the real value comes from CI's obs-secrets.env, matching the pattern used for other secrets in that file. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	ab2708e63b	feat(observability): provision Grafana PostgreSQL datasource Adds a read-only datasource pointing at archive-db using the grafana_reader role (provisioned by Flyway V68). The password is interpolated from the GRAFANA_DB_PASSWORD env var passed to obs-grafana, and the connection is locked to editable: false so the credential cannot be inspected via the UI. sslmode=disable is intentional: traffic stays inside archiv-net. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	ed8e9576e4	feat(observability): pass GRAFANA_DB_PASSWORD to archive-backend Flyway runs inside the backend container at startup; V68's ${grafanaDbPassword} placeholder is resolved from this env var. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	0958df7768	feat(observability): wire obs-grafana to archive-db and inject GRAFANA_DB_PASSWORD obs-grafana now joins archiv-net so it can resolve archive-db:5432 for the PO Overview dashboard's PostgreSQL datasource, and receives GRAFANA_DB_PASSWORD so provisioning can interpolate it into the datasource config. Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	f4ffd8acee	feat(observability): create grafana_reader read-only DB role Add Flyway V68 migration that provisions a read-only PostgreSQL role scoped to audit_log, documents, and transcription_blocks. The role's password is injected via the new ${grafanaDbPassword} Flyway placeholder, which FlywayConfig reads from the GRAFANA_DB_PASSWORD env var. The migration is idempotent: CREATE on first run, ALTER on re-run. Adds a Testcontainers integration test asserting positive grants on the three intended tables and a negative grant on app_users (NFR-SEC-01). Refs #651. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 20:21:05 +02:00
Marcel	0801da8df0	docs(ocr): explain why two metrics tests skip fresh_metrics fixture Some checks failed CI / Backend Unit Tests (push) Successful in 3m42s Details CI / fail2ban Regex (push) Successful in 43s Details CI / Semgrep Security Scan (push) Successful in 19s Details CI / Compose Bucket Idempotency (push) Successful in 1m0s Details nightly / deploy-staging (push) Successful in 5m43s Details CI / Unit & Component Tests (pull_request) Successful in 3m24s Details CI / OCR Service Tests (pull_request) Successful in 20s Details CI / Backend Unit Tests (pull_request) Successful in 3m28s Details CI / fail2ban Regex (pull_request) Successful in 43s Details CI / Semgrep Security Scan (pull_request) Successful in 19s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s Details CI / Unit & Component Tests (push) Failing after 2m44s Details CI / OCR Service Tests (push) Successful in 20s Details Sara's cycle-2 S2: clarify the latent (but not actual) cross-test state risk on the two metrics tests that hit the global REGISTRY instead of the per-test fresh_metrics fixture. Migrating them would actually break them — the /metrics endpoint is served by prometheus-fastapi-instrumentator which binds to the default REGISTRY at app-construction time, and the http_requests_total assertion only finds counters on that global registry. Both tests already assert response shape only (status code, content-type substring, body substrings), not numeric values, so the shared-registry caveat is documented for future readers rather than treated as a bug to fix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 17:23:32 +02:00
Marcel	e0e1578bdd	test(ocr): widen spell-check exclusion bound to 0.09s with rationale Sara's cycle-2 S1: the wall-clock assertion at < 0.05s could trip on a slow CI runner under load even when the timer correctly excludes spell-check. Sara's preferred structural fix (patch main.time.monotonic with a deterministic sequence) proved awkward — the patched attribute is the global time.monotonic which httpx and asyncio consume, exhausting the sequence before the request reaches the engine loop. Take the documented fallback: widen the bound to 0.09s and explain why. The failure mode the test guards against (spell-check inside the timer) would add 0.1s (2 × 0.05s sleep), so 0.09s catches the bug while leaving ~90ms of headroom for slow CI runners. Verified red→green by temporarily moving correct_text inside the timer block: bound trips at 0.101s; the fixed code reads ~0.001s. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-21 17:22:49 +02:00
Marcel	2df71beb7e	docs: add ADR-023 and glossary entries for OCR metrics All checks were successful CI / Unit & Component Tests (pull_request) Successful in 3m33s Details CI / OCR Service Tests (pull_request) Successful in 22s Details CI / Backend Unit Tests (pull_request) Successful in 3m29s Details CI / fail2ban Regex (pull_request) Successful in 42s Details CI / Semgrep Security Scan (pull_request) Successful in 20s Details CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s Details ADR-023 captures why prometheus-fastapi-instrumentator was chosen, the build_metrics(registry) factory pattern, and the test rebinding seam. The glossary gains four ops-aligned terms — illegible word, models-ready gauge, recognition vs segmentation accuracy — so the metrics documentation in OBSERVABILITY.md has a vocabulary to lean on. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:06:44 +02:00
Marcel	2dbb3c37b4	docs(observability): document ocr metrics, scrape edge, and access-log filter - L2 container diagram now shows the Prometheus -> ocr:8000 scrape edge (plus the previously-undrawn Prometheus -> backend edge for symmetry). - OBSERVABILITY.md gains a full ocr_* metrics table with labels, units, and the canonical example queries from issue #652. - New "Internal-only endpoints" subsection captures the unauthenticated /metrics caveat and provides the Caddy block snippet for the case where the service ever gets a host port. - Explicit note that MetricsPathFilter only quiets uvicorn stdout, and the OCR metrics must never carry PII or document content. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:05:27 +02:00
Marcel	67368b4413	docs(ocr): annotate metrics binding + /metrics exposure + pin client Three small drops that pay back later: - Note that main.metrics is import-time bound and tests must monkeypatch `main.metrics`, not the registry. - Flag the /metrics endpoint as unauthenticated and cross-link the Caddy-block snippet in docs/OBSERVABILITY.md. - Pin prometheus-client to the exact 0.25.0 patch version already resolved by prometheus-fastapi-instrumentator 7.0.0, so an upstream bump cannot silently slip in. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:04:28 +02:00
Marcel	ddf6cf4cbc	test(ocr): collapse shared client setup into ocr_client helper Each metrics test was repeating the same five-line block — patch kraken_engine.load_models, patch load_spell_checker, instantiate the AsyncClient, force _models_ready True, restore it. Lift the lot into a single async context manager so each test body shrinks to its real arrange / act / assert intent. Tests that drive the lifespan directly (models_ready gauge) or stub asyncio.to_thread for /train (which already patches _models_ready) stay unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 17:03:29 +02:00
Marcel	df952861c4	refactor(ocr): extract _record_training for shared metric bookkeeping The /train, /train-sender, and /segtrain endpoints each duplicated the same eight-line try/except + counter + gauge block around the asyncio.to_thread call. Lift it into _record_training(runner, kind), which accepts a sync- or async-returning callable for flexibility. Each endpoint now ends with a single return line. Behaviour preserved — status codes, error propagation, and metric labels stay identical. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 16:58:40 +02:00
Marcel	22a5ee816a	refactor(ocr): extract _observe_block_words for word counter sites The two block-iteration loops (/ocr and /ocr/stream's standard generator) both ran the same word-total and illegible-word increments. Lift them into a single helper so each call site becomes one line and the counter intent reads cleanly. Pure refactor — no behavior change, tests stay green. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 16:57:18 +02:00
Marcel	0179e93a4b	test(ocr): narrow training error test to subprocess.run seam The asyncio.to_thread patch stubbed out the entire _run_training call, hiding the real error path. Replacing it with a failing CompletedProcess from subprocess.run exercises the actual ketos-failed branch and keeps the test's intent — error counter bumps, 500 surfaces — intact. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 16:55:14 +02:00
Marcel	0fc0cbcffd	test(ocr): lock in MetricsPathFilter fail-open behavior If uvicorn's access log format ever changes (args=None, or shorter than 3 elements), the filter must keep forwarding records rather than silently dropping them. Two extra LogRecords cover both edge cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 16:54:24 +02:00
Marcel	549cb15845	test(ocr): cover /train-sender counter and accuracy=None gauge default Two regression tests: - /train-sender hitting the success path bumps the recognition counter (previously only /train and /segtrain were covered). - A successful run whose result.accuracy is None must not call set() on ocr_model_accuracy — the gauge stays at its default 0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-21 16:53:48 +02:00

1 2 3 4 5 ...

2837 Commits