Commit Graph

3 Commits

Author SHA1 Message Date
Marcel
ec85f228c1 refactor(ocr): document > 50 frequency threshold rationale
Strict greater-than avoids non-determinism: if multiple candidates share
the minimum frequency value, pyspellchecker's ranking is undefined.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 17:21:37 +02:00
Marcel
fea24aee25 refactor(ocr): make collapse_adjacent_markers a public function
Drop underscore prefix — the helper is part of confidence.py's effective
public API since spell_check.py imports and calls it directly.

Fixes reviewer concern: importing a _-prefixed name across module boundaries
contradicts Python's private-by-convention signal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 17:20:31 +02:00
Marcel
092131930c feat(ocr): add spell_check module with German spellchecker and historical wordlist
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 16:52:50 +02:00