feat(ocr): add DTA-derived historical German wordlist and generation script
153K words from dtak+dtae 1800-1899 corpora (min_freq=20), covering pre-reform spellings common in Kurrent/Süterlin documents. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
153551
ocr-service/dictionaries/de_historical.txt
Normal file
153551
ocr-service/dictionaries/de_historical.txt
Normal file
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user