docs(import): add unresolved-names plan + worklog entry
All checks were successful
CI / OCR Service Tests (pull_request) Successful in 22s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
CI / Backend Unit Tests (pull_request) Successful in 3m52s
CI / fail2ban Regex (pull_request) Successful in 42s
CI / Unit & Component Tests (pull_request) Successful in 4m13s
CI / Semgrep Security Scan (pull_request) Successful in 20s

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Marcel
2026-05-25 16:01:18 +02:00
parent 06127724de
commit 97db718f81
2 changed files with 523 additions and 0 deletions

View File

@@ -4,6 +4,27 @@ Running log of each working session. **Resume here.** Newest entry on top.
---
## 2026-05-25 (session 5) — Unresolved-name classification
**Did:** Implemented [`04-unresolved-names-plan.md`](./04-unresolved-names-plan.md) subagent-driven
(5 tasks, TDD, per-task spec + code-quality review; 67 tests pass). Added `classify_name` +
`NameClass` + `build_given_names` in `persons.py`; `ResolutionContext` now records non-RESOLVABLE
names in `self.unresolved`; orchestrator writes `review/unresolved-names.csv` (replaces the noisy
`ambiguous-receivers.csv`) with per-category stats.
**Why:** `unmatched-names.csv` mixes boring non-family correspondents (expected) with genuinely
unresolvable entries. The new report isolates the latter so review focuses on ~440 real cases.
**Real-run result:** unresolved-names.csv = single_token 191 / prose 103 / unknown 74 /
collective 46 / relational 21 / ambiguous_pair **5** (distinct). The ambiguous over-flagging fix
cut `ambiguous_pair` from 303 → 5 (genuine two-given-name pairs only; `Mieze Schefold` etc. now
correctly RESOLVABLE). given-name set = register first names `config.EXTRA_GIVEN_NAMES`.
**Next:** populate `overrides/names.csv` from unresolved-names.csv (highest-count first); extend
`EXTRA_GIVEN_NAMES` if a real pair isn't flagged; still-open date work (Spanish months, 5872 band).
---
## 2026-05-25 (session 4) — Built the normalizer (subagent-driven, all 17 tasks)
**Did:** Executed the plan subagent-driven (implementer + spec review + code-quality review per