docs(import): add unresolved-names plan + worklog entry

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 16:01:18 +02:00
parent 06127724de
commit 97db718f81
2 changed files with 523 additions and 0 deletions
--- a/docs/import-migration/WORKLOG.md
+++ b/docs/import-migration/WORKLOG.md
@@ -4,6 +4,27 @@ Running log of each working session. **Resume here.** Newest entry on top.

 ---

+## 2026-05-25 (session 5) — Unresolved-name classification
+
+**Did:** Implemented [`04-unresolved-names-plan.md`](./04-unresolved-names-plan.md) subagent-driven
+(5 tasks, TDD, per-task spec + code-quality review; 67 tests pass). Added `classify_name` +
+`NameClass` + `build_given_names` in `persons.py`; `ResolutionContext` now records non-RESOLVABLE
+names in `self.unresolved`; orchestrator writes `review/unresolved-names.csv` (replaces the noisy
+`ambiguous-receivers.csv`) with per-category stats.
+
+**Why:** `unmatched-names.csv` mixes boring non-family correspondents (expected) with genuinely
+unresolvable entries. The new report isolates the latter so review focuses on ~440 real cases.
+
+**Real-run result:** unresolved-names.csv = single_token 191 / prose 103 / unknown 74 /
+collective 46 / relational 21 / ambiguous_pair **5** (distinct). The ambiguous over-flagging fix
+cut `ambiguous_pair` from 303 → 5 (genuine two-given-name pairs only; `Mieze Schefold` etc. now
+correctly RESOLVABLE). given-name set = register first names ∪ `config.EXTRA_GIVEN_NAMES`.
+
+**Next:** populate `overrides/names.csv` from unresolved-names.csv (highest-count first); extend
+`EXTRA_GIVEN_NAMES` if a real pair isn't flagged; still-open date work (Spanish months, 58–72 band).
+
+---
+
 ## 2026-05-25 (session 4) — Built the normalizer (subagent-driven, all 17 tasks)

 **Did:** Executed the plan subagent-driven (implementer + spec review + code-quality review per