docs(normalizer): document unresolved-names.csv review report
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -26,10 +26,15 @@ Outputs:
|
||||
| --- | --- |
|
||||
| `unparsed-dates.csv` | For each `raw` (sorted by frequency), fill `suggested_iso` + `suggested_precision`, then paste `raw,suggested_iso,suggested_precision` into `overrides/dates.csv` (header `raw,iso,precision`). |
|
||||
| `unmatched-names.csv` | If `suggested_id` is right, copy `raw,suggested_id` into `overrides/names.csv`; else look up the correct id in `out/canonical-persons.xlsx` (the `person_id` column). |
|
||||
| `ambiguous-receivers.csv` | A space-joined pair we refused to auto-split (e.g. `Ella Anita`). Decide and add a names override if it is really two people. |
|
||||
| `unresolved-names.csv` | Names whose value is itself problematic, grouped by `category`: `unknown` (`?`/illegible), `single_token` (first OR last name only), `relational` (`Tante …`), `collective` (`Familie …`), `prose` (a description landed in a name column), `ambiguous_pair` (two given names → likely two people, not auto-split). Review highest-impact categories first; add decisions to `overrides/names.csv`. |
|
||||
| `index-file-mismatch.csv` | The `Datei` path disagrees with the index-derived filename — reconcile when the PDFs arrive. |
|
||||
| `duplicate-index.csv`, `blank-index-rows.csv`, `skipped-x-suffix.csv` | Inspect; fix in the source spreadsheet if needed. |
|
||||
|
||||
> `unresolved-names.csv` is the focused "names that need a human" list — distinct from
|
||||
> `unmatched-names.csv` (which is just non-family correspondents that got provisional persons).
|
||||
> The given-name set that drives `ambiguous_pair` detection is the register's first names plus
|
||||
> `config.EXTRA_GIVEN_NAMES` — add names there if a real two-person cell isn't being flagged.
|
||||
|
||||
**Valid `person_id` values** all come from the `person_id` column of `out/canonical-persons.xlsx`.
|
||||
|
||||
## Tests
|
||||
|
||||
Reference in New Issue
Block a user