Compare commits
5 Commits
e95c678271
...
0398ebea2c
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
0398ebea2c | ||
|
|
99d8229858 | ||
|
|
fee3c7e27d | ||
|
|
fa3f4167e9 | ||
|
|
a2b77e5bfa |
@@ -176,6 +176,14 @@ letter actually said.*
|
||||
Silvester=12-31, …). Seasons map to representative months: Frühling/Frühjahr=Apr, Sommer=Jul,
|
||||
Herbst=Oct, Winter=Jan. The feast/season tables and Easter algorithm live in `config.py`
|
||||
(NFR-MAINT-01).
|
||||
- **REQ-DATE-07** — **Intra-month day ranges carry an end day; half-resolved ranges are
|
||||
flagged.** For a day range like `7./8. Sept.1923`, `date_iso` holds the start day, the end
|
||||
day is resolved against the shared month/year into `date_end`, and `date_precision` =
|
||||
`RANGE`. If the **start** parses but the **end day is impossible** (e.g. `10./40.1.1917`),
|
||||
the row keeps the start and `RANGE` precision, leaves `date_end` **empty**, and is flagged
|
||||
`needs_review = range_end_unparsed` — the unparseable end is dropped honestly (surfaced for
|
||||
review), never silently invented or clamped. A `RANGE` row **may** therefore legitimately
|
||||
have an empty `date_end`; the importer must treat `date_end` as optional even on a `RANGE`.
|
||||
|
||||
### 4.4 Person resolution & dedup (`FR-PERS`, `FR-DEDUP`) — resolves IMP-04, IMP-05, IMP-11
|
||||
|
||||
@@ -262,6 +270,7 @@ DB schema.
|
||||
| Field | Required | Format / values | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| `index` | yes | string | Stable key; basis for PDF matching. |
|
||||
| `file` | no | string | verbatim `Datei` value (e.g. `H-0730.pdf`); carried through for the importer to link the scanned PDF. |
|
||||
| `box` | no | string | from `Box`. |
|
||||
| `folder` | no | string | from `Mappe`. |
|
||||
| `sender_person_id` | no | person_id | resolved; empty if no sender. |
|
||||
@@ -271,11 +280,12 @@ DB schema.
|
||||
| `date_iso` | no | `YYYY-MM-DD` | best-effort; empty if `UNKNOWN`. |
|
||||
| `date_raw` | no | string | verbatim source date. |
|
||||
| `date_precision` | yes | enum | `DAY\|MONTH\|SEASON\|YEAR\|RANGE\|APPROX\|UNKNOWN`. |
|
||||
| `date_end` | no | `YYYY-MM-DD` or empty | RANGE end day (e.g. `7./8. Sept.1923` → `date_iso` = start, `date_end` = end). Empty for every non-RANGE precision **and** for a half-resolved RANGE whose end did not parse (see REQ-DATE-07). |
|
||||
| `location` | no | string | from `Ort`. |
|
||||
| `tags` | no | `tag\|tag` | from `Schlagwort`. |
|
||||
| `summary` | no | string | from `Inhalt`. |
|
||||
| `source_row` | yes | int | provenance (NFR-DATA-01). |
|
||||
| `needs_review` | yes | `flag\|flag` or empty | review flags (REQ-PROV-02). |
|
||||
| `needs_review` | yes | `flag\|flag` or empty | review flags (REQ-PROV-02). Flags include `unparsed_date`, `range_end_unparsed` (half-resolved RANGE, REQ-DATE-07), `unmatched_sender`, `unmatched_receiver`, `multi_sender`, `index_file_mismatch`, `duplicate_index`. |
|
||||
|
||||
### 6.2 `canonical-persons.xlsx`
|
||||
|
||||
@@ -295,6 +305,27 @@ DB schema.
|
||||
| `aliases` | no | `a\|b\|c` | every surface form that maps here. |
|
||||
| `provisional` | yes | bool | true if created from a document string, not the register. |
|
||||
|
||||
### 6.3 `canonical-persons-tree.json`
|
||||
|
||||
The de-duplicated genealogical tree (family members + their relationships) the importer
|
||||
uses to seed the family graph. Each `persons[]` entry carries a `personId` that **joins
|
||||
1:1 onto** `person_id` in `canonical-persons.xlsx`.
|
||||
|
||||
| Field | Required | Format | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| `personId` | yes | slug | The register's **verbatim** `person_id` (e.g. `cram-hans-1`), propagated — never re-slugified — so collision suffixes match `canonical-persons.xlsx` exactly. Every tree `personId` exists in the register; the register is the sole slug authority. |
|
||||
| `firstName` / `lastName` / `maidenName` | first/last yes | string | name parts. |
|
||||
| `birthYear` / `deathYear` | no | int or null | year only (tree granularity). |
|
||||
| `birthPlace` / `deathPlace` | no | string or null | from the register. |
|
||||
| `generation` | no | int or null | parsed from `G n`. |
|
||||
| `notes` | no | string or null | leftover Bemerkung text after relationship extraction. |
|
||||
| `familyMember` | yes | bool | always true for tree persons. |
|
||||
|
||||
A top-level `generated_at` is pinned to a fixed timestamp (`2020-01-01T00:00:00`) for
|
||||
reproducibility (NFR-IDEM-01), not a wall-clock value. `relationships[]` carry `SPOUSE_OF`
|
||||
and `PARENT_OF` edges keyed by `rowId`; `unresolved[]` lists relationship strings that did
|
||||
not match a tree person.
|
||||
|
||||
---
|
||||
|
||||
## 7. Prioritized Backlog (MoSCoW)
|
||||
@@ -339,7 +370,7 @@ DB schema.
|
||||
| ID | Question | Why it matters | Ref | Resolution |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| OQ-01 ✅ | Season/holiday → date. | Accuracy of ~70 SEASON/feast rows. | REQ-DATE-06 | **Resolved (2026-05-25):** movable feasts (Ostern, Pfingsten, Himmelfahrt, Advent, …) **computed per year from Easter — never a fixed month**; fixed feasts looked up (Weihnachten=12-25, Neujahr=01-01, …); seasons = mid-season month (Frühling=Apr, Sommer=Jul, Herbst=Oct, Winter=Jan). |
|
||||
| OQ-02 ✅ | Date ranges: start only, or start+end? | Sorting/display of ~315 range values. | REQ-DATE-02 | **Confirmed:** store **start** in `date_iso`, precision `RANGE`, full text in `date_raw`. |
|
||||
| OQ-02 ✅ | Date ranges: start only, or start+end? | Sorting/display of ~315 range values. | REQ-DATE-02, REQ-DATE-07 | **Confirmed (updated #670):** store **start** in `date_iso`, precision `RANGE`, full text in `date_raw`, **and the resolved end day in `date_end`** for intra-month day ranges. A half-resolved range (start parsed, end impossible) keeps `date_end` empty and is flagged `range_end_unparsed`. |
|
||||
| OQ-03 ✅ | `person_id` format. | Stability across re-runs; diffability. | §6 | **Confirmed:** readable slug `lastname-firstname`, numeric suffix on collision. |
|
||||
| OQ-04 ✅ | `x`-suffix row handling. | 42 rows. | REQ-TRIAGE-03 | **Resolved (2026-05-25):** `x` rows are transcriptions of the base letter but not yet mappable → **skip this pass**, log to `review/skipped-x-suffix.csv` for later linking. |
|
||||
| OQ-05 ✅ | Importer output format. | Phase-2 reader. | B11 | **Confirmed:** `.xlsx` (openpyxl-native, headered). |
|
||||
|
||||
@@ -67,6 +67,23 @@ class ParsedDate:
|
||||
precision: Precision
|
||||
raw: str
|
||||
end: str | None = None # RANGE end day; None for every non-RANGE precision
|
||||
# True only for a half-resolved RANGE: the start parsed but the end did not, so
|
||||
# the end was dropped and the row should surface in review (#670, Gap 2).
|
||||
needs_review: bool = False
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class MatchResult:
|
||||
"""Uniform return shape for every _match_* matcher.
|
||||
|
||||
A matcher returns None when it does not match, or a MatchResult when it does.
|
||||
`end` is the RANGE end day (None for every non-RANGE precision); `needs_review`
|
||||
is True only for a half-resolved RANGE whose start parsed but end did not.
|
||||
"""
|
||||
iso: str
|
||||
precision: Precision
|
||||
end: str | None = None
|
||||
needs_review: bool = False
|
||||
|
||||
|
||||
_LEADING_MARKERS = re.compile(
|
||||
@@ -98,7 +115,7 @@ def _match_iso(s):
|
||||
if re.fullmatch(r"\d{4}-\d{2}-\d{2}", s):
|
||||
try:
|
||||
datetime.date.fromisoformat(s)
|
||||
return s, Precision.DAY
|
||||
return MatchResult(s, Precision.DAY)
|
||||
except ValueError:
|
||||
return None
|
||||
return None
|
||||
@@ -113,7 +130,7 @@ def _match_numeric(s):
|
||||
if year is None or not (1 <= month <= 12):
|
||||
return None
|
||||
try:
|
||||
return datetime.date(year, month, day).isoformat(), Precision.DAY
|
||||
return MatchResult(datetime.date(year, month, day).isoformat(), Precision.DAY)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
@@ -131,7 +148,7 @@ def _match_roman(s):
|
||||
if not month or year is None:
|
||||
return None
|
||||
try:
|
||||
return datetime.date(year, month, day).isoformat(), Precision.DAY
|
||||
return MatchResult(datetime.date(year, month, day).isoformat(), Precision.DAY)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
@@ -147,7 +164,7 @@ def _build_day_month_year(day, month, year):
|
||||
if not month or year is None or not (1 <= month <= 12):
|
||||
return None
|
||||
try:
|
||||
return datetime.date(year, month, day).isoformat(), Precision.DAY
|
||||
return MatchResult(datetime.date(year, month, day).isoformat(), Precision.DAY)
|
||||
except ValueError:
|
||||
return None
|
||||
|
||||
@@ -189,7 +206,7 @@ def _match_month_year(s):
|
||||
year = expand_year(m.group(2))
|
||||
if not month or year is None:
|
||||
return None
|
||||
return datetime.date(year, month, 1).isoformat(), Precision.MONTH
|
||||
return MatchResult(datetime.date(year, month, 1).isoformat(), Precision.MONTH)
|
||||
|
||||
|
||||
def _match_feast_season(s):
|
||||
@@ -199,19 +216,23 @@ def _match_feast_season(s):
|
||||
year = expand_year(m.group(2))
|
||||
if year is None:
|
||||
return None
|
||||
return resolve_feast_or_season(m.group(1), year)
|
||||
resolved = resolve_feast_or_season(m.group(1), year)
|
||||
if resolved is None:
|
||||
return None
|
||||
iso, precision = resolved
|
||||
return MatchResult(iso, precision)
|
||||
|
||||
|
||||
def _match_year_only(s):
|
||||
if _YEAR_ONLY_RE.fullmatch(s):
|
||||
return datetime.date(int(s), 1, 1).isoformat(), Precision.YEAR
|
||||
return MatchResult(datetime.date(int(s), 1, 1).isoformat(), Precision.YEAR)
|
||||
return None
|
||||
|
||||
|
||||
def _match_range(s):
|
||||
m = _RANGE_YY_RE.fullmatch(s)
|
||||
if m:
|
||||
return datetime.date(int(m.group(1)), 1, 1).isoformat(), Precision.RANGE, None
|
||||
return MatchResult(datetime.date(int(m.group(1)), 1, 1).isoformat(), Precision.RANGE)
|
||||
m = _RANGE_DAY_RE.fullmatch(s)
|
||||
if m:
|
||||
day_start, day_end, rest = m.group(1), m.group(2), m.group(3)
|
||||
@@ -220,14 +241,19 @@ def _match_range(s):
|
||||
start = matcher(f"{day_start}.{rest}")
|
||||
if start:
|
||||
end = matcher(f"{day_end}.{rest}")
|
||||
return start[0], Precision.RANGE, (end[0] if end else None)
|
||||
# Half-resolved range (start parsed, end did not — e.g. the impossible
|
||||
# end day in "10./40.1.1917"): keep the start and RANGE precision, drop
|
||||
# the end, and flag needs_review so the dropped end surfaces (#670, Gap 2).
|
||||
return MatchResult(start.iso, Precision.RANGE,
|
||||
end.iso if end else None,
|
||||
needs_review=end is None)
|
||||
m = _RANGE_HYPHEN_RE.fullmatch(s)
|
||||
if m:
|
||||
start = m.group(1).strip()
|
||||
for matcher in (_match_numeric, _match_roman, _match_monthname_a, _match_year_only):
|
||||
r = matcher(start)
|
||||
if r:
|
||||
return r[0], Precision.RANGE, None
|
||||
return MatchResult(r.iso, Precision.RANGE)
|
||||
return None
|
||||
|
||||
|
||||
@@ -256,11 +282,8 @@ def parse_date(raw: str, date_overrides: dict | None = None) -> ParsedDate:
|
||||
for matcher in _MATCHERS:
|
||||
result = matcher(cleaned)
|
||||
if result:
|
||||
iso, precision = result[0], result[1]
|
||||
end = result[2] if len(result) > 2 else None
|
||||
if approx:
|
||||
precision = Precision.APPROX
|
||||
return ParsedDate(iso, precision, raw, end)
|
||||
precision = Precision.APPROX if approx else result.precision
|
||||
return ParsedDate(result.iso, precision, raw, result.end, result.needs_review)
|
||||
return ParsedDate(None, Precision.UNKNOWN, raw)
|
||||
|
||||
|
||||
|
||||
@@ -107,6 +107,8 @@ def to_canonical(raw, ctx, date_overrides: dict, approved_themes: frozenset = fr
|
||||
|
||||
if raw.date.strip() and pd.precision == _dates.Precision.UNKNOWN:
|
||||
flags.append("unparsed_date")
|
||||
if pd.needs_review:
|
||||
flags.append("range_end_unparsed")
|
||||
if index_file_mismatch(raw.index, raw.file):
|
||||
flags.append("index_file_mismatch")
|
||||
|
||||
|
||||
@@ -193,6 +193,12 @@ def _attach_person_ids(tree_persons: list[dict], raw_dicts: list[dict]) -> None:
|
||||
parse_register and _parse_row both keep exactly the rows that have a last name.
|
||||
"""
|
||||
register = _persons.parse_register(raw_dicts)
|
||||
if len(tree_persons) != len(register):
|
||||
raise ValueError(
|
||||
"person_id propagation requires equal length: "
|
||||
f"{len(tree_persons)} tree persons vs {len(register)} register persons "
|
||||
"(the positional zip would otherwise silently truncate and mis-join ids)"
|
||||
)
|
||||
for tree_person, register_person in zip(tree_persons, register):
|
||||
tree_person["personId"] = register_person.person_id
|
||||
|
||||
|
||||
@@ -2,6 +2,18 @@ import datetime
|
||||
import dates
|
||||
from dates import Precision
|
||||
|
||||
def test_matchers_return_uniform_matchresult():
|
||||
# Every matcher returns a MatchResult(iso, precision, end) — no 2- vs 3-tuple
|
||||
# length-sniffing. A non-range matcher leaves end=None; a range matcher sets it.
|
||||
day = dates._match_numeric("15.2.1888")
|
||||
assert isinstance(day, dates.MatchResult)
|
||||
assert (day.iso, day.precision, day.end) == ("1888-02-15", Precision.DAY, None)
|
||||
|
||||
rng = dates._match_range("10./11.1.1917")
|
||||
assert isinstance(rng, dates.MatchResult)
|
||||
assert (rng.iso, rng.precision, rng.end) == ("1917-01-10", Precision.RANGE, "1917-01-11")
|
||||
|
||||
|
||||
def test_easter_known_years():
|
||||
# Anonymous Gregorian algorithm — verified against published tables
|
||||
assert dates.easter(2024) == datetime.date(2024, 3, 31)
|
||||
@@ -133,6 +145,32 @@ def test_parse_roman_month_day_range():
|
||||
assert r.precision == Precision.RANGE
|
||||
assert r.end == "1917-01-11"
|
||||
|
||||
def test_parse_range_invalid_end_keeps_start_flags_review():
|
||||
# "10./40.1.1917" — the 40th is an impossible end day. The start parses fine,
|
||||
# so the row stays RANGE with the start preserved, the unparseable end is dropped
|
||||
# (end is None), and the half-resolved range is flagged needs_review so the
|
||||
# dropped end surfaces honestly instead of vanishing silently (#670, Gap 2).
|
||||
r = dates.parse_date("10./40.1.1917")
|
||||
assert r.iso == "1917-01-10"
|
||||
assert r.precision == Precision.RANGE
|
||||
assert r.end is None
|
||||
assert r.needs_review is True
|
||||
|
||||
|
||||
def test_parse_range_valid_end_not_flagged():
|
||||
# a fully-resolved range carries its end and is NOT flagged for review
|
||||
r = dates.parse_date("10./11.1.1917")
|
||||
assert r.end == "1917-01-11"
|
||||
assert r.needs_review is False
|
||||
|
||||
|
||||
def test_parse_non_range_has_no_review_flag():
|
||||
# every fully-parsed non-range date is never flagged for review by the date layer
|
||||
assert dates.parse_date("15.2.1888").needs_review is False
|
||||
assert dates.parse_date("Mai 1895").needs_review is False
|
||||
assert dates.parse_date("").needs_review is False
|
||||
|
||||
|
||||
def test_parse_non_range_has_no_end():
|
||||
assert dates.parse_date("15.2.1888").end is None
|
||||
assert dates.parse_date("Mai 1895").end is None
|
||||
|
||||
@@ -82,6 +82,29 @@ def test_to_canonical_non_range_has_empty_date_end():
|
||||
assert doc.date_precision == "DAY"
|
||||
assert doc.date_end == ""
|
||||
|
||||
def test_to_canonical_half_resolved_range_flags_review():
|
||||
# an impossible end day ("10./40.1.1917") keeps the start + RANGE precision but
|
||||
# drops the unparseable end; the document must surface this as a review flag
|
||||
# so the importer (#669) knows date_end is empty on a RANGE row by design.
|
||||
ctx = _ctx()
|
||||
raw = documents.RawRow(source_row=5, index="H-0731", sender="", receivers="",
|
||||
date="10./40.1.1917")
|
||||
doc = documents.to_canonical(raw, ctx, date_overrides={})
|
||||
assert doc.date_iso == "1917-01-10"
|
||||
assert doc.date_precision == "RANGE"
|
||||
assert doc.date_end == ""
|
||||
assert "range_end_unparsed" in doc.needs_review
|
||||
|
||||
|
||||
def test_to_canonical_full_range_not_flagged():
|
||||
ctx = _ctx()
|
||||
raw = documents.RawRow(source_row=5, index="H-0730", sender="", receivers="",
|
||||
date="10./11.1.1917")
|
||||
doc = documents.to_canonical(raw, ctx, date_overrides={})
|
||||
assert doc.date_end == "1917-01-11"
|
||||
assert "range_end_unparsed" not in doc.needs_review
|
||||
|
||||
|
||||
def test_to_canonical_unmatched_and_unparsed():
|
||||
ctx = _ctx()
|
||||
raw = documents.RawRow(source_row=9, index="C-0001",
|
||||
|
||||
@@ -1,3 +1,8 @@
|
||||
import json
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
import openpyxl
|
||||
import normalize
|
||||
|
||||
@@ -119,3 +124,56 @@ def test_approved_themes_applied(tmp_path):
|
||||
tag_values = [ws.cell(row=r, column=tag_col + 1).value for r in range(2, ws.max_row + 1)]
|
||||
# W-0001 has Inhalt "Geschäftsreise" — should get an extra Themen/geschäftsreise tag
|
||||
assert any(v and "Themen/geschäftsreise" in v for v in tag_values)
|
||||
|
||||
|
||||
def _person_wb_with_collision(tmp_path):
|
||||
# Two "Hans Cram" rows force the register to suffix the colliding slug (-1/-2);
|
||||
# the tree must carry those exact suffixed ids so the join still reconciles.
|
||||
wb = openpyxl.Workbook(); ws = wb.active; ws.title = "Tabelle1"
|
||||
ws.append(["Generation", "Familienname", "Vorname", "geb als", "Geburtsdatum",
|
||||
"Geburtsort", "Todesdatum", "Sterbeort", "verheiratet mit", "Bemerkung"])
|
||||
ws.append(["G 1", "de Gruyter", "Walter", "", "", "", "", "", "", ""])
|
||||
ws.append(["G 1", "de Gruyter", "Eugenie", "Müller", "", "", "", "", "", ""])
|
||||
ws.append(["G 2", "Cram", "Hans", "", "1890", "", "", "", "", ""])
|
||||
ws.append(["G 3", "Cram", "Hans", "", "1925", "", "", "", "", ""])
|
||||
p = tmp_path / "persons.xlsx"; wb.save(p); return p
|
||||
|
||||
|
||||
def _generate_tree(person_wb, out_path):
|
||||
script = Path(__file__).parent.parent / "persons_tree.py"
|
||||
result = subprocess.run(
|
||||
[sys.executable, str(script), "--input", str(person_wb), "--output", str(out_path)],
|
||||
capture_output=True, text=True,
|
||||
)
|
||||
assert result.returncode == 0, result.stderr
|
||||
return json.loads(out_path.read_text(encoding="utf-8"))
|
||||
|
||||
|
||||
def test_tree_person_ids_reconcile_with_persons_xlsx(tmp_path):
|
||||
# The real #669 contract: every personId in canonical-persons-tree.json must join
|
||||
# 1:1 onto a person_id in canonical-persons.xlsx — no orphan tree id, no duplicate.
|
||||
# Both artifacts are produced from the SAME person workbook (collision included).
|
||||
person_wb = _person_wb_with_collision(tmp_path)
|
||||
out_dir = tmp_path / "out"; review_dir = tmp_path / "review"
|
||||
|
||||
normalize.run(
|
||||
document_workbook=_doc_wb(tmp_path), document_sheet="Familienarchiv",
|
||||
person_workbook=person_wb, person_sheet="Tabelle1",
|
||||
out_dir=out_dir, review_dir=review_dir, date_overrides={}, name_overrides={})
|
||||
|
||||
tree = _generate_tree(person_wb, tmp_path / "tree.json")
|
||||
tree_ids = [p["personId"] for p in tree["persons"]]
|
||||
|
||||
wb = openpyxl.load_workbook(out_dir / "canonical-persons.xlsx")
|
||||
ws = wb.active
|
||||
header = [c.value for c in ws[1]]
|
||||
pid_col = header.index("person_id")
|
||||
register_ids = [ws.cell(row=r, column=pid_col + 1).value for r in range(2, ws.max_row + 1)]
|
||||
|
||||
# tree ids are unique (no duplicate join key)
|
||||
assert len(tree_ids) == len(set(tree_ids))
|
||||
# the suffixed collision ids actually reached the tree
|
||||
assert "cram-hans-1" in tree_ids and "cram-hans-2" in tree_ids
|
||||
# every tree id resolves to exactly one register row — the join is total and 1:1
|
||||
register_counts = {pid: register_ids.count(pid) for pid in tree_ids}
|
||||
assert all(count == 1 for count in register_counts.values()), register_counts
|
||||
|
||||
@@ -454,6 +454,26 @@ def test_attach_person_ids_propagates_register_slug():
|
||||
assert tree_persons[1]["personId"] == "de-gruyter-eugenie"
|
||||
|
||||
|
||||
def test_attach_person_ids_raises_on_length_divergence():
|
||||
# The propagation is a positional zip; if tree_persons and the register drift in
|
||||
# length (e.g. a future filter change), zip would silently truncate and mis-join ids.
|
||||
# The guard must fail loudly instead.
|
||||
raw_dicts = [
|
||||
{"generation": "G 1", "last_name": "de Gruyter", "first_name": "Walter",
|
||||
"maiden_name": "", "birth_date": "", "birth_place": "",
|
||||
"death_date": "", "death_place": "", "spouse": "", "notes": ""},
|
||||
# second register row has a last name -> parse_register keeps it ...
|
||||
{"generation": "G 1", "last_name": "de Gruyter", "first_name": "Eugenie",
|
||||
"maiden_name": "Müller", "birth_date": "", "birth_place": "",
|
||||
"death_date": "", "death_place": "", "spouse": "", "notes": ""},
|
||||
]
|
||||
# ... but the tree side only has one person -> lengths diverge.
|
||||
tree_persons = [persons_tree._parse_row(2, raw_dicts[0])]
|
||||
import pytest
|
||||
with pytest.raises(ValueError, match="length"):
|
||||
persons_tree._attach_person_ids(tree_persons, raw_dicts)
|
||||
|
||||
|
||||
def test_attach_person_ids_carries_register_collision_suffix():
|
||||
# when two register rows slug-collide, the register suffixes the ids (-1, -2);
|
||||
# those exact suffixed ids must reach the tree persons, never a recomputed bare slug
|
||||
|
||||
Reference in New Issue
Block a user