feat(massimport): handle dot-compressed names and titles in PersonNameParser.split() #184
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
The mass import spreadsheet contains abbreviated names with no spaces — dots act as separators between initials, titles, and last names. The current
split()fallback only handles space-separated names, so these all producelastName = "?":E.Rockstroh("E.Rockstroh", "?")("E.", "Rockstroh")E.M.("E.M.", "?")("E.", "M.")Dr.Fr.Zarncke("Dr.Fr.Zarncke", "?")("Dr. Fr.", "Zarncke")Dr.Zarnke("Dr.Zarnke", "?")("Dr.", "Zarnke")Root Cause
split()(line 107,PersonNameParser.java) checks known last names first, then falls back to the last space. Names with no spaces bypass both paths and fall through toSplitName(cleaned, "?").Solution
After the
geb.stripping step insplit(), add a dot-normalization step that applies only when the cleaned name has no spaces but contains dots:Then the existing known-last-name check and last-space fallback handle the rest:
E.RockstrohE. Rockstroh("E.", "Rockstroh")✓E.M.E. M.("E.", "M.")✓Dr.Fr.ZarnckeDr. Fr. Zarncke("Dr. Fr.", "Zarncke")✓Dr.ZarnkeDr. Zarnke("Dr.", "Zarnke")✓No changes needed to
parseReceivers()— it already passes dot-compressed tokens through as single elements;split()is called downstream inPersonService.findOrCreateByAlias().Files
backend/src/main/java/.../service/PersonNameParser.javasplit()(3 lines)backend/src/test/java/.../service/PersonNameParserTest.javasplit_*tests + 1parseReceiverspassthrough test (TDD — red first)No schema, API, or i18n changes needed.
New Tests
Verification
👨💻 Felix Brandt — Senior Fullstack Developer
Questions & Observations
cleaned.replace(".", ". ").trim()) is beautifully minimal — it normalizes the input so the existing last-space fallback does all the work. No new branching logic, no new code paths. KISS at its best.!cleaned.contains(" ") && cleaned.contains(".")is tight — it only fires when there are no spaces but there are dots. This avoids interfering with already-spaced names that happen to contain dots (likeDr. Zarncke).parseReceivers_dotCompressedName_passthroughtest is important — it confirms thatparseReceiverstreatsDr.Fr.Zarnckeas a single token and doesn't try to split it at theund/ulevel.M.? After normalization it becomesM.→ trimmed toM.. The existing fallback would then see no space and fall through toSplitName("M.", "?"). Is that the expected behavior, or shouldM.be handled differently?Suggestions
M.to document whether the current behavior (falls through to?last name) is intentional or should be handled as a special case.geb.stripping step" is important — ifgeb.stripping runs first, a name likegeb.Rockstrohwould becomeRockstrohbefore the dot-normalization step, which is correct. Confirm this ordering in the implementation.replace(".", ". ")approach is a String method, not regex — that's the right choice here. Simple, readable, no regex overhead.🏗️ Markus Keller — Application Architect
Questions & Observations
split()is a utility method withinPersonNameParser, called downstream fromPersonService.findOrCreateByAlias(). The change is invisible to callers.Suggestions
split(), not inparseReceivers(). This is an important distinction —parseReceivers()treatsDr.Fr.Zarnckeas a single token (confirmed by the passthrough test). The normalization only happens whensplit()is asked to decompose a name into first/last. This layering is correct — just flagging it for the implementer to be precise about placement.PersonNameParseris accumulating a fair amount of normalization logic (geb. stripping, known last names, now dot-normalization). If more normalization steps are added in the future, consider whether a small pipeline of normalization steps (each a method) would be clearer than a growing list of if-checks insplit(). Not needed now — just a watch point.🧪 Sara Holt — QA Engineer
Questions & Observations
split()scenarios from the issue's table plus aparseReceiverspassthrough test. Good layered coverage — unit tests for the low-levelsplit()method and a higher-level test confirming the integration withparseReceivers.Rockstroh.— after normalization becomesRockstroh.→ trimmed toRockstroh.. How does the split fallback handle this?Dr. Fr. Zarncke(already properly spaced) — the guard!cleaned.contains(" ")should prevent normalization. A regression test confirming this would be valuable.M.— as Felix noted, this might produce("M.", "?"). Document the behavior with a test.E..Rockstroh— afterreplace(".", ". ")becomesE. . Rockstroh. Does the last-space fallback handle the extra spaces?Suggestions
Dr. Fr. Zarncke) — this confirms the guard clause works and that the normalization doesn't double-space names that are already correct.parseReceivers_dotCompressedName_passthroughtest is valuable for confirming layer separation — it proves thatparseReceiversdoesn't try to split on dots, onlysplit()does. Keep this test.PersonNameParserTestclass (not just the new tests) after implementation is essential — the normalization step could theoretically affect existing test cases if the guard condition has an edge case.🔒 Nora "NullX" Steiner — Security Engineer
Questions & Observations
String.replace(".", ". ")call is a literal string replacement, not regex — no risk of ReDoS or regex injection.Suggestions
🎨 Leonie Voss — UI/UX Design Lead
Questions & Observations
lastName = "?".Suggestions
🛠️ Tobias Wendt — DevOps Engineer
Questions & Observations
Suggestions