feat: PersonNameParser enhancements and Person model refactor (#209-#213) #215
Reference in New Issue
Block a user
Delete Branch "feat/issues-209-213-person-parser-enhancements"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Implements the full PersonNameParser enhancement suite across 5 issues:
Key changes
19 commits, 50+ files changed
Test plan
Closes #209, #210, #211, #212, #213
👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
Blockers
PersonService.findOrCreateByAlias()returnsnullfor SKIP — caller NPE risk.MassImportServicecallsfindOrCreateByAlias()and likely sets the result asdocument.setSender(person). Ifpersonis null due to SKIP classification, and the caller doesn't check, this is an NPE or a constraint violation. Every caller offindOrCreateByAlias()must be audited for null-safety. This is a behavioral contract change that needs a@Nullableannotation and caller updates.Architektappears in bothPersonTypeClassifier(INSTITUTION) andPersonNameParser.WORD_PREFIXES(title). If classification runs beforesplit()infindOrCreateByAlias(), the classifier catchesArchitekt Xas INSTITUTION andsplit()never runs — so the WORD_PREFIXES entry is dead code for the import path. But ifsplit()is ever called directly (e.g., from a future Von-column handler), it would stripArchitektas a title. This dual presence is confusing — document which takes precedence or remove the redundancy.Suggestions
PersonNameParser.java:162—firstName.equals(lastName)can NPE iffirstNameis null. This currently can't happen becausesplitByKnownLastNameOrFallbackalways sets firstName, but it's fragile. ConsiderObjects.equals(firstName, lastName).PersonTypeClassifier.java:27—trimmedis assigned but never used (onlyloweris used). Remove the unused variable.PersonSummaryDTO.getDisplayName()default method duplicates the logic inPerson.getDisplayName(). Consider extracting a shared static utility to avoid drift between the two implementations.PersonTypeBadge.svelte— the badge classes are duplicated across three branches. Extract the shared classes into a base variable and only vary color classes per type.🏗️ Markus Keller — Application Architect
Verdict: ⚠️ Approved with concerns
Blockers
findOrCreateByAlias()returning null breaks the method contract. This method previously always returned aPerson. Returning null for SKIP changes the contract silently — no@Nullableannotation, no Javadoc update. Every caller (MassImportService, any future caller) must null-check. The safer pattern: return anOptional<Person>or throw aDomainExceptionwith a specific error code. Returning null from a method namedfindOrCreateviolates the principle of least surprise.Concerns
PersonSummaryDTO.getDisplayName()duplicatesPerson.getDisplayName(). Two independent implementations of the same business logic is a maintenance risk. When someone adds a new display rule (e.g., PersonType-specific formatting), they must update both. Extract to a static utility:PersonDisplayNameUtil.format(title, firstName, lastName)called by both.PersonTypeClassifieris a separate class but tightly coupled toPersonService. The classifier determines behavior in the service (skip split, set personType). This is fine architecturally — the classifier is a pure function, the service orchestrates. But the keyword lists are hardcoded. If new institutions appear in future imports, the code must be changed and redeployed. Consider whether this is acceptable long-term or if the keywords should be stored in a database table. For now: acceptable, flag for future consideration.Pipeline ordering is implicit. The 5-step pipeline in
split()is well-structured, but the ordering contract (maiden → dot-norm → annotation → title → split) is only documented in a comment. If someone reorders the steps, subtle bugs emerge. Consider making the pipeline an explicit list ofFunction<String, String>steps — overkill for now but worth noting.LGTM
PersonType.SKIPnever persisted, CHECK constraint excludes it — defense-in-depth is correct.displayNameas@Transientwith@Schema(READ_ONLY)is the right pattern.🧪 Sara Holt — QA Engineer
Verdict: ⚠️ Approved with concerns
Blockers
findOrCreateByAlias()with SKIP/INSTITUTION/GROUP. The service tests mock the repository, which is correct for unit tests. But the behavioral change (returning null for SKIP, setting personType for non-persons) needs a Testcontainers integration test that verifies: (a) SKIP produces no Person record in the DB, (b) INSTITUTION/GROUP records have null firstName and correct personType, (c) the CHECK constraint rejects SKIP if someone tries to persist it.Concerns
PersonSummaryDTOlacksdisplayNamein the test assertions. ThePersonRepositoryTesttests forsearchWithDocumentCountandfindAllWithDocumentCountdon't verify thatdisplayNameis correctly computed by the default method. Add an assertion:assertThat(result.get(0).getDisplayName()).isEqualTo("Hans Müller").Classifier edge case:
containsWordonly finds the first occurrence. If a name containsElternas a substring in one position (not word boundary) and as a word in another, only the firstindexOfresult is checked. Example:"Nachbareltern Eltern"—indexOf("eltern")returns 8 (inside "Nachbareltern"), which fails the word boundary check, so the method returns false even though a standalone "Eltern" exists at position 16. Use a loop or regex\bEltern\binstead.No test for stacked titles in the classifier. What happens if someone imports
Prof. Dr. Firma Auschrath? The classifier checks INSTITUTION before the parser strips titles.Prof.does not start withFirma, so it falls through to PERSON. ButFirma Auschrathafter title strip would be INSTITUTION. The classifier-before-parser ordering means this is a miss. Edge case, but worth a test to document the behavior.LGTM
🔒 Nora "NullX" Steiner — Security Engineer
Verdict: ✅ Approved
No security blockers. This PR is contained within parsing logic and entity model changes with no new attack surface.
Checked
GEB_PATTERNuses(.+)$anchored at end-of-string — no catastrophic backtracking risk.PAREN_ANNOTATIONuses[^)]*(character class negation) — safe.UNCERTAIN_NAMEuses\S+— bounded. All patterns are pre-compiled as static constants. No ReDoS concern.@Paramplaceholders. The COALESCE additions are pure SQL functions on column references, no user input concatenation.titlefield: Thetitlefield stores parser-extracted data (from trusted ODS import). Frontend renders via Svelte text interpolation ({person.displayName},{m.person_type_INSTITUTION()}), which auto-escapes HTML. No{@html}usage for person display. Safe.PersonUpdateDTOdoes not includetitleorpersonTypefields, so these cannot be set via the person edit API.displayNameis@Schema(accessMode = READ_ONLY). No over-posting risk.person_type IN ('PERSON', 'INSTITUTION', 'GROUP', 'UNKNOWN')— defense-in-depth against invalid enum values.SKIPcorrectly excluded.Suggestion
titleis added (future issue), apply@Size(max = 50)and@Patternvalidation to reject HTML/script content. Not needed now sincetitleis only set programmatically by the parser.🎨 Leonie Voss — UI/UX Design Lead
Verdict: ⚠️ Approved with concerns
Concerns
PersonTypeBadgeuses Tailwind utility colors, not brand tokens.bg-blue-50,text-blue-700,bg-purple-50,bg-amber-50are generic Tailwind colors, not from the project's brand palette (brand-navy,brand-mint,brand-sand). For consistency with the design system, these should use the project's semantic color tokens or be explicitly documented as intentional departures. The amber-50 for UNKNOWN is fine (warning signal), but blue-50 and purple-50 have no precedent in the existing color system.No dark mode support for PersonTypeBadge. The badge uses light-mode-only colors (
bg-blue-50,text-blue-700). When dark mode is eventually implemented, these will needdark:variants. Add aTODOor use CSS custom properties.Badge tap target is below minimum. The badge renders at
text-xs px-2 py-0.5which produces a roughly 20-24px tall element. While badges are not interactive (no click handler), they appear inside clickable person cards. If they ever become interactive (e.g., filter by type), the tap target will be too small. Note for future.Person list with null firstName — initials handling.
persons/+page.svelte:102usesperson.firstName ? person.firstName[0] : person.lastName[0]for the first initial, thenperson.lastName[0]for the second. For an institution like "Arthur Collignon GmbH", this produces "AA" (first two chars of lastName being "A" + "A"). This is meaningless. The PersonTypeBadge partially addresses this by showing the type, but the avatar circle should consider showing a type icon instead of initials for non-persons.LGTM
displayNameas the single source of truth for name rendering is the correct pattern — eliminates 17+ files of duplicated concatenation logic.🛠️ Tobias Wendt — DevOps Engineer
Verdict: ✅ Approved
No infrastructure concerns. Clean application-level change.
Checked
frontend/src/lib/generated/api.tsis updated with the new Person shape. CI lint hook will catch any drift.Suggestion
./mvnw spring-boot:runwith env vars). Consider documenting this step in CLAUDE.md or adding a CI check that verifies generated types are up to date (npm run generate:api && git diff --exit-code). Not blocking for this PR.Review Concerns Addressed
All blockers and actionable concerns from the 6-persona review have been resolved in 3 commits:
c0cf8d7@NullableonfindOrCreateByAlias(), filter nulls inMassImportServicereceiver listac545ecArchitektfrom WORD_PREFIXES (classifier handles it),Objects.equalsfor null safety, remove unusedtrimmedvariable. @Markus: extractDisplayNameFormattershared utility. @Sara: fixcontainsWordto loop through all occurrences5106d27Deferred (noted, not blocking)
@Size/@Patternvalidation on title edit form — deferred to when edit form is builtAll deferred concerns resolved
3 additional commits addressing the remaining review concerns:
a1b21d6--c-badge-institution-*,--c-badge-group-*,--c-badge-unknown-*)166f60fc34db99titlefield added to PersonUpdateDTO with@Size(max=50)validation. Service handles title in create/update pathsNo remaining deferred items.