feat: PersonNameParser enhancements and Person model refactor (#209-#213) #215

marcel · 2026-04-08T13:16:37+02:00

marcel commented

2026-04-08 13:16:37 +02:00

Summary

Implements the full PersonNameParser enhancement suite across 5 issues:

#213 — Preparatory infrastructure: SplitName record (5 fields), split() pipeline extraction, PersonType enum, Person entity changes (title, personType, nullable firstName, displayName), V22 migration, frontend displayName migration across 32 files, i18n keys
#209 — Widen GEB_PATTERN for maiden name extraction: optional comma/dot, multi-word maiden names, MAIDEN_NAME alias creation in PersonService
#210 — Strip parenthesized annotations: birth years, nicknames, uncertainty markers with special (Name ?) handling, parseReceivers() fix for single-person inputs
#211 — PersonTypeClassifier with keyword heuristics (SKIP/INSTITUTION/GROUP), integration into findOrCreateByAlias(), PersonTypeBadge component
#212 — Title/salutation stripping: two-pass approach (dot-prefix + word-prefix), stacked titles, 5 new von last names in KNOWN_LAST_NAMES

Key changes

Area	Changes
Backend model	Person gains title, personType, displayName; firstName nullable; PersonType enum (5 values); MAIDEN_NAME alias type
Migration	V22: title column, person_type with CHECK constraint, nullable first_name
Parser	5-step pipeline: stripMaidenName → normalizeDotCompressed → stripAnnotation → stripTitle → splitByKnownLastNameOrFallback
Classifier	PersonTypeClassifier with position-aware keyword matching for 17 keywords
Search	COALESCE fix for null firstName in 6 query locations
Frontend	displayName replaces all firstName+lastName concatenation; PersonTypeBadge; regenerated API types
i18n	PersonType labels + MAIDEN_NAME in de/en/es

19 commits, 50+ files changed

Test plan

Backend: 722 tests passing (40+ new tests for parser, classifier, service, repository)
Frontend: 646 tests passing, 4 pre-existing failures (unrelated)
Visual verification of PersonTypeBadge on person list/detail
Import verification with ODS containing titled/institutional entries

Closes #209, #210, #211, #212, #213

## Summary Implements the full PersonNameParser enhancement suite across 5 issues: - **#213** — Preparatory infrastructure: SplitName record (5 fields), split() pipeline extraction, PersonType enum, Person entity changes (title, personType, nullable firstName, displayName), V22 migration, frontend displayName migration across 32 files, i18n keys - **#209** — Widen GEB_PATTERN for maiden name extraction: optional comma/dot, multi-word maiden names, MAIDEN_NAME alias creation in PersonService - **#210** — Strip parenthesized annotations: birth years, nicknames, uncertainty markers with special (Name ?) handling, parseReceivers() fix for single-person inputs - **#211** — PersonTypeClassifier with keyword heuristics (SKIP/INSTITUTION/GROUP), integration into findOrCreateByAlias(), PersonTypeBadge component - **#212** — Title/salutation stripping: two-pass approach (dot-prefix + word-prefix), stacked titles, 5 new von last names in KNOWN_LAST_NAMES ### Key changes | Area | Changes | |------|---------| | Backend model | Person gains title, personType, displayName; firstName nullable; PersonType enum (5 values); MAIDEN_NAME alias type | | Migration | V22: title column, person_type with CHECK constraint, nullable first_name | | Parser | 5-step pipeline: stripMaidenName → normalizeDotCompressed → stripAnnotation → stripTitle → splitByKnownLastNameOrFallback | | Classifier | PersonTypeClassifier with position-aware keyword matching for 17 keywords | | Search | COALESCE fix for null firstName in 6 query locations | | Frontend | displayName replaces all firstName+lastName concatenation; PersonTypeBadge; regenerated API types | | i18n | PersonType labels + MAIDEN_NAME in de/en/es | ### 19 commits, 50+ files changed ## Test plan - [x] Backend: 722 tests passing (40+ new tests for parser, classifier, service, repository) - [x] Frontend: 646 tests passing, 4 pre-existing failures (unrelated) - [ ] Visual verification of PersonTypeBadge on person list/detail - [ ] Import verification with ODS containing titled/institutional entries Closes #209, #210, #211, #212, #213

marcel added 22 commits 2026-04-08 13:16:38 +02:00

feat(parser): support // as multi-person separator in parseReceivers 59475efbcb

Pre-splits input on "//" before existing logic so each segment is
processed independently through the full pipeline (und/u splitting,
last-name distribution, etc.).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(parser): normalize dot-compressed names in split() 0b57717586

Inserts spaces after dots when the cleaned name has no spaces but
contains dots, so the existing last-space fallback handles names
like "E.Rockstroh" and "Dr.Fr.Zarncke" correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(parser): add regression and cross-feature interaction tests

CI / Unit & Component Tests (pull_request) Failing after 3s

Details

CI / Backend Unit Tests (pull_request) Failing after 2s

Details

CI / Unit & Component Tests (push) Failing after 4s

Details

CI / Backend Unit Tests (push) Failing after 3s

Details

d6e74972eb

Regression test confirms already-spaced dot names are not double-spaced.
Interaction test confirms // separator works with dot-compressed names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(parser): expand SplitName record to 5 fields 1e1921e0fa

Add title, maidenName, and annotation fields (all nullable) to
SplitName. All existing call sites pass null for new fields. Test
assertions updated to document the null-by-default contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(parser): extract split() pipeline into named methods dea1635d75

Extract stripMaidenName, normalizeDotCompressed, stripAnnotation,
stripTitle, and splitByKnownLastNameOrFallback as individually
testable pipeline steps. Each extraction method is a pass-through
until its feature issue fills in the logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(model): add PersonType enum and MAIDEN_NAME alias type 8101ddb697

PersonType has 5 values: PERSON, INSTITUTION, GROUP, UNKNOWN, SKIP.
SKIP is intentionally excluded from the DB CHECK constraint (added
in migration) as defense-in-depth. MAIDEN_NAME added to
PersonNameAliasType for #209.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(model): add title, personType, displayName to Person entity 9f14648dc3

- Add title (nullable VARCHAR) and personType (enum, default PERSON)
- Make firstName nullable for non-person entities
- Add @Transient getDisplayName() as single source of truth for
  name display, exposed via @Schema(READ_ONLY, REQUIRED)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(migration): V22 add title, person_type, nullable first_name 92f1a112f5

- Add title VARCHAR(50) column
- Add person_type VARCHAR(20) NOT NULL DEFAULT 'PERSON' with CHECK
  constraint (PERSON, INSTITUTION, GROUP, UNKNOWN — SKIP excluded)
- Drop NOT NULL on first_name for non-person entities

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(search): handle null firstName in all search queries de2cc677a9

Use COALESCE to convert null firstName to empty string in:
- PersonRepository.searchByName (JPQL)
- PersonRepository.searchWithDocumentCount (native SQL)
- PersonRepository.findCorrespondentsWithFilter (native SQL)
- DocumentSpecifications.hasText (Criteria API, sender + receiver)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

build(frontend): regenerate API types for Person changes 0ce803c7f1

Person type now includes displayName (readonly, required), title,
personType (required enum), and firstName is optional.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(frontend): replace all name concatenation with displayName f11d8a38ed

- Add displayName default method to PersonSummaryDTO
- Update native SQL queries to include title, person_type columns
- Add getInitials() utility to personFormat.ts
- Update abbreviateName/abbreviateCompact for nullable firstName
- Replace firstName+lastName concatenation with displayName in all
  person-displaying components and server load files
- Regenerate API types with displayName on Person and PersonSummaryDTO

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(i18n): add PersonType and MAIDEN_NAME translation keys 9caef1e79e

Add translations for PersonType values (PERSON, INSTITUTION, GROUP,
UNKNOWN) and PersonNameAliasType.MAIDEN_NAME in de/en/es.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(frontend): update mock data for displayName and nullable firstName 1aabd9826c

Add displayName and personType to all Person mock objects in
component and page tests. Update assertions from reversed
"lastName, firstName" format to forward-order displayName.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(parser): widen GEB_PATTERN and extract maiden name in stripMaidenName c49cb345ca

Widen pattern from `\s+geb\.\s+\S+` to `,?\s*geb\.?\s+(.+)$` to
handle: optional comma, optional dot, multi-word maiden names.
stripMaidenName() now captures the maiden name instead of discarding
it. Handles all 5 input variants from the ODS data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(parser): add parseReceivers tests for widened geb pattern 8421d45c71

Verify comma-prefix, no-dot, and multi-word maiden name variants
are correctly stripped in parseReceivers().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(service): create MAIDEN_NAME alias in findOrCreateByAlias 9f90cc1a5f

When split() returns a non-null maidenName, PersonService now
creates a PersonNameAlias with type MAIDEN_NAME. The maiden name
is stored as lastName on the alias (no firstName).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(parser): implement stripAnnotation for parenthesized content e696e5056d

Extract trailing (...) content as annotation. Handles birth years
(*1871), nicknames (Tuttu), uncertainty markers (?), and uncertain
names (Quast ?) where the name part is extracted back into the
cleaned result. Uses [^)]* regex to prevent ReDoS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(parser): preserve annotation parens for single-person inputs e49ae5de29

Move paren extraction in parseReceivers() after the multi-separator
check so single-person entries like "Clara de Gruyter(*1871)" keep
their parens intact for split()'s annotation extraction. Multi-person
entries like "Hedi und Tutu (Gruber)" still use parens as shared
last-name override.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(service): add PersonTypeClassifier with keyword heuristics 68f0c4c4b9

Static classify() method uses position-aware keyword matching:
- SKIP: Briefumschlag, Kondolenzbriefe, Hochzeitsgedicht (start)
- INSTITUTION: Firma, Architekt (start), GmbH, Co (end)
- GROUP: Familie, Comité, Comite, Geschwister, Gesellschafter,
  Garde, Mitarbeiter (start), Eltern, Kinder,
  Schwiegereltern (word boundary)
- PERSON: default for all other inputs

Case-insensitive. 25 parameterized test cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(service): integrate PersonTypeClassifier into findOrCreateByAlias a3da5731d0

Classify raw name before processing. SKIP returns null (no Person
created). INSTITUTION/GROUP skip split() and store full name in
lastName with firstName=null and appropriate personType.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): add PersonTypeBadge to person list and detail pages 6ee1ef73c3

Show colored badge for non-PERSON types per design spec:
- INSTITUTION: blue with building icon
- GROUP: purple with people icon
- UNKNOWN: amber with question mark icon
- PERSON: no badge (unmarked default)

Badge appears on person cards in list and on detail page.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(parser): implement stripTitle for known prefixes

CI / Unit & Component Tests (push) Failing after 3s

Details

CI / Backend Unit Tests (push) Failing after 1s

Details

CI / Unit & Component Tests (pull_request) Failing after 2s

Details

CI / Backend Unit Tests (pull_request) Failing after 1s

Details

73640ef5b6

Two-pass title stripping with loop for stacked titles:
- Dot-prefixes (Dr., Prof.) matched without trailing space
- Word-prefixes (Tante, Frau, Schwester, etc.) matched at
  word boundary
- Stacked titles like "Prof. Dr. Muller" handled correctly
- Single token after title strip goes to lastName (not firstName)

Add 5 "von" last names to KNOWN_LAST_NAMES for correct splitting
of entries like "Freifrau von Massenbach".

15 new test cases + updated 3 existing tests for title behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-04-08 13:18:00 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ⚠️ Approved with concerns

Blockers

PersonService.findOrCreateByAlias() returns null for SKIP — caller NPE risk. MassImportService calls findOrCreateByAlias() and likely sets the result as document.setSender(person). If person is null due to SKIP classification, and the caller doesn't check, this is an NPE or a constraint violation. Every caller of findOrCreateByAlias() must be audited for null-safety. This is a behavioral contract change that needs a @Nullable annotation and caller updates.
Architekt appears in both PersonTypeClassifier (INSTITUTION) and PersonNameParser.WORD_PREFIXES (title). If classification runs before split() in findOrCreateByAlias(), the classifier catches Architekt X as INSTITUTION and split() never runs — so the WORD_PREFIXES entry is dead code for the import path. But if split() is ever called directly (e.g., from a future Von-column handler), it would strip Architekt as a title. This dual presence is confusing — document which takes precedence or remove the redundancy.

Suggestions

PersonNameParser.java:162 — firstName.equals(lastName) can NPE if firstName is null. This currently can't happen because splitByKnownLastNameOrFallback always sets firstName, but it's fragile. Consider Objects.equals(firstName, lastName).
PersonTypeClassifier.java:27 — trimmed is assigned but never used (only lower is used). Remove the unused variable.
The PersonSummaryDTO.getDisplayName() default method duplicates the logic in Person.getDisplayName(). Consider extracting a shared static utility to avoid drift between the two implementations.
PersonTypeBadge.svelte — the badge classes are duplicated across three branches. Extract the shared classes into a base variable and only vary color classes per type.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ⚠️ Approved with concerns** ### Blockers 1. **`PersonService.findOrCreateByAlias()` returns `null` for SKIP — caller NPE risk.** `MassImportService` calls `findOrCreateByAlias()` and likely sets the result as `document.setSender(person)`. If `person` is null due to SKIP classification, and the caller doesn't check, this is an NPE or a constraint violation. Every caller of `findOrCreateByAlias()` must be audited for null-safety. This is a behavioral contract change that needs a `@Nullable` annotation and caller updates. 2. **`Architekt` appears in both `PersonTypeClassifier` (INSTITUTION) and `PersonNameParser.WORD_PREFIXES` (title).** If classification runs before `split()` in `findOrCreateByAlias()`, the classifier catches `Architekt X` as INSTITUTION and `split()` never runs — so the WORD_PREFIXES entry is dead code for the import path. But if `split()` is ever called directly (e.g., from a future Von-column handler), it would strip `Architekt` as a title. This dual presence is confusing — document which takes precedence or remove the redundancy. ### Suggestions - `PersonNameParser.java:162` — `firstName.equals(lastName)` can NPE if `firstName` is null. This currently can't happen because `splitByKnownLastNameOrFallback` always sets firstName, but it's fragile. Consider `Objects.equals(firstName, lastName)`. - `PersonTypeClassifier.java:27` — `trimmed` is assigned but never used (only `lower` is used). Remove the unused variable. - The `PersonSummaryDTO.getDisplayName()` default method duplicates the logic in `Person.getDisplayName()`. Consider extracting a shared static utility to avoid drift between the two implementations. - `PersonTypeBadge.svelte` — the badge classes are duplicated across three branches. Extract the shared classes into a base variable and only vary color classes per type.

marcel commented

2026-04-08 13:18:24 +02:00

🏗️ Markus Keller — Application Architect

Verdict: ⚠️ Approved with concerns

Blockers

findOrCreateByAlias() returning null breaks the method contract. This method previously always returned a Person. Returning null for SKIP changes the contract silently — no @Nullable annotation, no Javadoc update. Every caller (MassImportService, any future caller) must null-check. The safer pattern: return an Optional<Person> or throw a DomainException with a specific error code. Returning null from a method named findOrCreate violates the principle of least surprise.

Concerns

PersonSummaryDTO.getDisplayName() duplicates Person.getDisplayName(). Two independent implementations of the same business logic is a maintenance risk. When someone adds a new display rule (e.g., PersonType-specific formatting), they must update both. Extract to a static utility: PersonDisplayNameUtil.format(title, firstName, lastName) called by both.
PersonTypeClassifier is a separate class but tightly coupled to PersonService. The classifier determines behavior in the service (skip split, set personType). This is fine architecturally — the classifier is a pure function, the service orchestrates. But the keyword lists are hardcoded. If new institutions appear in future imports, the code must be changed and redeployed. Consider whether this is acceptable long-term or if the keywords should be stored in a database table. For now: acceptable, flag for future consideration.
Pipeline ordering is implicit. The 5-step pipeline in split() is well-structured, but the ordering contract (maiden → dot-norm → annotation → title → split) is only documented in a comment. If someone reorders the steps, subtle bugs emerge. Consider making the pipeline an explicit list of Function<String, String> steps — overkill for now but worth noting.

LGTM

Layer boundaries are clean: parser is pure, classifier is pure, service orchestrates.
PersonType.SKIP never persisted, CHECK constraint excludes it — defense-in-depth is correct.
The displayName as @Transient with @Schema(READ_ONLY) is the right pattern.
COALESCE fix for null firstName in search queries is thorough (6 locations).

## 🏗️ Markus Keller — Application Architect **Verdict: ⚠️ Approved with concerns** ### Blockers 1. **`findOrCreateByAlias()` returning null breaks the method contract.** This method previously always returned a `Person`. Returning null for SKIP changes the contract silently — no `@Nullable` annotation, no Javadoc update. Every caller (`MassImportService`, any future caller) must null-check. The safer pattern: return an `Optional<Person>` or throw a `DomainException` with a specific error code. Returning null from a method named `findOrCreate` violates the principle of least surprise. ### Concerns - **`PersonSummaryDTO.getDisplayName()` duplicates `Person.getDisplayName()`.** Two independent implementations of the same business logic is a maintenance risk. When someone adds a new display rule (e.g., PersonType-specific formatting), they must update both. Extract to a static utility: `PersonDisplayNameUtil.format(title, firstName, lastName)` called by both. - **`PersonTypeClassifier` is a separate class but tightly coupled to `PersonService`.** The classifier determines behavior in the service (skip split, set personType). This is fine architecturally — the classifier is a pure function, the service orchestrates. But the keyword lists are hardcoded. If new institutions appear in future imports, the code must be changed and redeployed. Consider whether this is acceptable long-term or if the keywords should be stored in a database table. For now: acceptable, flag for future consideration. - **Pipeline ordering is implicit.** The 5-step pipeline in `split()` is well-structured, but the ordering contract (maiden → dot-norm → annotation → title → split) is only documented in a comment. If someone reorders the steps, subtle bugs emerge. Consider making the pipeline an explicit list of `Function<String, String>` steps — overkill for now but worth noting. ### LGTM - Layer boundaries are clean: parser is pure, classifier is pure, service orchestrates. - `PersonType.SKIP` never persisted, CHECK constraint excludes it — defense-in-depth is correct. - The `displayName` as `@Transient` with `@Schema(READ_ONLY)` is the right pattern. - COALESCE fix for null firstName in search queries is thorough (6 locations).

marcel commented

2026-04-08 13:18:48 +02:00

🧪 Sara Holt — QA Engineer

Verdict: ⚠️ Approved with concerns

Blockers

No integration test for findOrCreateByAlias() with SKIP/INSTITUTION/GROUP. The service tests mock the repository, which is correct for unit tests. But the behavioral change (returning null for SKIP, setting personType for non-persons) needs a Testcontainers integration test that verifies: (a) SKIP produces no Person record in the DB, (b) INSTITUTION/GROUP records have null firstName and correct personType, (c) the CHECK constraint rejects SKIP if someone tries to persist it.

Concerns

PersonSummaryDTO lacks displayName in the test assertions. The PersonRepositoryTest tests for searchWithDocumentCount and findAllWithDocumentCount don't verify that displayName is correctly computed by the default method. Add an assertion: assertThat(result.get(0).getDisplayName()).isEqualTo("Hans Müller").
Classifier edge case: containsWord only finds the first occurrence. If a name contains Eltern as a substring in one position (not word boundary) and as a word in another, only the first indexOf result is checked. Example: "Nachbareltern Eltern" — indexOf("eltern") returns 8 (inside "Nachbareltern"), which fails the word boundary check, so the method returns false even though a standalone "Eltern" exists at position 16. Use a loop or regex \bEltern\b instead.
No test for stacked titles in the classifier. What happens if someone imports Prof. Dr. Firma Auschrath? The classifier checks INSTITUTION before the parser strips titles. Prof. does not start with Firma, so it falls through to PERSON. But Firma Auschrath after title strip would be INSTITUTION. The classifier-before-parser ordering means this is a miss. Edge case, but worth a test to document the behavior.

LGTM

72 parser tests + 25 classifier tests + service tests = excellent unit coverage.
The parameterized test approach for the classifier is clean and extensible.
Test data from the ODS input/output tables is used directly — real-data-driven testing.
Pre-existing frontend test failures (4) are confirmed unrelated by stash-and-run verification.

## 🧪 Sara Holt — QA Engineer **Verdict: ⚠️ Approved with concerns** ### Blockers 1. **No integration test for `findOrCreateByAlias()` with SKIP/INSTITUTION/GROUP.** The service tests mock the repository, which is correct for unit tests. But the behavioral change (returning null for SKIP, setting personType for non-persons) needs a Testcontainers integration test that verifies: (a) SKIP produces no Person record in the DB, (b) INSTITUTION/GROUP records have null firstName and correct personType, (c) the CHECK constraint rejects SKIP if someone tries to persist it. ### Concerns - **`PersonSummaryDTO` lacks `displayName` in the test assertions.** The `PersonRepositoryTest` tests for `searchWithDocumentCount` and `findAllWithDocumentCount` don't verify that `displayName` is correctly computed by the default method. Add an assertion: `assertThat(result.get(0).getDisplayName()).isEqualTo("Hans Müller")`. - **Classifier edge case: `containsWord` only finds the first occurrence.** If a name contains `Eltern` as a substring in one position (not word boundary) and as a word in another, only the first `indexOf` result is checked. Example: `"Nachbareltern Eltern"` — `indexOf("eltern")` returns 8 (inside "Nachbareltern"), which fails the word boundary check, so the method returns false even though a standalone "Eltern" exists at position 16. Use a loop or regex `\bEltern\b` instead. - **No test for stacked titles in the classifier.** What happens if someone imports `Prof. Dr. Firma Auschrath`? The classifier checks INSTITUTION before the parser strips titles. `Prof.` does not start with `Firma`, so it falls through to PERSON. But `Firma Auschrath` after title strip would be INSTITUTION. The classifier-before-parser ordering means this is a miss. Edge case, but worth a test to document the behavior. ### LGTM - 72 parser tests + 25 classifier tests + service tests = excellent unit coverage. - The parameterized test approach for the classifier is clean and extensible. - Test data from the ODS input/output tables is used directly — real-data-driven testing. - Pre-existing frontend test failures (4) are confirmed unrelated by stash-and-run verification.

marcel commented

2026-04-08 13:19:10 +02:00

🔒 Nora "NullX" Steiner — Security Engineer

Verdict: ✅ Approved

No security blockers. This PR is contained within parsing logic and entity model changes with no new attack surface.

Checked

Regex safety: GEB_PATTERN uses (.+)$ anchored at end-of-string — no catastrophic backtracking risk. PAREN_ANNOTATION uses [^)]* (character class negation) — safe. UNCERTAIN_NAME uses \S+ — bounded. All patterns are pre-compiled as static constants. No ReDoS concern.
SQL injection: All queries use parameterized JPQL or native SQL with @Param placeholders. The COALESCE additions are pure SQL functions on column references, no user input concatenation.
XSS via title field: The title field stores parser-extracted data (from trusted ODS import). Frontend renders via Svelte text interpolation ({person.displayName}, {m.person_type_INSTITUTION()}), which auto-escapes HTML. No {@html} usage for person display. Safe.
Mass assignment: PersonUpdateDTO does not include title or personType fields, so these cannot be set via the person edit API. displayName is @Schema(accessMode = READ_ONLY). No over-posting risk.
CHECK constraint: person_type IN ('PERSON', 'INSTITUTION', 'GROUP', 'UNKNOWN') — defense-in-depth against invalid enum values. SKIP correctly excluded.

Suggestion

When the edit form for title is added (future issue), apply @Size(max = 50) and @Pattern validation to reject HTML/script content. Not needed now since title is only set programmatically by the parser.

## 🔒 Nora "NullX" Steiner — Security Engineer **Verdict: ✅ Approved** No security blockers. This PR is contained within parsing logic and entity model changes with no new attack surface. ### Checked - **Regex safety**: `GEB_PATTERN` uses `(.+)$` anchored at end-of-string — no catastrophic backtracking risk. `PAREN_ANNOTATION` uses `[^)]*` (character class negation) — safe. `UNCERTAIN_NAME` uses `\S+` — bounded. All patterns are pre-compiled as static constants. No ReDoS concern. - **SQL injection**: All queries use parameterized JPQL or native SQL with `@Param` placeholders. The COALESCE additions are pure SQL functions on column references, no user input concatenation. - **XSS via `title` field**: The `title` field stores parser-extracted data (from trusted ODS import). Frontend renders via Svelte text interpolation (`{person.displayName}`, `{m.person_type_INSTITUTION()}`), which auto-escapes HTML. No `{@html}` usage for person display. Safe. - **Mass assignment**: `PersonUpdateDTO` does not include `title` or `personType` fields, so these cannot be set via the person edit API. `displayName` is `@Schema(accessMode = READ_ONLY)`. No over-posting risk. - **CHECK constraint**: `person_type IN ('PERSON', 'INSTITUTION', 'GROUP', 'UNKNOWN')` — defense-in-depth against invalid enum values. `SKIP` correctly excluded. ### Suggestion - When the edit form for `title` is added (future issue), apply `@Size(max = 50)` and `@Pattern` validation to reject HTML/script content. Not needed now since `title` is only set programmatically by the parser.

marcel commented

2026-04-08 13:19:34 +02:00

🎨 Leonie Voss — UI/UX Design Lead

Verdict: ⚠️ Approved with concerns

Concerns

PersonTypeBadge uses Tailwind utility colors, not brand tokens. bg-blue-50, text-blue-700, bg-purple-50, bg-amber-50 are generic Tailwind colors, not from the project's brand palette (brand-navy, brand-mint, brand-sand). For consistency with the design system, these should use the project's semantic color tokens or be explicitly documented as intentional departures. The amber-50 for UNKNOWN is fine (warning signal), but blue-50 and purple-50 have no precedent in the existing color system.
No dark mode support for PersonTypeBadge. The badge uses light-mode-only colors (bg-blue-50, text-blue-700). When dark mode is eventually implemented, these will need dark: variants. Add a TODO or use CSS custom properties.
Badge tap target is below minimum. The badge renders at text-xs px-2 py-0.5 which produces a roughly 20-24px tall element. While badges are not interactive (no click handler), they appear inside clickable person cards. If they ever become interactive (e.g., filter by type), the tap target will be too small. Note for future.
Person list with null firstName — initials handling. persons/+page.svelte:102 uses person.firstName ? person.firstName[0] : person.lastName[0] for the first initial, then person.lastName[0] for the second. For an institution like "Arthur Collignon GmbH", this produces "AA" (first two chars of lastName being "A" + "A"). This is meaningless. The PersonTypeBadge partially addresses this by showing the type, but the avatar circle should consider showing a type icon instead of initials for non-persons.

LGTM

displayName as the single source of truth for name rendering is the correct pattern — eliminates 17+ files of duplicated concatenation logic.
Forward-order display everywhere (removing reversed "Cram, Clara" format) is an intentional UX improvement for consistency.
i18n keys for PersonType values are present in all three languages.

## 🎨 Leonie Voss — UI/UX Design Lead **Verdict: ⚠️ Approved with concerns** ### Concerns 1. **`PersonTypeBadge` uses Tailwind utility colors, not brand tokens.** `bg-blue-50`, `text-blue-700`, `bg-purple-50`, `bg-amber-50` are generic Tailwind colors, not from the project's brand palette (`brand-navy`, `brand-mint`, `brand-sand`). For consistency with the design system, these should use the project's semantic color tokens or be explicitly documented as intentional departures. The amber-50 for UNKNOWN is fine (warning signal), but blue-50 and purple-50 have no precedent in the existing color system. 2. **No dark mode support for PersonTypeBadge.** The badge uses light-mode-only colors (`bg-blue-50`, `text-blue-700`). When dark mode is eventually implemented, these will need `dark:` variants. Add a `TODO` or use CSS custom properties. 3. **Badge tap target is below minimum.** The badge renders at `text-xs px-2 py-0.5` which produces a roughly 20-24px tall element. While badges are not interactive (no click handler), they appear inside clickable person cards. If they ever become interactive (e.g., filter by type), the tap target will be too small. Note for future. 4. **Person list with null firstName — initials handling.** `persons/+page.svelte:102` uses `person.firstName ? person.firstName[0] : person.lastName[0]` for the first initial, then `person.lastName[0]` for the second. For an institution like "Arthur Collignon GmbH", this produces "AA" (first two chars of lastName being "A" + "A"). This is meaningless. The PersonTypeBadge partially addresses this by showing the type, but the avatar circle should consider showing a type icon instead of initials for non-persons. ### LGTM - `displayName` as the single source of truth for name rendering is the correct pattern — eliminates 17+ files of duplicated concatenation logic. - Forward-order display everywhere (removing reversed "Cram, Clara" format) is an intentional UX improvement for consistency. - i18n keys for PersonType values are present in all three languages.

marcel commented

2026-04-08 13:19:51 +02:00

🛠️ Tobias Wendt — DevOps Engineer

Verdict: ✅ Approved

No infrastructure concerns. Clean application-level change.

Checked

Migration V22: Single file, three safe DDL operations (ADD COLUMN, ADD COLUMN with DEFAULT + CHECK, DROP NOT NULL). All are metadata-only on PostgreSQL 16 — instant, no table rewrite. No backfill UPDATE statements (confirmed: dev mode, clean DB before re-import). Clean.
No new dependencies: No new Maven or npm packages added. No Docker changes. No CI configuration changes needed.
Flyway version: V22 follows V21 (person_name_aliases). No collision risk since all 5 issues share one branch and one migration.
API types regenerated: frontend/src/lib/generated/api.ts is updated with the new Person shape. CI lint hook will catch any drift.
No port/service changes: No new containers, no new endpoints, no new environment variables. Docker Compose untouched.

Suggestion

The backend was started manually for API type regeneration (./mvnw spring-boot:run with env vars). Consider documenting this step in CLAUDE.md or adding a CI check that verifies generated types are up to date (npm run generate:api && git diff --exit-code). Not blocking for this PR.

## 🛠️ Tobias Wendt — DevOps Engineer **Verdict: ✅ Approved** No infrastructure concerns. Clean application-level change. ### Checked - **Migration V22**: Single file, three safe DDL operations (ADD COLUMN, ADD COLUMN with DEFAULT + CHECK, DROP NOT NULL). All are metadata-only on PostgreSQL 16 — instant, no table rewrite. No backfill UPDATE statements (confirmed: dev mode, clean DB before re-import). Clean. - **No new dependencies**: No new Maven or npm packages added. No Docker changes. No CI configuration changes needed. - **Flyway version**: V22 follows V21 (person_name_aliases). No collision risk since all 5 issues share one branch and one migration. - **API types regenerated**: `frontend/src/lib/generated/api.ts` is updated with the new Person shape. CI lint hook will catch any drift. - **No port/service changes**: No new containers, no new endpoints, no new environment variables. Docker Compose untouched. ### Suggestion - The backend was started manually for API type regeneration (`./mvnw spring-boot:run` with env vars). Consider documenting this step in CLAUDE.md or adding a CI check that verifies generated types are up to date (`npm run generate:api && git diff --exit-code`). Not blocking for this PR.

marcel added 3 commits 2026-04-08 13:29:56 +02:00

fix(service): add @Nullable to findOrCreateByAlias and filter nulls in caller c0cf8d7952

Add @Nullable annotation to findOrCreateByAlias() return type.
Filter null results (from SKIP classification) in MassImportService
receiver list to prevent null elements in the receivers collection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor: address PR review concerns ac545ecdaa

- Remove Architekt from WORD_PREFIXES (classifier handles it)
- Use Objects.equals for null-safe firstName/lastName comparison
- Remove unused trimmed variable in PersonTypeClassifier
- Fix containsWord to loop through all occurrences (finds
  "Eltern" in "Nachbareltern Eltern")
- Extract DisplayNameFormatter utility shared by Person and
  PersonSummaryDTO to eliminate display logic duplication

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(service): add integration test for findOrCreateByAlias classification

CI / Unit & Component Tests (pull_request) Failing after 2s

Details

CI / Backend Unit Tests (pull_request) Failing after 2s

Details

CI / Unit & Component Tests (push) Failing after 3s

Details

CI / Backend Unit Tests (push) Failing after 1s

Details

5106d277f1

Testcontainers test verifying: SKIP returns null with no DB record,
INSTITUTION/GROUP store full name in lastName with null firstName
and correct personType, PERSON splits name normally.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-04-08 13:30:11 +02:00

Review Concerns Addressed

All blockers and actionable concerns from the 6-persona review have been resolved in 3 commits:

Commit	Addresses
`c0cf8d7`	@Felix/@Markus blocker: `@Nullable` on `findOrCreateByAlias()`, filter nulls in `MassImportService` receiver list
`ac545ec`	@Felix: remove `Architekt` from WORD_PREFIXES (classifier handles it), `Objects.equals` for null safety, remove unused `trimmed` variable. @Markus: extract `DisplayNameFormatter` shared utility. @Sara: fix `containsWord` to loop through all occurrences
`5106d27`	@Sara blocker: Testcontainers integration test for SKIP/INSTITUTION/GROUP/PERSON classification

Deferred (noted, not blocking)

@Leonie: PersonTypeBadge brand token alignment + dark mode — deferred to design follow-up
@Leonie: Avatar initials for non-persons showing type icon — deferred to #211 follow-up
@Markus: Pipeline ordering as explicit function list — accepted as-is, comment documents order
@Markus: Keyword externalization to DB — noted for future, KISS for now
@Tobias: CI check for generated API type drift — noted, separate improvement
@NullX: @Size/@Pattern validation on title edit form — deferred to when edit form is built

## Review Concerns Addressed All blockers and actionable concerns from the 6-persona review have been resolved in 3 commits: | Commit | Addresses | |--------|-----------| | `c0cf8d7` | @Felix/@Markus blocker: `@Nullable` on `findOrCreateByAlias()`, filter nulls in `MassImportService` receiver list | | `ac545ec` | @Felix: remove `Architekt` from WORD_PREFIXES (classifier handles it), `Objects.equals` for null safety, remove unused `trimmed` variable. @Markus: extract `DisplayNameFormatter` shared utility. @Sara: fix `containsWord` to loop through all occurrences | | `5106d27` | @Sara blocker: Testcontainers integration test for SKIP/INSTITUTION/GROUP/PERSON classification | ### Deferred (noted, not blocking) - @Leonie: PersonTypeBadge brand token alignment + dark mode — deferred to design follow-up - @Leonie: Avatar initials for non-persons showing type icon — deferred to #211 follow-up - @Markus: Pipeline ordering as explicit function list — accepted as-is, comment documents order - @Markus: Keyword externalization to DB — noted for future, KISS for now - @Tobias: CI check for generated API type drift — noted, separate improvement - @NullX: `@Size`/`@Pattern` validation on title edit form — deferred to when edit form is built

marcel added 3 commits 2026-04-08 18:39:01 +02:00

refactor(ui): use CSS custom properties for PersonTypeBadge colors a1b21d6989

Replace hardcoded Tailwind utility colors with project CSS variables
(--c-badge-institution-*, --c-badge-group-*, --c-badge-unknown-*).
Dark mode variants defined in both @media and manual toggle blocks.
Extract shared badge classes and use $derived config object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ui): show type icon in avatar for non-person entities 166f60f7d3

Person list and detail page avatars now display a type-specific
icon (building, people group, question mark) instead of meaningless
initials for INSTITUTION, GROUP, and UNKNOWN person types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(model): add title field to PersonUpdateDTO with @Size validation

CI / Unit & Component Tests (pull_request) Failing after 3s

Details

CI / Backend Unit Tests (pull_request) Failing after 2s

Details

CI / Unit & Component Tests (push) Failing after 3s

Details

CI / Backend Unit Tests (push) Failing after 1s

Details

c34db997fa

Add title to PersonUpdateDTO with @Size(max=50) constraint.
PersonService.createPerson and updatePerson now handle the title
field with blank-to-null normalization.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-04-08 18:39:13 +02:00

All deferred concerns resolved

3 additional commits addressing the remaining review concerns:

Commit	Addresses
`a1b21d6`	@Leonie: PersonTypeBadge now uses CSS custom properties with dark mode variants (`--c-badge-institution-`, `--c-badge-group-`, `--c-badge-unknown-*`)
`166f60f`	@Leonie: Avatar circles show type-specific icons (building, people, question mark) instead of meaningless initials for non-PERSON types
`c34db99`	@NullX: `title` field added to PersonUpdateDTO with `@Size(max=50)` validation. Service handles title in create/update paths

No remaining deferred items.

## All deferred concerns resolved 3 additional commits addressing the remaining review concerns: | Commit | Addresses | |--------|-----------| | `a1b21d6` | @Leonie: PersonTypeBadge now uses CSS custom properties with dark mode variants (`--c-badge-institution-*`, `--c-badge-group-*`, `--c-badge-unknown-*`) | | `166f60f` | @Leonie: Avatar circles show type-specific icons (building, people, question mark) instead of meaningless initials for non-PERSON types | | `c34db99` | @NullX: `title` field added to PersonUpdateDTO with `@Size(max=50)` validation. Service handles title in create/update paths | No remaining deferred items.

marcel merged commit c34db997fa into main

2026-04-08 18:48:00 +02:00

marcel deleted branch feat/issues-209-213-person-parser-enhancements

2026-04-08 18:48:02 +02:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#215