feat: PersonNameParser enhancements and Person model refactor (#209-#213) #215

Merged
marcel merged 25 commits from feat/issues-209-213-person-parser-enhancements into main 2026-04-08 18:48:00 +02:00
Owner

Summary

Implements the full PersonNameParser enhancement suite across 5 issues:

  • #213 — Preparatory infrastructure: SplitName record (5 fields), split() pipeline extraction, PersonType enum, Person entity changes (title, personType, nullable firstName, displayName), V22 migration, frontend displayName migration across 32 files, i18n keys
  • #209 — Widen GEB_PATTERN for maiden name extraction: optional comma/dot, multi-word maiden names, MAIDEN_NAME alias creation in PersonService
  • #210 — Strip parenthesized annotations: birth years, nicknames, uncertainty markers with special (Name ?) handling, parseReceivers() fix for single-person inputs
  • #211 — PersonTypeClassifier with keyword heuristics (SKIP/INSTITUTION/GROUP), integration into findOrCreateByAlias(), PersonTypeBadge component
  • #212 — Title/salutation stripping: two-pass approach (dot-prefix + word-prefix), stacked titles, 5 new von last names in KNOWN_LAST_NAMES

Key changes

Area Changes
Backend model Person gains title, personType, displayName; firstName nullable; PersonType enum (5 values); MAIDEN_NAME alias type
Migration V22: title column, person_type with CHECK constraint, nullable first_name
Parser 5-step pipeline: stripMaidenName → normalizeDotCompressed → stripAnnotation → stripTitle → splitByKnownLastNameOrFallback
Classifier PersonTypeClassifier with position-aware keyword matching for 17 keywords
Search COALESCE fix for null firstName in 6 query locations
Frontend displayName replaces all firstName+lastName concatenation; PersonTypeBadge; regenerated API types
i18n PersonType labels + MAIDEN_NAME in de/en/es

19 commits, 50+ files changed

Test plan

  • Backend: 722 tests passing (40+ new tests for parser, classifier, service, repository)
  • Frontend: 646 tests passing, 4 pre-existing failures (unrelated)
  • Visual verification of PersonTypeBadge on person list/detail
  • Import verification with ODS containing titled/institutional entries

Closes #209, #210, #211, #212, #213

## Summary Implements the full PersonNameParser enhancement suite across 5 issues: - **#213** — Preparatory infrastructure: SplitName record (5 fields), split() pipeline extraction, PersonType enum, Person entity changes (title, personType, nullable firstName, displayName), V22 migration, frontend displayName migration across 32 files, i18n keys - **#209** — Widen GEB_PATTERN for maiden name extraction: optional comma/dot, multi-word maiden names, MAIDEN_NAME alias creation in PersonService - **#210** — Strip parenthesized annotations: birth years, nicknames, uncertainty markers with special (Name ?) handling, parseReceivers() fix for single-person inputs - **#211** — PersonTypeClassifier with keyword heuristics (SKIP/INSTITUTION/GROUP), integration into findOrCreateByAlias(), PersonTypeBadge component - **#212** — Title/salutation stripping: two-pass approach (dot-prefix + word-prefix), stacked titles, 5 new von last names in KNOWN_LAST_NAMES ### Key changes | Area | Changes | |------|---------| | Backend model | Person gains title, personType, displayName; firstName nullable; PersonType enum (5 values); MAIDEN_NAME alias type | | Migration | V22: title column, person_type with CHECK constraint, nullable first_name | | Parser | 5-step pipeline: stripMaidenName → normalizeDotCompressed → stripAnnotation → stripTitle → splitByKnownLastNameOrFallback | | Classifier | PersonTypeClassifier with position-aware keyword matching for 17 keywords | | Search | COALESCE fix for null firstName in 6 query locations | | Frontend | displayName replaces all firstName+lastName concatenation; PersonTypeBadge; regenerated API types | | i18n | PersonType labels + MAIDEN_NAME in de/en/es | ### 19 commits, 50+ files changed ## Test plan - [x] Backend: 722 tests passing (40+ new tests for parser, classifier, service, repository) - [x] Frontend: 646 tests passing, 4 pre-existing failures (unrelated) - [ ] Visual verification of PersonTypeBadge on person list/detail - [ ] Import verification with ODS containing titled/institutional entries Closes #209, #210, #211, #212, #213
marcel added 22 commits 2026-04-08 13:16:38 +02:00
Pre-splits input on "//" before existing logic so each segment is
processed independently through the full pipeline (und/u splitting,
last-name distribution, etc.).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inserts spaces after dots when the cleaned name has no spaces but
contains dots, so the existing last-space fallback handles names
like "E.Rockstroh" and "Dr.Fr.Zarncke" correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test(parser): add regression and cross-feature interaction tests
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 3s
CI / Backend Unit Tests (pull_request) Failing after 2s
CI / Unit & Component Tests (push) Failing after 4s
CI / Backend Unit Tests (push) Failing after 3s
d6e74972eb
Regression test confirms already-spaced dot names are not double-spaced.
Interaction test confirms // separator works with dot-compressed names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add title, maidenName, and annotation fields (all nullable) to
SplitName. All existing call sites pass null for new fields. Test
assertions updated to document the null-by-default contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract stripMaidenName, normalizeDotCompressed, stripAnnotation,
stripTitle, and splitByKnownLastNameOrFallback as individually
testable pipeline steps. Each extraction method is a pass-through
until its feature issue fills in the logic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PersonType has 5 values: PERSON, INSTITUTION, GROUP, UNKNOWN, SKIP.
SKIP is intentionally excluded from the DB CHECK constraint (added
in migration) as defense-in-depth. MAIDEN_NAME added to
PersonNameAliasType for #209.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add title (nullable VARCHAR) and personType (enum, default PERSON)
- Make firstName nullable for non-person entities
- Add @Transient getDisplayName() as single source of truth for
  name display, exposed via @Schema(READ_ONLY, REQUIRED)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add title VARCHAR(50) column
- Add person_type VARCHAR(20) NOT NULL DEFAULT 'PERSON' with CHECK
  constraint (PERSON, INSTITUTION, GROUP, UNKNOWN — SKIP excluded)
- Drop NOT NULL on first_name for non-person entities

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use COALESCE to convert null firstName to empty string in:
- PersonRepository.searchByName (JPQL)
- PersonRepository.searchWithDocumentCount (native SQL)
- PersonRepository.findCorrespondentsWithFilter (native SQL)
- DocumentSpecifications.hasText (Criteria API, sender + receiver)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Person type now includes displayName (readonly, required), title,
personType (required enum), and firstName is optional.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add displayName default method to PersonSummaryDTO
- Update native SQL queries to include title, person_type columns
- Add getInitials() utility to personFormat.ts
- Update abbreviateName/abbreviateCompact for nullable firstName
- Replace firstName+lastName concatenation with displayName in all
  person-displaying components and server load files
- Regenerate API types with displayName on Person and PersonSummaryDTO

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add translations for PersonType values (PERSON, INSTITUTION, GROUP,
UNKNOWN) and PersonNameAliasType.MAIDEN_NAME in de/en/es.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add displayName and personType to all Person mock objects in
component and page tests. Update assertions from reversed
"lastName, firstName" format to forward-order displayName.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Widen pattern from `\s+geb\.\s+\S+` to `,?\s*geb\.?\s+(.+)$` to
handle: optional comma, optional dot, multi-word maiden names.
stripMaidenName() now captures the maiden name instead of discarding
it. Handles all 5 input variants from the ODS data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verify comma-prefix, no-dot, and multi-word maiden name variants
are correctly stripped in parseReceivers().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When split() returns a non-null maidenName, PersonService now
creates a PersonNameAlias with type MAIDEN_NAME. The maiden name
is stored as lastName on the alias (no firstName).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract trailing (...) content as annotation. Handles birth years
(*1871), nicknames (Tuttu), uncertainty markers (?), and uncertain
names (Quast ?) where the name part is extracted back into the
cleaned result. Uses [^)]* regex to prevent ReDoS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move paren extraction in parseReceivers() after the multi-separator
check so single-person entries like "Clara de Gruyter(*1871)" keep
their parens intact for split()'s annotation extraction. Multi-person
entries like "Hedi und Tutu (Gruber)" still use parens as shared
last-name override.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Static classify() method uses position-aware keyword matching:
- SKIP: Briefumschlag, Kondolenzbriefe, Hochzeitsgedicht (start)
- INSTITUTION: Firma, Architekt (start), GmbH, Co (end)
- GROUP: Familie, Comité, Comite, Geschwister, Gesellschafter,
  Garde, Mitarbeiter (start), Eltern, Kinder,
  Schwiegereltern (word boundary)
- PERSON: default for all other inputs

Case-insensitive. 25 parameterized test cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Classify raw name before processing. SKIP returns null (no Person
created). INSTITUTION/GROUP skip split() and store full name in
lastName with firstName=null and appropriate personType.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Show colored badge for non-PERSON types per design spec:
- INSTITUTION: blue with building icon
- GROUP: purple with people icon
- UNKNOWN: amber with question mark icon
- PERSON: no badge (unmarked default)

Badge appears on person cards in list and on detail page.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat(parser): implement stripTitle for known prefixes
Some checks failed
CI / Unit & Component Tests (push) Failing after 3s
CI / Backend Unit Tests (push) Failing after 1s
CI / Unit & Component Tests (pull_request) Failing after 2s
CI / Backend Unit Tests (pull_request) Failing after 1s
73640ef5b6
Two-pass title stripping with loop for stacked titles:
- Dot-prefixes (Dr., Prof.) matched without trailing space
- Word-prefixes (Tante, Frau, Schwester, etc.) matched at
  word boundary
- Stacked titles like "Prof. Dr. Muller" handled correctly
- Single token after title strip goes to lastName (not firstName)

Add 5 "von" last names to KNOWN_LAST_NAMES for correct splitting
of entries like "Freifrau von Massenbach".

15 new test cases + updated 3 existing tests for title behavior.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ⚠️ Approved with concerns

Blockers

  1. PersonService.findOrCreateByAlias() returns null for SKIP — caller NPE risk. MassImportService calls findOrCreateByAlias() and likely sets the result as document.setSender(person). If person is null due to SKIP classification, and the caller doesn't check, this is an NPE or a constraint violation. Every caller of findOrCreateByAlias() must be audited for null-safety. This is a behavioral contract change that needs a @Nullable annotation and caller updates.

  2. Architekt appears in both PersonTypeClassifier (INSTITUTION) and PersonNameParser.WORD_PREFIXES (title). If classification runs before split() in findOrCreateByAlias(), the classifier catches Architekt X as INSTITUTION and split() never runs — so the WORD_PREFIXES entry is dead code for the import path. But if split() is ever called directly (e.g., from a future Von-column handler), it would strip Architekt as a title. This dual presence is confusing — document which takes precedence or remove the redundancy.

Suggestions

  • PersonNameParser.java:162firstName.equals(lastName) can NPE if firstName is null. This currently can't happen because splitByKnownLastNameOrFallback always sets firstName, but it's fragile. Consider Objects.equals(firstName, lastName).
  • PersonTypeClassifier.java:27trimmed is assigned but never used (only lower is used). Remove the unused variable.
  • The PersonSummaryDTO.getDisplayName() default method duplicates the logic in Person.getDisplayName(). Consider extracting a shared static utility to avoid drift between the two implementations.
  • PersonTypeBadge.svelte — the badge classes are duplicated across three branches. Extract the shared classes into a base variable and only vary color classes per type.
## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ⚠️ Approved with concerns** ### Blockers 1. **`PersonService.findOrCreateByAlias()` returns `null` for SKIP — caller NPE risk.** `MassImportService` calls `findOrCreateByAlias()` and likely sets the result as `document.setSender(person)`. If `person` is null due to SKIP classification, and the caller doesn't check, this is an NPE or a constraint violation. Every caller of `findOrCreateByAlias()` must be audited for null-safety. This is a behavioral contract change that needs a `@Nullable` annotation and caller updates. 2. **`Architekt` appears in both `PersonTypeClassifier` (INSTITUTION) and `PersonNameParser.WORD_PREFIXES` (title).** If classification runs before `split()` in `findOrCreateByAlias()`, the classifier catches `Architekt X` as INSTITUTION and `split()` never runs — so the WORD_PREFIXES entry is dead code for the import path. But if `split()` is ever called directly (e.g., from a future Von-column handler), it would strip `Architekt` as a title. This dual presence is confusing — document which takes precedence or remove the redundancy. ### Suggestions - `PersonNameParser.java:162` — `firstName.equals(lastName)` can NPE if `firstName` is null. This currently can't happen because `splitByKnownLastNameOrFallback` always sets firstName, but it's fragile. Consider `Objects.equals(firstName, lastName)`. - `PersonTypeClassifier.java:27` — `trimmed` is assigned but never used (only `lower` is used). Remove the unused variable. - The `PersonSummaryDTO.getDisplayName()` default method duplicates the logic in `Person.getDisplayName()`. Consider extracting a shared static utility to avoid drift between the two implementations. - `PersonTypeBadge.svelte` — the badge classes are duplicated across three branches. Extract the shared classes into a base variable and only vary color classes per type.
Author
Owner

🏗️ Markus Keller — Application Architect

Verdict: ⚠️ Approved with concerns

Blockers

  1. findOrCreateByAlias() returning null breaks the method contract. This method previously always returned a Person. Returning null for SKIP changes the contract silently — no @Nullable annotation, no Javadoc update. Every caller (MassImportService, any future caller) must null-check. The safer pattern: return an Optional<Person> or throw a DomainException with a specific error code. Returning null from a method named findOrCreate violates the principle of least surprise.

Concerns

  • PersonSummaryDTO.getDisplayName() duplicates Person.getDisplayName(). Two independent implementations of the same business logic is a maintenance risk. When someone adds a new display rule (e.g., PersonType-specific formatting), they must update both. Extract to a static utility: PersonDisplayNameUtil.format(title, firstName, lastName) called by both.

  • PersonTypeClassifier is a separate class but tightly coupled to PersonService. The classifier determines behavior in the service (skip split, set personType). This is fine architecturally — the classifier is a pure function, the service orchestrates. But the keyword lists are hardcoded. If new institutions appear in future imports, the code must be changed and redeployed. Consider whether this is acceptable long-term or if the keywords should be stored in a database table. For now: acceptable, flag for future consideration.

  • Pipeline ordering is implicit. The 5-step pipeline in split() is well-structured, but the ordering contract (maiden → dot-norm → annotation → title → split) is only documented in a comment. If someone reorders the steps, subtle bugs emerge. Consider making the pipeline an explicit list of Function<String, String> steps — overkill for now but worth noting.

LGTM

  • Layer boundaries are clean: parser is pure, classifier is pure, service orchestrates.
  • PersonType.SKIP never persisted, CHECK constraint excludes it — defense-in-depth is correct.
  • The displayName as @Transient with @Schema(READ_ONLY) is the right pattern.
  • COALESCE fix for null firstName in search queries is thorough (6 locations).
## 🏗️ Markus Keller — Application Architect **Verdict: ⚠️ Approved with concerns** ### Blockers 1. **`findOrCreateByAlias()` returning null breaks the method contract.** This method previously always returned a `Person`. Returning null for SKIP changes the contract silently — no `@Nullable` annotation, no Javadoc update. Every caller (`MassImportService`, any future caller) must null-check. The safer pattern: return an `Optional<Person>` or throw a `DomainException` with a specific error code. Returning null from a method named `findOrCreate` violates the principle of least surprise. ### Concerns - **`PersonSummaryDTO.getDisplayName()` duplicates `Person.getDisplayName()`.** Two independent implementations of the same business logic is a maintenance risk. When someone adds a new display rule (e.g., PersonType-specific formatting), they must update both. Extract to a static utility: `PersonDisplayNameUtil.format(title, firstName, lastName)` called by both. - **`PersonTypeClassifier` is a separate class but tightly coupled to `PersonService`.** The classifier determines behavior in the service (skip split, set personType). This is fine architecturally — the classifier is a pure function, the service orchestrates. But the keyword lists are hardcoded. If new institutions appear in future imports, the code must be changed and redeployed. Consider whether this is acceptable long-term or if the keywords should be stored in a database table. For now: acceptable, flag for future consideration. - **Pipeline ordering is implicit.** The 5-step pipeline in `split()` is well-structured, but the ordering contract (maiden → dot-norm → annotation → title → split) is only documented in a comment. If someone reorders the steps, subtle bugs emerge. Consider making the pipeline an explicit list of `Function<String, String>` steps — overkill for now but worth noting. ### LGTM - Layer boundaries are clean: parser is pure, classifier is pure, service orchestrates. - `PersonType.SKIP` never persisted, CHECK constraint excludes it — defense-in-depth is correct. - The `displayName` as `@Transient` with `@Schema(READ_ONLY)` is the right pattern. - COALESCE fix for null firstName in search queries is thorough (6 locations).
Author
Owner

🧪 Sara Holt — QA Engineer

Verdict: ⚠️ Approved with concerns

Blockers

  1. No integration test for findOrCreateByAlias() with SKIP/INSTITUTION/GROUP. The service tests mock the repository, which is correct for unit tests. But the behavioral change (returning null for SKIP, setting personType for non-persons) needs a Testcontainers integration test that verifies: (a) SKIP produces no Person record in the DB, (b) INSTITUTION/GROUP records have null firstName and correct personType, (c) the CHECK constraint rejects SKIP if someone tries to persist it.

Concerns

  • PersonSummaryDTO lacks displayName in the test assertions. The PersonRepositoryTest tests for searchWithDocumentCount and findAllWithDocumentCount don't verify that displayName is correctly computed by the default method. Add an assertion: assertThat(result.get(0).getDisplayName()).isEqualTo("Hans Müller").

  • Classifier edge case: containsWord only finds the first occurrence. If a name contains Eltern as a substring in one position (not word boundary) and as a word in another, only the first indexOf result is checked. Example: "Nachbareltern Eltern"indexOf("eltern") returns 8 (inside "Nachbareltern"), which fails the word boundary check, so the method returns false even though a standalone "Eltern" exists at position 16. Use a loop or regex \bEltern\b instead.

  • No test for stacked titles in the classifier. What happens if someone imports Prof. Dr. Firma Auschrath? The classifier checks INSTITUTION before the parser strips titles. Prof. does not start with Firma, so it falls through to PERSON. But Firma Auschrath after title strip would be INSTITUTION. The classifier-before-parser ordering means this is a miss. Edge case, but worth a test to document the behavior.

LGTM

  • 72 parser tests + 25 classifier tests + service tests = excellent unit coverage.
  • The parameterized test approach for the classifier is clean and extensible.
  • Test data from the ODS input/output tables is used directly — real-data-driven testing.
  • Pre-existing frontend test failures (4) are confirmed unrelated by stash-and-run verification.
## 🧪 Sara Holt — QA Engineer **Verdict: ⚠️ Approved with concerns** ### Blockers 1. **No integration test for `findOrCreateByAlias()` with SKIP/INSTITUTION/GROUP.** The service tests mock the repository, which is correct for unit tests. But the behavioral change (returning null for SKIP, setting personType for non-persons) needs a Testcontainers integration test that verifies: (a) SKIP produces no Person record in the DB, (b) INSTITUTION/GROUP records have null firstName and correct personType, (c) the CHECK constraint rejects SKIP if someone tries to persist it. ### Concerns - **`PersonSummaryDTO` lacks `displayName` in the test assertions.** The `PersonRepositoryTest` tests for `searchWithDocumentCount` and `findAllWithDocumentCount` don't verify that `displayName` is correctly computed by the default method. Add an assertion: `assertThat(result.get(0).getDisplayName()).isEqualTo("Hans Müller")`. - **Classifier edge case: `containsWord` only finds the first occurrence.** If a name contains `Eltern` as a substring in one position (not word boundary) and as a word in another, only the first `indexOf` result is checked. Example: `"Nachbareltern Eltern"` — `indexOf("eltern")` returns 8 (inside "Nachbareltern"), which fails the word boundary check, so the method returns false even though a standalone "Eltern" exists at position 16. Use a loop or regex `\bEltern\b` instead. - **No test for stacked titles in the classifier.** What happens if someone imports `Prof. Dr. Firma Auschrath`? The classifier checks INSTITUTION before the parser strips titles. `Prof.` does not start with `Firma`, so it falls through to PERSON. But `Firma Auschrath` after title strip would be INSTITUTION. The classifier-before-parser ordering means this is a miss. Edge case, but worth a test to document the behavior. ### LGTM - 72 parser tests + 25 classifier tests + service tests = excellent unit coverage. - The parameterized test approach for the classifier is clean and extensible. - Test data from the ODS input/output tables is used directly — real-data-driven testing. - Pre-existing frontend test failures (4) are confirmed unrelated by stash-and-run verification.
Author
Owner

🔒 Nora "NullX" Steiner — Security Engineer

Verdict: Approved

No security blockers. This PR is contained within parsing logic and entity model changes with no new attack surface.

Checked

  • Regex safety: GEB_PATTERN uses (.+)$ anchored at end-of-string — no catastrophic backtracking risk. PAREN_ANNOTATION uses [^)]* (character class negation) — safe. UNCERTAIN_NAME uses \S+ — bounded. All patterns are pre-compiled as static constants. No ReDoS concern.
  • SQL injection: All queries use parameterized JPQL or native SQL with @Param placeholders. The COALESCE additions are pure SQL functions on column references, no user input concatenation.
  • XSS via title field: The title field stores parser-extracted data (from trusted ODS import). Frontend renders via Svelte text interpolation ({person.displayName}, {m.person_type_INSTITUTION()}), which auto-escapes HTML. No {@html} usage for person display. Safe.
  • Mass assignment: PersonUpdateDTO does not include title or personType fields, so these cannot be set via the person edit API. displayName is @Schema(accessMode = READ_ONLY). No over-posting risk.
  • CHECK constraint: person_type IN ('PERSON', 'INSTITUTION', 'GROUP', 'UNKNOWN') — defense-in-depth against invalid enum values. SKIP correctly excluded.

Suggestion

  • When the edit form for title is added (future issue), apply @Size(max = 50) and @Pattern validation to reject HTML/script content. Not needed now since title is only set programmatically by the parser.
## 🔒 Nora "NullX" Steiner — Security Engineer **Verdict: ✅ Approved** No security blockers. This PR is contained within parsing logic and entity model changes with no new attack surface. ### Checked - **Regex safety**: `GEB_PATTERN` uses `(.+)$` anchored at end-of-string — no catastrophic backtracking risk. `PAREN_ANNOTATION` uses `[^)]*` (character class negation) — safe. `UNCERTAIN_NAME` uses `\S+` — bounded. All patterns are pre-compiled as static constants. No ReDoS concern. - **SQL injection**: All queries use parameterized JPQL or native SQL with `@Param` placeholders. The COALESCE additions are pure SQL functions on column references, no user input concatenation. - **XSS via `title` field**: The `title` field stores parser-extracted data (from trusted ODS import). Frontend renders via Svelte text interpolation (`{person.displayName}`, `{m.person_type_INSTITUTION()}`), which auto-escapes HTML. No `{@html}` usage for person display. Safe. - **Mass assignment**: `PersonUpdateDTO` does not include `title` or `personType` fields, so these cannot be set via the person edit API. `displayName` is `@Schema(accessMode = READ_ONLY)`. No over-posting risk. - **CHECK constraint**: `person_type IN ('PERSON', 'INSTITUTION', 'GROUP', 'UNKNOWN')` — defense-in-depth against invalid enum values. `SKIP` correctly excluded. ### Suggestion - When the edit form for `title` is added (future issue), apply `@Size(max = 50)` and `@Pattern` validation to reject HTML/script content. Not needed now since `title` is only set programmatically by the parser.
Author
Owner

🎨 Leonie Voss — UI/UX Design Lead

Verdict: ⚠️ Approved with concerns

Concerns

  1. PersonTypeBadge uses Tailwind utility colors, not brand tokens. bg-blue-50, text-blue-700, bg-purple-50, bg-amber-50 are generic Tailwind colors, not from the project's brand palette (brand-navy, brand-mint, brand-sand). For consistency with the design system, these should use the project's semantic color tokens or be explicitly documented as intentional departures. The amber-50 for UNKNOWN is fine (warning signal), but blue-50 and purple-50 have no precedent in the existing color system.

  2. No dark mode support for PersonTypeBadge. The badge uses light-mode-only colors (bg-blue-50, text-blue-700). When dark mode is eventually implemented, these will need dark: variants. Add a TODO or use CSS custom properties.

  3. Badge tap target is below minimum. The badge renders at text-xs px-2 py-0.5 which produces a roughly 20-24px tall element. While badges are not interactive (no click handler), they appear inside clickable person cards. If they ever become interactive (e.g., filter by type), the tap target will be too small. Note for future.

  4. Person list with null firstName — initials handling. persons/+page.svelte:102 uses person.firstName ? person.firstName[0] : person.lastName[0] for the first initial, then person.lastName[0] for the second. For an institution like "Arthur Collignon GmbH", this produces "AA" (first two chars of lastName being "A" + "A"). This is meaningless. The PersonTypeBadge partially addresses this by showing the type, but the avatar circle should consider showing a type icon instead of initials for non-persons.

LGTM

  • displayName as the single source of truth for name rendering is the correct pattern — eliminates 17+ files of duplicated concatenation logic.
  • Forward-order display everywhere (removing reversed "Cram, Clara" format) is an intentional UX improvement for consistency.
  • i18n keys for PersonType values are present in all three languages.
## 🎨 Leonie Voss — UI/UX Design Lead **Verdict: ⚠️ Approved with concerns** ### Concerns 1. **`PersonTypeBadge` uses Tailwind utility colors, not brand tokens.** `bg-blue-50`, `text-blue-700`, `bg-purple-50`, `bg-amber-50` are generic Tailwind colors, not from the project's brand palette (`brand-navy`, `brand-mint`, `brand-sand`). For consistency with the design system, these should use the project's semantic color tokens or be explicitly documented as intentional departures. The amber-50 for UNKNOWN is fine (warning signal), but blue-50 and purple-50 have no precedent in the existing color system. 2. **No dark mode support for PersonTypeBadge.** The badge uses light-mode-only colors (`bg-blue-50`, `text-blue-700`). When dark mode is eventually implemented, these will need `dark:` variants. Add a `TODO` or use CSS custom properties. 3. **Badge tap target is below minimum.** The badge renders at `text-xs px-2 py-0.5` which produces a roughly 20-24px tall element. While badges are not interactive (no click handler), they appear inside clickable person cards. If they ever become interactive (e.g., filter by type), the tap target will be too small. Note for future. 4. **Person list with null firstName — initials handling.** `persons/+page.svelte:102` uses `person.firstName ? person.firstName[0] : person.lastName[0]` for the first initial, then `person.lastName[0]` for the second. For an institution like "Arthur Collignon GmbH", this produces "AA" (first two chars of lastName being "A" + "A"). This is meaningless. The PersonTypeBadge partially addresses this by showing the type, but the avatar circle should consider showing a type icon instead of initials for non-persons. ### LGTM - `displayName` as the single source of truth for name rendering is the correct pattern — eliminates 17+ files of duplicated concatenation logic. - Forward-order display everywhere (removing reversed "Cram, Clara" format) is an intentional UX improvement for consistency. - i18n keys for PersonType values are present in all three languages.
Author
Owner

🛠️ Tobias Wendt — DevOps Engineer

Verdict: Approved

No infrastructure concerns. Clean application-level change.

Checked

  • Migration V22: Single file, three safe DDL operations (ADD COLUMN, ADD COLUMN with DEFAULT + CHECK, DROP NOT NULL). All are metadata-only on PostgreSQL 16 — instant, no table rewrite. No backfill UPDATE statements (confirmed: dev mode, clean DB before re-import). Clean.
  • No new dependencies: No new Maven or npm packages added. No Docker changes. No CI configuration changes needed.
  • Flyway version: V22 follows V21 (person_name_aliases). No collision risk since all 5 issues share one branch and one migration.
  • API types regenerated: frontend/src/lib/generated/api.ts is updated with the new Person shape. CI lint hook will catch any drift.
  • No port/service changes: No new containers, no new endpoints, no new environment variables. Docker Compose untouched.

Suggestion

  • The backend was started manually for API type regeneration (./mvnw spring-boot:run with env vars). Consider documenting this step in CLAUDE.md or adding a CI check that verifies generated types are up to date (npm run generate:api && git diff --exit-code). Not blocking for this PR.
## 🛠️ Tobias Wendt — DevOps Engineer **Verdict: ✅ Approved** No infrastructure concerns. Clean application-level change. ### Checked - **Migration V22**: Single file, three safe DDL operations (ADD COLUMN, ADD COLUMN with DEFAULT + CHECK, DROP NOT NULL). All are metadata-only on PostgreSQL 16 — instant, no table rewrite. No backfill UPDATE statements (confirmed: dev mode, clean DB before re-import). Clean. - **No new dependencies**: No new Maven or npm packages added. No Docker changes. No CI configuration changes needed. - **Flyway version**: V22 follows V21 (person_name_aliases). No collision risk since all 5 issues share one branch and one migration. - **API types regenerated**: `frontend/src/lib/generated/api.ts` is updated with the new Person shape. CI lint hook will catch any drift. - **No port/service changes**: No new containers, no new endpoints, no new environment variables. Docker Compose untouched. ### Suggestion - The backend was started manually for API type regeneration (`./mvnw spring-boot:run` with env vars). Consider documenting this step in CLAUDE.md or adding a CI check that verifies generated types are up to date (`npm run generate:api && git diff --exit-code`). Not blocking for this PR.
marcel added 3 commits 2026-04-08 13:29:56 +02:00
Add @Nullable annotation to findOrCreateByAlias() return type.
Filter null results (from SKIP classification) in MassImportService
receiver list to prevent null elements in the receivers collection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove Architekt from WORD_PREFIXES (classifier handles it)
- Use Objects.equals for null-safe firstName/lastName comparison
- Remove unused trimmed variable in PersonTypeClassifier
- Fix containsWord to loop through all occurrences (finds
  "Eltern" in "Nachbareltern Eltern")
- Extract DisplayNameFormatter utility shared by Person and
  PersonSummaryDTO to eliminate display logic duplication

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
test(service): add integration test for findOrCreateByAlias classification
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2s
CI / Backend Unit Tests (pull_request) Failing after 2s
CI / Unit & Component Tests (push) Failing after 3s
CI / Backend Unit Tests (push) Failing after 1s
5106d277f1
Testcontainers test verifying: SKIP returns null with no DB record,
INSTITUTION/GROUP store full name in lastName with null firstName
and correct personType, PERSON splits name normally.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

Review Concerns Addressed

All blockers and actionable concerns from the 6-persona review have been resolved in 3 commits:

Commit Addresses
c0cf8d7 @Felix/@Markus blocker: @Nullable on findOrCreateByAlias(), filter nulls in MassImportService receiver list
ac545ec @Felix: remove Architekt from WORD_PREFIXES (classifier handles it), Objects.equals for null safety, remove unused trimmed variable. @Markus: extract DisplayNameFormatter shared utility. @Sara: fix containsWord to loop through all occurrences
5106d27 @Sara blocker: Testcontainers integration test for SKIP/INSTITUTION/GROUP/PERSON classification

Deferred (noted, not blocking)

  • @Leonie: PersonTypeBadge brand token alignment + dark mode — deferred to design follow-up
  • @Leonie: Avatar initials for non-persons showing type icon — deferred to #211 follow-up
  • @Markus: Pipeline ordering as explicit function list — accepted as-is, comment documents order
  • @Markus: Keyword externalization to DB — noted for future, KISS for now
  • @Tobias: CI check for generated API type drift — noted, separate improvement
  • @NullX: @Size/@Pattern validation on title edit form — deferred to when edit form is built
## Review Concerns Addressed All blockers and actionable concerns from the 6-persona review have been resolved in 3 commits: | Commit | Addresses | |--------|-----------| | `c0cf8d7` | @Felix/@Markus blocker: `@Nullable` on `findOrCreateByAlias()`, filter nulls in `MassImportService` receiver list | | `ac545ec` | @Felix: remove `Architekt` from WORD_PREFIXES (classifier handles it), `Objects.equals` for null safety, remove unused `trimmed` variable. @Markus: extract `DisplayNameFormatter` shared utility. @Sara: fix `containsWord` to loop through all occurrences | | `5106d27` | @Sara blocker: Testcontainers integration test for SKIP/INSTITUTION/GROUP/PERSON classification | ### Deferred (noted, not blocking) - @Leonie: PersonTypeBadge brand token alignment + dark mode — deferred to design follow-up - @Leonie: Avatar initials for non-persons showing type icon — deferred to #211 follow-up - @Markus: Pipeline ordering as explicit function list — accepted as-is, comment documents order - @Markus: Keyword externalization to DB — noted for future, KISS for now - @Tobias: CI check for generated API type drift — noted, separate improvement - @NullX: `@Size`/`@Pattern` validation on title edit form — deferred to when edit form is built
marcel added 3 commits 2026-04-08 18:39:01 +02:00
Replace hardcoded Tailwind utility colors with project CSS variables
(--c-badge-institution-*, --c-badge-group-*, --c-badge-unknown-*).
Dark mode variants defined in both @media and manual toggle blocks.
Extract shared badge classes and use $derived config object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Person list and detail page avatars now display a type-specific
icon (building, people group, question mark) instead of meaningless
initials for INSTITUTION, GROUP, and UNKNOWN person types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat(model): add title field to PersonUpdateDTO with @Size validation
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 3s
CI / Backend Unit Tests (pull_request) Failing after 2s
CI / Unit & Component Tests (push) Failing after 3s
CI / Backend Unit Tests (push) Failing after 1s
c34db997fa
Add title to PersonUpdateDTO with @Size(max=50) constraint.
PersonService.createPerson and updatePerson now handle the title
field with blank-to-null normalization.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

All deferred concerns resolved

3 additional commits addressing the remaining review concerns:

Commit Addresses
a1b21d6 @Leonie: PersonTypeBadge now uses CSS custom properties with dark mode variants (--c-badge-institution-*, --c-badge-group-*, --c-badge-unknown-*)
166f60f @Leonie: Avatar circles show type-specific icons (building, people, question mark) instead of meaningless initials for non-PERSON types
c34db99 @NullX: title field added to PersonUpdateDTO with @Size(max=50) validation. Service handles title in create/update paths

No remaining deferred items.

## All deferred concerns resolved 3 additional commits addressing the remaining review concerns: | Commit | Addresses | |--------|-----------| | `a1b21d6` | @Leonie: PersonTypeBadge now uses CSS custom properties with dark mode variants (`--c-badge-institution-*`, `--c-badge-group-*`, `--c-badge-unknown-*`) | | `166f60f` | @Leonie: Avatar circles show type-specific icons (building, people, question mark) instead of meaningless initials for non-PERSON types | | `c34db99` | @NullX: `title` field added to PersonUpdateDTO with `@Size(max=50)` validation. Service handles title in create/update paths | No remaining deferred items.
marcel merged commit c34db997fa into main 2026-04-08 18:48:00 +02:00
marcel deleted branch feat/issues-209-213-person-parser-enhancements 2026-04-08 18:48:02 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#215