feat(model): add title/salutation field to Person and make firstName optional #212

Closed
opened 2026-04-07 18:28:32 +02:00 by marcel · 9 comments
Owner

Problem

The mass import contains names with title or relationship prefixes that the parser treats as first names:

Input Current split() result Expected result
Tante Molly firstName="Tante", lastName="Molly" title="Tante", firstName=null, lastName="Molly"
Schwester Hanni firstName="Schwester", lastName="Hanni" title="Schwester", firstName=null, lastName="Hanni"
Frau Bakker firstName="Frau", lastName="Bakker" title="Frau", firstName=null, lastName="Bakker"
Dr. Frock firstName="Dr.", lastName="Frock" title="Dr.", firstName=null, lastName="Frock"
Cousine Emmy Haniel firstName="Cousine Emmy", lastName="Haniel" title="Cousine", firstName="Emmy", lastName="Haniel"
Tante Lolly firstName="Tante", lastName="Lolly" title="Tante", firstName=null, lastName="Lolly"

In a family archive, "Tante Molly" or "Dr. Frock" are how these people are known. There is no known first name - the nickname/last name IS the identifier. The title/relationship prefix is metadata.

Solution

Part 1 - Add title field to Person

@Column(name = "title")
private String title;

Stores: Frau, Herr, Tante, Onkel, Schwester, Cousine, Freundin, Architekt, Dr., Prof., etc.

Part 2 - Make firstName nullable

firstName is currently always populated (worst case with "?" placeholder). Change to:

  • Allow null in the database
  • Update all display logic to handle null firstName gracefully
  • Display name becomes: [title] [firstName] lastName with null parts omitted

Impact scope - every place that concatenates firstName + " " + lastName:

  • Person list page
  • Person detail page
  • Document detail (sender/receiver display)
  • PersonTypeahead / PersonMultiSelect components
  • Search results
  • Mass import service

Part 3 - Parser: strip known prefixes

Add a list of known title/relationship prefixes to PersonNameParser:

static final List<String> KNOWN_PREFIXES = List.of(
    "Frau", "Herr", "Tante", "Onkel", "Schwester", "Bruder",
    "Cousine", "Cousin", "Freundin", "Freund",
    "Architekt", "Dr.", "Prof.", "Pastor"
);

In split(), check if the first token is a known prefix. If so:

  • Extract it as the title
  • The remainder is treated as the name (may be just a lastName with no firstName)

Part 4 - Nickname as lastName

When a name like "Tante Molly" is split, "Molly" goes to lastName. This is intentional - in the family archive context, the nickname IS how the person is identified. Users can later set a proper firstName/lastName via the edit page if the real name is known.

Part 5 - Migration

ALTER TABLE persons ADD COLUMN title VARCHAR(50);
ALTER TABLE persons ALTER COLUMN first_name DROP NOT NULL;

Update existing records where firstName matches a known prefix:

UPDATE persons SET title = first_name, first_name = NULL
WHERE first_name IN ('Tante', 'Schwester', 'Frau', 'Herr', ...);

Files

File Change
Person.java Add title field, make firstName nullable
V{n}__add_person_title.sql Migration
PersonNameParser.java Known prefix stripping, return title in SplitName
PersonService.java Handle title during findOrCreateByAlias
PersonNameParserTest.java Tests for prefix stripping
Frontend: person list, detail, edit, typeahead, document detail Handle null firstName, display title
API types regeneration Reflect nullable firstName + new title field

Design decision

firstName becoming optional is a significant change. The "?" placeholder approach was simpler but produced dirty data. Null is honest - it says "we don't know this person's first name" rather than pretending "?" is a name.

Found in

ODS import file analysis for #190.

## Problem The mass import contains names with title or relationship prefixes that the parser treats as first names: | Input | Current split() result | Expected result | |---|---|---| | `Tante Molly` | firstName="Tante", lastName="Molly" | title="Tante", firstName=null, lastName="Molly" | | `Schwester Hanni` | firstName="Schwester", lastName="Hanni" | title="Schwester", firstName=null, lastName="Hanni" | | `Frau Bakker` | firstName="Frau", lastName="Bakker" | title="Frau", firstName=null, lastName="Bakker" | | `Dr. Frock` | firstName="Dr.", lastName="Frock" | title="Dr.", firstName=null, lastName="Frock" | | `Cousine Emmy Haniel` | firstName="Cousine Emmy", lastName="Haniel" | title="Cousine", firstName="Emmy", lastName="Haniel" | | `Tante Lolly` | firstName="Tante", lastName="Lolly" | title="Tante", firstName=null, lastName="Lolly" | In a family archive, "Tante Molly" or "Dr. Frock" are how these people are known. There is no known first name - the nickname/last name IS the identifier. The title/relationship prefix is metadata. ## Solution ### Part 1 - Add `title` field to Person ```java @Column(name = "title") private String title; ``` Stores: Frau, Herr, Tante, Onkel, Schwester, Cousine, Freundin, Architekt, Dr., Prof., etc. ### Part 2 - Make `firstName` nullable `firstName` is currently always populated (worst case with "?" placeholder). Change to: - Allow `null` in the database - Update all display logic to handle null firstName gracefully - Display name becomes: `[title] [firstName] lastName` with null parts omitted **Impact scope** - every place that concatenates `firstName + " " + lastName`: - Person list page - Person detail page - Document detail (sender/receiver display) - PersonTypeahead / PersonMultiSelect components - Search results - Mass import service ### Part 3 - Parser: strip known prefixes Add a list of known title/relationship prefixes to `PersonNameParser`: ```java static final List<String> KNOWN_PREFIXES = List.of( "Frau", "Herr", "Tante", "Onkel", "Schwester", "Bruder", "Cousine", "Cousin", "Freundin", "Freund", "Architekt", "Dr.", "Prof.", "Pastor" ); ``` In `split()`, check if the first token is a known prefix. If so: - Extract it as the title - The remainder is treated as the name (may be just a lastName with no firstName) ### Part 4 - Nickname as lastName When a name like "Tante Molly" is split, "Molly" goes to `lastName`. This is intentional - in the family archive context, the nickname IS how the person is identified. Users can later set a proper firstName/lastName via the edit page if the real name is known. ### Part 5 - Migration ```sql ALTER TABLE persons ADD COLUMN title VARCHAR(50); ALTER TABLE persons ALTER COLUMN first_name DROP NOT NULL; ``` Update existing records where firstName matches a known prefix: ```sql UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN ('Tante', 'Schwester', 'Frau', 'Herr', ...); ``` ## Files | File | Change | |---|---| | `Person.java` | Add `title` field, make `firstName` nullable | | `V{n}__add_person_title.sql` | Migration | | `PersonNameParser.java` | Known prefix stripping, return title in `SplitName` | | `PersonService.java` | Handle title during findOrCreateByAlias | | `PersonNameParserTest.java` | Tests for prefix stripping | | Frontend: person list, detail, edit, typeahead, document detail | Handle null firstName, display title | | API types regeneration | Reflect nullable firstName + new title field | ## Design decision `firstName` becoming optional is a significant change. The "?" placeholder approach was simpler but produced dirty data. Null is honest - it says "we don't know this person's first name" rather than pretending "?" is a name. ## Found in ODS import file analysis for #190.
marcel added the featureperson labels 2026-04-07 18:28:45 +02:00
Author
Owner

Complete Input/Output Table

Every entry from the ODS that contains a title or relationship prefix, plus regression cases that must NOT be affected.

Relationship prefixes (firstName becomes null, nickname goes to lastName)

# Raw input Column title firstName lastName Notes
1 Tante Clara Von Tante null Clara Nickname as lastName
2 Tante Lolly Von (via // split from Charl.Blomquist//Tante Lolly) Tante null Lolly Nickname as lastName
3 Schwester Hanni Von Schwester null Hanni Nickname as lastName
4 Freundin Gathi Von Freundin null Gathi Nickname as lastName
5 Cousine Emmy Haniel Von Cousine Emmy Haniel Has full name after prefix

Formal titles (firstName becomes null when only one name token remains)

# Raw input Column title firstName lastName Notes
6 Frau Bakker Von Frau null Bakker Single token after prefix
7 Freifrau von Massenbach Von Freifrau null von Massenbach Noble title; "von Massenbach" is the full last name

Academic/professional titles (dot-compressed - dot normalization fires first)

These go through dot-normalization (#190) before title stripping. Processing order: geb strip -> dot normalize -> paren strip -> title strip -> known-last-name / fallback.

# Raw input After dot-norm title firstName lastName Notes
8 Dr.Fr.Zarncke Dr. Fr. Zarncke Dr. Fr. Zarncke Title + initial + last name
9 Dr.Sattelmacher Dr. Sattelmacher Dr. null Sattelmacher Title + last name only
10 Dr.W.Munch Dr. W. Munch Dr. W. Munch Title + initial + last name
11 Dr.von Gelden Dr. von Gelden Dr. null von Gelden Title + "von" last name (dot-norm only adds space after first dot since rest already has spaces)
12 Prof.H-M Theopold Prof. H-M Theopold Prof. H-M Theopold Title + hyphenated initial + last name

Wait - #11 is tricky. Dr.von Gelden has a space after "von", so dot-normalization guard (!contains(" ")) won't fire. It stays as Dr.von Gelden. Then title stripping needs to handle Dr. even without the space. This needs its own rule: dot-prefixed titles (Dr., Prof.) should be recognized with or without trailing space.

Corrected #11:

# Raw input After dot-norm title firstName lastName Notes
11 Dr.von Gelden Dr.von Gelden (unchanged - has space, dot-norm skips) Dr. null von Gelden Dot-norm doesn't fire; title stripping must handle Dr. prefix even when concatenated

Known prefix list

Category Prefixes
Formal Frau, Herr, Freifrau, Freiherr
Relationship Tante, Onkel, Schwester, Bruder, Cousine, Cousin, Freundin, Freund, Mutter, Vater
Academic Dr., Prof., Pastor
Professional Architekt

Regression cases - must NOT be affected by title stripping

These names start with tokens that look like they could be prefixes but are actually first names or parts of the name:

Raw input Column Expected title Expected firstName Expected lastName Why no stripping
Walter de Gruyter Von null Walter de Gruyter "Walter" is not a known prefix
Clara Muller Von null Clara Muller "Clara" is not a known prefix
Friedrich Muller Von null Friedrich Muller "Friedrich" is not a known prefix
Conrad von Geldern Von null Conrad von Geldern "Conrad" is not a known prefix, "von" is part of last name
Rikchen v R. Von null Rikchen v R. "Rikchen" is not a known prefix
Paula von der Heide Von null Paula von der Heide "Paula" is not a known prefix; "von der Heide" is the last name (needs addition to KNOWN_LAST_NAMES)
Wolf Meinhard von Staa Von null Wolf Meinhard von Staa "Wolf" is not a known prefix

Several entries have von in the last name. These are NOT titles - von is part of the German last name:

Raw input Expected firstName Expected lastName Needs KNOWN_LAST_NAMES?
Conrad von Geldern Conrad von Geldern Yes - add von Geldern
Paula von der Heide Paula von der Heide Yes - add von der Heide
Wolf Meinhard von Staa Wolf Meinhard von Staa Yes - add von Staa
Freifrau von Massenbach (after title strip) null von Massenbach Yes - add von Massenbach
Dr.von Gelden (after title strip) null von Gelden Yes - add von Gelden

These should be added to KNOWN_LAST_NAMES as part of this issue to ensure correct splitting.

Processing pipeline order (full)

For any name going through split():

raw input
  -> geb. strip (#209)
  -> dot-normalization (#190)
  -> parenthesis strip (#210)
  -> title strip (#212)       <-- NEW
  -> findKnownLastName / last-space fallback

Edge case: Architekt Korschelt u Renker

This is a Von entry with a professional title AND a multi-person u separator. Since Von goes to split() not parseReceivers(), the u isn't handled. After title stripping: title="Architekt", remainder="Korschelt u Renker", split: firstName="Korschelt u", lastName="Renker". This is still wrong - but it's a Von-column-multi-person issue, not a title issue. Noted in #211 as an open question.

## Complete Input/Output Table Every entry from the ODS that contains a title or relationship prefix, plus regression cases that must NOT be affected. ### Relationship prefixes (firstName becomes null, nickname goes to lastName) | # | Raw input | Column | title | firstName | lastName | Notes | |---|---|---|---|---|---|---| | 1 | `Tante Clara` | Von | `Tante` | null | `Clara` | Nickname as lastName | | 2 | `Tante Lolly` | Von (via `//` split from `Charl.Blomquist//Tante Lolly`) | `Tante` | null | `Lolly` | Nickname as lastName | | 3 | `Schwester Hanni` | Von | `Schwester` | null | `Hanni` | Nickname as lastName | | 4 | `Freundin Gathi` | Von | `Freundin` | null | `Gathi` | Nickname as lastName | | 5 | `Cousine Emmy Haniel` | Von | `Cousine` | `Emmy` | `Haniel` | Has full name after prefix | ### Formal titles (firstName becomes null when only one name token remains) | # | Raw input | Column | title | firstName | lastName | Notes | |---|---|---|---|---|---|---| | 6 | `Frau Bakker` | Von | `Frau` | null | `Bakker` | Single token after prefix | | 7 | `Freifrau von Massenbach` | Von | `Freifrau` | null | `von Massenbach` | Noble title; "von Massenbach" is the full last name | ### Academic/professional titles (dot-compressed - dot normalization fires first) These go through dot-normalization (#190) before title stripping. Processing order: geb strip -> dot normalize -> paren strip -> title strip -> known-last-name / fallback. | # | Raw input | After dot-norm | title | firstName | lastName | Notes | |---|---|---|---|---|---|---| | 8 | `Dr.Fr.Zarncke` | `Dr. Fr. Zarncke` | `Dr.` | `Fr.` | `Zarncke` | Title + initial + last name | | 9 | `Dr.Sattelmacher` | `Dr. Sattelmacher` | `Dr.` | null | `Sattelmacher` | Title + last name only | | 10 | `Dr.W.Munch` | `Dr. W. Munch` | `Dr.` | `W.` | `Munch` | Title + initial + last name | | 11 | `Dr.von Gelden` | `Dr. von Gelden` | `Dr.` | null | `von Gelden` | Title + "von" last name (dot-norm only adds space after first dot since rest already has spaces) | | 12 | `Prof.H-M Theopold` | `Prof. H-M Theopold` | `Prof.` | `H-M` | `Theopold` | Title + hyphenated initial + last name | Wait - #11 is tricky. `Dr.von Gelden` has a space after "von", so dot-normalization guard (`!contains(" ")`) won't fire. It stays as `Dr.von Gelden`. Then title stripping needs to handle `Dr.` even without the space. This needs its own rule: dot-prefixed titles (`Dr.`, `Prof.`) should be recognized with or without trailing space. **Corrected #11:** | # | Raw input | After dot-norm | title | firstName | lastName | Notes | |---|---|---|---|---|---|---| | 11 | `Dr.von Gelden` | `Dr.von Gelden` (unchanged - has space, dot-norm skips) | `Dr.` | null | `von Gelden` | Dot-norm doesn't fire; title stripping must handle `Dr.` prefix even when concatenated | ### Known prefix list | Category | Prefixes | |---|---| | Formal | `Frau`, `Herr`, `Freifrau`, `Freiherr` | | Relationship | `Tante`, `Onkel`, `Schwester`, `Bruder`, `Cousine`, `Cousin`, `Freundin`, `Freund`, `Mutter`, `Vater` | | Academic | `Dr.`, `Prof.`, `Pastor` | | Professional | `Architekt` | ### Regression cases - must NOT be affected by title stripping These names start with tokens that look like they could be prefixes but are actually first names or parts of the name: | Raw input | Column | Expected title | Expected firstName | Expected lastName | Why no stripping | |---|---|---|---|---|---| | `Walter de Gruyter` | Von | null | `Walter` | `de Gruyter` | "Walter" is not a known prefix | | `Clara Muller` | Von | null | `Clara` | `Muller` | "Clara" is not a known prefix | | `Friedrich Muller` | Von | null | `Friedrich` | `Muller` | "Friedrich" is not a known prefix | | `Conrad von Geldern` | Von | null | `Conrad` | `von Geldern` | "Conrad" is not a known prefix, "von" is part of last name | | `Rikchen v R.` | Von | null | `Rikchen v` | `R.` | "Rikchen" is not a known prefix | | `Paula von der Heide` | Von | null | `Paula` | `von der Heide` | "Paula" is not a known prefix; "von der Heide" is the last name (needs addition to KNOWN_LAST_NAMES) | | `Wolf Meinhard von Staa` | Von | null | `Wolf Meinhard` | `von Staa` | "Wolf" is not a known prefix | ### "von" as part of last names (related concern) Several entries have `von` in the last name. These are NOT titles - `von` is part of the German last name: | Raw input | Expected firstName | Expected lastName | Needs KNOWN_LAST_NAMES? | |---|---|---|---| | `Conrad von Geldern` | `Conrad` | `von Geldern` | Yes - add `von Geldern` | | `Paula von der Heide` | `Paula` | `von der Heide` | Yes - add `von der Heide` | | `Wolf Meinhard von Staa` | `Wolf Meinhard` | `von Staa` | Yes - add `von Staa` | | `Freifrau von Massenbach` | (after title strip) null | `von Massenbach` | Yes - add `von Massenbach` | | `Dr.von Gelden` | (after title strip) null | `von Gelden` | Yes - add `von Gelden` | These should be added to `KNOWN_LAST_NAMES` as part of this issue to ensure correct splitting. ### Processing pipeline order (full) For any name going through `split()`: ``` raw input -> geb. strip (#209) -> dot-normalization (#190) -> parenthesis strip (#210) -> title strip (#212) <-- NEW -> findKnownLastName / last-space fallback ``` ### Edge case: `Architekt Korschelt u Renker` This is a Von entry with a professional title AND a multi-person `u` separator. Since Von goes to `split()` not `parseReceivers()`, the `u` isn't handled. After title stripping: title="Architekt", remainder="Korschelt u Renker", split: firstName="Korschelt u", lastName="Renker". This is still wrong - but it's a Von-column-multi-person issue, not a title issue. Noted in #211 as an open question.
Author
Owner

💻 Felix Brandt -- Senior Fullstack Developer

Questions & Observations

  • The issue mentions updating "every place that concatenates firstName + " " + lastName" but does not inventory whether this logic currently lives in a single utility/helper or is scattered across multiple files. If scattered, this is a DRY opportunity -- extract a Person.getDisplayName() method (or a shared Svelte utility) that handles the [title] [firstName] lastName assembly with null-safe logic in one place. Every display site then calls that one method.

  • PersonNameParser.split() currently returns what appears to be a two-field SplitName record. Adding title makes it three fields. Is SplitName a record or a class? If it is a record, adding the field is straightforward. If it is constructed ad-hoc, this is the moment to formalize it. A SplitName(String title, String firstName, String lastName) record keeps the parser's contract explicit.

  • The comment's processing pipeline (geb strip -> dot-normalize -> paren strip -> title strip -> findKnownLastName / fallback) is well-defined. The dot-prefix recognition for Dr. and Prof. even without trailing space (case #11: Dr.von Gelden) needs careful implementation -- the title strip step must check for dot-terminated prefixes using startsWith("Dr.") / startsWith("Prof.") before attempting space-based token splitting. This is a separate code path from the space-separated prefixes like Tante, Frau, etc.

  • The KNOWN_PREFIXES list is a static constant. Where does it live? If it lives inside PersonNameParser, that is fine (single responsibility -- parsing names). If someone later wants to make it configurable, that is a separate issue. KISS says: hardcode the list as a List.of(...) constant for now.

  • The migration's UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN (...) is a data migration that changes existing records. What happens to documents whose sender/receiver display currently shows "Tante Molly" as firstName="Tante" lastName="Molly"? After migration it becomes title="Tante" firstName=null lastName="Molly". The frontend must handle this before the migration runs, or the display breaks between deploy and frontend update. Deploy order matters.

Suggestions

  • Add a getDisplayName() method to Person.java that assembles [title] [firstName] lastName with null handling. All frontend and backend display logic calls this. This is the single point of change for display format.
public String getDisplayName() {
    var parts = new ArrayList<String>();
    if (title != null) parts.add(title);
    if (firstName != null) parts.add(firstName);
    parts.add(lastName);
    return String.join(" ", parts);
}
  • Write the PersonNameParserTest cases from the comment table first (red), then implement the title stripping (green). The comment already provides 12+ concrete input/output pairs -- these are test cases ready to be transcribed.

  • For dot-terminated titles (Dr., Prof.), handle them as a prefix match, not a token match. The token-based approach (split(" ")[0]) will miss Dr.von Gelden where there is no space after the dot. A startsWith check on the known dot-prefixes should run before tokenization.

  • Consider whether KNOWN_PREFIXES should be case-insensitive. The ODS data might contain tante or TANTE. A Set of lowercased prefixes with token.toLowerCase() lookup is defensive without adding complexity.

## :computer: Felix Brandt -- Senior Fullstack Developer ### Questions & Observations - The issue mentions updating "every place that concatenates `firstName + " " + lastName`" but does not inventory whether this logic currently lives in a single utility/helper or is scattered across multiple files. If scattered, this is a DRY opportunity -- extract a `Person.getDisplayName()` method (or a shared Svelte utility) that handles the `[title] [firstName] lastName` assembly with null-safe logic in one place. Every display site then calls that one method. - `PersonNameParser.split()` currently returns what appears to be a two-field `SplitName` record. Adding `title` makes it three fields. Is `SplitName` a record or a class? If it is a record, adding the field is straightforward. If it is constructed ad-hoc, this is the moment to formalize it. A `SplitName(String title, String firstName, String lastName)` record keeps the parser's contract explicit. - The comment's processing pipeline (`geb strip -> dot-normalize -> paren strip -> title strip -> findKnownLastName / fallback`) is well-defined. The dot-prefix recognition for `Dr.` and `Prof.` even without trailing space (case #11: `Dr.von Gelden`) needs careful implementation -- the title strip step must check for dot-terminated prefixes using `startsWith("Dr.")` / `startsWith("Prof.")` *before* attempting space-based token splitting. This is a separate code path from the space-separated prefixes like `Tante`, `Frau`, etc. - The `KNOWN_PREFIXES` list is a static constant. Where does it live? If it lives inside `PersonNameParser`, that is fine (single responsibility -- parsing names). If someone later wants to make it configurable, that is a separate issue. KISS says: hardcode the list as a `List.of(...)` constant for now. - The migration's `UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN (...)` is a data migration that changes existing records. What happens to documents whose sender/receiver display currently shows "Tante Molly" as `firstName="Tante" lastName="Molly"`? After migration it becomes `title="Tante" firstName=null lastName="Molly"`. The frontend must handle this *before* the migration runs, or the display breaks between deploy and frontend update. Deploy order matters. ### Suggestions - Add a `getDisplayName()` method to `Person.java` that assembles `[title] [firstName] lastName` with null handling. All frontend and backend display logic calls this. This is the single point of change for display format. ```java public String getDisplayName() { var parts = new ArrayList<String>(); if (title != null) parts.add(title); if (firstName != null) parts.add(firstName); parts.add(lastName); return String.join(" ", parts); } ``` - Write the `PersonNameParserTest` cases from the comment table *first* (red), then implement the title stripping (green). The comment already provides 12+ concrete input/output pairs -- these are test cases ready to be transcribed. - For dot-terminated titles (`Dr.`, `Prof.`), handle them as a prefix match, not a token match. The token-based approach (`split(" ")[0]`) will miss `Dr.von Gelden` where there is no space after the dot. A `startsWith` check on the known dot-prefixes should run before tokenization. - Consider whether `KNOWN_PREFIXES` should be case-insensitive. The ODS data might contain `tante` or `TANTE`. A `Set` of lowercased prefixes with `token.toLowerCase()` lookup is defensive without adding complexity.
Author
Owner

🏗️ Markus Keller -- Application Architect

Questions & Observations

  • Schema change scope: Adding a nullable title column is clean. Making firstName nullable is the bigger change -- it ripples through every query, every ORDER BY, every search index that touches firstName. Does the full-text search index currently include firstName? If so, the search behavior changes when firstName is null. Null values are excluded from text search indexes by default in PostgreSQL. A person with only a title and lastName might become unsearchable by their title unless the search index is updated.

  • Data model integrity: The title field is described as storing both formal titles (Dr., Frau) and relationship labels (Tante, Cousine). These are semantically different -- "Dr." is a property of the person, while "Tante" is a relationship between two people (Tante to whom?). Is this conflation intentional? In a family archive with a single-family perspective, it probably works fine. But it is worth acknowledging this is a simplification that would break if you ever needed to model "Tante to Person A but Cousine to Person B."

  • Migration safety: The UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN (...) migration modifies existing data. This is irreversible in the sense that you lose the information that firstName was "Tante". Flyway migrations should be forward-only and safe. Consider: what if the list misses a prefix, and a real first name happens to match a future prefix addition? The migration as written is safe for the current list, but the approach of retroactively nulling data based on a hardcoded list is worth flagging.

  • Display name assembly: Where does the [title] [firstName] lastName concatenation live? The issue lists it as a frontend concern across many files. This logic should live on the entity as a derived property (a @Transient getter or a @Formula column), so the backend API always returns a consistent displayName. The frontend should not independently assemble display names -- that is duplicated business logic across two codebases.

  • VARCHAR(50) for title: Reasonable for known prefixes. But "Freifrau" (9 chars) and potential compound titles like "Prof. Dr." (9 chars) fit easily. 50 is fine.

Suggestions

  • Add a @Transient displayName getter on Person and include it in the API response. This is the single source of truth for how a person's name is displayed. The frontend reads person.displayName instead of assembling it from parts.

  • If the full-text search includes person names, update the search index/query to include title in the searchable text. Otherwise "Tante Molly" becomes unsearchable after migration.

  • Document the intentional conflation of titles and relationship labels in the migration SQL as a comment. Future maintainers will wonder why "Tante" is stored in a field called "title."

  • The KNOWN_LAST_NAMES additions (von Geldern, von der Heide, von Staa, von Massenbach) from the comment should be part of this issue's scope, not deferred. They are required for correct parsing of the test cases listed.

## :building_construction: Markus Keller -- Application Architect ### Questions & Observations - **Schema change scope**: Adding a nullable `title` column is clean. Making `firstName` nullable is the bigger change -- it ripples through every query, every `ORDER BY`, every search index that touches `firstName`. Does the full-text search index currently include `firstName`? If so, the search behavior changes when `firstName` is null. Null values are excluded from text search indexes by default in PostgreSQL. A person with only a title and lastName might become unsearchable by their title unless the search index is updated. - **Data model integrity**: The `title` field is described as storing both formal titles (`Dr.`, `Frau`) and relationship labels (`Tante`, `Cousine`). These are semantically different -- "Dr." is a property of the person, while "Tante" is a relationship between two people (Tante to *whom*?). Is this conflation intentional? In a family archive with a single-family perspective, it probably works fine. But it is worth acknowledging this is a simplification that would break if you ever needed to model "Tante to Person A but Cousine to Person B." - **Migration safety**: The `UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN (...)` migration modifies existing data. This is irreversible in the sense that you lose the information that `firstName` was `"Tante"`. Flyway migrations should be forward-only and safe. Consider: what if the list misses a prefix, and a real first name happens to match a future prefix addition? The migration as written is safe for the *current* list, but the approach of retroactively nulling data based on a hardcoded list is worth flagging. - **Display name assembly**: Where does the `[title] [firstName] lastName` concatenation live? The issue lists it as a frontend concern across many files. This logic should live on the entity as a derived property (a `@Transient` getter or a `@Formula` column), so the backend API always returns a consistent `displayName`. The frontend should not independently assemble display names -- that is duplicated business logic across two codebases. - **`VARCHAR(50)` for title**: Reasonable for known prefixes. But "Freifrau" (9 chars) and potential compound titles like "Prof. Dr." (9 chars) fit easily. 50 is fine. ### Suggestions - Add a `@Transient` `displayName` getter on `Person` and include it in the API response. This is the single source of truth for how a person's name is displayed. The frontend reads `person.displayName` instead of assembling it from parts. - If the full-text search includes person names, update the search index/query to include `title` in the searchable text. Otherwise "Tante Molly" becomes unsearchable after migration. - Document the intentional conflation of titles and relationship labels in the migration SQL as a comment. Future maintainers will wonder why "Tante" is stored in a field called "title." - The `KNOWN_LAST_NAMES` additions (`von Geldern`, `von der Heide`, `von Staa`, `von Massenbach`) from the comment should be part of this issue's scope, not deferred. They are required for correct parsing of the test cases listed.
Author
Owner

🧪 Sara Holt -- QA Engineer & Test Strategist

Questions & Observations

  • The comment provides an excellent input/output table with 12 positive cases and 7 regression cases. This is a test plan in disguise. However, there are gaps:

    • Null/empty input: What happens when the raw input is null, empty string, or whitespace-only? The parser must handle these gracefully.
    • Title-only input: What if someone enters just "Tante" with no name following? Does it become title="Tante", firstName=null, lastName=null? That would violate a lastName NOT NULL constraint if one exists.
    • Multiple titles: "Prof. Dr. Muller" -- does only the first token get stripped, or both? The pipeline description says "check if the first token is a known prefix" (singular), but academic titles can stack.
    • Case sensitivity: "tante molly", "TANTE MOLLY" -- are these handled identically to "Tante Molly"?
  • The migration's UPDATE statement modifies existing data. This needs a before/after integration test: run the migration on a test database with known seed data and assert the resulting state. Flyway migrations are tested implicitly by Testcontainers, but the data transformation deserves an explicit assertion.

  • The "impact scope" lists 6+ frontend locations that need to handle null firstName. Each of these is a potential regression point. Without component tests for null-firstName rendering, a future refactor could reintroduce firstName + " " + lastName and display "null Molly".

  • The processing pipeline has a defined order: geb strip -> dot-normalize -> paren strip -> title strip -> findKnownLastName / fallback. This pipeline is a prime candidate for a parameterized test that feeds raw input through the full pipeline and asserts the final SplitName output. Each row in the comment tables becomes one @ParameterizedTest @CsvSource row.

Suggestions

  • Unit test layer (PersonNameParserTest):

    • Parameterized test with all 19 cases from the comment (12 positive + 7 regression)
    • Edge cases: null input, empty string, single word with no prefix, title-only input, double titles
    • Case insensitivity test if that behavior is implemented
  • Integration test layer:

    • Flyway migration test: seed database with persons having firstName IN ('Tante', 'Dr.', 'Frau'), run migration, assert title is populated and firstName is null
    • PersonService.findOrCreateByAlias test: pass "Tante Molly" and assert the created person has title="Tante", firstName=null, lastName="Molly"
  • Frontend component tests (Vitest):

    • Render PersonTypeahead with a person having title="Tante", firstName=null, lastName="Molly" -- assert display shows "Tante Molly"
    • Render person list with mixed null/non-null firstNames -- assert no "null" or "undefined" strings in output
    • Document detail sender/receiver display with title-only persons
  • E2E smoke test (Playwright):

    • Create a person via mass import with a titled name, navigate to person detail, verify display name is correct. One happy path is sufficient at this layer.
## :test_tube: Sara Holt -- QA Engineer & Test Strategist ### Questions & Observations - The comment provides an excellent input/output table with 12 positive cases and 7 regression cases. This is a test plan in disguise. However, there are gaps: - **Null/empty input**: What happens when the raw input is `null`, empty string, or whitespace-only? The parser must handle these gracefully. - **Title-only input**: What if someone enters just `"Tante"` with no name following? Does it become `title="Tante", firstName=null, lastName=null`? That would violate a `lastName NOT NULL` constraint if one exists. - **Multiple titles**: `"Prof. Dr. Muller"` -- does only the first token get stripped, or both? The pipeline description says "check if the first token is a known prefix" (singular), but academic titles can stack. - **Case sensitivity**: `"tante molly"`, `"TANTE MOLLY"` -- are these handled identically to `"Tante Molly"`? - The migration's `UPDATE` statement modifies existing data. This needs a **before/after integration test**: run the migration on a test database with known seed data and assert the resulting state. Flyway migrations are tested implicitly by Testcontainers, but the *data* transformation deserves an explicit assertion. - The "impact scope" lists 6+ frontend locations that need to handle null `firstName`. Each of these is a potential regression point. Without component tests for null-firstName rendering, a future refactor could reintroduce `firstName + " " + lastName` and display `"null Molly"`. - The processing pipeline has a defined order: `geb strip -> dot-normalize -> paren strip -> title strip -> findKnownLastName / fallback`. This pipeline is a prime candidate for a parameterized test that feeds raw input through the full pipeline and asserts the final `SplitName` output. Each row in the comment tables becomes one `@ParameterizedTest` `@CsvSource` row. ### Suggestions - **Unit test layer** (PersonNameParserTest): - Parameterized test with all 19 cases from the comment (12 positive + 7 regression) - Edge cases: null input, empty string, single word with no prefix, title-only input, double titles - Case insensitivity test if that behavior is implemented - **Integration test layer**: - Flyway migration test: seed database with persons having `firstName IN ('Tante', 'Dr.', 'Frau')`, run migration, assert `title` is populated and `firstName` is null - `PersonService.findOrCreateByAlias` test: pass "Tante Molly" and assert the created person has `title="Tante"`, `firstName=null`, `lastName="Molly"` - **Frontend component tests** (Vitest): - Render PersonTypeahead with a person having `title="Tante"`, `firstName=null`, `lastName="Molly"` -- assert display shows "Tante Molly" - Render person list with mixed null/non-null firstNames -- assert no "null" or "undefined" strings in output - Document detail sender/receiver display with title-only persons - **E2E smoke test** (Playwright): - Create a person via mass import with a titled name, navigate to person detail, verify display name is correct. One happy path is sufficient at this layer.
Author
Owner

🛡️ Nora "NullX" Steiner -- Application Security Engineer

Questions & Observations

  • Mass assignment on the new title field: When Person is used as a response entity (no response DTO), the title field will be serialized in API responses automatically. That is fine for reads. For writes, does DocumentUpdateDTO or any person-related DTO accept a title field? If the person edit form action processes formData.get('title'), ensure it is validated and sanitized. A freeform string field that ends up in HTML display is an XSS vector if not properly escaped. SvelteKit's {person.title} text interpolation escapes by default, but verify no {@html} is used for display name rendering.

  • SQL injection in the migration: The migration uses a hardcoded IN ('Tante', 'Schwester', ...) list, which is safe (no user input). No issue here.

  • The KNOWN_PREFIXES list is server-side only: Confirm the prefix list is not sent to the frontend or used in client-side validation. Client-side validation of titles would leak the prefix list, which is not a security risk per se, but keeping parsing logic server-side is the right boundary.

  • Null firstName and search/filter injection: If any search query concatenates firstName into a JPQL or native SQL string, a null firstName could change query behavior. Verify all person search queries use parameterized queries and handle null fields gracefully. This is likely already the case given the project's use of Spring Data JPA, but the change from "always non-null" to "nullable" can surface edge cases in WHERE firstName LIKE :query clauses.

  • Authorization on person edit: The issue does not change authorization boundaries, but adding a new field (title) to person edit means the existing @RequirePermission coverage applies. No new endpoints are introduced, so no new permission gaps. Confirmed no concern here.

Suggestions

  • Verify that all display-name rendering in Svelte templates uses text interpolation ({person.title}) and not {@html}. A title like <script>alert(1)</script> entered via the edit form would be harmless with text interpolation but dangerous with {@html}.

  • Add input validation on the title field: max length 50 (matching the DB column), alphanumeric + dots + spaces only. Reject HTML entities and control characters. This can be a simple @Size(max = 50) and @Pattern on the DTO field if a DTO is used, or a service-layer validation.

  • If PersonService.findOrCreateByAlias creates persons from mass import input, the title value comes from parsed ODS data. ODS cell contents can contain arbitrary strings. Ensure the parser sanitizes or rejects unexpected characters in the extracted title before persisting.

## :shield: Nora "NullX" Steiner -- Application Security Engineer ### Questions & Observations - **Mass assignment on the new `title` field**: When `Person` is used as a response entity (no response DTO), the `title` field will be serialized in API responses automatically. That is fine for reads. For writes, does `DocumentUpdateDTO` or any person-related DTO accept a `title` field? If the person edit form action processes `formData.get('title')`, ensure it is validated and sanitized. A freeform string field that ends up in HTML display is an XSS vector if not properly escaped. SvelteKit's `{person.title}` text interpolation escapes by default, but verify no `{@html}` is used for display name rendering. - **SQL injection in the migration**: The migration uses a hardcoded `IN ('Tante', 'Schwester', ...)` list, which is safe (no user input). No issue here. - **The `KNOWN_PREFIXES` list is server-side only**: Confirm the prefix list is not sent to the frontend or used in client-side validation. Client-side validation of titles would leak the prefix list, which is not a security risk per se, but keeping parsing logic server-side is the right boundary. - **Null `firstName` and search/filter injection**: If any search query concatenates `firstName` into a JPQL or native SQL string, a null `firstName` could change query behavior. Verify all person search queries use parameterized queries and handle null fields gracefully. This is likely already the case given the project's use of Spring Data JPA, but the change from "always non-null" to "nullable" can surface edge cases in `WHERE firstName LIKE :query` clauses. - **Authorization on person edit**: The issue does not change authorization boundaries, but adding a new field (`title`) to person edit means the existing `@RequirePermission` coverage applies. No new endpoints are introduced, so no new permission gaps. Confirmed no concern here. ### Suggestions - Verify that all display-name rendering in Svelte templates uses text interpolation (`{person.title}`) and not `{@html}`. A `title` like `<script>alert(1)</script>` entered via the edit form would be harmless with text interpolation but dangerous with `{@html}`. - Add input validation on the `title` field: max length 50 (matching the DB column), alphanumeric + dots + spaces only. Reject HTML entities and control characters. This can be a simple `@Size(max = 50)` and `@Pattern` on the DTO field if a DTO is used, or a service-layer validation. - If `PersonService.findOrCreateByAlias` creates persons from mass import input, the title value comes from parsed ODS data. ODS cell contents can contain arbitrary strings. Ensure the parser sanitizes or rejects unexpected characters in the extracted title before persisting.
Author
Owner

🎨 Leonie Voss -- UI/UX Design Lead

Questions & Observations

  • Display name format [title] [firstName] lastName: The issue defines the format but not the visual treatment. Should the title be displayed differently from the name? In a family archive context, "Tante" before a name is contextual metadata, not part of the proper name. Consider rendering the title in a lighter weight or smaller size to visually distinguish it: <span class="text-gray-500 font-sans text-sm">Tante</span> <span class="font-serif">Molly</span>. This maintains the information without making "Tante" look like a first name.

  • Person list page: Currently shows firstName lastName in rows. With null firstNames, some rows will show just a last name while others show a full name. This creates visual inconsistency. With titles, "Tante Molly" and "Clara Muller" look structurally different. How does the list handle sorting? Alphabetical by lastName is correct, but display-wise, a person listed as just "Molly" (lastName only) next to "Clara Muller" (firstName + lastName) may confuse users who expect first names to always be present.

  • PersonTypeahead and PersonMultiSelect chips: Chips currently show the person's name. With title + nullable firstName, chip text could become quite long ("Cousine Emmy Haniel") or very short ("Molly"). Are there min/max width considerations for chips? A chip showing just "Molly" with no context is ambiguous -- should the title appear in chips too?

  • Accessibility: Null firstName means screen readers will read different patterns for different persons. "Tante Molly" vs "Clara Muller" vs just "Bakker". Ensure the aria-label on person links and chips includes the full display name including title, so screen reader users get the same contextual information as sighted users.

  • Edit form: The issue mentions users can "later set a proper firstName/lastName via the edit page." Is the title field editable on the person edit page? If so, it needs a label, placeholder text, and possibly a dropdown or datalist of known titles to guide input. A freeform text field for title invites inconsistency ("Tante" vs "tante" vs "Aunt").

Suggestions

  • Add a <datalist> or autocomplete dropdown for the title field on the person edit page, pre-populated with the known prefixes. This guides input without restricting it.

  • On the person list, consider showing title in a muted style next to the name: Tante in text-gray-400 text-sm font-sans followed by Molly in font-serif. This provides context without visual clutter.

  • For PersonTypeahead search results, show the title as a secondary label: the dropdown item shows "Molly" as primary and "Tante" as a dimmed prefix. This keeps the dropdown scannable.

  • Ensure that person chips in PersonMultiSelect include the title when present, so users can distinguish "Tante Molly" from a different "Molly" who might exist without a title.

  • Test the display at 320px viewport width with long titles like "Freifrau von Massenbach" -- this could overflow chip containers or truncate awkwardly. Set max-width with text-ellipsis on chips as a safety measure.

## :art: Leonie Voss -- UI/UX Design Lead ### Questions & Observations - **Display name format `[title] [firstName] lastName`**: The issue defines the format but not the visual treatment. Should the title be displayed differently from the name? In a family archive context, "Tante" before a name is contextual metadata, not part of the proper name. Consider rendering the title in a lighter weight or smaller size to visually distinguish it: `<span class="text-gray-500 font-sans text-sm">Tante</span> <span class="font-serif">Molly</span>`. This maintains the information without making "Tante" look like a first name. - **Person list page**: Currently shows `firstName lastName` in rows. With null firstNames, some rows will show just a last name while others show a full name. This creates visual inconsistency. With titles, "Tante Molly" and "Clara Muller" look structurally different. How does the list handle sorting? Alphabetical by lastName is correct, but display-wise, a person listed as just "Molly" (lastName only) next to "Clara Muller" (firstName + lastName) may confuse users who expect first names to always be present. - **PersonTypeahead and PersonMultiSelect chips**: Chips currently show the person's name. With title + nullable firstName, chip text could become quite long ("Cousine Emmy Haniel") or very short ("Molly"). Are there min/max width considerations for chips? A chip showing just "Molly" with no context is ambiguous -- should the title appear in chips too? - **Accessibility**: Null firstName means screen readers will read different patterns for different persons. "Tante Molly" vs "Clara Muller" vs just "Bakker". Ensure the `aria-label` on person links and chips includes the full display name including title, so screen reader users get the same contextual information as sighted users. - **Edit form**: The issue mentions users can "later set a proper firstName/lastName via the edit page." Is the `title` field editable on the person edit page? If so, it needs a label, placeholder text, and possibly a dropdown or datalist of known titles to guide input. A freeform text field for title invites inconsistency ("Tante" vs "tante" vs "Aunt"). ### Suggestions - Add a `<datalist>` or autocomplete dropdown for the title field on the person edit page, pre-populated with the known prefixes. This guides input without restricting it. - On the person list, consider showing title in a muted style next to the name: `Tante` in `text-gray-400 text-sm font-sans` followed by `Molly` in `font-serif`. This provides context without visual clutter. - For PersonTypeahead search results, show the title as a secondary label: the dropdown item shows "Molly" as primary and "Tante" as a dimmed prefix. This keeps the dropdown scannable. - Ensure that person chips in PersonMultiSelect include the title when present, so users can distinguish "Tante Molly" from a different "Molly" who might exist without a title. - Test the display at 320px viewport width with long titles like "Freifrau von Massenbach" -- this could overflow chip containers or truncate awkwardly. Set `max-width` with `text-ellipsis` on chips as a safety measure.
Author
Owner

🐳 Tobias Wendt -- DevOps & Platform Engineer

Questions & Observations

  • Flyway migration ordering: The issue references V{n}__add_person_title.sql without specifying the version number. With multiple open issues (#209, #210, #212) all adding migrations, there is a risk of version number collisions if branches are developed in parallel. Coordinate migration version numbers before merging. Flyway will fail hard on duplicate version numbers, and resolving conflicts in production is painful.

  • Migration rollback: The migration adds a column and modifies existing data (UPDATE ... SET title = first_name, first_name = NULL). The column addition is safe. The data modification is not trivially reversible -- if you need to rollback, you would need a reverse migration that moves title back to firstName. Consider whether this warrants a separate migration file: V{n}__add_person_title_column.sql (schema only) and V{n+1}__backfill_person_titles.sql (data migration). This way the schema change can be rolled back independently of the data change.

  • Deploy order: The issue changes both backend (new column, nullable firstName) and frontend (null-safe display). If the backend deploys first, the API starts returning firstName: null for migrated persons, but the old frontend may render "null Molly". If the frontend deploys first, it handles null gracefully but the backend has not migrated yet (no change). The safe deploy order is: frontend first (make it null-tolerant), then backend (migration + parser changes). Is this the planned order?

  • CI impact: The PersonNameParserTest additions should not affect CI runtime meaningfully -- these are unit tests. The Flyway migration will run in Testcontainers integration tests automatically. No new Docker services or CI job changes are needed.

  • No infrastructure changes: This issue is purely application-level. No new services, no new ports, no Docker Compose changes. The migration runs via Flyway on application startup as usual. Clean from an ops perspective.

Suggestions

  • Split the migration into two files: one for the schema change (add column, drop NOT NULL), one for the data backfill. This gives cleaner rollback granularity and makes the migration history easier to read.

  • Coordinate migration version numbers with #209 and #210 before implementation. A quick comment on each issue with the claimed version number prevents merge conflicts.

  • Plan the deploy as frontend-first, backend-second. The frontend change (null-safe display) is backward-compatible with the current API. The backend change (nullable firstName in API responses) requires the frontend to already handle it.

  • No changes needed to docker-compose.yml, CI workflow, or infrastructure configuration. This is a clean application-level change.

## :whale: Tobias Wendt -- DevOps & Platform Engineer ### Questions & Observations - **Flyway migration ordering**: The issue references `V{n}__add_person_title.sql` without specifying the version number. With multiple open issues (#209, #210, #212) all adding migrations, there is a risk of version number collisions if branches are developed in parallel. Coordinate migration version numbers before merging. Flyway will fail hard on duplicate version numbers, and resolving conflicts in production is painful. - **Migration rollback**: The migration adds a column and modifies existing data (`UPDATE ... SET title = first_name, first_name = NULL`). The column addition is safe. The data modification is not trivially reversible -- if you need to rollback, you would need a reverse migration that moves `title` back to `firstName`. Consider whether this warrants a separate migration file: `V{n}__add_person_title_column.sql` (schema only) and `V{n+1}__backfill_person_titles.sql` (data migration). This way the schema change can be rolled back independently of the data change. - **Deploy order**: The issue changes both backend (new column, nullable firstName) and frontend (null-safe display). If the backend deploys first, the API starts returning `firstName: null` for migrated persons, but the old frontend may render "null Molly". If the frontend deploys first, it handles null gracefully but the backend has not migrated yet (no change). The safe deploy order is: **frontend first** (make it null-tolerant), then **backend** (migration + parser changes). Is this the planned order? - **CI impact**: The `PersonNameParserTest` additions should not affect CI runtime meaningfully -- these are unit tests. The Flyway migration will run in Testcontainers integration tests automatically. No new Docker services or CI job changes are needed. - **No infrastructure changes**: This issue is purely application-level. No new services, no new ports, no Docker Compose changes. The migration runs via Flyway on application startup as usual. Clean from an ops perspective. ### Suggestions - Split the migration into two files: one for the schema change (add column, drop NOT NULL), one for the data backfill. This gives cleaner rollback granularity and makes the migration history easier to read. - Coordinate migration version numbers with #209 and #210 before implementation. A quick comment on each issue with the claimed version number prevents merge conflicts. - Plan the deploy as frontend-first, backend-second. The frontend change (null-safe display) is backward-compatible with the current API. The backend change (nullable firstName in API responses) requires the frontend to already handle it. - No changes needed to `docker-compose.yml`, CI workflow, or infrastructure configuration. This is a clean application-level change.
Author
Owner

🏗️ Markus Keller — Application Architect (Discussion Summary)

Interactive discussion with Marcel covering 4 open items. All resolved.

Resolved Items

  • Dot-prefixed titles without trailing space — Decision: two-pass approach in stripTitle(). First pass: startsWith check for dot-terminated prefixes (Dr., Prof.) — no space required, handles Dr.von Gelden correctly. Second pass: space-based token check for word prefixes (Tante, Frau, Schwester, etc.). Two distinct matching strategies, explicit in the code.

  • Stacked academic titles — Decision: loop the stripping. Both passes (dot-prefix and space-token) run in a loop, accumulating stripped prefixes into the title string. Prof. Dr. Muller -> title="Prof. Dr.", firstName=null, lastName="Muller". The loop is trivial to implement and Prof. Dr. is too common in German academic contexts to ignore, even though the current ODS data doesn't contain a stacked example.

  • Title vs relationship label conflation — Decision: single title field, accept the simplification. "Dr." (credential) and "Tante" (relationship) are semantically different, but in a single-family archive with a fixed perspective, the distinction doesn't matter for display or search. One field, one extraction pass, one display logic. If the distinction ever matters, the data can be reclassified from the known prefix list.

  • KNOWN_LAST_NAMES additions — Decision: include in this issue. Five "von" last names are directly required for the test cases to pass: von Geldern, von der Heide, von Staa, von Massenbach, von Gelden. Without them, entries like Freifrau von Massenbach (after title strip -> von Massenbach) fall through to the wrong split path.

Scope After #213

With #213 handling SplitName redesign, nullable firstName, Person.getDisplayName(), migration, and frontend displayName refactor, this issue's scope is now:

  1. Implement stripTitle() pipeline method (two-pass, looped)
  2. Add KNOWN_PREFIXES list to PersonNameParser
  3. Add 5 "von" entries to KNOWN_LAST_NAMES
  4. Tests for all 19 cases from the input/output table (12 positive + 7 regression)

Clean, additive change on top of #213's foundation.

## 🏗️ Markus Keller — Application Architect (Discussion Summary) Interactive discussion with Marcel covering 4 open items. All resolved. ### Resolved Items - **Dot-prefixed titles without trailing space** — Decision: **two-pass approach** in `stripTitle()`. First pass: `startsWith` check for dot-terminated prefixes (`Dr.`, `Prof.`) — no space required, handles `Dr.von Gelden` correctly. Second pass: space-based token check for word prefixes (`Tante`, `Frau`, `Schwester`, etc.). Two distinct matching strategies, explicit in the code. - **Stacked academic titles** — Decision: **loop the stripping**. Both passes (dot-prefix and space-token) run in a loop, accumulating stripped prefixes into the title string. `Prof. Dr. Muller` -> title="Prof. Dr.", firstName=null, lastName="Muller". The loop is trivial to implement and `Prof. Dr.` is too common in German academic contexts to ignore, even though the current ODS data doesn't contain a stacked example. - **Title vs relationship label conflation** — Decision: **single `title` field, accept the simplification**. "Dr." (credential) and "Tante" (relationship) are semantically different, but in a single-family archive with a fixed perspective, the distinction doesn't matter for display or search. One field, one extraction pass, one display logic. If the distinction ever matters, the data can be reclassified from the known prefix list. - **KNOWN_LAST_NAMES additions** — Decision: **include in this issue**. Five "von" last names are directly required for the test cases to pass: `von Geldern`, `von der Heide`, `von Staa`, `von Massenbach`, `von Gelden`. Without them, entries like `Freifrau von Massenbach` (after title strip -> `von Massenbach`) fall through to the wrong split path. ### Scope After #213 With #213 handling `SplitName` redesign, nullable `firstName`, `Person.getDisplayName()`, migration, and frontend `displayName` refactor, this issue's scope is now: 1. Implement `stripTitle()` pipeline method (two-pass, looped) 2. Add `KNOWN_PREFIXES` list to `PersonNameParser` 3. Add 5 "von" entries to `KNOWN_LAST_NAMES` 4. Tests for all 19 cases from the input/output table (12 positive + 7 regression) Clean, additive change on top of #213's foundation.
Author
Owner

Implementation Complete

Parser logic implemented on branch feat/issues-209-213-person-parser-enhancements. Parts 1-2 (title field, nullable firstName, migration, displayName) were already done in #213.

Commits

Commit Description
73640ef Implement stripTitle() with known prefix list, add von last names

What changed

  • stripTitle() — two-pass approach: dot-prefixes (Dr., Prof.) matched without trailing space, word-prefixes (14 entries: Tante, Frau, Schwester, Cousine, etc.) matched at word boundary. Loops for stacked titles (Prof. Dr. Muller).
  • Single-token-after-title logic — when title is stripped and only one name token remains, it goes to lastName with firstName=null. "Tante Molly" → title="Tante", firstName=null, lastName="Molly".
  • KNOWN_LAST_NAMES — added 5 "von" entries: von der Heide, von Massenbach, von Geldern, von Gelden, von Staa (longest first for correct matching)
  • 15 new test cases + 3 updated existing tests

Test results

  • Backend: 722 tests passing
## Implementation Complete Parser logic implemented on branch `feat/issues-209-213-person-parser-enhancements`. Parts 1-2 (title field, nullable firstName, migration, displayName) were already done in #213. ### Commits | Commit | Description | |--------|-------------| | `73640ef` | Implement stripTitle() with known prefix list, add von last names | ### What changed - **stripTitle()** — two-pass approach: dot-prefixes (`Dr.`, `Prof.`) matched without trailing space, word-prefixes (14 entries: `Tante`, `Frau`, `Schwester`, `Cousine`, etc.) matched at word boundary. Loops for stacked titles (`Prof. Dr. Muller`). - **Single-token-after-title logic** — when title is stripped and only one name token remains, it goes to `lastName` with `firstName=null`. "Tante Molly" → title="Tante", firstName=null, lastName="Molly". - **KNOWN_LAST_NAMES** — added 5 "von" entries: `von der Heide`, `von Massenbach`, `von Geldern`, `von Gelden`, `von Staa` (longest first for correct matching) - 15 new test cases + 3 updated existing tests ### Test results - Backend: 722 tests passing
Sign in to join this conversation.
No Label feature person
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#212