feat(model): add title/salutation field to Person and make firstName optional #212
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
The mass import contains names with title or relationship prefixes that the parser treats as first names:
Tante MollySchwester HanniFrau BakkerDr. FrockCousine Emmy HanielTante LollyIn a family archive, "Tante Molly" or "Dr. Frock" are how these people are known. There is no known first name - the nickname/last name IS the identifier. The title/relationship prefix is metadata.
Solution
Part 1 - Add
titlefield to PersonStores: Frau, Herr, Tante, Onkel, Schwester, Cousine, Freundin, Architekt, Dr., Prof., etc.
Part 2 - Make
firstNamenullablefirstNameis currently always populated (worst case with "?" placeholder). Change to:nullin the database[title] [firstName] lastNamewith null parts omittedImpact scope - every place that concatenates
firstName + " " + lastName:Part 3 - Parser: strip known prefixes
Add a list of known title/relationship prefixes to
PersonNameParser:In
split(), check if the first token is a known prefix. If so:Part 4 - Nickname as lastName
When a name like "Tante Molly" is split, "Molly" goes to
lastName. This is intentional - in the family archive context, the nickname IS how the person is identified. Users can later set a proper firstName/lastName via the edit page if the real name is known.Part 5 - Migration
Update existing records where firstName matches a known prefix:
Files
Person.javatitlefield, makefirstNamenullableV{n}__add_person_title.sqlPersonNameParser.javaSplitNamePersonService.javaPersonNameParserTest.javaDesign decision
firstNamebecoming optional is a significant change. The "?" placeholder approach was simpler but produced dirty data. Null is honest - it says "we don't know this person's first name" rather than pretending "?" is a name.Found in
ODS import file analysis for #190.
Complete Input/Output Table
Every entry from the ODS that contains a title or relationship prefix, plus regression cases that must NOT be affected.
Relationship prefixes (firstName becomes null, nickname goes to lastName)
Tante ClaraTanteClaraTante Lolly//split fromCharl.Blomquist//Tante Lolly)TanteLollySchwester HanniSchwesterHanniFreundin GathiFreundinGathiCousine Emmy HanielCousineEmmyHanielFormal titles (firstName becomes null when only one name token remains)
Frau BakkerFrauBakkerFreifrau von MassenbachFreifrauvon MassenbachAcademic/professional titles (dot-compressed - dot normalization fires first)
These go through dot-normalization (#190) before title stripping. Processing order: geb strip -> dot normalize -> paren strip -> title strip -> known-last-name / fallback.
Dr.Fr.ZarnckeDr. Fr. ZarnckeDr.Fr.ZarnckeDr.SattelmacherDr. SattelmacherDr.SattelmacherDr.W.MunchDr. W. MunchDr.W.MunchDr.von GeldenDr. von GeldenDr.von GeldenProf.H-M TheopoldProf. H-M TheopoldProf.H-MTheopoldWait - #11 is tricky.
Dr.von Geldenhas a space after "von", so dot-normalization guard (!contains(" ")) won't fire. It stays asDr.von Gelden. Then title stripping needs to handleDr.even without the space. This needs its own rule: dot-prefixed titles (Dr.,Prof.) should be recognized with or without trailing space.Corrected #11:
Dr.von GeldenDr.von Gelden(unchanged - has space, dot-norm skips)Dr.von GeldenDr.prefix even when concatenatedKnown prefix list
Frau,Herr,Freifrau,FreiherrTante,Onkel,Schwester,Bruder,Cousine,Cousin,Freundin,Freund,Mutter,VaterDr.,Prof.,PastorArchitektRegression cases - must NOT be affected by title stripping
These names start with tokens that look like they could be prefixes but are actually first names or parts of the name:
Walter de GruyterWalterde GruyterClara MullerClaraMullerFriedrich MullerFriedrichMullerConrad von GeldernConradvon GeldernRikchen v R.Rikchen vR.Paula von der HeidePaulavon der HeideWolf Meinhard von StaaWolf Meinhardvon Staa"von" as part of last names (related concern)
Several entries have
vonin the last name. These are NOT titles -vonis part of the German last name:Conrad von GeldernConradvon Geldernvon GeldernPaula von der HeidePaulavon der Heidevon der HeideWolf Meinhard von StaaWolf Meinhardvon Staavon StaaFreifrau von Massenbachvon Massenbachvon MassenbachDr.von Geldenvon Geldenvon GeldenThese should be added to
KNOWN_LAST_NAMESas part of this issue to ensure correct splitting.Processing pipeline order (full)
For any name going through
split():Edge case:
Architekt Korschelt u RenkerThis is a Von entry with a professional title AND a multi-person
useparator. Since Von goes tosplit()notparseReceivers(), theuisn't handled. After title stripping: title="Architekt", remainder="Korschelt u Renker", split: firstName="Korschelt u", lastName="Renker". This is still wrong - but it's a Von-column-multi-person issue, not a title issue. Noted in #211 as an open question.💻 Felix Brandt -- Senior Fullstack Developer
Questions & Observations
The issue mentions updating "every place that concatenates
firstName + " " + lastName" but does not inventory whether this logic currently lives in a single utility/helper or is scattered across multiple files. If scattered, this is a DRY opportunity -- extract aPerson.getDisplayName()method (or a shared Svelte utility) that handles the[title] [firstName] lastNameassembly with null-safe logic in one place. Every display site then calls that one method.PersonNameParser.split()currently returns what appears to be a two-fieldSplitNamerecord. Addingtitlemakes it three fields. IsSplitNamea record or a class? If it is a record, adding the field is straightforward. If it is constructed ad-hoc, this is the moment to formalize it. ASplitName(String title, String firstName, String lastName)record keeps the parser's contract explicit.The comment's processing pipeline (
geb strip -> dot-normalize -> paren strip -> title strip -> findKnownLastName / fallback) is well-defined. The dot-prefix recognition forDr.andProf.even without trailing space (case #11:Dr.von Gelden) needs careful implementation -- the title strip step must check for dot-terminated prefixes usingstartsWith("Dr.")/startsWith("Prof.")before attempting space-based token splitting. This is a separate code path from the space-separated prefixes likeTante,Frau, etc.The
KNOWN_PREFIXESlist is a static constant. Where does it live? If it lives insidePersonNameParser, that is fine (single responsibility -- parsing names). If someone later wants to make it configurable, that is a separate issue. KISS says: hardcode the list as aList.of(...)constant for now.The migration's
UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN (...)is a data migration that changes existing records. What happens to documents whose sender/receiver display currently shows "Tante Molly" asfirstName="Tante" lastName="Molly"? After migration it becomestitle="Tante" firstName=null lastName="Molly". The frontend must handle this before the migration runs, or the display breaks between deploy and frontend update. Deploy order matters.Suggestions
getDisplayName()method toPerson.javathat assembles[title] [firstName] lastNamewith null handling. All frontend and backend display logic calls this. This is the single point of change for display format.Write the
PersonNameParserTestcases from the comment table first (red), then implement the title stripping (green). The comment already provides 12+ concrete input/output pairs -- these are test cases ready to be transcribed.For dot-terminated titles (
Dr.,Prof.), handle them as a prefix match, not a token match. The token-based approach (split(" ")[0]) will missDr.von Geldenwhere there is no space after the dot. AstartsWithcheck on the known dot-prefixes should run before tokenization.Consider whether
KNOWN_PREFIXESshould be case-insensitive. The ODS data might containtanteorTANTE. ASetof lowercased prefixes withtoken.toLowerCase()lookup is defensive without adding complexity.🏗️ Markus Keller -- Application Architect
Questions & Observations
Schema change scope: Adding a nullable
titlecolumn is clean. MakingfirstNamenullable is the bigger change -- it ripples through every query, everyORDER BY, every search index that touchesfirstName. Does the full-text search index currently includefirstName? If so, the search behavior changes whenfirstNameis null. Null values are excluded from text search indexes by default in PostgreSQL. A person with only a title and lastName might become unsearchable by their title unless the search index is updated.Data model integrity: The
titlefield is described as storing both formal titles (Dr.,Frau) and relationship labels (Tante,Cousine). These are semantically different -- "Dr." is a property of the person, while "Tante" is a relationship between two people (Tante to whom?). Is this conflation intentional? In a family archive with a single-family perspective, it probably works fine. But it is worth acknowledging this is a simplification that would break if you ever needed to model "Tante to Person A but Cousine to Person B."Migration safety: The
UPDATE persons SET title = first_name, first_name = NULL WHERE first_name IN (...)migration modifies existing data. This is irreversible in the sense that you lose the information thatfirstNamewas"Tante". Flyway migrations should be forward-only and safe. Consider: what if the list misses a prefix, and a real first name happens to match a future prefix addition? The migration as written is safe for the current list, but the approach of retroactively nulling data based on a hardcoded list is worth flagging.Display name assembly: Where does the
[title] [firstName] lastNameconcatenation live? The issue lists it as a frontend concern across many files. This logic should live on the entity as a derived property (a@Transientgetter or a@Formulacolumn), so the backend API always returns a consistentdisplayName. The frontend should not independently assemble display names -- that is duplicated business logic across two codebases.VARCHAR(50)for title: Reasonable for known prefixes. But "Freifrau" (9 chars) and potential compound titles like "Prof. Dr." (9 chars) fit easily. 50 is fine.Suggestions
Add a
@TransientdisplayNamegetter onPersonand include it in the API response. This is the single source of truth for how a person's name is displayed. The frontend readsperson.displayNameinstead of assembling it from parts.If the full-text search includes person names, update the search index/query to include
titlein the searchable text. Otherwise "Tante Molly" becomes unsearchable after migration.Document the intentional conflation of titles and relationship labels in the migration SQL as a comment. Future maintainers will wonder why "Tante" is stored in a field called "title."
The
KNOWN_LAST_NAMESadditions (von Geldern,von der Heide,von Staa,von Massenbach) from the comment should be part of this issue's scope, not deferred. They are required for correct parsing of the test cases listed.🧪 Sara Holt -- QA Engineer & Test Strategist
Questions & Observations
The comment provides an excellent input/output table with 12 positive cases and 7 regression cases. This is a test plan in disguise. However, there are gaps:
null, empty string, or whitespace-only? The parser must handle these gracefully."Tante"with no name following? Does it becometitle="Tante", firstName=null, lastName=null? That would violate alastName NOT NULLconstraint if one exists."Prof. Dr. Muller"-- does only the first token get stripped, or both? The pipeline description says "check if the first token is a known prefix" (singular), but academic titles can stack."tante molly","TANTE MOLLY"-- are these handled identically to"Tante Molly"?The migration's
UPDATEstatement modifies existing data. This needs a before/after integration test: run the migration on a test database with known seed data and assert the resulting state. Flyway migrations are tested implicitly by Testcontainers, but the data transformation deserves an explicit assertion.The "impact scope" lists 6+ frontend locations that need to handle null
firstName. Each of these is a potential regression point. Without component tests for null-firstName rendering, a future refactor could reintroducefirstName + " " + lastNameand display"null Molly".The processing pipeline has a defined order:
geb strip -> dot-normalize -> paren strip -> title strip -> findKnownLastName / fallback. This pipeline is a prime candidate for a parameterized test that feeds raw input through the full pipeline and asserts the finalSplitNameoutput. Each row in the comment tables becomes one@ParameterizedTest@CsvSourcerow.Suggestions
Unit test layer (PersonNameParserTest):
Integration test layer:
firstName IN ('Tante', 'Dr.', 'Frau'), run migration, asserttitleis populated andfirstNameis nullPersonService.findOrCreateByAliastest: pass "Tante Molly" and assert the created person hastitle="Tante",firstName=null,lastName="Molly"Frontend component tests (Vitest):
title="Tante",firstName=null,lastName="Molly"-- assert display shows "Tante Molly"E2E smoke test (Playwright):
🛡️ Nora "NullX" Steiner -- Application Security Engineer
Questions & Observations
Mass assignment on the new
titlefield: WhenPersonis used as a response entity (no response DTO), thetitlefield will be serialized in API responses automatically. That is fine for reads. For writes, doesDocumentUpdateDTOor any person-related DTO accept atitlefield? If the person edit form action processesformData.get('title'), ensure it is validated and sanitized. A freeform string field that ends up in HTML display is an XSS vector if not properly escaped. SvelteKit's{person.title}text interpolation escapes by default, but verify no{@html}is used for display name rendering.SQL injection in the migration: The migration uses a hardcoded
IN ('Tante', 'Schwester', ...)list, which is safe (no user input). No issue here.The
KNOWN_PREFIXESlist is server-side only: Confirm the prefix list is not sent to the frontend or used in client-side validation. Client-side validation of titles would leak the prefix list, which is not a security risk per se, but keeping parsing logic server-side is the right boundary.Null
firstNameand search/filter injection: If any search query concatenatesfirstNameinto a JPQL or native SQL string, a nullfirstNamecould change query behavior. Verify all person search queries use parameterized queries and handle null fields gracefully. This is likely already the case given the project's use of Spring Data JPA, but the change from "always non-null" to "nullable" can surface edge cases inWHERE firstName LIKE :queryclauses.Authorization on person edit: The issue does not change authorization boundaries, but adding a new field (
title) to person edit means the existing@RequirePermissioncoverage applies. No new endpoints are introduced, so no new permission gaps. Confirmed no concern here.Suggestions
Verify that all display-name rendering in Svelte templates uses text interpolation (
{person.title}) and not{@html}. Atitlelike<script>alert(1)</script>entered via the edit form would be harmless with text interpolation but dangerous with{@html}.Add input validation on the
titlefield: max length 50 (matching the DB column), alphanumeric + dots + spaces only. Reject HTML entities and control characters. This can be a simple@Size(max = 50)and@Patternon the DTO field if a DTO is used, or a service-layer validation.If
PersonService.findOrCreateByAliascreates persons from mass import input, the title value comes from parsed ODS data. ODS cell contents can contain arbitrary strings. Ensure the parser sanitizes or rejects unexpected characters in the extracted title before persisting.🎨 Leonie Voss -- UI/UX Design Lead
Questions & Observations
Display name format
[title] [firstName] lastName: The issue defines the format but not the visual treatment. Should the title be displayed differently from the name? In a family archive context, "Tante" before a name is contextual metadata, not part of the proper name. Consider rendering the title in a lighter weight or smaller size to visually distinguish it:<span class="text-gray-500 font-sans text-sm">Tante</span> <span class="font-serif">Molly</span>. This maintains the information without making "Tante" look like a first name.Person list page: Currently shows
firstName lastNamein rows. With null firstNames, some rows will show just a last name while others show a full name. This creates visual inconsistency. With titles, "Tante Molly" and "Clara Muller" look structurally different. How does the list handle sorting? Alphabetical by lastName is correct, but display-wise, a person listed as just "Molly" (lastName only) next to "Clara Muller" (firstName + lastName) may confuse users who expect first names to always be present.PersonTypeahead and PersonMultiSelect chips: Chips currently show the person's name. With title + nullable firstName, chip text could become quite long ("Cousine Emmy Haniel") or very short ("Molly"). Are there min/max width considerations for chips? A chip showing just "Molly" with no context is ambiguous -- should the title appear in chips too?
Accessibility: Null firstName means screen readers will read different patterns for different persons. "Tante Molly" vs "Clara Muller" vs just "Bakker". Ensure the
aria-labelon person links and chips includes the full display name including title, so screen reader users get the same contextual information as sighted users.Edit form: The issue mentions users can "later set a proper firstName/lastName via the edit page." Is the
titlefield editable on the person edit page? If so, it needs a label, placeholder text, and possibly a dropdown or datalist of known titles to guide input. A freeform text field for title invites inconsistency ("Tante" vs "tante" vs "Aunt").Suggestions
Add a
<datalist>or autocomplete dropdown for the title field on the person edit page, pre-populated with the known prefixes. This guides input without restricting it.On the person list, consider showing title in a muted style next to the name:
Tanteintext-gray-400 text-sm font-sansfollowed byMollyinfont-serif. This provides context without visual clutter.For PersonTypeahead search results, show the title as a secondary label: the dropdown item shows "Molly" as primary and "Tante" as a dimmed prefix. This keeps the dropdown scannable.
Ensure that person chips in PersonMultiSelect include the title when present, so users can distinguish "Tante Molly" from a different "Molly" who might exist without a title.
Test the display at 320px viewport width with long titles like "Freifrau von Massenbach" -- this could overflow chip containers or truncate awkwardly. Set
max-widthwithtext-ellipsison chips as a safety measure.🐳 Tobias Wendt -- DevOps & Platform Engineer
Questions & Observations
Flyway migration ordering: The issue references
V{n}__add_person_title.sqlwithout specifying the version number. With multiple open issues (#209, #210, #212) all adding migrations, there is a risk of version number collisions if branches are developed in parallel. Coordinate migration version numbers before merging. Flyway will fail hard on duplicate version numbers, and resolving conflicts in production is painful.Migration rollback: The migration adds a column and modifies existing data (
UPDATE ... SET title = first_name, first_name = NULL). The column addition is safe. The data modification is not trivially reversible -- if you need to rollback, you would need a reverse migration that movestitleback tofirstName. Consider whether this warrants a separate migration file:V{n}__add_person_title_column.sql(schema only) andV{n+1}__backfill_person_titles.sql(data migration). This way the schema change can be rolled back independently of the data change.Deploy order: The issue changes both backend (new column, nullable firstName) and frontend (null-safe display). If the backend deploys first, the API starts returning
firstName: nullfor migrated persons, but the old frontend may render "null Molly". If the frontend deploys first, it handles null gracefully but the backend has not migrated yet (no change). The safe deploy order is: frontend first (make it null-tolerant), then backend (migration + parser changes). Is this the planned order?CI impact: The
PersonNameParserTestadditions should not affect CI runtime meaningfully -- these are unit tests. The Flyway migration will run in Testcontainers integration tests automatically. No new Docker services or CI job changes are needed.No infrastructure changes: This issue is purely application-level. No new services, no new ports, no Docker Compose changes. The migration runs via Flyway on application startup as usual. Clean from an ops perspective.
Suggestions
Split the migration into two files: one for the schema change (add column, drop NOT NULL), one for the data backfill. This gives cleaner rollback granularity and makes the migration history easier to read.
Coordinate migration version numbers with #209 and #210 before implementation. A quick comment on each issue with the claimed version number prevents merge conflicts.
Plan the deploy as frontend-first, backend-second. The frontend change (null-safe display) is backward-compatible with the current API. The backend change (nullable firstName in API responses) requires the frontend to already handle it.
No changes needed to
docker-compose.yml, CI workflow, or infrastructure configuration. This is a clean application-level change.🏗️ Markus Keller — Application Architect (Discussion Summary)
Interactive discussion with Marcel covering 4 open items. All resolved.
Resolved Items
Dot-prefixed titles without trailing space — Decision: two-pass approach in
stripTitle(). First pass:startsWithcheck for dot-terminated prefixes (Dr.,Prof.) — no space required, handlesDr.von Geldencorrectly. Second pass: space-based token check for word prefixes (Tante,Frau,Schwester, etc.). Two distinct matching strategies, explicit in the code.Stacked academic titles — Decision: loop the stripping. Both passes (dot-prefix and space-token) run in a loop, accumulating stripped prefixes into the title string.
Prof. Dr. Muller-> title="Prof. Dr.", firstName=null, lastName="Muller". The loop is trivial to implement andProf. Dr.is too common in German academic contexts to ignore, even though the current ODS data doesn't contain a stacked example.Title vs relationship label conflation — Decision: single
titlefield, accept the simplification. "Dr." (credential) and "Tante" (relationship) are semantically different, but in a single-family archive with a fixed perspective, the distinction doesn't matter for display or search. One field, one extraction pass, one display logic. If the distinction ever matters, the data can be reclassified from the known prefix list.KNOWN_LAST_NAMES additions — Decision: include in this issue. Five "von" last names are directly required for the test cases to pass:
von Geldern,von der Heide,von Staa,von Massenbach,von Gelden. Without them, entries likeFreifrau von Massenbach(after title strip ->von Massenbach) fall through to the wrong split path.Scope After #213
With #213 handling
SplitNameredesign, nullablefirstName,Person.getDisplayName(), migration, and frontenddisplayNamerefactor, this issue's scope is now:stripTitle()pipeline method (two-pass, looped)KNOWN_PREFIXESlist toPersonNameParserKNOWN_LAST_NAMESClean, additive change on top of #213's foundation.
Implementation Complete
Parser logic implemented on branch
feat/issues-209-213-person-parser-enhancements. Parts 1-2 (title field, nullable firstName, migration, displayName) were already done in #213.Commits
73640efWhat changed
Dr.,Prof.) matched without trailing space, word-prefixes (14 entries:Tante,Frau,Schwester,Cousine, etc.) matched at word boundary. Loops for stacked titles (Prof. Dr. Muller).lastNamewithfirstName=null. "Tante Molly" → title="Tante", firstName=null, lastName="Molly".von der Heide,von Massenbach,von Geldern,von Gelden,von Staa(longest first for correct matching)Test results