feat(massimport): support // as multi-person separator in PersonNameParser #182

New Issue

marcel · 2026-04-06T13:52:40+02:00

marcel commented

2026-04-06 13:52:40 +02:00

Problem

The mass import spreadsheet uses // to separate independent full names in a single cell, but PersonNameParser.parseReceivers() only splits on und/u. This causes both names to be imported as a single garbage person record.

Examples currently broken:

Charl.Blomquist//Tante Lolly → one person instead of two
Walter de Gruyter//Eugenie de Gruyter → one person instead of two

Solution

Pre-split the raw input on // before any existing logic. Each //-segment is then piped through the current pipeline independently, and results are flattened.

Unlike und/u, // always separates fully independent, complete names — no shared last-name distribution is needed. The recursive approach handles mixed cases like Herbert und Clara Cram//Eugenie de Gruyter naturally.

Changes

`PersonNameParser.java`

Add a SLASH_SEPARATOR pattern and a pre-split step at the top of parseReceivers():

private static final Pattern SLASH_SEPARATOR = Pattern.compile("//");

// In parseReceivers(), after null/blank check:
String[] slashParts = SLASH_SEPARATOR.split(raw);
if (slashParts.length > 1) {
    return Arrays.stream(slashParts)
            .map(String::trim)
            .filter(s -> !s.isBlank())
            .flatMap(segment -> parseReceivers(segment).stream())
            .toList();
}

`PersonNameParserTest.java`

4 new tests (TDD — red first):

slashSeparator_twoIndependentFullNames
slashSeparator_abbreviatedFirstName
slashSeparator_withSpacesAroundSlashes
slashSeparator_segmentContainsUnd

Files

File	Change
`backend/src/main/java/.../service/PersonNameParser.java`	Add constant + pre-split step
`backend/src/test/java/.../service/PersonNameParserTest.java`	4 new tests

No schema, API, or i18n changes needed.

Verification

cd backend && ./mvnw test -Dtest=PersonNameParserTest

## Problem The mass import spreadsheet uses `//` to separate independent full names in a single cell, but `PersonNameParser.parseReceivers()` only splits on `und`/`u`. This causes both names to be imported as a single garbage person record. Examples currently broken: - `Charl.Blomquist//Tante Lolly` → one person instead of two - `Walter de Gruyter//Eugenie de Gruyter` → one person instead of two ## Solution Pre-split the raw input on `//` **before** any existing logic. Each `//`-segment is then piped through the current pipeline independently, and results are flattened. Unlike `und`/`u`, `//` always separates fully independent, complete names — no shared last-name distribution is needed. The recursive approach handles mixed cases like `Herbert und Clara Cram//Eugenie de Gruyter` naturally. ## Changes ### `PersonNameParser.java` Add a `SLASH_SEPARATOR` pattern and a pre-split step at the top of `parseReceivers()`: ```java private static final Pattern SLASH_SEPARATOR = Pattern.compile("//"); // In parseReceivers(), after null/blank check: String[] slashParts = SLASH_SEPARATOR.split(raw); if (slashParts.length > 1) { return Arrays.stream(slashParts) .map(String::trim) .filter(s -> !s.isBlank()) .flatMap(segment -> parseReceivers(segment).stream()) .toList(); } ``` ### `PersonNameParserTest.java` 4 new tests (TDD — red first): - `slashSeparator_twoIndependentFullNames` - `slashSeparator_abbreviatedFirstName` - `slashSeparator_withSpacesAroundSlashes` - `slashSeparator_segmentContainsUnd` ## Files | File | Change | |---|---| | `backend/src/main/java/.../service/PersonNameParser.java` | Add constant + pre-split step | | `backend/src/test/java/.../service/PersonNameParserTest.java` | 4 new tests | No schema, API, or i18n changes needed. ## Verification ```bash cd backend && ./mvnw test -Dtest=PersonNameParserTest ```

marcel added the feature label 2026-04-06 13:52:46 +02:00

marcel commented

2026-04-06 19:28:26 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Questions & Observations

The recursive approach (flatMap(segment -> parseReceivers(segment).stream())) is elegant and handles the Herbert und Clara Cram//Eugenie de Gruyter mixed case naturally. Good design — the recursion lets each segment reuse all existing logic without duplication.
The 4 proposed test names are well-structured and follow the [feature]_[scenario] naming convention. Each covers a distinct dimension of the feature.
I notice the issue doesn't mention an edge case: what happens with trailing or leading //? E.g., //Herbert Cram or Herbert Cram//. The .filter(s -> !s.isBlank()) handles this, but an explicit test confirming the behavior would make the contract clear.
What about a single // with no names around it (i.e., //)? Again, the blank filter should handle it, but a test documents intent.

Suggestions

Consider adding a 5th test: slashSeparator_trailingOrLeadingSlash_ignored — to document that //Herbert Cram produces one person, not two. This makes the guard clause's behavior explicit and regression-proof.
The SLASH_SEPARATOR constant name is good — it reveals intent. Since Pattern.compile is used, confirm it's static final so the regex is compiled once (as shown in the issue — just calling it out for the implementer).
The pre-split step should be the very first thing after the null/blank check, before any other normalization. The issue says "after null/blank check" — just confirming this is the right place so that // splitting happens on raw input, not on input that's already been partially normalized.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer ### Questions & Observations - The recursive approach (`flatMap(segment -> parseReceivers(segment).stream())`) is elegant and handles the `Herbert und Clara Cram//Eugenie de Gruyter` mixed case naturally. Good design — the recursion lets each segment reuse all existing logic without duplication. - The 4 proposed test names are well-structured and follow the `[feature]_[scenario]` naming convention. Each covers a distinct dimension of the feature. - I notice the issue doesn't mention an edge case: what happens with trailing or leading `//`? E.g., `//Herbert Cram` or `Herbert Cram//`. The `.filter(s -> !s.isBlank())` handles this, but an explicit test confirming the behavior would make the contract clear. - What about a single `//` with no names around it (i.e., `//`)? Again, the blank filter should handle it, but a test documents intent. ### Suggestions - Consider adding a 5th test: `slashSeparator_trailingOrLeadingSlash_ignored` — to document that `//Herbert Cram` produces one person, not two. This makes the guard clause's behavior explicit and regression-proof. - The `SLASH_SEPARATOR` constant name is good — it reveals intent. Since `Pattern.compile` is used, confirm it's `static final` so the regex is compiled once (as shown in the issue — just calling it out for the implementer). - The pre-split step should be the **very first thing** after the null/blank check, before any other normalization. The issue says "after null/blank check" — just confirming this is the right place so that `//` splitting happens on raw input, not on input that's already been partially normalized.

marcel commented

2026-04-06 19:28:36 +02:00

🏗️ Markus Keller — Application Architect

Questions & Observations

The change is well-scoped: a single method gets a 6-line pre-split step, no new classes, no new interfaces, no schema changes. This is the right level of intervention for a parser enhancement.
The recursive design is architecturally sound — each //-segment re-enters parseReceivers(), so all existing logic (und/u splitting, last-name distribution) is reused without duplication. No new code paths to maintain.
No cross-domain impact: PersonNameParser is a pure utility within the service layer, no repository or controller changes needed.

Suggestions

One thing to confirm: does the spreadsheet data ever contain a single / as a separator (as opposed to //)? If so, the // pattern won't match, and those cases will silently pass through as single names. Worth checking the actual import data to be sure // is the only slash-based separator in use.
The Pattern.compile("//") is straightforward, but since // has no special regex meaning, String.split("//") would also work and avoids the compiled pattern overhead. That said, for a utility called at import time (not hot path), the difference is negligible — either approach is fine. The compiled pattern is arguably more readable since it sits as a named constant.

## 🏗️ Markus Keller — Application Architect ### Questions & Observations - The change is well-scoped: a single method gets a 6-line pre-split step, no new classes, no new interfaces, no schema changes. This is the right level of intervention for a parser enhancement. - The recursive design is architecturally sound — each `//`-segment re-enters `parseReceivers()`, so all existing logic (und/u splitting, last-name distribution) is reused without duplication. No new code paths to maintain. - No cross-domain impact: `PersonNameParser` is a pure utility within the service layer, no repository or controller changes needed. ### Suggestions - One thing to confirm: does the spreadsheet data ever contain a single `/` as a separator (as opposed to `//`)? If so, the `//` pattern won't match, and those cases will silently pass through as single names. Worth checking the actual import data to be sure `//` is the only slash-based separator in use. - The `Pattern.compile("//")` is straightforward, but since `//` has no special regex meaning, `String.split("//")` would also work and avoids the compiled pattern overhead. That said, for a utility called at import time (not hot path), the difference is negligible — either approach is fine. The compiled pattern is arguably more readable since it sits as a named constant.

marcel commented

2026-04-06 19:28:45 +02:00

🧪 Sara Holt — QA Engineer

Questions & Observations

The 4 proposed tests cover the core scenarios well: two independent names, abbreviated first name, whitespace tolerance, and mixed // + und. Good coverage of the feature's interaction with existing logic.
Missing edge case tests I'd want to see:
- Empty segment: Herbert Cram// or //Herbert Cram — does the blank filter correctly discard the empty segment?
- Multiple consecutive separators: Herbert Cram////Eugenie de Gruyter — does this produce empty segments that get filtered, or does it break?
- Single name with no separator: Regression test to confirm that inputs without // still pass through unchanged.

Suggestions

The test slashSeparator_segmentContainsUnd is the most valuable of the four — it tests the interaction between the new // split and the existing und split. I'd make this the first test written (red phase) since it exercises the full recursive pipeline.
Consider a regression test that runs all existing parseReceivers tests after the change to confirm no behavior change for inputs without //. This is probably already covered by running the full test class, but worth explicitly noting in the PR.
Verification command ./mvnw test -Dtest=PersonNameParserTest is correct. I'd also recommend running the full MassImportService test suite (if it exists) to catch any integration-level regressions.

## 🧪 Sara Holt — QA Engineer ### Questions & Observations - The 4 proposed tests cover the core scenarios well: two independent names, abbreviated first name, whitespace tolerance, and mixed `//` + `und`. Good coverage of the feature's interaction with existing logic. - Missing edge case tests I'd want to see: - **Empty segment**: `Herbert Cram//` or `//Herbert Cram` — does the blank filter correctly discard the empty segment? - **Multiple consecutive separators**: `Herbert Cram////Eugenie de Gruyter` — does this produce empty segments that get filtered, or does it break? - **Single name with no separator**: Regression test to confirm that inputs without `//` still pass through unchanged. ### Suggestions - The test `slashSeparator_segmentContainsUnd` is the most valuable of the four — it tests the interaction between the new `//` split and the existing `und` split. I'd make this the first test written (red phase) since it exercises the full recursive pipeline. - Consider a regression test that runs all existing `parseReceivers` tests after the change to confirm no behavior change for inputs without `//`. This is probably already covered by running the full test class, but worth explicitly noting in the PR. - Verification command `./mvnw test -Dtest=PersonNameParserTest` is correct. I'd also recommend running the full `MassImportService` test suite (if it exists) to catch any integration-level regressions.

marcel commented

2026-04-06 19:28:51 +02:00

🔒 Nora "NullX" Steiner — Security Engineer

Questions & Observations

This is a pure parsing change in a backend utility class that processes spreadsheet data from authenticated admin users (mass import). The attack surface is minimal — this is not processing untrusted external input in a web request context.
No new endpoints, no new user input vectors, no changes to authentication or authorization.
The recursive call parseReceivers(segment) is bounded by the number of //-segments in a single spreadsheet cell, which is naturally small. No risk of stack overflow from malicious input at realistic cell sizes.

Suggestions

No security concerns from my angle. The change is confined to a trusted-input parser within the service layer, and the recursive depth is bounded by the input data (spreadsheet cell content). No new attack vectors introduced.

## 🔒 Nora "NullX" Steiner — Security Engineer ### Questions & Observations - This is a pure parsing change in a backend utility class that processes spreadsheet data from authenticated admin users (mass import). The attack surface is minimal — this is not processing untrusted external input in a web request context. - No new endpoints, no new user input vectors, no changes to authentication or authorization. - The recursive call `parseReceivers(segment)` is bounded by the number of `//`-segments in a single spreadsheet cell, which is naturally small. No risk of stack overflow from malicious input at realistic cell sizes. ### Suggestions - No security concerns from my angle. The change is confined to a trusted-input parser within the service layer, and the recursive depth is bounded by the input data (spreadsheet cell content). No new attack vectors introduced.

marcel commented

2026-04-06 19:28:55 +02:00

🎨 Leonie Voss — UI/UX Design Lead

Questions & Observations

This is a backend-only parsing change with no UI impact. No frontend components, routes, or user-facing text are affected.
The // separator is a convention within the import spreadsheet, not something end users interact with through the web UI.

Suggestions

No concerns from my angle. I checked the scope — this change is entirely within PersonNameParser.java and its test file. No UI, accessibility, or design implications.

## 🎨 Leonie Voss — UI/UX Design Lead ### Questions & Observations - This is a backend-only parsing change with no UI impact. No frontend components, routes, or user-facing text are affected. - The `//` separator is a convention within the import spreadsheet, not something end users interact with through the web UI. ### Suggestions - No concerns from my angle. I checked the scope — this change is entirely within `PersonNameParser.java` and its test file. No UI, accessibility, or design implications.

marcel commented

2026-04-06 19:28:59 +02:00

🛠️ Tobias Wendt — DevOps Engineer

Questions & Observations

Pure Java code change, no infrastructure impact. No new dependencies, no config changes, no Docker or CI modifications needed.
The verification command (./mvnw test -Dtest=PersonNameParserTest) is a fast unit test — no Testcontainers or external services required. CI time impact is negligible.

Suggestions

No concerns from my angle. This is a self-contained parser enhancement that doesn't touch infrastructure, dependencies, or deployment configuration.

## 🛠️ Tobias Wendt — DevOps Engineer ### Questions & Observations - Pure Java code change, no infrastructure impact. No new dependencies, no config changes, no Docker or CI modifications needed. - The verification command (`./mvnw test -Dtest=PersonNameParserTest`) is a fast unit test — no Testcontainers or external services required. CI time impact is negligible. ### Suggestions - No concerns from my angle. This is a self-contained parser enhancement that doesn't touch infrastructure, dependencies, or deployment configuration.

marcel referenced this issue

2026-04-06 19:29:39 +02:00

feat(massimport): handle dot-compressed names and titles in PersonNameParser.split() #184

marcel referenced this issue

2026-04-07 09:22:53 +02:00

feat(massimport): support // separator and dot-compressed names in PersonNameParser #190

marcel closed this issue

2026-04-07 09:22:59 +02:00

Sign in to join this conversation.