fix(user): findOrCreate Administrators group instead of blind-INSERT (#518) #519
Reference in New Issue
Block a user
Delete Branch "fix/issue-518-userdatainitializer-findorcreate-group"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Closes #518. Blind INSERT of the
AdministratorsUserGroup hituser_groups_name_keyon the second boot of the staging stack today after we DELETEd just the orphan admin user to retry the seed with the correctedAPP_ADMIN_USERNAME.Fix
Match the existing
findByName(...).orElseGet(...)pattern thatinitE2EDataalready uses for the "Leser" group:Tests
Two new cases in
AdminSeedFailClosedTest:reuses_existing_Administrators_group_when_seeding_a_new_admin— group pre-exists, new admin should attach to it;groupRepository.saveMUST NOT be called.creates_Administrators_group_when_seeding_admin_on_a_fresh_database— group absent, both inserts happen.Plus the existing tests updated to stub
groupRepository.savesince the happy path now flows through it.All 7 tests in
AdminSeedFailClosedTest+ 4 inAdminSeedPropertyKeyTestpass.🤖 Generated with Claude Code
Markus Keller — Senior Application Architect
Verdict: Approved with concerns
What this fix actually does
You replaced an application-level "uniqueness assumption" with
findOrCreate. That's fine and matches theinitE2EDataprecedent forLeser. The PR is small, focused, and reverts a prod-blocker. Ship it.But let me name what this incident really exposed, because the patch treats the symptom:
Architectural concerns
The real issue is that seeding is not idempotent. A boot-time seeder that creates two rows in two repository calls, with no transaction boundary spanning them, will leave torn state when anything in between throws — env-var validation, password encoding, even a
LazyInitializationExceptionon a future refactor. You just patched the one row pair we hit on staging. The next half-failure (e.g. a future "create Reader group + reader user" ininitAdminUser, or aWRITE_ALLpermission audit) reproduces the exact same class of bug.Concrete fix: wrap the seed body in
@Transactional(on a dedicated seeder service method, not on the@Configurationclass) so either both rows land or neither does. ThenfindOrCreatebecomes a belt-and-suspenders defense, not the only defense.Push uniqueness enforcement to the database — which you already do, then catch the violation.
user_groups_name_keyis doing its job: it blocked a duplicate insert. The application response to that constraint should not be "crash the context" — it should be "the constraint told us the row exists, look it up." EitherfindOrCreate(this PR) ortry { save } catch (DataIntegrityViolationException) { findByName.orElseThrow }. Both work.findOrCreatehas a TOCTOU race window if two contexts boot simultaneously, but for aCommandLineRunneron a single backend that's not realistic.initE2EDatahas the same pattern duplicated three times (admin block, "Leser" for reader, "Leser" for reset — all callingfindByName("Leser").orElseGet(save(...))). Extract aprivate UserGroup findOrCreateGroup(String name, Set<String> permissions)helper. Not a blocker for this PR, but file a follow-up.Documentation
No diagrams or
docs/adr/entry needed — this is bugfix scope. The comment block ininitAdminUser(lines 70–74) is good: it explains the why (a previous boot, an operator retry, the unique constraint), which is exactly what future readers need. That's the right kind of comment.Suggestions, not blockers
@Transactionalaround the seed block (separate PR, but file it before the next seed-related bug)findOrCreateGrouphelper in a follow-up cleanup PR@Serviceunderuser/rather than a@Configuration— it's data initialization, not configurationThe fix is correct for what it claims to fix. My concerns are about the larger pattern, not this patch.
Felix Brandt — Senior Fullstack Developer
Verdict: Approved
TDD evidence — good
Two new tests, both named as sentences describing behavior:
reuses_existing_Administrators_group_when_seeding_a_new_admincreates_Administrators_group_when_seeding_admin_on_a_fresh_databaseBoth follow Arrange-Act-Assert. The "reuse" test uses
verify(groupRepository, never()).save(...)— that's the load-bearing assertion, and it's precise. The "fresh DB" test verifies both inserts happen exactly once. Together they pin both branches of theorElseGet.You also updated the two existing happy-path tests to stub
groupRepository.findByName+save— necessary because the production path now flows through both. Good discipline; nothing left passing by coincidence.Code quality
The fix itself is six lines of meaningful diff. The Lombok builder reads cleanly inside the
orElseGetlambda:This is the same shape
initE2EDataalready uses forLeser. Consistency across both seed paths is the right move.The comment above the block explains the why (previous half-seed, operator retry, unique constraint), not the what. That's the rule. Good.
Minor suggestions (not blockers)
The permissions string list is duplicated nowhere else, but it's a magic literal. If you ever introduce a
WRITE_ALLrename or a new admin permission, this will be the spot you forget. Consider aprivate static final Set<String> ADMIN_PERMISSIONS = Set.of(...)constant — costs nothing, makes the intent explicit.initE2EDatahas the samefindByName("Leser").orElseGet(save(...))block twice (for the reader and reset users). After this PR lands, extract afindOrCreateGroup(String name, Set<String> permissions)helper and have all three call sites use it. KISS says extract on the third duplication — we're there. File a small follow-up.Neither blocks merge. The patch does exactly what the failing test demanded, and nothing more. Red → green achieved cleanly.
Tobias Wendt — DevOps & Platform Engineer
Verdict: Approved
This is a deployment reliability bug, full stop. Worth saying out loud what almost happened on staging today:
APP_ADMIN_USERNAME, created theAdministratorsgroup, then created an admin user with the wrong email.APP_ADMIN_USERNAMEandDELETEd the orphan user row to retry.userRepository.findByEmail(correct@email)returned empty → seed path re-entered →groupRepository.save(adminGroup)→DataIntegrityViolationExceptiononuser_groups_name_key→ Spring context aborts → backend will not start.The Compose stack health check would have correctly reported the backend as unhealthy. We would have had no admin user, no UI, and a stuck deployment. That's a P0 incident at 10pm on a Friday for any operator who isn't me. Fixing this is the right call.
Operational observations
The fix matches
initE2EData. That's good — same idiom in both seed paths means an operator debugging "why didn't my seed work" has one mental model, not two. Consistency across boot paths matters more than elegance.The seed is still not transactional across both rows. If
passwordEncoder.encode(adminPassword)ever throws (BCrypt OOM, missing bean), oruserRepository.save(admin)hits a constraint, the group lands and the user doesn't — same starting state that bit us today, just at a different step. This PR fixes the specific row pair we hit. The class of bug still exists. Worth a follow-up ticket: wrap the seed body in@Transactional.Retest the actual recovery scenario on staging. Before declaring done: deploy this branch, run the exact recovery flow (DELETE the user row, restart backend), confirm second boot succeeds without manual SQL. The unit tests prove the Java logic; staging proves the deployment story.
Runbook entry. Add a one-paragraph note to
docs/infrastructure/(or wherever ops notes live) describing the recovery procedure: "If admin seed fails partway, you can DELETE the admin user row and restart — the group will be reused." This is the documentation the next on-call needs.Nothing infrastructure-shaped is broken
Ship it. File the follow-ups.
Elicit — Requirements Engineer
Verdict: Approved
This PR is in scope as a bug fix against a confirmed prod-blocker. The requirements story is clean: issue #518 names the failure, the PR description names the fix, the staging scenario is documented in the commit context. Traceability holds — incident → issue → PR → tests.
Definition of Ready (retrospective check on #518)
Definition of Done (this PR)
Requirements observations
The original requirement was under-specified. Issue #513 introduced the fail-closed default-credentials check. Issue #518 reveals that "seed an admin" was treated as a one-shot operation, never as an operation that might be retried after partial failure. The implicit assumption — "the database is empty on first boot, and stays consistent thereafter" — was a hidden assumption that production proved wrong.
Suggestion: when the seeder is touched next, add to its design notes an explicit non-functional requirement:
No gap analysis needed for this PR. It's a targeted fix, not a feature.
Open question (TBD): Are there other startup seeders in the codebase (Tag defaults? permission defaults? OCR model registration?) with the same blind-INSERT pattern? If yes, those are latent #518s. Worth a 30-minute backlog audit. File as a chore: "Audit boot-time seeders for idempotency."
Nothing blocking
The PR is INVEST-compliant against issue #518. Acceptance criteria are testable and tested. Approved from the requirements side. The follow-ups above are backlog items, not gates on this merge.
Nora "NullX" Steiner — Application Security Engineer
Verdict: Approved
I came in expecting to flag something around the admin-seed path — it's a high-value target, the comment in the file (lines 67–76) says "fail-closed in production," and any change touching it deserves scrutiny. After reading the diff, the existing default-credential fail-closed check in
initAdminUseris untouched, so the security posture established in #513 is preserved.What I checked
No change to the credential validation block.
IllegalStateExceptionstill fires when(!isLocalProfile && (DEFAULT_ADMIN_EMAIL.equals(adminEmail) || DEFAULT_ADMIN_PASSWORD.equals(adminPassword))). Good — that gate is the load-bearing security control here, and the PR does not weaken it.The new
findOrCreatedoes not create an unauthenticated bypass. A pre-existingAdministratorsgroup with the canonical name will be reused — but the group is just a permission bag; reusing it does not grant any admin privilege to anyone. The new admin user is still gated by thefindByEmail(adminEmail).isEmpty()check above it, and the fail-closed credential check still runs first. There is no path here where a seed runs without a strong password.No injection vector introduced.
"Administrators"is a literal string passed to a derived JPA query (findByName). Named parameter under the hood. Clean.Permissions set is unchanged. Still
ADMIN, READ_ALL, WRITE_ALL, ANNOTATE_ALL, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION. No silent permission grant or removal.Tests assert the security-relevant behavior.
verify(groupRepository, never()).save(...)proves the existing group is reused, not replaced — important, because a replacement would mean a future admin seeded against a tampered group with weaker permissions could inherit those. The test pins the "reuse exact existing row" contract.One observation — not a blocker
The "reuse existing group" path means: if an attacker with DB write access ever modified the
Administratorsgroup's permissions row (e.g. droppedADMIN_PERMISSION), the next seed of a fresh admin user would attach to the tampered group, not re-create the canonical permission set. The blind-INSERT version of the code had the same property in practice (the tampered group would have stayed, and the new seed would have crashed exactly as #518 describes — slightly more visible). This is a low-severity defense-in-depth observation, not a finding against this PR: an attacker withUPDATE user_groupsaccess has already won. Don't change anything here. Just noting it for the threat model.Threat-model artifact for the file
The inline comment block at lines 70–74 explains why
findOrCreateis safe (recoverable seed retry). Consider adding a one-liner sibling comment that names the assumption — something like:Documenting the threat-model boundary is more valuable than any other comment in this file. Not a merge blocker.
Approved.
Sara Holt — Senior QA Engineer
Verdict: Approved with concerns
The tests are well-shaped. The concerns are about what they don't cover, not about what they do.
What's good
Test names read as sentences.
reuses_existing_Administrators_group_when_seeding_a_new_admincreates_Administrators_group_when_seeding_admin_on_a_fresh_databaseWhen either fails in CI, the developer knows what broke without opening the file. This is the standard.
One logical assertion per test branch. The "reuse" test pins two things —
verify(groupRepository, never()).save(...)AND the captor on the savedAppUserconfirminggetGroups()contains the pre-existing group. That's not assertion sprawl; it's verifying the same behavior end-to-end (the group is reused AND it's actually attached to the new admin). Acceptable.Existing tests updated, not bypassed. The two pre-existing happy-path tests (
allows_seed_when_both_values_are_set_and_profile_is_not_devandallows_seed_with_defaults_when_profile_is_dev) were updated to stubgroupRepository.findByNameandgroupRepository.save. That's the right move — when the production path changes, the tests must reflect it, not paper over it. No tests were@Disabledto make this PR go green.Arrange-Act-Assert structure is clean in both new tests. Factory-style construction via
UserGroup.builder()directly inline. Acceptable for two cases; if a fifth case appears, extract amakeAdminGroup()helper.Concerns
Unit-test only. No integration test against real Postgres. This is the file's pre-existing state, not a regression, but worth naming: the bug we shipped lived in the interaction between two writes against a real
UNIQUEconstraint onuser_groups.name. The Mockito tests cannot catch a future regression where, say, the constraint is renamed in a migration but the seed code keeps the old name. AnAdminSeedIntegrationTestwith@SpringBootTest + Testcontainers (postgres:16-alpine)that exercises the actualCommandLineRunneragainst a real DB — and a real partial-state setup (insert group, no user, then run the seeder) — is the test that would have caught #518 before staging. Strongly recommend filing as a follow-up.No test for "group exists with different permissions." If a future migration changes the
Administratorspermission set, theorElseGetbranch will silently keep the old, stale permissions on the existing group. That's a real risk for a permission system. A test asserting "permissions on the reused group are not silently mutated by the seeder, and a stale permission set is detectable" would be valuable.The two-line stub of
groupRepository.savein the two happy-path tests usesinv -> inv.getArgument(0). Correct, but if anyone copies this pattern they should know it returns the un-persisted entity (no generated ID). For these tests it doesn't matter because we don't assert on the group's ID downstream. Worth a one-line comment in the test, or a_savedGroup_returns_inputhelper.Quality gates
AdminSeedFailClosedTestcases pass (per PR description)AdminSeedPropertyKeyTestcases pass (per PR description)Approved on the unit-test changes. The integration-test gap is a follow-up, not a blocker for this specific patch.
Leonie Voss — UX & Accessibility
Verdict: Approved — LGTM, no UX surface
Checked the diff: two files, both backend Java (
UserDataInitializer.javaandAdminSeedFailClosedTest.java). No Svelte components, no routes, no Tailwind classes, no copy strings, no error-message surfaces visible to users.What I checked anyway (because the failure mode is user-facing)
When this bug fires in production, the user experience is "the application will not load at all" — the Spring context aborts and the backend serves nothing. From a UX standpoint that's the worst-possible degraded state: no error page, no support hint, no path forward for a non-technical operator. So fixing it improves the UX outcome by removing a hard failure mode, even though the fix itself touches no UI.
One small thing the team might consider for a future PR — and explicitly not blocking this one:
req_engineerreview) cannot act on that. A clearly wordedlog.error("Admin seed failed: ...")with a one-line "what to do" — pointing to a runbook — would close that gap. Tobias mentioned runbook in his review; this lives in the same neighborhood.Nothing for this PR. Approved.