feat(auth): defense-in-depth — CSRF, session revocation, login rate limit #524
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
This is Phase 2 of 2 splitting #522. Blocked by #523 (Phase 1 — server-side session model).
#523 replaces the cookie-as-credential model with opaque server-side sessions. With sessions in place, several defenses become possible that were inexpressible before:
csrf.disable()so the cutover stays scoped).spring_sessionrows.This issue lands all three. Unlike #523, none of these are breaking — they are additive defenses on top of a working session model.
⚠ Hard dependency
Blocked by #523. Do not start until that ships. The CSRF synchronizer token cookie depends on
HttpSessionbeing a real thing; the session-deletion paths depend onspring_sessionexisting; the rate-limit error UX depends onINVALID_CREDENTIALSalready being a frontend-mapped code.User need
Functional requirements
FR-AUTH-004 — Whenever an authenticated user changes their own password, all
spring_sessionrows for that user OTHER than the current session SHALL be deleted. The current session SHALL be preserved (do not force the user to re-login on the device where they initiated the change).FR-AUTH-005 — There SHALL be an admin-only endpoint
POST /api/admin/users/{userId}/force-logout(or equivalent — final path is the implementer's call) that deletes allspring_sessionrows for the target user. Permission required:ADMIN_USER(existing permission — no new permission needed unless the implementer disagrees). UI surface is deferred — backend capability + audit log entry only in this issue.FR-AUTH-008 — A successful password reset via
POST /api/auth/reset-passwordSHALL delete allspring_sessionrows for the affected user. Unlike FR-AUTH-004, there is no "current session to preserve" — the password-reset flow is unauthenticated.FR-AUTH-009 (CSRF) — Spring Security's CSRF filter SHALL be re-enabled in
SecurityConfigusingCookieCsrfTokenRepository.withHttpOnlyFalse()(XSRF-TOKEN cookie pattern — decided in #522 review). State-changing requests (POST/PUT/PATCH/DELETE) without a validX-XSRF-TOKENheader echoing the cookie's token SHALL return 403 withErrorCode.CSRF_TOKEN_MISSING. GET/HEAD/OPTIONS/TRACE are exempt (Spring default).FR-AUTH-010 (rate limit) —
POST /api/auth/loginSHALL enforce a per-(IP, email) rate limit at the application layer using Bucket4j (or equivalent in-process token bucket). After 5 failed attempts within a 15-minute window, further attempts from that (IP, email) pair SHALL return HTTP 429 withErrorCode.TOO_MANY_LOGIN_ATTEMPTS. The limit SHALL be independent of any reverse-proxy-layer protection (fail2ban via Caddy). A successful login SHALL reset the bucket.Non-functional requirements
NFR-SEC-103 — CSRF protection on state-changing endpoints SHALL be enforced by two independent mechanisms:
SameSite=strictattribute (from #523)XSRF-TOKENcookie that must be echoed inX-XSRF-TOKENon writesEither alone is insufficient. Both SHALL pass for a state-changing request to succeed.
NFR-SEC-104 — Login attempts SHALL be rate-limited at the application layer per the FR-AUTH-010 thresholds. The 5/15min window MAY be tuned by configuration property without code change.
NFR-OBSV-101 (completion of #523's partial) — Admin force-logout SHALL emit
AuditKind.ADMIN_FORCE_LOGOUTaudit entries with payload{ adminUserId, targetUserId, sessionsRevokedCount, ip, ua }. Bulk session deletion via password change/reset SHALL emit a singleLOGOUTentry per session deleted, with payload{ userId, ip, ua, reason: "password_change" | "password_reset" | "admin_force_logout" }.NFR-OBSV-102 (new) — Rate-limited login attempts SHALL emit
AuditKind.LOGIN_RATE_LIMITEDentries with payload{ email, ip, ua, attemptsInWindow }. This is distinct fromLOGIN_FAILED— the latter means "credentials wrong"; the former means "stop knocking."NFR-COMPAT-101 (extends #523's) — Existing dev tooling SHALL continue to work without code changes:
XSRF-TOKENandX-XSRF-TOKENheaders in both directionse2eprofile's admin seed is unaffectedAcceptance criteria
Resolved decisions (carried from #522 review)
use:enhance; cleaner than threading a JSON field through the login responseImplementation hints (non-binding)
com.bucket4j:bucket4j-core(pin version). In-process buckets keyed by(InetAddress, email-lowercased). Bucket configuration: 5 tokens, refill 5 per 15 min. Wire as a filter or interceptor in front ofAuthController.login.SecurityConfig:54("CSRF disabled because SameSite holds the fort").FindByIndexNameSessionRepository.findByPrincipalName(email)andSessionRepository.deleteById(id). Wrap in a smallSessionRegistryServiceinside theauthpackage (created in #523).AuthService.invalidateAllSessionsForUser(userId, reason)andAuthService.invalidateAllSessionsExceptCurrent(userId, currentSessionId, reason). Called by:UserService.changeOwnPassword(...)→ exceptCurrentPasswordResetService.resetPassword(...)→ allUserService.disableUser(...)→ all (Sara's amendment)AdminController.forceLogout(userId)→ allCSRF_TOKEN_MISSING403s and shows:scripts/force-logout-user.sh: small bash helper that POSTs to the force-logout endpoint with admin credentials from env vars. Document indocs/infrastructure/.@RequirePermissiontable already covers admin force-logout ifADMIN_USERis reused).SELECT COUNT(*) FROM spring_session(gauge)rate({app="backend"} |= "LOGIN_FAILED" [5m]) > 10— likely brute forcepg_dumpexclusion:--exclude-table=spring_session*in the backup script. Document indocs/infrastructure/. Restores lose all sessions, which is acceptable.New ErrorCodes (this issue)
CSRF_TOKEN_MISSINGerror.csrf_token_missingTOO_MANY_LOGIN_ATTEMPTSerror.too_many_login_attemptsBoth must be added to
ErrorCode.java, mirrored infrontend/src/lib/shared/errors.ts, given acaseingetErrorMessage(), and have i18n keys in all three locale files (perCLAUDE.mderror-handling reminder).New AuditKinds (this issue)
ADMIN_FORCE_LOGOUTLOGIN_RATE_LIMITEDLOGOUTpayload with optionalreasonfield — schema-level)Out of scope
Priority & sizing
Test strategy
AuthService.invalidateAllSessionsForUser,invalidateAllSessionsExceptCurrent; Bucket4j wrapper logic; CSRF token validation edge casesAuthController.loginrate-limit path (429 + audit); admin force-logout 200/403/401; CSRF 403 paths@ParameterizedTest @MethodSource("writeEndpoints")— every POST/PUT/PATCH/DELETE in the OpenAPI spec returns 403 without CSRF (Sara's pattern — catches future controllers added without thinking)CSRF_TOKEN_MISSINGwith the reload UX; rate-limit error shows the localized messagebrowser.newContext()instances logged in as same user; change password in one; verify the OTHER context's next request gets 401/api/auth/loginwith wrong password from one IP — verify 429 kicks in at attempt 6, audit log has the expected rowsDon't mock
SessionRepositoryin integration tests — Spring Session JDBC semantics depend on the real Postgres-backed implementation (Sara).Definition of Done
ErrorCode.CSRF_TOKEN_MISSING,TOO_MANY_LOGIN_ATTEMPTSadded everywhere (Java + TS + i18n × 3)AuditKind.ADMIN_FORCE_LOGOUT,LOGIN_RATE_LIMITEDadded;LOGOUTpayload extended withreasonscripts/force-logout-user.shlands + documented indocs/infrastructure/pg_dumpexcludesspring_session*(backup script + doc)— Filed as part of splitting #522 (replaces it together with #523). Closes the Phase-2 portion of the auth-model rewrite.
🏛️ Markus Keller — Senior Application Architect
Observations
AuthService.invalidateAllSessionsForUser/invalidateAllSessionsExceptCurrent, both living in the newauthpackage from #523) is correct. Three callers from three different domains —UserService.changeOwnPassword,PasswordResetService.resetPassword, and the new force-logout controller — all going through the publishedAuthServiceinterface. Exactly the cross-domain pattern the layering rule wants.POST /api/admin/users/{userId}/force-logoutwith permissionADMIN_USER. But the existingAdminController(backend/.../user/AdminController.java) is class-level@RequirePermission(Permission.ADMIN)and hosts infra ops (mass-import, backfills, thumbnails). The/api/admin/*namespace currently means "needsPermission.ADMIN". User-admin operations live inUserControllerunderADMIN_USER. The proposed endpoint mixes those two surfaces.Recommendations
UserController, notAdminController. URLPOST /api/users/{userId}/force-logout, permission@RequirePermission(Permission.ADMIN_USER). Reasons: (a) keeps the/api/admin/*namespace purePermission.ADMIN— no exceptions to grep through during an audit; (b)UserControlleralready owns user-admin endpoints underADMIN_USER; (c) the layering rule says cross-domain data access goes through services —UserController.forceLogoutcallsuserService.findById(...)thenauthService.invalidateAllSessionsForUser(...), the textbook cross-package boundary.UserService.disableUser(...)is not in the code —AppUser.enabledis set totrueon create atUserService.java:70/106/128and never flipped. Either land a minimal disable endpoint here (own FR + AC +Permission.ADMIN_USER+ audit) or split that AC out into a follow-up issue. I'd split — keeps this PR scoped to the three things its title promises.docs/adr/. Suggested title:ADR-NNNN — defense-in-depth additions: CSRF tokens, session revocation, login rate limit. Context = post-#523 baseline. Decision = three additions. Alternatives = (1) relying on SameSite + CORS alone for CSRF, (2) keeping rate-limiting at fail2ban only. Consequences = in-process bucket state (no cluster support), CSRF cookie complexity in tests.CLAUDE.md— error-code section gets two new entries (CSRF_TOKEN_MISSING,TOO_MANY_LOGIN_ATTEMPTS)docs/ARCHITECTURE.md— permission table updated ifADMIN_USER's effective scope is meaningfully expandeddocs/architecture/c4/seq-auth-flow.puml— add the CSRF token cookie round-trip arrows (was already rewritten in #523; this is an amendment)LOGOUT(now carriesreason) and the two new AuditKindsOpen Decisions
/api/admin/users/{id}/force-logout(as written in the issue, underAdminControllerrequiringADMIN_USER) vs/api/users/{id}/force-logout(underUserController, matches the existingADMIN_USERpattern there). The former matches admin-tool clustering; the latter keeps each controller's class-level permission consistent. I recommend the latter for the consistency reason.👨💻 Felix Brandt — Senior Fullstack Developer
Observations
UserService.changePassword(UUID, ChangePasswordDTO)atUserService.java:236does NOT currently take the current session ID. To preserve the current session per FR-AUTH-004 it has to learn about the session at call time. Service stays servlet-free; the controller does the extraction.(InetAddress, email-lowercased).InetAddressis a poor map key —.equals()can trigger DNS lookups and the contract is brittle across IPv4/IPv6 dual-stack hosts. Bucket4j's own docs useStringkeys.Recommendations
AuthServiceshape (both methods@Transactional, both return the revoked count so callers can emit the audit payload):request.getSession(false).getId()in the controller, pass through to the service:StringnotInetAddress. Canonical form:ip.toLowerCase() + ":" + email.toLowerCase().strip(). Never callInetAddress.getByName(...)on user input — that would fire DNS lookups on attacker-controlled strings.AuthServiceTest.invalidateAllSessionsExceptCurrent_deletes_only_other_sessions— Mockito,findByPrincipalNamereturns 3 session ids, assertdeleteByIdcalled with exactly the two NOT matchingcurrentSessionId.LoginRateLimitFilterTest.returns_429_on_attempt_6_with_TOO_MANY_LOGIN_ATTEMPTS— drive 5×401, then assert 6×429.CsrfRegressionTest.every_write_endpoint_returns_403_without_csrf_token—@ParameterizedTest, source = OpenAPI write endpoints, single session, noX-XSRF-TOKENheader.UserServiceIntegrationTest.changeOwnPassword_preserves_current_session_invalidates_others— Testcontainers, realspring_session, three sessions, change password from session A, assert A still present, B + C deleted, twoLOGOUTaudit rows written.X-XSRF-TOKENheader injection infrontend/src/hooks.server.ts(the same handle layer that today proxies theauth_tokencookie — will be re-shaped by #523). Do NOT sprinkle.set('X-XSRF-TOKEN', ...)across every+page.server.ts.csrf().disable()out ofSecurityConfig, delete the entire surrounding comment block. The comment explaining "CSRF defence is LOAD-BEARING on SameSite + CORS" is wrong once Spring Security CSRF is on — leaving it as a relic confuses future readers.⚙️ Tobias Wendt — DevOps & Platform Engineer
Observations
pg_dump --exclude-table=spring_session*does glob bothspring_sessionandspring_session_attributes— verified that pg_dump's pattern is a shell-style glob. Document explicitly that restores lose all sessions and that's acceptable (users re-login post-restore, which is the desired behaviour anyway after a DR event).Recommendations
bucket4j-coreexplicitly inbackend/pom.xml— latest stable 8.x. Don't rely on a parent BOM. Renovate keeps it current.rate({app="backend"} |= "LOGIN_FAILED" [5m]) > 10only fires ifAuditService.logAfterCommit()ALSO writes a logger line, not just a DB row. Today'sAuditService(backend/src/main/java/.../audit/AuditService.java) needs verification — if it's DB-only, the alert won't ever trigger. Two-line fix: addlog.info("audit kind={} payload={}", kind, payload)next to the DB insert. That keeps Loki as the alerting plane and avoids adding a postgres exporter container on a CX32.LOGIN_RATE_LIMITEDas a third series on the login-rate Grafana panel. Three lines: success / failed / rate-limited. The slope of the rate-limited line is the brute-force signal; without it, you lose visibility precisely when an attack is most active.scripts/force-logout-user.shdistribution & contract.docs/infrastructure/next tos3-migration.mdand friends.curl+jqonly — no bash 5 features.APP_ADMIN_USERNAMEandAPP_ADMIN_PASSWORDfrom env — never accept on command line (leaks inps).0success,1auth failed,2target user not found,3network/HTTP error. Operator can wire it into a runbook.spring-session-jdbc+spring-session-coreseparately. Same reasoning as #523's implementation hints. Save one explicit version pin./actuator/healthis permitAll'd inSecurityConfigand is a GET — Spring's CSRF filter exempts GET by default, so the Docker Composehealthcheck: curl /actuator/healthshould keep passing. Confirm in CI before merge; if it breaks for any reason, that's a release blocker.bucket4j-corelands, add a packageRule so its patch releases auto-merge like other dependencies.Open Decisions
log.infomirror insideAuditService(two lines added, slight extra log volume) vs Postgres-side via apg_exportercontainer (cleaner separation, adds a container to operate). Lean Loki + mirror for cost reasons; flag because some operators prefer to avoid touchingAuditServicefor observability concerns.🛡️ Nora "NullX" Steiner — Application Security Engineer
Observations
lax, CORS allowlist widening, browser-specific edge cases). Either alone is insufficient — this is exactly right.LOGIN_FAILEDand emittingLOGIN_RATE_LIMITEDinstead on attempt #6 is correct — avoids log amplification under a sustained brute-force burst.CookieCsrfTokenRepository.withHttpOnlyFalse()is the canonical Spring pattern. The token is a per-session nonce, not a secret — JS reading it is necessary for the double-submit pattern.5×Nattempts every 15 minutes before any application-layer control kicks in. The only thing stopping that is fail2ban at the network layer, and fail2ban thresholds default to noisier signals than application-layer audit emits./api/auth/forgot-passwordis the gold standard for this; the rate-limit code path needs to match it.Recommendations
Add a per-IP backstop bucket alongside the per-(IP, email) bucket. Suggested config:
20 failures / 15 minper IP across all (email) values. Both buckets must succeed for the attempt to be evaluated. This kills credential stuffing — even with a wordlist, the attacker hits the IP cap after 20 attempts regardless of how many distinct emails they cycle.Always consume the bucket BEFORE looking up the user. Order:
This eliminates the "email existence" timing/state side channel: bucket behaviour is identical for valid and invalid emails. An attacker can no longer probe the user table by counting allowed attempts.
Lock down the IP source. Pick exactly one of
X-Forwarded-For(when Caddy strips and rewrites) orrequest.getRemoteAddr()(raw). Document it inSecurityConfigand verify Caddy strips client-sentX-Forwarded-Forbefore forwarding — otherwise an attacker can rotate buckets by spoofing the header. The bucket's security guarantee is only as strong as the IP source identity.CSRF token rotation on login — Spring Security's
CsrfAuthenticationStrategyrotates the token onSessionFixationProtectionStrategy.onAuthentication. Verify this still fires after #523's new/api/auth/loginsucceeds. Without it, a token pre-seeded before authentication remains valid post-login. One-line config check; must be in the test suite.Cap
LOGIN_RATE_LIMITEDaudit emission. An attacker who continues hammering after lockout writes one audit row per attempt, unbounded — that's an audit-table DoS vector. Debounce in-memory: emit oneLOGIN_RATE_LIMITEDper (ip, email) per window. Detection signal lost: none — one entry per window is enough for an alert. Implementation: aCaffeinecache with 15-min TTL keyed by the same bucket key.The
XSRF-TOKENcookie attributes must mirror the session cookie's risk posture. Explicitly setSameSite=strict,Secure=truein non-dev profiles,Path=/. The token is not a secret, but downgrading attributes here widens the cross-site surface for token theft via subdomain MITM.Force-logout response must not echo session IDs. Return
{ sessionsRevokedCount: N }. Session IDs are sensitive identifiers — minimum disclosure principle.Sara's "disabled user → all sessions invalidated in same transaction" amendment is correct, and the test must assert atomicity. If the disable commits before session deletion, there's a window where a still-cookie-authenticated user makes one more request. Integration test with
@Transactionalrollback proves it.Open Decisions
/api/auth/loginonly (simpler), or extend to/api/auth/login+/api/auth/forgot-password+/api/auth/reset-password(catches enumeration via the reset flow too). Broader scope is the more defensive choice; narrower is easier to reason about.🧪 Sara Holt — Senior QA Engineer
Observations
SessionRepositoryin integration tests" is correct. Spring Session JDBC's SQL behaviour (spring_sessionrow deletion with cascade tospring_session_attributes) only manifests against real Postgres. H2 would hide it.@ParameterizedTest @MethodSource("writeEndpoints")CSRF regression is the highest-value test in the whole package — it catches every future controller someone adds without thinking about CSRF.browser.newContext(), login in both, change password in one, assert the OTHER returns 401 on next request.UserServicehas nodisableUsermethod). My amendment, but it's blocked on a precondition this PR doesn't deliver.Recommendations
Inject a
TimeMeterfor rate-limit tests. NeverThread.sleep. Bucket4j supports custom time sources:Test runs in milliseconds. No flakiness. No 15-minute test executions.
CSRF parameterized regression — drive the seed list from the generated OpenAPI spec, not a hand-maintained list:
Zero maintenance when a new endpoint ships. The test list grows with the spec.
Audit-row assertions need
Awaitilitybecause they're written from an@TransactionalEventListener(AFTER_COMMIT). If a test assertion runs before the after-commit phase fires, it flakes. Wrap each audit-presence assertion inawait().atMost(5, SECONDS).untilAsserted(() -> assertThat(...)).Permission-boundary tests must include both 401 and 403 for every new endpoint:
with(user(...)), expectstatus().isUnauthorized()with(user("u").authorities(new SimpleGrantedAuthority("READ_ALL"))), expectstatus().isForbidden()These are different security failures; the AC list already has them for force-logout — keep the pattern for any new endpoint.
Coverage gate: hold the 88% branch line in
pom.xml. New code inauthpackage and the new code inUserService.changeOwnPasswordmust each carry ≥80% own-branch coverage. If this PR dips the global line below 88%, it doesn't merge.The multi-context E2E should assert audit + UI together. In one test: change password in context A, verify B's request returns 401 AND the audit table contains exactly N-1
LOGOUTrows withreason=password_change. If audit is missing, observability is silently broken — that's exactly the failure mode Nora's findings call out.The "forced disabled user" AC can't have a green test until the disable flow exists. Split it into a follow-up issue (#525) and remove from #524's scope, or land a minimal disable endpoint in this PR. I'd vote for the split — keeps the test pyramid for #524 honest.
k6 smoke is optional per the issue text — keep it optional. A 5-minute k6 run on the merge step is cheap insurance against the rate-limit filter regressing latency. If the implementer skips it, that's fine, but having the script committed makes the future check trivial.
Open Decisions (none from my angle — recommendations above are concrete)
🎨 Leonie Voss — UI/UX Design Lead
Observations
Recommendations
min-h-[44px] min-w-[44px]per WCAG 2.2 SC 2.5.8. Critical for the senior audience using a stylus or imprecise touch. Use the<button>element, not an<a>.aria-live="polite"so screen-reader users get the message announced. Not a modal, not a banner — a polite live region above the page content.Static (recommended for first pass):
Friendlier for the 60+ audience, gives up no useful info, no extra response field needed.
Dynamic countdown: backend returns
retryAfterSecondsin the 429 body, frontend renders a live-updating "Try again in MM:SS". More work, helpful for stuck admins, distracting for seniors. Defer until support friction proves it's needed.use:enhance+formreturn value — make sure the form-actionfail(429, { email })carries it back.<svg>warning) andaria-invalid="true"on the email input. Project convention; restate here so the implementer doesn't miss it on this specific surface.aktivitaeten/(the unified activity feed). Verify admins will see their own force-logout actions there post-merge; if not, file as a v1.2.0 follow-up alongside the UI.Open Decisions
📋 Elicit — Requirements Engineer
Observations
Recommendations
attemptsInWindowprecisely (NFR-OBSV-102 audit payload). Is it the cumulative number of attempts that have entered the bucket since it started filling, or the count at the moment of the rejected request (i.e. always≥6)? Either is fine; the schema needs to say which.POST /api/admin/users/{userId}/force-logoutwith permissionADMIN_USER. Today'sAdminControlleris class-level@RequirePermission(Permission.ADMIN)and hosts infra ops. Either the new endpoint sits awkwardly under the wrong permission namespace, or it has to land onUserControlleratPOST /api/users/{userId}/force-logout. Markus flagged the same. Park it in TBD so the implementer can't pick wrong silently.UserServicehas nodisableUsermethod andAppUser.enabledis never set tofalseanywhere in production code. Two options:FR-AUTH-011 — Admin can disable a user accounthere, with its own AC (PATCH /api/users/{userId}/disablerequiringADMIN_USER, setsenabled=false, invalidates sessions in the same transaction, emitsADMIN_FORCE_LOGOUT). Adds one endpoint to scope.XSRF-TOKENandX-XSRF-TOKENheaders in both directions" —vite.config.tscurrently only injectsAuthorizationfrom theauth_tokencookie (vite.config.ts:23–31). After #523 strips that block, the proxy is back to passing headers through unchanged. That MAY be sufficient for CSRF (Vite's proxy is transparent for non-injected headers by default), but it warrants explicit verification with one cURL through the running dev server. Add a footnote that says "verify" so it doesn't get assumed.docs/ARCHITECTURE.mdpermission table updated to reflect thatADMIN_USERnow governs session revocation in addition to user CRUD." Even though no newPermissionenum value is added, the scope of an existing one expands meaningfully.OQ-524-01…OQ-524-06. Pre-#522, this issue assumed all questions were resolved — but the cross-persona review surfaces new ones, and a register prevents them from being silently re-resolved during implementation.Open Decisions
🗳️ Decision Queue — Action Required
7 decisions need your input before implementation starts. Grouped by theme; deduplicated where multiple personas raised the same item.
Architecture
Force-logout endpoint surface. Two options:
POST /api/admin/users/{userId}/force-logout(as written in the issue) — fits the "admin tools" clustering, but the existingAdminControlleris class-level@RequirePermission(Permission.ADMIN); either you weaken that class annotation or split it into two controllers with different permissions in one namespace.POST /api/users/{userId}/force-logoutunderUserControllerwith@RequirePermission(Permission.ADMIN_USER)— matches the existing user-admin pattern (InviteController,UserControllerwrite endpoints all useADMIN_USER), keeps/api/admin/*a purePermission.ADMINnamespace.Raised by: Markus, Elicit
Disable-user flow scope. Sara's "forced disabled user" amendment references a flow (
UserService.disableUser,PATCH /api/users/{id}/disable) that does not exist in the codebase. Two options:Permission.ADMIN_USER+ audit kind + AC. Adds one controller method and a service method to scope.Raised by: Markus, Sara, Elicit
Security
Token-bucket semantics for FR-AUTH-010. "5 tokens, refill 5 per 15 min" is ambiguous. Either:
Pick before the rate-limit tests get written. The implementation cost is identical; the user-visible behaviour differs.
Raised by: Elicit
Per-(IP, email) rate-limit threshold. Issue says 5/15min, configurable. For the closed family archive (<20 users, mostly 60+ transcribers who mistype passwords), 5 is tight and false-lockouts will dominate support traffic. Options:
Raised by: Nora
Add a per-IP backstop bucket? Per-(IP, email) keying alone allows
5 × Nattempts per IP given a credential-stuffing wordlist of N emails. A per-IP backstop (e.g. 20 failures/15min across all emails) closes that gap. Options:/api/auth/login./api/auth/forgot-password+/api/auth/reset-passwordto catch enumeration via the reset flow too.Raised by: Nora
Infrastructure
Brute-force alerting wiring. Loki alert
rate({app="backend"} |= "LOGIN_FAILED" [5m]) > 10only fires if audit events are mirrored to the logger. Today'sAuditServicemay be DB-only. Options:AuditService— addlog.info("audit kind={} ...", kind)next to the DB insert. Two lines of code, slight extra log volume, Loki stays the alerting plane.pg_exportercontainer to surface audit-row metrics to Prometheus. Cleaner separation, one more container to operate on the CX32.Raised by: Tobias
UX
Rate-limit error text. Issue's current text says "Please wait 15 minutes". With a rolling token bucket (Decision 3 option A), the bucket may refill in 1 minute — the literal "15 minutes" claim is then misleading. Options:
retryAfterSecondsin the 429 body; frontend renders a live "Try again in MM:SS". Better for power users, distracting for seniors, more code.Raised by: Leonie
These are the only items that need your call before implementation can start. All other persona output is concrete recommendations — those can be picked up directly by the implementer.