fix(frontend): disable prerender crawl so protected routes aren't baked to login-bounces (#514) #515
Reference in New Issue
Block a user
Delete Branch "fix/issue-514-prerender-crawl-bakes-redirects"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Closes #514. This is the actual reason login doesn't work in production — and it's why dev mode works fine but staging is broken.
Default
prerender.crawl = truehad SvelteKit follow nav links from/hilfe/transkriptionand prerender/,/documents,/persons,/geschichten,/stammbaumtoo. Those routes' load functions throwredirect(302, '/login')during the build (no auth cookie), and SvelteKit captured each as a static HTML file:Those static files (
build/prerendered/index.html, …) are served before the runtimehooks.server.tsruns. An authenticated user with a validauth_tokencookie still gets the baked bounce-back HTML — every time.Dev mode doesn't prerender, so the bug only shows in
npm run buildoutput.Fix
Keeps the explicit
/hilfe/transkriptionprerender entry (still public, still pre-built), drops the implicit crawl.Test plan after merge
nightly.ymldocker exec archiv-staging-frontend-1 ls /app/build/prerendered/shows onlyhilfe/transkription.html, noindex.html/documents.html/etc.admin@familyarchive.local+ the staging admin password; land on dashboard instead of bouncing back to /login.🤖 Generated with Claude Code
Markus Keller — Senior Application Architect
Verdict: Approved
What I checked
frontend/svelte.config.js— the only file changedfrontend/src/hooks.server.ts:handleAuthredirects anything outsidePUBLIC_PATHSto/login, which is exactly what was getting baked into static HTML at build time. Root cause confirmed.frontend/src/routes/+layout.server.ts— pullslocals.user; during prerender there is no cookie solocals.useris undefined andhandleAuthtriggers the redirect. Matches the bug story exactly./hilfe/transkriptionis whitelisted inPUBLIC_PATHS, so it correctly stays prerenderable and public after this change.Architectural assessment
This is the right layer for the fix. Two configuration knobs were doing two different things:
entries: ['/hilfe/transkription']— "prerender these explicit pages"crawl: true(default) — "and also follow every link they emit"Knob #2 is what tangled the build-time and runtime auth boundaries. Disabling crawl restores the invariant we want: prerender is an opt-in list, not a transitive reachability set. Cleaner mental model, smaller blast radius.
Suggestions (non-blocking)
docs/adr/) — "Prerender entries are explicit, never crawled" — so a future SvelteKit upgrade or refactor doesn't silently re-enablecrawl: true. Optional; the code comment may already be sufficient at this project size.c4/l3-frontend-*.pumldoes not need to move.LGTM — ship it.
Felix Brandt — Senior Fullstack Developer
Verdict: Approved with concerns
What I like
svelte.config.jsexplains the why, not the what — exactly the style I'd want. It documents the threat model so the next person who looks at this config understands whycrawl: falseis load-bearing.Concerns
1. No regression test (this is the real one)
This bug only manifested in
npm run buildoutput. A passing dev-mode test would have happily green-lit a broken production build. We need a test that fails ifcrawl: trueever comes back. Cheap option: a CI step that asserts the prerendered output contains only the expected files.Or even better, a Vitest test that imports
svelte.config.jsand assertskit.prerender.crawl === false:Two lines. Permanent regression guard. Without it, a future
npm installor SvelteKit upgrade could silently re-introduce the default.2. Test plan in the PR is manual-only
The "Test plan after merge" checklist is fine for now, but every item there is a manual step. After this lands, those steps should become automated smoke tests in the post-deploy verification —
docker exec ... ls /app/build/prerendered/is a perfect smoke-test assertion.Suggestions (nice to have)
/hilfe/transkriptionalready stays prerendered correctly.Once a regression assertion lands (either in CI or as a Vitest config check), this is a clean ship.
Tobias Wendt — DevOps & Platform Engineer
Verdict: Approved with concerns
The fix itself is one line of config. From an infra angle there's nothing controversial here — the danger lives in the deployment process around it. This bug shipped to staging undetected, which means our deploy pipeline did not catch a production-blocking regression. That's the conversation I want to have.
What I like
crawl: falsekeeps the existing/hilfe/transkriptionprerender working, so caching/CDN behavior for that page is unchanged.Concerns (pipeline gaps, not code)
Post-deploy smoke test is missing. A
curl http://127.0.0.1:3001/ | grep -q 'location.href="/login"'against the deployed staging container would have caught this in seconds. Add a smoke step tonightly.ymlthat:/with no auth cookie and asserts the response is dynamic (a realSet-Cookie, a 302, or rendered Svelte markup — NOT a staticlocation.hrefHTML file)./hilfe/transkriptionwith no auth cookie and asserts 200.Build-artifact assertion in CI. Before the container is even built, fail the pipeline if
frontend/build/prerendered/contains anything other thanhilfe/:Cheap, deterministic, catches future regressions if anyone flips
crawlback or adds another entry.The PR's "Test plan after merge" is manual. All three checkboxes (
docker exec, log in, land on dashboard) should be in the smoke job, not in a human checklist. Manual post-deploy verification is the same class of failure we're fixing.Note (not a blocker)
The fact that dev mode worked but the production build was broken is a structural risk we will see again. Worth a follow-up issue: "Run a production build + curl-based smoke check in CI before merge to main" — same Docker compose stack, hit a handful of routes unauthenticated, fail if any return baked-redirect HTML. Two minutes of CI for an entire class of prerender-vs-runtime regressions caught forever.
Ship the fix; open the follow-up issues for the smoke test and the build-artifact assertion.
Elicit — Senior Requirements Engineer
Verdict: Approved
What I checked
This is a bug fix, not a feature change, so my lens here is: does the issue clearly document the user-observable problem, the acceptance criteria, and the verification plan?
Issue #514 — quality review
The issue is exemplary. It contains everything I look for in a defect spec:
build/prerendered/index.htmlcontaining baked redirect HTML; the build-timeredirect(302, '/login')was captured as static markup.PR — acceptance criteria
The PR's "Test plan after merge" maps cleanly to implicit acceptance criteria:
These three are testable, observable, and unambiguous. I would only suggest one improvement: lift them out of a markdown checkbox in the PR description and into the issue itself as Given-When-Then acceptance criteria, so the issue becomes the canonical "done" definition rather than the PR.
Suggestions (non-blocking)
OQ-open question: "Should we alert whenbuild/prerendered/content drifts?" — answer probably "yes, in CI" per Tobias's smoke-test suggestion.svelte.config.js.)entriesandPUBLIC_PATHSinhooks.server.ts? Currently this is tribal knowledge. Worth a docs sentence inCONTRIBUTING.mdor a comment inhooks.server.ts.Bottom line
The defect is well-specified; the fix scope matches the defect; the verification steps are testable. From a requirements perspective: ship it.
Nora "NullX" Steiner — Application Security Engineer
Verdict: Approved
This is a security-relevant fix even though it's labeled a build bug. Let me explain why, because the framing matters.
Why this is a security concern (not just a UX bug)
What was actually happening on staging:
auth_tokencookie → 303 to/./. The reverse proxy returns the statically-baked/app/build/prerendered/index.html, which contains<script>location.href="/login"</script>.handleAuthhook never executes. The authenticated user is treated identically to an unauthenticated one — the auth check is bypassed because there is no auth check to bypass; the response was decided at build time.That is a broken access control failure mode (CWE-862 conceptually — Missing Authorization due to bypass), even though in this case the failure is "deny everyone instead of decide per-user." The lesson generalizes: any time runtime auth decisions are captured into static artifacts, you have a class of issue where the auth layer becomes inert. The inversion of "static deny" could just as easily be "static allow" if a different page were prerendered in the wrong auth state — for example, if an admin built locally while logged in, SvelteKit could in principle bake admin content into static HTML and serve it to anonymous users. That has not happened here, but the configuration that allowed (1) above is the same configuration that would allow that.
crawl: falsefixes both directions of this hazard by establishing the invariant: the prerender set is exactly what's inentries, and nothing reachable from those entries. That is the right security posture for a build process.Verified by reading
hooks.server.tshandleAuthcorrectly redirects non-public paths to/login. Good.PUBLIC_PATHSincludes/hilfe/transkription. Good — matches the single prerender entry, so the public help page works without an auth cookie at runtime too.+page.svelte(it's a pure rules/help page with noloadfunction pulling user data).Suggestions (non-blocking)
build/prerendered/contains any file not underhilfe/. This converts the security invariant into an enforced one.CONTRIBUTING.mdunder a "Prerender safety" heading: "Every entry inprerender.entriesmust also be listed inPUBLIC_PATHSinhooks.server.ts. Never serve user-specific content from a prerendered route." Two sentences, removes a future foot-gun.Summary
The fix is correct, minimal, and addresses the root cause at the right layer. No code changes required for security approval. The follow-up assertion would harden the invariant; nice-to-have, not blocker.
Sara Holt — Senior QA Engineer
Verdict: Changes requested
The code change is correct. The reason I'm requesting changes is that this is exactly the class of bug a test suite exists to prevent, and shipping the fix without locking it in means we'll see this again — possibly with a different prerender entry, possibly worse.
This was a prod-blocking, login-breaking defect that survived our entire test pyramid. That means our pyramid has a hole. The PR cannot close that hole with a one-line config change alone.
Why our suite missed it
npm run build && node ./build, but we have no test that asserts the content of the prerendered output.Four layers, zero coverage for this failure mode.
Required additions before merge (blockers)
1. Build-output assertion (cheapest, highest value)
A test that runs
npm run buildand asserts the prerender directory:Add as a CI job that runs after
npm run build. If anything other thanhilfe/transkription.htmlshows up, fail.2. Playwright E2E for the actual user journey
This test runs against the production build, not the dev server. If it had existed, this bug would have failed CI before staging ever saw it.
3. Confirm where the existing E2E suite points
Before merge, please verify whether the current Playwright config runs against
npm run devornpm run preview/node build. If it's the former, that's the larger gap. Switch login-flow E2Es to run against the production build.Suggestions (non-blocking)
@WithMockUser-equivalent assertion suite for SvelteKitloadfunctions that ensures none of them are accidentally prerenderable when they depend onlocals.user./hilfe/transkriptionsince it's a special-case public route — easy to forget about, and seniors will land there from external links.Bottom line
Ship the config change plus at least item #1 above. Without item #1, this regression is permanently one careless
crawl: trueaway from coming back. I'd hold the PR until that assertion is in. Sorry to push back on what looks like a one-liner — the one-liner is right; the missing test is what made the bug expensive.Leonie Voss — UX & Accessibility Lead
Verdict: Approved
LGTM, with one note about what almost happened
This is a build-config change with no UI surface, so there's nothing for me to review at the pixel level. The visual brand, the components, the WCAG checks — none of those moved.
What I want to flag is the user-facing impact this bug had, because it sits squarely in my territory: it broke trust at the single most fragile moment in any product — the first login attempt by a senior on a new device.
Imagine our 67-year-old aunt on her tablet, in bright daylight, typing a 14-character admin password carefully with one finger. She hits "Anmelden." The screen flickers and she's back on the login page. She assumes she mistyped, tries again. Same result. By the third attempt she is convinced "the computer is broken" or — worse — "I broke it." She closes the tab and texts her grandson. That is exactly the failure mode our dual-audience design philosophy is built to prevent: silent failure with no useful error message and no path forward.
The fix restores the intended flow, so from a UX standpoint, approve. But this near-miss highlights two things worth follow-ups (not blockers on this PR):
Suggestions for follow-up issues
Login failure feedback. If the login form ever bounces the user back to itself, the page should say so explicitly with a friendly, localized message ("Anmeldung fehlgeschlagen — bitte erneut versuchen"). Today, the path back to
/loginis silent. Even when this static-HTML bug isn't in play, a session-expired redirect would land the user on/loginwith no explanation. That's worth a small spec — maybe a?reason=session-expiredquery param that the page renders as aaria-live="polite"notice./hilfe/transkriptionis special — confirm it works without an auth cookie. This is the one route that survivescrawl: false, and it's also the route a senior user might be linked to from an email or a printout before they ever log in. Add a Playwright check that hits it cold (no cookies, no session) and verifies that the page renders correctly, with all i18n strings present, at the 320px breakpoint. If we ever break that page's public-ness, our outreach to first-time users breaks with it.Touch-target audit on
/loginitself. Unrelated to this PR but the same user journey: while we're focused on login, verify the email/password fields and the "Anmelden" button meet the 44×44px floor and 48px preferred for the senior audience. This bug was prod-blocking; the touch-target floor is silent-blocking — same effect on the same user.Bottom line
Ship the fix. Open the three follow-up issues so the user experience around login is as robust as the infrastructure around it now will be.
0da768f3a4to6ba7254344