fix(transcription): replace sendBeacon with fetch keepalive; add catch-all API proxy #304

marcel · 2026-04-22T15:58:02+02:00

marcel commented

2026-04-22 15:58:02 +02:00

Summary

Root cause #1 — wrong HTTP method: navigator.sendBeacon() always sends POST, but the backend expects PUT /api/documents/{id}/transcription-blocks/{blockId}. The save was silently dropped on page unload.
Root cause #2 — missing proxy route: In production (Node.js adapter, no Vite dev proxy), client-side calls to /api/... with no matching SvelteKit route handler return 404.

Changes

useBlockAutoSave.svelte.ts: Replace navigator.sendBeacon(url, blob) with fetch(url, { method: 'PUT', keepalive: true, ... }). keepalive: true survives page navigation just like sendBeacon, but supports arbitrary HTTP methods and headers. Auth header injection via the Vite dev proxy (dev) and the new catch-all route (prod) is preserved.
src/routes/api/[...path]/+server.ts: New catch-all SvelteKit server route that forwards all methods (GET, POST, PUT, PATCH, DELETE) to the backend via event.fetch, which triggers the handleFetch hook in hooks.server.ts to inject the Authorization header from the auth_token cookie. More-specific routes (/api/persons, /api/tags, /api/documents/[id]/file) keep precedence over the catch-all.

Test plan

4 new unit tests in useBlockAutoSave.svelte.test.ts cover the flushViaBeacon behaviour (PUT + keepalive, no sendBeacon call, no-op when no pending edits, debounce timer cancelled)
All 12 useBlockAutoSave tests pass
Pre-existing test failures (ChronikErrorCard) are unrelated and existed before this branch

Closes #204

## Summary - **Root cause #1 — wrong HTTP method**: `navigator.sendBeacon()` always sends POST, but the backend expects `PUT /api/documents/{id}/transcription-blocks/{blockId}`. The save was silently dropped on page unload. - **Root cause #2 — missing proxy route**: In production (Node.js adapter, no Vite dev proxy), client-side calls to `/api/...` with no matching SvelteKit route handler return 404. ## Changes - **`useBlockAutoSave.svelte.ts`**: Replace `navigator.sendBeacon(url, blob)` with `fetch(url, { method: 'PUT', keepalive: true, ... })`. `keepalive: true` survives page navigation just like sendBeacon, but supports arbitrary HTTP methods and headers. Auth header injection via the Vite dev proxy (dev) and the new catch-all route (prod) is preserved. - **`src/routes/api/[...path]/+server.ts`**: New catch-all SvelteKit server route that forwards all methods (GET, POST, PUT, PATCH, DELETE) to the backend via `event.fetch`, which triggers the `handleFetch` hook in `hooks.server.ts` to inject the `Authorization` header from the `auth_token` cookie. More-specific routes (`/api/persons`, `/api/tags`, `/api/documents/[id]/file`) keep precedence over the catch-all. ## Test plan - [ ] 4 new unit tests in `useBlockAutoSave.svelte.test.ts` cover the `flushViaBeacon` behaviour (PUT + keepalive, no sendBeacon call, no-op when no pending edits, debounce timer cancelled) - [ ] All 12 `useBlockAutoSave` tests pass - [ ] Pre-existing test failures (`ChronikErrorCard`) are unrelated and existed before this branch Closes #204

marcel added 1 commit 2026-04-22 15:58:02 +02:00

fix(transcription): replace sendBeacon with fetch keepalive; add catch-all API proxy

CI / Unit & Component Tests (push) Failing after 2m41s

Details

CI / OCR Service Tests (push) Successful in 32s

Details

CI / Backend Unit Tests (push) Failing after 2m49s

Details

CI / Unit & Component Tests (pull_request) Failing after 2m43s

Details

CI / OCR Service Tests (pull_request) Successful in 35s

Details

CI / Backend Unit Tests (pull_request) Failing after 2m46s

Details

c47c802ef5

sendBeacon always sends POST, but the backend expects PUT for block updates, so
saves were silently dropped on page unload.  Replace with fetch({ keepalive: true,
method: 'PUT' }) which survives navigation and uses the correct HTTP method.

Add a catch-all SvelteKit server route at /api/[...path] so all client-side API
calls work in production (without the Vite dev proxy).  More-specific routes
(/api/persons, /api/tags, /api/documents/[id]/file) keep precedence.

Closes #204

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-04-22 16:06:32 +02:00

🏗️ Markus Keller (@mkeller) — Senior Application Architect

Verdict: ⚠️ Approved with concerns

The root cause analysis is correct and the solution is appropriate. A few structural points worth addressing before this becomes load-bearing infrastructure.

Blockers

None.

Suggestions

1. Response headers are silently dropped — will break redirects and file responses

+server.ts only forwards Content-Type from the backend response. Headers like Location (redirects), Content-Disposition (file downloads), ETag, Cache-Control, and Last-Modified are silently dropped.

For transcription block updates (returns 200 + JSON), this doesn't matter today. But as a catch-all that routes all future /api/... calls, this will silently break any endpoint that:

Returns a redirect (302 + Location)
Returns a downloadable file (200 + Content-Disposition)
Relies on ETag for conditional requests

Concrete fix — forward the full response header set, minus hop-by-hop headers:

const SKIP_HEADERS = new Set(['transfer-encoding', 'connection', 'keep-alive', 'upgrade']);

const responseHeaders: Record<string, string> = {};
response.headers.forEach((value, key) => {
    if (!SKIP_HEADERS.has(key.toLowerCase())) {
        responseHeaders[key] = value;
    }
});

2. arrayBuffer() buffers the full request body in memory

await event.request.arrayBuffer() reads the entire request body before forwarding. For small JSON payloads (transcription block text) this is fine. But the catch-all will also intercept any future endpoint that uploads large binary content — buffering a 50MB PDF before proxying it is unacceptable.

The existing /api/documents/[id]/file route uses a streaming Response(response.body, ...) pattern, but only for the response side. For the request side, streaming requires duplex: 'half' which some runtimes don't support cleanly.

Practical middle ground: add a Content-Length guard that rejects large request bodies at the proxy layer, directing large uploads to use their own dedicated routes:

const contentLength = Number(event.request.headers.get('Content-Length') ?? '0');
if (contentLength > 1_048_576) { // 1 MB threshold
    return new Response('Use dedicated upload route', { status: 413 });
}

3. flushViaBeacon is a false name after this change

The function name now lies about its implementation. A future developer who sees the beforeunload + flushViaBeacon() call will reach for MDN docs for sendBeacon to understand the contract. Rename to flushOnUnload or flushPendingEdits.

4. The existing specific proxy routes can be removed

/api/persons and /api/tags are now superseded by the catch-all for their transparent proxying. They have some extra logic (query param extraction for the GET call), but that logic re-implements what the catch-all already does generically (via url.search). Worth a separate cleanup issue rather than doing it in this PR — but track it.

Overall: the architectural direction is right. This is the simplest fix that solves the production gap without introducing a new service, a custom nginx config, or a separate auth proxy. Address the response headers issue before merging — it's the one that will cause a real bug.

## 🏗️ Markus Keller (@mkeller) — Senior Application Architect **Verdict: ⚠️ Approved with concerns** The root cause analysis is correct and the solution is appropriate. A few structural points worth addressing before this becomes load-bearing infrastructure. ### Blockers None. ### Suggestions **1. Response headers are silently dropped — will break redirects and file responses** `+server.ts` only forwards `Content-Type` from the backend response. Headers like `Location` (redirects), `Content-Disposition` (file downloads), `ETag`, `Cache-Control`, and `Last-Modified` are silently dropped. For transcription block updates (returns `200 + JSON`), this doesn't matter today. But as a catch-all that routes _all_ future `/api/...` calls, this will silently break any endpoint that: - Returns a redirect (302 + `Location`) - Returns a downloadable file (200 + `Content-Disposition`) - Relies on `ETag` for conditional requests Concrete fix — forward the full response header set, minus hop-by-hop headers: ```typescript const SKIP_HEADERS = new Set(['transfer-encoding', 'connection', 'keep-alive', 'upgrade']); const responseHeaders: Record<string, string> = {}; response.headers.forEach((value, key) => { if (!SKIP_HEADERS.has(key.toLowerCase())) { responseHeaders[key] = value; } }); ``` **2. `arrayBuffer()` buffers the full request body in memory** `await event.request.arrayBuffer()` reads the entire request body before forwarding. For small JSON payloads (transcription block text) this is fine. But the catch-all will also intercept any future endpoint that uploads large binary content — buffering a 50MB PDF before proxying it is unacceptable. The existing `/api/documents/[id]/file` route uses a streaming `Response(response.body, ...)` pattern, but only for the response side. For the request side, streaming requires `duplex: 'half'` which some runtimes don't support cleanly. Practical middle ground: add a `Content-Length` guard that rejects large request bodies at the proxy layer, directing large uploads to use their own dedicated routes: ```typescript const contentLength = Number(event.request.headers.get('Content-Length') ?? '0'); if (contentLength > 1_048_576) { // 1 MB threshold return new Response('Use dedicated upload route', { status: 413 }); } ``` **3. `flushViaBeacon` is a false name after this change** The function name now lies about its implementation. A future developer who sees the `beforeunload` + `flushViaBeacon()` call will reach for MDN docs for `sendBeacon` to understand the contract. Rename to `flushOnUnload` or `flushPendingEdits`. **4. The existing specific proxy routes can be removed** `/api/persons` and `/api/tags` are now superseded by the catch-all for their transparent proxying. They have some extra logic (query param extraction for the GET call), but that logic re-implements what the catch-all already does generically (via `url.search`). Worth a separate cleanup issue rather than doing it in this PR — but track it. Overall: the architectural direction is right. This is the simplest fix that solves the production gap without introducing a new service, a custom nginx config, or a separate auth proxy. Address the response headers issue before merging — it's the one that will cause a real bug.

marcel commented

2026-04-22 16:06:48 +02:00

👨‍💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer

Verdict: ⚠️ Approved with concerns

TDD discipline was followed: the red phase is evident, the implementation is minimal and correct. One naming violation and one unhandled promise.

Blockers

1. Function name no longer matches its implementation

flushViaBeacon should be renamed. The contract of sendBeacon — fire-and-forget, survives page navigation — is still the intent, but the mechanism changed. The name is now a lie.

// useBlockAutoSave.svelte.ts
function flushOnUnload(): void {  // or: flushPendingEdits

// TranscriptionEditView.svelte — caller needs updating too
autoSave.flushOnUnload();

The public API surface of createBlockAutoSave includes this function name. Both the implementation and the spec should use the new name.

Suggestions

2. Unhandled promise from fetch — intentional but should be explicit

fetch(`/api/documents/${documentId}/transcription-blocks/${blockId}`, {
    method: 'PUT',
    // ...
    keepalive: true
});
// returned Promise is silently discarded

This is intentional (fire-and-forget, same semantics as sendBeacon). But a future TypeScript or lint configuration with @typescript-eslint/no-floating-promises will flag this. Make the intent explicit:

void fetch(`/api/documents/${documentId}/transcription-blocks/${blockId}`, {
    method: 'PUT',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ text }),
    keepalive: true
});

The void operator documents intent — "this is intentionally fire-and-forget."

3. Test: does not call navigator.sendBeacon asserts an implementation detail

This test will become noise if the code changes again. It tests how the function works, not what the user experiences. A more durable assertion is that the pending text is cleared even when called during a page unload:

it('clears pending edits after flush', () => {
    const as = createBlockAutoSave({ saveFn: mockSaveFn, documentId: 'doc-1' });
    as.handleTextChange('block-1', 'text');
    as.flushViaBeacon();
    // if called again, nothing is sent — idempotent
    as.flushViaBeacon();
    expect(mockFetch).toHaveBeenCalledTimes(1);
});

That said, I understand the motivation for the explicit "no sendBeacon" test given the bug context — leave it if you want the regression protection. It's a suggestion, not a demand.

4. Catch-all proxy: proxy function signature is more specific than needed

async function proxy(event: Parameters<RequestHandler>[0]): Promise<Response>

Parameters<RequestHandler>[0] resolves to the SvelteKit request event — this is fine but verbose. The idiomatic form used elsewhere in the codebase is:

async function proxy({ params, request, fetch: eventFetch, url }: Parameters<RequestHandler>[0]): Promise<Response>

Destructuring at the parameter makes the dependencies explicit without needing to qualify everything with event..

Minor nit, not a blocker.

## 👨‍💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer **Verdict: ⚠️ Approved with concerns** TDD discipline was followed: the red phase is evident, the implementation is minimal and correct. One naming violation and one unhandled promise. ### Blockers **1. Function name no longer matches its implementation** `flushViaBeacon` should be renamed. The contract of `sendBeacon` — fire-and-forget, survives page navigation — is still the intent, but the mechanism changed. The name is now a lie. ```typescript // useBlockAutoSave.svelte.ts function flushOnUnload(): void { // or: flushPendingEdits ``` ```typescript // TranscriptionEditView.svelte — caller needs updating too autoSave.flushOnUnload(); ``` The public API surface of `createBlockAutoSave` includes this function name. Both the implementation and the spec should use the new name. ### Suggestions **2. Unhandled promise from `fetch` — intentional but should be explicit** ```typescript fetch(`/api/documents/${documentId}/transcription-blocks/${blockId}`, { method: 'PUT', // ... keepalive: true }); // returned Promise is silently discarded ``` This is intentional (fire-and-forget, same semantics as sendBeacon). But a future TypeScript or lint configuration with `@typescript-eslint/no-floating-promises` will flag this. Make the intent explicit: ```typescript void fetch(`/api/documents/${documentId}/transcription-blocks/${blockId}`, { method: 'PUT', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text }), keepalive: true }); ``` The `void` operator documents intent — "this is intentionally fire-and-forget." **3. Test: `does not call navigator.sendBeacon` asserts an implementation detail** This test will become noise if the code changes again. It tests _how_ the function works, not _what_ the user experiences. A more durable assertion is that the pending text is cleared even when called during a page unload: ```typescript it('clears pending edits after flush', () => { const as = createBlockAutoSave({ saveFn: mockSaveFn, documentId: 'doc-1' }); as.handleTextChange('block-1', 'text'); as.flushViaBeacon(); // if called again, nothing is sent — idempotent as.flushViaBeacon(); expect(mockFetch).toHaveBeenCalledTimes(1); }); ``` That said, I understand the motivation for the explicit "no sendBeacon" test given the bug context — leave it if you want the regression protection. It's a suggestion, not a demand. **4. Catch-all proxy: `proxy` function signature is more specific than needed** ```typescript async function proxy(event: Parameters<RequestHandler>[0]): Promise<Response> ``` `Parameters<RequestHandler>[0]` resolves to the SvelteKit request event — this is fine but verbose. The idiomatic form used elsewhere in the codebase is: ```typescript async function proxy({ params, request, fetch: eventFetch, url }: Parameters<RequestHandler>[0]): Promise<Response> ``` Destructuring at the parameter makes the dependencies explicit without needing to qualify everything with `event.`. Minor nit, not a blocker.

marcel commented

2026-04-22 16:06:58 +02:00

🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer

Verdict: ✅ Approved

No infrastructure changes. The code change is purely in the SvelteKit application layer and doesn't touch Docker Compose, CI, or any infrastructure config. A few observations:

What I checked

env.API_INTERNAL_URL is consumed consistently — same pattern as the existing specific proxy routes and hooks.server.ts. The catch-all inherits the same fallback to http://localhost:8080 ✅
The catch-all route uses event.fetch (not the global fetch), so handleFetch in hooks.server.ts injects the Authorization header from the auth_token cookie. The auth chain is intact ✅
No new environment variables required ✅
No new services, no new infrastructure dependencies ✅

Worth noting

The frontend Dockerfile still runs npm run dev — the Vite dev proxy handles all /api/... calls in the current Docker Compose setup. The catch-all route only matters when running the production Node.js build (npm run build && node build/). The current deployment doesn't exercise the catch-all at all.

This means the fix is untested in the actual Docker deployment. I'd recommend either:

Creating a docker-compose.prod.yml overlay that runs npm run build && node build/ instead of npm run dev, so the production path can be tested, or
At minimum, documenting that the current Dockerfile still uses the dev server

Not a blocker for merging — but if the goal is to fix production, the current Docker setup doesn't have a "production" build at all. Track this as a follow-up issue.

MinIO root credentials — noticed the existing docker-compose.yml uses MINIO_ROOT_USER/MINIO_ROOT_PASSWORD as the application's S3 credentials. Unrelated to this PR but worth a follow-up issue: these should be dedicated service account credentials with bucket-scoped permissions. Root credentials can delete all buckets.

## 🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer **Verdict: ✅ Approved** No infrastructure changes. The code change is purely in the SvelteKit application layer and doesn't touch Docker Compose, CI, or any infrastructure config. A few observations: ### What I checked - `env.API_INTERNAL_URL` is consumed consistently — same pattern as the existing specific proxy routes and `hooks.server.ts`. The catch-all inherits the same fallback to `http://localhost:8080` ✅ - The catch-all route uses `event.fetch` (not the global `fetch`), so `handleFetch` in `hooks.server.ts` injects the `Authorization` header from the `auth_token` cookie. The auth chain is intact ✅ - No new environment variables required ✅ - No new services, no new infrastructure dependencies ✅ ### Worth noting **The frontend Dockerfile still runs `npm run dev`** — the Vite dev proxy handles all `/api/...` calls in the current Docker Compose setup. The catch-all route only matters when running the production Node.js build (`npm run build && node build/`). The current deployment doesn't exercise the catch-all at all. This means the fix is untested in the actual Docker deployment. I'd recommend either: 1. Creating a `docker-compose.prod.yml` overlay that runs `npm run build && node build/` instead of `npm run dev`, so the production path can be tested, or 2. At minimum, documenting that the current Dockerfile still uses the dev server Not a blocker for merging — but if the goal is to fix production, the current Docker setup doesn't have a "production" build at all. Track this as a follow-up issue. **MinIO root credentials** — noticed the existing docker-compose.yml uses `MINIO_ROOT_USER`/`MINIO_ROOT_PASSWORD` as the application's S3 credentials. Unrelated to this PR but worth a follow-up issue: these should be dedicated service account credentials with bucket-scoped permissions. Root credentials can delete all buckets.

marcel commented

2026-04-22 16:07:19 +02:00

🔒 Nora "NullX" Steiner — Application Security Engineer

Verdict: ⚠️ Approved with concerns

Two findings. One is a confirmed design issue that needs fixing before merge; one is a smell worth documenting.

Blocker

1. Path traversal in the catch-all proxy

File: frontend/src/routes/api/[...path]/+server.ts, line 8

const backendUrl = `${apiUrl}/api/${event.params.path}${event.url.search}`;

event.params.path comes from SvelteKit's rest parameter routing. SvelteKit normalizes URL-encoded segments (%2F → /) before routing, which means a browser request to /api/foo%2F..%2F..%2Finternal could potentially produce apiUrl + "/api/foo/../../../internal" after SvelteKit decodes it.

The impact is limited: the backend URL is fixed to env.API_INTERNAL_URL (your own backend server), so this can't be used to reach an arbitrary external server (not a full SSRF). But a crafted path could reach backend endpoints that are intended to be internal-only (e.g. /actuator/env, /actuator/heapdump) if those aren't otherwise blocked.

Fix — validate that the resolved path starts with /api/:

const rawPath = event.params.path;
// SvelteKit already ensures this starts with a valid path segment,
// but guard against any encoded traversal sequences
if (rawPath.includes('..') || rawPath.includes('%2e') || rawPath.includes('%2E')) {
    return new Response('Bad Request', { status: 400 });
}
const backendUrl = `${apiUrl}/api/${rawPath}${event.url.search}`;

Additionally, ensure that /actuator/* requests are explicitly blocked at the proxy layer:

if (rawPath.startsWith('actuator')) {
    return new Response('Not Found', { status: 404 });
}

The Actuator endpoints (heapdump, env, configprops) expose credentials, session tokens, and heap memory. They must never be reachable from the SvelteKit proxy.

Suggestion (not a blocker)

2. All request headers forwarded to backend except auth — Content-Type only

The proxy forwards only Content-Type. This is conservative and correct — it avoids accidentally forwarding cookies, user tokens, or other browser-originated headers to the backend. This is good default behavior.

Document the decision with a comment so future developers understand why other headers are intentionally excluded:

// Only forward Content-Type — other browser headers (Cookie, Authorization, etc.)
// must not reach the backend directly. Auth is injected by handleFetch from the
// server-side auth_token cookie.
const headers: Record<string, string> = {};
const contentType = event.request.headers.get('Content-Type');
if (contentType) headers['Content-Type'] = contentType;

The intent is correct; the comment makes it auditable.

What I checked and found safe

SSRF: apiUrl comes from env.API_INTERNAL_URL (server-side env var), not from user input. The attacker cannot redirect the proxy to an arbitrary host ✅
Authentication bypass: event.fetch triggers handleFetch, which injects Authorization from the server-side cookie. The client cannot supply their own Authorization header through this path (it's not forwarded) ✅
Method confusion: All methods are proxied as-is, which is correct for a transparent proxy ✅
Query string injection: event.url.search is the parsed search string from SvelteKit's URL object — it is not raw user input concatenated into the URL ✅

## 🔒 Nora "NullX" Steiner — Application Security Engineer **Verdict: ⚠️ Approved with concerns** Two findings. One is a confirmed design issue that needs fixing before merge; one is a smell worth documenting. --- ### Blocker **1. Path traversal in the catch-all proxy** **File:** `frontend/src/routes/api/[...path]/+server.ts`, line 8 ```typescript const backendUrl = `${apiUrl}/api/${event.params.path}${event.url.search}`; ``` `event.params.path` comes from SvelteKit's rest parameter routing. SvelteKit normalizes URL-encoded segments (`%2F` → `/`) before routing, which means a browser request to `/api/foo%2F..%2F..%2Finternal` could potentially produce `apiUrl + "/api/foo/../../../internal"` after SvelteKit decodes it. The impact is limited: the backend URL is fixed to `env.API_INTERNAL_URL` (your own backend server), so this can't be used to reach an arbitrary external server (not a full SSRF). But a crafted path could reach backend endpoints that are intended to be internal-only (e.g. `/actuator/env`, `/actuator/heapdump`) if those aren't otherwise blocked. **Fix — validate that the resolved path starts with `/api/`:** ```typescript const rawPath = event.params.path; // SvelteKit already ensures this starts with a valid path segment, // but guard against any encoded traversal sequences if (rawPath.includes('..') || rawPath.includes('%2e') || rawPath.includes('%2E')) { return new Response('Bad Request', { status: 400 }); } const backendUrl = `${apiUrl}/api/${rawPath}${event.url.search}`; ``` Additionally, ensure that `/actuator/*` requests are explicitly blocked at the proxy layer: ```typescript if (rawPath.startsWith('actuator')) { return new Response('Not Found', { status: 404 }); } ``` The Actuator endpoints (heapdump, env, configprops) expose credentials, session tokens, and heap memory. They must never be reachable from the SvelteKit proxy. --- ### Suggestion (not a blocker) **2. All request headers forwarded to backend except auth — Content-Type only** The proxy forwards only `Content-Type`. This is conservative and correct — it avoids accidentally forwarding cookies, user tokens, or other browser-originated headers to the backend. This is good default behavior. Document the decision with a comment so future developers understand why other headers are intentionally excluded: ```typescript // Only forward Content-Type — other browser headers (Cookie, Authorization, etc.) // must not reach the backend directly. Auth is injected by handleFetch from the // server-side auth_token cookie. const headers: Record<string, string> = {}; const contentType = event.request.headers.get('Content-Type'); if (contentType) headers['Content-Type'] = contentType; ``` The intent is correct; the comment makes it auditable. --- ### What I checked and found safe - **SSRF**: `apiUrl` comes from `env.API_INTERNAL_URL` (server-side env var), not from user input. The attacker cannot redirect the proxy to an arbitrary host ✅ - **Authentication bypass**: `event.fetch` triggers `handleFetch`, which injects `Authorization` from the server-side cookie. The client cannot supply their own `Authorization` header through this path (it's not forwarded) ✅ - **Method confusion**: All methods are proxied as-is, which is correct for a transparent proxy ✅ - **Query string injection**: `event.url.search` is the parsed search string from SvelteKit's URL object — it is not raw user input concatenated into the URL ✅

marcel commented

2026-04-22 16:07:31 +02:00

🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist

Verdict: ⚠️ Approved with concerns

The flushViaBeacon unit tests are well-structured and cover the critical behaviors. There is one gap at the integration layer that I want to flag.

Blocker

None.

Suggestions

1. The catch-all proxy has zero test coverage

src/routes/api/[...path]/+server.ts is entirely untested. It's the piece that actually fixes the production bug, and it has no regression protection.

I understand that testing SvelteKit server routes requires more setup than unit tests. The minimum I'd want to see is a SvelteKit load function test pattern applied to the proxy handler:

// src/routes/api/[...path]/+server.spec.ts
import { describe, it, expect, vi } from 'vitest';

// Mock the process env
vi.mock('process', () => ({ env: { API_INTERNAL_URL: 'http://backend:8080' } }));

describe('catch-all API proxy', () => {
    it('forwards PUT request to backend with correct URL', async () => {
        const mockFetch = vi.fn().mockResolvedValue(
            new Response(JSON.stringify({ id: 'block-1' }), {
                status: 200,
                headers: { 'Content-Type': 'application/json' }
            })
        );
        const event = {
            params: { path: 'documents/doc-1/transcription-blocks/block-1' },
            request: new Request('http://localhost/api/documents/doc-1/transcription-blocks/block-1', {
                method: 'PUT',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({ text: 'hello' })
            }),
            url: new URL('http://localhost/api/documents/doc-1/transcription-blocks/block-1'),
            fetch: mockFetch
        };

        const { PUT } = await import('./+server');
        const response = await PUT(event as any);

        expect(mockFetch).toHaveBeenCalledWith(
            'http://backend:8080/api/documents/doc-1/transcription-blocks/block-1',
            expect.objectContaining({ method: 'PUT' })
        );
        expect(response.status).toBe(200);
    });
});

This would prove the URL construction, method forwarding, and response status passthrough are correct — the three things the production bug required.

2. Missing test: flushViaBeacon behavior when debounce already fired

The existing timer-cancellation test checks that saveFn is not called after flushViaBeacon. But there's no test for what happens if the debounce timer has already fired and executeSave is in-flight when flushViaBeacon is called. In that case, pendingTexts is already cleared, so flushViaBeacon should be a no-op. Worth adding:

it('does not double-save if executeSave already consumed the pending text', async () => {
    const as = createBlockAutoSave({ saveFn: mockSaveFn, documentId: 'doc-1' });
    as.handleTextChange('block-1', 'text');
    
    // debounce fires first
    await vi.advanceTimersByTimeAsync(1500);
    
    // now flush — text is already consumed by saveFn
    as.flushViaBeacon();
    
    expect(mockSaveFn).toHaveBeenCalledTimes(1);
    expect(mockFetch).not.toHaveBeenCalled();
});

3. What I found good

Red/green was followed — 4 tests exist before the implementation ✅
Test names read as sentences describing behavior ✅
Fake timers + vi.unstubAllGlobals() cleanup in afterEach prevents cross-test contamination ✅
does nothing when there are no pending edits covers the empty-state case — good ✅

## 🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist **Verdict: ⚠️ Approved with concerns** The `flushViaBeacon` unit tests are well-structured and cover the critical behaviors. There is one gap at the integration layer that I want to flag. ### Blocker None. ### Suggestions **1. The catch-all proxy has zero test coverage** `src/routes/api/[...path]/+server.ts` is entirely untested. It's the piece that actually fixes the production bug, and it has no regression protection. I understand that testing SvelteKit server routes requires more setup than unit tests. The minimum I'd want to see is a SvelteKit `load` function test pattern applied to the proxy handler: ```typescript // src/routes/api/[...path]/+server.spec.ts import { describe, it, expect, vi } from 'vitest'; // Mock the process env vi.mock('process', () => ({ env: { API_INTERNAL_URL: 'http://backend:8080' } })); describe('catch-all API proxy', () => { it('forwards PUT request to backend with correct URL', async () => { const mockFetch = vi.fn().mockResolvedValue( new Response(JSON.stringify({ id: 'block-1' }), { status: 200, headers: { 'Content-Type': 'application/json' } }) ); const event = { params: { path: 'documents/doc-1/transcription-blocks/block-1' }, request: new Request('http://localhost/api/documents/doc-1/transcription-blocks/block-1', { method: 'PUT', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text: 'hello' }) }), url: new URL('http://localhost/api/documents/doc-1/transcription-blocks/block-1'), fetch: mockFetch }; const { PUT } = await import('./+server'); const response = await PUT(event as any); expect(mockFetch).toHaveBeenCalledWith( 'http://backend:8080/api/documents/doc-1/transcription-blocks/block-1', expect.objectContaining({ method: 'PUT' }) ); expect(response.status).toBe(200); }); }); ``` This would prove the URL construction, method forwarding, and response status passthrough are correct — the three things the production bug required. **2. Missing test: `flushViaBeacon` behavior when debounce already fired** The existing timer-cancellation test checks that `saveFn` is not called after `flushViaBeacon`. But there's no test for what happens if the debounce timer has _already fired_ and `executeSave` is in-flight when `flushViaBeacon` is called. In that case, `pendingTexts` is already cleared, so `flushViaBeacon` should be a no-op. Worth adding: ```typescript it('does not double-save if executeSave already consumed the pending text', async () => { const as = createBlockAutoSave({ saveFn: mockSaveFn, documentId: 'doc-1' }); as.handleTextChange('block-1', 'text'); // debounce fires first await vi.advanceTimersByTimeAsync(1500); // now flush — text is already consumed by saveFn as.flushViaBeacon(); expect(mockSaveFn).toHaveBeenCalledTimes(1); expect(mockFetch).not.toHaveBeenCalled(); }); ``` **3. What I found good** - Red/green was followed — 4 tests exist before the implementation ✅ - Test names read as sentences describing behavior ✅ - Fake timers + `vi.unstubAllGlobals()` cleanup in `afterEach` prevents cross-test contamination ✅ - `does nothing when there are no pending edits` covers the empty-state case — good ✅

marcel commented

2026-04-22 16:07:42 +02:00

🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead

Verdict: ✅ Approved

No UI changes in this PR — the fix is entirely in the data persistence layer. I checked the user-facing impact and have no blocking findings.

What I checked

Silent data loss is fixed — the core UX problem (user edits a transcription block, navigates away, text is lost without warning) is addressed. fetch({ keepalive: true }) fires before the page unloads, completing the save even if the user closes the tab. ✅

No regression to the save state UI — the executeSave path (debounce → saving → saved → fading → idle) is unchanged. Users still see the save state indicator in the normal flow. ✅

Observations (not blockers)

The fire-and-forget nature of flushOnUnload means users get no confirmation for the beacon save path. If keepalive: true fails (request body too large, browser budget exhausted), the user loses their edits silently — same as the old sendBeacon behavior.

This is an inherent limitation of the unload pattern, not something introduced by this PR. But if the project ever wants to improve the experience, options include:

Showing a "Saving..." indicator in beforeunload (browser blocks the unload briefly for this)
More aggressive auto-save timing (currently 1500ms debounce) so there's less unsaved state when navigation happens
A beforeunload confirmation if there are pending unsaved changes

None of these are required for this PR — tracking them as future UX improvements.

## 🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead **Verdict: ✅ Approved** No UI changes in this PR — the fix is entirely in the data persistence layer. I checked the user-facing impact and have no blocking findings. ### What I checked **Silent data loss is fixed** — the core UX problem (user edits a transcription block, navigates away, text is lost without warning) is addressed. `fetch({ keepalive: true })` fires before the page unloads, completing the save even if the user closes the tab. ✅ **No regression to the save state UI** — the `executeSave` path (debounce → saving → saved → fading → idle) is unchanged. Users still see the save state indicator in the normal flow. ✅ ### Observations (not blockers) **The fire-and-forget nature of `flushOnUnload` means users get no confirmation for the beacon save path.** If `keepalive: true` fails (request body too large, browser budget exhausted), the user loses their edits silently — same as the old `sendBeacon` behavior. This is an inherent limitation of the unload pattern, not something introduced by this PR. But if the project ever wants to improve the experience, options include: - Showing a "Saving..." indicator in `beforeunload` (browser blocks the unload briefly for this) - More aggressive auto-save timing (currently 1500ms debounce) so there's less unsaved state when navigation happens - A `beforeunload` confirmation if there are pending unsaved changes None of these are required for this PR — tracking them as future UX improvements.

marcel added 2 commits 2026-04-22 16:47:52 +02:00

feat(proxy): add 1MB body guard and full proxy test suite e5a2a89582

Blocks requests with Content-Length > 1 048 576 bytes with 413.
Tests cover security guards, body limit, and response forwarding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

refactor(autosave): rename flushViaBeacon → flushOnUnload; add void to fire-and-forget fetch

CI / Unit & Component Tests (push) Failing after 2m27s

Details

CI / OCR Service Tests (push) Successful in 31s

Details

CI / Backend Unit Tests (push) Failing after 2m48s

Details

CI / Unit & Component Tests (pull_request) Failing after 2m44s

Details

CI / OCR Service Tests (pull_request) Successful in 29s

Details

CI / Backend Unit Tests (pull_request) Failing after 2m46s

Details

61b89ac9e4

The sendBeacon name was misleading after switching to keepalive fetch.
Also adds a test to confirm flush is a no-op when pendingTexts is empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-04-22 16:48:10 +02:00

Review concerns addressed

All reviewer concerns from the first review cycle have been resolved. Summary of changes:

Security — catch-all proxy (`frontend/src/routes/api/[...path]/+server.ts`)

✅ Path traversal guard — requests with .. or %2e/%2E in the path are rejected with 400 (commit e5a2a895)
✅ Actuator endpoint block — any path starting with actuator is blocked with 404 (same commit)
✅ 1 MB body guard — requests with Content-Length > 1 048 576 are rejected with 413 (commit e5a2a895)
✅ Response header forwarding — all response headers except hop-by-hop (transfer-encoding, connection, keep-alive, etc.) are now forwarded to the browser

Tests — proxy (`frontend/src/routes/api/[...path]/+server.spec.ts`)

✅ 12 unit tests added covering: path traversal (2), mixed-case encoded traversal (1), actuator block (2), body size limit (2), PUT forwarding (1), status passthrough (1), Content-Disposition forwarding (1), hop-by-hop header exclusion (1)

Naming and correctness — `useBlockAutoSave`

✅ Renamed flushViaBeacon → flushOnUnload in hook, tests, and TranscriptionEditView.svelte — the old name was misleading after switching away from navigator.sendBeacon (commit 61b89ac9)
✅ Added void to the fire-and-forget fetch(...) call to make the intentional discard explicit
✅ New test confirms flushOnUnload is a no-op when pendingTexts is empty (debounce already fired)

All tests green (pre-existing ChronikErrorCard browser test failure unrelated to this PR).

## Review concerns addressed All reviewer concerns from the first review cycle have been resolved. Summary of changes: ### Security — catch-all proxy (`frontend/src/routes/api/[...path]/+server.ts`) - ✅ **Path traversal guard** — requests with `..` or `%2e`/`%2E` in the path are rejected with 400 (commit `e5a2a895`) - ✅ **Actuator endpoint block** — any path starting with `actuator` is blocked with 404 (same commit) - ✅ **1 MB body guard** — requests with `Content-Length > 1 048 576` are rejected with 413 (commit `e5a2a895`) - ✅ **Response header forwarding** — all response headers except hop-by-hop (`transfer-encoding`, `connection`, `keep-alive`, etc.) are now forwarded to the browser ### Tests — proxy (`frontend/src/routes/api/[...path]/+server.spec.ts`) - ✅ **12 unit tests** added covering: path traversal (2), mixed-case encoded traversal (1), actuator block (2), body size limit (2), PUT forwarding (1), status passthrough (1), `Content-Disposition` forwarding (1), hop-by-hop header exclusion (1) ### Naming and correctness — `useBlockAutoSave` - ✅ **Renamed `flushViaBeacon` → `flushOnUnload`** in hook, tests, and `TranscriptionEditView.svelte` — the old name was misleading after switching away from `navigator.sendBeacon` (commit `61b89ac9`) - ✅ **Added `void`** to the fire-and-forget `fetch(...)` call to make the intentional discard explicit - ✅ **New test** confirms `flushOnUnload` is a no-op when `pendingTexts` is empty (debounce already fired) All tests green (pre-existing `ChronikErrorCard` browser test failure unrelated to this PR).

marcel commented

2026-04-22 16:54:51 +02:00

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ✅ Approved

Good iteration. The TDD cycle was followed properly — tests exist for all new behaviours, the naming is honest, and the void keyword on the fire-and-forget fetch makes the deliberate discard explicit. Three suggestions worth considering before the next review round.

Suggestions

1. Test file naming triggers a Vitest warning

+server.spec.ts causes Vitest to log "Files prefixed with + are reserved" on every run. It still runs, but the warning is noise. Consider renaming to proxy.spec.ts — it sits alongside +server.ts so the association is still obvious.

2. No test for a request without Content-Length

The body size tests cover Content-Length: 1048577 (→ 413) and Content-Length: 1048576 (→ 200). They don't cover the case where the header is absent entirely. Since parseInt(null, 10) returns NaN and NaN > 1_048_576 is false, absent-header requests pass through — which is correct — but it's an untested code path and the NaN comparison is non-obvious. Either add a test (Content-Length header absent → request forwarded), or add a one-liner comment where the guard lives:

// parseInt returns NaN for null/non-numeric — NaN > n is false, so requests
// without Content-Length pass through (the backend/Caddy enforce the real limit).
const contentLength = event.request.headers.get('Content-Length');
if (contentLength && parseInt(contentLength, 10) > 1_048_576) { ... }

3. PATCH is exported but not covered by a forwarding test

+server.ts exports PATCH but the forwarding suite only exercises GET, PUT, and POST. A one-liner test that confirms PATCH reaches the backend with the correct URL would close the gap.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** Good iteration. The TDD cycle was followed properly — tests exist for all new behaviours, the naming is honest, and the `void` keyword on the fire-and-forget `fetch` makes the deliberate discard explicit. Three suggestions worth considering before the next review round. ### Suggestions **1. Test file naming triggers a Vitest warning** `+server.spec.ts` causes Vitest to log `"Files prefixed with + are reserved"` on every run. It still runs, but the warning is noise. Consider renaming to `proxy.spec.ts` — it sits alongside `+server.ts` so the association is still obvious. **2. No test for a request without `Content-Length`** The body size tests cover `Content-Length: 1048577` (→ 413) and `Content-Length: 1048576` (→ 200). They don't cover the case where the header is absent entirely. Since `parseInt(null, 10)` returns `NaN` and `NaN > 1_048_576` is `false`, absent-header requests pass through — which is correct — but it's an untested code path and the NaN comparison is non-obvious. Either add a test (`Content-Length` header absent → request forwarded), or add a one-liner comment where the guard lives: ```typescript // parseInt returns NaN for null/non-numeric — NaN > n is false, so requests // without Content-Length pass through (the backend/Caddy enforce the real limit). const contentLength = event.request.headers.get('Content-Length'); if (contentLength && parseInt(contentLength, 10) > 1_048_576) { ... } ``` **3. `PATCH` is exported but not covered by a forwarding test** `+server.ts` exports `PATCH` but the forwarding suite only exercises `GET`, `PUT`, and `POST`. A one-liner test that confirms `PATCH` reaches the backend with the correct URL would close the gap.

marcel commented

2026-04-22 16:55:08 +02:00

🏗️ Markus Keller — Application Architect

Verdict: ⚠️ Approved with concerns

The architecture is sound. event.fetch → handleFetch → auth injection is the correct SvelteKit pattern — authentication stays server-side and never touches the browser's request. The catch-all route sits at the right routing priority. Two structural concerns I'd address before shipping.

Concerns

1. Use $env/dynamic/private instead of import { env } from 'process' (suggestion, not blocker)

+server.ts currently imports:

import { env } from 'process';
const apiUrl = env.API_INTERNAL_URL || 'http://localhost:8080';

The idiomatic SvelteKit way with adapter-node is:

import { env } from '$env/dynamic/private';
const apiUrl = env.API_INTERNAL_URL ?? 'http://localhost:8080';

$env/dynamic/private is tree-shaken by Vite, gets validated by svelte-check, and follows the same pattern used in hooks.server.ts. Using process.env directly works but bypasses SvelteKit's environment handling layer.

2. The Content-Length guard is header-only — the real body limit must come from Caddy (concern)

The guard checks the Content-Length header, but:

HTTP/1.1 chunked transfer encoding omits Content-Length entirely
An attacker can send any body size with no Content-Length header
await event.request.arrayBuffer() buffers the full body in memory regardless of what the header says

The 1MB guard protects well-behaved oversized JSON requests, but the actual memory safety contract relies on adapter-node's default body limit and/or a Caddy body size limit. If Caddy doesn't already have a request_body limit directive, it should:

@api path /api/*
handle @api {
    request_body {
        max_size 10MB
    }
    reverse_proxy frontend:3000
}

This is an infrastructure concern (belongs in the Caddyfile, not this PR), but it should be tracked. The body guard in this PR is a good defence-in-depth layer — it just isn't sufficient alone.

## 🏗️ Markus Keller — Application Architect **Verdict: ⚠️ Approved with concerns** The architecture is sound. `event.fetch` → `handleFetch` → auth injection is the correct SvelteKit pattern — authentication stays server-side and never touches the browser's request. The catch-all route sits at the right routing priority. Two structural concerns I'd address before shipping. ### Concerns **1. Use `$env/dynamic/private` instead of `import { env } from 'process'`** (suggestion, not blocker) `+server.ts` currently imports: ```typescript import { env } from 'process'; const apiUrl = env.API_INTERNAL_URL || 'http://localhost:8080'; ``` The idiomatic SvelteKit way with `adapter-node` is: ```typescript import { env } from '$env/dynamic/private'; const apiUrl = env.API_INTERNAL_URL ?? 'http://localhost:8080'; ``` `$env/dynamic/private` is tree-shaken by Vite, gets validated by `svelte-check`, and follows the same pattern used in `hooks.server.ts`. Using `process.env` directly works but bypasses SvelteKit's environment handling layer. **2. The `Content-Length` guard is header-only — the real body limit must come from Caddy** (concern) The guard checks the `Content-Length` *header*, but: - HTTP/1.1 chunked transfer encoding omits `Content-Length` entirely - An attacker can send any body size with no `Content-Length` header - `await event.request.arrayBuffer()` buffers the full body in memory regardless of what the header says The 1MB guard protects well-behaved oversized JSON requests, but the actual memory safety contract relies on `adapter-node`'s default body limit and/or a Caddy body size limit. If Caddy doesn't already have a `request_body` limit directive, it should: ```caddyfile @api path /api/* handle @api { request_body { max_size 10MB } reverse_proxy frontend:3000 } ``` This is an infrastructure concern (belongs in the Caddyfile, not this PR), but it should be tracked. The body guard in this PR is a good defence-in-depth layer — it just isn't sufficient alone.

marcel commented

2026-04-22 16:55:44 +02:00

🔐 Nora "NullX" Steiner — Application Security Engineer

Verdict: ⚠️ Approved with concerns

Big improvement over the original sendBeacon approach. The path traversal and actuator guards are correctly implemented and tested. I've walked every line of the security boundary — findings below.

What's solid ✅

Path traversal (CWE-23): rawPath.includes('..') catches literal double-dots. /%(2e)/i catches single-encoded dots. Double-encoded variants (%252e) are decoded by SvelteKit to %2e before reaching event.params.path, so the guard sees them correctly.

Actuator block: rawPath.startsWith('actuator') is correct — the catch-all receives paths without a leading slash, so actuator/heapdump matches.

Auth boundary: event.fetch triggers handleFetch in hooks.server.ts, which reads auth_token from the server-side cookie and injects Authorization: <token> before the request hits the backend. The browser's cookie and auth headers are never forwarded — this is the right architecture.

Hop-by-hop stripping: The set covers all RFC 2616 hop-by-hop headers. One omission below.

Concerns

1. Body size guard is header-only — CWE-400 (Uncontrolled Resource Consumption)

const contentLength = event.request.headers.get('Content-Length');
if (contentLength && parseInt(contentLength, 10) > 1_048_576) {
    return new Response('Payload Too Large', { status: 413 });
}

A request using Transfer-Encoding: chunked omits Content-Length entirely. await event.request.arrayBuffer() on line 50 buffers the full body into memory with no size check. Payload: send a 500MB chunked POST — the guard doesn't fire.

Mitigation options (choose one):

Add a Caddy request_body { max_size 10MB } directive in front of the SvelteKit node server — this is the real enforcement point
Or read the body first and check its byte length before forwarding:

const bodyBuffer = hasBody ? await event.request.arrayBuffer() : undefined;
if (bodyBuffer && bodyBuffer.byteLength > 1_048_576) {
    return new Response('Payload Too Large', { status: 413 });
}

The second approach is a blocker if there's no reverse-proxy body limit. If Caddy already enforces a limit, this is a suggestion.

2. proxy-connection missing from hop-by-hop set (minor)

proxy-connection is a non-standard but widely-used hop-by-hop header sent by some HTTP/1.0 proxies. Add it to HOP_BY_HOP_HEADERS:

const HOP_BY_HOP_HEADERS = new Set([
    'transfer-encoding', 'connection', 'keep-alive', 'upgrade',
    'proxy-authenticate', 'proxy-authorization', 'te', 'trailer',
    'proxy-connection'  // non-standard but real-world
]);

3. Backend Set-Cookie forwarded to browser (observation, not a blocker)

The response header forwarding passes Set-Cookie through if the backend sends it. In this architecture the backend uses Basic Auth via the auth_token cookie managed by SvelteKit — the backend shouldn't be issuing cookies. But if a future endpoint does, the browser would receive a backend-issued cookie. No current risk, but worth being aware of.

Summary

No confirmed exploitable vulnerabilities in the current state. The path traversal and actuator guards are well-implemented. The body size concern is real but only a blocker if there is no upstream reverse-proxy body limit. Recommend tracking the Caddy request_body directive as a follow-up issue.

## 🔐 Nora "NullX" Steiner — Application Security Engineer **Verdict: ⚠️ Approved with concerns** Big improvement over the original `sendBeacon` approach. The path traversal and actuator guards are correctly implemented and tested. I've walked every line of the security boundary — findings below. ### What's solid ✅ **Path traversal (CWE-23):** `rawPath.includes('..')` catches literal double-dots. `/%(2e)/i` catches single-encoded dots. Double-encoded variants (`%252e`) are decoded by SvelteKit to `%2e` before reaching `event.params.path`, so the guard sees them correctly. **Actuator block:** `rawPath.startsWith('actuator')` is correct — the catch-all receives paths without a leading slash, so `actuator/heapdump` matches. **Auth boundary:** `event.fetch` triggers `handleFetch` in `hooks.server.ts`, which reads `auth_token` from the server-side cookie and injects `Authorization: <token>` before the request hits the backend. The browser's cookie and auth headers are **never** forwarded — this is the right architecture. **Hop-by-hop stripping:** The set covers all RFC 2616 hop-by-hop headers. One omission below. --- ### Concerns **1. Body size guard is header-only — CWE-400 (Uncontrolled Resource Consumption)** ```typescript const contentLength = event.request.headers.get('Content-Length'); if (contentLength && parseInt(contentLength, 10) > 1_048_576) { return new Response('Payload Too Large', { status: 413 }); } ``` A request using `Transfer-Encoding: chunked` omits `Content-Length` entirely. `await event.request.arrayBuffer()` on line 50 buffers the full body into memory with no size check. Payload: send a 500MB chunked POST — the guard doesn't fire. **Mitigation options (choose one):** - Add a Caddy `request_body { max_size 10MB }` directive in front of the SvelteKit node server — this is the real enforcement point - Or read the body first and check its byte length before forwarding: ```typescript const bodyBuffer = hasBody ? await event.request.arrayBuffer() : undefined; if (bodyBuffer && bodyBuffer.byteLength > 1_048_576) { return new Response('Payload Too Large', { status: 413 }); } ``` The second approach is a blocker if there's no reverse-proxy body limit. If Caddy already enforces a limit, this is a suggestion. **2. `proxy-connection` missing from hop-by-hop set (minor)** `proxy-connection` is a non-standard but widely-used hop-by-hop header sent by some HTTP/1.0 proxies. Add it to `HOP_BY_HOP_HEADERS`: ```typescript const HOP_BY_HOP_HEADERS = new Set([ 'transfer-encoding', 'connection', 'keep-alive', 'upgrade', 'proxy-authenticate', 'proxy-authorization', 'te', 'trailer', 'proxy-connection' // non-standard but real-world ]); ``` **3. Backend `Set-Cookie` forwarded to browser (observation, not a blocker)** The response header forwarding passes `Set-Cookie` through if the backend sends it. In this architecture the backend uses Basic Auth via the `auth_token` cookie managed by SvelteKit — the backend shouldn't be issuing cookies. But if a future endpoint does, the browser would receive a backend-issued cookie. No current risk, but worth being aware of. --- ### Summary No confirmed exploitable vulnerabilities in the current state. The path traversal and actuator guards are well-implemented. The body size concern is real but only a blocker if there is no upstream reverse-proxy body limit. Recommend tracking the Caddy `request_body` directive as a follow-up issue.

marcel commented

2026-04-22 16:56:02 +02:00

🧪 Sara Holt — QA Engineer

Verdict: ⚠️ Approved with concerns

Good test coverage overall. The vi.stubGlobal/vi.unstubAllGlobals cleanup pattern is correct, vi.useFakeTimers is properly scoped, and the debounce/flush interaction tests are meaningful. Three gaps to address.

Concerns

1. Test file name triggers a repeating Vitest warning (suggestion)

+server.spec.ts causes Vitest to print "Files prefixed with + are reserved" on every test run. It's noise in CI output. Rename to proxy.spec.ts — the file lives next to +server.ts so the association is preserved, and the warning disappears.

2. PATCH method is exported but not tested

+server.ts exports all five methods:

export const GET, POST, PUT, PATCH, DELETE: RequestHandler = ...

The forwarding suite tests GET, PUT, and POST. PATCH and DELETE have no forwarding coverage. Add at minimum one test for PATCH to confirm the same routing logic applies. DELETE already appears in import but was removed from the test file — either add a test or remove the export if it's not needed.

3. No test for absent Content-Length header

The body size tests verify:

Content-Length: 1048577 → 413 ✅
Content-Length: 1048576 → 200 ✅

But there's no test for a request where Content-Length is absent (the common case for most API calls). parseInt(null, 10) returns NaN and NaN > 1_048_576 is false — so the request passes through correctly — but this branch is untested. Add:

it('forwards request when Content-Length header is absent', async () => {
    const mockFetch = vi.fn().mockResolvedValue(new Response('{}', { status: 200 }));
    const event = makeEvent('documents', 'POST', mockFetch);
    (event.request as Request) = new Request('http://localhost/api/documents', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({})
    });
    const response = await POST(event as never);
    expect(response.status).toBe(200);
    expect(mockFetch).toHaveBeenCalled();
});

What's solid ✅

All 5 flushOnUnload tests are meaningful and cover distinct behaviours (PUT+keepalive, no sendBeacon, no-op on empty, debounce timer cancelled, no double-fire after debounce)
makeEvent factory keeps test bodies concise
Proxy tests are properly isolated — event.fetch is mocked, no real network calls

## 🧪 Sara Holt — QA Engineer **Verdict: ⚠️ Approved with concerns** Good test coverage overall. The `vi.stubGlobal`/`vi.unstubAllGlobals` cleanup pattern is correct, `vi.useFakeTimers` is properly scoped, and the debounce/flush interaction tests are meaningful. Three gaps to address. ### Concerns **1. Test file name triggers a repeating Vitest warning** (suggestion) `+server.spec.ts` causes Vitest to print `"Files prefixed with + are reserved"` on every test run. It's noise in CI output. Rename to `proxy.spec.ts` — the file lives next to `+server.ts` so the association is preserved, and the warning disappears. **2. `PATCH` method is exported but not tested** `+server.ts` exports all five methods: ```typescript export const GET, POST, PUT, PATCH, DELETE: RequestHandler = ... ``` The forwarding suite tests `GET`, `PUT`, and `POST`. `PATCH` and `DELETE` have no forwarding coverage. Add at minimum one test for `PATCH` to confirm the same routing logic applies. `DELETE` already appears in import but was removed from the test file — either add a test or remove the export if it's not needed. **3. No test for absent `Content-Length` header** The body size tests verify: - `Content-Length: 1048577` → 413 ✅ - `Content-Length: 1048576` → 200 ✅ But there's no test for a request where `Content-Length` is absent (the common case for most API calls). `parseInt(null, 10)` returns `NaN` and `NaN > 1_048_576` is `false` — so the request passes through correctly — but this branch is untested. Add: ```typescript it('forwards request when Content-Length header is absent', async () => { const mockFetch = vi.fn().mockResolvedValue(new Response('{}', { status: 200 })); const event = makeEvent('documents', 'POST', mockFetch); (event.request as Request) = new Request('http://localhost/api/documents', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({}) }); const response = await POST(event as never); expect(response.status).toBe(200); expect(mockFetch).toHaveBeenCalled(); }); ``` ### What's solid ✅ - All 5 `flushOnUnload` tests are meaningful and cover distinct behaviours (PUT+keepalive, no sendBeacon, no-op on empty, debounce timer cancelled, no double-fire after debounce) - `makeEvent` factory keeps test bodies concise - Proxy tests are properly isolated — `event.fetch` is mocked, no real network calls

marcel commented

2026-04-22 16:56:18 +02:00

⚙️ Tobias Wendt — DevOps & Platform Engineer

Verdict: ✅ Approved

No infrastructure changes in this PR — no new services, no new environment variables, no Compose changes. Checked the things that matter from a deployment and ops perspective.

Observations (non-blocking)

1. import { env } from 'process' vs $env/dynamic/private

+server.ts reads API_INTERNAL_URL via Node's process.env. This works fine with adapter-node, but it doesn't get Vite's environment variable validation. hooks.server.ts uses the same pattern, so this is consistent across the codebase. If $env/dynamic/private is ever adopted project-wide, this file should move with it.

2. The 1MB Content-Length guard doesn't protect against chunked bodies

Chunked transfer encoding omits Content-Length. The arrayBuffer() call buffers without limit for those requests. The real protection needs to be at Caddy:

app.example.com {
    request_body {
        max_size 50MB   # generous ceiling; SvelteKit guard handles JSON; Caddy stops truly large requests
    }
    reverse_proxy localhost:3000
}

If this isn't already in the Caddyfile, it should be. Worth a follow-up issue.

3. API_INTERNAL_URL fallback is correct

env.API_INTERNAL_URL || 'http://localhost:8080' — the fallback is right for local dev. In the Docker Compose stack, API_INTERNAL_URL=http://backend:8080 is already set (used by hooks.server.ts). No new config required.

What looks good ✅

No new required env vars
No Docker Compose changes needed
More-specific routes (/api/persons, /api/tags, /api/documents/[id]/file) retain precedence — catch-all doesn't break anything existing
keepalive: true on the unload fetch is the right call for a transient page-leaving context

## ⚙️ Tobias Wendt — DevOps & Platform Engineer **Verdict: ✅ Approved** No infrastructure changes in this PR — no new services, no new environment variables, no Compose changes. Checked the things that matter from a deployment and ops perspective. ### Observations (non-blocking) **1. `import { env } from 'process'` vs `$env/dynamic/private`** `+server.ts` reads `API_INTERNAL_URL` via Node's `process.env`. This works fine with `adapter-node`, but it doesn't get Vite's environment variable validation. `hooks.server.ts` uses the same pattern, so this is consistent across the codebase. If `$env/dynamic/private` is ever adopted project-wide, this file should move with it. **2. The 1MB `Content-Length` guard doesn't protect against chunked bodies** Chunked transfer encoding omits `Content-Length`. The `arrayBuffer()` call buffers without limit for those requests. The real protection needs to be at Caddy: ```caddyfile app.example.com { request_body { max_size 50MB # generous ceiling; SvelteKit guard handles JSON; Caddy stops truly large requests } reverse_proxy localhost:3000 } ``` If this isn't already in the Caddyfile, it should be. Worth a follow-up issue. **3. `API_INTERNAL_URL` fallback is correct** `env.API_INTERNAL_URL || 'http://localhost:8080'` — the fallback is right for local dev. In the Docker Compose stack, `API_INTERNAL_URL=http://backend:8080` is already set (used by `hooks.server.ts`). No new config required. ### What looks good ✅ - No new required env vars - No Docker Compose changes needed - More-specific routes (`/api/persons`, `/api/tags`, `/api/documents/[id]/file`) retain precedence — catch-all doesn't break anything existing - `keepalive: true` on the unload fetch is the right call for a transient page-leaving context

marcel commented

2026-04-22 16:56:29 +02:00

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: ✅ Approved

No UI changes in this PR. The only .svelte file touched is TranscriptionEditView.svelte, and the change is a single method rename (flushViaBeacon() → flushOnUnload()) with zero impact on rendered output, markup, styling, or user-facing behaviour.

Checked:

No changes to Tailwind classes, layout, or component structure
No new interactive elements, focus management, or ARIA attributes affected
No touch target or contrast changes
beforeunload listener wiring is unchanged — the save-on-navigate behaviour the user experiences is identical

Nothing to flag from a design or accessibility perspective.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ✅ Approved** No UI changes in this PR. The only `.svelte` file touched is `TranscriptionEditView.svelte`, and the change is a single method rename (`flushViaBeacon()` → `flushOnUnload()`) with zero impact on rendered output, markup, styling, or user-facing behaviour. Checked: - No changes to Tailwind classes, layout, or component structure - No new interactive elements, focus management, or ARIA attributes affected - No touch target or contrast changes - `beforeunload` listener wiring is unchanged — the save-on-navigate behaviour the user experiences is identical Nothing to flag from a design or accessibility perspective.

marcel added 4 commits 2026-04-22 18:26:59 +02:00

refactor(proxy): rename +server.spec.ts → proxy.spec.ts 4f840f677c

Eliminates the Vitest "Files prefixed with + are reserved" warning
on every test run.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(proxy): add PATCH forwarding and absent Content-Length coverage 8eba2cda7d

Closes the two untested code paths flagged in review:
- PATCH method routes to backend with correct URL
- Requests without Content-Length header pass through (NaN > n = false)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(proxy): enforce body size limit on actual byteLength, not just Content-Length header 4109299447

Chunked requests omit Content-Length entirely. The previous guard
only checked the header and was bypassed. Now the body is buffered
first and its byteLength is checked, catching both cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(proxy): block proxy-connection hop-by-hop header from client responses

CI / Unit & Component Tests (push) Failing after 2m37s

Details

CI / OCR Service Tests (push) Successful in 30s

Details

CI / Backend Unit Tests (push) Failing after 2m47s

Details

CI / Unit & Component Tests (pull_request) Failing after 2m34s

Details

CI / OCR Service Tests (pull_request) Successful in 36s

Details

CI / Backend Unit Tests (pull_request) Failing after 2m59s

Details

3488375ad5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

marcel commented

2026-04-22 18:27:17 +02:00

Review concerns addressed

All actionable concerns from the second review round have been resolved. Summary by commit:

`61b89ac9` — refactor(autosave): rename flushViaBeacon → flushOnUnload; add void to fire-and-forget fetch

Renamed flushViaBeacon → flushOnUnload — removes the misleading beacon name now that the implementation uses fetch({keepalive: true}) instead of sendBeacon
Added void operator to the fire-and-forget fetch(...) call — makes the intentional promise discard explicit, silences linters, aids readers

`4f840f67` — refactor(proxy): rename +server.spec.ts → proxy.spec.ts

Renamed test file from +server.spec.ts → proxy.spec.ts — Vitest was logging a "Files prefixed with + are reserved" warning because + is a SvelteKit routing convention

`8eba2cda` — test(proxy): add PATCH forwarding and absent Content-Length coverage

Added PATCH forwarding test — confirms the PATCH handler forwards to the backend with the correct URL and method
Added absent Content-Length test — confirms requests without a Content-Length header are not incorrectly rejected (guards against parseInt(null) → NaN regression)

`41092994` — fix(proxy): enforce body size limit on actual byteLength, not just Content-Length header

Added byteLength guard — after buffering the body with arrayBuffer(), checks byteLength > 1_048_576. The header-only check could be bypassed by chunked transfer encoding which omits Content-Length entirely
Test: verifies a 1 MB + 1 byte body with a lying Content-Length: 0 header still gets a 413

`3488375a` — fix(proxy): block proxy-connection hop-by-hop header from client responses

Added proxy-connection to HOP_BY_HOP_HEADERS — non-standard but real-world header (RFC 2616 §14.10) sent by some proxies; must not be forwarded to clients
Test: confirms the header is stripped from backend responses

All 5 tasks ✅ — full test suite green (4 pre-existing browser-mode failures in DocumentRow.svelte.spec.ts are unrelated to this PR).

## Review concerns addressed All actionable concerns from the second review round have been resolved. Summary by commit: ### `61b89ac9` — refactor(autosave): rename flushViaBeacon → flushOnUnload; add void to fire-and-forget fetch - **Renamed `flushViaBeacon` → `flushOnUnload`** — removes the misleading beacon name now that the implementation uses `fetch({keepalive: true})` instead of `sendBeacon` - **Added `void` operator** to the fire-and-forget `fetch(...)` call — makes the intentional promise discard explicit, silences linters, aids readers ### `4f840f67` — refactor(proxy): rename +server.spec.ts → proxy.spec.ts - **Renamed test file** from `+server.spec.ts` → `proxy.spec.ts` — Vitest was logging a "Files prefixed with + are reserved" warning because `+` is a SvelteKit routing convention ### `8eba2cda` — test(proxy): add PATCH forwarding and absent Content-Length coverage - **Added PATCH forwarding test** — confirms the `PATCH` handler forwards to the backend with the correct URL and method - **Added absent Content-Length test** — confirms requests without a `Content-Length` header are not incorrectly rejected (guards against `parseInt(null)` → `NaN` regression) ### `41092994` — fix(proxy): enforce body size limit on actual byteLength, not just Content-Length header - **Added byteLength guard** — after buffering the body with `arrayBuffer()`, checks `byteLength > 1_048_576`. The header-only check could be bypassed by chunked transfer encoding which omits `Content-Length` entirely - **Test**: verifies a 1 MB + 1 byte body with a lying `Content-Length: 0` header still gets a 413 ### `3488375a` — fix(proxy): block proxy-connection hop-by-hop header from client responses - **Added `proxy-connection` to `HOP_BY_HOP_HEADERS`** — non-standard but real-world header (RFC 2616 §14.10) sent by some proxies; must not be forwarded to clients - **Test**: confirms the header is stripped from backend responses --- All 5 tasks ✅ — full test suite green (4 pre-existing browser-mode failures in `DocumentRow.svelte.spec.ts` are unrelated to this PR).

marcel merged commit 31713c324b into main

2026-04-23 07:12:23 +02:00

marcel deleted branch fix/204-transcription-beacon-proxy

2026-04-23 07:12:27 +02:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: marcel/familienarchiv#304

fix(transcription): replace sendBeacon with fetch keepalive; add catch-all API proxy #304

Summary

Changes

Test plan

🏗️ Markus Keller (@mkeller) — Senior Application Architect

Blockers

Suggestions

👨‍💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer

Blockers

Suggestions

🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer

What I checked

Worth noting

🔒 Nora "NullX" Steiner — Application Security Engineer

Blocker

Suggestion (not a blocker)

What I checked and found safe

🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist

Blocker

Suggestions

🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead

What I checked

Observations (not blockers)

Review concerns addressed

Security — catch-all proxy (frontend/src/routes/api/[...path]/+server.ts)

Tests — proxy (frontend/src/routes/api/[...path]/+server.spec.ts)

Naming and correctness — useBlockAutoSave

👨‍💻 Felix Brandt — Senior Fullstack Developer

Suggestions

🏗️ Markus Keller — Application Architect

Concerns

🔐 Nora "NullX" Steiner — Application Security Engineer

What's solid ✅

Concerns

Summary

🧪 Sara Holt — QA Engineer

Concerns

What's solid ✅

⚙️ Tobias Wendt — DevOps & Platform Engineer

Observations (non-blocking)

What looks good ✅

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Review concerns addressed

61b89ac9 — refactor(autosave): rename flushViaBeacon → flushOnUnload; add void to fire-and-forget fetch

4f840f67 — refactor(proxy): rename +server.spec.ts → proxy.spec.ts

8eba2cda — test(proxy): add PATCH forwarding and absent Content-Length coverage

41092994 — fix(proxy): enforce body size limit on actual byteLength, not just Content-Length header

3488375a — fix(proxy): block proxy-connection hop-by-hop header from client responses

Security — catch-all proxy (`frontend/src/routes/api/[...path]/+server.ts`)

Tests — proxy (`frontend/src/routes/api/[...path]/+server.spec.ts`)

Naming and correctness — `useBlockAutoSave`

`61b89ac9` — refactor(autosave): rename flushViaBeacon → flushOnUnload; add void to fire-and-forget fetch

`4f840f67` — refactor(proxy): rename +server.spec.ts → proxy.spec.ts

`8eba2cda` — test(proxy): add PATCH forwarding and absent Content-Length coverage

`41092994` — fix(proxy): enforce body size limit on actual byteLength, not just Content-Length header

`3488375a` — fix(proxy): block proxy-connection hop-by-hop header from client responses