fix(document-viewer): surface error + retry when file load stalls instead of spinning forever #322

Open
opened 2026-04-24 13:24:20 +02:00 by marcel · 8 comments
Owner

Context

On document detail, the PDF/image viewer currently shows only an indefinite loading spinner when the underlying file can't be fetched. Captured twice during Phase B2 audit on 2026-04-24 — the viewer spun for the entire screenshot timeout with no error, no retry, no diagnostic.

Users have no way to distinguish:

  • Slow network (retry likely to help).
  • Backend is unreachable (retry may help after a delay).
  • MinIO is down (retry won't help; admin needs to act).
  • File was deleted (retry will never help; user should ask admin or move on).
  • Permission denied (retry won't help; user may need a different account).

Silent hangs feel like a broken app. Particularly costly on mobile (reader persona on phones with variable connectivity — this is the primary consumer surface).

Non-goals

  • Not changing the viewer component itself (scaling, annotations, zoom).
  • Not handling MinIO presigned-URL expiry specially — surfaces as HTTP 403, treat generically.
  • No offline/PWA caching behaviour in this issue.

Proposed behaviour

Wrap the viewer in an error boundary that watches for:

  • load / error events from the underlying <img>, <iframe>, or PDF.js instance.
  • A 10-second timeout from navigation if no load event has fired yet.
  • HTTP response codes from the file-fetch request (via fetch() preflight or onerror with status capture — pick whichever is cleanest for the current implementation).

State machine:

Trigger UI
Initial Skeleton + spinner (current behaviour)
load fires Viewer renders (current behaviour)
10 s no load, no err "Das Dokument lädt länger als erwartet." + [Erneut versuchen] [Problem melden]
HTTP 404 "Die Datei wurde nicht gefunden oder wurde gelöscht." + [Zurück] [Problem melden]
HTTP 403 "Sie haben keine Berechtigung für diese Datei." + [Zurück]
HTTP 5xx "Der Server ist zurzeit nicht erreichbar. Bitte versuchen Sie es später."
Network failure (offline) "Keine Verbindung. Bitte prüfen Sie Ihr Netzwerk."
  • Retry button re-runs the fetch with the same URL; resets timeout.
  • Report button opens a pre-filled issue report (Gitea new-issue URL with prefilled title/body containing document id + current URL + timestamp) OR mailto fallback. Prefer the former if the app has a Gitea API token scope available; otherwise mailto.

Implementation plan

Frontend

  • New component frontend/src/lib/components/DocumentViewerErrorBoundary.svelte:
    • Props: documentId, fileUrl, children snippet.
    • Manages internal state status: 'loading' | 'ok' | 'timeout' | 'error-404' | 'error-403' | 'error-5xx' | 'network'.
    • Timer via setTimeout(10_000); cleared on child load event.
    • Error UI component per status, all localized.
  • Wire it into frontend/src/routes/documents/[id]/+page.svelte around the existing viewer component.
  • Retry = set status = 'loading' and regenerate the URL (cache-bust with a ?retry=<n> param to force a fresh fetch).

Backend

  • Ensure file-fetch endpoint returns proper HTTP codes:
    • 404 when the Document exists but its S3 object is missing.
    • 403 when permissions fail.
    • 500 for anything else.
    • (Verify current behaviour first — may already be correct.)

i18n

6 new Paraglide keys:
viewer_error_timeout_title, viewer_error_404, viewer_error_403, viewer_error_5xx, viewer_error_network, viewer_error_retry, viewer_error_report.

Tests

  • Component: dispatch error event with mocked status code, assert correct message renders.
  • Component: advance timers by 10 s with no load event, assert timeout UI renders.
  • Component: click retry, assert URL is re-fetched with cache-bust.
  • E2E: with MinIO stopped, navigate to a document → within 10 s see error UI with retry button.
  • E2E: navigate to a deleted document (delete via API mid-test), reload, expect 404 UI.
  • Playwright: throttle network to offline context, reload, expect "Keine Verbindung" UI.

Verification

Manual runs on the dev stack:

  1. Stop MinIO (docker compose stop minio). Reload a document page. Within 10 s, error UI renders with retry. Start MinIO. Click retry. Page loads.
  2. Via admin or DB, flip a Document.visible=false (if such a flag exists, else simulate 403). Navigate. 403 UI shows.
  3. Chrome DevTools → Network → Offline. Reload. Network-error UI shows.

Acceptance criteria

  • 10-second timeout triggers a retry-able error state
  • HTTP 404 / 403 / 5xx produce distinct, specific messages
  • Network error (offline) produces a specific message
  • Retry button re-issues the fetch with cache-bust
  • Report button produces a pre-filled Gitea issue URL (or mailto fallback)
  • i18n complete for de/en/es
  • Tests cover each code path
  • No regression on happy path (viewer renders as fast as before)

Critical files

frontend/src/lib/components/DocumentViewerErrorBoundary.svelte   (new)
frontend/src/lib/components/DocumentViewer.svelte                (wrap)
frontend/src/routes/documents/[id]/+page.svelte                  (wire)
frontend/src/lib/components/DocumentViewerErrorBoundary.spec.ts  (new)
frontend/e2e/viewer-errors.spec.ts                               (new)
frontend/messages/{de,en,es}.json
  • #318 (reader-path responsive) — viewer error UI must also pass the mobile-first bar.
## Context On document detail, the PDF/image viewer currently shows only an indefinite loading spinner when the underlying file can't be fetched. Captured twice during Phase B2 audit on 2026-04-24 — the viewer spun for the entire screenshot timeout with no error, no retry, no diagnostic. Users have no way to distinguish: - Slow network (retry likely to help). - Backend is unreachable (retry may help after a delay). - MinIO is down (retry won't help; admin needs to act). - File was deleted (retry will never help; user should ask admin or move on). - Permission denied (retry won't help; user may need a different account). **Silent hangs feel like a broken app.** Particularly costly on mobile (reader persona on phones with variable connectivity — this is the primary consumer surface). ## Non-goals - Not changing the viewer component itself (scaling, annotations, zoom). - Not handling MinIO presigned-URL expiry specially — surfaces as HTTP 403, treat generically. - No offline/PWA caching behaviour in this issue. ## Proposed behaviour Wrap the viewer in an error boundary that watches for: - `load` / `error` events from the underlying `<img>`, `<iframe>`, or PDF.js instance. - A 10-second timeout from navigation if no `load` event has fired yet. - HTTP response codes from the file-fetch request (via `fetch()` preflight or `onerror` with status capture — pick whichever is cleanest for the current implementation). State machine: | Trigger | UI | |-------------------------|---------------------------------------------------------------------------------| | Initial | Skeleton + spinner (current behaviour) | | `load` fires | Viewer renders (current behaviour) | | 10 s no `load`, no err | "Das Dokument lädt länger als erwartet." + [Erneut versuchen] [Problem melden] | | HTTP 404 | "Die Datei wurde nicht gefunden oder wurde gelöscht." + [Zurück] [Problem melden]| | HTTP 403 | "Sie haben keine Berechtigung für diese Datei." + [Zurück] | | HTTP 5xx | "Der Server ist zurzeit nicht erreichbar. Bitte versuchen Sie es später." | | Network failure (offline) | "Keine Verbindung. Bitte prüfen Sie Ihr Netzwerk." | - Retry button re-runs the fetch with the same URL; resets timeout. - Report button opens a pre-filled issue report (Gitea new-issue URL with prefilled title/body containing document id + current URL + timestamp) OR mailto fallback. Prefer the former if the app has a Gitea API token scope available; otherwise mailto. ## Implementation plan ### Frontend - New component `frontend/src/lib/components/DocumentViewerErrorBoundary.svelte`: - Props: `documentId`, `fileUrl`, `children snippet`. - Manages internal state `status: 'loading' | 'ok' | 'timeout' | 'error-404' | 'error-403' | 'error-5xx' | 'network'`. - Timer via `setTimeout(10_000)`; cleared on child `load` event. - Error UI component per status, all localized. - Wire it into `frontend/src/routes/documents/[id]/+page.svelte` around the existing viewer component. - Retry = set `status = 'loading'` and regenerate the URL (cache-bust with a `?retry=<n>` param to force a fresh fetch). ### Backend - Ensure file-fetch endpoint returns proper HTTP codes: - 404 when the Document exists but its S3 object is missing. - 403 when permissions fail. - 500 for anything else. - (Verify current behaviour first — may already be correct.) ### i18n 6 new Paraglide keys: `viewer_error_timeout_title`, `viewer_error_404`, `viewer_error_403`, `viewer_error_5xx`, `viewer_error_network`, `viewer_error_retry`, `viewer_error_report`. ## Tests - **Component:** dispatch `error` event with mocked status code, assert correct message renders. - **Component:** advance timers by 10 s with no load event, assert timeout UI renders. - **Component:** click retry, assert URL is re-fetched with cache-bust. - **E2E:** with MinIO stopped, navigate to a document → within 10 s see error UI with retry button. - **E2E:** navigate to a deleted document (delete via API mid-test), reload, expect 404 UI. - **Playwright:** throttle network to `offline` context, reload, expect "Keine Verbindung" UI. ## Verification Manual runs on the dev stack: 1. Stop MinIO (`docker compose stop minio`). Reload a document page. Within 10 s, error UI renders with retry. Start MinIO. Click retry. Page loads. 2. Via admin or DB, flip a `Document.visible=false` (if such a flag exists, else simulate 403). Navigate. 403 UI shows. 3. Chrome DevTools → Network → Offline. Reload. Network-error UI shows. ## Acceptance criteria - [ ] 10-second timeout triggers a retry-able error state - [ ] HTTP 404 / 403 / 5xx produce distinct, specific messages - [ ] Network error (offline) produces a specific message - [ ] Retry button re-issues the fetch with cache-bust - [ ] Report button produces a pre-filled Gitea issue URL (or mailto fallback) - [ ] i18n complete for de/en/es - [ ] Tests cover each code path - [ ] No regression on happy path (viewer renders as fast as before) ## Critical files ``` frontend/src/lib/components/DocumentViewerErrorBoundary.svelte (new) frontend/src/lib/components/DocumentViewer.svelte (wrap) frontend/src/routes/documents/[id]/+page.svelte (wire) frontend/src/lib/components/DocumentViewerErrorBoundary.spec.ts (new) frontend/e2e/viewer-errors.spec.ts (new) frontend/messages/{de,en,es}.json ``` ## Related - #318 (reader-path responsive) — viewer error UI must also pass the mobile-first bar.
marcel added this to the Reader Experience v1 milestone 2026-04-24 13:24:20 +02:00
marcel added the P1-highbugui labels 2026-04-24 13:28:09 +02:00
marcel modified the milestone from Reader Experience v1 to Demo Day — family get-together 2026-04-24 13:35:15 +02:00
Author
Owner

👨‍💻 Markus Keller — Application Architect

Observations

  • Backend error propagation is already correct. DocumentController.getDocumentFile() already maps StorageFileNotFoundException → DomainException.notFound(FILE_NOT_FOUND) and permission failures go through Spring Security / PermissionAspect as 403. The GlobalExceptionHandler converts every DomainException to a structured JSON body with an ErrorCode. No backend changes are strictly needed — the issue's "Verify current behaviour first" note is already answered: it is correct.

  • The actual gap is entirely in useFileLoader.svelte.ts. The loadFile() method currently does if (!response.ok) throw new Error('Failed to load file') — it discards the HTTP status code completely before the catch block converts everything to the same hardcoded German string. Status 404, 403, and 503 are all treated identically.

  • DocumentViewer.svelte already accepts error: string as a prop and renders it. The viewer itself does not need to be a new "error boundary component" — the existing prop surface is sufficient. The issue proposes a new DocumentViewerErrorBoundary.svelte wrapper, which would duplicate the orchestration logic already in +page.svelte. The simpler path is to extend useFileLoader to expose a typed status instead.

  • The proposed component boundary is questionable. A DocumentViewerErrorBoundary.svelte with documentId + fileUrl + children snippet props would need to know about loading state, retry logic, AND rendering the error UI — that is three concerns in one component. The existing separation (useFileLoader owns fetch state, DocumentViewer renders based on props) is cleaner.

  • The "Report button → Gitea new-issue URL" idea has no architectural home. A deep link to the Gitea instance URL (heim-nas:3005) would be hardcoded infrastructure detail in frontend code, or require a new config surface. The mailto: fallback is simpler and does not couple the frontend to the ops environment.

Recommendations

  • Extend useFileLoader to expose fileStatus: 'loading' | 'ok' | 'timeout' | 'error-404' | 'error-403' | 'error-5xx' | 'network' instead of just fileError: string. The caller (+page.svelte) passes this to DocumentViewer, which renders the appropriate message. This requires zero new components and no new prop contracts.

  • Implement timeout via AbortController with a 10-second signal inside useFileLoader.loadFile(). This is cleaner than a setTimeout that races against fetch completion and requires manual cleanup.

  • Drop the DocumentViewerErrorBoundary.svelte from the plan. The existing useFileLoaderDocumentViewer pipeline already has the right shape. Adding a wrapper component introduces a layer with no clear name — "ErrorBoundary" names an error handling mechanism, not a visible UI region.

  • Use mailto: for the report button, not a Gitea URL. The Gitea instance hostname is internal infrastructure. Hardcoding it in frontend code is an ops coupling that breaks whenever the instance moves. mailto:admin@familyarchive.local?subject=Viewer+error+%23<documentId> requires no config surface.

  • No architecture documentation updates needed (no new packages, routes, or services). The CLAUDE.md package table and the C4 diagrams are unaffected.

Open Decisions

  • Should the timeout be configurable (e.g., via an env var) or hardcoded at 10s? The issue hardcodes 10s. For a family app with variable home-network latency, 10s may be too short for large PDFs on slow Wi-Fi. 15–20s is a reasonable alternative, but this is a product judgment, not an architecture one.
## 👨‍💻 Markus Keller — Application Architect ### Observations - **Backend error propagation is already correct.** `DocumentController.getDocumentFile()` already maps `StorageFileNotFoundException → DomainException.notFound(FILE_NOT_FOUND)` and permission failures go through Spring Security / `PermissionAspect` as 403. The `GlobalExceptionHandler` converts every `DomainException` to a structured JSON body with an `ErrorCode`. No backend changes are strictly needed — the issue's "Verify current behaviour first" note is already answered: it is correct. - **The actual gap is entirely in `useFileLoader.svelte.ts`.** The `loadFile()` method currently does `if (!response.ok) throw new Error('Failed to load file')` — it discards the HTTP status code completely before the catch block converts everything to the same hardcoded German string. Status 404, 403, and 503 are all treated identically. - **`DocumentViewer.svelte` already accepts `error: string` as a prop** and renders it. The viewer itself does not need to be a new "error boundary component" — the existing prop surface is sufficient. The issue proposes a new `DocumentViewerErrorBoundary.svelte` wrapper, which would duplicate the orchestration logic already in `+page.svelte`. The simpler path is to extend `useFileLoader` to expose a typed status instead. - **The proposed component boundary is questionable.** A `DocumentViewerErrorBoundary.svelte` with `documentId` + `fileUrl` + `children snippet` props would need to know about loading state, retry logic, AND rendering the error UI — that is three concerns in one component. The existing separation (`useFileLoader` owns fetch state, `DocumentViewer` renders based on props) is cleaner. - **The "Report button → Gitea new-issue URL" idea has no architectural home.** A deep link to the Gitea instance URL (heim-nas:3005) would be hardcoded infrastructure detail in frontend code, or require a new config surface. The `mailto:` fallback is simpler and does not couple the frontend to the ops environment. ### Recommendations - **Extend `useFileLoader` to expose `fileStatus: 'loading' | 'ok' | 'timeout' | 'error-404' | 'error-403' | 'error-5xx' | 'network'`** instead of just `fileError: string`. The caller (`+page.svelte`) passes this to `DocumentViewer`, which renders the appropriate message. This requires zero new components and no new prop contracts. - **Implement timeout via `AbortController` with a 10-second signal** inside `useFileLoader.loadFile()`. This is cleaner than a `setTimeout` that races against fetch completion and requires manual cleanup. - **Drop the `DocumentViewerErrorBoundary.svelte` from the plan.** The existing `useFileLoader` → `DocumentViewer` pipeline already has the right shape. Adding a wrapper component introduces a layer with no clear name — "ErrorBoundary" names an error handling mechanism, not a visible UI region. - **Use `mailto:` for the report button, not a Gitea URL.** The Gitea instance hostname is internal infrastructure. Hardcoding it in frontend code is an ops coupling that breaks whenever the instance moves. `mailto:admin@familyarchive.local?subject=Viewer+error+%23<documentId>` requires no config surface. - **No architecture documentation updates needed** (no new packages, routes, or services). The CLAUDE.md package table and the C4 diagrams are unaffected. ### Open Decisions - **Should the timeout be configurable (e.g., via an env var) or hardcoded at 10s?** The issue hardcodes 10s. For a family app with variable home-network latency, 10s may be too short for large PDFs on slow Wi-Fi. 15–20s is a reasonable alternative, but this is a product judgment, not an architecture one.
Author
Owner

👨‍💻 Felix Brandt — Fullstack Developer

Observations

  • useFileLoader.svelte.ts discards the HTTP status code. Line 21 reads if (!response.ok) throw new Error('Failed to load file') — the status is available on response.status but is not captured before the throw. The catch block then sets a hardcoded German string regardless of whether the failure was 404, 403, 503, or a network error. The tests in useFileLoader.svelte.test.ts confirm this: the test 'sets fileError on a failed fetch (non-ok response)' only checks fileError !== '' — it does not assert the status is differentiated.

  • DocumentViewer.svelte already handles error: string rendering (lines 66–77 in the current file). However, the error UI only shows a generic {error} paragraph and a download link. It does not differentiate by error type for distinct messages or actions (retry vs. back).

  • PdfViewer.svelte has its own inline error state (renderer.error) that also renders a hardcoded German string ("Fehler beim Laden der PDF") without going through the error prop of DocumentViewer. This is a second unhandled error surface not mentioned in the issue — if PDF.js itself fails to parse the blob, the error boundary in useFileLoader will not catch it.

  • The loadFile function does not capture response.status before the exception path, making it impossible to distinguish 404 vs. 403 vs. 5xx after the fact without refactoring.

  • The retry mechanism needs a cache-bust. The issue proposes ?retry=<n>. However, useFileLoader creates an object URL from a blob — it does not serve the URL directly to <img> or <iframe>. The cache-bust needs to be on the fetch() call to the backend, not on the blob URL. A ?_t=<Date.now()> query param on the /api/documents/${id}/file request achieves this.

  • The image viewer path ({:else if fileUrl} in DocumentViewer.svelte line 107) has no onload/onerror handler at all. If the image blob itself is corrupt or the createObjectURL returns an empty blob, it silently renders nothing.

Recommendations

  • Refactor useFileLoader to capture and expose httpStatus: number | null alongside fileError. Replace the single fileError: string with fileStatus: 'loading' | 'ok' | 'timeout' | 'error-404' | 'error-403' | 'error-5xx' | 'network'. Derive the human-readable message in the component, not in the hook — the hook should expose facts, not display strings.

    // useFileLoader.svelte.ts — new shape
    const response = await fetch(url, { signal: controller.signal });
    if (!response.ok) {
      fileStatus = response.status === 404 ? 'error-404'
        : response.status === 403 ? 'error-403'
        : response.status >= 500 ? 'error-5xx'
        : 'error-5xx';
      return;
    }
    
  • Add AbortController + a 10-second timeout signal to loadFile(). On abort, set fileStatus = 'timeout'. Clear the controller on successful load or explicit retry.

  • Add onerror handler to the <img> element in DocumentViewer.svelte to surface image decode failures.

  • Use ?_t=${Date.now()} on the retry fetch — not on the blob URL.

  • Write the failing tests first (per TDD): add test cases to useFileLoader.svelte.test.ts for status: 404, status: 403, status: 503, and AbortError before touching the production code.

  • Address the PdfViewer.svelte inline error state separately or in the same PR — it should either pipe through the error prop or use the same status enum. Leaving two independent error surfaces will confuse the UX.

Open Decisions

  • None — the implementation path is clear.
## 👨‍💻 Felix Brandt — Fullstack Developer ### Observations - **`useFileLoader.svelte.ts` discards the HTTP status code.** Line 21 reads `if (!response.ok) throw new Error('Failed to load file')` — the status is available on `response.status` but is not captured before the throw. The catch block then sets a hardcoded German string regardless of whether the failure was 404, 403, 503, or a network error. The tests in `useFileLoader.svelte.test.ts` confirm this: the test `'sets fileError on a failed fetch (non-ok response)'` only checks `fileError !== ''` — it does not assert the status is differentiated. - **`DocumentViewer.svelte` already handles `error: string` rendering** (lines 66–77 in the current file). However, the error UI only shows a generic `{error}` paragraph and a download link. It does not differentiate by error type for distinct messages or actions (retry vs. back). - **`PdfViewer.svelte` has its own inline error state** (`renderer.error`) that also renders a hardcoded German string (`"Fehler beim Laden der PDF"`) without going through the `error` prop of `DocumentViewer`. This is a second unhandled error surface not mentioned in the issue — if PDF.js itself fails to parse the blob, the error boundary in `useFileLoader` will not catch it. - **The `loadFile` function does not capture `response.status`** before the exception path, making it impossible to distinguish 404 vs. 403 vs. 5xx after the fact without refactoring. - **The retry mechanism needs a cache-bust.** The issue proposes `?retry=<n>`. However, `useFileLoader` creates an object URL from a blob — it does not serve the URL directly to `<img>` or `<iframe>`. The cache-bust needs to be on the `fetch()` call to the backend, not on the blob URL. A `?_t=<Date.now()>` query param on the `/api/documents/${id}/file` request achieves this. - **The image viewer path** (`{:else if fileUrl}` in `DocumentViewer.svelte` line 107) has no `onload`/`onerror` handler at all. If the image blob itself is corrupt or the `createObjectURL` returns an empty blob, it silently renders nothing. ### Recommendations - **Refactor `useFileLoader` to capture and expose `httpStatus: number | null`** alongside `fileError`. Replace the single `fileError: string` with `fileStatus: 'loading' | 'ok' | 'timeout' | 'error-404' | 'error-403' | 'error-5xx' | 'network'`. Derive the human-readable message in the component, not in the hook — the hook should expose facts, not display strings. ```typescript // useFileLoader.svelte.ts — new shape const response = await fetch(url, { signal: controller.signal }); if (!response.ok) { fileStatus = response.status === 404 ? 'error-404' : response.status === 403 ? 'error-403' : response.status >= 500 ? 'error-5xx' : 'error-5xx'; return; } ``` - **Add `AbortController` + a 10-second timeout signal** to `loadFile()`. On abort, set `fileStatus = 'timeout'`. Clear the controller on successful load or explicit retry. - **Add `onerror` handler to the `<img>` element** in `DocumentViewer.svelte` to surface image decode failures. - **Use `?_t=${Date.now()}` on the retry fetch** — not on the blob URL. - **Write the failing tests first** (per TDD): add test cases to `useFileLoader.svelte.test.ts` for `status: 404`, `status: 403`, `status: 503`, and `AbortError` before touching the production code. - **Address the `PdfViewer.svelte` inline error state separately** or in the same PR — it should either pipe through the `error` prop or use the same status enum. Leaving two independent error surfaces will confuse the UX. ### Open Decisions - None — the implementation path is clear.
Author
Owner

👨‍💻 Tobias Wendt — DevOps & Platform Engineer

Observations

  • The E2E verification steps require stopping MinIO mid-test (docker compose stop minio). This is realistic for manual verification but is not automatable in CI without a custom Docker Compose CI overlay that can inject MinIO failures. The existing CI setup does not appear to have such an overlay.

  • The "MinIO is down" scenario is the hardest to reproduce deterministically in CI. A stopped MinIO container returns a connection refused error, which maps to a network failure at the fetch layer — not an HTTP 5xx from the backend. The backend's GlobalExceptionHandler catches RuntimeException from FileService and returns 500, so the frontend would actually see an HTTP 500 in that case, not a network error. The E2E test docker compose stop minioexpect 5xx UI is the correct mapping, but needs to be documented clearly.

  • The 10-second timeout E2E test (with MinIO stopped, navigate to a document → within 10 s see error UI) will be slow in CI — adding 10 seconds of wall-clock time per test run. This is a layer mismatch: timeout behavior is better validated at the unit/component layer using vi.useFakeTimers().

  • No infrastructure changes are required for this issue. No new Docker services, no new environment variables (if the 10s timeout stays hardcoded), no Caddy config changes. The existing health check setup is unaffected.

  • The "Report button → Gitea new-issue URL" would require the frontend to know the Gitea instance's external URL (http://heim-nas:3005). That is an infrastructure detail that would need to be injected via a PUBLIC_GITEA_URL env var in SvelteKit's .env or the Docker Compose environment. This adds a config surface for a minor feature.

Recommendations

  • Move the timeout test to the component layer using vi.useFakeTimers(). Keep the E2E test to verify the full "MinIO down → 5xx error UI" path (not the timeout path), which is faster because it does not wait 10 seconds.

  • Add a docker-compose.ci.yml overlay that includes a minio-flakey profile or use Playwright's page.route() to intercept /api/documents/*/file and return a synthetic 503 — this is more reliable and faster than stopping a real container.

    // In E2E test — faster and deterministic
    await page.route('/api/documents/*/file', route => route.fulfill({ status: 503 }));
    await page.goto('/documents/test-doc-id');
    await expect(page.getByText('Der Server ist zurzeit nicht erreichbar')).toBeVisible();
    
  • If the report button uses mailto:, no env var is needed. If Gitea URL is chosen, add PUBLIC_GITEA_URL to .env.example and document it in docs/DEPLOYMENT.md.

  • Do not add the Gitea instance URL to the Docker Compose file — it belongs in .env as a PUBLIC_GITEA_URL var that defaults to empty (which disables the Gitea link and falls back to mailto).

Open Decisions

  • Should the E2E test infrastructure use page.route() mocking or real MinIO outage scenarios? page.route() is faster and deterministic. Real MinIO outage tests the full stack more faithfully but is slow and CI-unfriendly.
## 👨‍💻 Tobias Wendt — DevOps & Platform Engineer ### Observations - **The E2E verification steps require stopping MinIO mid-test** (`docker compose stop minio`). This is realistic for manual verification but is not automatable in CI without a custom Docker Compose CI overlay that can inject MinIO failures. The existing CI setup does not appear to have such an overlay. - **The "MinIO is down" scenario** is the hardest to reproduce deterministically in CI. A stopped MinIO container returns a connection refused error, which maps to a network failure at the fetch layer — not an HTTP 5xx from the backend. The backend's `GlobalExceptionHandler` catches `RuntimeException` from `FileService` and returns 500, so the frontend would actually see an HTTP 500 in that case, not a network error. The E2E test `docker compose stop minio` → `expect 5xx UI` is the correct mapping, but needs to be documented clearly. - **The 10-second timeout E2E test** (`with MinIO stopped, navigate to a document → within 10 s see error UI`) will be slow in CI — adding 10 seconds of wall-clock time per test run. This is a layer mismatch: timeout behavior is better validated at the unit/component layer using `vi.useFakeTimers()`. - **No infrastructure changes are required** for this issue. No new Docker services, no new environment variables (if the 10s timeout stays hardcoded), no Caddy config changes. The existing health check setup is unaffected. - **The "Report button → Gitea new-issue URL"** would require the frontend to know the Gitea instance's external URL (`http://heim-nas:3005`). That is an infrastructure detail that would need to be injected via a `PUBLIC_GITEA_URL` env var in SvelteKit's `.env` or the Docker Compose environment. This adds a config surface for a minor feature. ### Recommendations - **Move the timeout test to the component layer** using `vi.useFakeTimers()`. Keep the E2E test to verify the full "MinIO down → 5xx error UI" path (not the timeout path), which is faster because it does not wait 10 seconds. - **Add a `docker-compose.ci.yml` overlay** that includes a `minio-flakey` profile or use Playwright's `page.route()` to intercept `/api/documents/*/file` and return a synthetic 503 — this is more reliable and faster than stopping a real container. ```typescript // In E2E test — faster and deterministic await page.route('/api/documents/*/file', route => route.fulfill({ status: 503 })); await page.goto('/documents/test-doc-id'); await expect(page.getByText('Der Server ist zurzeit nicht erreichbar')).toBeVisible(); ``` - **If the report button uses `mailto:`**, no env var is needed. If Gitea URL is chosen, add `PUBLIC_GITEA_URL` to `.env.example` and document it in `docs/DEPLOYMENT.md`. - **Do not add the Gitea instance URL to the Docker Compose file** — it belongs in `.env` as a `PUBLIC_GITEA_URL` var that defaults to empty (which disables the Gitea link and falls back to mailto). ### Open Decisions - **Should the E2E test infrastructure use `page.route()` mocking or real MinIO outage scenarios?** `page.route()` is faster and deterministic. Real MinIO outage tests the full stack more faithfully but is slow and CI-unfriendly.
Author
Owner

👨‍💻 Elicit — Requirements Engineer

Observations

  • The issue is well-specified overall. It includes a state machine table, concrete trigger-to-UI mappings, an i18n key list, a test plan, and an acceptance criteria checklist. It meets the Definition of Ready on most dimensions.

  • Two acceptance criteria are ambiguous or untestable as written:

    1. "10-second timeout triggers a retry-able error state" — does not specify the start point of the 10 seconds (navigation start? fetch start? onMount?). The issue body says "10 seconds from navigation if no load event has fired yet" but the AC does not capture this. A test engineer cannot write a deterministic test without knowing the precise clock start.
    2. "Report button produces a pre-filled Gitea issue URL (or mailto fallback)" — "or" is an unresolved branch. Both paths are acceptable? Under what condition does each fire? If the condition is "whether a Gitea API token is available in scope," that token availability is an infrastructure concern that is not resolved in the issue.
  • The non-goal "Not handling MinIO presigned-URL expiry specially — surfaces as HTTP 403" implies a user-visible 403 message for an expired presigned URL. But the current architecture does NOT use presigned URLs for file serving — DocumentController.getDocumentFile() fetches the file server-side and streams it to the client. A 403 in the viewer means permission denied on the Spring Boot endpoint, not an expired presigned URL. The non-goal is correct in its conclusion (treat as 403) but wrong in its reasoning.

  • Missing NFR: accessibility. The error UI includes retry buttons, back buttons, and a report button. None of the acceptance criteria check that these are keyboard-accessible or have proper ARIA roles. Issue #318 (reader-path responsive) is referenced, but accessibility is a separate concern.

  • Missing NFR: what happens on intermittent failure (retry succeeds)? The AC says "Retry button re-issues the fetch with cache-bust" but does not specify what state the component returns to on success after a retry. Does the spinner reappear? Is the retry count shown to the user?

  • The i18n key count in the spec body (6 keys) does not match the list (7 keys listed): viewer_error_timeout_title, viewer_error_404, viewer_error_403, viewer_error_5xx, viewer_error_network, viewer_error_retry, viewer_error_report — that is 7, not 6.

Recommendations

  • Tighten AC 1: "The timeout clock starts when loadFile() is called. After 10 seconds with no successful response, the UI transitions to the timeout error state." This is testable with vi.useFakeTimers().

  • Resolve the report button branch before implementation: Pick one mechanism (mailto or Gitea URL) and make it unconditional. Conditional behavior based on runtime token availability adds complexity and an untested branch. Recommend: mailto unconditionally for v1. A Gitea deep-link can be a follow-up enhancement (separate issue).

  • Add an explicit AC for the retry-success path: "Given the timeout error state is shown, when the user clicks Retry and the subsequent fetch succeeds within 10 seconds, then the viewer renders normally with no error UI visible."

  • Add a minimal accessibility AC: "All error state buttons (Retry, Back, Report) are reachable by keyboard Tab and activated by Enter/Space."

  • Fix the i18n key count to 7 in the issue body (minor, but counts matter for implementation tracking).

Open Decisions

  • Retry count cap: Should the retry button be disabled or replaced with a message after N consecutive failures? The issue does not specify. For a family app where "MinIO is down" is a real scenario, unlimited retries could be confusing. Recommend: cap at 3 retries with a note "Falls das Problem weiterhin besteht, wenden Sie sich an den Administrator."
## 👨‍💻 Elicit — Requirements Engineer ### Observations - **The issue is well-specified overall.** It includes a state machine table, concrete trigger-to-UI mappings, an i18n key list, a test plan, and an acceptance criteria checklist. It meets the Definition of Ready on most dimensions. - **Two acceptance criteria are ambiguous or untestable as written:** 1. `"10-second timeout triggers a retry-able error state"` — does not specify the start point of the 10 seconds (navigation start? fetch start? `onMount`?). The issue body says "10 seconds from navigation if no `load` event has fired yet" but the AC does not capture this. A test engineer cannot write a deterministic test without knowing the precise clock start. 2. `"Report button produces a pre-filled Gitea issue URL (or mailto fallback)"` — "or" is an unresolved branch. Both paths are acceptable? Under what condition does each fire? If the condition is "whether a Gitea API token is available in scope," that token availability is an infrastructure concern that is not resolved in the issue. - **The non-goal "Not handling MinIO presigned-URL expiry specially — surfaces as HTTP 403"** implies a user-visible 403 message for an expired presigned URL. But the current architecture does NOT use presigned URLs for file serving — `DocumentController.getDocumentFile()` fetches the file server-side and streams it to the client. A 403 in the viewer means permission denied on the Spring Boot endpoint, not an expired presigned URL. The non-goal is correct in its conclusion (treat as 403) but wrong in its reasoning. - **Missing NFR: accessibility.** The error UI includes retry buttons, back buttons, and a report button. None of the acceptance criteria check that these are keyboard-accessible or have proper ARIA roles. Issue #318 (reader-path responsive) is referenced, but accessibility is a separate concern. - **Missing NFR: what happens on intermittent failure (retry succeeds)?** The AC says "Retry button re-issues the fetch with cache-bust" but does not specify what state the component returns to on success after a retry. Does the spinner reappear? Is the retry count shown to the user? - **The i18n key count in the spec body (6 keys) does not match the list (7 keys listed):** `viewer_error_timeout_title`, `viewer_error_404`, `viewer_error_403`, `viewer_error_5xx`, `viewer_error_network`, `viewer_error_retry`, `viewer_error_report` — that is 7, not 6. ### Recommendations - **Tighten AC 1:** `"The timeout clock starts when `loadFile()` is called. After 10 seconds with no successful response, the UI transitions to the timeout error state."` This is testable with `vi.useFakeTimers()`. - **Resolve the report button branch before implementation:** Pick one mechanism (mailto or Gitea URL) and make it unconditional. Conditional behavior based on runtime token availability adds complexity and an untested branch. **Recommend: mailto unconditionally for v1.** A Gitea deep-link can be a follow-up enhancement (separate issue). - **Add an explicit AC for the retry-success path:** `"Given the timeout error state is shown, when the user clicks Retry and the subsequent fetch succeeds within 10 seconds, then the viewer renders normally with no error UI visible."` - **Add a minimal accessibility AC:** `"All error state buttons (Retry, Back, Report) are reachable by keyboard Tab and activated by Enter/Space."` - **Fix the i18n key count to 7** in the issue body (minor, but counts matter for implementation tracking). ### Open Decisions - **Retry count cap:** Should the retry button be disabled or replaced with a message after N consecutive failures? The issue does not specify. For a family app where "MinIO is down" is a real scenario, unlimited retries could be confusing. Recommend: cap at 3 retries with a note "Falls das Problem weiterhin besteht, wenden Sie sich an den Administrator."
Author
Owner

👨‍💻 Nora "NullX" Steiner — Security Engineer

Observations

  • The report button's Gitea URL construction is a potential open redirect / information disclosure vector. If the report URL is constructed from document.location.href + documentId and rendered as an <a> or passed to window.open(), an attacker who can control the document URL or ID could inject a crafted URL. The risk is low in a family app, but the pattern of constructing URLs from runtime state deserves explicit attention.

  • The cache-bust parameter ?retry=<n> or ?_t=<timestamp> on the file fetch has no security implication — it is a backend-facing internal request authenticated by session cookie. No issue here.

  • The fetch() call in useFileLoader.ts sends the session cookie (same-origin request). The 403 path is correctly handled server-side via @RequirePermission. No bypass risk in the new error-state code.

  • The report button must not expose internal document paths or backend error messages in the pre-filled issue body. The issue spec says "prefilled title/body containing document id + current URL + timestamp." The current URL contains the document UUID — this is acceptable for a private family app. However, the error message from the backend should NOT be included in the pre-filled body because GlobalExceptionHandler includes the raw file path in developer messages (e.g., "File missing in storage: documents/abc-123.pdf"). Only the document UUID and a human-friendly description should be included.

  • The rel="noopener noreferrer" attribute on the download fallback link in DocumentViewer.svelte (line 71) — the link opens /api/documents/{doc.id}/file in a target="_blank" — is missing rel="noopener noreferrer". This is a pre-existing issue but will be in scope when this component is touched. The opened tab can access window.opener and redirect the parent page.

  • The StorageFileNotFoundException in FileService falls through to the generic Exception handler if it is not explicitly caught in DocumentController. Looking at the code, it IS caught in getDocumentFile() (line 106) and mapped to DomainException.notFound(). The generic handler at line 64 of GlobalExceptionHandler would catch it as a 500 if any call path missed the catch — this is a latent risk if new download methods are added without the try/catch pattern.

Recommendations

  • Fix rel="noopener noreferrer" on the existing download fallback link in DocumentViewer.svelte line 71 as part of this PR. It is a pre-existing CWE-1022 and touching the component in this issue is the right time to fix it.

  • Do not include backend error messages in the report button pre-fill. Only include: document UUID, current page URL path (not query string), timestamp, and a user-typed description field. Example: mailto:admin@familyarchive.local?subject=Viewer-Fehler+%23<uuid>&body=Seite%3A+%2Fdocuments%2F<uuid>%0ADatum%3A+<date>.

  • Add explicit unit test coverage for the 403 path in useFileLoader to assert that the error state does not leak the backend error message string — only a localized, user-safe message is exposed.

  • Consider adding a StorageFileNotFoundException handler to GlobalExceptionHandler that explicitly returns 404, rather than relying on the catch in every controller method. This closes the latent risk of unhandled StorageFileNotFoundException returning 500.

Open Decisions

  • Should the report mailto include the document UUID? Document UUIDs are not secret (they are in the URL), but for a family app with private documents, some users may prefer not to embed identifiers in email. For a family admin reporting workflow, including the UUID is operationally necessary — recommend including it.
## 👨‍💻 Nora "NullX" Steiner — Security Engineer ### Observations - **The report button's Gitea URL construction is a potential open redirect / information disclosure vector.** If the report URL is constructed from `document.location.href + documentId` and rendered as an `<a>` or passed to `window.open()`, an attacker who can control the document URL or ID could inject a crafted URL. The risk is low in a family app, but the pattern of constructing URLs from runtime state deserves explicit attention. - **The cache-bust parameter `?retry=<n>` or `?_t=<timestamp>` on the file fetch has no security implication** — it is a backend-facing internal request authenticated by session cookie. No issue here. - **The `fetch()` call in `useFileLoader.ts` sends the session cookie** (same-origin request). The 403 path is correctly handled server-side via `@RequirePermission`. No bypass risk in the new error-state code. - **The report button must not expose internal document paths or backend error messages in the pre-filled issue body.** The issue spec says "prefilled title/body containing document id + current URL + timestamp." The current URL contains the document UUID — this is acceptable for a private family app. However, the error message from the backend should NOT be included in the pre-filled body because `GlobalExceptionHandler` includes the raw file path in developer messages (e.g., `"File missing in storage: documents/abc-123.pdf"`). Only the document UUID and a human-friendly description should be included. - **The `rel="noopener noreferrer"` attribute on the download fallback link** in `DocumentViewer.svelte` (line 71) — the link opens `/api/documents/{doc.id}/file` in a `target="_blank"` — is **missing `rel="noopener noreferrer"`**. This is a pre-existing issue but will be in scope when this component is touched. The opened tab can access `window.opener` and redirect the parent page. - **The `StorageFileNotFoundException` in `FileService` falls through to the generic `Exception` handler** if it is not explicitly caught in `DocumentController`. Looking at the code, it IS caught in `getDocumentFile()` (line 106) and mapped to `DomainException.notFound()`. The generic handler at line 64 of `GlobalExceptionHandler` would catch it as a 500 if any call path missed the catch — this is a latent risk if new download methods are added without the try/catch pattern. ### Recommendations - **Fix `rel="noopener noreferrer"` on the existing download fallback link** in `DocumentViewer.svelte` line 71 as part of this PR. It is a pre-existing CWE-1022 and touching the component in this issue is the right time to fix it. - **Do not include backend error messages in the report button pre-fill.** Only include: document UUID, current page URL path (not query string), timestamp, and a user-typed description field. Example: `mailto:admin@familyarchive.local?subject=Viewer-Fehler+%23<uuid>&body=Seite%3A+%2Fdocuments%2F<uuid>%0ADatum%3A+<date>`. - **Add explicit unit test coverage for the 403 path** in `useFileLoader` to assert that the error state does not leak the backend error message string — only a localized, user-safe message is exposed. - **Consider adding a `StorageFileNotFoundException` handler to `GlobalExceptionHandler`** that explicitly returns 404, rather than relying on the catch in every controller method. This closes the latent risk of unhandled `StorageFileNotFoundException` returning 500. ### Open Decisions - **Should the report mailto include the document UUID?** Document UUIDs are not secret (they are in the URL), but for a family app with private documents, some users may prefer not to embed identifiers in email. For a family admin reporting workflow, including the UUID is operationally necessary — recommend including it.
Author
Owner

👨‍💻 Sara Holt — QA Engineer & Test Strategist

Observations

  • The existing useFileLoader.svelte.test.ts tests do not cover HTTP status differentiation at all. The test 'sets fileError on a failed fetch (non-ok response)' uses setupFetch(false) with no status field on the mock response — so response.status is undefined. Any new code that reads response.status would be untested by the existing suite. The tests need to be updated before the new status logic is written (TDD: write the failing tests first).

  • The issue proposes both a component spec (DocumentViewerErrorBoundary.spec.ts) and an E2E spec (e2e/viewer-errors.spec.ts). This is the correct two-layer approach. The component test handles unit-level state machine verification; the E2E handles the full integration path.

  • The timeout test strategy needs vi.useFakeTimers(). The issue spec says "advance timers by 10 s" — this implies fake timers, which is correct. However, if the timeout is implemented with AbortController + setTimeout, vi.useFakeTimers() does NOT automatically fake AbortController signals. The implementation must use setTimeout directly (not AbortController's native timeout) for fake timers to work, OR the test must use vi.advanceTimersByTimeAsync(10_000) which handles promise microtasks correctly.

  • The E2E test for the timeout case (with MinIO stopped → within 10 s see error UI) will add 10+ seconds of wall-clock time to CI. This is a pyramid violation: timeout behavior belongs in the component/unit layer, not E2E.

  • Missing test: what happens when retry is clicked and the SECOND fetch succeeds? The issue test plan only covers "assert URL is re-fetched with cache-bust" — it does not assert that the successful retry restores the ok state and removes the error UI.

  • Missing test: what happens when the <img> element's onerror fires after fileUrl is set? The current DocumentViewer.svelte has no onerror handler on the <img> (line 109). A corrupt blob URL would silently show nothing.

  • The E2E test "navigate to a deleted document (delete via API mid-test)" is a complex setup that requires test isolation. Deleting a document mid-test while other tests may be using it is a shared-state risk. Use a dedicated test document created and deleted within the test's beforeAll/afterAll.

Recommendations

  • Write these failing tests first, in this order:

    1. useFileLoader: 'sets status to error-404 when response is 404'
    2. useFileLoader: 'sets status to error-403 when response is 403'
    3. useFileLoader: 'sets status to error-5xx when response is 503'
    4. useFileLoader: 'sets status to timeout after 10 seconds with no response' (with vi.useFakeTimers())
    5. useFileLoader: 'sets status to network when fetch throws TypeError'
    6. useFileLoader: 'resets to loading then ok on successful retry'
    7. Component test: 'shows retry button on timeout state'
    8. Component test: 'shows back button but no retry on 403 state'
  • Add status field to the mock fetch response in setupFetch():

    function setupFetch(ok: boolean, status = 200, body?: Blob) {
      vi.stubGlobal('fetch', vi.fn().mockResolvedValue({ ok, status, blob: vi.fn()... }));
    }
    
  • Move the timeout E2E scenario to the component test layer. Keep only the page.route() mock scenarios (404, 403, 5xx, network) in the E2E suite — they are fast and don't require fake timers.

  • Page Object Model for the error states:

    class ViewerErrorPage {
      get retryButton() { return this.page.getByRole('button', { name: /erneut versuchen/i }); }
      get backButton() { return this.page.getByRole('link', { name: /zurück/i }); }
      get reportButton() { return this.page.getByRole('link', { name: /problem melden/i }); }
    }
    
  • Run axe on all error state UIs — the error states will be the primary UX for users with connectivity issues, who may also be on assistive technology.

Open Decisions

  • Should the retry test assert the exact URL with cache-bust param, or just assert that fetch was called twice? Asserting the exact ?_t= timestamp is brittle (time-dependent). Asserting that fetch was called with a URL containing the base path (and a different query param than the first call) is more robust.
## 👨‍💻 Sara Holt — QA Engineer & Test Strategist ### Observations - **The existing `useFileLoader.svelte.test.ts` tests do not cover HTTP status differentiation at all.** The test `'sets fileError on a failed fetch (non-ok response)'` uses `setupFetch(false)` with no `status` field on the mock response — so `response.status` is `undefined`. Any new code that reads `response.status` would be untested by the existing suite. The tests need to be updated before the new status logic is written (TDD: write the failing tests first). - **The issue proposes both a component spec (`DocumentViewerErrorBoundary.spec.ts`) and an E2E spec (`e2e/viewer-errors.spec.ts`).** This is the correct two-layer approach. The component test handles unit-level state machine verification; the E2E handles the full integration path. - **The timeout test strategy needs `vi.useFakeTimers()`.** The issue spec says "advance timers by 10 s" — this implies fake timers, which is correct. However, if the timeout is implemented with `AbortController + setTimeout`, `vi.useFakeTimers()` does NOT automatically fake `AbortController` signals. The implementation must use `setTimeout` directly (not `AbortController`'s native timeout) for fake timers to work, OR the test must use `vi.advanceTimersByTimeAsync(10_000)` which handles promise microtasks correctly. - **The E2E test for the timeout case** (`with MinIO stopped → within 10 s see error UI`) will add 10+ seconds of wall-clock time to CI. This is a pyramid violation: timeout behavior belongs in the component/unit layer, not E2E. - **Missing test: what happens when retry is clicked and the SECOND fetch succeeds?** The issue test plan only covers "assert URL is re-fetched with cache-bust" — it does not assert that the successful retry restores the `ok` state and removes the error UI. - **Missing test: what happens when the `<img>` element's `onerror` fires after `fileUrl` is set?** The current `DocumentViewer.svelte` has no `onerror` handler on the `<img>` (line 109). A corrupt blob URL would silently show nothing. - **The E2E test "navigate to a deleted document (delete via API mid-test)"** is a complex setup that requires test isolation. Deleting a document mid-test while other tests may be using it is a shared-state risk. Use a dedicated test document created and deleted within the test's `beforeAll`/`afterAll`. ### Recommendations - **Write these failing tests first, in this order:** 1. `useFileLoader`: `'sets status to error-404 when response is 404'` 2. `useFileLoader`: `'sets status to error-403 when response is 403'` 3. `useFileLoader`: `'sets status to error-5xx when response is 503'` 4. `useFileLoader`: `'sets status to timeout after 10 seconds with no response'` (with `vi.useFakeTimers()`) 5. `useFileLoader`: `'sets status to network when fetch throws TypeError'` 6. `useFileLoader`: `'resets to loading then ok on successful retry'` 7. Component test: `'shows retry button on timeout state'` 8. Component test: `'shows back button but no retry on 403 state'` - **Add `status` field to the mock fetch response in `setupFetch()`:** ```typescript function setupFetch(ok: boolean, status = 200, body?: Blob) { vi.stubGlobal('fetch', vi.fn().mockResolvedValue({ ok, status, blob: vi.fn()... })); } ``` - **Move the timeout E2E scenario to the component test layer.** Keep only the `page.route()` mock scenarios (404, 403, 5xx, network) in the E2E suite — they are fast and don't require fake timers. - **Page Object Model for the error states:** ```typescript class ViewerErrorPage { get retryButton() { return this.page.getByRole('button', { name: /erneut versuchen/i }); } get backButton() { return this.page.getByRole('link', { name: /zurück/i }); } get reportButton() { return this.page.getByRole('link', { name: /problem melden/i }); } } ``` - **Run `axe` on all error state UIs** — the error states will be the primary UX for users with connectivity issues, who may also be on assistive technology. ### Open Decisions - **Should the retry test assert the exact URL with cache-bust param, or just assert that `fetch` was called twice?** Asserting the exact `?_t=` timestamp is brittle (time-dependent). Asserting that `fetch` was called with a URL containing the base path (and a different query param than the first call) is more robust.
Author
Owner

👨‍💻 Leonie Voss — UI/UX Design Lead & Accessibility Strategist

Observations

  • The existing DocumentViewer.svelte error state (line 66–77) is incomplete for the intended audience. The senior reader persona (60+, phone, variable connectivity) will encounter this error most. The current UI shows {error} in text-ink-3 — a muted color — with a small underlined download link. This fails several design principles: the color is likely too low-contrast on the dark PDF background, the touch target for the download link is not 44px, and there is no action hierarchy (no primary CTA).

  • The state machine table in the issue is correct and user-appropriate — the German copy is clear and the action hierarchy (Retry > Zurück > Problem melden) makes sense. The 403 omitting Retry is the right call (retrying won't help with permission failures).

  • The timeout state message "Das Dokument lädt länger als erwartet." is good — it is honest without alarming. The suggested actions (Erneut versuchen, Problem melden) are appropriate.

  • The 404 message "Die Datei wurde nicht gefunden oder wurde gelöscht." is good. The [Zurück] [Problem melden] actions are appropriate — no retry button is correct.

  • The "Report problem" action must have a visible label, not just an icon, given the senior audience. Icon-only buttons are a WCAG 2.1 failure (SC 1.1.1).

  • The error UI lives inside <div class="absolute inset-0 bg-pdf-bg"> on a dark background. The design tokens for the dark viewer panel (bg-pdf-bg) are not standard surface tokens — error text and buttons styled with text-ink-3 will have unknown contrast on this background. All error UI colors must be verified against the actual bg-pdf-bg value in layout.css.

  • The spinner (loading state) already renders correctly according to the issue — no changes needed there. The issue correctly calls this the "current behaviour" to preserve.

  • The <img> viewer path has no loading state — it just appears when fileUrl is set. For large images on slow connections, the image container is blank until the blob URL is decoded. A skeleton placeholder during decode would improve perceived performance.

Recommendations

  • Error UI pattern for each state should follow this structure:

    [Centered vertically within viewer panel]
    [Icon — status-specific: clock for timeout, broken-file for 404/403, cloud-x for 5xx, wifi-off for network]
    [Heading — h3, font-sans, text-sm, text-ink-2]
    [Body — font-serif, text-sm, text-ink-3, max-w-xs text-center]
    [Primary CTA — full-width on mobile, min-h-[44px], bg-brand-navy text-white]
    [Secondary CTA — text link, min-h-[44px], text-ink-2 underline]
    

    The vertical centering is already provided by flex h-full flex-col items-center justify-center.

  • Buttons must be min-h-[44px] min-w-[44px] — WCAG 2.2 SC 2.5.8. This is especially critical for the Retry button which is the primary recovery action on mobile.

  • Use aria-live="polite" on the error container so that screen readers announce the state change when the timeout triggers. The timeout state transition happens silently otherwise.

  • Use role="alert" specifically for network errors (most disruptive) and aria-live="polite" for timeout and HTTP errors (less urgent).

  • Do not use text-ink-3 on bg-pdf-bg without verifying contrast. Use text-ink-2 or text-white/80 for body copy on the dark viewer background. Provide a utility class text-viewer-muted or similar that is defined with the correct contrast value against bg-pdf-bg.

  • Add focus-visible:ring-2 focus-visible:ring-brand-mint to all error state buttons — keyboard focus must be visible on the dark background. The standard navy ring may not be visible on dark backgrounds; mint provides better contrast.

  • The rel="noopener noreferrer" on the existing download link (line 71 of DocumentViewer.svelte) is missing — the security team flagged this too. Fix it in this PR.

  • All three i18n locales (de/en/es) must be completed before merge. Spanish is often omitted from "done" criteria — make it a checklist item at the PR level, not just the issue level.

Open Decisions

  • Should the timeout error state show a progress indicator of elapsed time (e.g., "Lädt seit 8 Sekunden...") to reassure users still waiting? This is a Kano delighter — it improves perceived responsiveness for slow-but-not-failed loads. It is not in scope for this issue but worth noting for a follow-up.
## 👨‍💻 Leonie Voss — UI/UX Design Lead & Accessibility Strategist ### Observations - **The existing `DocumentViewer.svelte` error state (line 66–77) is incomplete for the intended audience.** The senior reader persona (60+, phone, variable connectivity) will encounter this error most. The current UI shows `{error}` in `text-ink-3` — a muted color — with a small underlined download link. This fails several design principles: the color is likely too low-contrast on the dark PDF background, the touch target for the download link is not 44px, and there is no action hierarchy (no primary CTA). - **The state machine table in the issue is correct and user-appropriate** — the German copy is clear and the action hierarchy (Retry > Zurück > Problem melden) makes sense. The 403 omitting Retry is the right call (retrying won't help with permission failures). - **The timeout state message** `"Das Dokument lädt länger als erwartet."` is good — it is honest without alarming. The suggested actions (Erneut versuchen, Problem melden) are appropriate. - **The 404 message** `"Die Datei wurde nicht gefunden oder wurde gelöscht."` is good. The [Zurück] [Problem melden] actions are appropriate — no retry button is correct. - **The "Report problem" action must have a visible label, not just an icon,** given the senior audience. Icon-only buttons are a WCAG 2.1 failure (SC 1.1.1). - **The error UI lives inside `<div class="absolute inset-0 bg-pdf-bg">` on a dark background.** The design tokens for the dark viewer panel (`bg-pdf-bg`) are not standard surface tokens — error text and buttons styled with `text-ink-3` will have unknown contrast on this background. All error UI colors must be verified against the actual `bg-pdf-bg` value in `layout.css`. - **The spinner (loading state) already renders correctly** according to the issue — no changes needed there. The issue correctly calls this the "current behaviour" to preserve. - **The `<img>` viewer path has no loading state** — it just appears when `fileUrl` is set. For large images on slow connections, the image container is blank until the blob URL is decoded. A skeleton placeholder during decode would improve perceived performance. ### Recommendations - **Error UI pattern for each state should follow this structure:** ``` [Centered vertically within viewer panel] [Icon — status-specific: clock for timeout, broken-file for 404/403, cloud-x for 5xx, wifi-off for network] [Heading — h3, font-sans, text-sm, text-ink-2] [Body — font-serif, text-sm, text-ink-3, max-w-xs text-center] [Primary CTA — full-width on mobile, min-h-[44px], bg-brand-navy text-white] [Secondary CTA — text link, min-h-[44px], text-ink-2 underline] ``` The vertical centering is already provided by `flex h-full flex-col items-center justify-center`. - **Buttons must be `min-h-[44px] min-w-[44px]`** — WCAG 2.2 SC 2.5.8. This is especially critical for the Retry button which is the primary recovery action on mobile. - **Use `aria-live="polite"` on the error container** so that screen readers announce the state change when the timeout triggers. The timeout state transition happens silently otherwise. - **Use `role="alert"` specifically for network errors** (most disruptive) and `aria-live="polite"` for timeout and HTTP errors (less urgent). - **Do not use `text-ink-3` on `bg-pdf-bg` without verifying contrast.** Use `text-ink-2` or `text-white/80` for body copy on the dark viewer background. Provide a utility class `text-viewer-muted` or similar that is defined with the correct contrast value against `bg-pdf-bg`. - **Add `focus-visible:ring-2 focus-visible:ring-brand-mint` to all error state buttons** — keyboard focus must be visible on the dark background. The standard navy ring may not be visible on dark backgrounds; mint provides better contrast. - **The `rel="noopener noreferrer"` on the existing download link** (line 71 of `DocumentViewer.svelte`) is missing — the security team flagged this too. Fix it in this PR. - **All three i18n locales (de/en/es) must be completed before merge.** Spanish is often omitted from "done" criteria — make it a checklist item at the PR level, not just the issue level. ### Open Decisions - **Should the timeout error state show a progress indicator of elapsed time** (e.g., "Lädt seit 8 Sekunden...") to reassure users still waiting? This is a Kano delighter — it improves perceived responsiveness for slow-but-not-failed loads. It is not in scope for this issue but worth noting for a follow-up.
Author
Owner

Decision Queue — Open Items Needing Human Judgment

Consolidated from all persona reviews. These are genuine tradeoffs — no clear winner without product/ops context.


1. Report Button Mechanism (Elicit + Markus + Tobias + Nora)

The question: Gitea new-issue deep-link, or mailto:?

  • Gitea URL is more usable for the admin, produces a structured issue, requires injecting PUBLIC_GITEA_URL as an env var, and couples the frontend to the internal hostname (heim-nas:3005). Breaks if the instance moves.
  • mailto: requires no config surface, works offline, but lands in an email inbox rather than the issue tracker. No infrastructure coupling.

Unanimous recommendation from all four personas: use mailto: unconditionally for v1. A Gitea deep-link can be a follow-up enhancement in its own issue once the ops URL stabilization question is resolved.

Decision needed: Confirm mailto: for v1 and pick the admin email address to use in the link.


2. Timeout Duration — 10s Hardcoded vs. Configurable (Markus)

The question: Is 10 seconds the right default for a family app on home Wi-Fi serving large PDFs?

  • Large PDFs (multi-MB) on slow home Wi-Fi connections may take >10s to stream legitimately.
  • 15–20s would reduce false timeout triggers at the cost of slightly longer "stuck" time for genuinely failed loads.
  • Hardcoding avoids an env var; a PUBLIC_VIEWER_TIMEOUT_MS var makes it tunable without code change.

Decision needed: Pick a duration (10s / 15s / 20s) and decide hardcoded vs. env var.


3. Retry Count Cap (Elicit)

The question: Should the Retry button be disabled or replaced with a final "contact admin" message after N consecutive failures?

  • Unlimited retries: simpler implementation, but a user could click Retry indefinitely when MinIO is down for hours.
  • Cap at 3: after 3 failures, replace the Retry button with a permanent message "Falls das Problem weiterhin besteht, wenden Sie sich an den Administrator." No retry possible until page reload.

Decision needed: Unlimited vs. capped (and if capped, what N).


4. E2E Test Strategy for Infrastructure Failures (Tobias + Sara)

The question: Should the E2E tests use page.route() intercepts (fast, deterministic, no real MinIO) or real Docker Compose MinIO outage (slow, more faithful)?

  • page.route() to return synthetic 503/404/network-error: adds ~0s CI overhead, testable without Docker changes, but does not exercise the backend error path.
  • Real docker compose stop minio: tests the full stack faithfully, but adds 10+ seconds per scenario and requires CI infrastructure changes.

Unanimous recommendation from both personas: use page.route() for CI E2E tests; keep the real MinIO outage test as a documented manual verification step.

Decision needed: Confirm this split or override.


5. PdfViewer.svelte Inline Error State — This PR or Follow-up? (Felix)

The question: PdfViewer.svelte has a hardcoded German error string for PDF.js load failures ("Fehler beim Laden der PDF") that is independent of the useFileLoader status system this issue builds. It represents a second unhandled error surface.

  • Fix in this PR: keeps error surfaces consistent, but expands scope.
  • Separate issue: keeps this PR focused, but leaves an inconsistency.

Decision needed: In-scope for this issue or create a follow-up?

## Decision Queue — Open Items Needing Human Judgment Consolidated from all persona reviews. These are genuine tradeoffs — no clear winner without product/ops context. --- ### 1. Report Button Mechanism (Elicit + Markus + Tobias + Nora) **The question:** Gitea new-issue deep-link, or `mailto:`? - **Gitea URL** is more usable for the admin, produces a structured issue, requires injecting `PUBLIC_GITEA_URL` as an env var, and couples the frontend to the internal hostname (`heim-nas:3005`). Breaks if the instance moves. - **`mailto:`** requires no config surface, works offline, but lands in an email inbox rather than the issue tracker. No infrastructure coupling. **Unanimous recommendation from all four personas:** use `mailto:` unconditionally for v1. A Gitea deep-link can be a follow-up enhancement in its own issue once the ops URL stabilization question is resolved. **Decision needed:** Confirm `mailto:` for v1 and pick the admin email address to use in the link. --- ### 2. Timeout Duration — 10s Hardcoded vs. Configurable (Markus) **The question:** Is 10 seconds the right default for a family app on home Wi-Fi serving large PDFs? - Large PDFs (multi-MB) on slow home Wi-Fi connections may take >10s to stream legitimately. - 15–20s would reduce false timeout triggers at the cost of slightly longer "stuck" time for genuinely failed loads. - Hardcoding avoids an env var; a `PUBLIC_VIEWER_TIMEOUT_MS` var makes it tunable without code change. **Decision needed:** Pick a duration (10s / 15s / 20s) and decide hardcoded vs. env var. --- ### 3. Retry Count Cap (Elicit) **The question:** Should the Retry button be disabled or replaced with a final "contact admin" message after N consecutive failures? - Unlimited retries: simpler implementation, but a user could click Retry indefinitely when MinIO is down for hours. - Cap at 3: after 3 failures, replace the Retry button with a permanent message `"Falls das Problem weiterhin besteht, wenden Sie sich an den Administrator."` No retry possible until page reload. **Decision needed:** Unlimited vs. capped (and if capped, what N). --- ### 4. E2E Test Strategy for Infrastructure Failures (Tobias + Sara) **The question:** Should the E2E tests use `page.route()` intercepts (fast, deterministic, no real MinIO) or real Docker Compose MinIO outage (slow, more faithful)? - `page.route()` to return synthetic 503/404/network-error: adds ~0s CI overhead, testable without Docker changes, but does not exercise the backend error path. - Real `docker compose stop minio`: tests the full stack faithfully, but adds 10+ seconds per scenario and requires CI infrastructure changes. **Unanimous recommendation from both personas:** use `page.route()` for CI E2E tests; keep the real MinIO outage test as a documented manual verification step. **Decision needed:** Confirm this split or override. --- ### 5. `PdfViewer.svelte` Inline Error State — This PR or Follow-up? (Felix) **The question:** `PdfViewer.svelte` has a hardcoded German error string for PDF.js load failures (`"Fehler beim Laden der PDF"`) that is independent of the `useFileLoader` status system this issue builds. It represents a second unhandled error surface. - Fix in this PR: keeps error surfaces consistent, but expands scope. - Separate issue: keeps this PR focused, but leaves an inconsistency. **Decision needed:** In-scope for this issue or create a follow-up?
Sign in to join this conversation.
No Label P1-high bug ui
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#322