feat(transcribe): keyboard shortcuts for the transcribe power path + cheatsheet overlay #327

Open
opened 2026-04-24 13:27:13 +02:00 by marcel · 8 comments
Owner

Context

Transcription is repetitive specialist work done by a small pool of Kurrent-literate family members (60+, working on laptops/tablets). Their productivity is limited by hand-travel to the mouse. No keyboard shortcuts exist today — every region switch, mode toggle, and save requires a click.

Given the primary transcriber persona, this is a high-leverage ergonomics play. 15 minutes of shortcut muscle memory saves hours over a backlog.

Non-goals

  • No customisable bindings in v1 — users accept the defaults.
  • No vim-style modal editing.
  • No global app-wide shortcuts (only active inside the Transcribe panel).

Proposed shortcuts

Keys Action
j Jump to next region
k Jump to previous region
e Toggle between Read and Edit mode
Cmd/Ctrl+Enter Save current region and move to next
Cmd/Ctrl+s Save current region (stays put)
Esc Discard current region's unsaved edits (confirm modal)
n Start drawing a new region
t Toggle "mark for training" on the current region
? Open the shortcut cheatsheet overlay

Rules:

  • Active only when the Transcribe panel is open.
  • Inactive when focus is inside a non-save text input. Exception: Cmd/Ctrl+Enter works inside the transcription textarea (must save and advance).
  • ? opens the cheatsheet regardless of focus (but not while a modifier is held, to avoid conflict with QWERTZ).
  • Cheatsheet closable with Esc or clicking the backdrop.
  • Mac users see Cmd labels; Windows/Linux see Ctrl. Detect via navigator.platform.

Implementation plan

Frontend

  • New Svelte action frontend/src/lib/actions/transcribeShortcuts.ts:
    • Attaches keydown listener on window.
    • Guards: returns early if panel isn't open (reads a store flag), if modifier mismatch, if focus is in a non-save input without a save-shortcut.
    • Dispatches actions via a small event bus / callback prop.
  • Attach action in frontend/src/lib/components/TranscribePanel.svelte:
    • Each callback maps to the existing action (goToNextRegion, toggleMode, saveAndNext, etc.).
  • New component frontend/src/lib/components/ShortcutCheatsheet.svelte:
    • Modal, opened via ?, rendered as <dialog> with aria-modal="true".
    • Two-column table of key + action label.
    • Respects prefers-reduced-motion for any open/close animation.
  • Mention ? in the coaching card from #320 ("Tipp: Drücken Sie ? für Tastatur-Kürzel").

i18n

12–14 new Paraglide keys:
shortcut_next_region, shortcut_prev_region, shortcut_toggle_mode, shortcut_save_and_next, shortcut_save, shortcut_discard, shortcut_new_region, shortcut_toggle_training, shortcut_help, cheatsheet_title, cheatsheet_close, cheatsheet_platform_hint.

Tests

  • Component (action): dispatch keydown events with a stub panel API, assert correct callback fires for each shortcut.
  • Component (action): focus is inside textarea + Cmd+Enter → saveAndNext fires. Focus in textarea + plain j → no navigation.
  • Component (cheatsheet): opens on ?, closes on Esc, displays all shortcut rows.
  • E2E: open a doc with ≥ 2 regions, press j, assert next region highlighted; press k, assert previous; e, assert mode toggled; ?, assert cheatsheet visible; Esc, assert cheatsheet closed.
  • E2E (platform labels): on a Mac user agent, cheatsheet shows Cmd; on Windows, Ctrl.

Verification

Manual: transcribe 5 regions of a document using the keyboard only, no mouse. Confirm all shortcuts work; no accidental activations while typing into the textarea; cheatsheet reachable via ?.

Acceptance criteria

  • All 9 shortcuts work as specified
  • Shortcuts inactive when focus is in a non-save text input
  • Cmd/Ctrl+Enter works inside the transcription textarea to save-and-next
  • Cheatsheet opens with ?, closes with Esc and backdrop click
  • Platform-correct modifier labels (Cmd vs Ctrl)
  • i18n complete for de/en/es on cheatsheet labels
  • Shortcuts mentioned in the coach card from #320 (hint: ?)
  • axe-core passes on the cheatsheet (focus trap, aria-modal, labelled heading)

Critical files

frontend/src/lib/actions/transcribeShortcuts.ts             (new)
frontend/src/lib/components/TranscribePanel.svelte          (attach action + callbacks)
frontend/src/lib/components/ShortcutCheatsheet.svelte       (new)
frontend/src/lib/actions/transcribeShortcuts.spec.ts        (new)
frontend/e2e/transcribe-shortcuts.spec.ts                   (new)
frontend/messages/{de,en,es}.json
  • #320 (coach empty state) — cross-reference in the coach card.
  • #321 (progress indicator) ��� t toggles the per-region training flag introduced there.
## Context Transcription is repetitive specialist work done by a small pool of Kurrent-literate family members (60+, working on laptops/tablets). Their productivity is limited by hand-travel to the mouse. No keyboard shortcuts exist today — every region switch, mode toggle, and save requires a click. Given the primary transcriber persona, this is a high-leverage ergonomics play. 15 minutes of shortcut muscle memory saves hours over a backlog. ## Non-goals - No customisable bindings in v1 — users accept the defaults. - No vim-style modal editing. - No global app-wide shortcuts (only active inside the Transcribe panel). ## Proposed shortcuts | Keys | Action | |--------------------|---------------------------------------------------------| | `j` | Jump to next region | | `k` | Jump to previous region | | `e` | Toggle between Read and Edit mode | | `Cmd/Ctrl+Enter` | Save current region and move to next | | `Cmd/Ctrl+s` | Save current region (stays put) | | `Esc` | Discard current region's unsaved edits (confirm modal) | | `n` | Start drawing a new region | | `t` | Toggle "mark for training" on the current region | | `?` | Open the shortcut cheatsheet overlay | Rules: - Active only when the Transcribe panel is open. - Inactive when focus is inside a non-save text input. Exception: `Cmd/Ctrl+Enter` works inside the transcription textarea (must save and advance). - `?` opens the cheatsheet regardless of focus (but not while a modifier is held, to avoid conflict with QWERTZ). - Cheatsheet closable with `Esc` or clicking the backdrop. - Mac users see `Cmd` labels; Windows/Linux see `Ctrl`. Detect via `navigator.platform`. ## Implementation plan ### Frontend - New Svelte action `frontend/src/lib/actions/transcribeShortcuts.ts`: - Attaches `keydown` listener on `window`. - Guards: returns early if panel isn't open (reads a store flag), if modifier mismatch, if focus is in a non-save input without a save-shortcut. - Dispatches actions via a small event bus / callback prop. - Attach action in `frontend/src/lib/components/TranscribePanel.svelte`: - Each callback maps to the existing action (`goToNextRegion`, `toggleMode`, `saveAndNext`, etc.). - New component `frontend/src/lib/components/ShortcutCheatsheet.svelte`: - Modal, opened via `?`, rendered as `<dialog>` with `aria-modal="true"`. - Two-column table of key + action label. - Respects `prefers-reduced-motion` for any open/close animation. - Mention `?` in the coaching card from #320 ("Tipp: Drücken Sie `?` für Tastatur-Kürzel"). ### i18n 12–14 new Paraglide keys: `shortcut_next_region`, `shortcut_prev_region`, `shortcut_toggle_mode`, `shortcut_save_and_next`, `shortcut_save`, `shortcut_discard`, `shortcut_new_region`, `shortcut_toggle_training`, `shortcut_help`, `cheatsheet_title`, `cheatsheet_close`, `cheatsheet_platform_hint`. ## Tests - **Component (action):** dispatch keydown events with a stub panel API, assert correct callback fires for each shortcut. - **Component (action):** focus is inside textarea + `Cmd+Enter` → saveAndNext fires. Focus in textarea + plain `j` → no navigation. - **Component (cheatsheet):** opens on `?`, closes on `Esc`, displays all shortcut rows. - **E2E:** open a doc with ≥ 2 regions, press `j`, assert next region highlighted; press `k`, assert previous; `e`, assert mode toggled; `?`, assert cheatsheet visible; `Esc`, assert cheatsheet closed. - **E2E (platform labels):** on a Mac user agent, cheatsheet shows `Cmd`; on Windows, `Ctrl`. ## Verification Manual: transcribe 5 regions of a document using the keyboard only, no mouse. Confirm all shortcuts work; no accidental activations while typing into the textarea; cheatsheet reachable via `?`. ## Acceptance criteria - [ ] All 9 shortcuts work as specified - [ ] Shortcuts inactive when focus is in a non-save text input - [ ] `Cmd/Ctrl+Enter` works inside the transcription textarea to save-and-next - [ ] Cheatsheet opens with `?`, closes with `Esc` and backdrop click - [ ] Platform-correct modifier labels (Cmd vs Ctrl) - [ ] i18n complete for de/en/es on cheatsheet labels - [ ] Shortcuts mentioned in the coach card from #320 (hint: `?`) - [ ] axe-core passes on the cheatsheet (focus trap, aria-modal, labelled heading) ## Critical files ``` frontend/src/lib/actions/transcribeShortcuts.ts (new) frontend/src/lib/components/TranscribePanel.svelte (attach action + callbacks) frontend/src/lib/components/ShortcutCheatsheet.svelte (new) frontend/src/lib/actions/transcribeShortcuts.spec.ts (new) frontend/e2e/transcribe-shortcuts.spec.ts (new) frontend/messages/{de,en,es}.json ``` ## Related - #320 (coach empty state) — cross-reference in the coach card. - #321 (progress indicator) ��� `t` toggles the per-region training flag introduced there.
marcel added this to the Transcriber Experience v1 milestone 2026-04-24 13:27:13 +02:00
marcel added the P2-mediumfeatureui labels 2026-04-24 13:28:16 +02:00
Author
Owner

👨‍💻 Markus Keller — Application Architect

Observations

  • The issue proposes a Svelte action (transcribeShortcuts.ts) that attaches a keydown listener on window. This is architecturally sound for a panel-scoped concern but the "reads a store flag" guard mentioned in the spec does not match the existing codebase — transcribeMode is a plain $state variable local to +page.svelte, not a Svelte store. The action would need to receive the panel-open state as a callback parameter rather than reading a shared store.
  • Placing the new action in frontend/src/lib/actions/transcribeShortcuts.ts is the correct location, consistent with clickOutside.ts and radioGroupNav.ts which live in frontend/src/lib/shared/actions/. The spec uses src/lib/actions/ — this path does not exist; the correct directory is src/lib/shared/actions/.
  • The n shortcut ("start drawing a new region") implies triggering a draw-mode toggle on the annotation layer. Looking at the code, onTranscriptionDraw flows from +page.svelteDocumentViewer → the PDF annotation layer. The shortcut action cannot directly invoke draw mode — it must dispatch an event or call a callback that the page wires to onTranscriptionDraw. The issue's implementation plan glosses over this; the callback API needs to include a startDrawMode() entry point.
  • The t shortcut ("toggle mark for training") operates on trainingLabels in TranscriptionEditView, which is owned by +page.svelte via onToggleTrainingLabel. Same pattern: needs a callback in the action's API, not a direct store access.
  • The ShortcutCheatsheet component is correctly proposed as a standalone component with <dialog> and aria-modal. This fits the existing component architecture cleanly.

Recommendations

  • Correct the file path in the issue: move frontend/src/lib/actions/transcribeShortcuts.tsfrontend/src/lib/shared/actions/transcribeShortcuts.ts. Update the Critical Files list accordingly.
  • Define the callback API explicitly before implementation begins. The action should accept a typed options object, not individual callback parameters:
    type TranscribeShortcutOptions = {
      isPanelOpen: () => boolean;
      goToNextRegion: () => void;
      goToPrevRegion: () => void;
      toggleMode: () => void;
      saveAndNext: () => void;
      saveCurrent: () => void;
      discardCurrent: () => void;
      startDrawMode: () => void;
      toggleTrainingMark: () => void;
      openCheatsheet: () => void;
    };
    
    The page wires all these to existing handlers. The action is a pure input-to-callback translator — no state ownership.
  • navigator.platform is deprecated (MDN: deprecated since 2021). Use navigator.userAgentData?.platform with fallback to navigator.platform for Safari compatibility, or use the simpler navigator.userAgent.includes('Mac') which is stable and already used by many production apps.
  • Add the Svelte action to src/lib/shared/actions/ index if one exists, or document the action at the top of the file, consistent with the existing pattern of standalone exports in that directory.
  • No backend changes are needed. No new routes are needed. No documentation updates are required beyond the standard ones (the CLAUDE.md frontend routes table does not need updating — this is a panel overlay, not a route).

Open Decisions

  • Cheatsheet component location: The issue proposes frontend/src/lib/components/ShortcutCheatsheet.svelte, but that path does not exist — the project uses domain-scoped lib directories. Should this live in frontend/src/lib/document/transcription/ShortcutCheatsheet.svelte (domain-scoped) or frontend/src/lib/shared/primitives/ShortcutCheatsheet.svelte (reusable across future panels)? Domain-scoped is cleaner for v1 since there is exactly one consumer. The src/lib/components/ path in the issue is simply wrong — the team should pick one of the two correct options before implementation starts.
## 👨‍💻 Markus Keller — Application Architect ### Observations - The issue proposes a **Svelte action** (`transcribeShortcuts.ts`) that attaches a `keydown` listener on `window`. This is architecturally sound for a panel-scoped concern but the "reads a store flag" guard mentioned in the spec does not match the existing codebase — `transcribeMode` is a plain `$state` variable local to `+page.svelte`, not a Svelte store. The action would need to receive the panel-open state as a callback parameter rather than reading a shared store. - Placing the new action in `frontend/src/lib/actions/transcribeShortcuts.ts` is the correct location, consistent with `clickOutside.ts` and `radioGroupNav.ts` which live in `frontend/src/lib/shared/actions/`. The spec uses `src/lib/actions/` — this path does not exist; the correct directory is `src/lib/shared/actions/`. - The `n` shortcut ("start drawing a new region") implies triggering a draw-mode toggle on the annotation layer. Looking at the code, `onTranscriptionDraw` flows from `+page.svelte` → `DocumentViewer` → the PDF annotation layer. The shortcut action cannot directly invoke draw mode — it must dispatch an event or call a callback that the page wires to `onTranscriptionDraw`. The issue's implementation plan glosses over this; the callback API needs to include a `startDrawMode()` entry point. - The `t` shortcut ("toggle mark for training") operates on `trainingLabels` in `TranscriptionEditView`, which is owned by `+page.svelte` via `onToggleTrainingLabel`. Same pattern: needs a callback in the action's API, not a direct store access. - The `ShortcutCheatsheet` component is correctly proposed as a standalone component with `<dialog>` and `aria-modal`. This fits the existing component architecture cleanly. ### Recommendations - **Correct the file path in the issue**: move `frontend/src/lib/actions/transcribeShortcuts.ts` → `frontend/src/lib/shared/actions/transcribeShortcuts.ts`. Update the Critical Files list accordingly. - **Define the callback API explicitly** before implementation begins. The action should accept a typed options object, not individual callback parameters: ```typescript type TranscribeShortcutOptions = { isPanelOpen: () => boolean; goToNextRegion: () => void; goToPrevRegion: () => void; toggleMode: () => void; saveAndNext: () => void; saveCurrent: () => void; discardCurrent: () => void; startDrawMode: () => void; toggleTrainingMark: () => void; openCheatsheet: () => void; }; ``` The page wires all these to existing handlers. The action is a pure input-to-callback translator — no state ownership. - **`navigator.platform` is deprecated** (MDN: deprecated since 2021). Use `navigator.userAgentData?.platform` with fallback to `navigator.platform` for Safari compatibility, or use the simpler `navigator.userAgent.includes('Mac')` which is stable and already used by many production apps. - **Add the Svelte action to `src/lib/shared/actions/` index** if one exists, or document the action at the top of the file, consistent with the existing pattern of standalone exports in that directory. - No backend changes are needed. No new routes are needed. No documentation updates are required beyond the standard ones (the `CLAUDE.md` frontend routes table does not need updating — this is a panel overlay, not a route). ### Open Decisions - **Cheatsheet component location**: The issue proposes `frontend/src/lib/components/ShortcutCheatsheet.svelte`, but that path does not exist — the project uses domain-scoped lib directories. Should this live in `frontend/src/lib/document/transcription/ShortcutCheatsheet.svelte` (domain-scoped) or `frontend/src/lib/shared/primitives/ShortcutCheatsheet.svelte` (reusable across future panels)? Domain-scoped is cleaner for v1 since there is exactly one consumer. The `src/lib/components/` path in the issue is simply wrong — the team should pick one of the two correct options before implementation starts.
Author
Owner

👨‍💻 Felix Brandt — Fullstack Developer

Observations

  • The implementation plan is well-structured. The existing actions (clickOutside.ts, radioGroupNav.ts) give a clear pattern: export a function that takes a node and options, return { destroy }. The new transcribeShortcuts.ts action should follow this exact shape.
  • The focus guard is the most nuanced part of this spec. The rule is: "inactive when focus is inside a non-save text input, except Cmd/Ctrl+Enter." In the TranscribePanel, the text input is a TipTap-backed PersonMentionEditor. TipTap renders a contenteditable div, not a native <input> or <textarea>. The guard must check event.target instanceof HTMLElement && event.target.isContentEditable — checking tagName === 'INPUT' or tagName === 'TEXTAREA' alone will miss TipTap. This is a known footgun documented in the project memory.
  • Cmd/Ctrl+Enter inside the textarea must fire saveAndNext. TipTap intercepts Enter by default but Cmd/Ctrl+Enter is not bound by TipTap's default keymap — the global window listener will receive it. However, TipTap will also receive it if a KeyboardShortcut extension is registered. Confirm no TipTap extension already captures this combination before implementation.
  • The e shortcut ("toggle between Read and Edit mode") maps to onModeChange in TranscriptionPanelHeader. The action callback toggleMode should flip the current mode, requiring the action to receive the current mode state or a stateful toggle function from the page.
  • j and k for region navigation: the sorted blocks list is owned by TranscriptionEditView (via sortedBlocks = $derived([...blocks].sort(...))). Navigation must find the current active block and advance the index. The action callback should call goToNextRegion() / goToPrevRegion() on the page, which sets activeAnnotationId — the same mechanism already used when clicking an annotation on the PDF.
  • Esc conflict: Esc is used in the spec both for "discard edits (with confirm)" and "close cheatsheet". The spec correctly notes that cheatsheet Esc takes precedence. The cheatsheet component should handle its own Esc via the <dialog>'s native close event or a local listener — not the global shortcut action.
  • Component size: ShortcutCheatsheet should be a standalone, focused component under 60 lines. The cheatsheet table is purely presentational — no state beyond "is open", which comes from a prop.

Recommendations

  • Focus guard implementation — use this exact check in the action:
    function isEditableTarget(target: EventTarget | null): boolean {
      if (!(target instanceof HTMLElement)) return false;
      const tag = target.tagName;
      return tag === 'INPUT' || tag === 'TEXTAREA' || target.isContentEditable;
    }
    
    Apply this guard to all shortcuts except Cmd/Ctrl+Enter (which fires regardless) and ? (which fires regardless of focus, per the spec).
  • Svelte action pattern — follow radioGroupNav.ts exactly: the action returns { destroy, update } where update allows the parent to refresh callbacks when reactive state changes. This prevents stale closures when panelMode or blocks change.
  • TDD sequence: write the failing spec first for each shortcut behavior. The spec file transcribeShortcuts.spec.ts should use document.dispatchEvent(new KeyboardEvent('keydown', {...})) with a stub panel API object. This pattern already works in the test suite (see existing dispatchEvent usage documented in project memory).
  • n shortcut: label it clearly in the callback API as startDrawMode — this is distinct from "create a block". The shortcut puts the annotation layer into draw mode; the user then draws the region. The action does not create anything directly.
  • ? shortcut: use event.key === '?' — on QWERTZ (German keyboard), ? is typed with Shift+ß. The spec says "not while a modifier is held" — this means guard with !event.ctrlKey && !event.altKey && !event.metaKey but allow event.shiftKey (since ? requires Shift on QWERTZ). Test this on a German keyboard layout in the E2E platform-labels test.
## 👨‍💻 Felix Brandt — Fullstack Developer ### Observations - The implementation plan is well-structured. The existing actions (`clickOutside.ts`, `radioGroupNav.ts`) give a clear pattern: export a function that takes a node and options, return `{ destroy }`. The new `transcribeShortcuts.ts` action should follow this exact shape. - The **focus guard** is the most nuanced part of this spec. The rule is: "inactive when focus is inside a non-save text input, except `Cmd/Ctrl+Enter`." In the TranscribePanel, the text input is a **TipTap-backed `PersonMentionEditor`**. TipTap renders a `contenteditable` div, not a native `<input>` or `<textarea>`. The guard must check `event.target instanceof HTMLElement && event.target.isContentEditable` — checking `tagName === 'INPUT'` or `tagName === 'TEXTAREA'` alone will miss TipTap. This is a known footgun documented in the project memory. - `Cmd/Ctrl+Enter` inside the textarea must fire `saveAndNext`. TipTap intercepts `Enter` by default but `Cmd/Ctrl+Enter` is not bound by TipTap's default keymap — the global `window` listener will receive it. However, TipTap will also receive it if a `KeyboardShortcut` extension is registered. Confirm no TipTap extension already captures this combination before implementation. - The `e` shortcut ("toggle between Read and Edit mode") maps to `onModeChange` in `TranscriptionPanelHeader`. The action callback `toggleMode` should flip the current mode, requiring the action to receive the current mode state or a stateful toggle function from the page. - **`j` and `k`** for region navigation: the sorted blocks list is owned by `TranscriptionEditView` (via `sortedBlocks = $derived([...blocks].sort(...))`). Navigation must find the current active block and advance the index. The action callback should call `goToNextRegion()` / `goToPrevRegion()` on the page, which sets `activeAnnotationId` — the same mechanism already used when clicking an annotation on the PDF. - **`Esc` conflict**: `Esc` is used in the spec both for "discard edits (with confirm)" and "close cheatsheet". The spec correctly notes that cheatsheet `Esc` takes precedence. The cheatsheet component should handle its own `Esc` via the `<dialog>`'s native `close` event or a local listener — not the global shortcut action. - Component size: `ShortcutCheatsheet` should be a standalone, focused component under 60 lines. The cheatsheet table is purely presentational — no state beyond "is open", which comes from a prop. ### Recommendations - **Focus guard implementation** — use this exact check in the action: ```typescript function isEditableTarget(target: EventTarget | null): boolean { if (!(target instanceof HTMLElement)) return false; const tag = target.tagName; return tag === 'INPUT' || tag === 'TEXTAREA' || target.isContentEditable; } ``` Apply this guard to all shortcuts except `Cmd/Ctrl+Enter` (which fires regardless) and `?` (which fires regardless of focus, per the spec). - **Svelte action pattern** — follow `radioGroupNav.ts` exactly: the action returns `{ destroy, update }` where `update` allows the parent to refresh callbacks when reactive state changes. This prevents stale closures when `panelMode` or `blocks` change. - **TDD sequence**: write the failing spec first for each shortcut behavior. The spec file `transcribeShortcuts.spec.ts` should use `document.dispatchEvent(new KeyboardEvent('keydown', {...}))` with a stub panel API object. This pattern already works in the test suite (see existing `dispatchEvent` usage documented in project memory). - **`n` shortcut**: label it clearly in the callback API as `startDrawMode` — this is distinct from "create a block". The shortcut puts the annotation layer into draw mode; the user then draws the region. The action does not create anything directly. - **`?` shortcut**: use `event.key === '?'` — on QWERTZ (German keyboard), `?` is typed with `Shift+ß`. The spec says "not while a modifier is held" — this means guard with `!event.ctrlKey && !event.altKey && !event.metaKey` but allow `event.shiftKey` (since `?` requires Shift on QWERTZ). Test this on a German keyboard layout in the E2E platform-labels test.
Author
Owner

👨‍💻 Tobias Wendt — DevOps & Platform Engineer

Observations

  • This feature is entirely frontend-only: one new TypeScript action module, one new Svelte component, i18n keys, tests. No infrastructure changes. No backend changes. No new Docker services. The ops surface delta is zero.
  • The E2E test plan (frontend/e2e/transcribe-shortcuts.spec.ts) will add Playwright tests that require the full stack (SvelteKit + Spring Boot + PostgreSQL + MinIO). This matches the existing E2E pattern (see annotations.spec.ts which seeds data via API in beforeAll). No CI changes are needed if the existing E2E job already covers e2e/*.spec.ts by glob.
  • The platform-label E2E test ("on a Mac user agent, cheatsheet shows Cmd; on Windows, Ctrl") requires setting a custom userAgent in Playwright. Playwright supports this via page.setExtraHTTPHeaders or the userAgent context option. This can be done within the single spec file without any CI configuration changes.
  • The feature adds 12–14 Paraglide i18n keys across three locales (de, en, es). The Paraglide Vite plugin regenerates src/lib/paraglide/ automatically on dev/build — no manual step needed in CI. The generated files should remain in .gitignore status or tracked per the existing convention (check current .gitignore for paraglide/ entries).

Recommendations

  • Verify that the E2E spec seeds its own test document (following the annotations.spec.ts pattern with request.post('/api/documents', ...) in beforeAll) rather than depending on a pre-existing fixture. Fixture-dependent E2E tests are the primary cause of non-deterministic CI failures in this project.
  • Keep the platform-label E2E test as a lightweight check: set userAgent to include Macintosh in one context and verify Cmd appears in the cheatsheet; set a non-Mac agent and verify Ctrl. Do not spin up two full browser instances for this — use test.describe with use: { userAgent: '...' } override within the same spec.
  • The spec file count stays well within the <8 minutes E2E target as long as the new spec follows the existing seed-via-API pattern (no canvas drawing, no file uploads beyond what annotations.spec.ts already does). A rough estimate: 5–6 new E2E tests, ~45–60 seconds added. Acceptable.
  • No docker-compose.yml changes required. No new environment variables required. No Caddy config changes required. This is as low-ops as a feature gets.
## 👨‍💻 Tobias Wendt — DevOps & Platform Engineer ### Observations - This feature is entirely frontend-only: one new TypeScript action module, one new Svelte component, i18n keys, tests. No infrastructure changes. No backend changes. No new Docker services. The ops surface delta is zero. - The E2E test plan (`frontend/e2e/transcribe-shortcuts.spec.ts`) will add Playwright tests that require the full stack (SvelteKit + Spring Boot + PostgreSQL + MinIO). This matches the existing E2E pattern (see `annotations.spec.ts` which seeds data via API in `beforeAll`). No CI changes are needed if the existing E2E job already covers `e2e/*.spec.ts` by glob. - The platform-label E2E test ("on a Mac user agent, cheatsheet shows Cmd; on Windows, Ctrl") requires setting a custom `userAgent` in Playwright. Playwright supports this via `page.setExtraHTTPHeaders` or the `userAgent` context option. This can be done within the single spec file without any CI configuration changes. - The feature adds 12–14 Paraglide i18n keys across three locales (`de`, `en`, `es`). The Paraglide Vite plugin regenerates `src/lib/paraglide/` automatically on dev/build — no manual step needed in CI. The generated files should remain in `.gitignore` status or tracked per the existing convention (check current `.gitignore` for `paraglide/` entries). ### Recommendations - Verify that the E2E spec seeds its own test document (following the `annotations.spec.ts` pattern with `request.post('/api/documents', ...)` in `beforeAll`) rather than depending on a pre-existing fixture. Fixture-dependent E2E tests are the primary cause of non-deterministic CI failures in this project. - Keep the platform-label E2E test as a lightweight check: set `userAgent` to include `Macintosh` in one context and verify `Cmd` appears in the cheatsheet; set a non-Mac agent and verify `Ctrl`. Do not spin up two full browser instances for this — use `test.describe` with `use: { userAgent: '...' }` override within the same spec. - The spec file count stays well within the `<8 minutes` E2E target as long as the new spec follows the existing seed-via-API pattern (no canvas drawing, no file uploads beyond what `annotations.spec.ts` already does). A rough estimate: 5–6 new E2E tests, ~45–60 seconds added. Acceptable. - No `docker-compose.yml` changes required. No new environment variables required. No Caddy config changes required. This is as low-ops as a feature gets.
Author
Owner

👨‍💻 Elicit — Requirements Engineer

Observations

Requirement completeness: The spec is dense and implementation-ready. However, three behavioral gaps need resolution before coding starts:

GAP 1 — n shortcut scope is ambiguous. "Start drawing a new region" implies the annotation layer enters draw mode. But what happens when the panel is in Read mode and the user presses n? The spec does not say whether n should auto-switch to Edit mode first, fire only in Edit mode, or be inactive in Read mode. The acceptance criteria only state "all 9 shortcuts work as specified" — no mode-specific guards are listed per shortcut.

GAP 2 — Esc conflict resolution. Two behaviors share Esc: (1) "discard current region's unsaved edits (with confirm modal)" and (2) "close the cheatsheet overlay". The spec says cheatsheet takes precedence when open. But what if the cheatsheet is closed AND focus is inside the textarea AND the user has unsaved edits — should Esc discard, or is Esc inactive when focus is inside an editable? The spec's focus guard says shortcuts are "inactive when focus is in a non-save text input" — which would mean Esc is also inactive. This contradicts the stated behavior of Esc as a global discard trigger. The two rules conflict.

GAP 3 — t shortcut with no active region. If no region is active (no block focused), what should t do? Toggle training on nothing? The spec implies t operates on "the current region" — but what is the current region when none is selected? Silent no-op, or scroll to the first block?

Non-functional requirement gap: The spec mentions prefers-reduced-motion for cheatsheet animation but says nothing about reduced-motion handling for the shortcut-driven navigation transitions themselves. When pressing j, the active block scrolls into view (scrollIntoView). This scroll is animated — it should respect prefersReducedMotion (already managed in the page with behavior: 'smooth' | 'instant'). The shortcut action's callbacks should receive this flag or the page-level implementation should handle it transparently (it will, since scrollIntoView is called in the page, not the action).

User story coverage: The spec covers the primary user (experienced transcriber on laptop). It does not address tablet users — the project memory notes that transcribers also work on tablets. On a tablet with a software keyboard, j/k are on the keyboard but the keyboard may be hidden when the user is not actively typing. Keyboard shortcuts are still accessible on tablets with a hardware keyboard paired, so this is not a blocker, but the cheatsheet hint on the coach card should include a note that shortcuts require a physical keyboard.

Recommendations

  • Resolve GAP 2 now: change Esc to only fire the discard behavior when focus is outside any input field AND the cheatsheet is closed. When focus is inside TipTap's contenteditable, Esc should be a no-op from the global shortcut system (TipTap may handle it internally). Add this as an explicit row in the acceptance criteria table.
  • Resolve GAP 1: restrict n to Edit mode only. If the panel is in Read mode, pressing n should be a no-op (not auto-switch modes). This avoids surprise mode changes. Add this to the acceptance criteria.
  • Resolve GAP 3: treat t as a no-op when no block is active. Do not auto-scroll or auto-select. Add this edge case to the test plan.
  • Add one acceptance criterion: "On touch-only devices (no hardware keyboard), shortcuts are unavailable but the cheatsheet is not shown to avoid user confusion." This is implicit in the current spec but should be stated.
  • The 12–14 i18n keys are well-scoped. The Paraglide key naming convention is consistent with existing keys (e.g. transcribe_coach_title, mode_read, mode_edit). Use the same snake_case pattern: shortcut_next_region, cheatsheet_title, etc.

Open Decisions

  • Esc behavior when unsaved edits exist and focus is outside the textarea: should Esc discard immediately-with-confirm, or should it be a no-op? The current spec implies discard-with-confirm but the focus guard makes this unreachable in the most common scenario (user just finished typing, focus is still in TipTap). Clarify whether the intended workflow is: user clicks elsewhere → focus leaves TipTap → then presses Esc to discard. If yes, document this two-step flow in the cheatsheet and coach card tip.
## 👨‍💻 Elicit — Requirements Engineer ### Observations **Requirement completeness:** The spec is dense and implementation-ready. However, three behavioral gaps need resolution before coding starts: **GAP 1 — `n` shortcut scope is ambiguous.** "Start drawing a new region" implies the annotation layer enters draw mode. But what happens when the panel is in **Read mode** and the user presses `n`? The spec does not say whether `n` should auto-switch to Edit mode first, fire only in Edit mode, or be inactive in Read mode. The acceptance criteria only state "all 9 shortcuts work as specified" — no mode-specific guards are listed per shortcut. **GAP 2 — `Esc` conflict resolution.** Two behaviors share `Esc`: (1) "discard current region's unsaved edits (with confirm modal)" and (2) "close the cheatsheet overlay". The spec says cheatsheet takes precedence when open. But what if the cheatsheet is closed AND focus is inside the textarea AND the user has unsaved edits — should `Esc` discard, or is `Esc` inactive when focus is inside an editable? The spec's focus guard says shortcuts are "inactive when focus is in a non-save text input" — which would mean `Esc` is also inactive. This contradicts the stated behavior of `Esc` as a global discard trigger. The two rules conflict. **GAP 3 — `t` shortcut with no active region.** If no region is active (no block focused), what should `t` do? Toggle training on nothing? The spec implies `t` operates on "the current region" — but what is the current region when none is selected? Silent no-op, or scroll to the first block? **Non-functional requirement gap:** The spec mentions `prefers-reduced-motion` for cheatsheet animation but says nothing about reduced-motion handling for the shortcut-driven navigation transitions themselves. When pressing `j`, the active block scrolls into view (`scrollIntoView`). This scroll is animated — it should respect `prefersReducedMotion` (already managed in the page with `behavior: 'smooth' | 'instant'`). The shortcut action's callbacks should receive this flag or the page-level implementation should handle it transparently (it will, since `scrollIntoView` is called in the page, not the action). **User story coverage:** The spec covers the primary user (experienced transcriber on laptop). It does not address **tablet users** — the project memory notes that transcribers also work on tablets. On a tablet with a software keyboard, `j`/`k` are on the keyboard but the keyboard may be hidden when the user is not actively typing. Keyboard shortcuts are still accessible on tablets with a hardware keyboard paired, so this is not a blocker, but the cheatsheet hint on the coach card should include a note that shortcuts require a physical keyboard. ### Recommendations - **Resolve GAP 2 now**: change `Esc` to only fire the discard behavior when focus is **outside** any input field AND the cheatsheet is closed. When focus is inside TipTap's `contenteditable`, `Esc` should be a no-op from the global shortcut system (TipTap may handle it internally). Add this as an explicit row in the acceptance criteria table. - **Resolve GAP 1**: restrict `n` to Edit mode only. If the panel is in Read mode, pressing `n` should be a no-op (not auto-switch modes). This avoids surprise mode changes. Add this to the acceptance criteria. - **Resolve GAP 3**: treat `t` as a no-op when no block is active. Do not auto-scroll or auto-select. Add this edge case to the test plan. - **Add one acceptance criterion**: "On touch-only devices (no hardware keyboard), shortcuts are unavailable but the cheatsheet is not shown to avoid user confusion." This is implicit in the current spec but should be stated. - The 12–14 i18n keys are well-scoped. The Paraglide key naming convention is consistent with existing keys (e.g. `transcribe_coach_title`, `mode_read`, `mode_edit`). Use the same snake_case pattern: `shortcut_next_region`, `cheatsheet_title`, etc. ### Open Decisions - **`Esc` behavior when unsaved edits exist and focus is outside the textarea**: should `Esc` discard immediately-with-confirm, or should it be a no-op? The current spec implies discard-with-confirm but the focus guard makes this unreachable in the most common scenario (user just finished typing, focus is still in TipTap). Clarify whether the intended workflow is: user clicks elsewhere → focus leaves TipTap → then presses `Esc` to discard. If yes, document this two-step flow in the cheatsheet and coach card tip.
Author
Owner

👨‍💻 Nora "NullX" Steiner — Security Engineer

Observations

This is a pure frontend, client-side interaction feature with no backend changes. The attack surface is minimal. Two security-adjacent concerns are worth flagging:

1. window keydown listener and clickjacking / iframe context. If the app is embedded in an iframe (e.g., by a malicious third-party page), a global window keydown listener will receive events fired by the parent frame only if same-origin. Cross-origin iframes cannot inject key events into the embedded page. This is not a meaningful risk given the app is a family-internal tool, but the X-Frame-Options: DENY header already set via Caddy (per devops.md references) eliminates the iframe scenario entirely. No additional action needed.

2. navigator.platform deprecation as a security-adjacent concern. The spec uses navigator.platform to detect Mac/Windows. This API is deprecated and returns a frozen value that can be spoofed by user scripts or extensions. The only consequence here is that the cheatsheet shows the wrong modifier label (Cmd instead of Ctrl or vice versa). This is a UX defect, not a security defect. However, since the issue calls out navigator.platform explicitly, use navigator.userAgentData?.platform (Secure Context required, available in Chrome/Edge) with navigator.platform as the fallback. Add ?? null-safety since userAgentData is undefined in Firefox and Safari.

3. <dialog> + aria-modal — focus trap. The cheatsheet is proposed as a <dialog> element. Native <dialog> does not implement a focus trap by default in all browsers (Safari historically had gaps). Use dialog.showModal() rather than toggling open attribute directly — showModal() activates the top-layer, which provides a native focus trap and blocks interaction with the rest of the page. This is both the accessible and secure approach: it prevents a scenario where a keyboard user tabs out of the modal and accidentally triggers another shortcut while the cheatsheet is open.

4. No XSS surface. The cheatsheet is entirely static content — i18n strings from Paraglide (compile-time generated, not runtime-user-input). The key labels (j, k, ?, etc.) are hardcoded string literals. There is no dynamic content interpolation in the cheatsheet that could create an XSS vector.

Recommendations

  • Use dialog.showModal() (not open attribute toggle) for the cheatsheet. In Svelte 5, bind the dialog element and call dialogEl.showModal() / dialogEl.close() reactively. This is the correct pattern for modal dialogs and ensures the native focus trap.
  • Add a guard in the shortcut action: if document.fullscreenElement is not null (user entered browser full-screen), shortcuts should still work — this is not a security concern but worth noting that the window listener scope is unaffected by full-screen mode.
  • The Cmd/Ctrl+S shortcut will be intercepted by the browser's native "Save Page" dialog on most browsers on some platforms. Test this explicitly. On Windows/Linux, Ctrl+S triggers "Save As" in Chrome if focus is on window (not on a form element). Since the TranscriptionEditView uses TipTap's contenteditable, pressing Ctrl+S while focus is in TipTap may or may not bubble to the window listener. The spec should clarify whether Ctrl+S calls event.preventDefault() to suppress the browser's Save dialog.

Open Decisions

  • Ctrl+S browser conflict: Should the shortcut action call event.preventDefault() for Cmd/Ctrl+S to suppress the browser's native Save dialog? Calling preventDefault() on key events is standard practice for shortcut handlers but should be an explicit, documented decision. Not calling it means Ctrl+S triggers both the block save AND the browser dialog on Windows/Linux — a bad experience. Recommend: always call event.preventDefault() for all shortcuts registered by this action.
## 👨‍💻 Nora "NullX" Steiner — Security Engineer ### Observations This is a pure frontend, client-side interaction feature with no backend changes. The attack surface is minimal. Two security-adjacent concerns are worth flagging: **1. `window` keydown listener and clickjacking / iframe context.** If the app is embedded in an iframe (e.g., by a malicious third-party page), a global `window` keydown listener will receive events fired by the parent frame only if same-origin. Cross-origin iframes cannot inject key events into the embedded page. This is not a meaningful risk given the app is a family-internal tool, but the `X-Frame-Options: DENY` header already set via Caddy (per `devops.md` references) eliminates the iframe scenario entirely. No additional action needed. **2. `navigator.platform` deprecation as a security-adjacent concern.** The spec uses `navigator.platform` to detect Mac/Windows. This API is deprecated and returns a frozen value that can be spoofed by user scripts or extensions. The only consequence here is that the cheatsheet shows the wrong modifier label (`Cmd` instead of `Ctrl` or vice versa). This is a UX defect, not a security defect. However, since the issue calls out `navigator.platform` explicitly, use `navigator.userAgentData?.platform` (Secure Context required, available in Chrome/Edge) with `navigator.platform` as the fallback. Add `??` null-safety since `userAgentData` is undefined in Firefox and Safari. **3. `<dialog>` + `aria-modal` — focus trap.** The cheatsheet is proposed as a `<dialog>` element. Native `<dialog>` does not implement a focus trap by default in all browsers (Safari historically had gaps). Use `dialog.showModal()` rather than toggling `open` attribute directly — `showModal()` activates the top-layer, which provides a native focus trap and blocks interaction with the rest of the page. This is both the accessible and secure approach: it prevents a scenario where a keyboard user tabs out of the modal and accidentally triggers another shortcut while the cheatsheet is open. **4. No XSS surface.** The cheatsheet is entirely static content — i18n strings from Paraglide (compile-time generated, not runtime-user-input). The key labels (`j`, `k`, `?`, etc.) are hardcoded string literals. There is no dynamic content interpolation in the cheatsheet that could create an XSS vector. ### Recommendations - Use `dialog.showModal()` (not `open` attribute toggle) for the cheatsheet. In Svelte 5, bind the dialog element and call `dialogEl.showModal()` / `dialogEl.close()` reactively. This is the correct pattern for modal dialogs and ensures the native focus trap. - Add a guard in the shortcut action: if `document.fullscreenElement` is not null (user entered browser full-screen), shortcuts should still work — this is not a security concern but worth noting that the `window` listener scope is unaffected by full-screen mode. - The `Cmd/Ctrl+S` shortcut will be intercepted by the browser's native "Save Page" dialog on most browsers on some platforms. Test this explicitly. On Windows/Linux, `Ctrl+S` triggers "Save As" in Chrome if focus is on `window` (not on a form element). Since the TranscriptionEditView uses TipTap's `contenteditable`, pressing `Ctrl+S` while focus is in TipTap may or may not bubble to the window listener. The spec should clarify whether `Ctrl+S` calls `event.preventDefault()` to suppress the browser's Save dialog. ### Open Decisions - **`Ctrl+S` browser conflict**: Should the shortcut action call `event.preventDefault()` for `Cmd/Ctrl+S` to suppress the browser's native Save dialog? Calling `preventDefault()` on key events is standard practice for shortcut handlers but should be an explicit, documented decision. Not calling it means `Ctrl+S` triggers both the block save AND the browser dialog on Windows/Linux — a bad experience. Recommend: always call `event.preventDefault()` for all shortcuts registered by this action.
Author
Owner

👨‍💻 Sara Holt — QA Engineer

Observations

The test plan is solid in structure but has coverage gaps and one testability risk.

Testability risk — the action's dependency on transcribeMode state. The spec says the action reads "a store flag" to determine if the panel is open. Looking at the actual code, transcribeMode is a $state variable local to +page.svelte — not a Svelte store, and not directly importable in a unit test. The correct design (passing isPanelOpen: () => boolean as a callback) makes the action fully testable without mounting any Svelte component. Verify the implementation follows this callback pattern before writing tests, or the unit tests will be forced to mount +page.svelte — a bad sign.

Gap in component tests — the discard-with-confirm flow. The spec says Esc triggers a confirm modal before discarding. The getConfirmService() pattern is used throughout the codebase (see TranscriptionBlock.svelte) and can be mocked. The unit test must assert: (1) confirm modal is invoked with a destructive flag, (2) if the user cancels, discardCurrent is NOT called, (3) if the user confirms, discardCurrent IS called. This three-step behavior needs three separate test cases.

E2E test plan gap — Cmd/Ctrl+Enter inside TipTap. The spec lists this as a distinct test case. The existing annotations.spec.ts seeds transcription blocks via API. The shortcut E2E test can reuse this pattern. However, typing into a TipTap contenteditable in Playwright requires page.locator('[contenteditable]').click() then page.keyboard.type(...) — standard Playwright keyboard interaction works on contenteditable. The test should verify that focus is inside the editor AND Ctrl+Enter fires saveAndNext, not just that Enter alone doesn't navigate.

Missing test: shortcut inactive on non-save inputs. The spec mentions that shortcuts should be inactive when focus is inside "a non-save text input." There are other inputs in the page context (e.g., the label input if one exists, or filter fields). The test plan should include a case where focus is in a plain <input> outside the transcription block — pressing j should NOT navigate. This is especially important given the TipTap / contenteditable distinction.

Missing test: ? shortcut with cheatsheet already open. If ? is pressed while the cheatsheet is open, should it close the cheatsheet (toggle) or be a no-op? The spec says the cheatsheet closes with Esc and backdrop click — ? is not listed as a close trigger. The test should assert that pressing ? a second time does NOT close the cheatsheet (it's a toggle-or-not decision that should be explicit).

Recommendations

  • Unit test file structure: one describe block per shortcut key, with nested it blocks for: (a) fires correctly, (b) does not fire when panel is closed, (c) does not fire when focus is in non-save input. Use vi.fn() stub for all callbacks.
  • Cheatsheet component tests (ShortcutCheatsheet.spec.ts):
    1. it('is not in the DOM when closed')expect(dialog).not.toBeInTheDocument() or toHaveAttribute('open', undefined)
    2. it('opens when ? is pressed')
    3. it('shows all 9 shortcut rows')
    4. it('closes on Esc')
    5. it('closes on backdrop click')
    6. it('shows Cmd on Mac user agent, Ctrl on non-Mac')
  • E2E beforeAll pattern: seed a document with ≥ 2 pre-existing transcription blocks via API (reuse annotations.spec.ts pattern). Do not rely on OCR or manual drawing in CI.
  • axe-core test: run AxeBuilder on the open cheatsheet state — the acceptance criteria require aria-modal, labelled heading, and focus trap. Wire this into the existing annotations.spec.ts axe check pattern or the new spec.
  • Quality gate: the new unit spec should achieve 100% branch coverage of the action's guard logic — this is the highest-value code in this feature (where bugs will hide).

Open Decisions

  • ? as toggle vs. open-only: if the user presses ? while the cheatsheet is already open, should it close? Recommend open-only (pressing ? again is a no-op; Esc closes). This avoids a scenario where ? accidentally closes the cheatsheet when the user was trying to open a second one or was expecting toggle behavior from another tool.
## 👨‍💻 Sara Holt — QA Engineer ### Observations The test plan is solid in structure but has coverage gaps and one testability risk. **Testability risk — the action's dependency on `transcribeMode` state.** The spec says the action reads "a store flag" to determine if the panel is open. Looking at the actual code, `transcribeMode` is a `$state` variable local to `+page.svelte` — not a Svelte store, and not directly importable in a unit test. The correct design (passing `isPanelOpen: () => boolean` as a callback) makes the action fully testable without mounting any Svelte component. Verify the implementation follows this callback pattern before writing tests, or the unit tests will be forced to mount `+page.svelte` — a bad sign. **Gap in component tests — the discard-with-confirm flow.** The spec says `Esc` triggers a confirm modal before discarding. The `getConfirmService()` pattern is used throughout the codebase (see `TranscriptionBlock.svelte`) and can be mocked. The unit test must assert: (1) confirm modal is invoked with a destructive flag, (2) if the user cancels, `discardCurrent` is NOT called, (3) if the user confirms, `discardCurrent` IS called. This three-step behavior needs three separate test cases. **E2E test plan gap — `Cmd/Ctrl+Enter` inside TipTap.** The spec lists this as a distinct test case. The existing `annotations.spec.ts` seeds transcription blocks via API. The shortcut E2E test can reuse this pattern. However, typing into a TipTap `contenteditable` in Playwright requires `page.locator('[contenteditable]').click()` then `page.keyboard.type(...)` — standard Playwright keyboard interaction works on `contenteditable`. The test should verify that focus is inside the editor AND `Ctrl+Enter` fires `saveAndNext`, not just that `Enter` alone doesn't navigate. **Missing test: shortcut inactive on non-save inputs.** The spec mentions that shortcuts should be inactive when focus is inside "a non-save text input." There are other inputs in the page context (e.g., the label input if one exists, or filter fields). The test plan should include a case where focus is in a plain `<input>` outside the transcription block — pressing `j` should NOT navigate. This is especially important given the TipTap / `contenteditable` distinction. **Missing test: `?` shortcut with cheatsheet already open.** If `?` is pressed while the cheatsheet is open, should it close the cheatsheet (toggle) or be a no-op? The spec says the cheatsheet closes with `Esc` and backdrop click — `?` is not listed as a close trigger. The test should assert that pressing `?` a second time does NOT close the cheatsheet (it's a toggle-or-not decision that should be explicit). ### Recommendations - **Unit test file structure**: one `describe` block per shortcut key, with nested `it` blocks for: (a) fires correctly, (b) does not fire when panel is closed, (c) does not fire when focus is in non-save input. Use `vi.fn()` stub for all callbacks. - **Cheatsheet component tests** (`ShortcutCheatsheet.spec.ts`): 1. `it('is not in the DOM when closed')` — `expect(dialog).not.toBeInTheDocument()` or `toHaveAttribute('open', undefined)` 2. `it('opens when ? is pressed')` 3. `it('shows all 9 shortcut rows')` 4. `it('closes on Esc')` 5. `it('closes on backdrop click')` 6. `it('shows Cmd on Mac user agent, Ctrl on non-Mac')` - **E2E `beforeAll` pattern**: seed a document with ≥ 2 pre-existing transcription blocks via API (reuse `annotations.spec.ts` pattern). Do not rely on OCR or manual drawing in CI. - **axe-core test**: run `AxeBuilder` on the open cheatsheet state — the acceptance criteria require `aria-modal`, labelled heading, and focus trap. Wire this into the existing `annotations.spec.ts` axe check pattern or the new spec. - **Quality gate**: the new unit spec should achieve 100% branch coverage of the action's guard logic — this is the highest-value code in this feature (where bugs will hide). ### Open Decisions - **`?` as toggle vs. open-only**: if the user presses `?` while the cheatsheet is already open, should it close? Recommend open-only (pressing `?` again is a no-op; `Esc` closes). This avoids a scenario where `?` accidentally closes the cheatsheet when the user was trying to open a second one or was expecting toggle behavior from another tool.
Author
Owner

👨‍💻 Leonie Voss — UX Design Lead

Observations

The feature is well-motivated for the 60+ transcriber persona on a laptop. Muscle memory over a large backlog is exactly the right framing. A few design details need attention.

The ? shortcut hint in the coach card is under-specified. The issue says "mention ? in the coaching card from #320." Looking at TranscribeCoachEmptyState.svelte, the coach card is a structured 3-step ol with a footer link row. Adding a fourth step or a floating badge is the right pattern. The hint should be: compact, visually secondary to the three steps (smaller font, subdued color), and use a <kbd> element for the key. Example: Tipp: Drücken Sie <kbd>?</kbd> für eine Übersicht aller Tastenkürzel. The <kbd> element is semantic and renders in the browser's monospace font — perfect for key labels without custom CSS.

The cheatsheet layout spec is missing visual hierarchy details. "Two-column table of key + action label" is stated but does not specify: grouping (navigation shortcuts vs. mode shortcuts vs. utility), column proportions, or font treatment. For the 60+ user on a laptop at arm's length, recommend:

  • Key column: font-sans font-mono text-sm, displayed as <kbd> chips with a border
  • Action column: font-serif text-sm text-ink for readability at 16px minimum
  • Group dividers (no header labels needed — visual whitespace between navigation, editing, and utility groups is sufficient)
  • Dialog width: max-w-md on desktop, full-width on mobile (though shortcuts are keyboard-only, the cheatsheet can be read on tablet)

Platform label rendering: "Mac users see Cmd; Windows/Linux see Ctrl." On QWERTZ keyboards (the primary audience is German), the Cmd/Ctrl key is also displayed as on Mac keyboards in the OS UI. Consider showing the symbol rather than the word "Cmd" for Mac — this is what the OS shows and what users will look for on the key. For Windows/Linux, "Strg" is the German label for Ctrl. If the UI is in German locale, showing "Strg" instead of "Ctrl" is more natural for the 60+ audience.

Focus trap in the cheatsheet dialog (coordination with Nora's point): when showModal() is used, the browser traps focus inside the dialog. However, the close button must be the first focusable element (or at minimum, focus must land on it on open). This is the WCAG 2.1 pattern for dialogs. Implement: dialog.showModal(); closeButton.focus() immediately after opening.

prefers-reduced-motion for the cheatsheet overlay animation. The spec mentions this. Use a CSS-only approach:

@media (prefers-reduced-motion: no-preference) {
  dialog[open] { animation: fadeIn 150ms ease; }
}

This is zero JavaScript and respects user preference natively.

Recommendations

  • Add <kbd> styling to the project's Tailwind config or use inline classes: rounded border border-line bg-muted px-1.5 py-0.5 font-mono text-xs text-ink shadow-sm — this matches the existing card and border token system.
  • Cheatsheet close button must have aria-label (e.g., {m.cheatsheet_close()}) since it will be icon-only. Minimum 44×44px touch target on the close button even though this is a keyboard-primary feature — mobile users may open it via programmatic trigger.
  • Two-step discard UX (Esc → confirm modal): ensure the confirm modal that appears after Esc is also <dialog> with showModal(). The existing getConfirmService() pattern handles this — confirm the confirm modal correctly traps focus and doesn't interfere with the cheatsheet's closed state.
  • Cheatsheet in German locale: the i18n keys should use "Kürzel" not "Shortcuts" in the cheatsheet_title key — "Tastaturkürzel" is the natural German term. "Shortcuts" is understood but sounds English in a German-first app serving a 60+ audience.
  • Coach card hint placement: add the ? hint as a footer item in the existing border-t pt-3.5 footer row of TranscribeCoachEmptyState.svelte, not as a fourth step. It reads as a tip, not a required action.
## 👨‍💻 Leonie Voss — UX Design Lead ### Observations The feature is well-motivated for the 60+ transcriber persona on a laptop. Muscle memory over a large backlog is exactly the right framing. A few design details need attention. **The `?` shortcut hint in the coach card is under-specified.** The issue says "mention `?` in the coaching card from #320." Looking at `TranscribeCoachEmptyState.svelte`, the coach card is a structured 3-step ol with a footer link row. Adding a fourth step or a floating badge is the right pattern. The hint should be: compact, visually secondary to the three steps (smaller font, subdued color), and use a `<kbd>` element for the key. Example: `Tipp: Drücken Sie <kbd>?</kbd> für eine Übersicht aller Tastenkürzel.` The `<kbd>` element is semantic and renders in the browser's monospace font — perfect for key labels without custom CSS. **The cheatsheet layout spec is missing visual hierarchy details.** "Two-column table of key + action label" is stated but does not specify: grouping (navigation shortcuts vs. mode shortcuts vs. utility), column proportions, or font treatment. For the 60+ user on a laptop at arm's length, recommend: - Key column: `font-sans font-mono text-sm`, displayed as `<kbd>` chips with a border - Action column: `font-serif text-sm text-ink` for readability at 16px minimum - Group dividers (no header labels needed — visual whitespace between navigation, editing, and utility groups is sufficient) - Dialog width: `max-w-md` on desktop, full-width on mobile (though shortcuts are keyboard-only, the cheatsheet can be read on tablet) **Platform label rendering**: "Mac users see `Cmd`; Windows/Linux see `Ctrl`." On QWERTZ keyboards (the primary audience is German), the `Cmd/Ctrl` key is also displayed as `⌘` on Mac keyboards in the OS UI. Consider showing the symbol `⌘` rather than the word "Cmd" for Mac — this is what the OS shows and what users will look for on the key. For Windows/Linux, "Strg" is the German label for `Ctrl`. If the UI is in German locale, showing "Strg" instead of "Ctrl" is more natural for the 60+ audience. **Focus trap in the cheatsheet dialog** (coordination with Nora's point): when `showModal()` is used, the browser traps focus inside the dialog. However, the close button must be the **first** focusable element (or at minimum, focus must land on it on open). This is the WCAG 2.1 pattern for dialogs. Implement: `dialog.showModal(); closeButton.focus()` immediately after opening. **`prefers-reduced-motion` for the cheatsheet overlay animation.** The spec mentions this. Use a CSS-only approach: ```css @media (prefers-reduced-motion: no-preference) { dialog[open] { animation: fadeIn 150ms ease; } } ``` This is zero JavaScript and respects user preference natively. ### Recommendations - **Add `<kbd>` styling** to the project's Tailwind config or use inline classes: `rounded border border-line bg-muted px-1.5 py-0.5 font-mono text-xs text-ink shadow-sm` — this matches the existing card and border token system. - **Cheatsheet close button** must have `aria-label` (e.g., `{m.cheatsheet_close()}`) since it will be icon-only. Minimum 44×44px touch target on the close button even though this is a keyboard-primary feature — mobile users may open it via programmatic trigger. - **Two-step discard UX** (`Esc` → confirm modal): ensure the confirm modal that appears after `Esc` is also `<dialog>` with `showModal()`. The existing `getConfirmService()` pattern handles this — confirm the confirm modal correctly traps focus and doesn't interfere with the cheatsheet's closed state. - **Cheatsheet in German locale**: the i18n keys should use "Kürzel" not "Shortcuts" in the `cheatsheet_title` key — "Tastaturkürzel" is the natural German term. "Shortcuts" is understood but sounds English in a German-first app serving a 60+ audience. - **Coach card hint placement**: add the `?` hint as a footer item in the existing `border-t pt-3.5` footer row of `TranscribeCoachEmptyState.svelte`, not as a fourth step. It reads as a tip, not a required action.
Author
Owner

Decision Queue

Consolidated open decisions from all personas that need a human call before implementation starts. Grouped by theme.


Theme A — Component & File Locations

DQ-A1 (Markus): Where does ShortcutCheatsheet.svelte live?

  • Option 1: frontend/src/lib/document/transcription/ShortcutCheatsheet.svelte — domain-scoped, single consumer in v1, follows the existing structure
  • Option 2: frontend/src/lib/shared/primitives/ShortcutCheatsheet.svelte — reusable if other panels later get shortcut overlays

The src/lib/components/ path in the issue does not exist in this codebase. Either option 1 or 2 must be chosen before implementation. Recommendation: Option 1 (domain-scoped) — YAGNI until a second panel needs it.


Theme B — Esc Key Behavior

DQ-B1 (Elicit, Sara): Esc is specified for two behaviors: (a) discard unsaved edits with a confirm modal, and (b) close the cheatsheet. The focus-guard rule ("shortcuts inactive when focus is in a non-save input") makes case (a) unreachable in the most common scenario — the user just finished typing, focus is still in TipTap's contenteditable.

Clarify the intended sequence:

  • Option 1: Esc for discard only fires when focus is outside any editable element. User must click away from TipTap first, then press Esc. Document this two-step flow.
  • Option 2: Esc fires for discard even when focus is inside TipTap, in addition to Cmd/Ctrl+Enter which is the only other exception to the focus guard.

Recommendation: Option 1 — consistent with the stated focus guard rule, avoids surprise mid-typing discard.

DQ-B2 (Nora): Should the shortcut handler call event.preventDefault() for all registered shortcuts to suppress browser defaults (particularly Ctrl+S triggering "Save Page")?

Recommendation: Yes — always call event.preventDefault() for shortcuts the action handles. This is standard practice for app-level shortcut systems and avoids the browser Save dialog conflict on Windows/Linux.


Theme C — ? Cheatsheet Toggle vs Open-Only

DQ-C1 (Sara): If ? is pressed while the cheatsheet is already open, should it close (toggle) or be a no-op?

Recommendation: Open-only? opens, Esc and backdrop click close. This matches the mental model of the cheatsheet as a reference overlay, not a toggle switch, and avoids accidentally closing it.


Theme D — Platform Labels in German Locale

DQ-D1 (Leonie): Should the modifier key label use:

  • Cmd / Ctrl (English, as specified)
  • / Strg (OS-native and German locale-appropriate)

Recommendation: for Mac, Strg for Windows/Linux when the UI is in German locale — the 60+ German-speaking transcriber will look for Strg on their keyboard, not Ctrl. Wire this through the existing i18n system: add locale-aware variants in de.json using "Strg" and in en.json/es.json using "Ctrl".


Theme E — n and t Shortcut Mode Restrictions

DQ-E1 (Elicit): Should n (start drawing a new region) be active in Read mode, active in Edit mode only, or auto-switch to Edit mode?

Recommendation: Edit mode only, no auto-switch — auto-switching modes on n is a surprise behavior. If the user presses n in Read mode, it should be a silent no-op.

DQ-E2 (Elicit): Should t (toggle training mark) be a no-op when no block is active, or should it auto-focus the first block?

Recommendation: Silent no-op — consistent with standard shortcut behavior on desktop apps where commands with no applicable target are disabled.

## Decision Queue Consolidated open decisions from all personas that need a human call before implementation starts. Grouped by theme. --- ### Theme A — Component & File Locations **DQ-A1 (Markus):** Where does `ShortcutCheatsheet.svelte` live? - Option 1: `frontend/src/lib/document/transcription/ShortcutCheatsheet.svelte` — domain-scoped, single consumer in v1, follows the existing structure - Option 2: `frontend/src/lib/shared/primitives/ShortcutCheatsheet.svelte` — reusable if other panels later get shortcut overlays The `src/lib/components/` path in the issue does not exist in this codebase. Either option 1 or 2 must be chosen before implementation. **Recommendation: Option 1** (domain-scoped) — YAGNI until a second panel needs it. --- ### Theme B — `Esc` Key Behavior **DQ-B1 (Elicit, Sara):** `Esc` is specified for two behaviors: (a) discard unsaved edits with a confirm modal, and (b) close the cheatsheet. The focus-guard rule ("shortcuts inactive when focus is in a non-save input") makes case (a) unreachable in the most common scenario — the user just finished typing, focus is still in TipTap's contenteditable. Clarify the intended sequence: - Option 1: `Esc` for discard only fires when focus is **outside** any editable element. User must click away from TipTap first, then press `Esc`. Document this two-step flow. - Option 2: `Esc` fires for discard even when focus is inside TipTap, **in addition to** `Cmd/Ctrl+Enter` which is the only other exception to the focus guard. **Recommendation: Option 1** — consistent with the stated focus guard rule, avoids surprise mid-typing discard. **DQ-B2 (Nora):** Should the shortcut handler call `event.preventDefault()` for all registered shortcuts to suppress browser defaults (particularly `Ctrl+S` triggering "Save Page")? **Recommendation: Yes** — always call `event.preventDefault()` for shortcuts the action handles. This is standard practice for app-level shortcut systems and avoids the browser Save dialog conflict on Windows/Linux. --- ### Theme C — `?` Cheatsheet Toggle vs Open-Only **DQ-C1 (Sara):** If `?` is pressed while the cheatsheet is already open, should it close (toggle) or be a no-op? **Recommendation: Open-only** — `?` opens, `Esc` and backdrop click close. This matches the mental model of the cheatsheet as a reference overlay, not a toggle switch, and avoids accidentally closing it. --- ### Theme D — Platform Labels in German Locale **DQ-D1 (Leonie):** Should the modifier key label use: - `Cmd` / `Ctrl` (English, as specified) - `⌘` / `Strg` (OS-native and German locale-appropriate) **Recommendation: `⌘` for Mac, `Strg` for Windows/Linux when the UI is in German locale** — the 60+ German-speaking transcriber will look for `Strg` on their keyboard, not `Ctrl`. Wire this through the existing i18n system: add locale-aware variants in `de.json` using "Strg" and in `en.json`/`es.json` using "Ctrl". --- ### Theme E — `n` and `t` Shortcut Mode Restrictions **DQ-E1 (Elicit):** Should `n` (start drawing a new region) be active in Read mode, active in Edit mode only, or auto-switch to Edit mode? **Recommendation: Edit mode only, no auto-switch** — auto-switching modes on `n` is a surprise behavior. If the user presses `n` in Read mode, it should be a silent no-op. **DQ-E2 (Elicit):** Should `t` (toggle training mark) be a no-op when no block is active, or should it auto-focus the first block? **Recommendation: Silent no-op** — consistent with standard shortcut behavior on desktop apps where commands with no applicable target are disabled.
Sign in to join this conversation.
No Label P2-medium feature ui
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#327