feat(admin): /admin lands on a real dashboard instead of redirecting to /admin/users #324

Open
opened 2026-04-24 13:25:25 +02:00 by marcel · 8 comments
Owner

Context

/admin currently redirects to /admin/users. The left-rail item reads ADMIN DASHBOARD, but no dashboard exists — the label promises something the page doesn't deliver. Admins have to click around three different sub-routes (Users, System, OCR) to get a sense of system health.

A real dashboard aggregates the most-asked questions into one view:

  • Is the app healthy? (OCR online, MinIO reachable, background jobs running)
  • What's happening? (recent uploads, imports, user signups)
  • What needs attention? (pending invites, stale OCR training queue, orphaned files)
  • What can I do quickly? (start mass import, generate thumbnails, start training)

Non-goals

  • No new data pipelines — read from existing services and MinIO admin API.
  • No dashboard customisation / widget reordering / theming.
  • No per-user dashboards (this is admin-only).

Proposed layout

┌─────────────────────────────────────────────────────────────┐
│  Aktivität                    System-Gesundheit             │
│  ───────────                  ──────────────────            │
│  Letzte Uploads (10)          OCR: Online · Fehler 2.9 %    │
│  Letzte Imports (3)           Letztes Training: vor 2 Tg.   │
│  Neue Benutzer (1)            Zu trainieren: 54 Blöcke      │
│                               Speicher: 12.4 GB · 1 502 Obj.│
│                                                             │
│  Offene Einladungen (0)       Schnellaktionen               │
│  ─────────────────────        ─────────────────             │
│  (keine)                      → Massenimport starten        │
│                               → Vorschaubilder generieren   │
│                               → OCR-Training starten        │
└─────────────────────────────────────────────────────────────┘

Panels are links/deep-links to the actual sub-routes; the dashboard is a summary, not a replacement.

Implementation plan

Backend

  • New AdminDashboardController.getDashboard()GET /api/admin/dashboardAdminDashboardDTO.
  • AdminDashboardDTO:
    - recentUploads: List<DocumentSummary>          // limit 10
    - recentImports: List<ImportJobSummary>         // limit 3
    - recentUsers:   List<UserSummary>              // limit 5
    - pendingInvites: Integer
    - ocr: {
        online: Boolean,
        lastTrainingAt: Instant,
        lastTrainingErrorRate: Double,
        blocksReadyForTraining: Integer,
      }
    - storage: {
        totalBytes: Long,
        totalObjects: Integer,
      }
    
  • Aggregates via existing services (DocumentService, MassImportService, UserService, OCR client, MinIO admin API).
  • Permission: @RequirePermission(Permission.ADMIN).
  • Read-only; no caching in v1 (query cost is small; add if it becomes slow).

Frontend

  • Replace frontend/src/routes/admin/+page.server.ts redirect with a loader that calls /api/admin/dashboard.
  • New frontend/src/routes/admin/+page.svelte:
    • Grid of 5 panel components (Activity, System Health, Invites, Storage, Quick Actions).
    • Each panel is its own small Svelte component in frontend/src/lib/components/admin/.
    • Quick actions: link to existing sub-routes; the only new interaction is clicking into them.
  • Regenerate API types.

i18n

15–20 new Paraglide keys covering all panel titles, labels, and empty states.

Tests

  • Backend controller test: GET /api/admin/dashboard with admin → 200 + correct shape. Without admin → 403.
  • Backend service tests: each sub-aggregate (recentUploads, ocr, storage) returns expected shape for seeded fixtures.
  • Frontend page test: loads, renders all 5 panels with mocked data.
  • E2E: login as admin, visit /admin, assert no redirect; confirm all 5 panel headers visible.
  • E2E: login as non-admin, visit /admin — expect redirect to login or access-denied (whatever the current global pattern is).

Verification

Manual: log in as admin, /admin shows dashboard. Click each quick action; lands on the right page. Disable OCR (stop the container) — dashboard shows "Offline".

Acceptance criteria

  • /admin no longer redirects to /admin/users
  • Dashboard shows Activity / System Health / Invites / Storage / Quick Actions panels
  • /api/admin/dashboard is permission-gated to ADMIN
  • Zero-states render cleanly (no imports yet, no invites, etc.)
  • i18n complete for de/en/es
  • Left-rail "ADMIN DASHBOARD" label no longer misleads

Critical files

backend/src/main/java/org/raddatz/familienarchiv/controller/AdminDashboardController.java  (new)
backend/src/main/java/org/raddatz/familienarchiv/dto/AdminDashboardDTO.java                (new)
backend/src/main/java/org/raddatz/familienarchiv/service/AdminDashboardService.java        (new, thin aggregator)
backend/src/test/java/org/raddatz/familienarchiv/controller/AdminDashboardControllerTest.java (new)
frontend/src/routes/admin/+page.svelte                                                     (replace redirect)
frontend/src/routes/admin/+page.server.ts                                                  (new loader)
frontend/src/lib/components/admin/ActivityPanel.svelte                                     (new)
frontend/src/lib/components/admin/SystemHealthPanel.svelte                                 (new)
frontend/src/lib/components/admin/InvitesPanel.svelte                                      (new)
frontend/src/lib/components/admin/StoragePanel.svelte                                      (new)
frontend/src/lib/components/admin/QuickActionsPanel.svelte                                 (new)
frontend/src/lib/generated/api.ts                                                          (regen)
frontend/messages/{de,en,es}.json
  • #326 (admin master-detail empty states) — dashboard complements the sub-page empty states.
## Context `/admin` currently redirects to `/admin/users`. The left-rail item reads **ADMIN DASHBOARD**, but no dashboard exists — the label promises something the page doesn't deliver. Admins have to click around three different sub-routes (Users, System, OCR) to get a sense of system health. A real dashboard aggregates the most-asked questions into one view: - Is the app healthy? (OCR online, MinIO reachable, background jobs running) - What's happening? (recent uploads, imports, user signups) - What needs attention? (pending invites, stale OCR training queue, orphaned files) - What can I do quickly? (start mass import, generate thumbnails, start training) ## Non-goals - No new data pipelines — read from existing services and MinIO admin API. - No dashboard customisation / widget reordering / theming. - No per-user dashboards (this is admin-only). ## Proposed layout ``` ┌─────────────────────────────────────────────────────────────┐ │ Aktivität System-Gesundheit │ │ ─────────── ────────────────── │ │ Letzte Uploads (10) OCR: Online · Fehler 2.9 % │ │ Letzte Imports (3) Letztes Training: vor 2 Tg. │ │ Neue Benutzer (1) Zu trainieren: 54 Blöcke │ │ Speicher: 12.4 GB · 1 502 Obj.│ │ │ │ Offene Einladungen (0) Schnellaktionen │ │ ───────────────────── ───────────────── │ │ (keine) → Massenimport starten │ │ → Vorschaubilder generieren │ │ → OCR-Training starten │ └─────────────────────────────────────────────────────────────┘ ``` Panels are links/deep-links to the actual sub-routes; the dashboard is a summary, not a replacement. ## Implementation plan ### Backend - New `AdminDashboardController.getDashboard()` → `GET /api/admin/dashboard` → `AdminDashboardDTO`. - `AdminDashboardDTO`: ``` - recentUploads: List<DocumentSummary> // limit 10 - recentImports: List<ImportJobSummary> // limit 3 - recentUsers: List<UserSummary> // limit 5 - pendingInvites: Integer - ocr: { online: Boolean, lastTrainingAt: Instant, lastTrainingErrorRate: Double, blocksReadyForTraining: Integer, } - storage: { totalBytes: Long, totalObjects: Integer, } ``` - Aggregates via existing services (`DocumentService`, `MassImportService`, `UserService`, OCR client, MinIO admin API). - Permission: `@RequirePermission(Permission.ADMIN)`. - Read-only; no caching in v1 (query cost is small; add if it becomes slow). ### Frontend - Replace `frontend/src/routes/admin/+page.server.ts` redirect with a loader that calls `/api/admin/dashboard`. - New `frontend/src/routes/admin/+page.svelte`: - Grid of 5 panel components (Activity, System Health, Invites, Storage, Quick Actions). - Each panel is its own small Svelte component in `frontend/src/lib/components/admin/`. - Quick actions: link to existing sub-routes; the only *new* interaction is clicking into them. - Regenerate API types. ### i18n 15–20 new Paraglide keys covering all panel titles, labels, and empty states. ## Tests - **Backend controller test:** `GET /api/admin/dashboard` with admin → 200 + correct shape. Without admin → 403. - **Backend service tests:** each sub-aggregate (recentUploads, ocr, storage) returns expected shape for seeded fixtures. - **Frontend page test:** loads, renders all 5 panels with mocked data. - **E2E:** login as admin, visit `/admin`, assert no redirect; confirm all 5 panel headers visible. - **E2E:** login as non-admin, visit `/admin` — expect redirect to login or access-denied (whatever the current global pattern is). ## Verification Manual: log in as admin, `/admin` shows dashboard. Click each quick action; lands on the right page. Disable OCR (stop the container) — dashboard shows "Offline". ## Acceptance criteria - [ ] `/admin` no longer redirects to `/admin/users` - [ ] Dashboard shows Activity / System Health / Invites / Storage / Quick Actions panels - [ ] `/api/admin/dashboard` is permission-gated to `ADMIN` - [ ] Zero-states render cleanly (no imports yet, no invites, etc.) - [ ] i18n complete for de/en/es - [ ] Left-rail "ADMIN DASHBOARD" label no longer misleads ## Critical files ``` backend/src/main/java/org/raddatz/familienarchiv/controller/AdminDashboardController.java (new) backend/src/main/java/org/raddatz/familienarchiv/dto/AdminDashboardDTO.java (new) backend/src/main/java/org/raddatz/familienarchiv/service/AdminDashboardService.java (new, thin aggregator) backend/src/test/java/org/raddatz/familienarchiv/controller/AdminDashboardControllerTest.java (new) frontend/src/routes/admin/+page.svelte (replace redirect) frontend/src/routes/admin/+page.server.ts (new loader) frontend/src/lib/components/admin/ActivityPanel.svelte (new) frontend/src/lib/components/admin/SystemHealthPanel.svelte (new) frontend/src/lib/components/admin/InvitesPanel.svelte (new) frontend/src/lib/components/admin/StoragePanel.svelte (new) frontend/src/lib/components/admin/QuickActionsPanel.svelte (new) frontend/src/lib/generated/api.ts (regen) frontend/messages/{de,en,es}.json ``` ## Related - #326 (admin master-detail empty states) — dashboard complements the sub-page empty states.
marcel added the P2-mediumfeatureui labels 2026-04-24 13:28:12 +02:00
Author
Owner

📋 Elicit — Requirements Engineer

Requirements discussion, 2026-04-26. Nine open items worked through; all resolved.


Resolved items

1. Pending invites data source
InviteToken.isActive() (not revoked, not expired, not exhausted) is the correct filter. InviteService.listInvites(true, ...).size() gives the count directly. System fully exists.

2. Quick Actions panel → replaced by OCR panel
Mass import and thumbnail generation are too infrequent (≤ once/year) to warrant dashboard real estate. Quick Actions panel is dropped entirely. The OCR section becomes its own panel showing:

  • Service: online / offline
  • Segmentation training: last run date + success/failure outcome
  • Kurrent training: last run date + success/failure outcome
  • Blocks not yet used for training, split by type (= total eligible − blockCount of last completed run per type)
  • Link to the OCR detail sub-route
  • No error rate on dashboard — that belongs on the OCR detail page

3. OCR panel resilience
If the OCR service is unreachable, only the OCR panel shows a degraded/offline state. The rest of the dashboard (Activity, Invites) loads normally. The backend must catch OCR connectivity failures and return ocr.online = false with null training fields rather than propagating a 500. All OCR training fields in the DTO must be nullable.

4. Storage panel — removed from this issue
The backend currently has no MinIO admin API integration and the Document entity stores no file size. Adding storage stats requires new integration work, which contradicts the "no new data pipelines" non-goal. Storage panel removed from v1 and tracked as a separate follow-on in #334.

5. Auto-refresh
Not needed in v1. Manual page reload is sufficient.

6. Non-admin access pattern
The pattern is already defined and consistent: authenticated non-admin users hit error(403) in +layout.server.ts — they see the 403 error page, not a redirect. The E2E acceptance criterion should read: "authenticated non-admin visits /admin → 403 error page" (not the vague "redirect to login or access-denied").

7. OCR error rate
Dropped from the dashboard (resolved as part of item 2). Detail page only.

8. Training block count definition
"Blocks not yet used for training" = total eligible blocks (per type) − blockCount of the last completed OcrTrainingRun for that type, split by KURRENT_RECOGNITION and KURRENT_SEGMENTATION.

9. Activity panel — redesigned
recentUploads, recentImports, and recentUsers as lists are dropped. The panel becomes four weekly counts (last 7 days), sourced entirely from the existing audit log:

Metric Audit event
Uploads FILE_UPLOADED
Comments COMMENT_ADDED
Transcription blocks saved TEXT_SAVED
Segmentation annotations ANNOTATION_CREATED

No lists, no links to individual items — counts only.


Implementer flag (not a blocker)

OcrTrainingRun has no explicit training type column. Type is currently inferred from modelName (german_kurrent → Kurrent recognition, blla → segmentation). This works today but is fragile if model names change. Consider adding a trainingType enum column via migration to make the split explicit and query-safe.


Updated DTO shape (replacing the original proposal)

AdminDashboardDTO:
  activity: {
    uploadsThisWeek: Integer
    commentsThisWeek: Integer
    textSavedThisWeek: Integer
    annotationsThisWeek: Integer
  }
  invites: {
    pendingCount: Integer
  }
  ocr: {
    online: Boolean
    kurrent: {                         // nullable if online=false
      lastRunAt: Instant
      lastRunSucceeded: Boolean
      blocksNotYetUsed: Integer
    }
    segmentation: {                    // nullable if online=false
      lastRunAt: Instant
      lastRunSucceeded: Boolean
      blocksNotYetUsed: Integer
    }
  }

Storage section removed (tracked in #334).

## 📋 Elicit — Requirements Engineer Requirements discussion, 2026-04-26. Nine open items worked through; all resolved. --- ### Resolved items **1. Pending invites data source** `InviteToken.isActive()` (not revoked, not expired, not exhausted) is the correct filter. `InviteService.listInvites(true, ...).size()` gives the count directly. System fully exists. **2. Quick Actions panel → replaced by OCR panel** Mass import and thumbnail generation are too infrequent (≤ once/year) to warrant dashboard real estate. Quick Actions panel is dropped entirely. The OCR section becomes its own panel showing: - Service: online / offline - **Segmentation training**: last run date + success/failure outcome - **Kurrent training**: last run date + success/failure outcome - Blocks not yet used for training, split by type (= total eligible − `blockCount` of last completed run per type) - Link to the OCR detail sub-route - No error rate on dashboard — that belongs on the OCR detail page **3. OCR panel resilience** If the OCR service is unreachable, only the OCR panel shows a degraded/offline state. The rest of the dashboard (Activity, Invites) loads normally. The backend must catch OCR connectivity failures and return `ocr.online = false` with null training fields rather than propagating a 500. All OCR training fields in the DTO must be nullable. **4. Storage panel — removed from this issue** The backend currently has no MinIO admin API integration and the `Document` entity stores no file size. Adding storage stats requires new integration work, which contradicts the "no new data pipelines" non-goal. Storage panel removed from v1 and tracked as a separate follow-on in #334. **5. Auto-refresh** Not needed in v1. Manual page reload is sufficient. **6. Non-admin access pattern** The pattern is already defined and consistent: authenticated non-admin users hit `error(403)` in `+layout.server.ts` — they see the 403 error page, not a redirect. The E2E acceptance criterion should read: *"authenticated non-admin visits `/admin` → 403 error page"* (not the vague "redirect to login or access-denied"). **7. OCR error rate** Dropped from the dashboard (resolved as part of item 2). Detail page only. **8. Training block count definition** "Blocks not yet used for training" = total eligible blocks (per type) − `blockCount` of the last completed `OcrTrainingRun` for that type, split by `KURRENT_RECOGNITION` and `KURRENT_SEGMENTATION`. **9. Activity panel — redesigned** `recentUploads`, `recentImports`, and `recentUsers` as lists are dropped. The panel becomes four **weekly counts** (last 7 days), sourced entirely from the existing audit log: | Metric | Audit event | |---|---| | Uploads | `FILE_UPLOADED` | | Comments | `COMMENT_ADDED` | | Transcription blocks saved | `TEXT_SAVED` | | Segmentation annotations | `ANNOTATION_CREATED` | No lists, no links to individual items — counts only. --- ### Implementer flag (not a blocker) `OcrTrainingRun` has no explicit training type column. Type is currently inferred from `modelName` (`german_kurrent` → Kurrent recognition, `blla` → segmentation). This works today but is fragile if model names change. Consider adding a `trainingType` enum column via migration to make the split explicit and query-safe. --- ### Updated DTO shape (replacing the original proposal) ``` AdminDashboardDTO: activity: { uploadsThisWeek: Integer commentsThisWeek: Integer textSavedThisWeek: Integer annotationsThisWeek: Integer } invites: { pendingCount: Integer } ocr: { online: Boolean kurrent: { // nullable if online=false lastRunAt: Instant lastRunSucceeded: Boolean blocksNotYetUsed: Integer } segmentation: { // nullable if online=false lastRunAt: Instant lastRunSucceeded: Boolean blocksNotYetUsed: Integer } } ``` Storage section removed (tracked in #334).
Author
Owner

🏗️ Markus Keller — Senior Application Architect

Observations

  • Package placement conflict. The issue puts AdminDashboardController in controller/ — consistent with AdminController.java, which also lives there at /api/admin/**. This is correct. The existing dashboard/ package is for the user-facing dashboard (DashboardController, DashboardService, ActivityFeedItemDTO). Keep admin stats separate from user-facing pulse/resume logic. Don't mix the two.

  • OcrTrainingRun has no trainingType column. The OCR panel requires splitting the last-run query by type (kurrent recognition vs. segmentation). Currently, type is inferred from modelName string matching. OcrTrainingRunRepository has no query for "last completed run per type" — whatever is added here must pattern-match on model name strings like german_kurrent and blla. This is fragile: if the model name changes in the Python service, the dashboard silently breaks with no type error. Elicit marked this as a non-blocking implementer flag. I'd push back on that: adding a trainingType enum column is a 2-line migration + one column addition. Doing it in this issue means one deploy. Doing it as a follow-on means two deploys and a window where the dashboard infers the wrong type silently.

  • AuditLogQueryRepository.getPulseStats() already does most of the work. The existing native query (lines 114–129) counts FILE_UPLOADED, ANNOTATION_CREATED, TEXT_SAVED per week. The admin dashboard Activity panel needs the same three plus COMMENT_ADDED, without the per-user yourPages metric. Add getSystemActivityCounts(@Param("weekStart") OffsetDateTime weekStart) to AuditLogQueryRepository — adapt the existing SQL rather than writing from scratch.

  • Invite count overhead. InviteService.listInvites(true, appBaseUrl) fetches full InviteListItemDTO objects with shareable URLs and formatted display codes — just to count them. For a dashboard summary, add a countActive() native query to InviteTokenRepository:

    SELECT COUNT(*) FROM invite_tokens
    WHERE NOT revoked
      AND (expires_at IS NULL OR expires_at > NOW())
      AND (max_uses IS NULL OR use_count < max_uses)
    

    This is a scalar count, not a full entity fetch. At current volume it's negligible, but the pattern is the right one.

  • OCR connectivity resilience must be in the service layer. AdminDashboardService calls the OCR health endpoint. If the OCR service is unreachable, the exception must be caught in the service — not the controller. Return AdminDashboardDTO with ocr.online = false and null training fields. A 500 from a side panel health check is not acceptable.

  • The +layout.server.ts has a related TODO. It loads ALL users, ALL groups, ALL tags to get counts — there's a comment flagging this as wasteful. This issue doesn't need to fix that, but the new /api/admin/dashboard endpoint should not replicate the same pattern. Return counts only, not lists.

Recommendations

  • Place AdminDashboardController in controller/ — consistent with AdminController. Keep dashboard/ for user-facing data.
  • Add trainingType enum column + Flyway migration in this issue, not as a follow-on. The dashboard query against a typed column is one findFirstByTrainingTypeAndStatus(type, COMPLETED) JPA call; the modelName-based version requires a native query with string matching.
  • Add AuditLogQueryRepository.getSystemActivityCounts(weekStart) — adapted from getPulseStats(), adds COMMENT_ADDED, removes the yourPages per-user metric.
  • Add InviteTokenRepository.countActive() — native count query, no DTO allocation.
  • Wrap OCR health check in try-catch in AdminDashboardService, return degraded state, never propagate the exception upward.

Open Decisions

  • trainingType migration: this issue or follow-on? Elicit called it non-blocking. I think it belongs here — the dashboard result is incorrect without it. Options: (A) Add migration in this PR (recommended), (B) Deploy dashboard with modelName string inference and add migration in a follow-on. What each costs: A = one deploy, correct from day one; B = two deploys, silent failure risk if model names drift.
## 🏗️ Markus Keller — Senior Application Architect ### Observations - **Package placement conflict.** The issue puts `AdminDashboardController` in `controller/` — consistent with `AdminController.java`, which also lives there at `/api/admin/**`. This is correct. The existing `dashboard/` package is for the *user-facing* dashboard (`DashboardController`, `DashboardService`, `ActivityFeedItemDTO`). Keep admin stats separate from user-facing pulse/resume logic. Don't mix the two. - **`OcrTrainingRun` has no `trainingType` column.** The OCR panel requires splitting the last-run query by type (kurrent recognition vs. segmentation). Currently, type is inferred from `modelName` string matching. `OcrTrainingRunRepository` has no query for "last completed run per type" — whatever is added here must pattern-match on model name strings like `german_kurrent` and `blla`. This is fragile: if the model name changes in the Python service, the dashboard silently breaks with no type error. Elicit marked this as a non-blocking implementer flag. I'd push back on that: adding a `trainingType` enum column is a 2-line migration + one column addition. Doing it in this issue means one deploy. Doing it as a follow-on means two deploys and a window where the dashboard infers the wrong type silently. - **`AuditLogQueryRepository.getPulseStats()` already does most of the work.** The existing native query (lines 114–129) counts `FILE_UPLOADED`, `ANNOTATION_CREATED`, `TEXT_SAVED` per week. The admin dashboard Activity panel needs the same three plus `COMMENT_ADDED`, without the per-user `yourPages` metric. Add `getSystemActivityCounts(@Param("weekStart") OffsetDateTime weekStart)` to `AuditLogQueryRepository` — adapt the existing SQL rather than writing from scratch. - **Invite count overhead.** `InviteService.listInvites(true, appBaseUrl)` fetches full `InviteListItemDTO` objects with shareable URLs and formatted display codes — just to count them. For a dashboard summary, add a `countActive()` native query to `InviteTokenRepository`: ```sql SELECT COUNT(*) FROM invite_tokens WHERE NOT revoked AND (expires_at IS NULL OR expires_at > NOW()) AND (max_uses IS NULL OR use_count < max_uses) ``` This is a scalar count, not a full entity fetch. At current volume it's negligible, but the pattern is the right one. - **OCR connectivity resilience must be in the service layer.** `AdminDashboardService` calls the OCR health endpoint. If the OCR service is unreachable, the exception must be caught in the service — not the controller. Return `AdminDashboardDTO` with `ocr.online = false` and null training fields. A 500 from a side panel health check is not acceptable. - **The `+layout.server.ts` has a related TODO.** It loads ALL users, ALL groups, ALL tags to get counts — there's a comment flagging this as wasteful. This issue doesn't need to fix that, but the new `/api/admin/dashboard` endpoint should not replicate the same pattern. Return counts only, not lists. ### Recommendations - **Place `AdminDashboardController` in `controller/`** — consistent with `AdminController`. Keep `dashboard/` for user-facing data. - **Add `trainingType` enum column + Flyway migration in this issue**, not as a follow-on. The dashboard query against a typed column is one `findFirstByTrainingTypeAndStatus(type, COMPLETED)` JPA call; the modelName-based version requires a native query with string matching. - **Add `AuditLogQueryRepository.getSystemActivityCounts(weekStart)`** — adapted from `getPulseStats()`, adds `COMMENT_ADDED`, removes the `yourPages` per-user metric. - **Add `InviteTokenRepository.countActive()`** — native count query, no DTO allocation. - **Wrap OCR health check in try-catch in `AdminDashboardService`**, return degraded state, never propagate the exception upward. ### Open Decisions - **`trainingType` migration: this issue or follow-on?** Elicit called it non-blocking. I think it belongs here — the dashboard result is incorrect without it. Options: (A) Add migration in this PR (recommended), (B) Deploy dashboard with modelName string inference and add migration in a follow-on. What each costs: A = one deploy, correct from day one; B = two deploys, silent failure risk if model names drift.
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Observations

  • The critical files list in the issue body is outdated. It still includes QuickActionsPanel.svelte and StoragePanel.svelte — both dropped by Elicit. The panel count is now 3 (Activity, Invites, OCR), not 5. Start from the Elicit comment's revised design, not the original issue body, when building the component tree.

  • +page.svelte currently uses onMount + goto() for desktop redirect. This means: browser loads the page → renders mobile entity picker → JS runs → navigates to /admin/users. The flash is visible on desktop. The server-side +page.server.ts loader eliminates this entirely — the data is available before first paint. The onMount + goto block must be deleted.

  • AuditLogQueryRepository.getPulseStats() already has the right shape. Lines 114–129: it counts FILE_UPLOADED, ANNOTATION_CREATED, TEXT_SAVED grouped to weekStart. The admin version needs:

    • Add COMMENT_ADDED to the query
    • Remove the yourPages per-user column
    • Remove the userId parameter
      Add a new method getSystemActivityCounts(OffsetDateTime weekStart) returning a new projection interface SystemActivityRow with four fields: uploaded, comments, transcribed, annotated.
  • OcrTrainingRunRepository has no query for "last completed run per type." The existing methods query by personId, status, or top-20 recent. For the dashboard, two new methods are needed:

    Optional<OcrTrainingRun> findFirstByModelNameContainingAndStatusOrderByCompletedAtDesc(
        String modelNameFragment, TrainingStatus status);
    

    called once with "german_kurrent" and once with "blla". This is exactly the fragile modelName inference the Elicit comment flagged. If the decision is to add a trainingType column, use findFirstByTrainingTypeAndStatusOrderByCompletedAtDesc(TrainingType type, TrainingStatus status) instead — one call per type, no string matching.

  • OCR panel needs null-safe $derived blocks. The DTO guarantees kurrent and segmentation are nullable when ocr.online = false. The component must guard this:

    const kurrentData = $derived(data.dashboard.ocr.online ? data.dashboard.ocr.ocr.kurrent : null);
    

    If the component accesses data.dashboard.ocr.kurrent.lastRunAt without a null guard, it throws at runtime when OCR is offline.

  • The existing page.svelte.spec.ts currently tests the mobile picker and desktop redirect behavior. This file must be rewritten — not just updated. Delete the onMount/goto tests and replace with panel rendering tests.

Recommendations

  • Update the critical files list before opening the PR — remove QuickActionsPanel.svelte, StoragePanel.svelte; rename SystemHealthPanel.svelteOcrPanel.svelte.
  • Add getSystemActivityCounts(OffsetDateTime weekStart) to AuditLogQueryRepository — adapt existing SQL, 5-minute task.
  • Delete the onMount + goto block from +page.svelte entirely once the loader is in place. No partial: the redirect and the mobile picker are both replaced by the dashboard.
  • Use $derived with null guards for the OCR sub-sections — do not access nested fields without first checking ocr.online.
  • Write the failing tests first before touching +page.svelte or any backend class — the existing page.svelte.spec.ts is the red starting point.
## 👨‍💻 Felix Brandt — Senior Fullstack Developer ### Observations - **The critical files list in the issue body is outdated.** It still includes `QuickActionsPanel.svelte` and `StoragePanel.svelte` — both dropped by Elicit. The panel count is now 3 (Activity, Invites, OCR), not 5. Start from the Elicit comment's revised design, not the original issue body, when building the component tree. - **`+page.svelte` currently uses `onMount` + `goto()` for desktop redirect.** This means: browser loads the page → renders mobile entity picker → JS runs → navigates to `/admin/users`. The flash is visible on desktop. The server-side `+page.server.ts` loader eliminates this entirely — the data is available before first paint. The `onMount` + `goto` block must be deleted. - **`AuditLogQueryRepository.getPulseStats()` already has the right shape.** Lines 114–129: it counts `FILE_UPLOADED`, `ANNOTATION_CREATED`, `TEXT_SAVED` grouped to `weekStart`. The admin version needs: - Add `COMMENT_ADDED` to the query - Remove the `yourPages` per-user column - Remove the `userId` parameter Add a new method `getSystemActivityCounts(OffsetDateTime weekStart)` returning a new projection interface `SystemActivityRow` with four fields: `uploaded`, `comments`, `transcribed`, `annotated`. - **`OcrTrainingRunRepository` has no query for "last completed run per type."** The existing methods query by `personId`, `status`, or top-20 recent. For the dashboard, two new methods are needed: ```java Optional<OcrTrainingRun> findFirstByModelNameContainingAndStatusOrderByCompletedAtDesc( String modelNameFragment, TrainingStatus status); ``` called once with `"german_kurrent"` and once with `"blla"`. This is exactly the fragile modelName inference the Elicit comment flagged. If the decision is to add a `trainingType` column, use `findFirstByTrainingTypeAndStatusOrderByCompletedAtDesc(TrainingType type, TrainingStatus status)` instead — one call per type, no string matching. - **OCR panel needs null-safe `$derived` blocks.** The DTO guarantees `kurrent` and `segmentation` are nullable when `ocr.online = false`. The component must guard this: ```svelte const kurrentData = $derived(data.dashboard.ocr.online ? data.dashboard.ocr.ocr.kurrent : null); ``` If the component accesses `data.dashboard.ocr.kurrent.lastRunAt` without a null guard, it throws at runtime when OCR is offline. - **The existing `page.svelte.spec.ts`** currently tests the mobile picker and desktop redirect behavior. This file must be rewritten — not just updated. Delete the `onMount`/`goto` tests and replace with panel rendering tests. ### Recommendations - **Update the critical files list before opening the PR** — remove `QuickActionsPanel.svelte`, `StoragePanel.svelte`; rename `SystemHealthPanel.svelte` → `OcrPanel.svelte`. - **Add `getSystemActivityCounts(OffsetDateTime weekStart)` to `AuditLogQueryRepository`** — adapt existing SQL, 5-minute task. - **Delete the `onMount` + `goto` block from `+page.svelte` entirely** once the loader is in place. No partial: the redirect and the mobile picker are both replaced by the dashboard. - **Use `$derived` with null guards for the OCR sub-sections** — do not access nested fields without first checking `ocr.online`. - **Write the failing tests first** before touching `+page.svelte` or any backend class — the existing `page.svelte.spec.ts` is the red starting point.
Author
Owner

🔒 Nora "NullX" Steiner — Application Security Engineer

Observations

  • Permission gate is correct, but must be class-level. The issue specifies @RequirePermission(Permission.ADMIN). The existing AdminController applies this at the class level — every method is protected by default. The new AdminDashboardController must do the same. A method-level annotation on only getDashboard() would leave any future method on the same class unprotected by default.

  • Data sensitivity is low — aggregated counts only. The dashboard response contains: weekly activity counts, pending invite count, OCR service status + training metrics. No individual user records, no document content, no PII in the response shape. The ADMIN permission gate is the right and sufficient mitigation.

  • OCR health check timeout is an availability risk. AdminDashboardService will call the Python OCR service to determine online status. If the OCR service is hung (not unreachable, but slow — responding after 30 seconds), every admin page load blocks a Spring thread for 30 seconds waiting for the OCR ping. At low concurrency this is invisible; under load it becomes thread exhaustion. The fix: configure a short read timeout (2–3 seconds) specifically for the dashboard health check, separate from the global OCR client timeout used for full OCR jobs.

  • Audit log query returns counts, not rows. The getSystemActivityCounts() query returns four integer counts. No risk of over-fetching event details or leaking document/user identifiers. Safe.

  • Invite count query. InviteTokenRepository.findActive() returns token objects including codes and shareable URLs. If the dashboard service uses listInvites() to count and discards the rest, those objects (containing invite codes) are allocated and then GC'd. This is a minor concern — not a security issue at this scale, just unnecessary. A countActive() native query avoids it.

Recommendations

  • Apply @RequirePermission(Permission.ADMIN) at the class level on AdminDashboardController — not method-level. Match AdminController's pattern exactly.
  • Set a 2–3 second timeout for the OCR health check call in AdminDashboardService. Use a separate RestClient instance or RequestOptions with a short timeout — do not share timeout config with the full OCR processing client.
  • Write explicit security tests for both 401 and 403:
    • GET /api/admin/dashboard with no auth → 401
    • GET /api/admin/dashboard with authenticated user missing ADMIN permission → 403
      Both are distinct failure modes and must be tested separately — "Spring Security handles auth" is not coverage.
## 🔒 Nora "NullX" Steiner — Application Security Engineer ### Observations - **Permission gate is correct, but must be class-level.** The issue specifies `@RequirePermission(Permission.ADMIN)`. The existing `AdminController` applies this at the class level — every method is protected by default. The new `AdminDashboardController` must do the same. A method-level annotation on only `getDashboard()` would leave any future method on the same class unprotected by default. - **Data sensitivity is low — aggregated counts only.** The dashboard response contains: weekly activity counts, pending invite count, OCR service status + training metrics. No individual user records, no document content, no PII in the response shape. The `ADMIN` permission gate is the right and sufficient mitigation. - **OCR health check timeout is an availability risk.** `AdminDashboardService` will call the Python OCR service to determine `online` status. If the OCR service is hung (not unreachable, but slow — responding after 30 seconds), every admin page load blocks a Spring thread for 30 seconds waiting for the OCR ping. At low concurrency this is invisible; under load it becomes thread exhaustion. The fix: configure a short read timeout (2–3 seconds) specifically for the dashboard health check, separate from the global OCR client timeout used for full OCR jobs. - **Audit log query returns counts, not rows.** The `getSystemActivityCounts()` query returns four integer counts. No risk of over-fetching event details or leaking document/user identifiers. Safe. - **Invite count query.** `InviteTokenRepository.findActive()` returns token objects including codes and shareable URLs. If the dashboard service uses `listInvites()` to count and discards the rest, those objects (containing invite codes) are allocated and then GC'd. This is a minor concern — not a security issue at this scale, just unnecessary. A `countActive()` native query avoids it. ### Recommendations - **Apply `@RequirePermission(Permission.ADMIN)` at the class level** on `AdminDashboardController` — not method-level. Match `AdminController`'s pattern exactly. - **Set a 2–3 second timeout for the OCR health check call** in `AdminDashboardService`. Use a separate `RestClient` instance or `RequestOptions` with a short timeout — do not share timeout config with the full OCR processing client. - **Write explicit security tests for both 401 and 403:** - `GET /api/admin/dashboard` with no auth → 401 - `GET /api/admin/dashboard` with authenticated user missing `ADMIN` permission → 403 Both are distinct failure modes and must be tested separately — "Spring Security handles auth" is not coverage.
Author
Owner

🧪 Sara Holt — Senior QA Engineer

Observations

  • Test plan structure is good, but missing critical cases. The issue lists: controller test (200+shape, 403), service tests per sub-aggregate, frontend page test, E2E. The pyramid is right. The gaps:

    1. OCR service offline state — no test listed for ocr.online = false with null kurrent/segmentation. This is a required branch in both backend (service catches RestClientException → returns degraded DTO) and frontend (OcrPanel renders "Offline" without crashing). Without this test, the first time OCR goes down, the dashboard throws a 500 or the UI crashes.

    2. getSystemActivityCounts() integration test — this will be a native SQL query against audit_log. It must run against real PostgreSQL via Testcontainers with seeded rows. Mocking it at the service layer does not prove the SQL is correct. Verify each of the four counts independently with seeded fixtures.

    3. page.svelte.spec.ts rewrite — the existing test asserts the mobile picker and desktop redirect behavior. Both are being deleted. The spec must be rewritten before the old behavior is removed — otherwise there's a window where the old tests are green (redirect exists), then they're deleted, then nothing covers the new dashboard panels.

  • E2E acceptance criteria need tightening. "Assert no redirect" verifies the goto is gone — good. Also assert all 3 panel headings are visible in the same test. Use getByRole('heading') to avoid coupling to CSS class names.

  • Non-admin E2E test. The issue says "authenticated non-admin visits /admin → 403 error page" (Elicit resolved this correctly). Write this as a Playwright test — not just a @WebMvcTest test. The 403 is enforced at the SvelteKit layout level (+layout.server.ts line 24), which is separate from the backend permission gate. Test both layers.

Recommendations

  • Add three missing test cases before closing the issue:

    1. AdminDashboardControllerTest: when_ocr_unreachable_returns_200_with_online_false — mock OCR client to throw RestClientException, assert response is 200 with ocr.online = false.
    2. AuditLogQueryRepositoryTest (Testcontainers): seed 2 FILE_UPLOADED, 3 COMMENT_ADDED, 1 TEXT_SAVED, 0 ANNOTATION_CREATED in the last 7 days — assert each count matches.
    3. OcrPanel.spec.ts (Vitest Browser): render with makeDashboard({ ocr: { online: false, kurrent: null, segmentation: null } }) — assert no runtime errors, offline badge visible.
  • Factory function for dashboard DTO in frontend tests:

    const makeDashboard = (overrides = {}) => ({
        activity: { uploadsThisWeek: 0, commentsThisWeek: 0, textSavedThisWeek: 0, annotationsThisWeek: 0 },
        invites: { pendingCount: 0 },
        ocr: { online: true, kurrent: { lastRunAt: '2026-04-01T10:00:00Z', lastRunSucceeded: true, blocksNotYetUsed: 12 },
                segmentation: { lastRunAt: '2026-04-02T10:00:00Z', lastRunSucceeded: false, blocksNotYetUsed: 5 } },
        ...overrides
    });
    

    Pass { ocr: { online: false, kurrent: null, segmentation: null } } to cover the offline case in one line.

  • Rewrite page.svelte.spec.ts first — before touching +page.svelte. The spec rewrite IS the red phase for the frontend dashboard implementation.

## 🧪 Sara Holt — Senior QA Engineer ### Observations - **Test plan structure is good, but missing critical cases.** The issue lists: controller test (200+shape, 403), service tests per sub-aggregate, frontend page test, E2E. The pyramid is right. The gaps: 1. **OCR service offline state** — no test listed for `ocr.online = false` with null `kurrent`/`segmentation`. This is a required branch in both backend (service catches RestClientException → returns degraded DTO) and frontend (OcrPanel renders "Offline" without crashing). Without this test, the first time OCR goes down, the dashboard throws a 500 or the UI crashes. 2. **`getSystemActivityCounts()` integration test** — this will be a native SQL query against `audit_log`. It must run against real PostgreSQL via Testcontainers with seeded rows. Mocking it at the service layer does not prove the SQL is correct. Verify each of the four counts independently with seeded fixtures. 3. **`page.svelte.spec.ts` rewrite** — the existing test asserts the mobile picker and desktop redirect behavior. Both are being deleted. The spec must be rewritten before the old behavior is removed — otherwise there's a window where the old tests are green (redirect exists), then they're deleted, then nothing covers the new dashboard panels. - **E2E acceptance criteria need tightening.** "Assert no redirect" verifies the `goto` is gone — good. Also assert all 3 panel headings are visible in the same test. Use `getByRole('heading')` to avoid coupling to CSS class names. - **Non-admin E2E test.** The issue says "authenticated non-admin visits `/admin` → 403 error page" (Elicit resolved this correctly). Write this as a Playwright test — not just a `@WebMvcTest` test. The 403 is enforced at the SvelteKit layout level (`+layout.server.ts` line 24), which is separate from the backend permission gate. Test both layers. ### Recommendations - **Add three missing test cases before closing the issue:** 1. `AdminDashboardControllerTest`: `when_ocr_unreachable_returns_200_with_online_false` — mock OCR client to throw `RestClientException`, assert response is 200 with `ocr.online = false`. 2. `AuditLogQueryRepositoryTest` (Testcontainers): seed 2 `FILE_UPLOADED`, 3 `COMMENT_ADDED`, 1 `TEXT_SAVED`, 0 `ANNOTATION_CREATED` in the last 7 days — assert each count matches. 3. `OcrPanel.spec.ts` (Vitest Browser): render with `makeDashboard({ ocr: { online: false, kurrent: null, segmentation: null } })` — assert no runtime errors, offline badge visible. - **Factory function for dashboard DTO** in frontend tests: ```typescript const makeDashboard = (overrides = {}) => ({ activity: { uploadsThisWeek: 0, commentsThisWeek: 0, textSavedThisWeek: 0, annotationsThisWeek: 0 }, invites: { pendingCount: 0 }, ocr: { online: true, kurrent: { lastRunAt: '2026-04-01T10:00:00Z', lastRunSucceeded: true, blocksNotYetUsed: 12 }, segmentation: { lastRunAt: '2026-04-02T10:00:00Z', lastRunSucceeded: false, blocksNotYetUsed: 5 } }, ...overrides }); ``` Pass `{ ocr: { online: false, kurrent: null, segmentation: null } }` to cover the offline case in one line. - **Rewrite `page.svelte.spec.ts` first** — before touching `+page.svelte`. The spec rewrite IS the red phase for the frontend dashboard implementation.
Author
Owner

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Observations

  • Mobile behavior is unspecified. The current +page.svelte is a mobile entity picker: tapping Users, Groups, Invites, Tags, System navigates to each sub-route, and history.back() returns here. The issue says "replace redirect with a loader" but doesn't define what mobile admins see. On desktop the entity nav lives in the left rail (via EntityNav.svelte); on mobile there is no persistent left rail. If the mobile entity picker is removed and replaced with dashboard panels, mobile admins lose their sub-route hub. This needs an explicit decision before any pixel is placed.

  • 3-panel layout works better than the original 5-panel grid. With Storage and QuickActions removed, the dashboard is leaner: Activity (4 counts), Invites (1 count), OCR (2 training types + online/offline). This fits a single-column layout on mobile and a 2-column or asymmetric grid on desktop without feeling sparse.

  • Activity panel: 4 stat cards, not a list. The Elicit design (4 weekly counts) is a stat card grid — not a list of recent items. On mobile: 2×2 grid. On desktop: 1×4 row. Use the pattern from OcrStatCards.svelte (it already exists in admin/ocr/) — the number styling and label hierarchy are established there. Reuse, don't reinvent.

  • OCR offline state needs a visible status badge, never color alone. If ocr.online = false, the panel must show a status badge with both a colored indicator AND the text "Offline". Color-blind users (8% of men) see no difference between a green dot and a red dot. The badge pattern:

    <span class="inline-flex items-center gap-1.5 rounded-full px-2 py-0.5 text-xs font-medium
                 {data.dashboard.ocr.online ? 'bg-green-100 text-green-800' : 'bg-red-100 text-red-700'}">
      <span class="h-1.5 w-1.5 rounded-full {data.dashboard.ocr.online ? 'bg-green-500' : 'bg-red-500'}"
            aria-hidden="true"></span>
      {data.dashboard.ocr.online ? 'Online' : 'Offline'}
    </span>
    
  • Zero state for Invites. pendingCount: 0 should display a message — not just "0". A small icon + "Keine ausstehenden Einladungen" communicates clearly. Showing "0" without context leaves admins wondering if the count failed to load.

  • Loading state is missing from the issue. The +page.server.ts loader is async. SvelteKit renders the page after the load completes, so there is no spinner needed in the normal SSR case — but if a Svelte streaming approach or client-side navigation to this page is in scope, skeleton cards should be considered. For the default SSR approach, the backend must respond quickly. The OCR health check timeout (flagged by Nora) directly affects time-to-first-paint.

  • The onMount redirect removal is a UX win. On desktop, the current behavior renders mobile content briefly before the client-side goto() fires. The new server-side loader means correct content on first paint, no flash. This improves both performance and perceived quality.

Recommendations

  • Decide mobile behavior before building — see Decision Queue. My recommendation: show the dashboard panels on all screen sizes, with sub-route navigation accessible via the established EntityNav component in the left rail on desktop, and via a "Zur Administration" header link or bottom-of-page links on mobile.
  • Reuse OcrStatCards.svelte pattern for Activity counts — number in font-serif text-3xl font-bold text-ink, label in font-sans text-xs text-ink-3 uppercase tracking-widest.
  • Status badge with text + color on the OCR panel header — never color alone.
  • Zero state for Invites — icon + localized message when pendingCount === 0.
  • 44px minimum touch target on any link within dashboard panels (especially the "→ OCR-Details" deep-link at the bottom of the OCR panel).
  • Panel container: use the established card pattern bg-white shadow-sm border border-brand-sand rounded-sm p-6 — consistent with every other admin panel in the project.

Open Decisions

  • Mobile layout — Does /admin on mobile show dashboard panels, the entity picker, or both? Options: (A) Dashboard panels only, sub-routes accessible via back button from each panel's detail link. (B) Tab switcher at the top: "Dashboard | Navigation". (C) Dashboard panels with a "Zur Administration ›" footer row linking to the entity picker. My preference: C — keeps the dashboard as the landing with easy escape to sub-routes, no new tab pattern needed.
## 🎨 Leonie Voss — UX Designer & Accessibility Strategist ### Observations - **Mobile behavior is unspecified.** The current `+page.svelte` is a mobile entity picker: tapping Users, Groups, Invites, Tags, System navigates to each sub-route, and `history.back()` returns here. The issue says "replace redirect with a loader" but doesn't define what mobile admins see. On desktop the entity nav lives in the left rail (via `EntityNav.svelte`); on mobile there is no persistent left rail. If the mobile entity picker is removed and replaced with dashboard panels, mobile admins lose their sub-route hub. This needs an explicit decision before any pixel is placed. - **3-panel layout works better than the original 5-panel grid.** With Storage and QuickActions removed, the dashboard is leaner: Activity (4 counts), Invites (1 count), OCR (2 training types + online/offline). This fits a single-column layout on mobile and a 2-column or asymmetric grid on desktop without feeling sparse. - **Activity panel: 4 stat cards, not a list.** The Elicit design (4 weekly counts) is a stat card grid — not a list of recent items. On mobile: 2×2 grid. On desktop: 1×4 row. Use the pattern from `OcrStatCards.svelte` (it already exists in `admin/ocr/`) — the number styling and label hierarchy are established there. Reuse, don't reinvent. - **OCR offline state needs a visible status badge, never color alone.** If `ocr.online = false`, the panel must show a status badge with both a colored indicator AND the text "Offline". Color-blind users (8% of men) see no difference between a green dot and a red dot. The badge pattern: ```svelte <span class="inline-flex items-center gap-1.5 rounded-full px-2 py-0.5 text-xs font-medium {data.dashboard.ocr.online ? 'bg-green-100 text-green-800' : 'bg-red-100 text-red-700'}"> <span class="h-1.5 w-1.5 rounded-full {data.dashboard.ocr.online ? 'bg-green-500' : 'bg-red-500'}" aria-hidden="true"></span> {data.dashboard.ocr.online ? 'Online' : 'Offline'} </span> ``` - **Zero state for Invites.** `pendingCount: 0` should display a message — not just "0". A small icon + "Keine ausstehenden Einladungen" communicates clearly. Showing "0" without context leaves admins wondering if the count failed to load. - **Loading state is missing from the issue.** The `+page.server.ts` loader is async. SvelteKit renders the page after the load completes, so there is no spinner needed in the normal SSR case — but if a Svelte streaming approach or client-side navigation to this page is in scope, skeleton cards should be considered. For the default SSR approach, the backend must respond quickly. The OCR health check timeout (flagged by Nora) directly affects time-to-first-paint. - **The `onMount` redirect removal is a UX win.** On desktop, the current behavior renders mobile content briefly before the client-side `goto()` fires. The new server-side loader means correct content on first paint, no flash. This improves both performance and perceived quality. ### Recommendations - **Decide mobile behavior before building** — see Decision Queue. My recommendation: show the dashboard panels on all screen sizes, with sub-route navigation accessible via the established `EntityNav` component in the left rail on desktop, and via a "Zur Administration" header link or bottom-of-page links on mobile. - **Reuse `OcrStatCards.svelte` pattern** for Activity counts — number in `font-serif text-3xl font-bold text-ink`, label in `font-sans text-xs text-ink-3 uppercase tracking-widest`. - **Status badge with text + color** on the OCR panel header — never color alone. - **Zero state for Invites** — icon + localized message when `pendingCount === 0`. - **44px minimum touch target** on any link within dashboard panels (especially the "→ OCR-Details" deep-link at the bottom of the OCR panel). - **Panel container**: use the established card pattern `bg-white shadow-sm border border-brand-sand rounded-sm p-6` — consistent with every other admin panel in the project. ### Open Decisions - **Mobile layout** — Does `/admin` on mobile show dashboard panels, the entity picker, or both? Options: (A) Dashboard panels only, sub-routes accessible via back button from each panel's detail link. (B) Tab switcher at the top: "Dashboard | Navigation". (C) Dashboard panels with a "Zur Administration ›" footer row linking to the entity picker. My preference: C — keeps the dashboard as the landing with easy escape to sub-routes, no new tab pattern needed.
Author
Owner

⚙️ Tobias Wendt — DevOps & Platform Engineer

Observations

  • No new infrastructure required. The dashboard reads from existing services (PostgreSQL audit log, InviteTokenRepository, OcrTrainingRunRepository, existing OCR client). No new containers, no new volumes, no config changes. Clean.

  • OCR health check adds a new outbound call on the hot path. Every admin page load will now trigger a call to the Python OCR service to determine online status. In normal operation this is fast. During OCR service restarts, upgrades, or GPU hangs, this call may stall. Tobias confirms: the OCR service has a start_period: 60s health check in the Compose file for model loading. During that window, /health on the OCR service may return unhealthy or not respond. The dashboard backend must have an explicit short timeout on this call (2–3 seconds) — not the same timeout as the full OCR job client. If not set, an admin hitting the dashboard during an OCR cold start gets a thread blocked for the full model-load window.

  • No API type regen impact beyond what's expected. The new AdminDashboardDTO is the only new shape in the OpenAPI spec. One npm run generate:api after the backend is running with --spring.profiles.active=dev covers it.

  • The +layout.server.ts currently loads ALL users, groups, tags to get counts (with a TODO noting this). The dashboard endpoint doesn't fix this — it's a separate concern. But noting it: the three parallel calls in +layout.server.ts run on every admin sub-route load, including the new dashboard. This is existing overhead, not introduced by this issue.

  • No Flyway migration required unless the trainingType column addition (flagged by Markus) is included. If it is: one new migration file, named V{N}__add_training_type_to_ocr_training_runs.sql. The existing pattern is OcrTrainingRun.modelName is a non-null string column — adding a nullable training_type VARCHAR(50) with a subsequent UPDATE to backfill is the standard approach.

Recommendations

  • Confirm OCR client timeout config before merging. If OcrTrainingService or OcrClient uses a global timeout of, say, 120 seconds (reasonable for a full OCR job), the dashboard health check must override this with a 2–3 second timeout. Check RestClientOcrClient configuration.
  • If the trainingType migration is included: name the migration V{N}__add_training_type_to_ocr_training_runs.sql, add a nullable column, backfill existing rows from modelName (CASE WHEN model_name LIKE '%german_kurrent%' THEN 'KURRENT_RECOGNITION' WHEN model_name LIKE '%blla%' THEN 'KURRENT_SEGMENTATION' ELSE NULL END), then make it NOT NULL DEFAULT 'UNKNOWN' or keep it nullable for safety. Verify the migration runs cleanly against a Testcontainers instance before merging.
  • No other infra changes needed. This is a clean, read-only feature addition.
## ⚙️ Tobias Wendt — DevOps & Platform Engineer ### Observations - **No new infrastructure required.** The dashboard reads from existing services (PostgreSQL audit log, InviteTokenRepository, OcrTrainingRunRepository, existing OCR client). No new containers, no new volumes, no config changes. Clean. - **OCR health check adds a new outbound call on the hot path.** Every admin page load will now trigger a call to the Python OCR service to determine `online` status. In normal operation this is fast. During OCR service restarts, upgrades, or GPU hangs, this call may stall. Tobias confirms: the OCR service has a `start_period: 60s` health check in the Compose file for model loading. During that window, `/health` on the OCR service may return unhealthy or not respond. The dashboard backend must have an explicit short timeout on this call (2–3 seconds) — not the same timeout as the full OCR job client. If not set, an admin hitting the dashboard during an OCR cold start gets a thread blocked for the full model-load window. - **No API type regen impact beyond what's expected.** The new `AdminDashboardDTO` is the only new shape in the OpenAPI spec. One `npm run generate:api` after the backend is running with `--spring.profiles.active=dev` covers it. - **The `+layout.server.ts` currently loads ALL users, groups, tags to get counts** (with a TODO noting this). The dashboard endpoint doesn't fix this — it's a separate concern. But noting it: the three parallel calls in `+layout.server.ts` run on every admin sub-route load, including the new dashboard. This is existing overhead, not introduced by this issue. - **No Flyway migration required** unless the `trainingType` column addition (flagged by Markus) is included. If it is: one new migration file, named `V{N}__add_training_type_to_ocr_training_runs.sql`. The existing pattern is `OcrTrainingRun.modelName` is a non-null string column — adding a nullable `training_type VARCHAR(50)` with a subsequent `UPDATE` to backfill is the standard approach. ### Recommendations - **Confirm OCR client timeout config before merging.** If `OcrTrainingService` or `OcrClient` uses a global timeout of, say, 120 seconds (reasonable for a full OCR job), the dashboard health check must override this with a 2–3 second timeout. Check `RestClientOcrClient` configuration. - **If the `trainingType` migration is included:** name the migration `V{N}__add_training_type_to_ocr_training_runs.sql`, add a nullable column, backfill existing rows from `modelName` (`CASE WHEN model_name LIKE '%german_kurrent%' THEN 'KURRENT_RECOGNITION' WHEN model_name LIKE '%blla%' THEN 'KURRENT_SEGMENTATION' ELSE NULL END`), then make it `NOT NULL DEFAULT 'UNKNOWN'` or keep it nullable for safety. Verify the migration runs cleanly against a Testcontainers instance before merging. - **No other infra changes needed.** This is a clean, read-only feature addition.
Author
Owner

🗳️ Decision Queue — Action Required

2 decisions need your input before implementation starts.

Schema

  • trainingType migration: this issue or follow-on? The OCR panel requires splitting the last-run record by type (kurrent recognition vs. segmentation). Currently OcrTrainingRun has no trainingType column — the split would use modelName string matching (LIKE '%german_kurrent%', LIKE '%blla%'). Elicit marked this non-blocking. Markus disagrees: if model names drift in the Python service, the dashboard silently shows wrong data with no compile-time error. Options: (A) Add a trainingType enum column + backfill migration in this PR — one deploy, correct from day one. (B) Ship the dashboard with modelName inference, file a follow-on — two deploys, silent-failure window. (Raised by: Markus, Felix)

UX / Mobile

  • What does /admin show on mobile? The current +page.svelte is a mobile entity picker (Users, Groups, Invites, Tags, System as tappable rows). On desktop it redirects to /admin/users. The issue replaces the desktop redirect with a dashboard loader — but is silent on mobile. Options: (A) Dashboard panels only on all screen sizes; sub-routes accessible via detail links within each panel. (B) Tab switcher at the top of the mobile view: "Dashboard | Navigation". (C) Dashboard panels at the top + "Zur Administration ›" footer row that expands or links to the existing entity picker rows. Leonie recommends C — preserves the sub-route hub without adding a new navigation pattern. (Raised by: Leonie, Markus)
## 🗳️ Decision Queue — Action Required _2 decisions need your input before implementation starts._ ### Schema - **`trainingType` migration: this issue or follow-on?** The OCR panel requires splitting the last-run record by type (kurrent recognition vs. segmentation). Currently `OcrTrainingRun` has no `trainingType` column — the split would use `modelName` string matching (`LIKE '%german_kurrent%'`, `LIKE '%blla%'`). Elicit marked this non-blocking. Markus disagrees: if model names drift in the Python service, the dashboard silently shows wrong data with no compile-time error. Options: **(A)** Add a `trainingType` enum column + backfill migration in this PR — one deploy, correct from day one. **(B)** Ship the dashboard with modelName inference, file a follow-on — two deploys, silent-failure window. _(Raised by: Markus, Felix)_ ### UX / Mobile - **What does `/admin` show on mobile?** The current `+page.svelte` is a mobile entity picker (Users, Groups, Invites, Tags, System as tappable rows). On desktop it redirects to `/admin/users`. The issue replaces the desktop redirect with a dashboard loader — but is silent on mobile. Options: **(A)** Dashboard panels only on all screen sizes; sub-routes accessible via detail links within each panel. **(B)** Tab switcher at the top of the mobile view: "Dashboard | Navigation". **(C)** Dashboard panels at the top + "Zur Administration ›" footer row that expands or links to the existing entity picker rows. Leonie recommends C — preserves the sub-route hub without adding a new navigation pattern. _(Raised by: Leonie, Markus)_
Sign in to join this conversation.
No Label P2-medium feature ui
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#324