Commit Graph

876 Commits

Author SHA1 Message Date
Marcel
57c44cf02f devops(backend): reduce healthcheck start_period to 30s
Some checks failed
CI / Unit & Component Tests (push) Failing after 2s
CI / Backend Unit Tests (push) Failing after 1s
With a pre-built JAR, Spring Boot + Flyway starts in ~15 seconds.
The previous 60s was sized for runtime compilation (90+ seconds).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:33:03 +02:00
Marcel
48223d5a3d devops(backend): pin eclipse-temurin tags, skip test compilation, document jar glob
- Pin to eclipse-temurin:21.0.10_7-{jdk,jre}-noble for reproducible builds
- Switch -DskipTests to -Dmaven.test.skip=true: skips test compilation entirely,
  not just execution — faster and avoids build failures from test-only missing classes
- Add comment on COPY *.jar explaining why the glob is safe (Spring Boot renames
  the pre-repackage artifact to .jar.original, leaving only one .jar in target/)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:33:03 +02:00
Marcel
04069c0286 devops(backend): add .dockerignore to exclude target/ from build context
Prevents 111MB of compiled output from being sent to the BuildKit daemon
on cold builds. Only .mvn/, mvnw, pom.xml, and src/ are needed by the
three COPY instructions in the Dockerfile.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:33:03 +02:00
Marcel
3c46d820ad devops(backend): switch to multi-stage Docker build
Replace runtime mvn spring-boot:run with a proper multi-stage build:
- Stage 1 (builder): compiles JAR with BuildKit cache mount for ~/.m2
- Stage 2 (runtime): eclipse-temurin:21-jre with only the JAR

Removes the backend source volume mount and maven_cache named volume.
Deploy with: docker compose up -d --build

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:33:03 +02:00
Marcel
38d558182a refactor(conversations): migrate ConversationTimeline to groupDocuments
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 1s
CI / Backend Unit Tests (pull_request) Failing after 2s
CI / Unit & Component Tests (push) Failing after 3s
CI / Backend Unit Tests (push) Failing after 2s
Replace hand-rolled enrichedDocuments year-divider logic with the shared
groupDocuments utility. Also fixes a timezone bug in documentYears: adds
'T12:00:00' to date strings so getFullYear() doesn't drift on UTC boundaries.
No behavior change — year dividers render the same way as before.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 09:41:40 +02:00
Marcel
25aa05411f fix(server): allowlist dir param in page.server.ts
Mirrors the existing sort allowlist pattern. Any value other than 'asc' or
'desc' silently falls back to 'desc', preventing arbitrary strings from
reaching the search API.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 09:39:24 +02:00
Marcel
f522ab633c fix(a11y): bump GroupDivider contrast and add separator role
text-xs text-ink/40 (~2.1:1) fails WCAG AA; text-sm bold at text-ink/60
(~3.7:1) passes the large-text 3:1 threshold. Also adds role="separator"
and aria-label so screen readers announce the group boundary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 09:38:37 +02:00
Marcel
593a6c8a38 test+fix(docs): correct fallbackLabel when sort prop is omitted
Add failing test for DATE-sort + undated doc showing "Undatiert" fallback
label, then fix DocumentList by null-coalescing sort before comparison
((sort ?? 'DATE') === 'DATE'). Test uses one dated + one undated doc to
produce two groups and trigger GroupDivider rendering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 09:37:19 +02:00
Marcel
67c03dab8c feat(search): wire sort to DocumentList; validate sort param allowlist
Some checks failed
CI / Unit & Component Tests (push) Failing after 3s
CI / Backend Unit Tests (push) Failing after 0s
CI / Unit & Component Tests (pull_request) Failing after 2s
CI / Backend Unit Tests (pull_request) Failing after 1s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:00:09 +02:00
Marcel
e302d3d689 feat(search): add group headers to DocumentList by sort field
Documents sorted by DATE show year dividers, SENDER/RECEIVER sort
shows person name dividers. Dividers only appear when there are 2+
distinct groups. Multi-receiver docs appear in each receiver group.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 07:59:02 +02:00
Marcel
a9aa1ec924 feat(search): add groupDocuments utility with unit tests
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:36:35 +02:00
Marcel
ce2bbf4230 refactor(conversations): use GroupDivider in ConversationTimeline
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:35:09 +02:00
Marcel
69bcb3f8b2 feat(search): add GroupDivider shared component
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:24:48 +02:00
Marcel
34a97cbfa2 i18n: add docs_group_undated and docs_group_unknown translation keys
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:21:43 +02:00
Marcel
3d3d4b8616 chore: add Claude personas, skills, memory, and project docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:21:15 +02:00
Marcel
e4719b9487 fix(deploy): increase OCR healthcheck start_period, comment ocr_cache volume, add token hint
Some checks failed
CI / Unit & Component Tests (push) Failing after 2s
CI / Backend Unit Tests (push) Failing after 1s
- start_period 60s → 120s: Zenodo download on cold start can exceed 60s on slow connections
- ocr_cache volume comment: documents what the cache stores for future operators
- .env.example: add token generation command to prevent weak placeholder in production

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
7562a400c0 test(frontend): add Vitest component tests for TrainingHistory expand/collapse
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
2073a4b64a fix(frontend): accessibility fixes for TrainingHistory expand/collapse and FAILED badge
- Add aria-expanded + aria-controls to expand button (WCAG 4.1.2)
- Add id="training-history-rows" to tbody for aria-controls target
- Replace title= tooltip on FAILED badge with details/summary for keyboard
  and touch accessibility; add training_error_detail_label i18n key
- Use motion-safe:animate-pulse on RUNNING badge for prefers-reduced-motion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
5c7efef307 fix(ocr): pin Dockerfile base image to python:3.11.9-slim for reproducible builds
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
74c9046745 fix(ocr): narrow exception handling and add unit tests for ensure_blla_model
- _model_is_loadable: narrow bare except to (RuntimeError, OSError, ValueError)
  with DEBUG-level fallback for unexpected exceptions — prevents silent masking
  of missing kraken install or AttributeError on vgsl
- _run_segtrain: replace bare except:pass with log.warning so height-check
  fallback is visible in container logs
- New test_ensure_blla_model.py: covers model-OK early return, incompatible
  model rename+replace, and missing model download paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
81da127381 refactor(ocr): rename findTop5 to findTop10 for headroom as frontend shows 3 by default
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
f206c0b9e9 test(ocr): add unit tests for triggerSegTraining() — conflict, threshold, happy path, failure
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
15e532eb96 refactor(ocr): extract assertNoRunningTraining() to eliminate duplicate guard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
f241a71733 feat(frontend): limit training history to 3 runs with expand toggle
Both training panels (OCR and segmentation) share TrainingHistory.
Show only the 3 most recent runs by default; render a Mehr/Weniger
anzeigen button when there are more.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
b83465020a fix(backend): store error rate for segmentation training runs
setCer() was called for recognition training but not for segmentation.
The OCR service now returns cer = 1 - accuracy for segtrain; persist it
so the admin panel can display Fehlerrate for both training types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
f08897b801 fix(deploy): wire OCR training token to backend and raise container memory limit
- Pass OCR_TRAINING_TOKEN through to the backend container as
  APP_OCR_TRAINING_TOKEN so RestClientOcrClient sends the X-Training-Token
  header when calling /train and /segtrain.
- Raise mem_limit/memswap_limit from 8g to 12g to give segtrain headroom
  on hosts with more available RAM.
- Uncomment OCR_TRAINING_TOKEN in .env.example — it is now required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
a5979c4069 fix(ocr-service): fix ketos 7 segtrain compatibility and prevent OOM
Three issues fixed:

1. --resize both was removed in ketos 7; replaced with --resize union
   which extends the model's class mapping to include training data classes.

2. ketos ignores -s when -i is present, so the 1800px blla model caused
   7+ GB peak RAM and OOM-killed the host (no swap, 5 GB free).
   Now checks the loaded model's input height: only uses the base model
   when it was already fine-tuned at 800px; otherwise trains from scratch
   at 800px (~200 MB peak). After the first run the trained 800px model
   becomes the base for all subsequent fine-tuning runs.

3. segtrain now computes and returns cer = 1 - accuracy, matching the
   recognition training path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
e8375d6c72 fix(ocr-service): add entrypoint that validates blla model format on startup
Adds ensure_blla_model.py which loads the blla segmentation model with
ketos on every container start. If the model is missing or in the legacy
PyTorch ZIP format (incompatible with ketos 7), it re-downloads the
correct CoreML protobuf model from Zenodo (DOI 10.5281/zenodo.14602569).
The Dockerfile now uses entrypoint.sh which runs this check before
starting uvicorn.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:17:53 +02:00
Marcel
28ac90b529 fix(annotations): replace outline:none with focus-visible ring for keyboard accessibility [M7]
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 1s
CI / Backend Unit Tests (pull_request) Failing after 1s
CI / Unit & Component Tests (push) Failing after 2s
CI / Backend Unit Tests (push) Failing after 1s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:42:01 +02:00
Marcel
76828a95e3 fix(annotations): add catch(err) binding to handlePointerUp error handler [M6]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:41:21 +02:00
Marcel
7125a0a8eb fix(annotations): reset liveWidth/liveHeight in handleKeyDown error rollback [M1, M6]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:40:55 +02:00
Marcel
7097f991fe feat(annotations): add keyboard accessibility to resize handles [B2]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:40:30 +02:00
Marcel
4d9145e49f feat(annotations): wire SVG aria-label to Paraglide i18n [B3]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:39:35 +02:00
Marcel
060d1c0515 feat(i18n): add annotation_resize_area and annotation_resize_handle message keys [B2, B3]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:38:10 +02:00
Marcel
72700bd28f test(annotations): add Testcontainers integration tests for V33 chk_annotation_bounds [B1]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:36:37 +02:00
Marcel
40c8f548db docs(annotations): fix ANNOTATION_UPDATE_FAILED Javadoc to reflect 400 status [M3]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:34:55 +02:00
Marcel
a19faa3806 feat(annotations): add @Slf4j and DataIntegrityViolationException catch to updateAnnotation [M2]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:34:03 +02:00
Marcel
f00b470928 test(annotations): add failing test for DataIntegrityViolationException defense [M2 red]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:33:43 +02:00
Marcel
65d606d8bb test(annotations): add missing height and x boundary validation tests [M4]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:31:07 +02:00
Marcel
4d3207fc27 test(annotations): verify save() is called in updateAnnotation test [M5]
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:30:50 +02:00
Marcel
2350b4f845 fix(annotations): make resize overlay keyboard-interactive
Some checks failed
CI / Unit & Component Tests (push) Failing after 1s
CI / Backend Unit Tests (push) Failing after 1s
CI / Unit & Component Tests (pull_request) Failing after 2s
CI / Backend Unit Tests (pull_request) Failing after 1s
- Add tabindex="0" so the SVG can receive DOM focus
- Auto-focus the SVG on mount so arrow keys work immediately after
  clicking an annotation to select it
- Show preview rect during keyboard nudging (not just pointer drag) by
  checking hasLiveChanges instead of only checking dragState
- Suppress default browser focus outline (outline: none) on the SVG

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:47:41 +02:00
Marcel
9fe5b32a69 feat(annotations): add N/S/E/W edge midpoint handles to resize overlay
Extends the 4-corner L-bracket handles with 4 tick-mark edge handles
(short lines along each edge), enabling single-axis resize from any edge.
Updates applyHandleDrag to route each handle to the correct axis.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:40:39 +02:00
Marcel
fcc0efbf02 refactor(annotations): replace 8-square handles with 4 corner L-brackets
- 4 corner-only handles (nw/ne/sw/se), no edge midpoints
- Each handle renders as two short perpendicular lines meeting at the corner
  (10px arms, navy, square linecap) — no fill, no box
- Thin dashed selection border added to SVG overlay to signal edit mode
- Simplify applyHandleDrag to remove dead n/s/e/w branches

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:14:30 +02:00
Marcel
e7f88a4ea1 fix(annotations): use pixel-space viewBox so handles stay square on non-square annotations
ResizeObserver binds actual SVG pixel dimensions; viewBox matches them so
16px handle squares and 44px hit areas are physically correct regardless of
the annotation's aspect ratio.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:03:15 +02:00
Marcel
c610a3cc37 feat(annotations): wire updateAnnotation context and error display into PdfViewer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 11:00:50 +02:00
Marcel
3fb32ea285 feat(annotations): pass isResizable to AnnotationShape based on selection + transcribeMode
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:57:13 +02:00
Marcel
3b756cd718 feat(annotations): add isResizable prop to AnnotationShape to render edit overlay
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:55:13 +02:00
Marcel
f5362a5850 feat(annotations): add AnnotationEditOverlay component with resize handles and drag
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:52:07 +02:00
Marcel
953cb2c910 feat(i18n): add ANNOTATION_UPDATE_FAILED error code and annotation_edit_mode_active translation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:43:10 +02:00
Marcel
ff231db671 feat(annotations): add PATCH endpoint for annotation resize/move
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:42:08 +02:00