When a user explicitly selects DATE sort with a text query active, the
previous code treated it identically to RELEVANCE, silently discarding
the user's sort choice. Remove DATE from the useRankOrder condition so
that explicit DATE sort always goes through the standard JPA sort path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fires the BEFORE UPDATE trigger for every documents row, which recomputes
the tsvector from all currently-linked metadata, blocks, receivers, and tags.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- V34 migration: adds search_vector tsvector column with GIN index
- BEFORE INSERT/UPDATE trigger on documents rebuilds vector from title (A),
summary + transcription_blocks.text (B), sender/receiver names (C),
tag names + location (D) using german FTS config
- AFTER triggers on transcription_blocks, document_receivers, document_tags
touch the parent document row to re-fire the BEFORE UPDATE trigger
- DocumentRepository.findRankedIdsByFts() native query using websearch_to_tsquery
- DocumentFtsTest: 12 integration tests covering stemming, trigger sync,
ranking, stop words, malformed input, receiver and tag search
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
With a pre-built JAR, Spring Boot + Flyway starts in ~15 seconds.
The previous 60s was sized for runtime compilation (90+ seconds).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pin to eclipse-temurin:21.0.10_7-{jdk,jre}-noble for reproducible builds
- Switch -DskipTests to -Dmaven.test.skip=true: skips test compilation entirely,
not just execution — faster and avoids build failures from test-only missing classes
- Add comment on COPY *.jar explaining why the glob is safe (Spring Boot renames
the pre-repackage artifact to .jar.original, leaving only one .jar in target/)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevents 111MB of compiled output from being sent to the BuildKit daemon
on cold builds. Only .mvn/, mvnw, pom.xml, and src/ are needed by the
three COPY instructions in the Dockerfile.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace runtime mvn spring-boot:run with a proper multi-stage build:
- Stage 1 (builder): compiles JAR with BuildKit cache mount for ~/.m2
- Stage 2 (runtime): eclipse-temurin:21-jre with only the JAR
Removes the backend source volume mount and maven_cache named volume.
Deploy with: docker compose up -d --build
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hand-rolled enrichedDocuments year-divider logic with the shared
groupDocuments utility. Also fixes a timezone bug in documentYears: adds
'T12:00:00' to date strings so getFullYear() doesn't drift on UTC boundaries.
No behavior change — year dividers render the same way as before.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirrors the existing sort allowlist pattern. Any value other than 'asc' or
'desc' silently falls back to 'desc', preventing arbitrary strings from
reaching the search API.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
text-xs text-ink/40 (~2.1:1) fails WCAG AA; text-sm bold at text-ink/60
(~3.7:1) passes the large-text 3:1 threshold. Also adds role="separator"
and aria-label so screen readers announce the group boundary.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add failing test for DATE-sort + undated doc showing "Undatiert" fallback
label, then fix DocumentList by null-coalescing sort before comparison
((sort ?? 'DATE') === 'DATE'). Test uses one dated + one undated doc to
produce two groups and trigger GroupDivider rendering.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents sorted by DATE show year dividers, SENDER/RECEIVER sort
shows person name dividers. Dividers only appear when there are 2+
distinct groups. Multi-receiver docs appear in each receiver group.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- start_period 60s → 120s: Zenodo download on cold start can exceed 60s on slow connections
- ocr_cache volume comment: documents what the cache stores for future operators
- .env.example: add token generation command to prevent weak placeholder in production
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add aria-expanded + aria-controls to expand button (WCAG 4.1.2)
- Add id="training-history-rows" to tbody for aria-controls target
- Replace title= tooltip on FAILED badge with details/summary for keyboard
and touch accessibility; add training_error_detail_label i18n key
- Use motion-safe:animate-pulse on RUNNING badge for prefers-reduced-motion
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _model_is_loadable: narrow bare except to (RuntimeError, OSError, ValueError)
with DEBUG-level fallback for unexpected exceptions — prevents silent masking
of missing kraken install or AttributeError on vgsl
- _run_segtrain: replace bare except:pass with log.warning so height-check
fallback is visible in container logs
- New test_ensure_blla_model.py: covers model-OK early return, incompatible
model rename+replace, and missing model download paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both training panels (OCR and segmentation) share TrainingHistory.
Show only the 3 most recent runs by default; render a Mehr/Weniger
anzeigen button when there are more.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
setCer() was called for recognition training but not for segmentation.
The OCR service now returns cer = 1 - accuracy for segtrain; persist it
so the admin panel can display Fehlerrate for both training types.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pass OCR_TRAINING_TOKEN through to the backend container as
APP_OCR_TRAINING_TOKEN so RestClientOcrClient sends the X-Training-Token
header when calling /train and /segtrain.
- Raise mem_limit/memswap_limit from 8g to 12g to give segtrain headroom
on hosts with more available RAM.
- Uncomment OCR_TRAINING_TOKEN in .env.example — it is now required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three issues fixed:
1. --resize both was removed in ketos 7; replaced with --resize union
which extends the model's class mapping to include training data classes.
2. ketos ignores -s when -i is present, so the 1800px blla model caused
7+ GB peak RAM and OOM-killed the host (no swap, 5 GB free).
Now checks the loaded model's input height: only uses the base model
when it was already fine-tuned at 800px; otherwise trains from scratch
at 800px (~200 MB peak). After the first run the trained 800px model
becomes the base for all subsequent fine-tuning runs.
3. segtrain now computes and returns cer = 1 - accuracy, matching the
recognition training path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ensure_blla_model.py which loads the blla segmentation model with
ketos on every container start. If the model is missing or in the legacy
PyTorch ZIP format (incompatible with ketos 7), it re-downloads the
correct CoreML protobuf model from Zenodo (DOI 10.5281/zenodo.14602569).
The Dockerfile now uses entrypoint.sh which runs this check before
starting uvicorn.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add tabindex="0" so the SVG can receive DOM focus
- Auto-focus the SVG on mount so arrow keys work immediately after
clicking an annotation to select it
- Show preview rect during keyboard nudging (not just pointer drag) by
checking hasLiveChanges instead of only checking dragState
- Suppress default browser focus outline (outline: none) on the SVG
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the 4-corner L-bracket handles with 4 tick-mark edge handles
(short lines along each edge), enabling single-axis resize from any edge.
Updates applyHandleDrag to route each handle to the correct axis.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 4 corner-only handles (nw/ne/sw/se), no edge midpoints
- Each handle renders as two short perpendicular lines meeting at the corner
(10px arms, navy, square linecap) — no fill, no box
- Thin dashed selection border added to SVG overlay to signal edit mode
- Simplify applyHandleDrag to remove dead n/s/e/w branches
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ResizeObserver binds actual SVG pixel dimensions; viewBox matches them so
16px handle squares and 44px hit areas are physically correct regardless of
the annotation's aspect ratio.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>