Documents sorted by DATE show year dividers, SENDER/RECEIVER sort
shows person name dividers. Dividers only appear when there are 2+
distinct groups. Multi-receiver docs appear in each receiver group.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- start_period 60s → 120s: Zenodo download on cold start can exceed 60s on slow connections
- ocr_cache volume comment: documents what the cache stores for future operators
- .env.example: add token generation command to prevent weak placeholder in production
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add aria-expanded + aria-controls to expand button (WCAG 4.1.2)
- Add id="training-history-rows" to tbody for aria-controls target
- Replace title= tooltip on FAILED badge with details/summary for keyboard
and touch accessibility; add training_error_detail_label i18n key
- Use motion-safe:animate-pulse on RUNNING badge for prefers-reduced-motion
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _model_is_loadable: narrow bare except to (RuntimeError, OSError, ValueError)
with DEBUG-level fallback for unexpected exceptions — prevents silent masking
of missing kraken install or AttributeError on vgsl
- _run_segtrain: replace bare except:pass with log.warning so height-check
fallback is visible in container logs
- New test_ensure_blla_model.py: covers model-OK early return, incompatible
model rename+replace, and missing model download paths
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both training panels (OCR and segmentation) share TrainingHistory.
Show only the 3 most recent runs by default; render a Mehr/Weniger
anzeigen button when there are more.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
setCer() was called for recognition training but not for segmentation.
The OCR service now returns cer = 1 - accuracy for segtrain; persist it
so the admin panel can display Fehlerrate for both training types.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Pass OCR_TRAINING_TOKEN through to the backend container as
APP_OCR_TRAINING_TOKEN so RestClientOcrClient sends the X-Training-Token
header when calling /train and /segtrain.
- Raise mem_limit/memswap_limit from 8g to 12g to give segtrain headroom
on hosts with more available RAM.
- Uncomment OCR_TRAINING_TOKEN in .env.example — it is now required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three issues fixed:
1. --resize both was removed in ketos 7; replaced with --resize union
which extends the model's class mapping to include training data classes.
2. ketos ignores -s when -i is present, so the 1800px blla model caused
7+ GB peak RAM and OOM-killed the host (no swap, 5 GB free).
Now checks the loaded model's input height: only uses the base model
when it was already fine-tuned at 800px; otherwise trains from scratch
at 800px (~200 MB peak). After the first run the trained 800px model
becomes the base for all subsequent fine-tuning runs.
3. segtrain now computes and returns cer = 1 - accuracy, matching the
recognition training path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ensure_blla_model.py which loads the blla segmentation model with
ketos on every container start. If the model is missing or in the legacy
PyTorch ZIP format (incompatible with ketos 7), it re-downloads the
correct CoreML protobuf model from Zenodo (DOI 10.5281/zenodo.14602569).
The Dockerfile now uses entrypoint.sh which runs this check before
starting uvicorn.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add tabindex="0" so the SVG can receive DOM focus
- Auto-focus the SVG on mount so arrow keys work immediately after
clicking an annotation to select it
- Show preview rect during keyboard nudging (not just pointer drag) by
checking hasLiveChanges instead of only checking dragState
- Suppress default browser focus outline (outline: none) on the SVG
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the 4-corner L-bracket handles with 4 tick-mark edge handles
(short lines along each edge), enabling single-axis resize from any edge.
Updates applyHandleDrag to route each handle to the correct axis.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 4 corner-only handles (nw/ne/sw/se), no edge midpoints
- Each handle renders as two short perpendicular lines meeting at the corner
(10px arms, navy, square linecap) — no fill, no box
- Thin dashed selection border added to SVG overlay to signal edit mode
- Simplify applyHandleDrag to remove dead n/s/e/w branches
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ResizeObserver binds actual SVG pixel dimensions; viewBox matches them so
16px handle squares and 44px hit areas are physically correct regardless of
the annotation's aspect ratio.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Commit 5afdc37 changed the empty state from transcription_empty_cta
('Markiere einen Bereich…') to transcription_empty_draw_hint
('Zeichnen Sie Bereiche…') but left the spec asserting the old text.
Updated the locator to match the current component output.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removed duplicate import of org.mockito.ArgumentMatchers.eq from
DocumentControllerTest (lines 32+35). Added @ApiResponse(responseCode="204")
to patchTrainingLabel so the generated OpenAPI spec matches the actual
NoContent response the controller returns.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
WCAG 1.4.1 (Use of Color) requires non-color redundant cues for status.
The unicode ✓/✗ characters had inconsistent screen-reader support.
Replaced with explicit aria-hidden SVG icons (checkmark / x-circle)
alongside the translated status text labels.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TranscriptionEditView rendered 'Kurrent-Erkennung' and 'Segmentierung'
as hardcoded German strings, breaking the en/es locales. Added
training_chip_kurrent and training_chip_segmentation keys to all three
message files and wired them up via m.training_chip_kurrent() /
m.training_chip_segmentation().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_check_training_token previously skipped auth when TRAINING_TOKEN was
empty, allowing unauthenticated requests to reach /train and /segtrain.
Now returns 503 ("Training not configured on this node") when the token
is absent, so missing configuration fails closed rather than open.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OcrTrainingService.triggerTraining() and triggerSegTraining() held a DB
connection open for the entire ketos training run (potentially minutes),
risking connection pool exhaustion. Replaced class-level @Transactional
with TransactionTemplate for narrow DB writes: guard+create and
result-record each run in their own short transaction; the HTTP call to
the OCR service runs between them with no open connection.
Also replaces blockRepository.findAll().size() with blockRepository.count()
in getTrainingInfo() to avoid loading every block into heap on each poll.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>