Commit Graph

305 Commits

Author SHA1 Message Date
Marcel
96f8bfd822 feat(backend): add POST /api/documents/{id}/file endpoint to attach file to existing document
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 23:36:31 +02:00
Marcel
16bcd0f73c fix(ocr): replace IllegalStateException with DomainException in triggerSenderTraining
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m33s
CI / OCR Service Tests (push) Successful in 36s
CI / Backend Unit Tests (push) Failing after 2m46s
CI / Unit & Component Tests (pull_request) Failing after 2m37s
CI / OCR Service Tests (pull_request) Successful in 36s
CI / Backend Unit Tests (pull_request) Failing after 2m50s
Consistent with triggerManualSenderTraining — both defensive paths now use
DomainException.internal(OCR_TRAINING_CONFLICT) when the expected RUNNING row
is not found after creation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 09:55:22 +02:00
Marcel
2466553216 fix(ocr): replace IllegalStateException with DomainException.internal in triggerManualSenderTraining
Ensures the unexpected-state path produces a structured JSON error response
instead of an unmapped 500 RuntimeException. Adds OCR_TRAINING_CONFLICT
ErrorCode and mirrors it in the frontend errors.ts. Adds coverage tests for
getAllSenderModels() and runSenderTraining().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 09:46:00 +02:00
Marcel
269894a47a refactor(ocr): move async training dispatch out of controller into SenderModelService
Controller was deciding when to fire runSenderTraining based on the returned run
status — a business rule that belongs in the service. Introduces @Lazy self-reference
to preserve @Async proxy dispatch without self-invocation bypassing Spring AOP.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 09:18:43 +02:00
Marcel
b879d28761 fix(ocr): validate personId in TriggerSenderTrainingDTO — returns 400 not 500 on null
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 08:49:17 +02:00
Marcel
99e7176eac test(ocr): add service-level tests for triggerManualSenderTraining
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 00:17:54 +02:00
Marcel
c3fa09d12e feat(ocr): add POST /api/ocr/train-sender endpoint for manual sender training
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 00:16:02 +02:00
Marcel
178afcd496 feat(ocr): add per-model history endpoints via path segments
Add findByPersonIdIsNullOrderByCreatedAtDesc + findByPersonIdOrderByCreatedAtDesc to
OcrTrainingRunRepository. Add dto/TrainingHistoryResponse. Expose
GET /api/ocr/training-info/global and GET /api/ocr/training-info/{personId} on
OcrController, both requiring ADMIN; getSenderTrainingHistory guards person existence
via PersonService and returns 404 for unknown personId.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 00:08:38 +02:00
Marcel
b1b7418404 feat(ocr): promote TrainingInfoResponse to dto, add senderModels field
Move TrainingInfoResponse from private nested record to dto/TrainingInfoResponse.java,
add senderModels field, inject SenderModelService into OcrTrainingService so personNames
covers all known senders rather than only recent-run participants.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 00:04:29 +02:00
Marcel
a52c8bf079 test(db): verify V42 partial unique index for QUEUED training runs per person
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m32s
CI / OCR Service Tests (push) Successful in 31s
CI / Backend Unit Tests (push) Failing after 2m41s
CI / Unit & Component Tests (pull_request) Failing after 2m33s
CI / OCR Service Tests (pull_request) Successful in 37s
CI / Backend Unit Tests (pull_request) Failing after 2m49s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:54:34 +02:00
Marcel
bbfd234746 refactor(ocr): use stream .toList() instead of FQCN Collectors.toList()
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:47:36 +02:00
Marcel
92f3c04d54 fix(ocr): add partial unique index and align SenderModelServiceTest with suite style
Add V42 partial unique index on ocr_training_runs(person_id) WHERE status='QUEUED'
to enforce the per-person queued coalescing guarantee at the DB level. Also adds
@ExtendWith(MockitoExtension.class) to SenderModelServiceTest for consistency with
the rest of the service test suite, with lenient() on the shared txTemplate stub.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:25:18 +02:00
Marcel
0d5f3f38d0 perf(ocr): resolve person names in single batch query in getTrainingInfo
Replace the per-run getById loop with a single getAllById call on distinct
person IDs, eliminating the N+1 query when training history contains multiple
sender model runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:21:12 +02:00
Marcel
4aa477555d refactor(ocr): return TrainingInfoResponse directly from getTrainingInfo endpoint
Remove the intermediate Map<String,Object> and return the typed record directly
so OpenAPI codegen produces a concrete TypeScript type. Fixes lastRun serializing
as {} (empty object) instead of null when no training run exists.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 21:18:27 +02:00
Marcel
e16dcdb7dc docs(ocr): document tail-recursive queue drain design in promoteNextQueuedRun
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m36s
CI / OCR Service Tests (push) Successful in 34s
CI / Backend Unit Tests (push) Failing after 2m43s
CI / Unit & Component Tests (pull_request) Failing after 2m38s
CI / OCR Service Tests (pull_request) Successful in 35s
CI / Backend Unit Tests (pull_request) Failing after 2m43s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:54:53 +02:00
Marcel
f76a9cce1f test(ocr): add failure path and DONE status assertions to SenderModelServiceTest
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:38:43 +02:00
Marcel
e2081b57e7 refactor(ocr): extract exportSenderData helper in triggerSenderTraining
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m36s
CI / OCR Service Tests (push) Successful in 37s
CI / Backend Unit Tests (push) Failing after 2m51s
CI / Unit & Component Tests (pull_request) Failing after 2m42s
CI / OCR Service Tests (pull_request) Successful in 35s
CI / Backend Unit Tests (pull_request) Failing after 2m54s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:24:38 +02:00
Marcel
2459408930 refactor(ocr): move person-name enrichment from OcrController into OcrTrainingService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:18:21 +02:00
Marcel
09f4601d15 test(ocr): verify triggerSenderTraining upserts SenderModel with correct path and cer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:13:21 +02:00
Marcel
1b34a36a77 fix(ocr): eliminate race window in runOrQueueSenderTraining by creating RUNNING row atomically
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:11:56 +02:00
Marcel
8d041a377d fix(ocr): correct trainSenderModel URI from /train to /train-sender
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 20:08:18 +02:00
Marcel
18cf839fac feat(ocr): wire SenderModelService into OcrAsyncRunner; stage missing foundational files
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m21s
CI / OCR Service Tests (push) Successful in 29s
CI / Backend Unit Tests (push) Failing after 2m38s
CI / Unit & Component Tests (pull_request) Failing after 2m26s
CI / OCR Service Tests (pull_request) Successful in 31s
CI / Backend Unit Tests (pull_request) Failing after 2m44s
OcrAsyncRunner now passes the per-sender model path to streamBlocks for
HANDWRITING_KURRENT documents. processDocument replaced extractBlocks
with streamBlocks + AtomicReference, removing the unchecked raw-array
pattern.

Also stages all previously uncommitted foundational files for this
feature: SenderModel entity, SenderModelRepository, Flyway migrations
V40/V41, updated OcrClient/RestClientOcrClient streaming API,
TrainingDataExportService.exportForSender, TranscriptionService Kurrent
hook, application.yaml OCR config, and frontend i18n/test additions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 19:27:02 +02:00
Marcel
386dc83958 refactor(ocr): move sender training methods from OcrTrainingService to SenderModelService
Eliminates cross-domain repository access: OcrTrainingService no longer
holds SenderModelRepository. SenderModelService now owns the full sender
training lifecycle (runOrQueueSenderTraining, triggerSenderTraining,
promoteNextQueuedRun), removing the circular dependency risk.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 19:08:10 +02:00
Marcel
60c1ec7b5f refactor(ocr): delete buildTrainingInfoMap() dead code
The controller now builds the map inline (with personNames support).
This method had zero callers.

Fixes reviewer concerns from @felixbrandt and @mkeller.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:52:51 +02:00
Marcel
10a4a4d94b fix(ocr): log debug instead of silently swallowing person name resolution errors
Replaces catch(Exception ignored){} with log.debug() in getTrainingInfo().
Adds controller test documenting the graceful degradation behavior
(response stays 200 when personService.getById() throws).

Fixes reviewer concerns from @felixbrandt and @nullx.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:51:15 +02:00
Marcel
7a342a07cf test: add unit tests for SenderModelService, runOrQueueSenderTraining, and updateBlock hook
- SenderModelServiceTest: 6 tests covering activation threshold (99/100),
  retrain delta (149/150), runNow flag (queued vs triggered)
- OcrTrainingServiceTest: 3 tests for runOrQueueSenderTraining — idle returns
  true, running saves QUEUED, duplicate QUEUED coalesces
- TranscriptionServiceTest: 3 tests for updateBlock — sets source=MANUAL,
  triggers training for HANDWRITING_KURRENT with sender, skips when no sender

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 18:00:59 +02:00
Marcel
bd23a76330 test: fix broken tests after per-sender model integration
- OcrAsyncRunnerTest: switch from extractBlocks/4-arg streamBlocks stubs
  to 5-arg streamBlocks (senderModelPath param) via doAnswer
- TranscriptionServiceTest: stub documentService.getDocumentById in
  updateBlock tests so the new Kurrent training hook does not NPE
- OcrControllerTest: add @MockitoBean PersonService (now injected into
  OcrController for personNames assembly in getTrainingInfo)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 17:56:51 +02:00
Marcel
ba36a88b65 feat(ocr): add Preprocessing NDJSON event to Java stream pipeline
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 14:21:00 +02:00
Marcel
d075bf390a feat(tag-search): expand children and surface ancestor path in search results
Modifies TagService.search() to enrich name-matches with tree relatives:
root matches expand descendants, child matches prepend ancestors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 11:27:41 +02:00
Marcel
4ec4062274 refactor(#248): simplify TagService.buildTree() to single-pass LinkedHashMap approach
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 3m12s
CI / Backend Unit Tests (pull_request) Failing after 2m57s
CI / Unit & Component Tests (push) Failing after 2m41s
CI / Backend Unit Tests (push) Failing after 2m45s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 07:45:40 +02:00
Marcel
e6497ebff4 fix(#248): add @Schema(REQUIRED) to TagTreeNodeDTO, improve mergeTags log, add comments
Some checks failed
CI / Unit & Component Tests (pull_request) Failing after 2m42s
CI / Backend Unit Tests (pull_request) Failing after 2m44s
CI / Unit & Component Tests (push) Failing after 2m35s
CI / Backend Unit Tests (push) Failing after 2m44s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 01:01:09 +02:00
Marcel
d7a46de1cc refactor(#248): address PR review concerns — TagOperator enum, typed projection, bean validation
- Replace stringly-typed "AND"/"OR" tagOperator with TagOperator enum (DocumentService, DocumentController)
- Replace Object[] with TagCount projection interface in TagRepository.findDocumentCountsPerTag()
- Use @NotNull + @Valid on MergeTagDTO.targetId; remove manual null check from TagController
- Correct ALLOWED_TAG_COLORS to match actual frontend CSS tokens (sage/sienna/amber/slate/violet/rose/cobalt/moss/sand/coral)
- Add TOCTOU comment to validateNoAncestorCycle() with mitigation explanation
- Add test: deleteWithDescendants_skipsDocTagDeletion_whenDescendantIdsIsEmpty
- Update TagServiceTest to use mock TagRepository.TagCount projection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 00:24:04 +02:00
Marcel
a669f6368d feat(#248): expose parentId in TagTreeNodeDTO OpenAPI schema and regenerate TypeScript types
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:33:12 +02:00
Marcel
5e5c249aba feat(#248): add POST /api/tags/{id}/merge and DELETE /api/tags/{id}/subtree endpoints
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:27:41 +02:00
Marcel
609d242f5d feat(#248): enrich TagTreeNodeDTO with parentId and populate documentCount via single aggregate query
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:24:50 +02:00
Marcel
c03c391879 test(#248): add deleteWithDescendants test coverage to TagServiceTest
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:20:19 +02:00
Marcel
f921284db6 feat(#248): add TagService.mergeTags() with validateNotSelf/validateNotDescendant/transferDocuments helpers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:18:41 +02:00
Marcel
b9b572436a feat(#248): add merge/delete/count native queries to TagRepository
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:15:14 +02:00
Marcel
a05d9c22ae fix(#248): TagService.getById() throws DomainException(TAG_NOT_FOUND) instead of ResponseStatusException
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:13:45 +02:00
Marcel
de7c48117b feat(#248): add TAG_NOT_FOUND, TAG_MERGE_SELF, TAG_MERGE_INVALID_TARGET error codes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 22:10:52 +02:00
Marcel
06fd5ae2da fix(#221): resolve inherited color on child tags in document responses
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m51s
CI / Backend Unit Tests (push) Failing after 2m46s
Colors are stored only on root-level tags. DocumentService now calls
TagService.resolveEffectiveColors() before returning search results and
single-document responses, so child tags carry their parent's color when
serialised to JSON. Parent tags are batch-loaded in a single query.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 19:28:21 +02:00
Marcel
e8e54cc282 feat(#221): change TagInput binding to Tag[], add color dots and hierarchy grouping
Backend:
- TagRepository: add findDescendantIdsByName() recursive CTE query
- TagService: add expandTagNamesToDescendantIdSets() for document search

Frontend:
- TagInput: accept Tag[] (id, name, color, parentId) instead of string[]
- Chips show color dot via var(--c-tag-{color}) when tag has color
- Suggestions grouped hierarchically: children indented under their parents
- Update DescriptionSection, edit/new pages, SearchFilterBar, +page.svelte

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 16:11:38 +02:00
Marcel
57dc72b51d feat(#221): add AND/OR tag filtering with hierarchy expansion in document search
- Replace hasTags(List<String>) spec with hasTags(List<Set<UUID>>, useOr)
- AND mode: one EXISTS subquery per expanded tag ID set; empty set = disjunction
- OR mode: union of all expanded sets into a single EXISTS subquery
- DocumentService calls tagService.expandTagNamesToDescendantIdSets() before building spec
- DocumentController exposes ?tagOp=AND|OR query param (default AND)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:44:18 +02:00
Marcel
3fba740469 feat(#221): tag entity hierarchy fields, service, repository, controller
- Tag entity: add parentId (UUID FK) and color (String) fields
- TagUpdateDTO and TagTreeNodeDTO records
- ErrorCode: INVALID_TAG_COLOR, TAG_CYCLE_DETECTED
- TagRepository: findAncestorIds() recursive CTE query
- TagService: cycle detection, color validation, getTagTree()
- TagController: use TagUpdateDTO, add GET /api/tags/tree endpoint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:26:23 +02:00
Marcel
f9ac963b9f feat(#221): add V39 migration for tag hierarchy and colors
Adds parent_id FK (ON DELETE SET NULL), self-reference check constraint,
parent_id index, and nullable color column to the tag table.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 15:15:17 +02:00
Marcel
da5c92fe39 fix(#240): remove readyCount from weekly stats DTO and SQL query
Some checks failed
CI / Unit & Component Tests (push) Failing after 2m26s
CI / Backend Unit Tests (push) Failing after 2m46s
CI / Unit & Component Tests (pull_request) Failing after 2m32s
CI / Backend Unit Tests (pull_request) Failing after 2m30s
The Lesefertig pulse was removed from the UI; drop the backend support
for it too — removes the subquery from findWeeklyStats(), the projection
getter, the DTO field, and updates all affected tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 13:19:53 +02:00
Marcel
23410aa4b8 fix(#240): rename V37→V38 (V37 was already applied); regenerate api.ts
The original needsExpert V37 migration was applied to the dev DB before
the feature was removed. Renaming our new indexes migration to V38 avoids
the Flyway checksum conflict. Regenerated api.ts now reflects the
@Schema(requiredMode=REQUIRED) annotations — DTO fields are non-optional.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 12:23:14 +02:00
Marcel
e041c75793 test(#240): add Testcontainers integration tests for native SQL queue queries
6 new tests covering findSegmentationQueue (excludes PLACEHOLDER, excludes
annotated docs), findTranscriptionQueue (below-90%-reviewed docs, zero-block
case), findReadyToReadQueue (>=90% reviewed), and findWeeklyStats (zeros on
empty DB). Runs against real PostgreSQL 16 via Testcontainers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 12:15:21 +02:00
Marcel
adea7d498f fix(#240): add @Schema(requiredMode=REQUIRED) to both queue DTOs; add V37 indexes
All non-null DTO fields are now marked required so the generated api.ts
emits required (non-optional) types for callers. V37 migration adds
created_at/updated_at indexes on document_annotations and transcription_blocks
to avoid full table scans in the weekly stats correlated subqueries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 12:09:09 +02:00
Marcel
4cf01a0f1d test(#240): add TranscriptionQueueControllerTest
Verifies 401/403/200 responses for all four endpoints. Matches
the @WebMvcTest + @RequirePermission pattern used across the project.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 12:07:14 +02:00