Fires the BEFORE UPDATE trigger for every documents row, which recomputes
the tsvector from all currently-linked metadata, blocks, receivers, and tags.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- V34 migration: adds search_vector tsvector column with GIN index
- BEFORE INSERT/UPDATE trigger on documents rebuilds vector from title (A),
summary + transcription_blocks.text (B), sender/receiver names (C),
tag names + location (D) using german FTS config
- AFTER triggers on transcription_blocks, document_receivers, document_tags
touch the parent document row to re-fire the BEFORE UPDATE trigger
- DocumentRepository.findRankedIdsByFts() native query using websearch_to_tsquery
- DocumentFtsTest: 12 integration tests covering stemming, trigger sync,
ranking, stop words, malformed input, receiver and tag search
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After each training run, the Character Error Rate (CER = 1 - accuracy),
loss, accuracy, and epoch count are now stored on the OcrTrainingRun
record and shown in the training history table.
Also adds the missing POST /api/ocr/segtrain endpoint and the
triggerSegTraining service method so the segmentation training card
can actually trigger training.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add /segtrain endpoint to OCR service (ZIP upload, ketos.segtrain,
backup rotation, in-process model reload)
- Add segtrainModel() to OcrClient and RestClientOcrClient (10-min timeout,
X-Training-Token header)
- Add SegmentationTrainingExportService: PAGE XML export with polygon
de-normalization and per-page PNG rendering via PDFBox
- Add GET /api/ocr/segmentation-training-data/export endpoint
- Make TranscriptionBlock.text nullable for segmentation-only blocks
(V31 migration)
- Add Paraglide i18n translation keys for all training UI strings (de/en/es)
- Pass source prop from TranscriptionEditView to TranscriptionBlock
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backend sends progress codes (PREPARING, LOADING, ANALYZING,
CREATING_BLOCKS:N, DONE:N, ERROR) via OcrJob.progressMessage.
Frontend translates them via Paraglide (de/en/es) and displays
below the spinner.
- V27 migration: adds progress_message column to ocr_jobs
- OcrAsyncRunner updates progress at each phase
- Poll interval reduced to 2s for snappier updates
Refs #226
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- BlockSource enum: MANUAL, OCR
- V26 migration adds source + reviewed columns to transcription_blocks
- OcrService sets source=OCR when creating blocks
- TranscriptionService.reviewBlock() toggles the reviewed flag
- PUT /api/documents/{id}/transcription-blocks/{blockId}/review endpoint
- 5 new tests: reviewBlock toggle/untoggle/notfound, controller,
OcrService source=OCR verification
The reviewed flag enables the Kraken fine-tuning pipeline: only blocks
marked as reviewed by a human are exported as training data.
Refs #226
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add title VARCHAR(50) column
- Add person_type VARCHAR(20) NOT NULL DEFAULT 'PERSON' with CHECK
constraint (PERSON, INSTITUTION, GROUP, UNKNOWN — SKIP excluded)
- Drop NOT NULL on first_name for non-person entities
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates the alias table for historical name changes (marriage,
widowhood, etc.) and adds GIN trigram indexes on both the new
alias table and the existing persons table for substring search.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
V18: text column now has CHECK (length(text) <= 10000) to enforce
the 10,000 character limit at the database level, complementing
the application-level enforcement in TranscriptionService.sanitizeText().
Fixes @Nora: "DB constraint catches anything the application misses"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes from PR #178 review:
Migration fixes:
- V18/V19: fix FK references from app_users to users (correct table name)
- V18: change annotation_id FK from ON DELETE CASCADE to ON DELETE RESTRICT
(block is aggregate root, cascade flows from block, not annotation)
Backend fixes:
- TranscriptionService.deleteBlock(): remove userId param, delete block first
then annotation directly via repository (no ownership check — block owns annotation)
- TranscriptionService.sanitizeText(): remove flawed regex HTML stripping,
textarea content is plain text by design — just enforce max length
- TranscriptionBlockController.requireUserId(): throw DomainException.unauthorized()
instead of silently returning null on auth failure
- CreateTranscriptionBlockDTO: add @Min/@Positive validation on coordinates
- Add @Slf4j logging to TranscriptionService for create/delete operations
Frontend fixes:
- Delete DocumentBottomPanel.svelte entirely (issue #175 requirement)
- Remove redundant mode exclusivity $effect (handled at toggle call sites)
- Remove dead handleCommentClick + onCommentClick prop (comments are future work)
- Remove quote hint UI (depends on comment feature)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
V18: transcription_blocks table with optimistic locking version column
V19: transcription_block_versions for edit history capture
V20: add block_id FK to document_comments for block-level threads
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
spring.jpa.open-in-view=true (the default) holds a DB connection open for
the entire HTTP request lifecycle. Under concurrent dashboard API calls
(Promise.allSettled fires 3 at once), the pool of 10 is exhausted and the
backend crashes with connection timeout errors.
Setting open-in-view=false releases connections as soon as each
@Transactional method completes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove @RequirePermission(READ_ALL) from NotificationController class level so
authenticated users with any permission (or none) can access their own notifications
- Add V19 migration, annotationId field to Notification entity and NotificationDTO
- NotificationService now stores annotationId from comment on both REPLY and MENTION
- Update controller tests: permission tests now expect 200, DTO constructor includes annotationId
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BLOCKERs:
- Remove direct AppUserRepository/CommentRepository access from CommentService and
NotificationService — replaced with UserService.findAllById() and UserService
(fixes layering contract from CLAUDE.md)
- Switch Optional<JavaMailSender> constructor injection — removes @Autowired(required=false)
field and ReflectionTestUtils hack in tests
- Add @RequirePermission(READ_ALL) to UserSearchController — prevents user enumeration
without read access
Data bug:
- Promote actorName from @Transient to persisted VARCHAR column (V18 migration)
- Set actorName in notifyReply and notifyMentions from comment.getAuthorName()
Architecture:
- Add @RequirePermission(READ_ALL) to NotificationController
- Introduce NotificationDTO — controller returns DTO instead of Notification entity,
eliminating lazy-load N+1 and AppUser field leakage
- Change mentions FetchType to EAGER — fixes LazyInitializationException outside transaction
- Add @Transactional(propagation=REQUIRES_NEW) to notifyReply/notifyMentions so a
notification failure cannot roll back the parent comment
- N+1 fix: replace per-ID findById loops with single findAllById bulk fetch
- Move collectParticipantIds to CommentService; notifyReply accepts Set<UUID> directly
Security:
- Escape displayName before injecting into renderBody HTML span
- Replace <a href="#"> with <span class="mention"> — no profile page to link to, and
the anchor's scroll-to-top behaviour is harmful
Tests added/fixed:
- markRead_throwsNotFound, markAllRead_delegatesToRepository, countUnread_delegatesToRepository
- markOneRead_returns401, @RequirePermission 403 coverage for both controllers
- postComment/replyToComment_triggersNotifyMentions_whenMentionedUserIdsProvided
- search_returnsAtMostTenResults now asserts $.length() <= 10
- XSS regression test for escaped displayName in mention.spec.ts
Frontend minors:
- relativeTime() uses Intl.RelativeTimeFormat (locale-aware, not German-hardcoded)
- aria-label uses m.notification_unread() Paraglide key (de/en/es added)
- <div role="button"> replaced with <button> (native Enter+Space handling)
- onDestroy clears debounceTimer in MentionEditor
- setTimeout(100) replaced with await tick() + requestAnimationFrame in CommentThread
- Notification prefs form uses checkbox name attributes + formData.has() pattern
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a metadata_complete column (default true for existing rows) to drive
the enrichment queue. New drop-zone uploads always start as false; createDocument
uses an explicit DTO flag or a heuristic (any of date/sender/receivers present →
true); the mass importer applies the same heuristic per row.
New endpoints: GET /api/documents/incomplete-count, /incomplete, /incomplete/next.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add V14 migration: ON DELETE CASCADE for document_tags and document_receivers
so deleting a document removes its join-table rows automatically
- Rename default form action to 'update' in the edit page — SvelteKit forbids
mixing a default action with named actions (was causing 500 on delete)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Flyway V13: add file_hash column to documents and document_annotations
- FileService.uploadFile() now returns UploadResult(s3Key, fileHash) with SHA-256 hash computed from raw bytes
- Document and DocumentAnnotation models gain a fileHash field
- DocumentService propagates the hash at all three upload sites (storeDocument, createDocument, updateDocument)
- AnnotationService.createAnnotation() accepts and persists a fileHash
- AnnotationController resolves the document's hash and passes it through
Closes#55
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers existing deployments where the Administrators group was created
before DataInitializer started including ANNOTATE_ALL.
Refs #40
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates the document_versions table (V9) with JSONB snapshot and
changed_fields columns. DocumentVersionService records a version on
every create/update, resolves the editor name from the security context,
and computes changedFields by diffing against the previous snapshot.
Refs #38
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add PasswordResetToken entity, repository (Flyway V8 migration)
- PasswordResetService: token generation, validation, nightly cleanup
- AuthController: POST /api/auth/forgot-password and /api/auth/reset-password (both permitAll)
- AuthE2EController (@Profile("e2e")): GET /api/auth/reset-token-for-test for CI testing
- spring-boot-starter-mail dependency; JavaMailSender optional (@Autowired required=false)
- mail health indicator disabled; mail config via MAIL_HOST/PORT/USERNAME/PASSWORD env vars
- 5 unit tests written TDD-style (all pass)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add firstName, lastName, birthDate, contact to AppUser via V7 migration.
Add PUT /api/users/me and POST /api/users/me/password endpoints.
Add GET /api/users/{id} for public profile lookup.
Add EMAIL_ALREADY_IN_USE and WRONG_CURRENT_PASSWORD error codes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Avoids Flyway errors when columns already exist in the DB due to
migration history mismatches from parallel feature branches.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolves merge conflicts with main (feat/person-notes merged first).
Combines both features: birth/death years and notes field on person detail.
Renames migration V5__add_birth_death_years to V6 to avoid Flyway conflict.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
V5 Flyway migration adds TEXT notes column; Person entity, service, and
controller updated to persist notes. Frontend edit form adds textarea and
view mode renders the notes section. Backed by 2 new service unit tests
(persist + blank clears).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
V5 Flyway migration adds birth_year and death_year INTEGER columns.
Service validates birthYear <= deathYear (400 otherwise). Frontend edit
form adds year number inputs; view mode renders * year / † year. Backed
by 3 backend service tests and 1 E2E test.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spring Boot 4.0 Flyway auto-configuration is not triggering in the CI
environment — confirmed by empty DB and no flyway_schema_history table.
Replace YAML-based auto-config with an explicit @Bean that creates and
runs Flyway directly on startup, independent of any auto-configuration
conditions. Disable the auto-config via spring.flyway.enabled=false to
prevent interference. Add @DependsOn("flyway") to DataInitializer to
enforce that CommandLineRunner beans are only registered after migrations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adding explicit spring.flyway.* config (url/user/password) ensures Flyway
creates its own JDBC connection and runs migrations independently of the JPA
datasource initialization order in Spring Boot 4.0.
Fix DataInitializer creating a Document with title=null, which would hit the
NOT NULL constraint in the documents table once the admin user init succeeds.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rename app.import.excel.col.* → app.import.col.* and set correct
column indices for all fields in the ODS spreadsheet.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Maps cols 1 (Box) and 2 (Mappe) from the ODS to the Document entity.
These are physical archival location identifiers needed to locate
original documents in the physical archive.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously FileService fell back to extension-based MIME detection, causing
TIFF, HEIC, DOCX and other unlisted types to be served as octet-stream
(forced download instead of inline display).
- Add content_type column to documents (V3 migration)
- Store file.getContentType() in DocumentService on upload and file replace
- MassImportService uses Files.probeContentType() for local files
- DocumentController prefers doc.getContentType() over S3-reported type
- FileService: remove extension-based fallback (no longer needed)
- DocumentService: replace leftover ResponseStatusException with DomainException
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spring Session was pulled in as a dependency but never used — auth is
stateless HTTP Basic, so sessions are never written. Removed:
- spring-boot-starter-session-jdbc (and test variant) from pom.xml
- spring_session and spring_session_attributes tables/indexes/constraints
from V1__initial_schema.sql
Added V2 migration to drop the tables on existing databases that already
ran V1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace application.properties with application.yaml (base/prod config)
and application-dev.yaml (dev overrides: show-sql=true)
- Add Maven 'dev' profile (activeByDefault) and 'prod' profile to pom.xml;
spring-boot:run picks up the active Spring profile automatically
- Guard DataInitializer.initData with @Profile("dev") so test data is
never seeded in production
Local dev: ./mvnw spring-boot:run (dev profile active by default)
Production: SPRING_PROFILES_ACTIVE env var controls the Spring profile;
Maven profiles are irrelevant for the packaged JAR.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>