kraken 5.2.9 required torch~=2.1.0, incompatible with surya-ocr's
torch>=2.3.0. kraken 6.0.3 requires torch>=2.4.0,<=2.9 which
overlaps with surya and our pinned torch==2.5.1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
surya-ocr 0.6.3 requires pillow<11.0.0,>=10.2.0. The previous
pin at 11.1.0 caused a dependency resolution failure during
Docker build.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace @Convert(PolygonConverter) with Hibernate native @JdbcTypeCode(SqlTypes.JSON)
to fix JDBC type mismatch — PostgreSQL requires jsonb type, not varchar.
The PolygonConverter is retained as a standalone utility but no longer
used on the entity. Hibernate 6 natively handles List<List<Double>>
serialization to JSONB.
Refs #227
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- AnnotationShape.svelte: renders a single annotation as either a
rectangle or a polygon-clipped div (via CSS clip-path: polygon())
- AnnotationLayer.svelte: refactored to delegate rendering to
AnnotationShape, keeping draw logic and hover state management
- Annotation type: added optional polygon field ([number, number][] | null)
- Polygon coordinates are converted from page-normalized to
bounding-box-relative percentages for clip-path
All 687 existing frontend tests pass.
Refs #227
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Python microservice (ocr-service/):
- FastAPI app with /ocr and /health endpoints
- Surya engine: transformer-based OCR for typewritten/modern handwriting
- Kraken engine: historical HTR for Kurrent/Suetterlin with
pure-Python polygon-to-quad approximation (gift wrapping + rotating calipers)
- Eager model loading at startup via lifespan context manager
- PDF download via httpx, page rendering via pypdfium2 at 300 DPI
Java RestClientOcrClient:
- Implements OcrClient + OcrHealthClient interfaces
- Calls Python service via Spring RestClient
- Health check with graceful fallback
Docker Compose:
- New ocr-service container (mem_limit 6g, no host ports)
- Health check with start_period 60s for model loading
- ocr_models volume for Kraken model files
- Backend depends on ocr-service health
Refs #226, #227
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OCR creates many adjacent text line annotations that would fail the
existing overlap check. createOcrAnnotation() accepts an optional
polygon and bypasses overlap detection entirely.
Refs #227
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ADR-001 documents the decision to use a separate Python container for
OCR (Surya + Kraken), the interface contract, and why alternatives
like Tess4J were rejected.
ADR-002 documents the decision to store polygon annotations as JSONB
with a 4-point CHECK constraint, backed by an AttributeConverter.
Refs #226, #227
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- CommentThread: add missing empty-state paragraph using comment_empty_hint
i18n key (key existed but was never rendered in the template)
- TranscriptionBlock: add selectedQuote hint using transcription_block_quote_hint
i18n key (key existed but was never rendered); fix test to use native DOM
el.focus()/setSelectionRange()/dispatchEvent instead of locator.selectText()
which is not available in this vitest-browser version
- TranscriptionEditView: fix test to use native el.dispatchEvent(FocusEvent)
instead of locator.blur() which is not available
- Conversations: fix test expected text from stale "Korrespondenz durchsuchen"
to match current conv_empty_heading() = "Wessen Briefe möchten Sie lesen?"
All 687 tests now pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
resolveRef is never read reactively — it is only read synchronously
inside settle(). Using $state was misleading about the intent.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add m-auto and w-full to ensure the native <dialog> is centred
- Add backdrop:bg-black/50 for dimmed overlay when modal is open
- Add hover:bg-danger/80 and hover:bg-primary/80 on confirm button
- Add cursor-pointer to both cancel and confirm buttons
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
provideConfirmService() sets up context for the entire component tree.
ConfirmDialog is mounted once at the bottom of the layout shell.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- confirm.svelte.ts: context-based async service returning Promise<boolean>
- ConfirmDialog.svelte: native <dialog> element, reads service from context
- Concurrent calls return false immediately (guard at top of confirm())
- SSR-safe: confirm() returns Promise.resolve(false) on server
- getConfirmService() throws descriptive error outside provider tree
- 5 Vitest tests: confirm/cancel/Escape/concurrent/outside-provider all green
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Light: #c0392b (5.1:1 on white — WCAG AA), dark: #e55347 (4.7:1 on surface).
Exposed as bg-danger/text-danger-fg Tailwind utilities via @theme inline.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers segmented type control, title input, conditional field
visibility, PersonCard title display, mobile layout, and a11y.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add "amt" and "schule" suffixes to INSTITUTION_END in PersonTypeClassifier
so German government offices and schools are auto-classified on import.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add title to PersonUpdateDTO with @Size(max=50) constraint.
PersonService.createPerson and updatePerson now handle the title
field with blank-to-null normalization.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Person list and detail page avatars now display a type-specific
icon (building, people group, question mark) instead of meaningless
initials for INSTITUTION, GROUP, and UNKNOWN person types.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded Tailwind utility colors with project CSS variables
(--c-badge-institution-*, --c-badge-group-*, --c-badge-unknown-*).
Dark mode variants defined in both @media and manual toggle blocks.
Extract shared badge classes and use $derived config object.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Testcontainers test verifying: SKIP returns null with no DB record,
INSTITUTION/GROUP store full name in lastName with null firstName
and correct personType, PERSON splits name normally.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove Architekt from WORD_PREFIXES (classifier handles it)
- Use Objects.equals for null-safe firstName/lastName comparison
- Remove unused trimmed variable in PersonTypeClassifier
- Fix containsWord to loop through all occurrences (finds
"Eltern" in "Nachbareltern Eltern")
- Extract DisplayNameFormatter utility shared by Person and
PersonSummaryDTO to eliminate display logic duplication
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add @Nullable annotation to findOrCreateByAlias() return type.
Filter null results (from SKIP classification) in MassImportService
receiver list to prevent null elements in the receivers collection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two-pass title stripping with loop for stacked titles:
- Dot-prefixes (Dr., Prof.) matched without trailing space
- Word-prefixes (Tante, Frau, Schwester, etc.) matched at
word boundary
- Stacked titles like "Prof. Dr. Muller" handled correctly
- Single token after title strip goes to lastName (not firstName)
Add 5 "von" last names to KNOWN_LAST_NAMES for correct splitting
of entries like "Freifrau von Massenbach".
15 new test cases + updated 3 existing tests for title behavior.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Show colored badge for non-PERSON types per design spec:
- INSTITUTION: blue with building icon
- GROUP: purple with people icon
- UNKNOWN: amber with question mark icon
- PERSON: no badge (unmarked default)
Badge appears on person cards in list and on detail page.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Classify raw name before processing. SKIP returns null (no Person
created). INSTITUTION/GROUP skip split() and store full name in
lastName with firstName=null and appropriate personType.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move paren extraction in parseReceivers() after the multi-separator
check so single-person entries like "Clara de Gruyter(*1871)" keep
their parens intact for split()'s annotation extraction. Multi-person
entries like "Hedi und Tutu (Gruber)" still use parens as shared
last-name override.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract trailing (...) content as annotation. Handles birth years
(*1871), nicknames (Tuttu), uncertainty markers (?), and uncertain
names (Quast ?) where the name part is extracted back into the
cleaned result. Uses [^)]* regex to prevent ReDoS.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When split() returns a non-null maidenName, PersonService now
creates a PersonNameAlias with type MAIDEN_NAME. The maiden name
is stored as lastName on the alias (no firstName).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verify comma-prefix, no-dot, and multi-word maiden name variants
are correctly stripped in parseReceivers().
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Widen pattern from `\s+geb\.\s+\S+` to `,?\s*geb\.?\s+(.+)$` to
handle: optional comma, optional dot, multi-word maiden names.
stripMaidenName() now captures the maiden name instead of discarding
it. Handles all 5 input variants from the ODS data.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add displayName and personType to all Person mock objects in
component and page tests. Update assertions from reversed
"lastName, firstName" format to forward-order displayName.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add translations for PersonType values (PERSON, INSTITUTION, GROUP,
UNKNOWN) and PersonNameAliasType.MAIDEN_NAME in de/en/es.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add displayName default method to PersonSummaryDTO
- Update native SQL queries to include title, person_type columns
- Add getInitials() utility to personFormat.ts
- Update abbreviateName/abbreviateCompact for nullable firstName
- Replace firstName+lastName concatenation with displayName in all
person-displaying components and server load files
- Regenerate API types with displayName on Person and PersonSummaryDTO
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Person type now includes displayName (readonly, required), title,
personType (required enum), and firstName is optional.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add title VARCHAR(50) column
- Add person_type VARCHAR(20) NOT NULL DEFAULT 'PERSON' with CHECK
constraint (PERSON, INSTITUTION, GROUP, UNKNOWN — SKIP excluded)
- Drop NOT NULL on first_name for non-person entities
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>