familienarchiv

Author	SHA1	Message	Date
Marcel	dd078d50da	fix(ocr): extract PDF pages as PNGs before running kraken OCR Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 0s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Kraken's -f pdf mode tries to write output next to the input file, which fails on read-only mounts. Instead, extract pages as PNGs via pypdfium2 (already installed), then run kraken on each image. Both models run in a single container per PDF to avoid overhead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 20:37:29 +02:00
Marcel	31519af1a4	fix(ocr): add pyvips for kraken PDF input support Some checks failed CI / Unit & Component Tests (push) Failing after 0s Details CI / Backend Unit Tests (push) Failing after 0s Details CI / Unit & Component Tests (pull_request) Failing after 0s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Kraken 7 requires pyvips (optional dep) for -f pdf mode. Added libvips42 system package and pyvips Python package. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 20:11:14 +02:00
Marcel	c0004f5e6f	fix(ocr): parse kraken 'Model dir' output to locate downloaded model Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 0s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 0s Details The previous approach used find across the htrmopo cache which failed because -newer /tmp ran in a separate container. Now parses the 'Model dir: <path>' line from kraken get output directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 20:09:23 +02:00
Marcel	f12b41161e	fix(ocr): update model script for kraken 7 DOI-based downloads Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Kraken 7 uses DOIs (not short names) to identify models from Zenodo. Updated to use actual DOIs: - 10.5281/zenodo.7933463 — German handwriting HTR - 10.5281/zenodo.13788177 — McCATMuS generic handwritten/printed/typed Added -f pdf flag for PDF input, volume mounts for import dir, and post-download copy from htrmopo cache to the models volume. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 20:05:29 +02:00
Marcel	37abc376ec	fix(ocr): install torchvision from CPU index alongside torch Some checks failed CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details torchvision installed from PyPI expects CUDA torch operator registrations. Installing from the CPU whl index ensures torchvision matches the CPU-only torch build. Fixes 'torchvision::nms does not exist' RuntimeError on startup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 19:46:37 +02:00
Marcel	0af4749677	feat(ocr): extend model script with automatic OCR evaluation Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details Downloads both Kraken models, then runs each against 4 sample PDFs from the import folder (Eu-0693, Eu-0692, W-0150, W-0575). Output goes to ocr-model-evaluation/<model-name>/<doc>.txt for side-by-side comparison. Usage: ./scripts/download-kraken-models.sh # download + evaluate ./scripts/download-kraken-models.sh --eval-only # re-run evaluation ./scripts/download-kraken-models.sh --activate 1 # pick winner Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 19:41:59 +02:00
Marcel	6669fffead	fix(ocr): pin transformers<5.0 and torch==2.7.1 in requirements.txt Some checks failed CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details transformers 5.x breaks surya 0.17.1 — SuryaDecoderConfig is missing pad_token_id. Pin to transformers>=4.56.1,<5.0.0. Also add torch==2.7.1 to requirements.txt to prevent pip from upgrading it past the CPU-only build installed in the Dockerfile layer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 19:34:03 +02:00
Marcel	41f9262238	feat(ocr): add Kraken model download and evaluation script Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 2s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details Runbook script to download both HTR-United Kurrent model candidates (german_kurrent_manu_9, kurrent-de) into the ocr_models Docker volume, test them against sample documents, and activate the winner. Usage: ./scripts/download-kraken-models.sh # download both ./scripts/download-kraken-models.sh --activate 1 # pick model 1 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 19:19:39 +02:00
Marcel	c74539b04b	feat(ocr): auto-insert [unleserlich] markers for low-confidence words Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 2s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details New confidence.py module with two functions: - apply_confidence_markers(): replaces words below threshold with [unleserlich], collapses adjacent markers into one - words_from_characters(): reconstructs word-level confidence from Kraken's character-level data Surya 0.17 provides native word-level confidence via line.words. Kraken 7.0 provides per-character confidences via record.confidences. Both engines now pass word+confidence data through main.py, which applies the marker post-processing before returning the API response. Threshold configurable via OCR_CONFIDENCE_THRESHOLD env var (default 0.3). Frontend already renders [unleserlich] markers via transcriptionMarkers.ts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 19:16:17 +02:00
Marcel	49975154d9	feat(ocr): bump to latest surya 0.17.1, kraken 7.0, torch 2.7.1 Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details - surya-ocr 0.6.3 → 0.17.1: new predictor API (FoundationPredictor, RecognitionPredictor, DetectionPredictor), native polygon output on text lines (4-point clockwise) - kraken 5.2.9 → 7.0: wider torch range (>=2.4,<=2.10), unpinned numpy - torch 2.5.1 → 2.7.1: satisfies surya's >=2.7.0 requirement - Rewrite engines/surya.py for the 0.17 predictor class API - Surya now outputs polygons natively — no longer rectangle-only Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 18:53:14 +02:00
Marcel	e29c865016	fix(ocr): upgrade kraken to 6.0.3 for torch>=2.4 compatibility Some checks failed CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 2s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 3s Details kraken 5.2.9 required torch~=2.1.0, incompatible with surya-ocr's torch>=2.3.0. kraken 6.0.3 requires torch>=2.4.0,<=2.9 which overlaps with surya and our pinned torch==2.5.1. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 18:48:14 +02:00
Marcel	d49010cd7b	fix(ocr): relax pillow version to match surya-ocr constraint Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details surya-ocr 0.6.3 requires pillow<11.0.0,>=10.2.0. The previous pin at 11.1.0 caused a dependency resolution failure during Docker build. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 18:40:46 +02:00
Marcel	931fbc28e5	fix(annotations): use @JdbcTypeCode(JSON) for polygon JSONB column Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 1s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details Replace @Convert(PolygonConverter) with Hibernate native @JdbcTypeCode(SqlTypes.JSON) to fix JDBC type mismatch — PostgreSQL requires jsonb type, not varchar. The PolygonConverter is retained as a standalone utility but no longer used on the entity. Hibernate 6 natively handles List<List<Double>> serialization to JSONB. Refs #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:39:54 +02:00
Marcel	a4651aa317	feat(frontend): add OCR UI components and translations - ScriptTypeSelect: native select for TYPEWRITER/HANDWRITING_LATIN/KURRENT - OcrTrigger: wraps script type select + start button + confirmation dialog - OcrProgress: SSE-based progress display with page counter and progress bar - Paraglide translations for OCR (de/en/es): script types, trigger labels, confirmation dialog, progress messages, error messages - ErrorCode type + getErrorMessage: OCR_SERVICE_UNAVAILABLE, OCR_JOB_NOT_FOUND, OCR_DOCUMENT_NOT_UPLOADED, OCR_PROCESSING_FAILED All 687 frontend tests pass. Refs #226 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:36:00 +02:00
Marcel	cf8dc3559f	feat(frontend): extract AnnotationShape component with polygon support - AnnotationShape.svelte: renders a single annotation as either a rectangle or a polygon-clipped div (via CSS clip-path: polygon()) - AnnotationLayer.svelte: refactored to delegate rendering to AnnotationShape, keeping draw logic and hover state management - Annotation type: added optional polygon field ([number, number][] \| null) - Polygon coordinates are converted from page-normalized to bounding-box-relative percentages for clip-path All 687 existing frontend tests pass. Refs #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:30:27 +02:00
Marcel	6737bd6db5	feat(ocr): add Python OCR microservice, RestClientOcrClient, Docker Compose Python microservice (ocr-service/): - FastAPI app with /ocr and /health endpoints - Surya engine: transformer-based OCR for typewritten/modern handwriting - Kraken engine: historical HTR for Kurrent/Suetterlin with pure-Python polygon-to-quad approximation (gift wrapping + rotating calipers) - Eager model loading at startup via lifespan context manager - PDF download via httpx, page rendering via pypdfium2 at 300 DPI Java RestClientOcrClient: - Implements OcrClient + OcrHealthClient interfaces - Calls Python service via Spring RestClient - Health check with graceful fallback Docker Compose: - New ocr-service container (mem_limit 6g, no host ports) - Health check with start_period 60s for model loading - ocr_models volume for Kraken model files - Backend depends on ocr-service health Refs #226, #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:26:40 +02:00
Marcel	aea46c5fd0	feat(ocr): add OcrService, OcrBatchService, OcrProgressService, OcrController - OcrService: single-document OCR (health check, block clearing, presigned URL, annotation + block creation) - OcrBatchService: batch processing with @Async, per-document status tracking, SKIPPED for PLACEHOLDER documents, failure isolation - OcrProgressService: SSE emitter registry per job ID with 5-min timeout - OcrController: POST /api/documents/{id}/ocr (WRITE_ALL), POST /api/ocr/batch (ADMIN), GET /api/ocr/jobs/{id} (READ_ALL), GET /api/ocr/jobs/{id}/progress (SSE), GET /api/documents/{id}/ocr-status 19 tests: 6 OcrService, 4 OcrBatchService, 3 OcrProgressService, 6 OcrController Refs #226 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:24:15 +02:00
Marcel	ff3990710e	feat(ocr): add OCR infrastructure (interfaces, entities, migrations, DTOs) - OcrClient + OcrHealthClient interfaces for testable OCR integration - OcrBlockResult record for OCR engine response mapping - OcrJob + OcrJobDocument entities with status enums - V25 migration creates ocr_jobs and ocr_job_documents tables - Repositories for job and job-document queries - TriggerOcrDTO, BatchOcrDTO (@Size max=500), OcrStatusDTO - ErrorCodes: OCR_SERVICE_UNAVAILABLE, OCR_JOB_NOT_FOUND, OCR_DOCUMENT_NOT_UPLOADED, OCR_PROCESSING_FAILED Refs #226 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:15:16 +02:00
Marcel	d194b6b225	feat(documents): add ScriptType enum and script_type column - ScriptType enum: UNKNOWN, TYPEWRITER, HANDWRITING_LATIN, HANDWRITING_KURRENT - V24 migration adds script_type VARCHAR(30) NOT NULL DEFAULT 'UNKNOWN' - Document entity: scriptType field with @Builder.Default UNKNOWN - DocumentUpdateDTO: optional scriptType field - DocumentService: wires scriptType through update method Refs #226 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:13:42 +02:00
Marcel	c19c41f812	feat(annotations): add createOcrAnnotation that skips overlap check OCR creates many adjacent text line annotations that would fail the existing overlap check. createOcrAnnotation() accepts an optional polygon and bypasses overlap detection entirely. Refs #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:12:11 +02:00
Marcel	878a90a86d	feat(annotations): add polygon JSONB support for quadrilateral shapes - V23 migration adds polygon JSONB column with 4-point CHECK constraint - PolygonConverter: AttributeConverter for List<List<Double>> <-> JSONB - @UniquePoints custom validator rejects duplicate coordinates - CreateAnnotationDTO: validated optional polygon field - DocumentAnnotation entity: polygon field with converter Refs #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:10:35 +02:00
Marcel	ec32d225b5	docs(adr): add ADR-001 (OCR microservice) and ADR-002 (polygon JSONB) ADR-001 documents the decision to use a separate Python container for OCR (Surya + Kraken), the interface contract, and why alternatives like Tess4J were rejected. ADR-002 documents the decision to store polygon annotations as JSONB with a 4-point CHECK constraint, backed by an AttributeConverter. Refs #226, #227 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 15:07:46 +02:00
Marcel	11a35f2952	fix(tests): resolve all 4 pre-existing test failures Some checks failed CI / Unit & Component Tests (push) Failing after 1s Details CI / Backend Unit Tests (push) Failing after 1s Details - CommentThread: add missing empty-state paragraph using comment_empty_hint i18n key (key existed but was never rendered in the template) - TranscriptionBlock: add selectedQuote hint using transcription_block_quote_hint i18n key (key existed but was never rendered); fix test to use native DOM el.focus()/setSelectionRange()/dispatchEvent instead of locator.selectText() which is not available in this vitest-browser version - TranscriptionEditView: fix test to use native el.dispatchEvent(FocusEvent) instead of locator.blur() which is not available - Conversations: fix test expected text from stale "Korrespondenz durchsuchen" to match current conv_empty_heading() = "Wessen Briefe möchten Sie lesen?" All 687 tests now pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:55:34 +02:00
Marcel	d046c89631	test(confirm): add ConfirmDialog component spec (12 tests) Some checks failed CI / Unit & Component Tests (pull_request) Failing after 3s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 1s Details Covers: title/body rendering, destructive vs primary button class, custom labels, settle true/cancel, aria-labelledby, and hide-after-settle. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:38:58 +02:00
Marcel	a2d078b8f9	refactor(persons): replace non-null assertion with null guard on removeFormEl Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:36:47 +02:00
Marcel	0b95c90e7a	refactor(confirm): use import { m } instead of import * as m in ConfirmDialog Consistent with every other component in the project. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:35:42 +02:00
Marcel	84378f11b4	refactor(confirm): use plain let for resolveRef instead of $state resolveRef is never read reactively — it is only read synchronously inside settle(). Using $state was misleading about the intent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:34:52 +02:00
Marcel	3a316bc382	fix(ui): center dialog, add backdrop, hover states, and cursor-pointer on buttons - Add m-auto and w-full to ensure the native <dialog> is centred - Add backdrop:bg-black/50 for dimmed overlay when modal is open - Add hover:bg-danger/80 and hover:bg-primary/80 on confirm button - Add cursor-pointer to both cancel and confirm buttons Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:33:33 +02:00
Marcel	1a519eedd6	refactor(persons): replace inline delete modal with ConfirmService in NameHistoryEditCard Some checks failed CI / Unit & Component Tests (push) Failing after 4s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 3s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:19:33 +02:00
Marcel	498679234a	refactor(docs): replace inline confirmDelete toggle with ConfirmService in SaveBar Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:12:01 +02:00
Marcel	14fc5cbc54	refactor(admin): replace window.confirm with ConfirmService in admin group delete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:06:54 +02:00
Marcel	0d1401ce4f	refactor(admin): replace window.confirm with ConfirmService in admin user delete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 14:04:09 +02:00
Marcel	d4ead08c17	refactor(transcription): replace window.confirm with ConfirmService in TranscriptionBlock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 13:47:37 +02:00
Marcel	08bd27b5cd	feat(layout): mount ConfirmDialog in root layout and provide confirm service provideConfirmService() sets up context for the entire component tree. ConfirmDialog is mounted once at the bottom of the layout shell. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 13:21:34 +02:00
Marcel	1942c2a5cb	feat(confirm): add ConfirmService and ConfirmDialog with deferred-Promise pattern - confirm.svelte.ts: context-based async service returning Promise<boolean> - ConfirmDialog.svelte: native <dialog> element, reads service from context - Concurrent calls return false immediately (guard at top of confirm()) - SSR-safe: confirm() returns Promise.resolve(false) on server - getConfirmService() throws descriptive error outside provider tree - 5 Vitest tests: confirm/cancel/Escape/concurrent/outside-provider all green Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 13:20:37 +02:00
Marcel	fb00de6690	feat(design-system): add --c-danger/--c-danger-fg token pair for destructive actions Light: #c0392b (5.1:1 on white — WCAG AA), dark: #e55347 (4.7:1 on surface). Exposed as bg-danger/text-danger-fg Tailwind utilities via @theme inline. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 13:07:46 +02:00
Marcel	52dd72ae8d	feat(i18n): add btn_confirm key to de/en/es message files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 13:06:43 +02:00
Marcel	7a6b3d66fb	docs(spec): add design spec for person title & type fields UI Covers segmented type control, title input, conditional field visibility, PersonCard title display, mobile layout, and a11y. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 21:42:17 +02:00
Marcel	e69aaa6a8c	fix: classify Steuerfinanzamt and Reichsfechtschule as institutions Some checks failed CI / Unit & Component Tests (push) Failing after 2s Details CI / Backend Unit Tests (push) Failing after 1s Details Add "amt" and "schule" suffixes to INSTITUTION_END in PersonTypeClassifier so German government offices and schools are auto-classified on import. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-08 20:59:17 +02:00
Marcel	c34db997fa	feat(model): add title field to PersonUpdateDTO with @Size validation Some checks failed CI / Unit & Component Tests (pull_request) Failing after 3s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 1s Details Add title to PersonUpdateDTO with @Size(max=50) constraint. PersonService.createPerson and updatePerson now handle the title field with blank-to-null normalization. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 18:38:33 +02:00
Marcel	166f60f7d3	feat(ui): show type icon in avatar for non-person entities Person list and detail page avatars now display a type-specific icon (building, people group, question mark) instead of meaningless initials for INSTITUTION, GROUP, and UNKNOWN person types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 18:36:34 +02:00
Marcel	a1b21d6989	refactor(ui): use CSS custom properties for PersonTypeBadge colors Replace hardcoded Tailwind utility colors with project CSS variables (--c-badge-institution-, --c-badge-group-, --c-badge-unknown-*). Dark mode variants defined in both @media and manual toggle blocks. Extract shared badge classes and use $derived config object. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 18:35:05 +02:00
Marcel	5106d277f1	test(service): add integration test for findOrCreateByAlias classification Some checks failed CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 2s Details CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 1s Details Testcontainers test verifying: SKIP returns null with no DB record, INSTITUTION/GROUP store full name in lastName with null firstName and correct personType, PERSON splits name normally. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:29:20 +02:00
Marcel	ac545ecdaa	refactor: address PR review concerns - Remove Architekt from WORD_PREFIXES (classifier handles it) - Use Objects.equals for null-safe firstName/lastName comparison - Remove unused trimmed variable in PersonTypeClassifier - Fix containsWord to loop through all occurrences (finds "Eltern" in "Nachbareltern Eltern") - Extract DisplayNameFormatter utility shared by Person and PersonSummaryDTO to eliminate display logic duplication Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:25:06 +02:00
Marcel	c0cf8d7952	fix(service): add @Nullable to findOrCreateByAlias and filter nulls in caller Add @Nullable annotation to findOrCreateByAlias() return type. Filter null results (from SKIP classification) in MassImportService receiver list to prevent null elements in the receivers collection. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:22:33 +02:00
Marcel	73640ef5b6	feat(parser): implement stripTitle for known prefixes Some checks failed CI / Unit & Component Tests (push) Failing after 3s Details CI / Backend Unit Tests (push) Failing after 1s Details CI / Unit & Component Tests (pull_request) Failing after 2s Details CI / Backend Unit Tests (pull_request) Failing after 1s Details Two-pass title stripping with loop for stacked titles: - Dot-prefixes (Dr., Prof.) matched without trailing space - Word-prefixes (Tante, Frau, Schwester, etc.) matched at word boundary - Stacked titles like "Prof. Dr. Muller" handled correctly - Single token after title strip goes to lastName (not firstName) Add 5 "von" last names to KNOWN_LAST_NAMES for correct splitting of entries like "Freifrau von Massenbach". 15 new test cases + updated 3 existing tests for title behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:15:18 +02:00
Marcel	6ee1ef73c3	feat(ui): add PersonTypeBadge to person list and detail pages Show colored badge for non-PERSON types per design spec: - INSTITUTION: blue with building icon - GROUP: purple with people icon - UNKNOWN: amber with question mark icon - PERSON: no badge (unmarked default) Badge appears on person cards in list and on detail page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:09:16 +02:00
Marcel	a3da5731d0	feat(service): integrate PersonTypeClassifier into findOrCreateByAlias Classify raw name before processing. SKIP returns null (no Person created). INSTITUTION/GROUP skip split() and store full name in lastName with firstName=null and appropriate personType. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:06:49 +02:00
Marcel	68f0c4c4b9	feat(service): add PersonTypeClassifier with keyword heuristics Static classify() method uses position-aware keyword matching: - SKIP: Briefumschlag, Kondolenzbriefe, Hochzeitsgedicht (start) - INSTITUTION: Firma, Architekt (start), GmbH, Co (end) - GROUP: Familie, Comité, Comite, Geschwister, Gesellschafter, Garde, Mitarbeiter (start), Eltern, Kinder, Schwiegereltern (word boundary) - PERSON: default for all other inputs Case-insensitive. 25 parameterized test cases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:03:53 +02:00
Marcel	e49ae5de29	fix(parser): preserve annotation parens for single-person inputs Move paren extraction in parseReceivers() after the multi-separator check so single-person entries like "Clara de Gruyter(*1871)" keep their parens intact for split()'s annotation extraction. Multi-person entries like "Hedi und Tutu (Gruber)" still use parens as shared last-name override. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 13:00:34 +02:00

1 2 3 4 5 ...

731 Commits