diff --git a/backend/src/main/java/org/raddatz/familienarchiv/audit/README.md b/backend/src/main/java/org/raddatz/familienarchiv/audit/README.md new file mode 100644 index 00000000..012ab00e --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/audit/README.md @@ -0,0 +1,37 @@ +# audit + +Append-only event store for all domain mutations. Every write across the application produces an `audit_log` row. The activity feed and Family Pulse dashboard aggregate from this table. + +## What this domain owns + +Table: `audit_log` (append-only by convention — no UPDATE or DELETE in application code). +Features: log mutations, query activity feed, query per-entity history. + +**Admission criteria (why this is cross-cutting, not a Tier-1 domain):** consumed by 5+ domains; has no user-facing CRUD of its own; the data model is fixed (event log, not a business entity). + +## What this domain does NOT own + +Nothing beyond the log table. `audit/` is an infrastructure layer, not a business domain. + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `logAfterCommit(event)` | document, person, user, ocr, geschichte | Record a mutation event after the DB transaction commits | + +`logAfterCommit` is the only write-path. Query paths (`AuditLogQueryService`) are consumed by `dashboard/` and the activity feed route. + +## Internal layout + +- `AuditService` — `logAfterCommit()` (write) +- `AuditLogQueryService` — query by entity, by user, for the activity feed +- `AuditLog` (entity) → table `audit_log` +- `AuditLogRepository` + +## Cross-domain dependencies + +None. `audit/` is consumed by other domains; it does not call out to any of them. + +## Frontend counterpart + +No direct frontend counterpart. Audit data surfaces in the `activity/` and `conversation/` frontend domains via the dashboard API. diff --git a/backend/src/main/java/org/raddatz/familienarchiv/dashboard/README.md b/backend/src/main/java/org/raddatz/familienarchiv/dashboard/README.md new file mode 100644 index 00000000..952ceb42 --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/dashboard/README.md @@ -0,0 +1,39 @@ +# dashboard + +Stats aggregation for the admin dashboard and the Family Pulse widget. This is a derived domain — it has no tables of its own; all data is computed on-the-fly from Tier-1 domain data. + +## What this domain owns + +No entities. Routes: `/api/dashboard/*`, `/api/stats/*`. +Features: document counts, person counts, publication stats, weekly activity data, incomplete-document list, enrichment queue, Family Pulse widget data, admin statistics. + +**Admission criteria (cross-cutting):** aggregates from 3+ domains; no owned entities. + +## What this domain does NOT own + +None of the underlying data — it reads from `document/`, `person/`, `audit/`, `notification/`, `geschichte/`. + +## Public surface + +`dashboard/` is a leaf domain — no other domain calls its services. It is the aggregator, not the aggregated. + +## Internal layout + +- `StatsController` — REST under `/api/stats` +- `DashboardController` — REST under `/api/dashboard` +- `StatsService` — aggregated counts (documents, persons, geschichten, incomplete, etc.) +- `DashboardService` — activity feed composition, Family Pulse data + +## Cross-domain dependencies + +- `DocumentService.count()` — total document count (StatsService) +- `DocumentService.getDocumentById(UUID)` / `getDocumentsByIds(List)` — document enrichment for activity feed (DashboardService) +- `PersonService.count()` — total person count (StatsService) +- `TranscriptionService.listBlocks(UUID)` — transcription block lookup for resume widget (DashboardService) +- `UserService.getById(UUID)` — actor name resolution in activity feed (DashboardService) +- `CommentService.findAnnotationIdsByIds(...)` — annotation context lookup for activity feed (DashboardService) +- `AuditLogQueryService.findMostRecentDocumentForUser()` / `getPulseStats()` / `findActivityFeed()` — audit-sourced feed rows (DashboardService) + +## Frontend counterpart + +Activity feed and Pulse widget are assembled in `frontend/src/lib/shared/dashboard/` and in the `aktivitaeten` route; no dedicated `dashboard/` lib folder. diff --git a/backend/src/main/java/org/raddatz/familienarchiv/document/README.md b/backend/src/main/java/org/raddatz/familienarchiv/document/README.md new file mode 100644 index 00000000..8a67ec0d --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/document/README.md @@ -0,0 +1,50 @@ +# document + +The archive's core concept. A `Document` represents one physical artefact (a letter, a postcard, a photo) stored in MinIO and described by metadata. + +## What this domain owns + +Entities: `Document`, `DocumentVersion`, `TranscriptionBlock`, `DocumentAnnotation`, `DocumentComment`. +Features: document CRUD, file upload/download, full-text search, bulk editing, transcription workflows, annotation canvas, threaded comments, thumbnail generation (PDFBox). + +## What this domain does NOT own + +- `Person` (sender / receivers) — referenced by ID, resolved via `PersonService` +- `Tag` — referenced by ID; the join is on the document side but tags are owned by `tag/` +- `AppUser` — comments reference `AppUser` IDs, but user management lives in `user/` +- OCR processing — `ocr/` orchestrates jobs; `ocr-service/` executes them + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `getDocumentById(UUID)` | ocr, notification | Fetch a single document | +| `getDocumentsByIds(List)` | ocr | Bulk fetch for OCR job | +| `findByOriginalFilename(String)` | importing | Deduplication during mass import | +| `deleteTagCascading(UUID tagId)` | tag | Remove a tag from all documents before deleting it | +| `findWeeklyStats()` | dashboard | Activity data for Family Pulse widget | +| `count()` | dashboard | Total document count for stats | +| `addTrainingLabel(...)` | ocr | Attach a confirmed sender label to a document | +| `findSegmentationQueue(int limit)` / `findTranscriptionQueue(int limit)` / `findReadyToReadQueue(int limit)` | ocr | OCR pipeline queues | + +## Internal layout + +- `DocumentController` — REST under `/api/documents` +- `DocumentService` — CRUD, search (JPA Specifications), bulk edit +- `DocumentRepository` — includes bidirectional conversation-thread query +- `DocumentSpecifications` — composable `Specification` predicates for search +- `DocumentVersionService` / `DocumentVersionRepository` — append-only version history +- `ThumbnailService` + `ThumbnailAsyncRunner` — PDFBox thumbnail generation (separate thread pool) +- Sub-packages: `annotation/`, `comment/`, `transcription/` + +## Cross-domain dependencies + +- `PersonService.getById()` / `getAllById()` — resolve sender and receivers +- `TagService.expandTagNamesToDescendantIdSets()` — tag filter expansion +- `FileService.uploadFile()` / `downloadFile()` / `generatePresignedUrl()` — S3 I/O +- `NotificationService.notifyMentions()` / `.notifyReply()` — comment mentions +- `AuditService.logAfterCommit()` — every mutation is audited + +## Frontend counterpart + +`frontend/src/lib/document/README.md` diff --git a/backend/src/main/java/org/raddatz/familienarchiv/geschichte/README.md b/backend/src/main/java/org/raddatz/familienarchiv/geschichte/README.md new file mode 100644 index 00000000..d206a4c7 --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/geschichte/README.md @@ -0,0 +1,38 @@ +# geschichte + +Family stories — curated narrative pieces that weave together persons, documents, and commentary into a publishable article. German: *Geschichte* (story / history). + +## What this domain owns + +Entity: `Geschichte`. +Lifecycle: `DRAFT → PUBLISHED` (only published stories are visible to non-authors). +Features: story CRUD, rich-text editing with person and document cross-references, publish/unpublish toggle, comment thread (shared component from `shared/discussion/`). + +## What this domain does NOT own + +- `Person` or `Document` records — stories reference them by ID. Deleting a Person or Document does not cascade to Geschichte. +- Comment storage — shared comment infrastructure is in `document/comment/` (or `shared/discussion/` on the frontend). + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `getById(UUID)` | notification | Resolve story context in mention notifications | +| `list(...)` | dashboard | Recent stories for the activity feed | +| `count()` | dashboard | Published story count for stats | + +## Internal layout + +- `GeschichteController` — REST under `/api/geschichten` +- `GeschichteService` — CRUD, publish lifecycle +- `GeschichteRepository` — list by status, author + +## Cross-domain dependencies + +- `PersonService.getById()` / `getAllById()` — resolve person references in story body +- `DocumentService.getDocumentsByIds()` — resolve document references in story body +- `AuditService.logAfterCommit()` — story mutations are audited + +## Frontend counterpart + +`frontend/src/lib/geschichte/README.md` diff --git a/backend/src/main/java/org/raddatz/familienarchiv/notification/README.md b/backend/src/main/java/org/raddatz/familienarchiv/notification/README.md new file mode 100644 index 00000000..904cc106 --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/notification/README.md @@ -0,0 +1,41 @@ +# notification + +In-app messages delivered in real time via SSE and persisted in the bell-icon dropdown. Notifications are created by other domains in response to events (comment mentions, replies). + +## What this domain owns + +Entity: `Notification`. +Features: create and deliver notifications, unread count, mark-read, SSE real-time push, per-user delivery preferences (stored as fields on `AppUser`, managed by `user/`). + +## What this domain does NOT own + +- `AppUser` (recipient) — owned by `user/` +- `Document` or `Geschichte` (notification context) — referenced by ID only + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `notifyMentions(mentionedUserIds, comment)` | document (comment) | Push mention notifications when a comment contains @mentions | +| `notifyReply(reply, participantIds)` | document (comment) | Push reply notification to all thread participants | +| `countUnread(userId)` | user session | Unread badge count in the nav bar | +| `getNotifications(userId)` | dashboard / activity | Notification list for bell dropdown | +| `markRead(id)` / `markAllRead(userId)` | notification controller | User-driven read-state updates | +| `updatePreferences(userId, dto)` | notification controller | Per-user delivery preferences | + +## Internal layout + +- `NotificationController` — REST under `/api/notifications` +- `NotificationService` — create, query, mark-read +- `SseEmitterRegistry` — runtime-stateful component that keeps one `SseEmitter` per connected user. On `notifyMentions()` / `notifyReply()`, the service writes to `SseEmitterRegistry` to push real-time events. SSE connections go **backend → browser directly**, not via the SvelteKit SSR layer. +- `NotificationRepository` — persisted notification rows +- `NotificationPreferenceDTO` — read/write DTO for preference endpoints (prefs stored on `AppUser`) + +## Cross-domain dependencies + +**Outbound (this domain calls):** +- `DocumentService.findTitlesByIds(List)` — enriches notification DTOs with document titles for display in the bell dropdown + +## Frontend counterpart + +`frontend/src/lib/notification/README.md` diff --git a/backend/src/main/java/org/raddatz/familienarchiv/ocr/README.md b/backend/src/main/java/org/raddatz/familienarchiv/ocr/README.md new file mode 100644 index 00000000..5bf9a095 --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/ocr/README.md @@ -0,0 +1,44 @@ +# ocr + +OCR/HTR pipeline orchestration. This domain manages job lifecycle and result ingestion — it does **not** perform OCR. Actual text recognition runs in the Python `ocr-service/` container (port 8000, internal network only). + +## What this domain owns + +Entities: `OcrJob`, `OcrJobDocument`, `SenderModel`. +Features: start OCR jobs, track job lifecycle (`PENDING → RUNNING → DONE / FAILED`), stream transcription blocks back into `document/transcription/`, sender-model training, segmentation training. + +## What this domain does NOT own + +- Document content — `Document` and `TranscriptionBlock` are owned by `document/` +- File storage — presigned MinIO URLs are generated by `filestorage/FileService` and passed to the OCR service +- OCR processing — the Python `ocr-service/` executes Surya (typewritten) and Kraken (Kurrent/Sütterlin HTR) and streams results back + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `startOcr(documentId, ...)` | document | Trigger an OCR job for a document | +| `getJob(UUID)` | document | Fetch job status | +| `getDocumentOcrStatus(UUID)` | document | Per-document OCR status summary | + +## Internal layout + +- `OcrController` — REST under `/api/ocr` +- `OcrService` — job creation, presigned URL generation, result ingestion +- `OcrBatchService` — batch job workflows +- `OcrAsyncRunner` — `@Async` execution of OCR jobs +- `OcrTrainingService` — calls `/train` and `/segtrain` on the Python service (protected by `X-Training-Token` header) +- `OcrJobRepository` / `OcrJobDocumentRepository` +- `SenderModelRepository` — trained sender-recognition models +- `OcrClient` (interface) / `RestClientOcrClient` — HTTP client for the Python OCR service; mockable for tests + +## Cross-domain dependencies + +- `DocumentService.getDocumentById()` / `getDocumentsByIds()` — resolve target documents +- `DocumentService.addTrainingLabel()` — attach confirmed sender labels after training +- `FileService.generatePresignedUrl()` — generate MinIO presigned URLs passed to the OCR service (PDF bytes never flow through the backend) +- `AuditService.logAfterCommit()` — OCR job events are audited + +## Frontend counterpart + +`frontend/src/lib/ocr/README.md` diff --git a/backend/src/main/java/org/raddatz/familienarchiv/person/README.md b/backend/src/main/java/org/raddatz/familienarchiv/person/README.md new file mode 100644 index 00000000..f507f540 --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/person/README.md @@ -0,0 +1,45 @@ +# person + +Historical individuals referenced by documents. A `Person` is a family member who appears as a sender or receiver in the archive — they are never login accounts. + +## What this domain owns + +Entities: `Person`, `PersonNameAlias`, `PersonRelationship`. +Features: person CRUD, name alias management, person merge (deduplication), family-member designation, relationship graph, person type classification (FAMILY, CORRESPONDENT, INSTITUTION). + +## What this domain does NOT own + +- `AppUser` — login accounts are in `user/`. A `Person` record has no login credentials. The separation is deliberate: a historical family member from 1905 is never a system user. +- Document content — `Person` records are referenced by documents (as sender/receiver), not the other way around. +- Relationship rendering — the Stammbaum view is derived by the frontend from `PersonRelationship` data. + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `getById(UUID)` | document, geschichte, ocr | Fetch one person by ID | +| `getAllById(List)` | document | Bulk fetch for sender/receiver resolution | +| `findAll(String q)` | document, dashboard | List all persons | +| `findByName(String firstName, String lastName)` | document | Typeahead search | +| `findOrCreateByAlias(String rawName)` | importing | Idempotent create during mass import; type classification happens internally | +| `findAllFamilyMembers()` | dashboard | Family member list for stats | +| `findCorrespondents()` | document | Correspondent list for conversation filter | +| `count()` | dashboard | Total person count for stats | + +## Internal layout + +- `PersonController` — REST under `/api/persons` +- `PersonService` — CRUD, merge, alias management, family-member designation +- `PersonRepository` — sorted list, name search +- `PersonNameAlias` / `PersonNameAliasRepository` — alternative name spellings +- `PersonNameParser` / `PersonTypeClassifier` — name parsing utilities +- `PersonSummaryDTO` — lightweight DTO for typeahead / list views +- Sub-package: `relationship/` — `PersonRelationship`, `RelationshipService`, `RelationshipController` + +## Cross-domain dependencies + +- `AuditService.logAfterCommit()` — person mutations are audited + +## Frontend counterpart + +`frontend/src/lib/person/README.md` diff --git a/backend/src/main/java/org/raddatz/familienarchiv/tag/README.md b/backend/src/main/java/org/raddatz/familienarchiv/tag/README.md new file mode 100644 index 00000000..4f34833e --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/tag/README.md @@ -0,0 +1,35 @@ +# tag + +Hierarchical document categories. Tags form a tree via a self-referencing `parent_id` column and are applied to documents for filtering and browse navigation. + +## What this domain owns + +Entity: `Tag` (self-referencing `parent_id` tree). +Features: tag CRUD, hierarchical deletion (cascade to descendants), tag typeahead, admin tag management (rename, reparent, merge). + +## What this domain does NOT own + +- Documents — the `document_tags` join table is on the document side. `Tag` does not hold document references. +- Tag assignment — adding/removing a tag from a document is handled by `DocumentService`. + +## Public surface (called from other domains) + +| Method | Consumer | Purpose | +|---|---|---| +| `delete(UUID)` | document | Remove the tag record; called by `DocumentService.deleteTagCascading()` after all document references are unlinked | +| `deleteWithDescendants(UUID)` | admin tag UI | Recursive subtree deletion | +| `expandTagNamesToDescendantIdSets(List)` | document | Expand tag filter to include descendant tags | + +## Internal layout + +- `TagController` — REST under `/api/tags` +- `TagService` — CRUD, hierarchy traversal, cascade-delete coordination +- `TagRepository` — find-or-create by name (case-insensitive), subtree queries + +## Cross-domain dependencies + +None. Documents reference tags; tags do not reference documents or other domains. + +## Frontend counterpart + +`frontend/src/lib/tag/README.md` diff --git a/backend/src/main/java/org/raddatz/familienarchiv/user/README.md b/backend/src/main/java/org/raddatz/familienarchiv/user/README.md new file mode 100644 index 00000000..7d79f0da --- /dev/null +++ b/backend/src/main/java/org/raddatz/familienarchiv/user/README.md @@ -0,0 +1,35 @@ +# user + +Login accounts and permission groups. An `AppUser` is a system user who can authenticate and act in the application — they are never a historical family member. + +## What this domain owns + +Entities: `AppUser`, `UserGroup`, password-reset tokens, invite tokens. +Features: user CRUD, group CRUD, password change, password reset flow, invite links. + +## What this domain does NOT own + +- `Person` records — historical family members. An `AppUser` is never linked to a `Person`. This separation is intentional: a person who digitized letters in 2024 is not the same entity as their great-grandmother who wrote them in 1912. See `docs/GLOSSARY.md`. +- Permission enforcement — `security/` owns `@RequirePermission` and `PermissionAspect`. `user/` only manages which permissions are stored on `UserGroup`. + +## Public surface + +`UserService` methods are consumed primarily by the security infrastructure and the admin UI. No other business-logic domain calls `UserService` directly. + +The Spring Security chain (via `CustomUserDetailsService` in `security/`) calls `AppUserRepository.findByUsername()` on every authenticated request. + +## Internal layout + +- `UserController` — REST under `/api/users` (current user, CRUD) +- `AuthController` — password reset, invite flow +- `UserService` — BCrypt-encoded passwords, group assignment +- `AppUserRepository` — find by username (used by Spring Security) +- `UserGroupRepository` — group and permission management + +## Cross-domain dependencies + +- `AuditService.logAfterCommit()` — user-management mutations are audited + +## Frontend counterpart + +`frontend/src/lib/user/README.md` diff --git a/frontend/src/lib/document/README.md b/frontend/src/lib/document/README.md new file mode 100644 index 00000000..16885de7 --- /dev/null +++ b/frontend/src/lib/document/README.md @@ -0,0 +1,36 @@ +# document (frontend) + +UI for the archive's core concept: viewing, uploading, editing, searching, bulk-selecting, and transcribing documents. + +## What this domain owns + +Components: `DocumentRow`, `DocumentThumbnail`, `DocumentTopBar`, `DocumentViewer`, `DocumentMetadataDrawer`, `DocumentEditLayout`, `DocumentStatusChip`, `UploadZone`, `BulkSelectionBar`, `BulkDropZone`. +Utilities: `search.ts` (search-param helpers), `filename.ts` (filename formatting), `documentStatusLabel.ts` (i18n label mapping), `validateFile.ts` (upload validation), `groupDocuments.ts` (list grouping). +Sub-folders: `annotation/`, `transcription/`, `viewer/`. + +## What this domain does NOT own + +- Person typeahead — `person/PersonTypeahead.svelte` (cross-domain import, allowed by ESLint rule) +- Tag input — `tag/TagInput.svelte` (cross-domain import, allowed) +- Shared discussion — `shared/discussion/` (comment/mention editor) + +## Key components + +| Component | Route used in | Notes | +| --------------------------- | ---------------------------------- | ------------------------------------ | +| `DocumentRow.svelte` | `/` (search results), admin queues | Compact document card with thumbnail | +| `DocumentViewer.svelte` | `/documents/[id]` | PDF/image inline viewer | +| `DocumentEditLayout.svelte` | `/documents/[id]/edit` | Full edit form with sticky save bar | +| `UploadZone.svelte` | `/documents/new`, bulk upload | Drag-and-drop file drop area | +| `BulkSelectionBar.svelte` | `/documents` bulk mode | Multi-select action bar | + +## Cross-domain imports + +- `person/PersonTypeahead.svelte` — sender / receiver selection +- `tag/TagInput.svelte` — tag chip input +- `ocr/OcrProgress.svelte` — job status indicator in the document header +- `shared/primitives/BackButton.svelte`, `shared/discussion/` — shared UI + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/document/README.md` diff --git a/frontend/src/lib/geschichte/README.md b/frontend/src/lib/geschichte/README.md new file mode 100644 index 00000000..74baea7b --- /dev/null +++ b/frontend/src/lib/geschichte/README.md @@ -0,0 +1,34 @@ +# geschichte (frontend) + +UI for family stories: the rich-text editor, story cards, and story list view. + +## What this domain owns + +Components: `GeschichteEditor.svelte`, `GeschichtenCard.svelte`. + +## What this domain does NOT own + +- Comment/discussion UI — shared via `shared/discussion/` (same component used for document comments) +- Person display — `person/PersonChip.svelte` is used inside story content (cross-domain import) +- Document display — document references in stories use components from `document/` + +## Key components + +| Component | Used in | Notes | +| ------------------------- | -------------------------------------------- | ------------------------------------------------------------------ | +| `GeschichteEditor.svelte` | `/geschichten/new`, `/geschichten/[id]/edit` | Rich-text editor with person/document @-mentions and inline embeds | +| `GeschichtenCard.svelte` | `/geschichten` (list), dashboard | Story preview card with cover image and publish status | + +## Audience note + +The `/geschichten` route primarily serves readers (younger family members on mobile). Cards must have ≥ 44 px touch targets. Status must not rely on color alone. + +## Cross-domain imports + +- `person/PersonChip.svelte` — inline person references in story content +- `document/DocumentThumbnail.svelte` — inline document references +- `shared/discussion/` — comment thread below published stories + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/geschichte/README.md` diff --git a/frontend/src/lib/notification/README.md b/frontend/src/lib/notification/README.md new file mode 100644 index 00000000..36377c46 --- /dev/null +++ b/frontend/src/lib/notification/README.md @@ -0,0 +1,36 @@ +# notification (frontend) + +Bell-icon dropdown and real-time SSE connection for in-app notifications. + +## What this domain owns + +Components: `NotificationBell.svelte`, `NotificationDropdown.svelte`. +Utilities: `notifications.svelte.ts` (Svelte 5 reactive store), `notifications.ts` (API helpers). + +## What this domain does NOT own + +- SSE infrastructure — the backend's `SseEmitterRegistry` manages the server-side emitter. The frontend establishes one `EventSource` connection per session. Connection management lives in `notifications.svelte.ts`. +- Notification content rendering — notification payloads contain a `contextUrl`; the frontend navigates there on click. + +## Key design: SSE connection + +The SSE path is **backend → browser directly** (not proxied through SvelteKit SSR). The `EventSource` connects to `/api/notifications/stream`. On receive, the reactive store updates the unread count and the bell dropdown in real time. + +``` +Backend SseEmitterRegistry → /api/notifications/stream → EventSource in browser +``` + +## Key components + +| Component | Used in | Notes | +| ----------------------------- | ----------------------------- | --------------------------------------------------------- | +| `NotificationBell.svelte` | global nav (`+layout.svelte`) | Bell icon with unread badge; opens `NotificationDropdown` | +| `NotificationDropdown.svelte` | global nav | Scrollable list of recent notifications with mark-read | + +## Cross-domain imports + +- `shared/primitives/` — icon, button primitives only + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/notification/README.md` diff --git a/frontend/src/lib/ocr/README.md b/frontend/src/lib/ocr/README.md new file mode 100644 index 00000000..261b4e82 --- /dev/null +++ b/frontend/src/lib/ocr/README.md @@ -0,0 +1,27 @@ +# ocr (frontend) + +UI for OCR job management, progress display, and sender-model training in the admin/enrichment panel. + +## What this domain owns + +Components: `OcrProgress.svelte`, `OcrTrigger.svelte`, `OcrTrainingCard.svelte`, `SegmentationTrainingCard.svelte`, `TrainingHistory.svelte`. +Utilities: `translateOcrProgress.ts` (progress-state → display-string mapping), `training.ts` (training API helpers). + +## What this domain does NOT own + +- OCR processing — all text recognition runs in the Python `ocr-service/` container. The frontend shows job state; it does not run OCR. +- Transcription block display — rendered by `document/transcription/` components. + +## Key components + +| Component | Used in | Notes | +| --------------------------------- | ----------------------------- | -------------------------------------------------------- | +| `OcrProgress.svelte` | document header, enrich panel | Progress bar and status label for an active OCR job | +| `OcrTrigger.svelte` | enrich panel, document detail | Button to start an OCR job; disabled when one is running | +| `OcrTrainingCard.svelte` | `/admin/ocr` | Trigger sender-model training; shows training history | +| `SegmentationTrainingCard.svelte` | `/admin/ocr` | Trigger segmentation training | +| `TrainingHistory.svelte` | `/admin/ocr` | List of past training runs with status | + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/ocr/README.md` diff --git a/frontend/src/lib/person/README.md b/frontend/src/lib/person/README.md new file mode 100644 index 00000000..59ca3b3d --- /dev/null +++ b/frontend/src/lib/person/README.md @@ -0,0 +1,37 @@ +# person (frontend) + +UI for historical family members: typeahead selection, chip display, hover cards, genealogy graph, relationship management. + +## What this domain owns + +Components: `PersonTypeahead.svelte`, `PersonMultiSelect.svelte`, `PersonChip.svelte`, `PersonChipRow.svelte`, `PersonHoverCard.svelte`, `PersonTypeBadge.svelte`, `PersonTypeSelector.svelte`. +Utilities: `personFormat.ts` (full-name formatting), `personLifeDates.ts` (birth/death display), `person-validation.ts` (form validation), `personHoverCard.ts` (hover-card controller). +Sub-folders: `genealogy/` (Stammbaum view components), `relationship/` (relationship graph components). + +## What this domain does NOT own + +- Document content — displayed in `document/` +- AppUser accounts — managed in `user/` + +## Key components + +| Component | Used in | Notes | +| -------------------------- | ----------------------------------------- | ----------------------------------------------------------------------------------- | +| `PersonTypeahead.svelte` | document edit, geschichte, search filters | Single-person selector with debounced typeahead. Exported for use by other domains. | +| `PersonMultiSelect.svelte` | document edit (receivers) | Chip-based multi-person selector | +| `PersonChip.svelte` | document rows, conversation view | Compact display chip with link and hover card | +| `PersonHoverCard.svelte` | person chips | Floating card with person summary on hover | + +## Cross-domain imports + +- `shared/primitives/` — generic UI primitives +- `shared/hooks/useTypeahead.svelte.ts` — typeahead keyboard/focus logic + +## Accessibility notes + +- `PersonChip` focus ring: `focus-visible:ring-2 focus-visible:ring-brand-navy` +- `PersonTypeahead` dropdown navigable via keyboard (↑↓ Enter Escape) + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/person/README.md` diff --git a/frontend/src/lib/shared/README.md b/frontend/src/lib/shared/README.md new file mode 100644 index 00000000..5968ed4a --- /dev/null +++ b/frontend/src/lib/shared/README.md @@ -0,0 +1,40 @@ +# shared (frontend) + +Cross-domain utilities and UI primitives. Any file here is consumed by two or more domain folders and has no domain identity of its own. + +## Admission criteria (what belongs here) + +A file belongs in `shared/` if it meets **all three** conditions: + +1. No domain identity — it does not represent a `Document`, `Person`, `Tag`, etc. +2. Consumed by ≥ 2 domain folders — or is framework infrastructure that every domain depends on. +3. Generic — could work in a different SvelteKit project with zero business-logic changes. + +If any condition fails, the file belongs in the domain folder of its primary consumer. + +## What this folder owns + +| Sub-folder / file | Purpose | +| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------- | +| `api.server.ts` | Typed `openapi-fetch` client factory — the standard entry point for all backend API calls in server-side load functions and actions | +| `errors.ts` | Mirror of the backend `ErrorCode` enum + `getErrorMessage()` → Paraglide i18n key mapping | +| `types.ts` | Cross-domain TypeScript interfaces | +| `utils.ts` | Pure utility functions (date formatting, sorting, debounce) | +| `relativeTime.ts` | Human-relative time formatting (`"2 days ago"`) | +| `primitives/` | Generic UI components: `BackButton.svelte`, form inputs, pagination, layout shells | +| `discussion/` | Comment/mention editor shared by `document/` and `geschichte/` | +| `dashboard/` | Family Pulse widget and recent-activity components assembled in the `/` route | +| `hooks/` | Svelte 5 reactive hooks: `useTypeahead`, `useUnsavedWarning` | +| `services/` | Generic client-side service helpers | +| `actions/` | Shared SvelteKit form action utilities | +| `server/` | Server-only shared utilities (load function helpers) | +| `help/` | Coach marks and empty-state components used across multiple domains | + +## What does NOT belong here + +- Components owned by one domain — move to that domain's folder. +- Domain-specific business logic — even if shared, it belongs in the owning domain's public surface. + +## Adding to shared/ + +If you need to add a file here, confirm it meets all three admission criteria. If it's domain-adjacent, check whether the owning domain should export it as part of its public surface instead. diff --git a/frontend/src/lib/tag/README.md b/frontend/src/lib/tag/README.md new file mode 100644 index 00000000..5a30f14b --- /dev/null +++ b/frontend/src/lib/tag/README.md @@ -0,0 +1,28 @@ +# tag (frontend) + +UI for hierarchical document categories: tag chip lists, tag input with typeahead, and the admin tag-tree editor. + +## What this domain owns + +Components: `TagInput.svelte`, `TagChipList.svelte`, `TagParentPicker.svelte`. + +## What this domain does NOT own + +- Tag data management — CRUD is handled via the backend `tag/` domain +- Document association — adding/removing tags from documents is in `document/` + +## Key components + +| Component | Used in | Notes | +| ------------------------ | --------------------------- | -------------------------------------------------------------------------------------------------------------------------------- | +| `TagInput.svelte` | document edit form | Multi-tag chip input with typeahead. Supports free-text creation and selecting existing tags. Exported for use by other domains. | +| `TagChipList.svelte` | document rows, detail pages | Read-only display of a tag set | +| `TagParentPicker.svelte` | admin tag editor | Tree-aware parent selection | + +## Cross-domain imports + +- `shared/hooks/useTypeahead.svelte.ts` — shared typeahead logic for `TagInput` + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/tag/README.md` diff --git a/frontend/src/lib/user/README.md b/frontend/src/lib/user/README.md new file mode 100644 index 00000000..3302aac0 --- /dev/null +++ b/frontend/src/lib/user/README.md @@ -0,0 +1,28 @@ +# user (frontend) + +UI for account management: profile editing, password change, and permission group management in the admin panel. + +## What this domain owns + +Components: `UserProfileSection.svelte`, `UserPasswordSection.svelte`, `UserGroupsSection.svelte`. + +## What this domain does NOT own + +- `Person` records — historical family members are entirely separate from login accounts. A user editing their profile is an `AppUser`; the historical persons in documents are `Person` entities. They are never linked. +- User list or admin creation UI — those live in the `/admin` route, which assembles views from multiple domains. + +## Key components + +| Component | Used in | Notes | +| ---------------------------- | --------------------------- | ------------------------------------ | +| `UserProfileSection.svelte` | `/settings` or profile page | Display name, email editing | +| `UserPasswordSection.svelte` | `/settings` | Password change form | +| `UserGroupsSection.svelte` | `/admin` | Per-user permission group assignment | + +## Cross-domain imports + +- `shared/primitives/` — generic UI primitives only + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/user/README.md` diff --git a/ocr-service/README.md b/ocr-service/README.md new file mode 100644 index 00000000..976db06b --- /dev/null +++ b/ocr-service/README.md @@ -0,0 +1,51 @@ +# ocr-service + +Python FastAPI microservice that performs the actual handwritten text recognition (HTR) and OCR. The Spring Boot backend orchestrates jobs; this service executes them. + +## What this service owns + +- Text recognition: Surya (typewritten text) and Kraken (Kurrent/Sütterlin historical handwriting) +- Baseline layout analysis: Kraken BLLA model +- Sender recognition: trained per-archive sender models +- HTTP API at port 8000 (internal Docker network — no external port) + +## What this service does NOT own + +- Job lifecycle — tracked in the backend's `ocr/` domain +- MinIO storage — the service fetches PDFs via presigned URLs generated by the backend; it does not hold credentials +- Transcription block storage — results are streamed back to the backend, which writes them to PostgreSQL + +## API endpoints + +| Endpoint | Auth | Purpose | +|---|---|---| +| `POST /ocr` | None (internal network only) | Run OCR on a PDF (presigned MinIO URL in request body) | +| `POST /train` | `X-Training-Token` header | Trigger sender-model training | +| `POST /segtrain` | `X-Training-Token` header | Trigger segmentation training | +| `GET /health` | None | Health check | + +## Environment variables + +| Variable | Default | Required? | Sensitive? | Purpose | +|---|---|---|---|---| +| `TRAINING_TOKEN` | — | YES (prod) | YES | Guards `/train` and `/segtrain`. Do not leave empty in production. | +| `ALLOWED_PDF_HOSTS` | `minio,localhost,127.0.0.1` | YES | — | SSRF protection — comma-separated allowed PDF source hosts. Never set to `*`. | +| `KRAKEN_MODEL_PATH` | `/app/models/` | — | — | Directory where Kraken HTR models are stored (populated by `download-kraken-models.sh`) | +| `BLLA_MODEL_PATH` | `/app/models/blla.mlmodel` | — | — | Kraken baseline layout analysis model. Auto-downloaded via `ensure_blla_model.py` on startup if missing. | + +## Key files + +| File | Purpose | +|---|---| +| `main.py` | FastAPI app, endpoint definitions, SSRF validation | +| `engines/` | Surya and Kraken engine wrappers | +| `models.py` | Pydantic request/response models | +| `preprocessing.py` | PDF-to-image conversion before OCR | +| `confidence.py` | Per-block confidence scoring | +| `spell_check.py` | Post-OCR spell correction using historical dictionaries | +| `ensure_blla_model.py` | Startup script that downloads the BLLA model if missing | +| `entrypoint.sh` | Docker entrypoint — runs `ensure_blla_model.py` then starts the server | + +## Backend counterpart + +`backend/src/main/java/org/raddatz/familienarchiv/ocr/README.md`