docs(legibility): write docs/GLOSSARY.md — DOC-3
Disambiguates all overloaded terms in the codebase: Person vs AppUser, Chronik (internal) vs Aktivität (user-facing), TranscriptionBlock polygon vs bounding box, DocumentVersion append-only convention, OcrJob lifecycle, SenderModel as persistent entity, Audit log DB-layer caveat, and more. Includes Pending Terms section for audit follow-ups (#388–#392). Refs #397 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
113
docs/GLOSSARY.md
Normal file
113
docs/GLOSSARY.md
Normal file
@@ -0,0 +1,113 @@
|
||||
# Familienarchiv — Glossary
|
||||
|
||||
Domain-specific and overloaded terms used in this codebase.
|
||||
Each entry: **Term** — definition (≤ 2 sentences). Where two terms are easily confused, a _Not to be confused with_ note follows.
|
||||
|
||||
For architecture context see [`docs/architecture/c4-diagrams.md`](architecture/c4-diagrams.md).
|
||||
For domain package structure see [`docs/ARCHITECTURE.md`](ARCHITECTURE.md) _(coming: DOC-2)_.
|
||||
|
||||
---
|
||||
|
||||
## Identity Terms
|
||||
|
||||
**AppUser** (`AppUser`) — a real person who can log into the system (a family member or administrator). `AppUser` records carry login credentials, group memberships, and notification history.
|
||||
_Not to be confused with [Person](#person-person)_ — an AppUser is never recorded as a document sender, receiver, or historical individual.
|
||||
|
||||
**Permission** — a discrete capability string assigned to a `UserGroup` (e.g. `READ_ALL`, `WRITE_ALL`, `ADMIN`, `ADMIN_USER`, `ADMIN_TAG`, `ADMIN_PERMISSION`). Enforced via the `@RequirePermission` AOP annotation on controller methods, checked at runtime by `PermissionAspect`; not via Spring Security's `@PreAuthorize`.
|
||||
|
||||
**Person** (`Person`) — a historical individual in the family archive (sender, receiver of letters, person mentioned in transcriptions). NEVER has a login account and NEVER appears as an `AppUser`.
|
||||
_Not to be confused with [AppUser](#appuser-appuser)_ — `Person` is a historical record; `AppUser` is someone who can log in today.
|
||||
|
||||
**PersonNameAlias** (`PersonNameAlias`) — an alternate or historical name form associated with a `Person` (e.g. maiden name, nickname, abbreviated form). Used to locate `Person` records during mass import via `PersonNameAliasType`.
|
||||
|
||||
**UserGroup** (`UserGroup`) — a named permission bundle assigned to one or more `AppUser`s. A user's effective permissions are the union of all permissions across all groups they belong to.
|
||||
|
||||
---
|
||||
|
||||
## Document-Related Terms
|
||||
|
||||
**Annotation** (`DocumentAnnotation`) — a free-form polygon or shape drawn over a document page image to highlight a region of interest. Always scoped to a specific page of a `Document`; stored as a polygon (JSONB).
|
||||
_See also [TranscriptionBlock](#transcriptionblock-transcriptionblock)._
|
||||
|
||||
**Comment** (`DocumentComment`, table `document_comments`) — a threaded discussion message attached to a `Document`. Always scoped to a `Document`; optionally further contextualized by a specific `DocumentAnnotation` or `TranscriptionBlock`.
|
||||
|
||||
**Document** (`Document`) — a single archival item (letter, postcard, photograph) with a file stored in MinIO/S3 and associated metadata (sender, receivers, date, tags, transcription blocks).
|
||||
|
||||
**DocumentVersion** (`DocumentVersion`) — an append-only snapshot of a `Document`'s metadata at a point in time. Append-only by convention; no consumer-facing create or update endpoint exists. The entity uses Lombok `@Data` (which generates setters), so immutability is enforced by application convention, not at the Java level.
|
||||
|
||||
**Tag** (`Tag`) — a hierarchical category that can be applied to `Document`s. Tags are self-referencing via a `parent_id` foreign key, forming a tree structure.
|
||||
|
||||
**TranscriptionBlock** (`TranscriptionBlock`) — a paragraph-level segment of a `Document`'s transcribed text, with a polygon region (stored as JSONB) identifying its position on the page. One document can have many blocks across multiple pages.
|
||||
_See also [Annotation](#annotation-documentannotation)._
|
||||
|
||||
---
|
||||
|
||||
## Workflow Terms
|
||||
|
||||
**DocumentStatus lifecycle** — the ordered states a `Document` moves through:
|
||||
`PLACEHOLDER → UPLOADED → TRANSCRIBED → REVIEWED → ARCHIVED`
|
||||
- `PLACEHOLDER`: created during mass import; no file attached yet.
|
||||
- `UPLOADED`: a file has been stored in MinIO/S3.
|
||||
- `TRANSCRIBED`: all transcription blocks have been marked done.
|
||||
- `REVIEWED`: a reviewer has approved the transcription.
|
||||
- `ARCHIVED`: the document is finalized and read-only.
|
||||
|
||||
**Mass import** — an asynchronous batch process (`MassImportService`) that reads an Excel or ODS file and creates `Person`s, `Tag`s, and `PLACEHOLDER` `Document`s in one shot. Only one import can run at a time (`IMPORT_ALREADY_RUNNING` error if attempted concurrently).
|
||||
|
||||
**Transcription queue** — the set of `Document`s and `TranscriptionBlock`s awaiting work, computed on-the-fly from `Document`/`Block` status. Three views: segmentation queue, transcription queue, ready-to-read queue. NOT a persistent entity — no `transcription_queues` table exists.
|
||||
_See also [DocumentStatus lifecycle](#documentstatus-lifecycle)._
|
||||
|
||||
---
|
||||
|
||||
## OCR-Specific Terms
|
||||
|
||||
**HTR** — Handwritten Text Recognition. Recognizes cursive and historical handwriting (contrasted with OCR for printed/typewritten text). The primary mode used for letters in this archive.
|
||||
|
||||
**Kurrent** — Old German cursive handwriting style, the primary historical script appearing in letters from the 1899–1950 period covered by this archive.
|
||||
|
||||
**OCR** — Optical Character Recognition. Recognizes printed or typewritten text. Used for typed documents; HTR is used for handwritten ones.
|
||||
|
||||
**OcrJob** (`OcrJob`, table `ocr_jobs`) — a first-class persistent entity tracking a batch OCR run across one or more documents (`OcrJobDocument`, table `ocr_job_documents`). Distinct from the concept of "running OCR on a single document." Lifecycle: `PENDING → RUNNING → DONE / FAILED` (see `OcrJobStatus`).
|
||||
|
||||
**SenderModel** (`SenderModel`, table `sender_models`) — a fine-tuned Kraken HTR model trained on a specific historical correspondent's handwriting. Both an OCR-service concept (the model weights) and a persistent entity linking a `Person` to the path of their trained model file.
|
||||
|
||||
**Sütterlin** — A specific standardized style of Kurrent taught in German schools from 1915 to 1941.
|
||||
|
||||
---
|
||||
|
||||
## Other Domain Terms
|
||||
|
||||
**Aktivität / Aktivitäten** `[user-facing]` — the family activity feed accessible at `/aktivitaeten`. Shows recent documents, transcriptions, comments, and Geschichten as a chronological timeline.
|
||||
_See also [Chronik](#chronik-internal)._
|
||||
|
||||
**Briefwechsel** `[user-facing]` — the bilateral conversation timeline between two `Person`s, derived from `Document` sender/receiver relationships. Accessible at `/briefwechsel`. Not a persistent entity — data is computed from existing `Document` records.
|
||||
_See also [Derived domain](#derived-domain)._
|
||||
|
||||
**Chronik** `[internal]` — the conceptual and code-level name for the unified activity feed (per ADR-003 `003-chronik-unified-activity-feed.md`). Used in code, architecture documents, and ADRs. The user-facing label for the same concept is [Aktivität](#aktivitat--aktivitaten-user-facing).
|
||||
|
||||
**Geschichte** (`Geschichte`) `[user-facing]` — a narrative story or article published in the archive, linking `Person`s and `Document`s. Lifecycle: `DRAFT → PUBLISHED` (see `GeschichteStatus`). DRAFT stories are hidden from users without the `BLOG_WRITE` permission.
|
||||
|
||||
**Notification** (`Notification`) — an in-app message delivered to an `AppUser`. No email or SMS delivery exists today. Delivered via Server-Sent Events (`SseEmitterRegistry`) and persisted in the `notifications` table.
|
||||
|
||||
**Audit log** (`AuditLog`, table `audit_log`) — an append-only event store recording domain-level activity (document edits, user actions, etc.). Append-only by application convention; a `REVOKE UPDATE, DELETE` is attempted at the DB layer (see migrations V46, V47) but is a no-op if the application role is the table owner in PostgreSQL. Do not rely on DB-enforced immutability — the constraint is application-layer only.
|
||||
|
||||
---
|
||||
|
||||
## Architectural Terms
|
||||
|
||||
**Cross-cutting** — code that lives in `lib/shared/` (frontend) or cross-domain packages (backend) because it has no entity of its own, no user-facing CRUD, AND is used by two or more domains OR is framework infrastructure (error handling, API client, i18n utilities).
|
||||
|
||||
**Derived domain** — a Tier-2 frontend domain that has its own UI but no backend entities of its own. Data is computed from Tier-1 domain records. Current derived domains: `conversation` (from `Document` sender/receivers) and `activity` (from audit, notifications, document events).
|
||||
_See also [Briefwechsel](#briefwechsel-user-facing)._
|
||||
|
||||
**Domain** — a Tier-1 bounded context with its own entities, controller, service, repository, and DTOs. Backend domains: `document`, `person`, `tag`, `user`, `geschichte`, `notification`, `ocr`, `audit`, `dashboard`. Frontend domains mirror this structure under `src/lib/`.
|
||||
|
||||
---
|
||||
|
||||
## Pending Terms
|
||||
|
||||
_Terms flagged as potentially ambiguous that have not yet been formally defined here. Add an entry above and remove it from this list when resolved._
|
||||
|
||||
- Terms surfaced by Epic 1 audit findings (#388–#392) — review audit reports under `docs/audits/` when available and add any term flagged as ambiguous.
|
||||
- `OcrBatchService` vs `OcrAsyncRunner` — both handle async OCR orchestration; their division of responsibility should be clarified here.
|
||||
- `Stammbaum` — the genealogy tree view; relationship to `PersonRelationship` entity.
|
||||
Reference in New Issue
Block a user