docs(db): add ER and ORM diagrams (PlantUML) #452
Reference in New Issue
Block a user
Delete Branch "docs/db-diagrams"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #451.
Summary
docs/architecture/db/db-relationships.puml— crow's-foot ER diagram showing all 30 tables and their FK connections, grouped into 7 domain packages (Auth, Documents, Persons, Tags, Transcription, OCR, Supporting). Left-to-right layout for readability.docs/architecture/db/db-orm.puml— full schema reference with every column, PostgreSQL type, and<<PK>>/<<FK>>markers for all 30 tables. Use this when mapping Java entities to database columns.docs/architecture/c4-diagrams.mdwith a new Database section linking both diagrams and describing their intended use.Schema snapshot
Both diagrams are derived from Flyway migrations V1–V60 (excluding V37 and V43, which were intentionally removed). Snapshot date: 2026-05-06.
Open in VS Code with the PlantUML extension pointed at
http://heim-nas:8500.Test plan
db-relationships.pumlin VS Code with PlantUML extension — diagram renders without errorsdb-orm.pumlin VS Code with PlantUML extension — diagram renders without errorsc4-diagrams.mdDatabase section links resolve correctly to the.pumlfiles🏗️ Markus Keller (@mkeller) — Senior Application Architect
Verdict: ✅ Approved
Solid documentation work. I checked the diagrams against the actual Flyway migration trail (V1–V60) and the structural choices are correct. A few observations.
What I Checked
transcription_block_mentioned_persons.person_idis correctly modelled — no FK arrow in the relationship diagram, no<<FK>>in the ORM. The migration comment explains exactly why (lazy degradation on person delete). The diagram correctly reflects the deliberate design choice.ocr_jobs.created_by— also correctly shown as a plain UUID with no FK (matches V25 which uses noREFERENCES).sender_models ||--|| personsrelationship uses||--||(one-to-one mandatory), which is correct given theUNIQUE NOT NULL REFERENCESin V40.tag }o--o| tagself-reference correctly shows the nullable parent hierarchy (V39).Advisory
geschichtentable referencesauthor_idas nullable (}o--o|). This is correct per V58 (REFERENCES users(id) ON DELETE SET NULL). No change needed.left to right direction— good for overview readability given 30 tables.db-relationships.pumlheader explaining thattranscription_block_mentioned_persons.person_idintentionally has no FK (the ORM comment block is there, but the relationships file is silent on this). This is advisory — not a blocker.c4-diagrams.md"Schema as of Flyway V60" notation is the right way to communicate this is a snapshot. The warning comment in both files (⚠ This is a versioned snapshot) is appropriate.No structural accuracy issues found. Architecture LGTM.
👨💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer
Verdict: ✅ Approved
This is a documentation-only PR. I reviewed it from a code accuracy perspective — whether the diagram content reflects what the actual Java entities and Flyway migrations say.
What I Verified Against Migrations
I traced through V1–V60 (minus V37 and V43) and the schema in the diagrams is accurate:
app_users:usernamecorrectly absent (dropped in V44).email NOT NULL UNIQUEcorrect.color NOT NULLcorrect (V47).notify_on_reply,notify_on_mentioncorrect (V16).first_name,last_name VARCHAR(100)correct (V7).persons.first_name: shown without NOT NULL — correct, V22 dropped the NOT NULL constraint.transcription_blocks:source,reviewedcorrectly present (V26).textshown as plainTEXT(V31 made it nullable). The ORM diagram showstext : TEXTwithout NOT NULL, which is correct after V31.document_comments.block_id: correctly present (V20 added it).ocr_training_runs:person_id,cer,loss,accuracy,epochsall correctly present (V41, V32).document_annotations.polygon: correctly present (V23).document_annotations.file_hash VARCHAR(64): correctly present (V13 adds it to bothdocumentsanddocument_annotations).Suggestions (not blockers)
db-orm.pumlcomment header saystext : TEXTfortranscription_blocksbut doesn't call out it's nullable post-V31. A-- nullable since V31inline comment would help the next developer who wonders why there's noNOT NULL. Advisory only.Schema content is accurate. LGTM.
🔒 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
Documentation-only PR — no production code changes. My focus here is whether the diagrams expose any sensitive structure that could be useful to an attacker, and whether any security-relevant schema details are misrepresented.
What I Checked
Sensitive column visibility: The
app_usersentity indb-orm.pumllistspassword : VARCHAR(255) NOT NULL. This is documenting a schema fact, not exposing anything that isn't already in the codebase. For an internal family archive with private repository hosting (Gitea self-hosted), this is appropriate. The value is that new developers understand BCrypt is stored here, not plaintext.Auth-relevant tables accurately represented:
invite_tokens.created_by— correctly shown asUUID <<FK>>withNOT NULL(per V45:NOT NULL REFERENCES users(id)). Important: revoked invites are blocked at the application layer but the diagram shows therevoked : BOOLEAN NOT NULLfield, which is correct.password_reset_tokens—token,expires_at,usedall correctly present.audit_log—actor_idshown as nullable (}o--o|), correct per V46 (ON DELETE SET NULLfor GDPR right-to-erasure). This is a security-relevant design choice that the diagram models accurately.No misleading FK arrows on security-sensitive soft references:
ocr_jobs.created_byis shown as a plain UUID without FK — correct, this field has no referential constraint in V25.Advisory
app_users.passwordasVARCHAR(255) NOT NULL. If this repo were ever made public, consider whether including column names of sensitive fields in documentation is desirable. For a private self-hosted family archive, this is not a concern today.audit_logappend-only enforcement (REVOKE UPDATE/DELETE per V46/V47) is a schema-layer security control not visible in the diagram. This is advisory — diagrams don't need to document REVOKE grants, but the control exists and works.No security concerns with the diagrams themselves. LGTM.
🧪 Sara Holt (@saraholt) — Senior QA Engineer
Verdict: ✅ Approved
Documentation-only PR — no test coverage required for PlantUML files. My review focuses on whether the diagrams can be used to drive test coverage decisions and whether the schema is accurate enough to trust for test data setup.
What I Checked
Schema accuracy for test fixtures: The ORM diagram is detailed enough to write accurate test factories. Key facts verified:
documents.status— the CHECK constraint limits values toPLACEHOLDER | UPLOADED | TRANSCRIBED | REVIEWED | ARCHIVED. The ORM just showsstatus : VARCHAR(255) NOT NULL, which is sufficient for documentation purposes.NOT NULLmeans nullable), matching V31's nullabletranscription_blocks.text.ON DELETE SET NULLare correctly shown as optional (}o--o|) rather than mandatory (}o--||).Test plan in PR body: The test plan is minimal but appropriate for a docs-only PR:
Advisory
\d table_namein psql). This would be the gold standard for accuracy validation. Not a blocker for a snapshot diagram, but worth considering for future updates.LGTM for documentation purposes.
🚀 Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer
Verdict: ✅ Approved
Documentation-only PR. No infrastructure changes. My check here is whether the docs are maintainable and whether the tooling assumptions are valid for our setup.
What I Checked
PlantUML server reference: Both files and the
c4-diagrams.mdentry point tohttp://heim-nas:8500as the PlantUML rendering server. This matches the existing VS Code config already in the repo from PR #448 (the C4 diagram series). Consistent. Good.No new services, volumes, or CI changes: Pure doc addition. Zero operational impact.
Snapshot model: The
⚠ This is a versioned snapshot. Update when the schema changes significantly.comment is pragmatic. Better than a stale auto-generated diagram that claims to be live. The snapshot date in the header (2026-05-06) gives readers a reference point.Advisory
docs/architecture/db/directory is a new path. Confirm the VS Code workspace settings (.vscode/settings.json) PlantUML root paths cover this directory. Based on the C4 diagram PR, they should — the server config is global, not path-specific. No action needed.Infrastructure is unaffected. LGTM.
📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
Documentation PR. My review checks whether the diagrams serve the information needs of future developers and whether the scope matches what was promised in issue #451.
Traceability to Issue #451
The PR closes #451. Both deliverables promised in the PR description are present:
db-relationships.puml— ER overview diagram ✓db-orm.puml— Full schema reference diagram ✓c4-diagrams.mdupdated with a Database section linking both ✓Fitness for Purpose
Two-diagram strategy is well-chosen: Separating "overview for navigation" (
db-relationships.puml) from "column-level reference" (db-orm.puml) matches two distinct use cases:These are genuinely different jobs-to-be-done and warrant separate artifacts.
Intended use is documented: The
c4-diagrams.mdsection clearly distinguishes the two diagrams' purposes with "Start here for an overview" vs "Use this when mapping Java entities to database columns." Good discoverability.Snapshot framing is honest: Calling out "Schema as of Flyway V60 (2026-05-06)" sets appropriate expectations. The
⚠ versioned snapshotwarning in the PUML headers is correct.Advisory
c4-diagrams.mdnow has a Database section at the bottom. If the document grows, a table of contents would help. Not needed now.Requirements fulfilled. LGTM.
🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead
Verdict: ✅ Approved
Documentation-only PR — PlantUML diagrams, no UI changes. My review is scoped to whether the documentation is usable by developers as a reference for UI/UX work.
What I Checked
Relevance to frontend development: The ORM diagram's column types and nullability information directly supports frontend work — knowing which fields are nullable helps developers decide when to show placeholder states vs. guaranteed content. For example:
persons.first_name— nullable (correct, V22). The person display component needs a fallback when first_name is absent.documents.file_path— nullable (shown without NOT NULL). UI must handle "no file yet" state (PLACEHOLDER status).documents.thumbnail_key,thumbnail_generated_at,thumbnail_aspect— all nullable. The document grid needs a fallback thumbnail state.These are correctly represented and useful for frontend development decisions.
The
search_vector : tsvector <<computed>>field in the documents ORM is correctly tagged as<<computed>>. Frontend developers should know this is not a field they set.Advisory
LGTM.