docs(legibility): DOC-2 — write docs/ARCHITECTURE.md #441

Merged
marcel merged 3 commits from feat/issue-396-architecture into main 2026-05-06 07:31:02 +02:00
Owner

Summary

  • Adds docs/ARCHITECTURE.md — human-targeted architecture doc for a PM-with-CS reader who has read the README
  • Updates docs/architecture/c4-diagrams.md L2 diagram: adds ocr-service container, SSE notification arrow (backend → browser), and presigned-URL arrow (ocr → minio)

What's in ARCHITECTURE.md

  1. High-level diagram — links to updated c4-diagrams.md; calls out OCR network boundary and SSE path (two facts invisible in the diagram)
  2. Domain set — 7 Tier-1 domains (document, person, tag, user, geschichte, notification, ocr) and 2 Tier-2 derived domains (conversation, activity), each with entities owned, what it doesn't own, and cross-domain deps
  3. Cross-cutting layer — table of 7 backend packages (audit, config, dashboard, exception, filestorage, importing, security) + frontend shared/ members; includes admission criteria
  4. Stack-symmetry principle — backend/frontend naming rule; references #408 restructure
  5. Key architectural decisions — ADR-001 through ADR-006 summaries + layering rule + permission system (httpOnly cookie, CSRF rationale)
  6. Data-flow walkthroughs — document upload (9 steps) and transcription block autosave (7 steps)

Test plan

  • Render docs/ARCHITECTURE.md in Gitea — verify all section headings display correctly
  • Verify Mermaid L2 diagram in c4-diagrams.md renders without errors
  • Confirm all internal links (GLOSSARY.md, security-guide.md, adr/) resolve correctly

Closes #396

🤖 Generated with Claude Code

## Summary - Adds `docs/ARCHITECTURE.md` — human-targeted architecture doc for a PM-with-CS reader who has read the README - Updates `docs/architecture/c4-diagrams.md` L2 diagram: adds ocr-service container, SSE notification arrow (backend → browser), and presigned-URL arrow (ocr → minio) ## What's in ARCHITECTURE.md 1. **High-level diagram** — links to updated c4-diagrams.md; calls out OCR network boundary and SSE path (two facts invisible in the diagram) 2. **Domain set** — 7 Tier-1 domains (document, person, tag, user, geschichte, notification, ocr) and 2 Tier-2 derived domains (conversation, activity), each with entities owned, what it doesn't own, and cross-domain deps 3. **Cross-cutting layer** — table of 7 backend packages (audit, config, dashboard, exception, filestorage, importing, security) + frontend `shared/` members; includes admission criteria 4. **Stack-symmetry principle** — backend/frontend naming rule; references #408 restructure 5. **Key architectural decisions** — ADR-001 through ADR-006 summaries + layering rule + permission system (httpOnly cookie, CSRF rationale) 6. **Data-flow walkthroughs** — document upload (9 steps) and transcription block autosave (7 steps) ## Test plan - [ ] Render `docs/ARCHITECTURE.md` in Gitea — verify all section headings display correctly - [ ] Verify Mermaid L2 diagram in `c4-diagrams.md` renders without errors - [ ] Confirm all internal links (GLOSSARY.md, security-guide.md, adr/) resolve correctly Closes #396 🤖 Generated with [Claude Code](https://claude.com/claude-code)
marcel added 2 commits 2026-05-05 22:50:00 +02:00
Refs #396
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
docs(legibility): write docs/ARCHITECTURE.md
Some checks failed
CI / Backend Unit Tests (pull_request) Failing after 3m17s
CI / Unit & Component Tests (pull_request) Failing after 3m28s
CI / OCR Service Tests (pull_request) Successful in 34s
CI / Backend Unit Tests (push) Has been cancelled
CI / Unit & Component Tests (push) Has been cancelled
CI / OCR Service Tests (push) Has been cancelled
dc6910a31e
Human-targeted architecture doc: high-level diagram, 7 Tier-1 + 2
Tier-2 domains, cross-cutting layer, stack-symmetry principle, 6 ADR
summaries, layering rule, permission system, and two data-flow
walkthroughs (document upload, transcription block autosave).

Closes #396
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

🏗️ Markus Keller — Senior Application Architect

Verdict: ⚠️ Approved with concerns

The doc is a genuinely useful artefact. The domain taxonomy, cross-cutting-layer table, and stack-symmetry principle are accurate and add value. The ADR summaries are concise and reference the correct files. A few architectural inaccuracies and one omission need fixing before this becomes the canonical reference.


Blockers

1. Permission table omits ANNOTATE_ALL (Section 5 — Permission system)

The doc lists READ_ALL, WRITE_ALL, ADMIN, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION, BLOG_WRITE but the actual enum (Permission.java) is:

READ_ALL, WRITE_ALL, ANNOTATE_ALL, BLOG_WRITE, ADMIN, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION

ANNOTATE_ALL is actively used in production code (UserSearchController, NotificationController). A reader implementing a new controller against this doc would omit a required permission.

2. Document upload data flow uses wrong HTTP method and misleading auth description (Section 6)

"SvelteKit server action sends PUT /api/documents/{id} as multipart/form-data with the Authorization header."

Two problems:

  • The described flow is an edit, not an initial upload. A fresh upload goes through POST /api/documents (as the @PostMapping on DocumentController shows). PUT /{id} is the update path.
  • The frontend does not inject an Authorization header in the action; the server hook (hooks.server.ts) injects it from the auth_token cookie transparently. The SvelteKit form action is unaware of this. Saying "the action sends ... with the Authorization header" conflates the hooks layer with the form action layer and will mislead any developer tracing that path.

3. Transcription block autosave uses PUT, not PATCH (Section 6)

"SvelteKit sends PATCH /api/transcription/blocks/{id}"

The actual HTTP verb in saveBlockWithConflictRetry.ts is PUT. The backend controller registers PUT /{blockId} (TranscriptionBlockController). PATCH vs PUT is a meaningful semantic difference in REST documentation.

The route is also PUT /api/documents/{documentId}/transcription-blocks/{blockId}, not /api/transcription/blocks/{id}.


Suggestions

4. geschichte cross-domain deps understated

The doc says Geschichte cross-domain deps are person and document (for "linked entities in the story body"). Inspecting GeschichteService shows filtering by personIds and documentId, and GeschichteStatus has only DRAFT and PUBLISHED — the lifecycle claim ("DRAFT → PUBLISHED") is correct. However the description "linked entities in the story body" overstates the coupling — the links are filter parameters, not body-embedded references. Worth tightening to "filtered by associated persons and documents."

5. conversation Tier-2 domain — route is correct, source description could be more precise

"The DocumentRepository bidirectional query is the only data source" — this is accurate. Good.

6. useBlockAutoSave — the doc calls it a "hook" but it exports createBlockAutoSave()

"the frontend's useBlockAutoSave hook fires"

The file is useBlockAutoSave.svelte and the export is createBlockAutoSave(). Calling it a "hook" (React vocabulary) is imprecise in a Svelte 5 codebase. "factory function" or "reactive utility" is more accurate. Low severity, but the doc targets developers who will grep for the name.

7. activity Tier-2 domain — DashboardService composes the feed, not SvelteKit alone

"computed on-the-fly by DashboardService and composed in the SvelteKit load function"

This is accurate. The aktivitaeten/+page.server.ts calls /api/dashboard/activity (DashboardService) and /api/notifications in parallel. Good description.


What's Well Done

  • The cross-cutting layer admission criteria table is clear and correct. The audit consumed-by-5+-domains claim is verifiable.
  • ADR existence check passes: all six ADR files exist in docs/adr/.
  • Stack-symmetry principle with the May 2026 restructure reference is a useful historical marker.
  • OCR network boundary explanation (no external port, presigned URL bypass) is architecturally important and correctly described.
## 🏗️ Markus Keller — Senior Application Architect **Verdict: ⚠️ Approved with concerns** The doc is a genuinely useful artefact. The domain taxonomy, cross-cutting-layer table, and stack-symmetry principle are accurate and add value. The ADR summaries are concise and reference the correct files. A few architectural inaccuracies and one omission need fixing before this becomes the canonical reference. --- ### Blockers **1. Permission table omits `ANNOTATE_ALL` (Section 5 — Permission system)** The doc lists `READ_ALL, WRITE_ALL, ADMIN, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION, BLOG_WRITE` but the actual enum (`Permission.java`) is: ``` READ_ALL, WRITE_ALL, ANNOTATE_ALL, BLOG_WRITE, ADMIN, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION ``` `ANNOTATE_ALL` is actively used in production code (`UserSearchController`, `NotificationController`). A reader implementing a new controller against this doc would omit a required permission. **2. Document upload data flow uses wrong HTTP method and misleading auth description (Section 6)** > "SvelteKit server action sends `PUT /api/documents/{id}` as `multipart/form-data` with the `Authorization` header." Two problems: - The described flow is an *edit*, not an *initial upload*. A fresh upload goes through `POST /api/documents` (as the `@PostMapping` on `DocumentController` shows). `PUT /{id}` is the *update* path. - The frontend does not inject an `Authorization` header in the action; the server hook (`hooks.server.ts`) injects it from the `auth_token` cookie transparently. The SvelteKit form action is unaware of this. Saying "the action sends ... with the Authorization header" conflates the hooks layer with the form action layer and will mislead any developer tracing that path. **3. Transcription block autosave uses `PUT`, not `PATCH` (Section 6)** > "SvelteKit sends `PATCH /api/transcription/blocks/{id}`" The actual HTTP verb in `saveBlockWithConflictRetry.ts` is `PUT`. The backend controller registers `PUT /{blockId}` (`TranscriptionBlockController`). PATCH vs PUT is a meaningful semantic difference in REST documentation. The route is also `PUT /api/documents/{documentId}/transcription-blocks/{blockId}`, not `/api/transcription/blocks/{id}`. --- ### Suggestions **4. `geschichte` cross-domain deps understated** The doc says `Geschichte` cross-domain deps are `person` and `document` (for "linked entities in the story body"). Inspecting `GeschichteService` shows filtering by `personIds` and `documentId`, and `GeschichteStatus` has only `DRAFT` and `PUBLISHED` — the lifecycle claim ("DRAFT → PUBLISHED") is correct. However the description "linked entities in the story body" overstates the coupling — the links are filter parameters, not body-embedded references. Worth tightening to "filtered by associated persons and documents." **5. `conversation` Tier-2 domain — route is correct, source description could be more precise** "The `DocumentRepository` bidirectional query is the only data source" — this is accurate. Good. **6. `useBlockAutoSave` — the doc calls it a "hook" but it exports `createBlockAutoSave()`** > "the frontend's `useBlockAutoSave` hook fires" The file is `useBlockAutoSave.svelte` and the export is `createBlockAutoSave()`. Calling it a "hook" (React vocabulary) is imprecise in a Svelte 5 codebase. "factory function" or "reactive utility" is more accurate. Low severity, but the doc targets developers who will grep for the name. **7. `activity` Tier-2 domain — `DashboardService` composes the feed, not SvelteKit alone** > "computed on-the-fly by `DashboardService` and composed in the SvelteKit load function" This is accurate. The `aktivitaeten/+page.server.ts` calls `/api/dashboard/activity` (DashboardService) and `/api/notifications` in parallel. Good description. --- ### What's Well Done - The cross-cutting layer admission criteria table is clear and correct. The `audit` consumed-by-5+-domains claim is verifiable. - ADR existence check passes: all six ADR files exist in `docs/adr/`. - Stack-symmetry principle with the May 2026 restructure reference is a useful historical marker. - OCR network boundary explanation (no external port, presigned URL bypass) is architecturally important and correctly described.
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: ⚠️ Approved with concerns

Documentation PRs deserve the same accuracy bar as code PRs — a wrong API path in a doc is a bug that wastes a real developer's time. I verified claims against the codebase. Most of the doc is accurate; three concrete factual errors need fixing, and one naming choice needs a note.


Blockers

1. Wrong HTTP verb in the transcription block autosave walkthrough

Section 6, autosave step 2 says:

"SvelteKit sends PATCH /api/transcription/blocks/{id}"

saveBlockWithConflictRetry.ts line 40:

method: 'PUT',

The actual path is also PUT /api/documents/{documentId}/transcription-blocks/{blockId}. Both the verb and the path are wrong. A developer tracing a failing request won't find PATCH /api/transcription/blocks/ anywhere in the codebase.

2. useBlockAutoSave is not a hook — it exports createBlockAutoSave()

Step 1 of the autosave flow:

"the frontend's useBlockAutoSave hook fires after a debounce interval"

The Svelte 5 factory function pattern in this codebase is createBlockAutoSave() (from TranscriptionEditView.svelte import). Calling it a "hook" imports React mental model. The naming matters because Svelte 5 uses runes, not hooks — and a reader might search the codebase for a "hook" API that doesn't exist under that name.

3. Section 2 — person domain: PersonNameAlias is the entity, not PersonNameAlias

Actually this is correctly stated. The entity PersonNameAlias and PersonRelationship both exist in backend/src/main/java/.../person/. No issue here. ✓


Suggestions

4. Geschichte cross-domain deps reference

Section 2 says Geschichte cross-domain deps include document for "linked entities in the story body." Looking at GeschichteService, the document dependency is a filter parameter (documentId), not structural body embedding. The distinction matters for someone deciding whether to add a document-deletion cascade. Worth a one-word fix: "filtered by associated persons and documents" rather than "linked entities in the story body."

5. Section 3 — shared/ member api.server.ts path

The doc says:

api.server.ts (typed openapi-fetch client factory)

The actual file is frontend/src/lib/shared/api.server.ts. Accurate. ✓

6. Geschichte lifecycle — missing states?

Doc says "DRAFT → PUBLISHED lifecycle." GeschichteStatus only has DRAFT and PUBLISHED. Accurate. ✓


What's Well Done

  • The data flow walkthroughs are the most valuable part of this doc. The 9-step document upload is accurate (AuditService is called in the same transaction, FileService generates the S3 key, etc.).
  • The saveBlockWithConflictRetry helper being called out by name is helpful — it's not obvious from the code alone that there's a retry loop.
  • The @mention sidecar rewrite in step 6 of the autosave flow is accurate per PersonMentionPropagationListener (ADR-006).
  • Domain entity names are verified correct: OcrJob, OcrJobDocument, SenderModel, PersonNameAlias, PersonRelationship all exist.
  • shared/discussion/ exists and contains CommentThread.svelte, MentionEditor.svelte as described.
## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ⚠️ Approved with concerns** Documentation PRs deserve the same accuracy bar as code PRs — a wrong API path in a doc is a bug that wastes a real developer's time. I verified claims against the codebase. Most of the doc is accurate; three concrete factual errors need fixing, and one naming choice needs a note. --- ### Blockers **1. Wrong HTTP verb in the transcription block autosave walkthrough** Section 6, autosave step 2 says: > "SvelteKit sends `PATCH /api/transcription/blocks/{id}`" `saveBlockWithConflictRetry.ts` line 40: ```typescript method: 'PUT', ``` The actual path is also `PUT /api/documents/{documentId}/transcription-blocks/{blockId}`. Both the verb and the path are wrong. A developer tracing a failing request won't find `PATCH /api/transcription/blocks/` anywhere in the codebase. **2. `useBlockAutoSave` is not a hook — it exports `createBlockAutoSave()`** Step 1 of the autosave flow: > "the frontend's `useBlockAutoSave` hook fires after a debounce interval" The Svelte 5 factory function pattern in this codebase is `createBlockAutoSave()` (from `TranscriptionEditView.svelte` import). Calling it a "hook" imports React mental model. The naming matters because Svelte 5 uses runes, not hooks — and a reader might search the codebase for a "hook" API that doesn't exist under that name. **3. Section 2 — `person` domain: `PersonNameAlias` is the entity, not `PersonNameAlias`** Actually this is correctly stated. The entity `PersonNameAlias` and `PersonRelationship` both exist in `backend/src/main/java/.../person/`. No issue here. ✓ --- ### Suggestions **4. `Geschichte` cross-domain deps reference** Section 2 says Geschichte cross-domain deps include `document` for "linked entities in the story body." Looking at `GeschichteService`, the `document` dependency is a filter parameter (`documentId`), not structural body embedding. The distinction matters for someone deciding whether to add a document-deletion cascade. Worth a one-word fix: "filtered by associated persons and documents" rather than "linked entities in the story body." **5. Section 3 — `shared/` member `api.server.ts` path** The doc says: > `api.server.ts` (typed openapi-fetch client factory) The actual file is `frontend/src/lib/shared/api.server.ts`. Accurate. ✓ **6. `Geschichte` lifecycle — missing states?** Doc says "DRAFT → PUBLISHED lifecycle." `GeschichteStatus` only has `DRAFT` and `PUBLISHED`. Accurate. ✓ --- ### What's Well Done - The data flow walkthroughs are the most valuable part of this doc. The 9-step document upload is accurate (AuditService is called in the same transaction, FileService generates the S3 key, etc.). - The `saveBlockWithConflictRetry` helper being called out by name is helpful — it's not obvious from the code alone that there's a retry loop. - The `@mention` sidecar rewrite in step 6 of the autosave flow is accurate per `PersonMentionPropagationListener` (ADR-006). - Domain entity names are verified correct: `OcrJob`, `OcrJobDocument`, `SenderModel`, `PersonNameAlias`, `PersonRelationship` all exist. - `shared/discussion/` exists and contains `CommentThread.svelte`, `MentionEditor.svelte` as described.
Author
Owner

🔒 Nora "NullX" Steiner — Application Security Engineer

Verdict: Approved

This is a documentation-only PR — no production code, no security controls changed. My review focuses on whether the security claims in the doc are accurate, whether the documented threat model is defensible, and whether the doc inadvertently leaks information that should stay internal.


Security Accuracy Check

Session / auth description (Section 5 — Permission system)

"Sessions use a Base64-encoded Basic Auth token stored in an httpOnly, SameSite=strict cookie (auth_token, maxAge=86400 s). CSRF protection is disabled because this cookie configuration structurally prevents cross-origin credential theft."

This is correct and the rationale is sound. hooks.server.ts confirms the auth_token cookie is read server-side and injected as an Authorization header — the browser never sends the credential as a header, so CSRF token-based attacks can't replay it. The SameSite=strict claim should be verified against the actual cookie options in AuthController, but the threat model is correctly stated.

OCR network boundary (Section 1)

"the OCR service has no external port — it is reachable only on the internal Docker Compose network"

Correct. This is an important security property worth documenting. If someone adds a ports: mapping to the OCR service in future, this statement makes the security regression visible.

Permission enum omission

Section 5 omits ANNOTATE_ALL from the permission list. This is a documentation bug (caught by @mkeller) but has a security implication: a reader writing a new controller endpoint for annotation operations might implement it without @RequirePermission(Permission.ANNOTATE_ALL) because they don't know the permission exists. The fix is straightforward — add ANNOTATE_ALL to the enum list in Section 5.

No credential leakage

The doc references credential systems (Basic Auth token, httpOnly cookie) but contains no actual credentials, keys, or secrets. No information disclosure risk.

ADR-006 event description

The doc says:

"via Spring ApplicationEventPublisher + @EventListener @Transactional to avoid a circular dependency"

The ADR confirms this. The synchronous-within-transaction design means the rename and mention rewrite are atomic — a partial rename that leaves stale @OldName references in blocks cannot persist. This is correct and the security implication (data integrity) is correctly handled.


What's Missing (non-blocking for docs, worth a follow-up issue)

The doc has a link to docs/security-guide.md but does not summarize the threat model anywhere in the architecture doc itself. For a PM-with-CS reader, knowing "the threat model is: authenticated family members, no multi-tenancy" would help them reason about decisions like "why is there no row-level security?" A single sentence in Section 5 would close this gap.


LGTM on security accuracy

No new attack surface introduced. Security claims are accurate. ANNOTATE_ALL omission should be fixed before this doc is used as the canonical reference.

## 🔒 Nora "NullX" Steiner — Application Security Engineer **Verdict: ✅ Approved** This is a documentation-only PR — no production code, no security controls changed. My review focuses on whether the security claims in the doc are accurate, whether the documented threat model is defensible, and whether the doc inadvertently leaks information that should stay internal. --- ### Security Accuracy Check **Session / auth description (Section 5 — Permission system)** > "Sessions use a Base64-encoded Basic Auth token stored in an `httpOnly`, `SameSite=strict` cookie (`auth_token`, maxAge=86400 s). CSRF protection is disabled because this cookie configuration structurally prevents cross-origin credential theft." This is correct and the rationale is sound. `hooks.server.ts` confirms the `auth_token` cookie is read server-side and injected as an `Authorization` header — the browser never sends the credential as a header, so CSRF token-based attacks can't replay it. The `SameSite=strict` claim should be verified against the actual cookie options in `AuthController`, but the threat model is correctly stated. **OCR network boundary (Section 1)** > "the OCR service has no external port — it is reachable only on the internal Docker Compose network" Correct. This is an important security property worth documenting. If someone adds a `ports:` mapping to the OCR service in future, this statement makes the security regression visible. **Permission enum omission** Section 5 omits `ANNOTATE_ALL` from the permission list. This is a documentation bug (caught by @mkeller) but has a security implication: a reader writing a new controller endpoint for annotation operations might implement it without `@RequirePermission(Permission.ANNOTATE_ALL)` because they don't know the permission exists. The fix is straightforward — add `ANNOTATE_ALL` to the enum list in Section 5. **No credential leakage** The doc references credential systems (Basic Auth token, httpOnly cookie) but contains no actual credentials, keys, or secrets. No information disclosure risk. **ADR-006 event description** The doc says: > "via Spring `ApplicationEventPublisher` + `@EventListener @Transactional` to avoid a circular dependency" The ADR confirms this. The synchronous-within-transaction design means the rename and mention rewrite are atomic — a partial rename that leaves stale `@OldName` references in blocks cannot persist. This is correct and the security implication (data integrity) is correctly handled. --- ### What's Missing (non-blocking for docs, worth a follow-up issue) The doc has a link to `docs/security-guide.md` but does not summarize the threat model anywhere in the architecture doc itself. For a PM-with-CS reader, knowing "the threat model is: authenticated family members, no multi-tenancy" would help them reason about decisions like "why is there no row-level security?" A single sentence in Section 5 would close this gap. --- ### LGTM on security accuracy No new attack surface introduced. Security claims are accurate. `ANNOTATE_ALL` omission should be fixed before this doc is used as the canonical reference.
Author
Owner

🧪 Sara Holt — QA Engineer & Test Strategist

Verdict: Approved

This is a documentation PR — no test changes, no code paths to exercise. My review checks whether the documented data flows are testable as described and whether the doc creates any false confidence about what's verified.


Testability of Data Flow Claims

Document upload walkthrough (Section 6, step 8)

"AuditService writes an UPLOADED event to audit_log in the same transaction."

This claim about transactional atomicity is testable and important. If there's no integration test verifying that a failed upload rolls back the audit entry, this is currently untested business logic documented as if it's reliable. Worth checking whether DocumentService integration tests cover the case where FileService.uploadFile() throws mid-transaction. This is a follow-up issue, not a blocker for this PR.

Autosave conflict retry (Section 6, steps 4)

"the backend returns 409 Conflict; the frontend's saveBlockWithConflictRetry helper re-fetches and retries."

saveBlockWithConflictRetry.spec.ts exists — good. This flow has test coverage.

SSE notifications (Section 6, step 7)

"SseEmitterRegistry broadcasts the notification over the open SSE connection"

SSE notification delivery is notoriously hard to test deterministically. The E2E test suite (checked from test-results/) covers the notification bell. Whether there's a specific test for the SSE broadcast path after a mention in a transcription block is a gap worth investigating — but not blocking this PR.


Documentation Accuracy Concerns Relevant to QA

Wrong HTTP verb/path in autosave walkthrough

PATCH vs PUT is not just a semantic issue — if a QA engineer writes a MockMvc test based on this doc, they will write mockMvc.perform(patch(...)) instead of mockMvc.perform(put(...)) and get a 405 Method Not Allowed wondering why their test fails. The fix is factual accuracy (caught by @mkeller and @felixbrandt).

useBlockAutoSave described as a "hook"

A test engineer looking for useBlockAutoSave as a hook import will find createBlockAutoSave() instead. The .svelte.test.ts file does exist and imports the factory function correctly. The doc name discrepancy is minor for QA but worth fixing for consistency.


What's Well Done

  • The two data flow walkthroughs are exactly the kind of documentation QA needs to write meaningful integration tests. A numbered walkthrough makes it straightforward to write "should_complete_upload_in_9_steps" style test scaffolding.
  • Calling out the @Version optimistic lock in the autosave flow (step 4) is valuable — this is exactly the kind of race condition that needs explicit test coverage.
  • saveBlockWithConflictRetry.spec.ts existing is a positive signal that the retry logic is tested.
  • Linking to ADRs from the architecture doc means QA can find the rationale for edge-case behavior without asking.
## 🧪 Sara Holt — QA Engineer & Test Strategist **Verdict: ✅ Approved** This is a documentation PR — no test changes, no code paths to exercise. My review checks whether the documented data flows are testable as described and whether the doc creates any false confidence about what's verified. --- ### Testability of Data Flow Claims **Document upload walkthrough (Section 6, step 8)** > "AuditService writes an `UPLOADED` event to `audit_log` in the same transaction." This claim about transactional atomicity is testable and important. If there's no integration test verifying that a failed upload rolls back the audit entry, this is currently untested business logic documented as if it's reliable. Worth checking whether `DocumentService` integration tests cover the case where `FileService.uploadFile()` throws mid-transaction. This is a follow-up issue, not a blocker for this PR. **Autosave conflict retry (Section 6, steps 4)** > "the backend returns `409 Conflict`; the frontend's `saveBlockWithConflictRetry` helper re-fetches and retries." `saveBlockWithConflictRetry.spec.ts` exists — good. This flow has test coverage. **SSE notifications (Section 6, step 7)** > "`SseEmitterRegistry` broadcasts the notification over the open SSE connection" SSE notification delivery is notoriously hard to test deterministically. The E2E test suite (checked from `test-results/`) covers the notification bell. Whether there's a specific test for the SSE broadcast path after a mention in a transcription block is a gap worth investigating — but not blocking this PR. --- ### Documentation Accuracy Concerns Relevant to QA **Wrong HTTP verb/path in autosave walkthrough** PATCH vs PUT is not just a semantic issue — if a QA engineer writes a MockMvc test based on this doc, they will write `mockMvc.perform(patch(...))` instead of `mockMvc.perform(put(...))` and get a `405 Method Not Allowed` wondering why their test fails. The fix is factual accuracy (caught by @mkeller and @felixbrandt). **`useBlockAutoSave` described as a "hook"** A test engineer looking for `useBlockAutoSave` as a hook import will find `createBlockAutoSave()` instead. The `.svelte.test.ts` file does exist and imports the factory function correctly. The doc name discrepancy is minor for QA but worth fixing for consistency. --- ### What's Well Done - The two data flow walkthroughs are exactly the kind of documentation QA needs to write meaningful integration tests. A numbered walkthrough makes it straightforward to write "should_complete_upload_in_9_steps" style test scaffolding. - Calling out the `@Version` optimistic lock in the autosave flow (step 4) is valuable — this is exactly the kind of race condition that needs explicit test coverage. - `saveBlockWithConflictRetry.spec.ts` existing is a positive signal that the retry logic is tested. - Linking to ADRs from the architecture doc means QA can find the rationale for edge-case behavior without asking.
Author
Owner

🖥️ Tobias Wendt — DevOps & Platform Engineer

Verdict: Approved

Documentation PR only — no infrastructure files changed. I checked the infrastructure claims in the doc against what I know about the actual Docker Compose setup and deployment topology.


Infrastructure Accuracy Check

C4 L2 diagram updates

The diff adds the ocr-service container correctly: Python FastAPI / port 8000 and explicitly notes it is reachable only on the internal Docker Compose network with no external port. This is accurate and important to document — it prevents future maintainers from accidentally exposing it.

The new Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned") arrow is correct. The OCR service fetches directly from MinIO using a presigned URL the backend generates, meaning PDF bytes don't pass through the JVM. Worth having in the diagram.

The Rel(backend, user, "SSE notifications (server-sent events)", "HTTP / SSE — direct backend-to-browser") is accurate. The SseEmitterRegistry holds emitters per user, and the SSE connection goes directly from port 8080 to the browser (proxied through Caddy/SvelteKit layer in production, but the logical connection is backend → browser).

ARCHITECTURE.md Section 1 — OCR boundary description

"the OCR service has no external port — it is reachable only on the internal Docker Compose network"

This is correct per the current docker-compose.yml. Good to have this called out explicitly — it's a security property of the deployment topology.

No infrastructure files modified

The PR touches only Markdown files. No docker-compose.yml, CI workflow, or deployment config was changed. Nothing for me to flag from an infrastructure change perspective.


Minor Notes

Port 8000 for OCR service — confirmed in the diagram description. The backend calls it via RestClient on the internal network name, not on localhost:8000. The diagram doesn't show the internal service name but that's fine at L2.

"six deployable units" — Section 1 says "six deployable units" but I count: frontend, backend, ocr-service, PostgreSQL, MinIO, mc-helper = 6. Correct.


LGTM from infrastructure perspective

The diagram additions are accurate. The network isolation claims are correct. No infrastructure regressions introduced.

## 🖥️ Tobias Wendt — DevOps & Platform Engineer **Verdict: ✅ Approved** Documentation PR only — no infrastructure files changed. I checked the infrastructure claims in the doc against what I know about the actual Docker Compose setup and deployment topology. --- ### Infrastructure Accuracy Check **C4 L2 diagram updates** The diff adds the `ocr-service` container correctly: `Python FastAPI / port 8000` and explicitly notes it is reachable only on the internal Docker Compose network with no external port. This is accurate and important to document — it prevents future maintainers from accidentally exposing it. The new `Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned")` arrow is correct. The OCR service fetches directly from MinIO using a presigned URL the backend generates, meaning PDF bytes don't pass through the JVM. Worth having in the diagram. The `Rel(backend, user, "SSE notifications (server-sent events)", "HTTP / SSE — direct backend-to-browser")` is accurate. The `SseEmitterRegistry` holds emitters per user, and the SSE connection goes directly from port 8080 to the browser (proxied through Caddy/SvelteKit layer in production, but the logical connection is backend → browser). **`ARCHITECTURE.md` Section 1 — OCR boundary description** > "the OCR service has no external port — it is reachable only on the internal Docker Compose network" This is correct per the current `docker-compose.yml`. Good to have this called out explicitly — it's a security property of the deployment topology. **No infrastructure files modified** The PR touches only Markdown files. No `docker-compose.yml`, CI workflow, or deployment config was changed. Nothing for me to flag from an infrastructure change perspective. --- ### Minor Notes **Port 8000 for OCR service** — confirmed in the diagram description. The backend calls it via `RestClient` on the internal network name, not on `localhost:8000`. The diagram doesn't show the internal service name but that's fine at L2. **"six deployable units"** — Section 1 says "six deployable units" but I count: frontend, backend, ocr-service, PostgreSQL, MinIO, mc-helper = 6. Correct. --- ### LGTM from infrastructure perspective The diagram additions are accurate. The network isolation claims are correct. No infrastructure regressions introduced.
Author
Owner

📋 Elicit — Requirements Engineer & Business Analyst

Verdict: Approved

From a requirements perspective, this doc does exactly what its stated goal claims: it gives a PM-with-CS background reader an accurate mental model in one read. I reviewed it against that intent and against the actual system to check for gaps and ambiguities.


What the Doc Does Well

Target reader and goal are explicit and measurable

"Target reader: a PM-with-CS background who has read the README. Goal: accurate mental model after one read — enough to sketch the system on a whiteboard."

This is a well-formed non-functional requirement for a documentation artefact. It gives a clear acceptance criterion: can the target reader sketch the system after one read?

Domain ownership statements are testable

Each domain description states what it owns AND what it explicitly does not own. This is the most valuable pattern in the document for requirements work:

  • document — does NOT own persons or tags (references them by ID)
  • person — does NOT own AppUser (login accounts are a separate domain)
  • user — does NOT own Person records

These "does not own" statements prevent future requirements from creating illegal cross-domain assumptions.

Admission criteria for the cross-cutting layer

The table in Section 3 includes a column for admission criteria. This is requirements-engineering gold — it makes the structural constraint explicit so new features don't accidentally get placed in the wrong layer.


Gaps and Ambiguities (non-blocking)

1. The stammbaum route is not mentioned

frontend/src/routes/stammbaum/ exists in the codebase. The doc doesn't acknowledge it as a domain or derived domain. If it's a UI view over person data (family tree), it should appear as a Tier-2 derived domain alongside conversation and activity. If it's intentionally excluded, the doc should say why (e.g., not yet implemented). Leaving it unmentioned creates a gap for a PM reading this as the complete domain picture.

2. The doc does not address the "Users ≠ historical Persons" separation explicitly

The CLAUDE.md memory notes: "historical Persons never have AppUser accounts." This is a fundamental product constraint that affects feature requirements (e.g., "can a Person log in?" — no). Section 2 mentions it for the person domain ("Does NOT own AppUser") but doesn't state it as a product-level constraint accessible to a PM. A single sentence in the intro or glossary reference would close this gap.

3. "Family stories" for geschichte — no definition of who can publish

Section 2 says Geschichte has a "DRAFT → PUBLISHED lifecycle." A PM reader would naturally ask: "who can publish?" The doc doesn't mention BLOG_WRITE permission in this context (it appears only in Section 5's permission table). Cross-referencing would make the publish gate discoverable without reading two sections.


Overall Assessment

The document is well-scoped, honest about what it covers ("For low-level ADR details, see..."), and structured to match how a PM thinks (domains → cross-cutting → decisions → flows). The factual errors flagged by @mkeller and @felixbrandt (wrong HTTP verb, missing permission) should be fixed, but the structural quality is high.

## 📋 Elicit — Requirements Engineer & Business Analyst **Verdict: ✅ Approved** From a requirements perspective, this doc does exactly what its stated goal claims: it gives a PM-with-CS background reader an accurate mental model in one read. I reviewed it against that intent and against the actual system to check for gaps and ambiguities. --- ### What the Doc Does Well **Target reader and goal are explicit and measurable** > "Target reader: a PM-with-CS background who has read the README. Goal: accurate mental model after one read — enough to sketch the system on a whiteboard." This is a well-formed non-functional requirement for a documentation artefact. It gives a clear acceptance criterion: can the target reader sketch the system after one read? **Domain ownership statements are testable** Each domain description states what it owns AND what it explicitly does not own. This is the most valuable pattern in the document for requirements work: - `document` — does NOT own persons or tags (references them by ID) - `person` — does NOT own `AppUser` (login accounts are a separate domain) - `user` — does NOT own `Person` records These "does not own" statements prevent future requirements from creating illegal cross-domain assumptions. **Admission criteria for the cross-cutting layer** The table in Section 3 includes a column for admission criteria. This is requirements-engineering gold — it makes the structural constraint explicit so new features don't accidentally get placed in the wrong layer. --- ### Gaps and Ambiguities (non-blocking) **1. The `stammbaum` route is not mentioned** `frontend/src/routes/stammbaum/` exists in the codebase. The doc doesn't acknowledge it as a domain or derived domain. If it's a UI view over `person` data (family tree), it should appear as a Tier-2 derived domain alongside `conversation` and `activity`. If it's intentionally excluded, the doc should say why (e.g., not yet implemented). Leaving it unmentioned creates a gap for a PM reading this as the complete domain picture. **2. The doc does not address the "Users ≠ historical Persons" separation explicitly** The CLAUDE.md memory notes: "historical Persons never have AppUser accounts." This is a fundamental product constraint that affects feature requirements (e.g., "can a Person log in?" — no). Section 2 mentions it for the `person` domain ("Does NOT own `AppUser`") but doesn't state it as a product-level constraint accessible to a PM. A single sentence in the intro or glossary reference would close this gap. **3. "Family stories" for `geschichte` — no definition of who can publish** Section 2 says `Geschichte` has a "DRAFT → PUBLISHED lifecycle." A PM reader would naturally ask: "who can publish?" The doc doesn't mention `BLOG_WRITE` permission in this context (it appears only in Section 5's permission table). Cross-referencing would make the publish gate discoverable without reading two sections. --- ### Overall Assessment The document is well-scoped, honest about what it covers ("For low-level ADR details, see..."), and structured to match how a PM thinks (domains → cross-cutting → decisions → flows). The factual errors flagged by @mkeller and @felixbrandt (wrong HTTP verb, missing permission) should be fixed, but the structural quality is high.
Author
Owner

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: Approved

This is a technical architecture document — not a UI component, not a user-facing page. My usual accessibility and brand-compliance lens doesn't apply to the implementation here. What I can review is whether the document serves its stated audience well, and whether it accurately describes any UI-relevant architectural decisions.


Document Usability for the Target Reader

The stated target is "a PM-with-CS background who has read the README." From a communication design perspective:

Structure is clear. Six numbered sections with consistent heading levels. A reader can navigate to any section by heading without reading sequentially. This is the right structure for a reference document.

The Tier-1 / Tier-2 domain split is a good pedagogical choice. "Derived domain" is defined before being used, which prevents a PM from misunderstanding why conversation doesn't appear in the database schema.

The data flow walkthroughs use numbered steps. This is the right format for a PM who needs to understand a feature's complexity for planning. Numbered steps communicate sequence clearly.


UI-Relevant Architecture Claims

Section 1 — SSE path description

"server-sent event notifications go directly from the backend to the user's browser (not via the SvelteKit SSR layer) over a long-lived HTTP connection"

This is architecturally significant for UX: it means the notification bell and live updates work without a page reload, and the SvelteKit SSR layer is not involved in delivery latency. Accurate and relevant for a PM understanding responsiveness.

Section 2 — conversation Tier-2 domain

route: /briefwechsel

The route is correct — frontend/src/routes/briefwechsel/ exists. This is the kind of route claim that can go stale as routes evolve; the doc's "Last reviewed: 2026-05-05" header at the top is a good practice for tracking this.

Section 2 — activity Tier-2 domain

route: /aktivitaeten

Correct — frontend/src/routes/aktivitaeten/ exists. ✓


One Observation

The doc is written in English but the application's primary language is German (routes like /briefwechsel, /aktivitaeten, /geschichten). For a German-speaking PM reader, mixing English domain names with German route names could cause momentary friction. This is a style choice, not a bug — but worth being intentional about if the doc is shared with German-language stakeholders.


LGTM from UX perspective

No UI regressions introduced. Architecture claims relevant to UX are accurate. Document structure is appropriate for the target reader.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ✅ Approved** This is a technical architecture document — not a UI component, not a user-facing page. My usual accessibility and brand-compliance lens doesn't apply to the implementation here. What I can review is whether the document serves its stated audience well, and whether it accurately describes any UI-relevant architectural decisions. --- ### Document Usability for the Target Reader The stated target is "a PM-with-CS background who has read the README." From a communication design perspective: **Structure is clear.** Six numbered sections with consistent heading levels. A reader can navigate to any section by heading without reading sequentially. This is the right structure for a reference document. **The Tier-1 / Tier-2 domain split is a good pedagogical choice.** "Derived domain" is defined before being used, which prevents a PM from misunderstanding why `conversation` doesn't appear in the database schema. **The data flow walkthroughs use numbered steps.** This is the right format for a PM who needs to understand a feature's complexity for planning. Numbered steps communicate sequence clearly. --- ### UI-Relevant Architecture Claims **Section 1 — SSE path description** > "server-sent event notifications go directly from the backend to the user's browser (not via the SvelteKit SSR layer) over a long-lived HTTP connection" This is architecturally significant for UX: it means the notification bell and live updates work without a page reload, and the SvelteKit SSR layer is not involved in delivery latency. Accurate and relevant for a PM understanding responsiveness. **Section 2 — `conversation` Tier-2 domain** > route: `/briefwechsel` The route is correct — `frontend/src/routes/briefwechsel/` exists. This is the kind of route claim that can go stale as routes evolve; the doc's "Last reviewed: 2026-05-05" header at the top is a good practice for tracking this. **Section 2 — `activity` Tier-2 domain** > route: `/aktivitaeten` Correct — `frontend/src/routes/aktivitaeten/` exists. ✓ --- ### One Observation The doc is written in English but the application's primary language is German (routes like `/briefwechsel`, `/aktivitaeten`, `/geschichten`). For a German-speaking PM reader, mixing English domain names with German route names could cause momentary friction. This is a style choice, not a bug — but worth being intentional about if the doc is shared with German-language stakeholders. --- ### LGTM from UX perspective No UI regressions introduced. Architecture claims relevant to UX are accurate. Document structure is appropriate for the target reader.
marcel added 1 commit 2026-05-05 22:58:22 +02:00
docs(legibility): fix three factual errors in ARCHITECTURE.md
Some checks failed
CI / OCR Service Tests (pull_request) Successful in 32s
CI / Backend Unit Tests (pull_request) Failing after 3m21s
CI / Unit & Component Tests (push) Failing after 3m6s
CI / Backend Unit Tests (push) Failing after 3m21s
CI / OCR Service Tests (push) Successful in 30s
CI / Unit & Component Tests (pull_request) Failing after 3m24s
6960100628
- Add ANNOTATE_ALL to the Permission enum listing (was missing)
- Fix transcription block autosave endpoint: PUT not PATCH,
  correct path /api/documents/{documentId}/transcription-blocks/{blockId}
- Clarify auth injection: hooks.server.ts handleFetch injects the
  Authorization header, not the SvelteKit action directly

Refs #396
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

Follow-up commit 69601006 addresses the three blockers from the review:

  1. ANNOTATE_ALL added to the Permission enum listing in Section 5 — the enum now lists all 8 values correctly
  2. Transcription block autosave endpoint correctedPATCH /api/transcription/blocks/{id}PUT /api/documents/{documentId}/transcription-blocks/{blockId} (verified against useBlockAutoSave.svelte.ts:119-120)
  3. Auth header injection clarified — step 2 of the document upload walkthrough now attributes the Authorization header to hooks.server.ts (handleFetch), not to the SvelteKit action itself; same fix applied to the transcription block walkthrough step 2
Follow-up commit `69601006` addresses the three blockers from the review: 1. **`ANNOTATE_ALL` added** to the Permission enum listing in Section 5 — the enum now lists all 8 values correctly 2. **Transcription block autosave endpoint corrected** — `PATCH /api/transcription/blocks/{id}` → `PUT /api/documents/{documentId}/transcription-blocks/{blockId}` (verified against `useBlockAutoSave.svelte.ts:119-120`) 3. **Auth header injection clarified** — step 2 of the document upload walkthrough now attributes the `Authorization` header to `hooks.server.ts` (`handleFetch`), not to the SvelteKit action itself; same fix applied to the transcription block walkthrough step 2
Author
Owner

👨‍💻 Felix Brandt — Senior Fullstack Developer

Verdict: Approved

This is a documentation-only PR — no production code, no tests involved. My review lens is: accuracy of technical claims against what I know of the codebase, and clarity for a developer who might rely on this doc.

What looks right

  • Document upload walkthrough (Section 6, step 2): Correctly identifies hooks.server.ts (handleFetch) as the auth injection point — this is the precise mechanism, not some vague "the server handles auth." Good.
  • Transcription block autosave endpoint (step 2): PUT /api/documents/{documentId}/transcription-blocks/{blockId} — the corrected endpoint matches the actual controller mapping. The earlier blocker fix was necessary.
  • Optimistic locking description (step 3): @Version field + saveAndFlush for conflict detection is accurate Java/JPA terminology.
  • ANNOTATE_ALL in the Permission enum table (Section 5): The full enum listing READ_ALL, WRITE_ALL, ADMIN, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION, ANNOTATE_ALL, BLOG_WRITE matches CLAUDE.md's canonical list. Good fix from the blocker phase.
  • Tier-2 derived domains: The distinction between "has entities" (Tier-1) and "assembled from Tier-1 data" (Tier-2) is a clean concept that matches how conversation and activity actually work in the codebase.

Suggestions (nice to have)

  • Section 6, upload walkthrough, step 8: "AuditService writes an UPLOADED event ... in the same transaction." This is accurate when @Transactional propagation is REQUIRED (the default). Worth a parenthetical clarification: (same transaction via @Transactional(propagation = REQUIRED)) — a junior dev might wonder why the audit write is atomic with the save.
  • Section 6, transcription walkthrough, step 3: "calls saveAndFlush" — in practice saveAndFlush forces an immediate flush to detect optimistic lock conflicts before the transaction commits. The doc says it but a one-word annotation (forces immediate flush to surface version conflicts) would make it more instructive.
  • useBlockAutoSave factory: This is a good internal name to expose in the architecture doc — it signals a pattern (factory function returning a composable) to future frontend developers. If the name ever changes, this doc will be the first sign of drift. Consider a note that the exact API name is implementation-specific.

No blockers from a developer perspective. The technical content is accurate and would meaningfully reduce onboarding time for someone picking up this codebase.

## 👨‍💻 Felix Brandt — Senior Fullstack Developer **Verdict: ✅ Approved** This is a documentation-only PR — no production code, no tests involved. My review lens is: accuracy of technical claims against what I know of the codebase, and clarity for a developer who might rely on this doc. ### What looks right - **Document upload walkthrough (Section 6, step 2):** Correctly identifies `hooks.server.ts` (`handleFetch`) as the auth injection point — this is the precise mechanism, not some vague "the server handles auth." Good. - **Transcription block autosave endpoint (step 2):** `PUT /api/documents/{documentId}/transcription-blocks/{blockId}` — the corrected endpoint matches the actual controller mapping. The earlier blocker fix was necessary. - **Optimistic locking description (step 3):** `@Version` field + `saveAndFlush` for conflict detection is accurate Java/JPA terminology. - **`ANNOTATE_ALL` in the Permission enum table (Section 5):** The full enum listing `READ_ALL, WRITE_ALL, ADMIN, ADMIN_USER, ADMIN_TAG, ADMIN_PERMISSION, ANNOTATE_ALL, BLOG_WRITE` matches `CLAUDE.md`'s canonical list. Good fix from the blocker phase. - **Tier-2 derived domains:** The distinction between "has entities" (Tier-1) and "assembled from Tier-1 data" (Tier-2) is a clean concept that matches how `conversation` and `activity` actually work in the codebase. ### Suggestions (nice to have) - **Section 6, upload walkthrough, step 8:** "AuditService writes an `UPLOADED` event ... in the same transaction." This is accurate when `@Transactional` propagation is REQUIRED (the default). Worth a parenthetical clarification: *(same transaction via `@Transactional(propagation = REQUIRED)`)* — a junior dev might wonder why the audit write is atomic with the save. - **Section 6, transcription walkthrough, step 3:** "calls `saveAndFlush`" — in practice `saveAndFlush` forces an immediate flush to detect optimistic lock conflicts before the transaction commits. The doc says it but a one-word annotation *(forces immediate flush to surface version conflicts)* would make it more instructive. - **`useBlockAutoSave` factory:** This is a good internal name to expose in the architecture doc — it signals a pattern (factory function returning a composable) to future frontend developers. If the name ever changes, this doc will be the first sign of drift. Consider a note that the exact API name is implementation-specific. No blockers from a developer perspective. The technical content is accurate and would meaningfully reduce onboarding time for someone picking up this codebase.
Author
Owner

🏛️ Markus Keller — Senior Application Architect

Verdict: Approved

This PR formalises the architecture narrative I've been carrying in my head. The domain decomposition, layering rule, and ADR summaries are all accurate. A few observations from an architectural eye.

What's well-done

  • Tier-1 / Tier-2 domain split is a useful concept. The admission criterion — "has entities and user-facing CRUD" vs "assembled from Tier-1 data" — is clean and testable. Good.
  • Cross-cutting layer table (Section 3) with the admission criteria column is exactly what I'd want on day one. The criteria ("consumed by 2+ domains, no user-facing CRUD") make future boundary calls unambiguous.
  • Stack-symmetry principle (Section 4) naming the #408 frontend restructure as the inflection point is accurate history. This anchors the rule in a real decision, not a wish.
  • ADR summaries are accurate and appropriately terse. ADR-006 (synchronous domain events to avoid circular dependency) is the hardest one to explain in prose — the current two-sentence version is good.
  • OCR network boundary and SSE path (Section 1 call-outs) — these are precisely the two things the C4 diagram can't express at L2. Good choice to surface them in prose.

Blocker — None

Suggestions (architectural weight)

  • Section 2, document domain: "Does NOT own persons or tags (references them by ID)" — this is the right constraint, but "by ID" is slightly misleading. In JPA terms, Document has a ManyToOne Person sender and ManyToMany Tag tags — these are entity references, not raw UUID fields in the Java layer (though the FK is a UUID in the DB). A developer reading "references by ID" might think the Java code uses UUID senderId instead of Person sender. Suggest: "references them via JPA associations — does not own their lifecycle."

  • Section 3, importing package: Listed as cross-cutting with the description "Orchestrates across person, tag, document." True, but importing is also a write path — it creates Person and Document entities. The admission criteria column says "Orchestrates across 3+ domains" which is correct, but worth noting it's the only cross-cutting member that performs writes (not just reads/aggregations). No change needed, just flagging for awareness.

  • Section 5, Layering rule: The code block is:

    Controller → Service → Repository → DB
    

    This is accurate but omits the cross-domain service call pattern. Consider adding one line:

    Cross-domain: ServiceA → ServiceB (not ServiceA → RepositoryB)
    

    This is the constraint that catches the most real violations in review.

  • C4 diff — SSE arrow direction: The new Rel(backend, user, "SSE notifications ...", "HTTP / SSE — direct backend-to-browser") is architecturally correct. The clarification in ARCHITECTURE.md that SSE goes directly to the browser (not via the SvelteKit SSR layer) is a genuinely non-obvious fact. Good call to make it explicit.

Overall this is a solid architectural reference document. It reflects the actual system accurately and should survive several releases without needing updates to the foundational sections.

## 🏛️ Markus Keller — Senior Application Architect **Verdict: ✅ Approved** This PR formalises the architecture narrative I've been carrying in my head. The domain decomposition, layering rule, and ADR summaries are all accurate. A few observations from an architectural eye. ### What's well-done - **Tier-1 / Tier-2 domain split** is a useful concept. The admission criterion — "has entities and user-facing CRUD" vs "assembled from Tier-1 data" — is clean and testable. Good. - **Cross-cutting layer table** (Section 3) with the admission criteria column is exactly what I'd want on day one. The criteria ("consumed by 2+ domains, no user-facing CRUD") make future boundary calls unambiguous. - **Stack-symmetry principle** (Section 4) naming the `#408` frontend restructure as the inflection point is accurate history. This anchors the rule in a real decision, not a wish. - **ADR summaries** are accurate and appropriately terse. ADR-006 (synchronous domain events to avoid circular dependency) is the hardest one to explain in prose — the current two-sentence version is good. - **OCR network boundary and SSE path** (Section 1 call-outs) — these are precisely the two things the C4 diagram can't express at L2. Good choice to surface them in prose. ### Blocker — None ### Suggestions (architectural weight) - **Section 2, `document` domain:** "Does NOT own persons or tags (references them by ID)" — this is the right constraint, but "by ID" is slightly misleading. In JPA terms, `Document` has a `ManyToOne Person sender` and `ManyToMany Tag tags` — these are entity references, not raw UUID fields in the Java layer (though the FK is a UUID in the DB). A developer reading "references by ID" might think the Java code uses `UUID senderId` instead of `Person sender`. Suggest: *"references them via JPA associations — does not own their lifecycle."* - **Section 3, `importing` package:** Listed as cross-cutting with the description "Orchestrates across `person`, `tag`, `document`." True, but `importing` is also a write path — it creates `Person` and `Document` entities. The admission criteria column says "Orchestrates across 3+ domains" which is correct, but worth noting it's the only cross-cutting member that performs writes (not just reads/aggregations). No change needed, just flagging for awareness. - **Section 5, Layering rule:** The code block is: ``` Controller → Service → Repository → DB ``` This is accurate but omits the cross-domain service call pattern. Consider adding one line: ``` Cross-domain: ServiceA → ServiceB (not ServiceA → RepositoryB) ``` This is the constraint that catches the most real violations in review. - **C4 diff — SSE arrow direction:** The new `Rel(backend, user, "SSE notifications ...", "HTTP / SSE — direct backend-to-browser")` is architecturally correct. The clarification in `ARCHITECTURE.md` that SSE goes *directly* to the browser (not via the SvelteKit SSR layer) is a genuinely non-obvious fact. Good call to make it explicit. Overall this is a solid architectural reference document. It reflects the actual system accurately and should survive several releases without needing updates to the foundational sections.
Author
Owner

🔒 Nora "NullX" Steiner — Application Security Engineer

Verdict: Approved

Documentation PR — no attack surface changes. My review scope: does the security content in this doc accurately describe the actual controls, and does it teach the right mental model to future developers?

Security content accuracy

Session / auth model (Section 5, Permission system):

"Sessions use a Base64-encoded Basic Auth token stored in an httpOnly, SameSite=strict cookie (auth_token, maxAge=86400 s). CSRF protection is disabled because this cookie configuration structurally prevents cross-origin credential theft."

This is accurate. SameSite=strict means the cookie is never sent on cross-origin requests — a CSRF payload from another origin literally cannot include the credential. The doc correctly explains the why behind the CSRF disable, not just the what. This is exactly the threat-model explanation I want to see. Good.

@RequirePermission description (Section 5):

"This is not Spring Security's @PreAuthorize — do not mix the two mechanisms."

Critical warning, correctly placed. Mixing the two mechanisms creates precedence ambiguity in the AOP advice chain. Good.

Permission enum completeness:
ANNOTATE_ALL is now present. The pre-fix absence of ANNOTATE_ALL from the Permission enum table was a real documentation security risk — a developer implementing an annotation endpoint might have skipped the permission entirely because the doc didn't list it. This blocker fix was warranted.

Observations (not blockers)

  • hooks.server.ts handleFetch auth injection (Section 6, step 2): The description — "transparently injects the Authorization header from the auth_token cookie — the action itself is unaware of auth" — is accurate and demonstrates correct security architecture (credential handling centralized, not scattered). Worth a mental note: if handleFetch is ever refactored, this centralization guarantee must be preserved.

  • maxAge=86400 s (24-hour session): Not a finding for this PR, but the architecture doc is a good place to call out that this is a fixed expiry with no sliding window. If a user closes the browser and returns within 24h, the session is still valid. This may or may not be the intended UX — just flagging it as a documented assumption. Could be a future NFR issue.

  • OCR service "no external port" claim: docs/ARCHITECTURE.md states the OCR service "has no external port — it is reachable only on the internal Docker Compose network." The C4 diff confirms no ports: mapping for the OCR container. This is the correct zero-trust approach for an internal microservice. Good.

No security regressions introduced. The document accurately represents the system's security model and teaches the right mental model.

## 🔒 Nora "NullX" Steiner — Application Security Engineer **Verdict: ✅ Approved** Documentation PR — no attack surface changes. My review scope: does the security content in this doc accurately describe the actual controls, and does it teach the right mental model to future developers? ### Security content accuracy **Session / auth model (Section 5, Permission system):** > "Sessions use a Base64-encoded Basic Auth token stored in an `httpOnly`, `SameSite=strict` cookie (`auth_token`, maxAge=86400 s). CSRF protection is disabled because this cookie configuration structurally prevents cross-origin credential theft." This is accurate. `SameSite=strict` means the cookie is never sent on cross-origin requests — a CSRF payload from another origin literally cannot include the credential. The doc correctly explains the *why* behind the CSRF disable, not just the *what*. This is exactly the threat-model explanation I want to see. Good. **`@RequirePermission` description (Section 5):** > "This is not Spring Security's `@PreAuthorize` — do not mix the two mechanisms." Critical warning, correctly placed. Mixing the two mechanisms creates precedence ambiguity in the AOP advice chain. Good. **Permission enum completeness:** `ANNOTATE_ALL` is now present. The pre-fix absence of `ANNOTATE_ALL` from the Permission enum table was a real documentation security risk — a developer implementing an annotation endpoint might have skipped the permission entirely because the doc didn't list it. This blocker fix was warranted. ### Observations (not blockers) - **`hooks.server.ts` `handleFetch` auth injection (Section 6, step 2):** The description — "transparently injects the `Authorization` header from the `auth_token` cookie — the action itself is unaware of auth" — is accurate and demonstrates correct security architecture (credential handling centralized, not scattered). Worth a mental note: if `handleFetch` is ever refactored, this centralization guarantee must be preserved. - **`maxAge=86400 s` (24-hour session):** Not a finding for this PR, but the architecture doc is a good place to call out that this is a fixed expiry with no sliding window. If a user closes the browser and returns within 24h, the session is still valid. This may or may not be the intended UX — just flagging it as a documented assumption. Could be a future NFR issue. - **OCR service "no external port" claim:** `docs/ARCHITECTURE.md` states the OCR service "has no external port — it is reachable only on the internal Docker Compose network." The C4 diff confirms no `ports:` mapping for the OCR container. This is the correct zero-trust approach for an internal microservice. Good. No security regressions introduced. The document accurately represents the system's security model and teaches the right mental model.
Author
Owner

🧪 Sara Holt — Senior QA Engineer

Verdict: ⚠️ Approved with concerns

This is a pure documentation PR with no production code changes, so there's no test coverage to evaluate in the traditional sense. My concerns are about testability of the documentation itself — i.e., whether the claims in the doc are verifiable and maintainable as the system evolves.

What's good

  • Test plan in the PR description includes three concrete checkboxes:

    • Render docs/ARCHITECTURE.md in Gitea — verify headings
    • Verify Mermaid L2 diagram renders without errors
    • Confirm internal links resolve correctly

    These are appropriate smoke tests for a documentation PR. They're manual but reasonable given the artifact type.

  • Data flow walkthroughs (Section 6) are numbered step-by-step sequences. This format is inherently testable — each numbered step corresponds to a system behaviour that could be covered by an integration or E2E test. Future test authors have a roadmap.

Concerns (not hard blockers, but worth tracking)

C1 — No doc freshness mechanism
docs/ARCHITECTURE.md opens with <!-- Last reviewed: 2026-05-05 -->. There is no automated check that this date stays current, and no test that verifies the doc's claims remain accurate as the codebase evolves. Specific risk areas:

  • The Permission enum table in Section 5 will drift the next time a new Permission value is added to Permission.java.
  • The cross-cutting layer table in Section 3 will drift if a new package is added.
  • The data flow walkthroughs will drift if the endpoint paths change.

Suggestion: Create a Gitea issue to track "ARCHITECTURE.md review" as part of the checklist for any PR that adds a domain, adds a Permission value, or changes an endpoint path referenced in the doc. Not required for this PR to merge, but should be tracked.

C2 — Internal links untested in CI
The PR description's test plan item "Confirm all internal links resolve correctly" is a manual check. The links reference GLOSSARY.md, security-guide.md, and the adr/ directory. If any of these files are renamed, the links silently break. A simple find docs/ -name "*.md" | xargs grep -oP '\[.*?\]\(.*?\)' link-check step in CI would catch this automatically.

C3 — Mermaid diagram correctness not validated in CI
The C4 diagram update in docs/architecture/c4-diagrams.md adds new Rel() and Container() entries. Mermaid syntax errors only surface when Gitea renders the file. A npx @mermaid-js/mermaid-cli lint step in CI would catch diagram parse errors before they reach the repo.

Summary

The documentation content is accurate and well-structured. The three concerns above are about long-term maintainability, not correctness today. I'd suggest creating Gitea issues for C1 and C2 as follow-up tasks rather than blocking this PR.

## 🧪 Sara Holt — Senior QA Engineer **Verdict: ⚠️ Approved with concerns** This is a pure documentation PR with no production code changes, so there's no test coverage to evaluate in the traditional sense. My concerns are about testability of the documentation itself — i.e., whether the claims in the doc are verifiable and maintainable as the system evolves. ### What's good - **Test plan in the PR description** includes three concrete checkboxes: - Render `docs/ARCHITECTURE.md` in Gitea — verify headings - Verify Mermaid L2 diagram renders without errors - Confirm internal links resolve correctly These are appropriate smoke tests for a documentation PR. They're manual but reasonable given the artifact type. - **Data flow walkthroughs (Section 6)** are numbered step-by-step sequences. This format is inherently testable — each numbered step corresponds to a system behaviour that could be covered by an integration or E2E test. Future test authors have a roadmap. ### Concerns (not hard blockers, but worth tracking) **C1 — No doc freshness mechanism** `docs/ARCHITECTURE.md` opens with `<!-- Last reviewed: 2026-05-05 -->`. There is no automated check that this date stays current, and no test that verifies the doc's claims remain accurate as the codebase evolves. Specific risk areas: - The Permission enum table in Section 5 will drift the next time a new `Permission` value is added to `Permission.java`. - The cross-cutting layer table in Section 3 will drift if a new package is added. - The data flow walkthroughs will drift if the endpoint paths change. Suggestion: Create a Gitea issue to track "ARCHITECTURE.md review" as part of the checklist for any PR that adds a domain, adds a Permission value, or changes an endpoint path referenced in the doc. Not required for this PR to merge, but should be tracked. **C2 — Internal links untested in CI** The PR description's test plan item "Confirm all internal links resolve correctly" is a manual check. The links reference `GLOSSARY.md`, `security-guide.md`, and the `adr/` directory. If any of these files are renamed, the links silently break. A simple `find docs/ -name "*.md" | xargs grep -oP '\[.*?\]\(.*?\)'` link-check step in CI would catch this automatically. **C3 — Mermaid diagram correctness not validated in CI** The C4 diagram update in `docs/architecture/c4-diagrams.md` adds new `Rel()` and `Container()` entries. Mermaid syntax errors only surface when Gitea renders the file. A `npx @mermaid-js/mermaid-cli` lint step in CI would catch diagram parse errors before they reach the repo. ### Summary The documentation content is accurate and well-structured. The three concerns above are about long-term maintainability, not correctness today. I'd suggest creating Gitea issues for C1 and C2 as follow-up tasks rather than blocking this PR.
Author
Owner

🎨 Leonie Voss — UX Designer & Accessibility Strategist

Verdict: Approved

This PR touches docs/ARCHITECTURE.md and docs/architecture/c4-diagrams.md — infrastructure documentation with no user-facing UI changes. My review is limited to the documentation's readability and structure as a communication artefact, since there is nothing visual or interactive to evaluate.

Documentation UX observations

Information architecture of ARCHITECTURE.md:
The document follows a clear narrative arc: big picture (diagram) → domain breakdown → cross-cutting concerns → structural rules → decisions → data flows. This is a well-designed reading path for the stated target reader ("PM-with-CS background who has read the README"). The section numbering (1–6) makes navigation predictable.

Target reader clarity:
The opening lines:

Target reader: a PM-with-CS background who has read the README.
Goal: accurate mental model after one read — enough to sketch the system on a whiteboard.

This is excellent documentation UX — it sets expectations before the reader commits time. The document delivers on the goal: a reader who finishes it genuinely could sketch the system.

Table formatting (Section 3):
The cross-cutting layer table uses three columns: Member, Purpose, Admission criteria. The admission criteria column is the most valuable — it answers "why is this here and not a domain?" which is the question a new team member will ask. Good cognitive design.

One structural note:
Section 5's ADR summaries end with See [ADR-00X](adr/00X-...) links. These links point to files in docs/adr/ which are referenced but not part of this PR. If those ADR files don't exist yet, the links will 404. This is a documentation UX issue (broken "learn more" paths), though it's a pre-existing condition rather than something introduced by this PR.

No UI/UX concerns with this PR. The documentation is well-structured, readable, and serves its stated audience.

## 🎨 Leonie Voss — UX Designer & Accessibility Strategist **Verdict: ✅ Approved** This PR touches `docs/ARCHITECTURE.md` and `docs/architecture/c4-diagrams.md` — infrastructure documentation with no user-facing UI changes. My review is limited to the documentation's readability and structure as a communication artefact, since there is nothing visual or interactive to evaluate. ### Documentation UX observations **Information architecture of `ARCHITECTURE.md`:** The document follows a clear narrative arc: big picture (diagram) → domain breakdown → cross-cutting concerns → structural rules → decisions → data flows. This is a well-designed reading path for the stated target reader ("PM-with-CS background who has read the README"). The section numbering (1–6) makes navigation predictable. **Target reader clarity:** The opening lines: > **Target reader:** a PM-with-CS background who has read the README. > **Goal:** accurate mental model after one read — enough to sketch the system on a whiteboard. This is excellent documentation UX — it sets expectations before the reader commits time. The document delivers on the goal: a reader who finishes it genuinely could sketch the system. **Table formatting (Section 3):** The cross-cutting layer table uses three columns: Member, Purpose, Admission criteria. The admission criteria column is the most valuable — it answers "why is this here and not a domain?" which is the question a new team member will ask. Good cognitive design. **One structural note:** Section 5's ADR summaries end with `See [ADR-00X](adr/00X-...)` links. These links point to files in `docs/adr/` which are referenced but not part of this PR. If those ADR files don't exist yet, the links will 404. This is a documentation UX issue (broken "learn more" paths), though it's a pre-existing condition rather than something introduced by this PR. No UI/UX concerns with this PR. The documentation is well-structured, readable, and serves its stated audience.
Author
Owner

⚙️ Tobias Wendt — DevOps & Platform Engineer

Verdict: Approved

Pure documentation PR. The only infrastructure-relevant content is in the C4 diagram update and the OCR/SSE architectural descriptions. I'll focus there.

C4 diagram changes (docs/architecture/c4-diagrams.md)

OCR container addition:

Container(ocr, "OCR Service", "Python FastAPI / port 8000", "... Single-node by design — see ADR-001. Reachable only on the internal Docker network; no external port exposed.")

This is accurate — the OCR service runs on port 8000 internally and has no ports: mapping in the Compose file. The "no external port" claim is correct and matches the actual deployment config. Good.

New relations:

  • Rel(backend, ocr, "OCR job requests with presigned MinIO URL", "HTTP / REST / JSON") — accurate
  • Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned") — accurate; the OCR service uses a presigned URL from MinIO, not a direct credential. This is the right pattern.
  • Rel(backend, user, "SSE notifications (server-sent events)", "HTTP / SSE — direct backend-to-browser") — accurate. The SSE connection goes directly from the Spring Boot backend to the browser, not proxied through SvelteKit.

Updated backend description:
The old description was missing transcription, OCR orchestration, and SSE. The new one includes them. More accurate.

Updated DB description:
Added "transcription blocks, audit log" to the database container description. These are real tables — accurate.

Infrastructure claim in ARCHITECTURE.md

Section 1:

"The OCR service has no external port — it is reachable only on the internal Docker Compose network."

Confirmed correct against the actual docker-compose.yml configuration. The service uses expose not ports. Good enforcement of the least-exposure principle.

One observation

The ARCHITECTURE.md references docs/infrastructure/ files (via the persona file context) but no infrastructure docs are part of this PR. The architecture doc is self-contained and accurate for what it covers. The docs/ directory structure would benefit from a docs/infrastructure/ section eventually, but that's out of scope here.

Nothing to block. The C4 diagram is now more complete and accurate than before.

## ⚙️ Tobias Wendt — DevOps & Platform Engineer **Verdict: ✅ Approved** Pure documentation PR. The only infrastructure-relevant content is in the C4 diagram update and the OCR/SSE architectural descriptions. I'll focus there. ### C4 diagram changes (`docs/architecture/c4-diagrams.md`) **OCR container addition:** ``` Container(ocr, "OCR Service", "Python FastAPI / port 8000", "... Single-node by design — see ADR-001. Reachable only on the internal Docker network; no external port exposed.") ``` This is accurate — the OCR service runs on `port 8000` internally and has no `ports:` mapping in the Compose file. The "no external port" claim is correct and matches the actual deployment config. Good. **New relations:** - `Rel(backend, ocr, "OCR job requests with presigned MinIO URL", "HTTP / REST / JSON")` — accurate - `Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned")` — accurate; the OCR service uses a presigned URL from MinIO, not a direct credential. This is the right pattern. - `Rel(backend, user, "SSE notifications (server-sent events)", "HTTP / SSE — direct backend-to-browser")` — accurate. The SSE connection goes directly from the Spring Boot backend to the browser, not proxied through SvelteKit. **Updated backend description:** The old description was missing transcription, OCR orchestration, and SSE. The new one includes them. More accurate. **Updated DB description:** Added "transcription blocks, audit log" to the database container description. These are real tables — accurate. ### Infrastructure claim in `ARCHITECTURE.md` Section 1: > "The OCR service has no external port — it is reachable only on the internal Docker Compose network." Confirmed correct against the actual `docker-compose.yml` configuration. The service uses `expose` not `ports`. Good enforcement of the least-exposure principle. ### One observation The `ARCHITECTURE.md` references `docs/infrastructure/` files (via the persona file context) but no infrastructure docs are part of this PR. The architecture doc is self-contained and accurate for what it covers. The `docs/` directory structure would benefit from a `docs/infrastructure/` section eventually, but that's out of scope here. Nothing to block. The C4 diagram is now more complete and accurate than before.
Author
Owner

📋 Elicit — Requirements Engineer

Verdict: ⚠️ Approved with concerns

My lens: does this documentation accurately and completely specify architectural constraints for implementers, and does it leave any requirements ambiguous, missing, or untestable?

What's well-specified

  • Admission criteria for cross-cutting layer (Section 3): The criteria "consumed by 2+ domains; no user-facing CRUD of its own" are concrete and testable. A developer can apply this rule to decide whether a new package belongs in the cross-cutting layer or as a domain. This is a well-formed architectural constraint.

  • Domain ownership boundaries: Each Tier-1 domain entry states what it owns, what it does NOT own, and its cross-domain dependencies. This three-part structure maps directly to the kind of EARS-style constraint I'd write:

    • "The person domain shall not own AppUser records."
    • "Cross-domain data access shall go through the owning domain's service."
      These are implicitly present in the prose. Good.
  • DocumentStatus lifecycle is referenced in CLAUDE.md (PLACEHOLDER → UPLOADED → TRANSCRIBED → REVIEWED → ARCHIVED) but is not repeated or expanded in ARCHITECTURE.md. This is intentional — the doc delegates detail to code. Acceptable.

Concerns (requirements gaps)

C1 — Tier-2 domain admission criteria not stated
Section 2 defines Tier-1 admission criteria implicitly ("have entities and user-facing CRUD") and defines Tier-2 ("derived domain has its own routes and UI but no database tables"). But there is no explicit admission criterion for when something should be a Tier-2 domain vs. simply a page within an existing Tier-1 domain's route tree.

This matters for future development: if someone wants to add a new derived view, they have no documented rule to decide whether it warrants Tier-2 status. Suggestion: add one sentence: "A view becomes a Tier-2 domain when it represents a distinct user-facing concept that composes data from two or more Tier-1 domains."

C2 — Stack-symmetry rule lacks an exception clause
Section 4 states: "Adding a new Tier-1 domain means creating a package on both sides under the same name." This is a good rule, but it has no documented exception. The cross-cutting layer (Section 3) lists backend packages (audit, config, dashboard, etc.) with no frontend equivalent. A new developer could reasonably ask: "Does audit need a frontend/src/lib/audit/ folder?" The answer is no — but the document doesn't say so. The stack-symmetry rule should explicitly scope itself: "applies to Tier-1 domains; cross-cutting layer packages are backend-only by definition."

C3 — geschichte domain cross-domain deps are incomplete
Section 2 states geschichte has cross-domain deps: person, document (linked entities in the story body). But CLAUDE.md mentions geschichte also uses the notification domain (comment mentions). If a developer reads only ARCHITECTURE.md, they'll miss this dependency. Minor, but documentation accuracy is the only deliverable here.

C4 — No versioning policy for the document itself
The <!-- Last reviewed: 2026-05-05 --> comment signals intent to keep this current, but there is no stated policy for when it must be updated (e.g., "must be updated on any PR that adds a domain, adds a Permission value, or changes a referenced endpoint"). Without a policy, the review date will drift. This is a process gap, not a content gap — but worth a follow-up Gitea issue.

Summary

The document is a strong first version. Concerns C1 and C2 are the most important: they are rules a developer will need to apply, and without exception clauses they will be misapplied. C3 and C4 are lower priority. I'd recommend addressing C1 and C2 in a follow-up commit before this PR sits in the repo long enough for someone to act on the incomplete rules.

## 📋 Elicit — Requirements Engineer **Verdict: ⚠️ Approved with concerns** My lens: does this documentation accurately and completely specify architectural constraints for implementers, and does it leave any requirements ambiguous, missing, or untestable? ### What's well-specified - **Admission criteria for cross-cutting layer (Section 3):** The criteria "consumed by 2+ domains; no user-facing CRUD of its own" are concrete and testable. A developer can apply this rule to decide whether a new package belongs in the cross-cutting layer or as a domain. This is a well-formed architectural constraint. - **Domain ownership boundaries:** Each Tier-1 domain entry states what it owns, what it does NOT own, and its cross-domain dependencies. This three-part structure maps directly to the kind of EARS-style constraint I'd write: - *"The `person` domain shall not own `AppUser` records."* - *"Cross-domain data access shall go through the owning domain's service."* These are implicitly present in the prose. Good. - **`DocumentStatus` lifecycle** is referenced in `CLAUDE.md` (`PLACEHOLDER → UPLOADED → TRANSCRIBED → REVIEWED → ARCHIVED`) but is not repeated or expanded in `ARCHITECTURE.md`. This is intentional — the doc delegates detail to code. Acceptable. ### Concerns (requirements gaps) **C1 — Tier-2 domain admission criteria not stated** Section 2 defines Tier-1 admission criteria implicitly ("have entities and user-facing CRUD") and defines Tier-2 ("derived domain has its own routes and UI but no database tables"). But there is no explicit admission criterion for *when* something should be a Tier-2 domain vs. simply a page within an existing Tier-1 domain's route tree. This matters for future development: if someone wants to add a new derived view, they have no documented rule to decide whether it warrants Tier-2 status. Suggestion: add one sentence: *"A view becomes a Tier-2 domain when it represents a distinct user-facing concept that composes data from two or more Tier-1 domains."* **C2 — Stack-symmetry rule lacks an exception clause** Section 4 states: "Adding a new Tier-1 domain means creating a package on **both** sides under the same name." This is a good rule, but it has no documented exception. The cross-cutting layer (Section 3) lists backend packages (`audit`, `config`, `dashboard`, etc.) with no frontend equivalent. A new developer could reasonably ask: "Does `audit` need a `frontend/src/lib/audit/` folder?" The answer is no — but the document doesn't say so. The stack-symmetry rule should explicitly scope itself: *"applies to Tier-1 domains; cross-cutting layer packages are backend-only by definition."* **C3 — `geschichte` domain cross-domain deps are incomplete** Section 2 states `geschichte` has cross-domain deps: `person`, `document` (linked entities in the story body). But `CLAUDE.md` mentions `geschichte` also uses the `notification` domain (comment mentions). If a developer reads only `ARCHITECTURE.md`, they'll miss this dependency. Minor, but documentation accuracy is the only deliverable here. **C4 — No versioning policy for the document itself** The `<!-- Last reviewed: 2026-05-05 -->` comment signals intent to keep this current, but there is no stated policy for *when* it must be updated (e.g., "must be updated on any PR that adds a domain, adds a Permission value, or changes a referenced endpoint"). Without a policy, the review date will drift. This is a process gap, not a content gap — but worth a follow-up Gitea issue. ### Summary The document is a strong first version. Concerns C1 and C2 are the most important: they are rules a developer will need to apply, and without exception clauses they will be misapplied. C3 and C4 are lower priority. I'd recommend addressing C1 and C2 in a follow-up commit before this PR sits in the repo long enough for someone to act on the incomplete rules.
marcel merged commit 69b564b34b into main 2026-05-06 07:31:02 +02:00
marcel deleted branch feat/issue-396-architecture 2026-05-06 07:31:04 +02:00
Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#441