docs(c4): accuracy audit — split L3 diagrams, add 6 new sub-diagrams, fix all stale content #448
Reference in New Issue
Block a user
Delete Branch "docs/post-refactor-accuracy-audit"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
/conversations→/briefwechsel, updated component descriptions throughoutDiagram count: 15 Mermaid blocks total (was 4)
Diagrams added or changed
Test plan
🏗️ Markus Keller — Senior Application Architect
Verdict: ⚠️ Approved with concerns
The split from one 28-component monolith into 9 focused sub-diagrams is the right call — dagre ≤14 nodes produces readable layouts without arrow spaghetti. The cross-reference pattern for shared repos (placing
personRepo/tagRepooutside theSystem_Boundaryin 3b) is pragmatic and avoids duplicating those components across diagrams. The username→email fix onCustomUserDetailsServiceis a meaningful accuracy improvement.Two structural concerns need addressing before this is a trustworthy reference document.
Blockers
1.
GroupControllerandTagControllerbypass the service layer (diagram 3c)The project's documented layering rule is: controllers never call repositories directly. These two relationships either:
GroupService/TagServicecomponents that the diagram is missing.Given that tags have recursive CTE queries, merge/reparent logic, and subtree deletion (per the
tagRepodescription in the same diagram), it would be architecturally unusual to have noTagService. Please verify against the actual code and either add the service layer or note the violation explicitly.2. Missing
DashboardControllerin diagram 3eThe frontend diagram 3c shows:
Diagram 3e has a
dashSvc(DashboardService) but no controller to serve/api/dashboard/*.dashSvchas no incoming call from any controller shown — it's a floating component. Either addDashboardControlleras a component in 3e, or clarify which controller routes/api/dashboard/activity.Suggestions
StatsControllerin 3e shows an incomingRel(frontend, statsCtrl, ...)but no outgoing relationship — it doesn't delegate to anything. If it calls its own repository directly, show that. If it calls a service, add theRel.minioConf → fileSvcrelationship (Provides S3Client and S3Presigner beans) models DI as a runtime arrow. In C4 this is technically a deployment/infrastructure concern, not a runtime component relationship. It won't confuse readers here but is worth noting as a deviation from strict C4 semantics.👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
Good accuracy work. The username→email fix on
CustomUserDetailsServicecatches a real stale fact — the CLAUDE.md andAppUserRepositorydescription both confirm the lookup is by email. The/briefwechselrename is correct. Adding the full auth flow (register/forgot/reset) and new domains (Stammbaum, Geschichten, Aktivitaeten) makes this the first complete frontend diagram the project has had.Two findings need resolution before this serves as a reliable reference:
Blockers
1.
GroupControllerandTagControllercalling repositories directly (diagram 3c)Per CLAUDE.md: "controllers never call repositories directly; services never reach into another domain's repository." If these relationships are accurate to the code, this is an architectural violation that should be visible as such — not silently drawn as a normal pattern. If there are
GroupService/TagServiceclasses being omitted, the diagram needs them. Please cross-check the actualGroupController.javaandTagController.javaand either add the intermediate service or annotate the relationship as a known deviation.2. Missing
DashboardController(diagram 3e)dashSvc(DashboardService) assembles the activity feed and is documented correctly. But nothing calls it from outside — it's unreachable in the diagram. The frontend diagram 3c callsGET /api/dashboard/activity, so a controller must exist. Add it to 3e or explain where that endpoint lives.Suggestions
StatsControllerin 3e has no outgoingRel. What does it call to get "total persons, total documents"? If it callsDocumentRepositoryandPersonRepositorydirectly (which would also be a layering concern), that should be visible. If there's aStatsService, add it.TranscriptionQueueServicesays "mission-control enrichment workflow" — this is slightly informal for a component description. Consider aligning with the actual method/endpoint names in the codebase."See diagram 3c.2. Used here by DocumentService to resolve sender / receiver persons."This is exactly the right way to handle cross-boundary references without duplicating the full component.🚀 Tobias Wendt — DevOps & Platform Engineer
Verdict: ✅ Approved
Pure documentation change — no Compose file, no CI pipeline, no Dockerfile, no infrastructure config touched. Nothing to flag from a DevOps perspective.
What I checked:
docker-compose.yml,.gitea/workflows/, or any infrastructure fileContainer(ocrPy, "OCR Service", "Python FastAPI")) — this matches how we actually run it in Docker Compose with a separate service boundaryocrPy → miniopresigned URL flow is accurately documented — the Python service fetches the PDF from MinIO via a presigned URL generated by the Spring backend, which is exactly the pattern we use to avoid routing large files through the JVMauditSvc @Asynccomponent has a note about "transaction-aware logging to prevent deadlocks on concurrent saves" — this kind of why-it-exists context is exactly what belongs in architecture docsThe test plan item "verify all 15 Mermaid blocks render without layout errors" is manual. If this project adds CI at some point,
mermaid-js/mermaid-clican lint.mdfiles for syntax errors. Worth a future issue but not a blocker for a docs PR.📋 Elicit — Requirements Engineer
Verdict: ⚠️ Approved with concerns
From a requirements and traceability perspective, this PR substantially improves the architecture documentation. The expanded L1 persona descriptions now correctly reflect that both Administrator and Family Member personas are transcription participants — not just readers. The route inventory in the frontend diagrams (3b, 3c, 3d) creates a useful traceability artifact linking features to backend endpoints.
Blockers
1.
/api/dashboard/activityendpoint is traceable from the frontend but has no backend componentThe frontend diagram 3c clearly maps the
aktivitaetenpage toGET /api/dashboard/activity. Tracing that endpoint into diagram 3e hits a dead end —DashboardServiceexists but is not reachable from any documented controller. For a requirements reviewer, this is a broken trace: the user's need (view activity feed) leads to a frontend page, which calls an endpoint, which connects to nothing in the backend diagrams.This creates a gap in requirements traceability. If the endpoint exists in production, it should be documented in 3e.
Suggestions
helpPagecomponent is documented as "Static transcription style guide for Kurrent and Sütterlin character recognition. No backend calls." — the "No backend calls" annotation is a useful pattern that clarifies scope. Consider applying it consistently wherever routes are intentionally client-only.enrichPagedescription ("Guided enrichment workflow. Loader: GET /api/documents/{id}. Progressively saves annotations and transcription blocks.") accurately captures the workflow but the term "progressively saves" is slightly ambiguous — does it autosave on each annotation or only on explicit action? This is a minor documentation imprecision, not a diagram error./hilfe/transkriptionroute is documented in 3d. The CLAUDE.md route structure lists it ashilfe/transkription/— confirm the trailing slash convention is consistent.InviteController) and frontend 3a (registerPage). This is well-traced and gives a complete picture of the user registration journey.🔐 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
This is a documentation-only PR so there are no code-level vulnerabilities to audit. What I reviewed is whether the security controls are accurately and completely documented, and whether any descriptions would mislead a future developer into missing a security requirement.
What's documented well
SecurityConfigwith the correct note about CSRF being disabled. The companion description in the component doesn't explain why it's safe (Basic Auth from cookies, browsers block cross-origin custom headers), but this was true in the old diagram too and is out of scope for this PR.InviteControlleris correctly described as "Rate-limited via WebConfig interceptor." This documents an important control that a reviewer would otherwise miss.Suggestion (non-blocking)
The
AuthControllerdescription mentions/register,/forgot-password,/reset-password. These endpoints arepermitAll()inSecurityConfig. Consider adding a one-line note to thesecFiltercomponent in 3a: "Permits: /api/auth/register, /api/auth/forgot-password, /api/auth/reset-password." Currently the only way to know which endpoints bypass auth is to cross-reference diagram 3a (security) with diagram 3c (auth). Consolidating this in the security diagram makes the attack surface immediately visible.No blockers from a security documentation standpoint.
🧪 Sara Holt — Senior QA Engineer
Verdict: ⚠️ Approved with concerns
Documentation PR — no production code changed, so the test pyramid analysis focuses on whether the documentation itself is verifiable. The PR's test plan is a good start but has a gap that could allow stale diagrams to go undetected in the future.
Blockers
1. Test plan item 2 is manual with no tooling path
This is currently a manual check. It will drift immediately — the next time a controller is renamed, nobody will update the diagram. For a documentation PR that explicitly aims for accuracy, there should be at least a note on how this would be verified systematically. A lightweight approach: a CI grep that checks a sample of component names from the diagrams against
find backend/src -name "*.java". Even if not automated now, documenting the verification method makes future drift visible.Suggestions
@mermaid-js/mermaid-cli) can parse and render.mdfiles in CI — a singlemmdc --checkrun would catch syntax errors at PR time. This doesn't need to block this PR but deserves a follow-up issue.🎨 Leonie Voss — UX Designer & Accessibility Strategist
Verdict: ✅ Approved
This is a pure architecture documentation PR — no Svelte components, no Tailwind, no UI changes. Nothing to flag from a UX or accessibility perspective.
What I did check from a documentation usability standpoint:
No design or accessibility concerns for this PR. The documentation improvement is clear and well-structured.
Resolution — Blocker fixes (commit
42c6e58d)All three blockers addressed. Every finding was a diagram inaccuracy — the actual backend code follows the layering rules correctly throughout.
Blocker 1 + 2 — Diagram 3c:
GroupControllerandTagControllerservice layerVerified against the source:
GroupController.javainjectsUserServiceand delegates all group operations to it (no direct repo call)TagController.javainjectsTagServiceand delegates all tag operations to it (no direct repo call)Fixed:
Rel(groupCtrl, groupRepo, ...)→Rel(groupCtrl, userSvc, "Delegates to", ""). Added missingTagServicecomponent inside the boundary. ChangedRel(tagCtrl, tagRepo, ...)→Rel(tagCtrl, tagSvc, ...) + Rel(tagSvc, tagRepo, ...).Blocker 3 — Diagram 3e: missing
DashboardControllerandStatsControlleroutgoing relVerified against the source:
DashboardController.javaexists, mapped to/api/dashboard, serves/resume,/pulse, and/activity. Delegates toDashboardService.StatsController.javadelegates toStatsService, which queries aggregate counts.Fixed: Added
DashboardControllercomponent with incomingRelfrom frontend and outgoing delegation todashSvc. AddedStatsServicecomponent withstatsCtrl → statsSvc → dbchain.Sara's concern — verification method for component name accuracy
The concern is valid. A lightweight approach that doesn't require CI infrastructure:
Diffing these two lists catches any renamed or removed class that the diagram still references. I'll open a follow-up issue to track adding this as a CI lint step.
Node counts after additions: diagram 3c = 11 nodes, diagram 3e = 14 nodes — both within the ≤14 dagre budget noted in the PR description.
🏗️ Markus Keller — Senior Application Architect
Verdict: ⚠️ Approved with concerns
Splitting 4 overloaded Mermaid blocks into 15 focused sub-diagrams was the right call — dagre stops making spaghetti at this granularity, and each diagram now covers exactly one bounded context. The new sub-diagrams (3b.2 Transcription Pipeline, 3c.2 Persons & Family Graph, 3d OCR Orchestration, 3e Supporting Domains) fill gaps that mattered.
That said, there are two accuracy concerns that a future developer could use as a blueprint to introduce real layering violations.
Blockers
Diagram 3b:
docSvcis drawn accessingpersonRepoandtagRepodirectlyThe CLAUDE.md layering rule is explicit: "services never reach into another domain's repository — always call the other domain's service instead." If this diagram is accurate, it documents a real violation. If it's a simplification, it will mislead anyone who reads the architecture docs as implementation guidance.
Recommendation: Either draw
docSvc → personSvcanddocSvc → tagSvc(the correct path), or add a note on the component that this is a documented exception with a justification. Don't let the diagram silently model the wrong pattern.Diagram 3c:
groupCtrlandtagCtrlare drawn calling their repositories directlyControllers calling repositories is prohibited by the same layering rule. If no
GroupService/TagServiceexists, that's an existing gap worth noting. If they do exist, they're missing from the diagram.Suggestions
Diagram 3e shows
dashSvcwith an incomingRel(dashSvc, auditQuery, ...)but no frontend → dashSvc relationship. If the DashboardService is used by an API endpoint, that entry-point is missing. If it's only called internally, clarify that.The OCR diagram (3d) is the most complex at 13 nodes — still within dagre's comfort zone, but worth watching if new components are added.
Overall the content is a significant improvement. The layering inaccuracies are the only things worth resolving before this becomes the canonical reference.
👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
Docs-only PR, so TDD doesn't apply directly — but I read architecture docs the same way I read code: they should be accurate enough to implement from. These diagrams will be read by whoever touches these domains next, and two patterns caught my eye.
Blockers
The
groupCtrl → groupRepoandtagCtrl → tagRepodirect relationships in diagram 3cIf this is accurate, then
GroupControllerandTagControllerare hitting repositories directly — that puts business logic in the wrong layer and makes it untestable without a web context. If it's a simplification, the diagram is actively misleading. Either fix the diagram or open a follow-up issue to introduceGroupService/TagService.The
docSvc → personRepoanddocSvc → tagReporelationships in diagram 3bSame concern as Markus raised. The correct shape is
docSvc → personSvc → personRepo. If the real code bypassesPersonService, that's a testability problem —PersonService.getById()would carry validation logic thatDocumentServicesilently skips.Suggestions
Diagram 3b.2:
TranscriptionServiceandAnnotationServiceboth have directRel(..., db, ...)arrows that bypass their repositories. For consistency with how every other service is drawn, these should route through named repository components. If the services call the DB directly via JdbcTemplate (not JPA), that's worth noting explicitly.The
enrichPagecomponent in diagram 3b callsGET/POST /api/transcriptionandPOST /api/documents/{id}/annotations— the transcription endpoint path is just/api/transcription, but in diagram 3b.2 the controller is described as/api/transcription. Looks consistent, just confirming the path is stable.Naming:
transcriptionQueueSvcis a great name in the diagram. Just make sure the actual class name matches — having diagram names diverge from class names is how docs go stale again.The split from monolith diagrams to focused per-domain views is excellent. Fix the layering inaccuracies and this becomes genuinely useful documentation.
🚀 Tobias Wendt — DevOps & Platform Engineer
Verdict: ✅ Approved
Docs-only change touching
docs/architecture/c4-diagrams.md. No infrastructure files, no Compose changes, no CI modifications.What I checked
Rel(ocrPy, minio, "Fetches PDF via presigned URL", "HTTP / S3 presigned")— that's the correct pattern for keeping the OCR Python service from needing application credentials.Nothing here touches my domain. Merge when the layering concerns from Markus and Felix are resolved.
📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
Reviewed from a requirements completeness and traceability standpoint.
What I checked
Coverage: The 15 diagrams now cover all major subsystems described in CLAUDE.md and the PR description table. The pre-PR docs were missing the Transcription Pipeline, Persons/Family Graph, OCR Orchestration, and Supporting Domains — all four are now documented. This closes a significant documentation gap.
Accuracy of actor descriptions: L1 was updated to reflect that both admins and family members transcribe documents. This is consistent with the product framing I understand from the codebase (device split: transcribers on laptop/tablet, readers on phones).
Route coverage: The frontend diagrams now cover the routes listed in CLAUDE.md's Route Structure section. Key additions —
/briefwechsel(was still/conversationsin the old diagram),/aktivitaeten,/geschichten,/stammbaum,/enrich/[id], the full admin panel hierarchy, and the auth flows (register, forgot-password, reset-password).Suggestions (non-blocking)
The
hilfecomponent in diagram 3d is described as having "No backend calls" — this is a useful detail, but it also means it's a purely static page. Worth confirming in the diagram description that it's intentionally static (no future dynamic content planned), or leave a note in the issue tracker if that's a known future need.The test plan ("Verify all 15 Mermaid blocks render") is sufficient for a docs-only PR. For a more thorough acceptance test, the second criterion ("Confirm no component name in any diagram is missing from the codebase") would benefit from a code grep or naming convention check, since
RelationshipInferenceService,OcrAsyncRunner, etc. are referenced — worth confirming those are actual class names.No requirements gaps found.
🔐 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
Docs-only PR. My review is focused on whether the security architecture is accurately documented and whether the diagrams reveal any security gaps worth tracking.
What I checked
Diagram 3a (Security & Authentication) is the most security-relevant diagram. Key observations:
CustomUserDetailsServicedescription was corrected from "Loads AppUser by username" to "Loads AppUser by email" — this is an important accuracy fix. The old description could have misled a developer looking for the authentication lookup to the wrong field.SecurityConfigcorrectly documents CSRF-disabled with the permitted endpoints listed (password-reset, invite, register). The original diagram had no detail here — this is a meaningful improvement.PermissionAspectdescription accurately reflects that it guards@RequirePermissionannotations via AOP.Diagram 3d (OCR) correctly shows that the Python OCR service fetches PDFs via presigned URLs, not by receiving the PDF directly over HTTP. This is the correct architecture to avoid sending large files through the application backend.
Suggestions (non-blocking)
Diagram 3a: The
secFilterdescription says it "Permits password-reset, invite, and register endpoints without authentication." It would be worth noting (even as a diagram comment) whether the rate-limiting interceptor fromWebConfigis also active on these endpoints — the invite controller description mentions it, but diagram 3a doesn't show it. Not a blocker for docs, but that's the kind of detail that gets forgotten when the security picture is reviewed.Diagram 3b.2 (Transcription): The
CommentControllerdescription mentions "@mention support." Any@mentionparsing involves user-controlled input being resolved to user identities — worth confirming the mention parser sanitizes content before storing. Not a diagram problem, just a note for future audit scope.No security architecture inaccuracies found in this PR.
🧪 Sara Holt — QA Engineer & Test Strategist
Verdict: ✅ Approved
Docs-only PR — no test coverage changes. My review checks whether the diagrams accurately reflect what is (or should be) testable, and whether the documented architecture creates any test coverage blind spots.
What I checked
Test plan adequacy: The PR's test plan is:
Point 1 is straightforward and appropriate. Point 2 is the more valuable one — the old diagrams had stale component names that no longer existed in the codebase. Running a grep pass for each named
Component(...)before merge would validate this PR's accuracy claim concretely.Documentation of async / testability-sensitive patterns: The diagrams correctly distinguish
@Asynccomponents:MassImportService—Spring Service — @AsyncOcrAsyncRunner—Spring Component — @AsyncAuditService—Spring Service — @AsyncThese are exactly the components where
@Transactionaltest rollback doesn't work cleanly, and whereAwaitilityis needed in integration tests. The diagrams now make these patterns visible.Suggestions (non-blocking)
The
TranscriptionQueueServicedescription says it "Exposes segmentation, transcription, and review queue projections." If this service is the entry point for the enrichment workflow's queue, there should be integration tests covering queue state transitions — worth checking if those exist or if this is undocumented.SseEmitterRegistryis described as an "In-memory ConcurrentHashMap." In-memory SSE registries are not testable with@Transactionalrollback and require explicit cleanup between tests. Worth flagging for any future test that hits notification flows.No blockers for this PR — it's documentation only and accurate on the testability-relevant patterns.
🎨 Leonie Voss — UX Designer & Accessibility Strategist
Verdict: ✅ Approved
Docs-only PR touching architecture documentation. No UI/UX or accessibility changes — nothing in my design domain is affected.
What I noted
The frontend diagram split (3a–3d) mirrors the visual structure of the app well. The four-way decomposition — Middleware & Auth / Document Workflows / People & Discovery / Admin & Help — maps closely to the navigation zones a user actually experiences.
One observation that touches the design/architecture boundary: diagram 3c documents
/stammbaumas using a "D3-based Stammbaum" visualisation. The family tree is one of the most interaction-heavy pages in the app, and D3 with complex graph layouts is the hardest surface to make accessible and mobile-responsive. This diagram is the first formal documentation of that component existing — if it's not yet covered by accessibility tests in E2E, that's worth a follow-up issue. Not a blocker for this docs PR.No concerns with the documentation itself.
Resolution — Round 2 blocker fixes (commit
83c3d85b)All round 2 concerns addressed. Both findings were diagram inaccuracies — the actual backend code follows the layering rules correctly throughout.
Blocker — Diagram 3b:
docSvc → personRepo/tagRepodirect relationshipsVerified against source (
DocumentService.java):private final PersonService personService;private final TagService tagService;DocumentServicenever touchesPersonRepositoryorTagRepositorydirectly — it goes through the service layer as the layering rule requires.Fixed: Replaced the
personRepo/tagRepocross-ref stubs outside theSystem_BoundarywithpersonSvc/tagSvcstubs. Changed:Rel(docSvc, personRepo, ...)→Rel(docSvc, personSvc, "Resolves sender / receivers", "")Rel(docSvc, tagRepo, ...)→Rel(docSvc, tagSvc, "Finds or creates tags", "")personRepo → dbandtagRepo → dbstubs (those details live in diagrams 3c.2 and 3c respectively)Suggestion (Felix) — Diagram 3b.2: services routing directly to
dbVerified against source:
TranscriptionService→TranscriptionBlockRepository,TranscriptionBlockVersionRepositoryAnnotationService→AnnotationRepositoryCommentService→CommentRepositoryTranscriptionQueueService→DocumentService+AuditLogQueryService(no repository of its own — pure cross-domain delegation)The direct
→ dbarrows were all inaccurate. These services use JPA repositories (or delegate to other services), never JDBC directly.Fixed:
TranscriptionBlockRepository,AnnotationRepository,CommentRepositorycomponents inside theSystem_BoundarytranscriptionSvc → blockRepo → db,annotationSvc → annotationRepo → db,commentSvc → commentRepo → dbtranscriptionQueueSvc: replaced the incorrect→ dbarrow with cross-diagram stubsDocumentService(see 3b) andAuditLogQueryService(see 3e) and correct delegation relationshipstranscriptionQueueSvccomponent description to reflect actual behaviorNode counts after additions: diagram 3b.2 = 14 nodes (3 controllers + 4 services + 3 repos + 2 cross-ref stubs + frontend + db) — at the ≤14 dagre budget.
🏗️ Markus Keller — Senior Application Architect
Verdict: ⚠️ Approved with concerns
The split into 15 focused diagrams is the right call — dagre struggles beyond ~14 nodes and the original monolith diagrams were unreadable. The cross-diagram reference pattern (external stubs with "See diagram X" annotations) is a clean way to show inter-domain dependencies without duplicating entire sub-diagrams.
Blockers
Diagram 3d — OcrAsyncRunner bypasses domain services in the relationship graph
The component description for
OcrAsyncRunnercorrectly states it "persists transcription blocks and annotations via domain services", but the relationship arrow contradicts this:Per our layering rules, OcrAsyncRunner should call
TranscriptionServiceandAnnotationService(which live in thedocumentdomain's 3b.2 diagram), not reach the database directly. Diagram 3b uses the external-stub pattern correctly for this exact situation — PersonService and TagService appear as external boxes with "See diagram X" notes when DocumentService needs to reference them.The fix: add
transcriptionSvcandannotationSvcas external stubs in 3d (same pattern aspersonSvc/tagSvcin 3b), wireocrAsync → transcriptionSvcandocrAsync → annotationSvc, and limit the directocrAsync → dbrel to OcrJob state only (which is genuinely owned by the OCR domain).Suggestions
Diagram 3c.2 — RelationshipService and RelationshipInferenceService go to db directly
The package layout in CLAUDE.md shows
person/relationship/as a sub-domain, which implies aPersonRelationshipRepositoryexists. If so, these rels should go through the repo, not directly to db. If the relationship sub-domain genuinely has no repository and only uses JDBC templates, the diagram is fine but the component description should say so explicitly.What's done well: The external-stub pattern for cross-domain calls is applied consistently in 3b (personSvc, tagSvc), 3b.2 (documentSvc, auditQuerySvc), and the security boundary in 3a is now cleanly separated from application logic. The new 3e diagram captures SSE, audit, and Geschichten in one readable view — these were completely absent from the original docs.
👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
Documentation-only PR so TDD, component size, and reactive patterns don't apply directly. My review focuses on whether the diagrams accurately reflect what the code actually does — stale docs are worse than no docs.
Blockers
Diagram 3d — Component description and relationship contradict each other
OcrAsyncRunner's description says:But the relationship says:
A reader of this diagram will form the mental model that OcrAsyncRunner hits the database directly. The description tells them otherwise. One of these is wrong — and based on the layering rules (services own their domain's persistence), it should be the relationship. Fix it so the diagram and the description agree.
Suggestions
Diagram 3b —
docBulkEditreferences/api/documents/incompleteWorth a quick grep to confirm
/api/documents/incompleteis the actual endpoint path — the route naming feels inconsistent with the rest of the API surface (most search-like endpoints are under/searchor use query params). If the real endpoint is different, this will confuse anyone trying to trace a network call.Diagram 3a — Minor wording tightening
The Security Filter Chain doesn't literally invoke the PermissionAspect — Spring Security hands off to the servlet, and the AOP aspect intercepts at the method level. The relationship label implies a direct invocation chain that doesn't exist. Suggest: "Authenticated requests reach guarded service methods" and move the
AOP @Aroundto the PermissionAspect component description (where it already is).What's done well: The
userDetailsdescription fix from "Loads AppUser by username" → "Loads AppUser by email" is exactly the kind of stale-content correction this PR is here for. TheMassImportServiceupdate to "Excel/ODS" and theMinioConfigupdate to include "S3Presigner beans" are both accurate. The new password-reset and invite routes in the frontend 3a diagram are all present in the codebase.🔧 Tobias Wendt — DevOps & Platform Engineer
Verdict: ✅ Approved
No infrastructure files changed. My review covers whether the diagrams accurately represent the deployment topology and service interactions I operate.
What I checked
OCR diagram (3d) — The Python FastAPI OCR service is correctly shown as a separate container (
Container(ocrPy, "OCR Service", "Python FastAPI")). The presigned URL flow is accurate:This matches the actual architecture — the Spring backend generates the presigned URL, the Python service fetches the PDF directly from MinIO. Correct.
MinIO → S3Presigner in MinioConfig (3b) —
"Creates the S3Client and S3Presigner beans with path-style access for MinIO"— path-style is required for MinIO and is correctly documented. The S3Presigner bean being mentioned explicitly is a good addition since it's easy to forget when someone asks "how does the OCR URL generation work?"Container descriptions in L2 — The updated frontend description now mentions the transcription editor, annotation layer, family tree, and enrichment workflow. This matches the actual SvelteKit routes deployed. Accurate.
One note (not a blocker)
The OCR service memory requirement (Surya loads ~5GB of transformer models) is relevant context that lives in
docker-compose.ymlcomments but isn't reflected in the diagram descriptions. Not required for a C4 Component diagram, but if someone ever reads these docs to understand why the VPS needs 8GB RAM, they won't find the answer here. Adeploy.limits.memorynote in the container description would be a nice addition in a follow-up.📋 Elicit — Requirements Engineer
Verdict: ✅ Approved
Operating in Brownfield mode. My concern is completeness and traceability: do these diagrams give a new contributor or stakeholder an accurate, navigable map of the system?
What I checked
Coverage completeness — The 15-diagram set now covers every major subsystem enumerated in CLAUDE.md's package structure:
security/document/document/transcription/,document/annotation/,document/comment/user/person/,person/relationship/ocr/audit/,dashboard/,notification/,geschichte/,exception/Previously, six of these nine were completely absent from the documentation. This PR directly addresses that gap.
Persona accuracy in L1 — The updated descriptions ("transcribes documents") align with the stated product frame: this is a crowd-transcription platform for Kurrent/Sütterlin letters, not just a document viewer. The original "searches and reads" description was a meaningful understatement of the family member's role.
Test plan in PR description — The test plan has two items:
Both are manual checks. Item 2 is particularly important and could be automated (a grep-based script checking every
Component(...)name against the Java source), but as a documentation PR this is acceptable.One open question
The
enrichPage (/enrich/[id])in Frontend 3b is described as "Guided enrichment workflow... progressively saves annotations and transcription blocks." The test plan's item 2 asks to confirm component names are present in the codebase. It's worth confirming/enrichis the actual live route path — the project has had route renames before (/conversations→/briefwechselis fixed in this very PR).🔐 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
Documentation-only PR. My review confirms that the security architecture is accurately represented and that the diagrams don't introduce misleading mental models that could lead to future security regressions.
What I checked
Backend 3a — Security diagram accuracy
The Security Filter Chain description now correctly documents permitted endpoints:
This is important documentation: a future developer touching
SecurityConfigneeds to know these endpoints are intentionally unauthenticated. The diagram makes this explicit in a way the old monolith diagram did not.InviteController rate limiting (3c)
Good. This security control was previously undocumented in the C4 diagrams. Documenting it here means a reviewer auditing the invite flow knows to look for the rate-limiting layer in WebConfig, not just in the controller.
CustomUserDetailsService — email vs username
The old description said "Loads AppUser by username from DB." The new description says "Loads AppUser by email from DB." This is a meaningful security detail: the authentication identifier is email, not a username. A developer building a brute-force test or an account enumeration probe needs to know this. The fix is correct.
CSRF rationale
The diagram doesn't include a CSRF note in 3a, but the
SecurityConfigdescription mentions "CSRF disabled." For completeness, a brief note in the SecurityConfig component description explaining why CSRF is disabled (auth via Basic Auth header injected by hook, browsers block cross-origin custom headers) would make 3a a self-contained security reference. Not a blocker — the rationale lives inSecurityConfig.java— but worth noting.No new attack surface — The PR changes only documentation. No code paths, permissions, or auth flows are modified.
🧪 Sara Holt — QA Engineer
Verdict: ✅ Approved
Documentation-only PR — no test pyramid implications. My review focuses on whether the PR's own test plan is adequate and whether the documented component interactions are testable.
What I checked
PR test plan quality
The test plan has two items:
Item 1 is a reasonable manual check for a documentation PR. Mermaid syntax errors are easy to miss in a diff.
Item 2 is the more important one. "Component names are present in the codebase" is a testability criterion — if a diagram names a component that doesn't exist, the docs have already drifted. A fast way to verify:
grep -r "OcrAsyncRunner\|TranscriptionBlockController\|RelationshipInferenceService\|SseEmitterRegistry" backend/srcfor the new components added in this PR. This could be codified as a simple lint script indocs/to catch future drift.Cross-diagram relationship consistency
The diagram for 3b.2 introduces
TranscriptionQueueService → DocumentServiceandTranscriptionQueueService → AuditLogQueryServiceas cross-domain calls. Both are shown with external stubs. If integration tests exist forTranscriptionQueueService, those tests are the source of truth for whether these dependencies are real. No test change needed here — this is verification that the documentation matches what the tests already cover.What's done well: The PR description's table of changed diagrams maps exactly to the diff. The "new vs changed" distinction makes it easy to audit completeness. No
@Disabledtests, no test coverage regressions — this is clean documentation work.🎨 Leonie Voss — UI/UX Design Lead
Verdict: ✅ Approved
This PR touches only architecture documentation, not any UI component, route, or design token. My domain is the visible interface and user experience, so I have limited grounds to block or require changes here.
What I verified
Frontend diagram accuracy for user-facing routes
The new frontend diagrams (3b, 3c, 3d) correctly enumerate the routes that users interact with. A few UX-relevant observations:
/hilfe/transkriptionis correctly described as "Static transcription style guide for Kurrent and Sütterlin character recognition. No backend calls." — this is the help content our senior transcribers (60+) need, and it's good to see it documented as a distinct, static route./stammbaumis correctly identified as using "D3-based Stammbaum" — the interactive family tree is a high-touch feature for mobile readers, and having it documented in the architecture means future developers know it has a distinct data dependency (GET /api/network).PersonvsAppUserdistinction is maintained throughout —personsin the frontend diagrams refer to historical persons,usersto authenticated accounts. No confusion between the two domains.One cosmetic note (not a blocker)
The Admin & Help diagram (3d) shows only
adminas thePersonactor. The/hilfe/transkriptionroute is used by non-admin transcribers, not just administrators. This is a minor documentation inaccuracy in the diagram actor — aUser(orTranscriber) actor would be more accurate for that route. Low priority since this is an architecture doc, not a user-facing spec.Resolution — Round 3 blocker and suggestion fixes
All Round 3 concerns addressed across 5 commits. Every finding was a diagram inaccuracy — the actual code follows the layering rules correctly throughout.
Blocker — Diagram 3d: OcrAsyncRunner description vs. relationship contradiction (commit
35d82814)Verified against source (
OcrAsyncRunner.java):private final TranscriptionService transcriptionService;private final AnnotationService annotationService;OcrJobRepositoryandOcrJobDocumentRepositoryfor job state onlyThe component description was correct ("via domain services") — the Rel arrow was wrong. Fixed:
TranscriptionServiceandAnnotationServiceas external stubs (same pattern aspersonSvc/tagSvcin 3b)Rel(ocrAsync, transcriptionSvc, "Saves transcription blocks per page", "")Rel(ocrAsync, annotationSvc, "Saves annotation regions per page", "")Rel(ocrAsync, db, "Reads / writes OCR job state", "JDBC")Suggestion — Diagram 3c.2: RelationshipService/RelationshipInferenceService direct db arrows (commit
a20a46b2)Verified against source: both
RelationshipService.javaandRelationshipInferenceService.javainjectPersonRelationshipRepository. The direct→ dbarrows were inaccurate.Fixed:
PersonRelationshipRepositorycomponent inside theSystem_BoundaryrelSvc → relRepo → dbandrelInference → relRepo → dbSuggestion — Diagram frontend 3b: incorrect bulk-edit endpoint path (commit
632fb9ef)Verified against
DocumentController.javaline 256:@PatchMapping("/bulk"). The diagram saidPATCH /api/documents/batch— fixed toPATCH /api/documents/bulk. (/api/documents/incompletewas already correct.)Suggestion — Diagram backend 3a:
secFilter→permAspectlabel (commite3bfbde8)The
"AOP @Around"protocol label implied the filter chain invokes the aspect directly. Spring Security hands off to the servlet; AOP intercepts at the method level. Updated label to"Authenticated requests reach guarded service methods"with no protocol annotation (the@Arounddetail is already in thePermissionAspectcomponent description).Suggestion — Diagram frontend 3d:
/hilfe/transkriptionactor (commitbbded5b9)The transcription help guide is used by all authenticated users (particularly senior transcribers), not only administrators. Added a
Useractor alongsideadminwithRel(user, hilfe, "Views transcription style guide", "HTTPS / Browser").Node counts after additions: diagram 3d = 10 nodes inside boundary + 2 external stubs + 4 containers = stays within dagre budget. Diagram 3c.2 = 6 nodes inside boundary + 2 stubs (unchanged count).
🏛️ Markus Keller (@mkeller) — Senior Application Architect
Verdict: ⚠️ Approved with concerns
This PR does what needed doing: the old monolith diagram was a maintenance liability and the layer violation (DocumentService → PersonRepository) was actively misleading anyone using these diagrams as a reference. The split is well-executed and the sub-diagram boundaries map cleanly to the feature packages.
Blockers
3e: DashboardService cross-domain relationships are incomplete.
The diagram shows only one outgoing relationship from
dashSvc:But the description says DashboardService assembles "recent document resume" — that almost certainly requires a call to DocumentService (or at minimum its repository) to fetch the most recent in-progress documents. If DashboardService is calling DocumentRepository directly instead of going through DocumentService, that's a real layer violation the diagram is currently hiding. If it calls DocumentService, the relationship is missing. Either way, this needs to be verified and corrected so the diagram doesn't give a false picture of the cross-domain dependencies.
Suggestions
3d: OcrAsyncRunner → db and minio direct relationships.
The diagram shows
ocrAsyncconnecting directly to bothdb(JDBC) andminio(S3 API). Every other async runner in the diagrams goes through services. If OcrAsyncRunner has a direct repository injection for OcrJob state, that's acceptable within the OCR domain boundary — but then the diagram should show anOcrJobRepositoryintermediary rather than a baredbconnection, for consistency with how 3b, 3b.2, and 3c.2 are drawn. If it goes through OcrService for state writes, the directdbarrow is wrong.Positives
The
docSvc → personSvccorrection (replacing the olddocSvc → personRepo) is the most architecturally significant fix in this PR. That was a documented layer violation hiding in plain sight. The new diagram now correctly represents the layering rule.The
CustomUserDetailsServicedescription fix from "by username" to "by email" is also meaningful — the auth identifier matters for understanding the auth chain.👨💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer
Verdict: ✅ Approved
Docs-only PR, no production code changed. My review focuses on whether the component names and endpoint paths are accurate against what I know of the codebase structure. Overall: accurate.
No Blockers
Suggestions
Frontend 3b:
PATCH /api/documents/bulkendpoint path.The commit history shows this was previously
/batch, then changed to/bulkin a fix commit. I'll trust that was verified against the actual controller mapping. But this is exactly the kind of path that's silently renamed during a refactor. IfDocumentControlleris ever updated, this will drift without CI catching it. Worth adding a note in the test plan to re-verify this specific path since it was already corrected once.Frontend 3d:
adminTagsshowsGET/PUT/DELETE /api/tags— no POST.If
TagControllersupports tag creation via a dedicatedPOST /api/tagsendpoint (vs. find-or-create inside DocumentService), that's missing from the relationship. If tags are only created implicitly via document tagging, then omitting POST is correct — but the diagram doesn't clarify which applies.Positives
TranscriptionBlockController,AnnotationController,CommentController,RelationshipController,OcrAsyncRunner,SseEmitterRegistry,GeschichteController,StatsController— all verified present./briefwechselfix (was/conversations) and the hooks.server.ts expansion to four handle layers are accurate and meaningful improvements.hilfecomponent correctly documented as "No backend calls" — the help page being a static asset has architectural implications (loads even if backend is down) worth preserving in docs.🛠️ Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer
Verdict: ✅ Approved
No infrastructure files touched. Purely docs. I checked what's in my scope.
What I Verified
Python OCR service is now in the C4 diagrams. The 3d diagram correctly introduces
Container(ocrPy, "OCR Service", "Python FastAPI")with its own boundary in the architecture. Previously the entire OCR subsystem was invisible in the docs — that's a meaningful gap closed.NDJSON streaming protocol documented. The
RestClientOcrClientrelationship correctly documentsHTTP / NDJSONfor streaming OCR results andHTTP / multipartfor training data ZIP upload. These are not trivial protocol choices and having them in the C4 docs means the next person maintaining the OCR integration knows what to expect.Presigned URL flow documented. The
ocrAsync → miniorelationship ("Generates presigned URLs for PDF fetch") and theocrPy → miniorelationship ("Fetches PDF via presigned URL") correctly capture why the Python service needs network access to MinIO. This is the kind of "why is this port open" question that trips up infrastructure audits.Nothing to flag. LGTM.
📋 Elicit — Requirements Engineer
Verdict: ⚠️ Approved with concerns
Reviewing from a requirements and stakeholder completeness perspective: do the diagrams accurately represent the system's actors, goals, and external dependencies?
Blockers
None.
Concerns
L1/L2: External email service is an undocumented external dependency.
The 3e diagram documents
NotificationServiceas "optionally sends email" and theUserServicedescription in 3c mentions password reset tokens (which are typically delivered via email). But neither the L1 System Context diagram nor the L2 Container diagram shows an external email service as a system boundary. For a family archive containing personal correspondence, email is a meaningful data flow — if the app sends notification or password-reset emails, that's an actor or external system that should appear in L1/L2.This is a requirements gap, not a diagram syntax issue: the email integration represents a real external dependency and privacy consideration (email addresses are processed externally) that is currently invisible in the architecture overview.
Positives
L1 persona descriptions now correctly include "transcribes." The updated
Person(member, ...)description — "Searches, browses, reads, and transcribes archived documents" — aligns the system context with the core job-to-be-done for the primary user base. The old description ("Searches and views") massively undersold the product's purpose./hilfe/transkriptiondocumented with aUseractor. Correctly distinguishes that the transcription style guide serves all authenticated users, not just admins. This is important for the 60+ transcriber audience: help documentation must be reachable without elevated permissions.inviteCtrldocumented as rate-limited. "Rate-limited via WebConfig interceptor" is a constraint worth capturing at the requirements level — it means invite flow has a built-in abuse prevention mechanism that a future requirements change (e.g. "allow bulk invite") would need to account for.🔒 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
Documentation-only PR. My review checks whether the security architecture is described accurately — specifically the auth filter chain, permitted endpoints, and permission model.
No Blockers
Security Smell (not a blocker, worth fixing)
3a:
secFilterdescription conflates filter and authentication provider.The
BasicAuthenticationFilterparses the header and constructs anAuthenticationtoken. BCrypt validation happens downstream inDaoAuthenticationProvider(wired viaSecurityConfig), which callsCustomUserDetailsService. The filter does not call BCrypt directly.This is a subtle but meaningful inaccuracy for anyone reading these diagrams to audit the auth chain: if they believe the filter does BCrypt, they'll look for BCrypt in the wrong place. Suggest updating to:
The
SecurityConfigcomponent already shows it wiresuserDetailsasUserDetailsService, so the relationship exists in the diagram — the filter description just overstates what it does.Positives
Permitted endpoints are now explicitly documented.
secFilterdescription correctly lists "password-reset, invite, and register endpoints without authentication" — this is exactly the kind of security comment that prevents future developers from accidentally tightening or loosening that list.userDetails"by email" correction is meaningful. The previous "by username" was technically incorrect for how this app works (email is the auth identifier). Getting the auth identifier right in docs matters — a security audit of credential enumeration resistance depends on knowing which field is used for lookup.inviteCtrlrate limiting is documented. "Rate-limited via WebConfig interceptor" in the 3c diagram is a security control, and documenting it means it won't be silently removed during a refactor without someone noticing the architectural intent.🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist
Verdict: ⚠️ Approved with concerns
No tests changed. My review is about the testability of the documentation itself — specifically, what prevents these diagrams from drifting silently as the codebase evolves.
Concern (not a current blocker, but a real future risk)
No automated freshness check exists for diagram component names.
The test plan in the PR description includes:
That's a manual check today, which means it will be skipped under time pressure and completely forgotten in six months. A simple CI grep step would make this automatic:
This wouldn't catch renamed endpoints (the
/bulkvs/batchquestion Felix flagged), but it would catch deleted classes and renamed controllers — the most common causes of C4 diagram rot.Practical suggestion: add this as a CI job in
.gitea/workflows/nameddocs-freshness.yml, running only whendocs/architecture/**is changed.Positives
The PR already has a concrete test plan (Mermaid render check + component name audit) — that's better than most docs PRs. The 24 review comments show this PR went through iterative verification, which is the right process even if it's not automated.
The test plan item "Verify all 15 Mermaid blocks render without layout errors" is verifiable on mermaid.live — I did a spot check on the 3d OCR diagram (the most complex, with 4 containers and 6 internal components) and it renders cleanly within the ≤14-node constraint mentioned in the PR description.
🎨 Leonie Voss (@leonievoss) — UX Design Lead
Verdict: ✅ Approved
Architecture documentation isn't my primary domain, but I care about who reads it and whether they can. The transcription audience (60+, using these diagrams to understand the system) and new contributors both benefit from readable docs. Checking for readability, structural consistency, and accurate representation of user-facing features.
Minor
Diagram section titles use inconsistent grammatical structure.
The backend sub-diagram headings alternate between:
But the Frontend headings use:
These are fine — the inconsistency I'm flagging is minor capitalization and ampersand use. Not a blocker, but uniform title formatting helps scanability for someone skimming the ToC.
Positives
The ≤14-node split is a genuine readability improvement. The original single-diagram had 20+ nodes with crossing arrows — effectively unreadable at any zoom level. Each sub-diagram now tells one coherent story. This is the design principle of progressive disclosure applied to architecture docs.
/hilfe/transkriptionis correctly documented as a User-facing page with no backend calls. This page is the primary reference for the 60+ transcribers learning Kurrent and Sütterlin character recognition. Documenting it as always-available (no backend dependency) and accessible to all authenticated users — not just admins — correctly captures its role for the audience that needs it most.The bilateral conversation timeline is now correctly named
/briefwechsel. The old/conversationsroute name was generic;Briefwechsel(letter exchange) is the German domain term that matches the family archive context. Good to have the accurate route documented.Resolution — Round 4 fixes (4 commits)
All Round 4 concerns addressed. Every finding was a diagram inaccuracy — the actual backend code follows the layering rules correctly throughout.
Blocker — Diagram 3e: DashboardService missing cross-domain relationships (commit
3fafaf29)Verified against
DashboardService.java:private final DocumentService documentService;private final TranscriptionService transcriptionService;getResume()callsdocumentService.getDocumentById()andtranscriptionService.listBlocks()getActivity()callsdocumentService.getDocumentsByIds()for title enrichmentFixed: Added
documentSvcandtranscriptionSvcas external cross-diagram stubs (same pattern aspersonSvc/tagSvcin 3b). Added:Rel(dashSvc, documentSvc, "Fetches document titles and resume data", "")Rel(dashSvc, transcriptionSvc, "Fetches transcription block progress for resume", "")dashSvccomponent description to mention both callsNode count after additions: 3e = 12 inside boundary + 2 external stubs + 2 containers (frontend, db) = 16 nodes — slightly over the ≤14 dagre guideline but necessary to show the correct cross-domain dependency graph.
Suggestion — Diagram 3d: OcrAsyncRunner → bare db arrow (commit
83f4f835)Verified against
OcrAsyncRunner.java:private final OcrJobRepository ocrJobRepository;private final OcrJobDocumentRepository ocrJobDocumentRepository;Fixed: Added
OcrJobRepository, OcrJobDocumentRepositoryas a combined component inside the boundary. Changed:Rel(ocrAsync, db, "Reads / writes OCR job state", "JDBC")→Rel(ocrAsync, ocrJobRepo, ...) + Rel(ocrJobRepo, db, "SQL queries", "JDBC")Node count: 3d = 7 inside boundary + 2 external stubs + 4 containers = 13 nodes, within the ≤14 budget.
Security smell — Diagram 3a: secFilter BCrypt description (commit
61ca5aac)BasicAuthenticationFilterparses the header and constructs anAuthenticationtoken — BCrypt validation happens downstream inDaoAuthenticationProvider. Updated:Concern — L1 + L2: Email Service not shown as external system (commit
818246a2)Verified: both
NotificationService(line 42:Optional<JavaMailSender> mailSender) andPasswordResetServiceuseJavaMailSenderfor outbound email.Fixed: Added
System_Ext(mail, "Email Service", "SMTP server. Delivers notification and password-reset emails.")to both L1 and L2 diagrams, withRel(familienarchiv/backend, mail, "Sends notification and password-reset emails (optional)", "SMTP"). The(optional)annotation reflects theOptional<JavaMailSender>injection — the app runs without SMTP configured.Felix's suggestion — Frontend 3d: adminTags missing POST
Verified against
TagController.java: noPOST /api/tagsendpoint exists. Tags are only created implicitly via document tagging (find-or-create inTagService). Current diagram (GET/PUT/DELETE /api/tags) is already correct. No change needed.🏗️ Markus Keller (@mkeller) — Senior Application Architect
Verdict: ✅ Approved
Splitting the monolith L3 diagrams was the right call. Dagre layouts above ~15 nodes produce arrow spaghetti that obscures rather than communicates. Nine focused views at ≤14 nodes each is exactly the correct trade-off.
What's correct
Cross-domain boundary fix in 3b: The original
Rel(docSvc, personRepo, ...)violated the layering rule ("services never reach into another domain's repository"). The replacement with stub components (PersonService,TagServiceoutside the boundary) and explicitRel(docSvc, personSvc, ...)now accurately represents the enforced boundary. This was a meaningful documentation bug.OCR async pattern (3d): The diagram correctly separates
OcrService(creates job records, delegates) fromOcrAsyncRunner(@Async worker) fromRestClientOcrClient(HTTP wrapper). This maps accurately to the three-layer async pattern in the codebase.Supporting domains (3e):
AuditServicewith@Asyncannotation and theSseEmitterRegistryas aConcurrentHashMap-backed in-memory registry are both correctly described. The cross-diagram references (dashSvc → documentSvc,dashSvc → transcriptionSvc) correctly model inter-domain service calls rather than direct repo access.Email service in L1 and L2: Was missing before; now correctly shown as an external system with an optional SMTP relationship from the backend.
Suggestions (non-blocking)
Cross-diagram stub convention: The pattern of placing
Component(personSvc, ..., "See diagram 3c.2. Called by...")outside aSystem_Boundaryblock is an informal convention not defined in the C4 model standard. It works visually but could confuse someone unfamiliar with the project's rendering tool. Consider adding a one-sentence note in the document header explaining that stubs are used to show cross-diagram dependencies without repeating full component definitions.3b —
minioConfrelationship direction:Rel(minioConf, fileSvc, "Provides S3Client and S3Presigner beans", "")— in C4, configuration-providing relationships are somewhat unusual. This is accurate, but a note that this represents Spring's@Beanwiring (not a runtime call) would help readers unfamiliar with Spring Boot's DI model.👨💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
Solid accuracy work overall. The
username→emailfix inCustomUserDetailsServiceis correct — the repository API is email-based. The/conversations→/briefwechselfix matches the live route. My only concern is one endpoint that looks stale.Blockers
Potential endpoint mismatch in Frontend 3b (
docs/architecture/c4-diagrams.md, bulk-edit relation):CLAUDE.md documents the bulk-edit action as
PATCH /api/documents/batch— not/bulk. One of these is wrong. If the live endpoint is/batch, the diagram introduces a new stale value that this PR was supposed to eliminate. Please verify against the actualDocumentControllermapping before merging.Suggestions (non-blocking)
Frontend 3a —
loginPagedescription still says "username:password":Since
CustomUserDetailsServicenow correctly loads by email, the login form sendsemail:password, notusername:password. The description should be updated to match — this is the same class of stale-content fix this PR is designed to address.Frontend 3b —
docDetaildescription: The description now includes "annotation layer, and comment thread" — these additions are correct and good. No action needed, just noting this is a material improvement from the original.3c.2 —
RelationshipInferenceService: This component is new to the docs. Worth verifying the class exists inbackend/src/main/java/.../person/relationship/to avoid re-introducing phantom components. (The PR test plan item "Confirm no component name in any diagram is missing from the codebase" covers this — just flagging it as the highest-risk new name.)🔧 Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer
Verdict: ✅ Approved
Docs-only change, no infrastructure files touched. Checked what's relevant from a platform perspective.
What I checked
Email Service in L1 and L2: Correctly added as an external SMTP system with
optionalqualifier on the relationship. This matches the actual deployment model — the SMTP relay is optional infrastructure (the app functions without it, only notifications and password-reset emails are affected). The description "SMTP server. Delivers notification emails (mentions, replies) and password-reset links." is accurate.MinioConfig in 3b: Now correctly describes both
S3ClientandS3Presignerbeans. The original only mentionedS3Client. The presigner is required for OCR presigned URL generation — this was a real documentation gap, now fixed.OCR diagram (3d):
Rel(ocrAsync, minio, "Generates presigned URLs for PDF fetch", "S3 API")andRel(ocrPy, minio, "Fetches PDF via presigned URL", "HTTP / S3 presigned")— this correctly represents the two-hop pattern: Java backend generates the URL, Python service fetches the PDF directly from MinIO without going through the backend. Accurate.Container diagram (L2): The
mc(MinIO Client CLI) container still correctly shown as creating the bucket on startup. No change needed there.One minor note
The OCR diagram description for
ocrClientsays "Falls back from streaming to batch on 404." — if this fallback behavior is still present in the Python service, fine. If it was removed during a refactor, this description would be stale. Not blocking, but worth confirming against the current Python service implementation.📋 Elicit — Requirements Engineer
Verdict: ⚠️ Approved with concerns
Reviewing this as a documentation artifact: does it accurately represent what the system does for its users, and does it close the gap between how the system was specified and how it actually behaves?
What's improved
L1 persona descriptions updated: Both admin and family member are now described as "transcribes" — this correctly reflects the core domain (crowd transcription of historical Kurrent/Sütterlin letters). The original described members only as "searches, browses, and reads", which missed the most distinctive capability of the system.
Authentication flows now documented:
forgotPw,resetPw,registerPageare all new in the frontend diagrams. These are user-facing flows that were completely absent from the original docs. Finding them missing in architecture diagrams is a good catch.Bilateral conversation now correctly named:
/briefwechselinstead of/conversations. The Briefwechsel (letter exchange) concept is domain-specific vocabulary that matters for this archive — calling it "conversations" was a category error.Concerns
Test plan items are fully manual — both checklist items require a human to visually render and cross-reference the diagrams. There is no automated check. Given that the PR adds 6 new diagrams and 300+ lines of Mermaid DSL, a single typo in a component name could silently produce a broken diagram.
Suggesting: add
npx @mermaid-js/mermaid-clior ammdclinting step to CI so Mermaid syntax errors are caught automatically on future docs PRs. This doesn't block this PR, but it's a gap worth filing as a follow-up issue.NFR coverage gap in the diagram for the reader path: The new diagrams accurately document the transcription/author path (annotations, OCR, enrichment). The reader path — searching, browsing, reading on mobile — is represented only at the route level in Frontend 3b. The architecture correctly supports it, but if someone reads these docs to understand where mobile responsiveness requirements land, they won't find it. This is a documentation gap, not a code gap, and acceptable for this PR scope — but worth noting for the next docs cycle.
🔒 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
Documentation-only PR. My review focuses on whether the security architecture is accurately described and whether the docs could mislead a future developer into implementing something insecure.
What's accurate and well-documented
3a Security diagram — correctly represents the defense-in-depth model:
secFilterdescription correctly identifies both the authentication mechanism (Basic Auth header parsing, BCrypt via DaoAuthenticationProvider) and the explicitly permitted endpoints (password-reset, invite, register). An inaccurate permitted-endpoint list in architecture docs is a real risk — developers may assume an endpoint is protected when it isn't, or vice versa.permAspect→secConf→userDetailswiring is correct.userDetailscomponent now says "Loads AppUser by email from DB" — this is the correct fix. The original "by username" was both inaccurate and misleading, since email is the lookup key used at authentication time.forgotPwcomponent documents its anti-enumeration property explicitly:This is excellent security documentation. It tells future developers why the endpoint behaves this way, which prevents someone from "improving" it into an enumerable endpoint. This is exactly the kind of security comment that belongs in docs, not in code (it would be redundant in both places, but in architecture docs it provides the threat model context).
One observation (non-blocking)
The
inviteCtrldescription says "Rate-limited via WebConfig interceptor." This is a meaningful security control that prevents brute-force invite code guessing. It's correctly documented here. Just noting that the test coverage for this rate-limiter is where I'd look on the next security review — architecture docs say it exists, tests should prove it does.🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist
Verdict: ⚠️ Approved with concerns
Documentation-only PR. From a QA perspective I'm checking two things: (1) is the test plan in the PR body sufficient to verify correctness, and (2) does the documentation accurately reflect testable behavior?
Test plan assessment
The PR test plan has two items:
Both are manual checks. For a PR adding 300+ lines of Mermaid DSL across 6 new diagrams, this is a non-trivial verification burden and entirely human-dependent. A syntax error in any of the 15 blocks will produce a broken rendered diagram silently unless someone manually inspects each one.
Suggestion: Add Mermaid syntax validation to CI. The
@mermaid-js/mermaid-clipackage supports headless rendering — a CI step that runsmmdc -i docs/architecture/c4-diagrams.mdwould catch syntax errors automatically and turn this from a manual gate into an automated one. This doesn't block this PR, but I'd file a follow-upchoreissue for it.Documentation accuracy from a testability lens
The
docBulkEditcomponent description saysAction: PATCH /api/documents/batchin CLAUDE.md, but the diagram showsPATCH /api/documents/bulk. This is exactly the kind of discrepancy that makes integration tests valuable — if there were an E2E test asserting the bulk-edit route, the endpoint name would be verified against the live backend on every PR. Worth fixing before merge (also flagged by Felix).What's good
The new
TranscriptionQueueServiceandAuditLogQueryServicedescriptions are specific enough to be testable — they name concrete query projections and delegation patterns rather than vague "handles transcription" descriptions. Good documentation drives testable specifications.🎨 Leonie Voss (@leonievoss) — UX Designer & Accessibility Strategist
Verdict: ⚠️ Approved with concerns
Documentation PR, so I'm reviewing whether the frontend architecture diagrams accurately represent the user-facing experience — especially the auth flows and the dual-audience (transcribers on laptop, readers on phone) split that's central to this product.
What's correctly updated
Frontend 3a hooks description is significantly improved: the original described two
handlelayers, the new version correctly documents four —handleAuth,userGroup,handleFetch, andhandleLocaleDetection. ThehandleLocaleDetectionlayer was entirely missing from the original docs. This matters for understanding the i18n (Paraglide) pipeline.New auth pages in 3a (
registerPage,forgotPw,resetPw): These are all user-facing flows that were invisible in the original docs. TheregisterPagedescription correctly captures the invite-code validation step, which is a meaningful UX guard (invite-only registration).Frontend 3c correctly splits the discovery path: The
briefwechsel,aktivitaeten,stammbaumroutes are now documented. These are the routes mobile readers will use most — previously completely absent from architecture docs.Concerns
loginPagedescription — stale form field label (also flagged by Felix):The login form presents an email field to the user, not a "username" field. This matters from a UX documentation perspective — someone reading this to understand the login flow will think the input is a username, which affects how they'd design or test the login form. Should read: "encodes email:password as Base64 Basic Auth token."
Minor observation
The
stammbaumroute description says "Renders interactive D3-based Stammbaum." — this is implementation detail (D3) in what should be a component-level description. C4 L3 describes responsibility, not library choice. Suggest: "Renders interactive family tree visualisation (nodes + edges from /api/network)." — but this is cosmetic and non-blocking.DocumentController has @PatchMapping("/bulk"); the component description had the wrong path. The Rel in the same diagram was already correct. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>Resolution — Round 5 fixes (4 commits + 1 suggestion)
All Round 5 concerns addressed. Every finding was a diagram inaccuracy — the actual backend code follows the layering rules correctly throughout.
Blocker —
docBulkEditcomponent description:/batch→/bulk(commit318d83d2)Verified against
DocumentController.javaline 256:@PatchMapping("/bulk"). TheRelin the diagram already had the correct/bulk; the component description had the wrong/batch.Fixed:
Action: PATCH /api/documents/batch.→Action: PATCH /api/documents/bulk.Blocker —
loginPagecomponent description:username:password→email:password(commitaf24d631)CustomUserDetailsServiceloads by email (findByEmail). The component description said "encodes username:password", which misrepresents the login form's credential field.Fixed:
"encodes username:password as Base64 Basic Auth token"→"encodes email:password as Base64 Basic Auth token"Blocker — Sequence diagram: 3 stale username references (commit
caf86b32)The auth flow sequence diagram had three stale references carrying over the old username-based model:
"Enter username + password"→"Enter email + password"Base64 encode "user:password"→Base64 encode "email:password"SELECT user WHERE username=?→SELECT user WHERE email=?Suggestion —
stammbaumdescription: remove D3 implementation detail (commit9878810a)C4 L3 describes component responsibility, not library choice. Removed "D3-based" from the description.
Before:
"Renders interactive D3-based Stammbaum."After:
"Renders interactive family tree from network graph data."Suggestion — Cross-diagram stub convention note (commit
7ac39541)Added a one-sentence explanation to the document header clarifying that components placed outside a
System_Boundarywith "See diagram X" annotations are reference stubs — not duplicated definitions.🏗️ Markus Keller (@mkeller) — Senior Application Architect
Verdict: ✅ Approved
This is exactly what architecture documentation should be. Splitting 2 monolithic 40-node Mermaid diagrams into 15 focused sub-diagrams solves the core readability problem. The numbered scheme (3a / 3b / 3b.2 / 3c / 3c.2 / 3d / 3e) is consistent and navigable.
What I Checked
Cross-domain calls go through services, not repositories.
The 3b diagram correctly shows
DocumentServicecallingpersonSvcandtagSvc(other domains' services), notPersonRepositoryorTagRepositorydirectly. The stub annotations ("See diagram 3c.2" / "See diagram 3c") make the boundary explicit. This is exactly what the layering rules require.Layer boundaries are correctly depicted.
Controllers → Services → Repositories is consistently shown across all 9 backend sub-diagrams. No controller-to-repository shortcut appears anywhere.
OCR service extraction rationale.
The 3d diagram correctly shows the Python OCR Service as a separate container communicating via presigned MinIO URLs rather than direct credentials. The
@Asyncannotation onOcrAsyncRunneris reflected in the component technology label.Security surface in 3a.
PermissionAspect(AOP) andCustomUserDetailsServiceare correctly shown as distinct beans — authentication and authorization are separate concerns, correctly separated. The permitted endpoint list (invite, register, forgot-password) is accurate.Email service.
Adding
System_Ext(mail, ...)to L1 and L2 is the right call — the notification and password-reset system sends SMTP and belongs on the context diagram. The "optional" qualifier in the relationship is accurate.One Minor Observation (not a blocker)
The
3b.2/3c.2decimal notation is slightly non-standard in C4 practice (C4 levels don't conventionally use fractional sub-levels). A future audit might consider naming like3b-transcription/3c-personsto avoid decimal confusion with version numbers. Not a change request for this PR.👨💻 Felix Brandt (@felixbrandt) — Senior Fullstack Developer
Verdict: ✅ Approved
Documentation-only PR — no TDD evidence to assess. What I can verify is accuracy: does the documentation describe what the code actually does?
Accuracy Wins I Can Verify From This Diff
/conversations→/briefwechsel— route was renamed and the diagram now matches. Thebriefwechselcomponent in Frontend 3c correctly showsGET /api/documents/conversationas the backend call.Username → email in the auth flow —
CustomUserDetailsServicenow says "Loads AppUser by email from DB" and the sequence diagram showsSELECT user WHERE email=?. This matches the actualfindByEmail()query. The previous "username" description was a latent accuracy bug waiting to confuse someone reading the auth flow.PATCH /api/documents/bulk(not/batch) —docBulkEditis correctly shown withPATCH /api/documents/bulk. This was corrected in an earlier commit on this branch; good to see it landed in the final state.S3ClientandS3Presigner—MinioConfignow mentions both beans. The presigner is what generates the URLs the OCR service uses to fetch PDFs from MinIO. This detail matters for anyone trying to understand whyMinioConfighas two beans.OcrAsyncRunnerstreams page-by-page — the description "Streams OCR results from Python page by page" correctly reflects the NDJSON streaming approach rather than a batch response.One Observation (informational)
OcrControlleris described as handling "single jobs, batch jobs, SSE streaming, job status, training runs, and per-sender models." That's a broad mandate for one controller. I'd flag this in an implementation review — but it's out of scope for a documentation PR. Worth knowing as a future refactor signal.Test Plan
Both test plan items are reasonable for a docs-only PR. Manual Mermaid render verification is the right check here since there's no CI Mermaid linting step yet.
🚀 Tobias Wendt (@tobiwendt) — DevOps & Platform Engineer
Verdict: ✅ Approved
No Compose files, no CI config, no Dockerfiles touched. Pure documentation. I'll verify the L2 container diagram reflects the actual deployment topology I know from the Compose file.
L2 Container Diagram Accuracy Check
SvelteKit / Node.jsSpring Boot 4 / Java 21 / JettyPostgreSQL 16MinIO (S3-compatible)Python FastAPI(in 3d)MinIO Client (mc)— creates bucket on startupSystem_Ext— external SMTP relayThe new
Rel(backend, mail, "Sends notification and password-reset emails (optional)", "SMTP")correctly marks email as optional — SMTP config is not required for core document management functionality.The
Rel(ocr, storage, "Fetches PDF via presigned URL", "HTTP / S3 presigned")in L2 accurately captures why the OCR service doesn't need its own S3 credentials — it gets time-limited presigned URLs from the backend.One Thing Worth Noting
The L2 diagram still shows
MinIOas the object storage component. In production, this is Hetzner Object Storage (S3-compatible, same API). The diagram is technically correct at the API level — both are S3-compatible — but a future revision might note the dev/prod distinction. Not a blocker; this is well-understood by the team.Nothing blocking. Merge when ready.
📋 Elicit — Requirements Engineer & Business Analyst
Verdict: ⚠️ Approved with concerns
Architecture diagrams are a form of requirements documentation — they specify what the system does and who it serves. I'm checking whether the actors, system boundaries, and functional scope are consistently and completely represented.
What's Well Done
The L1 actor descriptions now include "transcribes" for both Administrator and Family Member — this accurately reflects the product's core use case (crowd transcription of historical Kurrent/Sütterlin letters). That verb was missing before and its absence undersold the primary user activity.
The email service is correctly marked optional in the L1/L2 relationships. Invite-based registration and notification email are real features, but they're optional dependencies — the system functions without an SMTP relay. This distinction is correctly captured.
Concerns
1. Invite-gated registration not visible at L1 (Minor)
The L1 System Context diagram shows a
registerendpoint in the frontend via theregisterPagecomponent, but L1 doesn't communicate that registration is invite-only. A new reader of the L1 diagram would conclude anyone can self-register. The invite constraint is architecturally significant — it shapes the user lifecycle and the security surface. Consider a note or a relationship qualifier like "Joins by invite" on the Member actor at L1. This is intentionally high-level, so I won't call it a blocker, but it's a representation gap.2. Route name consistency:
/aktivitaetenvs "Chronik" (Verify)Frontend 3c shows the component as
/aktivitaetenwith description "Unified activity feed (Chronik)." CLAUDE.md lists the route asaktivitaeten/and names the feature "Chronik." As long as/aktivitaetenis the actual SvelteKit route and "Chronik" is only the user-facing label, this is fine. But worth a quick sanity check before merge — if someone named the route/chronikat some point, the diagram would be wrong.3.
/hilfe/transkriptionunderstated (Cosmetic)The help page description says "Static transcription style guide for Kurrent and Sütterlin character recognition. No backend calls." Accurate. But from a requirements perspective, this is the system's primary onboarding artifact for the 60+ transcriber persona — the people who do the most important work in the system. The word "static" makes it sound incidental. Not a diagram change request, just an observation about documentation framing.
🔒 Nora "NullX" Steiner — Application Security Engineer
Verdict: ✅ Approved
I'm verifying that the security subsystem diagram (3a) accurately represents the authentication and authorization architecture, and that no security-relevant components are misrepresented or missing.
3a — Security & Authentication Review
CustomUserDetailsServiceauthentication field: ✅ FixedThe prior diagram said "Loads AppUser by username from DB." The new diagram correctly says "Loads AppUser by email from DB." This is security-critical accuracy — the lookup field in Spring Security's
UserDetailsServicedetermines what the Basic Auth username field is compared against. Documenting the wrong field would mislead anyone auditing the authentication flow.SecurityConfigpermitted endpoints: ✅ AccurateThe description "Permits password-reset, invite, and register endpoints without authentication" correctly reflects the actual
permitAll()configuration. These are the only unauthenticated entry points to the backend. The diagram makes the security surface explicit and auditable.PermissionAspectseparation: ✅ Architecturally correctAuthentication (
SecurityConfig/CustomUserDetailsService) and authorization (PermissionAspect) are shown as distinct components. This is the correct representation — they are separate beans with separate concerns. A reader can follow the flow: filter chain authenticates → AOP aspect authorizes → method executes.CSRF disabled: ✅ Implicit, correct
The diagram notes CSRF is disabled in
SecurityConfig. The reason (Basic Auth credentials in cookies with Authorization header injection inhooks.server.tsmakes CSRF structurally impossible) belongs in source-level comments, not in the diagram. The diagram correctly documents the what and defers the why to the code.One Observation (informational, not blocking)
The 3c diagram shows
InviteControllerat/api/auth/inviteandAuthControllerat/api/authbut doesn't list which specific endpoints arepermitAll(). A reader of 3c alone wouldn't know these endpoints are unauthenticated. This is fine — 3a is the authoritative source for the security perimeter, and the cross-diagram structure handles it. Just noting the cross-diagram dependency for anyone reading 3c in isolation.No security misrepresentations found.
🧪 Sara Holt (@saraholt) — QA Engineer & Test Strategist
Verdict: ⚠️ Approved with concerns
No production code changed — my review focuses on: (1) does the test plan adequately verify this documentation? (2) are the component descriptions precise enough to write tests from? (3) are there accuracy claims I can cross-reference?
What's Well Done
The component descriptions are precise enough to drive test design. Examples:
"Creates OcrJob and OcrJobDocument records, checks Python service health, and delegates async execution to OcrAsyncRunner"— tells a tester exactly whatOcrServiceshould be verified to do in a unit test."Tracks import state (IDLE/RUNNING/DONE/FAILED)"— tells a tester the four states to assert on."Always responds with success to prevent email enumeration"(forgot-password) — this is a behavioral requirement that should have an explicit test. Good to see it documented.Concerns
1. Test plan is entirely manual (Minor)
Both checklist items require human execution:
For a docs-only PR this is acceptable, but worth noting:
npx @mermaid-js/mermaid-clican validate Mermaid syntax in CI without a browser. One pipeline step would catch syntax regressions automatically on future docs changes.2. No mechanism to catch documentation drift (Minor)
Several commits on this branch fixed accuracy regressions that accumulated since the original diagrams were written (email/username, /batch vs /bulk, /conversations vs /briefwechsel). The final state is correct, but there's no automated protection against the next accumulation. As new endpoints are added, C4 diagrams will silently drift again. A tracking issue or a documentation-review checklist item in the feature PR template would mitigate this.
3. Cross-diagram dependency in 3b.2 to verify (Informational)
TranscriptionQueueServicein diagram 3b.2 depends onDocumentService(from 3b) andAuditLogQueryService(from 3e), while 3e'sDashboardServicedepends back onTranscriptionService(from 3b.2). This bidirectional dependency across sub-diagrams could obscure a real coupling concern. I'd verify against the actual service code that neither of these is a circular Spring injection dependency — given the Spring Boot 4 constraint on constructor injection cycles, if this is real code it either uses@Lazy(now discouraged) or delegates through a different path.🎨 Leonie Voss (@leonievoss) — UI/UX Design Lead & Accessibility Strategist
Verdict: ✅ Approved
Architecture diagrams aren't UI components, but they are information design artifacts. I'm evaluating this as an information architecture problem: can the intended audience navigate and understand these diagrams?
What Works Well
The split is a readability win. 2 monolithic 40-node diagrams → 15 focused 8–14 node diagrams. A 40-node Mermaid graph in Gitea's markdown viewer requires diagonal scrolling and mental context-switching. Sub-diagrams that fit in a single browser viewport can be scanned in seconds. This is the right call for the audience (developers navigating the codebase, not conference attendees with a full-screen projector).
The cross-diagram stub convention note is well-placed. It appears in the document header before the reader encounters any stubs. This is the correct pattern — establish the vocabulary before using it. Without this note, a reader seeing a component outside a
System_Boundaryblock would be confused.Section headers are scannable. "### 3a — Security & Authentication" renders as a proper heading in Gitea's markdown, giving each diagram a title that appears in the page outline. A developer who knows they want the OCR section can jump directly without reading top-to-bottom.
One Design Observation (informational, not blocking)
The decimal numbering
3b.2and3c.2can be misread as version numbers or patch releases rather than sub-diagram identifiers. Someone skimming the markdown source might wonder if3b.2supersedes3bor is an incremental revision of it. Flatter naming — e.g., "3c — Documents", "3d — Transcription Pipeline", "3e — Persons & Family Graph" — would eliminate this ambiguity. Worth considering on the next documentation audit, not a change request for this PR.Rendering Consideration
Mermaid diagrams render as horizontally scrollable containers in Gitea's mobile view. The transcriber audience (60+, typically on laptop or tablet) won't encounter this — they're on wider screens. Younger family members reading on phones may need to scroll horizontally on the larger sub-diagrams. This is inherent to Mermaid rendering, not introduced by this PR, and the split into smaller diagrams actually reduces horizontal scroll width compared to the original monoliths.