feat(audit): domain-level audit log for archive activity #275
Reference in New Issue
Block a user
Delete Branch "feat/issue-274-audit-log"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #274
Summary
Adds a dedicated
audit_logtable andAuditServicethat records meaningful state transitions across the domain — who transcribed what, who uploaded which file, when metadata changed. Audit writes are async (@Async("auditExecutor")) and deferred past transaction commit viaafterCommit()hooks to eliminate FK timing races. A dedicated thread pool (CallerRunsPolicy) degrades gracefully under load instead of dropping events.Events recorded:
FILE_UPLOADED— when a PLACEHOLDER document receives its first file (viastoreDocumentorattachFile)STATUS_CHANGED— when a document's status transitions (payload:{oldStatus, newStatus})METADATA_UPDATED— when document metadata is updated without a status changeTEXT_SAVED— when a transcription block's text actually changes (payload:{pageNumber})BLOCK_REVIEWED— when a block is toggled from unreviewed → reviewedANNOTATION_CREATED— when a new annotation is created manually (not via OCR) (payload:{pageNumber})Append-only enforcement:
REVOKE UPDATE, DELETE ON audit_log FROM app_userin V46 migration. Actor column isON DELETE SET NULLfor GDPR right-to-erasure compliance.👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
TDD evidence is solid — tests clearly precede the production code, names are descriptive, and the audit condition tests (text changes vs. unchanged,
false→truevstrue→false) are exactly right. No complaints there.Blockers
None.
Suggestions
logAfterCommit()is copy-pasted verbatim into three service classes.AnnotationService,TranscriptionService, andDocumentServiceall contain an identical private method:Three identical 13-line methods is the clearest DRY violation in this PR. This belongs in
AuditServiceitself as a public helper — it's an audit concern, not a DocumentService concern:Call sites become
auditService.logAfterCommit(...)— one copy, one test, one place to update if the synchronization strategy ever changes.updateDocument()is accumulating responsibilities.The method now handles: simple field updates, tag resolution, sender linking, receiver linking, metadata-complete flag, script type, file replacement, version recording, and audit logging. Seven concerns in one method. The audit instrumentation lands at the bottom after ~60 lines. This isn't a blocker for this PR but is worth a follow-up refactor —
applyMetadata(),resolveRelationships(),replaceFile()helpers would make the audit instrumentation visible without scrolling.🏗️ Markus Keller — Application Architect
Verdict: ⚠️ Approved with concerns
The
audit/package as a feature package is the right call.AuditKind,AuditLog,AuditLogRepository, andAuditServiceall belong together and are cleanly separated from the service layer. TheCallerRunsPolicyand dedicated executor are good operational decisions — I raised those points in the original design and they're addressed correctly.Blockers
None.
Suggestions
Inconsistent actor resolution pattern across instrumented services.
There are two patterns in this PR for how the
userIdreaches the audit call:TranscriptionService.reviewBlock()UUID userIdas an explicit parameter, passed from the controllerAnnotationService.createAnnotation()UUID userIdas an existing explicit parameterDocumentService.updateDocument()resolveCurrentUserId()internally viaSecurityContextHolderDocumentServiceis now the only service that reaches into the security context directly. This means:DocumentServicehas a hidden dependency onSecurityContextHolderthat doesn't appear in its constructor — invisible to tests without extra setup.resolveCurrentUserId()returnsnullwhen no auth context is present, and the tests assertisNull(). But this masks the actor rather than testing it.updateDocument()is ever called from a background thread (batch import, scheduled task), the security context will not propagate and actor attribution silently disappears.The established pattern — controller resolves user, passes
actorIdto the service method — should be extended toDocumentServiceas well:This aligns with how
reviewBlock,createBlock, andupdateBlockalready work.logAfterCommit()duplication across three services.Three identical helper methods belong in
AuditService.logAfterCommit(). Each service currently owns infrastructure code that is not its domain concern. Felix has also flagged this — extract toAuditServiceand the duplication disappears.🔒 Nora "NullX" Steiner — Security Engineer
Verdict: ✅ Approved
This is a well-considered audit implementation from a security standpoint. Let me walk through what I checked.
What I checked
Append-only enforcement —
REVOKE UPDATE, DELETE ON audit_log FROM app_userin V46 is exactly right. The application role can insert but cannot tamper with historical records. This is the database-layer guarantee that no application bug can bypass.GDPR right-to-erasure design —
actor_id ... ON DELETE SET NULLis the correct choice. Deleting a user wipes their PII (email, name) fromapp_userswhile retaining the audit trail's temporal integrity. The comment in the migration documents the intent — good.document_id ... ON DELETE CASCADE— also correct. When a document is deleted, its audit trail is irrelevant and should go with it. No orphaned rows accumulate.No injection vectors — The
payloadfield is aMap<String, Object>populated entirely from trusted internal values:pageNumber(integer from the database),oldStatus/newStatus(.name()of an enum). No user-controlled input flows into the payload. No injection risk.Parameterized logging —
log.error("Audit log write failed: kind={}, document={}", kind, documentId, e)uses SLF4J's parameterized form. No Log4Shell concern.Security smell (not a blocker)
resolveCurrentUserId()silently returnsnullwithout any log entry.Three return-null paths with no log output. In production, if a batch import or scheduled task calls an instrumented method without an authenticated context, audit records will be written with
actor_id = NULL. The schema allows this and the design documents it for GDPR, so the behavior is intentional. But alog.debug()on the null paths would make it observable during debugging — right now you'd have to look at the audit table directly to notice that actor attribution is missing.Not a security vulnerability, but a diagnostic gap worth closing before the audit data starts being used for compliance reporting.
🧪 Sara Holt — QA Engineer
Verdict: ⚠️ Approved with concerns
Test names are excellent —
reviewBlock_logsBlockReviewed_whenFlippingFalseToTruetells me exactly what broke when CI goes red. The condition coverage (text changed vs. unchanged, reviewed toggle directions) is thorough. Thenever().log(...)assertions are as important as the positive ones and are present everywhere they should be. Good TDD discipline.Blockers
None.
Suggestions
The
afterCommit()branch oflogAfterCommit()has zero test coverage.Every unit test in this PR calls a service method without a Spring-managed transaction.
TransactionSynchronizationManager.isActualTransactionActive()returnsfalsein Mockito tests — so every test exercises theelsebranch (direct call). The critical production path — "we're inside a@Transactionalmethod, so defer the audit write to after the transaction commits" — is never exercised by any test in this PR.This matters because the
afterCommit()path is the whole reason the helper exists. It prevents FK violations from the async write racing against the parent transaction. If the registration logic were subtly wrong (wrong synchronization phase, wrong callback method), the unit tests would never catch it.What I'd want to see is one
@DataJpaTestor@SpringBootTestintegration test that:@Transactionalservice method that triggers an audit eventauditLogRepository.count()is 0 inside the transaction (before commit)auditLogRepository.count()is 1 after the transaction commitsThat test would give genuine confidence in the
afterCommit()path and serve as a regression guard if the synchronization strategy ever changes.CallerRunsPolicysaturation path is untested.The queue capacity is 50. Under saturation, the policy degrades to synchronous execution. This is the right choice, but there's no test that verifies the application doesn't crash or throw when the executor is saturated. Low priority, but worth a note in a follow-up.
What's tested well:
AuditServiceTest— 4 tests covering: saves correct fields, exception swallowing, null payload, null actorId. Solid.AnnotationServiceTest— distinguishescreateAnnotation(fires event) vscreateOcrAnnotation(does not). Exactly the right discrimination.TranscriptionServiceTest— 4 new tests covering both directions of the text-change and review-toggle guards. Thorough.🛠️ Tobias Wendt — DevOps & Platform Engineer
Verdict: ✅ Approved
No Compose, CI, or infrastructure changes in this PR — pure backend Java and SQL migration. Reviewing from the operational angle.
What I checked
V46 migration —
audit_logtable schema is clean.gen_random_uuid()default is fine on PostgreSQL 16 (pgcrypto is available by default). TheREVOKE UPDATE, DELETEstatement requires thatapp_useris the role the application connects as — this is consistent with how previous migrations handle least-privilege (V30uses the same pattern). Verify in the production.envthatPOSTGRES_USERmatches.The
MigrationIntegrationTestwill run V46 automatically on the next CI run, so there's no need for a separate migration smoke test. Good.auditExecutorthread pool —corePoolSize=1,maxPoolSize=2,queueCapacity=50,CallerRunsPolicy. Reasonable sizing for a low-volume family archive. Under 50 queued events, the background thread handles it. At 51+, the calling thread does the work synchronously. No events dropped. I like this.Operational gap: no alert on repeated audit write failures.
The error is logged but there's no monitoring hook. If the audit executor fails repeatedly (DB connection issue, schema mismatch after a botched migration), the only signal is in the application log. Since the write failure is intentionally swallowed (fire-and-forget), an operator would only discover silent data loss when someone queries the
audit_logtable and finds gaps.Low urgency for now — the
MigrationIntegrationTestwould catch schema issues before they reach production. But when Grafana/Loki gets wired up, add an alert onAudit log write failedlog rate > 0 for 5 minutes. That's a one-liner in Alertmanager and it would close the observability gap.🎨 Leonie Voss — UX Design & Accessibility
Verdict: ✅ Approved
This PR is entirely backend — Flyway migration, Java service classes, and unit tests. No Svelte components, no frontend routes, no UI changes of any kind.
Nothing to review from a UX or accessibility perspective.
When the audit data gets surfaced to users (Family Pulse dashboard, activity feed, resume card — referenced in issue #271), I'll want to review those UI implementations carefully. In particular: how actor names are displayed when
actor_idis null (GDPR erasure case), how timestamps are formatted for the senior audience, and whether the activity feed provides sufficient contrast and touch targets. Flag me when those PRs come up.Review Cycle 1 — Concerns Addressed
Commit:
5a3b5ff fix(audit): address review cycle 1 feedback✅ Markus —
logAfterCommit()duplication across 3 servicesConcern: Each of
AnnotationService,TranscriptionService, andDocumentServicehad its own privatelogAfterCommit()copy (identical boilerplate).Fix: Extracted
logAfterCommit()intoAuditServiceas a public method. All three services now delegate toauditService.logAfterCommit(...). TheTransactionSynchronizationimports were removed from the service layer entirely.✅ Markus —
DocumentServicecoupling toSecurityContextHolderConcern:
DocumentServiceinjectedUserServiceand calledSecurityContextHolder.getContext().getAuthentication()to resolve the actor. Services should not touch the security context.Fix: Removed
UserServicefromDocumentService. AddedUUID actorIdparameter tostoreDocument(file, actorId),attachFile(id, file, actorId), andupdateDocument(id, dto, file, actorId).DocumentControllernow resolves the actor fromAuthentication(via arequireUserId()helper, following the same pattern asTranscriptionBlockController) and passes it down.✅ Tests updated
AuditServiceTest: added twologAfterCommit()tests (no-transaction branch + transaction callback branch) usingMockedStatic<TransactionSynchronizationManager>DocumentServiceTest: removed@Mock UserService, updated all 3-argstoreDocument/attachFile/updateDocumentcalls to 4-arg form, updatedverify(auditService).log(...)→.logAfterCommit(...)TranscriptionServiceTest,AnnotationServiceTest,DocumentControllerTest: same verify-call fixesAll 211 unit tests green ✅
🏛️ Markus Lüttich — Software Architect
Verdict: ✅ Approved
My cycle 1 blocker —
SecurityContextHolderbeing called inside the service layer — has been correctly resolved. Actor resolution now lives entirely in the controller via therequireUserId(Authentication)helper, and service methods receive a plainUUID actorId. The layer contract is clean.What I verified
Layer separation ✓
DocumentService,AnnotationService, andTranscriptionServiceno longer touchSecurityContextHolder. The audit write path is: Controller resolves actor → passes UUID to Service → Service callsauditService.logAfterCommit(). Domain boundary is respected.audit/package isolation ✓AuditService,AuditLog,AuditLogRepository, andAuditKindlive in their own package. No other package has a direct dependency on the JPA entity or repository — only onAuditService. This is the correct encapsulation.logAfterCommit()consolidation ✓The transaction synchronization logic is now in exactly one place (
AuditService.logAfterCommit()), not duplicated across three services. This was the right extraction.Cross-domain service access ✓
DocumentControllerinjects bothDocumentServiceandUserService— this is the correct pattern per our architecture (controllers may call multiple services; services must not cross domain boundaries).One observation (not a blocker)
The
@Async("auditExecutor")annotation is onAuditService.log(), but all production call sites uselogAfterCommit()which callswriteLog()synchronously from theafterCommitcallback. TheauditExecutorbean is therefore exercised only inAuditServiceTest. The actual audit writes run synchronously on the transaction commit thread.This is architecturally acceptable — the afterCommit hook already guarantees FK safety, and the write is fire-and-done from the DB's perspective. But it's worth noting that the
CallerRunsPolicy/ queue-capacity configuration onauditExecutorhas no effect on the primary flow. If you later want true async writes,log()is already wired for it.👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ✅ Approved
Solid TDD execution. The red/green discipline held throughout — every behaviour is captured by a failing test before the implementation. Code is clean and the public API surface of
AuditServiceis minimal and purposeful.What I checked
Test coverage ✓
AuditServiceTest: 6 tests coverlog()async write,logAfterCommit()both branches (inside/outside transaction), and exception swallowing viaMockedStatic<TransactionSynchronizationManager>AnnotationServiceTest:ANNOTATION_CREATEDfires oncreateAnnotation(), does NOT fire oncreateOcrAnnotation()— the distinction is explicitly testedTranscriptionServiceTest:TEXT_SAVEDfires only on text change,BLOCK_REVIEWEDfires only onfalse → true— both guards are coveredDocumentServiceTest:METADATA_UPDATED,STATUS_CHANGED,FILE_UPLOADED(both paths) all covered withArgumentCaptorfor payload verificationNaming and structure ✓
writeLog()as a private shared implementation,log()as the async entry point,logAfterCommit()as the transactional wrapper — the three-method structure maps cleanly to the three concerns. No ambiguity.Guard conditions ✓
TEXT_SAVEDcapturespreviousTextbefore mutation, compares aftersanitizeText()— the guard is in the right place.BLOCK_REVIEWEDcaptureswasReviewedbefore the flip — same pattern.requireUserId()pattern ✓Consistent with
TranscriptionBlockController. ThrowingDomainException.unauthorized()on null/unauthenticated is the right failure mode.Minor observation (not a blocker)
updateAnnotation()has no audit instrumentation — intentional per the plan, but worth a one-line comment in the issue or plan for anyone who later wonders why it's omitted. Not code debt, just a future question that will come up.🔧 Tobias Kern — DevOps / Infrastructure
Verdict: ✅ Approved
The infrastructure pieces are solid. Migration is safe to run, the executor bean is correctly configured, and the error handling in
writeLog()means a flaky DB connection won't take down the request thread.What I checked
Migration
V46__add_audit_log.sql✓happened_at TIMESTAMPTZ DEFAULT now()— correct, timestamptz not timestampON DELETE CASCADEfordocument_id— good, no orphaned audit rows when a document is deletedON DELETE SET NULLforactor_id— correct GDPR handling (right to erasure doesn't destroy audit history, just removes the identity link)REVOKE UPDATE, DELETE ON audit_log FROM app_user— append-only enforced at DB level ✓happened_at DESC,document_id,actor_id,kind— appropriate for the anticipated query patterns (document timeline, actor activity, kind filtering)auditExecutorbean ✓corePoolSize=1,maxPoolSize=2,queueCapacity=50,CallerRunsPolicy, thread prefixAudit-— correct for a low-volume audit path.CallerRunsPolicyensures backpressure rather than silent drops.Exception handling ✓
writeLog()catches all exceptions and logs at ERROR level — a DB outage won't surface as a 500 to the user.Observation: executor is currently idle
As noted by Markus:
logAfterCommit()callswriteLog()synchronously, soauditExecutoris only used whenlog()is called directly (currently only in tests). The executor is deployed but not on the hot path.This is fine for now — the afterCommit hook provides equivalent FK safety. If throughput becomes a concern, you can switch
logAfterCommit()to dispatch to the executor instead of callingwriteLog()directly. The bean is already wired and ready.🔒 Nora Vogel — Security
Verdict: ✅ Approved
The security properties of this PR are sound. The audit log is treated as a system-internal tamper-evident record, and the design correctly separates the concern of "who did what" from "what the user sees."
What I checked
Append-only enforcement ✓
REVOKE UPDATE, DELETE ON audit_log FROM app_userin the migration ensures no application-level code can alter or delete audit records. This is enforced at the DB permission level, not just by convention.GDPR / right-to-erasure design ✓
actor_idusesON DELETE SET NULL(not CASCADE). Deleting a user account anonymizes their audit trail without destroying the audit history. The migration comment explains this explicitly — important for future maintainers.No sensitive data in payloads ✓
The payload contracts are minimal:
pageNumberfor text and annotation events,{oldStatus, newStatus}for status transitions, null for others. No PII, no file content, no credentials land inaudit_log.payload.Actor resolution failure modes ✓
requireUserId()in controllers throwsDomainException.unauthorized()on null authentication or unknown user — no silent actor omission in write paths. OCR annotations intentionally bypass actor tracking (createOcrAnnotation()is system-generated, not user-initiated).JSONB payload injection risk ✓
Payloads are constructed as
Map<String, Object>and serialized via Hibernate's@JdbcTypeCode(SqlTypes.JSON)— no raw SQL concatenation, no injection vector.No blockers found.
🧪 Sara Holz — QA / Tester
Verdict: ✅ Approved
Test suite is comprehensive and the tricky edge cases (guard conditions, transaction branches) are explicitly covered. No gaps that would leave a behaviour unverified in production.
What I checked
AuditServiceTest✓log()saves to repository (async path)logAfterCommit()inside a transaction: registers synchronization and callswriteLog()fromafterCommit()— tested viaMockedStatic<TransactionSynchronizationManager>logAfterCommit()outside a transaction: callswriteLog()directlywriteLog()is caught and does not propagate to the callerGuard condition tests ✓
TEXT_SAVEDfires when text changes, does NOT fire when text is unchangedBLOCK_REVIEWEDfires onfalse → true, does NOT fire ontrue → falseANNOTATION_CREATEDfires fromcreateAnnotation(), does NOT fire fromcreateOcrAnnotation()Payload verification ✓
ArgumentCaptor<Map<String, Object>>is used to assertpageNumberin bothTEXT_SAVEDandANNOTATION_CREATEDpayloads.STATUS_CHANGEDpayload is verified foroldStatus/newStatuskeys.Null-actor paths ✓
Service tests pass
nullasactorId— the audit write still proceeds (null actor is a valid state for system-initiated actions).DocumentControllerTestwrite-path stubs ✓UserService.findByEmail()is stubbed in all tests that reachrequireUserId()— no test relies on unconfigured mock behavior.No blockers found.
🎨 Leonie Voss — UX / Frontend
Verdict: ✅ Approved
This PR is entirely backend — new
audit/package, migration, service instrumentation, controller adjustments, and tests. No frontend files are changed, no API endpoints are exposed, and no user-facing behaviour is affected.Checked:
.svelte,.ts, or route files modifiederrors.tsNothing to flag from a UX perspective. Looking forward to the activity feed / dashboard feature (issue #271) that this lays the groundwork for.
Expansion needed:
COMMENT_ADDEDandMENTION_CREATEDeventsThe dashboard activity feed (issue #271) shows 5 item types. Three are covered by the current audit log instrumentation. The remaining two — @mention notifications and comment replies — are not:
"Klaus hat dich erwähnt in Brief an Frieda"+ für-dich badge"Lotte hat geantwortet auf deinem Kommentar"+ für-dich badgeThese are the items that make the dashboard feel personal. The "für dich" badge is the most emotionally engaging element in the spec.
Decision
Add
COMMENT_ADDEDandMENTION_CREATEDtoaudit_logbefore dashboard backend work starts. Queryingdocument_commentsseparately inDashboardController(the alternative) would make the controller the only place in the system merging two heterogeneous data sources, require sharing or duplicating theyouMentioneddetection logic fromNotificationController, and create a permanent special case that grows harder to extend.The audit log is already the authoritative record of all meaningful archive activity. Comments are meaningful activity.
What to add
COMMENT_ADDEDevent — fired when aDocumentCommentis saved:MENTION_CREATEDevent — fired once per mentioned user when a comment is saved (the comment body is parsed for @mentions at write time):Storing
mentionedUserIdin the payload means the dashboard query for the "für dich" flag is simply:No join to
document_commentsneeded at query time.Enum values to add to
AuditKindTests to write first (TDD)
commentAdded_logsAuditEvent()— saving a comment firesCOMMENT_ADDEDwith correctdocumentIdandcommentIdmentionCreated_logsOneEventPerMentionedUser()— a comment mentioning two users fires twoMENTION_CREATEDeventsmentionCreated_doesNotFireWhenNoMentions()— a comment without @mentions fires noMENTION_CREATEDeventExpansion:
COMMENT_ADDEDandMENTION_CREATEDeventsCommit:
428c63a feat(audit): add COMMENT_ADDED and MENTION_CREATED audit eventsWhat was added
AuditKindenum — two new values:COMMENT_ADDED— payload:{"commentId": "uuid"}MENTION_CREATED— payload:{"commentId": "uuid", "mentionedUserId": "uuid"}CommentServiceinstrumentation — all three post methods (postComment,postBlockComment,replyToComment) now fire:COMMENT_ADDEDevent after saveMENTION_CREATEDevent per mentioned user (skipped when list is null or empty)A private
logCommentPosted()helper encapsulates both calls to avoid repeating the pattern across all three methods.Tests (TDD — all red before implementation)
CommentServiceTest— 6 new tests:postComment_logsCommentAdded— firesCOMMENT_ADDEDwith correctdocumentIdandcommentIdpayloadpostComment_logsMentionCreated_oncePerMentionedUser— two mentions → twoMENTION_CREATEDevents with correctmentionedUserIdper eventpostComment_doesNotLogMentionCreated_whenNoMentions— empty mention list → zeroMENTION_CREATEDeventsreplyToComment_logsCommentAdded— reply firesCOMMENT_ADDEDreplyToComment_logsMentionCreated_whenMentioned— mention in reply firesMENTION_CREATEDpostBlockComment_logsCommentAdded— block comment firesCOMMENT_ADDED266 unit tests green (the
TrainingDataExportServiceTestfailures are a pre-existing Testcontainers/Docker infrastructure issue on this machine, not related to this change).