fix(dashboard): audit_log never written + hero blind to annotation-only work #280
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Two bugs found while testing the dashboard after adding annotation segments:
Bug 1 —
audit_logis silently never writtenSymptom: The
audit_logtable is completely empty despite annotations and transcription blocks being saved successfully. The activity feed and pulse stats therefore show nothing.Root cause:
AuditService.logAfterCommit()registers aTransactionSynchronization.afterCommit()callback. Inside that callback,writeLog()callsauditLogRepository.save(), which is@Transactional(fromSimpleJpaRepository). Spring tries to start a new transaction, which callsTransactionSynchronizationManager.initSynchronization(). But the outer transaction's synchronizations are still active atafterCommit()time — they are only cleared inafterCompletion(), which fires later. This throws:The
catch (Exception e)inwriteLog()silently swallows this and callslog.error(), so no row is ever written and no visible stack trace appears unless you watch stdout.Fix: In the
afterCommit()callback, submit the work to theauditExecutorthread pool directly instead of callingwriteLog()inline. A fresh thread has no existing synchronization context andSimpleJpaRepository.save()can start a clean transaction normally.Note: calling
this.log()(the@Asyncmethod) from within the same class won't work due to Spring AOP self-invocation. Injecting the executor directly avoids that limitation.Bug 2 — Dashboard hero is blind to annotation-only work
Symptom: After adding annotation segments without saving transcription text, the hero ("resume" card) stays in empty state.
Root cause:
AuditLogQueryRepository.findMostRecentDocumentIdByActorqueries:Annotations write
ANNOTATION_CREATEDevents, notTEXT_SAVED. A document where the user only drew bounding boxes (no text typed yet) never appears in the hero.Fix: Extend the
INclause to includeANNOTATION_CREATED:The hero's progress calculation and excerpt already come from transcription blocks, so documents with only annotations will correctly show 0 % and no excerpt — that's the right behaviour for documents not yet transcribed.
Side note — REVOKE on
audit_loghad no effectV46__add_audit_log.sqlrunsREVOKE UPDATE, DELETE ON audit_log FROM CURRENT_USERto enforce append-only at the DB layer. Butarchive_useris the table owner, and PostgreSQL owners always retain all privileges regardless of REVOKE. The statement ran without error but had no effect. A separate superuser role would be needed to enforce this properly. Not blocking, but the append-only guarantee is currently not enforced at the DB level.Acceptance criteria
ANNOTATION_CREATEDrow toaudit_log🏛️ Markus Keller — Senior Application Architect
Observations
afterCommit()fires beforeafterCompletion(), so Spring's transaction synchronizations are still registered on the current thread whenwriteLog()tries to start a new@Transactionalcontext viaSimpleJpaRepository.save(). The silentcatch (Exception e)buries the resultingIllegalStateExceptionentirely.auditExecutor. A fresh thread has no registered synchronizations, sosave()can start a clean transaction normally.CallerRunsPolicyis a latent risk with the proposed fix. TheauditExecutoris configured with core=1, max=2, queue=50, andCallerRunsPolicy. If the queue fills under load,auditExecutor.execute(() -> writeLog(...))runs the lambda synchronously on the calling thread — which is theafterCommit()callback thread, and still has live synchronizations. Bug 1 would silently resurface under sustained write load at exactly the moment of highest audit activity.@Autowired @Qualifier("auditExecutor") private TaskExecutor auditExecutor— field injection. Every service in this codebase uses@RequiredArgsConstructorwithfinalfields. This would be the first field-injected dependency in the service layer, inconsistent with the established pattern.REVOKEunconditionally. Non-blocking for this issue, but the migration comment should reflect reality.Recommendations
private final TaskExecutor auditExecutor;with@Qualifier("auditExecutor")on the field.@RequiredArgsConstructorgenerates the constructor. No@Autowiredneeded.auditExecutorrejection policy fromCallerRunsPolicytoAbortPolicy. CatchRejectedExecutionExceptionin theafterCommit()callback and emit a distinctlog.warn(). This prevents the fix from regressing under load and makes capacity exhaustion visible instead of silently running on the wrong thread.V46__add_audit_log.sqlwith a comment noting that the REVOKE is a no-op for table owners, so future developers don't rely on this guarantee.auditanddashboardpackages are correctly feature-packaged and boundaries are clean.👨💻 Felix Brandt — Senior Fullstack Developer
Observations
AuditServiceTest.javamocksauditLogRepositoryvia@ExtendWith(MockitoExtension.class)— the mock'ssave()never callsSimpleJpaRepository.save(), so theTransactionSynchronizationManager.initSynchronization()conflict is never triggered. All existing tests pass while the bug is live in production. This is the expected failure mode of mocking at the wrong boundary.@Autowired @Qualifier("auditExecutor") private TaskExecutor auditExecutor). Every service in this codebase uses@RequiredArgsConstructor— this would be an inconsistency. The correct pattern exists right next to it.WHERE a.kind = 'TEXT_SAVED'toWHERE a.kind IN ('TEXT_SAVED', 'ANNOTATION_CREATED')inAuditLogQueryRepository.findMostRecentDocumentIdByActor. Straightforward.AnnotationService.createOcrAnnotation()does not fire anANNOTATION_CREATEDevent. The AC says "Adding an annotation writes anANNOTATION_CREATEDrow." OCR annotations are system-generated, not user-initiated — scope should be explicitly resolved before implementation starts.AuditService.log()is annotated@Async("auditExecutor"). After injecting the executor directly, there will be two ways to submit work toauditExecutor—@Asyncproxy and directexecute(). Rationalise one path.Recommendations
private final TaskExecutor auditExecutor;with@Qualifier("auditExecutor")on the field. Lombok handles the rest. The@Async("auditExecutor")annotation onlog()can stay if that method has separate callers, but document the two patterns explicitly.@SpringBootTestintegration test that callsannotationService.createAnnotation()inside a@Transactionalwrapper, waits with Awaitility, then asserts anaudit_logrow exists. The test must be red on the current code, green after the fix.🔐 Nora "NullX" Steiner — Application Security Engineer
Observations
catch (Exception e) { log.error(...); }inwriteLog()means dropped events produce only a log line. There is no counter, no alert, and no structured signal. An attacker who triggers load conditions that exhaust the executor queue produces no audit trail and no alarm. Audit logs only have value if their failure mode is itself observable.CallerRunsPolicyis a regression vector for the proposed fix. TheauditExecutorusesCallerRunsPolicywith a queue of 50. When the queue fills,auditExecutor.execute(() -> writeLog(...))runs the lambda synchronously on theafterCommit()thread — which still hasTransactionSynchronizationManagersynchronizations registered. The originalIllegalStateExceptionreturns, and is silently swallowed again. Under sustained write load, audit logging fails exactly when activity is highest.V46__add_audit_log.sqldoes not exist at the database layer.archive_userowns the table and PostgreSQL owners cannot be stripped of privileges byREVOKE. Any application bug or compromised credential that can write to the database can also delete audit rows.ON DELETE SET NULLonactor_idmeans user deletion does not erase audit history, only attribution. The JSONB payload structure is flexible without schema migration cost.Recommendations
auditExecutorrejection policy toAbortPolicy. CatchRejectedExecutionExceptionin theafterCommit()callback with a distinctlog.warn("Audit event dropped — executor queue full: kind={}, documentId={}"). This makes capacity exhaustion an explicit observable signal rather than a silent bug.writeLog()'s catch block:auditWriteFailures.increment(). Wire it to Prometheus. Dropped audit events then show up on the Grafana dashboard without log scraping.V46__add_audit_log.sqldocumenting the REVOKE limitation:-- NOTE: REVOKE has no effect for table owner. Append-only is enforced at application layer only.Future developers should not rely on a guarantee that doesn't exist.🧪 Sara Holt — Senior QA Engineer
Observations
AuditServiceTest.javauses@ExtendWith(MockitoExtension.class)and mocksauditLogRepository. The mock'ssave()never invokesSimpleJpaRepository.save(), so Spring's transaction synchronization conflict is never triggered. All tests pass while the bug is live. This is a test pyramid gap: transaction lifecycle behavior requires a real Spring context and a real database — mocks cannot reproduce it.@SpringBootTestwith Testcontainers onpostgres:16-alpine. H2 would not reproduce this bug either — the issue is in Spring'sTransactionSynchronizationManager, which is framework-level and independent of the underlying database. But the SQL fix for Bug 2 uses native PostgreSQL (DISTINCT ON,date_trunc) and must be tested on real Postgres.writeLog()execute on the executor pool. Integration tests must use Awaitility — notThread.sleep()— to wait for the async write to complete before asserting.afterCommit()pattern should guarantee this, but it's an explicit contract that deserves a permanent regression test.Recommendations
Required test names:
logAfterCommit_writes_ANNOTATION_CREATED_row_after_transaction_commits()logAfterCommit_writes_no_row_when_transaction_rolls_back()findMostRecentDocumentIdByActor_returns_document_with_annotation_only_events()dashboard_hero_resume_shows_annotation_only_document_with_zero_progress()Integration test structure for Bug 1 (must be red before the fix, green after):
Consider adding
DashboardQueryIntegrationTestfor the SQL query changes inAuditLogQueryRepository. The existingDashboardControllerTestuses@WebMvcTestwith a mocked service — it cannot verify that the native SQL query returns correct results for annotation-only documents.🎨 Leonie Voss — UX Designer & Accessibility Strategist
Observations
DashboardService.getResume()andDashboardResumeDTO: the hero card's progress is calculated from transcription blocks (reviewed / total). For an annotation-only document,total = 0. The issue correctly anticipates this and calls 0% the right behaviour — but a zero denominator must be explicitly handled in the frontend progress calculation or the hero will display either a division-by-zero result or a blank progress bar.Recommendations
const progress = $derived(totalBlocks > 0 ? Math.round((reviewedBlocks / totalBlocks) * 100) : 0). This is a$derivedexpression — no$effectneeded.No accessibility, brand, or layout concerns with the backend changes themselves. The annotation-only edge case just needs a visual smoke-test after the fix ships.
🔧 Tobias Wendt — DevOps & Platform Engineer
Observations
CallerRunsPolicyis a code-level risk worth calling out from an operational perspective. TheauditExecutor(core=1, max=2, queue=50,CallerRunsPolicy) means that under sustained write load — 50+ queued audit events —auditExecutor.execute(...)runs its lambda synchronously on the calling thread. If that thread is theafterCommit()callback thread (which still has transaction synchronizations registered), Bug 1 silently returns. This is not a DevOps fix, but it belongs in theAsyncConfig.javachange accompanying this PR.log.error()line. If the application is running under Loki log aggregation, a saved search on"Failed to write audit log"would surface this. Currently, persistent audit log failures produce no alert and no dashboard signal. Under the current setup, this bug could have been silently failing for weeks with no visibility.Recommendations
level=error msg=~"Failed to write audit log"so persistent failures trigger an alert. This is a 5-minute rule addition and gives the team early warning if the fix regresses.auditExecutortoAbortPolicyinAsyncConfig.java. Document in the config whyCallerRunsPolicyis unsafe for this specific executor (it would reintroduce the synchronization conflict under load). A brief inline comment is justified here.🗳️ Decision Queue — Action Required
1 decision needs your input before implementation starts.
Scope
createOcrAnnotation()fireANNOTATION_CREATEDevents? —AnnotationService.createOcrAnnotation()(the OCR-pipeline path) does not callauditService.logAfterCommit(), unlike the user-initiatedcreateAnnotation(). The AC says "Adding an annotation writes anANNOTATION_CREATEDrow" — this is ambiguous about system-generated annotations. If OCR-created annotations count as activity: include them and they'll appear in the activity feed and hero. If they don't count (they are system output, not user work): exclude them and the current gap is intentional. The right answer depends on how you want the activity feed to represent OCR work. (Raised by: Felix)Decision closed:
createOcrAnnotation()will NOT fireANNOTATION_CREATEDevents. OCR-generated annotations are system output, not user activity. The current gap is intentional — exclude from scope.