perf(document): EAGER→LAZY migration with @EntityGraph + @BatchSize (#467) #622

Merged
marcel merged 19 commits from worktree-feat+issue-467-lazy-fetch-entity-graph into main 2026-05-19 09:23:32 +02:00

19 Commits

Author SHA1 Message Date
Marcel
3d458f6e03 chore(document): address non-blocking review feedback on lazy-fetch PR
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m13s
CI / OCR Service Tests (pull_request) Successful in 20s
CI / Backend Unit Tests (pull_request) Successful in 3m24s
CI / fail2ban Regex (pull_request) Successful in 44s
CI / Semgrep Security Scan (pull_request) Successful in 19s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
- Add @BatchSize(50) fallback comments on findBySenderId / findByReceiversId
- Replace silent size() discard in getRecentActivity test with assertThat isNotEmpty()
- Add ADR-022 reference comment above @JsonIgnoreProperties on Person and Tag
- Document within-open-transaction limitation in DocumentLazyLoadingTest Javadoc

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 08:44:53 +02:00
Marcel
be15de6cc6 fix(document): fix test assertion structure + add entity graph decision comments
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m15s
CI / OCR Service Tests (pull_request) Successful in 19s
CI / Backend Unit Tests (pull_request) Successful in 3m25s
CI / fail2ban Regex (pull_request) Successful in 42s
CI / Semgrep Security Scan (pull_request) Successful in 20s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s
- Refactor DocumentLazyLoadingTest: pull value assertions (assertThat) out
  of assertThatCode lambdas so failures surface as AssertionError rather
  than "unexpected exception: AssertionError" (review item 1)
- Add @EntityGraph("Document.full") to findBySenderId, findByReceiversId,
  findConversation, and findSinglePersonCorrespondence — all return full
  Documents to the controller for JSON serialization (review item 2)
- Add "// Callers access only ..." comments to un-graphed methods where no
  lazy associations are touched: findByTags_Id, findByStatus,
  findByMetadataCompleteFalse(Sort), findByMetadataCompleteFalse(Pageable)
- Remove "what" inline comments from @Transactional(readOnly=true)
  on getRecentActivity and getDocumentById — the why is in ADR-022 (item 4)
- Add named-graph coupling consequence to ADR-022: Document.java and
  DocumentRepository.java graph name strings must stay in sync (item 5)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 07:42:59 +02:00
Marcel
82e81e159a docs(backend): document @Transactional(readOnly=true) exception in CLAUDE.md
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m42s
CI / OCR Service Tests (pull_request) Successful in 21s
CI / Backend Unit Tests (pull_request) Successful in 3m17s
CI / fail2ban Regex (pull_request) Successful in 42s
CI / Semgrep Security Scan (pull_request) Successful in 20s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m1s
The convention 'read methods are not annotated' has one exception: methods
that return lazily-initialized entities to callers require readOnly=true to
keep the session open. Documents the rule and links to ADR-022.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:26:24 +02:00
Marcel
cabcf6a6ca docs(adr): add ADR-022 for EAGER→LAZY fetch strategy with @EntityGraph
Records context (2733 queries/24 requests), the two-graph decision,
@BatchSize fallback, @Transactional(readOnly=true) session-lifetime
requirement, and alternatives considered.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:25:38 +02:00
Marcel
befecc9864 refactor(document): extract factory helpers in DocumentLazyLoadingTest
Replace repeated personRepository.save/tagRepository.save/documentRepository.save
boilerplate with savedPerson(), savedTag(), savedDocument() helpers.
Each test body is now 2-3 lines of relevant setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:24:17 +02:00
Marcel
57259e4195 test(document): add query-count assertion for findAll(Spec) non-paginated path
List<Document> findAll(Specification) is called in DocumentService for
receiver-sort, sender-sort, and conversation queries but had no query-count
coverage. Asserts ≤5 statements for 5 docs with @EntityGraph(Document.list).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:22:50 +02:00
Marcel
3356e27273 test(document): assert non-empty result in receiverSort lazy-loading test
assertThatCode(() -> service.searchDocuments(...)) passed vacuously on an
empty page; capture the result, assert totalElements > 0, then assert
getSender().getLastName() is accessible post-return.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:21:31 +02:00
Marcel
050cd0adf8 test(document): enable statistics before findById query-count assertion
Without setStatisticsEnabled(true) the counter stays 0 and ≤2 passes
vacuously when the test runs in isolation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 22:19:32 +02:00
Marcel
29eaa253a6 fix(document): remove @BatchSize from @ManyToOne sender — not supported
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m10s
CI / OCR Service Tests (pull_request) Successful in 21s
CI / Backend Unit Tests (pull_request) Successful in 3m18s
CI / fail2ban Regex (pull_request) Successful in 41s
CI / Semgrep Security Scan (pull_request) Successful in 19s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m2s
Hibernate throws AnnotationException at startup when @BatchSize is placed
on a @ManyToOne field. @BatchSize is only valid on collections (@OneToMany,
@ManyToMany, @ElementCollection). The N+1 for sender is already covered by
the @EntityGraph overrides on DocumentRepository.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 21:11:12 +02:00
Marcel
3358f0509f test(document): strengthen getRecentActivity smoke test for post-return access
Some checks failed
CI / Unit & Component Tests (pull_request) Successful in 3m9s
CI / OCR Service Tests (pull_request) Successful in 19s
CI / Backend Unit Tests (pull_request) Failing after 2m38s
CI / fail2ban Regex (pull_request) Successful in 42s
CI / Semgrep Security Scan (pull_request) Successful in 19s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s
Previous version only asserted the method call didn't throw. Now the test
captures the returned list and asserts that sender.getLastName() and
tags.size() are accessible outside the transaction, which is the scenario
that would have failed with a LazyInitializationException.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 19:05:33 +02:00
Marcel
667b237c33 docs(document): add WHY comments to @Transactional(readOnly=true) methods
These annotations deviate from the project convention (read methods are
normally unannotated). The comment explains that the session must stay
open for callers to access lazy-loaded collections post-return, preventing
future developers from removing the annotation as a cleanup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 19:04:22 +02:00
Marcel
e677756139 perf(document): add @BatchSize(50) to sender and trainingLabels
Consistent with the @BatchSize already on receivers and tags. Any lazy
code path not covered by an entity graph will batch-load these associations
instead of issuing one query per document.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 19:03:09 +02:00
Marcel
108ab1a288 perf(document): add @EntityGraph(Document.list) for findAll(Pageable)
getRecentActivity calls findAll(Pageable) — the JpaRepository overload
not covered by the existing Specification variants. Without this override,
sender is loaded N+1 per document. Now applies Document.list graph so
sender and tags are fetched eagerly for every findAll(Pageable) call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 19:02:07 +02:00
Marcel
5f83fa1bc2 test(document): add query-count assertion for findAll(Pageable) path
Adds failing test: findAll(Pageable) must not N+1 sender for 5 docs.
Without @EntityGraph override for this overload, each document triggers
a separate SELECT for its lazy sender.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 19:01:14 +02:00
Marcel
0dd7d8a5e7 test(document): remove redundant global generate_statistics from test config
Stats tracking is already enabled per-test via setStatisticsEnabled(true);
enabling it globally added unnecessary overhead to every test in the suite.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 19:00:16 +02:00
Marcel
e13b37d585 test(document): add @SpringBootTest smoke tests for lazy-loading correctness
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m2s
CI / OCR Service Tests (pull_request) Successful in 19s
CI / Backend Unit Tests (pull_request) Successful in 3m6s
CI / fail2ban Regex (pull_request) Successful in 42s
CI / Semgrep Security Scan (pull_request) Successful in 19s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m0s
Five integration tests verify that DocumentService and DashboardService
do not throw LazyInitializationException after the EAGER→LAZY migration:
getDocumentById, getRecentActivity, searchDocuments (receiver/sender sort),
and dashboardService.getResume.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 16:55:23 +02:00
Marcel
b8505e0de5 fix(document): add @Transactional to read methods that access lazy collections
- getDocumentById: add @Transactional(readOnly=true) — calls
  tagService.resolveEffectiveColors(doc.getTags()) which requires an open
  session after the LAZY switch
- getRecentActivity: add @Transactional(readOnly=true) — callers may access
  tags/receivers on the returned list; keeps session open for @BatchSize fetches
- updateDocumentTags: add @Transactional — write method was missing annotation

Also adds @JsonIgnoreProperties({"hibernateLazyInitializer","handler"}) to
Person and Tag to prevent Jackson serialization errors on uninitialized
lazy proxies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 16:49:56 +02:00
Marcel
b88573c432 refactor(document): switch collections to LAZY + add @EntityGraph + @BatchSize
- receivers, tags, trainingLabels: FetchType.EAGER → FetchType.LAZY
- sender: add explicit FetchType.LAZY (was implicitly lazy, now explicit)
- @NamedEntityGraph("Document.full"): sender + receivers + tags
- @NamedEntityGraph("Document.list"): sender + tags
- DocumentRepository.findById overridden with @EntityGraph("Document.full")
- DocumentRepository.findAll(Specification, Pageable) overridden with
  @EntityGraph("Document.list")
- DocumentRepository.findAll(Specification) overridden with
  @EntityGraph("Document.list") for RECEIVER/SENDER sort paths
- @BatchSize(50) on receivers and tags as fallback for any list path
  that does not go through an @EntityGraph method

Fixes issue #467.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 16:47:33 +02:00
Marcel
ff792e6625 test(document): add query-count assertions for findAll + findById entity graphs
Adds Hibernate statistics to the test config and two new tests in
DocumentRepositoryTest:
- findAll_withSpecAndPageable asserts ≤5 statements for 10 documents
  (currently RED: EAGER @ManyToMany generates 31 secondary SELECTs)
- findById regression guard verifies collections load in ≤2 statements

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-18 16:43:36 +02:00