feat(documents): paginate /documents search so first paint isn't 1500 rows #316
Reference in New Issue
Block a user
Delete Branch "feat/issue-315-paginate-documents-search"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Closes #315
Summary
Classic offset pagination for
/documentssearch. Default page size 50, max 100.?page=Nis shareable; changing any filter resets to page 0. Backend enrichment (FTS snippets, completion stats, contributor stacks) now runs only on the requested slice — 30× cheaper at 50 items vs 1500.Shape
DocumentSearchResultextended withpageNumber,pageSize,totalPages.totalrenamed tototalElementsfor Spring-Page parity. Newpaged(slice, pageable, totalElements)factory alongside the existingof(list)shortcut used for unpaged callsites.DocumentController.searchacceptspage(@Min(0) @Max(100000)) +size(@Min(1) @Max(100)).@Validatedadded at the controller class level — without it, these annotations silently no-op, and?size=999999would re-open the DoS window this PR is meant to close.DocumentService.searchDocumentstakes aPageable. Fast path (DATE/TITLE/UPLOAD_DATE) pushes sort + paging tofindAll(Specification, PageRequest); in-memory sort paths (SENDER/RECEIVER/RELEVANCE) still fetch all match rows (LEFT-JOIN nullability on sender/receiver) but now slice in memory via a smallpageSlice(list, pageable)helper so enrichment still only runs on the slice.Pagination.svelte(new) renders prev / "Seite X von Y" / next whentotalPages > 1. Decorative chevrons wrapped inaria-hidden, 44px touch targets, focus-visible ring, stacks vertically below 640px,aria-current="page"on the current-page label. Plain<a href>— SvelteKit's default scroll restoration brings the user to the top of the new slice (senior-UX expectation, vs.noScroll: trueon filter-debounce nav).+page.server.tsreadspagefrom URL, forwardspage+size=50to the backend.+page.sveltepassestotalElementsthrough toDocumentListand renders<Pagination>.triggerSearch()intentionally does NOT carrypage— filter change resets to 0.DocumentThumbnail.svelteunchanged;DocumentList.svelteunchanged (group headers never had per-group counts — scope correction flagged by Felix during review).Persona-review corrections addressed
@ValidatedonDocumentControllerwould make@Min/@Maxsilent. Fixed + locked bysearch_returns_400_when_size_exceeds_max. Also boundedpagewith@Max(100_000)to blockInteger.MAX_VALUE * sizeoverflow into negative SQL offsets.DocumentSearchResultinstead of creating a separate paged DTO. Renamedtotal→totalElementsfor Page-parity. ExtractedpageSlicehelper. TDD commit chain visible in log.Pageable(matchesNotificationService.getNotifications). Added scaling-ceiling comment on the in-memory sort branches.<a href>for pagination → scroll-to-top on page change (vs. noScroll on filter debounce);aria-hiddenchevrons; 44px targets; responsive stacking; focus ring.Verification
./mvnw test— 1289 passing, 0 failures../mvnw clean package -DskipTests— BUILD SUCCESS.Pagination.svelte.spec.ts9/9,documents/**27/27, filter-reset case green). Full-suite chromium flakiness exists on main (12 failures on baseline without this PR) and isn't related.npm run check: 240 errors, matching the pre-issue-309 baseline of 242 — zero new type errors introduced.Manual edit to
api.tsThe dev backend running in docker-compose is built from main, not this worktree, so
npm run generate:apiagainst it would produce main's schema. The diff tofrontend/src/lib/generated/api.ts(renametotal→totalElements, addpageNumber/pageSize/totalPages, addpage/sizequery params on search operation) was applied manually to match what a real regen will produce post-merge. Same pragmatic approach used in #309; confirm with a real regen after merge.Post-merge follow-ups (filed as signals, not in scope)
RateLimitInterceptorcould cover/api/documents/search— separate security-policy decision, worth its own issue.👨💻 Felix Brandt — Senior Fullstack Developer
Verdict: ⚠️ Approved with concerns
What I checked
pageSlice,enrichItems,buildPageHref,buildResultPagedall say what they do. ✓pageSlice(sorted, pageable)usesMath.mintwice; edge math lives once. ✓$derivednot$effect,$props(), keyed{#each}N/A (Pagination doesn't iterate). ✓Pagination.svelteis one visual region in ~55 lines. Good.page.server.tsstill returns the shape the component expects (now including paging fields).Concerns (non-blocking)
buildPageHrefandtriggerSearchduplicate URL-building logic. Both iterate over the same filter set (q,from,to,senderId,receiverId,tags,sort,dir,tagQ,tagOp) and emit the sameSvelteURLSearchParams. The only difference is the source (local$statevs serverdata) and whetherpageis appended. If a filter gets added later, it has to be added in two places. Extract abuildSearchParams({ q, from, to, ..., page? })helper — probably 15 lines — and call it from both.PageRequest.of(0, 10_000)in tests is a magic sentinel. I introduced it. It works, but seeing it scattered across ~15 call sites would make me pull it into aDocumentServiceTest.UNPAGED = PageRequest.of(0, 10_000)static constant. Tiny readability win, prevents accidental copy-paste of the wrong number.DashboardServiceand any other caller ofsearchDocumentsis unchanged becauseDashboardServicedoesn't callsearchDocuments. Confirmed via grep — no other service or controller invokes it. Good, just noting I checked.TDD gaps worth flagging
total→totalElementson the DTO was NOT red-before-green in isolation. I rewroteDocumentSearchResultTestin bulk to add the new tests + rename, so the rename traveled under the umbrella of "new field coverage." Minor process violation; the safety net still caught it because compile errors cascaded across every caller. Won't claim strict TDD for the rename.LGTM pending the two extract-a-helper suggestions
Neither is a blocker. The PR ships a measurable perf improvement with zero behavior regression. The URL-builder dedup is the one I'd actually do if we had 10 more minutes.
🏛️ Markus Keller — Senior Application Architect
Verdict: ✅ Approved
Structure review
DocumentService.searchDocuments(..., Pageable)matches the Spring idiom used byNotificationService.getNotifications. Convention established. ✓DocumentSearchResultis still a domain DTO, not a SpringPage<T>— the controller wraps the pageable result at the boundary, so the openapi-typescript client sees a clean shape (nopageable.sort.sortednoise leaking to the frontend). ✓DocumentServicestill doesn't reach intoPersonRepository.DashboardServiceis untouched. Feature package boundaries preserved. ✓JpaSpecificationExecutor.findAll(Spec, Pageable). In-memory paths load everything + slice — documented with the scaling ceiling comment as I suggested. ✓One watchpoint (not a blocker)
The fallback to in-memory sort for SENDER/RECEIVER/RELEVANCE still loads
documentRepository.findAll(spec)without a pageable. At 1500 docs, that's 1500 Document entities (each with eager receivers/tags/trainingLabels sets via the existingFetchType.EAGER+@JoinTable). The 30× enrichment win is real, but the full entity fetch is the same as before the PR — it's just no longer followed by 30× the enrichment cost. If SENDER sort becomes the perf hot path in production, the win from pagination is smaller than DATE sort would imply. That's the trade the issue chose and I signed off on; just keeping the observation on the record for future maintainers looking at Grafana and wondering why SENDER p95 doesn't track DATE p95.LGTM
Narrow, localised change within the existing monolith-first posture. Merge when the suite is green.
🔒 Nora Steiner — Application Security Engineer
Verdict: ✅ Approved
Surface review
@Validatedadded at theDocumentControllerclass level. My primary concern from the issue review is addressed. The rejection test (search_returns_400_when_size_exceeds_max) locks it — if someone removes@Validatedlater, that test fails loudly.pagebounded to@Max(100_000)blocks theInteger.MAX_VALUE × pageSizeoverflow into negative SQL offsets. Covered bysearch_returns_400_when_pageAboveMax.pageandsizeare integers, not strings; they flow intoPageRequest.of(int, int), never into a query fragment.totalElements— same information the oldtotaldid. No new information disclosure.Defense-in-depth note
@Validon the@RequestBodyannotations elsewhere inDocumentControllerwhile we're here —DocumentUpdateDTO-type payloads won't get validated without@Valideven with@Validatedat the class. That's out of scope for this PR but worth a follow-up grep; putting the annotation at the class opens the door, but each@RequestBodystill needs@Validto walk through it. Specifically not blocking #315.Defensive test coverage
search_returns_400_when_size_exceeds_max(size=101)search_returns_400_when_size_below_min(size=0)search_returns_400_when_page_negative(page=-1)search_returns_400_when_page_above_max(page=200000)All four are small, fast, and stay in the regression suite forever. This is the right shape.
LGTM
The security delta is entirely positive: closes a DoS vector that was already latent in the pre-PR
?size=…-free endpoint (which returned all 1500 rows). Ship it.🧪 Sara Holt — QA Engineer
Verdict: ⚠️ Approved with concerns
Coverage landed
size, size-below-min, negative page, page-above-max, ArgumentCaptor on forwarded Pageable. ✓findAll(Spec, Pageable)is used on fast path, Pageable propagates to DB, paging fields flow back on the result, SENDER-sort slices + reports full total, page-beyond-last returns empty + correct total. ✓of()semantics preserved,paged()populates fields, totalPages rounds up,@Schema(requiredMode=REQUIRED)asserted on every new field. ✓Concerns
The baseline frontend suite is failing 12 tests even without this PR. I saw the run output — chromium browser-mode flakes in
ThumbnailRow,TagParentPicker,TranscriptionBlock, admin panels. That's infrastructure debt, not this PR's bug, but it does mean we can't confidently say "frontend full suite green" on merge. Recommendation: file a separate "stabilise chromium browser tests" issue. Not blocking this PR.No integration test runs the full stack —
MinIO → Spring Boot → browser— for the paged endpoint.DocumentSearchIntegrationTest(if it exists, and I'd have to verify) would catch shape regressions on real Postgres. At 1500 docs, a one-shot test that loads a fixture, calls/api/documents/search?page=5&size=50, and assertsitems.size()==50 && totalElements==1500costs 10s and buys genuine confidence. Worth a follow-up.No perf assertion. The whole point of the PR is speed. Neither a backend nor a frontend test asserts "this is measurably faster than before." Hard to bake in — perf tests are flaky on shared runners — but a rough upper-bound assertion on enrichment method invocation count exists (
findEnrichmentDatacalled with N = pageSize, not totalElements) which is the best we can do at unit level. Good enough.PageRequest.of(0, 10_000)sentinel in 15+ existing tests is a readability smell. Felix flagged it too. Extract to a test-scoped constant.What I'd want before merge (none are blockers)
PageRequest.of(0, 10_000)→UNPAGEDconstant in test code.LGTM
Test pyramid shape is right: lots of fast unit/controller cases, a few captors for behavioural invariants, no misuse of
@SpringBootTest. Ship it.⚙️ Tobias Wendt — DevOps & Platform Engineer
Verdict: ✅ Approved
What I checked
/api/documents/searchcall because enrichment runs on the slice. At 1500 rows and typical dashboard usage this is the biggest single-request perf improvement shipped this quarter. ✓/api/documents/searchwas no Cache-Control header and still is. Appropriate for per-user-authenticated content that varies by permissions. ✓application/json).Observability opportunity (future work, not blocking)
The controller now takes a
page+sizequery param pair. Worth adding a Prometheus labelsize_bucket(e.g.default|custom-small|custom-large) to the/api/documents/searchhistogram so we can spot "someone's client is sending size=100 repeatedly" vs "default 50." Low cost, catches client bugs early. Separate PR.Manual api.ts edit
npm run generate:apiagainst a rebuilt dev backend will converge. Same approach as #309. Worth running and confirming empty diff after merge.Rate limiting
/api/documents/search, as I noted in the issue review. Recommendation stands: separate PR for a security-policy decision, e.g. 60/min/user. Not a prerequisite for shipping #315.LGTM
Pure positive change to operational posture. Merge.
🎨 Leonie Voss — UX Designer & Accessibility Advocate
Verdict: ⚠️ Approved with concerns
What landed as expected
<a href>for prev/next. NonoScroll: true. SvelteKit's default scroll-restore brings the user to the top of the new slice. ✓min-h-[44px] min-w-[44px]). ✓aria-current="page"on the current-page label — screen readers announce it as the current location. ✓<span aria-hidden="true">. ✓focus-visible:ring-2 focus-visible:ring-brand-navy). ✓flex-colunder 640 px,sm:flex-row sm:justify-betweenabove. ✓aria-disabled="true"on bounds, href omitted so the link isn't activatable. Proper semantics for a disabled link (we don't render<button>because it breaks shareable-URL semantics).Two concerns
aria-disabled="true"+hrefomission disables mouse/keyboard activation but a screen reader still reads the link label. Users may hear "Previous, link, disabled" and wonder. Consider: if!hasPrev, render a<span>instead of<a>with the same visuals (or addaria-hidden="true"on the disabled element so assistive tech skips it entirely). Current implementation isn't wrong — it's the documentedaria-disabledpattern — but I'd want a real-user test with NVDA/VoiceOver before calling it great.The pagination control only renders when
totalPages > 1— good default. But it means on a small corpus (< 50 results) the user sees NO affordance that pagination exists, which is right. However, on a 100-result corpus (totalPages === 2), the user seesSeite 1 von 2 [« Zurück (disabled)] [Weiter »]. The disabled prev can be visually confusing in that small window. Not a blocker, just noting the edge feels odd.Not blocking, worth filing as follow-ups (Markus mentioned scaling, mine are UX)
focus()on a header element after navigation. Needs some SvelteKitafterNavigateplumbing; out of scope.page=X of Yinput for jump-to-page — flagged earlier, still a future-work item.3 → 47pagination is tedious with prev/next only.Visual regression snapshots
LGTM
Baseline a11y is in good shape. The two concerns are polish, not blockers. Approve.
Review Cycle 1 — Changes
Addressed
PageRequest.of(0, 10_000)magic sentinel — extracted to a namedUNPAGEDconstant onDocumentServiceTestandDocumentServiceSortTest. Commit6808d9fb.buildPageHrefandtriggerSearchURL-builder duplication — extractedbuildSearchParams(filters, targetPage?)helper in+page.svelte. Any new filter param now lands in exactly one place. Commitbaf7ae34.<a aria-disabled>at bounds still announced as a disabled link by screen readers — switched to<span aria-hidden="true">at the bounds with a matchingdisabledBaseTailwind class for visual parity. AT skips the element entirely now. Tests updated. Commita8edd0d9.DocumentSearchPagedIntegrationTest(5 cases,@SpringBootTest+ Testcontainers Postgres, 120-doc fixture). Asserts first page / last partial page / beyond-last / SENDER-sort slice / disjoint slices. Commit4f741a87.Not addressed (explicitly out of scope per persona + plan)
/api/documents/search— separate security-policy issue.@Validaudit across@RequestBodyhandlers — needs a broader DTO-constraint audit first (@Validalone without@NotNull/@NotBlankis a no-op). Separate issue.size_bucketlabel — observability improvement, separate PR.Final status
Pagination.svelte.spec.ts9/9 green.documents/specs 27/27 green.feat/issue-315-paginate-documents-search.