feat(search): show search result snippets with match highlighting (#219) #242

Merged

marcel merged 20 commits from feat/issue-219-search-snippets into main

2026-04-16 09:10:11 +02:00

Author	SHA1	Message	Date
Marcel	bca7822ab7	feat(search): surface summary snippet when summary matched the query Some checks failed CI / Unit & Component Tests (push) Failing after 2m31s Details CI / Backend Unit Tests (push) Failing after 2m39s Details CI / Unit & Component Tests (pull_request) Failing after 2m27s Details CI / Backend Unit Tests (pull_request) Failing after 4m45s Details Add a summary_snippet column to findEnrichmentData using ts_headline on documents.summary, only when the summary's tsvector matches the query. Expose it via SearchMatchData.summarySnippet / summaryOffsets and render a "Zusammenfassung" / "Summary" / "Resumen" labelled row in the document list — identical treatment to the transcription snippet row. Fixes the case where a document appeared in search results with no visible match explanation (e.g. searching "frucht" found a document whose summary mentioned "Früchte"). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 21:34:10 +02:00
Marcel	b87036dde0	feat(search): restyle highlights to navy underline and add snippet labels Switch search match highlights from bordered mint chips to a plain navy underline (decoration-brand-navy). Add visible "Inhalt" / "Content" / "Contenido" label before the transcription snippet, matching the style of the Von/An sender-receiver labels. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 21:33:23 +02:00
Marcel	ca7dbfeb99	feat(search): partial-word matching via to_tsquery prefix queries Replace websearch_to_tsquery with a CROSS JOIN LATERAL subquery that appends :* to each lexeme so prefix matches work (e.g. "furchtb" finds "furchtbar"). websearch_to_tsquery still handles the safe tokenisation of user input (stop words, special chars, operators); regexp_replace then adds :* before to_tsquery re-parses the result. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 21:32:37 +02:00
Marcel	f2f55b05c5	feat(search): add snippetOffsets to SearchMatchData and use ts_headline for highlighted snippets - SearchMatchData gains a 6th field snippetOffsets: List<MatchOffset> so the frontend can render highlighted terms inside the transcription snippet without {#html}. - DocumentRepository.findEnrichmentData now calls ts_headline() with chr(1)/chr(2) sentinels instead of returning raw block text; parseHighlight() strips the sentinels and produces clean text + MatchOffset list in one pass. - DocumentService exposes ParsedHighlight and parseHighlight() as public so they can be called from cross-package integration tests. - All related tests updated to the new 6-argument SearchMatchData constructor and to call parseHighlight() for asserting the snippet clean text and offsets. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 20:14:20 +02:00
Marcel	25f1402dd9	feat(search): highlight snippet terms and mark sender/receiver/tag matches in document list Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 20:03:04 +02:00
Marcel	c8cd236568	fix(search): make ParsedHighlight and parseHighlight public for cross-package test access Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:47:23 +02:00
Marcel	d14dd795a4	fix(pdf): merge setElements and render effects so canvas remount triggers re-render The refactor made pdfDoc a plain variable so renderer.isLoaded was not reactive. Svelte only tracked currentPage and scale — but when the canvas reappeared after loading, neither changed, so the PDF stayed blank. Fix: merge the two effects into one that reads canvasEl synchronously. Svelte now tracks canvasEl as a dependency; when the canvas remounts (loading spinner → false), the effect re-fires and renders the already-loaded PDF document. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:26:54 +02:00
Marcel	211f531513	perf(search): add index on transcription_blocks.document_id for lateral join Some checks failed CI / Unit & Component Tests (push) Failing after 2m29s Details CI / Backend Unit Tests (push) Failing after 2m42s Details CI / Unit & Component Tests (pull_request) Failing after 2m26s Details CI / Backend Unit Tests (pull_request) Failing after 2m35s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:12:46 +02:00
Marcel	141511e973	style(search): improve mark hover contrast, remove no-op class, italicize snippet Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:11:54 +02:00
Marcel	89d2c47f8f	test(search): add applyOffsets coverage for negative start offsets Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:08:30 +02:00
Marcel	2b8663415b	test(search): assert matchData key and snippet in controller search response Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:06:47 +02:00
Marcel	a7c839aa94	fix(search): mark documents and total as required in OpenAPI schema Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 19:04:59 +02:00
Marcel	ddb87db6b7	feat(search): pass matchData from server load to DocumentList Some checks failed CI / Unit & Component Tests (push) Failing after 2m46s Details CI / Backend Unit Tests (push) Failing after 2m49s Details CI / Unit & Component Tests (pull_request) Failing after 2m32s Details CI / Backend Unit Tests (pull_request) Failing after 2m36s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 18:34:54 +02:00
Marcel	93c78433cf	feat(search): render title highlights and transcription snippets in DocumentList Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 18:34:14 +02:00
Marcel	9673cefe44	feat(search): add applyOffsets utility and regenerate API types with MatchOffset/SearchMatchData Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 18:05:07 +02:00
Marcel	8c7ce147a0	feat(search): enrich searchDocuments with per-document match data DocumentService.searchDocuments now returns DocumentSearchResult with matchData populated from findEnrichmentData. Title highlights are parsed from chr(1)/chr(2) delimiters into MatchOffset lists; transcription snippet and sender/receiver/tag match flags are extracted from the same native SQL row. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 17:53:57 +02:00
Marcel	8526e6c0a1	test(search): add DocumentSearchEnrichmentTest for findEnrichmentData native query Tests lateral join best-block selection, chr(1)/chr(2) headline delimiters, sender/receiver/tag match flags, and null cases for missing relations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 17:40:47 +02:00
Marcel	003d68ed21	feat(search): add DocumentSearchResult.withMatchData() factory with match overlay map Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 15:34:00 +02:00
Marcel	8cbecd452b	feat(search): add SearchMatchData record for per-document match signals Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 15:32:26 +02:00
Marcel	47da0fa216	feat(search): add MatchOffset record for character-level highlight positions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 15:30:53 +02:00

feat(search): show search result snippets with match highlighting (#219) #242

20 Commits