test(e2e): follow-up gaps from legibility pre-flight (#402) #431
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Context
PR #430 (legibility pre-flight — make tests trustworthy) fixed the most critical E2E accuracy issues. A number of smaller gaps were intentionally deferred to keep the PR focused. This issue tracks everything that was left behind.
Deferred items
1. J10 — Full notification bell flow not tested
The current bell test only verifies open/close (Escape). It does not cover:
Action: Extend
notification-deep-link.spec.tsbell describe with these three steps.2. J6 — Sender filter is untested
The J6 search test covers text + tag AND combination. The sender (
?sender=<uuid>) filter has no E2E coverage.Action: Add a test that creates a document with a seeded sender via API, then verifies that searching with
?sender={personId}returns that document and filters out others.3. Person accumulation — no
DELETE /api/personsendpointThe J5 test seeds two persons on every run (using a timestamp-based
lastNameto avoid collisions). Those persons are never cleaned up becausePersonControllerhas noDELETE /{id}endpoint.Action (two options — pick one):
DELETE /api/persons/{id}endpoint (admin-only) and wire upafterAllin J5.4. afterAll cleanup missing in older spec sections
Several older describe blocks (written before the pre-flight PR) seed documents via API but have no
afterAllcleanup. Notable examples:documents.spec.ts— "Document creation" and "PDF viewer" describe blocksadmin.spec.ts— The backfill hash test creates a document but does not delete itAction: Audit all
beforeAllblocks that callPOST /api/documentsand add matchingafterAlldeletions.5. E2E suite not running in CI
The CI pipeline (
infra/gitea/workflows/ci.yml) stops at unit/component tests. The Playwright suite requires the full Docker Compose stack and currently must be run manually.Action: Add a Playwright job to the Gitea CI pipeline that:
docker-compose up -d/actuator/health)npm run test:e2etest-results/as an artifact on failure6. Negative registration tests
The J1 registration flow is covered (happy path). Negative cases are not:
Action: Add negative test cases to
auth.spec.tsregistration describe.7. Mobile viewport — read path
Issue #402 noted that the read path is critical on phones (320 px). The deep-link test runs at two viewports but most other tests run at desktop only.
Action: Identify 3–5 read-path tests (document detail, person detail, search results) and run them at
{ width: 320, height: 700 }as well.Acceptance criteria
beforeAllAPI-seeded blocks have matchingafterAllteardownmain👨💻 Felix Brandt — Senior Fullstack Developer
Observations
notification-deep-link.spec.tswithbeforeAll/afterAllcleanup in place — that scaffolding is reusable. The current test only asserts open + Escape close. The three missing assertions (unread badge, mark-as-read, navigate to target document) are well-scoped additions inside the existing describe block.GET /api/notifications/unread-countexists and returns{ count: N }.NotificationControlleralso has mark-as-read endpoints. E2E can hit these viarequestfixtures to assert state transitions without depending on polling timing.DocumentControlleraccepts?senderId=<uuid>(not?sender=<uuid>as the issue says — confirm the exact query param name before writing the test). The test needs a seeded document with a known sender;POST /api/persons+POST /api/documents(withsenderId) inbeforeAllis the pattern used by J5 inpersons.spec.ts.PersonControllerhas noDELETE /{id}endpoint. J5 inpersons.spec.tsalready creates two persons with no cleanup — the issue correctly identifies this as accumulation debt.documents.spec.ts"Document creation" and "New document" describe blocks have noafterAll. Theadmin.spec.tsbackfill hash test also seeds without cleanup. This is confirmed — the issue's list is accurate.auth.spec.tscovers happy-path registration. The currentRegistration via invite codedescribe has no negative cases. The mismatched-password case can be tested entirely in afreshCtxwith no backend state — it's a cheap add.notification-deep-link.spec.tsalready runs at{ width: 320, height: 700 }as a positive example to follow. The other specs (documents, persons, search) run desktop-only.Recommendations
'bell opens dropdown, shows notifications list'test OR add two newit()cases inside the same describe — whichever keeps them under 25 lines each. Check unread count viarequest.get('/api/notifications/unread-count')before and after clicking a notification. Usepage.locator('[data-unread-count]')(or equivalent ARIA badge) for the badge visibility assertion.beforeAllof a new describe insidedocuments.spec.ts. Navigate to/?senderId={personId}(verify exact param from the backend) and assert only the seeded document appears. UseafterAllto delete both the document and the person.DELETE /api/persons/{id}with@RequirePermission(Permission.WRITE_ALL)— it's a one-method addition toPersonControllerand removes the only cleanup blocker. Option (b) defers tech debt to an indefinite admin tool that adds UI complexity without solving the underlying E2E problem. Option (a) is the clear winner.afterAllblocks to every describe that callsPOST /api/documentswithout a matching delete. This is mechanical work — scan forrequest.post('/api/documents')in all spec files.Registration via invite codedescribe: (1) used invite code returns an error, (2) expired code returns an error, (3) mismatched passwords stays on/registerwith an error message. Each should run infreshCtxwithstorageState: undefined.runAtMobile(page)that sets viewport to{ width: 320, height: 700 }and use it in a parametrized loop — same pattern used innotification-deep-link.spec.ts. Apply to: document detail, person detail, document search results, home page, andbriefwechsel.🏗️ Markus Keller — Application Architect
Observations
DELETE /api/persons/{id}gap is a domain boundary hole, not just a test convenience problem. The Person domain exposes create, read, update, and merge — but no delete. Any feature that seeds data (E2E tests, future admin tooling, data quality workflows) hits this ceiling. The gap is structural, not cosmetic..gitea/workflows/ci.ymlruns three jobs:unit-tests,ocr-tests, andbackend-unit-tests. No Playwright job exists. The E2E suite comments infrontend/e2e/CLAUDE.mdexplicitly document this: "E2E tests are not currently run in CI." This is a known architectural hole in the delivery pipeline.docker-compose up -din CI, but the existing pipeline never starts the full Compose stack. There is nodocker-compose.ci.ymloverlay in the repo. A CI job that spins up the full stack must account for startup ordering (backend needs DB + MinIO healthy before accepting requests), port exposure, and ephemeral data volumes.Date.now().toString(36)) as a collision-avoidance hack. This works but produces unbounded database growth per test run — not a problem today, but a maintenance liability as the E2E suite grows.POST /api/personsandPOST /api/documents— two domain boundaries. This is expected and correct E2E behavior, but theafterAllmust delete both in reverse dependency order (document first, then person).Recommendations
DELETE /api/persons/{id}as a first-class endpoint (admin-scoped:@RequirePermission(Permission.ADMIN)rather thanWRITE_ALL, since deleting a historical person has irreversible archival consequences). UpdatePersonControllerandPersonService. Write the failing unit test first.docker-compose.ci.ymloverlay before writing the Playwright CI job. The overlay should: use named (not bind-mounted) volumes, not expose postgres port externally, setPOSTGRES_PASSWORDandMINIO_ROOT_PASSWORDfrom Gitea secrets. The existingdocker-compose.ymluses bind mounts for postgres data — that needs an override for CI.wait-for-healthstep in the CI Playwright job (pollhttp://localhost:8080/actuator/health) with a timeout of 120 seconds before running any tests. Use a simpleuntilloop, notsleep. The backend boots faster than OCR startup — no need to wait for OCR in E2E.E2E-in their name/title. A periodic admin purge of records matching this prefix is then trivially safe. This doesn't replaceafterAllcleanup but adds a safety net for interrupted test runs.Open Decisions
DELETE /api/persons/{id}:WRITE_ALL(standard writer) vs.ADMIN(admin-only). Deleting a historical person cascades to document sender/receiver links and is irreversible — this feels closer toADMINscope. ButWRITE_ALLis what all other mutation endpoints use. Which principal should be able to delete a person?🛡️ Nora "NullX" Steiner — Security Engineer
Observations
DELETE /api/persons/{id}— cascade risk: If this endpoint is added (option a), it will cascade to document sender/receiver relationships. AWRITE_ALLuser who can delete persons could silently sever document–person links for the entire family archive. This is data destruction, not just record deletion. The permission level and cascade behavior need explicit design.auth.setup.tslogs in with test credentials. The CI workflow must source these from Gitea secrets, not hardcoded values in workflow YAML. Looking atdevops.md's persona notes:${{ secrets.E2E_ADMIN_PASSWORD }}is the required pattern. If someone hardcodesadmin123in the CI job YAML, that secret is committed to git history.auth.spec.tstakes a screenshot attest-results/e2e/admin-invite-created.pngwhile the invite table is visible. Iftest-results/is uploaded as a CI artifact on every run (not just failure), invite codes could be visible in artifact storage. This is low risk in a self-hosted private Gitea, but worth noting.NotificationControllerhas mark-as-read endpoints — the test should verify that after clicking a notification, the unread count drops to zero viaGET /api/notifications/unread-count. This proves the server-side state changed, not just that the UI updated optimistically.Recommendations
DELETE /api/persons/{id}toPermission.ADMIN— notWRITE_ALL. Document the cascade behavior in a comment on the endpoint. Add a soft-delete flag (deletedAt) rather than hard delete so document–person links survive as orphaned references rather than null pointers. This is an archival system: irreversible data deletion needs a higher permission bar.${{ secrets.E2E_ADMIN_PASSWORD }}and inject it via an environment variable, not a hardcoded value. Add a comment in the workflow YAML explaining this is a test-only account with no production data.'rejects already-used invite code — prevents replay attack') so their security intent is visible in CI logs.test-results/artifacts onlyif: failure(), notif: always(). The issue's spec suggests "on failure" — stick to that. Reduces artifact storage and avoids exposing any seeded test data.request.get('/api/notifications/unread-count')that the count decremented, not just that the badge disappeared from the DOM.🧪 Sara Holt — QA Engineer & Test Strategist
Observations
notification-deep-link.spec.tsseeds a document + comment and has properbeforeAll/afterAll. The scaffolding for a richer bell test is already there — only the assertions are missing. This is an extension task, not a rewrite.afterAllin "Document creation" and "PDF viewer" indocuments.spec.ts: Confirmed. These describe blocks callPOST /api/documentsin setup steps but have no matching delete. Every test run leaks documents. For a small archive this accumulates quickly.Person creationdescribe inpersons.spec.tscreatesE2E Testpersonon every run with noafterAll. Combined with the J5 relationship test creating two more persons per run, this produces 3+ persons per full E2E run. Without a DELETE endpoint, there is no mechanical fix possible today.?sender=<uuid>butDocumentControlleraccepts?senderId=<uuid>. The test will fail to filter anything if the wrong param name is used. Verify withcurl '/?senderId={id}'against a running backend before writing the test.freshCtxcorrectly. The negative tests for expired/used codes need the same pattern — fresh context with no admin session. The "mismatched passwords" case is entirely client-side validation (HTML5 or SvelteKit form action) — test by fillingpasswordConfirmwith a different value and asserting the page stays on/register.workers: 1runs sequentially. Adding 7 deferred items will extend runtime. The target from Sara's pyramid model is <8 minutes. Before adding more tests, measure current runtime and confirm headroom.Recommendations
'bell opens dropdown'test OR split into three focusedit()cases. Name them clearly:'bell badge shows unread count when notifications exist','clicking a notification marks it as read and decrements badge','clicking a notification navigates to the target document'. One assertion per test.// Backend param: ?senderId (not ?sender)at the top of the test. Seed a person with a unique last name (timestamp), create a document with that person as sender, search with?senderId={id}, and assert the specific document title appears. Seed a second unrelated document and assert it does NOT appear.afterAllcleanup, the E2E suite's test data grows unbounded. Option (b) (a dev admin tool) is manual and fragile. AddDELETE /api/persons/{id}now and addafterAllto J5 and the person creation describe. This is the only durable solution.afterAllaudit: Usegrep -rn "request.post('/api/documents')" frontend/e2e/to find all seeding calls, then check each describe for a matchingrequest.delete. Add the delete inafterAllfor every describe that's missing it. Do this as a single atomic commit so the audit is complete.notification-deep-link.spec.ts— afor (const viewport of [...])loop around the five read-path tests. Use 320×700 only (the issue specifies 320px).time npm run test:e2elocally and note the current duration. Add a comment inplaywright.config.tswith the baseline so future additions can be checked against the <8 min target.⚙️ Tobias Wendt — DevOps & Platform Engineer
Observations
docker-compose.ci.ymloverlay exists: The existing pipeline never starts the full stack. Adding a Playwright job that callsdocker-compose up -dagainst the developmentdocker-compose.ymlwill bring in bind mounts for PostgreSQL data (./data/postgres:/var/lib/postgresql/data) — which is not safe for CI. Those paths don't exist on a fresh runner.actions/cache@v4for both Maven and node_modules,actions/upload-artifact@v4(not the deprecated v3). The build hygiene is good. The Playwright job can follow the same structure./actuator/health)". The backend only exposes/actuator/health— good. But the PlaywrightwebServerconfig inplaywright.config.tsalready handles frontend dev server startup vianpm run dev -- --port 3000. In CI with a real Docker Compose stack, the E2E tests should point at the built frontend (port 3000 from the SvelteKit Node adapter), not the dev server.DOCKER_API_VERSION: "1.43"already set in backend job). The Playwright job needs the sameDOCKER_API_VERSIONenv var if it uses Testcontainers — but for the E2E job, it's spawning Compose services directly, not via Testcontainers. Confirm the NAS runner has Docker Compose v2 (docker composenotdocker-compose).screenshot: 'on'inplaywright.config.ts, screenshots are always captured — that means even passing tests produce screenshot files. Considerscreenshot: 'only-on-failure'in the CI profile to reduce artifact size.workers: 1): The E2E suite runs sequentially by design (shared auth state). This is unavoidable unless auth is refactored to per-test isolation. In CI this means the Playwright job runtime is the sum of all tests, not parallelized. Plan accordingly.Recommendations
docker-compose.ci.ymlas a proper overlay before writing the CI job. It should override:volumes(named instead of bind mounts), removeportsexposure for postgres and minio (useexposeinstead), and set credentials via env vars sourced from Gitea secrets. Base:docker-compose up -d db minio backend frontend.screenshotto'only-on-failure'inplaywright.config.tsfor the CI profile (or add a separateplaywright.config.ci.ts). Keep'on'for local runs where screenshots aid debugging. This halves artifact size on green runs.docker compose(v2 syntax), notdocker-compose(v1). Verify the NAS runner has Compose v2 (docker compose version) before writing the workflow. If v1 only, use the hyphen form — but plan to migrate.📋 Elicit — Requirements Engineer
Observations
Recommendations
docker-compose.ci.ymloverlay, (5b) AddE2E_ADMIN_PASSWORDGitea secret, (5c) Verify NAS runner supports Docker Compose v2, (5d) Add Playwright CI job toci.yml. The current single acceptance criterion ("E2E suite runs in CI on every push to main") hides four independent prerequisites.InviteControlleror equivalent) for how expiry is implemented, then write acceptance criteria that can be tested deterministically. If expiry requires time manipulation, consider using an invite code from a prior test run (mark-as-used) for the replay test, and skip the expiry test if the backend doesn't expose a way to create pre-expired codes.🎨 Leonie Voss — UI/UX & Accessibility
Observations
aria-live="polite"is missing from the badge, the count update is invisible to assistive technology. The bell test is a good place to verify this — but a Playwright/axe assertion is needed, not just a DOM check.notification-deep-link.spec.tsincludes anAxeBuildercheck. The new bell test assertions should include an axe scan of the bell dropdown open state — dropdown menus are a common accessibility failure point (focus trapping, role=menu vs role=dialog).Recommendations
AxeBuilderscan in the J10 bell dropdown test: After opening the dropdown, runawait new AxeBuilder({ page }).analyze()and assertviolations.toHaveLength(0). This catchesrole,aria-label, and focus-management violations in the dropdown in one assertion.aria-liveon the unread badge: After the mark-as-read step in the bell test, assertpage.locator('[aria-live]').filter(...)includes the badge region, or check that the badge element hasaria-live="polite". If it doesn't, that's a separate accessibility bug to file.page.setViewportSize({ width: 320, height: 700 }), navigate to a document detail page, and runcheckA11y(page)in addition to visual assertions. A screenshot at this size is low effort and will catch overflow issues.const box = await bell.first().boundingBox(); expect(box.height).toBeGreaterThanOrEqual(44)— this is the WCAG 2.2 minimum. Do the same for primary navigation links in the mobile header.🗳️ Decision Queue — Action Required
1 decision needs your input before implementation starts.
Architecture / Security
Permission scope for
DELETE /api/persons/{id}— should the delete endpoint requireWRITE_ALL(same as all other mutations, simpler, consistent) orADMIN(higher bar, reflects irreversibility of deleting a historical person that cascades to document–sender/receiver links)? Nora also recommends considering a soft-delete (deletedAtflag) rather than hard delete, so document links survive as traceable orphans rather than silent nulls. Three options:WRITE_ALL+ hard delete — simple, consistent, dangerous for archival dataADMIN+ hard delete — higher permission bar, consistent with "destructive = admin-only" semanticsADMIN+ soft delete — safest for an archival system, adds adeleted_atcolumn and Flyway migration(Raised by: Markus, Nora)