test(ci): production image smoke-test job — boot frontend + backend images, curl /login #501

Open
opened 2026-05-11 13:19:38 +02:00 by marcel · 0 comments
Owner

Background

PR #499 wires the production deploy pipeline (Caddy + nightly staging + release tag → production). The first cross-image regression is caught only at deploy time: if a frontend Dockerfile change breaks the production stage, or the backend image fails to bind 8080, the workflow's up -d --wait step is the first to notice — at 02:00 during the nightly run.

A pre-merge CI job that boots the actual production images and curls /login would catch this at PR time.

This was Sara's "deferred follow-up" in the pre-merge review summary (comment #8333) and tracked as OQ-2 in Elicit's PR #499 review (comment #8356).

Scope

Add a CI job production-images-smoke to .gitea/workflows/ci.yml:

  1. Build the frontend production stage and the backend image using docker compose -f docker-compose.prod.yml build against a stub .env.
  2. Bring up minio + db + backend + frontend (--profile staging to skip the OCR container, which is 5GB+ of model weights and not useful for a smoke).
  3. Wait for healthchecks (--wait).
  4. Curl /login on the frontend, assert 200 + content-type text/html.
  5. Curl /actuator/health on the backend through the internal docker network (not via Caddy), assert UP.
  6. Tear down with down -v.

Total CI time target: < 5 minutes on the existing self-hosted runner. If the OCR container would be needed for a more complete smoke, that goes in a separate job (it would dominate runtime).

Acceptance criteria

  • CI job exists and passes on the PR that introduces it
  • Job fails when a real regression is introduced (verified by intentionally breaking the frontend production stage on a throwaway branch)
  • Job runtime < 5 minutes p90 on existing runner

Out of scope

  • OCR container smoke (separate follow-up — heavy)
  • Anything requiring secrets (stub env file only)
  • E2E flows (covered by existing Playwright job)

References

  • PR #499 — QA review (Sara, #8354)
  • PR #499 — Requirements review (Elicit, #8356, OQ-2)
  • Issue #461 — security scan gates in CI (parallel scope; image smoke is the functional counterpart)
## Background PR #499 wires the production deploy pipeline (Caddy + nightly staging + release tag → production). The first cross-image regression is caught only at deploy time: if a frontend Dockerfile change breaks the `production` stage, or the backend image fails to bind 8080, the workflow's `up -d --wait` step is the first to notice — at 02:00 during the nightly run. A pre-merge CI job that boots the actual production images and curls `/login` would catch this at PR time. This was Sara's "deferred follow-up" in the pre-merge review summary ([comment #8333](https://git.raddatz.cloud/marcel/familienarchiv/issues/497#issuecomment-8333)) and tracked as OQ-2 in Elicit's PR #499 review (comment [#8356](https://git.raddatz.cloud/marcel/familienarchiv/pulls/499#issuecomment-8356)). ## Scope Add a CI job `production-images-smoke` to `.gitea/workflows/ci.yml`: 1. Build the frontend `production` stage and the backend image using `docker compose -f docker-compose.prod.yml build` against a stub `.env`. 2. Bring up minio + db + backend + frontend (`--profile staging` to skip the OCR container, which is 5GB+ of model weights and not useful for a smoke). 3. Wait for healthchecks (`--wait`). 4. Curl `/login` on the frontend, assert 200 + content-type text/html. 5. Curl `/actuator/health` on the backend through the internal docker network (not via Caddy), assert UP. 6. Tear down with `down -v`. Total CI time target: < 5 minutes on the existing self-hosted runner. If the OCR container would be needed for a more complete smoke, that goes in a separate job (it would dominate runtime). ## Acceptance criteria - [ ] CI job exists and passes on the PR that introduces it - [ ] Job fails when a real regression is introduced (verified by intentionally breaking the frontend `production` stage on a throwaway branch) - [ ] Job runtime < 5 minutes p90 on existing runner ## Out of scope - OCR container smoke (separate follow-up — heavy) - Anything requiring secrets (stub env file only) - E2E flows (covered by existing Playwright job) ## References - PR #499 — QA review (Sara, [#8354](https://git.raddatz.cloud/marcel/familienarchiv/pulls/499#issuecomment-8354)) - PR #499 — Requirements review (Elicit, [#8356](https://git.raddatz.cloud/marcel/familienarchiv/pulls/499#issuecomment-8356), OQ-2) - Issue #461 — security scan gates in CI (parallel scope; image smoke is the functional counterpart)
marcel added the P2-mediumdevopstest labels 2026-05-11 13:19:43 +02:00
Sign in to join this conversation.
No Label P2-medium devops test
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: marcel/familienarchiv#501