Files
familienarchiv/docs/infrastructure/ci-gitea.md
Marcel 02fb16a0bd
All checks were successful
CI / Unit & Component Tests (pull_request) Successful in 3m20s
CI / OCR Service Tests (pull_request) Successful in 24s
CI / Backend Unit Tests (pull_request) Successful in 3m39s
CI / fail2ban Regex (pull_request) Successful in 44s
CI / Semgrep Security Scan (pull_request) Successful in 22s
CI / Compose Bucket Idempotency (pull_request) Successful in 1m5s
docs(ci): document composite actions in ci-gitea.md
Adds a Composite actions section covering the checkout-first ordering rule, the
secrets-via-inputs + unquoted-heredoc constraint (with the five-key guard and
shell: bash requirement), and a step-by-step for adding an input. Notes that the
inline Reload Caddy example now lives in the reload-caddy action.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-02 19:25:32 +02:00

19 KiB

CI with Gitea Actions

This document covers the Gitea Actions CI workflow for Familienarchiv, including the full workflow YAML, differences from GitHub Actions, and self-hosted runner provisioning.


Runner Architecture

Familienarchiv uses two runners on the same Hetzner VPS:

Runner Purpose Config
gitea (Docker container) Hosts Gitea itself infra/gitea/docker-compose.yml
gitea-runner (Docker container) Runs all CI and deploy jobs infra/gitea/docker-compose.yml + /root/docker/gitea/runner-config.yaml

Both containers live in the gitea_gitea Docker network on the VPS. The runner connects to Gitea via the LAN IP so job containers (which don't share the gitea_gitea network) can also reach it.

Docker-out-of-Docker (DooD)

The gitea-runner container mounts the host Docker socket (/var/run/docker.sock). When a workflow job runs, act_runner spawns a sibling container for each job. That job container also gets the Docker socket mounted (via valid_volumes in runner-config.yaml), enabling docker compose calls in workflow steps.

Workspace bind-mount setup (DooD path resolution)

When a workflow step calls docker compose up with relative bind-mount sources (e.g. ./infra/observability/prometheus/prometheus.yml), Compose resolves them against $(pwd) inside the job container and passes the resulting absolute path to the host Docker daemon. The host daemon then tries to bind-mount that path from the host filesystem.

In the default DooD setup the job container's workspace lives in the act_runner overlay2 layer — the host has no directory at that path, auto-creates an empty one, and the container fails with:

error mounting "…/prometheus/prometheus.yml" to rootfs at "/etc/prometheus/prometheus.yml": not a directory

Solution (ADR-015): store job workspaces on a real host path and mount it at the same absolute path inside the runner and every job container. runner-config.yaml configures this via workdir_parent, valid_volumes, and options.

One-time host setup (required on any fresh VPS):

mkdir -p /srv/gitea-workspace
# Then add to the runner service in ~/docker/gitea/compose.yaml:
#   volumes:
#     - /srv/gitea-workspace:/srv/gitea-workspace
# Restart the runner container for the change to take effect.

The path /srv/gitea-workspace is the canonical workspace root. It must be identical on the host and inside job containers — if the paths differ, Compose still resolves to the container-internal path, which the host daemon cannot find (the original bug).

Disk management: act_runner cleans per-run subdirectories on completion. Orphaned directories from interrupted runs accumulate under /srv/gitea-workspace and should be pruned manually if disk space becomes a concern:

# List workspace directories older than 7 days
find /srv/gitea-workspace -mindepth 3 -maxdepth 3 -type d -mtime +7

Running host-level commands from CI (nsenter pattern)

Job containers are unprivileged and do not share the host's PID/mount/network namespaces. Commands like systemctl that target the host daemon are therefore unavailable by default. When a workflow step needs to manage a host service (e.g. systemctl reload caddy), it uses the Docker socket to spin up a privileged sibling container in the host PID namespace:

- name: Reload Caddy
  run: |
    docker run --rm --privileged --pid=host \
      alpine:3.21@sha256:48b0309ca019d89d40f670aa1bc06e426dc0931948452e8491e3d65087abc07d \
      sh -c 'apk add --no-cache util-linux -q && nsenter -t 1 -m -u -n -p -i -- /bin/systemctl reload caddy'

nsenter -t 1 -m -u -n -p -i enters the init process's mount, UTS, IPC, network, PID, and cgroup namespaces, giving systemctl a view of the real host systemd. No sudoers entry is required — the Docker socket already grants root-equivalent host access.

Alpine is used instead of Ubuntu: ~5 MB vs ~70 MB, and the digest is pinned to a specific sha256 so any upstream change requires an explicit Renovate bump PR. util-linux (which ships nsenter) is not part of the Alpine base image but is installed at run time in ~1 s from the warm VPS cache.

This exact step now lives in the reload-caddy composite action (see Composite actions below); both deploy workflows call it via uses: ./.gitea/actions/reload-caddy. The pinned digest moved with it, so Renovate's privileged-digest watch covers .gitea/actions/** as well as .gitea/workflows/**.

Why not sudo systemctl in the job container?

Job containers run as root inside an unprivileged Docker namespace. There is no systemd PID 1 inside the container — systemctl would attempt to reach a socket that does not exist. sudo is not present in container images and would not help even if it were.

Why not Caddy's admin API?

Caddy ships a localhost admin API at :2019 by default. Job containers do not share the host network namespace, so they cannot reach localhost:2019 on the host. Exposing :2019 on a host-bound port to make it reachable would add a network attack surface with no benefit over the current approach.

The deploy workflows reload Caddy to pick up committed Caddyfile changes. This relies on a symlink that must exist on the VPS:

/etc/caddy/Caddyfile → /opt/familienarchiv/infra/caddy/Caddyfile

Created once during server bootstrap (see docs/DEPLOYMENT.md §3.1). Verify with:

ls -la /etc/caddy/Caddyfile
# Expected: lrwxrwxrwx ... /etc/caddy/Caddyfile -> /opt/familienarchiv/infra/caddy/Caddyfile

Troubleshooting: Reload Caddy step fails

Failure mode 1 — Caddy is stopped

Symptom in CI log:

Failed to reload caddy.service: Unit caddy.service is not active.

Recovery:

ssh root@<vps>
systemctl start caddy
systemctl status caddy   # confirm Active: active (running)

Re-run the workflow via Gitea Actions → "Re-run workflow".

Failure mode 2 — Caddyfile symlink is missing or mis-pointed

This failure is silent — systemctl reload caddy exits 0 but Caddy reloads whatever /etc/caddy/Caddyfile currently resolves to. The smoke test may then pass against stale config.

Symptom: smoke test fails on the HSTS value or the /actuator/health → 404 check despite the Reload Caddy step succeeding.

Diagnosis:

ssh root@<vps>
ls -la /etc/caddy/Caddyfile
# Should be: lrwxrwxrwx ... /etc/caddy/Caddyfile -> /opt/familienarchiv/infra/caddy/Caddyfile

Recovery if symlink is wrong or missing:

ln -sf /opt/familienarchiv/infra/caddy/Caddyfile /etc/caddy/Caddyfile
systemctl reload caddy

Failure mode 3 — nsenter / Docker socket unavailable

Symptom in CI log:

docker: Cannot connect to the Docker daemon at unix:///var/run/docker.sock.

or

nsenter: failed to execute /bin/systemctl: No such file or directory

The first error means the Docker socket is not mounted into the job container — check valid_volumes in /root/docker/gitea/runner-config.yaml on the VPS. The second means the Alpine image is running but cannot enter the host mount namespace; verify --privileged and --pid=host are both present in the workflow step.

Failure mode 4 — workspace bind-mount not configured (observability stack or any compose-with-file-mounts job)

Symptom in CI log:

Error response from daemon: error while creating mount source path "…/prometheus/prometheus.yml": mkdir …: not a directory

Or the service starts but immediately crashes because a config file was mounted as an empty directory.

Cause: /srv/gitea-workspace does not exist on the host, or the runner container's compose.yaml is missing the - /srv/gitea-workspace:/srv/gitea-workspace volume line.

Diagnosis:

ssh root@<vps>
ls -la /srv/gitea-workspace          # must exist and be a directory
docker inspect gitea-runner | grep -A5 Mounts   # must show /srv/gitea-workspace

Recovery:

mkdir -p /srv/gitea-workspace
# Add volume line to runner compose.yaml, then:
docker compose -f ~/docker/gitea/compose.yaml up -d gitea-runner

See docs/DEPLOYMENT.md §3.1 and ADR-015 for the full setup rationale.


Composite actions

The nightly.yml (staging) and release.yml (production) deploy workflows share their observability-stack deploy, Caddy reload, and smoke-test logic through three single-responsibility composite actions under .gitea/actions/ (ADR-029). Before this, the shared logic was duplicated in both workflows and held together by # Keep in sync with nightly.yml comments — an unenforced honour-system invariant.

Action Inputs Purpose
deploy-obs grafana_admin_password, grafana_db_password, glitchtip_secret_key, postgres_password, postgres_host Deploy obs configs + secrets to /opt/familienarchiv, validate the compose config, start the stack, assert the five healthchecked services
reload-caddy Reload host Caddy via the privileged-sibling + nsenter pattern
smoke-test host Verify the public surface (login reachable, HSTS pinned, Permissions-Policy present, /actuator → 404)

A workflow calls them by relative path, passing per-environment values as with: inputs:

- uses: ./.gitea/actions/deploy-obs
  with:
    grafana_admin_password: ${{ secrets.GRAFANA_ADMIN_PASSWORD }}
    grafana_db_password: ${{ secrets.GRAFANA_DB_PASSWORD }}
    glitchtip_secret_key: ${{ secrets.GLITCHTIP_SECRET_KEY }}
    postgres_password: ${{ secrets.STAGING_POSTGRES_PASSWORD }}
    postgres_host: archiv-staging-db-1
- uses: ./.gitea/actions/reload-caddy
- uses: ./.gitea/actions/smoke-test
  with:
    host: staging.raddatz.cloud

Checkout-first ordering rule

A local composite action (uses: ./…) only exists on disk after the repo is checked out. actions/checkout@v4 MUST therefore be the first step of any job that calls one — if a future reorder moves checkout later, every uses: ./.gitea/actions/… call fails because the action file is not yet on disk. Both deploy workflows pin checkout as step 1 for exactly this reason.

Secrets inside composite actions

The secrets.* context is not available inside a composite action. Secrets are passed in as inputs, mapped to an env: block, and referenced as $VAR:

inputs:
  grafana_admin_password:
    required: true        # no default — a missing secret must fail loudly, never fall back to empty
runs:
  using: composite
  steps:
    - shell: bash         # composite steps do NOT default the shell — always declare it
      env:
        GRAFANA_ADMIN_PASSWORD: ${{ inputs.grafana_admin_password }}
      run: |
        cat > obs-secrets.env <<EOF   # unquoted EOF — $VAR expands at the shell layer
        GRAFANA_ADMIN_PASSWORD=$GRAFANA_ADMIN_PASSWORD
        EOF

Two load-bearing details:

  • Unquoted heredoc delimiter (<<EOF, not <<'EOF'). With a quoted delimiter the shell writes the literal string $GRAFANA_ADMIN_PASSWORD, and docker compose config --quiet still passes (the variable is present, just wrong). The deploy-obs action guards against this with a five-key non-empty check (grep -Eq "^KEY=.+") immediately after writing obs-secrets.env. chmod 600 is the action's final operation so the file is never world-readable.
  • Every run: step declares shell: bash. Composite actions do not inherit the workflow's default shell; a step without it fails to run.

Adding an input to an action

To thread a new per-environment value (e.g. a new secret) through deploy-obs:

  1. Add it under inputs: in .gitea/actions/deploy-obs/action.yml with required: true and no default:.
  2. Map it in the relevant step's env: block: NEW_KEY: ${{ inputs.new_key }}.
  3. Reference it as $NEW_KEY in the run: script — add a NEW_KEY=$NEW_KEY line to the heredoc and a matching entry to the five-key guard loop.
  4. Pass it from both workflows' with: blocks. That is the whole point of the action: the contract lives in one place, so neither environment can silently drift.

Gitea vs GitHub Actions Differences

Context Variable Names

GitHub Actions Gitea Actions
github.sha gitea.sha
github.actor gitea.actor
github.repository gitea.repository
github.ref_name gitea.ref_name
secrets.GITHUB_TOKEN secrets.GITEA_TOKEN (must be created manually)

Token Name Difference

# GitHub Actions
password: ${{ secrets.GITHUB_TOKEN }}

# Gitea Actions — use a Gitea access token stored as a secret
password: ${{ secrets.GITEA_TOKEN }}

Container Registry

# GitHub Actions — GHCR
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
tags: ghcr.io/${{ github.repository }}/app:${{ github.sha }}

# Gitea Actions — Gitea Package Registry
registry: gitea.example.com
username: ${{ gitea.actor }}
password: ${{ secrets.GITEA_TOKEN }}
tags: gitea.example.com/${{ gitea.repository }}/app:${{ gitea.sha }}

What Works Identically Between GitHub and Gitea Actions

  • uses: actions/checkout@v4 -- works unchanged
  • uses: actions/setup-java@v4 -- works unchanged
  • uses: actions/setup-node@v4 -- works unchanged
  • uses: actions/cache@v4 -- works unchanged
  • uses: docker/build-push-action@v5 -- works unchanged
  • container: key for running jobs inside a Docker image -- works unchanged
  • Secrets syntax ${{ secrets.MY_SECRET }} -- works unchanged

Full CI Workflow YAML

This is the complete ci.yml workflow, updated for Gitea with key changes highlighted.

# Updated for Gitea — key changes highlighted

name: CI

on:
  push:
  pull_request:

jobs:
  unit-tests:
    name: Unit & Component Tests
    runs-on: ubuntu-latest          # matches runner label registered above
    container:
      image: mcr.microsoft.com/playwright:v1.58.2-noble
    steps:
      - uses: actions/checkout@v4
      - name: Cache node_modules
        uses: actions/cache@v4
        with:
          path: frontend/node_modules
          key: node-modules-${{ hashFiles('frontend/package-lock.json') }}
      - name: Install dependencies
        if: steps.node-modules-cache.outputs.cache-hit != 'true'
        run: npm ci
        working-directory: frontend
      - name: Lint
        run: npm run lint
        working-directory: frontend
      - name: Run unit and component tests
        run: npm test
        working-directory: frontend
      - name: Upload screenshots
        if: always()
        uses: actions/upload-artifact@v3  # pinned per ADR-014 — Gitea Actions does not implement v4 protocol. Do NOT upgrade.
        with:
          name: unit-test-screenshots
          path: frontend/test-results/screenshots/

  backend-unit-tests:
    name: Backend Unit Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-java@v4
        with:
          java-version: '21'
          distribution: temurin
      - name: Cache Maven repository
        uses: actions/cache@v4
        with:
          path: ~/.m2/repository
          key: maven-${{ hashFiles('backend/pom.xml') }}
          restore-keys: maven-
      - name: Run backend tests
        run: |
          chmod +x mvnw
          ./mvnw clean test
        working-directory: backend
      - name: Upload test results
        if: always()
        uses: actions/upload-artifact@v3  # pinned per ADR-014 — Gitea Actions does not implement v4 protocol. Do NOT upgrade.
        with:
          name: backend-test-results
          path: backend/target/surefire-reports/

  e2e-tests:
    name: E2E Tests
    runs-on: ubuntu-latest
    env:
      DOCKER_API_VERSION: "1.43"
      POSTGRES_USER: archive_user
      POSTGRES_PASSWORD: ci_db_password
      POSTGRES_DB: family_archive_db
      MINIO_ROOT_USER: minio_admin
      MINIO_ROOT_PASSWORD: ci_minio_password
      MINIO_DEFAULT_BUCKETS: archive-documents
      PORT_DB: 5433
      PORT_MINIO_API: 9100
      PORT_MINIO_CONSOLE: 9101
      PORT_BACKEND: 8080
      PORT_FRONTEND: 3000
    steps:
      - uses: actions/checkout@v4
      - name: Cleanup leftover containers
        run: docker compose -f docker-compose.yml -f docker-compose.ci.yml down --volumes --remove-orphans || true
      - name: Start DB and MinIO
        run: docker compose -f docker-compose.yml -f docker-compose.ci.yml up -d db minio create-buckets
      - name: Wait for DB
        run: |
          timeout 30 bash -c \
            'until docker compose -f docker-compose.yml -f docker-compose.ci.yml exec -T db pg_isready -U archive_user; do sleep 2; done'
      - name: Connect job container to compose network
        run: docker network connect familienarchiv_archiv-net $(cat /etc/hostname)
      - uses: actions/setup-java@v4
        with:
          java-version: '21'
          distribution: temurin
      - name: Cache Maven repository
        uses: actions/cache@v4
        with:
          path: ~/.m2/repository
          key: maven-${{ hashFiles('backend/pom.xml') }}
          restore-keys: maven-
      - name: Build backend
        run: |
          chmod +x mvnw
          ./mvnw clean package -DskipTests
        working-directory: backend
      - name: Start backend
        run: |
          java -jar backend/target/*.jar \
            --spring.profiles.active=e2e \
            --SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/family_archive_db \
            --SPRING_DATASOURCE_USERNAME=archive_user \
            --SPRING_DATASOURCE_PASSWORD=ci_db_password \
            --S3_ENDPOINT=http://minio:9000 \
            --S3_ACCESS_KEY=minio_admin \
            --S3_SECRET_KEY=ci_minio_password \
            --S3_BUCKET_NAME=archive-documents \
            --S3_REGION=us-east-1 \
            --APP_ADMIN_USERNAME=admin \
            --APP_ADMIN_PASSWORD=${{ secrets.E2E_ADMIN_PASSWORD }} \
            &
          timeout 90 bash -c \
            'until curl -sf http://localhost:8080/actuator/health | grep -q "UP"; do sleep 3; done'
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - name: Cache node_modules
        id: node-modules-cache
        uses: actions/cache@v4
        with:
          path: frontend/node_modules
          key: node-modules-${{ hashFiles('frontend/package-lock.json') }}
      - name: Install frontend dependencies
        if: steps.node-modules-cache.outputs.cache-hit != 'true'
        run: npm ci
        working-directory: frontend
      - name: Cache Playwright browsers
        id: playwright-cache
        uses: actions/cache@v4
        with:
          path: ~/.cache/ms-playwright
          key: playwright-chromium-${{ hashFiles('frontend/package-lock.json') }}
      - name: Install Playwright Chromium + system deps
        if: steps.playwright-cache.outputs.cache-hit != 'true'
        run: npx playwright install chromium --with-deps
        working-directory: frontend
      - name: Install Playwright system deps only
        if: steps.playwright-cache.outputs.cache-hit == 'true'
        run: npx playwright install-deps chromium
        working-directory: frontend
      - name: Run E2E tests
        run: npm run test:e2e
        working-directory: frontend
        env:
          E2E_BASE_URL: http://localhost:3000
          E2E_USERNAME: admin
          E2E_PASSWORD: ${{ secrets.E2E_ADMIN_PASSWORD }}   # ← secret, not hardcoded
          E2E_BACKEND_URL: http://localhost:8080
      - name: Upload E2E results
        if: always()
        uses: actions/upload-artifact@v3  # pinned per ADR-014 — Gitea Actions does not implement v4 protocol. Do NOT upgrade.
        with:
          name: e2e-results
          path: frontend/test-results/e2e/