diff --git a/docs/adr/015-dood-workspace-bind-mount.md b/docs/adr/015-dood-workspace-bind-mount.md new file mode 100644 index 00000000..53bf73d1 --- /dev/null +++ b/docs/adr/015-dood-workspace-bind-mount.md @@ -0,0 +1,69 @@ +# ADR-015: DooD workspace bind mount for Compose file bind-mount resolution + +## Status + +Accepted + +## Context + +The deploy workflows (`.gitea/workflows/nightly.yml`, `release.yml`) run job steps inside Docker containers via Docker-out-of-Docker (DooD): the Gitea runner mounts the host Docker socket, and act_runner spawns sibling containers for each job. + +When a job step calls `docker compose -f docker-compose.observability.yml up`, Docker Compose resolves relative bind-mount sources against `$(pwd)` inside the job container and passes the resulting absolute paths to the **host** daemon. For example, `./infra/observability/prometheus/prometheus.yml` becomes `/some/path/infra/observability/prometheus/prometheus.yml`, and the host daemon tries to bind-mount that path from the **host filesystem**. + +In the default DooD setup (`runner-config.yaml` with only `valid_volumes: ["/var/run/docker.sock"]`), job container workspaces live in the act_runner overlay2 layer. The host has no corresponding directory at the job container's `$(pwd)` path, so the daemon auto-creates an empty directory in its place. The container then fails to start because the mount target was expected to be a file, not a directory: + +``` +error mounting "…/prometheus/prometheus.yml" to rootfs at "/etc/prometheus/prometheus.yml": not a directory +``` + +This affected all five config file bind mounts in `docker-compose.observability.yml`. + +## Decision + +Configure act_runner to store job workspaces on a real host path (`/srv/gitea-workspace`) and mount that path into both the runner container and every job container at the **same absolute path**. The identity of the host path and container path is the key constraint: Compose resolves to an absolute path and hands it to the host daemon, which looks for that exact path on the host filesystem. + +**runner-config.yaml changes:** + +```yaml +container: + workdir_parent: /srv/gitea-workspace + valid_volumes: + - "/var/run/docker.sock" + - "/srv/gitea-workspace" + options: "-v /srv/gitea-workspace:/srv/gitea-workspace" +``` + +**Runner compose.yaml change** (host side — not in this repo): + +```yaml +runner: + volumes: + - /srv/gitea-workspace:/srv/gitea-workspace +``` + +With this in place, `$(pwd)` inside a job container resolves to `/srv/gitea-workspace///`, which is a real directory on the host. Compose-managed bind mounts from that directory work without any additional steps. + +## Alternatives Considered + +| Alternative | Why rejected | +|---|---| +| **overlay2 `MergedDir` sync via privileged nsenter** (the previous approach, see PR #599 v1) | Required `--privileged --pid=host` (effective root on the host) plus fragile overlay2 driver assumption. Introduced stale-file risk on the host and a second stable path (`/srv/familienarchiv-*/obs-configs`) to maintain separately from the source tree. Replaced by this ADR. | +| **Build configs into a dedicated Docker image** (pattern used for MinIO bootstrap, see `infra/minio/Dockerfile`) | Viable for static files that change infrequently. Requires a build step and an image rebuild every time a config changes. Appropriate for bootstrap scripts; too heavy for frequently-tuned observability configs. | +| **Add workspace directory to runner-config `valid_volumes` only** (without `workdir_parent`) | `valid_volumes` whitelists paths that workflow steps may reference, but does not change where act_runner stores workspaces. Without `workdir_parent`, the workspace would still be in overlay2 and the bind-mount resolution problem would remain. | +| **Map workspace under a different host path than container path** (e.g. host `/srv/workspace`, container `/workspace`) | Compose resolves to the container-internal path (e.g. `/workspace/…`) and passes that to the host daemon. The host daemon interprets the source as a host path. If host `/workspace` does not exist, the daemon creates an empty directory — the original bug. The paths must be identical. | + +## Consequences + +- `/srv/gitea-workspace` must exist on the VPS before the runner starts. The directory was created as part of this change; it is not created automatically. +- The runner container's `compose.yaml` (maintained outside this repo at `~/docker/gitea/compose.yaml` on the VPS) must include the `- /srv/gitea-workspace:/srv/gitea-workspace` volume line. This is an out-of-band operational dependency; the prerequisite is documented in `runner-config.yaml`. +- `workdir_parent` applies to all jobs on this runner. Any future workflow that calls `docker compose` with relative bind mounts benefits automatically without further configuration. +- Job workspaces persist across runs under `/srv/gitea-workspace`. act_runner manages per-run subdirectory cleanup. Orphaned directories from interrupted runs should be cleaned up manually if disk space becomes a concern. +- Workflows that previously relied on `OBS_CONFIG_DIR` env var or the `obs-configs` stable path on the host no longer need those. Both were removed in this PR. +- This pattern does **not** apply to the `nsenter`-based Caddy reload step (ADR-012), which manages a host systemd service — a different problem class with no bind-mount equivalent. + +## References + +- ADR-011 — single-tenant runner trust model +- ADR-012 — nsenter via privileged container for host service management +- Issue #598 — original observability stack bind-mount failure +- `runner-config.yaml` — `workdir_parent`, `valid_volumes`, `options`