Files

417 lines
12 KiB
Markdown

# Self-Hosted GitHub Actions Runner (Docker)
Run GitHub Actions CI on your own Linux server instead of GitHub-hosted runners.
Eliminates laptop CPU burden, avoids runner-minute quotas, and gives faster feedback.
## How It Works
Each runner container:
1. Starts up, generates a short-lived registration token from your GitHub PAT
2. Registers with GitHub in **ephemeral mode** (one job per lifecycle)
3. Picks up a CI job, executes it, and exits
4. Docker's `restart: unless-stopped` brings it back for the next job
## Prerequisites
- Docker Engine 24+ and Docker Compose v2
- A GitHub Personal Access Token (classic) with **`repo`** and **`read:packages`** scopes
- Network access to `github.com`, `api.github.com`, and `ghcr.io`
## One-Time GitHub Setup
Before deploying, the repository needs write permissions for the image build workflow.
### Enable GHCR image builds
The `build-runner-image.yml` workflow pushes Docker images to GHCR using the
`GITHUB_TOKEN`. By default, this token is read-only and the workflow will fail
silently (zero steps executed, no runner assigned).
Fix by allowing write permissions for Actions workflows:
```bash
gh api -X PUT repos/OWNER/REPO/actions/permissions/workflow \
-f default_workflow_permissions=write \
-F can_approve_pull_request_reviews=false
```
Alternatively, keep read-only defaults and create a dedicated PAT secret with
`write:packages` scope, then reference it in the workflow instead of `GITHUB_TOKEN`.
### Build the runner image
Trigger the GHCR image build (first time and whenever Dockerfile/entrypoint changes):
```bash
gh workflow run build-runner-image.yml
```
Wait for the workflow to complete (~5 min):
```bash
gh run list --workflow=build-runner-image.yml --limit=1
```
The image is also rebuilt automatically:
- On push to `main` when `infra/runners/Dockerfile` or `entrypoint.sh` changes
- Weekly (Monday 06:00 UTC) to pick up OS patches and runner agent updates
## Deploy on Your Server
### Choose an image source
| Method | Files needed on server | Registry auth? | Best for |
|--------|----------------------|----------------|----------|
| **Self-hosted registry** | `docker-compose.yml`, `.env`, `envs/augur.env` | No (your network) | Production — push once, pull from any machine |
| **GHCR** | `docker-compose.yml`, `.env`, `envs/augur.env` | Yes (`docker login ghcr.io`) | GitHub-native workflow |
| **Build locally** | All 5 files (+ `Dockerfile`, `entrypoint.sh`) | No | Quick start, no registry needed |
### Option A: Self-hosted registry (recommended)
For the full end-to-end workflow (build image on your Mac, push to Unraid registry,
start runner), see the [CI Workflow Guide](../../docs/ci-workflows.md#lifecycle-2-offload-ci-to-a-server-unraid).
The private Docker registry is configured at `infra/registry/`. It listens on port 5000,
accessible from the LAN. Docker treats `localhost` registries as insecure by default —
no `daemon.json` changes needed on the server. To push from another machine, add
`<UNRAID_IP>:5000` to `insecure-registries` in that machine's Docker daemon config.
### Option B: GHCR
Requires the `build-runner-image.yml` workflow to have run successfully
(see [One-Time GitHub Setup](#one-time-github-setup)).
```bash
# 1. Copy environment templates
cp .env.example .env
cp envs/augur.env.example envs/augur.env
# 2. Edit .env — set your GITHUB_PAT
# 3. Edit envs/augur.env — set REPO_URL, RUNNER_NAME, resource limits
# 4. Authenticate Docker with GHCR (one-time, persists to ~/.docker/config.json)
echo "$GITHUB_PAT" | docker login ghcr.io -u YOUR_GITHUB_USERNAME --password-stdin
# 5. Pull and start
docker compose pull
docker compose up -d
# 6. Verify runner is registered
docker compose ps
docker compose logs -f runner-augur
```
### Option C: Build locally
No registry needed — builds the image directly on the target machine.
Requires `Dockerfile` and `entrypoint.sh` alongside the compose file.
```bash
# 1. Copy environment templates
cp .env.example .env
cp envs/augur.env.example envs/augur.env
# 2. Edit .env — set your GITHUB_PAT
# 3. Edit envs/augur.env — set REPO_URL, RUNNER_NAME, resource limits
# 4. Build and start
docker compose up -d --build
# 5. Verify runner is registered
docker compose ps
docker compose logs -f runner-augur
```
### Verify the runner is online in GitHub
```bash
gh api repos/OWNER/REPO/actions/runners \
--jq '.runners[] | {name, status, labels: [.labels[].name]}'
```
## Activate Self-Hosted CI
Set the repository variable `CI_RUNS_ON` so the CI workflow targets your runner:
```bash
gh variable set CI_RUNS_ON --body '["self-hosted", "Linux", "X64"]'
```
To revert to GitHub-hosted runners:
```bash
gh variable delete CI_RUNS_ON
```
## Configuration
### Shared Config (`.env`)
| Variable | Required | Description |
|----------|----------|-------------|
| `GITHUB_PAT` | Yes | GitHub PAT with `repo` + `read:packages` scope |
### Per-Repo Config (`envs/<repo>.env`)
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `REPO_URL` | Yes | — | Full GitHub repository URL |
| `RUNNER_NAME` | Yes | — | Unique runner name within the repo |
| `RUNNER_LABELS` | No | `self-hosted,Linux,X64` | Comma-separated runner labels |
| `RUNNER_GROUP` | No | `default` | Runner group |
| `RUNNER_IMAGE` | No | `ghcr.io/aiinfuseds/augur-runner:latest` | Docker image to use |
| `RUNNER_CPUS` | No | `6` | CPU limit for the container |
| `RUNNER_MEMORY` | No | `12G` | Memory limit for the container |
## Adding More Repos
1. Copy the per-repo env template:
```bash
cp envs/augur.env.example envs/myrepo.env
```
2. Edit `envs/myrepo.env` — set `REPO_URL`, `RUNNER_NAME`, and resource limits.
3. Add a service block to `docker-compose.yml`:
```yaml
runner-myrepo:
image: ${RUNNER_IMAGE:-ghcr.io/aiinfuseds/augur-runner:latest}
build: .
env_file:
- .env
- envs/myrepo.env
init: true
read_only: true
tmpfs:
- /tmp:size=2G
security_opt:
- no-new-privileges:true
stop_grace_period: 5m
deploy:
resources:
limits:
cpus: "${RUNNER_CPUS:-6}"
memory: "${RUNNER_MEMORY:-12G}"
restart: unless-stopped
healthcheck:
test: ["CMD", "pgrep", "-f", "Runner.Listener"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
logging:
driver: json-file
options:
max-size: "50m"
max-file: "3"
volumes:
- myrepo-work:/home/runner/_work
```
4. Add the volume at the bottom of `docker-compose.yml`:
```yaml
volumes:
augur-work:
myrepo-work:
```
5. Start: `docker compose up -d`
## Scaling
Run multiple concurrent runners for the same repo:
```bash
# Scale to 3 runners for augur
docker compose up -d --scale runner-augur=3
```
Each container gets a unique runner name (Docker appends a suffix).
Set `RUNNER_NAME` to a base name like `unraid-augur` — scaled instances become
`unraid-augur-1`, `unraid-augur-2`, etc.
## Resource Tuning
Each repo can have different resource limits in its env file:
```env
# Lightweight repo (linting only)
RUNNER_CPUS=2
RUNNER_MEMORY=4G
# Heavy repo (Go builds + extensive tests)
RUNNER_CPUS=8
RUNNER_MEMORY=16G
```
### tmpfs Sizing
The `/tmp` tmpfs defaults to 2G. If your CI writes large temp files,
increase it in `docker-compose.yml`:
```yaml
tmpfs:
- /tmp:size=4G
```
## Monitoring
```bash
# Container status and health
docker compose ps
# Live logs
docker compose logs -f runner-augur
# Last 50 log lines
docker compose logs --tail 50 runner-augur
# Resource usage
docker stats runner-augur
```
## Updating the Runner Image
To pull the latest GHCR image:
```bash
docker compose pull
docker compose up -d
```
To rebuild locally:
```bash
docker compose build
docker compose up -d
```
### Using a Self-Hosted Registry
See the [CI Workflow Guide](../../docs/ci-workflows.md#lifecycle-2-offload-ci-to-a-server-unraid)
for the full build-push-start workflow with a self-hosted registry.
## Troubleshooting
### Image build workflow fails with zero steps
The `build-runner-image.yml` workflow needs `packages: write` permission.
If the repo's default workflow permissions are read-only, the job fails
instantly (0 steps, no runner assigned). See [One-Time GitHub Setup](#one-time-github-setup).
### `docker compose pull` returns "access denied" or 403
The GHCR package inherits the repository's visibility. For private repos,
authenticate Docker first:
```bash
echo "$GITHUB_PAT" | docker login ghcr.io -u USERNAME --password-stdin
```
Or make the package public:
```bash
gh api -X PATCH /user/packages/container/augur-runner -f visibility=public
```
Or skip GHCR entirely and build locally: `docker compose build`.
### Runner doesn't appear in GitHub
1. Check logs: `docker compose logs runner-augur`
2. Verify `GITHUB_PAT` has `repo` scope
3. Verify `REPO_URL` is correct (full HTTPS URL)
4. Check network: `docker compose exec runner-augur curl -s https://api.github.com`
### Runner appears "offline"
The runner may have exited after a job. Check:
```bash
docker compose ps # Is the container running?
docker compose restart runner-augur # Force restart
```
### OOM (Out of Memory) kills
Increase `RUNNER_MEMORY` in the per-repo env file:
```env
RUNNER_MEMORY=16G
```
Then: `docker compose up -d`
### Stale/ghost runners in GitHub
Ephemeral runners deregister automatically after each job. If a container
was killed ungracefully (power loss, `docker kill`), the runner may appear
stale. It will auto-expire after a few hours, or remove manually:
```bash
# List runners
gh api repos/OWNER/REPO/actions/runners --jq '.runners[] | {id, name, status}'
# Remove stale runner by ID
gh api -X DELETE repos/OWNER/REPO/actions/runners/RUNNER_ID
```
### Disk space
Check work directory volume usage:
```bash
docker system df -v
```
Clean up unused volumes:
```bash
docker compose down -v # Remove work volumes
docker volume prune # Remove all unused volumes
```
## Unraid Notes
- **Docker login persistence**: `docker login ghcr.io` writes credentials to
`/root/.docker/config.json`. On Unraid, `/root` is on the USB flash drive
and persists across reboots. Verify with `cat /root/.docker/config.json`
after login.
- **Compose file location**: Place the 3 files (`docker-compose.yml`, `.env`,
`envs/augur.env`) in a share directory (e.g., `/mnt/user/appdata/augur-runner/`).
- **Alternative to GHCR**: If you don't want to deal with registry auth on Unraid,
copy the `Dockerfile` and `entrypoint.sh` alongside the compose file and use
`docker compose up -d --build` instead. No registry needed.
## Security
| Measure | Description |
|---------|-------------|
| Ephemeral mode | Fresh runner state per job — no cross-job contamination |
| PAT scope isolation | PAT generates a short-lived registration token; PAT never touches the runner agent |
| Non-root user | Runner process runs as UID 1000, not root |
| no-new-privileges | Prevents privilege escalation via setuid/setgid binaries |
| tini (PID 1) | Proper signal forwarding and zombie process reaping |
| Log rotation | Prevents disk exhaustion from verbose CI output (50MB x 3 files) |
### PAT Scope
Use the minimum scope required:
- **Classic token**: `repo` + `read:packages` scopes
- **Fine-grained token**: Repository access → Only select repositories → Read and Write for Administration
### Network Considerations
The runner container needs outbound access to:
- `github.com` (clone repos, download actions)
- `api.github.com` (registration, status)
- `ghcr.io` (pull runner image — only if using GHCR)
- Package registries (`proxy.golang.org`, `registry.npmjs.org`, etc.)
No inbound ports are required.
## Stopping and Removing
```bash
# Stop runners (waits for stop_grace_period)
docker compose down
# Stop and remove work volumes
docker compose down -v
# Stop, remove volumes, and delete the locally built image
docker compose down -v --rmi local
```