417 lines
12 KiB
Markdown
417 lines
12 KiB
Markdown
# Self-Hosted GitHub Actions Runner (Docker)
|
|
|
|
Run GitHub Actions CI on your own Linux server instead of GitHub-hosted runners.
|
|
Eliminates laptop CPU burden, avoids runner-minute quotas, and gives faster feedback.
|
|
|
|
## How It Works
|
|
|
|
Each runner container:
|
|
1. Starts up, generates a short-lived registration token from your GitHub PAT
|
|
2. Registers with GitHub in **ephemeral mode** (one job per lifecycle)
|
|
3. Picks up a CI job, executes it, and exits
|
|
4. Docker's `restart: unless-stopped` brings it back for the next job
|
|
|
|
## Prerequisites
|
|
|
|
- Docker Engine 24+ and Docker Compose v2
|
|
- A GitHub Personal Access Token (classic) with **`repo`** and **`read:packages`** scopes
|
|
- Network access to `github.com`, `api.github.com`, and `ghcr.io`
|
|
|
|
## One-Time GitHub Setup
|
|
|
|
Before deploying, the repository needs write permissions for the image build workflow.
|
|
|
|
### Enable GHCR image builds
|
|
|
|
The `build-runner-image.yml` workflow pushes Docker images to GHCR using the
|
|
`GITHUB_TOKEN`. By default, this token is read-only and the workflow will fail
|
|
silently (zero steps executed, no runner assigned).
|
|
|
|
Fix by allowing write permissions for Actions workflows:
|
|
|
|
```bash
|
|
gh api -X PUT repos/OWNER/REPO/actions/permissions/workflow \
|
|
-f default_workflow_permissions=write \
|
|
-F can_approve_pull_request_reviews=false
|
|
```
|
|
|
|
Alternatively, keep read-only defaults and create a dedicated PAT secret with
|
|
`write:packages` scope, then reference it in the workflow instead of `GITHUB_TOKEN`.
|
|
|
|
### Build the runner image
|
|
|
|
Trigger the GHCR image build (first time and whenever Dockerfile/entrypoint changes):
|
|
|
|
```bash
|
|
gh workflow run build-runner-image.yml
|
|
```
|
|
|
|
Wait for the workflow to complete (~5 min):
|
|
|
|
```bash
|
|
gh run list --workflow=build-runner-image.yml --limit=1
|
|
```
|
|
|
|
The image is also rebuilt automatically:
|
|
- On push to `main` when `infra/runners/Dockerfile` or `entrypoint.sh` changes
|
|
- Weekly (Monday 06:00 UTC) to pick up OS patches and runner agent updates
|
|
|
|
## Deploy on Your Server
|
|
|
|
### Choose an image source
|
|
|
|
| Method | Files needed on server | Registry auth? | Best for |
|
|
|--------|----------------------|----------------|----------|
|
|
| **Self-hosted registry** | `docker-compose.yml`, `.env`, `envs/augur.env` | No (your network) | Production — push once, pull from any machine |
|
|
| **GHCR** | `docker-compose.yml`, `.env`, `envs/augur.env` | Yes (`docker login ghcr.io`) | GitHub-native workflow |
|
|
| **Build locally** | All 5 files (+ `Dockerfile`, `entrypoint.sh`) | No | Quick start, no registry needed |
|
|
|
|
### Option A: Self-hosted registry (recommended)
|
|
|
|
For the full end-to-end workflow (build image on your Mac, push to Unraid registry,
|
|
start runner), see the [CI Workflow Guide](../../docs/ci-workflows.md#lifecycle-2-offload-ci-to-a-server-unraid).
|
|
|
|
The private Docker registry is configured at `infra/registry/`. It listens on port 5000,
|
|
accessible from the LAN. Docker treats `localhost` registries as insecure by default —
|
|
no `daemon.json` changes needed on the server. To push from another machine, add
|
|
`<UNRAID_IP>:5000` to `insecure-registries` in that machine's Docker daemon config.
|
|
|
|
### Option B: GHCR
|
|
|
|
Requires the `build-runner-image.yml` workflow to have run successfully
|
|
(see [One-Time GitHub Setup](#one-time-github-setup)).
|
|
|
|
```bash
|
|
# 1. Copy environment templates
|
|
cp .env.example .env
|
|
cp envs/augur.env.example envs/augur.env
|
|
|
|
# 2. Edit .env — set your GITHUB_PAT
|
|
# 3. Edit envs/augur.env — set REPO_URL, RUNNER_NAME, resource limits
|
|
|
|
# 4. Authenticate Docker with GHCR (one-time, persists to ~/.docker/config.json)
|
|
echo "$GITHUB_PAT" | docker login ghcr.io -u YOUR_GITHUB_USERNAME --password-stdin
|
|
|
|
# 5. Pull and start
|
|
docker compose pull
|
|
docker compose up -d
|
|
|
|
# 6. Verify runner is registered
|
|
docker compose ps
|
|
docker compose logs -f runner-augur
|
|
```
|
|
|
|
### Option C: Build locally
|
|
|
|
No registry needed — builds the image directly on the target machine.
|
|
Requires `Dockerfile` and `entrypoint.sh` alongside the compose file.
|
|
|
|
```bash
|
|
# 1. Copy environment templates
|
|
cp .env.example .env
|
|
cp envs/augur.env.example envs/augur.env
|
|
|
|
# 2. Edit .env — set your GITHUB_PAT
|
|
# 3. Edit envs/augur.env — set REPO_URL, RUNNER_NAME, resource limits
|
|
|
|
# 4. Build and start
|
|
docker compose up -d --build
|
|
|
|
# 5. Verify runner is registered
|
|
docker compose ps
|
|
docker compose logs -f runner-augur
|
|
```
|
|
|
|
### Verify the runner is online in GitHub
|
|
|
|
```bash
|
|
gh api repos/OWNER/REPO/actions/runners \
|
|
--jq '.runners[] | {name, status, labels: [.labels[].name]}'
|
|
```
|
|
|
|
## Activate Self-Hosted CI
|
|
|
|
Set the repository variable `CI_RUNS_ON` so the CI workflow targets your runner:
|
|
|
|
```bash
|
|
gh variable set CI_RUNS_ON --body '["self-hosted", "Linux", "X64"]'
|
|
```
|
|
|
|
To revert to GitHub-hosted runners:
|
|
```bash
|
|
gh variable delete CI_RUNS_ON
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### Shared Config (`.env`)
|
|
|
|
| Variable | Required | Description |
|
|
|----------|----------|-------------|
|
|
| `GITHUB_PAT` | Yes | GitHub PAT with `repo` + `read:packages` scope |
|
|
|
|
### Per-Repo Config (`envs/<repo>.env`)
|
|
|
|
| Variable | Required | Default | Description |
|
|
|----------|----------|---------|-------------|
|
|
| `REPO_URL` | Yes | — | Full GitHub repository URL |
|
|
| `RUNNER_NAME` | Yes | — | Unique runner name within the repo |
|
|
| `RUNNER_LABELS` | No | `self-hosted,Linux,X64` | Comma-separated runner labels |
|
|
| `RUNNER_GROUP` | No | `default` | Runner group |
|
|
| `RUNNER_IMAGE` | No | `ghcr.io/aiinfuseds/augur-runner:latest` | Docker image to use |
|
|
| `RUNNER_CPUS` | No | `6` | CPU limit for the container |
|
|
| `RUNNER_MEMORY` | No | `12G` | Memory limit for the container |
|
|
|
|
## Adding More Repos
|
|
|
|
1. Copy the per-repo env template:
|
|
```bash
|
|
cp envs/augur.env.example envs/myrepo.env
|
|
```
|
|
|
|
2. Edit `envs/myrepo.env` — set `REPO_URL`, `RUNNER_NAME`, and resource limits.
|
|
|
|
3. Add a service block to `docker-compose.yml`:
|
|
```yaml
|
|
runner-myrepo:
|
|
image: ${RUNNER_IMAGE:-ghcr.io/aiinfuseds/augur-runner:latest}
|
|
build: .
|
|
env_file:
|
|
- .env
|
|
- envs/myrepo.env
|
|
init: true
|
|
read_only: true
|
|
tmpfs:
|
|
- /tmp:size=2G
|
|
security_opt:
|
|
- no-new-privileges:true
|
|
stop_grace_period: 5m
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: "${RUNNER_CPUS:-6}"
|
|
memory: "${RUNNER_MEMORY:-12G}"
|
|
restart: unless-stopped
|
|
healthcheck:
|
|
test: ["CMD", "pgrep", "-f", "Runner.Listener"]
|
|
interval: 30s
|
|
timeout: 5s
|
|
retries: 3
|
|
start_period: 30s
|
|
logging:
|
|
driver: json-file
|
|
options:
|
|
max-size: "50m"
|
|
max-file: "3"
|
|
volumes:
|
|
- myrepo-work:/home/runner/_work
|
|
```
|
|
|
|
4. Add the volume at the bottom of `docker-compose.yml`:
|
|
```yaml
|
|
volumes:
|
|
augur-work:
|
|
myrepo-work:
|
|
```
|
|
|
|
5. Start: `docker compose up -d`
|
|
|
|
## Scaling
|
|
|
|
Run multiple concurrent runners for the same repo:
|
|
|
|
```bash
|
|
# Scale to 3 runners for augur
|
|
docker compose up -d --scale runner-augur=3
|
|
```
|
|
|
|
Each container gets a unique runner name (Docker appends a suffix).
|
|
Set `RUNNER_NAME` to a base name like `unraid-augur` — scaled instances become
|
|
`unraid-augur-1`, `unraid-augur-2`, etc.
|
|
|
|
## Resource Tuning
|
|
|
|
Each repo can have different resource limits in its env file:
|
|
|
|
```env
|
|
# Lightweight repo (linting only)
|
|
RUNNER_CPUS=2
|
|
RUNNER_MEMORY=4G
|
|
|
|
# Heavy repo (Go builds + extensive tests)
|
|
RUNNER_CPUS=8
|
|
RUNNER_MEMORY=16G
|
|
```
|
|
|
|
### tmpfs Sizing
|
|
|
|
The `/tmp` tmpfs defaults to 2G. If your CI writes large temp files,
|
|
increase it in `docker-compose.yml`:
|
|
|
|
```yaml
|
|
tmpfs:
|
|
- /tmp:size=4G
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
```bash
|
|
# Container status and health
|
|
docker compose ps
|
|
|
|
# Live logs
|
|
docker compose logs -f runner-augur
|
|
|
|
# Last 50 log lines
|
|
docker compose logs --tail 50 runner-augur
|
|
|
|
# Resource usage
|
|
docker stats runner-augur
|
|
```
|
|
|
|
## Updating the Runner Image
|
|
|
|
To pull the latest GHCR image:
|
|
```bash
|
|
docker compose pull
|
|
docker compose up -d
|
|
```
|
|
|
|
To rebuild locally:
|
|
```bash
|
|
docker compose build
|
|
docker compose up -d
|
|
```
|
|
|
|
### Using a Self-Hosted Registry
|
|
|
|
See the [CI Workflow Guide](../../docs/ci-workflows.md#lifecycle-2-offload-ci-to-a-server-unraid)
|
|
for the full build-push-start workflow with a self-hosted registry.
|
|
|
|
## Troubleshooting
|
|
|
|
### Image build workflow fails with zero steps
|
|
|
|
The `build-runner-image.yml` workflow needs `packages: write` permission.
|
|
If the repo's default workflow permissions are read-only, the job fails
|
|
instantly (0 steps, no runner assigned). See [One-Time GitHub Setup](#one-time-github-setup).
|
|
|
|
### `docker compose pull` returns "access denied" or 403
|
|
|
|
The GHCR package inherits the repository's visibility. For private repos,
|
|
authenticate Docker first:
|
|
|
|
```bash
|
|
echo "$GITHUB_PAT" | docker login ghcr.io -u USERNAME --password-stdin
|
|
```
|
|
|
|
Or make the package public:
|
|
```bash
|
|
gh api -X PATCH /user/packages/container/augur-runner -f visibility=public
|
|
```
|
|
|
|
Or skip GHCR entirely and build locally: `docker compose build`.
|
|
|
|
### Runner doesn't appear in GitHub
|
|
|
|
1. Check logs: `docker compose logs runner-augur`
|
|
2. Verify `GITHUB_PAT` has `repo` scope
|
|
3. Verify `REPO_URL` is correct (full HTTPS URL)
|
|
4. Check network: `docker compose exec runner-augur curl -s https://api.github.com`
|
|
|
|
### Runner appears "offline"
|
|
|
|
The runner may have exited after a job. Check:
|
|
```bash
|
|
docker compose ps # Is the container running?
|
|
docker compose restart runner-augur # Force restart
|
|
```
|
|
|
|
### OOM (Out of Memory) kills
|
|
|
|
Increase `RUNNER_MEMORY` in the per-repo env file:
|
|
```env
|
|
RUNNER_MEMORY=16G
|
|
```
|
|
|
|
Then: `docker compose up -d`
|
|
|
|
### Stale/ghost runners in GitHub
|
|
|
|
Ephemeral runners deregister automatically after each job. If a container
|
|
was killed ungracefully (power loss, `docker kill`), the runner may appear
|
|
stale. It will auto-expire after a few hours, or remove manually:
|
|
|
|
```bash
|
|
# List runners
|
|
gh api repos/OWNER/REPO/actions/runners --jq '.runners[] | {id, name, status}'
|
|
|
|
# Remove stale runner by ID
|
|
gh api -X DELETE repos/OWNER/REPO/actions/runners/RUNNER_ID
|
|
```
|
|
|
|
### Disk space
|
|
|
|
Check work directory volume usage:
|
|
```bash
|
|
docker system df -v
|
|
```
|
|
|
|
Clean up unused volumes:
|
|
```bash
|
|
docker compose down -v # Remove work volumes
|
|
docker volume prune # Remove all unused volumes
|
|
```
|
|
|
|
## Unraid Notes
|
|
|
|
- **Docker login persistence**: `docker login ghcr.io` writes credentials to
|
|
`/root/.docker/config.json`. On Unraid, `/root` is on the USB flash drive
|
|
and persists across reboots. Verify with `cat /root/.docker/config.json`
|
|
after login.
|
|
- **Compose file location**: Place the 3 files (`docker-compose.yml`, `.env`,
|
|
`envs/augur.env`) in a share directory (e.g., `/mnt/user/appdata/augur-runner/`).
|
|
- **Alternative to GHCR**: If you don't want to deal with registry auth on Unraid,
|
|
copy the `Dockerfile` and `entrypoint.sh` alongside the compose file and use
|
|
`docker compose up -d --build` instead. No registry needed.
|
|
|
|
## Security
|
|
|
|
| Measure | Description |
|
|
|---------|-------------|
|
|
| Ephemeral mode | Fresh runner state per job — no cross-job contamination |
|
|
| PAT scope isolation | PAT generates a short-lived registration token; PAT never touches the runner agent |
|
|
| Non-root user | Runner process runs as UID 1000, not root |
|
|
| no-new-privileges | Prevents privilege escalation via setuid/setgid binaries |
|
|
| tini (PID 1) | Proper signal forwarding and zombie process reaping |
|
|
| Log rotation | Prevents disk exhaustion from verbose CI output (50MB x 3 files) |
|
|
|
|
### PAT Scope
|
|
|
|
Use the minimum scope required:
|
|
- **Classic token**: `repo` + `read:packages` scopes
|
|
- **Fine-grained token**: Repository access → Only select repositories → Read and Write for Administration
|
|
|
|
### Network Considerations
|
|
|
|
The runner container needs outbound access to:
|
|
- `github.com` (clone repos, download actions)
|
|
- `api.github.com` (registration, status)
|
|
- `ghcr.io` (pull runner image — only if using GHCR)
|
|
- Package registries (`proxy.golang.org`, `registry.npmjs.org`, etc.)
|
|
|
|
No inbound ports are required.
|
|
|
|
## Stopping and Removing
|
|
|
|
```bash
|
|
# Stop runners (waits for stop_grace_period)
|
|
docker compose down
|
|
|
|
# Stop and remove work volumes
|
|
docker compose down -v
|
|
|
|
# Stop, remove volumes, and delete the locally built image
|
|
docker compose down -v --rmi local
|
|
```
|