gitea-migration/README.md

# Gitea Migration Toolkit

Automated migration of GitHub repositories to self-hosted Gitea, with backup mirroring and push-mirror offsite redundancy. 40 shell scripts, 7 config templates, ~6,100 lines of bash.

## What This Does

Moves 3 GitHub repos to a self-hosted Gitea instance on Unraid, sets up a backup Gitea mirror on Fedora, and keeps GitHub as an offsite push mirror. After migration, Gitea is the primary git host — all CI runs on Gitea Actions, GitHub receives automatic push mirrors, and Fedora pulls from Unraid on a schedule.

The entire process is driven from a MacBook over SSH. Nothing is installed on the remote machines beyond what the setup scripts explicitly provision.

## Architecture

```
                    ┌──────────────────────────────────────────────┐
                    │               MacBook (Control Plane)        │
                    │   Runs all scripts locally, SSHs into hosts  │
                    │   Native macOS runner (launchd)              │
                    └──────────┬──────────────────┬────────────────┘
                               │ SSH              │ SSH
                    ┌──────────▼──────────┐  ┌────▼───────────────┐
                    │   Unraid (Primary)   │  │   Fedora (Backup)  │
                    │   Gitea + Nginx      │  │   Gitea (mirror)   │
                    │   Docker runners     │  │   Docker runners   │
                    │   Let's Encrypt      │  │   Backup storage   │
                    └──────────┬──────────┘  └────▲───────────────┘
                               │                  │
                               │  pull mirror     │
                               │  (8h interval)   │
                               └──────────────────┘
                               │
                               │ push mirror (on commit + 8h)
                               ▼
                    ┌──────────────────────┐
                    │   GitHub (Offsite)   │
                    │   Read-only mirror   │
                    │   Actions disabled   │
                    └──────────────────────┘
```

**Data flow after migration:**
- Developers push to Gitea on Unraid (via HTTPS reverse proxy)
- Gitea pushes to GitHub on every commit and on an 8-hour schedule
- Fedora pulls from Unraid on an 8-hour schedule
- Backup dumps are created on Unraid and SCP'd directly to Fedora

## The 9-Phase Pipeline

| Phase | Script | What It Does |
|-------|--------|-------------|
| 1 | `phase1_gitea_unraid.sh` | Deploy Gitea on Unraid via Docker Compose, create admin user, generate API token, create organization |
| 2 | `phase2_gitea_fedora.sh` | Deploy Gitea on Fedora (backup instance), create admin user, generate backup API token |
| 3 | `phase3_runners.sh` | Get runner registration token, deploy all runners from `runners.conf` |
| 4 | `phase4_migrate_repos.sh` | Import repos from GitHub to Unraid, create pull mirrors on Fedora |
| 5 | `phase5_migrate_pipelines.sh` | Copy `.github/workflows/` to `.gitea/workflows/`, apply context variable fixes |
| 6 | `phase6_github_mirrors.sh` | Configure push mirrors from Gitea to GitHub, disable GitHub Actions |
| 7 | `phase7_branch_protection.sh` | Apply branch protection rules to all repos |
| 8 | `phase8_cutover.sh` | Deploy Nginx HTTPS reverse proxy, obtain SSL cert, mark GitHub repos as mirrors |
| 9 | `phase9_security.sh` | Deploy Semgrep + Trivy + Gitleaks security scanning workflows |

Each phase has three scripts: the main script, a `_post_check.sh` that independently verifies success, and a `_teardown.sh` that cleanly reverses the phase.

## File Structure

```
gitea-migration/
├── .env.example              # Configuration template (copy to .env)
├── runners.conf.example      # Runner definitions template
├── lib/common.sh             # Shared functions + .env validators
├── setup/
│   ├── configure_env.sh      # Interactive .env wizard (~50 prompts)
│   ├── macbook.sh            # Local prerequisites (brew packages)
│   ├── unraid.sh             # Remote prerequisites (static binaries)
│   ├── fedora.sh             # Remote prerequisites (dnf packages)
│   ├── cross_host_ssh.sh     # SSH key exchange between Unraid and Fedora
│   ├── env_to_bitwarden.sh   # Export .env to Bitwarden JSON import format
│   ├── bitwarden_to_env.sh   # Restore .env from Bitwarden CLI
│   └── cleanup.sh            # Manifest-driven rollback of setup
├── templates/                # Config templates (.tpl + envsubst)
│   ├── app.ini.tpl
│   ├── docker-compose-gitea.yml.tpl
│   ├── docker-compose-runner.yml.tpl
│   ├── nginx-gitea.conf.tpl
│   ├── runner-config.yaml.tpl
│   ├── com.gitea.runner.plist.tpl
│   └── workflows/security-scan.yml.tpl
├── contracts/gitea-api.md    # API contract documentation
├── backup/
│   ├── backup_primary.sh     # Gitea dump, SCP to Fedora
│   └── restore_to_primary.sh # Restore dump to Unraid
├── preflight.sh              # 25 pre-flight validation checks
├── run_all.sh                # Full pipeline orchestration
├── teardown_all.sh           # Reverse teardown (9 to 1)
├── manage_runner.sh          # Dynamic runner add/remove/list
├── phase{1-9}_*.sh           # Main phase scripts
├── phase{1-9}_post_check.sh  # Verification scripts
└── phase{1-9}_teardown.sh    # Reversal scripts
```

## Design Decisions and Rationale

### Why bash scripts instead of Ansible/Terraform/Pulumi?

The migration targets 3 repos across 3 machines with a one-time execution path. Ansible requires installing agents or running a control node; Terraform manages ongoing state that doesn't apply to a one-shot migration; Pulumi requires a runtime. Bash scripts with SSH are zero-dependency beyond what's already on a Mac, run anywhere, are readable without framework knowledge, and produce no ongoing state to manage. The downside is more verbose error handling and no built-in parallelism, but for a sequential 9-phase pipeline that's acceptable.

### Why a single MacBook control plane?

All scripts run from the MacBook and SSH into remotes. This means:
- No agents, daemons, or software installed on servers beyond the migration targets
- One place to look at logs, one place to re-run failed phases
- The MacBook doesn't need to stay connected after each phase completes — phases are atomic
- Trade-off: the MacBook must be on the same network (or VPN) as both servers

### Why Docker Compose for Gitea but native binary for macOS runner?

Docker Desktop on macOS is heavyweight (~4 GB), requires a commercial license for organizations, and is unreliable for long-running background services (it suspends when the Mac sleeps). A native `act_runner` binary with a launchd plist is 30 MB, survives sleep/wake cycles, and by default starts at login via `~/Library/LaunchAgents/`. For headless Macs or dedicated CI machines, set `boot = true` in `runners.conf` to install the plist to `/Library/LaunchDaemons/` instead — this starts the runner at boot before any user logs in (requires `sudo` for plist installation and `launchctl` load/unload). On Linux, Docker is the native container runtime with no overhead, so Docker Compose is the obvious choice there.

### Why `envsubst` templates instead of Jinja2/Helm/gomplate?

`envsubst` is a single binary from GNU gettext with zero dependencies. Templates are plain config files with `${VAR}` placeholders — anyone can read them without learning a template language. The trade-off is no conditionals or loops in templates. The scripts work around this by generating template variants in bash (e.g., HTTP-only vs HTTPS Nginx configs use marker-block stripping with `sed`).

### Why check-before-act idempotency instead of desired-state?

Every operation checks if its target already exists before creating it. This is simpler to implement in bash and easier to debug — you can see exactly which step was skipped vs executed. The trade-off is that it cannot detect drift (e.g., someone manually changed a Gitea setting between runs). For a one-time migration, drift detection adds complexity without value.

### Why SQLite instead of PostgreSQL/MySQL?

The target workload is 3 repos with a handful of users. SQLite handles this with zero operational overhead — no separate database container, no connection strings, no backup coordination. The Gitea instance can be backed up by copying a single file. If the workload grows, migrating to PostgreSQL later is a Gitea admin operation, not a re-migration.

### Why Nginx reverse proxy instead of Caddy/Traefik?

An Nginx Docker container was already running on Unraid. Adding a server block and SSL cert to an existing Nginx is simpler than deploying a new reverse proxy. Caddy has simpler cert management but would require replacing the existing proxy stack.

### Why mark GitHub repos as mirrors instead of archiving them?

An earlier version archived GitHub repos during Phase 8. This was changed because archived repos reject all pushes, which breaks the push mirrors configured in Phase 6. Instead, repos are marked with a `[MIRROR]` description prefix, wiki/projects/Pages are disabled, and the original settings are saved to a JSON state file for exact restoration on teardown.

### Why separate Gitea instances instead of built-in replication?

Gitea doesn't have built-in multi-node replication. The Fedora instance is a completely independent Gitea that pulls mirrors from Unraid. This is simpler than database replication, works across different networks, and provides a fully functional standby — if Unraid dies, Fedora has a complete Gitea instance with all repos, not just a database replica.

### Why the three-script-per-phase pattern (do / verify / undo)?

- The main script may partially succeed before failing. The post-check tells you exactly what's working.
- Post-checks can run independently — useful for debugging without re-running the whole phase.
- Teardown scripts reverse only what their phase created, making selective rollback possible.

### Why pipe stderr for logs and stdout for data?

All `log_*` functions write to stderr. API wrappers return JSON on stdout. This means you can do `result=$(gitea_api GET /user)` without log messages contaminating the JSON. Piping through `jq` works cleanly.

## Compromises

### Shared admin credentials across instances

Unraid and Fedora use the same `GITEA_ADMIN_USER` and `GITEA_ADMIN_PASSWORD`. This simplifies setup (one set of credentials) and makes the pull mirror authentication straightforward (Fedora authenticates to Unraid using the shared admin password). The trade-off is reduced isolation — compromising one set of credentials compromises both instances. For a personal or small-team setup, this is acceptable.

### Dynamic repo list

The scripts read `REPO_NAMES` from `.env` — a space-delimited list of repo names (e.g., `REPO_NAMES=myapp backend infra`). The `get_repo_list()` helper in `lib/common.sh` splits it into individual names. Phase scripts use `read -ra REPOS <<< "$REPO_NAMES"` to build an array, supporting any number of repos.

### Workflow migration is syntactic, not semantic

Phase 5 copies workflow files and does a `sed` replacement of `github.*` context variables to `gitea.*` inside `${{ }}` expressions. It does NOT:
- Validate YAML syntax
- Check if referenced GitHub marketplace actions exist in Gitea
- Migrate secrets, OIDC providers, or environment configurations
- Handle composite actions or reusable workflows

Full semantic migration would require parsing YAML, understanding the GitHub Actions schema, and mapping every action to a Gitea equivalent. For 3 repos, manual review after automated migration is faster than building a full converter.

### No automatic rollback on failure

If Phase 5 fails halfway through, Phase 4's repos are still migrated and Phase 3's runners are still running. The user must manually run `teardown_all.sh --through=5` to roll back. Automatic rollback was rejected because:
- Determining "what succeeded" in a partially-failed phase is complex
- Some failures are transient (network timeout) and re-running the phase is the correct fix
- Automatic rollback of destructive operations (deleting repos) should always require human confirmation

### Migration polling is timeout-based, not event-driven

Phase 4 polls the Gitea API every N seconds to check if a migration completed, with a configurable timeout. Gitea's migration API doesn't support webhooks or long-polling, so polling is the only option. The defaults (3-second interval, 600-second timeout) work for repos up to ~1 GB. Larger repos need a higher timeout via `MIGRATION_POLL_TIMEOUT_SEC` in `.env`.

### No parallel phase execution

Phases run strictly sequentially. Phase 4 could potentially import all 3 repos in parallel, and Phase 3 could deploy runners concurrently. Sequential execution was chosen because:
- Bash parallelism (`&` + `wait`) makes error handling complex
- The total migration time is dominated by network transfers, not script execution
- Sequential execution produces readable, linear logs

### Docker socket mounted in runner containers

Runner containers get `/var/run/docker.sock` mounted, giving them root-equivalent access to the host's Docker daemon. This is required for runners to spawn job containers but is a security concern for untrusted code. For a private instance with trusted users, this is the standard Gitea runner deployment.

### Native runner `boot` mode requires sudo

When `boot = true` is set in `runners.conf`, `manage_runner.sh` uses `sudo` for three operations: copying the plist to `/Library/LaunchDaemons/`, loading/unloading the service via `launchctl`, and removing the plist on teardown. The plist includes a `<key>UserName</key>` entry so the daemon process runs as the deploying user, not root. The newsyslog config (log rotation) always requires `sudo` regardless of boot mode, since it installs to `/etc/newsyslog.d/`.

### Backup archives are unencrypted

`gitea dump` produces a zip file containing the database, all repos, and config. This is transferred over SSH (encrypted in transit) and stored on Fedora's filesystem. At-rest encryption is the user's responsibility (e.g., LUKS on the Fedora backup volume).

### Phase 8 state snapshot lives in `.manifests/`

The JSON file that records pre-cutover GitHub repo settings is stored alongside install manifests in `.manifests/`. This directory is gitignored (machine-specific state). If the user deletes `.manifests/` before running Phase 8 teardown, the teardown falls back to parsing the original description from the `[MIRROR] ... — was: ORIGINAL` format, but cannot restore homepage, wiki, projects, or Pages settings.

### SSL renewal cron on Unraid may not survive reboots

The Let's Encrypt renewal cron is added via `crontab` on Unraid. Unraid is not designed for persistent user crontabs — they can be lost on reboot depending on the Unraid version and configuration. A more robust approach would be a dedicated Certbot Docker container with a restart policy, but that adds deployment complexity.

## Security Notes

- **Sensitive files** (`.env`, `runners.conf`, `.manifests/`, `*.pem`, `*.key`, `*.crt`) are in `.gitignore`
- **API tokens** are generated by the scripts and written to `.env` — never hardcoded
- **SSH** uses `BatchMode=yes` (no password prompts) and `StrictHostKeyChecking=accept-new`
- **Passwords** are only used for initial admin creation and token generation — all subsequent API calls use tokens
- **Runner containers** mount the Docker socket — this is root-equivalent access to the host
- **Cross-host SSH keys** are ed25519 with no passphrase (automation keys)

## Prerequisites

| Machine | Requirements |
|---------|-------------|
| MacBook | macOS, Homebrew, jq >= 1.6, curl >= 7.70, git >= 2.30, shellcheck >= 0.8, gh >= 2.0, bw >= 2.0 |
| Unraid | Linux, Docker >= 20.0, docker-compose >= 2.0, jq >= 1.6, existing Nginx container |
| Fedora | Linux with dnf, Docker CE >= 20.0, docker-compose >= 2.0, jq >= 1.6 |
| Network | MacBook can SSH to both servers, DNS A record pointing to Unraid for HTTPS |

## Quick Start

```bash
cp .env.example .env
cp runners.conf.example runners.conf
# Edit both files, then:
./run_all.sh
```

See [USAGE_GUIDE.md](USAGE_GUIDE.md) for the full walkthrough, edge cases, and rollback procedures.