docs: update README.md — Nginx→Caddy, dynamic repos, macvlan networking

- Architecture diagram: Gitea+Nginx→Gitea+Caddy, Let's Encrypt→macvlan
- Phase 8 description: Nginx→Caddy with Cloudflare DNS-01
- Template listing: nginx-gitea.conf.tpl→Caddyfile.tpl + caddy compose
- Design rationale: replaced "Why Nginx" with "Why Caddy"
- Compromises: replaced SSL cron section with Caddy auto-renewal
- Prerequisites: removed "existing Nginx container", added Cloudflare
- Removed hardcoded "3 repos" references throughout

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
S
2026-03-01 11:04:11 -05:00
parent 89bfc8a70f
commit f87acc5664

View File

@@ -1,10 +1,10 @@
# Gitea Migration Toolkit # Gitea Migration Toolkit
Automated migration of GitHub repositories to self-hosted Gitea, with backup mirroring and push-mirror offsite redundancy. 40 shell scripts, 7 config templates, ~6,100 lines of bash. Automated migration of GitHub repositories to self-hosted Gitea, with backup mirroring and push-mirror offsite redundancy. 40+ shell scripts, 10 config templates, ~6,500 lines of bash.
## What This Does ## What This Does
Moves 3 GitHub repos to a self-hosted Gitea instance on Unraid, sets up a backup Gitea mirror on Fedora, and keeps GitHub as an offsite push mirror. After migration, Gitea is the primary git host — all CI runs on Gitea Actions, GitHub receives automatic push mirrors, and Fedora pulls from Unraid on a schedule. Moves GitHub repos to a self-hosted Gitea instance on Unraid, sets up a backup Gitea mirror on Fedora, and keeps GitHub as an offsite push mirror. After migration, Gitea is the primary git host — all CI runs on Gitea Actions, GitHub receives automatic push mirrors, and Fedora pulls from Unraid on a schedule. Supports any number of repos via space-delimited `REPO_NAMES` in `.env`.
The entire process is driven from a MacBook over SSH. Nothing is installed on the remote machines beyond what the setup scripts explicitly provision. The entire process is driven from a MacBook over SSH. Nothing is installed on the remote machines beyond what the setup scripts explicitly provision.
@@ -19,9 +19,9 @@ The entire process is driven from a MacBook over SSH. Nothing is installed on th
│ SSH │ SSH │ SSH │ SSH
┌──────────▼──────────┐ ┌────▼───────────────┐ ┌──────────▼──────────┐ ┌────▼───────────────┐
│ Unraid (Primary) │ │ Fedora (Backup) │ │ Unraid (Primary) │ │ Fedora (Backup) │
│ Gitea + Nginx │ │ Gitea (mirror) │ │ Gitea + Caddy │ │ Gitea (mirror) │
│ Docker runners │ │ Docker runners │ │ Docker runners │ │ Docker runners │
Let's Encrypt │ │ Backup storage │ macvlan networking │ │ Backup storage │
└──────────┬──────────┘ └────▲───────────────┘ └──────────┬──────────┘ └────▲───────────────┘
│ │ │ │
│ pull mirror │ │ pull mirror │
@@ -54,7 +54,7 @@ The entire process is driven from a MacBook over SSH. Nothing is installed on th
| 5 | `phase5_migrate_pipelines.sh` | Copy `.github/workflows/` to `.gitea/workflows/`, apply context variable fixes | | 5 | `phase5_migrate_pipelines.sh` | Copy `.github/workflows/` to `.gitea/workflows/`, apply context variable fixes |
| 6 | `phase6_github_mirrors.sh` | Configure push mirrors from Gitea to GitHub, disable GitHub Actions | | 6 | `phase6_github_mirrors.sh` | Configure push mirrors from Gitea to GitHub, disable GitHub Actions |
| 7 | `phase7_branch_protection.sh` | Apply branch protection rules to all repos | | 7 | `phase7_branch_protection.sh` | Apply branch protection rules to all repos |
| 8 | `phase8_cutover.sh` | Deploy Nginx HTTPS reverse proxy, obtain SSL cert, mark GitHub repos as mirrors | | 8 | `phase8_cutover.sh` | Deploy Caddy HTTPS reverse proxy (Cloudflare DNS-01 or existing certs), mark GitHub repos as mirrors |
| 9 | `phase9_security.sh` | Deploy Semgrep + Trivy + Gitleaks security scanning workflows | | 9 | `phase9_security.sh` | Deploy Semgrep + Trivy + Gitleaks security scanning workflows |
Each phase has three scripts: the main script, a `_post_check.sh` that independently verifies success, and a `_teardown.sh` that cleanly reverses the phase. Each phase has three scripts: the main script, a `_post_check.sh` that independently verifies success, and a `_teardown.sh` that cleanly reverses the phase.
@@ -79,9 +79,11 @@ gitea-migration/
│ ├── app.ini.tpl │ ├── app.ini.tpl
│ ├── docker-compose-gitea.yml.tpl │ ├── docker-compose-gitea.yml.tpl
│ ├── docker-compose-runner.yml.tpl │ ├── docker-compose-runner.yml.tpl
│ ├── nginx-gitea.conf.tpl │ ├── Caddyfile.tpl
│ ├── docker-compose-caddy.yml.tpl
│ ├── runner-config.yaml.tpl │ ├── runner-config.yaml.tpl
│ ├── com.gitea.runner.plist.tpl │ ├── com.gitea.runner.plist.tpl
│ ├── com.gitea.runner.newsyslog.conf.tpl
│ └── workflows/security-scan.yml.tpl │ └── workflows/security-scan.yml.tpl
├── contracts/gitea-api.md # API contract documentation ├── contracts/gitea-api.md # API contract documentation
├── backup/ ├── backup/
@@ -100,7 +102,7 @@ gitea-migration/
### Why bash scripts instead of Ansible/Terraform/Pulumi? ### Why bash scripts instead of Ansible/Terraform/Pulumi?
The migration targets 3 repos across 3 machines with a one-time execution path. Ansible requires installing agents or running a control node; Terraform manages ongoing state that doesn't apply to a one-shot migration; Pulumi requires a runtime. Bash scripts with SSH are zero-dependency beyond what's already on a Mac, run anywhere, are readable without framework knowledge, and produce no ongoing state to manage. The downside is more verbose error handling and no built-in parallelism, but for a sequential 9-phase pipeline that's acceptable. The migration targets a handful of repos across 3 machines with a one-time execution path. Ansible requires installing agents or running a control node; Terraform manages ongoing state that doesn't apply to a one-shot migration; Pulumi requires a runtime. Bash scripts with SSH are zero-dependency beyond what's already on a Mac, run anywhere, are readable without framework knowledge, and produce no ongoing state to manage. The downside is more verbose error handling and no built-in parallelism, but for a sequential 9-phase pipeline that's acceptable.
### Why a single MacBook control plane? ### Why a single MacBook control plane?
@@ -116,7 +118,7 @@ Docker Desktop on macOS is heavyweight (~4 GB), requires a commercial license fo
### Why `envsubst` templates instead of Jinja2/Helm/gomplate? ### Why `envsubst` templates instead of Jinja2/Helm/gomplate?
`envsubst` is a single binary from GNU gettext with zero dependencies. Templates are plain config files with `${VAR}` placeholders — anyone can read them without learning a template language. The trade-off is no conditionals or loops in templates. The scripts work around this by generating template variants in bash (e.g., HTTP-only vs HTTPS Nginx configs use marker-block stripping with `sed`). `envsubst` is a single binary from GNU gettext with zero dependencies. Templates are plain config files with `${VAR}` placeholders — anyone can read them without learning a template language. The trade-off is no conditionals or loops in templates. The scripts work around this by using marker-block stripping with `sed` (e.g., sqlite3 vs external DB blocks in the docker-compose template).
### Why check-before-act idempotency instead of desired-state? ### Why check-before-act idempotency instead of desired-state?
@@ -126,9 +128,9 @@ Every operation checks if its target already exists before creating it. This is
All four Gitea-supported database backends are available: `sqlite3`, `mysql`, `postgres`, and `mssql`. Set `GITEA_DB_TYPE` in `.env` — sqlite3 is the default and needs no additional configuration. For external databases, the toolkit deploys a containerized database alongside Gitea (PostgreSQL 16, MySQL 8.0, or MSSQL 2022) with health checks, and the wizard prompts for connection details (host, port, name, user, password) only when needed. Backup/restore handles SQL dump import into the correct database engine. All four Gitea-supported database backends are available: `sqlite3`, `mysql`, `postgres`, and `mssql`. Set `GITEA_DB_TYPE` in `.env` — sqlite3 is the default and needs no additional configuration. For external databases, the toolkit deploys a containerized database alongside Gitea (PostgreSQL 16, MySQL 8.0, or MSSQL 2022) with health checks, and the wizard prompts for connection details (host, port, name, user, password) only when needed. Backup/restore handles SQL dump import into the correct database engine.
### Why Nginx reverse proxy instead of Caddy/Traefik? ### Why Caddy reverse proxy?
An Nginx Docker container was already running on Unraid. Adding a server block and SSL cert to an existing Nginx is simpler than deploying a new reverse proxy. Caddy has simpler cert management but would require replacing the existing proxy stack. Caddy with the Cloudflare DNS plugin handles wildcard TLS certificates automatically via DNS-01 challenge — no port 80 exposure needed, no certbot cron jobs, and zero-touch renewal. The `slothcroissant/caddy-cloudflaredns` Docker image bundles the plugin. For environments without Cloudflare, `TLS_MODE=existing` supports manual cert/key paths. Each host gets its own Caddy container on a dedicated macvlan IP.
### Why mark GitHub repos as mirrors instead of archiving them? ### Why mark GitHub repos as mirrors instead of archiving them?
@@ -166,7 +168,7 @@ Phase 5 copies workflow files and does a `sed` replacement of `github.*` context
- Migrate secrets, OIDC providers, or environment configurations - Migrate secrets, OIDC providers, or environment configurations
- Handle composite actions or reusable workflows - Handle composite actions or reusable workflows
Full semantic migration would require parsing YAML, understanding the GitHub Actions schema, and mapping every action to a Gitea equivalent. For 3 repos, manual review after automated migration is faster than building a full converter. Full semantic migration would require parsing YAML, understanding the GitHub Actions schema, and mapping every action to a Gitea equivalent. For a small number of repos, manual review after automated migration is faster than building a full converter.
### No automatic rollback on failure ### No automatic rollback on failure
@@ -181,7 +183,7 @@ Phase 4 polls the Gitea API every N seconds to check if a migration completed, w
### No parallel phase execution ### No parallel phase execution
Phases run strictly sequentially. Phase 4 could potentially import all 3 repos in parallel, and Phase 3 could deploy runners concurrently. Sequential execution was chosen because: Phases run strictly sequentially. Phase 4 could potentially import repos in parallel, and Phase 3 could deploy runners concurrently. Sequential execution was chosen because:
- Bash parallelism (`&` + `wait`) makes error handling complex - Bash parallelism (`&` + `wait`) makes error handling complex
- The total migration time is dominated by network transfers, not script execution - The total migration time is dominated by network transfers, not script execution
- Sequential execution produces readable, linear logs - Sequential execution produces readable, linear logs
@@ -202,9 +204,9 @@ When `boot = true` is set in `runners.conf`, `manage_runner.sh` uses `sudo` for
The JSON file that records pre-cutover GitHub repo settings is stored alongside install manifests in `.manifests/`. This directory is gitignored (machine-specific state). If the user deletes `.manifests/` before running Phase 8 teardown, the teardown falls back to parsing the original description from the `[MIRROR] ... — was: ORIGINAL` format, but cannot restore homepage, wiki, projects, or Pages settings. The JSON file that records pre-cutover GitHub repo settings is stored alongside install manifests in `.manifests/`. This directory is gitignored (machine-specific state). If the user deletes `.manifests/` before running Phase 8 teardown, the teardown falls back to parsing the original description from the `[MIRROR] ... — was: ORIGINAL` format, but cannot restore homepage, wiki, projects, or Pages settings.
### SSL renewal cron on Unraid may not survive reboots ### TLS certificate renewal
The Let's Encrypt renewal cron is added via `crontab` on Unraid. Unraid is not designed for persistent user crontabs — they can be lost on reboot depending on the Unraid version and configuration. A more robust approach would be a dedicated Certbot Docker container with a restart policy, but that adds deployment complexity. When `TLS_MODE=cloudflare`, Caddy handles certificate renewal automatically via the Cloudflare DNS-01 challenge — no cron jobs or manual intervention needed. Caddy renews certificates 30 days before expiry and persists them in `$CADDY_DATA_PATH/data`. When `TLS_MODE=existing`, cert renewal is the user's responsibility.
## Security Notes ## Security Notes
@@ -220,9 +222,9 @@ The Let's Encrypt renewal cron is added via `crontab` on Unraid. Unraid is not d
| Machine | Requirements | | Machine | Requirements |
|---------|-------------| |---------|-------------|
| MacBook | macOS, Homebrew, jq >= 1.6, curl >= 7.70, git >= 2.30, shellcheck >= 0.8, gh >= 2.0, bw >= 2.0 | | MacBook | macOS, Homebrew, jq >= 1.6, curl >= 7.70, git >= 2.30, shellcheck >= 0.8, gh >= 2.0, bw >= 2.0 |
| Unraid | Linux, Docker >= 20.0, docker-compose >= 2.0, jq >= 1.6, existing Nginx container | | Unraid | Linux, Docker >= 20.0, docker-compose >= 2.0, jq >= 1.6 |
| Fedora | Linux with dnf, Docker CE >= 20.0, docker-compose >= 2.0, jq >= 1.6 | | Fedora | Linux with dnf, Docker CE >= 20.0, docker-compose >= 2.0, jq >= 1.6 |
| Network | MacBook can SSH to both servers, DNS A record pointing to Unraid for HTTPS | | Network | MacBook can SSH to both servers, DNS A record pointing to Unraid for HTTPS, Cloudflare API token (if using `TLS_MODE=cloudflare`) |
## Quick Start ## Quick Start