feat: add phase 7.5 Nginx to Caddy migration script and update usage guide
This commit is contained in:
111
TODO.md
Normal file
111
TODO.md
Normal file
@@ -0,0 +1,111 @@
|
||||
# TODO — Phase 7.5 Nginx -> Caddy Consolidation
|
||||
|
||||
## Why this exists
|
||||
|
||||
This file captures the decisions and migration context for the one-time "phase 7.5"
|
||||
work so we do not lose reasoning between sessions.
|
||||
|
||||
## What happened so far
|
||||
|
||||
1. The original `phase8_cutover.sh` was designed for one wildcard zone
|
||||
(`*.${CADDY_DOMAIN}`), mainly for Gitea cutover.
|
||||
2. The homelab currently has two active DNS zones in scope:
|
||||
- `sintheus.com` (legacy services behind Nginx)
|
||||
- `privacyindesign.com` (new Gitea public endpoint)
|
||||
3. Decision made: run a one-time migration where a single Caddy instance serves
|
||||
both zones, then gradually retire Nginx.
|
||||
4. Implemented: `phase7_5_nginx_to_caddy.sh` to generate/deploy a multi-domain
|
||||
Caddyfile and run canary/full rollout modes.
|
||||
|
||||
## Current design decisions
|
||||
|
||||
1. Public ingress should be HTTPS-only for all migrated hostnames.
|
||||
2. Backend scheme is mixed for now:
|
||||
- Keep `http://` upstream where service does not yet have TLS.
|
||||
- Keep `https://` where already available.
|
||||
3. End-to-end HTTPS is a target state, not an immediate requirement.
|
||||
4. A strict toggle exists in phase 7.5:
|
||||
- `--strict-backend-https` fails if any upstream is `http://`.
|
||||
5. Canary-first rollout:
|
||||
- first migration target is `tower.sintheus.com`.
|
||||
|
||||
## Host map and backend TLS status
|
||||
|
||||
### Canary scope (default mode)
|
||||
|
||||
- `tower.sintheus.com -> https://192.168.1.82:443` (TLS backend; cert verify skipped)
|
||||
- `${GITEA_DOMAIN} -> http://${UNRAID_GITEA_IP}:3000` (HTTP backend for now)
|
||||
|
||||
### Full migration scope
|
||||
|
||||
- `ai.sintheus.com -> http://192.168.1.82:8181`
|
||||
- `photos.sintheus.com -> http://192.168.1.222:2283`
|
||||
- `fin.sintheus.com -> http://192.168.1.233:8096`
|
||||
- `disk.sintheus.com -> http://192.168.1.52:80`
|
||||
- `pi.sintheus.com -> http://192.168.1.4:80`
|
||||
- `plex.sintheus.com -> http://192.168.1.111:32400`
|
||||
- `sync.sintheus.com -> http://192.168.1.119:8384`
|
||||
- `syno.sintheus.com -> https://100.108.182.16:5001` (verify skipped)
|
||||
- `tower.sintheus.com -> https://192.168.1.82:443` (verify skipped)
|
||||
- `${GITEA_DOMAIN} -> http://${UNRAID_GITEA_IP}:3000`
|
||||
|
||||
## Definition of done (phase 7.5)
|
||||
|
||||
Phase 7.5 is done only when all are true:
|
||||
|
||||
1. Caddy is running on Unraid with generated multi-domain config.
|
||||
2. Canary host `tower.sintheus.com` is reachable over HTTPS through Caddy.
|
||||
3. Canary routing is proven by at least one path:
|
||||
- `curl --resolve` tests, or
|
||||
- split-DNS/hosts override, or
|
||||
- intentional DNS cutover.
|
||||
4. Legacy Nginx remains available for non-migrated hosts during canary.
|
||||
5. No critical regressions observed for at least 24 hours on canary traffic.
|
||||
|
||||
## Definition of done (final state after full migration)
|
||||
|
||||
1. All selected domains route to Caddy through the intended ingress path:
|
||||
- LAN-only: split-DNS/private resolution to Caddy, or
|
||||
- public: DNS to WAN ingress that forwards 443 to Caddy.
|
||||
2. Caddy serves valid certificates for both zones.
|
||||
3. Functional checks pass for each service (UI load, API, websocket/streaming where relevant).
|
||||
4. Nginx is no longer on the request path for migrated domains.
|
||||
5. Long-term target: all backends upgraded to `https://` and strict mode passes.
|
||||
|
||||
## What remains to happen
|
||||
|
||||
1. Run canary:
|
||||
- `./phase7_5_nginx_to_caddy.sh --mode=canary`
|
||||
2. Route canary traffic to Caddy using one method:
|
||||
- `curl --resolve` for zero-DNS-change testing, or
|
||||
- split-DNS/private DNS, or
|
||||
- explicit DNS cutover if desired.
|
||||
3. Observe errors/latency/app behavior for at least 24 hours.
|
||||
4. If canary is clean, run full:
|
||||
- `./phase7_5_nginx_to_caddy.sh --mode=full`
|
||||
5. Move remaining routes in batches (DNS or split-DNS, depending on ingress model).
|
||||
6. Validate each app after each batch.
|
||||
7. After everything is stable, plan Nginx retirement.
|
||||
8. Later hardening pass:
|
||||
- enable TLS on each backend service one by one
|
||||
- flip each corresponding upstream to `https://`
|
||||
- finally run `--strict-backend-https` and require it to pass.
|
||||
|
||||
## Risks and why mixed backend HTTP is acceptable short-term
|
||||
|
||||
1. Risk: backend HTTP is unencrypted on LAN.
|
||||
- Mitigation: traffic stays on trusted local network, temporary state only.
|
||||
2. Risk: if strict mode is enabled too early, rollout blocks.
|
||||
- Mitigation: keep strict mode off until backend TLS coverage improves.
|
||||
3. Risk: moving all DNS at once can create broad outage.
|
||||
- Mitigation: canary-first and batch DNS cutover.
|
||||
|
||||
## Operational notes
|
||||
|
||||
1. If Caddyfile already exists, phase 7.5 backs it up as:
|
||||
- `${CADDY_DATA_PATH}/Caddyfile.pre_phase7_5.<timestamp>`
|
||||
2. Compose stack path for Caddy:
|
||||
- `${UNRAID_COMPOSE_DIR}/caddy/docker-compose.yml`
|
||||
3. Script does not change Cloudflare DNS records automatically.
|
||||
- DNS updates are intentional/manual to keep blast radius controlled.
|
||||
4. Do not set public Cloudflare proxied records to private `192.168.x.x` addresses.
|
||||
Reference in New Issue
Block a user