5.0 KiB
5.0 KiB
TODO — Phase 7.5 Nginx -> Caddy Consolidation
Why this exists
This file captures the decisions and migration context for the one-time "phase 7.5" work so we do not lose reasoning between sessions.
What happened so far
- The original
phase8_cutover.shwas designed for one wildcard zone (*.${CADDY_DOMAIN}), mainly for Gitea cutover. - The homelab currently has two active DNS zones in scope:
sintheus.com(legacy services behind Nginx)privacyindesign.com(new Gitea public endpoint)
- Decision made: run a one-time migration where a single Caddy instance serves both zones, then gradually retire Nginx.
- Implemented:
phase7_5_nginx_to_caddy.shto generate/deploy a multi-domain Caddyfile and run canary/full rollout modes.
Current design decisions
- Public ingress should be HTTPS-only for all migrated hostnames.
- Backend scheme is mixed for now:
- Keep
http://upstream where service does not yet have TLS. - Keep
https://where already available.
- Keep
- End-to-end HTTPS is a target state, not an immediate requirement.
- A strict toggle exists in phase 7.5:
--strict-backend-httpsfails if any upstream ishttp://.
- Canary-first rollout:
- first migration target is
tower.sintheus.com.
- first migration target is
- Canary mode is additive:
- preserves existing Caddy routes
- updates only a managed canary block for
tower.sintheus.com.
Host map and backend TLS status
Canary scope (default mode)
tower.sintheus.com -> https://192.168.1.82:443(TLS backend; cert verify skipped)${GITEA_DOMAIN} -> http://${UNRAID_GITEA_IP}:3000(HTTP backend for now)
Full migration scope
ai.sintheus.com -> http://192.168.1.82:8181photos.sintheus.com -> http://192.168.1.222:2283fin.sintheus.com -> http://192.168.1.233:8096disk.sintheus.com -> http://192.168.1.52:80pi.sintheus.com -> http://192.168.1.4:80plex.sintheus.com -> http://192.168.1.111:32400sync.sintheus.com -> http://192.168.1.119:8384syno.sintheus.com -> https://100.108.182.16:5001(verify skipped)tower.sintheus.com -> https://192.168.1.82:443(verify skipped)${GITEA_DOMAIN} -> http://${UNRAID_GITEA_IP}:3000
Definition of done (phase 7.5)
Phase 7.5 is done only when all are true:
- Caddy is running on Unraid with generated multi-domain config.
- Canary host
tower.sintheus.comis reachable over HTTPS through Caddy. - Canary routing is proven by at least one path:
curl --resolvetests, or- split-DNS/hosts override, or
- intentional DNS cutover.
- Legacy Nginx remains available for non-migrated hosts during canary.
- No critical regressions observed for at least 24 hours on canary traffic.
Definition of done (final state after full migration)
- All selected domains route to Caddy through the intended ingress path:
- LAN-only: split-DNS/private resolution to Caddy, or
- public: DNS to WAN ingress that forwards 443 to Caddy.
- Caddy serves valid certificates for both zones.
- Functional checks pass for each service (UI load, API, websocket/streaming where relevant).
- Nginx is no longer on the request path for migrated domains.
- Long-term target: all backends upgraded to
https://and strict mode passes.
What remains to happen
- Run canary:
./phase7_5_nginx_to_caddy.sh --mode=canary
- Route canary traffic to Caddy using one method:
curl --resolvefor zero-DNS-change testing, or- split-DNS/private DNS, or
- explicit DNS cutover if desired.
- Observe errors/latency/app behavior for at least 24 hours.
- If canary is clean, run full:
./phase7_5_nginx_to_caddy.sh --mode=full
- Move remaining routes in batches (DNS or split-DNS, depending on ingress model).
- Validate each app after each batch.
- After everything is stable, plan Nginx retirement.
- Later hardening pass:
- enable TLS on each backend service one by one
- flip each corresponding upstream to
https:// - finally run
--strict-backend-httpsand require it to pass.
Risks and why mixed backend HTTP is acceptable short-term
- Risk: backend HTTP is unencrypted on LAN.
- Mitigation: traffic stays on trusted local network, temporary state only.
- Risk: if strict mode is enabled too early, rollout blocks.
- Mitigation: keep strict mode off until backend TLS coverage improves.
- Risk: moving all DNS at once can create broad outage.
- Mitigation: canary-first and batch DNS cutover.
Operational notes
- If Caddyfile already exists, phase 7.5 backs it up as:
${CADDY_DATA_PATH}/Caddyfile.pre_phase7_5.<timestamp>
- Compose stack path for Caddy:
${UNRAID_COMPOSE_DIR}/caddy/docker-compose.yml
- Script does not change Cloudflare DNS records automatically.
- DNS updates are intentional/manual to keep blast radius controlled.
- Do not set public Cloudflare proxied records to private
192.168.x.xaddresses. - Canary upsert behavior is domain-aware:
- if site block for the canary domain does not exist, it is added
- if site block exists, it is replaced in-place
- previous block content is printed in logs before replacement