diff --git a/bundles/letsencrypt/README.md b/bundles/letsencrypt/README.md index 1b71c2b..a1d4ab2 100644 --- a/bundles/letsencrypt/README.md +++ b/bundles/letsencrypt/README.md @@ -1,9 +1,60 @@ -https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script +# letsencrypt + +Issues and renews Let's Encrypt certs via [dehydrated][upstream] with +DNS-01 against the in-house bind-acme server. + +[upstream]: https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script + +## First-apply behaviour + +Immediately after `bw apply `, nginx serves a **self-signed +cert** for each declared domain — generated by +`/etc/dehydrated/letsencrypt-ensure-some-certificate` so nginx has +something to start with. The real Let's Encrypt cert arrives at most +24h later when the systemd timer fires +(`/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`). To +shortcut the wait: + +```sh +ssh 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01' +ssh 'sudo systemctl reload nginx' +``` + +## DNS-01 prerequisites + +`hook.sh` does `nsupdate` against the bind-acme server (referenced +by `letsencrypt/acme_node`). For the challenge to succeed: + +1. The acme node must be in the same metadata graph (so + `bw metadata -k letsencrypt/acme_node` resolves). +2. **All NS servers** for the validated domain must serve the + `_acme-challenge.` CNAME — Let's Encrypt validates from + primary AND secondary geographic regions; both authoritative + servers must agree. If a secondary NS is also a bw-managed node, + `bw apply` it after adding the domain (see e.g. `ovh.secondary`). +3. The bind-acme node's TSIG key must be reachable. `hook.sh` is + rendered with the bind-acme server's `network/internal/ipv4` — + for clients outside that LAN, the route must exist (typically via + wireguard `s2s` peer membership). + +## Negative-cache penalty + +If the first DNS-01 attempt fails (e.g. zone not yet applied to the +secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's +negative TTL (often 900s = 15 min). Subsequent attempts during that +window also fail and refresh the cache. Combined with LE's rate limit +of **5 failed authorisations per domain per hour**, recovery requires +you to **stop retrying** for ~15 minutes after fixing the DNS, then +make at most one attempt. + +## nsupdate sample + +For interactive testing of the bind-acme TSIG path: ```sh printf "server 127.0.0.1 zone acme.resolver.name. -update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT "hello" +update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\" send " | nsupdate -y hmac-sha512:acme:XXXXXX ```