letsencrypt/README: first-apply, DNS-01 prereqs, negative-cache
Reshapes the existing scratchpad README into operational sections. Captures three things that took the left4me-integration session ~30 minutes to figure out: - After bw apply, nginx serves a self-signed cert until the daily systemd timer fires; the dehydrated --cron one-liner shortcuts the wait. - DNS-01 needs all NS servers (primary AND secondary) to serve the _acme-challenge CNAME, the acme node reachable, and TSIG-key reachability via wireguard for off-LAN clients. - LE's negative-cache + rate-limit combo: stop retrying for ~15 min after fixing DNS, then make at most one attempt. Existing nsupdate sample preserved at the bottom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
7a579f27c5
commit
05abe52221
1 changed files with 53 additions and 2 deletions
|
|
@ -1,9 +1,60 @@
|
|||
https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
||||
# letsencrypt
|
||||
|
||||
Issues and renews Let's Encrypt certs via [dehydrated][upstream] with
|
||||
DNS-01 against the in-house bind-acme server.
|
||||
|
||||
[upstream]: https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
||||
|
||||
## First-apply behaviour
|
||||
|
||||
Immediately after `bw apply <node>`, nginx serves a **self-signed
|
||||
cert** for each declared domain — generated by
|
||||
`/etc/dehydrated/letsencrypt-ensure-some-certificate` so nginx has
|
||||
something to start with. The real Let's Encrypt cert arrives at most
|
||||
24h later when the systemd timer fires
|
||||
(`/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`). To
|
||||
shortcut the wait:
|
||||
|
||||
```sh
|
||||
ssh <node> 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01'
|
||||
ssh <node> 'sudo systemctl reload nginx'
|
||||
```
|
||||
|
||||
## DNS-01 prerequisites
|
||||
|
||||
`hook.sh` does `nsupdate` against the bind-acme server (referenced
|
||||
by `letsencrypt/acme_node`). For the challenge to succeed:
|
||||
|
||||
1. The acme node must be in the same metadata graph (so
|
||||
`bw metadata <node> -k letsencrypt/acme_node` resolves).
|
||||
2. **All NS servers** for the validated domain must serve the
|
||||
`_acme-challenge.<domain>` CNAME — Let's Encrypt validates from
|
||||
primary AND secondary geographic regions; both authoritative
|
||||
servers must agree. If a secondary NS is also a bw-managed node,
|
||||
`bw apply` it after adding the domain (see e.g. `ovh.secondary`).
|
||||
3. The bind-acme node's TSIG key must be reachable. `hook.sh` is
|
||||
rendered with the bind-acme server's `network/internal/ipv4` —
|
||||
for clients outside that LAN, the route must exist (typically via
|
||||
wireguard `s2s` peer membership).
|
||||
|
||||
## Negative-cache penalty
|
||||
|
||||
If the first DNS-01 attempt fails (e.g. zone not yet applied to the
|
||||
secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's
|
||||
negative TTL (often 900s = 15 min). Subsequent attempts during that
|
||||
window also fail and refresh the cache. Combined with LE's rate limit
|
||||
of **5 failed authorisations per domain per hour**, recovery requires
|
||||
you to **stop retrying** for ~15 minutes after fixing the DNS, then
|
||||
make at most one attempt.
|
||||
|
||||
## nsupdate sample
|
||||
|
||||
For interactive testing of the bind-acme TSIG path:
|
||||
|
||||
```sh
|
||||
printf "server 127.0.0.1
|
||||
zone acme.resolver.name.
|
||||
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT "hello"
|
||||
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\"
|
||||
send
|
||||
" | nsupdate -y hmac-sha512:acme:XXXXXX
|
||||
```
|
||||
|
|
|
|||
Loading…
Reference in a new issue