Reshapes the existing scratchpad README into operational sections. Captures three things that took the left4me-integration session ~30 minutes to figure out: - After bw apply, nginx serves a self-signed cert until the daily systemd timer fires; the dehydrated --cron one-liner shortcuts the wait. - DNS-01 needs all NS servers (primary AND secondary) to serve the _acme-challenge CNAME, the acme node reachable, and TSIG-key reachability via wireguard for off-LAN clients. - LE's negative-cache + rate-limit combo: stop retrying for ~15 min after fixing DNS, then make at most one attempt. Existing nsupdate sample preserved at the bottom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.3 KiB
letsencrypt
Issues and renews Let's Encrypt certs via dehydrated with DNS-01 against the in-house bind-acme server.
First-apply behaviour
Immediately after bw apply <node>, nginx serves a self-signed
cert for each declared domain — generated by
/etc/dehydrated/letsencrypt-ensure-some-certificate so nginx has
something to start with. The real Let's Encrypt cert arrives at most
24h later when the systemd timer fires
(/usr/bin/dehydrated --cron --accept-terms --challenge dns-01). To
shortcut the wait:
ssh <node> 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01'
ssh <node> 'sudo systemctl reload nginx'
DNS-01 prerequisites
hook.sh does nsupdate against the bind-acme server (referenced
by letsencrypt/acme_node). For the challenge to succeed:
- The acme node must be in the same metadata graph (so
bw metadata <node> -k letsencrypt/acme_noderesolves). - All NS servers for the validated domain must serve the
_acme-challenge.<domain>CNAME — Let's Encrypt validates from primary AND secondary geographic regions; both authoritative servers must agree. If a secondary NS is also a bw-managed node,bw applyit after adding the domain (see e.g.ovh.secondary). - The bind-acme node's TSIG key must be reachable.
hook.shis rendered with the bind-acme server'snetwork/internal/ipv4— for clients outside that LAN, the route must exist (typically via wireguards2speer membership).
Negative-cache penalty
If the first DNS-01 attempt fails (e.g. zone not yet applied to the secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's negative TTL (often 900s = 15 min). Subsequent attempts during that window also fail and refresh the cache. Combined with LE's rate limit of 5 failed authorisations per domain per hour, recovery requires you to stop retrying for ~15 minutes after fixing the DNS, then make at most one attempt.
nsupdate sample
For interactive testing of the bind-acme TSIG path:
printf "server 127.0.0.1
zone acme.resolver.name.
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\"
send
" | nsupdate -y hmac-sha512:acme:XXXXXX