Compare commits
No commits in common. "c6caf2a1cf532cc675778840689daaba28891621" and "d4dedde0ad8af5d384df8653fa0e26d25a89bdf8" have entirely different histories.
c6caf2a1cf
...
d4dedde0ad
33 changed files with 23 additions and 2011 deletions
2
.gitignore
vendored
2
.gitignore
vendored
|
|
@ -5,5 +5,3 @@
|
||||||
.bw_debug_history
|
.bw_debug_history
|
||||||
# CocoIndex Code (ccc)
|
# CocoIndex Code (ccc)
|
||||||
/.cocoindex_code/
|
/.cocoindex_code/
|
||||||
# bundlewrap git_deploy local-mirror map (operator-specific paths)
|
|
||||||
git_deploy_repos
|
|
||||||
|
|
|
||||||
15
AGENTS.md
15
AGENTS.md
|
|
@ -12,12 +12,12 @@ not project documentation. Onboarding lives **here**, in `AGENTS.md`.
|
||||||
|
|
||||||
## Quickstart for agents
|
## Quickstart for agents
|
||||||
|
|
||||||
Six rules; follow these and you won't break things:
|
Five rules; follow these and you won't break things:
|
||||||
|
|
||||||
1. **Read-only by default.** Never run `bw apply`, `bw run`, or
|
1. **Read-only by default.** Never run `bw apply`, `bw run`, or
|
||||||
`bw lock` without explicit user request — even with `-i`. Stick
|
`bw lock` without explicit user request — even with `-i`. Stick
|
||||||
to `bw test`, `bw nodes`, `bw groups`, `bw items`,
|
to `bw test`, `bw nodes`, `bw groups`, `bw bundles`,
|
||||||
`bw metadata`, `bw hash`, `bw verify`, `bw debug`. See
|
`bw items`, `bw metadata`, `bw hash`, `bw debug`. See
|
||||||
[`docs/agents/commands.md`](docs/agents/commands.md) and the
|
[`docs/agents/commands.md`](docs/agents/commands.md) and the
|
||||||
fork's [safety envelope](https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md).
|
fork's [safety envelope](https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md).
|
||||||
2. **Never echo decrypted secrets.** Don't print, paste, or log the
|
2. **Never echo decrypted secrets.** Don't print, paste, or log the
|
||||||
|
|
@ -38,15 +38,6 @@ Six rules; follow these and you won't break things:
|
||||||
5. **Prefer adding helpers to `libs/`** over duplicating logic across
|
5. **Prefer adding helpers to `libs/`** over duplicating logic across
|
||||||
bundles. Repo-wide helpers go in
|
bundles. Repo-wide helpers go in
|
||||||
[`libs/`](libs/AGENTS.md), reachable as `repo.libs.<x>`.
|
[`libs/`](libs/AGENTS.md), reachable as `repo.libs.<x>`.
|
||||||
6. **`ccc` is available for semantic search.** This repo is indexed
|
|
||||||
with [`ccc`](https://github.com/cocoindex-io/cocoindex-code).
|
|
||||||
Reach for it on conceptual questions ("where is X used / which
|
|
||||||
bundles do Y / what are the contexts of Z"), where a keyword
|
|
||||||
grep would miss indirect usage:
|
|
||||||
`ccc search '<concept>' --path '**'`. Pass `--path '**'` —
|
|
||||||
without it, results are filtered to the current working
|
|
||||||
directory's subtree. `grep`/`rg`/`find` remain fine for
|
|
||||||
exact-string lookups; pick whichever fits the question.
|
|
||||||
|
|
||||||
## Layout
|
## Layout
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,16 +41,6 @@ bundles/<name>/
|
||||||
more than one bundle. Don't duplicate logic across bundles.
|
more than one bundle. Don't duplicate logic across bundles.
|
||||||
- **Custom item types** (e.g. `download:`) live in
|
- **Custom item types** (e.g. `download:`) live in
|
||||||
[`items/`](../items/AGENTS.md), not per-bundle.
|
[`items/`](../items/AGENTS.md), not per-bundle.
|
||||||
- **Bundles own application-wide knowledge; nodes carry only the few
|
|
||||||
per-host knobs the bundle actually needs.** When designing a bundle,
|
|
||||||
identify the per-node knobs (e.g. domain, uplink interface, a
|
|
||||||
vault-id suffix) and put everything else in `defaults`, or in a
|
|
||||||
reactor that derives from those knobs. Per-node random secrets
|
|
||||||
belong in `defaults` via `repo.vault.random_bytes_as_base64_for(...)`
|
|
||||||
keyed on the node — not in the node file. See
|
|
||||||
`bundles/left4me/metadata.py:10` (`secret_key` derived in defaults)
|
|
||||||
and `bundles/postgresql/metadata.py:4` (vault-derived `password_for`
|
|
||||||
at module scope).
|
|
||||||
|
|
||||||
## How to add a new bundle
|
## How to add a new bundle
|
||||||
|
|
||||||
|
|
@ -66,22 +56,12 @@ bundles/<name>/
|
||||||
[`groups/<axis>/<x>.py`](../groups/AGENTS.md) (preferred for shared
|
[`groups/<axis>/<x>.py`](../groups/AGENTS.md) (preferred for shared
|
||||||
bundles) or to the node's `bundles` list directly
|
bundles) or to the node's `bundles` list directly
|
||||||
([`nodes/AGENTS.md`](../nodes/AGENTS.md)).
|
([`nodes/AGENTS.md`](../nodes/AGENTS.md)).
|
||||||
5. **Verify, in this order:**
|
5. Verify, in this order:
|
||||||
- `bw test` — repo-wide parse + cross-cutting hooks. Loads every
|
- `bw test` — sanity (loaders + reactors).
|
||||||
bundle, but reactors don't fire for nodes that haven't opted into
|
- `bw items <node>` — confirm new items appear on a node that opts in.
|
||||||
the bundle yet — bugs in new reactors stay hidden here.
|
- `bw hash <node>` — confirm the change is what you expected. See
|
||||||
- **Attach the bundle to a node** (via the node's `bundles` list, or
|
[`docs/agents/commands.md`](../docs/agents/commands.md) and the
|
||||||
a group it belongs to). Until you do, the next steps don't actually
|
fork's hash-diff workflow.
|
||||||
exercise the bundle.
|
|
||||||
- `bw test <node>` — exercises every reactor and item-graph edge for
|
|
||||||
that node. This is where most new-bundle bugs surface.
|
|
||||||
- `bw items <node> --blame` — confirm items materialise with the
|
|
||||||
right paths, authored by the expected bundle.
|
|
||||||
- `bw metadata <node> -k <a/b>` — spot-check derived metadata.
|
|
||||||
- `bw hash <node>` — preview vs current host state.
|
|
||||||
|
|
||||||
See [`docs/agents/commands.md#bundle-validation-workflow`](../docs/agents/commands.md#bundle-validation-workflow)
|
|
||||||
for the rationale.
|
|
||||||
6. Add a `bundles/<name>/README.md`. See "Per-bundle README" below
|
6. Add a `bundles/<name>/README.md`. See "Per-bundle README" below
|
||||||
for what to cover.
|
for what to cover.
|
||||||
|
|
||||||
|
|
@ -102,12 +82,6 @@ bundles/<name>/
|
||||||
unless the matching `file:` item declares `content_type='mako'`
|
unless the matching `file:` item declares `content_type='mako'`
|
||||||
(or a templating extension triggers it). To check, read the matching
|
(or a templating extension triggers it). To check, read the matching
|
||||||
`file:` entry in `items.py`.
|
`file:` entry in `items.py`.
|
||||||
- **`file:` `source` defaults to the destination basename.** For a
|
|
||||||
destination of `/etc/foo/bar.conf` with no `source` key, bw looks
|
|
||||||
for `bundles/<bundle>/files/bar.conf`. Only declare `source`
|
|
||||||
explicitly when the basename you want differs (e.g. shipping a Mako
|
|
||||||
template named `bar.conf.mako` to a destination of
|
|
||||||
`/etc/foo/bar.conf`).
|
|
||||||
- **Reactors writing across namespaces.** Some bundles' reactors write
|
- **Reactors writing across namespaces.** Some bundles' reactors write
|
||||||
into other bundles' metadata namespaces (e.g. `nextcloud` writes
|
into other bundles' metadata namespaces (e.g. `nextcloud` writes
|
||||||
into `apt.packages`, `archive.paths`). When you change such a bundle,
|
into `apt.packages`, `archive.paths`). When you change such a bundle,
|
||||||
|
|
@ -116,28 +90,6 @@ bundles/<name>/
|
||||||
itself; grep `'<other-bundle>':` in the reactors when in doubt.
|
itself; grep `'<other-bundle>':` in the reactors when in doubt.
|
||||||
- **`bw hash` doesn't accept selectors.** Use `bw hash <node>` per
|
- **`bw hash` doesn't accept selectors.** Use `bw hash <node>` per
|
||||||
literal name; see the fork's runbook.
|
literal name; see the fork's runbook.
|
||||||
- **Reactors must read metadata.** If a reactor body returns a static
|
|
||||||
dict without calling `metadata.get(...)`, bw raises
|
|
||||||
`ValueError: <reactor> on <node> did not request any metadata, you
|
|
||||||
might want to use defaults instead` once a node consumes the bundle.
|
|
||||||
Fix: fold the contribution into `defaults`. The rule applies even
|
|
||||||
when the reactor writes into another bundle's namespace — a static
|
|
||||||
contribution to e.g. `nftables/output` belongs in `defaults`, where
|
|
||||||
bw merges it with other bundles' contributions.
|
|
||||||
- **`triggers` ↔ `triggered: True` invariant.** Any item listed in
|
|
||||||
another's `triggers` list must declare `triggered: True`. bw
|
|
||||||
enforces this at `bw test` time: *"…triggered by …, but missing
|
|
||||||
'triggered' attribute"*. Corollary: an action can't be both in an
|
|
||||||
upstream `triggers` list AND self-healing every apply — pick one.
|
|
||||||
- **Triggered actions don't recover from partial failure.** When an
|
|
||||||
upstream item's apply succeeds but its triggered downstream action
|
|
||||||
fails, subsequent applies can't recover via the trigger chain —
|
|
||||||
upstream is "already in desired state" and never re-triggers. For
|
|
||||||
actions that must self-heal (pip installs, chowns, migrations),
|
|
||||||
drop `triggered: True` and gate the command with `unless: <fast-check>`.
|
|
||||||
`unless` is a shell command on the target host whose exit status
|
|
||||||
decides whether the main command runs (exit 0 = skip); it's checked
|
|
||||||
at fire time, after `triggered:` filtering.
|
|
||||||
|
|
||||||
## Per-bundle README
|
## Per-bundle README
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -33,7 +33,6 @@ def acme_zone(metadata):
|
||||||
str(ip_interface(other_node.metadata.get('network/internal/ipv4')).ip)
|
str(ip_interface(other_node.metadata.get('network/internal/ipv4')).ip)
|
||||||
for other_node in repo.nodes
|
for other_node in repo.nodes
|
||||||
if other_node.metadata.get('letsencrypt/domains', {})
|
if other_node.metadata.get('letsencrypt/domains', {})
|
||||||
and other_node.metadata.get('network/internal/ipv4', None)
|
|
||||||
},
|
},
|
||||||
*{
|
*{
|
||||||
str(ip_interface(other_node.metadata.get('wireguard/my_ip')).ip)
|
str(ip_interface(other_node.metadata.get('wireguard/my_ip')).ip)
|
||||||
|
|
|
||||||
|
|
@ -1,30 +0,0 @@
|
||||||
# bind
|
|
||||||
|
|
||||||
Authoritative DNS — primary plus optional `bind/master_node` slaves.
|
|
||||||
|
|
||||||
## Applying changes needs both nodes
|
|
||||||
|
|
||||||
The slave's bw-managed zone files are rendered from the master's
|
|
||||||
metadata at slave-apply time (see `bundles/bind/items.py:100`). When
|
|
||||||
you change a record on the master (adding a `letsencrypt/domains`
|
|
||||||
entry, a new vhost, etc.), the change is only published once you
|
|
||||||
apply BOTH:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
bw apply htz.mails # primary (where the source records live)
|
|
||||||
bw apply ovh.secondary # secondary (renders its own zone files)
|
|
||||||
```
|
|
||||||
|
|
||||||
Until both have been applied, `bw verify ovh.secondary` will show
|
|
||||||
stale zones and consumers that hit the secondary (Let's Encrypt's
|
|
||||||
secondary-region validators in particular) will see NXDOMAIN. Even
|
|
||||||
though the slave's named.conf.local declares `type slave;`, don't
|
|
||||||
rely on bind's own AXFR catching up — the bw-rendered file on disk
|
|
||||||
is what `bw verify` measures.
|
|
||||||
|
|
||||||
## See also
|
|
||||||
|
|
||||||
- `bundles/bind-acme/` — the in-house ACME-update receiver.
|
|
||||||
- `bundles/letsencrypt/README.md` — DNS-01 prerequisites and the
|
|
||||||
negative-cache penalty (the most common operational consequence
|
|
||||||
of forgetting to apply the secondary).
|
|
||||||
|
|
@ -1,114 +0,0 @@
|
||||||
# left4me
|
|
||||||
|
|
||||||
L4D2 game-server management platform: a Flask web UI on gunicorn that
|
|
||||||
provisions per-instance srcds servers via templated systemd units, with
|
|
||||||
kernel-overlayfs layering for shared installations + per-overlay maps,
|
|
||||||
and uid-based DSCP/priority marking on the egress path so CAKE on the
|
|
||||||
external interface prioritizes srcds UDP over bulk traffic.
|
|
||||||
|
|
||||||
## Metadata
|
|
||||||
|
|
||||||
```python
|
|
||||||
'metadata': {
|
|
||||||
'left4me': {
|
|
||||||
'domain': 'whatever.tld', # required — the only per-node knob
|
|
||||||
# Everything below is optional and has a sensible default in the
|
|
||||||
# bundle. Override per-node only if the default is wrong:
|
|
||||||
# 'git_url': 'git@git.sublimity.de:cronekorkn/left4me',
|
|
||||||
# 'git_branch': 'master',
|
|
||||||
# 'gunicorn_workers': 1,
|
|
||||||
# 'gunicorn_threads': 32,
|
|
||||||
# 'job_worker_threads': 4,
|
|
||||||
# 'port_range_start': 27015,
|
|
||||||
# 'port_range_end': 27115,
|
|
||||||
# secret_key is auto-derived per node
|
|
||||||
# (repo.vault.random_bytes_as_base64_for f'{node.name} left4me secret_key').
|
|
||||||
},
|
|
||||||
},
|
|
||||||
```
|
|
||||||
|
|
||||||
The bundle's `derived_from_domain` reactor reads `left4me/domain` and
|
|
||||||
emits the corresponding `nginx/vhosts`, `letsencrypt/domains`,
|
|
||||||
`monitoring/services/left4me-web` (HTTPS health check), and the game-
|
|
||||||
port `nftables/input` accept rules. Backup paths
|
|
||||||
(`/var/lib/left4me`, `/etc/left4me`) are set-merged into `backup/paths`
|
|
||||||
from defaults. None of these need to be declared per-node.
|
|
||||||
|
|
||||||
## What this bundle does
|
|
||||||
|
|
||||||
- Creates system users `left4me` (uid/gid 980, home `/var/lib/left4me`,
|
|
||||||
mode 0711) and `l4d2-sandbox` (uid/gid 981, no home, used by bwrap
|
|
||||||
script-overlay builds).
|
|
||||||
- Drops privileged helpers under `/usr/local/libexec/left4me/`
|
|
||||||
(`left4me-systemctl`, `left4me-journalctl`, `left4me-overlay`,
|
|
||||||
`left4me-script-sandbox`) plus a tight sudoers file (validated with
|
|
||||||
`visudo -cf` before install).
|
|
||||||
- `git_deploy`s the left4me repo to `/opt/left4me/src`, builds a venv at
|
|
||||||
`/opt/left4me/.venv`, `pip install -e`s both `l4d2host` and `l4d2web`,
|
|
||||||
runs `alembic upgrade head` and `flask seed-script-overlays`, then
|
|
||||||
enables `left4me-web.service`.
|
|
||||||
- Emits four systemd units via `systemd/units` metadata (consumed by
|
|
||||||
`bundles/systemd/`):
|
|
||||||
- `left4me-web.service` — gunicorn on `127.0.0.1:8000` (TLS terminates upstream).
|
|
||||||
- `left4me-server@.service` — per-instance srcds template, started on
|
|
||||||
demand by the web app via the `left4me-systemctl` helper.
|
|
||||||
- `l4d2-game.slice` / `l4d2-build.slice` — cgroup slices for the
|
|
||||||
perf-baseline (CPU/IO weights, memory caps).
|
|
||||||
- Contributes uid-based DSCP/priority marks for srcds UDP egress to
|
|
||||||
`nftables/output` (via `defaults`).
|
|
||||||
|
|
||||||
## Gotchas
|
|
||||||
|
|
||||||
- **Requires `bundles/nftables` and `bundles/systemd` on the node.** The
|
|
||||||
bundle asserts membership at `bw test` time. On Debian-13 these ride
|
|
||||||
in via the `debian-13` group, so attaching the bundle to a Debian-13
|
|
||||||
node is enough.
|
|
||||||
- **`left4me-web.service` does not have `NoNewPrivileges=true`.** This is
|
|
||||||
intentional — workers `sudo` the privileged helpers; `NoNewPrivileges`
|
|
||||||
would block setuid escalation. Per-instance `server@.service` units
|
|
||||||
*do* have it.
|
|
||||||
- **CAKE shaping is configured separately**, via
|
|
||||||
`network/<iface>/cake` on the node (consumed by `bundles/network/`),
|
|
||||||
not by this bundle.
|
|
||||||
- **First-run admin user is manual.** After `bw apply`, ssh to the host and
|
|
||||||
bootstrap the admin via the `left4me` wrapper (it sources the env files,
|
|
||||||
drops to the `left4me` user, and runs the flask CLI):
|
|
||||||
`sudo left4me create-user <username> --admin` (prompts for password via
|
|
||||||
the flask CLI, or set `LEFT4ME_ADMIN_PASSWORD` first). The bundle
|
|
||||||
deliberately doesn't seed an admin to keep credentials out of the
|
|
||||||
metadata pipeline. The same `left4me` wrapper accepts any other flask
|
|
||||||
subcommand: `sudo left4me seed-script-overlays <dir>`,
|
|
||||||
`sudo left4me routes`, `sudo left4me shell`, etc.
|
|
||||||
- **CPU isolation is managed by this bundle**, driven by one required
|
|
||||||
per-node knob: `left4me/system_cpus` — a set of int CPU ids that
|
|
||||||
pins `system.slice` / `user.slice` / `l4d2-build.slice`. The
|
|
||||||
complement (`set(range(vm/threads)) - system_cpus`) pins
|
|
||||||
`l4d2-game.slice`. On HT hosts, list both SMT siblings of every
|
|
||||||
physical core you want to reserve for system, otherwise games end
|
|
||||||
up sharing L1/L2 with system. Find pairings via
|
|
||||||
`/sys/devices/system/cpu/cpu<n>/topology/thread_siblings_list`. On
|
|
||||||
the prod node (`ovh.left4me`, 4 physical / 8 threads, pairings
|
|
||||||
(0,4) (1,5) (2,6) (3,7)) the node sets `'system_cpus': {0, 4}` to
|
|
||||||
reserve physical core 0 entirely. `l4d2-game.slice` and
|
|
||||||
`l4d2-build.slice` carry `AllowedCPUs=` inline on their unit
|
|
||||||
definitions; `system.slice` and `user.slice` get drop-ins registered
|
|
||||||
under `systemd/units` with the `'<parent>.d/<basename>.conf'` key
|
|
||||||
convention (same shape nginx and autologin use), landing at
|
|
||||||
`/usr/local/lib/systemd/system/<slice>.d/99-left4me-cpuset.conf`.
|
|
||||||
The reactor raises if `system_cpus` includes CPUs outside
|
|
||||||
`[0, vm/threads)` or leaves no cores for games.
|
|
||||||
- **Kernel feature requirement:** kernel-overlayfs (`CONFIG_OVERLAY_FS`).
|
|
||||||
Standard on debian-13.
|
|
||||||
- **Game ports** open by the web app on demand in the range 27015-27115
|
|
||||||
(UDP+TCP). Add corresponding accept rules to `nftables/input` per
|
|
||||||
node if the host's policy is default-drop on input.
|
|
||||||
- **Pinned UIDs/GIDs (980/981).** Chosen for deterministic ownership
|
|
||||||
across rebuilds and backup restores. If you add another bundle that
|
|
||||||
pins UIDs in this repo, make sure it doesn't collide.
|
|
||||||
|
|
||||||
## Slice support requires `bundles/systemd` ≥ commit cc1c6a5
|
|
||||||
|
|
||||||
This bundle's `l4d2-game.slice` and `l4d2-build.slice` units rely on
|
|
||||||
`bundles/systemd/items.py` accepting the `.slice` extension. Older
|
|
||||||
revisions raised `Exception(f'unknown type slice')` at apply time.
|
|
||||||
The repo-wide `bw test` will catch this if it regresses.
|
|
||||||
|
|
@ -1,6 +0,0 @@
|
||||||
# Managed by ckn-bw bundles/left4me. Local edits will be reverted.
|
|
||||||
# Deployment units use fixed /var/lib/left4me paths; regenerate units if this changes.
|
|
||||||
LEFT4ME_ROOT=/var/lib/left4me
|
|
||||||
# l4d2host invokes steamcmd by absolute path — bypasses PATH lookup so the
|
|
||||||
# script's `cd "$(dirname "$0")"` resolves next to the real install dir.
|
|
||||||
LEFT4ME_STEAMCMD=/opt/left4me/steam/steamcmd.sh
|
|
||||||
|
|
@ -1,6 +0,0 @@
|
||||||
# Sandbox-only resolver config — bind-mounted into script-overlay sandboxes
|
|
||||||
# at /etc/resolv.conf. The host's resolver (often a private/LAN DNS server)
|
|
||||||
# is unreachable from inside the sandbox because IPAddressDeny= blocks
|
|
||||||
# egress to RFC1918 / loopback. Public resolvers keep DNS working.
|
|
||||||
nameserver 1.1.1.1
|
|
||||||
nameserver 8.8.8.8
|
|
||||||
|
|
@ -1,7 +0,0 @@
|
||||||
# Managed by ckn-bw bundles/left4me. Local edits will be reverted.
|
|
||||||
DATABASE_URL=sqlite:////var/lib/left4me/left4me.db
|
|
||||||
SECRET_KEY=${node.metadata.get('left4me/secret_key')}
|
|
||||||
JOB_WORKER_THREADS=${node.metadata.get('left4me/job_worker_threads')}
|
|
||||||
SESSION_COOKIE_SECURE=true
|
|
||||||
LEFT4ME_PORT_RANGE_START=${node.metadata.get('left4me/port_range_start')}
|
|
||||||
LEFT4ME_PORT_RANGE_END=${node.metadata.get('left4me/port_range_end')}
|
|
||||||
|
|
@ -1,5 +0,0 @@
|
||||||
Defaults:left4me !requiretty
|
|
||||||
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-systemctl *
|
|
||||||
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-journalctl *
|
|
||||||
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-overlay mount *, /usr/local/libexec/left4me/left4me-overlay umount *
|
|
||||||
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-script-sandbox
|
|
||||||
|
|
@ -1,36 +0,0 @@
|
||||||
# Host-side perf baseline for left4me — see
|
|
||||||
# docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md
|
|
||||||
#
|
|
||||||
# UDP socket buffers: distro defaults of ~128 KiB are too small for sustained
|
|
||||||
# Source-engine UDP across multiple instances. 8 MiB matches the standard
|
|
||||||
# 1 Gbit recommendation; rmem_default/wmem_default protect sockets that don't
|
|
||||||
# explicitly enlarge their buffers.
|
|
||||||
net.core.rmem_max = 8388608
|
|
||||||
net.core.wmem_max = 8388608
|
|
||||||
net.core.rmem_default = 524288
|
|
||||||
net.core.wmem_default = 524288
|
|
||||||
|
|
||||||
# Kernel softirq UDP path: the per-CPU backlog queue starts dropping packets
|
|
||||||
# at the default 1000 under multi-instance burst; 5000 absorbs realistic peaks.
|
|
||||||
# netdev_budget = 600 gives softirq more drain headroom per pass.
|
|
||||||
net.core.netdev_max_backlog = 5000
|
|
||||||
net.core.netdev_budget = 600
|
|
||||||
|
|
||||||
# Latency-sensitive default: avoid swap unless the box is really under
|
|
||||||
# pressure. Harmless on swapless hosts.
|
|
||||||
vm.swappiness = 10
|
|
||||||
|
|
||||||
# Per-socket UDP buffer floors: protect game-server sockets that don't bump
|
|
||||||
# their own SO_RCVBUF/SO_SNDBUF when softirq drains lag briefly.
|
|
||||||
net.ipv4.udp_rmem_min = 16384
|
|
||||||
net.ipv4.udp_wmem_min = 16384
|
|
||||||
|
|
||||||
# Default qdisc for ifaces we don't explicitly shape with CAKE. Debian Trixie
|
|
||||||
# already defaults to fq_codel; setting it explicitly is belt-and-suspenders
|
|
||||||
# and survives kernel-default churn.
|
|
||||||
net.core.default_qdisc = fq_codel
|
|
||||||
|
|
||||||
# TCP congestion control: BBR for any bulk TCP egress on the host (admin SSH,
|
|
||||||
# backups, package fetches, web-app responses) so a long flow does not push
|
|
||||||
# the bottleneck queue ahead of game UDP. UDP srcds is unaffected.
|
|
||||||
net.ipv4.tcp_congestion_control = bbr
|
|
||||||
|
|
@ -1,53 +0,0 @@
|
||||||
#!/bin/sh
|
|
||||||
set -eu
|
|
||||||
|
|
||||||
usage() {
|
|
||||||
printf '%s\n' "usage: left4me-journalctl <server-name> --lines <n> --follow|--no-follow" >&2
|
|
||||||
exit 2
|
|
||||||
}
|
|
||||||
|
|
||||||
validate_name() {
|
|
||||||
name=$1
|
|
||||||
[ -n "$name" ] || usage
|
|
||||||
case "$name" in
|
|
||||||
.*|*..*|*/*|*\\*) usage ;;
|
|
||||||
esac
|
|
||||||
case "$name" in
|
|
||||||
*[!A-Za-z0-9_.-]*) usage ;;
|
|
||||||
esac
|
|
||||||
}
|
|
||||||
|
|
||||||
[ "$#" -eq 4 ] || usage
|
|
||||||
name=$1
|
|
||||||
lines_flag=$2
|
|
||||||
lines=$3
|
|
||||||
follow_flag=$4
|
|
||||||
|
|
||||||
validate_name "$name"
|
|
||||||
[ "$lines_flag" = "--lines" ] || usage
|
|
||||||
case "$lines" in
|
|
||||||
''|*[!0-9]*) usage ;;
|
|
||||||
esac
|
|
||||||
|
|
||||||
follow_arg=
|
|
||||||
case "$follow_flag" in
|
|
||||||
--follow) follow_arg=-f ;;
|
|
||||||
--no-follow) ;;
|
|
||||||
*) usage ;;
|
|
||||||
esac
|
|
||||||
|
|
||||||
unit="left4me-server@${name}.service"
|
|
||||||
if [ -x /bin/journalctl ]; then
|
|
||||||
journalctl=/bin/journalctl
|
|
||||||
elif [ -x /usr/bin/journalctl ]; then
|
|
||||||
journalctl=/usr/bin/journalctl
|
|
||||||
else
|
|
||||||
printf '%s\n' 'journalctl not found at /bin/journalctl or /usr/bin/journalctl' >&2
|
|
||||||
exit 69
|
|
||||||
fi
|
|
||||||
|
|
||||||
if [ -n "$follow_arg" ]; then
|
|
||||||
exec "$journalctl" -u "$unit" -n "$lines" -o cat "$follow_arg"
|
|
||||||
fi
|
|
||||||
|
|
||||||
exec "$journalctl" -u "$unit" -n "$lines" -o cat
|
|
||||||
|
|
@ -1,242 +0,0 @@
|
||||||
#!/usr/bin/python3
|
|
||||||
"""Privileged overlay mount helper for left4me.
|
|
||||||
|
|
||||||
Invoked from the systemd unit's ExecStartPre / ExecStopPost via
|
|
||||||
`+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- …`. The unit-level
|
|
||||||
nsenter is what makes this work: it runs the helper Python interpreter
|
|
||||||
inside PID 1's mount namespace. Without it, the `+` Exec prefix
|
|
||||||
removes the sandbox/credentials but does NOT detach from the unit's
|
|
||||||
per-service mount namespace, and the helper process itself would pin
|
|
||||||
that namespace alive — turning every umount into a multi-second EBUSY
|
|
||||||
race with the kernel's deferred namespace cleanup. With the unit-level
|
|
||||||
nsenter the helper has no such reference and umount succeeds first try.
|
|
||||||
|
|
||||||
Validates inputs strictly, then performs `mount -t overlay` /
|
|
||||||
`umount` directly — no internal nsenter, since the helper is already
|
|
||||||
running where the syscalls need to take effect.
|
|
||||||
|
|
||||||
Verbs:
|
|
||||||
mount <name> Reads ${LEFT4ME_ROOT}/instances/<name>/instance.env
|
|
||||||
for L4D2_LOWERDIRS, validates every lowerdir is
|
|
||||||
under one of installation/overlays/workshop_cache/
|
|
||||||
global_overlay_cache, then mounts the kernel
|
|
||||||
overlay at runtime/<name>/merged.
|
|
||||||
umount <name> Unmounts runtime/<name>/merged and cleans up the
|
|
||||||
kernel-overlayfs `work/work` orphan.
|
|
||||||
|
|
||||||
Set LEFT4ME_OVERLAY_PRINT_ONLY=1 to print the would-be argv (one line,
|
|
||||||
shell-quoted) and exit 0 instead of execv. Used by tests.
|
|
||||||
"""
|
|
||||||
|
|
||||||
import os
|
|
||||||
import re
|
|
||||||
import shlex
|
|
||||||
import shutil
|
|
||||||
import subprocess
|
|
||||||
import sys
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
NAME_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
|
|
||||||
DEFAULT_ROOT = "/var/lib/left4me"
|
|
||||||
LOWERDIR_ALLOWLIST = (
|
|
||||||
"installation",
|
|
||||||
"overlays",
|
|
||||||
"global_overlay_cache",
|
|
||||||
"workshop_cache",
|
|
||||||
)
|
|
||||||
MAX_LOWERDIRS = 500
|
|
||||||
MOUNT_BIN = "/bin/mount"
|
|
||||||
UMOUNT_BIN = "/bin/umount"
|
|
||||||
|
|
||||||
|
|
||||||
def die(msg: str) -> None:
|
|
||||||
sys.stderr.write(f"left4me-overlay: {msg}\n")
|
|
||||||
sys.exit(1)
|
|
||||||
|
|
||||||
|
|
||||||
def root() -> Path:
|
|
||||||
return Path(os.environ.get("LEFT4ME_ROOT") or DEFAULT_ROOT)
|
|
||||||
|
|
||||||
|
|
||||||
def validate_name(name: str) -> str:
|
|
||||||
if not NAME_RE.fullmatch(name):
|
|
||||||
die(f"invalid instance name: {name!r}")
|
|
||||||
return name
|
|
||||||
|
|
||||||
|
|
||||||
def parse_lowerdirs(env_path: Path) -> list[str]:
|
|
||||||
if not env_path.is_file():
|
|
||||||
die(f"instance.env not found: {env_path}")
|
|
||||||
raw = None
|
|
||||||
for line in env_path.read_text().splitlines():
|
|
||||||
if "=" not in line:
|
|
||||||
continue
|
|
||||||
key, value = line.split("=", 1)
|
|
||||||
if key.strip() == "L4D2_LOWERDIRS":
|
|
||||||
raw = value
|
|
||||||
break
|
|
||||||
if raw is None:
|
|
||||||
die(f"L4D2_LOWERDIRS not set in {env_path}")
|
|
||||||
if raw == "":
|
|
||||||
die(f"L4D2_LOWERDIRS is empty in {env_path}")
|
|
||||||
parts = raw.split(":")
|
|
||||||
if any(p == "" for p in parts):
|
|
||||||
die(f"L4D2_LOWERDIRS contains an empty entry: {raw!r}")
|
|
||||||
if len(parts) > MAX_LOWERDIRS:
|
|
||||||
die(f"L4D2_LOWERDIRS has {len(parts)} entries (cap {MAX_LOWERDIRS})")
|
|
||||||
return parts
|
|
||||||
|
|
||||||
|
|
||||||
def canonical_under(allowed_roots: list[Path], path: Path) -> Path:
|
|
||||||
try:
|
|
||||||
canonical = path.resolve(strict=True)
|
|
||||||
except (FileNotFoundError, RuntimeError):
|
|
||||||
die(f"path does not exist or has a symlink loop: {path}")
|
|
||||||
for r in allowed_roots:
|
|
||||||
if canonical == r or r in canonical.parents:
|
|
||||||
return canonical
|
|
||||||
die(f"path is outside the permitted roots: {path} (resolved: {canonical})")
|
|
||||||
|
|
||||||
|
|
||||||
_LISTXATTR = getattr(os, "listxattr", None)
|
|
||||||
|
|
||||||
|
|
||||||
def _entry_has_fuse_xattr(path: str) -> str | None:
|
|
||||||
if _LISTXATTR is None:
|
|
||||||
return None
|
|
||||||
try:
|
|
||||||
attrs = _LISTXATTR(path, follow_symlinks=False)
|
|
||||||
except OSError:
|
|
||||||
return None
|
|
||||||
for a in attrs:
|
|
||||||
if a.startswith("user.fuseoverlayfs."):
|
|
||||||
return a
|
|
||||||
return None
|
|
||||||
|
|
||||||
|
|
||||||
def assert_no_fuse_xattrs(upper: Path) -> None:
|
|
||||||
if not upper.exists() or _LISTXATTR is None:
|
|
||||||
return
|
|
||||||
for dirpath, dirnames, filenames in os.walk(upper):
|
|
||||||
for entry in (dirpath, *(os.path.join(dirpath, n) for n in dirnames),
|
|
||||||
*(os.path.join(dirpath, n) for n in filenames)):
|
|
||||||
tainted = _entry_has_fuse_xattr(entry)
|
|
||||||
if tainted:
|
|
||||||
die(
|
|
||||||
f"upperdir contains fuse-overlayfs xattr {tainted!r} on {entry}; "
|
|
||||||
"wipe upper/ and work/ before mounting"
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
def exec_or_print(argv: list[str]) -> None:
|
|
||||||
if os.environ.get("LEFT4ME_OVERLAY_PRINT_ONLY") == "1":
|
|
||||||
print(" ".join(shlex.quote(a) for a in argv))
|
|
||||||
sys.exit(0)
|
|
||||||
os.execv(argv[0], argv)
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_mount(name: str) -> None:
|
|
||||||
name = validate_name(name)
|
|
||||||
r = root()
|
|
||||||
runtime_name_dir = (r / "runtime" / name).resolve(strict=True)
|
|
||||||
merged_for_check = (runtime_name_dir / "merged").resolve(strict=True)
|
|
||||||
|
|
||||||
# Idempotency for unit restart cycles: if a previous start mounted
|
|
||||||
# successfully but ExecStart failed afterwards (and Restart=on-failure
|
|
||||||
# fires another cycle), the second ExecStartPre would otherwise refuse
|
|
||||||
# to mount-on-top. Short-circuit here so the second cycle just gets
|
|
||||||
# straight to ExecStart. PRINT_ONLY (test mode) bypasses this so the
|
|
||||||
# tests can exercise the full nsenter argv regardless of mount state.
|
|
||||||
if (
|
|
||||||
os.environ.get("LEFT4ME_OVERLAY_PRINT_ONLY") != "1"
|
|
||||||
and os.path.ismount(merged_for_check)
|
|
||||||
):
|
|
||||||
return
|
|
||||||
|
|
||||||
instance_env = r / "instances" / name / "instance.env"
|
|
||||||
raw_lowerdirs = parse_lowerdirs(instance_env)
|
|
||||||
|
|
||||||
allowed_roots = [(r / sub).resolve() for sub in LOWERDIR_ALLOWLIST]
|
|
||||||
canonical_lowerdirs = [str(canonical_under(allowed_roots, Path(p))) for p in raw_lowerdirs]
|
|
||||||
|
|
||||||
upper = (runtime_name_dir / "upper").resolve(strict=True)
|
|
||||||
work = (runtime_name_dir / "work").resolve(strict=True)
|
|
||||||
merged = merged_for_check
|
|
||||||
for label, path in (("upper", upper), ("work", work), ("merged", merged)):
|
|
||||||
if path.parent != runtime_name_dir:
|
|
||||||
die(f"{label} resolved outside runtime/{name}: {path}")
|
|
||||||
|
|
||||||
assert_no_fuse_xattrs(upper)
|
|
||||||
|
|
||||||
options = f"lowerdir={':'.join(canonical_lowerdirs)},upperdir={upper},workdir={work}"
|
|
||||||
argv = [
|
|
||||||
MOUNT_BIN,
|
|
||||||
"-t", "overlay",
|
|
||||||
"overlay",
|
|
||||||
"-o", options,
|
|
||||||
str(merged),
|
|
||||||
]
|
|
||||||
exec_or_print(argv)
|
|
||||||
|
|
||||||
|
|
||||||
def cmd_umount(name: str) -> None:
|
|
||||||
name = validate_name(name)
|
|
||||||
r = root()
|
|
||||||
runtime_name_dir = (r / "runtime" / name).resolve(strict=True)
|
|
||||||
merged_path = runtime_name_dir / "merged"
|
|
||||||
work_inner = runtime_name_dir / "work" / "work"
|
|
||||||
|
|
||||||
argv = [
|
|
||||||
UMOUNT_BIN,
|
|
||||||
# Resolve only if it exists; PRINT_ONLY tests always pre-create it.
|
|
||||||
str(merged_path.resolve(strict=True) if merged_path.exists() else merged_path),
|
|
||||||
]
|
|
||||||
|
|
||||||
# PRINT_ONLY: emit the umount argv and exit. Tests assert exact shape
|
|
||||||
# of this dry-run; the post-umount cleanup of work_inner is a runtime
|
|
||||||
# behaviour exercised on the host, not in unit tests.
|
|
||||||
if os.environ.get("LEFT4ME_OVERLAY_PRINT_ONLY") == "1":
|
|
||||||
print(" ".join(shlex.quote(a) for a in argv))
|
|
||||||
sys.exit(0)
|
|
||||||
|
|
||||||
if merged_path.exists():
|
|
||||||
merged = merged_path.resolve(strict=True)
|
|
||||||
if merged.parent != runtime_name_dir:
|
|
||||||
die(f"merged resolved outside runtime/{name}: {merged}")
|
|
||||||
# Idempotency: only umount if currently a mount point. Mirrors
|
|
||||||
# cmd_mount's symmetric check; a redundant cleanup pass — or a
|
|
||||||
# call after a partial _purge_instance — must be a no-op.
|
|
||||||
#
|
|
||||||
# No retry loop here: with the helper running in PID 1's mount
|
|
||||||
# namespace (via the unit-level `nsenter --mount=/proc/1/ns/mnt`
|
|
||||||
# in ExecStopPost), it holds no reference to the unit's
|
|
||||||
# per-service mount namespace, so the cgroup-empty → namespace
|
|
||||||
# reaped → umount-clears sequence happens without any race
|
|
||||||
# window for us to ride out. EBUSY here is a real error.
|
|
||||||
if os.path.ismount(merged):
|
|
||||||
subprocess.run(argv, check=True)
|
|
||||||
|
|
||||||
# Kernel-overlayfs creates work_inner during mount with root:root mode
|
|
||||||
# 0/0. After unmount it's an orphan that the unit's User= (left4me)
|
|
||||||
# cannot traverse via shutil.rmtree, so reset/delete in instances.py
|
|
||||||
# blows up with EACCES on `runtime/<name>/work/work`. The helper is
|
|
||||||
# the only code path with root that knows about this directory, so
|
|
||||||
# the cleanup belongs here. Safe to nuke — the kernel re-creates it
|
|
||||||
# on the next mount. Run unconditionally — covers both "we just
|
|
||||||
# unmounted" and "previous teardown didn't finish" cases.
|
|
||||||
if work_inner.exists():
|
|
||||||
shutil.rmtree(work_inner)
|
|
||||||
|
|
||||||
|
|
||||||
def main(argv: list[str]) -> None:
|
|
||||||
if len(argv) != 3 or argv[1] not in ("mount", "umount"):
|
|
||||||
sys.stderr.write("usage: left4me-overlay mount|umount <name>\n")
|
|
||||||
sys.exit(2)
|
|
||||||
if argv[1] == "mount":
|
|
||||||
cmd_mount(argv[2])
|
|
||||||
else:
|
|
||||||
cmd_umount(argv[2])
|
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
|
||||||
main(sys.argv)
|
|
||||||
|
|
@ -1,82 +0,0 @@
|
||||||
#!/bin/bash
|
|
||||||
# Privileged sandbox launcher for left4me script overlays.
|
|
||||||
#
|
|
||||||
# Invoked via sudo by the web user with two arguments:
|
|
||||||
# <overlay_id> numeric overlay id; bind-mounts /var/lib/left4me/overlays/<id>
|
|
||||||
# read-write at /overlay inside the sandbox.
|
|
||||||
# <script_path> absolute path to a bash file already written by the web app;
|
|
||||||
# bind-mounted read-only at /script.sh inside the sandbox.
|
|
||||||
#
|
|
||||||
# The script runs as a transient systemd .service with the full hardening
|
|
||||||
# surface: cgroup limits + walltime kill, NoNewPrivileges, ProtectSystem,
|
|
||||||
# ProtectHome, kernel-tunable / -module / -log protection, namespace
|
|
||||||
# restriction, address-family restriction, capability bounding (empty),
|
|
||||||
# seccomp filter (@system-service @network-io), MemoryDenyWriteExecute,
|
|
||||||
# LockPersonality, RestrictSUIDSGID. Network namespace is *not* restricted —
|
|
||||||
# scripts must reach the public internet to download workshop / l4d2center
|
|
||||||
# / cedapug content. PID namespace is shared with the host (no
|
|
||||||
# PrivatePID= directive in systemd); host PIDs are visible via /proc but
|
|
||||||
# not signal-able due to UID mismatch.
|
|
||||||
set -euo pipefail
|
|
||||||
|
|
||||||
[[ $# -eq 2 ]] || { echo "usage: $0 <overlay_id> <script>" >&2; exit 64; }
|
|
||||||
|
|
||||||
OVERLAY_ID=$1
|
|
||||||
SCRIPT=$2
|
|
||||||
|
|
||||||
[[ "$OVERLAY_ID" =~ ^[0-9]+$ ]] || { echo "bad overlay id" >&2; exit 64; }
|
|
||||||
OVERLAY_DIR=/var/lib/left4me/overlays/$OVERLAY_ID
|
|
||||||
[[ -d $OVERLAY_DIR ]] || { echo "no overlay dir at $OVERLAY_DIR" >&2; exit 65; }
|
|
||||||
[[ -f $SCRIPT ]] || { echo "no script at $SCRIPT" >&2; exit 65; }
|
|
||||||
|
|
||||||
if [[ "${LEFT4ME_SCRIPT_SANDBOX_DRY_RUN:-}" == "1" ]]; then
|
|
||||||
echo "DRY RUN: overlay_id=$OVERLAY_ID script=$SCRIPT overlay_dir=$OVERLAY_DIR"
|
|
||||||
exit 0
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Make sure the sandbox UID owns the overlay dir so the script can write there.
|
|
||||||
# Idempotent: a no-op when the dir is already l4d2-sandbox-owned (re-run case),
|
|
||||||
# and corrects the ownership the first time the dir was created by the web app
|
|
||||||
# under the left4me UID. World-readable so the gameserver process (left4me)
|
|
||||||
# can read the overlay contents via the kernel-overlayfs lowerdir at runtime.
|
|
||||||
chown -R l4d2-sandbox:l4d2-sandbox "$OVERLAY_DIR"
|
|
||||||
chmod 0755 "$OVERLAY_DIR"
|
|
||||||
|
|
||||||
SCRIPT_RC=0
|
|
||||||
systemd-run --quiet --collect --wait --pipe \
|
|
||||||
--unit="left4me-script-${OVERLAY_ID}-$$" \
|
|
||||||
--slice=l4d2-build.slice \
|
|
||||||
-p OOMScoreAdjust=500 \
|
|
||||||
-p User=l4d2-sandbox -p Group=l4d2-sandbox \
|
|
||||||
-p UMask=0022 \
|
|
||||||
-p NoNewPrivileges=yes \
|
|
||||||
-p ProtectSystem=strict -p ProtectHome=yes \
|
|
||||||
-p PrivateTmp=yes -p PrivateDevices=yes -p PrivateIPC=yes \
|
|
||||||
-p ProtectKernelTunables=yes -p ProtectKernelModules=yes \
|
|
||||||
-p ProtectKernelLogs=yes -p ProtectControlGroups=yes \
|
|
||||||
-p RestrictNamespaces=yes \
|
|
||||||
-p RestrictAddressFamilies="AF_INET AF_INET6 AF_UNIX" \
|
|
||||||
-p RestrictSUIDSGID=yes -p LockPersonality=yes \
|
|
||||||
-p MemoryDenyWriteExecute=yes \
|
|
||||||
-p SystemCallFilter="@system-service @network-io" \
|
|
||||||
-p SystemCallArchitectures=native \
|
|
||||||
-p CapabilityBoundingSet= -p AmbientCapabilities= \
|
|
||||||
-p IPAddressDeny="127.0.0.0/8 ::1/128 169.254.0.0/16 fe80::/10 224.0.0.0/4 ff00::/8 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 100.64.0.0/10 fc00::/7" \
|
|
||||||
-p TemporaryFileSystem="/etc /var/lib" \
|
|
||||||
-p BindReadOnlyPaths="/etc/left4me/sandbox-resolv.conf:/etc/resolv.conf /etc/ssl /etc/ca-certificates /etc/nsswitch.conf /etc/alternatives ${SCRIPT}:/script.sh" \
|
|
||||||
-p BindPaths="${OVERLAY_DIR}:/overlay" \
|
|
||||||
-p WorkingDirectory=/overlay \
|
|
||||||
-p Environment="HOME=/tmp PATH=/usr/bin:/usr/sbin OVERLAY=/overlay" \
|
|
||||||
-p MemoryMax=4G -p MemorySwapMax=0 -p TasksMax=512 \
|
|
||||||
-p CPUQuota=200% -p RuntimeMaxSec=3600 \
|
|
||||||
-- /bin/bash /script.sh || SCRIPT_RC=$?
|
|
||||||
|
|
||||||
# Normalize perms so the web service (left4me uid) can read overlay files
|
|
||||||
# directly via Python open() — needed by the file tree's download endpoint.
|
|
||||||
# UMask=0022 above takes care of *new* writes; this catches anything the
|
|
||||||
# script created with a tighter mode (e.g. cedapug_maps writes its
|
|
||||||
# .cedapug/manifest.tsv as 0600 by default).
|
|
||||||
find "$OVERLAY_DIR" -type f ! -perm -o+r -exec chmod o+r {} + 2>/dev/null || true
|
|
||||||
find "$OVERLAY_DIR" -type d ! -perm -o+rx -exec chmod o+rx {} + 2>/dev/null || true
|
|
||||||
|
|
||||||
exit $SCRIPT_RC
|
|
||||||
|
|
@ -1,44 +0,0 @@
|
||||||
#!/bin/sh
|
|
||||||
set -eu
|
|
||||||
|
|
||||||
usage() {
|
|
||||||
printf '%s\n' "usage: left4me-systemctl enable|disable|show <server-name>" >&2
|
|
||||||
exit 2
|
|
||||||
}
|
|
||||||
|
|
||||||
validate_name() {
|
|
||||||
name=$1
|
|
||||||
[ -n "$name" ] || usage
|
|
||||||
case "$name" in
|
|
||||||
.*|*..*|*/*|*\\*) usage ;;
|
|
||||||
esac
|
|
||||||
case "$name" in
|
|
||||||
*[!A-Za-z0-9_.-]*) usage ;;
|
|
||||||
esac
|
|
||||||
}
|
|
||||||
|
|
||||||
[ "$#" -eq 2 ] || usage
|
|
||||||
action=$1
|
|
||||||
name=$2
|
|
||||||
|
|
||||||
case "$action" in
|
|
||||||
enable|disable|show) ;;
|
|
||||||
*) usage ;;
|
|
||||||
esac
|
|
||||||
|
|
||||||
validate_name "$name"
|
|
||||||
unit="left4me-server@${name}.service"
|
|
||||||
if [ -x /bin/systemctl ]; then
|
|
||||||
systemctl=/bin/systemctl
|
|
||||||
elif [ -x /usr/bin/systemctl ]; then
|
|
||||||
systemctl=/usr/bin/systemctl
|
|
||||||
else
|
|
||||||
printf '%s\n' 'systemctl not found at /bin/systemctl or /usr/bin/systemctl' >&2
|
|
||||||
exit 69
|
|
||||||
fi
|
|
||||||
|
|
||||||
case "$action" in
|
|
||||||
enable) exec "$systemctl" enable --now "$unit" ;;
|
|
||||||
disable) exec "$systemctl" disable --now "$unit" ;;
|
|
||||||
show) exec "$systemctl" show --property=ActiveState --property=SubState "$unit" ;;
|
|
||||||
esac
|
|
||||||
|
|
@ -1,17 +0,0 @@
|
||||||
#!/bin/sh
|
|
||||||
# Run l4d2web flask CLI commands as the left4me user with the deploy env loaded.
|
|
||||||
# Usage: left4me <flask-subcommand> [args...]
|
|
||||||
# Examples:
|
|
||||||
# left4me create-user alice --admin
|
|
||||||
# left4me seed-script-overlays /opt/left4me/src/examples/script-overlays
|
|
||||||
# left4me routes
|
|
||||||
set -eu
|
|
||||||
exec sudo -u left4me sh -c '
|
|
||||||
set -a
|
|
||||||
. /etc/left4me/host.env
|
|
||||||
. /etc/left4me/web.env
|
|
||||||
set +a
|
|
||||||
export JOB_WORKER_ENABLED=false
|
|
||||||
export PYTHONPATH=/opt/left4me/src
|
|
||||||
exec /opt/left4me/.venv/bin/flask --app l4d2web.app:create_app "$@"
|
|
||||||
' sh "$@"
|
|
||||||
|
|
@ -1,293 +0,0 @@
|
||||||
# Items for the left4me bundle.
|
|
||||||
# Systemd units come from metadata via bundles/systemd/ — there are no
|
|
||||||
# .service or .slice files in this bundle's files/ tree. Cpuset drop-ins
|
|
||||||
# for system.slice / user.slice are likewise emitted via systemd/units
|
|
||||||
# in metadata.py (key: '<parent>.d/<basename>.conf').
|
|
||||||
|
|
||||||
directories = {
|
|
||||||
'/opt/left4me': {
|
|
||||||
'owner': 'left4me',
|
|
||||||
'group': 'left4me',
|
|
||||||
},
|
|
||||||
'/opt/left4me/src': {
|
|
||||||
'owner': 'left4me',
|
|
||||||
'group': 'left4me',
|
|
||||||
},
|
|
||||||
'/etc/left4me': {
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
'mode': '0755',
|
|
||||||
},
|
|
||||||
'/var/lib/left4me': {
|
|
||||||
# left4me's home dir — useradd creates with 0700; loosen to 0711 so
|
|
||||||
# l4d2-sandbox can traverse (but not list) for bwrap bind-mounts.
|
|
||||||
'owner': 'left4me',
|
|
||||||
'group': 'left4me',
|
|
||||||
'mode': '0711',
|
|
||||||
},
|
|
||||||
'/var/lib/left4me/installation': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/var/lib/left4me/overlays': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/var/lib/left4me/instances': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/var/lib/left4me/runtime': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/var/lib/left4me/workshop_cache': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/var/lib/left4me/tmp': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/opt/left4me/steam': {'owner': 'left4me', 'group': 'left4me'},
|
|
||||||
'/usr/local/libexec/left4me': {
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
'mode': '0755',
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
groups = {
|
|
||||||
'left4me': {'gid': 980},
|
|
||||||
'l4d2-sandbox': {'gid': 981},
|
|
||||||
}
|
|
||||||
|
|
||||||
users = {
|
|
||||||
'left4me': {
|
|
||||||
'uid': 980,
|
|
||||||
'gid': 980,
|
|
||||||
'home': '/var/lib/left4me',
|
|
||||||
'shell': '/usr/sbin/nologin',
|
|
||||||
},
|
|
||||||
'l4d2-sandbox': {
|
|
||||||
'uid': 981,
|
|
||||||
'gid': 981,
|
|
||||||
'shell': '/usr/sbin/nologin',
|
|
||||||
},
|
|
||||||
}
|
|
||||||
# UIDs/GIDs pinned in the system-package range (100-999, per Debian
|
|
||||||
# policy) so file ownership is deterministic across rebuilds and
|
|
||||||
# backup restores. 980/981 are unused elsewhere in this repo.
|
|
||||||
|
|
||||||
# Privileged helpers (mode 0755 root:root). Listed by sudoers as the only
|
|
||||||
# commands left4me can invoke as root NOPASSWD.
|
|
||||||
HELPERS = (
|
|
||||||
'left4me-systemctl',
|
|
||||||
'left4me-journalctl',
|
|
||||||
'left4me-overlay',
|
|
||||||
'left4me-script-sandbox',
|
|
||||||
)
|
|
||||||
|
|
||||||
files = {
|
|
||||||
'/usr/local/sbin/left4me': {
|
|
||||||
'source': 'usr/local/sbin/left4me', # explicit — basename collides with sudoers
|
|
||||||
'mode': '0755',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
},
|
|
||||||
**{
|
|
||||||
f'/usr/local/libexec/left4me/{h}': {
|
|
||||||
'source': f'usr/local/libexec/left4me/{h}',
|
|
||||||
'mode': '0755',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
}
|
|
||||||
for h in HELPERS
|
|
||||||
},
|
|
||||||
'/etc/left4me/sandbox-resolv.conf': {
|
|
||||||
'source': 'etc/left4me/sandbox-resolv.conf',
|
|
||||||
'mode': '0644',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
},
|
|
||||||
'/etc/sudoers.d/left4me': {
|
|
||||||
'source': 'etc/sudoers.d/left4me',
|
|
||||||
'mode': '0440',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
'test_with': 'visudo -cf {}',
|
|
||||||
},
|
|
||||||
'/etc/sysctl.d/99-left4me.conf': {
|
|
||||||
'source': 'etc/sysctl.d/99-left4me.conf',
|
|
||||||
'mode': '0644',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
'triggers': [
|
|
||||||
'action:left4me_sysctl_reload',
|
|
||||||
],
|
|
||||||
},
|
|
||||||
'/etc/left4me/host.env': {
|
|
||||||
'source': 'etc/left4me/host.env.mako',
|
|
||||||
'content_type': 'mako',
|
|
||||||
'mode': '0644',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'root',
|
|
||||||
},
|
|
||||||
'/etc/left4me/web.env': {
|
|
||||||
'source': 'etc/left4me/web.env.mako',
|
|
||||||
'content_type': 'mako',
|
|
||||||
'mode': '0640',
|
|
||||||
'owner': 'root',
|
|
||||||
'group': 'left4me',
|
|
||||||
'needs': [
|
|
||||||
'group:left4me',
|
|
||||||
],
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
actions = {
|
|
||||||
'left4me_sysctl_reload': {
|
|
||||||
'command': 'sysctl --system >/dev/null',
|
|
||||||
'triggered': True,
|
|
||||||
},
|
|
||||||
'left4me_dpkg_add_i386_arch': {
|
|
||||||
# steamcmd is 32-bit and pulls libc6:i386 + lib32z1 from the i386 arch.
|
|
||||||
# apt-get update is part of this action because newly-added foreign
|
|
||||||
# archs need a fresh package list before any :i386 package resolves.
|
|
||||||
'command': 'dpkg --add-architecture i386 && apt-get update',
|
|
||||||
'unless': 'dpkg --print-foreign-architectures | grep -qx i386',
|
|
||||||
'cascade_skip': False,
|
|
||||||
},
|
|
||||||
'left4me_install_steamcmd': {
|
|
||||||
# Steam's tarball is rolling with no published checksum, so we can't
|
|
||||||
# use download: (which requires a hash). Guard with a presence check
|
|
||||||
# on steamcmd.sh — steamcmd self-updates at runtime, so chasing the
|
|
||||||
# tarball version from bw isn't useful.
|
|
||||||
'command': (
|
|
||||||
'sudo -u left4me sh -c "'
|
|
||||||
'cd /opt/left4me/steam && '
|
|
||||||
'curl -fsSL https://media.steampowered.com/installer/steamcmd_linux.tar.gz | '
|
|
||||||
'tar -xz'
|
|
||||||
'"'
|
|
||||||
),
|
|
||||||
'unless': 'test -x /opt/left4me/steam/steamcmd.sh',
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'directory:/opt/left4me/steam',
|
|
||||||
'pkg_apt:curl',
|
|
||||||
'pkg_apt:libc6_i386', # bw pkg_apt convention: _ → :
|
|
||||||
'pkg_apt:lib32z1',
|
|
||||||
'user:left4me',
|
|
||||||
],
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
# steamcmd is invoked by absolute path (LEFT4ME_STEAMCMD in host.env),
|
|
||||||
# not via PATH lookup — see l4d2host/cli.py:install. We don't need to put
|
|
||||||
# anything in /usr/local/bin for it.
|
|
||||||
|
|
||||||
git_deploy = {
|
|
||||||
'/opt/left4me/src': {
|
|
||||||
'repo': node.metadata.get('left4me/git_url'),
|
|
||||||
'rev': node.metadata.get('left4me/git_branch'),
|
|
||||||
'triggers': [
|
|
||||||
# On a code-update apply, refresh the DB schema. pip_install
|
|
||||||
# would have triggered alembic in the create_venv path, but on
|
|
||||||
# a normal apply pip_install's `unless` skips (packages still
|
|
||||||
# importable from the previous editable install), and that
|
|
||||||
# would leave alembic_upgrade dormant. Wiring git_deploy →
|
|
||||||
# alembic directly ensures new migrations land whenever new
|
|
||||||
# code lands. alembic upgrade head is idempotent (no-op when
|
|
||||||
# already at head), so this is safe to fire on every code
|
|
||||||
# update; the seed_overlays + service:restart cascade off
|
|
||||||
# alembic also covers picking up the new code in gunicorn.
|
|
||||||
'action:left4me_alembic_upgrade',
|
|
||||||
],
|
|
||||||
# chown_src and pip_install are NOT in triggers — they run every
|
|
||||||
# apply gated by their own `unless` guards, which makes the chain
|
|
||||||
# self-healing after a partial failure. (Items in a triggers list
|
|
||||||
# must be triggered:True, which would lose that property.)
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
actions['left4me_chown_src'] = {
|
|
||||||
# Runs every apply (cheap — chown -R on a small tree). Self-heals
|
|
||||||
# whenever git_deploy extracts a new tarball as root-owned files.
|
|
||||||
# Not in any triggers list so doesn't need triggered:True.
|
|
||||||
'command': 'chown -R left4me:left4me /opt/left4me/src',
|
|
||||||
'unless': 'test -z "$(find /opt/left4me/src \\! -user left4me -print -quit 2>/dev/null)"',
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'git_deploy:/opt/left4me/src',
|
|
||||||
'user:left4me',
|
|
||||||
'group:left4me',
|
|
||||||
],
|
|
||||||
}
|
|
||||||
|
|
||||||
actions['left4me_create_venv'] = {
|
|
||||||
'command': 'sudo -u left4me /usr/bin/python3 -m venv /opt/left4me/.venv',
|
|
||||||
'unless': 'test -x /opt/left4me/.venv/bin/python',
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'directory:/opt/left4me',
|
|
||||||
'pkg_apt:python3-venv',
|
|
||||||
'user:left4me',
|
|
||||||
],
|
|
||||||
'triggers': [
|
|
||||||
'action:left4me_pip_upgrade',
|
|
||||||
],
|
|
||||||
}
|
|
||||||
|
|
||||||
actions['left4me_pip_upgrade'] = {
|
|
||||||
'command': 'sudo -u left4me /opt/left4me/.venv/bin/python -m pip install --upgrade pip',
|
|
||||||
'triggered': True,
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'pkg_apt:python3-pip',
|
|
||||||
],
|
|
||||||
# No triggers — pip_install runs on every apply (gated by `unless`)
|
|
||||||
# rather than being chained from here. Keeps pip_upgrade scoped to
|
|
||||||
# exactly its purpose.
|
|
||||||
}
|
|
||||||
|
|
||||||
actions['left4me_pip_install'] = {
|
|
||||||
# Single pip invocation installs both editable packages from the same
|
|
||||||
# checkout. Runs on every apply: pip install -e is fast on no-op, and
|
|
||||||
# any gate weaker than "egg-info matches pyproject.toml" can mask
|
|
||||||
# script regeneration — e.g. adding [project.scripts] later wouldn't
|
|
||||||
# be picked up if `unless` only checks importability.
|
|
||||||
'command': 'sudo -u left4me /opt/left4me/.venv/bin/pip install -e /opt/left4me/src/l4d2host -e /opt/left4me/src/l4d2web',
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'git_deploy:/opt/left4me/src',
|
|
||||||
'action:left4me_create_venv',
|
|
||||||
'action:left4me_chown_src',
|
|
||||||
],
|
|
||||||
'triggers': [
|
|
||||||
'action:left4me_alembic_upgrade',
|
|
||||||
],
|
|
||||||
}
|
|
||||||
|
|
||||||
actions['left4me_alembic_upgrade'] = {
|
|
||||||
# Mirrors deploy-test-server.sh:239-242. Runs as left4me with both env
|
|
||||||
# files sourced; JOB_WORKER_ENABLED=false so a stray worker doesn't race
|
|
||||||
# with the migration.
|
|
||||||
'command': (
|
|
||||||
'sudo -u left4me sh -c "'
|
|
||||||
'cd /opt/left4me/src/l4d2web && '
|
|
||||||
'set -a && . /etc/left4me/host.env && . /etc/left4me/web.env && set +a && '
|
|
||||||
'env JOB_WORKER_ENABLED=false PYTHONPATH=/opt/left4me/src '
|
|
||||||
'/opt/left4me/.venv/bin/alembic -c /opt/left4me/src/l4d2web/alembic.ini upgrade head'
|
|
||||||
'"'
|
|
||||||
),
|
|
||||||
'triggered': True,
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'action:left4me_pip_install',
|
|
||||||
'file:/etc/left4me/host.env',
|
|
||||||
'file:/etc/left4me/web.env',
|
|
||||||
],
|
|
||||||
'triggers': [
|
|
||||||
'action:left4me_seed_overlays',
|
|
||||||
'svc_systemd:left4me-web.service:restart',
|
|
||||||
],
|
|
||||||
}
|
|
||||||
|
|
||||||
actions['left4me_seed_overlays'] = {
|
|
||||||
# Idempotent: refreshes script bodies in place; existing overlay rows keep their ids.
|
|
||||||
'command': (
|
|
||||||
'sudo -u left4me sh -c "'
|
|
||||||
'set -a && . /etc/left4me/host.env && . /etc/left4me/web.env && set +a && '
|
|
||||||
'env JOB_WORKER_ENABLED=false PYTHONPATH=/opt/left4me/src '
|
|
||||||
'/opt/left4me/.venv/bin/flask --app l4d2web.app:create_app '
|
|
||||||
'seed-script-overlays /opt/left4me/src/examples/script-overlays'
|
|
||||||
'"'
|
|
||||||
),
|
|
||||||
'triggered': True,
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': [
|
|
||||||
'action:left4me_alembic_upgrade',
|
|
||||||
],
|
|
||||||
}
|
|
||||||
|
|
@ -1,275 +0,0 @@
|
||||||
assert node.has_bundle('nftables')
|
|
||||||
assert node.has_bundle('systemd')
|
|
||||||
|
|
||||||
|
|
||||||
defaults = {
|
|
||||||
'left4me': {
|
|
||||||
# Application-wide defaults; node only overrides if it really needs to.
|
|
||||||
'git_url': 'https://git.sublimity.de/cronekorkn/left4me.git',
|
|
||||||
'git_branch': 'master',
|
|
||||||
'secret_key': repo.vault.random_bytes_as_base64_for(f'{node.name} left4me secret_key', length=32).value,
|
|
||||||
'gunicorn_workers': 1,
|
|
||||||
'gunicorn_threads': 32,
|
|
||||||
'job_worker_threads': 4,
|
|
||||||
# Whole 27000-block: covers Steam's defaults (27015 game, 27005
|
|
||||||
# client/RCON) plus headroom for ad-hoc ports without further
|
|
||||||
# nftables changes. Mirrored into LEFT4ME_PORT_RANGE_{START,END}
|
|
||||||
# by web.env.mako and into the nftables input rule by the
|
|
||||||
# nftables_input reactor below.
|
|
||||||
'port_range_start': 27000,
|
|
||||||
'port_range_end': 27999,
|
|
||||||
},
|
|
||||||
'apt': {
|
|
||||||
'packages': {
|
|
||||||
'p7zip-full': {},
|
|
||||||
'nftables': {},
|
|
||||||
'iproute2': {},
|
|
||||||
'curl': {},
|
|
||||||
'ca-certificates': {},
|
|
||||||
'python3': {},
|
|
||||||
'python3-venv': {},
|
|
||||||
'python3-pip': {},
|
|
||||||
'python3-dev': {},
|
|
||||||
# steamcmd is a 32-bit ELF; needs i386 multiarch + these libs.
|
|
||||||
# `_` → `:` is bundlewrap's pkg_apt convention for multiarch
|
|
||||||
# names (see pkg_apt.py:48).
|
|
||||||
'libc6_i386': { # installs libc6:i386
|
|
||||||
'needs': ['action:left4me_dpkg_add_i386_arch'],
|
|
||||||
},
|
|
||||||
'lib32z1': {
|
|
||||||
'needs': ['action:left4me_dpkg_add_i386_arch'],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'nftables': {
|
|
||||||
# Match deploy/files/usr/local/lib/left4me/nft/left4me-mark.nft.
|
|
||||||
# Mark srcds UDP egress (uid left4me) with DSCP EF + skb priority 6
|
|
||||||
# so CAKE classifies it into the priority tin.
|
|
||||||
'output': {
|
|
||||||
'meta skuid "left4me" meta l4proto udp ip dscp set ef meta priority set 0006:0000',
|
|
||||||
'meta skuid "left4me" meta l4proto udp ip6 dscp set ef meta priority set 0006:0000',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'systemd': {
|
|
||||||
'services': {
|
|
||||||
'left4me-web.service': {
|
|
||||||
'enabled': True,
|
|
||||||
'running': True,
|
|
||||||
'needs': [
|
|
||||||
'action:left4me_alembic_upgrade',
|
|
||||||
'file:/etc/left4me/host.env',
|
|
||||||
'file:/etc/left4me/web.env',
|
|
||||||
],
|
|
||||||
},
|
|
||||||
# Note: left4me-server@.service is a TEMPLATE — instances are
|
|
||||||
# started on-demand by the web app via the left4me-systemctl
|
|
||||||
# helper. Don't enable/start it from here.
|
|
||||||
# The slices are installed (file present) but don't need
|
|
||||||
# enable/start — they're activated implicitly when a unit
|
|
||||||
# uses Slice=.
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'backup': {
|
|
||||||
# Application-owned paths. Set-merged with backup group / node-level paths.
|
|
||||||
'paths': {
|
|
||||||
'/var/lib/left4me',
|
|
||||||
'/etc/left4me',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
@metadata_reactor.provides(
|
|
||||||
'nginx/vhosts',
|
|
||||||
)
|
|
||||||
def nginx_vhosts(metadata):
|
|
||||||
# letsencrypt/domains and monitoring/services for the vhost are auto-
|
|
||||||
# populated by bundles/nginx/metadata.py. We just declare check_path:
|
|
||||||
# '/health' so the auto-check hits the Flask health endpoint, not '/'.
|
|
||||||
domain = metadata.get('left4me/domain')
|
|
||||||
return {
|
|
||||||
'nginx': {
|
|
||||||
'vhosts': {
|
|
||||||
domain: {
|
|
||||||
'content': 'nginx/proxy_pass.conf',
|
|
||||||
'context': {
|
|
||||||
'target': 'http://127.0.0.1:8000',
|
|
||||||
},
|
|
||||||
'check_path': '/health',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
@metadata_reactor.provides(
|
|
||||||
'nftables/input',
|
|
||||||
)
|
|
||||||
def nftables_input(metadata):
|
|
||||||
port_start = metadata.get('left4me/port_range_start')
|
|
||||||
port_end = metadata.get('left4me/port_range_end')
|
|
||||||
return {
|
|
||||||
'nftables': {
|
|
||||||
'input': {
|
|
||||||
f'udp dport {port_start}-{port_end} accept',
|
|
||||||
f'tcp dport {port_start}-{port_end} accept',
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
||||||
|
|
||||||
@metadata_reactor.provides(
|
|
||||||
'systemd/units',
|
|
||||||
)
|
|
||||||
def systemd_units(metadata):
|
|
||||||
workers = metadata.get('left4me/gunicorn_workers')
|
|
||||||
threads = metadata.get('left4me/gunicorn_threads')
|
|
||||||
|
|
||||||
# cgroup-v2 cpuset. `system_cpus` (set of int CPU ids, declared per
|
|
||||||
# node) pins system/user/build; the complement pins l4d2-game. On HT
|
|
||||||
# hosts, list both siblings of a physical core so games don't share
|
|
||||||
# L1/L2 with system work — pairings via
|
|
||||||
# /sys/devices/system/cpu/cpu<n>/topology/thread_siblings_list.
|
|
||||||
vm_threads = metadata.get('vm/threads', metadata.get('vm/cores'))
|
|
||||||
all_cpus = set(range(vm_threads))
|
|
||||||
system_cpus = metadata.get('left4me/system_cpus')
|
|
||||||
if not system_cpus <= all_cpus:
|
|
||||||
raise Exception(
|
|
||||||
f'left4me/system_cpus={sorted(system_cpus)} on {vm_threads}-thread host '
|
|
||||||
f'includes CPUs outside [0, {vm_threads})'
|
|
||||||
)
|
|
||||||
game_cpus = all_cpus - system_cpus
|
|
||||||
if not game_cpus:
|
|
||||||
raise Exception(
|
|
||||||
f'left4me/system_cpus={sorted(system_cpus)} on {vm_threads}-thread host '
|
|
||||||
f'leaves no cores for games'
|
|
||||||
)
|
|
||||||
system_cpus_string = ','.join(str(t) for t in sorted(system_cpus))
|
|
||||||
game_cpus_string = ','.join(str(t) for t in sorted(game_cpus))
|
|
||||||
|
|
||||||
# Drop-in for upstream system.slice / user.slice (units we don't own).
|
|
||||||
# Same '<parent>.d/<basename>.conf' convention as nginx and autologin.
|
|
||||||
cpuset_dropin = {'Slice': {'AllowedCPUs': system_cpus_string}}
|
|
||||||
|
|
||||||
return {
|
|
||||||
'systemd': {
|
|
||||||
'units': {
|
|
||||||
'left4me-web.service': {
|
|
||||||
'Unit': {
|
|
||||||
'Description': 'left4me web application',
|
|
||||||
'After': 'network-online.target',
|
|
||||||
'Wants': 'network-online.target',
|
|
||||||
},
|
|
||||||
'Service': {
|
|
||||||
'Type': 'simple',
|
|
||||||
'User': 'left4me',
|
|
||||||
'Group': 'left4me',
|
|
||||||
'WorkingDirectory': '/opt/left4me/src',
|
|
||||||
'Environment': {
|
|
||||||
'HOME=/var/lib/left4me',
|
|
||||||
'PATH=/opt/left4me/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
|
|
||||||
},
|
|
||||||
'EnvironmentFile': (
|
|
||||||
'/etc/left4me/host.env',
|
|
||||||
'/etc/left4me/web.env',
|
|
||||||
),
|
|
||||||
'ExecStart': (
|
|
||||||
'/opt/left4me/.venv/bin/gunicorn '
|
|
||||||
f'--workers {workers} --threads {threads} '
|
|
||||||
"--bind 127.0.0.1:8000 'l4d2web.app:create_app()'"
|
|
||||||
),
|
|
||||||
'Restart': 'on-failure',
|
|
||||||
'RestartSec': '3',
|
|
||||||
# NoNewPrivileges intentionally NOT set: workers sudo to the helpers.
|
|
||||||
'ProtectSystem': 'full',
|
|
||||||
'ReadWritePaths': '/var/lib/left4me',
|
|
||||||
'PrivateTmp': 'true',
|
|
||||||
},
|
|
||||||
'Install': {
|
|
||||||
'WantedBy': {'multi-user.target'},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'left4me-server@.service': {
|
|
||||||
'Unit': {
|
|
||||||
'Description': 'left4me server instance %i',
|
|
||||||
'After': 'network-online.target',
|
|
||||||
'Wants': 'network-online.target',
|
|
||||||
'StartLimitBurst': '5',
|
|
||||||
'StartLimitIntervalSec': '60s',
|
|
||||||
},
|
|
||||||
'Service': {
|
|
||||||
'Type': 'simple',
|
|
||||||
'User': 'left4me',
|
|
||||||
'Group': 'left4me',
|
|
||||||
'EnvironmentFile': (
|
|
||||||
'/etc/left4me/host.env',
|
|
||||||
'/var/lib/left4me/instances/%i/instance.env',
|
|
||||||
),
|
|
||||||
'WorkingDirectory': '-/var/lib/left4me/runtime/%i/merged/left4dead2',
|
|
||||||
'ExecStartPre': (
|
|
||||||
'+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- '
|
|
||||||
'/usr/local/libexec/left4me/left4me-overlay mount %i'
|
|
||||||
),
|
|
||||||
'ExecStart': (
|
|
||||||
'/var/lib/left4me/runtime/%i/merged/srcds_run '
|
|
||||||
'-game left4dead2 +hostport ${L4D2_PORT} $L4D2_ARGS'
|
|
||||||
),
|
|
||||||
'ExecStopPost': (
|
|
||||||
'+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- '
|
|
||||||
'/usr/local/libexec/left4me/left4me-overlay umount %i'
|
|
||||||
),
|
|
||||||
'Restart': 'on-failure',
|
|
||||||
'RestartSec': '5',
|
|
||||||
'Slice': 'l4d2-game.slice',
|
|
||||||
'Nice': '-5',
|
|
||||||
'IOSchedulingClass': 'best-effort',
|
|
||||||
'IOSchedulingPriority': '4',
|
|
||||||
'OOMScoreAdjust': '-200',
|
|
||||||
'MemoryHigh': '1.5G',
|
|
||||||
'MemoryMax': '2G',
|
|
||||||
'TasksMax': '256',
|
|
||||||
'LimitNOFILE': '65536',
|
|
||||||
'KillSignal': 'SIGINT',
|
|
||||||
'TimeoutStopSec': '15s',
|
|
||||||
'LogRateLimitIntervalSec': '0',
|
|
||||||
'NoNewPrivileges': 'true',
|
|
||||||
'PrivateTmp': 'true',
|
|
||||||
'PrivateDevices': 'true',
|
|
||||||
'ProtectHome': 'true',
|
|
||||||
'ProtectSystem': 'strict',
|
|
||||||
'ReadOnlyPaths': '/var/lib/left4me/installation /var/lib/left4me/overlays',
|
|
||||||
'ReadWritePaths': '/var/lib/left4me/runtime/%i',
|
|
||||||
'RestrictSUIDSGID': 'true',
|
|
||||||
'LockPersonality': 'true',
|
|
||||||
},
|
|
||||||
'Install': {
|
|
||||||
'WantedBy': {'multi-user.target'},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'l4d2-game.slice': {
|
|
||||||
'Unit': {
|
|
||||||
'Description': 'left4me game-server slice',
|
|
||||||
'Before': 'slices.target',
|
|
||||||
},
|
|
||||||
'Slice': {
|
|
||||||
'CPUWeight': '1000',
|
|
||||||
'IOWeight': '1000',
|
|
||||||
'AllowedCPUs': game_cpus_string,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'l4d2-build.slice': {
|
|
||||||
'Unit': {
|
|
||||||
'Description': 'left4me script-sandbox build slice',
|
|
||||||
'Before': 'slices.target',
|
|
||||||
},
|
|
||||||
'Slice': {
|
|
||||||
'CPUWeight': '10',
|
|
||||||
'IOWeight': '10',
|
|
||||||
'AllowedCPUs': system_cpus_string,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
'system.slice.d/99-left4me-cpuset.conf': cpuset_dropin,
|
|
||||||
'user.slice.d/99-left4me-cpuset.conf': cpuset_dropin,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
@ -1,60 +1,9 @@
|
||||||
# letsencrypt
|
https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
||||||
|
|
||||||
Issues and renews Let's Encrypt certs via [dehydrated][upstream] with
|
|
||||||
DNS-01 against the in-house bind-acme server.
|
|
||||||
|
|
||||||
[upstream]: https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
|
||||||
|
|
||||||
## First-apply behaviour
|
|
||||||
|
|
||||||
Immediately after `bw apply <node>`, nginx serves a **self-signed
|
|
||||||
cert** for each declared domain — generated by
|
|
||||||
`/etc/dehydrated/letsencrypt-ensure-some-certificate` so nginx has
|
|
||||||
something to start with. The real Let's Encrypt cert arrives at most
|
|
||||||
24h later when the systemd timer fires
|
|
||||||
(`/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`). To
|
|
||||||
shortcut the wait:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
ssh <node> 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01'
|
|
||||||
ssh <node> 'sudo systemctl reload nginx'
|
|
||||||
```
|
|
||||||
|
|
||||||
## DNS-01 prerequisites
|
|
||||||
|
|
||||||
`hook.sh` does `nsupdate` against the bind-acme server (referenced
|
|
||||||
by `letsencrypt/acme_node`). For the challenge to succeed:
|
|
||||||
|
|
||||||
1. The acme node must be in the same metadata graph (so
|
|
||||||
`bw metadata <node> -k letsencrypt/acme_node` resolves).
|
|
||||||
2. **All NS servers** for the validated domain must serve the
|
|
||||||
`_acme-challenge.<domain>` CNAME — Let's Encrypt validates from
|
|
||||||
primary AND secondary geographic regions; both authoritative
|
|
||||||
servers must agree. If a secondary NS is also a bw-managed node,
|
|
||||||
`bw apply` it after adding the domain (see e.g. `ovh.secondary`).
|
|
||||||
3. The bind-acme node's TSIG key must be reachable. `hook.sh` is
|
|
||||||
rendered with the bind-acme server's `network/internal/ipv4` —
|
|
||||||
for clients outside that LAN, the route must exist (typically via
|
|
||||||
wireguard `s2s` peer membership).
|
|
||||||
|
|
||||||
## Negative-cache penalty
|
|
||||||
|
|
||||||
If the first DNS-01 attempt fails (e.g. zone not yet applied to the
|
|
||||||
secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's
|
|
||||||
negative TTL (often 900s = 15 min). Subsequent attempts during that
|
|
||||||
window also fail and refresh the cache. Combined with LE's rate limit
|
|
||||||
of **5 failed authorisations per domain per hour**, recovery requires
|
|
||||||
you to **stop retrying** for ~15 minutes after fixing the DNS, then
|
|
||||||
make at most one attempt.
|
|
||||||
|
|
||||||
## nsupdate sample
|
|
||||||
|
|
||||||
For interactive testing of the bind-acme TSIG path:
|
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
printf "server 127.0.0.1
|
printf "server 127.0.0.1
|
||||||
zone acme.resolver.name.
|
zone acme.resolver.name.
|
||||||
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\"
|
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT "hello"
|
||||||
send
|
send
|
||||||
" | nsupdate -y hmac-sha512:acme:XXXXXX
|
" | nsupdate -y hmac-sha512:acme:XXXXXX
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@ defaults = {
|
||||||
'apt': {
|
'apt': {
|
||||||
'packages': {
|
'packages': {
|
||||||
'dehydrated': {},
|
'dehydrated': {},
|
||||||
'bind9-dnsutils': {},
|
'dnsutils': {},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
'letsencrypt': {
|
'letsencrypt': {
|
||||||
|
|
|
||||||
|
|
@ -1,36 +0,0 @@
|
||||||
# nginx
|
|
||||||
|
|
||||||
Webserver. Per-node vhosts in `nginx/vhosts`; per-vhost templates in
|
|
||||||
`data/nginx/*.conf`.
|
|
||||||
|
|
||||||
## How port 80 is served
|
|
||||||
|
|
||||||
The bundle ships a fixed `80.conf` to
|
|
||||||
`/etc/nginx/sites-available/80.conf` (picked up by the
|
|
||||||
`sites-enabled/` symlink) that handles **all** port-80 traffic
|
|
||||||
across vhosts:
|
|
||||||
|
|
||||||
1. ACME HTTP-01 challenges (`/.well-known/acme-challenge/`) are
|
|
||||||
served from `/var/lib/dehydrated/acme-challenges/`.
|
|
||||||
2. All other port-80 requests are 301-redirected to
|
|
||||||
`https://$host$request_uri`.
|
|
||||||
|
|
||||||
Per-vhost templates only declare `listen 443 ssl http2;`, so they
|
|
||||||
don't need their own port-80 server blocks. If you need vhost-
|
|
||||||
specific port-80 behaviour (e.g. plain-HTTP without redirect),
|
|
||||||
override 80.conf or add a per-vhost block.
|
|
||||||
|
|
||||||
## Required metadata
|
|
||||||
|
|
||||||
- `vm/cores` — read directly by `items.py` for `worker_processes`.
|
|
||||||
No default; `bw items <node>` raises at item-build time if missing.
|
|
||||||
Typically supplied by the `vm` bundle / hetzner-vm group; double-
|
|
||||||
check on bare-metal hosts.
|
|
||||||
- `nginx/vhosts` — dict of vhost-name → vhost-config.
|
|
||||||
- `nginx/modules` — list of dynamic modules to load.
|
|
||||||
|
|
||||||
## Cross-namespace
|
|
||||||
|
|
||||||
`items.py` reads `letsencrypt/domains` to skip emitting a per-vhost
|
|
||||||
HTTPS block when LE hasn't declared the domain yet — keeps the
|
|
||||||
bundle loadable on a node where letsencrypt isn't fully wired up.
|
|
||||||
|
|
@ -32,13 +32,12 @@ http {
|
||||||
|
|
||||||
% endif
|
% endif
|
||||||
|
|
||||||
# Always defined: serves both WS-enabled vhosts (Connection: upgrade for
|
% if has_websockets:
|
||||||
# ws clients) and SSE/keep-alive vhosts (Connection: "" lets nginx manage
|
|
||||||
# the upstream connection for keep-alive, instead of forcing "close").
|
|
||||||
map $http_upgrade $connection_upgrade {
|
map $http_upgrade $connection_upgrade {
|
||||||
default upgrade;
|
default upgrade;
|
||||||
'' '';
|
'' close;
|
||||||
}
|
}
|
||||||
|
% endif
|
||||||
|
|
||||||
include /etc/nginx/sites-enabled/*;
|
include /etc/nginx/sites-enabled/*;
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -64,7 +64,7 @@ files = {
|
||||||
'svc_systemd:nginx:restart',
|
'svc_systemd:nginx:restart',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
'/etc/nginx/sites-available/80.conf': {
|
'/etc/nginx/sites/80.conf': {
|
||||||
'triggers': {
|
'triggers': {
|
||||||
'svc_systemd:nginx:restart',
|
'svc_systemd:nginx:restart',
|
||||||
},
|
},
|
||||||
|
|
|
||||||
|
|
@ -33,7 +33,7 @@ for name, unit in node.metadata.get('systemd/units').items():
|
||||||
'svc_systemd:systemd-networkd.service:restart',
|
'svc_systemd:systemd-networkd.service:restart',
|
||||||
],
|
],
|
||||||
}
|
}
|
||||||
elif extension in ['timer', 'service', 'mount', 'swap', 'target', 'slice']:
|
elif extension in ['timer', 'service', 'mount', 'swap', 'target']:
|
||||||
path = f'/usr/local/lib/systemd/system/{name}'
|
path = f'/usr/local/lib/systemd/system/{name}'
|
||||||
dependencies = {
|
dependencies = {
|
||||||
'triggers': [
|
'triggers': [
|
||||||
|
|
|
||||||
|
|
@ -8,16 +8,10 @@ server {
|
||||||
|
|
||||||
location / {
|
location / {
|
||||||
proxy_set_header X-Real-IP $remote_addr;
|
proxy_set_header X-Real-IP $remote_addr;
|
||||||
# Always set Upgrade + Connection via the $connection_upgrade map:
|
% if websockets:
|
||||||
# WS client (Upgrade header sent) -> Connection: upgrade
|
|
||||||
# non-WS client (no Upgrade) -> Connection: "" (keep-alive)
|
|
||||||
# Lets every vhost serve both WS and SSE without per-vhost flags.
|
|
||||||
proxy_http_version 1.1;
|
|
||||||
proxy_set_header Upgrade $http_upgrade;
|
proxy_set_header Upgrade $http_upgrade;
|
||||||
proxy_set_header Connection $connection_upgrade;
|
proxy_set_header Connection $connection_upgrade;
|
||||||
# SSE-safe pass-through (also fine for non-SSE traffic):
|
% endif
|
||||||
proxy_buffering off;
|
|
||||||
proxy_read_timeout 1h;
|
|
||||||
proxy_pass ${target};
|
proxy_pass ${target};
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -48,51 +48,3 @@ instead.
|
||||||
|
|
||||||
See [`conventions.md#secrets`](conventions.md#secrets) for the
|
See [`conventions.md#secrets`](conventions.md#secrets) for the
|
||||||
demagify magic-string list and the rule's full rationale.
|
demagify magic-string list and the rule's full rationale.
|
||||||
|
|
||||||
## Read-only commands — useful flag combinations
|
|
||||||
|
|
||||||
The fork's [`AGENTS.md`][fork] documents the canonical safety envelope.
|
|
||||||
These are the flag combinations agents reach for most often in this repo:
|
|
||||||
|
|
||||||
| Want to … | Run |
|
|
||||||
|---|---|
|
|
||||||
| Sanity-check the whole repo (parse + cross-cutting hooks) | `bw test` (defaults to `-HIJKMSp`) |
|
|
||||||
| Exercise reactors and item-graph for one node | `bw test <node>` (defaults to `-IJKMp`) |
|
|
||||||
| Same, but every node that has a given bundle | `bw test bundle:<name>` |
|
|
||||||
| Print one metadata key for one node | `bw metadata <node> -k <a/b>` (repeat `-k` for more) |
|
|
||||||
| Show where each metadata value comes from | `bw metadata <node> -b` |
|
|
||||||
| Resolve Faults (vault values) into the dump | `bw metadata <node> -f` — **may print secrets, avoid** |
|
|
||||||
| List a node's items, with the bundle that defines each | `bw items <node> --blame` |
|
|
||||||
| Preview a rendered file's content | `bw items <node> file:<path> -f` |
|
|
||||||
| Verify against the live host, scoped to one bundle | `bw verify <node> -o bundle:<name>` |
|
|
||||||
| Hash metadata only (faster than full config hash) | `bw hash <node> -m` |
|
|
||||||
| Inspect the data backing a hash | `bw hash <node> -d` |
|
|
||||||
|
|
||||||
`bw test`, `bw verify`, `bw nodes`, `bw metadata` all share a target-
|
|
||||||
selector grammar: bare node name, group name, `bundle:<name>`,
|
|
||||||
`!bundle:<name>`, or `"lambda:node.metadata_get('foo/bar', 0) < 3"`.
|
|
||||||
|
|
||||||
[fork]: https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md
|
|
||||||
|
|
||||||
## Bundle-validation workflow
|
|
||||||
|
|
||||||
`bw test` (no args) is a *parsing* gate, not a *behaviour* gate. It
|
|
||||||
loads every bundle, but a bundle's reactors only resolve when a node's
|
|
||||||
metadata is actually built — and that happens only for nodes that
|
|
||||||
opt in. Until then, reactor bugs stay dormant. bw rejects reactors
|
|
||||||
that don't read any metadata, but the rejection only fires once *some*
|
|
||||||
node consumes the bundle.
|
|
||||||
|
|
||||||
When developing a new bundle:
|
|
||||||
|
|
||||||
1. Scaffold + `bw test` — confirms parsing.
|
|
||||||
2. **Attach the bundle to one node** (or a stub node) by adding it to
|
|
||||||
`nodes/<n>.py`'s `bundles` list, or to a group the node is in.
|
|
||||||
3. `bw test <node>` — now reactors fire. This is where bundle bugs
|
|
||||||
surface.
|
|
||||||
4. `bw items <node> --blame` and `bw metadata <node> -k <key>` —
|
|
||||||
confirm items materialise and derived metadata looks right.
|
|
||||||
5. `bw hash <node>` — preview against the live host.
|
|
||||||
|
|
||||||
Step 2 is non-optional. A bundle that "passes `bw test`" with no
|
|
||||||
consumer is proven only to parse.
|
|
||||||
|
|
|
||||||
|
|
@ -127,12 +127,6 @@ bundle.
|
||||||
|
|
||||||
## 3. Per-bundle `AGENTS.md` template
|
## 3. Per-bundle `AGENTS.md` template
|
||||||
|
|
||||||
> **Status: replaced — pre-pivot intent only.** Per-bundle docs are plain
|
|
||||||
> `README.md` with no fixed structure. See §0 Revisions and the
|
|
||||||
> "Per-bundle README" section in [`bundles/AGENTS.md`](../../../bundles/AGENTS.md)
|
|
||||||
> for the current convention. The template below is kept as a record of
|
|
||||||
> the original design.
|
|
||||||
|
|
||||||
One balanced doc serving both audiences. Prose where prose helps, structure
|
One balanced doc serving both audiences. Prose where prose helps, structure
|
||||||
where structure helps. Sections in order:
|
where structure helps. Sections in order:
|
||||||
|
|
||||||
|
|
@ -345,12 +339,6 @@ in 30–120 lines each; root `AGENTS.md` is ~150 lines.
|
||||||
|
|
||||||
### Phase 2 — seed bundles (10)
|
### Phase 2 — seed bundles (10)
|
||||||
|
|
||||||
> **Status: dropped — pre-pivot intent only.** Phase 2 didn't ship. After
|
|
||||||
> Phase 1 landed, the maintainer pulled the per-bundle `AGENTS.md`
|
|
||||||
> migration: the rigid template proved a poor fit for the heterogeneous
|
|
||||||
> existing READMEs. See §0 Revisions. The seed list and migration plan
|
|
||||||
> below are kept as a record of how the work was scoped.
|
|
||||||
|
|
||||||
Bundles selected empirically (node+group references and recent commit
|
Bundles selected empirically (node+group references and recent commit
|
||||||
activity, validated 2026-05-10):
|
activity, validated 2026-05-10):
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,253 +0,0 @@
|
||||||
# Round 1 — agent-doc refactor (gaps 1–6 + cmd cheat sheet)
|
|
||||||
|
|
||||||
## Why
|
|
||||||
|
|
||||||
A previous session integrated `bundles/left4me/` and brought
|
|
||||||
`ovh.left4me` live. The integration produced a handoff (at
|
|
||||||
`~/.claude/plans/2026-05-10-ckn-bw-docs-improvements-handoff.md`)
|
|
||||||
listing 12 documentation gaps surfaced by the work. This spec covers
|
|
||||||
the first six (the cross-cutting ones) plus a useful side-quest:
|
|
||||||
adding a read-only command cheat sheet to `docs/agents/commands.md`.
|
|
||||||
Gaps 7–12 (item-specific, bundle READMEs) are deferred to a follow-up
|
|
||||||
round.
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
In:
|
|
||||||
|
|
||||||
- Gap 1 — drop `bw bundles` (doesn't exist), add `bw verify` to the
|
|
||||||
read-only allowlist.
|
|
||||||
- Gap 2 — bundle-validation workflow needs a node attached.
|
|
||||||
- Gap 3 — nodes carry only node-specific metadata (split across
|
|
||||||
`bundles/AGENTS.md` and `nodes/AGENTS.md`).
|
|
||||||
- Gap 4 — reactors must read metadata or be defaults.
|
|
||||||
- Gap 5 — `triggers` ↔ `triggered: True` invariant + self-healing
|
|
||||||
pattern.
|
|
||||||
- Gap 6 — `unless` semantics (folded into Gap 5's second bullet).
|
|
||||||
- Side-quest: read-only command cheat sheet in `commands.md` (`bw
|
|
||||||
test` flag matrix + selectors, `bw metadata -k/-b/-f`, `bw items
|
|
||||||
--blame/-f`, `bw verify -o bundle:`, `bw hash -m/-d`).
|
|
||||||
|
|
||||||
Out:
|
|
||||||
|
|
||||||
- Gaps 7–12 (`source` implicit, `git_deploy` chown, `git_deploy` URL
|
|
||||||
form, letsencrypt/bind/nginx READMEs).
|
|
||||||
- Any change to bundle behaviour. This is pure docs; if a doc claim
|
|
||||||
feels wrong, push back to the maintainer rather than editing
|
|
||||||
`.py`.
|
|
||||||
|
|
||||||
## Verification approach
|
|
||||||
|
|
||||||
For each gap, find current line numbers in the target doc (handoff
|
|
||||||
line numbers are May 2026; some have drifted). Verify code-level
|
|
||||||
claims against the fork source under `.venv/src/bundlewrap/` before
|
|
||||||
quoting them.
|
|
||||||
|
|
||||||
Already verified during brainstorm:
|
|
||||||
|
|
||||||
- Gap 1: `bw bundles` is not a subcommand of the installed fork
|
|
||||||
(`.venv/bin/bw --help` lists only
|
|
||||||
`apply, debug, diff, groups, hash, ipmi, items, lock, metadata,
|
|
||||||
nodes, plot, pw, repo, run, stats, test, verify, zen`). `bw verify`
|
|
||||||
is read-only.
|
|
||||||
- Gap 2: `bw test` default flag set differs by mode. Whole-repo:
|
|
||||||
`-HIJKMSp`. Node-targeted: `-IJKMp`. The repo-mode adds `-H`
|
|
||||||
(repo hooks) and `-S` (subgroup-loops); the node-mode adds `-J`
|
|
||||||
(node hooks). Reactors only resolve when a node's metadata is
|
|
||||||
built, which only happens when a node opts into the bundle.
|
|
||||||
- Gap 4: exact wording at `metagen.py:428`:
|
|
||||||
`"{reactor_name} on {node_name} did not request any metadata, you
|
|
||||||
might want to use defaults instead"`.
|
|
||||||
- Gap 5: exact wording at `deps.py:340`:
|
|
||||||
`"'{item1}' in bundle '{bundle1}' triggered by '{item2}' in bundle
|
|
||||||
'{bundle2}', but missing 'triggered' attribute"`.
|
|
||||||
- Gap 3 precedent: `bundles/left4me/metadata.py:10` is the canonical
|
|
||||||
random-bytes-in-defaults example. `bundles/postgresql/metadata.py:4`
|
|
||||||
is the password_for-at-module-scope example. (The handoff cites
|
|
||||||
postgresql for the random-bytes pattern; that's a misattribution —
|
|
||||||
postgresql uses `password_for`.)
|
|
||||||
|
|
||||||
After every commit: `.venv/bin/bw test` must pass with the same
|
|
||||||
output as before. Pure-docs edits cannot break it unless a `.py` is
|
|
||||||
touched accidentally.
|
|
||||||
|
|
||||||
## Commits
|
|
||||||
|
|
||||||
Six iterative commits, matching repo style.
|
|
||||||
|
|
||||||
### Commit 1 — drop `bw bundles`, add `bw verify` (Gap 1)
|
|
||||||
|
|
||||||
`AGENTS.md` rule 1 only. The handoff also flagged
|
|
||||||
`bundles/AGENTS.md:60-64`, but that list no longer references
|
|
||||||
`bw bundles` (it currently reads `bw test` / `bw items` / `bw hash`).
|
|
||||||
That section gets rewritten in commit 3, not here.
|
|
||||||
|
|
||||||
```diff
|
|
||||||
- to `bw test`, `bw nodes`, `bw groups`, `bw bundles`,
|
|
||||||
- `bw items`, `bw metadata`, `bw hash`, `bw debug`. See
|
|
||||||
+ to `bw test`, `bw nodes`, `bw groups`, `bw items`,
|
|
||||||
+ `bw metadata`, `bw hash`, `bw verify`, `bw debug`. See
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 2 — read-only command cheat sheet
|
|
||||||
|
|
||||||
Append to `docs/agents/commands.md`. New H2 section, table format
|
|
||||||
to match the existing voice.
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Read-only commands — useful flag combinations
|
|
||||||
|
|
||||||
The fork's [`AGENTS.md`][fork] documents the canonical safety envelope.
|
|
||||||
These are the flag combinations agents reach for most often in this repo:
|
|
||||||
|
|
||||||
| Want to … | Run |
|
|
||||||
|---|---|
|
|
||||||
| Sanity-check the whole repo (parse + cross-cutting hooks) | `bw test` (defaults to `-HIJKMSp`) |
|
|
||||||
| Exercise reactors and item-graph for one node | `bw test <node>` (defaults to `-IJKMp`) |
|
|
||||||
| Same, but every node that has a given bundle | `bw test bundle:<name>` |
|
|
||||||
| Print one metadata key for one node | `bw metadata <node> -k <a/b>` (repeat `-k` for more) |
|
|
||||||
| Show where each metadata value comes from | `bw metadata <node> -b` |
|
|
||||||
| Resolve Faults (vault values) into the dump | `bw metadata <node> -f` — **may print secrets, avoid** |
|
|
||||||
| List a node's items, with the bundle that defines each | `bw items <node> --blame` |
|
|
||||||
| Preview a rendered file's content | `bw items <node> file:<path> -f` |
|
|
||||||
| Verify against the live host, scoped to one bundle | `bw verify <node> -o bundle:<name>` |
|
|
||||||
| Hash metadata only (faster than full config hash) | `bw hash <node> -m` |
|
|
||||||
| Inspect the data backing a hash | `bw hash <node> -d` |
|
|
||||||
|
|
||||||
`bw test`, `bw verify`, `bw nodes`, `bw metadata` all share a target-
|
|
||||||
selector grammar: bare node name, group name, `bundle:<name>`,
|
|
||||||
`!bundle:<name>`, or `"lambda:node.metadata_get('foo/bar', 0) < 3"`.
|
|
||||||
|
|
||||||
[fork]: https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 3 — bundle validation needs a node attached (Gap 2)
|
|
||||||
|
|
||||||
Two file changes.
|
|
||||||
|
|
||||||
**`bundles/AGENTS.md` lines 59-64** — replace the Verify list:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
5. **Verify, in this order:**
|
|
||||||
- `bw test` — repo-wide parse + cross-cutting hooks. Loads every
|
|
||||||
bundle, but reactors don't fire for nodes that haven't opted into
|
|
||||||
the bundle yet — bugs in new reactors stay hidden here.
|
|
||||||
- **Attach the bundle to a node** (via the node's `bundles` list, or
|
|
||||||
a group it belongs to). Until you do, the next steps don't actually
|
|
||||||
exercise the bundle.
|
|
||||||
- `bw test <node>` — exercises every reactor and item-graph edge for
|
|
||||||
that node. This is where most new-bundle bugs surface.
|
|
||||||
- `bw items <node> --blame` — confirm items materialise with the right
|
|
||||||
paths, authored by the expected bundle.
|
|
||||||
- `bw metadata <node> -k <a/b>` — spot-check derived metadata.
|
|
||||||
- `bw hash <node>` — preview vs current host state.
|
|
||||||
|
|
||||||
See [`docs/agents/commands.md#bundle-validation-workflow`](../docs/agents/commands.md#bundle-validation-workflow)
|
|
||||||
for the rationale.
|
|
||||||
```
|
|
||||||
|
|
||||||
**`docs/agents/commands.md`** — new section after the cheat sheet:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
## Bundle-validation workflow
|
|
||||||
|
|
||||||
`bw test` (no args) is a *parsing* gate, not a *behaviour* gate. It
|
|
||||||
loads every bundle, but a bundle's reactors only resolve when a node's
|
|
||||||
metadata is actually built — and that happens only for nodes that
|
|
||||||
opt in. Until then, reactor bugs stay dormant. bw rejects reactors that
|
|
||||||
don't read any metadata, but the rejection only fires once *some* node
|
|
||||||
consumes the bundle.
|
|
||||||
|
|
||||||
When developing a new bundle:
|
|
||||||
|
|
||||||
1. Scaffold + `bw test` — confirms parsing.
|
|
||||||
2. **Attach the bundle to one node** (or a stub node) by adding it to
|
|
||||||
`nodes/<n>.py`'s `bundles` list, or to a group the node is in.
|
|
||||||
3. `bw test <node>` — now reactors fire. This is where bundle bugs
|
|
||||||
surface.
|
|
||||||
4. `bw items <node> --blame` and `bw metadata <node> -k <key>` — confirm
|
|
||||||
items materialise and derived metadata looks right.
|
|
||||||
5. `bw hash <node>` — preview against the live host.
|
|
||||||
|
|
||||||
Step 2 is non-optional. A bundle that "passes `bw test`" with no consumer
|
|
||||||
is proven only to parse.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 4 — nodes carry only node-specific metadata (Gap 3)
|
|
||||||
|
|
||||||
**`bundles/AGENTS.md` Conventions** — new bullet:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
- **Bundles own application-wide knowledge; nodes carry only the few
|
|
||||||
per-host knobs the bundle actually needs.** When designing a bundle,
|
|
||||||
identify the per-node knobs (e.g. domain, uplink interface, a
|
|
||||||
vault-id suffix) and put everything else in `defaults`, or in a
|
|
||||||
reactor that derives from those knobs. Per-node random secrets
|
|
||||||
belong in `defaults` via `repo.vault.random_bytes_as_base64_for(...)`
|
|
||||||
keyed on the node — not in the node file. See
|
|
||||||
`bundles/left4me/metadata.py:10` (`secret_key` derived in defaults)
|
|
||||||
and `bundles/postgresql/metadata.py:4` (vault-derived `password_for`
|
|
||||||
at module scope).
|
|
||||||
```
|
|
||||||
|
|
||||||
**`nodes/AGENTS.md` Pitfalls** — new bullet:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
- **Bloated per-node metadata is usually a bundle smell.** If a
|
|
||||||
bundle's metadata block in the node file has more than 3-5 keys,
|
|
||||||
the bundle is probably under-using `defaults` / reactors. Push the
|
|
||||||
contribution into the bundle (see
|
|
||||||
[`bundles/AGENTS.md`](../bundles/AGENTS.md#conventions)) rather than
|
|
||||||
growing the node file.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 5 — reactors must read metadata or be defaults (Gap 4)
|
|
||||||
|
|
||||||
**`bundles/AGENTS.md` Pitfalls** — new bullet:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
- **Reactors must read metadata.** If a reactor body returns a static
|
|
||||||
dict without calling `metadata.get(...)`, bw raises
|
|
||||||
`ValueError: <reactor> on <node> did not request any metadata, you
|
|
||||||
might want to use defaults instead` once a node consumes the bundle.
|
|
||||||
Fix: fold the contribution into `defaults`. The rule applies even
|
|
||||||
when the reactor writes into another bundle's namespace — a static
|
|
||||||
contribution to e.g. `nftables/output` belongs in `defaults`, where
|
|
||||||
bw merges it with other bundles' contributions.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 6 — `triggers` invariant + self-healing + `unless` (Gaps 5+6)
|
|
||||||
|
|
||||||
**`bundles/AGENTS.md` Pitfalls** — two new bullets (Gap 6's `unless`
|
|
||||||
semantics fold into the second; cleaner than three bullets):
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
- **`triggers` ↔ `triggered: True` invariant.** Any item listed in
|
|
||||||
another's `triggers` list must declare `triggered: True`. bw
|
|
||||||
enforces this at `bw test` time: *"…triggered by …, but missing
|
|
||||||
'triggered' attribute"*. Corollary: an action can't be both in an
|
|
||||||
upstream `triggers` list AND self-healing every apply — pick one.
|
|
||||||
|
|
||||||
- **Triggered actions don't recover from partial failure.** When an
|
|
||||||
upstream item's apply succeeds but its triggered downstream action
|
|
||||||
fails, subsequent applies can't recover via the trigger chain —
|
|
||||||
upstream is "already in desired state" and never re-triggers. For
|
|
||||||
actions that must self-heal (pip installs, chowns, migrations),
|
|
||||||
drop `triggered: True` and gate the command with `unless:
|
|
||||||
<fast-check>`. `unless` is a shell command on the target host whose
|
|
||||||
exit status decides whether the main command runs (exit 0 = skip);
|
|
||||||
it's checked at fire time, after `triggered:` filtering.
|
|
||||||
```
|
|
||||||
|
|
||||||
## Out of scope
|
|
||||||
|
|
||||||
- Gaps 7–12 — deferred. The maintainer re-engages after this round.
|
|
||||||
- Bundle behaviour changes. Pure docs.
|
|
||||||
- `bw apply` / `bw run` — not authorised this session.
|
|
||||||
|
|
||||||
## Constraints
|
|
||||||
|
|
||||||
- Don't echo decrypted secrets in commit messages or new doc text.
|
|
||||||
- Don't restore `*.py_` parked nodes.
|
|
||||||
- After each commit, `.venv/bin/bw test` must pass.
|
|
||||||
- No push.
|
|
||||||
|
|
@ -1,286 +0,0 @@
|
||||||
# Round 2 — agent-doc refactor (gaps 7–12)
|
|
||||||
|
|
||||||
## Why
|
|
||||||
|
|
||||||
Continuation of round 1 (spec at
|
|
||||||
`2026-05-10-ckn-bw-agents-md-refactor-round-1-design.md`). Round 1
|
|
||||||
landed the cross-cutting lessons (read-only allowlist, bundle
|
|
||||||
validation needs a node, nodes-carry-only-node-specific-metadata,
|
|
||||||
reactors-must-read-metadata, triggers/triggered:True invariant,
|
|
||||||
self-healing pattern). Round 2 covers the remaining six gaps: built-in
|
|
||||||
item-type gotchas and three bundle READMEs.
|
|
||||||
|
|
||||||
## Scope
|
|
||||||
|
|
||||||
In:
|
|
||||||
|
|
||||||
- Gap 7 — `file:`'s `source` defaults to the basename of the destination.
|
|
||||||
- Gap 8 — `git_deploy` extracts as the connecting user (root after
|
|
||||||
sudo); chown action needed for non-root downstream consumers.
|
|
||||||
- Gap 9 — `git_deploy` URL form: `://` triggers per-apply clone, no `://`
|
|
||||||
requires a `git_deploy_repos` map at the repo root.
|
|
||||||
- Gap 10 — `bundles/letsencrypt`: first-apply behaviour, DNS-01
|
|
||||||
prerequisites, negative-cache penalty.
|
|
||||||
- Gap 11 — `bundles/bind`: applying changes to a `master_node`-linked
|
|
||||||
pair needs `bw apply` on both ends.
|
|
||||||
- Gap 12 — `bundles/nginx`: how port 80 is served, `vm/cores`
|
|
||||||
requirement.
|
|
||||||
|
|
||||||
Out:
|
|
||||||
|
|
||||||
- Bundle behaviour changes. Pure docs.
|
|
||||||
- `bw apply` / `bw run` — not authorised this session.
|
|
||||||
|
|
||||||
## Placement decision (diverges from the handoff)
|
|
||||||
|
|
||||||
The handoff suggests `items/AGENTS.md` for gaps 7, 8, 9. But
|
|
||||||
`items/AGENTS.md` is scoped to **custom** item types in the `items/`
|
|
||||||
directory — its first sentence: *"Custom item types — each `*.py` is
|
|
||||||
a `bundlewrap.items.Item` subclass…"*. Built-in gotchas (`file:`,
|
|
||||||
`git_deploy:`) don't fit there.
|
|
||||||
|
|
||||||
Round-1 lessons about built-in mechanics (reactors must read metadata,
|
|
||||||
`triggers` invariant, self-healing pattern) all landed in
|
|
||||||
`bundles/AGENTS.md` Pitfalls. Gaps 7, 8, 9 are the same shape, so
|
|
||||||
they go in the same place.
|
|
||||||
|
|
||||||
## Validation findings
|
|
||||||
|
|
||||||
- Gap 7: well-known bw built-in semantics. Trusting the handoff.
|
|
||||||
- Gap 8: confirmed at `.venv/src/bundlewrap/bundlewrap/items/git_deploy.py`'s
|
|
||||||
`fix()` method — uses `self.node.upload(...)` which writes as the sudo
|
|
||||||
user (root). Files end up root-owned.
|
|
||||||
- Gap 9: confirmed in round 1 (`git_deploy.py:103` —
|
|
||||||
`if "://" in self.attributes['repo']:`).
|
|
||||||
- Gap 10: confirmed `/etc/dehydrated/letsencrypt-ensure-some-certificate`
|
|
||||||
exists in the bundle; runs on every domain with idempotent `unless`.
|
|
||||||
Daily timer at `/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`.
|
|
||||||
- Gap 11: nuanced. The bundle DOES set `bind/type = 'slave'` and renders
|
|
||||||
different named.conf.local for slaves, so bind itself may AXFR at
|
|
||||||
runtime. But the slave's *bw-managed* zone files are statically
|
|
||||||
rendered from the master's metadata at slave-apply time
|
|
||||||
(`bundles/bind/items.py:100`). The practical workflow rule — "apply
|
|
||||||
both" — is correct regardless. I'll frame the README as the workflow
|
|
||||||
rule, not the absolute "not AXFR slaving" claim from the handoff.
|
|
||||||
- Gap 12: confirmed `nginx.conf:42` includes `/etc/nginx/sites-enabled/*`;
|
|
||||||
`nginx/items.py:35` reads `node.metadata.get('vm/cores')` with no
|
|
||||||
default. README does not exist.
|
|
||||||
|
|
||||||
## Existing README states
|
|
||||||
|
|
||||||
- `bundles/letsencrypt/README.md` — 9 lines: upstream link + nsupdate
|
|
||||||
snippet. Reshape into an operational README; keep the nsupdate snippet.
|
|
||||||
- `bundles/bind/README.md` — does not exist. Create.
|
|
||||||
- `bundles/nginx/README.md` — does not exist. Create.
|
|
||||||
|
|
||||||
## Commits
|
|
||||||
|
|
||||||
### Commit 7 — `file:` source defaults to destination basename (Gap 7)
|
|
||||||
|
|
||||||
`bundles/AGENTS.md` Pitfalls — new bullet:
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
- **`file:` `source` defaults to the destination basename.** For a
|
|
||||||
destination of `/etc/foo/bar.conf` with no `source` key, bw looks for
|
|
||||||
`bundles/<bundle>/files/bar.conf`. Only declare `source` explicitly
|
|
||||||
when the basename you want differs (e.g. shipping a Mako template
|
|
||||||
named `bar.conf.mako` to a destination of `/etc/foo/bar.conf`).
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 8 — `git_deploy` gotchas (Gaps 8 + 9)
|
|
||||||
|
|
||||||
`bundles/AGENTS.md` Pitfalls — two new bullets.
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
- **`git_deploy` extracts as the connecting (sudo) user — files end up
|
|
||||||
root-owned.** A downstream action that runs as a non-root app user
|
|
||||||
(typical: editable pip install, Rails bundle install) will hit
|
|
||||||
`Permission denied` on `.egg-info` or similar. The fix is a
|
|
||||||
self-healing chown action between `git_deploy` and the downstream
|
|
||||||
action:
|
|
||||||
|
|
||||||
```python
|
|
||||||
actions['<bundle>_chown_src'] = {
|
|
||||||
'command': 'chown -R <user>:<group> <path>',
|
|
||||||
'unless': 'test -z "$(find <path> ! -user <user> -print -quit)"',
|
|
||||||
'cascade_skip': False,
|
|
||||||
'needs': ['git_deploy:<path>', 'user:<user>', 'group:<group>'],
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
See `bundles/left4me/items.py` for an in-tree example.
|
|
||||||
|
|
||||||
- **`git_deploy` URL form matters.** A URL containing `://` (HTTP/HTTPS,
|
|
||||||
`ssh://`) makes bw clone to a temp dir per-apply — no operator-side
|
|
||||||
state needed. Without `://` (SCP-style `git@host:path`), bw expects a
|
|
||||||
`git_deploy_repos` map file at the repo root pointing at a long-lived
|
|
||||||
local clone, and raises `RepositoryError('missing repo map for
|
|
||||||
git_deploy')` if it isn't there. For HTTPS-reachable repos use the
|
|
||||||
HTTPS form; for SSH-only, prefer the explicit `ssh://user@host/path`
|
|
||||||
form so the map isn't needed.
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 9 — letsencrypt README (Gap 10)
|
|
||||||
|
|
||||||
Reshape `bundles/letsencrypt/README.md`. Keep the upstream link and
|
|
||||||
nsupdate snippet at the top; add three structured sections.
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# letsencrypt
|
|
||||||
|
|
||||||
Issues and renews Let's Encrypt certs via [dehydrated][upstream] with
|
|
||||||
DNS-01 against the in-house bind-acme server.
|
|
||||||
|
|
||||||
[upstream]: https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
|
||||||
|
|
||||||
## First-apply behaviour
|
|
||||||
|
|
||||||
Immediately after `bw apply <node>`, nginx serves a **self-signed
|
|
||||||
cert** for each declared domain — generated by
|
|
||||||
`/etc/dehydrated/letsencrypt-ensure-some-certificate` so nginx has
|
|
||||||
something to start with. The real Let's Encrypt cert arrives at most
|
|
||||||
24h later when the systemd timer fires
|
|
||||||
(`/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`). To
|
|
||||||
shortcut the wait:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
ssh <node> 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01'
|
|
||||||
ssh <node> 'sudo systemctl reload nginx'
|
|
||||||
```
|
|
||||||
|
|
||||||
## DNS-01 prerequisites
|
|
||||||
|
|
||||||
`hook.sh` does `nsupdate` against the bind-acme server (referenced
|
|
||||||
by `letsencrypt/acme_node`). For the challenge to succeed:
|
|
||||||
|
|
||||||
1. The acme node must be in the same metadata graph (so
|
|
||||||
`bw metadata <node> -k letsencrypt/acme_node` resolves).
|
|
||||||
2. **All NS servers** for the validated domain must serve the
|
|
||||||
`_acme-challenge.<domain>` CNAME — Let's Encrypt validates from
|
|
||||||
primary AND secondary geographic regions; both authoritative
|
|
||||||
servers must agree. If a secondary NS is also a bw-managed node,
|
|
||||||
`bw apply` it after adding the domain (see e.g. `ovh.secondary`).
|
|
||||||
3. The bind-acme node's TSIG key must be reachable. `hook.sh` is
|
|
||||||
rendered with the bind-acme server's `network/internal/ipv4` —
|
|
||||||
for clients outside that LAN, the route must exist (typically via
|
|
||||||
wireguard `s2s` peer membership).
|
|
||||||
|
|
||||||
## Negative-cache penalty
|
|
||||||
|
|
||||||
If the first DNS-01 attempt fails (e.g. zone not yet applied to the
|
|
||||||
secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's
|
|
||||||
negative TTL (often 900s = 15 min). Subsequent attempts during that
|
|
||||||
window also fail and refresh the cache. Combined with LE's rate limit
|
|
||||||
of **5 failed authorisations per domain per hour**, recovery requires
|
|
||||||
you to **stop retrying** for ~15 minutes after fixing the DNS, then
|
|
||||||
make at most one attempt.
|
|
||||||
|
|
||||||
## nsupdate sample
|
|
||||||
|
|
||||||
For interactive testing of the bind-acme TSIG path:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
printf "server 127.0.0.1
|
|
||||||
zone acme.resolver.name.
|
|
||||||
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\"
|
|
||||||
send
|
|
||||||
" | nsupdate -y hmac-sha512:acme:<TSIG_KEY_REDACTED>
|
|
||||||
```
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 10 — bind README (Gap 11, reframed)
|
|
||||||
|
|
||||||
Create `bundles/bind/README.md`. Frame as the workflow rule, not the
|
|
||||||
absolute "not AXFR" claim.
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# bind
|
|
||||||
|
|
||||||
Authoritative DNS — primary plus optional `bind/master_node` slaves.
|
|
||||||
|
|
||||||
## Applying changes needs both nodes
|
|
||||||
|
|
||||||
The slave's bw-managed zone files are rendered from the master's
|
|
||||||
metadata at slave-apply time (see `bundles/bind/items.py:100`). When
|
|
||||||
you change a record on the master (adding a `letsencrypt/domains`
|
|
||||||
entry, a new vhost, etc.), the change is only published once you
|
|
||||||
apply BOTH:
|
|
||||||
|
|
||||||
```sh
|
|
||||||
bw apply htz.mails # primary (where the source records live)
|
|
||||||
bw apply ovh.secondary # secondary (renders its own zone files)
|
|
||||||
```
|
|
||||||
|
|
||||||
Until both have been applied, `bw verify ovh.secondary` will show
|
|
||||||
stale zones and consumers that hit the secondary (Let's Encrypt's
|
|
||||||
secondary-region validators in particular) will see NXDOMAIN. Even
|
|
||||||
though the slave's named.conf.local declares `type slave;`, don't
|
|
||||||
rely on bind's own AXFR catching up — the bw-rendered file on disk
|
|
||||||
is what `bw verify` measures.
|
|
||||||
|
|
||||||
## See also
|
|
||||||
|
|
||||||
- `bundles/bind-acme/` — the in-house ACME-update receiver.
|
|
||||||
- `bundles/letsencrypt/README.md` — DNS-01 prerequisites and the
|
|
||||||
negative-cache penalty (the most common operational consequence of
|
|
||||||
forgetting to apply the secondary).
|
|
||||||
```
|
|
||||||
|
|
||||||
### Commit 11 — nginx README (Gap 12)
|
|
||||||
|
|
||||||
Create `bundles/nginx/README.md`.
|
|
||||||
|
|
||||||
```markdown
|
|
||||||
# nginx
|
|
||||||
|
|
||||||
Webserver. Per-node vhosts in `nginx/vhosts`; per-vhost templates in
|
|
||||||
`data/nginx/*.conf`.
|
|
||||||
|
|
||||||
## How port 80 is served
|
|
||||||
|
|
||||||
The bundle ships a fixed `80.conf` to
|
|
||||||
`/etc/nginx/sites-available/80.conf` (picked up by the
|
|
||||||
`sites-enabled/` symlink) that handles **all** port-80 traffic
|
|
||||||
across vhosts:
|
|
||||||
|
|
||||||
1. ACME HTTP-01 challenges (`/.well-known/acme-challenge/`) are
|
|
||||||
served from `/var/lib/dehydrated/acme-challenges/`.
|
|
||||||
2. All other port-80 requests are 301-redirected to
|
|
||||||
`https://$host$request_uri`.
|
|
||||||
|
|
||||||
Per-vhost templates only declare `listen 443 ssl http2;`, so they
|
|
||||||
don't need their own port-80 server blocks. If you need vhost-
|
|
||||||
specific port-80 behaviour (e.g. plain-HTTP without redirect), you'll
|
|
||||||
need to override 80.conf or add a per-vhost block.
|
|
||||||
|
|
||||||
## Required metadata
|
|
||||||
|
|
||||||
- `vm/cores` — read directly by `items.py` for `worker_processes`.
|
|
||||||
No default; `bw items <node>` raises at item-build time if missing.
|
|
||||||
Typically supplied by the `vm` bundle / hetzner-vm group; double-
|
|
||||||
check on bare-metal hosts.
|
|
||||||
- `nginx/vhosts` — dict of vhost-name → vhost-config.
|
|
||||||
- `nginx/modules` — list of dynamic modules to load.
|
|
||||||
|
|
||||||
## Cross-namespace
|
|
||||||
|
|
||||||
`items.py` reads `letsencrypt/domains` to skip emitting a per-vhost
|
|
||||||
HTTPS block when LE hasn't declared the domain yet — keeps the bundle
|
|
||||||
loadable on a node where letsencrypt isn't fully wired up.
|
|
||||||
```
|
|
||||||
|
|
||||||
## Out of scope
|
|
||||||
|
|
||||||
- Bundle behaviour changes. Pure docs.
|
|
||||||
- `bw apply` / `bw run`.
|
|
||||||
- Reformatting the existing two-line bundle READMEs into the new
|
|
||||||
shape — bundles/AGENTS.md explicitly says don't do that
|
|
||||||
("uneven quality is part of what we accept in exchange for not
|
|
||||||
blocking other work").
|
|
||||||
|
|
||||||
## Constraints
|
|
||||||
|
|
||||||
- Don't echo decrypted secrets. The TSIG-key example in the
|
|
||||||
letsencrypt nsupdate snippet uses `<TSIG_KEY_REDACTED>`.
|
|
||||||
- After each commit, `.venv/bin/bw test` must pass.
|
|
||||||
- No push.
|
|
||||||
|
|
@ -1,5 +0,0 @@
|
||||||
{
|
|
||||||
'bundles': {
|
|
||||||
'left4me',
|
|
||||||
},
|
|
||||||
}
|
|
||||||
|
|
@ -81,12 +81,6 @@ This loader shape has consequences:
|
||||||
These are intentional parks/buffers, not bugs.
|
These are intentional parks/buffers, not bugs.
|
||||||
- **`id` must be unique.** A pre-apply hook (`hooks/unique_node_ids.py`)
|
- **`id` must be unique.** A pre-apply hook (`hooks/unique_node_ids.py`)
|
||||||
enforces this; duplicate IDs fail `bw test` and `bw apply`.
|
enforces this; duplicate IDs fail `bw test` and `bw apply`.
|
||||||
- **Bloated per-node metadata is usually a bundle smell.** If a
|
|
||||||
bundle's metadata block in the node file has more than 3-5 keys,
|
|
||||||
the bundle is probably under-using `defaults` / reactors. Push the
|
|
||||||
contribution into the bundle (see
|
|
||||||
[`bundles/AGENTS.md`](../bundles/AGENTS.md#conventions)) rather than
|
|
||||||
growing the node file.
|
|
||||||
|
|
||||||
## See also
|
## See also
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -233,7 +233,6 @@
|
||||||
'10.0.229.0/24',
|
'10.0.229.0/24',
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
'ovh.left4me': {},
|
|
||||||
},
|
},
|
||||||
'clients': {
|
'clients': {
|
||||||
'macbook': {
|
'macbook': {
|
||||||
|
|
|
||||||
|
|
@ -1,21 +1,15 @@
|
||||||
{
|
{
|
||||||
'hostname': '141.95.32.8',
|
'hostname': '141.95.32.8',
|
||||||
|
'username': 'debian',
|
||||||
'groups': [
|
'groups': [
|
||||||
'backup',
|
|
||||||
'debian-13',
|
'debian-13',
|
||||||
'left4me',
|
|
||||||
'monitored',
|
'monitored',
|
||||||
'webserver',
|
|
||||||
],
|
],
|
||||||
'bundles': [
|
'bundles': [
|
||||||
'wireguard',
|
#'wireguard',
|
||||||
],
|
],
|
||||||
'metadata': {
|
'metadata': {
|
||||||
'id': '14d2abc-3855-4bb7-99e2-d4e3eb0344dd',
|
'id': '14d2abc-3855-4bb7-99e2-d4e3eb0344dd',
|
||||||
'vm': {
|
|
||||||
'cores': 4, # 4 physical, 8 with HT
|
|
||||||
'threads': 8,
|
|
||||||
},
|
|
||||||
'network': {
|
'network': {
|
||||||
'external': {
|
'external': {
|
||||||
'interface': 'enp3s0f0',
|
'interface': 'enp3s0f0',
|
||||||
|
|
@ -41,12 +35,5 @@
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
'left4me': {
|
|
||||||
'domain': 'left4.me',
|
|
||||||
# Both HT siblings of physical core 0 (cpu0+cpu4 per
|
|
||||||
# /sys/devices/system/cpu/cpu0/topology/thread_siblings_list).
|
|
||||||
# Keeps system work off the physical cores running game ticks.
|
|
||||||
'system_cpus': {0, 4},
|
|
||||||
},
|
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue