Compare commits
57 commits
d4dedde0ad
...
c6caf2a1cf
| Author | SHA1 | Date | |
|---|---|---|---|
| c6caf2a1cf | |||
| 1b3f3ecf97 | |||
| 1d30830824 | |||
| 524ad6e89b | |||
| 99d68a5135 | |||
| 852a65a6f6 | |||
| 09d236ded5 | |||
| 7265c4aab1 | |||
| b5662f7ea7 | |||
| b8648cb53f | |||
| 6f2073847d | |||
| 6cc823613a | |||
| 05abe52221 | |||
| 7a579f27c5 | |||
| 0e88c4967e | |||
| 69bcac421a | |||
| 59788f315a | |||
| d3068ba8f6 | |||
| b5e72a3ac3 | |||
| 0a9f3dae88 | |||
| 422a275d97 | |||
| 3ed0264be6 | |||
| d49259ff07 | |||
| ed141a9300 | |||
| 9d17c69b22 | |||
| 5bf95cb065 | |||
| cac04a456b | |||
| c2cc3866f3 | |||
| d548235dfe | |||
| 149ce6c870 | |||
| 0479c96ae9 | |||
| 5d69180466 | |||
| 7d3554f8a5 | |||
| fc66267656 | |||
| 758660b131 | |||
| 7b291acca1 | |||
| 90f14b69e4 | |||
| 3bffd7b8f5 | |||
| 43f0c57438 | |||
| d425afad02 | |||
| f9bf289ef0 | |||
| a8fc3f2298 | |||
| c82737b162 | |||
| b1edcac3c7 | |||
| 72da6c0a8d | |||
| 6965441e9a | |||
| 6bf46ce9a4 | |||
| def010c976 | |||
| 433c403ddc | |||
| 80d2a79b97 | |||
| e842e7caa6 | |||
| 3afd4d60cc | |||
| 6db792ce6a | |||
| 7547d041a2 | |||
| cc1c6a5767 | |||
| af78e40fda | |||
| c6bf2e0fc8 |
33 changed files with 2011 additions and 23 deletions
2
.gitignore
vendored
2
.gitignore
vendored
|
|
@ -5,3 +5,5 @@
|
||||||
.bw_debug_history
|
.bw_debug_history
|
||||||
# CocoIndex Code (ccc)
|
# CocoIndex Code (ccc)
|
||||||
/.cocoindex_code/
|
/.cocoindex_code/
|
||||||
|
# bundlewrap git_deploy local-mirror map (operator-specific paths)
|
||||||
|
git_deploy_repos
|
||||||
|
|
|
||||||
15
AGENTS.md
15
AGENTS.md
|
|
@ -12,12 +12,12 @@ not project documentation. Onboarding lives **here**, in `AGENTS.md`.
|
||||||
|
|
||||||
## Quickstart for agents
|
## Quickstart for agents
|
||||||
|
|
||||||
Five rules; follow these and you won't break things:
|
Six rules; follow these and you won't break things:
|
||||||
|
|
||||||
1. **Read-only by default.** Never run `bw apply`, `bw run`, or
|
1. **Read-only by default.** Never run `bw apply`, `bw run`, or
|
||||||
`bw lock` without explicit user request — even with `-i`. Stick
|
`bw lock` without explicit user request — even with `-i`. Stick
|
||||||
to `bw test`, `bw nodes`, `bw groups`, `bw bundles`,
|
to `bw test`, `bw nodes`, `bw groups`, `bw items`,
|
||||||
`bw items`, `bw metadata`, `bw hash`, `bw debug`. See
|
`bw metadata`, `bw hash`, `bw verify`, `bw debug`. See
|
||||||
[`docs/agents/commands.md`](docs/agents/commands.md) and the
|
[`docs/agents/commands.md`](docs/agents/commands.md) and the
|
||||||
fork's [safety envelope](https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md).
|
fork's [safety envelope](https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md).
|
||||||
2. **Never echo decrypted secrets.** Don't print, paste, or log the
|
2. **Never echo decrypted secrets.** Don't print, paste, or log the
|
||||||
|
|
@ -38,6 +38,15 @@ Five rules; follow these and you won't break things:
|
||||||
5. **Prefer adding helpers to `libs/`** over duplicating logic across
|
5. **Prefer adding helpers to `libs/`** over duplicating logic across
|
||||||
bundles. Repo-wide helpers go in
|
bundles. Repo-wide helpers go in
|
||||||
[`libs/`](libs/AGENTS.md), reachable as `repo.libs.<x>`.
|
[`libs/`](libs/AGENTS.md), reachable as `repo.libs.<x>`.
|
||||||
|
6. **`ccc` is available for semantic search.** This repo is indexed
|
||||||
|
with [`ccc`](https://github.com/cocoindex-io/cocoindex-code).
|
||||||
|
Reach for it on conceptual questions ("where is X used / which
|
||||||
|
bundles do Y / what are the contexts of Z"), where a keyword
|
||||||
|
grep would miss indirect usage:
|
||||||
|
`ccc search '<concept>' --path '**'`. Pass `--path '**'` —
|
||||||
|
without it, results are filtered to the current working
|
||||||
|
directory's subtree. `grep`/`rg`/`find` remain fine for
|
||||||
|
exact-string lookups; pick whichever fits the question.
|
||||||
|
|
||||||
## Layout
|
## Layout
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -41,6 +41,16 @@ bundles/<name>/
|
||||||
more than one bundle. Don't duplicate logic across bundles.
|
more than one bundle. Don't duplicate logic across bundles.
|
||||||
- **Custom item types** (e.g. `download:`) live in
|
- **Custom item types** (e.g. `download:`) live in
|
||||||
[`items/`](../items/AGENTS.md), not per-bundle.
|
[`items/`](../items/AGENTS.md), not per-bundle.
|
||||||
|
- **Bundles own application-wide knowledge; nodes carry only the few
|
||||||
|
per-host knobs the bundle actually needs.** When designing a bundle,
|
||||||
|
identify the per-node knobs (e.g. domain, uplink interface, a
|
||||||
|
vault-id suffix) and put everything else in `defaults`, or in a
|
||||||
|
reactor that derives from those knobs. Per-node random secrets
|
||||||
|
belong in `defaults` via `repo.vault.random_bytes_as_base64_for(...)`
|
||||||
|
keyed on the node — not in the node file. See
|
||||||
|
`bundles/left4me/metadata.py:10` (`secret_key` derived in defaults)
|
||||||
|
and `bundles/postgresql/metadata.py:4` (vault-derived `password_for`
|
||||||
|
at module scope).
|
||||||
|
|
||||||
## How to add a new bundle
|
## How to add a new bundle
|
||||||
|
|
||||||
|
|
@ -56,12 +66,22 @@ bundles/<name>/
|
||||||
[`groups/<axis>/<x>.py`](../groups/AGENTS.md) (preferred for shared
|
[`groups/<axis>/<x>.py`](../groups/AGENTS.md) (preferred for shared
|
||||||
bundles) or to the node's `bundles` list directly
|
bundles) or to the node's `bundles` list directly
|
||||||
([`nodes/AGENTS.md`](../nodes/AGENTS.md)).
|
([`nodes/AGENTS.md`](../nodes/AGENTS.md)).
|
||||||
5. Verify, in this order:
|
5. **Verify, in this order:**
|
||||||
- `bw test` — sanity (loaders + reactors).
|
- `bw test` — repo-wide parse + cross-cutting hooks. Loads every
|
||||||
- `bw items <node>` — confirm new items appear on a node that opts in.
|
bundle, but reactors don't fire for nodes that haven't opted into
|
||||||
- `bw hash <node>` — confirm the change is what you expected. See
|
the bundle yet — bugs in new reactors stay hidden here.
|
||||||
[`docs/agents/commands.md`](../docs/agents/commands.md) and the
|
- **Attach the bundle to a node** (via the node's `bundles` list, or
|
||||||
fork's hash-diff workflow.
|
a group it belongs to). Until you do, the next steps don't actually
|
||||||
|
exercise the bundle.
|
||||||
|
- `bw test <node>` — exercises every reactor and item-graph edge for
|
||||||
|
that node. This is where most new-bundle bugs surface.
|
||||||
|
- `bw items <node> --blame` — confirm items materialise with the
|
||||||
|
right paths, authored by the expected bundle.
|
||||||
|
- `bw metadata <node> -k <a/b>` — spot-check derived metadata.
|
||||||
|
- `bw hash <node>` — preview vs current host state.
|
||||||
|
|
||||||
|
See [`docs/agents/commands.md#bundle-validation-workflow`](../docs/agents/commands.md#bundle-validation-workflow)
|
||||||
|
for the rationale.
|
||||||
6. Add a `bundles/<name>/README.md`. See "Per-bundle README" below
|
6. Add a `bundles/<name>/README.md`. See "Per-bundle README" below
|
||||||
for what to cover.
|
for what to cover.
|
||||||
|
|
||||||
|
|
@ -82,6 +102,12 @@ bundles/<name>/
|
||||||
unless the matching `file:` item declares `content_type='mako'`
|
unless the matching `file:` item declares `content_type='mako'`
|
||||||
(or a templating extension triggers it). To check, read the matching
|
(or a templating extension triggers it). To check, read the matching
|
||||||
`file:` entry in `items.py`.
|
`file:` entry in `items.py`.
|
||||||
|
- **`file:` `source` defaults to the destination basename.** For a
|
||||||
|
destination of `/etc/foo/bar.conf` with no `source` key, bw looks
|
||||||
|
for `bundles/<bundle>/files/bar.conf`. Only declare `source`
|
||||||
|
explicitly when the basename you want differs (e.g. shipping a Mako
|
||||||
|
template named `bar.conf.mako` to a destination of
|
||||||
|
`/etc/foo/bar.conf`).
|
||||||
- **Reactors writing across namespaces.** Some bundles' reactors write
|
- **Reactors writing across namespaces.** Some bundles' reactors write
|
||||||
into other bundles' metadata namespaces (e.g. `nextcloud` writes
|
into other bundles' metadata namespaces (e.g. `nextcloud` writes
|
||||||
into `apt.packages`, `archive.paths`). When you change such a bundle,
|
into `apt.packages`, `archive.paths`). When you change such a bundle,
|
||||||
|
|
@ -90,6 +116,28 @@ bundles/<name>/
|
||||||
itself; grep `'<other-bundle>':` in the reactors when in doubt.
|
itself; grep `'<other-bundle>':` in the reactors when in doubt.
|
||||||
- **`bw hash` doesn't accept selectors.** Use `bw hash <node>` per
|
- **`bw hash` doesn't accept selectors.** Use `bw hash <node>` per
|
||||||
literal name; see the fork's runbook.
|
literal name; see the fork's runbook.
|
||||||
|
- **Reactors must read metadata.** If a reactor body returns a static
|
||||||
|
dict without calling `metadata.get(...)`, bw raises
|
||||||
|
`ValueError: <reactor> on <node> did not request any metadata, you
|
||||||
|
might want to use defaults instead` once a node consumes the bundle.
|
||||||
|
Fix: fold the contribution into `defaults`. The rule applies even
|
||||||
|
when the reactor writes into another bundle's namespace — a static
|
||||||
|
contribution to e.g. `nftables/output` belongs in `defaults`, where
|
||||||
|
bw merges it with other bundles' contributions.
|
||||||
|
- **`triggers` ↔ `triggered: True` invariant.** Any item listed in
|
||||||
|
another's `triggers` list must declare `triggered: True`. bw
|
||||||
|
enforces this at `bw test` time: *"…triggered by …, but missing
|
||||||
|
'triggered' attribute"*. Corollary: an action can't be both in an
|
||||||
|
upstream `triggers` list AND self-healing every apply — pick one.
|
||||||
|
- **Triggered actions don't recover from partial failure.** When an
|
||||||
|
upstream item's apply succeeds but its triggered downstream action
|
||||||
|
fails, subsequent applies can't recover via the trigger chain —
|
||||||
|
upstream is "already in desired state" and never re-triggers. For
|
||||||
|
actions that must self-heal (pip installs, chowns, migrations),
|
||||||
|
drop `triggered: True` and gate the command with `unless: <fast-check>`.
|
||||||
|
`unless` is a shell command on the target host whose exit status
|
||||||
|
decides whether the main command runs (exit 0 = skip); it's checked
|
||||||
|
at fire time, after `triggered:` filtering.
|
||||||
|
|
||||||
## Per-bundle README
|
## Per-bundle README
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -33,6 +33,7 @@ def acme_zone(metadata):
|
||||||
str(ip_interface(other_node.metadata.get('network/internal/ipv4')).ip)
|
str(ip_interface(other_node.metadata.get('network/internal/ipv4')).ip)
|
||||||
for other_node in repo.nodes
|
for other_node in repo.nodes
|
||||||
if other_node.metadata.get('letsencrypt/domains', {})
|
if other_node.metadata.get('letsencrypt/domains', {})
|
||||||
|
and other_node.metadata.get('network/internal/ipv4', None)
|
||||||
},
|
},
|
||||||
*{
|
*{
|
||||||
str(ip_interface(other_node.metadata.get('wireguard/my_ip')).ip)
|
str(ip_interface(other_node.metadata.get('wireguard/my_ip')).ip)
|
||||||
|
|
|
||||||
30
bundles/bind/README.md
Normal file
30
bundles/bind/README.md
Normal file
|
|
@ -0,0 +1,30 @@
|
||||||
|
# bind
|
||||||
|
|
||||||
|
Authoritative DNS — primary plus optional `bind/master_node` slaves.
|
||||||
|
|
||||||
|
## Applying changes needs both nodes
|
||||||
|
|
||||||
|
The slave's bw-managed zone files are rendered from the master's
|
||||||
|
metadata at slave-apply time (see `bundles/bind/items.py:100`). When
|
||||||
|
you change a record on the master (adding a `letsencrypt/domains`
|
||||||
|
entry, a new vhost, etc.), the change is only published once you
|
||||||
|
apply BOTH:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bw apply htz.mails # primary (where the source records live)
|
||||||
|
bw apply ovh.secondary # secondary (renders its own zone files)
|
||||||
|
```
|
||||||
|
|
||||||
|
Until both have been applied, `bw verify ovh.secondary` will show
|
||||||
|
stale zones and consumers that hit the secondary (Let's Encrypt's
|
||||||
|
secondary-region validators in particular) will see NXDOMAIN. Even
|
||||||
|
though the slave's named.conf.local declares `type slave;`, don't
|
||||||
|
rely on bind's own AXFR catching up — the bw-rendered file on disk
|
||||||
|
is what `bw verify` measures.
|
||||||
|
|
||||||
|
## See also
|
||||||
|
|
||||||
|
- `bundles/bind-acme/` — the in-house ACME-update receiver.
|
||||||
|
- `bundles/letsencrypt/README.md` — DNS-01 prerequisites and the
|
||||||
|
negative-cache penalty (the most common operational consequence
|
||||||
|
of forgetting to apply the secondary).
|
||||||
114
bundles/left4me/README.md
Normal file
114
bundles/left4me/README.md
Normal file
|
|
@ -0,0 +1,114 @@
|
||||||
|
# left4me
|
||||||
|
|
||||||
|
L4D2 game-server management platform: a Flask web UI on gunicorn that
|
||||||
|
provisions per-instance srcds servers via templated systemd units, with
|
||||||
|
kernel-overlayfs layering for shared installations + per-overlay maps,
|
||||||
|
and uid-based DSCP/priority marking on the egress path so CAKE on the
|
||||||
|
external interface prioritizes srcds UDP over bulk traffic.
|
||||||
|
|
||||||
|
## Metadata
|
||||||
|
|
||||||
|
```python
|
||||||
|
'metadata': {
|
||||||
|
'left4me': {
|
||||||
|
'domain': 'whatever.tld', # required — the only per-node knob
|
||||||
|
# Everything below is optional and has a sensible default in the
|
||||||
|
# bundle. Override per-node only if the default is wrong:
|
||||||
|
# 'git_url': 'git@git.sublimity.de:cronekorkn/left4me',
|
||||||
|
# 'git_branch': 'master',
|
||||||
|
# 'gunicorn_workers': 1,
|
||||||
|
# 'gunicorn_threads': 32,
|
||||||
|
# 'job_worker_threads': 4,
|
||||||
|
# 'port_range_start': 27015,
|
||||||
|
# 'port_range_end': 27115,
|
||||||
|
# secret_key is auto-derived per node
|
||||||
|
# (repo.vault.random_bytes_as_base64_for f'{node.name} left4me secret_key').
|
||||||
|
},
|
||||||
|
},
|
||||||
|
```
|
||||||
|
|
||||||
|
The bundle's `derived_from_domain` reactor reads `left4me/domain` and
|
||||||
|
emits the corresponding `nginx/vhosts`, `letsencrypt/domains`,
|
||||||
|
`monitoring/services/left4me-web` (HTTPS health check), and the game-
|
||||||
|
port `nftables/input` accept rules. Backup paths
|
||||||
|
(`/var/lib/left4me`, `/etc/left4me`) are set-merged into `backup/paths`
|
||||||
|
from defaults. None of these need to be declared per-node.
|
||||||
|
|
||||||
|
## What this bundle does
|
||||||
|
|
||||||
|
- Creates system users `left4me` (uid/gid 980, home `/var/lib/left4me`,
|
||||||
|
mode 0711) and `l4d2-sandbox` (uid/gid 981, no home, used by bwrap
|
||||||
|
script-overlay builds).
|
||||||
|
- Drops privileged helpers under `/usr/local/libexec/left4me/`
|
||||||
|
(`left4me-systemctl`, `left4me-journalctl`, `left4me-overlay`,
|
||||||
|
`left4me-script-sandbox`) plus a tight sudoers file (validated with
|
||||||
|
`visudo -cf` before install).
|
||||||
|
- `git_deploy`s the left4me repo to `/opt/left4me/src`, builds a venv at
|
||||||
|
`/opt/left4me/.venv`, `pip install -e`s both `l4d2host` and `l4d2web`,
|
||||||
|
runs `alembic upgrade head` and `flask seed-script-overlays`, then
|
||||||
|
enables `left4me-web.service`.
|
||||||
|
- Emits four systemd units via `systemd/units` metadata (consumed by
|
||||||
|
`bundles/systemd/`):
|
||||||
|
- `left4me-web.service` — gunicorn on `127.0.0.1:8000` (TLS terminates upstream).
|
||||||
|
- `left4me-server@.service` — per-instance srcds template, started on
|
||||||
|
demand by the web app via the `left4me-systemctl` helper.
|
||||||
|
- `l4d2-game.slice` / `l4d2-build.slice` — cgroup slices for the
|
||||||
|
perf-baseline (CPU/IO weights, memory caps).
|
||||||
|
- Contributes uid-based DSCP/priority marks for srcds UDP egress to
|
||||||
|
`nftables/output` (via `defaults`).
|
||||||
|
|
||||||
|
## Gotchas
|
||||||
|
|
||||||
|
- **Requires `bundles/nftables` and `bundles/systemd` on the node.** The
|
||||||
|
bundle asserts membership at `bw test` time. On Debian-13 these ride
|
||||||
|
in via the `debian-13` group, so attaching the bundle to a Debian-13
|
||||||
|
node is enough.
|
||||||
|
- **`left4me-web.service` does not have `NoNewPrivileges=true`.** This is
|
||||||
|
intentional — workers `sudo` the privileged helpers; `NoNewPrivileges`
|
||||||
|
would block setuid escalation. Per-instance `server@.service` units
|
||||||
|
*do* have it.
|
||||||
|
- **CAKE shaping is configured separately**, via
|
||||||
|
`network/<iface>/cake` on the node (consumed by `bundles/network/`),
|
||||||
|
not by this bundle.
|
||||||
|
- **First-run admin user is manual.** After `bw apply`, ssh to the host and
|
||||||
|
bootstrap the admin via the `left4me` wrapper (it sources the env files,
|
||||||
|
drops to the `left4me` user, and runs the flask CLI):
|
||||||
|
`sudo left4me create-user <username> --admin` (prompts for password via
|
||||||
|
the flask CLI, or set `LEFT4ME_ADMIN_PASSWORD` first). The bundle
|
||||||
|
deliberately doesn't seed an admin to keep credentials out of the
|
||||||
|
metadata pipeline. The same `left4me` wrapper accepts any other flask
|
||||||
|
subcommand: `sudo left4me seed-script-overlays <dir>`,
|
||||||
|
`sudo left4me routes`, `sudo left4me shell`, etc.
|
||||||
|
- **CPU isolation is managed by this bundle**, driven by one required
|
||||||
|
per-node knob: `left4me/system_cpus` — a set of int CPU ids that
|
||||||
|
pins `system.slice` / `user.slice` / `l4d2-build.slice`. The
|
||||||
|
complement (`set(range(vm/threads)) - system_cpus`) pins
|
||||||
|
`l4d2-game.slice`. On HT hosts, list both SMT siblings of every
|
||||||
|
physical core you want to reserve for system, otherwise games end
|
||||||
|
up sharing L1/L2 with system. Find pairings via
|
||||||
|
`/sys/devices/system/cpu/cpu<n>/topology/thread_siblings_list`. On
|
||||||
|
the prod node (`ovh.left4me`, 4 physical / 8 threads, pairings
|
||||||
|
(0,4) (1,5) (2,6) (3,7)) the node sets `'system_cpus': {0, 4}` to
|
||||||
|
reserve physical core 0 entirely. `l4d2-game.slice` and
|
||||||
|
`l4d2-build.slice` carry `AllowedCPUs=` inline on their unit
|
||||||
|
definitions; `system.slice` and `user.slice` get drop-ins registered
|
||||||
|
under `systemd/units` with the `'<parent>.d/<basename>.conf'` key
|
||||||
|
convention (same shape nginx and autologin use), landing at
|
||||||
|
`/usr/local/lib/systemd/system/<slice>.d/99-left4me-cpuset.conf`.
|
||||||
|
The reactor raises if `system_cpus` includes CPUs outside
|
||||||
|
`[0, vm/threads)` or leaves no cores for games.
|
||||||
|
- **Kernel feature requirement:** kernel-overlayfs (`CONFIG_OVERLAY_FS`).
|
||||||
|
Standard on debian-13.
|
||||||
|
- **Game ports** open by the web app on demand in the range 27015-27115
|
||||||
|
(UDP+TCP). Add corresponding accept rules to `nftables/input` per
|
||||||
|
node if the host's policy is default-drop on input.
|
||||||
|
- **Pinned UIDs/GIDs (980/981).** Chosen for deterministic ownership
|
||||||
|
across rebuilds and backup restores. If you add another bundle that
|
||||||
|
pins UIDs in this repo, make sure it doesn't collide.
|
||||||
|
|
||||||
|
## Slice support requires `bundles/systemd` ≥ commit cc1c6a5
|
||||||
|
|
||||||
|
This bundle's `l4d2-game.slice` and `l4d2-build.slice` units rely on
|
||||||
|
`bundles/systemd/items.py` accepting the `.slice` extension. Older
|
||||||
|
revisions raised `Exception(f'unknown type slice')` at apply time.
|
||||||
|
The repo-wide `bw test` will catch this if it regresses.
|
||||||
6
bundles/left4me/files/etc/left4me/host.env.mako
Normal file
6
bundles/left4me/files/etc/left4me/host.env.mako
Normal file
|
|
@ -0,0 +1,6 @@
|
||||||
|
# Managed by ckn-bw bundles/left4me. Local edits will be reverted.
|
||||||
|
# Deployment units use fixed /var/lib/left4me paths; regenerate units if this changes.
|
||||||
|
LEFT4ME_ROOT=/var/lib/left4me
|
||||||
|
# l4d2host invokes steamcmd by absolute path — bypasses PATH lookup so the
|
||||||
|
# script's `cd "$(dirname "$0")"` resolves next to the real install dir.
|
||||||
|
LEFT4ME_STEAMCMD=/opt/left4me/steam/steamcmd.sh
|
||||||
6
bundles/left4me/files/etc/left4me/sandbox-resolv.conf
Normal file
6
bundles/left4me/files/etc/left4me/sandbox-resolv.conf
Normal file
|
|
@ -0,0 +1,6 @@
|
||||||
|
# Sandbox-only resolver config — bind-mounted into script-overlay sandboxes
|
||||||
|
# at /etc/resolv.conf. The host's resolver (often a private/LAN DNS server)
|
||||||
|
# is unreachable from inside the sandbox because IPAddressDeny= blocks
|
||||||
|
# egress to RFC1918 / loopback. Public resolvers keep DNS working.
|
||||||
|
nameserver 1.1.1.1
|
||||||
|
nameserver 8.8.8.8
|
||||||
7
bundles/left4me/files/etc/left4me/web.env.mako
Normal file
7
bundles/left4me/files/etc/left4me/web.env.mako
Normal file
|
|
@ -0,0 +1,7 @@
|
||||||
|
# Managed by ckn-bw bundles/left4me. Local edits will be reverted.
|
||||||
|
DATABASE_URL=sqlite:////var/lib/left4me/left4me.db
|
||||||
|
SECRET_KEY=${node.metadata.get('left4me/secret_key')}
|
||||||
|
JOB_WORKER_THREADS=${node.metadata.get('left4me/job_worker_threads')}
|
||||||
|
SESSION_COOKIE_SECURE=true
|
||||||
|
LEFT4ME_PORT_RANGE_START=${node.metadata.get('left4me/port_range_start')}
|
||||||
|
LEFT4ME_PORT_RANGE_END=${node.metadata.get('left4me/port_range_end')}
|
||||||
5
bundles/left4me/files/etc/sudoers.d/left4me
Normal file
5
bundles/left4me/files/etc/sudoers.d/left4me
Normal file
|
|
@ -0,0 +1,5 @@
|
||||||
|
Defaults:left4me !requiretty
|
||||||
|
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-systemctl *
|
||||||
|
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-journalctl *
|
||||||
|
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-overlay mount *, /usr/local/libexec/left4me/left4me-overlay umount *
|
||||||
|
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-script-sandbox
|
||||||
36
bundles/left4me/files/etc/sysctl.d/99-left4me.conf
Normal file
36
bundles/left4me/files/etc/sysctl.d/99-left4me.conf
Normal file
|
|
@ -0,0 +1,36 @@
|
||||||
|
# Host-side perf baseline for left4me — see
|
||||||
|
# docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md
|
||||||
|
#
|
||||||
|
# UDP socket buffers: distro defaults of ~128 KiB are too small for sustained
|
||||||
|
# Source-engine UDP across multiple instances. 8 MiB matches the standard
|
||||||
|
# 1 Gbit recommendation; rmem_default/wmem_default protect sockets that don't
|
||||||
|
# explicitly enlarge their buffers.
|
||||||
|
net.core.rmem_max = 8388608
|
||||||
|
net.core.wmem_max = 8388608
|
||||||
|
net.core.rmem_default = 524288
|
||||||
|
net.core.wmem_default = 524288
|
||||||
|
|
||||||
|
# Kernel softirq UDP path: the per-CPU backlog queue starts dropping packets
|
||||||
|
# at the default 1000 under multi-instance burst; 5000 absorbs realistic peaks.
|
||||||
|
# netdev_budget = 600 gives softirq more drain headroom per pass.
|
||||||
|
net.core.netdev_max_backlog = 5000
|
||||||
|
net.core.netdev_budget = 600
|
||||||
|
|
||||||
|
# Latency-sensitive default: avoid swap unless the box is really under
|
||||||
|
# pressure. Harmless on swapless hosts.
|
||||||
|
vm.swappiness = 10
|
||||||
|
|
||||||
|
# Per-socket UDP buffer floors: protect game-server sockets that don't bump
|
||||||
|
# their own SO_RCVBUF/SO_SNDBUF when softirq drains lag briefly.
|
||||||
|
net.ipv4.udp_rmem_min = 16384
|
||||||
|
net.ipv4.udp_wmem_min = 16384
|
||||||
|
|
||||||
|
# Default qdisc for ifaces we don't explicitly shape with CAKE. Debian Trixie
|
||||||
|
# already defaults to fq_codel; setting it explicitly is belt-and-suspenders
|
||||||
|
# and survives kernel-default churn.
|
||||||
|
net.core.default_qdisc = fq_codel
|
||||||
|
|
||||||
|
# TCP congestion control: BBR for any bulk TCP egress on the host (admin SSH,
|
||||||
|
# backups, package fetches, web-app responses) so a long flow does not push
|
||||||
|
# the bottleneck queue ahead of game UDP. UDP srcds is unaffected.
|
||||||
|
net.ipv4.tcp_congestion_control = bbr
|
||||||
53
bundles/left4me/files/usr/local/libexec/left4me/left4me-journalctl
Executable file
53
bundles/left4me/files/usr/local/libexec/left4me/left4me-journalctl
Executable file
|
|
@ -0,0 +1,53 @@
|
||||||
|
#!/bin/sh
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
usage() {
|
||||||
|
printf '%s\n' "usage: left4me-journalctl <server-name> --lines <n> --follow|--no-follow" >&2
|
||||||
|
exit 2
|
||||||
|
}
|
||||||
|
|
||||||
|
validate_name() {
|
||||||
|
name=$1
|
||||||
|
[ -n "$name" ] || usage
|
||||||
|
case "$name" in
|
||||||
|
.*|*..*|*/*|*\\*) usage ;;
|
||||||
|
esac
|
||||||
|
case "$name" in
|
||||||
|
*[!A-Za-z0-9_.-]*) usage ;;
|
||||||
|
esac
|
||||||
|
}
|
||||||
|
|
||||||
|
[ "$#" -eq 4 ] || usage
|
||||||
|
name=$1
|
||||||
|
lines_flag=$2
|
||||||
|
lines=$3
|
||||||
|
follow_flag=$4
|
||||||
|
|
||||||
|
validate_name "$name"
|
||||||
|
[ "$lines_flag" = "--lines" ] || usage
|
||||||
|
case "$lines" in
|
||||||
|
''|*[!0-9]*) usage ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
follow_arg=
|
||||||
|
case "$follow_flag" in
|
||||||
|
--follow) follow_arg=-f ;;
|
||||||
|
--no-follow) ;;
|
||||||
|
*) usage ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
unit="left4me-server@${name}.service"
|
||||||
|
if [ -x /bin/journalctl ]; then
|
||||||
|
journalctl=/bin/journalctl
|
||||||
|
elif [ -x /usr/bin/journalctl ]; then
|
||||||
|
journalctl=/usr/bin/journalctl
|
||||||
|
else
|
||||||
|
printf '%s\n' 'journalctl not found at /bin/journalctl or /usr/bin/journalctl' >&2
|
||||||
|
exit 69
|
||||||
|
fi
|
||||||
|
|
||||||
|
if [ -n "$follow_arg" ]; then
|
||||||
|
exec "$journalctl" -u "$unit" -n "$lines" -o cat "$follow_arg"
|
||||||
|
fi
|
||||||
|
|
||||||
|
exec "$journalctl" -u "$unit" -n "$lines" -o cat
|
||||||
242
bundles/left4me/files/usr/local/libexec/left4me/left4me-overlay
Normal file
242
bundles/left4me/files/usr/local/libexec/left4me/left4me-overlay
Normal file
|
|
@ -0,0 +1,242 @@
|
||||||
|
#!/usr/bin/python3
|
||||||
|
"""Privileged overlay mount helper for left4me.
|
||||||
|
|
||||||
|
Invoked from the systemd unit's ExecStartPre / ExecStopPost via
|
||||||
|
`+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- …`. The unit-level
|
||||||
|
nsenter is what makes this work: it runs the helper Python interpreter
|
||||||
|
inside PID 1's mount namespace. Without it, the `+` Exec prefix
|
||||||
|
removes the sandbox/credentials but does NOT detach from the unit's
|
||||||
|
per-service mount namespace, and the helper process itself would pin
|
||||||
|
that namespace alive — turning every umount into a multi-second EBUSY
|
||||||
|
race with the kernel's deferred namespace cleanup. With the unit-level
|
||||||
|
nsenter the helper has no such reference and umount succeeds first try.
|
||||||
|
|
||||||
|
Validates inputs strictly, then performs `mount -t overlay` /
|
||||||
|
`umount` directly — no internal nsenter, since the helper is already
|
||||||
|
running where the syscalls need to take effect.
|
||||||
|
|
||||||
|
Verbs:
|
||||||
|
mount <name> Reads ${LEFT4ME_ROOT}/instances/<name>/instance.env
|
||||||
|
for L4D2_LOWERDIRS, validates every lowerdir is
|
||||||
|
under one of installation/overlays/workshop_cache/
|
||||||
|
global_overlay_cache, then mounts the kernel
|
||||||
|
overlay at runtime/<name>/merged.
|
||||||
|
umount <name> Unmounts runtime/<name>/merged and cleans up the
|
||||||
|
kernel-overlayfs `work/work` orphan.
|
||||||
|
|
||||||
|
Set LEFT4ME_OVERLAY_PRINT_ONLY=1 to print the would-be argv (one line,
|
||||||
|
shell-quoted) and exit 0 instead of execv. Used by tests.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import shlex
|
||||||
|
import shutil
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
NAME_RE = re.compile(r"^[a-z0-9][a-z0-9_-]{0,63}$")
|
||||||
|
DEFAULT_ROOT = "/var/lib/left4me"
|
||||||
|
LOWERDIR_ALLOWLIST = (
|
||||||
|
"installation",
|
||||||
|
"overlays",
|
||||||
|
"global_overlay_cache",
|
||||||
|
"workshop_cache",
|
||||||
|
)
|
||||||
|
MAX_LOWERDIRS = 500
|
||||||
|
MOUNT_BIN = "/bin/mount"
|
||||||
|
UMOUNT_BIN = "/bin/umount"
|
||||||
|
|
||||||
|
|
||||||
|
def die(msg: str) -> None:
|
||||||
|
sys.stderr.write(f"left4me-overlay: {msg}\n")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
|
def root() -> Path:
|
||||||
|
return Path(os.environ.get("LEFT4ME_ROOT") or DEFAULT_ROOT)
|
||||||
|
|
||||||
|
|
||||||
|
def validate_name(name: str) -> str:
|
||||||
|
if not NAME_RE.fullmatch(name):
|
||||||
|
die(f"invalid instance name: {name!r}")
|
||||||
|
return name
|
||||||
|
|
||||||
|
|
||||||
|
def parse_lowerdirs(env_path: Path) -> list[str]:
|
||||||
|
if not env_path.is_file():
|
||||||
|
die(f"instance.env not found: {env_path}")
|
||||||
|
raw = None
|
||||||
|
for line in env_path.read_text().splitlines():
|
||||||
|
if "=" not in line:
|
||||||
|
continue
|
||||||
|
key, value = line.split("=", 1)
|
||||||
|
if key.strip() == "L4D2_LOWERDIRS":
|
||||||
|
raw = value
|
||||||
|
break
|
||||||
|
if raw is None:
|
||||||
|
die(f"L4D2_LOWERDIRS not set in {env_path}")
|
||||||
|
if raw == "":
|
||||||
|
die(f"L4D2_LOWERDIRS is empty in {env_path}")
|
||||||
|
parts = raw.split(":")
|
||||||
|
if any(p == "" for p in parts):
|
||||||
|
die(f"L4D2_LOWERDIRS contains an empty entry: {raw!r}")
|
||||||
|
if len(parts) > MAX_LOWERDIRS:
|
||||||
|
die(f"L4D2_LOWERDIRS has {len(parts)} entries (cap {MAX_LOWERDIRS})")
|
||||||
|
return parts
|
||||||
|
|
||||||
|
|
||||||
|
def canonical_under(allowed_roots: list[Path], path: Path) -> Path:
|
||||||
|
try:
|
||||||
|
canonical = path.resolve(strict=True)
|
||||||
|
except (FileNotFoundError, RuntimeError):
|
||||||
|
die(f"path does not exist or has a symlink loop: {path}")
|
||||||
|
for r in allowed_roots:
|
||||||
|
if canonical == r or r in canonical.parents:
|
||||||
|
return canonical
|
||||||
|
die(f"path is outside the permitted roots: {path} (resolved: {canonical})")
|
||||||
|
|
||||||
|
|
||||||
|
_LISTXATTR = getattr(os, "listxattr", None)
|
||||||
|
|
||||||
|
|
||||||
|
def _entry_has_fuse_xattr(path: str) -> str | None:
|
||||||
|
if _LISTXATTR is None:
|
||||||
|
return None
|
||||||
|
try:
|
||||||
|
attrs = _LISTXATTR(path, follow_symlinks=False)
|
||||||
|
except OSError:
|
||||||
|
return None
|
||||||
|
for a in attrs:
|
||||||
|
if a.startswith("user.fuseoverlayfs."):
|
||||||
|
return a
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def assert_no_fuse_xattrs(upper: Path) -> None:
|
||||||
|
if not upper.exists() or _LISTXATTR is None:
|
||||||
|
return
|
||||||
|
for dirpath, dirnames, filenames in os.walk(upper):
|
||||||
|
for entry in (dirpath, *(os.path.join(dirpath, n) for n in dirnames),
|
||||||
|
*(os.path.join(dirpath, n) for n in filenames)):
|
||||||
|
tainted = _entry_has_fuse_xattr(entry)
|
||||||
|
if tainted:
|
||||||
|
die(
|
||||||
|
f"upperdir contains fuse-overlayfs xattr {tainted!r} on {entry}; "
|
||||||
|
"wipe upper/ and work/ before mounting"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def exec_or_print(argv: list[str]) -> None:
|
||||||
|
if os.environ.get("LEFT4ME_OVERLAY_PRINT_ONLY") == "1":
|
||||||
|
print(" ".join(shlex.quote(a) for a in argv))
|
||||||
|
sys.exit(0)
|
||||||
|
os.execv(argv[0], argv)
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_mount(name: str) -> None:
|
||||||
|
name = validate_name(name)
|
||||||
|
r = root()
|
||||||
|
runtime_name_dir = (r / "runtime" / name).resolve(strict=True)
|
||||||
|
merged_for_check = (runtime_name_dir / "merged").resolve(strict=True)
|
||||||
|
|
||||||
|
# Idempotency for unit restart cycles: if a previous start mounted
|
||||||
|
# successfully but ExecStart failed afterwards (and Restart=on-failure
|
||||||
|
# fires another cycle), the second ExecStartPre would otherwise refuse
|
||||||
|
# to mount-on-top. Short-circuit here so the second cycle just gets
|
||||||
|
# straight to ExecStart. PRINT_ONLY (test mode) bypasses this so the
|
||||||
|
# tests can exercise the full nsenter argv regardless of mount state.
|
||||||
|
if (
|
||||||
|
os.environ.get("LEFT4ME_OVERLAY_PRINT_ONLY") != "1"
|
||||||
|
and os.path.ismount(merged_for_check)
|
||||||
|
):
|
||||||
|
return
|
||||||
|
|
||||||
|
instance_env = r / "instances" / name / "instance.env"
|
||||||
|
raw_lowerdirs = parse_lowerdirs(instance_env)
|
||||||
|
|
||||||
|
allowed_roots = [(r / sub).resolve() for sub in LOWERDIR_ALLOWLIST]
|
||||||
|
canonical_lowerdirs = [str(canonical_under(allowed_roots, Path(p))) for p in raw_lowerdirs]
|
||||||
|
|
||||||
|
upper = (runtime_name_dir / "upper").resolve(strict=True)
|
||||||
|
work = (runtime_name_dir / "work").resolve(strict=True)
|
||||||
|
merged = merged_for_check
|
||||||
|
for label, path in (("upper", upper), ("work", work), ("merged", merged)):
|
||||||
|
if path.parent != runtime_name_dir:
|
||||||
|
die(f"{label} resolved outside runtime/{name}: {path}")
|
||||||
|
|
||||||
|
assert_no_fuse_xattrs(upper)
|
||||||
|
|
||||||
|
options = f"lowerdir={':'.join(canonical_lowerdirs)},upperdir={upper},workdir={work}"
|
||||||
|
argv = [
|
||||||
|
MOUNT_BIN,
|
||||||
|
"-t", "overlay",
|
||||||
|
"overlay",
|
||||||
|
"-o", options,
|
||||||
|
str(merged),
|
||||||
|
]
|
||||||
|
exec_or_print(argv)
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_umount(name: str) -> None:
|
||||||
|
name = validate_name(name)
|
||||||
|
r = root()
|
||||||
|
runtime_name_dir = (r / "runtime" / name).resolve(strict=True)
|
||||||
|
merged_path = runtime_name_dir / "merged"
|
||||||
|
work_inner = runtime_name_dir / "work" / "work"
|
||||||
|
|
||||||
|
argv = [
|
||||||
|
UMOUNT_BIN,
|
||||||
|
# Resolve only if it exists; PRINT_ONLY tests always pre-create it.
|
||||||
|
str(merged_path.resolve(strict=True) if merged_path.exists() else merged_path),
|
||||||
|
]
|
||||||
|
|
||||||
|
# PRINT_ONLY: emit the umount argv and exit. Tests assert exact shape
|
||||||
|
# of this dry-run; the post-umount cleanup of work_inner is a runtime
|
||||||
|
# behaviour exercised on the host, not in unit tests.
|
||||||
|
if os.environ.get("LEFT4ME_OVERLAY_PRINT_ONLY") == "1":
|
||||||
|
print(" ".join(shlex.quote(a) for a in argv))
|
||||||
|
sys.exit(0)
|
||||||
|
|
||||||
|
if merged_path.exists():
|
||||||
|
merged = merged_path.resolve(strict=True)
|
||||||
|
if merged.parent != runtime_name_dir:
|
||||||
|
die(f"merged resolved outside runtime/{name}: {merged}")
|
||||||
|
# Idempotency: only umount if currently a mount point. Mirrors
|
||||||
|
# cmd_mount's symmetric check; a redundant cleanup pass — or a
|
||||||
|
# call after a partial _purge_instance — must be a no-op.
|
||||||
|
#
|
||||||
|
# No retry loop here: with the helper running in PID 1's mount
|
||||||
|
# namespace (via the unit-level `nsenter --mount=/proc/1/ns/mnt`
|
||||||
|
# in ExecStopPost), it holds no reference to the unit's
|
||||||
|
# per-service mount namespace, so the cgroup-empty → namespace
|
||||||
|
# reaped → umount-clears sequence happens without any race
|
||||||
|
# window for us to ride out. EBUSY here is a real error.
|
||||||
|
if os.path.ismount(merged):
|
||||||
|
subprocess.run(argv, check=True)
|
||||||
|
|
||||||
|
# Kernel-overlayfs creates work_inner during mount with root:root mode
|
||||||
|
# 0/0. After unmount it's an orphan that the unit's User= (left4me)
|
||||||
|
# cannot traverse via shutil.rmtree, so reset/delete in instances.py
|
||||||
|
# blows up with EACCES on `runtime/<name>/work/work`. The helper is
|
||||||
|
# the only code path with root that knows about this directory, so
|
||||||
|
# the cleanup belongs here. Safe to nuke — the kernel re-creates it
|
||||||
|
# on the next mount. Run unconditionally — covers both "we just
|
||||||
|
# unmounted" and "previous teardown didn't finish" cases.
|
||||||
|
if work_inner.exists():
|
||||||
|
shutil.rmtree(work_inner)
|
||||||
|
|
||||||
|
|
||||||
|
def main(argv: list[str]) -> None:
|
||||||
|
if len(argv) != 3 or argv[1] not in ("mount", "umount"):
|
||||||
|
sys.stderr.write("usage: left4me-overlay mount|umount <name>\n")
|
||||||
|
sys.exit(2)
|
||||||
|
if argv[1] == "mount":
|
||||||
|
cmd_mount(argv[2])
|
||||||
|
else:
|
||||||
|
cmd_umount(argv[2])
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main(sys.argv)
|
||||||
82
bundles/left4me/files/usr/local/libexec/left4me/left4me-script-sandbox
Executable file
82
bundles/left4me/files/usr/local/libexec/left4me/left4me-script-sandbox
Executable file
|
|
@ -0,0 +1,82 @@
|
||||||
|
#!/bin/bash
|
||||||
|
# Privileged sandbox launcher for left4me script overlays.
|
||||||
|
#
|
||||||
|
# Invoked via sudo by the web user with two arguments:
|
||||||
|
# <overlay_id> numeric overlay id; bind-mounts /var/lib/left4me/overlays/<id>
|
||||||
|
# read-write at /overlay inside the sandbox.
|
||||||
|
# <script_path> absolute path to a bash file already written by the web app;
|
||||||
|
# bind-mounted read-only at /script.sh inside the sandbox.
|
||||||
|
#
|
||||||
|
# The script runs as a transient systemd .service with the full hardening
|
||||||
|
# surface: cgroup limits + walltime kill, NoNewPrivileges, ProtectSystem,
|
||||||
|
# ProtectHome, kernel-tunable / -module / -log protection, namespace
|
||||||
|
# restriction, address-family restriction, capability bounding (empty),
|
||||||
|
# seccomp filter (@system-service @network-io), MemoryDenyWriteExecute,
|
||||||
|
# LockPersonality, RestrictSUIDSGID. Network namespace is *not* restricted —
|
||||||
|
# scripts must reach the public internet to download workshop / l4d2center
|
||||||
|
# / cedapug content. PID namespace is shared with the host (no
|
||||||
|
# PrivatePID= directive in systemd); host PIDs are visible via /proc but
|
||||||
|
# not signal-able due to UID mismatch.
|
||||||
|
set -euo pipefail
|
||||||
|
|
||||||
|
[[ $# -eq 2 ]] || { echo "usage: $0 <overlay_id> <script>" >&2; exit 64; }
|
||||||
|
|
||||||
|
OVERLAY_ID=$1
|
||||||
|
SCRIPT=$2
|
||||||
|
|
||||||
|
[[ "$OVERLAY_ID" =~ ^[0-9]+$ ]] || { echo "bad overlay id" >&2; exit 64; }
|
||||||
|
OVERLAY_DIR=/var/lib/left4me/overlays/$OVERLAY_ID
|
||||||
|
[[ -d $OVERLAY_DIR ]] || { echo "no overlay dir at $OVERLAY_DIR" >&2; exit 65; }
|
||||||
|
[[ -f $SCRIPT ]] || { echo "no script at $SCRIPT" >&2; exit 65; }
|
||||||
|
|
||||||
|
if [[ "${LEFT4ME_SCRIPT_SANDBOX_DRY_RUN:-}" == "1" ]]; then
|
||||||
|
echo "DRY RUN: overlay_id=$OVERLAY_ID script=$SCRIPT overlay_dir=$OVERLAY_DIR"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Make sure the sandbox UID owns the overlay dir so the script can write there.
|
||||||
|
# Idempotent: a no-op when the dir is already l4d2-sandbox-owned (re-run case),
|
||||||
|
# and corrects the ownership the first time the dir was created by the web app
|
||||||
|
# under the left4me UID. World-readable so the gameserver process (left4me)
|
||||||
|
# can read the overlay contents via the kernel-overlayfs lowerdir at runtime.
|
||||||
|
chown -R l4d2-sandbox:l4d2-sandbox "$OVERLAY_DIR"
|
||||||
|
chmod 0755 "$OVERLAY_DIR"
|
||||||
|
|
||||||
|
SCRIPT_RC=0
|
||||||
|
systemd-run --quiet --collect --wait --pipe \
|
||||||
|
--unit="left4me-script-${OVERLAY_ID}-$$" \
|
||||||
|
--slice=l4d2-build.slice \
|
||||||
|
-p OOMScoreAdjust=500 \
|
||||||
|
-p User=l4d2-sandbox -p Group=l4d2-sandbox \
|
||||||
|
-p UMask=0022 \
|
||||||
|
-p NoNewPrivileges=yes \
|
||||||
|
-p ProtectSystem=strict -p ProtectHome=yes \
|
||||||
|
-p PrivateTmp=yes -p PrivateDevices=yes -p PrivateIPC=yes \
|
||||||
|
-p ProtectKernelTunables=yes -p ProtectKernelModules=yes \
|
||||||
|
-p ProtectKernelLogs=yes -p ProtectControlGroups=yes \
|
||||||
|
-p RestrictNamespaces=yes \
|
||||||
|
-p RestrictAddressFamilies="AF_INET AF_INET6 AF_UNIX" \
|
||||||
|
-p RestrictSUIDSGID=yes -p LockPersonality=yes \
|
||||||
|
-p MemoryDenyWriteExecute=yes \
|
||||||
|
-p SystemCallFilter="@system-service @network-io" \
|
||||||
|
-p SystemCallArchitectures=native \
|
||||||
|
-p CapabilityBoundingSet= -p AmbientCapabilities= \
|
||||||
|
-p IPAddressDeny="127.0.0.0/8 ::1/128 169.254.0.0/16 fe80::/10 224.0.0.0/4 ff00::/8 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 100.64.0.0/10 fc00::/7" \
|
||||||
|
-p TemporaryFileSystem="/etc /var/lib" \
|
||||||
|
-p BindReadOnlyPaths="/etc/left4me/sandbox-resolv.conf:/etc/resolv.conf /etc/ssl /etc/ca-certificates /etc/nsswitch.conf /etc/alternatives ${SCRIPT}:/script.sh" \
|
||||||
|
-p BindPaths="${OVERLAY_DIR}:/overlay" \
|
||||||
|
-p WorkingDirectory=/overlay \
|
||||||
|
-p Environment="HOME=/tmp PATH=/usr/bin:/usr/sbin OVERLAY=/overlay" \
|
||||||
|
-p MemoryMax=4G -p MemorySwapMax=0 -p TasksMax=512 \
|
||||||
|
-p CPUQuota=200% -p RuntimeMaxSec=3600 \
|
||||||
|
-- /bin/bash /script.sh || SCRIPT_RC=$?
|
||||||
|
|
||||||
|
# Normalize perms so the web service (left4me uid) can read overlay files
|
||||||
|
# directly via Python open() — needed by the file tree's download endpoint.
|
||||||
|
# UMask=0022 above takes care of *new* writes; this catches anything the
|
||||||
|
# script created with a tighter mode (e.g. cedapug_maps writes its
|
||||||
|
# .cedapug/manifest.tsv as 0600 by default).
|
||||||
|
find "$OVERLAY_DIR" -type f ! -perm -o+r -exec chmod o+r {} + 2>/dev/null || true
|
||||||
|
find "$OVERLAY_DIR" -type d ! -perm -o+rx -exec chmod o+rx {} + 2>/dev/null || true
|
||||||
|
|
||||||
|
exit $SCRIPT_RC
|
||||||
44
bundles/left4me/files/usr/local/libexec/left4me/left4me-systemctl
Executable file
44
bundles/left4me/files/usr/local/libexec/left4me/left4me-systemctl
Executable file
|
|
@ -0,0 +1,44 @@
|
||||||
|
#!/bin/sh
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
usage() {
|
||||||
|
printf '%s\n' "usage: left4me-systemctl enable|disable|show <server-name>" >&2
|
||||||
|
exit 2
|
||||||
|
}
|
||||||
|
|
||||||
|
validate_name() {
|
||||||
|
name=$1
|
||||||
|
[ -n "$name" ] || usage
|
||||||
|
case "$name" in
|
||||||
|
.*|*..*|*/*|*\\*) usage ;;
|
||||||
|
esac
|
||||||
|
case "$name" in
|
||||||
|
*[!A-Za-z0-9_.-]*) usage ;;
|
||||||
|
esac
|
||||||
|
}
|
||||||
|
|
||||||
|
[ "$#" -eq 2 ] || usage
|
||||||
|
action=$1
|
||||||
|
name=$2
|
||||||
|
|
||||||
|
case "$action" in
|
||||||
|
enable|disable|show) ;;
|
||||||
|
*) usage ;;
|
||||||
|
esac
|
||||||
|
|
||||||
|
validate_name "$name"
|
||||||
|
unit="left4me-server@${name}.service"
|
||||||
|
if [ -x /bin/systemctl ]; then
|
||||||
|
systemctl=/bin/systemctl
|
||||||
|
elif [ -x /usr/bin/systemctl ]; then
|
||||||
|
systemctl=/usr/bin/systemctl
|
||||||
|
else
|
||||||
|
printf '%s\n' 'systemctl not found at /bin/systemctl or /usr/bin/systemctl' >&2
|
||||||
|
exit 69
|
||||||
|
fi
|
||||||
|
|
||||||
|
case "$action" in
|
||||||
|
enable) exec "$systemctl" enable --now "$unit" ;;
|
||||||
|
disable) exec "$systemctl" disable --now "$unit" ;;
|
||||||
|
show) exec "$systemctl" show --property=ActiveState --property=SubState "$unit" ;;
|
||||||
|
esac
|
||||||
17
bundles/left4me/files/usr/local/sbin/left4me
Normal file
17
bundles/left4me/files/usr/local/sbin/left4me
Normal file
|
|
@ -0,0 +1,17 @@
|
||||||
|
#!/bin/sh
|
||||||
|
# Run l4d2web flask CLI commands as the left4me user with the deploy env loaded.
|
||||||
|
# Usage: left4me <flask-subcommand> [args...]
|
||||||
|
# Examples:
|
||||||
|
# left4me create-user alice --admin
|
||||||
|
# left4me seed-script-overlays /opt/left4me/src/examples/script-overlays
|
||||||
|
# left4me routes
|
||||||
|
set -eu
|
||||||
|
exec sudo -u left4me sh -c '
|
||||||
|
set -a
|
||||||
|
. /etc/left4me/host.env
|
||||||
|
. /etc/left4me/web.env
|
||||||
|
set +a
|
||||||
|
export JOB_WORKER_ENABLED=false
|
||||||
|
export PYTHONPATH=/opt/left4me/src
|
||||||
|
exec /opt/left4me/.venv/bin/flask --app l4d2web.app:create_app "$@"
|
||||||
|
' sh "$@"
|
||||||
293
bundles/left4me/items.py
Normal file
293
bundles/left4me/items.py
Normal file
|
|
@ -0,0 +1,293 @@
|
||||||
|
# Items for the left4me bundle.
|
||||||
|
# Systemd units come from metadata via bundles/systemd/ — there are no
|
||||||
|
# .service or .slice files in this bundle's files/ tree. Cpuset drop-ins
|
||||||
|
# for system.slice / user.slice are likewise emitted via systemd/units
|
||||||
|
# in metadata.py (key: '<parent>.d/<basename>.conf').
|
||||||
|
|
||||||
|
directories = {
|
||||||
|
'/opt/left4me': {
|
||||||
|
'owner': 'left4me',
|
||||||
|
'group': 'left4me',
|
||||||
|
},
|
||||||
|
'/opt/left4me/src': {
|
||||||
|
'owner': 'left4me',
|
||||||
|
'group': 'left4me',
|
||||||
|
},
|
||||||
|
'/etc/left4me': {
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
'mode': '0755',
|
||||||
|
},
|
||||||
|
'/var/lib/left4me': {
|
||||||
|
# left4me's home dir — useradd creates with 0700; loosen to 0711 so
|
||||||
|
# l4d2-sandbox can traverse (but not list) for bwrap bind-mounts.
|
||||||
|
'owner': 'left4me',
|
||||||
|
'group': 'left4me',
|
||||||
|
'mode': '0711',
|
||||||
|
},
|
||||||
|
'/var/lib/left4me/installation': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/var/lib/left4me/overlays': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/var/lib/left4me/instances': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/var/lib/left4me/runtime': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/var/lib/left4me/workshop_cache': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/var/lib/left4me/tmp': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/opt/left4me/steam': {'owner': 'left4me', 'group': 'left4me'},
|
||||||
|
'/usr/local/libexec/left4me': {
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
'mode': '0755',
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
groups = {
|
||||||
|
'left4me': {'gid': 980},
|
||||||
|
'l4d2-sandbox': {'gid': 981},
|
||||||
|
}
|
||||||
|
|
||||||
|
users = {
|
||||||
|
'left4me': {
|
||||||
|
'uid': 980,
|
||||||
|
'gid': 980,
|
||||||
|
'home': '/var/lib/left4me',
|
||||||
|
'shell': '/usr/sbin/nologin',
|
||||||
|
},
|
||||||
|
'l4d2-sandbox': {
|
||||||
|
'uid': 981,
|
||||||
|
'gid': 981,
|
||||||
|
'shell': '/usr/sbin/nologin',
|
||||||
|
},
|
||||||
|
}
|
||||||
|
# UIDs/GIDs pinned in the system-package range (100-999, per Debian
|
||||||
|
# policy) so file ownership is deterministic across rebuilds and
|
||||||
|
# backup restores. 980/981 are unused elsewhere in this repo.
|
||||||
|
|
||||||
|
# Privileged helpers (mode 0755 root:root). Listed by sudoers as the only
|
||||||
|
# commands left4me can invoke as root NOPASSWD.
|
||||||
|
HELPERS = (
|
||||||
|
'left4me-systemctl',
|
||||||
|
'left4me-journalctl',
|
||||||
|
'left4me-overlay',
|
||||||
|
'left4me-script-sandbox',
|
||||||
|
)
|
||||||
|
|
||||||
|
files = {
|
||||||
|
'/usr/local/sbin/left4me': {
|
||||||
|
'source': 'usr/local/sbin/left4me', # explicit — basename collides with sudoers
|
||||||
|
'mode': '0755',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
},
|
||||||
|
**{
|
||||||
|
f'/usr/local/libexec/left4me/{h}': {
|
||||||
|
'source': f'usr/local/libexec/left4me/{h}',
|
||||||
|
'mode': '0755',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
}
|
||||||
|
for h in HELPERS
|
||||||
|
},
|
||||||
|
'/etc/left4me/sandbox-resolv.conf': {
|
||||||
|
'source': 'etc/left4me/sandbox-resolv.conf',
|
||||||
|
'mode': '0644',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
},
|
||||||
|
'/etc/sudoers.d/left4me': {
|
||||||
|
'source': 'etc/sudoers.d/left4me',
|
||||||
|
'mode': '0440',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
'test_with': 'visudo -cf {}',
|
||||||
|
},
|
||||||
|
'/etc/sysctl.d/99-left4me.conf': {
|
||||||
|
'source': 'etc/sysctl.d/99-left4me.conf',
|
||||||
|
'mode': '0644',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
'triggers': [
|
||||||
|
'action:left4me_sysctl_reload',
|
||||||
|
],
|
||||||
|
},
|
||||||
|
'/etc/left4me/host.env': {
|
||||||
|
'source': 'etc/left4me/host.env.mako',
|
||||||
|
'content_type': 'mako',
|
||||||
|
'mode': '0644',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'root',
|
||||||
|
},
|
||||||
|
'/etc/left4me/web.env': {
|
||||||
|
'source': 'etc/left4me/web.env.mako',
|
||||||
|
'content_type': 'mako',
|
||||||
|
'mode': '0640',
|
||||||
|
'owner': 'root',
|
||||||
|
'group': 'left4me',
|
||||||
|
'needs': [
|
||||||
|
'group:left4me',
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
actions = {
|
||||||
|
'left4me_sysctl_reload': {
|
||||||
|
'command': 'sysctl --system >/dev/null',
|
||||||
|
'triggered': True,
|
||||||
|
},
|
||||||
|
'left4me_dpkg_add_i386_arch': {
|
||||||
|
# steamcmd is 32-bit and pulls libc6:i386 + lib32z1 from the i386 arch.
|
||||||
|
# apt-get update is part of this action because newly-added foreign
|
||||||
|
# archs need a fresh package list before any :i386 package resolves.
|
||||||
|
'command': 'dpkg --add-architecture i386 && apt-get update',
|
||||||
|
'unless': 'dpkg --print-foreign-architectures | grep -qx i386',
|
||||||
|
'cascade_skip': False,
|
||||||
|
},
|
||||||
|
'left4me_install_steamcmd': {
|
||||||
|
# Steam's tarball is rolling with no published checksum, so we can't
|
||||||
|
# use download: (which requires a hash). Guard with a presence check
|
||||||
|
# on steamcmd.sh — steamcmd self-updates at runtime, so chasing the
|
||||||
|
# tarball version from bw isn't useful.
|
||||||
|
'command': (
|
||||||
|
'sudo -u left4me sh -c "'
|
||||||
|
'cd /opt/left4me/steam && '
|
||||||
|
'curl -fsSL https://media.steampowered.com/installer/steamcmd_linux.tar.gz | '
|
||||||
|
'tar -xz'
|
||||||
|
'"'
|
||||||
|
),
|
||||||
|
'unless': 'test -x /opt/left4me/steam/steamcmd.sh',
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'directory:/opt/left4me/steam',
|
||||||
|
'pkg_apt:curl',
|
||||||
|
'pkg_apt:libc6_i386', # bw pkg_apt convention: _ → :
|
||||||
|
'pkg_apt:lib32z1',
|
||||||
|
'user:left4me',
|
||||||
|
],
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
# steamcmd is invoked by absolute path (LEFT4ME_STEAMCMD in host.env),
|
||||||
|
# not via PATH lookup — see l4d2host/cli.py:install. We don't need to put
|
||||||
|
# anything in /usr/local/bin for it.
|
||||||
|
|
||||||
|
git_deploy = {
|
||||||
|
'/opt/left4me/src': {
|
||||||
|
'repo': node.metadata.get('left4me/git_url'),
|
||||||
|
'rev': node.metadata.get('left4me/git_branch'),
|
||||||
|
'triggers': [
|
||||||
|
# On a code-update apply, refresh the DB schema. pip_install
|
||||||
|
# would have triggered alembic in the create_venv path, but on
|
||||||
|
# a normal apply pip_install's `unless` skips (packages still
|
||||||
|
# importable from the previous editable install), and that
|
||||||
|
# would leave alembic_upgrade dormant. Wiring git_deploy →
|
||||||
|
# alembic directly ensures new migrations land whenever new
|
||||||
|
# code lands. alembic upgrade head is idempotent (no-op when
|
||||||
|
# already at head), so this is safe to fire on every code
|
||||||
|
# update; the seed_overlays + service:restart cascade off
|
||||||
|
# alembic also covers picking up the new code in gunicorn.
|
||||||
|
'action:left4me_alembic_upgrade',
|
||||||
|
],
|
||||||
|
# chown_src and pip_install are NOT in triggers — they run every
|
||||||
|
# apply gated by their own `unless` guards, which makes the chain
|
||||||
|
# self-healing after a partial failure. (Items in a triggers list
|
||||||
|
# must be triggered:True, which would lose that property.)
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
actions['left4me_chown_src'] = {
|
||||||
|
# Runs every apply (cheap — chown -R on a small tree). Self-heals
|
||||||
|
# whenever git_deploy extracts a new tarball as root-owned files.
|
||||||
|
# Not in any triggers list so doesn't need triggered:True.
|
||||||
|
'command': 'chown -R left4me:left4me /opt/left4me/src',
|
||||||
|
'unless': 'test -z "$(find /opt/left4me/src \\! -user left4me -print -quit 2>/dev/null)"',
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'git_deploy:/opt/left4me/src',
|
||||||
|
'user:left4me',
|
||||||
|
'group:left4me',
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
actions['left4me_create_venv'] = {
|
||||||
|
'command': 'sudo -u left4me /usr/bin/python3 -m venv /opt/left4me/.venv',
|
||||||
|
'unless': 'test -x /opt/left4me/.venv/bin/python',
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'directory:/opt/left4me',
|
||||||
|
'pkg_apt:python3-venv',
|
||||||
|
'user:left4me',
|
||||||
|
],
|
||||||
|
'triggers': [
|
||||||
|
'action:left4me_pip_upgrade',
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
actions['left4me_pip_upgrade'] = {
|
||||||
|
'command': 'sudo -u left4me /opt/left4me/.venv/bin/python -m pip install --upgrade pip',
|
||||||
|
'triggered': True,
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'pkg_apt:python3-pip',
|
||||||
|
],
|
||||||
|
# No triggers — pip_install runs on every apply (gated by `unless`)
|
||||||
|
# rather than being chained from here. Keeps pip_upgrade scoped to
|
||||||
|
# exactly its purpose.
|
||||||
|
}
|
||||||
|
|
||||||
|
actions['left4me_pip_install'] = {
|
||||||
|
# Single pip invocation installs both editable packages from the same
|
||||||
|
# checkout. Runs on every apply: pip install -e is fast on no-op, and
|
||||||
|
# any gate weaker than "egg-info matches pyproject.toml" can mask
|
||||||
|
# script regeneration — e.g. adding [project.scripts] later wouldn't
|
||||||
|
# be picked up if `unless` only checks importability.
|
||||||
|
'command': 'sudo -u left4me /opt/left4me/.venv/bin/pip install -e /opt/left4me/src/l4d2host -e /opt/left4me/src/l4d2web',
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'git_deploy:/opt/left4me/src',
|
||||||
|
'action:left4me_create_venv',
|
||||||
|
'action:left4me_chown_src',
|
||||||
|
],
|
||||||
|
'triggers': [
|
||||||
|
'action:left4me_alembic_upgrade',
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
actions['left4me_alembic_upgrade'] = {
|
||||||
|
# Mirrors deploy-test-server.sh:239-242. Runs as left4me with both env
|
||||||
|
# files sourced; JOB_WORKER_ENABLED=false so a stray worker doesn't race
|
||||||
|
# with the migration.
|
||||||
|
'command': (
|
||||||
|
'sudo -u left4me sh -c "'
|
||||||
|
'cd /opt/left4me/src/l4d2web && '
|
||||||
|
'set -a && . /etc/left4me/host.env && . /etc/left4me/web.env && set +a && '
|
||||||
|
'env JOB_WORKER_ENABLED=false PYTHONPATH=/opt/left4me/src '
|
||||||
|
'/opt/left4me/.venv/bin/alembic -c /opt/left4me/src/l4d2web/alembic.ini upgrade head'
|
||||||
|
'"'
|
||||||
|
),
|
||||||
|
'triggered': True,
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'action:left4me_pip_install',
|
||||||
|
'file:/etc/left4me/host.env',
|
||||||
|
'file:/etc/left4me/web.env',
|
||||||
|
],
|
||||||
|
'triggers': [
|
||||||
|
'action:left4me_seed_overlays',
|
||||||
|
'svc_systemd:left4me-web.service:restart',
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
actions['left4me_seed_overlays'] = {
|
||||||
|
# Idempotent: refreshes script bodies in place; existing overlay rows keep their ids.
|
||||||
|
'command': (
|
||||||
|
'sudo -u left4me sh -c "'
|
||||||
|
'set -a && . /etc/left4me/host.env && . /etc/left4me/web.env && set +a && '
|
||||||
|
'env JOB_WORKER_ENABLED=false PYTHONPATH=/opt/left4me/src '
|
||||||
|
'/opt/left4me/.venv/bin/flask --app l4d2web.app:create_app '
|
||||||
|
'seed-script-overlays /opt/left4me/src/examples/script-overlays'
|
||||||
|
'"'
|
||||||
|
),
|
||||||
|
'triggered': True,
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': [
|
||||||
|
'action:left4me_alembic_upgrade',
|
||||||
|
],
|
||||||
|
}
|
||||||
275
bundles/left4me/metadata.py
Normal file
275
bundles/left4me/metadata.py
Normal file
|
|
@ -0,0 +1,275 @@
|
||||||
|
assert node.has_bundle('nftables')
|
||||||
|
assert node.has_bundle('systemd')
|
||||||
|
|
||||||
|
|
||||||
|
defaults = {
|
||||||
|
'left4me': {
|
||||||
|
# Application-wide defaults; node only overrides if it really needs to.
|
||||||
|
'git_url': 'https://git.sublimity.de/cronekorkn/left4me.git',
|
||||||
|
'git_branch': 'master',
|
||||||
|
'secret_key': repo.vault.random_bytes_as_base64_for(f'{node.name} left4me secret_key', length=32).value,
|
||||||
|
'gunicorn_workers': 1,
|
||||||
|
'gunicorn_threads': 32,
|
||||||
|
'job_worker_threads': 4,
|
||||||
|
# Whole 27000-block: covers Steam's defaults (27015 game, 27005
|
||||||
|
# client/RCON) plus headroom for ad-hoc ports without further
|
||||||
|
# nftables changes. Mirrored into LEFT4ME_PORT_RANGE_{START,END}
|
||||||
|
# by web.env.mako and into the nftables input rule by the
|
||||||
|
# nftables_input reactor below.
|
||||||
|
'port_range_start': 27000,
|
||||||
|
'port_range_end': 27999,
|
||||||
|
},
|
||||||
|
'apt': {
|
||||||
|
'packages': {
|
||||||
|
'p7zip-full': {},
|
||||||
|
'nftables': {},
|
||||||
|
'iproute2': {},
|
||||||
|
'curl': {},
|
||||||
|
'ca-certificates': {},
|
||||||
|
'python3': {},
|
||||||
|
'python3-venv': {},
|
||||||
|
'python3-pip': {},
|
||||||
|
'python3-dev': {},
|
||||||
|
# steamcmd is a 32-bit ELF; needs i386 multiarch + these libs.
|
||||||
|
# `_` → `:` is bundlewrap's pkg_apt convention for multiarch
|
||||||
|
# names (see pkg_apt.py:48).
|
||||||
|
'libc6_i386': { # installs libc6:i386
|
||||||
|
'needs': ['action:left4me_dpkg_add_i386_arch'],
|
||||||
|
},
|
||||||
|
'lib32z1': {
|
||||||
|
'needs': ['action:left4me_dpkg_add_i386_arch'],
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'nftables': {
|
||||||
|
# Match deploy/files/usr/local/lib/left4me/nft/left4me-mark.nft.
|
||||||
|
# Mark srcds UDP egress (uid left4me) with DSCP EF + skb priority 6
|
||||||
|
# so CAKE classifies it into the priority tin.
|
||||||
|
'output': {
|
||||||
|
'meta skuid "left4me" meta l4proto udp ip dscp set ef meta priority set 0006:0000',
|
||||||
|
'meta skuid "left4me" meta l4proto udp ip6 dscp set ef meta priority set 0006:0000',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'systemd': {
|
||||||
|
'services': {
|
||||||
|
'left4me-web.service': {
|
||||||
|
'enabled': True,
|
||||||
|
'running': True,
|
||||||
|
'needs': [
|
||||||
|
'action:left4me_alembic_upgrade',
|
||||||
|
'file:/etc/left4me/host.env',
|
||||||
|
'file:/etc/left4me/web.env',
|
||||||
|
],
|
||||||
|
},
|
||||||
|
# Note: left4me-server@.service is a TEMPLATE — instances are
|
||||||
|
# started on-demand by the web app via the left4me-systemctl
|
||||||
|
# helper. Don't enable/start it from here.
|
||||||
|
# The slices are installed (file present) but don't need
|
||||||
|
# enable/start — they're activated implicitly when a unit
|
||||||
|
# uses Slice=.
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'backup': {
|
||||||
|
# Application-owned paths. Set-merged with backup group / node-level paths.
|
||||||
|
'paths': {
|
||||||
|
'/var/lib/left4me',
|
||||||
|
'/etc/left4me',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@metadata_reactor.provides(
|
||||||
|
'nginx/vhosts',
|
||||||
|
)
|
||||||
|
def nginx_vhosts(metadata):
|
||||||
|
# letsencrypt/domains and monitoring/services for the vhost are auto-
|
||||||
|
# populated by bundles/nginx/metadata.py. We just declare check_path:
|
||||||
|
# '/health' so the auto-check hits the Flask health endpoint, not '/'.
|
||||||
|
domain = metadata.get('left4me/domain')
|
||||||
|
return {
|
||||||
|
'nginx': {
|
||||||
|
'vhosts': {
|
||||||
|
domain: {
|
||||||
|
'content': 'nginx/proxy_pass.conf',
|
||||||
|
'context': {
|
||||||
|
'target': 'http://127.0.0.1:8000',
|
||||||
|
},
|
||||||
|
'check_path': '/health',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@metadata_reactor.provides(
|
||||||
|
'nftables/input',
|
||||||
|
)
|
||||||
|
def nftables_input(metadata):
|
||||||
|
port_start = metadata.get('left4me/port_range_start')
|
||||||
|
port_end = metadata.get('left4me/port_range_end')
|
||||||
|
return {
|
||||||
|
'nftables': {
|
||||||
|
'input': {
|
||||||
|
f'udp dport {port_start}-{port_end} accept',
|
||||||
|
f'tcp dport {port_start}-{port_end} accept',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@metadata_reactor.provides(
|
||||||
|
'systemd/units',
|
||||||
|
)
|
||||||
|
def systemd_units(metadata):
|
||||||
|
workers = metadata.get('left4me/gunicorn_workers')
|
||||||
|
threads = metadata.get('left4me/gunicorn_threads')
|
||||||
|
|
||||||
|
# cgroup-v2 cpuset. `system_cpus` (set of int CPU ids, declared per
|
||||||
|
# node) pins system/user/build; the complement pins l4d2-game. On HT
|
||||||
|
# hosts, list both siblings of a physical core so games don't share
|
||||||
|
# L1/L2 with system work — pairings via
|
||||||
|
# /sys/devices/system/cpu/cpu<n>/topology/thread_siblings_list.
|
||||||
|
vm_threads = metadata.get('vm/threads', metadata.get('vm/cores'))
|
||||||
|
all_cpus = set(range(vm_threads))
|
||||||
|
system_cpus = metadata.get('left4me/system_cpus')
|
||||||
|
if not system_cpus <= all_cpus:
|
||||||
|
raise Exception(
|
||||||
|
f'left4me/system_cpus={sorted(system_cpus)} on {vm_threads}-thread host '
|
||||||
|
f'includes CPUs outside [0, {vm_threads})'
|
||||||
|
)
|
||||||
|
game_cpus = all_cpus - system_cpus
|
||||||
|
if not game_cpus:
|
||||||
|
raise Exception(
|
||||||
|
f'left4me/system_cpus={sorted(system_cpus)} on {vm_threads}-thread host '
|
||||||
|
f'leaves no cores for games'
|
||||||
|
)
|
||||||
|
system_cpus_string = ','.join(str(t) for t in sorted(system_cpus))
|
||||||
|
game_cpus_string = ','.join(str(t) for t in sorted(game_cpus))
|
||||||
|
|
||||||
|
# Drop-in for upstream system.slice / user.slice (units we don't own).
|
||||||
|
# Same '<parent>.d/<basename>.conf' convention as nginx and autologin.
|
||||||
|
cpuset_dropin = {'Slice': {'AllowedCPUs': system_cpus_string}}
|
||||||
|
|
||||||
|
return {
|
||||||
|
'systemd': {
|
||||||
|
'units': {
|
||||||
|
'left4me-web.service': {
|
||||||
|
'Unit': {
|
||||||
|
'Description': 'left4me web application',
|
||||||
|
'After': 'network-online.target',
|
||||||
|
'Wants': 'network-online.target',
|
||||||
|
},
|
||||||
|
'Service': {
|
||||||
|
'Type': 'simple',
|
||||||
|
'User': 'left4me',
|
||||||
|
'Group': 'left4me',
|
||||||
|
'WorkingDirectory': '/opt/left4me/src',
|
||||||
|
'Environment': {
|
||||||
|
'HOME=/var/lib/left4me',
|
||||||
|
'PATH=/opt/left4me/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
|
||||||
|
},
|
||||||
|
'EnvironmentFile': (
|
||||||
|
'/etc/left4me/host.env',
|
||||||
|
'/etc/left4me/web.env',
|
||||||
|
),
|
||||||
|
'ExecStart': (
|
||||||
|
'/opt/left4me/.venv/bin/gunicorn '
|
||||||
|
f'--workers {workers} --threads {threads} '
|
||||||
|
"--bind 127.0.0.1:8000 'l4d2web.app:create_app()'"
|
||||||
|
),
|
||||||
|
'Restart': 'on-failure',
|
||||||
|
'RestartSec': '3',
|
||||||
|
# NoNewPrivileges intentionally NOT set: workers sudo to the helpers.
|
||||||
|
'ProtectSystem': 'full',
|
||||||
|
'ReadWritePaths': '/var/lib/left4me',
|
||||||
|
'PrivateTmp': 'true',
|
||||||
|
},
|
||||||
|
'Install': {
|
||||||
|
'WantedBy': {'multi-user.target'},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'left4me-server@.service': {
|
||||||
|
'Unit': {
|
||||||
|
'Description': 'left4me server instance %i',
|
||||||
|
'After': 'network-online.target',
|
||||||
|
'Wants': 'network-online.target',
|
||||||
|
'StartLimitBurst': '5',
|
||||||
|
'StartLimitIntervalSec': '60s',
|
||||||
|
},
|
||||||
|
'Service': {
|
||||||
|
'Type': 'simple',
|
||||||
|
'User': 'left4me',
|
||||||
|
'Group': 'left4me',
|
||||||
|
'EnvironmentFile': (
|
||||||
|
'/etc/left4me/host.env',
|
||||||
|
'/var/lib/left4me/instances/%i/instance.env',
|
||||||
|
),
|
||||||
|
'WorkingDirectory': '-/var/lib/left4me/runtime/%i/merged/left4dead2',
|
||||||
|
'ExecStartPre': (
|
||||||
|
'+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- '
|
||||||
|
'/usr/local/libexec/left4me/left4me-overlay mount %i'
|
||||||
|
),
|
||||||
|
'ExecStart': (
|
||||||
|
'/var/lib/left4me/runtime/%i/merged/srcds_run '
|
||||||
|
'-game left4dead2 +hostport ${L4D2_PORT} $L4D2_ARGS'
|
||||||
|
),
|
||||||
|
'ExecStopPost': (
|
||||||
|
'+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- '
|
||||||
|
'/usr/local/libexec/left4me/left4me-overlay umount %i'
|
||||||
|
),
|
||||||
|
'Restart': 'on-failure',
|
||||||
|
'RestartSec': '5',
|
||||||
|
'Slice': 'l4d2-game.slice',
|
||||||
|
'Nice': '-5',
|
||||||
|
'IOSchedulingClass': 'best-effort',
|
||||||
|
'IOSchedulingPriority': '4',
|
||||||
|
'OOMScoreAdjust': '-200',
|
||||||
|
'MemoryHigh': '1.5G',
|
||||||
|
'MemoryMax': '2G',
|
||||||
|
'TasksMax': '256',
|
||||||
|
'LimitNOFILE': '65536',
|
||||||
|
'KillSignal': 'SIGINT',
|
||||||
|
'TimeoutStopSec': '15s',
|
||||||
|
'LogRateLimitIntervalSec': '0',
|
||||||
|
'NoNewPrivileges': 'true',
|
||||||
|
'PrivateTmp': 'true',
|
||||||
|
'PrivateDevices': 'true',
|
||||||
|
'ProtectHome': 'true',
|
||||||
|
'ProtectSystem': 'strict',
|
||||||
|
'ReadOnlyPaths': '/var/lib/left4me/installation /var/lib/left4me/overlays',
|
||||||
|
'ReadWritePaths': '/var/lib/left4me/runtime/%i',
|
||||||
|
'RestrictSUIDSGID': 'true',
|
||||||
|
'LockPersonality': 'true',
|
||||||
|
},
|
||||||
|
'Install': {
|
||||||
|
'WantedBy': {'multi-user.target'},
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'l4d2-game.slice': {
|
||||||
|
'Unit': {
|
||||||
|
'Description': 'left4me game-server slice',
|
||||||
|
'Before': 'slices.target',
|
||||||
|
},
|
||||||
|
'Slice': {
|
||||||
|
'CPUWeight': '1000',
|
||||||
|
'IOWeight': '1000',
|
||||||
|
'AllowedCPUs': game_cpus_string,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'l4d2-build.slice': {
|
||||||
|
'Unit': {
|
||||||
|
'Description': 'left4me script-sandbox build slice',
|
||||||
|
'Before': 'slices.target',
|
||||||
|
},
|
||||||
|
'Slice': {
|
||||||
|
'CPUWeight': '10',
|
||||||
|
'IOWeight': '10',
|
||||||
|
'AllowedCPUs': system_cpus_string,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
'system.slice.d/99-left4me-cpuset.conf': cpuset_dropin,
|
||||||
|
'user.slice.d/99-left4me-cpuset.conf': cpuset_dropin,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
@ -1,9 +1,60 @@
|
||||||
https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
# letsencrypt
|
||||||
|
|
||||||
|
Issues and renews Let's Encrypt certs via [dehydrated][upstream] with
|
||||||
|
DNS-01 against the in-house bind-acme server.
|
||||||
|
|
||||||
|
[upstream]: https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
||||||
|
|
||||||
|
## First-apply behaviour
|
||||||
|
|
||||||
|
Immediately after `bw apply <node>`, nginx serves a **self-signed
|
||||||
|
cert** for each declared domain — generated by
|
||||||
|
`/etc/dehydrated/letsencrypt-ensure-some-certificate` so nginx has
|
||||||
|
something to start with. The real Let's Encrypt cert arrives at most
|
||||||
|
24h later when the systemd timer fires
|
||||||
|
(`/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`). To
|
||||||
|
shortcut the wait:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
ssh <node> 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01'
|
||||||
|
ssh <node> 'sudo systemctl reload nginx'
|
||||||
|
```
|
||||||
|
|
||||||
|
## DNS-01 prerequisites
|
||||||
|
|
||||||
|
`hook.sh` does `nsupdate` against the bind-acme server (referenced
|
||||||
|
by `letsencrypt/acme_node`). For the challenge to succeed:
|
||||||
|
|
||||||
|
1. The acme node must be in the same metadata graph (so
|
||||||
|
`bw metadata <node> -k letsencrypt/acme_node` resolves).
|
||||||
|
2. **All NS servers** for the validated domain must serve the
|
||||||
|
`_acme-challenge.<domain>` CNAME — Let's Encrypt validates from
|
||||||
|
primary AND secondary geographic regions; both authoritative
|
||||||
|
servers must agree. If a secondary NS is also a bw-managed node,
|
||||||
|
`bw apply` it after adding the domain (see e.g. `ovh.secondary`).
|
||||||
|
3. The bind-acme node's TSIG key must be reachable. `hook.sh` is
|
||||||
|
rendered with the bind-acme server's `network/internal/ipv4` —
|
||||||
|
for clients outside that LAN, the route must exist (typically via
|
||||||
|
wireguard `s2s` peer membership).
|
||||||
|
|
||||||
|
## Negative-cache penalty
|
||||||
|
|
||||||
|
If the first DNS-01 attempt fails (e.g. zone not yet applied to the
|
||||||
|
secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's
|
||||||
|
negative TTL (often 900s = 15 min). Subsequent attempts during that
|
||||||
|
window also fail and refresh the cache. Combined with LE's rate limit
|
||||||
|
of **5 failed authorisations per domain per hour**, recovery requires
|
||||||
|
you to **stop retrying** for ~15 minutes after fixing the DNS, then
|
||||||
|
make at most one attempt.
|
||||||
|
|
||||||
|
## nsupdate sample
|
||||||
|
|
||||||
|
For interactive testing of the bind-acme TSIG path:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
printf "server 127.0.0.1
|
printf "server 127.0.0.1
|
||||||
zone acme.resolver.name.
|
zone acme.resolver.name.
|
||||||
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT "hello"
|
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\"
|
||||||
send
|
send
|
||||||
" | nsupdate -y hmac-sha512:acme:XXXXXX
|
" | nsupdate -y hmac-sha512:acme:XXXXXX
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@ defaults = {
|
||||||
'apt': {
|
'apt': {
|
||||||
'packages': {
|
'packages': {
|
||||||
'dehydrated': {},
|
'dehydrated': {},
|
||||||
'dnsutils': {},
|
'bind9-dnsutils': {},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
'letsencrypt': {
|
'letsencrypt': {
|
||||||
|
|
|
||||||
36
bundles/nginx/README.md
Normal file
36
bundles/nginx/README.md
Normal file
|
|
@ -0,0 +1,36 @@
|
||||||
|
# nginx
|
||||||
|
|
||||||
|
Webserver. Per-node vhosts in `nginx/vhosts`; per-vhost templates in
|
||||||
|
`data/nginx/*.conf`.
|
||||||
|
|
||||||
|
## How port 80 is served
|
||||||
|
|
||||||
|
The bundle ships a fixed `80.conf` to
|
||||||
|
`/etc/nginx/sites-available/80.conf` (picked up by the
|
||||||
|
`sites-enabled/` symlink) that handles **all** port-80 traffic
|
||||||
|
across vhosts:
|
||||||
|
|
||||||
|
1. ACME HTTP-01 challenges (`/.well-known/acme-challenge/`) are
|
||||||
|
served from `/var/lib/dehydrated/acme-challenges/`.
|
||||||
|
2. All other port-80 requests are 301-redirected to
|
||||||
|
`https://$host$request_uri`.
|
||||||
|
|
||||||
|
Per-vhost templates only declare `listen 443 ssl http2;`, so they
|
||||||
|
don't need their own port-80 server blocks. If you need vhost-
|
||||||
|
specific port-80 behaviour (e.g. plain-HTTP without redirect),
|
||||||
|
override 80.conf or add a per-vhost block.
|
||||||
|
|
||||||
|
## Required metadata
|
||||||
|
|
||||||
|
- `vm/cores` — read directly by `items.py` for `worker_processes`.
|
||||||
|
No default; `bw items <node>` raises at item-build time if missing.
|
||||||
|
Typically supplied by the `vm` bundle / hetzner-vm group; double-
|
||||||
|
check on bare-metal hosts.
|
||||||
|
- `nginx/vhosts` — dict of vhost-name → vhost-config.
|
||||||
|
- `nginx/modules` — list of dynamic modules to load.
|
||||||
|
|
||||||
|
## Cross-namespace
|
||||||
|
|
||||||
|
`items.py` reads `letsencrypt/domains` to skip emitting a per-vhost
|
||||||
|
HTTPS block when LE hasn't declared the domain yet — keeps the
|
||||||
|
bundle loadable on a node where letsencrypt isn't fully wired up.
|
||||||
|
|
@ -32,12 +32,13 @@ http {
|
||||||
|
|
||||||
% endif
|
% endif
|
||||||
|
|
||||||
% if has_websockets:
|
# Always defined: serves both WS-enabled vhosts (Connection: upgrade for
|
||||||
|
# ws clients) and SSE/keep-alive vhosts (Connection: "" lets nginx manage
|
||||||
|
# the upstream connection for keep-alive, instead of forcing "close").
|
||||||
map $http_upgrade $connection_upgrade {
|
map $http_upgrade $connection_upgrade {
|
||||||
default upgrade;
|
default upgrade;
|
||||||
'' close;
|
'' '';
|
||||||
}
|
}
|
||||||
% endif
|
|
||||||
|
|
||||||
include /etc/nginx/sites-enabled/*;
|
include /etc/nginx/sites-enabled/*;
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -64,7 +64,7 @@ files = {
|
||||||
'svc_systemd:nginx:restart',
|
'svc_systemd:nginx:restart',
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
'/etc/nginx/sites/80.conf': {
|
'/etc/nginx/sites-available/80.conf': {
|
||||||
'triggers': {
|
'triggers': {
|
||||||
'svc_systemd:nginx:restart',
|
'svc_systemd:nginx:restart',
|
||||||
},
|
},
|
||||||
|
|
|
||||||
|
|
@ -33,7 +33,7 @@ for name, unit in node.metadata.get('systemd/units').items():
|
||||||
'svc_systemd:systemd-networkd.service:restart',
|
'svc_systemd:systemd-networkd.service:restart',
|
||||||
],
|
],
|
||||||
}
|
}
|
||||||
elif extension in ['timer', 'service', 'mount', 'swap', 'target']:
|
elif extension in ['timer', 'service', 'mount', 'swap', 'target', 'slice']:
|
||||||
path = f'/usr/local/lib/systemd/system/{name}'
|
path = f'/usr/local/lib/systemd/system/{name}'
|
||||||
dependencies = {
|
dependencies = {
|
||||||
'triggers': [
|
'triggers': [
|
||||||
|
|
|
||||||
|
|
@ -8,10 +8,16 @@ server {
|
||||||
|
|
||||||
location / {
|
location / {
|
||||||
proxy_set_header X-Real-IP $remote_addr;
|
proxy_set_header X-Real-IP $remote_addr;
|
||||||
% if websockets:
|
# Always set Upgrade + Connection via the $connection_upgrade map:
|
||||||
|
# WS client (Upgrade header sent) -> Connection: upgrade
|
||||||
|
# non-WS client (no Upgrade) -> Connection: "" (keep-alive)
|
||||||
|
# Lets every vhost serve both WS and SSE without per-vhost flags.
|
||||||
|
proxy_http_version 1.1;
|
||||||
proxy_set_header Upgrade $http_upgrade;
|
proxy_set_header Upgrade $http_upgrade;
|
||||||
proxy_set_header Connection $connection_upgrade;
|
proxy_set_header Connection $connection_upgrade;
|
||||||
% endif
|
# SSE-safe pass-through (also fine for non-SSE traffic):
|
||||||
|
proxy_buffering off;
|
||||||
|
proxy_read_timeout 1h;
|
||||||
proxy_pass ${target};
|
proxy_pass ${target};
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -48,3 +48,51 @@ instead.
|
||||||
|
|
||||||
See [`conventions.md#secrets`](conventions.md#secrets) for the
|
See [`conventions.md#secrets`](conventions.md#secrets) for the
|
||||||
demagify magic-string list and the rule's full rationale.
|
demagify magic-string list and the rule's full rationale.
|
||||||
|
|
||||||
|
## Read-only commands — useful flag combinations
|
||||||
|
|
||||||
|
The fork's [`AGENTS.md`][fork] documents the canonical safety envelope.
|
||||||
|
These are the flag combinations agents reach for most often in this repo:
|
||||||
|
|
||||||
|
| Want to … | Run |
|
||||||
|
|---|---|
|
||||||
|
| Sanity-check the whole repo (parse + cross-cutting hooks) | `bw test` (defaults to `-HIJKMSp`) |
|
||||||
|
| Exercise reactors and item-graph for one node | `bw test <node>` (defaults to `-IJKMp`) |
|
||||||
|
| Same, but every node that has a given bundle | `bw test bundle:<name>` |
|
||||||
|
| Print one metadata key for one node | `bw metadata <node> -k <a/b>` (repeat `-k` for more) |
|
||||||
|
| Show where each metadata value comes from | `bw metadata <node> -b` |
|
||||||
|
| Resolve Faults (vault values) into the dump | `bw metadata <node> -f` — **may print secrets, avoid** |
|
||||||
|
| List a node's items, with the bundle that defines each | `bw items <node> --blame` |
|
||||||
|
| Preview a rendered file's content | `bw items <node> file:<path> -f` |
|
||||||
|
| Verify against the live host, scoped to one bundle | `bw verify <node> -o bundle:<name>` |
|
||||||
|
| Hash metadata only (faster than full config hash) | `bw hash <node> -m` |
|
||||||
|
| Inspect the data backing a hash | `bw hash <node> -d` |
|
||||||
|
|
||||||
|
`bw test`, `bw verify`, `bw nodes`, `bw metadata` all share a target-
|
||||||
|
selector grammar: bare node name, group name, `bundle:<name>`,
|
||||||
|
`!bundle:<name>`, or `"lambda:node.metadata_get('foo/bar', 0) < 3"`.
|
||||||
|
|
||||||
|
[fork]: https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md
|
||||||
|
|
||||||
|
## Bundle-validation workflow
|
||||||
|
|
||||||
|
`bw test` (no args) is a *parsing* gate, not a *behaviour* gate. It
|
||||||
|
loads every bundle, but a bundle's reactors only resolve when a node's
|
||||||
|
metadata is actually built — and that happens only for nodes that
|
||||||
|
opt in. Until then, reactor bugs stay dormant. bw rejects reactors
|
||||||
|
that don't read any metadata, but the rejection only fires once *some*
|
||||||
|
node consumes the bundle.
|
||||||
|
|
||||||
|
When developing a new bundle:
|
||||||
|
|
||||||
|
1. Scaffold + `bw test` — confirms parsing.
|
||||||
|
2. **Attach the bundle to one node** (or a stub node) by adding it to
|
||||||
|
`nodes/<n>.py`'s `bundles` list, or to a group the node is in.
|
||||||
|
3. `bw test <node>` — now reactors fire. This is where bundle bugs
|
||||||
|
surface.
|
||||||
|
4. `bw items <node> --blame` and `bw metadata <node> -k <key>` —
|
||||||
|
confirm items materialise and derived metadata looks right.
|
||||||
|
5. `bw hash <node>` — preview against the live host.
|
||||||
|
|
||||||
|
Step 2 is non-optional. A bundle that "passes `bw test`" with no
|
||||||
|
consumer is proven only to parse.
|
||||||
|
|
|
||||||
|
|
@ -127,6 +127,12 @@ bundle.
|
||||||
|
|
||||||
## 3. Per-bundle `AGENTS.md` template
|
## 3. Per-bundle `AGENTS.md` template
|
||||||
|
|
||||||
|
> **Status: replaced — pre-pivot intent only.** Per-bundle docs are plain
|
||||||
|
> `README.md` with no fixed structure. See §0 Revisions and the
|
||||||
|
> "Per-bundle README" section in [`bundles/AGENTS.md`](../../../bundles/AGENTS.md)
|
||||||
|
> for the current convention. The template below is kept as a record of
|
||||||
|
> the original design.
|
||||||
|
|
||||||
One balanced doc serving both audiences. Prose where prose helps, structure
|
One balanced doc serving both audiences. Prose where prose helps, structure
|
||||||
where structure helps. Sections in order:
|
where structure helps. Sections in order:
|
||||||
|
|
||||||
|
|
@ -339,6 +345,12 @@ in 30–120 lines each; root `AGENTS.md` is ~150 lines.
|
||||||
|
|
||||||
### Phase 2 — seed bundles (10)
|
### Phase 2 — seed bundles (10)
|
||||||
|
|
||||||
|
> **Status: dropped — pre-pivot intent only.** Phase 2 didn't ship. After
|
||||||
|
> Phase 1 landed, the maintainer pulled the per-bundle `AGENTS.md`
|
||||||
|
> migration: the rigid template proved a poor fit for the heterogeneous
|
||||||
|
> existing READMEs. See §0 Revisions. The seed list and migration plan
|
||||||
|
> below are kept as a record of how the work was scoped.
|
||||||
|
|
||||||
Bundles selected empirically (node+group references and recent commit
|
Bundles selected empirically (node+group references and recent commit
|
||||||
activity, validated 2026-05-10):
|
activity, validated 2026-05-10):
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,253 @@
|
||||||
|
# Round 1 — agent-doc refactor (gaps 1–6 + cmd cheat sheet)
|
||||||
|
|
||||||
|
## Why
|
||||||
|
|
||||||
|
A previous session integrated `bundles/left4me/` and brought
|
||||||
|
`ovh.left4me` live. The integration produced a handoff (at
|
||||||
|
`~/.claude/plans/2026-05-10-ckn-bw-docs-improvements-handoff.md`)
|
||||||
|
listing 12 documentation gaps surfaced by the work. This spec covers
|
||||||
|
the first six (the cross-cutting ones) plus a useful side-quest:
|
||||||
|
adding a read-only command cheat sheet to `docs/agents/commands.md`.
|
||||||
|
Gaps 7–12 (item-specific, bundle READMEs) are deferred to a follow-up
|
||||||
|
round.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
In:
|
||||||
|
|
||||||
|
- Gap 1 — drop `bw bundles` (doesn't exist), add `bw verify` to the
|
||||||
|
read-only allowlist.
|
||||||
|
- Gap 2 — bundle-validation workflow needs a node attached.
|
||||||
|
- Gap 3 — nodes carry only node-specific metadata (split across
|
||||||
|
`bundles/AGENTS.md` and `nodes/AGENTS.md`).
|
||||||
|
- Gap 4 — reactors must read metadata or be defaults.
|
||||||
|
- Gap 5 — `triggers` ↔ `triggered: True` invariant + self-healing
|
||||||
|
pattern.
|
||||||
|
- Gap 6 — `unless` semantics (folded into Gap 5's second bullet).
|
||||||
|
- Side-quest: read-only command cheat sheet in `commands.md` (`bw
|
||||||
|
test` flag matrix + selectors, `bw metadata -k/-b/-f`, `bw items
|
||||||
|
--blame/-f`, `bw verify -o bundle:`, `bw hash -m/-d`).
|
||||||
|
|
||||||
|
Out:
|
||||||
|
|
||||||
|
- Gaps 7–12 (`source` implicit, `git_deploy` chown, `git_deploy` URL
|
||||||
|
form, letsencrypt/bind/nginx READMEs).
|
||||||
|
- Any change to bundle behaviour. This is pure docs; if a doc claim
|
||||||
|
feels wrong, push back to the maintainer rather than editing
|
||||||
|
`.py`.
|
||||||
|
|
||||||
|
## Verification approach
|
||||||
|
|
||||||
|
For each gap, find current line numbers in the target doc (handoff
|
||||||
|
line numbers are May 2026; some have drifted). Verify code-level
|
||||||
|
claims against the fork source under `.venv/src/bundlewrap/` before
|
||||||
|
quoting them.
|
||||||
|
|
||||||
|
Already verified during brainstorm:
|
||||||
|
|
||||||
|
- Gap 1: `bw bundles` is not a subcommand of the installed fork
|
||||||
|
(`.venv/bin/bw --help` lists only
|
||||||
|
`apply, debug, diff, groups, hash, ipmi, items, lock, metadata,
|
||||||
|
nodes, plot, pw, repo, run, stats, test, verify, zen`). `bw verify`
|
||||||
|
is read-only.
|
||||||
|
- Gap 2: `bw test` default flag set differs by mode. Whole-repo:
|
||||||
|
`-HIJKMSp`. Node-targeted: `-IJKMp`. The repo-mode adds `-H`
|
||||||
|
(repo hooks) and `-S` (subgroup-loops); the node-mode adds `-J`
|
||||||
|
(node hooks). Reactors only resolve when a node's metadata is
|
||||||
|
built, which only happens when a node opts into the bundle.
|
||||||
|
- Gap 4: exact wording at `metagen.py:428`:
|
||||||
|
`"{reactor_name} on {node_name} did not request any metadata, you
|
||||||
|
might want to use defaults instead"`.
|
||||||
|
- Gap 5: exact wording at `deps.py:340`:
|
||||||
|
`"'{item1}' in bundle '{bundle1}' triggered by '{item2}' in bundle
|
||||||
|
'{bundle2}', but missing 'triggered' attribute"`.
|
||||||
|
- Gap 3 precedent: `bundles/left4me/metadata.py:10` is the canonical
|
||||||
|
random-bytes-in-defaults example. `bundles/postgresql/metadata.py:4`
|
||||||
|
is the password_for-at-module-scope example. (The handoff cites
|
||||||
|
postgresql for the random-bytes pattern; that's a misattribution —
|
||||||
|
postgresql uses `password_for`.)
|
||||||
|
|
||||||
|
After every commit: `.venv/bin/bw test` must pass with the same
|
||||||
|
output as before. Pure-docs edits cannot break it unless a `.py` is
|
||||||
|
touched accidentally.
|
||||||
|
|
||||||
|
## Commits
|
||||||
|
|
||||||
|
Six iterative commits, matching repo style.
|
||||||
|
|
||||||
|
### Commit 1 — drop `bw bundles`, add `bw verify` (Gap 1)
|
||||||
|
|
||||||
|
`AGENTS.md` rule 1 only. The handoff also flagged
|
||||||
|
`bundles/AGENTS.md:60-64`, but that list no longer references
|
||||||
|
`bw bundles` (it currently reads `bw test` / `bw items` / `bw hash`).
|
||||||
|
That section gets rewritten in commit 3, not here.
|
||||||
|
|
||||||
|
```diff
|
||||||
|
- to `bw test`, `bw nodes`, `bw groups`, `bw bundles`,
|
||||||
|
- `bw items`, `bw metadata`, `bw hash`, `bw debug`. See
|
||||||
|
+ to `bw test`, `bw nodes`, `bw groups`, `bw items`,
|
||||||
|
+ `bw metadata`, `bw hash`, `bw verify`, `bw debug`. See
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 2 — read-only command cheat sheet
|
||||||
|
|
||||||
|
Append to `docs/agents/commands.md`. New H2 section, table format
|
||||||
|
to match the existing voice.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Read-only commands — useful flag combinations
|
||||||
|
|
||||||
|
The fork's [`AGENTS.md`][fork] documents the canonical safety envelope.
|
||||||
|
These are the flag combinations agents reach for most often in this repo:
|
||||||
|
|
||||||
|
| Want to … | Run |
|
||||||
|
|---|---|
|
||||||
|
| Sanity-check the whole repo (parse + cross-cutting hooks) | `bw test` (defaults to `-HIJKMSp`) |
|
||||||
|
| Exercise reactors and item-graph for one node | `bw test <node>` (defaults to `-IJKMp`) |
|
||||||
|
| Same, but every node that has a given bundle | `bw test bundle:<name>` |
|
||||||
|
| Print one metadata key for one node | `bw metadata <node> -k <a/b>` (repeat `-k` for more) |
|
||||||
|
| Show where each metadata value comes from | `bw metadata <node> -b` |
|
||||||
|
| Resolve Faults (vault values) into the dump | `bw metadata <node> -f` — **may print secrets, avoid** |
|
||||||
|
| List a node's items, with the bundle that defines each | `bw items <node> --blame` |
|
||||||
|
| Preview a rendered file's content | `bw items <node> file:<path> -f` |
|
||||||
|
| Verify against the live host, scoped to one bundle | `bw verify <node> -o bundle:<name>` |
|
||||||
|
| Hash metadata only (faster than full config hash) | `bw hash <node> -m` |
|
||||||
|
| Inspect the data backing a hash | `bw hash <node> -d` |
|
||||||
|
|
||||||
|
`bw test`, `bw verify`, `bw nodes`, `bw metadata` all share a target-
|
||||||
|
selector grammar: bare node name, group name, `bundle:<name>`,
|
||||||
|
`!bundle:<name>`, or `"lambda:node.metadata_get('foo/bar', 0) < 3"`.
|
||||||
|
|
||||||
|
[fork]: https://github.com/CroneKorkN/bundlewrap/blob/main/AGENTS.md
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 3 — bundle validation needs a node attached (Gap 2)
|
||||||
|
|
||||||
|
Two file changes.
|
||||||
|
|
||||||
|
**`bundles/AGENTS.md` lines 59-64** — replace the Verify list:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
5. **Verify, in this order:**
|
||||||
|
- `bw test` — repo-wide parse + cross-cutting hooks. Loads every
|
||||||
|
bundle, but reactors don't fire for nodes that haven't opted into
|
||||||
|
the bundle yet — bugs in new reactors stay hidden here.
|
||||||
|
- **Attach the bundle to a node** (via the node's `bundles` list, or
|
||||||
|
a group it belongs to). Until you do, the next steps don't actually
|
||||||
|
exercise the bundle.
|
||||||
|
- `bw test <node>` — exercises every reactor and item-graph edge for
|
||||||
|
that node. This is where most new-bundle bugs surface.
|
||||||
|
- `bw items <node> --blame` — confirm items materialise with the right
|
||||||
|
paths, authored by the expected bundle.
|
||||||
|
- `bw metadata <node> -k <a/b>` — spot-check derived metadata.
|
||||||
|
- `bw hash <node>` — preview vs current host state.
|
||||||
|
|
||||||
|
See [`docs/agents/commands.md#bundle-validation-workflow`](../docs/agents/commands.md#bundle-validation-workflow)
|
||||||
|
for the rationale.
|
||||||
|
```
|
||||||
|
|
||||||
|
**`docs/agents/commands.md`** — new section after the cheat sheet:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
## Bundle-validation workflow
|
||||||
|
|
||||||
|
`bw test` (no args) is a *parsing* gate, not a *behaviour* gate. It
|
||||||
|
loads every bundle, but a bundle's reactors only resolve when a node's
|
||||||
|
metadata is actually built — and that happens only for nodes that
|
||||||
|
opt in. Until then, reactor bugs stay dormant. bw rejects reactors that
|
||||||
|
don't read any metadata, but the rejection only fires once *some* node
|
||||||
|
consumes the bundle.
|
||||||
|
|
||||||
|
When developing a new bundle:
|
||||||
|
|
||||||
|
1. Scaffold + `bw test` — confirms parsing.
|
||||||
|
2. **Attach the bundle to one node** (or a stub node) by adding it to
|
||||||
|
`nodes/<n>.py`'s `bundles` list, or to a group the node is in.
|
||||||
|
3. `bw test <node>` — now reactors fire. This is where bundle bugs
|
||||||
|
surface.
|
||||||
|
4. `bw items <node> --blame` and `bw metadata <node> -k <key>` — confirm
|
||||||
|
items materialise and derived metadata looks right.
|
||||||
|
5. `bw hash <node>` — preview against the live host.
|
||||||
|
|
||||||
|
Step 2 is non-optional. A bundle that "passes `bw test`" with no consumer
|
||||||
|
is proven only to parse.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 4 — nodes carry only node-specific metadata (Gap 3)
|
||||||
|
|
||||||
|
**`bundles/AGENTS.md` Conventions** — new bullet:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **Bundles own application-wide knowledge; nodes carry only the few
|
||||||
|
per-host knobs the bundle actually needs.** When designing a bundle,
|
||||||
|
identify the per-node knobs (e.g. domain, uplink interface, a
|
||||||
|
vault-id suffix) and put everything else in `defaults`, or in a
|
||||||
|
reactor that derives from those knobs. Per-node random secrets
|
||||||
|
belong in `defaults` via `repo.vault.random_bytes_as_base64_for(...)`
|
||||||
|
keyed on the node — not in the node file. See
|
||||||
|
`bundles/left4me/metadata.py:10` (`secret_key` derived in defaults)
|
||||||
|
and `bundles/postgresql/metadata.py:4` (vault-derived `password_for`
|
||||||
|
at module scope).
|
||||||
|
```
|
||||||
|
|
||||||
|
**`nodes/AGENTS.md` Pitfalls** — new bullet:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **Bloated per-node metadata is usually a bundle smell.** If a
|
||||||
|
bundle's metadata block in the node file has more than 3-5 keys,
|
||||||
|
the bundle is probably under-using `defaults` / reactors. Push the
|
||||||
|
contribution into the bundle (see
|
||||||
|
[`bundles/AGENTS.md`](../bundles/AGENTS.md#conventions)) rather than
|
||||||
|
growing the node file.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 5 — reactors must read metadata or be defaults (Gap 4)
|
||||||
|
|
||||||
|
**`bundles/AGENTS.md` Pitfalls** — new bullet:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **Reactors must read metadata.** If a reactor body returns a static
|
||||||
|
dict without calling `metadata.get(...)`, bw raises
|
||||||
|
`ValueError: <reactor> on <node> did not request any metadata, you
|
||||||
|
might want to use defaults instead` once a node consumes the bundle.
|
||||||
|
Fix: fold the contribution into `defaults`. The rule applies even
|
||||||
|
when the reactor writes into another bundle's namespace — a static
|
||||||
|
contribution to e.g. `nftables/output` belongs in `defaults`, where
|
||||||
|
bw merges it with other bundles' contributions.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 6 — `triggers` invariant + self-healing + `unless` (Gaps 5+6)
|
||||||
|
|
||||||
|
**`bundles/AGENTS.md` Pitfalls** — two new bullets (Gap 6's `unless`
|
||||||
|
semantics fold into the second; cleaner than three bullets):
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **`triggers` ↔ `triggered: True` invariant.** Any item listed in
|
||||||
|
another's `triggers` list must declare `triggered: True`. bw
|
||||||
|
enforces this at `bw test` time: *"…triggered by …, but missing
|
||||||
|
'triggered' attribute"*. Corollary: an action can't be both in an
|
||||||
|
upstream `triggers` list AND self-healing every apply — pick one.
|
||||||
|
|
||||||
|
- **Triggered actions don't recover from partial failure.** When an
|
||||||
|
upstream item's apply succeeds but its triggered downstream action
|
||||||
|
fails, subsequent applies can't recover via the trigger chain —
|
||||||
|
upstream is "already in desired state" and never re-triggers. For
|
||||||
|
actions that must self-heal (pip installs, chowns, migrations),
|
||||||
|
drop `triggered: True` and gate the command with `unless:
|
||||||
|
<fast-check>`. `unless` is a shell command on the target host whose
|
||||||
|
exit status decides whether the main command runs (exit 0 = skip);
|
||||||
|
it's checked at fire time, after `triggered:` filtering.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- Gaps 7–12 — deferred. The maintainer re-engages after this round.
|
||||||
|
- Bundle behaviour changes. Pure docs.
|
||||||
|
- `bw apply` / `bw run` — not authorised this session.
|
||||||
|
|
||||||
|
## Constraints
|
||||||
|
|
||||||
|
- Don't echo decrypted secrets in commit messages or new doc text.
|
||||||
|
- Don't restore `*.py_` parked nodes.
|
||||||
|
- After each commit, `.venv/bin/bw test` must pass.
|
||||||
|
- No push.
|
||||||
|
|
@ -0,0 +1,286 @@
|
||||||
|
# Round 2 — agent-doc refactor (gaps 7–12)
|
||||||
|
|
||||||
|
## Why
|
||||||
|
|
||||||
|
Continuation of round 1 (spec at
|
||||||
|
`2026-05-10-ckn-bw-agents-md-refactor-round-1-design.md`). Round 1
|
||||||
|
landed the cross-cutting lessons (read-only allowlist, bundle
|
||||||
|
validation needs a node, nodes-carry-only-node-specific-metadata,
|
||||||
|
reactors-must-read-metadata, triggers/triggered:True invariant,
|
||||||
|
self-healing pattern). Round 2 covers the remaining six gaps: built-in
|
||||||
|
item-type gotchas and three bundle READMEs.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
In:
|
||||||
|
|
||||||
|
- Gap 7 — `file:`'s `source` defaults to the basename of the destination.
|
||||||
|
- Gap 8 — `git_deploy` extracts as the connecting user (root after
|
||||||
|
sudo); chown action needed for non-root downstream consumers.
|
||||||
|
- Gap 9 — `git_deploy` URL form: `://` triggers per-apply clone, no `://`
|
||||||
|
requires a `git_deploy_repos` map at the repo root.
|
||||||
|
- Gap 10 — `bundles/letsencrypt`: first-apply behaviour, DNS-01
|
||||||
|
prerequisites, negative-cache penalty.
|
||||||
|
- Gap 11 — `bundles/bind`: applying changes to a `master_node`-linked
|
||||||
|
pair needs `bw apply` on both ends.
|
||||||
|
- Gap 12 — `bundles/nginx`: how port 80 is served, `vm/cores`
|
||||||
|
requirement.
|
||||||
|
|
||||||
|
Out:
|
||||||
|
|
||||||
|
- Bundle behaviour changes. Pure docs.
|
||||||
|
- `bw apply` / `bw run` — not authorised this session.
|
||||||
|
|
||||||
|
## Placement decision (diverges from the handoff)
|
||||||
|
|
||||||
|
The handoff suggests `items/AGENTS.md` for gaps 7, 8, 9. But
|
||||||
|
`items/AGENTS.md` is scoped to **custom** item types in the `items/`
|
||||||
|
directory — its first sentence: *"Custom item types — each `*.py` is
|
||||||
|
a `bundlewrap.items.Item` subclass…"*. Built-in gotchas (`file:`,
|
||||||
|
`git_deploy:`) don't fit there.
|
||||||
|
|
||||||
|
Round-1 lessons about built-in mechanics (reactors must read metadata,
|
||||||
|
`triggers` invariant, self-healing pattern) all landed in
|
||||||
|
`bundles/AGENTS.md` Pitfalls. Gaps 7, 8, 9 are the same shape, so
|
||||||
|
they go in the same place.
|
||||||
|
|
||||||
|
## Validation findings
|
||||||
|
|
||||||
|
- Gap 7: well-known bw built-in semantics. Trusting the handoff.
|
||||||
|
- Gap 8: confirmed at `.venv/src/bundlewrap/bundlewrap/items/git_deploy.py`'s
|
||||||
|
`fix()` method — uses `self.node.upload(...)` which writes as the sudo
|
||||||
|
user (root). Files end up root-owned.
|
||||||
|
- Gap 9: confirmed in round 1 (`git_deploy.py:103` —
|
||||||
|
`if "://" in self.attributes['repo']:`).
|
||||||
|
- Gap 10: confirmed `/etc/dehydrated/letsencrypt-ensure-some-certificate`
|
||||||
|
exists in the bundle; runs on every domain with idempotent `unless`.
|
||||||
|
Daily timer at `/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`.
|
||||||
|
- Gap 11: nuanced. The bundle DOES set `bind/type = 'slave'` and renders
|
||||||
|
different named.conf.local for slaves, so bind itself may AXFR at
|
||||||
|
runtime. But the slave's *bw-managed* zone files are statically
|
||||||
|
rendered from the master's metadata at slave-apply time
|
||||||
|
(`bundles/bind/items.py:100`). The practical workflow rule — "apply
|
||||||
|
both" — is correct regardless. I'll frame the README as the workflow
|
||||||
|
rule, not the absolute "not AXFR slaving" claim from the handoff.
|
||||||
|
- Gap 12: confirmed `nginx.conf:42` includes `/etc/nginx/sites-enabled/*`;
|
||||||
|
`nginx/items.py:35` reads `node.metadata.get('vm/cores')` with no
|
||||||
|
default. README does not exist.
|
||||||
|
|
||||||
|
## Existing README states
|
||||||
|
|
||||||
|
- `bundles/letsencrypt/README.md` — 9 lines: upstream link + nsupdate
|
||||||
|
snippet. Reshape into an operational README; keep the nsupdate snippet.
|
||||||
|
- `bundles/bind/README.md` — does not exist. Create.
|
||||||
|
- `bundles/nginx/README.md` — does not exist. Create.
|
||||||
|
|
||||||
|
## Commits
|
||||||
|
|
||||||
|
### Commit 7 — `file:` source defaults to destination basename (Gap 7)
|
||||||
|
|
||||||
|
`bundles/AGENTS.md` Pitfalls — new bullet:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **`file:` `source` defaults to the destination basename.** For a
|
||||||
|
destination of `/etc/foo/bar.conf` with no `source` key, bw looks for
|
||||||
|
`bundles/<bundle>/files/bar.conf`. Only declare `source` explicitly
|
||||||
|
when the basename you want differs (e.g. shipping a Mako template
|
||||||
|
named `bar.conf.mako` to a destination of `/etc/foo/bar.conf`).
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 8 — `git_deploy` gotchas (Gaps 8 + 9)
|
||||||
|
|
||||||
|
`bundles/AGENTS.md` Pitfalls — two new bullets.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
- **`git_deploy` extracts as the connecting (sudo) user — files end up
|
||||||
|
root-owned.** A downstream action that runs as a non-root app user
|
||||||
|
(typical: editable pip install, Rails bundle install) will hit
|
||||||
|
`Permission denied` on `.egg-info` or similar. The fix is a
|
||||||
|
self-healing chown action between `git_deploy` and the downstream
|
||||||
|
action:
|
||||||
|
|
||||||
|
```python
|
||||||
|
actions['<bundle>_chown_src'] = {
|
||||||
|
'command': 'chown -R <user>:<group> <path>',
|
||||||
|
'unless': 'test -z "$(find <path> ! -user <user> -print -quit)"',
|
||||||
|
'cascade_skip': False,
|
||||||
|
'needs': ['git_deploy:<path>', 'user:<user>', 'group:<group>'],
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
See `bundles/left4me/items.py` for an in-tree example.
|
||||||
|
|
||||||
|
- **`git_deploy` URL form matters.** A URL containing `://` (HTTP/HTTPS,
|
||||||
|
`ssh://`) makes bw clone to a temp dir per-apply — no operator-side
|
||||||
|
state needed. Without `://` (SCP-style `git@host:path`), bw expects a
|
||||||
|
`git_deploy_repos` map file at the repo root pointing at a long-lived
|
||||||
|
local clone, and raises `RepositoryError('missing repo map for
|
||||||
|
git_deploy')` if it isn't there. For HTTPS-reachable repos use the
|
||||||
|
HTTPS form; for SSH-only, prefer the explicit `ssh://user@host/path`
|
||||||
|
form so the map isn't needed.
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 9 — letsencrypt README (Gap 10)
|
||||||
|
|
||||||
|
Reshape `bundles/letsencrypt/README.md`. Keep the upstream link and
|
||||||
|
nsupdate snippet at the top; add three structured sections.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
# letsencrypt
|
||||||
|
|
||||||
|
Issues and renews Let's Encrypt certs via [dehydrated][upstream] with
|
||||||
|
DNS-01 against the in-house bind-acme server.
|
||||||
|
|
||||||
|
[upstream]: https://github.com/dehydrated-io/dehydrated/wiki/example-dns-01-nsupdate-script
|
||||||
|
|
||||||
|
## First-apply behaviour
|
||||||
|
|
||||||
|
Immediately after `bw apply <node>`, nginx serves a **self-signed
|
||||||
|
cert** for each declared domain — generated by
|
||||||
|
`/etc/dehydrated/letsencrypt-ensure-some-certificate` so nginx has
|
||||||
|
something to start with. The real Let's Encrypt cert arrives at most
|
||||||
|
24h later when the systemd timer fires
|
||||||
|
(`/usr/bin/dehydrated --cron --accept-terms --challenge dns-01`). To
|
||||||
|
shortcut the wait:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
ssh <node> 'sudo /usr/bin/dehydrated --cron --accept-terms --challenge dns-01'
|
||||||
|
ssh <node> 'sudo systemctl reload nginx'
|
||||||
|
```
|
||||||
|
|
||||||
|
## DNS-01 prerequisites
|
||||||
|
|
||||||
|
`hook.sh` does `nsupdate` against the bind-acme server (referenced
|
||||||
|
by `letsencrypt/acme_node`). For the challenge to succeed:
|
||||||
|
|
||||||
|
1. The acme node must be in the same metadata graph (so
|
||||||
|
`bw metadata <node> -k letsencrypt/acme_node` resolves).
|
||||||
|
2. **All NS servers** for the validated domain must serve the
|
||||||
|
`_acme-challenge.<domain>` CNAME — Let's Encrypt validates from
|
||||||
|
primary AND secondary geographic regions; both authoritative
|
||||||
|
servers must agree. If a secondary NS is also a bw-managed node,
|
||||||
|
`bw apply` it after adding the domain (see e.g. `ovh.secondary`).
|
||||||
|
3. The bind-acme node's TSIG key must be reachable. `hook.sh` is
|
||||||
|
rendered with the bind-acme server's `network/internal/ipv4` —
|
||||||
|
for clients outside that LAN, the route must exist (typically via
|
||||||
|
wireguard `s2s` peer membership).
|
||||||
|
|
||||||
|
## Negative-cache penalty
|
||||||
|
|
||||||
|
If the first DNS-01 attempt fails (e.g. zone not yet applied to the
|
||||||
|
secondary NS), Let's Encrypt's resolvers cache NXDOMAIN for the SOA's
|
||||||
|
negative TTL (often 900s = 15 min). Subsequent attempts during that
|
||||||
|
window also fail and refresh the cache. Combined with LE's rate limit
|
||||||
|
of **5 failed authorisations per domain per hour**, recovery requires
|
||||||
|
you to **stop retrying** for ~15 minutes after fixing the DNS, then
|
||||||
|
make at most one attempt.
|
||||||
|
|
||||||
|
## nsupdate sample
|
||||||
|
|
||||||
|
For interactive testing of the bind-acme TSIG path:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
printf "server 127.0.0.1
|
||||||
|
zone acme.resolver.name.
|
||||||
|
update add _acme-challenge.ckn.li.acme.resolver.name. 600 IN TXT \"hello\"
|
||||||
|
send
|
||||||
|
" | nsupdate -y hmac-sha512:acme:<TSIG_KEY_REDACTED>
|
||||||
|
```
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 10 — bind README (Gap 11, reframed)
|
||||||
|
|
||||||
|
Create `bundles/bind/README.md`. Frame as the workflow rule, not the
|
||||||
|
absolute "not AXFR" claim.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
# bind
|
||||||
|
|
||||||
|
Authoritative DNS — primary plus optional `bind/master_node` slaves.
|
||||||
|
|
||||||
|
## Applying changes needs both nodes
|
||||||
|
|
||||||
|
The slave's bw-managed zone files are rendered from the master's
|
||||||
|
metadata at slave-apply time (see `bundles/bind/items.py:100`). When
|
||||||
|
you change a record on the master (adding a `letsencrypt/domains`
|
||||||
|
entry, a new vhost, etc.), the change is only published once you
|
||||||
|
apply BOTH:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
bw apply htz.mails # primary (where the source records live)
|
||||||
|
bw apply ovh.secondary # secondary (renders its own zone files)
|
||||||
|
```
|
||||||
|
|
||||||
|
Until both have been applied, `bw verify ovh.secondary` will show
|
||||||
|
stale zones and consumers that hit the secondary (Let's Encrypt's
|
||||||
|
secondary-region validators in particular) will see NXDOMAIN. Even
|
||||||
|
though the slave's named.conf.local declares `type slave;`, don't
|
||||||
|
rely on bind's own AXFR catching up — the bw-rendered file on disk
|
||||||
|
is what `bw verify` measures.
|
||||||
|
|
||||||
|
## See also
|
||||||
|
|
||||||
|
- `bundles/bind-acme/` — the in-house ACME-update receiver.
|
||||||
|
- `bundles/letsencrypt/README.md` — DNS-01 prerequisites and the
|
||||||
|
negative-cache penalty (the most common operational consequence of
|
||||||
|
forgetting to apply the secondary).
|
||||||
|
```
|
||||||
|
|
||||||
|
### Commit 11 — nginx README (Gap 12)
|
||||||
|
|
||||||
|
Create `bundles/nginx/README.md`.
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
# nginx
|
||||||
|
|
||||||
|
Webserver. Per-node vhosts in `nginx/vhosts`; per-vhost templates in
|
||||||
|
`data/nginx/*.conf`.
|
||||||
|
|
||||||
|
## How port 80 is served
|
||||||
|
|
||||||
|
The bundle ships a fixed `80.conf` to
|
||||||
|
`/etc/nginx/sites-available/80.conf` (picked up by the
|
||||||
|
`sites-enabled/` symlink) that handles **all** port-80 traffic
|
||||||
|
across vhosts:
|
||||||
|
|
||||||
|
1. ACME HTTP-01 challenges (`/.well-known/acme-challenge/`) are
|
||||||
|
served from `/var/lib/dehydrated/acme-challenges/`.
|
||||||
|
2. All other port-80 requests are 301-redirected to
|
||||||
|
`https://$host$request_uri`.
|
||||||
|
|
||||||
|
Per-vhost templates only declare `listen 443 ssl http2;`, so they
|
||||||
|
don't need their own port-80 server blocks. If you need vhost-
|
||||||
|
specific port-80 behaviour (e.g. plain-HTTP without redirect), you'll
|
||||||
|
need to override 80.conf or add a per-vhost block.
|
||||||
|
|
||||||
|
## Required metadata
|
||||||
|
|
||||||
|
- `vm/cores` — read directly by `items.py` for `worker_processes`.
|
||||||
|
No default; `bw items <node>` raises at item-build time if missing.
|
||||||
|
Typically supplied by the `vm` bundle / hetzner-vm group; double-
|
||||||
|
check on bare-metal hosts.
|
||||||
|
- `nginx/vhosts` — dict of vhost-name → vhost-config.
|
||||||
|
- `nginx/modules` — list of dynamic modules to load.
|
||||||
|
|
||||||
|
## Cross-namespace
|
||||||
|
|
||||||
|
`items.py` reads `letsencrypt/domains` to skip emitting a per-vhost
|
||||||
|
HTTPS block when LE hasn't declared the domain yet — keeps the bundle
|
||||||
|
loadable on a node where letsencrypt isn't fully wired up.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Out of scope
|
||||||
|
|
||||||
|
- Bundle behaviour changes. Pure docs.
|
||||||
|
- `bw apply` / `bw run`.
|
||||||
|
- Reformatting the existing two-line bundle READMEs into the new
|
||||||
|
shape — bundles/AGENTS.md explicitly says don't do that
|
||||||
|
("uneven quality is part of what we accept in exchange for not
|
||||||
|
blocking other work").
|
||||||
|
|
||||||
|
## Constraints
|
||||||
|
|
||||||
|
- Don't echo decrypted secrets. The TSIG-key example in the
|
||||||
|
letsencrypt nsupdate snippet uses `<TSIG_KEY_REDACTED>`.
|
||||||
|
- After each commit, `.venv/bin/bw test` must pass.
|
||||||
|
- No push.
|
||||||
5
groups/applications/left4me.py
Normal file
5
groups/applications/left4me.py
Normal file
|
|
@ -0,0 +1,5 @@
|
||||||
|
{
|
||||||
|
'bundles': {
|
||||||
|
'left4me',
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
@ -81,6 +81,12 @@ This loader shape has consequences:
|
||||||
These are intentional parks/buffers, not bugs.
|
These are intentional parks/buffers, not bugs.
|
||||||
- **`id` must be unique.** A pre-apply hook (`hooks/unique_node_ids.py`)
|
- **`id` must be unique.** A pre-apply hook (`hooks/unique_node_ids.py`)
|
||||||
enforces this; duplicate IDs fail `bw test` and `bw apply`.
|
enforces this; duplicate IDs fail `bw test` and `bw apply`.
|
||||||
|
- **Bloated per-node metadata is usually a bundle smell.** If a
|
||||||
|
bundle's metadata block in the node file has more than 3-5 keys,
|
||||||
|
the bundle is probably under-using `defaults` / reactors. Push the
|
||||||
|
contribution into the bundle (see
|
||||||
|
[`bundles/AGENTS.md`](../bundles/AGENTS.md#conventions)) rather than
|
||||||
|
growing the node file.
|
||||||
|
|
||||||
## See also
|
## See also
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -233,6 +233,7 @@
|
||||||
'10.0.229.0/24',
|
'10.0.229.0/24',
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
|
'ovh.left4me': {},
|
||||||
},
|
},
|
||||||
'clients': {
|
'clients': {
|
||||||
'macbook': {
|
'macbook': {
|
||||||
|
|
|
||||||
|
|
@ -1,15 +1,21 @@
|
||||||
{
|
{
|
||||||
'hostname': '141.95.32.8',
|
'hostname': '141.95.32.8',
|
||||||
'username': 'debian',
|
|
||||||
'groups': [
|
'groups': [
|
||||||
|
'backup',
|
||||||
'debian-13',
|
'debian-13',
|
||||||
|
'left4me',
|
||||||
'monitored',
|
'monitored',
|
||||||
|
'webserver',
|
||||||
],
|
],
|
||||||
'bundles': [
|
'bundles': [
|
||||||
#'wireguard',
|
'wireguard',
|
||||||
],
|
],
|
||||||
'metadata': {
|
'metadata': {
|
||||||
'id': '14d2abc-3855-4bb7-99e2-d4e3eb0344dd',
|
'id': '14d2abc-3855-4bb7-99e2-d4e3eb0344dd',
|
||||||
|
'vm': {
|
||||||
|
'cores': 4, # 4 physical, 8 with HT
|
||||||
|
'threads': 8,
|
||||||
|
},
|
||||||
'network': {
|
'network': {
|
||||||
'external': {
|
'external': {
|
||||||
'interface': 'enp3s0f0',
|
'interface': 'enp3s0f0',
|
||||||
|
|
@ -35,5 +41,12 @@
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
|
'left4me': {
|
||||||
|
'domain': 'left4.me',
|
||||||
|
# Both HT siblings of physical core 0 (cpu0+cpu4 per
|
||||||
|
# /sys/devices/system/cpu/cpu0/topology/thread_siblings_list).
|
||||||
|
# Keeps system work off the physical cores running game ticks.
|
||||||
|
'system_cpus': {0, 4},
|
||||||
|
},
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue