diff --git a/docs/superpowers/plans/2026-05-15-deploy-dir-rethink.md b/docs/superpowers/plans/2026-05-15-deploy-dir-rethink.md new file mode 100644 index 0000000..35e3558 --- /dev/null +++ b/docs/superpowers/plans/2026-05-15-deploy-dir-rethink.md @@ -0,0 +1,198 @@ +# Deploy-dir architecture rethink — implementation plan + +## Context + +Resolves the open questions in `docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md`. After the 2026-05-15 script-consolidation work, `deploy/` ended up half-canonical / half-historical: the privileged scripts were treated as load-bearing source-of-truth there, while sudoers/sysctl/env-templates stayed duplicated against ckn-bw, and the obsolete `deploy-test-server.sh` plus a pile of dead static unit files lingered. The shape worked but couldn't be described in two sentences. + +This plan commits to the framing the user picked: **`deploy/` is a reference exemplar** — readable enough that a fresh consumer (ckn-bw today, hypothetical docker/ansible/manual tomorrow) could build a deployment from it, but not the live source of truth for installed binaries. The privileged scripts are **application-inherent code** and move out of `deploy/` to top-level `scripts/{libexec,sbin}/`. Dead code is deleted in the same pass. ckn-bw is updated to read scripts from the new location. The intended outcome: `deploy/` shrinks to README + example configs + a couple of curated example units, the rules for "what goes here" fit in two sentences, and the cross-repo install path becomes self-explanatory. + +## End state + +``` +left4me/ + scripts/ + libexec/ + left4me-overlay # 244-line Python helper (mount/umount) + left4me-script-sandbox # 109-line bash (systemd-run sandbox) + left4me-systemctl # 44-line sh wrapper + left4me-journalctl # 53-line sh wrapper + sbin/ + left4me # 17-line admin CLI wrapper + tests/ + test_overlay.py + test_script_sandbox.py + test_systemctl_helper.py + test_journalctl_helper.py + test_sudoers_grants.py # tests the contract between scripts and sudoers + deploy/ # REFERENCE ONLY — see deploy/README.md + README.md # rewritten: explains target layout, points at scripts/ + files/ + etc/ + sudoers.d/left4me # example, ckn-bw ships its own verbatim copy + sysctl.d/99-left4me.conf # example + left4me/sandbox-resolv.conf # example + usr/local/lib/systemd/system/ + left4me-server@.service # curated example of what ckn-bw's reactor emits + left4me-web.service # curated example + left4me-workshop-refresh.service # curated example + left4me-workshop-refresh.timer # curated example + l4d2-game.slice # curated example + l4d2-build.slice # curated example + templates/etc/left4me/ + host.env # example, ckn-bw renders its own mako version + web.env.template + tests/ + test_example_units.py # slimmed: just locks down the curated examples + l4d2host/ # unchanged + l4d2web/ # unchanged + docs/ +``` + +## Step-by-step + +### 1. Create `scripts/` and move helpers + +- `mkdir -p scripts/libexec scripts/sbin scripts/tests` +- `git mv` the four live helpers and the admin CLI: + - `deploy/files/usr/local/libexec/left4me/left4me-overlay` → `scripts/libexec/left4me-overlay` + - `deploy/files/usr/local/libexec/left4me/left4me-script-sandbox` → `scripts/libexec/left4me-script-sandbox` + - `deploy/files/usr/local/libexec/left4me/left4me-systemctl` → `scripts/libexec/left4me-systemctl` + - `deploy/files/usr/local/libexec/left4me/left4me-journalctl` → `scripts/libexec/left4me-journalctl` + - `deploy/files/usr/local/sbin/left4me` → `scripts/sbin/left4me` +- The scripts' contents are unchanged. Every install-target path inside them (`/usr/local/libexec/left4me/...`, `/etc/left4me/...`, `/var/lib/left4me/...`) stays exactly as is — those are runtime paths, not source-tree paths. + +### 2. Delete dead code + +- `git rm` (truly obsolete; replacements live elsewhere or feature was retired): + - `deploy/files/usr/local/libexec/left4me/left4me-apply-cake` — CAKE migrated to systemd-networkd via `network//cake` node metadata in ckn-bw. + - `deploy/files/usr/local/lib/systemd/system/left4me-cake.service` — same reason. + - `deploy/files/etc/left4me/cake.env` — bandwidth lives in node metadata, not an env file. + - `deploy/files/usr/local/lib/systemd/system/left4me-nft-mark.service` — central `bundles/nftables/` consumes the rules now. + - `deploy/files/usr/local/lib/left4me/nft/left4me-mark.nft` — same. After the delete, the now-empty `deploy/files/usr/local/lib/left4me/` and its `nft/` child disappear (git doesn't track empty dirs). + - `deploy/deploy-test-server.sh` — superseded by `bw apply`; content survives in git history. +- **Do NOT delete** `deploy/files/usr/local/lib/systemd/system/left4me-workshop-refresh.{service,timer}`. The workshop-refresh job is live (invokes `flask workshop-refresh`, defined in `l4d2web/cli.py`); ckn-bw's reactor emits these on production. They stay as curated examples, same category as `left4me-server@.service` / `left4me-web.service` / the slices. (This corrects the framing in `docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md` and item 2 of `docs/superpowers/specs/2026-05-15-janitorial-cleanup.md`, both of which lumped workshop-refresh together with truly-dead units.) +- Stale `__pycache__` dirs under `deploy/files/usr/local/libexec/left4me/` are deleted by the moves in step 1. + +### 3. Split and relocate `deploy/tests/test_deploy_artifacts.py` + +The current file (~880 lines) is doing four jobs. Split as follows; do not duplicate tests across files. + +**Concrete sequence to preserve git history where it counts**: + +1. `git mv deploy/tests/test_deploy_artifacts.py deploy/tests/test_example_units.py` — single rename, history follows via `git log --follow`. +2. In the renamed file, delete every test except the "Keep in `deploy/tests/test_example_units.py`" list below. The kept tests track the unit/sysctl/env-template examples, which is what `deploy/tests/` will mean afterwards. +3. Create new `scripts/tests/*.py` files (and `conftest.py`) by writing them fresh — pasting the relevant test functions across. The extracted tests lose direct rename history, but blame against the new files still resolves to the originals one git ref back; acceptable tradeoff. + +**Move to `scripts/tests/`** (tests of script behavior + the sudoers contract that gates the scripts): + +- `scripts/tests/test_overlay.py` — `test_overlay_helper_is_python_with_strict_validation`, `test_overlay_helper_mount_is_idempotent_when_already_mounted` +- `scripts/tests/test_script_sandbox.py` — `test_script_sandbox_helper_present`, `test_script_sandbox_helper_passes_shell_syntax_check`, `test_script_sandbox_helper_invokes_systemd_run_with_hardening`, `test_script_sandbox_uses_idmap_staging`, `test_script_sandbox_in_build_slice_with_oom_adjust`, `test_script_sandbox_helper_validates_overlay_id`, `test_script_sandbox_helper_dry_run_mode` +- `scripts/tests/test_systemctl_helper.py` — `test_systemctl_helper_passes_shell_syntax_check_and_rejects_bad_args` +- `scripts/tests/test_journalctl_helper.py` — `test_journalctl_helper_passes_shell_syntax_check_and_rejects_bad_args` +- `scripts/tests/test_helpers_use_fixed_paths.py` — `test_helpers_use_fixed_system_tool_paths_not_sudo_path` +- `scripts/tests/test_sudoers_grants.py` — `test_sudoers_allows_only_left4me_helpers_not_raw_system_tools` (still reads `deploy/files/etc/sudoers.d/left4me` as the canonical example; comment why) + +The `ROOT/DEPLOY` path-prefix constants in each file get rewritten so `SCRIPTS = Path(__file__).resolve().parents[2] / "scripts"` and helpers resolve to `SCRIPTS / "libexec/left4me-overlay"` etc. Shared helpers (`_fake_command`, `_env_with_fake_commands`) move into `scripts/tests/conftest.py`. + +**Keep in `deploy/tests/test_example_units.py`** (locks down the curated examples; renamed from the current file): + +- `test_global_unit_files_exist_at_product_level_paths` +- `test_web_unit_contains_required_runtime_contract` +- `test_server_unit_contains_required_runtime_contract` +- `test_server_unit_mounts_overlay_via_exec_start_pre` +- `test_server_unit_unmounts_overlay_via_exec_stop_post` +- `test_server_unit_contains_perf_baseline_directives` +- `test_l4d2_game_slice_exists_with_high_weights` +- `test_l4d2_build_slice_exists_with_low_weights` +- `test_sysctl_conf_present_with_perf_settings` +- `test_env_templates_contain_required_defaults` +- `test_sandbox_resolv_conf_exists` + +Add a top-of-file docstring: *"These tests lock down the curated examples kept in `deploy/files/` for reference. The production units are emitted by ckn-bw's reactor in `bundles/left4me/metadata.py`; when reactor output drifts intentionally, update the examples here too."* + +**Delete entirely** (target removed or no longer load-bearing): + +- All `test_deploy_script_*` tests (12 tests; `deploy-test-server.sh` is gone) +- `test_globals_refresh_units_removed` — file already deleted; nothing to lock down +- `test_nft_mark_file_marks_left4me_udp_with_dscp_ef_and_priority`, `test_nft_mark_unit_loads_and_clears_left4me_table` — nft-mark moved to central nftables bundle +- `test_cake_env_template_documents_required_knobs`, `test_apply_cake_helper_supports_apply_and_clear_modes`, `test_apply_cake_helper_passes_shell_syntax_check`, `test_cake_unit_runs_helper_in_apply_and_clear_modes` — CAKE moved to systemd-networkd +- `test_deploy_script_installs_overlay_helper_with_executable_mode`, `test_deploy_script_installs_script_sandbox_helper` — install responsibility now lives in ckn-bw's bundle, not in any left4me-side script + +Final file count: `scripts/tests/` gets 6 files, `deploy/tests/test_example_units.py` is one file, `deploy/tests/test_deploy_artifacts.py` is gone (renamed). + +### 4. Rewrite `deploy/README.md` + +Reframe the top of the file as: *"This directory is a reference exemplar. The canonical deploy is [ckn-bw](https://git.sublimity.de/cronekorkn/ckn-bw)'s `bundles/left4me/` (run `bw apply ovh.left4me`). Files under `deploy/files/` and `deploy/templates/` are readable examples — not the binaries / configs ckn-bw actually installs. Read them to understand the target layout if you're building a fresh deployment by other means."* + +Update the file/status table: + +- Drop rows for files that no longer exist (apply-cake, cake.service, cake.env, nft-mark.*, workshop-refresh.*). +- Drop the `deploy-test-server.sh` row. +- For the privileged-scripts rows, change `files/usr/local/libexec/left4me/...` → `(moved to scripts/libexec/, installed by ckn-bw's install_left4me_scripts action)`; same for the sbin row. +- Mark the remaining `files/etc/...` and `files/usr/local/lib/systemd/system/...` entries explicitly as **example**: ckn-bw ships its own verbatim copies of the configs, its reactor emits the units. + +Keep the "Target Layout" / "Runtime User" / "Overlay References" / "Performance Tuning" sections — they're useful reference prose. Strip the "Running A Test Deployment" / "Admin Bootstrap" sections that refer to the deleted shell installer; replace with a one-paragraph pointer to ckn-bw. + +### 5. ckn-bw cross-repo update + +The `install_left4me_scripts` action in `bundles/left4me/items.py` currently reads from `/opt/left4me/src/deploy/files/usr/local/{libexec,sbin}/`. Update it to read from `/opt/left4me/src/scripts/{libexec,sbin}/`. The install target is unchanged (`/usr/local/libexec/left4me/`, `/usr/local/sbin/left4me`), so nothing on the deployed host moves. + +This is a separate PR in the ckn-bw repo. It must land **at the same time** as the left4me move — the install action depends on the source paths existing. Coordination: + +1. Open both PRs simultaneously. +2. Merge order: left4me first (scripts exist at the new path in `/opt/left4me/src/` only after a fresh `git_deploy`), then ckn-bw, then `bw apply ovh.left4me`. +3. Alternative: have the ckn-bw PR fall back to the old path if the new path doesn't exist (one extra glob); decide during ckn-bw review whether the complexity is worth the looser coupling. Default: no fallback, coordinate the merges. + +Verification on the deploy target: after `bw apply`, the files under `/usr/local/libexec/left4me/` and `/usr/local/sbin/left4me` should be byte-identical to before. Sudoers, services, the web app: all unchanged. + +### 6. Mark adjacent specs / docs as resolved + +- `docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md`: prepend a `**Resolved 2026-05-15 by docs/superpowers/plans/….md.**` line at the top. Leave the body intact for archaeology. +- `docs/superpowers/specs/2026-05-15-janitorial-cleanup.md`: cross out items 1, 5, 6 (now handled here). Item 2 needs a rewrite — the framing "all static unit files are obsolete drift" was wrong; the live reactor-emitted set (`server@`, `web`, `workshop-refresh.{service,timer}`, `l4d2-{game,build}.slice`) stays in `deploy/files/` as curated examples. The truly-dead two (`left4me-cake.service`, `left4me-nft-mark.service`) are already deleted by this plan, so item 2 collapses to "no remaining work." +- No memory file changes needed; the project state captured here is structural and re-derivable from `deploy/README.md` after the rewrite lands. + +### 7. Rollback notes + +If `bw apply ovh.left4me` against the test server breaks something after the cross-repo merge: + +1. Revert the ckn-bw `install_left4me_scripts` action change to the old source path (`/opt/left4me/src/deploy/files/usr/local/{libexec,sbin}/`). Re-apply. +2. The left4me side never needs reverting in isolation — the scripts at the new path are byte-identical to the old ones, so a stale ckn-bw install action against a *new* left4me checkout would fail at `install -t` (source path missing). That failure is loud and safe: nothing on the deployed system gets modified. +3. The only foot-gun is **partial rollout**: ckn-bw updated but left4me not yet checked out at the right revision. The `git_deploy` step pins the revision, so as long as the two PRs reference compatible commits, the deployed `/opt/left4me/src/` always matches the action's expectation. + +## What does NOT change + +- Runtime install-target paths (`/usr/local/libexec/left4me/...`, `/usr/local/sbin/left4me`) — every reference inside `l4d2host/service_control.py:7-8`, `l4d2web/services/overlay_builders.py:34`, the sudoers file, and the systemd units stays the same. +- The Python packages `l4d2host/` and `l4d2web/`. +- ckn-bw's bundles for sudoers / sysctl / sandbox-resolv.conf — those keep their own verbatim copies (the user picked "deploy/ keeps configs as examples; duplication-with-ckn-bw is OK because deploy/ is explicitly reference"). Janitoring the duplication is *not* in scope for this plan. +- The Mako env templates in ckn-bw — they stay where they are, since they need bw's metadata access for rendering. +- The recent overlay-idmap / script-sandbox idmap-staging work — untouched. + +## Critical files (jump points for the implementor) + +- `deploy/tests/test_deploy_artifacts.py` — the source for the test split (lines 20-32 are the path constants; tests grouped roughly by helper from line 138 onward) +- `deploy/README.md` — full rewrite of the top section, partial rewrite of the table +- `l4d2host/service_control.py:7-8` — verify install-target paths unchanged (sanity) +- `l4d2web/services/overlay_builders.py:34` — same +- `deploy/files/etc/sudoers.d/left4me` — sanity-check that no path inside changed +- `deploy/files/usr/local/lib/systemd/system/{left4me-server@.service,left4me-web.service,l4d2-{game,build}.slice}` — survive as curated examples +- ckn-bw repo: `bundles/left4me/items.py` — the `install_left4me_scripts` action (separate PR) + +## Verification + +End-to-end: + +1. **Source-tree consistency.** `find scripts deploy -type f | sort` matches the layout in "End state" above (modulo `__pycache__`). +2. **All tests pass locally.** From the repo root: `pytest scripts/tests/ deploy/tests/ l4d2host/tests/ l4d2web/tests/` — every test passes. Specifically verify `scripts/tests/test_sudoers_grants.py` still reads `deploy/files/etc/sudoers.d/left4me` correctly (path constant points across the dir boundary). +3. **Shell syntax checks.** The split tests should still run `sh -n` / `bash -n` against the moved scripts; no script edits means no syntax regressions, but the test paths must resolve. +4. **No accidental application breakage.** `grep -rn '/usr/local/libexec/left4me\|/usr/local/sbin/left4me' l4d2host l4d2web` returns the same hits as before (paths are install-target, source moves don't affect them). +5. **ckn-bw dry-run.** Once the ckn-bw PR is up, `bw apply --dry-run ovh.left4me` from the ckn-bw repo: the diff should show **no changes** to files under `/usr/local/libexec/left4me/` or `/usr/local/sbin/left4me` (byte-identical content via the new path). +6. **Production apply.** `bw apply ovh.left4me` against the real test server. After apply: `systemctl status left4me-web.service` is green, starting a game server via the web UI still works (overlay mount → srcds_run → unmount on stop), running an overlay build script through the sandbox still works. + +## Out of scope (handled elsewhere or deferred) + +- The Mako template duplication in ckn-bw — separate cleanup; the templates legitimately need bw's metadata access. +- The 1/2/3-user uid-split decision — `docs/superpowers/specs/2026-05-15-user-uid-split-design.md`. +- The script-sandbox → systemd template unit refactor — `docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`. +- Remaining janitorial items: item 3 (bubblewrap→systemd-run doc drift), item 4 (stale gameserver-side idmap binds), calendar reminder for SM 1.13 stable. Items 1, 2 (partial — see step 6), 5, 6 are subsumed here. +- Rewriting the shell helpers in Python / packaging them as console_scripts — explicitly rejected in the recent script-consolidation plan (egg-info + TOCTOU privilege concerns). +- Historical references inside `docs/superpowers/plans/*` and `docs/superpowers/specs/*` to `deploy/files/...` or `deploy-test-server.sh` paths. Those are time-stamped snapshots of past sessions; they don't get rewritten when the underlying tree moves. diff --git a/docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md b/docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md index 1b56866..b433cf5 100644 --- a/docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md +++ b/docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md @@ -1,5 +1,15 @@ # Deploy directory architecture — open questions +**Resolved 2026-05-15 by [`docs/superpowers/plans/2026-05-15-deploy-dir-rethink.md`](../plans/2026-05-15-deploy-dir-rethink.md).** +Decision summary: `deploy/` is reference material; privileged scripts moved +to top-level `scripts/{libexec,sbin}/`; `deploy-test-server.sh` deleted; +dead static units (cake.service, nft-mark.service) deleted; reactor-emitted +units (server@, web, workshop-refresh.{service,timer}, slices) retained as +curated examples; ckn-bw `install_left4me_scripts` action repointed to the +new source paths. Body below preserved for archaeology. + +--- + **Status: open questions, not a settled design.** This is a thinking-aloud handoff prompted by the script-consolidation change on 2026-05-15. Decisions deferred; a future session should pick this up, talk through the options, diff --git a/docs/superpowers/specs/2026-05-15-janitorial-cleanup.md b/docs/superpowers/specs/2026-05-15-janitorial-cleanup.md index 33a81dc..f04d826 100644 --- a/docs/superpowers/specs/2026-05-15-janitorial-cleanup.md +++ b/docs/superpowers/specs/2026-05-15-janitorial-cleanup.md @@ -7,9 +7,18 @@ self-contained. Knock them out individually or batch them into a single janitorial PR. None are urgent — the project works fine with all of these still present. +> **2026-05-15 update**: items 1, 3, 4, and 5 resolved by +> [`docs/superpowers/plans/2026-05-15-deploy-dir-rethink.md`](../plans/2026-05-15-deploy-dir-rethink.md). +> Item 2 partially resolved by the same plan with a third option the +> original enumeration didn't list: the truly-dead units (cake.service, +> nft-mark.service) are deleted, the reactor-emitted set (server@, web, +> workshop-refresh.{service,timer}, slices) stays as curated examples +> under `deploy/files/`. Resolved items left in place below, marked +> RESOLVED, for archaeology. Remaining live items: 6, 7, 8, 9, 10. + ## Items -### 1. `left4me-apply-cake` — dead code +### 1. `left4me-apply-cake` — dead code [RESOLVED] **What**: `deploy/files/usr/local/libexec/left4me/left4me-apply-cake` (POSIX sh, ~47 lines) that applies/clears CAKE egress traffic @@ -34,7 +43,19 @@ sudo find /var/lib/left4me /opt/left4me /usr/local -name 'left4me-apply-cake' # expect: empty after the rm ``` -### 2. Obsolete systemd unit files in `deploy/files/` +### 2. Obsolete systemd unit files in `deploy/files/` [PARTIALLY RESOLVED] + +**Resolution path chosen**: third option not in the original enumeration — +*only the truly-dead two* (`left4me-cake.service`, `left4me-nft-mark.service`) +were deleted. The reactor-emitted set (`left4me-server@.service`, +`left4me-web.service`, `left4me-workshop-refresh.{service,timer}`, +`l4d2-game.slice`, `l4d2-build.slice`) is retained as **curated examples** +under `deploy/files/`, locked down by `deploy/tests/test_example_units.py`. +The framing in this item — "all six are equally drift" — was wrong: the +reactor-emitted units carry useful signal as readable examples of what +ckn-bw's `systemd_units` reactor emits at apply time. Original body below. + + **What**: - `deploy/files/usr/local/lib/systemd/system/left4me-cake.service` @@ -65,7 +86,7 @@ they matter). **Verification**: `find deploy/files/usr/local/lib/systemd/system -type f` should match the README's "what's canonical" list. -### 3. `deploy/files/etc/left4me/cake.env` +### 3. `deploy/files/etc/left4me/cake.env` [RESOLVED] **What**: env file referenced by the obsolete `left4me-cake.service`. @@ -75,7 +96,7 @@ read by anything live. **Action**: delete `deploy/files/etc/left4me/cake.env`. -### 4. `deploy/files/usr/local/lib/left4me/nft/` +### 4. `deploy/files/usr/local/lib/left4me/nft/` [RESOLVED] **What**: nftables fragment for `left4me-nft-mark.service`. @@ -86,7 +107,11 @@ fragment isn't read. **Action**: delete `deploy/files/usr/local/lib/left4me/` recursively. -### 5. `deploy-test-server.sh`'s fate +### 5. `deploy-test-server.sh`'s fate [RESOLVED] + +**Resolution**: deleted entirely. Content survives in git history. + + **What**: `deploy/deploy-test-server.sh`, the historical one-shot bash deploy.