diff --git a/docs/superpowers/specs/2026-05-15-handoff-deployment-responsibility.md b/docs/superpowers/specs/2026-05-15-handoff-deployment-responsibility.md new file mode 100644 index 0000000..d9b6c43 --- /dev/null +++ b/docs/superpowers/specs/2026-05-15-handoff-deployment-responsibility.md @@ -0,0 +1,238 @@ +# Handoff — brainstorm deployment responsibility (left4me vs. ckn-bw) + +## Status + +Queued for a future session, **after the uid-collapse refactor lands** +(`docs/superpowers/plans/2026-05-15-uid-collapse.md`). This is a +framing doc for a brainstorming session, not an implementation plan. +The brainstorming session should use `superpowers:brainstorming` and +exit with a design doc; implementation follows separately. + +## The question + +How should left4me and ckn-bw split responsibility for the host's +deployment? + +**Not a fresh question.** The original deployment design at +`docs/superpowers/specs/2026-05-06-left4me-deployment-design.md` +already laid out the canonical shape: `deploy/files/` in the left4me +repo mirrors target filesystem paths for root-owned deployment +artifacts (systemd units, sudoers, helpers, env templates); +"production config management can own both env files directly" +(line 91). The implicit model: **left4me defines the deployment +artifacts; ckn-bw integrates them onto the host.** That spec also +defined a self-contained `deploy/deploy-test-server.sh` so the +deployment could be exercised without ckn-bw at all. + +Over time, more and more of those artifacts migrated *into* ckn-bw's +`bundles/left4me/` — specifically: +- systemd unit definitions are now emitted by the + `systemd/units` reactor in `~/Projekte/ckn-bw/bundles/left4me/metadata.py` + (the hardening refactor we just landed reinforced this). +- sysctl options ended up in ckn-bw `bundles/left4me/metadata.py` + `defaults` (just landed too). +- sudoers exists in *both* repos (left4me `deploy/files/.../sudoers.d/left4me` + + ckn-bw verbatim mirror). +- Privileged helpers moved BACK to left4me as part of deploy-dir-rethink + (commit `5284e28`) — `scripts/{libexec,sbin}/`. Pattern works: + left4me defines, ckn-bw deploys via `install_left4me_scripts`. + +So the trajectory has been mixed: helpers re-converged on left4me +(good, matches 2026-05-06); systemd units + sysctl drifted into +ckn-bw (away from 2026-05-06). The brainstorm reconciles this. + +**The question**: should we return to the 2026-05-06 model +end-to-end — every deployment artifact lives in left4me's +`deploy/files/`, ckn-bw becomes a thin integrator — or is the +current mixed shape the right answer for some artifact classes? + +## Operator's leaning + +Security-related artifacts belong **in the left4me repo**, owned by +the project; ckn-bw is responsible for **integrating** them into the +host (deploying them to the right paths, restarting affected units, +etc.) but doesn't *author* them. + +Concretely the operator's preference (from session +2026-05-15): "security-related stuff should be bundled in this repo +and ckn-bw is responsible for integrating it into the server." + +## Why we're doing this + +Background from the hardening-refactor session +(`docs/superpowers/specs/2026-05-15-hardening-refactor-design.md`, +"Approach" section). We considered two shapes for the hardening +landing: + +- **A** — hardening directives inline in ckn-bw's `systemd/units` + reactor (the path we took) +- **B** — hardening as drop-in `.conf` files living in left4me's + `deploy/files/etc/systemd/system/.d/`, ckn-bw deploys them + (consistent with 2026-05-06's `deploy/files/` model) + +We picked A for the hardening refactor because B implied a broader +configmgmt responsibility reshape that deserved its own session. +That session is this one. + +The motivating arguments for B (this brainstorming session evaluates +them seriously): + +1. **Hardening is application knowledge.** Knowing srcds is i386, + that `MemoryDenyWriteExecute=true` breaks Source's text + relocations, that web's sudo path is incompatible with + `PrivateUsers=true` — all of this is left4me's domain, not + ckn-bw's. ckn-bw shouldn't need to understand the threat model. +2. **Test-artifact = production-artifact.** The Test 7 drop-in from + the hardening test plan literally is the file we'd want + deployed. With B, there's no translation step. +3. **Repo self-containment for security review.** A reviewer of + left4me sees the threat model in code form without needing to + read the configmgmt repo. +4. **Easier coordination with the `build-overlay-unit` refactor** + (queued). That unit's hardening profile can ship in its own + drop-in inline with the unit template. + +The counter-argument: + +- **Coupling cost.** A change to a directive may require redeploying + via ckn-bw, which means a cross-repo coordination cycle (edit + left4me → commit → push → ckn-bw `bw apply`). Today the same is + true (edit ckn-bw → push → apply); just the *which* repo changes. + +## What "security-related" likely means + +Enumerate during the brainstorm. Initial candidates: + +- **systemd unit hardening directives** — currently in + ckn-bw `bundles/left4me/metadata.py` `HARDENING_COMMON` / + `HARDENING_SERVER` / `HARDENING_WEB`. Strong candidate for left4me. +- **sysctl drop-ins** — currently `kernel.yama.ptrace_scope=2` in + ckn-bw's left4me bundle `defaults` (`sysctl/kernel/yama/ptrace_scope`). + Strong candidate for left4me. +- **sudoers** — already in `left4me/deploy/files/etc/sudoers.d/left4me` + + a verbatim mirror in `ckn-bw/bundles/left4me/files/etc/sudoers.d/left4me`. + Already mostly left4me-owned; redundancy worth resolving. +- **Privileged helper scripts** — already in `left4me/scripts/{libexec,sbin}/`, + ckn-bw deploys them via `install_left4me_scripts`. Already + left4me-owned. The pattern works. +- **systemd unit BASE definitions** (`User=`, `ExecStart=`, `Restart=`, + resource limits) — currently in ckn-bw's reactor. **Open question:** + is this application knowledge or infrastructure knowledge? They + depend on the application's binary paths, env files, restart + semantics — all application knowledge. Probably also belongs to + left4me. +- **AppArmor profiles** (if we add them later — deferred from the + defenses survey). Application knowledge. +- **`/etc/left4me/host.env` / `web.env` templating** — ckn-bw owns + these today because they're templated via mako from node metadata + (per-host overrides). Probably stays in ckn-bw. +- **User/group creation** — kernel-side infrastructure, no + application knowledge needed. Stays in ckn-bw. +- **Package installation** (apt). Stays in ckn-bw. +- **Firewall rules** — depend on per-instance port ranges + (`LEFT4ME_PORT_RANGE_*`); could be either. Worth discussing. +- **Nginx vhost** — same: depends on app-specific routes. + +## Mechanism: how does ckn-bw "integrate"? + +Brainstorm the deploy mechanism. Candidates (already partially +sketched in the hardening-refactor design doc's earlier draft, before +it was reverted to the inline-in-reactor approach): + +- **Symlinks.** ckn-bw creates symlinks like + `/etc/systemd/system/left4me-server@.service.d/10-hardening.conf` + → `/opt/left4me/src/deploy/files/etc/systemd/system/.../10-hardening.conf`. + Editing the file in the repo + `systemctl daemon-reload` picks it + up. Cleanest for "ckn-bw doesn't author." +- **File copy via `files` entries.** ckn-bw `files = {...}` reads + from `/opt/left4me/src/deploy/files/...` (post-git_deploy) and + copies to the target. Standard idiom. Two-place state. +- **Glob-walker action.** A small ckn-bw action walks `deploy/files/` + tree and mirrors paths to root. +- **Bundle inclusion / left4me-as-bundle.** Left4me's `deploy/` + becomes its own bundlewrap bundle that ckn-bw imports. Strongest + decoupling; requires bundlewrap bundle conventions. + +Each has different implications for: triggers (which units restart +when which files change), drift detection, rollback semantics. + +## Migration / coexistence path + +Brainstorm: how do we get from the current state to the new state +without breaking things? + +- Inventory: every artifact ckn-bw currently emits/ships for left4me + (the `systemd/units` reactor entries, sysctl defaults, sudoers + mirror, file deploy actions, etc.). +- For each: stays, moves, or split (some in each). +- Mechanism rollout: pick one (symlinks vs. file copy vs. ...) and + apply it consistently. +- Test-driven: pick one artifact as the canary (probably the sysctl + drop-in — smallest), validate the mechanism end-to-end, then + migrate the others. + +## Key sub-questions for the brainstorm + +1. **Is the unit's BASE definition application knowledge?** If yes, + ckn-bw's `systemd/units` reactor shrinks dramatically — to maybe + one line per unit ("ckn-bw, deploy this file as a unit"). If no, + we have a more delicate split. +2. **What about the user/group definitions?** Infrastructure-side + today. But the application defines that `left4me` (uid 980) + exists; ckn-bw just creates it. Could move. +3. **Per-host configuration** (gunicorn worker count, port ranges, + CPU pinning): these are per-host overrides ckn-bw computes from + node metadata. Stays in ckn-bw (or whatever owns deployment-time + parameterization). +4. **Test infrastructure**: `deploy/tests/test_deploy_artifacts.py` + asserts left4me's reference units match the deployed form. If + left4me starts owning the deployed form, those tests get + stronger (no longer "reference vs. live" drift; the file in + `deploy/files/` *is* the live form). +5. **Drift / observability**: how do we know the deployed state + matches the repo? Today `bw apply` + git diff is the source of + truth. Same applies; mechanism details vary. +6. **Rollback semantics**: removing a drop-in is one `rm` away; the + base unit is preserved. Same applies to reverting the + left4me-side commit and re-applying. + +## Prereqs (must land before this brainstorming session) + +- **uid-collapse refactor** — queued in + `docs/superpowers/plans/2026-05-15-uid-collapse.md`. Settles the + user model first so the deployment-responsibility brainstorm + doesn't have to juggle a moving user definition. + +## Out of scope for the brainstorm + +- The hardening composition itself (already settled, deployed, + verified). +- The `build-overlay-unit` template unit refactor + (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`) + — both this brainstorm *and* the build-overlay-unit refactor + benefit from settling responsibility first. Sequencing TBD; the + brainstorm should consider whether to land before or after + build-overlay-unit. +- The application code itself (`l4d2web`, `l4d2host`) — that's + always been left4me-owned. + +## Pointers + +- **Original deployment design (the model to revisit):** + `docs/superpowers/specs/2026-05-06-left4me-deployment-design.md` +- Hardening refactor design (motivation; the deferred reshape): + `docs/superpowers/specs/2026-05-15-hardening-refactor-design.md` +- Hardening refactor plan (what got landed): + `docs/superpowers/plans/2026-05-15-hardening-refactor.md` +- Defenses survey (mentions AppArmor, deferred): + `docs/superpowers/specs/2026-05-15-hardening-defenses-survey.md` +- Test plan + executed results: + `docs/superpowers/specs/2026-05-15-hardening-test-plan.md` +- uid-collapse plan (prereq): + `docs/superpowers/plans/2026-05-15-uid-collapse.md` +- deploy-dir-rethink (recent reshape that moved scripts into left4me; + background on the current `deploy/` tree): + `docs/superpowers/plans/2026-05-15-deploy-dir-rethink.md` (or + `2026-05-15-deploy-dir-rethink-design.md`) +- Live ckn-bw bundle (the thing being rethought): + `~/Projekte/ckn-bw/bundles/left4me/` diff --git a/docs/superpowers/specs/2026-05-15-session-handoff.md b/docs/superpowers/specs/2026-05-15-session-handoff.md index a891fa0..bec647a 100644 --- a/docs/superpowers/specs/2026-05-15-session-handoff.md +++ b/docs/superpowers/specs/2026-05-15-session-handoff.md @@ -100,7 +100,9 @@ Operator picked C. (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`). Sequenced after this; will inherit `User=left4me` cleanly. - **Broader configmgmt responsibility reshape** (drop-ins owned by - left4me, ckn-bw as thin file-shipper). Deliberately deferred. + left4me, ckn-bw as thin file-shipper). Framed as a brainstorming + session at `docs/superpowers/specs/2026-05-15-handoff-deployment-responsibility.md`; + sequenced after uid-collapse lands. - **Stale RCON port app bug** flagged in the earlier executor's handoff. Separate scope. - Renaming `left4me` to anything else. Cosmetic.