diff --git a/docs/superpowers/specs/2026-05-15-session-handoff.md b/docs/superpowers/specs/2026-05-15-session-handoff.md index 812883d..0043f64 100644 --- a/docs/superpowers/specs/2026-05-15-session-handoff.md +++ b/docs/superpowers/specs/2026-05-15-session-handoff.md @@ -1,114 +1,178 @@ -# Session handoff — next: execute hardening test plan +# Session handoff — next: write the hardening-refactor implementation plan -Short handoff. Three new hardening specs landed today; the next session -takes the test plan to `left4.me` and runs it. Decision on -`2026-05-15-user-uid-split-design.md` is **deferred** until the test -plan reports back. +Short handoff. The hardening test plan was executed end-to-end on +`left4.me` this session. Results are recorded inline in the spec at +`docs/superpowers/specs/2026-05-15-hardening-test-plan.md` (commit +`461b8d0`). The next session writes the implementation plan that lands +the proven composition in ckn-bw. -## What just landed +## What just happened -Three coordinated specs at `docs/superpowers/specs/`: +Ran all 11 tests from the hardening test plan on +`left4me-server@1` (canary) and `left4me-web` against the live host +at `left4.me` / `left4me.ovh.ckn.li` (Debian 13, systemd 257). All +drop-ins cleaned up at session end; the Test 9 sysctl +(`kernel.yama.ptrace_scope=2`) is the one persistent host change. +`gdb` + `seccomp` packages left installed. -- `2026-05-15-hardening-threat-model.md` — assets, attackers (A1-A6), - trust boundaries (TB1-TB8), attack scenarios (S1-S6), what we - defend (D1-D7), what we accept losing. -- `2026-05-15-hardening-defenses-survey.md` — full Linux + systemd - defense menu, per-defense primitive mapping, candidate composition - for `left4me-server@.service` + `left4me-web.service`. -- `2026-05-15-hardening-test-plan.md` — 11 tests runnable cold on - `left4.me`; drop-in style so they never modify persistent units. +Headline numbers: +- `left4me-server@1.service`: **7.5 EXPOSED → 1.3 OK** (systemd-analyze) +- `left4me-web.service`: **8.7 EXPOSED → 4.1 OK** +- Test 8 attack matrix: all 8 vectors (D1.a/b/c, D2.a/b/c, D3, D5) blocked. -## Why the shape changed (from uid-split → hardening) +Three things the test surfaced that change what the refactor must look like: +- **`SystemCallArchitectures=native x86`**, not bare `native`. + `srcds_linux` is 32-bit i386; with `native=AUDIT_ARCH_X86_64` only, + every i386 syscall is killed and srcds_run respawns every 10 s. +- **Add `PrivatePIDs=true`** to the composition. `ProtectProc=invisible` + alone cannot hide gunicorn from srcds because they share uid 980; + PrivatePIDs gives each instance its own PID namespace and closes + D2.b without needing the uid split. +- **Exclude `MemoryDenyWriteExecute=true`.** Source engine i386 `.so` + files have text relocations; MDW returns EPERM on the relocation + `mprotect`, dlopen aborts, srcds enters the respawn loop. Permanent + exclusion — not fixable without rebuilding Valve's closed-source + binary. -The prior handoff pointed this session at the 1/2/3-user decision in -`2026-05-15-user-uid-split-design.md`. Audit during this session -established that the same-uid attack surface (DB readable from srcds, -ptrace of gunicorn allowed, RCON passwords stored plaintext in DB, -no `/proc` isolation) is closable by *either* a uid split *or* -systemd directive composition (`TemporaryFileSystem=` + -`SystemCallFilter=~@debug` + `PrivateUsers=true` + `ProcSubset=pid` -+ empty `CapabilityBoundingSet=`). Operator chose to step back: do -threat-model + research + test before committing to either approach. -The three new specs are the output of that step-back. +Full per-test detail is in the spec's "Results" section. -## What's next: run the test plan +## What's next: write the refactor plan -The test plan is **self-contained** — drop a fresh Claude session on -`left4.me` (141.95.32.8) with the spec in hand and it can execute end -to end. System units only; no user units, no lingering. +Target file: `docs/superpowers/plans/2026-05-16-hardening-refactor.md` +(or whatever date the next session opens). -Per the test plan's structure: -1. Capture baseline (`systemd-analyze security`, current unit state, - sysctl). -2. Tests 1-6 isolate individual directives against srcds on - `left4me-server@1` (canary; server@2 stays baseline as a fallback). -3. Test 7 composes everything that passed. -4. Test 8 verifies the threat-model defenses (D1-D5) actually work. -5. Test 9 applies `kernel.yama.ptrace_scope=2` system-wide. -6. Test 10 applies the sudo-compatible subset to `left4me-web.service`. -7. Test 11 is a 24-48h soak. +Scope: -Results template at the bottom of the test plan; fill in as you go. +1. **Land the proven composition in ckn-bw.** Live source for the + unit emission is `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`. + The reactor emits `left4me-server@.service` and `left4me-web.service` + — both need the new directives. Copy the Test 7 drop-in (from the + spec) into the reactor's unit body, with the two amendments above. -After execution: write the implementation plan at -`docs/superpowers/plans/2026-MM-DD-hardening-refactor.md` against the -proven composition. The plan touches `~/Projekte/ckn-bw/bundles/left4me/metadata.py` -(live source for unit emission per `items.py:2-5`) and the reference -copies in `deploy/files/usr/local/lib/systemd/system/`. +2. **Land the web composition** (sudo-compatible subset from Test 10) + in the same reactor. + +3. **Land the sysctl drop-in in ckn-bw.** Currently + `/etc/sysctl.d/99-left4me-ptrace.conf` is host-only — if ckn-bw + later enforces unmanaged-file removal, this would disappear. Add + `pkg_files:` entry (or whatever the bundle convention is) for + `kernel.yama.ptrace_scope=2`. + +4. **Update reference units** in + `deploy/files/usr/local/lib/systemd/system/{left4me-server@,left4me-web}.service` + to mirror the new emission (these are reference-only post the + deploy-dir-rethink, but should not drift from the live source). + +5. **Decide on `SocketBindAllow=`** for game port range (27000–27999 + per `LEFT4ME_PORT_RANGE_*`). Worth adding to lock srcds's bindable + sockets; not tested in this session. + +6. **Resolve the deferred specs:** + - `docs/superpowers/specs/2026-05-15-user-uid-split-design.md` — + **mark as superseded.** PrivatePIDs + PrivateUsers close the + same-uid /proc gap that motivated it. Note the residual app-level + same-uid surface (DB ACLs, web.env mode) is a separate concern + not addressed by uid split anyway. + - AppArmor follow-up — defer further; defenses survey lists it. + Revisit if directive-only hardening leaves observable gaps. + +7. **Fix the four spec bugs documented at the bottom of the test plan** + (PID-lookup races, gdb-from-outside-NS verification flaw, D5 + pgrep pattern, scmp_sys_resolver package name). + +### Recommendation on sequencing + +Before touching ckn-bw, run **superpowers:brainstorming** on the +refactor — there's a real design choice on emission shape. The +test-plan drop-in is ~50 lines of new directives; the existing +reactor emits a smaller unit. Options: + +- **A. Inline.** All directives land directly in the + `[Service]` block emitted by the reactor. Simple, ckn-bw-idiomatic. +- **B. Profile-as-reusable-fragment.** Put the directive block in a + shared bundle (so the future build-overlay-unit refactor can reuse + it). Better factoring, more upfront design. +- **C. Drop-in pattern preserved.** Reactor emits the base unit + unchanged, plus a separate `*.service.d/hardening.conf` drop-in. + Mirrors the test methodology; easier to revert by removing the + drop-in. + +My weak preference is **A** for the first pass — get the production +state hardened, then refactor into shared shape (B) when the +build-overlay-unit work needs it. **C** is operationally clean but +introduces a new emission pattern just for this. Worth 10 minutes of +brainstorming before committing. ## Decision-relevant context -- **Source of truth for unit files is ckn-bw**, not left4me's - `deploy/files/`. The `deploy/files/usr/local/lib/systemd/system/*.service` - copies are reference-only post-deploy-dir-rethink; the - `systemd/units` reactor in `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+` - is the live emission. Audit confirmed (commit `5284e28` + `items.py:2-5` - comment). -- **Sandbox is already strong.** `l4d2-sandbox` unit is not in scope - for this refactor — its hardening profile was verified during 2026-05-15 - build-time-idmap work. Document as load-bearing; do not weaken. -- **Sudo on the web app blocks deep hardening there.** `NoNewPrivileges=true` - and `PrivateUsers=true` are incompatible with the helper-invocation - pattern. Sudo-compatible subset only on web. Full hardening blocked - on a future "replace sudo with systemctl-managed unit triggering" - refactor (build-overlay-unit spec is a step in that direction). -- **uid-split spec is deferred, not closed.** After Phase A test - results come back, decide: residual risk small enough → close - `2026-05-15-user-uid-split-design.md` as superseded. Residual risk - significant → write the split as a follow-up. +- **Source of truth is ckn-bw.** `deploy/files/.../*.service` copies + are reference-only post-deploy-dir-rethink. Don't edit them as the + primary change — emit-then-mirror. +- **Sandbox `l4d2-sandbox` unit is out of scope.** Verified during + prior build-time-idmap work; do not weaken. +- **Web sudo helpers must keep working.** `NoNewPrivileges` and + `PrivateUsers` are NOT in the web composition (Test 10 confirmed + the sudo-compat subset). The "replace sudo with systemctl-managed + unit triggering" refactor (build-overlay-unit spec is a step + toward it) would unlock deeper web hardening later. +- **App-level stale RCON port bug** surfaced during testing: each + srcds restart picks a new ephemeral RCON port; the web app + caches the previous one and logs `Connection refused`. Pre-exists + hardening (repro'd before any drop-in was applied). Separate issue, + not for this refactor. Mention to operator; track in whatever + issue-tracking the project uses. +- **gdb + seccomp packages on left4.me** are installed but not in + ckn-bw. Either add them to the bundle (so they're reproducible) + or `apt remove` them after the refactor lands — operator + preference. -## Open questions to clarify with operator before/during execution +## Host state at end of session -(Captured in the threat model's "Open questions" section.) - -1. Is gunicorn directly internet-reachable, or only via nginx? -2. Admin-auth strength on the web app (defines S2 realism). -3. Workshop content curation policy (defines A3 realism). -4. Is `ckn@10.0.4.128` usable as a test bench, or is `left4.me` the - only deployment target? (Test plan currently assumes `left4.me`.) -5. Current `kernel.yama.ptrace_scope` setting on the host. -6. AppArmor enabled on host? (Default Debian: not enabled.) +- `left4me-server@1`, `@2`, `left4me-web`: all `active`, baseline + (no drop-ins). +- `/etc/sysctl.d/99-left4me-ptrace.conf` present, `ptrace_scope=2` + effective. +- `gdb`, `seccomp` (provides `scmp_sys_resolver`), `libseccomp-dev` + installed. +- `/tmp/sec-{baseline,after}-{server,web}.txt`, `/tmp/unit-baseline-*.conf`, + `/tmp/sysctl-baseline.txt` retained (next session can pull diffs from + these if needed). ## What's NOT next -- **build-overlay-unit refactor** - (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`). - Still queued; sequenced behind this. The hardening profile from - this work becomes the template for the build-overlay unit. -- **Pushing the ckn-bw `91b7265` commit.** Still unpushed; still safe. - Mentioned in the previous handoff; not a blocker. -- **uid-split implementation.** Deferred pending test results. -- **AppArmor profiles.** Listed in the defenses survey; deferred. - Revisit after Phase A if directive-only hardening leaves gaps. +- **Don't re-run the test plan.** Already done; results committed. +- **Don't push to origin yet.** Repo is 3 ahead of + `origin/master` (the three hardening specs + this commit). User + said "commit locally" this session; they'll push when ready. +- **Don't fix the stale-RCON-port app bug as part of the refactor.** + Different system, different scope. +- **Don't do AppArmor.** Still deferred. +- **build-overlay-unit refactor** (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`) + remains sequenced behind this; not next. + +## Open questions for the next session + +- Should the refactor be a single PR/commit, or split into + "ckn-bw emission" + "reference unit mirror" + "sysctl drop-in"? + Operator preference. Recommend single bundle if the changes are + small. +- Should we land Test 7's composition on `@2` first as a longer + canary before rolling to all instances? Or trust the symmetric + emission and roll everywhere at once? Currently both are running + baseline; @1 was the only canary. +- `SocketBindAllow=` for the 27000–27999 game port range — include + in the first pass, or defer to a follow-up commit? Survey lists + it, test plan didn't exercise it. ## Pointers -- Test plan (the thing to execute): `docs/superpowers/specs/2026-05-15-hardening-test-plan.md` +- Test plan (executed; **read the Results section first**): + `docs/superpowers/specs/2026-05-15-hardening-test-plan.md` - Threat model: `docs/superpowers/specs/2026-05-15-hardening-threat-model.md` - Defenses survey: `docs/superpowers/specs/2026-05-15-hardening-defenses-survey.md` -- Original uid-split spec (deferred): `docs/superpowers/specs/2026-05-15-user-uid-split-design.md` +- Original uid-split (to be marked superseded): + `docs/superpowers/specs/2026-05-15-user-uid-split-design.md` - Live unit emission: `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+` - Reference units: `deploy/files/usr/local/lib/systemd/system/` -- Scratch plan from earlier this session - (`~/.claude/plans/docs-superpowers-specs-2026-05-15-sessio-cosmic-codd.md`) - is superseded by the three specs; safe to discard. +- Recent commit on this work: `461b8d0` +- Host SSH: `ssh left4.me` (config at `~/.ssh/config`, 1Password agent)