spec(session-handoff): point next session at hardening-refactor plan

The prior handoff pointed this session at running the test plan; that's done (commit 461b8d0). Update the handoff to point the next session at writing docs/superpowers/plans/2026-MM-DD-hardening-refactor.md against the proven composition, including the two amendments (x86 arch, PrivatePIDs) and the MDW permanent exclusion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 13:43:37 +02:00 · 2026-05-15 13:43:37 +02:00 · 152c313315
commit 152c313315
parent 461b8d028f
1 changed files with 152 additions and 88 deletions
--- a/docs/superpowers/specs/2026-05-15-session-handoff.md
+++ b/docs/superpowers/specs/2026-05-15-session-handoff.md
@ -1,114 +1,178 @@
-# Session handoff — next: execute hardening test plan
+# Session handoff — next: write the hardening-refactor implementation plan

-Short handoff. Three new hardening specs landed today; the next session
-takes the test plan to `left4.me` and runs it. Decision on
-`2026-05-15-user-uid-split-design.md` is **deferred** until the test
-plan reports back.
+Short handoff. The hardening test plan was executed end-to-end on
+`left4.me` this session. Results are recorded inline in the spec at
+`docs/superpowers/specs/2026-05-15-hardening-test-plan.md` (commit
+`461b8d0`). The next session writes the implementation plan that lands
+the proven composition in ckn-bw.

-## What just landed
+## What just happened

-Three coordinated specs at `docs/superpowers/specs/`:
+Ran all 11 tests from the hardening test plan on
+`left4me-server@1` (canary) and `left4me-web` against the live host
+at `left4.me` / `left4me.ovh.ckn.li` (Debian 13, systemd 257). All
+drop-ins cleaned up at session end; the Test 9 sysctl
+(`kernel.yama.ptrace_scope=2`) is the one persistent host change.
+`gdb` + `seccomp` packages left installed.

- `2026-05-15-hardening-threat-model.md` — assets, attackers (A1-A6),
-  trust boundaries (TB1-TB8), attack scenarios (S1-S6), what we
-  defend (D1-D7), what we accept losing.
- `2026-05-15-hardening-defenses-survey.md` — full Linux + systemd
-  defense menu, per-defense primitive mapping, candidate composition
-  for `left4me-server@.service` + `left4me-web.service`.
- `2026-05-15-hardening-test-plan.md` — 11 tests runnable cold on
-  `left4.me`; drop-in style so they never modify persistent units.
+Headline numbers:
+- `left4me-server@1.service`: **7.5 EXPOSED → 1.3 OK** (systemd-analyze)
+- `left4me-web.service`: **8.7 EXPOSED → 4.1 OK**
+- Test 8 attack matrix: all 8 vectors (D1.a/b/c, D2.a/b/c, D3, D5) blocked.

-## Why the shape changed (from uid-split → hardening)
+Three things the test surfaced that change what the refactor must look like:
+- **`SystemCallArchitectures=native x86`**, not bare `native`.
+  `srcds_linux` is 32-bit i386; with `native=AUDIT_ARCH_X86_64` only,
+  every i386 syscall is killed and srcds_run respawns every 10 s.
+- **Add `PrivatePIDs=true`** to the composition. `ProtectProc=invisible`
+  alone cannot hide gunicorn from srcds because they share uid 980;
+  PrivatePIDs gives each instance its own PID namespace and closes
+  D2.b without needing the uid split.
+- **Exclude `MemoryDenyWriteExecute=true`.** Source engine i386 `.so`
+  files have text relocations; MDW returns EPERM on the relocation
+  `mprotect`, dlopen aborts, srcds enters the respawn loop. Permanent
+  exclusion — not fixable without rebuilding Valve's closed-source
+  binary.

-The prior handoff pointed this session at the 1/2/3-user decision in
-`2026-05-15-user-uid-split-design.md`. Audit during this session
-established that the same-uid attack surface (DB readable from srcds,
-ptrace of gunicorn allowed, RCON passwords stored plaintext in DB,
-no `/proc` isolation) is closable by *either* a uid split *or*
-systemd directive composition (`TemporaryFileSystem=` +
-`SystemCallFilter=~@debug` + `PrivateUsers=true` + `ProcSubset=pid`
-+ empty `CapabilityBoundingSet=`). Operator chose to step back: do
-threat-model + research + test before committing to either approach.
-The three new specs are the output of that step-back.
+Full per-test detail is in the spec's "Results" section.

-## What's next: run the test plan
+## What's next: write the refactor plan

-The test plan is **self-contained** — drop a fresh Claude session on
-`left4.me` (141.95.32.8) with the spec in hand and it can execute end
-to end. System units only; no user units, no lingering.
+Target file: `docs/superpowers/plans/2026-05-16-hardening-refactor.md`
+(or whatever date the next session opens).

-Per the test plan's structure:
-1. Capture baseline (`systemd-analyze security`, current unit state,
-   sysctl).
-2. Tests 1-6 isolate individual directives against srcds on
-   `left4me-server@1` (canary; server@2 stays baseline as a fallback).
-3. Test 7 composes everything that passed.
-4. Test 8 verifies the threat-model defenses (D1-D5) actually work.
-5. Test 9 applies `kernel.yama.ptrace_scope=2` system-wide.
-6. Test 10 applies the sudo-compatible subset to `left4me-web.service`.
-7. Test 11 is a 24-48h soak.
+Scope:

-Results template at the bottom of the test plan; fill in as you go.
+1. **Land the proven composition in ckn-bw.** Live source for the
+   unit emission is `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`.
+   The reactor emits `left4me-server@.service` and `left4me-web.service`
+   — both need the new directives. Copy the Test 7 drop-in (from the
+   spec) into the reactor's unit body, with the two amendments above.

-After execution: write the implementation plan at
-`docs/superpowers/plans/2026-MM-DD-hardening-refactor.md` against the
-proven composition. The plan touches `~/Projekte/ckn-bw/bundles/left4me/metadata.py`
-(live source for unit emission per `items.py:2-5`) and the reference
-copies in `deploy/files/usr/local/lib/systemd/system/`.
+2. **Land the web composition** (sudo-compatible subset from Test 10)
+   in the same reactor.
+
+3. **Land the sysctl drop-in in ckn-bw.** Currently
+   `/etc/sysctl.d/99-left4me-ptrace.conf` is host-only — if ckn-bw
+   later enforces unmanaged-file removal, this would disappear. Add
+   `pkg_files:` entry (or whatever the bundle convention is) for
+   `kernel.yama.ptrace_scope=2`.
+
+4. **Update reference units** in
+   `deploy/files/usr/local/lib/systemd/system/{left4me-server@,left4me-web}.service`
+   to mirror the new emission (these are reference-only post the
+   deploy-dir-rethink, but should not drift from the live source).
+
+5. **Decide on `SocketBindAllow=`** for game port range (27000–27999
+   per `LEFT4ME_PORT_RANGE_*`). Worth adding to lock srcds's bindable
+   sockets; not tested in this session.
+
+6. **Resolve the deferred specs:**
+   - `docs/superpowers/specs/2026-05-15-user-uid-split-design.md` —
+     **mark as superseded.** PrivatePIDs + PrivateUsers close the
+     same-uid /proc gap that motivated it. Note the residual app-level
+     same-uid surface (DB ACLs, web.env mode) is a separate concern
+     not addressed by uid split anyway.
+   - AppArmor follow-up — defer further; defenses survey lists it.
+     Revisit if directive-only hardening leaves observable gaps.
+
+7. **Fix the four spec bugs documented at the bottom of the test plan**
+   (PID-lookup races, gdb-from-outside-NS verification flaw, D5
+   pgrep pattern, scmp_sys_resolver package name).
+
+### Recommendation on sequencing
+
+Before touching ckn-bw, run **superpowers:brainstorming** on the
+refactor — there's a real design choice on emission shape. The
+test-plan drop-in is ~50 lines of new directives; the existing
+reactor emits a smaller unit. Options:
+
+- **A. Inline.** All directives land directly in the
+  `[Service]` block emitted by the reactor. Simple, ckn-bw-idiomatic.
+- **B. Profile-as-reusable-fragment.** Put the directive block in a
+  shared bundle (so the future build-overlay-unit refactor can reuse
+  it). Better factoring, more upfront design.
+- **C. Drop-in pattern preserved.** Reactor emits the base unit
+  unchanged, plus a separate `*.service.d/hardening.conf` drop-in.
+  Mirrors the test methodology; easier to revert by removing the
+  drop-in.
+
+My weak preference is **A** for the first pass — get the production
+state hardened, then refactor into shared shape (B) when the
+build-overlay-unit work needs it. **C** is operationally clean but
+introduces a new emission pattern just for this. Worth 10 minutes of
+brainstorming before committing.

 ## Decision-relevant context

- **Source of truth for unit files is ckn-bw**, not left4me's
-  `deploy/files/`. The `deploy/files/usr/local/lib/systemd/system/*.service`
-  copies are reference-only post-deploy-dir-rethink; the
-  `systemd/units` reactor in `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`
-  is the live emission. Audit confirmed (commit `5284e28` + `items.py:2-5`
-  comment).
- **Sandbox is already strong.** `l4d2-sandbox` unit is not in scope
-  for this refactor — its hardening profile was verified during 2026-05-15
-  build-time-idmap work. Document as load-bearing; do not weaken.
- **Sudo on the web app blocks deep hardening there.** `NoNewPrivileges=true`
-  and `PrivateUsers=true` are incompatible with the helper-invocation
-  pattern. Sudo-compatible subset only on web. Full hardening blocked
-  on a future "replace sudo with systemctl-managed unit triggering"
-  refactor (build-overlay-unit spec is a step in that direction).
- **uid-split spec is deferred, not closed.** After Phase A test
-  results come back, decide: residual risk small enough → close
-  `2026-05-15-user-uid-split-design.md` as superseded. Residual risk
-  significant → write the split as a follow-up.
+- **Source of truth is ckn-bw.** `deploy/files/.../*.service` copies
+  are reference-only post-deploy-dir-rethink. Don't edit them as the
+  primary change — emit-then-mirror.
+- **Sandbox `l4d2-sandbox` unit is out of scope.** Verified during
+  prior build-time-idmap work; do not weaken.
+- **Web sudo helpers must keep working.** `NoNewPrivileges` and
+  `PrivateUsers` are NOT in the web composition (Test 10 confirmed
+  the sudo-compat subset). The "replace sudo with systemctl-managed
+  unit triggering" refactor (build-overlay-unit spec is a step
+  toward it) would unlock deeper web hardening later.
+- **App-level stale RCON port bug** surfaced during testing: each
+  srcds restart picks a new ephemeral RCON port; the web app
+  caches the previous one and logs `Connection refused`. Pre-exists
+  hardening (repro'd before any drop-in was applied). Separate issue,
+  not for this refactor. Mention to operator; track in whatever
+  issue-tracking the project uses.
+- **gdb + seccomp packages on left4.me** are installed but not in
+  ckn-bw. Either add them to the bundle (so they're reproducible)
+  or `apt remove` them after the refactor lands — operator
+  preference.

-## Open questions to clarify with operator before/during execution
+## Host state at end of session

-(Captured in the threat model's "Open questions" section.)
-
-1. Is gunicorn directly internet-reachable, or only via nginx?
-2. Admin-auth strength on the web app (defines S2 realism).
-3. Workshop content curation policy (defines A3 realism).
-4. Is `ckn@10.0.4.128` usable as a test bench, or is `left4.me` the
-   only deployment target? (Test plan currently assumes `left4.me`.)
-5. Current `kernel.yama.ptrace_scope` setting on the host.
-6. AppArmor enabled on host? (Default Debian: not enabled.)
+- `left4me-server@1`, `@2`, `left4me-web`: all `active`, baseline
+  (no drop-ins).
+- `/etc/sysctl.d/99-left4me-ptrace.conf` present, `ptrace_scope=2`
+  effective.
+- `gdb`, `seccomp` (provides `scmp_sys_resolver`), `libseccomp-dev`
+  installed.
+- `/tmp/sec-{baseline,after}-{server,web}.txt`, `/tmp/unit-baseline-*.conf`,
+  `/tmp/sysctl-baseline.txt` retained (next session can pull diffs from
+  these if needed).

 ## What's NOT next

- **build-overlay-unit refactor**
-  (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`).
-  Still queued; sequenced behind this. The hardening profile from
-  this work becomes the template for the build-overlay unit.
- **Pushing the ckn-bw `91b7265` commit.** Still unpushed; still safe.
-  Mentioned in the previous handoff; not a blocker.
- **uid-split implementation.** Deferred pending test results.
- **AppArmor profiles.** Listed in the defenses survey; deferred.
-  Revisit after Phase A if directive-only hardening leaves gaps.
+- **Don't re-run the test plan.** Already done; results committed.
+- **Don't push to origin yet.** Repo is 3 ahead of
+  `origin/master` (the three hardening specs + this commit). User
+  said "commit locally" this session; they'll push when ready.
+- **Don't fix the stale-RCON-port app bug as part of the refactor.**
+  Different system, different scope.
+- **Don't do AppArmor.** Still deferred.
+- **build-overlay-unit refactor** (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`)
+  remains sequenced behind this; not next.
+
+## Open questions for the next session
+
+- Should the refactor be a single PR/commit, or split into
+  "ckn-bw emission" + "reference unit mirror" + "sysctl drop-in"?
+  Operator preference. Recommend single bundle if the changes are
+  small.
+- Should we land Test 7's composition on `@2` first as a longer
+  canary before rolling to all instances? Or trust the symmetric
+  emission and roll everywhere at once? Currently both are running
+  baseline; @1 was the only canary.
+- `SocketBindAllow=` for the 27000–27999 game port range — include
+  in the first pass, or defer to a follow-up commit? Survey lists
+  it, test plan didn't exercise it.

 ## Pointers

- Test plan (the thing to execute): `docs/superpowers/specs/2026-05-15-hardening-test-plan.md`
+- Test plan (executed; **read the Results section first**):
+  `docs/superpowers/specs/2026-05-15-hardening-test-plan.md`
 - Threat model: `docs/superpowers/specs/2026-05-15-hardening-threat-model.md`
 - Defenses survey: `docs/superpowers/specs/2026-05-15-hardening-defenses-survey.md`
- Original uid-split spec (deferred): `docs/superpowers/specs/2026-05-15-user-uid-split-design.md`
+- Original uid-split (to be marked superseded):
+  `docs/superpowers/specs/2026-05-15-user-uid-split-design.md`
 - Live unit emission: `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`
 - Reference units: `deploy/files/usr/local/lib/systemd/system/`
- Scratch plan from earlier this session
-  (`~/.claude/plans/docs-superpowers-specs-2026-05-15-sessio-cosmic-codd.md`)
-  is superseded by the three specs; safe to discard.
+- Recent commit on this work: `461b8d0`
+- Host SSH: `ssh left4.me` (config at `~/.ssh/config`, 1Password agent)