spec(session-handoff): hardening refactor landed and verified on left4.me

12-task subagent-driven refactor complete. left4me-server@1: 7.5 → 1.3
systemd-analyze. left4me-web: 8.7 → 4.1. All 6 Test 8 attack vectors
blocked post-deploy. One acceptable SECCOMP audit line per gameserver
restart (Breakpad's ptrace fork, blocked by design). Test tooling
(gdb, seccomp, libseccomp-dev) apt-removed from left4.me. uid-split
spec marked superseded.

No queued follow-up. Adjacent work: build-overlay-unit refactor and
the deferred drop-in / configmgmt-responsibility reshape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
mwiegand 2026-05-15 15:17:06 +02:00
parent f615d0de75
commit f5f8db84ef
No known key found for this signature in database

View file

@ -1,178 +1,126 @@
# Session handoff — next: write the hardening-refactor implementation plan # Session handoff — hardening refactor landed
Short handoff. The hardening test plan was executed end-to-end on The hardening refactor planned at
`left4.me` this session. Results are recorded inline in the spec at `docs/superpowers/plans/2026-05-15-hardening-refactor.md` is deployed
`docs/superpowers/specs/2026-05-15-hardening-test-plan.md` (commit to `left4.me` and verified. This session executed all 12 tasks
`461b8d0`). The next session writes the implementation plan that lands subagent-driven; no follow-up implementation work is queued.
the proven composition in ckn-bw.
## What just happened ## What landed
Ran all 11 tests from the hardening test plan on **left4me commits** (this session, in order; all on `master`, pushed):
`left4me-server@1` (canary) and `left4me-web` against the live host - `7c64910``spec(hardening-refactor): resolve emitter open items`
at `left4.me` / `left4me.ovh.ckn.li` (Debian 13, systemd 257). All (verified ckn-bw systemd-bundle emitter handles tuples + empty values)
drop-ins cleaned up at session end; the Test 9 sysctl - `8e678b6``deploy/files: annotate reference units with per-directive hardening comments`
(`kernel.yama.ptrace_scope=2`) is the one persistent host change. - `37309ba``spec(hardening-test-plan): fix four bugs surfaced by executor`
`gdb` + `seccomp` packages left installed. - `f615d0d``spec(user-uid-split): mark superseded by the hardening refactor`
Headline numbers: **ckn-bw commits** (this session, in order; all on `master`, pushed):
- `left4me-server@1.service`: **7.5 EXPOSED → 1.3 OK** (systemd-analyze) - `85b9af0``bundles/left4me: add HARDENING_{COMMON,SERVER,WEB} constants`
- `left4me-web.service`: **8.7 EXPOSED → 4.1 OK** - `640461c``bundles/left4me: spread HARDENING_SERVER into left4me-server@.service`
- Test 8 attack matrix: all 8 vectors (D1.a/b/c, D2.a/b/c, D3, D5) blocked. - `c6721e7``bundles/left4me: spread HARDENING_WEB into left4me-web.service`
- `130b0b1``bundles/left4me: ship kernel.yama.ptrace_scope=2 sysctl drop-in`
Three things the test surfaced that change what the refactor must look like: **Deploy:** `bw apply ovh.left4me` ran clean in 10 s (194 OK, 4 fixed,
- **`SystemCallArchitectures=native x86`**, not bare `native`. 0 failed). `left4me-web.service` restarted automatically by `bw`;
`srcds_linux` is 32-bit i386; with `native=AUDIT_ARCH_X86_64` only, `left4me-server@1` and `@2` restarted manually post-apply.
every i386 syscall is killed and srcds_run respawns every 10 s.
- **Add `PrivatePIDs=true`** to the composition. `ProtectProc=invisible`
alone cannot hide gunicorn from srcds because they share uid 980;
PrivatePIDs gives each instance its own PID namespace and closes
D2.b without needing the uid split.
- **Exclude `MemoryDenyWriteExecute=true`.** Source engine i386 `.so`
files have text relocations; MDW returns EPERM on the relocation
`mprotect`, dlopen aborts, srcds enters the respawn loop. Permanent
exclusion — not fixable without rebuilding Valve's closed-source
binary.
Full per-test detail is in the spec's "Results" section. ## What's live on `left4.me`
## What's next: write the refactor plan | Unit | systemd-analyze score | State |
|---|---|---|
| `left4me-server@1.service` | **1.3 OK** (was 7.5 baseline) | active since 13:13:39 UTC |
| `left4me-server@2.service` | 1.3 OK | active since 13:14:40 UTC |
| `left4me-web.service` | **4.1 OK** (was 8.7 baseline) | active since 13:01:06 UTC |
Target file: `docs/superpowers/plans/2026-05-16-hardening-refactor.md` Sysctl: `kernel.yama.ptrace_scope = 2` (managed by ckn-bw bundle now,
(or whatever date the next session opens). not hand-applied).
Scope: Composition matches Test 7 of the test plan with two amendments
(`SystemCallArchitectures=native x86`, `PrivatePIDs=true`) and one
addition (`SocketBindAllow=udp:27000-27999 tcp:27000-27999`).
`MemoryDenyWriteExecute=true` permanently excluded.
1. **Land the proven composition in ckn-bw.** Live source for the ## Attack vectors blocked (Test 8 subset rerun post-deploy)
unit emission is `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`.
The reactor emits `left4me-server@.service` and `left4me-web.service`
— both need the new directives. Copy the Test 7 drop-in (from the
spec) into the reactor's unit body, with the two amendments above.
2. **Land the web composition** (sudo-compatible subset from Test 10) - **D1.a — srcds reads DB**: `cat /var/lib/left4me/left4me.db` from
in the same reactor. inside the unit's mount namespace → `No such file or directory`
- **D1.b — srcds reads web.env**: `cat /etc/left4me/web.env`
`No such file or directory`
- **D1.c — srcds sees /opt**: empty listing
- **D2.b — srcds sees gunicorn PID via /proc**: `cannot access /proc/<pid>`
(PrivatePIDs in effect; PID doesn't exist in the namespace)
- **D5 — cross-instance ptrace**: `cannot access /proc/<peer-srcds-pid>`
(cross-instance PID isolation)
- **Syscall filter compiled correctly**: `ptrace` and `process_vm_*`
not in the compiled allow list (verified via
`systemd-analyze syscall-filter`)
3. **Land the sysctl drop-in in ckn-bw.** Currently ## Known acceptable noise
`/etc/sysctl.d/99-left4me-ptrace.conf` is host-only — if ckn-bw
later enforces unmanaged-file removal, this would disappear. Add
`pkg_files:` entry (or whatever the bundle convention is) for
`kernel.yama.ptrace_scope=2`.
4. **Update reference units** in - **One SECCOMP audit line per gameserver restart** (`type=1326`,
`deploy/files/usr/local/lib/systemd/system/{left4me-server@,left4me-web}.service` i386 syscall 26 = `ptrace`, sig=31 SIGSYS, code=0x80000000
to mirror the new emission (these are reference-only post the SECCOMP_RET_KILL_PROCESS). Source: srcds's Breakpad crash-reporter
deploy-dir-rethink, but should not drift from the live source). init forks a child that attempts `ptrace`; we block it by design.
The child gets killed; the main srcds process is unaffected. Net
effect: Valve doesn't get crash minidumps from this host.
Acceptable trade-off given the threat model. If the audit-log noise
becomes a problem, switch the SECCOMP filter's action from
`KILL_PROCESS` to `EPERM` via `SystemCallErrorNumber=EPERM` (would
let breakpad fail cleanly instead of getting killed; same security
outcome).
5. **Decide on `SocketBindAllow=`** for game port range (2700027999 ## Host cleanup done
per `LEFT4ME_PORT_RANGE_*`). Worth adding to lock srcds's bindable
sockets; not tested in this session.
6. **Resolve the deferred specs:** `gdb`, `libseccomp-dev`, `seccomp` removed via `apt remove --purge`.
- `docs/superpowers/specs/2026-05-15-user-uid-split-design.md` Test tooling was installed during the test-plan execution session
**mark as superseded.** PrivatePIDs + PrivateUsers close the (commit `461b8d0`); not needed in steady state. ~13 MB freed.
same-uid /proc gap that motivated it. Note the residual app-level
same-uid surface (DB ACLs, web.env mode) is a separate concern
not addressed by uid split anyway.
- AppArmor follow-up — defer further; defenses survey lists it.
Revisit if directive-only hardening leaves observable gaps.
7. **Fix the four spec bugs documented at the bottom of the test plan** ## What's next
(PID-lookup races, gdb-from-outside-NS verification flaw, D5
pgrep pattern, scmp_sys_resolver package name).
### Recommendation on sequencing No queued follow-up from this work. Adjacent open work:
Before touching ckn-bw, run **superpowers:brainstorming** on the - **`build-overlay-unit` refactor**
refactor — there's a real design choice on emission shape. The (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`).
test-plan drop-in is ~50 lines of new directives; the existing Will reuse `HARDENING_COMMON` (or a sandbox-class variant) when it
reactor emits a smaller unit. Options: lands. Sequenced after this; not blocked.
- **Broader configmgmt-responsibility reshape** — hardening as drop-in
files living in left4me with ckn-bw as a thin file-shipper. Real
direction, deliberately deferred to a dedicated session in this
refactor's design doc.
- **Stale RCON port app bug** flagged in the prior executor's handoff.
Not a hardening issue; separate scope.
- **A. Inline.** All directives land directly in the ## Open items the operator should sanity-check manually
`[Service]` block emitted by the reactor. Simple, ckn-bw-idiomatic.
- **B. Profile-as-reusable-fragment.** Put the directive block in a
shared bundle (so the future build-overlay-unit refactor can reuse
it). Better factoring, more upfront design.
- **C. Drop-in pattern preserved.** Reactor emits the base unit
unchanged, plus a separate `*.service.d/hardening.conf` drop-in.
Mirrors the test methodology; easier to revert by removing the
drop-in.
My weak preference is **A** for the first pass — get the production I executed everything programmatically that I could. The following
state hardened, then refactor into shared shape (B) when the need an eyeballed check via the web UI from your laptop:
build-overlay-unit work needs it. **C** is operationally clean but
introduces a new emission pattern just for this. Worth 10 minutes of
brainstorming before committing.
## Decision-relevant context 1. Login to the web UI; confirm session works (would catch a SECRET_KEY
regression or session-cookie issue).
2. Start/stop a server from the UI (exercises the sudo path on the web
unit; if the SystemCallFilter or any other web hardening broke
sudo, this would fail).
3. View live logs for a server (uses `sudo left4me-journalctl`).
4. Trigger an overlay rebuild for a script overlay (exercises the
sandbox; unchanged by this refactor, but a smoke against the
full chain).
- **Source of truth is ckn-bw.** `deploy/files/.../*.service` copies If any of those break, the most likely cause is the web unit's
are reference-only post-deploy-dir-rethink. Don't edit them as the `SystemCallFilter`. Drop-in override at
primary change — emit-then-mirror. `/etc/systemd/system/left4me-web.service.d/00-debug.conf` with
- **Sandbox `l4d2-sandbox` unit is out of scope.** Verified during `SystemCallLog=...` instead of `SystemCallFilter` to identify the
prior build-time-idmap work; do not weaken. offending syscall, then narrow the filter.
- **Web sudo helpers must keep working.** `NoNewPrivileges` and
`PrivateUsers` are NOT in the web composition (Test 10 confirmed
the sudo-compat subset). The "replace sudo with systemctl-managed
unit triggering" refactor (build-overlay-unit spec is a step
toward it) would unlock deeper web hardening later.
- **App-level stale RCON port bug** surfaced during testing: each
srcds restart picks a new ephemeral RCON port; the web app
caches the previous one and logs `Connection refused`. Pre-exists
hardening (repro'd before any drop-in was applied). Separate issue,
not for this refactor. Mention to operator; track in whatever
issue-tracking the project uses.
- **gdb + seccomp packages on left4.me** are installed but not in
ckn-bw. Either add them to the bundle (so they're reproducible)
or `apt remove` them after the refactor lands — operator
preference.
## Host state at end of session
- `left4me-server@1`, `@2`, `left4me-web`: all `active`, baseline
(no drop-ins).
- `/etc/sysctl.d/99-left4me-ptrace.conf` present, `ptrace_scope=2`
effective.
- `gdb`, `seccomp` (provides `scmp_sys_resolver`), `libseccomp-dev`
installed.
- `/tmp/sec-{baseline,after}-{server,web}.txt`, `/tmp/unit-baseline-*.conf`,
`/tmp/sysctl-baseline.txt` retained (next session can pull diffs from
these if needed).
## What's NOT next
- **Don't re-run the test plan.** Already done; results committed.
- **Don't push to origin yet.** Repo is 3 ahead of
`origin/master` (the three hardening specs + this commit). User
said "commit locally" this session; they'll push when ready.
- **Don't fix the stale-RCON-port app bug as part of the refactor.**
Different system, different scope.
- **Don't do AppArmor.** Still deferred.
- **build-overlay-unit refactor** (`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`)
remains sequenced behind this; not next.
## Open questions for the next session
- Should the refactor be a single PR/commit, or split into
"ckn-bw emission" + "reference unit mirror" + "sysctl drop-in"?
Operator preference. Recommend single bundle if the changes are
small.
- Should we land Test 7's composition on `@2` first as a longer
canary before rolling to all instances? Or trust the symmetric
emission and roll everywhere at once? Currently both are running
baseline; @1 was the only canary.
- `SocketBindAllow=` for the 2700027999 game port range — include
in the first pass, or defer to a follow-up commit? Survey lists
it, test plan didn't exercise it.
## Pointers ## Pointers
- Test plan (executed; **read the Results section first**):
`docs/superpowers/specs/2026-05-15-hardening-test-plan.md`
- Threat model: `docs/superpowers/specs/2026-05-15-hardening-threat-model.md` - Threat model: `docs/superpowers/specs/2026-05-15-hardening-threat-model.md`
- Defenses survey: `docs/superpowers/specs/2026-05-15-hardening-defenses-survey.md` - Defenses survey: `docs/superpowers/specs/2026-05-15-hardening-defenses-survey.md`
- Original uid-split (to be marked superseded): - Test plan (with executor results + this session's bug fixes):
`docs/superpowers/specs/2026-05-15-user-uid-split-design.md` `docs/superpowers/specs/2026-05-15-hardening-test-plan.md`
- Live unit emission: `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+` - Design doc: `docs/superpowers/specs/2026-05-15-hardening-refactor-design.md`
- Reference units: `deploy/files/usr/local/lib/systemd/system/` - Implementation plan: `docs/superpowers/plans/2026-05-15-hardening-refactor.md`
- Recent commit on this work: `461b8d0` - uid-split spec (marked superseded): `docs/superpowers/specs/2026-05-15-user-uid-split-design.md`
- Host SSH: `ssh left4.me` (config at `~/.ssh/config`, 1Password agent) - Live unit emission: `~/Projekte/ckn-bw/bundles/left4me/metadata.py`
(`HARDENING_COMMON` etc. near top; spreads at the
`left4me-server@.service` and `left4me-web.service` entries)
- Reference units (annotated): `deploy/files/usr/local/lib/systemd/system/`