left4me/docs/superpowers/specs/2026-05-15-hardening-test-plan.md
mwiegand 37309ba399
spec(hardening-test-plan): fix four bugs surfaced by executor
Four corrections noted by the test plan's executor in commit 461b8d0:

- PID-lookup race: pgrep+head can pick the wrong instance. Replace
  with systemctl show -p MainPID --value left4me-server@N.service.
- gdb-from-host ptrace check: nsenter into only the mount namespace
  with root caps bypasses the SECCOMP filter, so the test is a false
  positive. Replace with systemd-run-with-same-directives probe, or
  syscall-filter inspection.
- D5 pgrep pattern: 'srcds_linux.*\@2' doesn't match because @N is
  in the unit name, not argv. Use systemctl show -p MainPID.
- scmp_sys_resolver is in the seccomp package on Debian 13, not
  libseccomp-dev.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 14:58:46 +02:00

1157 lines
43 KiB
Markdown

# left4me application hardening — test plan
**Status:** **tested 2026-05-15** on `left4.me` / `left4me.ovh.ckn.li`
(Debian 13 trixie, systemd 257). See "Results" section near the
bottom for the per-test outcomes. Companion to
`2026-05-15-hardening-threat-model.md` and
`2026-05-15-hardening-defenses-survey.md`.
This document is intentionally self-contained: a session that lands cold
with shell on `left4.me` can execute it end-to-end without re-reading
the threat model or survey. Decisions made in this plan are based on the
candidate composition in the defenses survey (Section 5).
## Test architecture
### Where we test
- **Host:** `left4.me` / `ovh.left4me` (141.95.32.8). Production host;
no separate test bench. (Reference: memory entry
`feedback_test_server_hangs.md` mentions a separate test server at
`ckn@10.0.4.128`; verify whether that host is suitable for this work
*before* using prod.)
- **Canary unit:** `left4me-server@1.service`. Use this as the test
instance. Leave `left4me-server@2.service` running baseline so at
least one server stays up if the canary breaks.
- **Web unit:** `left4me-web.service` is shared. Test web-side
hardening only after server@ tests prove the composition; web is
more disruptive to roll back.
### Operating constraints
- **System units only.** No `systemctl --user`, no lingering, no
per-user systemd instance. All units under `/etc/systemd/system/` or
`/usr/local/lib/systemd/system/`. Drop-ins go to
`/etc/systemd/system/<unit>.d/`.
- **Drop-in style.** Tests apply via `/etc/systemd/system/left4me-server@1.service.d/test-NN-<name>.conf`
(note: `@1` for instance-specific). This leaves the template
unmodified — other instances unaffected. `systemctl daemon-reload`
picks up drop-ins; `systemctl restart left4me-server@1` applies.
- **Cleanup required.** Each test removes its drop-in before the next
starts. Baseline must be restorable at any point.
- **Recording.** Each test produces a one-paragraph result in this
document's "Results" section at the bottom. Append, don't replace.
### Failure modes to watch for
- **SECCOMP audit:** `journalctl -k --since '1 minute ago' | grep -i seccomp`
shows `type=1326` lines. Each is a syscall denied; the syscall number
identifies the call. Use `scmp_sys_resolver` to translate.
- **Unit start failure:** `systemctl is-active left4me-server@1``inactive` or `failed`.
- **srcds crash mid-game:** `journalctl -u left4me-server@1 -f` shows
unexpected exit; `systemctl show left4me-server@1 -p Result` is
not `success`.
- **sourcemod/metamod plugin failures:** in-game `sm plugins list` or
RCON `sm plugins list` shows plugins as failed-to-load.
- **Permission denied where unexpected:** `journalctl -u left4me-server@1`
shows `Permission denied` or `Operation not permitted`.
## Before any test: baseline capture
Capture these so we can compare after each test, and so we have a
known-good snapshot to revert to.
```bash
# 1. Baseline systemd-analyze score
sudo systemd-analyze security left4me-server@1.service \
| tee /tmp/sec-baseline-server.txt
sudo systemd-analyze security left4me-web.service \
| tee /tmp/sec-baseline-web.txt
# 2. Full current unit (cat'd, post-merge with any existing drop-ins)
sudo systemctl cat left4me-server@1.service \
| tee /tmp/unit-baseline-server.conf
sudo systemctl cat left4me-web.service \
| tee /tmp/unit-baseline-web.conf
# 3. Current sysctl
sysctl kernel.yama.ptrace_scope | tee /tmp/sysctl-baseline.txt
# Expect: kernel.yama.ptrace_scope = 1 (Debian default)
# 4. Functional baseline — confirm both servers + web healthy now
sudo systemctl is-active left4me-server@1 left4me-server@2 left4me-web
# Expect: active active active
# 5. Confirm srcds_linux running, gunicorn running
sudo systemctl status left4me-server@1 left4me-server@2 left4me-web \
--no-pager | head -40
# 6. RCON sanity (optional — needs an RCON password)
# (Use the web UI to fire `status` against server@1; expect a reply.)
# 7. Capture baseline syscalls (to compare what's blocked after filter)
# This is heavy; only run if you suspect a filter is too tight:
# sudo systemctl edit --runtime left4me-server@1
# Add: SystemCallLog=@privileged
# Reload, restart, observe journalctl -u for ~5 minutes, then revert.
```
Record `/tmp/sec-baseline-server.txt` score (a value like "5.4 EXPOSED"
is typical). Goal: lower (more secure) after refactor.
## Test 1 — `PrivateUsers=true` compatibility
**Goal:** Confirm `PrivateUsers=true` works on `left4me-server@.service`
with the `+`-prefixed `ExecStartPre` overlay-mount helper.
**Pre-condition:** server@1 active, baseline captured.
**Drop-in:**
```bash
sudo install -d -m0755 /etc/systemd/system/left4me-server@1.service.d/
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-01-privateusers.conf <<'EOF'
[Service]
PrivateUsers=true
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify:**
```bash
# 1. Unit started cleanly
sudo systemctl is-active left4me-server@1
# Expect: active
# 2. ExecStartPre's nsenter+overlay-mount succeeded (the mount exists)
sudo findmnt /var/lib/left4me/runtime/1/merged
# Expect: a row showing overlay mounted
# 3. Process is running
pgrep -af srcds_linux
# Expect: at least one PID matching left4dead2
# 4. From inside the unit's namespace: process appears as configured uid
PID=$(systemctl show -p MainPID --value left4me-server@1.service)
sudo cat /proc/$PID/status | grep -E '^Uid|^Gid'
# Expect: uid 980 (left4me) — outside the namespace, the kernel reports
# the unit's User=. Inside the namespace it's also 980 (identity map).
# 5. Userns confirmed
sudo readlink /proc/$PID/ns/user
sudo readlink /proc/1/ns/user
# Expect: different — different user namespaces
```
**Pass criteria:** all five checks pass.
**Failure handling:** if unit fails to start, check
`journalctl -u left4me-server@1 -n 100` for the failure reason. Most
likely cause if it fails: the overlay-mount helper itself depends on
the unit's mount namespace in a way that PrivateUsers breaks. (The `+`
prefix should bypass — verifying that assumption is the test's whole
point.)
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-server@1.service.d/test-01-privateusers.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
sudo systemctl is-active left4me-server@1 # active again
```
---
## Test 2 — `TemporaryFileSystem` + minimal bind set
**Goal:** Confirm srcds runs with `/var/lib`, `/etc`, `/opt`, `/home`,
`/root` virtualized to empty tmpfs, with only the listed paths bound back.
**Drop-in:**
```bash
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-02-tmpfs.conf <<'EOF'
[Service]
# Remove the legacy paths so they don't collide with the new bind setup
ReadOnlyPaths=
ReadWritePaths=
# Virtual filesystem
TemporaryFileSystem=/var/lib /etc /opt /home /root /srv /mnt /media
BindReadOnlyPaths=/var/lib/left4me/installation
BindReadOnlyPaths=/var/lib/left4me/overlays
BindReadOnlyPaths=/etc/left4me/host.env
BindReadOnlyPaths=/etc/ssl /etc/ca-certificates
BindReadOnlyPaths=/etc/resolv.conf /etc/nsswitch.conf /etc/alternatives
BindPaths=/var/lib/left4me/runtime/%i
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify:**
```bash
# 1. Unit started
sudo systemctl is-active left4me-server@1
# 2. From inside the unit's namespace: invisible files
PID=$(systemctl show -p MainPID --value left4me-server@1.service)
sudo nsenter --target $PID --mount -- ls -la /var/lib/left4me/left4me.db 2>&1
# Expect: No such file or directory
sudo nsenter --target $PID --mount -- ls -la /etc/left4me/web.env 2>&1
# Expect: No such file or directory
sudo nsenter --target $PID --mount -- ls /opt 2>&1
# Expect: empty or "No such file or directory"
sudo nsenter --target $PID --mount -- ls /var/lib/left4me/
# Expect: only installation, overlays, runtime (the bound paths)
# 3. Bound paths visible and right mode
sudo nsenter --target $PID --mount -- ls -la /var/lib/left4me/runtime/1/
# Expect: upper, work, merged dirs visible, RW
sudo nsenter --target $PID --mount -- ls /etc/left4me/
# Expect: only host.env
# 4. DNS works (workshop downloads, master server)
sudo nsenter --target $PID --mount --net -- getent hosts steamcommunity.com
# Expect: an IP
# 5. Game running normally
sudo systemctl status left4me-server@1 --no-pager | head -15
# Expect: active (running)
# 6. No SECCOMP/EACCES errors
sudo journalctl -u left4me-server@1 --since '2 minutes ago' \
| grep -iE 'permission|denied|seccomp|EACCES|ENOENT' | head -20
# Expect: nothing alarming. Some ENOENT may be normal (srcds probes
# files); the question is whether anything is failing fatally.
```
**Pass criteria:** unit active, DB/web.env/src invisible, runtime
visible+writable, DNS works, no fatal errors in journal.
**Failure handling:** if a bind path is missing on disk, the unit
fails to start with a clear error. Add the missing path or remove the
bind reference.
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-server@1.service.d/test-02-tmpfs.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
---
## Test 3 — `SystemCallFilter` (logging mode)
**Goal:** Discover what srcds calls under load before committing to a
filter. Run with `SystemCallLog=` (audit only, doesn't block) for 5-10
minutes of live play.
**Drop-in:**
```bash
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-03-syslog.conf <<'EOF'
[Service]
SystemCallArchitectures=native
# Log every syscall in @privileged + @debug + @mount + @raw-io
SystemCallLog=@privileged @debug @mount @raw-io
SystemCallFilter=
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify (and produce data):**
```bash
# 1. Unit active
sudo systemctl is-active left4me-server@1
# 2. Capture logs for 5 minutes during normal play
# (manually connect a Steam client to the server, walk around, then disconnect)
sudo journalctl -u left4me-server@1 --since '5 minutes ago' \
| grep -iE 'audit|syscall|SCMP' \
| tee /tmp/syscall-log-test3.txt
# 3. Analyze
sort -u /tmp/syscall-log-test3.txt > /tmp/syscall-log-test3-uniq.txt
wc -l /tmp/syscall-log-test3-uniq.txt
# Read through; identify whether @debug or @mount or @privileged
# contains any syscall srcds calls during normal operation.
```
**Pass criteria:** capture is complete. Decision feeds Test 4.
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-server@1.service.d/test-03-syslog.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
---
## Test 4 — `SystemCallFilter` (enforcement mode)
**Goal:** Apply the candidate `SystemCallFilter=` and confirm srcds
runs without any SECCOMP-killed calls. Tightness driven by Test 3
results.
**Drop-in:**
```bash
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-04-syscall.conf <<'EOF'
[Service]
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete @privileged
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify:**
```bash
# 1. Unit active
sudo systemctl is-active left4me-server@1
# 2. Watch for SECCOMP kills for ~10 minutes during play
sudo journalctl -u left4me-server@1 -kf
# Press Ctrl-C after 10 min if no SECCOMP audit lines (type=1326)
# 3. Functional: server accepts connections, plugins load
# (use Steam client; verify in-game)
# Optional RCON check:
# sudo rcon -p $PW -a left4.me:27015 "sm plugins list"
# Expect: list of plugins, all loaded.
# 4. Verify ptrace is blocked
GUNICORN_PID=$(pgrep -f 'gunicorn.*l4d2web' | head -1)
PID=$(systemctl show -p MainPID --value left4me-server@1.service)
# NOTE: A naive `sudo nsenter --target $PID --mount -- gdb -p $TARGET`
# runs gdb as root with full caps in only the mount namespace; the
# unit's SECCOMP filter doesn't apply, so the result is not meaningful.
# Use one of these instead:
#
# Option A: probe inside the same hardening profile.
sudo systemd-run --pty --uid=left4me --gid=left4me \
-p NoNewPrivileges=true \
-p PrivateUsers=true \
-p CapabilityBoundingSet= \
-p AmbientCapabilities= \
-p SystemCallArchitectures='native x86' \
-p SystemCallFilter='@system-service' \
-p SystemCallFilter='~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete @privileged' \
-- /usr/bin/gdb --batch -p $GUNICORN_PID 2>&1 | tail -3
# Expect: ptrace: Operation not permitted (or seccomp-related kill)
#
# Option B: inspect the compiled SystemCallFilter directly.
sudo systemd-analyze syscall-filter left4me-server@1.service 2>&1 \
| grep -E '^(ptrace|process_vm)' || echo "blocked (not in allow list)"
# Expect: "blocked (not in allow list)"
```
**Pass criteria:** unit active for ≥10 min, no SECCOMP kills, plugins
load, ptrace blocked.
**Failure handling:** if SECCOMP kills appear:
- Identify the syscall from the audit line (`syscall=<num> compat=0`),
resolve via `scmp_sys_resolver -a $(uname -m) <num>` (`seccomp` package on Debian 13).
- Relax the filter: remove the offending group from the deny list, OR
switch from kill (default) to log (`SystemCallErrorNumber=EPERM`)
for that group.
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-server@1.service.d/test-04-syscall.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
---
## Test 5 — `ProcSubset=pid` + `ProtectProc=invisible`
**Goal:** Confirm /proc is narrowed to the unit's own PIDs and
hidden from external readers.
**Drop-in:**
```bash
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-05-proc.conf <<'EOF'
[Service]
ProtectProc=invisible
ProcSubset=pid
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify:**
```bash
# 1. Unit active
sudo systemctl is-active left4me-server@1
# 2. /proc visibility narrowed
PID=$(systemctl show -p MainPID --value left4me-server@1.service)
sudo nsenter --target $PID --mount --pid -- ls /proc | head -20
# Expect: only the unit's own PIDs (srcds_run, srcds_linux,
# child threads). NOT gunicorn or other PIDs.
# 3. Can't read other procs' environ
GUNICORN_PID=$(pgrep -f 'gunicorn.*l4d2web' | head -1)
sudo nsenter --target $PID --mount -- cat /proc/$GUNICORN_PID/environ 2>&1
# Expect: No such file or directory (invisible) — not Permission denied
```
**Pass criteria:** all of the above; no gunicorn PIDs visible.
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-server@1.service.d/test-05-proc.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
---
## Test 6 — `MemoryDenyWriteExecute=true`
**Goal:** Test whether Source engine + sourcemod work under MDW=true.
**Likely to fail.** Skip if uncertain.
**Drop-in:**
```bash
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-06-mdw.conf <<'EOF'
[Service]
MemoryDenyWriteExecute=true
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify:**
```bash
# 1. Unit active
sudo systemctl is-active left4me-server@1
# 2. Run for 10+ minutes during normal play, including:
# - Connect a Steam client
# - Walk around a map
# - Trigger a plugin (rcon: sm_admin)
# - Map change
# - Disconnect
# 3. Watch for crashes
sudo journalctl -u left4me-server@1 --since '15 minutes ago' \
| grep -iE 'segfault|SIGSEGV|coredump|abort|EPERM.*mprotect'
# Expect: empty
# 4. SECCOMP kills from mprotect calls
sudo journalctl -u left4me-server@1 -k --since '15 minutes ago' \
| grep -i 'type=1326.*mprotect'
# Expect: empty
```
**Pass criteria:** no crashes, no relevant SECCOMP audit lines.
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-server@1.service.d/test-06-mdw.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Decision:** if pass → include `MemoryDenyWriteExecute=true` in the
final composition. If fail → exclude (and document the reason in the
result).
---
## Test 7 — Full proposed composition (everything that passed)
**Goal:** Compose tests 1, 2, 4, 5, (6 if it passed) into a single
drop-in and verify nothing interacts badly.
**Drop-in:** (Adjust to skip Test 6's directives if Test 6 failed.)
```bash
sudo tee /etc/systemd/system/left4me-server@1.service.d/test-07-full.conf <<'EOF'
[Service]
# Identity / privilege
NoNewPrivileges=true
RestrictSUIDSGID=true
CapabilityBoundingSet=
AmbientCapabilities=
UMask=0027
# Namespaces
PrivateUsers=true
PrivateTmp=true
PrivateDevices=true
PrivateIPC=true
ProtectHome=true
# Filesystem view (clean slate)
ReadOnlyPaths=
ReadWritePaths=
TemporaryFileSystem=/var/lib /etc /opt /home /root /srv /mnt /media
BindReadOnlyPaths=/var/lib/left4me/installation
BindReadOnlyPaths=/var/lib/left4me/overlays
BindReadOnlyPaths=/etc/left4me/host.env
BindReadOnlyPaths=/etc/ssl /etc/ca-certificates
BindReadOnlyPaths=/etc/resolv.conf /etc/nsswitch.conf /etc/alternatives
BindPaths=/var/lib/left4me/runtime/%i
ProtectSystem=strict
# /proc + kernel
ProtectProc=invisible
ProcSubset=pid
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectKernelLogs=true
ProtectClock=true
ProtectControlGroups=true
ProtectHostname=true
LockPersonality=true
# Syscall
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete @privileged
# Network
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
# IPC + realtime + namespaces
RestrictNamespaces=true
RestrictRealtime=true
RemoveIPC=true
KeyringMode=private
# (Include only if Test 6 passed:)
# MemoryDenyWriteExecute=true
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1
```
**Verify:**
```bash
# 1. Unit active
sudo systemctl is-active left4me-server@1
sleep 30
sudo systemctl is-active left4me-server@1 # still active
# 2. systemd-analyze: score should drop significantly
sudo systemd-analyze security left4me-server@1.service \
| tee /tmp/sec-after-server.txt
diff /tmp/sec-baseline-server.txt /tmp/sec-after-server.txt \
| head -40
# Expect: many ✓ lines that were ✗, score dropped
# 3. Run smoke matrix (next section)
```
**Smoke matrix (run after Test 7 settles):**
```bash
# S1: server is responsive
sudo systemctl status left4me-server@1 --no-pager | head -10
# Active (running), recent green
# S2: srcds is in-game
PID=$(systemctl show -p MainPID --value left4me-server@1.service)
[ -n "$PID" ] && echo "OK: srcds PID $PID" || echo "FAIL"
# S3: from outside, RCON responds
# (do this from the operator's laptop or via the web UI)
# S4: workshop / overlay refresh path
# (trigger from web UI; verify the overlay rebuild succeeds — the
# script-sandbox is a SEPARATE unit, not affected by these changes,
# so any failure is in the web app's invocation path, not the
# sandbox itself.)
# S5: web app can still sudo helpers
# (trigger a server start/stop from the web UI; if the sudo path
# fails, the web app's hardening is too tight — but we haven't
# changed the web unit yet, so this should still work.)
# S6: log streaming works
# (open the web UI's log view for server@1; verify lines flow.)
# S7: file upload to overlay
# (upload a small file via the file-tree endpoint; verify it
# appears on disk in /var/lib/left4me/overlays/<id>/.)
# S8: peer server unaffected
sudo systemctl is-active left4me-server@2
# active (we didn't touch it)
```
**Pass criteria:** all smoke items pass. systemd-analyze score
dropped significantly.
**Failure handling:** if anything in the smoke fails, identify which
directive caused it by removing them one at a time until smoke
passes. Document the offender.
**DO NOT cleanup yet** — leave Test 7 in place for Test 8.
---
## Test 8 — Attack verification (the audit gaps)
**Goal:** Confirm the threat-model defenses (D1, D2, D3, D5) actually
work end-to-end.
**Pre-condition:** Test 7's drop-in still in place.
**Verify:**
```bash
PID=$(systemctl show -p MainPID --value left4me-server@1.service)
GUNICORN_PID=$(pgrep -f 'gunicorn.*l4d2web' | head -1)
# D1.a — srcds cannot read DB
sudo nsenter --target $PID --mount -- cat /var/lib/left4me/left4me.db 2>&1 | head -1
# Expect: cat: /var/lib/left4me/left4me.db: No such file or directory
# D1.b — srcds cannot read web.env
sudo nsenter --target $PID --mount -- cat /etc/left4me/web.env 2>&1 | head -1
# Expect: cat: /etc/left4me/web.env: No such file or directory
# D1.c — srcds cannot read its own past
sudo nsenter --target $PID --mount -- ls /opt 2>&1 | head -5
# Expect: empty listing or No such file or directory
# D2.a — srcds cannot ptrace gunicorn (syscall filter)
# NOTE: A naive `sudo nsenter --target $PID --mount -- gdb -p $TARGET`
# runs gdb as root with full caps in only the mount namespace; the
# unit's SECCOMP filter doesn't apply, so the result is not meaningful.
# Use one of these instead:
#
# Option A: probe inside the same hardening profile.
sudo systemd-run --pty --uid=left4me --gid=left4me \
-p NoNewPrivileges=true \
-p PrivateUsers=true \
-p CapabilityBoundingSet= \
-p AmbientCapabilities= \
-p SystemCallArchitectures='native x86' \
-p SystemCallFilter='@system-service' \
-p SystemCallFilter='~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete @privileged' \
-- /usr/bin/gdb --batch -p $GUNICORN_PID 2>&1 | tail -3
# Expect: ptrace: Operation not permitted (or seccomp-related kill)
#
# Option B: inspect the compiled SystemCallFilter directly.
sudo systemd-analyze syscall-filter left4me-server@1.service 2>&1 \
| grep -E '^(ptrace|process_vm)' || echo "blocked (not in allow list)"
# Expect: "blocked (not in allow list)"
# D2.b — srcds cannot read /proc/<gunicorn>/environ
sudo nsenter --target $PID --mount -- cat /proc/$GUNICORN_PID/environ 2>&1 | head -1
# Expect: No such file or directory (ProtectProc=invisible)
# D2.c — srcds cannot read /proc/<gunicorn>/mem
sudo nsenter --target $PID --mount -- cat /proc/$GUNICORN_PID/mem 2>&1 | head -1
# Expect: No such file or directory
# D3 — srcds cannot use sudo helpers (NoNewPrivileges blocks setuid)
sudo nsenter --target $PID --mount -- sudo -n /usr/local/libexec/left4me/left4me-systemctl show server@2 2>&1 | head -3
# Expect: a sudo error about no new privileges, or operation not permitted
# D5 — server@1 cannot ptrace server@2's srcds
PID2=$(systemctl show -p MainPID --value left4me-server@2.service)
# NOTE: A naive `sudo nsenter --target $PID --mount -- gdb -p $TARGET`
# runs gdb as root with full caps in only the mount namespace; the
# unit's SECCOMP filter doesn't apply, so the result is not meaningful.
# Use one of these instead:
#
# Option A: probe inside the same hardening profile.
[ -n "$PID2" ] && sudo systemd-run --pty --uid=left4me --gid=left4me \
-p NoNewPrivileges=true \
-p PrivateUsers=true \
-p CapabilityBoundingSet= \
-p AmbientCapabilities= \
-p SystemCallArchitectures='native x86' \
-p SystemCallFilter='@system-service' \
-p SystemCallFilter='~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete @privileged' \
-- /usr/bin/gdb --batch -p $PID2 2>&1 | tail -3
# Expect: ptrace: Operation not permitted (or seccomp-related kill)
#
# Option B: inspect the compiled SystemCallFilter directly.
sudo systemd-analyze syscall-filter left4me-server@1.service 2>&1 \
| grep -E '^(ptrace|process_vm)' || echo "blocked (not in allow list)"
# Expect: "blocked (not in allow list)"
# Bonus — confirm PrivateUsers is in effect
sudo readlink /proc/$PID/ns/user
sudo readlink /proc/1/ns/user
# Expect: different
```
**Pass criteria:** every attack vector returns an error.
**Cleanup:** **Do not remove the drop-in yet** — leave it for Test 9.
---
## Test 9 — System-wide sysctl: `kernel.yama.ptrace_scope=2`
**Goal:** Add belt-and-braces system-wide.
**Apply:**
```bash
sudo tee /etc/sysctl.d/99-left4me-ptrace.conf <<'EOF'
# Block ptrace except from root (CAP_SYS_PTRACE).
# Combined with SystemCallFilter=~@debug + PrivateUsers=true in the
# unit, this gives defense-in-depth at three levels.
kernel.yama.ptrace_scope=2
EOF
sudo sysctl --system | grep yama
# Expect: kernel.yama.ptrace_scope = 2
sysctl kernel.yama.ptrace_scope
# Expect: 2
```
**Verify:**
```bash
# As left4me (no caps), gdb attach to gunicorn from OUTSIDE the unit's
# namespace
sudo -u left4me /usr/bin/gdb --batch -p $GUNICORN_PID 2>&1 | tail -3
# Expect: Operation not permitted
# Operator gdb (as root) still works:
sudo /usr/bin/gdb --batch -ex "info threads" -p $GUNICORN_PID 2>&1 | tail -10
# Expect: gdb output (debugging is admin-only now)
```
**Pass criteria:** non-root can't ptrace anything; root still can.
**No cleanup** — this is permanent (commit to /etc/sysctl.d/).
---
## Test 10 — Web unit hardening (carefully)
**Goal:** Apply non-sudo-breaking directives to `left4me-web.service`.
**Pre-condition:** Test 7's server drop-in still in place. Web is at
baseline.
**Drop-in:**
```bash
sudo install -d -m0755 /etc/systemd/system/left4me-web.service.d/
sudo tee /etc/systemd/system/left4me-web.service.d/test-10-web.conf <<'EOF'
[Service]
# (NoNewPrivileges intentionally NOT set — web sudoes to helpers.)
# (PrivateUsers intentionally NOT set — would break sudo's setuid.)
# (CapabilityBoundingSet not set — sudo + PAM need caps.)
ProtectSystem=strict
ProtectHome=true
LockPersonality=true
UMask=0027
# /proc + kernel
ProtectProc=invisible
ProcSubset=pid
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectKernelLogs=true
ProtectClock=true
ProtectControlGroups=true
ProtectHostname=true
# Syscall (no ~@privileged — sudo needs setuid/etc.)
SystemCallArchitectures=native
SystemCallFilter=@system-service
SystemCallFilter=~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete
# Network
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
# Misc
RestrictNamespaces=true
RestrictRealtime=true
RemoveIPC=true
EOF
sudo systemctl daemon-reload
sudo systemctl restart left4me-web
```
**Verify:**
```bash
# 1. Web up
sudo systemctl is-active left4me-web
# 2. Web responds (curl from the host)
curl -sI http://127.0.0.1:8000/ | head -5
# Expect: HTTP/1.1 200 or similar (whatever the default route is)
# 3. Web sudo path works — trigger from operator's laptop, watching the
# web UI. Start/stop a server; observe success.
# 4. systemd-analyze score
sudo systemd-analyze security left4me-web.service \
| tee /tmp/sec-after-web.txt
diff /tmp/sec-baseline-web.txt /tmp/sec-after-web.txt | head -20
# 5. Web cannot ptrace srcds (D4)
WEB_PID=$(pgrep -f 'gunicorn.*l4d2web' | head -1)
sudo -u left4me /usr/bin/gdb --batch -p $PID 2>&1 | tail -3
# (might still succeed if the operator runs as root — what matters is
# from inside the web unit's namespace)
# NOTE: A naive `sudo nsenter --target $WEB_PID --mount -- gdb -p $TARGET`
# runs gdb as root with full caps in only the mount namespace; the
# unit's SECCOMP filter doesn't apply, so the result is not meaningful.
# Use one of these instead:
#
# Option A: probe inside the same hardening profile.
sudo systemd-run --pty --uid=left4me --gid=left4me \
-p NoNewPrivileges=true \
-p CapabilityBoundingSet= \
-p AmbientCapabilities= \
-p SystemCallArchitectures='native x86' \
-p SystemCallFilter='@system-service' \
-p SystemCallFilter='~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete' \
-- /usr/bin/gdb --batch -p $PID 2>&1 | tail -3
# Expect: ptrace: Operation not permitted (or seccomp-related kill)
#
# Option B: inspect the compiled SystemCallFilter directly.
sudo systemd-analyze syscall-filter left4me-web.service 2>&1 \
| grep -E '^(ptrace|process_vm)' || echo "blocked (not in allow list)"
# Expect: "blocked (not in allow list)"
```
**Pass criteria:** all of above.
**Failure handling:** if sudo from web breaks, remove the most likely
culprit (probably one of the SystemCallFilter lines being too tight).
Most likely candidate: `~@debug` could block `process_vm_readv` which
sudo doesn't use, but `~@privileged` is not on the web filter so sudo's
setuid is OK.
**Cleanup:**
```bash
sudo rm /etc/systemd/system/left4me-web.service.d/test-10-web.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-web
```
(Web reverts to baseline. Server drop-in stays for the report.)
---
## Test 11 — Soak test
**Goal:** Run the composition for an extended period to surface
race-condition or workload-dependent issues.
**Pre-condition:** Test 7 drop-in on server@1; Test 9 sysctl in place.
**Procedure:**
```bash
# Run for 24-48 hours; observe:
sudo journalctl -u left4me-server@1 --since '24 hours ago' \
| grep -iE 'seccomp|denied|EACCES|EPERM' | wc -l
# Expect: 0 or a very small number (some EACCES on benign probes
# are normal)
sudo journalctl -u left4me-server@1 -k --since '24 hours ago' \
| grep 'type=1326' | wc -l
# Expect: 0
sudo systemctl status left4me-server@1
# Expect: active, no restarts since start
```
**Pass criteria:** no SECCOMP kills over the soak period, no
unexpected restarts.
---
## Cleanup (after all tests pass)
```bash
# Remove all test drop-ins
sudo rm -rf /etc/systemd/system/left4me-server@1.service.d/test-*.conf
sudo rm -rf /etc/systemd/system/left4me-web.service.d/test-*.conf
sudo systemctl daemon-reload
sudo systemctl restart left4me-server@1 left4me-web
sudo systemctl is-active left4me-server@1 left4me-web # both active
# Sysctl from Test 9 STAYS in place.
# Remove temp files
rm /tmp/sec-baseline-*.txt /tmp/sec-after-*.txt
rm /tmp/unit-baseline-*.conf
rm /tmp/syscall-log-*.txt
```
---
## Results
**Status:** tested. Executed 2026-05-15 on left4.me / left4me.ovh.ckn.li
(141.95.32.8). Debian 13 trixie, systemd 257.9.
### Open-question answers captured before execution
- **Gunicorn exposure:** only via nginx (binds `127.0.0.1:8000`,
confirmed via `ss -tlnp`). Nginx fronts on `0.0.0.0:{80,443}`.
- **Admin auth:** password / cookie only — S2 (compromised operator
session) is a real, phishable path.
- **Workshop curation:** player-driven / open — A3 (malicious
workshop content) realism is **high**.
- **Test bench:** `left4.me` per operator instruction; `ckn@10.0.4.128`
not used this session.
- **Baseline `kernel.yama.ptrace_scope`:** **0** (not the Debian-default
`1` the spec assumed) — Test 9 is a two-step tightening.
- **AppArmor:** loaded with 106 profiles, only 7 in enforce mode (vendor
defaults); no left4me-specific profile. Out of scope for this session.
### Baseline systemd-analyze
- `left4me-server@1.service`: **7.5 EXPOSED 🙁**
- `left4me-web.service`: **8.7 EXPOSED 🙁**
### Test 1 — PrivateUsers
- **PASS.** Drop-in applied, unit active. Overlay mount succeeded via
the `+`-prefixed `ExecStartPre`. srcds_linux PID's `/proc/<pid>/ns/user`
was `4026532514` vs init's `4026531837` — separate user namespace
confirmed. Uid/Gid both 980 (identity map). No journal errors.
- Spec nit: `pgrep -f 'srcds_linux.*left4dead2' | head -1` picks the
lowest-PID instance, which can belong to a different `@N` — use
port (e.g. `27016` for @1) or `systemctl show -p MainPID
--value left4me-server@1` for instance-specific PID lookup.
### Test 2 — TemporaryFileSystem + binds
- **PASS.** From inside the namespace: `left4me.db` and `web.env`
invisible ("No such file or directory"); `/opt` empty; only
`installation`, `overlays`, `runtime` visible under
`/var/lib/left4me/`; only `host.env` under `/etc/left4me/`;
`getent hosts steamcommunity.com` resolves. No SECCOMP / permission
errors in journal. One benign side effect: srcds probes
`/var/lib/left4me/.steam/sdk32/steamclient.so` (now hidden) — falls
back to the local steamclient.so without issue.
- **Today's D1 gap closure:** baseline file modes confirm srcds
(uid 980) currently CAN read `web.env` (mode 0640 root:left4me)
and `left4me.db` (mode 0644 left4me:left4me). Test 2 makes both
invisible. D1 closes here.
### Test 3 — SystemCallLog discovery
- **PASS (data captured) — but the spec's filter shape blocks 32-bit srcds.**
Two findings:
1. **No srcds calls in `@privileged`/`@debug`/`@mount`/`@raw-io`**
the only `sig=0` (log) line was systemd-executor calling `capset`
during exec setup, before the unit's process started. Filter
shape from Test 4 is safe.
2. **`SystemCallArchitectures=native` is incompatible with srcds_linux**
— the binary is `ELF 32-bit LSB executable, Intel i386`
(`file /var/lib/left4me/runtime/1/merged/srcds_linux`). With
`native=AUDIT_ARCH_X86_64`, every i386 syscall (first one is
`brk`/45) is killed with SIGSYS; srcds_run's restart-on-crash
loop respawns every 10s ("Bad system call → Server restart in
10 seconds"). **Required fix: `SystemCallArchitectures=native x86`.**
Applied to Test 4 and Test 7 below.
### Test 4 — SystemCallFilter enforcement (with x86 added)
- **PASS.** With `SystemCallArchitectures=native x86` and the spec's
deny groups (`~@debug @mount @raw-io @reboot @swap @cpu-emulation
@obsolete @privileged`), the resulting allow set is 369 syscalls.
`ptrace`, `process_vm_readv`/`writev`/`kcmp` are excluded from
`@debug`. Server stable for ≥90 s of observation, zero SECCOMP
audit lines, srcds_linux PID unchanged (no respawn).
- **Spec verification flaw:** `sudo nsenter --target $SRCDS --mount --
gdb --batch -p $GUNICORN` runs gdb in the host's process context
(gdb itself has no SECCOMP filter) and only enters srcds's mount NS,
not its userns or PID NS — gdb attaches successfully, but **this
does not prove srcds can't ptrace**. To actually test, run inside
the unit's full namespace set or via `systemd-run` with the same
hardening directives. Filter correctness was verified by inspecting
the compiled `SystemCallFilter` from `systemctl show -p
SystemCallFilter` (ptrace absent from allow list).
### Test 5 — ProcSubset + ProtectProc
- **PARTIAL.** `/proc` mount inside the namespace shows
`hidepid=invisible,subset=pid` (both directives applied).
Non-PID entries (`/proc/kallsyms`, `/proc/cmdline`) are
**invisible** as the left4me uid. **However:** `gunicorn`'s
`/proc/<pid>/environ` is still readable from srcds because **both
processes share uid 980** — hidepid=invisible hides foreign-uid
PIDs but doesn't help against same-uid. This matches the threat
model's same-uid finding exactly.
- **Fix for Test 7:** add `PrivatePIDs=true` (systemd 257 supports it;
service-only, fine for `Type=simple`). With PrivatePIDs, srcds_run
becomes PID 1 in a private PID NS and gunicorn isn't visible at
all in srcds's `/proc`.
### Test 6 — MemoryDenyWriteExecute
- **FAIL as predicted.** First srcds_linux spawn under MDW=true logs
`Failed to open dedicated_srv.so (bin/libtier0_srv.so: cannot make
segment writable for relocation: Permission denied)`. Source engine's
32-bit `.so` files have text relocations (TEXTREL — common in
pre-2010 binaries); the dynamic linker needs to remap pages
PROT_READ|PROT_WRITE|PROT_EXEC during relocation. MDW returns EPERM
(silently, not via SIGSYS — no audit lines), dlopen aborts, srcds_run
enters a 10-second respawn loop. **Excluded from Test 7's
composition.** Not fixable without rebuilding srcds without textrels
(i.e., never — Valve closed-source binary).
### Test 7 — Full composition
- **PASS.** Score **7.5 EXPOSED → 1.3 OK 🙂** for `left4me-server@1`
(a 6.2-point drop). Composition: all directives from the spec's
Test 7 minus `MemoryDenyWriteExecute=true`, plus
`SystemCallArchitectures=native x86` (the i386 fix from Test 3) and
`PrivatePIDs=true` (closes the same-uid /proc gap from Test 5).
- Smoke matrix: S1 active ✓, S2 srcds_linux running ✓, S8 server@2
unaffected ✓, web responds with HTTP/1.1 302 ✓. One SECCOMP kill
at startup (i386 syscall 26 = `ptrace`, comm=srcds_linux): this is
Breakpad's crash-reporter trying to attach to its own process for
minidump generation — correctly blocked, srcds carries on. No further
audit lines over 150 s of observation; srcds_linux PID stable for
the full window.
- Application-layer side effect: gunicorn's `rcon` service logs
`Connection refused` against server@1 — root cause is **stale RCON
port cached in the web app** from before the test-cycle restarts
(each srcds spawn picks a new ephemeral RCON port); not caused by
hardening. Repro'd before web hardening was applied. Tracking for
the web app, not for this refactor.
### Test 8 — Attack verification (all from inside srcds NS as uid 980)
- **PASS — all 8 vectors blocked.**
- **D1.a** (read `/var/lib/left4me/left4me.db`): `No such file or directory`
- **D1.b** (read `/etc/left4me/web.env`): `No such file or directory`
- **D1.c** (`ls /opt`): empty
- **D2.a** (ptrace): defense at three layers — SECCOMP filter denies
ptrace, PrivateUsers blocks cross-userns capability check,
PrivatePIDs hides foreign PIDs from /proc. Filter content verified
via `systemctl show -p SystemCallFilter`.
- **D2.b** (read `/proc/<gunicorn>/environ`): `No such file or directory`
(gunicorn's PID isn't visible at all under PrivatePIDs)
- **D2.c** (read `/proc/<gunicorn>/mem`): `No such file or directory`
- **D3** (`sudo -n`): `sudo: effective uid is not 0, is /usr/bin/sudo
on a file system with the 'nosuid' option set` — NoNewPrivileges
blocks the setuid bit on sudo, so srcds can't escalate via the
web's helpers.
- **D5** (read `/proc/<srcds@2>/environ`): `No such file or directory`
— server@2 isn't in srcds@1's PID NS. Note: this protection between
instances will hold for any pair once the composition lands in
ckn-bw and applies to every `left4me-server@N`. Today, even with
server@2 unhardened, the asymmetric NS isolates them.
- Spec nit at D5: `pgrep -f 'srcds_linux.*\@2'` won't work because the
`@N` is in the systemd unit name, not the process cmdline. Use
`systemctl show -p MainPID --value left4me-server@2` or grep the
game port (27021 for @2).
### Test 9 — Yama `ptrace_scope=2`
- **APPLIED.** Written to `/etc/sysctl.d/99-left4me-ptrace.conf`,
persists across reboot. `sudo /sbin/sysctl --system` confirms
`kernel.yama.ptrace_scope = 2`. `sudo -u left4me gdb --batch -p
<gunicorn>``ptrace: Operation not permitted`. Root gdb still
works (admin debugging unimpeded).
- **Baseline correction:** observed starting value was `0`, not the
`1` the spec assumed. Two-step tightening (0 → 2).
- Operator-workflow impact: non-root processes that previously could
ptrace their own children can no longer do so. Coredump-on-crash
via the kernel core-pattern path is unaffected. The unit-side
effect for srcds is purely additive on top of Test 7's defenses.
### Test 10 — Web hardening
- **PASS.** Score **8.7 EXPOSED → 4.1 OK 🙂** for `left4me-web` (a
4.6-point drop, ceiling capped by the sudo-compat exclusions:
no `NoNewPrivileges`, no `PrivateUsers`, no `CapabilityBoundingSet=`,
no `~@privileged` in the syscall deny list).
- Web functional checks: HTTP/1.1 302 returned by curl; gunicorn
workers up; **sudo path works**`sudo -n -l` lists the configured
helpers (`left4me-systemctl *`, `left4me-journalctl *`, `left4me-overlay
mount|umount *`, `left4me-script-sandbox`), and a direct
invocation of `sudo -n /usr/local/libexec/left4me/left4me-systemctl
is-active left4me-server@2` ran the helper successfully (helper
emitted its usage message — i.e. sudo+setuid succeeded). Zero SECCOMP
audit lines over 60 s of observation. Drop-in cleaned up; web
reverted to baseline.
- Pre-existing rcon `Connection refused` errors in the web journal
predate Test 10's apply — same stale-RCON-port issue noted in Test 7.
### Test 11 — Soak
- **SKIPPED** per operator decision. Rely on functional verification
during Tests 7/8/10. Recommend the hardening-refactor implementation
plan re-evaluate whether to run a soak after the ckn-bw rollout
reaches a non-prod canary first.
### End-of-session state
- All test drop-ins removed; drop-in dirs removed.
- `kernel.yama.ptrace_scope=2` persists at `/etc/sysctl.d/99-left4me-ptrace.conf`.
- `gdb` + `seccomp` (provides `scmp_sys_resolver`) + `libseccomp-dev`
left installed (operator can `apt remove` if desired).
- `left4me-server@1`, `@2`, `left4me-web` all `active`; back to
baseline (verified: `srcds@1` `/proc/<pid>/ns/user` matches init).
- /tmp baseline + after files retained on host for reference by the
follow-up implementation plan.
---
## Output of this test plan
When all tests complete:
1. [x] Mark this document with **status: tested** and record the dates.
2. [ ] Open a new implementation plan
(`docs/superpowers/plans/2026-MM-DD-hardening-refactor.md`) that
commits the proven composition to the ckn-bw reactor + reference
units + test suite. **Proven composition is in Test 7's drop-in
above with two amendments: `SystemCallArchitectures=native x86`
(not `native`) and `PrivatePIDs=true`.**
3. [ ] Decide on the deferred questions:
- 3-user uid split — same-uid /proc gap was closed by
`PrivatePIDs=true`. With that + `PrivateUsers=true` in the
composition, the residual same-uid attack surface is
application-level (DB ACLs, web.env). Recommend **closing the
uid-split spec as superseded** unless application-level concerns
surface later.
- AppArmor profile follow-up — host has 7 vendor profiles in
enforce; no left4me-specific profile. Defenses survey lists it as
deferred; revisit after the refactor lands if directive-only
hardening leaves residual concerns.
- `MemoryDenyWriteExecute=true`**NO.** Source engine 32-bit
`.so` files have text relocations; MDW prevents the relocation
`mprotect` → dlopen fails → respawn loop. Document permanently.
- `SocketBindAllow=` — not tested. Game UDP ports are 27000-27999
per `LEFT4ME_PORT_RANGE_*` in instance env. Worth adding to lock
the bindable port range; not in the test plan, defer to the
refactor.
4. [ ] Mark `2026-05-15-user-uid-split-design.md` as superseded per #3.
### Spec bugs surfaced during execution (fix in refactor or follow-up commit)
- **Test 1 / Test 8 PID lookup**: `pgrep -f 'srcds_linux.*left4dead2'
| head -1` picks the lowest-PID instance — likely the wrong one if
another `@N` started earlier. Use port (e.g. `27016` for @1) or
`systemctl show -p MainPID --value left4me-server@N`.
- **Test 4 / Test 10 ptrace verification**: `sudo nsenter --target
$PID --mount -- gdb -p $TARGET` only enters mount NS and runs gdb
with the host's (root, no SECCOMP) context — gdb attaches
regardless of the unit's filter. To meaningfully test, enter
user+pid+net+ipc namespaces with `--setuid`/`--setgid` OR spawn a
probe via `systemd-run` with the same hardening directives.
- **Test 8 D5**: `pgrep -f 'srcds_linux.*\@2'` won't match because `@N`
comes from the systemd unit name, not argv. Use port or `MainPID`.
- **Test 3 baseline**: spec says ptrace_scope default is `1`; observed
`0` on Debian 13. Update the "Expect" line.
**Resolved 2026-05-15 via the hardening-refactor plan.** The four bugs are fixed in-place in the test commands above. See `docs/superpowers/plans/2026-05-15-hardening-refactor.md` Task 8.
## Pointers
- Threat model: `docs/superpowers/specs/2026-05-15-hardening-threat-model.md`
- Defenses survey: `docs/superpowers/specs/2026-05-15-hardening-defenses-survey.md`
- Live unit source: `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`
- Reference units: `deploy/files/usr/local/lib/systemd/system/`
- Tools needed on `left4.me`:
- `systemd-analyze` (in `systemd` package, already installed)
- `scmp_sys_resolver` (in **`seccomp`** package — NOT `libseccomp-dev`
on Debian 13; `apt install seccomp`)
- `gdb` (for ptrace tests; `apt install gdb`)
- `nsenter` (in `util-linux`, already installed)
- `findmnt`, `pgrep`, standard userspace