left4me/docs/superpowers/specs/2026-05-15-hardening-refactor-design.md
mwiegand 8971b23617
refactor(sandbox): collapse l4d2-sandbox user into left4me
The hardening refactor that just landed closes the same-uid attack
surface (FS view, ptrace, /proc visibility, signals) for the web +
gameserver units via systemd directives plus system-wide
kernel.yama.ptrace_scope=2. Keeping the script-sandbox on a separate
uid was the inconsistent half-step — defense-in-depth only, with
build-time-idmap complexity attached. One principle wins: harden
once, share the uid.

scripts/libexec/left4me-script-sandbox: drop the idmap block (uid
lookups, STAGING setup, cleanup_staging trap, mount --bind
--map-users), switch User=/Group= to left4me, point BindPaths at
\$OVERLAY_DIR directly. Header comment updated to reflect
hardening-not-uid as the same-uid defense. nsenter self-wrap kept —
it's about mount-namespace escape, not uid.

Tests + comments + companion docs updated. Build-time-idmap and
overlay-idmap plans marked SUPERSEDED; user-uid-split spec revised
to "1 user is correct"; one-line update notes on the hardening
specs and the build-overlay-unit-design.

Companion ckn-bw commit removes the l4d2-sandbox user + group and
tightens /var/lib/left4me from 0711 → 0755 (the traverse-only mode
was specifically for the sandbox uid).
2026-05-15 15:50:57 +02:00

243 lines
11 KiB
Markdown

# Hardening refactor — design
**Status:** approved design; implementation plan to follow at
`docs/superpowers/plans/2026-05-15-hardening-refactor.md`.
Companion: `2026-05-15-hardening-threat-model.md`,
`2026-05-15-hardening-defenses-survey.md`,
`2026-05-15-hardening-test-plan.md` (executed 2026-05-15, results inline).
> **Updated 2026-05-15:** `l4d2-sandbox` was collapsed into `left4me`
> after this refactor landed — see
> `docs/superpowers/plans/2026-05-15-uid-collapse.md`. References below
> to the sandbox running as a separate uid describe the pre-collapse
> state; the directive composition this doc establishes is unchanged.
This doc records the *shape* of the refactor — where the artifacts live,
how they're factored, what's in scope. The implementation plan lays out
the steps.
## Context
The hardening test plan ran end-to-end on `left4.me` on 2026-05-15
(commit `461b8d0`). Outcome: `left4me-server@1` 7.5→1.3 systemd-analyze,
`left4me-web` 8.7→4.1, all 8 Test 8 attack vectors blocked. Two
amendments to the spec's proposed composition required: `SystemCallArchitectures=native x86`
(srcds_linux is i386), `PrivatePIDs=true` (same-uid
`ProtectProc=invisible` can't hide gunicorn from srcds; PID namespace
fixes it at the kernel level). `MemoryDenyWriteExecute=true` permanently
excluded (Source engine i386 `.so` files have text relocations).
Composition is *not currently deployed* — Test 7's drop-in was cleaned
up at session end; only the Test 9 sysctl (`kernel.yama.ptrace_scope=2`)
persists. This refactor lands the proven composition permanently via
the ckn-bw bundle.
## Approach
Keep the current responsibility split for now: ckn-bw owns systemd unit
emission (base + hardening), left4me owns the educational reference
copies and the threat-model/test docs. Hardening directives land in
ckn-bw's `systemd/units` reactor at
`~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`, factored via
shared Python dicts so the two units (and the future
build-overlay-unit refactor) reuse the common base.
The broader responsibility reshape — hardening as drop-in files
*living* in left4me with ckn-bw as a thin file-shipper — is a real
direction worth pursuing, but deserves its own session. Deferred.
## Factoring
Three dict constants at the top of `metadata.py` (or in a sibling
`hardening.py` module if `metadata.py` grows past a comfortable read):
### `HARDENING_COMMON`
Directives both units take verbatim. ~17 keys:
```python
HARDENING_COMMON = {
'ProtectProc': 'invisible',
'ProcSubset': 'pid',
'ProtectKernelTunables': 'true',
'ProtectKernelModules': 'true',
'ProtectKernelLogs': 'true',
'ProtectClock': 'true',
'ProtectControlGroups': 'true',
'ProtectHostname': 'true',
'LockPersonality': 'true',
'ProtectSystem': 'strict',
'ProtectHome': 'true',
'PrivateTmp': 'true',
'RestrictNamespaces': 'true',
'RestrictRealtime': 'true',
'RemoveIPC': 'true',
'KeyringMode': 'private',
'UMask': '0027',
'RestrictAddressFamilies': 'AF_INET AF_INET6 AF_UNIX',
}
```
### `HARDENING_SERVER`
`{**HARDENING_COMMON, ...server-specific}`. Adds sudo-incompatible
flags + filesystem virtualization + i386 amendment + per-instance PID
namespace + bound socket binds:
- `NoNewPrivileges=true`
- `RestrictSUIDSGID=true`
- `PrivateUsers=true`
- **`PrivatePIDs=true`** *(Test amendment — D2.b / D5)*
- `PrivateIPC=true`
- `PrivateDevices=true`
- `CapabilityBoundingSet=` *(empty value → drop all)*
- `AmbientCapabilities=`
- `SystemCallArchitectures='native x86'` *(Test amendment — i386 srcds)*
- `SystemCallFilter=('@system-service', '~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete @privileged')` *(tuple → repeated key)*
- `TemporaryFileSystem='/var/lib /etc /opt /home /root /srv /mnt /media'`
- `BindReadOnlyPaths=('/var/lib/left4me/installation', '/var/lib/left4me/overlays', '/etc/left4me/host.env', '/etc/ssl', '/etc/ca-certificates', '/etc/resolv.conf', '/etc/nsswitch.conf', '/etc/alternatives')`
- `BindPaths='/var/lib/left4me/runtime/%i'`
- `SocketBindAllow=('udp:27000-27999', 'tcp:27000-27999')` *(NEW — lock srcds bindable sockets to the game port range; not tested in Test 7 but cheap defense-in-depth. Concrete range pending verification of `LEFT4ME_PORT_RANGE_*` substitution support in systemd directives; hard-coded range as fallback.)*
### `HARDENING_WEB`
`{**HARDENING_COMMON, ...web-specific}`. Web inherits `ProtectSystem=strict`
from COMMON (was `=full` in the current base unit; this tightens). Adds
a syscall filter *without* `~@privileged` (sudo needs setuid).
**Excludes** `NoNewPrivileges`, `PrivateUsers`, `RestrictSUIDSGID`,
empty `CapabilityBoundingSet` — all sudo-incompatible.
- `SystemCallArchitectures='native'`
- `SystemCallFilter=('@system-service', '~@debug @mount @raw-io @reboot @swap @cpu-emulation @obsolete')` *(no `~@privileged`)*
Web's existing `ReadWritePaths=/var/lib/left4me` stays in its unit's
inline `Service` dict (web-specific, not common).
### Multi-value directives and empty values
Tuples-of-strings → emitted as repeated `Key=Value` lines by ckn-bw's
systemd-bundle emitter. Existing precedent: `EnvironmentFile` at
`metadata.py:201-204`. Empty values (`CapabilityBoundingSet=`,
`AmbientCapabilities=`) need to emit as `Key=` with nothing after `=`.
Both behaviors verified as the first step of the implementation plan;
fallback approaches if the emitter doesn't handle them: inline-joined
strings where systemd accepts them, or extend the emitter.
## Reference units
Keep `deploy/files/usr/local/lib/systemd/system/left4me-server@.service`
and `deploy/files/usr/local/lib/systemd/system/left4me-web.service` as
**deliberately educational** copies of the deployed units. Each new
hardening directive in the reference gets a one-line comment
explaining the threat it addresses. A cold reader of the repo can open
the reference unit and read the threat model in code form, without
needing to read the ckn-bw bundle or systemd man pages.
Source-of-truth: ckn-bw reactor is what's deployed. Reference units in
left4me are hand-synced. No CI drift test (would be brittle against
comment ordering and structural human-readable formatting); operator
discipline at edit time keeps them aligned. A top-of-file note in each
reference unit points readers at the reactor.
## Scope of the refactor
1. **Ckn-bw reactor edits.** Three constants + spread into the two
units. Verify tuple-multi-value emission. `metadata.py`.
2. **Sysctl drop-in via ckn-bw.** `kernel.yama.ptrace_scope=2`. Move
from host-only `/etc/sysctl.d/99-left4me-ptrace.conf` (applied by
hand in Test 9) into the bundle's file management. Find the existing
sysctl pattern in ckn-bw and follow it.
3. **Reference unit mirror with educational comments.** Update
`deploy/files/usr/local/lib/systemd/system/{left4me-server@,left4me-web}.service`
to match the reactor's emission, with per-directive comments
explaining each hardening directive's purpose. Top-of-file note
pointing to the reactor.
4. **Spec bug fixes in the test plan.** Four bugs flagged in
`2026-05-15-hardening-test-plan.md`'s output section: PID-lookup
race (use `systemctl show -p MainPID --value`), gdb-from-host
verification flaw (probe via `systemd-run` inside the same
hardening profile, not via `nsenter` that bypasses it), D5 pgrep
pattern, `scmp_sys_resolver` package is `seccomp` not
`libseccomp-dev`. Doc-only.
5. **Mark `2026-05-15-user-uid-split-design.md` superseded.** Front-matter
status note + brief explanation that `PrivateUsers` + `PrivatePIDs`
+ `TemporaryFileSystem` close D1, D2, D3, D5 at the kernel level.
Reference this design + the refactor plan as the replacement.
6. **`SocketBindAllow=` for srcds** (in `HARDENING_SERVER`). Not tested
in Test 7; verify on deploy. Encoding pending — likely hard-coded
port range, since systemd directive variable substitution support
is uneven.
7. **Cleanup unmanaged packages on left4.me.** `apt remove gdb seccomp
libseccomp-dev` after the refactor lands. Test-only tooling;
reinstall on demand for future test sessions.
## Sequencing the deploy
1. Land ckn-bw commit (reactor changes, sysctl drop-in entry).
2. Land left4me commit (reference units, spec bug fixes, uid-split
spec status update, this design doc, the refactor plan).
3. Push both repos.
4. `bw apply ovh.left4me` — applies reactor changes; systemd restarts
affected units automatically.
5. Verify on the host:
- `systemctl cat left4me-server@1` shows the new directives.
- Re-run a Test 8 subset (D1.a, D1.b, D2.b via PrivatePIDs, D5 with
the corrected pgrep) using the *corrected* probe pattern (per
spec bug fix in scope item 4). Test 8's full rerun is unnecessary
— composition is proven; only the *deployment* needs verifying.
- `sysctl kernel.yama.ptrace_scope` = 2.
- Smoke: server@1 + server@2 + web all active and stable for 10
minutes. Web UI: login, server start/stop, log view, overlay
rebuild.
6. Rollback if needed: `git revert` the ckn-bw commit + `bw apply`.
## What's out of scope
- **`MemoryDenyWriteExecute=true`** — permanently excluded.
- **AppArmor profile** — deferred per defenses-survey.
- **`build-overlay-unit` refactor**
(`2026-05-15-build-overlay-unit-design.md`) — sequenced after this.
Will reuse `HARDENING_COMMON` (or a variant) when it lands.
- **3-user uid split** — `2026-05-15-user-uid-split-design.md`
superseded by this refactor (scope item 5).
- **Broader configmgmt-responsibility reshape** — hardening as
drop-ins living in left4me, ckn-bw becoming a thin file-shipper.
Real direction worth pursuing; deserves a dedicated session.
Out of scope here.
- **Stale RCON port app bug** — flagged in executor's handoff. Separate
scope.
- **Pushing the branch** — operator decides when.
## Implementation notes (resolved during plan execution)
- The ckn-bw systemd-bundle emitter renders Python tuples as repeated
`Key=Value` lines and renders empty strings as `Key=` with no value.
Both behaviors confirmed by reading the Mako template in
`libs/systemd.py:17-23`. Tuple branch: `isinstance(value,
(list, set, tuple))` iterates and emits `${option}=${item}` per
element, preserving insertion order (sets are sorted; lists and
tuples are not). Empty-string branch: falls through to `else:
${option}=${str(value)}`, which emits `Key=` with nothing after `=`.
`None` suppresses the key entirely (distinct from empty string —
important). The `protection()` helper at `libs/systemd.py:94` already
uses `'CapabilityBoundingSet': ''` as a live in-repo example. Tuple
precedent in the left4me bundle: `EnvironmentFile` at
`bundles/left4me/metadata.py:201-204`. Verified 2026-05-15.
- `SocketBindAllow=` value: hard-coded port range `27000-27999` for
both `udp:` and `tcp:` lines (matches the `LEFT4ME_PORT_RANGE_*`
metadata values). Variable substitution in systemd directives is not
universally supported; hard-coded range avoids the hazard.
## Pointers
- Threat model: `2026-05-15-hardening-threat-model.md`
- Defenses survey: `2026-05-15-hardening-defenses-survey.md` (§ 5
candidate composition is the basis for the factoring above)
- Test plan + results: `2026-05-15-hardening-test-plan.md`
(commit `461b8d0`)
- Executor's handoff: `2026-05-15-session-handoff.md`
(commit `152c313`)
- Live reactor: `~/Projekte/ckn-bw/bundles/left4me/metadata.py:150+`
- Reference units: `deploy/files/usr/local/lib/systemd/system/`
- Deferred uid-split spec: `2026-05-15-user-uid-split-design.md`
- Adjacent (sequenced after): `2026-05-15-build-overlay-unit-design.md`