diff --git a/deploy/README.md b/deploy/README.md index 35778ba..deb04ad 100644 --- a/deploy/README.md +++ b/deploy/README.md @@ -77,21 +77,20 @@ The deployment uses these on-host paths (FHS-aligned): ## Runtime users -Two system users are involved: +One system user does everything: - **`left4me`** (home `/var/lib/left4me`, shell `/usr/sbin/nologin`): - web app, host library, and gameserver runtime. -- **`l4d2-sandbox`** (no home, shell `/usr/sbin/nologin`): unprivileged - uid the script-overlay sandbox drops into via `systemd-run`. The - `left4me-script-sandbox` helper sets up an idmapped bind from the - sandbox uid back to `left4me` on a staging path so overlay writes - land on disk as `left4me`-owned. The split is load-bearing: a - sandbox escape would otherwise see `web.env`, the SQLite DB, and - running gameservers. + web app, host library, gameserver runtime, and script-overlay + sandbox. The sandbox unit drops privileges via `systemd-run` and + runs the user-authored bash inside a fully hardened transient + service (see `scripts/libexec/left4me-script-sandbox`). Same-uid + attack surface — sandbox escape reaching `web.env`, the SQLite DB, + or running gameservers — is closed by that hardening profile plus + system-wide `kernel.yama.ptrace_scope=2`, rather than by a uid + boundary. -(Whether the gameserver runtime should be split off into a third uid is -an open design question — see -`docs/superpowers/specs/2026-05-15-user-uid-split-design.md`.) +The user-count decision and its history live in +`docs/superpowers/specs/2026-05-15-user-uid-split-design.md`. ## Deployment @@ -137,10 +136,10 @@ The web app currently supports two overlay surfaces: symlinks under `${LEFT4ME_ROOT}/overlays/{overlay_id}/left4dead2/addons/{steam_id}.vpk`. - **`script` overlays** — populated by an arbitrary user-authored bash - script that runs inside `systemd-run` as the unprivileged - `l4d2-sandbox` UID, with the overlay directory bind-mounted RW at - `/overlay`. Resource caps: 1h walltime, 4 GB RAM, 512 tasks, 200% CPU, - 20 GB post-build disk cap. + script that runs inside `systemd-run` as `left4me` (under a fully + hardened transient service unit), with the overlay directory + bind-mounted RW at `/overlay`. Resource caps: 1h walltime, 4 GB RAM, + 512 tasks, 200% CPU, 20 GB post-build disk cap. Both caches and overlay directories are owned by `left4me`. If the web service ever runs as a different uid, ensure it shares a group with the diff --git a/docs/superpowers/plans/2026-05-14-overlay-idmap.md b/docs/superpowers/plans/2026-05-14-overlay-idmap.md index c3735e0..49d7e87 100644 --- a/docs/superpowers/plans/2026-05-14-overlay-idmap.md +++ b/docs/superpowers/plans/2026-05-14-overlay-idmap.md @@ -1,5 +1,11 @@ # Idmapped lowerdirs for left4me kernel-overlayfs +> **SUPERSEDED 2026-05-15** by the uid-collapse refactor +> ([`2026-05-15-uid-collapse.md`](2026-05-15-uid-collapse.md)). With +> `l4d2-sandbox` collapsed into `left4me`, all overlay content is +> uniformly `left4me`-owned end-to-end and no idmap is needed at +> mount time either. Kept for design-evolution context. + ## Context Kernel-overlayfs copy-up preserves the lower-layer file's owner and mode in the diff --git a/docs/superpowers/plans/2026-05-15-build-time-idmap.md b/docs/superpowers/plans/2026-05-15-build-time-idmap.md index aa9d2df..ab3498e 100644 --- a/docs/superpowers/plans/2026-05-15-build-time-idmap.md +++ b/docs/superpowers/plans/2026-05-15-build-time-idmap.md @@ -1,6 +1,12 @@ # Build-time idmap: move the uid translation from the gameserver mount into the script sandbox +> **SUPERSEDED 2026-05-15** by the uid-collapse refactor +> ([`2026-05-15-uid-collapse.md`](2026-05-15-uid-collapse.md)). The +> idmap pattern this plan introduced is removed because source uid +> (`left4me`) now equals target uid (`left4me`) — the translation is +> a no-op. Kept for design-evolution context. + ## Context The current idmap implementation translates uids at **gameserver mount diff --git a/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md b/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md index 669d69a..ee6e19a 100644 --- a/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md +++ b/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md @@ -9,6 +9,13 @@ The same pattern is already established in the codebase for gameservers (`left4me-server@.service`). A future session should evaluate whether to refactor and, if so, follow the steps below. +> **Updated 2026-05-15:** `l4d2-sandbox` was collapsed into `left4me` +> — see `docs/superpowers/plans/2026-05-15-uid-collapse.md`. The +> idmap bind setup + trap cleanup are gone, so the remaining +> complexity in the helper is just the nsenter self-wrap. References +> below to `User=l4d2-sandbox` should read as `User=left4me`; the +> template refactor will inherit that cleanly. + ## Why this came up While verifying the build-time idmap refactor, the first 5 build jobs diff --git a/docs/superpowers/specs/2026-05-15-hardening-refactor-design.md b/docs/superpowers/specs/2026-05-15-hardening-refactor-design.md index 619cf5d..8274e6b 100644 --- a/docs/superpowers/specs/2026-05-15-hardening-refactor-design.md +++ b/docs/superpowers/specs/2026-05-15-hardening-refactor-design.md @@ -6,6 +6,12 @@ Companion: `2026-05-15-hardening-threat-model.md`, `2026-05-15-hardening-defenses-survey.md`, `2026-05-15-hardening-test-plan.md` (executed 2026-05-15, results inline). +> **Updated 2026-05-15:** `l4d2-sandbox` was collapsed into `left4me` +> after this refactor landed — see +> `docs/superpowers/plans/2026-05-15-uid-collapse.md`. References below +> to the sandbox running as a separate uid describe the pre-collapse +> state; the directive composition this doc establishes is unchanged. + This doc records the *shape* of the refactor — where the artifacts live, how they're factored, what's in scope. The implementation plan lays out the steps. diff --git a/docs/superpowers/specs/2026-05-15-hardening-threat-model.md b/docs/superpowers/specs/2026-05-15-hardening-threat-model.md index 3d4fb81..784aef1 100644 --- a/docs/superpowers/specs/2026-05-15-hardening-threat-model.md +++ b/docs/superpowers/specs/2026-05-15-hardening-threat-model.md @@ -4,6 +4,13 @@ Paired with `2026-05-15-hardening-defenses-survey.md` and `2026-05-15-hardening-test-plan.md`. +> **Updated 2026-05-15:** `l4d2-sandbox` was collapsed into `left4me` +> after the hardening refactor landed — see +> `docs/superpowers/plans/2026-05-15-uid-collapse.md`. The same-uid +> threat surface that doc accepts is the same surface this model +> already documents for server/web; the sandbox is now in scope of +> the same hardening profile. + This document establishes *what we defend against and what we accept losing*. The defenses survey and test plan operationalize this against the codebase. diff --git a/docs/superpowers/specs/2026-05-15-user-uid-split-design.md b/docs/superpowers/specs/2026-05-15-user-uid-split-design.md index 1bf6c5e..353fc34 100644 --- a/docs/superpowers/specs/2026-05-15-user-uid-split-design.md +++ b/docs/superpowers/specs/2026-05-15-user-uid-split-design.md @@ -1,10 +1,10 @@ # How many system users should left4me have? — 1, 2, or 3 -**Status: SUPERSEDED 2026-05-15 by the hardening refactor.** +**Status: SUPERSEDED 2026-05-15 by the hardening refactor + uid-collapse.** The original question — should left4me have 1, 2, or 3 system users — is -now answered: **2 users (current state) is correct.** The -defenses that motivated a 3-user split (DB readability from srcds, +now answered: **1 user (after the uid-collapse refactor) is correct.** +The defenses that motivated a multi-user split (DB readability from srcds, cross-server ptrace, same-uid /proc visibility, web-side reach into gameserver state) are closed by the systemd hardening composition landed in the hardening-refactor plan (`docs/superpowers/plans/2026-05-15-hardening-refactor.md`): @@ -14,12 +14,22 @@ landed in the hardening-refactor plan (`docs/superpowers/plans/2026-05-15-harden srcds entirely. - `SystemCallFilter=~@debug` + empty `CapabilityBoundingSet=` block ptrace at the syscall layer. +- System-wide `kernel.yama.ptrace_scope=2` blocks same-uid ptrace. + +The interim state (`left4me` + `l4d2-sandbox`) recorded earlier in this +doc was the principled middle ground — script-sandbox builds keeping a +separate uid for kernel-enforced isolation. After the hardening +refactor closed the same-uid attack surface for server/web, leaving +the sandbox on a separate uid was the inconsistent half-step. The +uid-collapse refactor (`docs/superpowers/plans/2026-05-15-uid-collapse.md`) +removed `l4d2-sandbox` so the sandbox now runs as `left4me`, defended +by the same hardening profile. One principle: hardening covers it. The residual filesystem-ACL surface (DB at `0640 root:left4me`, web.env same) is a separate concern: a uid split would close it via kernel ACLs, but for the current deployment shape it's covered by the systemd-imposed FS view. If the deployment shape changes (multi-tenant -host, shell logins as the service uids, additional services running +host, shell logins as the service uid, additional services running as `left4me` outside these units) the uid split should be revisited. The original content of this spec is preserved below for context. diff --git a/l4d2host/instances.py b/l4d2host/instances.py index e88d96f..5c3cac9 100644 --- a/l4d2host/instances.py +++ b/l4d2host/instances.py @@ -73,13 +73,12 @@ def start_instance( runtime_dir = root / "runtime" / name # Stage cfg files in the upper layer. Writing here goes straight to the - # upper dir on the host filesystem with the worker's uid; the unit's - # ExecStartPre then mounts the overlay (single source of truth for the - # mount), and the kernel surfaces these files at the top of the merged - # stack. A script-sandbox-built lower-layer `server.cfg` is owned by - # `l4d2-sandbox`, not the worker — staging in upper sidesteps the - # ownership-preserving copy-up that would happen if we wrote through - # merged post-mount. + # upper dir on the host filesystem; the unit's ExecStartPre then mounts + # the overlay (single source of truth for the mount), and the kernel + # surfaces these files at the top of the merged stack. All overlay + # content (script-built lowers + this upper stage) is left4me-owned + # end-to-end, so the kernel's overlay copy-up is uniform — no + # ownership crossings to reason about. emit_step("staging server.cfg + per-overlay aliases in upper layer...", on_stdout, passthrough) upper_cfg_dir = runtime_dir / "upper" / "left4dead2" / "cfg" upper_cfg_dir.mkdir(parents=True, exist_ok=True) diff --git a/l4d2web/services/overlay_builders.py b/l4d2web/services/overlay_builders.py index 7acda04..adbcc1d 100644 --- a/l4d2web/services/overlay_builders.py +++ b/l4d2web/services/overlay_builders.py @@ -339,7 +339,7 @@ def run_sandboxed_script( f.write(script_text or "") script_path = f.name # NamedTemporaryFile creates 0600 owned by the web user; the sandbox runs - # as l4d2-sandbox and needs to read it (bind-mounted at /script.sh inside + # as left4me and needs to read it (bind-mounted at /script.sh inside # the sandbox). Script content is not a secret — it's plain bash stored # in the DB and editable by the user — so 0644 is appropriate. os.chmod(script_path, 0o644) diff --git a/scripts/libexec/left4me-script-sandbox b/scripts/libexec/left4me-script-sandbox index d3ce299..748d260 100755 --- a/scripts/libexec/left4me-script-sandbox +++ b/scripts/libexec/left4me-script-sandbox @@ -15,8 +15,11 @@ # LockPersonality, RestrictSUIDSGID. Network namespace is *not* restricted — # scripts must reach the public internet to download workshop / l4d2center # / cedapug content. PID namespace is shared with the host (no -# PrivatePID= directive in systemd); host PIDs are visible via /proc but -# not signal-able due to UID mismatch. +# PrivatePID= directive in systemd); host PIDs are visible via /proc. +# Same-uid attack surface (the sandbox runs as left4me, so do the +# gameservers and the web app) is covered by the hardening profile plus +# system-wide kernel.yama.ptrace_scope=2 — see +# docs/superpowers/specs/2026-05-15-hardening-threat-model.md. set -euo pipefail # Self-wrap into PID 1's mount namespace before doing anything mount-related. @@ -46,43 +49,12 @@ if [[ "${LEFT4ME_SCRIPT_SANDBOX_DRY_RUN:-}" == "1" ]]; then exit 0 fi -# Pre-create an idmapped bind of the overlay dir, then point the sandbox's -# BindPaths at that staging path. The bind translates the sandbox's writing -# uid (l4d2-sandbox) back to left4me on disk, so all overlay content -# (script-built and workshop) is uniformly left4me-owned. Map direction: -# `--map-users=::1` with disk=left4me, mount=sandbox — -# a process inside the bind with uid sandbox sees its uid as itself, and -# writes get translated to disk-uid left4me. Verified on kernel 6.12 that -# idmap propagates through systemd-run's plain re-bind of the staging path. -LEFT4ME_UID=$(id -u left4me) -LEFT4ME_GID=$(id -g left4me) -SANDBOX_UID=$(id -u l4d2-sandbox) -SANDBOX_GID=$(id -g l4d2-sandbox) -STAGING=/var/lib/left4me/tmp/sandbox-idmap-${OVERLAY_ID} - -# trap fires even on errors / signals so the staging bind doesn't outlive -# this invocation. Idempotent if the staging is already gone. -cleanup_staging() { - umount "$STAGING" 2>/dev/null || true - rmdir "$STAGING" 2>/dev/null || true -} -trap cleanup_staging EXIT - -# A leftover staging mount from a SIGKILLed prior run can be reset by -# umounting first, then re-binding fresh on the same path. -umount "$STAGING" 2>/dev/null || true -mkdir -p "$STAGING" -mount --bind \ - --map-users="${LEFT4ME_UID}:${SANDBOX_UID}:1" \ - --map-groups="${LEFT4ME_GID}:${SANDBOX_GID}:1" \ - "$OVERLAY_DIR" "$STAGING" - SCRIPT_RC=0 systemd-run --quiet --collect --wait --pipe \ --unit="left4me-script-${OVERLAY_ID}-$$" \ --slice=l4d2-build.slice \ -p OOMScoreAdjust=500 \ - -p User=l4d2-sandbox -p Group=l4d2-sandbox \ + -p User=left4me -p Group=left4me \ -p UMask=0022 \ -p NoNewPrivileges=yes \ -p ProtectSystem=strict -p ProtectHome=yes \ @@ -99,7 +71,7 @@ systemd-run --quiet --collect --wait --pipe \ -p IPAddressDeny="127.0.0.0/8 ::1/128 169.254.0.0/16 fe80::/10 224.0.0.0/4 ff00::/8 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 100.64.0.0/10 fc00::/7" \ -p TemporaryFileSystem="/etc /var/lib" \ -p BindReadOnlyPaths="/etc/left4me/sandbox-resolv.conf:/etc/resolv.conf /etc/ssl /etc/ca-certificates /etc/nsswitch.conf /etc/alternatives ${SCRIPT}:/script.sh" \ - -p BindPaths="${STAGING}:/overlay" \ + -p BindPaths="${OVERLAY_DIR}:/overlay" \ -p WorkingDirectory=/overlay \ -p Environment="HOME=/tmp PATH=/usr/bin:/usr/sbin OVERLAY=/overlay" \ -p MemoryMax=4G -p MemorySwapMax=0 -p TasksMax=512 \ diff --git a/scripts/tests/test_script_sandbox.py b/scripts/tests/test_script_sandbox.py index 2bf2570..16e626d 100644 --- a/scripts/tests/test_script_sandbox.py +++ b/scripts/tests/test_script_sandbox.py @@ -33,8 +33,8 @@ def test_script_sandbox_helper_invokes_systemd_run_with_hardening(): assert "bubblewrap" not in text # UID drop via systemd directives. - assert "User=l4d2-sandbox" in text - assert "Group=l4d2-sandbox" in text + assert "User=left4me" in text + assert "Group=left4me" in text # Cgroup limits unchanged from v1. assert "MemoryMax=4G" in text @@ -80,7 +80,7 @@ def test_script_sandbox_helper_invokes_systemd_run_with_hardening(): assert "/etc/nsswitch.conf" in text assert "/etc/alternatives" in text assert "${SCRIPT}:/script.sh" in text - assert 'BindPaths="${STAGING}:/overlay"' in text + assert 'BindPaths="${OVERLAY_DIR}:/overlay"' in text # IP egress filter: allow public, deny localhost / RFC1918 / link-local / # multicast / CGNAT / ULA. systemd's "more specific rule wins" semantics @@ -110,29 +110,6 @@ def test_script_sandbox_helper_invokes_systemd_run_with_hardening(): assert token in text, f"missing {token!r} in IPAddressDeny set" -def test_script_sandbox_uses_idmap_staging(): - """The sandbox runs as l4d2-sandbox but writes need to land on disk as - left4me, so all overlay content (workshop + script-built) is uniformly - left4me-owned. The helper pre-creates an idmapped bind on a staging - path and points the sandbox's BindPaths at the staging, not at the raw - overlay dir. trap cleans up the staging bind on exit. - """ - text = SCRIPT_SANDBOX_HELPER.read_text() - # Idmap mount setup uses --map-users / --map-groups. - assert "--map-users=" in text - assert "--map-groups=" in text - # Staging path lives under /var/lib/left4me/tmp/sandbox-idmap-. - assert "/var/lib/left4me/tmp/sandbox-idmap-" in text - # BindPaths into the sandbox points at the staging path, not the - # raw overlay dir. - assert 'BindPaths="${STAGING}:/overlay"' in text - # trap registers cleanup so the staging bind doesn't outlive the helper. - assert "trap " in text and "cleanup_staging" in text - # The previous chown-to-l4d2-sandbox approach is gone; overlay dirs - # stay left4me-owned end-to-end. - assert "chown -R l4d2-sandbox" not in text - - def test_script_sandbox_in_build_slice_with_oom_adjust(): text = SCRIPT_SANDBOX_HELPER.read_text() @@ -162,10 +139,8 @@ def test_script_sandbox_helper_dry_run_mode(tmp_path): fake_script = tmp_path / "fake.sh" fake_script.write_text("echo hi") - # Run in DRY_RUN mode against a fake l4d2-sandbox UID via a tiny shim that - # simulates `id -u l4d2-sandbox` resolving to a valid number. helper_text = SCRIPT_SANDBOX_HELPER.read_text() - # We can't actually exec this without root + a real sandbox user; just - # verify the dry-run guard short-circuits before systemd-run runs. + # We can't actually exec this without root; just verify the dry-run + # guard short-circuits before systemd-run runs. assert 'LEFT4ME_SCRIPT_SANDBOX_DRY_RUN' in helper_text assert 'exit 0' in helper_text