left4me

Author	SHA1	Message	Date
mwiegand	7a25c2453c	fix(left4me-script-sandbox): self-wrap into PID 1's mount namespace The web service runs with PrivateTmp=true, which puts it in its own mount namespace. Worker invokes the sandbox helper via sudo from there; the helper's pre-systemd-run `mount --bind --map-users=...` lands in the web service's namespace. systemd-run then spawns transient units in PID 1's namespace where the bind is invisible — the BindPaths lookup finds an empty staging dir owned by root, and the sandbox uid hits permission-denied on every write. Mirror the pattern from left4me-overlay's ExecStartPre wrapper: enter PID 1's mount namespace at the start of the helper via `nsenter --mount=/proc/1/ns/mnt`. Sentinel env var avoids exec recursion. The gameserver helper handles this at the unit level; the script helper doesn't have a unit so we self-wrap. Diagnosis: 5 failed builds all hit the same EACCES on the first `mkdir`/`tar mkdir`. Direct SSH-sudo invocations of the same helper succeeded because SSH-sudo doesn't inherit a private namespace; only the worker-invoked path is affected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:33:13 +02:00
mwiegand	48381089d3	refactor(left4me-overlay): move uid translation to script-sandbox build left4me-script-sandbox now pre-creates an idmapped bind staging path (--map-users=<left4me_uid>:<sandbox_uid>:1) and points the sandbox's BindPaths at that staging instead of the raw overlay dir. Writes from inside the sandbox (uid l4d2-sandbox) land on disk as left4me, so all overlay content is uniformly left4me-owned end-to-end. left4me-overlay loses ~165 lines of idmap-on-mount logic: the per- lowerdir stat + idmap-bind setup, the bind-umount loop in teardown, the uid lookup helpers, the _is_mountpoint /proc/self/mountinfo parser, and the LEFT4ME_TEST_* env-var stubs. It's back to a simple "validate lowerdirs, mount overlay" shape; gameserver mount path no longer needs to know about producer-side ownership decisions. Verified on kernel 6.12 that the kernel idmap propagates through systemd-run's plain re-bind of the staging path. Tests dropped 4 idmap-on-mount specs and one deploy-artifact regression check; added test_script_sandbox_uses_idmap_staging to pin the new staging path + map flags + trap cleanup. The post-build world-read chmod kludge in the sandbox is also dropped: the web app reads overlay files via its primary uid (left4me). Existing overlays on the test server are sandbox-owned from prior runs and need a one-shot `chown -R left4me:left4me /var/lib/left4me/overlays` during deploy. New overlays produced by the refactored sandbox are left4me-owned from creation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:20:39 +02:00
mwiegand	dd918aca4b	fix(left4me-overlay): use /proc/self/mountinfo to detect bind mounts os.path.ismount() compares st_dev against the parent dir, which silently returns False for same-fs bind mounts. The idmap binds at runtime/<n>/ idmap/<basename> are exactly that case, so: - cmd_umount skipped the bind-umount step every stop, leaving orphan binds in PID 1's mount namespace. - cmd_mount's idempotency check then "didn't see" the orphan and re-bound on top, accumulating one mount per start/stop cycle. Findmnt nesting like /var/lib/left4me/runtime/2/idmap/overlays_9 └─/var/lib/left4me/runtime/2/idmap/overlays_9 is the visible symptom. Reboot wipes everything so the bug is invisible on a fresh boot — only stop/start cycles accumulate. Replace both ismount sites with a _is_mountpoint() helper that reads /proc/self/mountinfo (column 5 is the mount point). Keep os.path.ismount for the overlay merged check, where it's reliable (distinct fs type). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:02:18 +02:00
mwiegand	90531864b3	harden(left4me-overlay): fix idmap collision risk, gate test stubs on PRINT_ONLY, wrap os.stat Issue #1: idmap target now uses parent+name (overlays_workshop instead of workshop) to prevent basename collisions across allowlist roots; explicit die() on collision detected in the loop. Issue #2: env-var uid stubs (renamed to LEFT4ME_TEST_SANDBOX_UID etc.) are only honoured when LEFT4ME_OVERLAY_PRINT_ONLY=1, so a misconfigured systemd unit override cannot influence real uid mapping. Issue #3: os.stat(lowerdir) is wrapped in try/except OSError with a die() that shell-quotes the path and includes the exception, matching the helper's existing error style. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 23:53:32 +02:00
mwiegand	2f6a9cfba0	feat(left4me-overlay): idmap bind mounts for l4d2-sandbox-owned lowerdirs Insert an idmapped bind mount in front of each lowerdir whose top-level uid matches l4d2-sandbox at overlay-mount time, so that overlayfs copy-up produces left4me-owned upperdir entries instead of EACCES. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-14 23:48:07 +02:00
mwiegand	878639147a	feat(deploy): left4me-apply-cake helper with apply/clear modes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:52:16 +02:00
mwiegand	5eac51a93e	fix(deploy): wrap overlay helper with nsenter so it doesn't pin the unit's mount namespace systemd's `+` Exec prefix removes sandbox/credentials but does NOT detach from the unit's per-service mount namespace (created by PrivateTmp/Protect). The Python interpreter for the helper was launched inside that namespace, and even though the helper internally nsenter'd into PID 1 for the umount syscall, the calling Python process itself never left the unit's namespace. Its existence pinned the namespace alive, which kept the slave mount tree alive, which made PID 1's umount return EBUSY for the entire duration of the helper's run. The mount became unmountable the moment the helper exited — empirically verified by polling /proc//ns/mnt during stop: the only PID holding the dying namespace was the helper itself. Wrap both ExecStartPre and ExecStopPost with `/usr/bin/nsenter --mount=/proc/1/ns/mnt --` so the helper Python interpreter runs in PID 1's mount namespace from the start. With the helper out of the unit's namespace, umount succeeds first try once the cgroup empties. Reset went from ~25 s with retry/lazy-fallback workarounds to ~0.5 s clean. Knock-on cleanups: - Helper drops internal nsenter for the syscalls (already in PID 1's namespace), and drops the eager-retry loop + lazy-umount fallback + inner work_inner retry (no race left to ride out). - Revert TimeoutStopSec=60s back to 15s. - Tests updated to expect the new argv shapes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 15:13:59 +02:00
mwiegand	ff6ce7b091	refactor(l4d2-host): unmount via ExecStopPost — single code path mirroring mount Symmetric with the earlier mount cleanup (commits 519567e..a982995). Until now, the unit's ExecStartPre handled mount but the Python side still drove unmount: stop_instance and _purge_instance both called _mounter.unmount, which wrapped sudo + the helper. Two code paths for two halves of the same lifecycle. Move unmount into the unit: - ExecStopPost=+/usr/local/libexec/left4me/left4me-overlay umount %i (ExecStopPost, not ExecStop, so it runs after the cgroup is cleared; ExecStop runs while srcds is alive and would EBUSY the umount syscall.) - Helper's umount verb is now idempotent (mirrors mount): if merged isn't a mount point, return early. PRINT_ONLY mode bypasses both short-circuits so the unit tests still exercise the full nsenter argv. Drop the dead Python machinery: - _mounter.unmount(...) calls in stop_instance and _purge_instance - _mounter global + KernelOverlayFSMounter import - The whole l4d2host/fs/ package (OverlayMounter ABC + KernelOverlayFSMounter class) — no production callers, just self-tests - l4d2host/tests/test_kernel_overlayfs.py - test_stop_succeeds_when_unmount_fails / test_delete_succeeds_when_unmount_fails (tested Python-side unmount-failure tolerance that no longer exists) - The l4d2host.fs.kernel_overlayfs.run_command monkeypatches in lifecycle tests After this, the only thing start_instance does beyond cfg-staging is ask systemd to enable+start the unit. stop/delete/reset only ask systemd to disable; the overlay lifecycle lives entirely in the unit file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 13:09:52 +02:00
mwiegand	519567e156	fix(l4d2-host): mount overlay via ExecStartPre so enabled units boot cleanly The lifecycle change to systemctl enable --now (commit `8552c55`) made units auto-start at boot. But the kernel-overlayfs mount is volatile (reboot kills it), and the web app's start_instance only re-mounts in response to a UI click. Result: at boot, systemd starts the unit, finds empty merged/, CHDIR fails, Restart=on-failure spins forever (counter hit 65 on ckn before this fix landed). Fix: - Unit gets `ExecStartPre=/usr/bin/sudo -n .../left4me-overlay mount %i` so the overlay is established before the main process starts. - Helper is now idempotent: if merged is already a mount point, exit 0. Required because Restart=on-failure re-runs ExecStartPre on each cycle, and the web-app's start_instance also calls the helper, so both paths would otherwise collide on "already mounted". - StartLimitBurst=5 + StartLimitIntervalSec=60s caps the restart loop instead of letting it spin indefinitely on a fundamental failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 12:47:20 +02:00
mwiegand	8552c559d3	feat(l4d2-host): server lifecycle uses systemctl enable --now / disable --now Servers started via the web UI now create a WantedBy= symlink under multi-user.target.wants/, so they auto-start on the next host reboot. Helper verbs renamed start/stop -> enable/disable; service_control.py renamed start_service/stop_service -> enable_service/disable_service. The user-facing l4d2ctl start/stop commands keep their names per the AGENTS.md contract -- only the implementation changes. Spec: docs/superpowers/specs/2026-05-09-l4d2-server-lifecycle-reboot-and-drift-design.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 12:28:44 +02:00
mwiegand	7e4a5691ed	feat(deploy): script-sandbox runs in l4d2-build.slice + OOMScoreAdjust=500 Builds yield CPU/IO to game-server instances under contention via the slice's weight=10, and are killed first under memory pressure (servers have OOMScoreAdjust=-200).	2026-05-09 10:01:38 +02:00
mwiegand	965b67e6fc	fix(l4d2-host): script-sandbox normalizes file perms so web user can read Cedapug's build script writes .cedapug/manifest.tsv with mode 0600 owned by l4d2-sandbox; the web service (left4me uid) then 500s when streaming that file via the download route — PermissionError on open(). Two fixes: - UMask=0022 on the systemd-run unit so new file writes default to 0644 / dirs to 0755. - Post-script chmod o+r/o+rx walk over the overlay dir to backfill any stricter modes the script left behind (e.g. shells/tools that ignore umask and explicitly create with 0600). The helper no longer execs systemd-run; it captures the rc, runs the post-step, and exits with the original rc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 01:44:26 +02:00
mwiegand	7e66936d03	feat(deploy): restrict script-sandbox egress to public internet only Adds IPAddressDeny= to the sandbox unit covering loopback (127/8 + ::1), link-local (169.254/16 + fe80::/10), multicast (224/4 + ff00::/8), all RFC1918 v4 (10/8, 172.16/12, 192.168/16), CGNAT (100.64/10), and ULA v6 (fc00::/7). The kernel attaches systemd's sd_fw_egress BPF program to the unit's cgroup; egress packets matching any of the deny prefixes are silently dropped at the cgroup boundary. Important: do NOT pair this with `IPAddressAllow=any`. Documentation claims "more specific rule wins" but on this systemd 257 + kernel 6.12 combo, having both set causes the allow to win unconditionally — the deny gets ignored. Empty IPAddressAllow + populated IPAddressDeny is the correct shape: kernel default "allow all" applies to non-listed addresses, and the listed prefixes are blocked. Because the host's resolv.conf typically points at a private-IP DNS server (10.0.0.1 in the test deploy), blocking RFC1918 also kills DNS. Adds a static /etc/left4me/sandbox-resolv.conf with public resolvers (Cloudflare 1.1.1.1, Google 8.8.8.8) and bind-mounts that into the sandbox at /etc/resolv.conf, replacing the host's resolver inside the sandbox only. Smoke-tested on ckn@10.0.4.128: - public 1.1.1.1:443: CONNECTED - public HTTPS via DNS (steamcommunity.com): 200 - localhost web app 127.0.0.1:8000: blocked (TimeoutError) - localhost sshd 127.0.0.1:22: blocked - private LAN ssh 10.0.4.128:22: blocked - private DNS 10.0.0.1:53: blocked AF_UNIX stays in RestrictAddressFamilies — dropping it would risk breaking NSS / syslog for marginal gain, and the IP-level filter addresses the primary threat (reaching the host's HTTP/SSH services). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 17:04:57 +02:00
mwiegand	4ee8f6af44	refactor(deploy): rewrite left4me-script-sandbox to systemd-only — drop bwrap Replaces the systemd-run --scope + bwrap composition with systemd-run in service-unit mode (--pipe --wait, transient .service unit). Same cgroup limits and walltime kill, plus the hardening directives that --scope units cannot carry: NoNewPrivileges, ProtectSystem=strict, ProtectHome, ProtectKernel{Tunables,Modules,Logs,ControlGroups}, RestrictNamespaces, RestrictAddressFamilies, RestrictSUIDSGID, LockPersonality, MemoryDenyWriteExecute, SystemCallFilter (seccomp), and an empty CapabilityBoundingSet (drops all caps). UID drop via User=/Group=. The TemporaryFileSystem="/etc /var/lib" pair is the gotcha: ProtectSystem=strict makes /var/lib read-only but visible, so the host DB at /var/lib/left4me/left4me.db (mode 0644) was readable from inside. Masking /var/lib with tmpfs hides the entire subtree; the BindPaths bind to /overlay is at a different path and unaffected. The Python side (ScriptBuilder, run_sandboxed_script, routes) is unchanged — same sudo-helper invocation, same argv shape. Loses PID-namespace isolation (no PrivatePID= directive in systemd). Host PIDs are visible via /proc and ps -ef but not signal-able due to UID mismatch — information disclosure only, not a privilege boundary. Smoke-tested on ckn@10.0.4.128 prior to this commit; all isolation invariants reproduced and the hardening directives provably blocked unshare(2), mount(2), personality(2), bpf(2), and sysctl writes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:47:30 +02:00
mwiegand	06ae84fbe4	fix(deploy): script-sandbox helper — UID drop via systemd-run, --unshare-user-try, /etc/alternatives Smoke testing on the test host revealed three issues with the helper as shipped: 1. bwrap 0.11+ rejects --uid without --unshare-user. Switching the UID drop from inside bwrap to systemd-run (--uid=l4d2-sandbox --gid=l4d2-sandbox) sidesteps the userns UID-mapping headaches and keeps file ownership on the bind-mounted /overlay matching l4d2-sandbox on the host (which the wipe path relies on). 2. bwrap running as an unprivileged uid still needs a user namespace to set up its mount-namespace bind-mounts. Adding --unshare-user-try gives it the userns context when needed and is a no-op otherwise. 3. /etc/alternatives wasn't bind-mounted, so symlinked tools like /usr/bin/awk -> /etc/alternatives/awk fell over inside the sandbox. Adds the ro-bind. Also: the helper now chowns the overlay dir to l4d2-sandbox before bwrap (idempotent — needed because the web app creates the dir as left4me), and the deploy script chmods /var/lib/left4me to 0711 so l4d2-sandbox can traverse to the bind-mount source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 16:12:46 +02:00
mwiegand	75e703e1a4	feat(deploy): left4me-script-sandbox helper + sudoers fragment Privileged bash helper that wraps user-authored scripts in systemd-run --scope (cgroup limits + RuntimeMaxSec=3600) inside a bubblewrap sandbox dropped to the l4d2-sandbox uid. Network is shared with the host so scripts can fetch from Steam / l4d2center / etc.; filesystem is RO except for /overlay (rw bind from /var/lib/left4me/overlays/{id}) and tmpfs /tmp + /run. Adds a sudoers rule allowing the left4me user to invoke this helper without restrictions on its arguments. Strict argument validation is in the helper itself. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 15:53:21 +02:00
mwiegand	d5b321b557	feat(l4d2-host): KernelOverlayFSMounter + left4me-overlay helper New privileged helper at /usr/local/libexec/left4me/left4me-overlay (Python, system /usr/bin/python3, stdlib only) takes only the instance name, parses instance.env for L4D2_LOWERDIRS, validates each lowerdir against an allowlist (installation/, overlays/, global_overlay_cache/, workshop_cache/), refuses upperdirs tainted with user.fuseoverlayfs.* xattrs from the prior fuse era, and execs `nsenter --mount=/proc/1/ns/mnt -- mount -t overlay ...` so the resulting mount lives in the host namespace. Mirrors the existing left4me-systemctl / left4me-journalctl pattern; sudoers entry is verb-constrained. KernelOverlayFSMounter implements the existing OverlayMounter ABC, deriving the instance name from the merged path. No call sites use it yet — that's the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 12:23:58 +02:00
mwiegand	bbfc528354	feat(deploy): add production-like test deployment	2026-05-06 19:30:10 +02:00

18 commits