Hardening directives leave the base unit body and live in:
deploy/files/etc/systemd/system/left4me-web.service.d/10-hardening.conf
deploy/files/etc/systemd/system/left4me-server@.service.d/10-hardening.conf
Reference units now describe just the base operational shape (exec,
env, restart, resources). Tests split: base-unit content and hardening
profile are asserted separately.
Part of 2026-05-15-deployment-responsibility-design.md migration
step 2. ckn-bw lands the matching reactor surgery + symlink delivery.
Single source of truth for left4me sysctl tuning. The metadata entry
in ckn-bw (sysctl/kernel/yama/ptrace_scope) is removed in lockstep;
the live value is unchanged.
Part of 2026-05-15-deployment-responsibility-design.md migration step 1
(canary).
Sync deployment references for the runtime state relocation
shipped via ckn-bw (commit 6fae2fd). /opt/left4me/ is now a
root-owned deploy-artifact root (just src/); .venv and steamcmd
live at /var/lib/left4me/{.venv,steam}.
Touches:
- deploy/files/.../left4me-web.service: PATH + ExecStart
- deploy/files/.../left4me-workshop-refresh.service: WorkingDirectory
(was /opt/left4me, now /opt/left4me/src to match the web unit),
PATH, ExecStart
- scripts/sbin/left4me wrapper: flask path
- deploy/tests/test_example_units.py: PATH + ExecStart assertions
for the web unit; also fix a pre-existing broken assertion that
read "Environment=PATH=..." (the unit has Environment=HOME=...
PATH=... on one line, so "Environment=PATH=" was never present)
- now reads just "PATH=..."
- deploy/README.md: paths
- l4d2host/tests/test_cli.py: LEFT4ME_STEAMCMD fixture path
Design + as-shipped record:
docs/superpowers/specs/2026-05-15-runtime-state-relocation-design.md.
The original (narrower) prereq spec at
docs/superpowers/specs/2026-05-15-handoff-noneditable-install.md
is marked superseded with a pointer to what shipped + why the
scope grew (setuptools writes egg-info to source during PEP 517
build prep).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls the 5 privileged helpers out of deploy/files/usr/local/{libexec,sbin}/
into top-level scripts/{libexec,sbin}/. They are application-inherent code
(invoked at runtime via sudo from l4d2host/l4d2web), not deploy artifacts —
the previous nesting under deploy/files/ confused source-of-truth with
install-target FHS layout.
deploy/ now means "reference exemplar": README explaining the target
layout, plus example sudoers / sysctl / sandbox-resolv.conf / env
templates / curated systemd units (the ones ckn-bw's reactor emits).
Anyone building a fresh deployment (other than ckn-bw) reads this tree.
Dead static artifacts deleted: left4me-apply-cake helper, left4me-cake
+ left4me-nft-mark service units, cake.env, left4me-mark.nft, and the
superseded deploy-test-server.sh installer.
Tests split to match the new shape:
- scripts/tests/{test_overlay,test_script_sandbox,test_systemctl_helper,
test_journalctl_helper,test_helpers_use_fixed_paths,test_sudoers_grants}.py
with shared fixtures in conftest.py
- deploy/tests/test_example_units.py (renamed from test_deploy_artifacts.py)
— slimmed to lock down the curated example units, sysctl, env templates
l4d2host/tests/test_overlay_helper.py: helper-source path updated to
scripts/libexec/left4me-overlay (was building the path segment-by-segment
under deploy/files/, missed by the path-prefix grep during pre-flight).
Runtime install-target paths (/usr/local/{libexec,sbin}/) unchanged, so
l4d2host/service_control.py, l4d2web/services/overlay_builders.py, the
sudoers grants, and the systemd units all keep their existing path
references.
Requires the matching ckn-bw change to bundles/left4me/items.py
(install_left4me_scripts repointed from /opt/left4me/src/deploy/files/...
to /opt/left4me/src/scripts/...). Left4me lands first so a fresh
git_deploy exposes the new source path before the bundle apply runs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
left4me-script-sandbox now pre-creates an idmapped bind staging path
(--map-users=<left4me_uid>:<sandbox_uid>:1) and points the sandbox's
BindPaths at that staging instead of the raw overlay dir. Writes from
inside the sandbox (uid l4d2-sandbox) land on disk as left4me, so all
overlay content is uniformly left4me-owned end-to-end.
left4me-overlay loses ~165 lines of idmap-on-mount logic: the per-
lowerdir stat + idmap-bind setup, the bind-umount loop in teardown,
the uid lookup helpers, the _is_mountpoint /proc/self/mountinfo parser,
and the LEFT4ME_TEST_* env-var stubs. It's back to a simple "validate
lowerdirs, mount overlay" shape; gameserver mount path no longer needs
to know about producer-side ownership decisions.
Verified on kernel 6.12 that the kernel idmap propagates through
systemd-run's plain re-bind of the staging path. Tests dropped 4
idmap-on-mount specs and one deploy-artifact regression check; added
test_script_sandbox_uses_idmap_staging to pin the new staging path
+ map flags + trap cleanup.
The post-build world-read chmod kludge in the sandbox is also dropped:
the web app reads overlay files via its primary uid (left4me).
Existing overlays on the test server are sandbox-owned from prior runs
and need a one-shot `chown -R left4me:left4me /var/lib/left4me/overlays`
during deploy. New overlays produced by the refactored sandbox are
left4me-owned from creation.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Guards against silent regression of the idmap bind-mount step in the
privileged kernel-overlayfs helper. Asserts --map-users / --map-groups
argv, the runtime/<name>/idmap/ target path, the LEFT4ME_TEST_* stub-
env-var names, and the collision-detection table.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
For the live-state panel's Steam profile enrichment (persona names +
avatars). Optional: empty value disables enrichment and the panel falls
back to in-game names + placeholder avatars.
The actual web.env is materialized by the ckn-bw bundle's Mako; the
template here documents the operator-facing shape.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Installs nftables via apt/dnf, copies left4me-mark.nft and left4me-apply-cake
helper into system paths, conditionally seeds cake.env (preserving operator
edits), and enables left4me-nft-mark.service + left4me-cake.service on deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CAKE egress shaper now has a systemd unit that wraps the
left4me-apply-cake helper in apply and clear modes. The unit is a
oneshot that starts after network-online and survives service restarts,
allowing the shaper to persist across reboots and be managed by systemd.
The environment file is marked non-fatal (EnvironmentFile=-) to handle
missing or incomplete configurations gracefully.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
systemd's `+` Exec prefix removes sandbox/credentials but does NOT
detach from the unit's per-service mount namespace (created by
PrivateTmp/Protect*). The Python interpreter for the helper was
launched inside that namespace, and even though the helper internally
nsenter'd into PID 1 for the umount syscall, the calling Python
process itself never left the unit's namespace. Its existence pinned
the namespace alive, which kept the slave mount tree alive, which
made PID 1's umount return EBUSY for the entire duration of the
helper's run. The mount became unmountable the moment the helper
exited — empirically verified by polling /proc/*/ns/mnt during stop:
the only PID holding the dying namespace was the helper itself.
Wrap both ExecStartPre and ExecStopPost with `/usr/bin/nsenter
--mount=/proc/1/ns/mnt --` so the helper Python interpreter runs in
PID 1's mount namespace from the start. With the helper out of the
unit's namespace, umount succeeds first try once the cgroup empties.
Reset went from ~25 s with retry/lazy-fallback workarounds to ~0.5 s
clean.
Knock-on cleanups:
- Helper drops internal nsenter for the syscalls (already in PID 1's
namespace), and drops the eager-retry loop + lazy-umount fallback +
inner work_inner retry (no race left to ride out).
- Revert TimeoutStopSec=60s back to 15s.
- Tests updated to expect the new argv shapes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
srcds_run is a shell script that cd's to its own dirname before exec'ing
srcds_linux, so WorkingDirectory has no effect — the binary's path is what
determines where the engine reads gameinfo.txt and addons from. Pointing
at installation/srcds_run resolved everything against the lower layer, so
overlay-provided Metamod/SourceMod plugins and cfgs (zonemod, confogl)
never loaded. Switch to runtime/%i/merged/srcds_run so the engine sees
the merged tree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`test_deploy_script_has_safe_defaults_and_preserves_state` had been red
since commit caa8b83 ("rewrite web.env every deploy with machine-id-
derived SECRET_KEY"). Two assertions encoded the prior model:
- `if [ ! -f /etc/left4me/web.env ]` — the create-only-if-missing guard
caa8b83 removed in favor of unconditional `install -m 0640 ...`.
- `. /etc/left4me/web.env not in script` — masked by the first failing
but also stale: the deploy intentionally sources web.env in the
alembic and seed-script-overlays helper subprocesses so they get
DATABASE_URL.
Removed both. The full suite now runs 0 failed. The note left in place
points future readers at the live coverage path (install + SECRET_KEY
rewrite + run_left4me_with_env plumbing already asserted nearby).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Symmetric with the earlier mount cleanup (commits 519567e..a982995). Until
now, the unit's ExecStartPre handled mount but the Python side still drove
unmount: stop_instance and _purge_instance both called _mounter.unmount,
which wrapped sudo + the helper. Two code paths for two halves of the
same lifecycle.
Move unmount into the unit:
- ExecStopPost=+/usr/local/libexec/left4me/left4me-overlay umount %i
(ExecStopPost, not ExecStop, so it runs after the cgroup is cleared;
ExecStop runs while srcds is alive and would EBUSY the umount syscall.)
- Helper's umount verb is now idempotent (mirrors mount): if merged
isn't a mount point, return early. PRINT_ONLY mode bypasses both
short-circuits so the unit tests still exercise the full nsenter argv.
Drop the dead Python machinery:
- _mounter.unmount(...) calls in stop_instance and _purge_instance
- _mounter global + KernelOverlayFSMounter import
- The whole l4d2host/fs/ package (OverlayMounter ABC + KernelOverlayFSMounter
class) — no production callers, just self-tests
- l4d2host/tests/test_kernel_overlayfs.py
- test_stop_succeeds_when_unmount_fails / test_delete_succeeds_when_unmount_fails
(tested Python-side unmount-failure tolerance that no longer exists)
- The l4d2host.fs.kernel_overlayfs.run_command monkeypatches in lifecycle tests
After this, the only thing start_instance does beyond cfg-staging is ask
systemd to enable+start the unit. stop/delete/reset only ask systemd to
disable; the overlay lifecycle lives entirely in the unit file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The unit has NoNewPrivileges=true (security hardening for srcds), which
blocks sudo's setuid escalation. The previous sudo'd ExecStartPre failed
on every start with "sudo: the 'no new privileges' switch is set, which
prevents sudo from running as root" -> Restart=on-failure loop.
systemd's `+` prefix runs the Exec command as PID 1 (root, no sandbox),
bypassing User=/Group=/NoNewPrivileges=. Equivalent privilege scope to
the sudoers rule the web app already uses for the same helper, just
without the sudo middleman.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
systemd applies WorkingDirectory= to every Exec line including ExecStartPre.
With the merged dir not yet existing at boot time (the volatile overlay
mount has been wiped), the chdir into runtime/%i/merged/left4dead2 fails
with status=200/CHDIR before ExecStartPre can run the mount helper.
The `-` prefix makes chdir failure non-fatal: ExecStartPre runs in the
unit's home (cwd doesn't matter for the mount helper); ExecStart re-applies
WorkingDirectory once the mount has landed and chdirs successfully.
Companion to commit 519567e (which added the ExecStartPre mount + helper
idempotency but didn't account for the WorkingDirectory ordering).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The lifecycle change to systemctl enable --now (commit 8552c55) made
units auto-start at boot. But the kernel-overlayfs mount is volatile
(reboot kills it), and the web app's start_instance only re-mounts in
response to a UI click. Result: at boot, systemd starts the unit, finds
empty merged/, CHDIR fails, Restart=on-failure spins forever (counter
hit 65 on ckn before this fix landed).
Fix:
- Unit gets `ExecStartPre=/usr/bin/sudo -n .../left4me-overlay mount %i`
so the overlay is established before the main process starts.
- Helper is now idempotent: if merged is already a mount point, exit 0.
Required because Restart=on-failure re-runs ExecStartPre on each
cycle, and the web-app's start_instance also calls the helper, so
both paths would otherwise collide on "already mounted".
- StartLimitBurst=5 + StartLimitIntervalSec=60s caps the restart loop
instead of letting it spin indefinitely on a fundamental failure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Servers started via the web UI now create a WantedBy= symlink under
multi-user.target.wants/, so they auto-start on the next host reboot.
Helper verbs renamed start/stop -> enable/disable; service_control.py
renamed start_service/stop_service -> enable_service/disable_service.
The user-facing l4d2ctl start/stop commands keep their names per the
AGENTS.md contract -- only the implementation changes. Spec:
docs/superpowers/specs/2026-05-09-l4d2-server-lifecycle-reboot-and-drift-design.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Computes NPROC at deploy time. Defaults LEFT4ME_SYSTEM_CPUS=0 and
LEFT4ME_GAME_CPUS=1-(NPROC-1). Single-core hosts skip cpuset writes
with a stderr warning unless an env var override is set. Spec:
docs/superpowers/specs/2026-05-09-l4d2-cpu-isolation-design.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copies l4d2-game.slice and l4d2-build.slice into
/usr/local/lib/systemd/system/, installs 99-left4me.conf into
/etc/sysctl.d/, and runs sysctl --system so the perf baseline is
live this deploy, not on next reboot.
Builds yield CPU/IO to game-server instances under contention via the
slice's weight=10, and are killed first under memory pressure
(servers have OOMScoreAdjust=-200).
Flat top-level slices. Game wins under contention; build still gets
the box when uncontended. Referenced by left4me-server@.service and
the script-sandbox systemd-run invocation.
Script overlays commonly need 7z and md5sum (e.g. the l4d2center map
sync recipe). Add p7zip-full to the apt install line, p7zip + p7zip-plugins
to dnf, and coreutils explicitly so md5sum is guaranteed even on slim base
images. Lock both in with a regression test.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SQLite in WAL mode (the default for this app) maintains left4me.db-wal
and left4me.db-shm sidecar files alongside the main DB. All three must
be writable by the web service uid; if any one is root-owned, SQLite
reports "attempt to write a readonly database" on the next INSERT —
which surfaced as a 500 on POST /overlays/{id}/script after I'd done
ad-hoc root-side sqlite3.connect() inspection earlier and the resulting
root-owned WAL/SHM persisted.
Loop over all three paths in the deploy chmod step so root-owned
sidecars are corrected on every deploy. Idempotent.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The v2 hardening tightened the DB to mode 0640 owned by root:left4me,
intending to block reads from the sandbox uid (l4d2-sandbox, not in the
left4me group). It did — but it also took away write access from the web
service itself, which runs as user left4me. With root owning the file,
left4me only had group-read; INSERTs into the jobs table failed with
"attempt to write a readonly database" and surfaced as a 500 on POST
/overlays/{id}/script.
Owner left4me + group left4me + mode 0640 keeps the same external
posture (l4d2-sandbox gets nothing via "other") while restoring the
web service's read+write access via "owner".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds IPAddressDeny= to the sandbox unit covering loopback (127/8 + ::1),
link-local (169.254/16 + fe80::/10), multicast (224/4 + ff00::/8), all
RFC1918 v4 (10/8, 172.16/12, 192.168/16), CGNAT (100.64/10), and ULA v6
(fc00::/7). The kernel attaches systemd's sd_fw_egress BPF program to
the unit's cgroup; egress packets matching any of the deny prefixes are
silently dropped at the cgroup boundary.
Important: do NOT pair this with `IPAddressAllow=any`. Documentation
claims "more specific rule wins" but on this systemd 257 + kernel 6.12
combo, having both set causes the allow to win unconditionally — the
deny gets ignored. Empty IPAddressAllow + populated IPAddressDeny is the
correct shape: kernel default "allow all" applies to non-listed
addresses, and the listed prefixes are blocked.
Because the host's resolv.conf typically points at a private-IP DNS
server (10.0.0.1 in the test deploy), blocking RFC1918 also kills DNS.
Adds a static /etc/left4me/sandbox-resolv.conf with public resolvers
(Cloudflare 1.1.1.1, Google 8.8.8.8) and bind-mounts that into the
sandbox at /etc/resolv.conf, replacing the host's resolver inside the
sandbox only.
Smoke-tested on ckn@10.0.4.128:
- public 1.1.1.1:443: CONNECTED
- public HTTPS via DNS (steamcommunity.com): 200
- localhost web app 127.0.0.1:8000: blocked (TimeoutError)
- localhost sshd 127.0.0.1:22: blocked
- private LAN ssh 10.0.4.128:22: blocked
- private DNS 10.0.0.1:53: blocked
AF_UNIX stays in RestrictAddressFamilies — dropping it would risk
breaking NSS / syslog for marginal gain, and the IP-level filter
addresses the primary threat (reaching the host's HTTP/SSH services).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
bubblewrap is no longer used now that left4me-script-sandbox runs as a
systemd service unit. Remove it from the apt-get and dnf install lines.
Also tighten the application database file mode after the alembic
upgrade step: chown root:left4me, chmod 0640. The DB had been created
at default 0644 by SQLite's open() call inside the web service, which
made it world-readable on the host — i.e. readable by any uid that can
traverse /var/lib/left4me, including the sandbox's l4d2-sandbox uid.
Smoke-testing the v2 sandbox prototype on ckn@10.0.4.128 surfaced this:
the sandbox could read "SQLite format 3" from the DB until the parent
dir was masked with TemporaryFileSystem=. Tightening the file mode is
the host-level fix; the sandbox-level mask is defense in depth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the systemd-run --scope + bwrap composition with systemd-run in
service-unit mode (--pipe --wait, transient .service unit). Same cgroup
limits and walltime kill, plus the hardening directives that --scope
units cannot carry: NoNewPrivileges, ProtectSystem=strict, ProtectHome,
ProtectKernel{Tunables,Modules,Logs,ControlGroups}, RestrictNamespaces,
RestrictAddressFamilies, RestrictSUIDSGID, LockPersonality,
MemoryDenyWriteExecute, SystemCallFilter (seccomp), and an empty
CapabilityBoundingSet (drops all caps). UID drop via User=/Group=.
The TemporaryFileSystem="/etc /var/lib" pair is the gotcha:
ProtectSystem=strict makes /var/lib *read-only* but visible, so the host
DB at /var/lib/left4me/left4me.db (mode 0644) was readable from inside.
Masking /var/lib with tmpfs hides the entire subtree; the BindPaths bind
to /overlay is at a different path and unaffected.
The Python side (ScriptBuilder, run_sandboxed_script, routes) is
unchanged — same sudo-helper invocation, same argv shape.
Loses PID-namespace isolation (no PrivatePID= directive in systemd).
Host PIDs are visible via /proc and ps -ef but not signal-able due to
UID mismatch — information disclosure only, not a privilege boundary.
Smoke-tested on ckn@10.0.4.128 prior to this commit; all isolation
invariants reproduced and the hardening directives provably blocked
unshare(2), mount(2), personality(2), bpf(2), and sysctl writes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migration 0005_script_overlays drops the legacy l4d2center_maps /
cedapug_maps overlay rows but leaves their /var/lib/left4me/overlays/{id}
directories on disk. When the web app subsequently creates a new overlay
and AUTOINCREMENT issues an id matching one of those orphans,
create_overlay_directory(exist_ok=False) crashes with FileExistsError —
which surfaced as a 500 on POST /overlays the first time a script
overlay was created on a deployed test box.
Adds a sentinel-gated sweep in deploy-test-server.sh that lists overlay
ids in the DB, removes any directory under overlays/ whose id has no
matching row, and drops the now-unused global_overlay_cache. Mirrors the
.kernel-overlay-migrated sentinel pattern so reruns are no-ops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smoke testing on the test host revealed three issues with the helper as
shipped:
1. bwrap 0.11+ rejects --uid without --unshare-user. Switching the UID
drop from inside bwrap to systemd-run (--uid=l4d2-sandbox
--gid=l4d2-sandbox) sidesteps the userns UID-mapping headaches and
keeps file ownership on the bind-mounted /overlay matching
l4d2-sandbox on the host (which the wipe path relies on).
2. bwrap running as an unprivileged uid still needs a user namespace to
set up its mount-namespace bind-mounts. Adding --unshare-user-try
gives it the userns context when needed and is a no-op otherwise.
3. /etc/alternatives wasn't bind-mounted, so symlinked tools like
/usr/bin/awk -> /etc/alternatives/awk fell over inside the sandbox.
Adds the ro-bind.
Also: the helper now chowns the overlay dir to l4d2-sandbox before bwrap
(idempotent — needed because the web app creates the dir as left4me),
and the deploy script chmods /var/lib/left4me to 0711 so l4d2-sandbox
can traverse to the bind-mount source.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
deploy-test-server.sh: provisions the l4d2-sandbox system user (no home,
nologin shell) and installs the bubblewrap apt/dnf package; copies the
left4me-script-sandbox helper into /usr/local/libexec/left4me with mode
0755. Drops the global_overlay_cache directory provisioning, the
refresh-global-overlays unit installation, and the timer enable.
Deletes the orphaned left4me-refresh-global-overlays.{service,timer}
files. Trims the matching paragraph from deploy/README.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Privileged bash helper that wraps user-authored scripts in
systemd-run --scope (cgroup limits + RuntimeMaxSec=3600) inside a
bubblewrap sandbox dropped to the l4d2-sandbox uid. Network is shared
with the host so scripts can fetch from Steam / l4d2center / etc.;
filesystem is RO except for /overlay (rw bind from
/var/lib/left4me/overlays/{id}) and tmpfs /tmp + /run.
Adds a sudoers rule allowing the left4me user to invoke this helper
without restrictions on its arguments. Strict argument validation is
in the helper itself.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop MountFlags=shared (the assumption that it propagated fuse mounts
to host was incorrect on systemd 257 with ProtectSystem+ReadWritePaths).
Restore PrivateTmp=true (was dropped in 593611e for fuse propagation
that did not work). Rewrite the comment block to describe the new
model: mounts go through the left4me-overlay helper which nsenters
into PID 1's mount namespace, so the unit's mount-ns layout is no
longer load-bearing.
Update the three user-facing READMEs (root, l4d2host, deploy) to drop
fuse-overlayfs / fusermount3 prereqs and call out the kernel overlayfs
mount path through the privileged helper.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop fuse-overlayfs / fuse3 from the apt/dnf install line — the new
mount path is kernel overlayfs via the left4me-overlay helper, no
fuse userspace needed.
Add a one-shot migration block gated by /var/lib/left4me/.kernel-overlay-migrated
that runs before daemon-reload: stop gameservers + web service, force-
unmount any leftover fuse or overlay mounts under runtime/, then wipe
and recreate empty upper/ and work/ for every instance. fuse-overlayfs
running as a non-root user used user.fuseoverlayfs.* xattrs that kernel
overlayfs ignores, so a pre-existing upper/ from the fuse era would
resurrect "deleted" files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New privileged helper at /usr/local/libexec/left4me/left4me-overlay
(Python, system /usr/bin/python3, stdlib only) takes only the instance
name, parses instance.env for L4D2_LOWERDIRS, validates each lowerdir
against an allowlist (installation/, overlays/, global_overlay_cache/,
workshop_cache/), refuses upperdirs tainted with user.fuseoverlayfs.*
xattrs from the prior fuse era, and execs `nsenter --mount=/proc/1/ns/mnt
-- mount -t overlay ...` so the resulting mount lives in the host
namespace. Mirrors the existing left4me-systemctl / left4me-journalctl
pattern; sudoers entry is verb-constrained.
KernelOverlayFSMounter implements the existing OverlayMounter ABC,
deriving the instance name from the merged path. No call sites use it
yet — that's the next commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each SSE log-viewer or job-log stream holds a thread for its full
lifetime. With --threads 8, a handful of open browser tabs could
exhaust the pool. 32 keeps the same single-process scheduler invariant
(_claim_lock in job_worker is process-local) while giving SSE plenty
of headroom on the test box's user count.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two managed system overlays (l4d2center-maps, cedapug-maps) that
fetch curated map archives from upstream sources and reconcile addons
symlinks for non-Steam maps. A daily systemd timer enqueues a coalesced
refresh_global_overlays worker job; downloads, extraction, and rebuilds
run in the existing job worker and surface in the job log UI.
Schema: GlobalOverlaySource / GlobalOverlayItem / GlobalOverlayItemFile
plus nullable Job.user_id so system jobs render as "system" in the UI.
The new builder reconciles symlinks against the per-source vpk cache
and leaves foreign symlinks untouched. Initialize-time guard refuses
to mount a partial overlay if any expected vpk is missing from cache.
Refresh service uses shutil.move to handle EXDEV when /tmp and the
cache live on different filesystems.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Exclude local agent state from deploy archives, avoid recursive ownership over active runtime mounts, and let Alembic own schema upgrades before app startup.