NamedTemporaryFile creates the script file at mode 0600 owned by the
left4me web user. The sandbox runs as l4d2-sandbox and bwrap bind-mounts
the file read-only at /script.sh, but the kernel still enforces the
underlying file's permissions — l4d2-sandbox can't read 0600 left4me
files, so /bin/bash /script.sh fails with "Permission denied".
Script content is not a secret (it's stored in the DB and editable by
the user), so 0644 is appropriate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Migration 0005_script_overlays drops the legacy l4d2center_maps /
cedapug_maps overlay rows but leaves their /var/lib/left4me/overlays/{id}
directories on disk. When the web app subsequently creates a new overlay
and AUTOINCREMENT issues an id matching one of those orphans,
create_overlay_directory(exist_ok=False) crashes with FileExistsError —
which surfaced as a 500 on POST /overlays the first time a script
overlay was created on a deployed test box.
Adds a sentinel-gated sweep in deploy-test-server.sh that lists overlay
ids in the DB, removes any directory under overlays/ whose id has no
matching row, and drops the now-unused global_overlay_cache. Mirrors the
.kernel-overlay-migrated sentinel pattern so reruns are no-ops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Smoke testing on the test host revealed three issues with the helper as
shipped:
1. bwrap 0.11+ rejects --uid without --unshare-user. Switching the UID
drop from inside bwrap to systemd-run (--uid=l4d2-sandbox
--gid=l4d2-sandbox) sidesteps the userns UID-mapping headaches and
keeps file ownership on the bind-mounted /overlay matching
l4d2-sandbox on the host (which the wipe path relies on).
2. bwrap running as an unprivileged uid still needs a user namespace to
set up its mount-namespace bind-mounts. Adding --unshare-user-try
gives it the userns context when needed and is a no-op otherwise.
3. /etc/alternatives wasn't bind-mounted, so symlinked tools like
/usr/bin/awk -> /etc/alternatives/awk fell over inside the sandbox.
Adds the ro-bind.
Also: the helper now chowns the overlay dir to l4d2-sandbox before bwrap
(idempotent — needed because the web app creates the dir as left4me),
and the deploy script chmods /var/lib/left4me to 0711 so l4d2-sandbox
can traverse to the bind-mount source.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
deploy/README.md still described the deleted managed-global overlays as
the second overlay surface. Replace with a description of script
overlays (bubblewrap + systemd-run sandbox, resource caps).
Full test sweep: 367 passing, 2 skipped across l4d2web, l4d2host, deploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
deploy-test-server.sh: provisions the l4d2-sandbox system user (no home,
nologin shell) and installs the bubblewrap apt/dnf package; copies the
left4me-script-sandbox helper into /usr/local/libexec/left4me with mode
0755. Drops the global_overlay_cache directory provisioning, the
refresh-global-overlays unit installation, and the timer enable.
Deletes the orphaned left4me-refresh-global-overlays.{service,timer}
files. Trims the matching paragraph from deploy/README.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Privileged bash helper that wraps user-authored scripts in
systemd-run --scope (cgroup limits + RuntimeMaxSec=3600) inside a
bubblewrap sandbox dropped to the l4d2-sandbox uid. Network is shared
with the host so scripts can fetch from Steam / l4d2center / etc.;
filesystem is RO except for /overlay (rw bind from
/var/lib/left4me/overlays/{id}) and tmpfs /tmp + /run.
Adds a sudoers rule allowing the left4me user to invoke this helper
without restrictions on its arguments. Strict argument validation is
in the helper itself.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the script type to the create-overlay modal (with an admin-only
system-wide checkbox) and a script-section to the detail page: textarea
for the bash body, Save / Rebuild / Wipe buttons, last_build_status
badge, latest-build-job link, and a Wipe confirm modal. Removes the
GlobalOverlaySource block.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds POST /overlays/{id}/script, /wipe, /build under the overlay blueprint.
Generalizes /build to handle any owner/admin-editable overlay (deletes the
duplicate workshop-specific manual_build). Wipe runs the literal script
"find /overlay -mindepth 1 -delete" through run_sandboxed_script and
refuses with 409 while a build_overlay job is running. Adds an
admin-only system_wide=1 flag to POST /overlays for system-wide creation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GLOBAL_OPERATIONS becomes {"install", "refresh_workshop_items"}.
Removes refresh_global_overlays_running from SchedulerState and the
_run_refresh_global_overlays dispatch. Drops dead test cases and pins
GLOBAL_OPERATIONS contents.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deletes the global_map_sources, global_overlay_refresh, global_map_cache,
and global_overlays service modules and their tests. Removes the
refresh-global-overlays CLI command, the /admin/global-overlays/refresh
route, and the GlobalOverlaySource view in overlay_detail rendering.
Drops py7zr from dependencies — was only used by the deleted subsystem.
The job_worker scheduler still tracks refresh_global_overlays; that
cleanup is Task 4. Deploy/README references are Task 8.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ScriptBuilder that runs user-authored bash inside the
left4me-script-sandbox helper via run_command, with a 20 GB post-build
disk cap. Registry now {"workshop", "script"}.
finish_job writes Overlay.last_build_status on build_overlay completion.
Drops GlobalMapOverlayBuilder and the now-unreachable
_check_global_overlay_caches in l4d2_facade.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop MountFlags=shared (the assumption that it propagated fuse mounts
to host was incorrect on systemd 257 with ProtectSystem+ReadWritePaths).
Restore PrivateTmp=true (was dropped in 593611e for fuse propagation
that did not work). Rewrite the comment block to describe the new
model: mounts go through the left4me-overlay helper which nsenters
into PID 1's mount namespace, so the unit's mount-ns layout is no
longer load-bearing.
Update the three user-facing READMEs (root, l4d2host, deploy) to drop
fuse-overlayfs / fusermount3 prereqs and call out the kernel overlayfs
mount path through the privileged helper.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop fuse-overlayfs / fuse3 from the apt/dnf install line — the new
mount path is kernel overlayfs via the left4me-overlay helper, no
fuse userspace needed.
Add a one-shot migration block gated by /var/lib/left4me/.kernel-overlay-migrated
that runs before daemon-reload: stop gameservers + web service, force-
unmount any leftover fuse or overlay mounts under runtime/, then wipe
and recreate empty upper/ and work/ for every instance. fuse-overlayfs
running as a non-root user used user.fuseoverlayfs.* xattrs that kernel
overlayfs ignores, so a pre-existing upper/ from the fuse era would
resurrect "deleted" files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace direct fuse-overlayfs / fusermount3 subprocess calls in
start_instance / stop_instance / delete_instance with the existing
OverlayMounter abstraction, now backed by KernelOverlayFSMounter.
Adds an os.path.ismount guard at the top of start_instance so a
kernel-level overlay that survived a web-worker crash isn't double-
mounted (kernel mounts persist when the cgroup dies, unlike fuse
daemons).
Delete the unused FuseOverlayFSMounter module.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New privileged helper at /usr/local/libexec/left4me/left4me-overlay
(Python, system /usr/bin/python3, stdlib only) takes only the instance
name, parses instance.env for L4D2_LOWERDIRS, validates each lowerdir
against an allowlist (installation/, overlays/, global_overlay_cache/,
workshop_cache/), refuses upperdirs tainted with user.fuseoverlayfs.*
xattrs from the prior fuse era, and execs `nsenter --mount=/proc/1/ns/mnt
-- mount -t overlay ...` so the resulting mount lives in the host
namespace. Mirrors the existing left4me-systemctl / left4me-journalctl
pattern; sudoers entry is verb-constrained.
KernelOverlayFSMounter implements the existing OverlayMounter ABC,
deriving the instance name from the merged path. No call sites use it
yet — that's the next commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the architectural fix for the mount-propagation bug: replace
fuse-overlayfs (rootless mount inside the web service's namespace, never
visible to host or to gameserver units) with kernel-native overlayfs
mounted via a privileged sudo helper that nsenters into PID 1's mount
namespace. Companion plan numbers the migration as five tasks ending in
end-to-end verification on the test box.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
systemctl stop is already a no-op on a stopped unit, but stop_instance
was unconditionally running fusermount3 -u and bubbling up the EINVAL
when the overlay wasn't currently mounted (e.g. server already stopped).
Mirror the established delete_instance pattern: always attempt the
unmount, swallow CalledProcessError, and label the step "(if mounted)".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Each SSE log-viewer or job-log stream holds a thread for its full
lifetime. With --threads 8, a handful of open browser tabs could
exhaust the pool. 32 keeps the same single-process scheduler invariant
(_claim_lock in job_worker is process-local) while giving SSE plenty
of headroom on the test box's user count.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
stream_command used a blocking proc.stdout.readline() that never woke
when the underlying journalctl was silent, so Flask never delivered
GeneratorExit on client disconnect — the worker thread and the journalctl
child both leaked permanently and pinned the gunicorn thread pool.
Switch to a select-based read loop with a 15s heartbeat tick (yielded as
""), and translate the tick to an SSE keepalive comment in the log route.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The workshop + managed-global overlay surface fully covers the
admin-SFTP flow that 'external' was a placeholder for. Drop the type
from the model defaults, builder registry, routes, template, and
tests, and add migration 0004 that deletes any leftover external
rows along with their blueprint and job references.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two managed system overlays (l4d2center-maps, cedapug-maps) that
fetch curated map archives from upstream sources and reconcile addons
symlinks for non-Steam maps. A daily systemd timer enqueues a coalesced
refresh_global_overlays worker job; downloads, extraction, and rebuilds
run in the existing job worker and surface in the job log UI.
Schema: GlobalOverlaySource / GlobalOverlayItem / GlobalOverlayItemFile
plus nullable Job.user_id so system jobs render as "system" in the UI.
The new builder reconciles symlinks against the per-source vpk cache
and leaves foreign symlinks untouched. Initialize-time guard refuses
to mount a partial overlay if any expected vpk is missing from cache.
Refresh service uses shutil.move to handle EXDEV when /tmp and the
cache live on different filesystems.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Limit workshop refresh downloads to one worker and commit build-overlay enqueue work before writing the final job log so SQLite locks do not wedge the web process.
Exclude local agent state from deploy archives, avoid recursive ownership over active runtime mounts, and let Alembic own schema upgrades before app startup.
Adds /var/lib/left4me/workshop_cache to the deploy mkdir list (owned by
the left4me runtime user). Updates deploy/README.md to document the new
directory and the workshop overlay layout: web app downloads VPKs into
the cache and symlinks them into overlays/{overlay_id}/left4dead2/addons/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before invoking l4d2ctl initialize, run each blueprint overlay's builder
synchronously and then verify that every workshop item attached to the
blueprint has a cache file on disk. If any are missing, raise a clear
error naming the overlay and the missing steam_ids — server start can't
silently mount a partial overlay where some maps are mysteriously absent
in-game.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds workshop_routes blueprint with add-items / remove-item / manual-
build endpoints plus admin /admin/workshop/refresh. Add-items handles
single ID, single URL, multi-line batch, or a collection ID; auto-
enqueues a coalesced build_overlay job per call. Reject non-L4D2 items
with 400, duplicate associations with friendly toast, intruders with
403.
Generalizes overlay_routes: type+name only on create (no path field);
external is admin-only and system-wide, workshop is per-user and
auto-pathed. Update is name-only. Delete recursively removes the
on-disk dir only for managed paths (path == str(id)); legacy externals
are left in place. The pre-existing in-use guard is preserved.
Page routes filter the overlay listing by user permissions and load
workshop items + the latest related job for the detail view.
Templates: unified Create modal with type radio (no path field).
Type-aware overlay detail: workshop overlays show a multi-line input
+ items/collection radio + item table partial with thumbnails, manual
Rebuild button, and a small status indicator pulled from the latest
related job. Admin page gets a "Refresh all workshop items" button.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends SchedulerState with running_overlays / refresh_running /
blocked_servers_by_overlay, and updates can_start with the truth table:
install and refresh_workshop_items are global mutexes; build_overlay
serializes per-overlay; server jobs block on builds for any overlay
their blueprint references.
Adds enqueue_build_overlay coalescing helper that returns an existing
queued job for the same overlay rather than inserting a duplicate.
Adds run_job dispatch for build_overlay (BUILDERS[overlay.type].build)
and refresh_workshop_items (re-fetches metadata, re-downloads on
time_updated/filename change, enqueues coalesced rebuilds for affected
overlays).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds l4d2web/services/overlay_builders.py with a BUILDERS dict mapping
Overlay.type to a builder class. ExternalBuilder is a no-op that just
ensures the overlay directory exists. WorkshopBuilder diff-applies
absolute symlinks under left4dead2/addons/ against the overlay's current
WorkshopItem associations: creates new ones, removes obsolete, leaves
unrelated files alone, and skips uncached items with a warning rather
than producing dangling symlinks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds workshop_paths.cache_path(steam_id) returning
$LEFT4ME_ROOT/workshop_cache/{steam_id}.vpk with digit-only validation.
Adds overlay_creation.generate_overlay_path(id) and
create_overlay_directory(overlay) with exist_ok=False so a stray dir from a
prior failed delete surfaces loudly instead of shadowing fresh content.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds l4d2web/services/steam_workshop.py: parse_workshop_input (single ID,
URL, or multi-line batch), resolve_collection (HTTPS POST to
GetCollectionDetails), fetch_metadata_batch (HTTPS POST to
GetPublishedFileDetails with consumer_app_id == 550 enforcement that
raises WorkshopValidationError in add-mode and silently skips in
refresh-mode), download_to_cache (atomic + idempotent on mtime+size),
and refresh_all (ThreadPoolExecutor with per-item error collection).
Adds requests as an explicit dependency.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Overlay.type and Overlay.user_id with two partial unique indexes
(externals globally unique by name; user overlays unique per user).
Adds WorkshopItem registry keyed on steam_id and a pure many-to-many
overlay_workshop_items association. Adds Job.overlay_id for build_overlay
job tracking. Switches overlays.id to AUTOINCREMENT so deleted IDs are
never reused.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a typed-overlay model with workshop as the first non-external type:
deduplicated WorkshopItem registry, symlink-based overlay directories,
auto-rebuild after item changes, admin global refresh, and a unified
Create-overlay UI with web-managed paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
L4D2 dedicated server expects to dlopen steamclient.so via
~/.steam/sdk32 (and sdk64). Without those symlinks, srcds_run logs
'cannot open shared object file' and SteamAPI_Init fails, which means
the server is invisible to the public Steam master server, Workshop
addon downloads break, and Steam 'Join Game' / lobby joins do not
reach the server (only direct-IP connect works).
SteamInstaller.install_or_update now ensures the symlinks exist after
SteamCMD finishes. Targets prefer SteamCMD's linux32/linux64 sibling
dirs; falls back to <install_dir>/bin/ if the siblings cannot be
located. Idempotent: re-running the install repairs or leaves the
symlinks alone.
Path.home() respects HOME, which the systemd web unit sets to
/var/lib/left4me, so the symlinks land in the left4me user's home.
Existing deploys can apply the fix by re-running 'Install' from /admin
without a full redeploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ProtectSystem=full + ReadWritePaths implicitly give the unit a private
mount namespace (systemd needs to remount /usr read-only). The default
namespace propagation is slave, so mounts the worker creates inside
never reach the host. The gameserver units (started via systemctl,
each with their own namespace) then inherit a host that lacks the
overlay, and their CHDIR into /var/lib/left4me/runtime/<name>/merged
fails.
Set MountFlags=shared so mount events propagate from the worker's
namespace back to the host, then onward to gameserver units at their
unshare time.
Verified on test box: nsenter -t <gunicorn-pid> -m mount showed the
fuse-overlayfs mount inside the worker but plain mount on the host
did not, while web unit had ProtectSystem=full + ReadWritePaths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PrivateTmp=true gives the unit a private mount namespace. The worker's
fuse-overlayfs mount lives only inside that namespace, so the host
cannot see it and the gameserver unit (started via systemctl, with its
own namespace inherited from the host) also cannot see it. The
gameserver unit then fails CHDIR on
/var/lib/left4me/runtime/<name>/merged/left4dead2.
The mount must land in the host namespace so the gameserver unit
inherits it at unshare time. Remaining hardening: dedicated user,
ProtectSystem=full, ReadWritePaths, sudoers allowlist limited to two
helper scripts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The job worker calls fusermount3 (setuid-root) to mount per-instance
FUSE overlays and sudo to invoke the privileged systemctl wrapper.
NoNewPrivileges=true blocks both, surfacing as
"fusermount3: mount failed: Operation not permitted" the first time a
server is started. Hardening is still enforced via dedicated user,
PrivateTmp, ProtectSystem=full, ReadWritePaths, and the narrow sudoers
allowlist limited to two helper scripts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Per-row "Create server" link on /blueprints navigates to
/servers?blueprint_id=<id>; that page validates the param against
the user's owned blueprints, pre-selects the option, and auto-opens
the create modal.
- /servers empty-blueprint state now shows an actionable
"Create a blueprint first ->" link (styled like the primary button)
pointing at /blueprints, replacing the silent disabled "+ Create"
button + muted hint.
- Drop the "Reassign blueprint" form on the server detail page
along with the unused POST /servers/<id> form route. The JSON
PATCH /servers/<id> endpoint is retained.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Native <dialog> modal infra (CSS + ~30 LOC JS, no framework) used for
create forms and delete confirmations.
- Index pages become listing-only: + Create button opens a modal; the
broken blueprint Actions column and inline overlay edit cells are gone.
- Server detail gains a blueprint reassignment form; existing Delete
button now opens a confirmation modal before tearing down the runtime.
- Blueprint detail gains a Delete button + confirmation modal (was
unreachable from the UI before).
- New overlay detail page at /overlays/<id> with edit form, "Used by"
blueprints list, and delete (admin only).
- Server create: port field is now optional; backend auto-assigns the
next free port from LEFT4ME_PORT_RANGE_START/_END (default
27015-27115). 409 on range exhaustion.
- New routes: POST /blueprints/<id>/delete (form sentinel matching
overlays pattern), POST /servers/<id> (form-friendly blueprint
reassign), GET /overlays/<id>.
- Server delete operation now redirects to /servers; overlay update
redirects to /overlays/<id>.
Server rename remains unsupported pending an id-vs-name design pass for
l4d2host (the runtime directory is name-keyed; renaming would orphan
files).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When tar runs on macOS it embeds ._* resource-fork sidecars next to each
file. These ended up under l4d2web/alembic/versions/ on the target and
alembic tried to import them as migration modules, failing with
"source code string cannot contain null bytes". Set COPYFILE_DISABLE=1
and add an --exclude '._*' so the archive is portable.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The HTTP-only test deployment binds gunicorn to 0.0.0.0:8000 with no TLS
terminator, so a hardcoded SESSION_COOKIE_SECURE=True breaks browser
login. Make it opt-out via env (default True outside TESTING) and set
SESSION_COOKIE_SECURE=false in the generated web.env so the test box
keeps working over HTTP.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- validate instance names at the host lib and web boundary against
[a-z0-9][a-z0-9_-]{0,63} to prevent path traversal via Server.name
- fail-closed on SECRET_KEY: load_config returns None when env unset,
create_app raises if missing or "dev" outside TESTING
- close login timing oracle by hashing a dummy digest when the user
is not found, equalizing response time
- set SESSION_COOKIE_SECURE outside TESTING
- delete_instance tolerates stop_service and fusermount3 failures so
partially-initialized instances clean up without contract breaks;
drops the is_mount() preflight that violated AGENTS.md
- document claim_next_job's single-process assumption
- clarify emit_step contract via docstring
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>