Commit graph

21 commits

Author SHA1 Message Date
mwiegand
1e62a44c16
docs(deploy): replace globals overlay description with script overlays
deploy/README.md still described the deleted managed-global overlays as
the second overlay surface. Replace with a description of script
overlays (bubblewrap + systemd-run sandbox, resource caps).

Full test sweep: 367 passing, 2 skipped across l4d2web, l4d2host, deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:56:24 +02:00
mwiegand
e51a4d58a4
chore(deploy): provision l4d2-sandbox + bubblewrap; drop globals refresh timer
deploy-test-server.sh: provisions the l4d2-sandbox system user (no home,
nologin shell) and installs the bubblewrap apt/dnf package; copies the
left4me-script-sandbox helper into /usr/local/libexec/left4me with mode
0755. Drops the global_overlay_cache directory provisioning, the
refresh-global-overlays unit installation, and the timer enable.

Deletes the orphaned left4me-refresh-global-overlays.{service,timer}
files. Trims the matching paragraph from deploy/README.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:54:57 +02:00
mwiegand
75e703e1a4
feat(deploy): left4me-script-sandbox helper + sudoers fragment
Privileged bash helper that wraps user-authored scripts in
systemd-run --scope (cgroup limits + RuntimeMaxSec=3600) inside a
bubblewrap sandbox dropped to the l4d2-sandbox uid. Network is shared
with the host so scripts can fetch from Steam / l4d2center / etc.;
filesystem is RO except for /overlay (rw bind from
/var/lib/left4me/overlays/{id}) and tmpfs /tmp + /run.

Adds a sudoers rule allowing the left4me user to invoke this helper
without restrictions on its arguments. Strict argument validation is
in the helper itself.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:53:21 +02:00
mwiegand
9985ecc56c
chore(deploy): cleanup left4me-web hardening + docs for kernel overlayfs
Drop MountFlags=shared (the assumption that it propagated fuse mounts
to host was incorrect on systemd 257 with ProtectSystem+ReadWritePaths).
Restore PrivateTmp=true (was dropped in 593611e for fuse propagation
that did not work). Rewrite the comment block to describe the new
model: mounts go through the left4me-overlay helper which nsenters
into PID 1's mount namespace, so the unit's mount-ns layout is no
longer load-bearing.

Update the three user-facing READMEs (root, l4d2host, deploy) to drop
fuse-overlayfs / fusermount3 prereqs and call out the kernel overlayfs
mount path through the privileged helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:29:49 +02:00
mwiegand
172e574a00
chore(deploy): drop fuse-overlayfs apt dep + one-shot migrate upper/work
Drop fuse-overlayfs / fuse3 from the apt/dnf install line — the new
mount path is kernel overlayfs via the left4me-overlay helper, no
fuse userspace needed.

Add a one-shot migration block gated by /var/lib/left4me/.kernel-overlay-migrated
that runs before daemon-reload: stop gameservers + web service, force-
unmount any leftover fuse or overlay mounts under runtime/, then wipe
and recreate empty upper/ and work/ for every instance. fuse-overlayfs
running as a non-root user used user.fuseoverlayfs.* xattrs that kernel
overlayfs ignores, so a pre-existing upper/ from the fuse era would
resurrect "deleted" files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:28:00 +02:00
mwiegand
d5b321b557
feat(l4d2-host): KernelOverlayFSMounter + left4me-overlay helper
New privileged helper at /usr/local/libexec/left4me/left4me-overlay
(Python, system /usr/bin/python3, stdlib only) takes only the instance
name, parses instance.env for L4D2_LOWERDIRS, validates each lowerdir
against an allowlist (installation/, overlays/, global_overlay_cache/,
workshop_cache/), refuses upperdirs tainted with user.fuseoverlayfs.*
xattrs from the prior fuse era, and execs `nsenter --mount=/proc/1/ns/mnt
-- mount -t overlay ...` so the resulting mount lives in the host
namespace. Mirrors the existing left4me-systemctl / left4me-journalctl
pattern; sudoers entry is verb-constrained.

KernelOverlayFSMounter implements the existing OverlayMounter ABC,
deriving the instance name from the merged path. No call sites use it
yet — that's the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 12:23:58 +02:00
mwiegand
38548ab0d7
chore(deploy): raise gunicorn thread pool to 32 for SSE headroom
Each SSE log-viewer or job-log stream holds a thread for its full
lifetime. With --threads 8, a handful of open browser tabs could
exhaust the pool. 32 keeps the same single-process scheduler invariant
(_claim_lock in job_worker is process-local) while giving SSE plenty
of headroom on the test box's user count.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:19:03 +02:00
mwiegand
ffc4cdbd7d
refactor(l4d2-web): remove legacy external overlay type
The workshop + managed-global overlay surface fully covers the
admin-SFTP flow that 'external' was a placeholder for. Drop the type
from the model defaults, builder registry, routes, template, and
tests, and add migration 0004 that deletes any leftover external
rows along with their blueprint and job references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 09:31:04 +02:00
mwiegand
92d6ebbe82
feat(l4d2-web): managed global map overlays with daily refresh
Adds two managed system overlays (l4d2center-maps, cedapug-maps) that
fetch curated map archives from upstream sources and reconcile addons
symlinks for non-Steam maps. A daily systemd timer enqueues a coalesced
refresh_global_overlays worker job; downloads, extraction, and rebuilds
run in the existing job worker and surface in the job log UI.

Schema: GlobalOverlaySource / GlobalOverlayItem / GlobalOverlayItemFile
plus nullable Job.user_id so system jobs render as "system" in the UI.
The new builder reconciles symlinks against the per-source vpk cache
and leaves foreign symlinks untouched. Initialize-time guard refuses
to mount a partial overlay if any expected vpk is missing from cache.

Refresh service uses shutil.move to handle EXDEV when /tmp and the
cache live on different filesystems.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 08:05:14 +02:00
mwiegand
0e83ee07d7
fix(deploy): make test deployments safe to rerun
Exclude local agent state from deploy archives, avoid recursive ownership over active runtime mounts, and let Alembic own schema upgrades before app startup.
2026-05-07 17:16:58 +02:00
mwiegand
b2a8d3d5e0
feat(deploy): workshop_cache provisioning
Adds /var/lib/left4me/workshop_cache to the deploy mkdir list (owned by
the left4me runtime user). Updates deploy/README.md to document the new
directory and the workshop overlay layout: web app downloads VPKs into
the cache and symlinks them into overlays/{overlay_id}/left4dead2/addons/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:53:49 +02:00
mwiegand
1968684c03
fix(deploy): MountFlags=shared on web service for fuse mount propagation
ProtectSystem=full + ReadWritePaths implicitly give the unit a private
mount namespace (systemd needs to remount /usr read-only). The default
namespace propagation is slave, so mounts the worker creates inside
never reach the host. The gameserver units (started via systemctl,
each with their own namespace) then inherit a host that lacks the
overlay, and their CHDIR into /var/lib/left4me/runtime/<name>/merged
fails.

Set MountFlags=shared so mount events propagate from the worker's
namespace back to the host, then onward to gameserver units at their
unshare time.

Verified on test box: nsenter -t <gunicorn-pid> -m mount showed the
fuse-overlayfs mount inside the worker but plain mount on the host
did not, while web unit had ProtectSystem=full + ReadWritePaths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 02:01:24 +02:00
mwiegand
593611e194
fix(deploy): drop PrivateTmp on web service so fuse mounts propagate
PrivateTmp=true gives the unit a private mount namespace. The worker's
fuse-overlayfs mount lives only inside that namespace, so the host
cannot see it and the gameserver unit (started via systemctl, with its
own namespace inherited from the host) also cannot see it. The
gameserver unit then fails CHDIR on
/var/lib/left4me/runtime/<name>/merged/left4dead2.

The mount must land in the host namespace so the gameserver unit
inherits it at unshare time. Remaining hardening: dedicated user,
ProtectSystem=full, ReadWritePaths, sudoers allowlist limited to two
helper scripts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:57:43 +02:00
mwiegand
56b9523d88
fix(deploy): drop NoNewPrivileges on web service so FUSE mounts work
The job worker calls fusermount3 (setuid-root) to mount per-instance
FUSE overlays and sudo to invoke the privileged systemctl wrapper.
NoNewPrivileges=true blocks both, surfacing as
"fusermount3: mount failed: Operation not permitted" the first time a
server is started. Hardening is still enforced via dedicated user,
PrivateTmp, ProtectSystem=full, ReadWritePaths, and the narrow sudoers
allowlist limited to two helper scripts.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:51:39 +02:00
mwiegand
7d9939c71d
fix(deploy): exclude macOS AppleDouble files from deploy archive
When tar runs on macOS it embeds ._* resource-fork sidecars next to each
file. These ended up under l4d2web/alembic/versions/ on the target and
alembic tried to import them as migration modules, failing with
"source code string cannot contain null bytes". Set COPYFILE_DISABLE=1
and add an --exclude '._*' so the archive is portable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 00:58:29 +02:00
mwiegand
0210ecd301
config: allow SESSION_COOKIE_SECURE override and disable on test deploy
The HTTP-only test deployment binds gunicorn to 0.0.0.0:8000 with no TLS
terminator, so a hardcoded SESSION_COOKIE_SECURE=True breaks browser
login. Make it opt-out via env (default True outside TESTING) and set
SESSION_COOKIE_SECURE=false in the generated web.env so the test box
keeps working over HTTP.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 00:56:48 +02:00
mwiegand
3809f85795
fix: load environment variables for alembic upgrade in deploy script to ensure database url is set properly 2026-05-06 21:01:35 +02:00
mwiegand
441c1db79b
fix: change directory before running alembic upgrade in deploy script to avoid pyproject.toml permission issues 2026-05-06 21:00:59 +02:00
mwiegand
fa566db820
fix: apply alembic migrations automatically on deployment 2026-05-06 21:00:25 +02:00
mwiegand
833ae318cf
fix(deploy): add venv to PATH in left4me-web systemd service 2026-05-06 20:45:37 +02:00
mwiegand
bbfc528354
feat(deploy): add production-like test deployment 2026-05-06 19:30:10 +02:00