Commit graph

6 commits

Author SHA1 Message Date
mwiegand
a982995d5b
fix(deploy): ExecStartPre runs overlay helper with + prefix, not sudo
The unit has NoNewPrivileges=true (security hardening for srcds), which
blocks sudo's setuid escalation. The previous sudo'd ExecStartPre failed
on every start with "sudo: the 'no new privileges' switch is set, which
prevents sudo from running as root" -> Restart=on-failure loop.

systemd's `+` prefix runs the Exec command as PID 1 (root, no sandbox),
bypassing User=/Group=/NoNewPrivileges=. Equivalent privilege scope to
the sudoers rule the web app already uses for the same helper, just
without the sudo middleman.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:55:16 +02:00
mwiegand
56f5c30296
refactor(l4d2-host): unit's ExecStartPre is the sole code path to the mount
Before this change there were two callers of left4me-overlay mount:
the web app's start_instance (Python, in-process) and the unit's
ExecStartPre (shell, via sudo). The duplication invited divergence; the
helper's recently-added idempotency made both paths technically work
but at the cost of a "first wins" race and dead-code retry logic in
start_instance.

Drop the in-process _mounter.mount() call from start_instance. The web
app now only stages cfg files (which still must happen on the host
filesystem before mount, to avoid overlayfs copy-up changing ownership),
then asks systemd to enable+start the unit; the unit's ExecStartPre
does the mount.

Removed:
- os.path.ismount(merged) refusal in start_instance and its test
  (test_start_refuses_to_double_mount). The race the check guarded
  against is now handled by the helper's idempotency.
- _load_instance_env helper and the `os` import (both became dead).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:54:05 +02:00
mwiegand
3d9b7ef771
fix(deploy): WorkingDirectory= prefix - so ExecStartPre can mount the overlay
systemd applies WorkingDirectory= to every Exec line including ExecStartPre.
With the merged dir not yet existing at boot time (the volatile overlay
mount has been wiped), the chdir into runtime/%i/merged/left4dead2 fails
with status=200/CHDIR before ExecStartPre can run the mount helper.

The `-` prefix makes chdir failure non-fatal: ExecStartPre runs in the
unit's home (cwd doesn't matter for the mount helper); ExecStart re-applies
WorkingDirectory once the mount has landed and chdirs successfully.

Companion to commit 519567e (which added the ExecStartPre mount + helper
idempotency but didn't account for the WorkingDirectory ordering).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:51:58 +02:00
mwiegand
519567e156
fix(l4d2-host): mount overlay via ExecStartPre so enabled units boot cleanly
The lifecycle change to systemctl enable --now (commit 8552c55) made
units auto-start at boot. But the kernel-overlayfs mount is volatile
(reboot kills it), and the web app's start_instance only re-mounts in
response to a UI click. Result: at boot, systemd starts the unit, finds
empty merged/, CHDIR fails, Restart=on-failure spins forever (counter
hit 65 on ckn before this fix landed).

Fix:
- Unit gets `ExecStartPre=/usr/bin/sudo -n .../left4me-overlay mount %i`
  so the overlay is established before the main process starts.
- Helper is now idempotent: if merged is already a mount point, exit 0.
  Required because Restart=on-failure re-runs ExecStartPre on each
  cycle, and the web-app's start_instance also calls the helper, so
  both paths would otherwise collide on "already mounted".
- StartLimitBurst=5 + StartLimitIntervalSec=60s caps the restart loop
  instead of letting it spin indefinitely on a fundamental failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:47:20 +02:00
mwiegand
7193163488
feat(deploy): perf-baseline directives on left4me-server@.service
Slice=l4d2-game.slice, Nice=-5, IOSchedulingClass=best-effort,
OOMScoreAdjust=-200, MemoryHigh=1.5G, MemoryMax=2G, TasksMax=256,
LimitNOFILE=65536, KillSignal=SIGINT, TimeoutStopSec=15s,
LogRateLimitIntervalSec=0. Spec:
docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md
2026-05-09 09:44:12 +02:00
mwiegand
bbfc528354
feat(deploy): add production-like test deployment 2026-05-06 19:30:10 +02:00