left4me/deploy
mwiegand dd918aca4b
fix(left4me-overlay): use /proc/self/mountinfo to detect bind mounts
os.path.ismount() compares st_dev against the parent dir, which silently
returns False for same-fs bind mounts. The idmap binds at runtime/<n>/
idmap/<basename> are exactly that case, so:

- cmd_umount skipped the bind-umount step every stop, leaving orphan
  binds in PID 1's mount namespace.
- cmd_mount's idempotency check then "didn't see" the orphan and
  re-bound on top, accumulating one mount per start/stop cycle.

Findmnt nesting like
    /var/lib/left4me/runtime/2/idmap/overlays_9
    └─/var/lib/left4me/runtime/2/idmap/overlays_9
is the visible symptom. Reboot wipes everything so the bug is invisible
on a fresh boot — only stop/start cycles accumulate.

Replace both ismount sites with a _is_mountpoint() helper that reads
/proc/self/mountinfo (column 5 is the mount point). Keep os.path.ismount
for the overlay merged check, where it's reliable (distinct fs type).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:02:18 +02:00
..
files fix(left4me-overlay): use /proc/self/mountinfo to detect bind mounts 2026-05-15 01:02:18 +02:00
templates/etc/left4me deploy: add STEAM_WEB_API_KEY to web.env template 2026-05-12 22:25:03 +02:00
tests test(deploy): assert left4me-overlay idmaps sandbox-owned lowerdirs 2026-05-14 23:56:36 +02:00
deploy-test-server.sh deploy: claim /usr/local/sbin/left4me admin CLI in deploy/files 2026-05-15 00:41:06 +02:00
README.md deploy: claim /usr/local/sbin/left4me admin CLI in deploy/files 2026-05-15 00:41:06 +02:00

left4me Deployment

Production provisioning of left4me on ovh.left4me is driven by ckn-bw (bundles/left4me/, attached via groups/applications/left4me.py). Run bw apply ovh.left4me from the ckn-bw repo to deploy.

What's canonical in this directory (deploy/files/, deploy/templates/, deploy/tests/): the actual file payload ckn-bw deploys. ckn-bw fetches the left4me repo via git_deploy to /opt/left4me/src/ and installs the privileged scripts from deploy/files/usr/local/{libexec,sbin}/ directly onto the target. Sudoers, sysctl, and env-template content ships from deploy/files/etc/ and deploy/templates/etc/. Edit these files here; ckn-bw picks them up on the next apply. No duplicate copy of the file content lives in ckn-bw.

What's superseded: the deploy-test-server.sh script — an older one-shot bash deploy that ckn-bw replaced. It's kept as a readable description of the install steps the bundle now performs declaratively. Don't run it against an ovh.left4me node managed by ckn-bw; the two would fight over file ownership.

What's obsolete (kept for greppability, not currently used): CAKE traffic shaping (now in systemd-networkd via network/<iface>/cake metadata in ckn-bw), nft marking (now in the central nftables/output set), and the systemd unit files under files/usr/local/lib/systemd/system/ (emitted by the bundle's systemd_units reactor instead of being shipped as static files). The obsolete bits stay here intact so the original choices and tradeoffs remain greppable.

What lives here (and what corresponds to it in ckn-bw)

Path here Status under ckn-bw
deploy-test-server.sh replaced by bw apply
files/etc/sudoers.d/left4me shipped verbatim by bundles/left4me/files/etc/sudoers.d/left4me (validated with visudo -cf via test_with)
files/etc/sysctl.d/99-left4me.conf shipped verbatim by the bundle
files/etc/left4me/sandbox-resolv.conf shipped verbatim by the bundle
files/usr/local/libexec/left4me/{left4me-systemctl,journalctl,overlay,script-sandbox} installed onto the target by the install_left4me_scripts action in bundles/left4me/items.py, reading directly from /opt/left4me/src/deploy/files/usr/local/libexec/left4me/ after git_deploy. The bundle does not carry a duplicate copy.
files/usr/local/sbin/left4me same install action; admin CLI wrapper (sudo left4me <flask-subcommand>)
files/usr/local/lib/systemd/system/left4me-web.service emitted by systemd_units reactor in bundles/left4me/metadata.py (intentional change: --bind 0.0.0.0:8000127.0.0.1:8000 because nginx now terminates TLS)
files/usr/local/lib/systemd/system/left4me-server@.service emitted by the same reactor
files/usr/local/lib/systemd/system/{l4d2-game,l4d2-build}.slice emitted by the same reactor
files/usr/local/lib/systemd/system/left4me-cake.service obsolete — CAKE applied via systemd-networkd (network/<iface>/cake metadata in bundles/network/)
files/usr/local/libexec/left4me/left4me-apply-cake obsolete — same as above
files/etc/left4me/cake.env obsolete — bandwidth lives in node metadata under network/external/cake/Bandwidth
files/usr/local/lib/systemd/system/left4me-nft-mark.service obsolete — central bundles/nftables/ consumes the rules from bundles/left4me/'s defaults
files/usr/local/lib/left4me/nft/left4me-mark.nft obsolete — same as above
templates/etc/left4me/host.env rendered as Mako by bundles/left4me/files/etc/left4me/host.env.mako
templates/etc/left4me/web.env.template rendered as Mako by bundles/left4me/files/etc/left4me/web.env.mako (intentional change: SESSION_COOKIE_SECURE=falsetrue, plus LEFT4ME_PORT_RANGE_* are now wired through)
First-run admin bootstrap (flask create-user … --admin near the end of deploy-test-server.sh) manual one-time step after bw apply; the bundle deliberately doesn't seed an admin to keep credentials out of the metadata pipeline
CPU isolation drop-ins (/etc/systemd/system/{system,user,l4d2-game,l4d2-build}.slice.d/99-left4me-cpuset.conf) not managed by the bundle — generated dynamically based on nproc --all in the script; that logic doesn't fit static bundle metadata, apply manually post-deploy if needed

Original notes (still accurate as a description of the install steps)

This directory contains the production-like test deployment for a Linux server. It installs the repository into a fixed host layout, configures a dedicated runtime user, installs systemd units, and wires the web app to host operations through privileged helper commands.

Target Layout

The deployment uses these paths:

  • /etc/left4me/host.env: host library environment configuration.
  • /etc/left4me/web.env: web app environment configuration.
  • /opt/left4me/.venv: Python virtual environment for deployed commands.
  • /opt/left4me: deployed repository contents.
  • /var/lib/left4me/left4me.db: SQLite database used by the web app.
  • /var/lib/left4me/installation: shared L4D2 installation.
  • /var/lib/left4me/overlays: overlay directories. Each overlay lives at ${overlay_id} under here.
  • /var/lib/left4me/workshop_cache: deduplicated cache of .vpk files downloaded for workshop overlays. One file per Steam item, named {steam_id}.vpk. Workshop overlays symlink into this tree.
  • /var/lib/left4me/global_overlay_cache: cache of non-Steam map archives and extracted .vpk files used by managed global map overlays.
  • /var/lib/left4me/instances: rendered instance specifications and per-instance state.
  • /var/lib/left4me/runtime: per-instance runtime mount directories.
  • /var/lib/left4me/tmp: temporary files used by deployment/runtime operations.
  • /usr/local/lib/systemd/system: global systemd unit files, including left4me-server@.service.
  • /usr/local/libexec/left4me: privileged helper commands, including left4me-systemctl, left4me-journalctl, and left4me-overlay (the latter mounts the per-instance kernel overlay in PID 1's mount namespace via nsenter).
  • /etc/sudoers.d/left4me: sudoers rules allowing the web/runtime commands to call the helpers non-interactively.

Static units are generated for /var/lib/left4me. If LEFT4ME_ROOT changes, regenerate and reinstall the unit files instead of reusing the existing static units.

Runtime User

The deployment creates and runs host operations as the dedicated runtime user:

  • Username: left4me
  • Home: /var/lib/left4me
  • Shell: /usr/sbin/nologin

Running A Test Deployment

Run the deployment from the repository root:

deploy/deploy-test-server.sh deploy-user@example-host

The SSH user must be able to run sudo on the target host. The deployment configures system packages, directories, environment files, helper scripts, sudoers rules, Python dependencies, and systemd units.

Admin Bootstrap

Set the bootstrap credentials in the environment when creating the first admin user:

LEFT4ME_ADMIN_USERNAME=admin \
LEFT4ME_ADMIN_PASSWORD='change-me' \
flask create-user "$LEFT4ME_ADMIN_USERNAME" --admin

Use a strong one-time password and rotate it after first login if needed.

Overlay References

Overlay references are relative paths below ${LEFT4ME_ROOT}/overlays. With the default deployment root, they resolve under /var/lib/left4me/overlays. New overlays use ${overlay_id} as their path; the digit-only form is the only one created by the web app.

Invalid references are rejected:

  • Absolute paths such as /srv/overlay.
  • Parent traversal such as ../other or competitive/../../base.
  • Empty path components such as competitive//base.
  • Symlink escapes that resolve outside ${LEFT4ME_ROOT}/overlays.

The web app currently supports two overlay surfaces:

  • workshop overlays (user-owned) — populated by downloading .vpk files from the public Steam Web API into ${LEFT4ME_ROOT}/workshop_cache/{steam_id}.vpk and creating absolute symlinks under ${LEFT4ME_ROOT}/overlays/{overlay_id}/left4dead2/addons/{steam_id}.vpk.
  • script overlays — populated by an arbitrary user-authored bash script that runs inside bubblewrap + systemd-run --scope as the unprivileged l4d2-sandbox UID, with the overlay directory bind-mounted RW at /overlay. Resource caps: 1h walltime, 4 GB RAM, 512 tasks, 200% CPU, 20 GB post-build disk cap.

Both the caches and the overlay directories are owned by the left4me runtime user; if the web service ever runs as a different uid, ensure it shares a group with the host process and that both trees are group-readable.

Performance Tuning

The deployment ships a host-side perf baseline (slices, unit directives, sysctls). See docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md for design rationale.

The following knobs are documented escape hatches — they are not auto-applied. Apply only if you have measured a need and understand the failure modes.

Network shaping

The deploy ships three things that affect player-experience network behaviour:

  1. Per-flow marking. left4me-nft-mark.service loads a small nftables table (inet left4me_mark) that marks every UDP packet from uid left4me with DSCP EF and skb->priority 6. srcds doesn't set these itself, so without this rule its UDP is indistinguishable from any other flow.

  2. Sysctl baseline. 99-left4me.conf sets udp_rmem_min=16384, udp_wmem_min=16384, default_qdisc=fq_codel, and tcp_congestion_control=bbr. Reduces head-of-line blocking when bulk TCP egress (backups, package fetches, web responses) coexists with game UDP.

  3. CAKE egress shaping. left4me-cake.service runs tc qdisc replace dev <iface> root cake bandwidth Xmbit internet diffserv4 dual-dsthost from /etc/left4me/cake.env. CAKE only shapes if its declared bandwidth is below the real bottleneck, so set LEFT4ME_UPLINK_MBIT to ≈95% of measured uplink:

    sudoedit /etc/left4me/cake.env
    # set LEFT4ME_UPLINK_MBIT=480 (or whatever ~95% of your uplink is)
    sudo systemctl restart left4me-cake.service
    

    LEFT4ME_UPLINK_IFACE is auto-detected from the IPv4 default route; override only on hosts with multi-homed setups.

    At idle 500 Mbit with no competing egress, CAKE shapes nothing — that's expected, not a bug. The win materialises when bulk traffic on the same uplink would otherwise bufferbloat the link the players share.

Production hosts running systemd-networkd should NOT use the left4me-cake.service oneshot. Instead, configure the equivalent in the matching .network file, which systemd-networkd reapplies across iface lifecycle events:

# /etc/systemd/network/<your-uplink>.network
[CAKE]
Bandwidth=480M
OverheadKeyword=internet
PriorityQueueingPreset=diffserv4
EgressHostIsolation=yes

The nftables marking from (1) is qdisc-installer-agnostic and ships unchanged on production.

Disabling network shaping. To turn the whole feature off on a deployed host:

sudo systemctl stop left4me-cake.service left4me-nft-mark.service
sudo systemctl disable left4me-cake.service left4me-nft-mark.service

The sysctl baseline (99-left4me.conf) and the BBR/fq_codel defaults stay applied; revert those by removing the file and running sysctl --system if needed.

CPU governor

The performance governor squeezes a few percent off jitter under bursty load. schedutil is acceptable for sustained UDP workloads.

sudo cpupower frequency-set -g performance

Install via sudo apt install linux-cpupower if the binary isn't present.

Persist via your distro's CPU-frequency tooling (e.g. /etc/default/cpufrequtils).

CPU isolation (cores)

The deploy script writes four AllowedCPUs= drop-ins so that, by default, only l4d2-game.slice is allowed to run on cores 1..N-1; system.slice, user.slice, and l4d2-build.slice are pinned to core 0. Game servers thus get the host minus core 0 exclusively, the build sandbox and the web app stay on core 0, and a logged-in admin running CPU-heavy work in their shell can't steal cycles from a live match.

Override the split by setting either env var when running the deploy:

LEFT4ME_SYSTEM_CPUS="0,1" LEFT4ME_GAME_CPUS="2-7" deploy/deploy-test-server.sh deploy-user@host

On single-core hosts the deploy skips the cpuset drop-ins entirely and prints a warning to stderr; the rest of the perf baseline (cgroup weights, sysctls, OOM scores) still applies. To force isolation on a single-core host anyway (rarely useful), set either env var explicitly.

Per-instance CPUAffinity= (next subsection) composes on top of this — the per-instance value must be a subset of l4d2-game.slice's AllowedCPUs=, which the kernel enforces.

Per-instance CPU affinity

srcds is single-threaded per instance. On a multi-core host, pinning each instance to its own core can cut jitter under contention. Drop in /etc/systemd/system/left4me-server@<name>.service.d/affinity.conf:

[Service]
CPUAffinity=2

This pins the instance to CPU 2 specifically; per-instance values would typically be 1, 2, 3, ... so each server has its own core.

A reasonable strategy on an N-core host: leave core 0 for the kernel + IRQs + system services, then pin one instance per remaining core.

NIC tuning

Hardware-specific (install via sudo apt install ethtool if not present). On a host with a single primary interface (replace eth0):

sudo ethtool -G eth0 rx 4096 tx 4096
sudo ethtool -K eth0 gro on lro off

If you run a high instance count, also pin the NIC's interrupts off the cores that game servers occupy (see /proc/interrupts and /proc/irq/<n>/smp_affinity).

Real-time scheduling (advanced, opt-in)

Source-engine servers do not need real-time scheduling, and a misbehaving srcds at any RT priority can starve kernel threads — even with the default kernel.sched_rt_runtime_us=950000 throttling 5% of CPU back. Use only if you have a measured jitter problem that the baseline does not solve.

/etc/systemd/system/left4me-server@.service.d/realtime.conf:

[Service]
CPUSchedulingPolicy=fifo
CPUSchedulingPriority=10
LimitRTPRIO=10
AmbientCapabilities=CAP_SYS_NICE

The AmbientCapabilities=CAP_SYS_NICE line is needed because the service runs as User=left4me with NoNewPrivileges=true; without it some kernels/systemd combinations refuse to apply the RT policy.

Additional opt-in network knobs

  • Ingress shaping via IFB. Egress CAKE alone does not protect srcds receive against ingress saturation (large workshop downloads, package fetches arriving at line rate). One-liner:

    sudo modprobe ifb && sudo ip link set ifb0 up
    sudo tc qdisc add dev <uplink> handle ffff: ingress
    sudo tc filter add dev <uplink> parent ffff: protocol ip u32 \
        match u32 0 0 action mirred egress redirect dev ifb0
    sudo tc qdisc add dev ifb0 root cake bandwidth Xmbit ingress \
        diffserv4 dual-srchost
    

    Worth flipping only when measurement shows ingress hurting receive.

  • net.core.busy_poll = 50 / net.core.busy_read = 50. Reduces UDP receive median latency by polling for incoming packets briefly at syscall boundaries. Cost: measurable CPU per syscall under load. Worth flipping if a host is dedicated to game serving and CPU headroom is plentiful.

  • ethtool -K <iface> gro off. Some Source-engine ops disable generic receive offload to avoid receive-side coalescing latency. Hardware/driver dependent; document only.

Applying changes to running servers

Unit-file changes do not apply to already-running services. After any change:

sudo systemctl daemon-reload
# Restart each game server via the web UI's stop + start, or:
sudo systemctl restart 'left4me-server@*.service'