235 lines
11 KiB
Markdown
235 lines
11 KiB
Markdown
# left4me Deployment
|
|
|
|
This directory contains the production-like test deployment for a Linux server. It installs the repository into a fixed host layout, configures a dedicated runtime user, installs systemd units, and wires the web app to host operations through privileged helper commands.
|
|
|
|
## Target Layout
|
|
|
|
The deployment uses these paths:
|
|
|
|
- `/etc/left4me/host.env`: host library environment configuration.
|
|
- `/etc/left4me/web.env`: web app environment configuration.
|
|
- `/opt/left4me/.venv`: Python virtual environment for deployed commands.
|
|
- `/opt/left4me`: deployed repository contents.
|
|
- `/var/lib/left4me/left4me.db`: SQLite database used by the web app.
|
|
- `/var/lib/left4me/installation`: shared L4D2 installation.
|
|
- `/var/lib/left4me/overlays`: overlay directories. Each overlay lives at `${overlay_id}` under here.
|
|
- `/var/lib/left4me/workshop_cache`: deduplicated cache of `.vpk` files downloaded for workshop overlays. One file per Steam item, named `{steam_id}.vpk`. Workshop overlays symlink into this tree.
|
|
- `/var/lib/left4me/global_overlay_cache`: cache of non-Steam map archives and extracted `.vpk` files used by managed global map overlays.
|
|
- `/var/lib/left4me/instances`: rendered instance specifications and per-instance state.
|
|
- `/var/lib/left4me/runtime`: per-instance runtime mount directories.
|
|
- `/var/lib/left4me/tmp`: temporary files used by deployment/runtime operations.
|
|
- `/usr/local/lib/systemd/system`: global systemd unit files, including `left4me-server@.service`.
|
|
- `/usr/local/libexec/left4me`: privileged helper commands, including `left4me-systemctl`, `left4me-journalctl`, and `left4me-overlay` (the latter mounts the per-instance kernel overlay in PID 1's mount namespace via `nsenter`).
|
|
- `/etc/sudoers.d/left4me`: sudoers rules allowing the web/runtime commands to call the helpers non-interactively.
|
|
|
|
Static units are generated for `/var/lib/left4me`. If `LEFT4ME_ROOT` changes, regenerate and reinstall the unit files instead of reusing the existing static units.
|
|
|
|
## Runtime User
|
|
|
|
The deployment creates and runs host operations as the dedicated runtime user:
|
|
|
|
- Username: `left4me`
|
|
- Home: `/var/lib/left4me`
|
|
- Shell: `/usr/sbin/nologin`
|
|
|
|
## Running A Test Deployment
|
|
|
|
Run the deployment from the repository root:
|
|
|
|
```bash
|
|
deploy/deploy-test-server.sh deploy-user@example-host
|
|
```
|
|
|
|
The SSH user must be able to run `sudo` on the target host. The deployment configures system packages, directories, environment files, helper scripts, sudoers rules, Python dependencies, and systemd units.
|
|
|
|
## Admin Bootstrap
|
|
|
|
Set the bootstrap credentials in the environment when creating the first admin user:
|
|
|
|
```bash
|
|
LEFT4ME_ADMIN_USERNAME=admin \
|
|
LEFT4ME_ADMIN_PASSWORD='change-me' \
|
|
flask create-user "$LEFT4ME_ADMIN_USERNAME" --admin
|
|
```
|
|
|
|
Use a strong one-time password and rotate it after first login if needed.
|
|
|
|
## Overlay References
|
|
|
|
Overlay references are relative paths below `${LEFT4ME_ROOT}/overlays`. With the default deployment root, they resolve under `/var/lib/left4me/overlays`. New overlays use `${overlay_id}` as their path; the digit-only form is the only one created by the web app.
|
|
|
|
Invalid references are rejected:
|
|
|
|
- Absolute paths such as `/srv/overlay`.
|
|
- Parent traversal such as `../other` or `competitive/../../base`.
|
|
- Empty path components such as `competitive//base`.
|
|
- Symlink escapes that resolve outside `${LEFT4ME_ROOT}/overlays`.
|
|
|
|
The web app currently supports two overlay surfaces:
|
|
|
|
- `workshop` overlays (user-owned) — populated by downloading `.vpk` files from the public Steam Web API into `${LEFT4ME_ROOT}/workshop_cache/{steam_id}.vpk` and creating absolute symlinks under `${LEFT4ME_ROOT}/overlays/{overlay_id}/left4dead2/addons/{steam_id}.vpk`.
|
|
- `script` overlays — populated by an arbitrary user-authored bash script that runs inside `bubblewrap` + `systemd-run --scope` as the unprivileged `l4d2-sandbox` UID, with the overlay directory bind-mounted RW at `/overlay`. Resource caps: 1h walltime, 4 GB RAM, 512 tasks, 200% CPU, 20 GB post-build disk cap.
|
|
|
|
Both the caches and the overlay directories are owned by the `left4me` runtime user; if the web service ever runs as a different uid, ensure it shares a group with the host process and that both trees are group-readable.
|
|
|
|
## Performance Tuning
|
|
|
|
The deployment ships a host-side perf baseline (slices, unit directives, sysctls). See `docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md` for design rationale.
|
|
|
|
The following knobs are documented escape hatches — they are **not** auto-applied. Apply only if you have measured a need and understand the failure modes.
|
|
|
|
### Network shaping
|
|
|
|
The deploy ships three things that affect player-experience network behaviour:
|
|
|
|
1. **Per-flow marking.** `left4me-nft-mark.service` loads a small nftables
|
|
table (`inet left4me_mark`) that marks every UDP packet from uid `left4me`
|
|
with DSCP EF and `skb->priority` 6. srcds doesn't set these itself, so
|
|
without this rule its UDP is indistinguishable from any other flow.
|
|
2. **Sysctl baseline.** `99-left4me.conf` sets `udp_rmem_min=16384`,
|
|
`udp_wmem_min=16384`, `default_qdisc=fq_codel`, and
|
|
`tcp_congestion_control=bbr`. Reduces head-of-line blocking when bulk
|
|
TCP egress (backups, package fetches, web responses) coexists with
|
|
game UDP.
|
|
3. **CAKE egress shaping.** `left4me-cake.service` runs
|
|
`tc qdisc replace dev <iface> root cake bandwidth Xmbit internet
|
|
diffserv4 dual-dsthost` from `/etc/left4me/cake.env`. CAKE only shapes
|
|
if its declared bandwidth is **below** the real bottleneck, so set
|
|
`LEFT4ME_UPLINK_MBIT` to ≈95% of measured uplink:
|
|
|
|
sudoedit /etc/left4me/cake.env
|
|
# set LEFT4ME_UPLINK_MBIT=480 (or whatever ~95% of your uplink is)
|
|
sudo systemctl restart left4me-cake.service
|
|
|
|
`LEFT4ME_UPLINK_IFACE` is auto-detected from the IPv4 default route;
|
|
override only on hosts with multi-homed setups.
|
|
|
|
At idle 500 Mbit with no competing egress, CAKE shapes nothing — that's
|
|
expected, not a bug. The win materialises when bulk traffic on the
|
|
same uplink would otherwise bufferbloat the link the players share.
|
|
|
|
**Production hosts running `systemd-networkd`** should NOT use the
|
|
`left4me-cake.service` oneshot. Instead, configure the equivalent in the
|
|
matching `.network` file, which systemd-networkd reapplies across iface
|
|
lifecycle events:
|
|
|
|
# /etc/systemd/network/<your-uplink>.network
|
|
[CAKE]
|
|
Bandwidth=480M
|
|
OverheadKeyword=internet
|
|
PriorityQueueingPreset=diffserv4
|
|
EgressHostIsolation=yes
|
|
|
|
The nftables marking from (1) is qdisc-installer-agnostic and ships
|
|
unchanged on production.
|
|
|
|
**Disabling network shaping.** To turn the whole feature off on a deployed
|
|
host:
|
|
|
|
sudo systemctl stop left4me-cake.service left4me-nft-mark.service
|
|
sudo systemctl disable left4me-cake.service left4me-nft-mark.service
|
|
|
|
The sysctl baseline (`99-left4me.conf`) and the BBR/fq_codel defaults stay
|
|
applied; revert those by removing the file and running `sysctl --system`
|
|
if needed.
|
|
|
|
### CPU governor
|
|
|
|
The performance governor squeezes a few percent off jitter under bursty load. `schedutil` is acceptable for sustained UDP workloads.
|
|
|
|
```sh
|
|
sudo cpupower frequency-set -g performance
|
|
```
|
|
|
|
Install via `sudo apt install linux-cpupower` if the binary isn't present.
|
|
|
|
Persist via your distro's CPU-frequency tooling (e.g. `/etc/default/cpufrequtils`).
|
|
|
|
### CPU isolation (cores)
|
|
|
|
The deploy script writes four `AllowedCPUs=` drop-ins so that, by default, only `l4d2-game.slice` is allowed to run on cores 1..N-1; `system.slice`, `user.slice`, and `l4d2-build.slice` are pinned to core 0. Game servers thus get the host minus core 0 exclusively, the build sandbox and the web app stay on core 0, and a logged-in admin running CPU-heavy work in their shell can't steal cycles from a live match.
|
|
|
|
Override the split by setting either env var when running the deploy:
|
|
|
|
```sh
|
|
LEFT4ME_SYSTEM_CPUS="0,1" LEFT4ME_GAME_CPUS="2-7" deploy/deploy-test-server.sh deploy-user@host
|
|
```
|
|
|
|
On single-core hosts the deploy skips the cpuset drop-ins entirely and prints a warning to stderr; the rest of the perf baseline (cgroup weights, sysctls, OOM scores) still applies. To force isolation on a single-core host anyway (rarely useful), set either env var explicitly.
|
|
|
|
Per-instance `CPUAffinity=` (next subsection) composes on top of this — the per-instance value must be a subset of `l4d2-game.slice`'s `AllowedCPUs=`, which the kernel enforces.
|
|
|
|
### Per-instance CPU affinity
|
|
|
|
`srcds` is single-threaded per instance. On a multi-core host, pinning each instance to its own core can cut jitter under contention. Drop in `/etc/systemd/system/left4me-server@<name>.service.d/affinity.conf`:
|
|
|
|
```ini
|
|
[Service]
|
|
CPUAffinity=2
|
|
```
|
|
|
|
This pins the instance to CPU 2 specifically; per-instance values would typically be 1, 2, 3, ... so each server has its own core.
|
|
|
|
A reasonable strategy on an N-core host: leave core 0 for the kernel + IRQs + system services, then pin one instance per remaining core.
|
|
|
|
### NIC tuning
|
|
|
|
Hardware-specific (install via `sudo apt install ethtool` if not present). On a host with a single primary interface (replace `eth0`):
|
|
|
|
```sh
|
|
sudo ethtool -G eth0 rx 4096 tx 4096
|
|
sudo ethtool -K eth0 gro on lro off
|
|
```
|
|
|
|
If you run a high instance count, also pin the NIC's interrupts off the cores that game servers occupy (see `/proc/interrupts` and `/proc/irq/<n>/smp_affinity`).
|
|
|
|
### Real-time scheduling (advanced, opt-in)
|
|
|
|
Source-engine servers do not need real-time scheduling, and a misbehaving `srcds` at any RT priority can starve kernel threads — even with the default `kernel.sched_rt_runtime_us=950000` throttling 5% of CPU back. Use only if you have a measured jitter problem that the baseline does not solve.
|
|
|
|
`/etc/systemd/system/left4me-server@.service.d/realtime.conf`:
|
|
|
|
```ini
|
|
[Service]
|
|
CPUSchedulingPolicy=fifo
|
|
CPUSchedulingPriority=10
|
|
LimitRTPRIO=10
|
|
AmbientCapabilities=CAP_SYS_NICE
|
|
```
|
|
|
|
The `AmbientCapabilities=CAP_SYS_NICE` line is needed because the service runs as `User=left4me` with `NoNewPrivileges=true`; without it some kernels/systemd combinations refuse to apply the RT policy.
|
|
|
|
### Additional opt-in network knobs
|
|
|
|
- **Ingress shaping via IFB.** Egress CAKE alone does not protect srcds
|
|
receive against ingress saturation (large workshop downloads, package
|
|
fetches arriving at line rate). One-liner:
|
|
|
|
sudo modprobe ifb && sudo ip link set ifb0 up
|
|
sudo tc qdisc add dev <uplink> handle ffff: ingress
|
|
sudo tc filter add dev <uplink> parent ffff: protocol ip u32 \
|
|
match u32 0 0 action mirred egress redirect dev ifb0
|
|
sudo tc qdisc add dev ifb0 root cake bandwidth Xmbit ingress \
|
|
diffserv4 dual-srchost
|
|
|
|
Worth flipping only when measurement shows ingress hurting receive.
|
|
|
|
- **`net.core.busy_poll = 50` / `net.core.busy_read = 50`.** Reduces UDP
|
|
receive median latency by polling for incoming packets briefly at
|
|
syscall boundaries. Cost: measurable CPU per syscall under load. Worth
|
|
flipping if a host is dedicated to game serving and CPU headroom is
|
|
plentiful.
|
|
|
|
- **`ethtool -K <iface> gro off`.** Some Source-engine ops disable
|
|
generic receive offload to avoid receive-side coalescing latency.
|
|
Hardware/driver dependent; document only.
|
|
|
|
### Applying changes to running servers
|
|
|
|
Unit-file changes do not apply to already-running services. After any change:
|
|
|
|
```sh
|
|
sudo systemctl daemon-reload
|
|
# Restart each game server via the web UI's stop + start, or:
|
|
sudo systemctl restart 'left4me-server@*.service'
|
|
```
|