From 9f0b51b455d5183570c08e999f49d0655c503acd Mon Sep 17 00:00:00 2001 From: mwiegand Date: Sun, 10 May 2026 01:09:28 +0200 Subject: [PATCH] docs(deploy): document network-shaping defaults + opt-in network knobs Co-Authored-By: Claude Opus 4.7 (1M context) --- deploy/README.md | 70 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/deploy/README.md b/deploy/README.md index 7807bcf..22c17ee 100644 --- a/deploy/README.md +++ b/deploy/README.md @@ -78,6 +78,51 @@ The deployment ships a host-side perf baseline (slices, unit directives, sysctls The following knobs are documented escape hatches — they are **not** auto-applied. Apply only if you have measured a need and understand the failure modes. +### Network shaping + +The deploy ships three things that affect player-experience network behaviour: + +1. **Per-flow marking.** `left4me-nft-mark.service` loads a small nftables + table (`inet left4me_mark`) that marks every UDP packet from uid `left4me` + with DSCP EF and `skb->priority` 6. srcds doesn't set these itself, so + without this rule its UDP is indistinguishable from any other flow. +2. **Sysctl baseline.** `99-left4me.conf` sets `udp_rmem_min=16384`, + `udp_wmem_min=16384`, `default_qdisc=fq_codel`, and + `tcp_congestion_control=bbr`. Reduces head-of-line blocking when bulk + TCP egress (backups, package fetches, web responses) coexists with + game UDP. +3. **CAKE egress shaping.** `left4me-cake.service` runs + `tc qdisc replace dev root cake bandwidth Xmbit internet + diffserv4 dual-dsthost` from `/etc/left4me/cake.env`. CAKE only shapes + if its declared bandwidth is **below** the real bottleneck, so set + `LEFT4ME_UPLINK_MBIT` to ≈95% of measured uplink: + + sudoedit /etc/left4me/cake.env + # set LEFT4ME_UPLINK_MBIT=480 (or whatever ~95% of your uplink is) + sudo systemctl restart left4me-cake.service + + `LEFT4ME_UPLINK_IFACE` is auto-detected from the IPv4 default route; + override only on hosts with multi-homed setups. + + At idle 500 Mbit with no competing egress, CAKE shapes nothing — that's + expected, not a bug. The win materialises when bulk traffic on the + same uplink would otherwise bufferbloat the link the players share. + +**Production hosts running `systemd-networkd`** should NOT use the +`left4me-cake.service` oneshot. Instead, configure the equivalent in the +matching `.network` file, which systemd-networkd reapplies across iface +lifecycle events: + + # /etc/systemd/network/.network + [CAKE] + Bandwidth=480M + OverheadKeyword=internet + PriorityQueueingPreset=diffserv4 + EgressHostIsolation=yes + +The nftables marking from (1) is qdisc-installer-agnostic and ships +unchanged on production. + ### CPU governor The performance governor squeezes a few percent off jitter under bursty load. `schedutil` is acceptable for sustained UDP workloads. @@ -144,6 +189,31 @@ AmbientCapabilities=CAP_SYS_NICE The `AmbientCapabilities=CAP_SYS_NICE` line is needed because the service runs as `User=left4me` with `NoNewPrivileges=true`; without it some kernels/systemd combinations refuse to apply the RT policy. +### Additional opt-in network knobs + +- **Ingress shaping via IFB.** Egress CAKE alone does not protect srcds + receive against ingress saturation (large workshop downloads, package + fetches arriving at line rate). One-liner: + + sudo modprobe ifb && sudo ip link set ifb0 up + sudo tc qdisc add dev handle ffff: ingress + sudo tc filter add dev parent ffff: protocol ip u32 \ + match u32 0 0 action mirred egress redirect dev ifb0 + sudo tc qdisc add dev ifb0 root cake bandwidth Xmbit ingress \ + diffserv4 dual-srchost + + Worth flipping only when measurement shows ingress hurting receive. + +- **`net.core.busy_poll = 50` / `net.core.busy_read = 50`.** Reduces UDP + receive median latency by polling for incoming packets briefly at + syscall boundaries. Cost: measurable CPU per syscall under load. Worth + flipping if a host is dedicated to game serving and CPU headroom is + plentiful. + +- **`ethtool -K gro off`.** Some Source-engine ops disable + generic receive offload to avoid receive-side coalescing latency. + Hardware/driver dependent; document only. + ### Applying changes to running servers Unit-file changes do not apply to already-running services. After any change: