docs(specs): l4d2 network shaping & marking — design
CAKE egress shaping (test-deploy oneshot + systemd-networkd [CAKE] block on prod), nftables uid-based DSCP-EF + skb-priority marking for srcds UDP, plus rounding sysctls (udp_rmem_min/wmem_min, default_qdisc=fq_codel, tcp_congestion_control=bbr). Hardware-specific knobs stay documented escape hatches matching the perf-baseline boundary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
62d6d4cbcd
commit
0cc92f2c17
1 changed files with 487 additions and 0 deletions
487
docs/superpowers/specs/2026-05-10-l4d2-network-shaping-design.md
Normal file
487
docs/superpowers/specs/2026-05-10-l4d2-network-shaping-design.md
Normal file
|
|
@ -0,0 +1,487 @@
|
||||||
|
# l4d2 network shaping & marking — design
|
||||||
|
|
||||||
|
Date: 2026-05-10
|
||||||
|
Status: design
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Add a network-side player-experience baseline alongside the existing host
|
||||||
|
perf baseline. Three concerns ship together:
|
||||||
|
|
||||||
|
1. **Mark srcds outbound packets** with DSCP `EF` and skb priority `6:0` so
|
||||||
|
any qdisc — host CAKE, ISP gear that honours DSCP, future systems —
|
||||||
|
recognises L4D2 game traffic as latency-sensitive. Marking happens by uid
|
||||||
|
match on the `left4me` user.
|
||||||
|
2. **Round out the UDP-socket sysctl baseline** (`udp_rmem_min`,
|
||||||
|
`udp_wmem_min`), set the default qdisc explicitly to `fq_codel`, and
|
||||||
|
switch TCP to `bbr` so coexisting TCP egress (admin, backups, web app,
|
||||||
|
apt) cannot bufferbloat the link the players share.
|
||||||
|
3. **Shape egress with CAKE.** On the test deploy, install a systemd oneshot
|
||||||
|
that applies `tc qdisc replace … cake …` from an operator-edited env
|
||||||
|
file. On production hosts running `systemd-networkd`, document the
|
||||||
|
equivalent `[CAKE]` section in the matching `.network` file as the
|
||||||
|
long-term path.
|
||||||
|
|
||||||
|
The intent is "all reasonable measures that do not depend on host-specific
|
||||||
|
hardware." Hardware-specific tuning (NIC ring buffers, IRQ pinning, CPU
|
||||||
|
governor, real-time scheduling, CPU affinity) remains a documented escape
|
||||||
|
hatch — same boundary the existing perf-baseline spec drew. The pieces
|
||||||
|
that *are* universally safe ship as defaults.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Game-server UDP packets carry an unambiguous priority signal in DSCP and
|
||||||
|
in `skb->priority`, set on the host before any qdisc inspects them.
|
||||||
|
- A coexisting bulk TCP flow on the same host (backup upload, package
|
||||||
|
fetch, web-app response) cannot push the bottleneck queue ahead of game
|
||||||
|
UDP under saturation.
|
||||||
|
- An operator who declares uplink bandwidth gets fair-queueing egress
|
||||||
|
shaping with diffserv-aware tin selection — i.e. EF-marked srcds traffic
|
||||||
|
drops into the highest-priority CAKE tin, per-destination-host fairness
|
||||||
|
keeps every connected player on equal footing.
|
||||||
|
- A production deployment using `systemd-networkd` has a one-block
|
||||||
|
configuration recipe, no helper script needed.
|
||||||
|
- Operators have a documented set of additional knobs (ingress shaping via
|
||||||
|
IFB, `busy_poll`, GRO toggling) for cases the default baseline does not
|
||||||
|
cover. None of these auto-apply.
|
||||||
|
|
||||||
|
## Non-goals
|
||||||
|
|
||||||
|
- NIC ring-buffer / IRQ pinning / RPS / RFS / hardware timestamping —
|
||||||
|
already declared host-specific in the perf-baseline spec; not
|
||||||
|
re-litigated here.
|
||||||
|
- `busy_poll` / `busy_read` as defaults — non-trivial CPU cost; documented
|
||||||
|
as opt-in.
|
||||||
|
- Ingress shaping via IFB as a default — only matters if egress CAKE turns
|
||||||
|
out load-bearing and ingress is also saturated; documented as opt-in.
|
||||||
|
- Real-time scheduling, governor changes — already declined by the
|
||||||
|
perf-baseline spec.
|
||||||
|
- Blueprint-side game settings (`sv_minrate`, `sv_maxrate`, tickrate,
|
||||||
|
`fps_max`) — owned by the server maintainer.
|
||||||
|
- Auto-detection or measurement of uplink bandwidth. CAKE only shapes
|
||||||
|
correctly when its declared bandwidth sits below the real bottleneck;
|
||||||
|
the operator must measure once and configure.
|
||||||
|
- Iface-flap watchdog. `tc qdisc replace` is idempotent; on prod,
|
||||||
|
`systemd-networkd` reapplies CAKE across iface lifecycle events. On
|
||||||
|
test, `systemctl restart left4me-cake.service` is the documented
|
||||||
|
recovery.
|
||||||
|
|
||||||
|
## Background
|
||||||
|
|
||||||
|
Current state (commit `62d6d4c` or thereabouts):
|
||||||
|
|
||||||
|
- The perf-baseline spec ships `/etc/sysctl.d/99-left4me.conf` with
|
||||||
|
`rmem_max`, `wmem_max`, `rmem_default`, `wmem_default`,
|
||||||
|
`netdev_max_backlog`, `netdev_budget`, `vm.swappiness`. No per-socket
|
||||||
|
UDP minimums, no default-qdisc directive, no TCP congestion-control
|
||||||
|
setting.
|
||||||
|
- `srcds_run` runs as system user `left4me`. srcds itself does not set
|
||||||
|
`IP_TOS` or `SO_PRIORITY`, so its UDP packets leave the host with
|
||||||
|
DSCP 0 and priority 0 — indistinguishable from any other UDP traffic to
|
||||||
|
any qdisc.
|
||||||
|
- The deploy ships nftables-relevant infrastructure only via package
|
||||||
|
defaults (Debian Trixie ships `nftables` in base, but no `left4me`
|
||||||
|
table is created).
|
||||||
|
- No qdisc is explicitly configured. The kernel's per-iface default
|
||||||
|
applies — `fq_codel` on Trixie, but only because Debian's default has
|
||||||
|
been `fq_codel` since Buster.
|
||||||
|
- The deploy script already copies sysctl drop-ins and runs
|
||||||
|
`sysctl --system` (`deploy/deploy-test-server.sh:196`).
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### Sysctl additions to `99-left4me.conf`
|
||||||
|
|
||||||
|
Append to `deploy/files/etc/sysctl.d/99-left4me.conf`:
|
||||||
|
|
||||||
|
```
|
||||||
|
# Per-socket UDP buffer floors: protect game-server sockets that don't bump
|
||||||
|
# their own SO_RCVBUF/SO_SNDBUF when softirq drains lag briefly.
|
||||||
|
net.ipv4.udp_rmem_min = 16384
|
||||||
|
net.ipv4.udp_wmem_min = 16384
|
||||||
|
|
||||||
|
# Default qdisc for ifaces we don't explicitly shape with CAKE. Debian
|
||||||
|
# Trixie already defaults to fq_codel; setting it explicitly is
|
||||||
|
# belt-and-suspenders and survives kernel-default churn.
|
||||||
|
net.core.default_qdisc = fq_codel
|
||||||
|
|
||||||
|
# TCP congestion control: BBR for any bulk TCP egress on the host (admin
|
||||||
|
# SSH, backups, package fetches, web-app responses) so a long flow does
|
||||||
|
# not push the bottleneck queue ahead of game UDP. UDP srcds is
|
||||||
|
# unaffected.
|
||||||
|
net.ipv4.tcp_congestion_control = bbr
|
||||||
|
```
|
||||||
|
|
||||||
|
The deploy already runs `sysctl --system` after copying the conf
|
||||||
|
(`deploy/deploy-test-server.sh:198`); no script change required for this
|
||||||
|
block.
|
||||||
|
|
||||||
|
### nftables packet marking
|
||||||
|
|
||||||
|
New file `deploy/files/usr/local/lib/left4me/nft/left4me-mark.nft`:
|
||||||
|
|
||||||
|
```nft
|
||||||
|
table inet left4me_mark {
|
||||||
|
chain mangle_output {
|
||||||
|
type filter hook output priority mangle; policy accept;
|
||||||
|
meta skuid "left4me" meta l4proto udp ip dscp set ef meta priority set 0006:0000
|
||||||
|
meta skuid "left4me" meta l4proto udp ip6 dscp set ef meta priority set 0006:0000
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Per-element rationale:
|
||||||
|
|
||||||
|
- `meta skuid "left4me"` — every srcds instance runs as that user. The
|
||||||
|
match is exact; nothing else on the host matches. No false positives
|
||||||
|
against the web app (which runs as `left4me` too but speaks TCP) or the
|
||||||
|
build sandbox (different uid).
|
||||||
|
- `meta l4proto udp` — bypass anything not UDP, including the future
|
||||||
|
RCON/HTTP TCP traffic from the web app.
|
||||||
|
- `ip dscp set ef` / `ip6 dscp set ef` — DSCP `EF` (Expedited Forwarding,
|
||||||
|
decimal 46) is the standard low-latency marking. CAKE's `diffserv4`
|
||||||
|
preset routes EF into its highest-priority "Voice" tin. Two rules,
|
||||||
|
one per L3 family, because in an `inet` table the `ip` matcher only
|
||||||
|
fires on v4 and `ip6` only on v6.
|
||||||
|
- `meta priority set 0006:0000` — sets `skb->priority` to class `6:0`.
|
||||||
|
Read by qdiscs that classify on skb priority (CAKE included) ahead of
|
||||||
|
any DSCP table lookup. Set inline with the DSCP rule so a single
|
||||||
|
rule-match runs both statements.
|
||||||
|
|
||||||
|
The table is named `left4me_mark` and lives in its own `inet` namespace.
|
||||||
|
It does not touch, depend on, or conflict with any nftables config the
|
||||||
|
operator may run independently. `nft -f` loads the file; `nft delete
|
||||||
|
table inet left4me_mark` cleanly removes it.
|
||||||
|
|
||||||
|
New unit `deploy/files/usr/local/lib/systemd/system/left4me-nft-mark.service`:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[Unit]
|
||||||
|
Description=left4me nftables packet marking (DSCP EF + priority for srcds)
|
||||||
|
After=network-pre.target
|
||||||
|
Before=network.target
|
||||||
|
Wants=network-pre.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
RemainAfterExit=yes
|
||||||
|
ExecStart=/usr/sbin/nft -f /usr/local/lib/left4me/nft/left4me-mark.nft
|
||||||
|
ExecStop=/usr/sbin/nft delete table inet left4me_mark
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
```
|
||||||
|
|
||||||
|
`After=network-pre.target` / `Before=network.target` keeps the rules in
|
||||||
|
place before any iface comes up, so the very first packet srcds emits
|
||||||
|
post-boot is already marked.
|
||||||
|
|
||||||
|
Deploy script changes:
|
||||||
|
|
||||||
|
- Ensure `nftables` is installed (`apt-get install -y nftables`;
|
||||||
|
idempotent — package is in Trixie base).
|
||||||
|
- Create `/usr/local/lib/left4me/nft/` and copy `left4me-mark.nft` into
|
||||||
|
it.
|
||||||
|
- Copy the unit, `daemon-reload`, `systemctl enable --now
|
||||||
|
left4me-nft-mark.service`.
|
||||||
|
|
||||||
|
### CAKE egress shaper — test deploy mechanism
|
||||||
|
|
||||||
|
Three files plus deploy-script changes. All operator-tunable knobs go in
|
||||||
|
the env file; the helper and unit are static.
|
||||||
|
|
||||||
|
**`deploy/files/etc/left4me/cake.env`** (template; deploy installs only
|
||||||
|
if absent so operator edits survive re-runs):
|
||||||
|
|
||||||
|
```
|
||||||
|
# Uplink bandwidth in Mbit/s. Set to ~95% of the smaller of measured
|
||||||
|
# upload and measured download. CAKE only shapes correctly when its
|
||||||
|
# declared bandwidth sits below the real bottleneck. If unset, the
|
||||||
|
# left4me-cake.service unit logs a warning and exits 0 (no shaping).
|
||||||
|
LEFT4ME_UPLINK_MBIT=
|
||||||
|
|
||||||
|
# Egress interface. If unset, auto-detected from the IPv4 default route.
|
||||||
|
LEFT4ME_UPLINK_IFACE=
|
||||||
|
```
|
||||||
|
|
||||||
|
**`deploy/files/usr/local/libexec/left4me/left4me-apply-cake`** (mode
|
||||||
|
`0755`, owner `root:root`). The helper takes a single argument — `apply`
|
||||||
|
or `clear` — so the unit's `ExecStart` and `ExecStop` both call the same
|
||||||
|
script and the unit file stays free of shell escaping:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
#!/bin/sh
|
||||||
|
set -eu
|
||||||
|
|
||||||
|
mode=${1:-apply}
|
||||||
|
|
||||||
|
if [ -r /etc/left4me/cake.env ]; then
|
||||||
|
. /etc/left4me/cake.env
|
||||||
|
fi
|
||||||
|
|
||||||
|
resolve_iface() {
|
||||||
|
if [ -n "${LEFT4ME_UPLINK_IFACE:-}" ]; then
|
||||||
|
printf '%s' "$LEFT4ME_UPLINK_IFACE"
|
||||||
|
return
|
||||||
|
fi
|
||||||
|
ip -4 route show default | awk '/default/ {print $5; exit}'
|
||||||
|
}
|
||||||
|
|
||||||
|
case "$mode" in
|
||||||
|
apply)
|
||||||
|
if [ -z "${LEFT4ME_UPLINK_MBIT:-}" ]; then
|
||||||
|
echo "left4me-cake: LEFT4ME_UPLINK_MBIT unset; skipping shaper" >&2
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
iface=$(resolve_iface)
|
||||||
|
if [ -z "$iface" ]; then
|
||||||
|
echo "left4me-cake: cannot determine egress iface; skipping" >&2
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
exec tc qdisc replace dev "$iface" root cake \
|
||||||
|
bandwidth "${LEFT4ME_UPLINK_MBIT}mbit" \
|
||||||
|
internet diffserv4 dual-dsthost
|
||||||
|
;;
|
||||||
|
clear)
|
||||||
|
iface=$(resolve_iface)
|
||||||
|
if [ -z "$iface" ]; then
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
tc qdisc del dev "$iface" root 2>/dev/null || true
|
||||||
|
;;
|
||||||
|
*)
|
||||||
|
echo "usage: $0 [apply|clear]" >&2
|
||||||
|
exit 2
|
||||||
|
;;
|
||||||
|
esac
|
||||||
|
```
|
||||||
|
|
||||||
|
`tc qdisc replace` is idempotent: replaces an existing root qdisc on the
|
||||||
|
iface, adds one if absent. Re-running the unit any time is safe. `clear`
|
||||||
|
swallows the "no such qdisc" error so stop is also idempotent.
|
||||||
|
|
||||||
|
Fail-soft on missing config matches the perf-baseline philosophy — the
|
||||||
|
deploy does not refuse to boot servers because the operator has not yet
|
||||||
|
filled in `LEFT4ME_UPLINK_MBIT`. The journal warning surfaces the gap.
|
||||||
|
|
||||||
|
**`deploy/files/usr/local/lib/systemd/system/left4me-cake.service`**:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
[Unit]
|
||||||
|
Description=left4me CAKE egress shaper
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
RemainAfterExit=yes
|
||||||
|
EnvironmentFile=-/etc/left4me/cake.env
|
||||||
|
ExecStart=/usr/local/libexec/left4me/left4me-apply-cake apply
|
||||||
|
ExecStop=/usr/local/libexec/left4me/left4me-apply-cake clear
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
|
```
|
||||||
|
|
||||||
|
Per-flag rationale for the `cake` invocation:
|
||||||
|
|
||||||
|
- `bandwidth ${LEFT4ME_UPLINK_MBIT}mbit` — operator-declared, ≈95% of
|
||||||
|
measured uplink. CAKE only shapes if its declared bandwidth is below
|
||||||
|
the real bottleneck; setting it slightly low moves the queue into a
|
||||||
|
place the host controls.
|
||||||
|
- `internet` — overhead-accounting keyword that handles common
|
||||||
|
Ethernet+ISP encapsulation (DOCSIS / GPON / PPPoE) correctly without
|
||||||
|
undershooting. Conservative default.
|
||||||
|
- `diffserv4` — four-tier DSCP-aware tin selection. Reads the EF marks
|
||||||
|
set by the nftables rule and routes srcds packets into the
|
||||||
|
highest-priority "Voice" tin. Without `diffserv4`, the marks are
|
||||||
|
ignored.
|
||||||
|
- `dual-dsthost` — egress fairness keyed on destination host. With ≥2
|
||||||
|
players connected, each player gets fair share regardless of how
|
||||||
|
chatty the server is to any single client.
|
||||||
|
|
||||||
|
Iface-flap behaviour: the kernel keeps the qdisc on an iface across
|
||||||
|
link-down/link-up while the iface itself exists. If the iface is
|
||||||
|
recreated (e.g., NetworkManager reconfiguration), `systemctl restart
|
||||||
|
left4me-cake.service` reapplies. Documented; no auto-watchdog in v1.
|
||||||
|
|
||||||
|
Deploy script changes (in `deploy/deploy-test-server.sh`):
|
||||||
|
|
||||||
|
- Copy `cake.env` to `/etc/left4me/cake.env` only if absent (do not
|
||||||
|
clobber operator edits).
|
||||||
|
- Copy `left4me-apply-cake` to `/usr/local/libexec/left4me/`, mode
|
||||||
|
`0755`, owner `root:root`.
|
||||||
|
- Copy `left4me-cake.service` to `/usr/local/lib/systemd/system/`.
|
||||||
|
- `systemctl daemon-reload` (already done in the existing flow).
|
||||||
|
- `systemctl enable --now left4me-cake.service`.
|
||||||
|
|
||||||
|
### CAKE egress shaper — production deployment (systemd-networkd)
|
||||||
|
|
||||||
|
On hosts running `systemd-networkd`, the CAKE configuration belongs in
|
||||||
|
the matching `.network` file. systemd-networkd reapplies it across iface
|
||||||
|
lifecycle events, addressing the only fragility of the test-deploy
|
||||||
|
oneshot.
|
||||||
|
|
||||||
|
Document in `deploy/README.md` Performance section:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
# /etc/systemd/network/<your-uplink>.network
|
||||||
|
[CAKE]
|
||||||
|
Bandwidth=480M
|
||||||
|
OverheadKeyword=internet
|
||||||
|
PriorityQueueingPreset=diffserv4
|
||||||
|
EgressHostIsolation=yes
|
||||||
|
```
|
||||||
|
|
||||||
|
Directive names follow `systemd.network(5)`. Values mirror the test
|
||||||
|
deploy's `tc` invocation:
|
||||||
|
|
||||||
|
- `Bandwidth=480M` — placeholder; operator sets to ≈95% of measured
|
||||||
|
uplink in their actual `.network`.
|
||||||
|
- `OverheadKeyword=internet` — equivalent of the `internet` keyword.
|
||||||
|
- `PriorityQueueingPreset=diffserv4` — equivalent of `diffserv4`.
|
||||||
|
- `EgressHostIsolation=yes` — equivalent of `dual-dsthost` on egress.
|
||||||
|
|
||||||
|
The nftables marking from the previous section ships unchanged on prod;
|
||||||
|
it is qdisc-installer-agnostic.
|
||||||
|
|
||||||
|
The test-deploy oneshot does NOT install on a host running
|
||||||
|
`systemd-networkd`. v1 does not implement that gate — production hosts
|
||||||
|
do not run the test-deploy script. If the boundary blurs in the future,
|
||||||
|
add a check in `left4me-apply-cake` for `systemctl is-active
|
||||||
|
systemd-networkd` and skip cleanly.
|
||||||
|
|
||||||
|
### Documented escape hatches
|
||||||
|
|
||||||
|
Append to `deploy/README.md` Performance section, alongside the existing
|
||||||
|
governor / CPU-affinity / NIC entries:
|
||||||
|
|
||||||
|
- **Ingress shaping via IFB.** Egress CAKE alone does not protect srcds
|
||||||
|
receive against ingress saturation (large workshop downloads, package
|
||||||
|
fetches arriving at line rate). One-liner template using `modprobe
|
||||||
|
ifb`, `ip link set ifb0 up`, `tc qdisc add dev ifb0 root cake bandwidth
|
||||||
|
Xmbit ingress diffserv4 dual-srchost`, and a `tc filter` redirect from
|
||||||
|
the uplink iface. Worth flipping only when measurement shows ingress
|
||||||
|
hurting receive; in v1 we have no such measurement, so it stays
|
||||||
|
documented.
|
||||||
|
- **`net.core.busy_poll = 50` / `net.core.busy_read = 50`.** Reduces UDP
|
||||||
|
receive median latency by polling for incoming packets briefly at
|
||||||
|
syscall boundaries. Cost: measurable CPU per syscall under load. Worth
|
||||||
|
flipping if a host is dedicated to game serving and CPU headroom is
|
||||||
|
plentiful.
|
||||||
|
- **`ethtool -K <iface> gro off`.** Some Source-engine ops disable
|
||||||
|
generic receive offload to avoid receive-side coalescing latency.
|
||||||
|
Hardware/driver dependent. Document, do not ship.
|
||||||
|
|
||||||
|
These three entries follow the existing escape-hatch style: a one-liner
|
||||||
|
or short config block, plus one sentence on when it matters.
|
||||||
|
|
||||||
|
### Files changed / added
|
||||||
|
|
||||||
|
```
|
||||||
|
deploy/files/etc/sysctl.d/99-left4me.conf (modified — block added)
|
||||||
|
deploy/files/usr/local/lib/left4me/nft/left4me-mark.nft (new)
|
||||||
|
deploy/files/usr/local/lib/systemd/system/left4me-nft-mark.service (new)
|
||||||
|
deploy/files/etc/left4me/cake.env (new — template, deploy preserves operator edits)
|
||||||
|
deploy/files/usr/local/libexec/left4me/left4me-apply-cake (new)
|
||||||
|
deploy/files/usr/local/lib/systemd/system/left4me-cake.service (new)
|
||||||
|
deploy/deploy-test-server.sh (modified — install+enable nft and cake units, conditional copy of cake.env)
|
||||||
|
deploy/README.md (modified — Network shaping subsection + 3 new escape hatches)
|
||||||
|
deploy/tests/test_deploy_artifacts.py (modified — assertions for all artifacts above)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tests
|
||||||
|
|
||||||
|
Following the existing `assert "key=value" in text` pattern in
|
||||||
|
`deploy/tests/test_deploy_artifacts.py`:
|
||||||
|
|
||||||
|
**Sysctl block** (extension of the existing perf-baseline assertions):
|
||||||
|
|
||||||
|
- Each of `net.ipv4.udp_rmem_min = 16384`, `net.ipv4.udp_wmem_min =
|
||||||
|
16384`, `net.core.default_qdisc = fq_codel`,
|
||||||
|
`net.ipv4.tcp_congestion_control = bbr` is asserted as a separate line.
|
||||||
|
|
||||||
|
**nftables marking artifacts:**
|
||||||
|
|
||||||
|
- `left4me-mark.nft` ships with `table inet left4me_mark`, `chain
|
||||||
|
mangle_output`, `meta skuid "left4me"`, `ip dscp set ef`, `ip6 dscp
|
||||||
|
set ef`, and `meta priority set 0006:0000` each asserted as separate
|
||||||
|
substring matches. (DSCP and priority statements appear inline on
|
||||||
|
the same rule per L3 family; substring assertions don't depend on
|
||||||
|
rule layout.)
|
||||||
|
- `left4me-nft-mark.service` has `ExecStart=/usr/sbin/nft -f
|
||||||
|
/usr/local/lib/left4me/nft/left4me-mark.nft`, `ExecStop=/usr/sbin/nft
|
||||||
|
delete table inet left4me_mark`, `Type=oneshot`,
|
||||||
|
`RemainAfterExit=yes`, `WantedBy=multi-user.target`.
|
||||||
|
- `deploy-test-server.sh` invokes `systemctl enable --now
|
||||||
|
left4me-nft-mark.service` (or equivalent at-deploy enabling step).
|
||||||
|
|
||||||
|
**CAKE artifacts:**
|
||||||
|
|
||||||
|
- `cake.env` template contains the literal lines `LEFT4ME_UPLINK_MBIT=`
|
||||||
|
and `LEFT4ME_UPLINK_IFACE=` (commented or uncommented; matched as
|
||||||
|
substring).
|
||||||
|
- `left4me-apply-cake` contains the literals `tc qdisc replace`, `cake`,
|
||||||
|
`bandwidth`, `internet`, `diffserv4`, `dual-dsthost`,
|
||||||
|
`LEFT4ME_UPLINK_MBIT`, `LEFT4ME_UPLINK_IFACE`.
|
||||||
|
- `left4me-apply-cake` is mode `0755` after deploy (asserted via the
|
||||||
|
same mechanism the existing helper-script tests use).
|
||||||
|
- `left4me-cake.service` contains
|
||||||
|
`EnvironmentFile=-/etc/left4me/cake.env`,
|
||||||
|
`ExecStart=/usr/local/libexec/left4me/left4me-apply-cake apply`,
|
||||||
|
`ExecStop=/usr/local/libexec/left4me/left4me-apply-cake clear`,
|
||||||
|
`Wants=network-online.target`, `Type=oneshot`,
|
||||||
|
`WantedBy=multi-user.target`.
|
||||||
|
- `deploy-test-server.sh` invokes `systemctl enable --now
|
||||||
|
left4me-cake.service`.
|
||||||
|
- `deploy-test-server.sh` copies `cake.env` only when target absent
|
||||||
|
(asserted by literal substring of the guarding `[ -e
|
||||||
|
/etc/left4me/cake.env ]` test or equivalent).
|
||||||
|
|
||||||
|
No runtime networking tests in v1. The artifacts are static; their
|
||||||
|
runtime behaviour requires a real iface and a real bandwidth load,
|
||||||
|
which the operator measures.
|
||||||
|
|
||||||
|
## Rollout
|
||||||
|
|
||||||
|
Single deploy. After the new sysctl block lands, `sysctl --system`
|
||||||
|
applies it immediately (already in the deploy flow). The two new
|
||||||
|
systemd units start on `systemctl enable --now`; CAKE without a
|
||||||
|
configured `LEFT4ME_UPLINK_MBIT` logs a warning and no-ops, which is
|
||||||
|
the expected fresh-deploy state. The operator measures their uplink,
|
||||||
|
edits `/etc/left4me/cake.env`, and runs `systemctl restart
|
||||||
|
left4me-cake.service`.
|
||||||
|
|
||||||
|
Already-running game servers are unaffected by the network changes
|
||||||
|
themselves. The marking applies on every emitted packet from the moment
|
||||||
|
the nft rule loads; future-emitted packets pick up DSCP+priority without
|
||||||
|
restarting any srcds instance.
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
None blocking. v2 candidates if measurement justifies them:
|
||||||
|
|
||||||
|
- A `LEFT4ME_INGRESS_MBIT` knob that flips on the IFB ingress shaper as
|
||||||
|
a default, conditional on the env value being set.
|
||||||
|
- A `left4me-net-doctor` helper that reports current qdisc, applied
|
||||||
|
marks, and a one-shot saturation+ping measurement against a local
|
||||||
|
endpoint.
|
||||||
|
- A small Python wrapper in `l4d2host` that reads `cake.env` for
|
||||||
|
display in the web UI, so the operator sees in one place whether
|
||||||
|
shaping is active.
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- `tc-cake(8)` — keyword semantics: `bandwidth`, `internet`,
|
||||||
|
`diffserv4`, `dual-dsthost`, tin priority mapping.
|
||||||
|
- `systemd.network(5)` — `[CAKE]` section directives:
|
||||||
|
`Bandwidth=`, `OverheadKeyword=`, `PriorityQueueingPreset=`,
|
||||||
|
`EgressHostIsolation=`.
|
||||||
|
- `nft(8)` — `meta skuid`, `meta priority`, `ip dscp set`, table
|
||||||
|
isolation semantics.
|
||||||
|
- RFC 3246 — Expedited Forwarding (EF) PHB.
|
||||||
|
- Linux kernel `Documentation/networking/tcp_bbr.txt` — BBR pairs with
|
||||||
|
`fq` / `fq_codel` for correct pacing.
|
||||||
|
- `docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md`
|
||||||
|
— sibling spec; this spec extends `99-left4me.conf` and reuses the
|
||||||
|
same deploy-test-artifact pattern.
|
||||||
Loading…
Reference in a new issue