From 122e0abdddd70478bc6134b0f6bdf2fee95df167 Mon Sep 17 00:00:00 2001 From: mwiegand Date: Wed, 20 May 2026 00:31:45 +0200 Subject: [PATCH] fix(log-streaming): point logaddress at non-loopback IP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reversed the wrong conclusion in 46bba0d. The vanilla L4D2 logaddress UDP path is NOT broken — proved by retesting with destination 172.30.0.5:28000 (wireguard IP) which yielded 8 properly framed HL Log Standard packets in 12s including real game events. Root cause: the Source engine silently drops logaddress destinations in 127.0.0.0/8. Registration succeeds and the cvar API reports "logging to: udp" but sendto is never called for loopback. Every other L4D2 stats deployment (multiple production HLstatsX:CE instances) puts the collector on a separate host or interface IP and never hits this. Defaults: LOG_LISTENER_BIND=0.0.0.0:28000 (accept on any interface); LOG_LISTENER_ADDR="" (production must set via web.env to the host's non-loopback IP). Empty default = safe no-op for dev. The kernel's same-host routing optimization keeps the traffic on lo internally but the packet's destination IP must not literally be in 127/8. Co-Authored-By: Claude Opus 4.7 (1M context) --- ...6-05-19-log-streaming-and-temp-listener.md | 147 +++++++++++------- l4d2web/l4d2web/config.py | 13 +- l4d2web/l4d2web/services/l4d2_facade.py | 19 ++- l4d2web/l4d2web/services/log_listener.py | 22 +-- 4 files changed, 123 insertions(+), 78 deletions(-) diff --git a/docs/superpowers/plans/2026-05-19-log-streaming-and-temp-listener.md b/docs/superpowers/plans/2026-05-19-log-streaming-and-temp-listener.md index e9f158e..4f77833 100644 --- a/docs/superpowers/plans/2026-05-19-log-streaming-and-temp-listener.md +++ b/docs/superpowers/plans/2026-05-19-log-streaming-and-temp-listener.md @@ -234,17 +234,21 @@ End-to-end smoke test, on the dev box first: ## Post-deploy verification findings (2026-05-20) -The plan above describes what we INTENDED to ship; this section -documents what we ACTUALLY learned after deployment. +The plan above describes what we INTENDED to ship. This section +documents what we ACTUALLY learned, including a wrong intermediate +conclusion that was later corrected. -Listener was deployed and confirmed bound (`ss -ulnp` showed -`gunicorn:28000`). Servers were re-initialized, `server.cfg` -inspected and confirmed correct — `log on`, `logaddress_add -127.0.0.1:28000` at the bottom, after `rcon_password`. The listener -was proven healthy end-to-end with a local `nc` probe writing a -capture file. **Yet captures stayed empty even during active play.** +### What worked -Diagnostic steps: +Listener deployed and confirmed bound (`ss -ulnp` showed +`gunicorn:28000`). Servers re-initialized; `server.cfg` had +`log on` / `logaddress_add 127.0.0.1:28000` after `rcon_password`. +Listener proven healthy end-to-end with a local `nc` probe writing +a capture file. + +### What didn't (and why) + +Captures stayed empty even during active play. Symptoms: 1. `logaddress_list` via RCON → `1 entry: 127.0.0.1:28000` ✓ 2. `log` via RCON → `currently logging to: file, console, udp` ✓ @@ -256,54 +260,91 @@ Diagnostic steps: 4. `tcpdump -i lo udp port 28000` during rcon say/echo/status bursts → **0 packets** ✗ 5. `tcpdump -i any host 127.0.0.1 and udp` → still 0 ✗ -6. Toggle `log off` / `log on` live → engine confirms cycle, no - effect ✗ -7. `sv_logflush 1` live → no effect ✗ -8. Suspected `SocketBindAllow=udp:27000-27999` was blocking the - implicit ephemeral bind of a separate log socket. Added drop-in - `SocketBindAllow=udp:32768-60999`, daemon-reloaded, restarted - server → still 0 packets ✗. Drop-in rolled back. -9. `strace -p -e sendto,sendmsg` while triggering - events → **zero sendto/sendmsg calls toward 127.0.0.1:28000**. - Only one unrelated TLS sendmsg (Steam) ✗. +6. Toggle `log off` / `log on` live, `sv_logflush 1` → no effect ✗ +7. Tested `SocketBindAllow=udp:32768-60999` drop-in (suspected the + ephemeral source-port bind was being rejected) → still 0 + packets. Drop-in rolled back ✗ +8. `strace -p -e sendto,sendmsg` → **zero sendto + calls toward the destination** ✗ -**Verdict:** L4D2's vanilla UDP logaddress emit path is -stubbed/broken at the engine level. The cvar API and registration -paths are intact (data-structure ops); the emit path simply doesn't -make the syscalls. Logs reach the **file** and **console** sinks -(the latter is what `journalctl` captures from srcds_run's stdout) -but never the **udp** sink, despite all three being listed in the -`log` cvar status. Same family of issue as L4D2's broken SourceTV -(see `shqke/sourcetvsupport`). +A premature conclusion was reached and committed as `46bba0d` +("L4D2 logaddress UDP emit is dead"). User pushed back, asked for +verification. Research found multiple production HLstatsX:CE +instances running L4D2 stats successfully — disproving the +engine-bug theory. -**Implications for the roadmap:** +### Real cause -- The current listener captures nothing in production. The cfg - injection of `log on` / `logaddress_add` does no harm and is kept - for a future bridge to attach to. -- Phase 2 path A (recommended): **SourceMod bridge plugin** in - `l4d2host/`. Subscribes to L4D2 game events and calls - `LogMessage()` from inside srcds. The explicit `LogMessage()` - path is implemented separately from the autonomous engine emit - and DOES reach UDP destinations on every Source 1 game stat - trackers have used in production. This proves out the listener. -- Phase 2 path B (interim, no plugin): **parse the journal**. - `journalctl -u "left4me-server@*.service" -f` already streams - every HL-Log-Standard line as captured srcds stdout. A reducer - that follows journald + matches the `srcds_run[*]: L ...` lines - produces structured events today without any srcds-side change. - Less clean than UDP-to-listener (we'd hop through journald) but - available immediately. -- `mp_logdetail` was removed from the cfg injection: it is a - CS-only cvar (L4D2 prints `Unknown command "mp_logdetail"` at - startup). +Re-registered logaddress at a non-loopback IP (`172.30.0.5:28000` +on the wireguard interface) and reran the test. **8 packets in 12 +seconds**, each a properly framed HL-Log-Standard payload — +including `Console<0>" say "wg-test-1"`, all the rcon-from lines, +and live poller status calls. + +**The Source engine silently drops `logaddress` destinations in +`127.0.0.0/8`.** Registration succeeds (data-structure op), the +cvar API reports "logging to: udp", but the engine's send loop +filters out loopback destinations and never calls sendto for them. +This is presumably an anti-self-loop / anti-amplification measure +that I have not seen documented in any Valve or community source. + +Everyone else using `logaddress` for stat tracking puts the +collector on a *separate host* or at minimum a different interface +IP — they never hit this. We're the unusual case of co-locating +the listener with the gameserver and naively pointing at +`127.0.0.1`. + +### Fix + +- `LOG_LISTENER_BIND` default → `0.0.0.0:28000` (accept on any + interface). +- `LOG_LISTENER_ADDR` default → `""` (empty). Production env file + MUST set this to a non-loopback IP. Dev gets a safe no-op + (cfg injector skips emitting log cvars when the address is + empty). +- Production `web.env`: set + `LOG_LISTENER_ADDR=:28000`. The host's + public interface IP works; the kernel's same-host routing + optimization keeps the actual traffic on `lo` internally, but the + *destination IP* in the packet header is non-loopback so Source's + filter passes. + +### Implications for the roadmap (revised) + +- The vanilla UDP logaddress mechanism works in L4D2. **No SourceMod + bridge is required for Phase 1** — we were going to add one to + work around a non-existent engine bug. +- `mp_logdetail` was correctly removed from the cfg injection: it + is a CS-only cvar; L4D2 prints `Unknown command` at startup. +- A future SourceMod plugin in `l4d2host/` may still be useful for + L4D-specific events the engine doesn't auto-log (kill events with + weapon detail, special-infected spawns) — see + `SMILEWHENYOUDIE/HLstatsX-CE`'s `superlogs-l4d.sp` for prior + art. That's a Phase 2 enhancement, not a Phase 1 prerequisite. + +### Lessons (filed under "validate before implementing") + +- I anchored hard on "L4D2 engine bug" after disproving two + reasonable alternative hypotheses (hibernation, SocketBindAllow). + The third alternative — destination-IP filter — wasn't tested + before committing the wrong-conclusion docs. Should have tried a + non-loopback destination as the very first test. +- The journal evidence of rich game events going to console/file + was real and significant, but I misread it as proof of an + engine-level UDP stub instead of evidence the engine's line + generator works fine and only the *destination filter* was at + play. +- The user's push-back ("maybe do research if it's really broken?") + forced the right next step. Worth internalizing: a single + contrarian search query found HLstatsX-on-L4D2 production + instances within seconds and would have prevented the + wrong-conclusion commit. ## Follow-ups (separate plans) -- **Phase 2:** pick Path A (SourceMod bridge) vs Path B (journald - follower). A is cleaner long-term; B is faster to ship. -- Once events flow somehow: design `RawLogLine` / `LogEvent` / - `MatchSession` / `Round` schema based on what we actually see. +- **After a few days of capture data**: design + `RawLogLine` / `LogEvent` / `MatchSession` / `Round` schema. - Build a reducer. -- `sv_logsecret` becomes relevant only once the listener is - reachable beyond loopback (currently moot). +- Consider a SourceMod bridge for richer L4D2-specific events + (kill weapons, special-infected spawns, finale outcomes). +- `sv_logsecret` if the listener is ever moved off-host. diff --git a/l4d2web/l4d2web/config.py b/l4d2web/l4d2web/config.py index 7491919..fefe150 100644 --- a/l4d2web/l4d2web/config.py +++ b/l4d2web/l4d2web/config.py @@ -22,8 +22,13 @@ DEFAULT_CONFIG: dict[str, object] = { "STEAM_PROFILE_TTL_SECONDS": 86400, "STEAM_WEB_API_KEY": "", "LOG_LISTENER_ENABLED": True, - "LOG_LISTENER_BIND": "127.0.0.1:28000", - "LOG_LISTENER_ADDR": "127.0.0.1:28000", + "LOG_LISTENER_BIND": "0.0.0.0:28000", + # Empty by default — production MUST set this to a non-loopback IP via + # web.env. Source silently drops logaddress destinations in 127.0.0.0/8, + # so 127.0.0.1 here would register but never receive packets. When + # empty, the cfg injector skips emitting `log on` / `logaddress_add`, + # which is the safer no-op for dev environments without a srcds host. + "LOG_LISTENER_ADDR": "", "LOG_CAPTURE_DIR": "/var/lib/left4me/captures", } @@ -54,7 +59,7 @@ def load_config() -> dict[str, object]: "STEAM_PROFILE_TTL_SECONDS": int(os.getenv("STEAM_PROFILE_TTL_SECONDS", "86400")), "STEAM_WEB_API_KEY": os.getenv("STEAM_WEB_API_KEY", ""), "LOG_LISTENER_ENABLED": _bool_from_env(os.getenv("LOG_LISTENER_ENABLED", "true")), - "LOG_LISTENER_BIND": os.getenv("LOG_LISTENER_BIND", "127.0.0.1:28000"), - "LOG_LISTENER_ADDR": os.getenv("LOG_LISTENER_ADDR", "127.0.0.1:28000"), + "LOG_LISTENER_BIND": os.getenv("LOG_LISTENER_BIND", "0.0.0.0:28000"), + "LOG_LISTENER_ADDR": os.getenv("LOG_LISTENER_ADDR", ""), "LOG_CAPTURE_DIR": os.getenv("LOG_CAPTURE_DIR", "/var/lib/left4me/captures"), } diff --git a/l4d2web/l4d2web/services/l4d2_facade.py b/l4d2web/l4d2web/services/l4d2_facade.py index 27feb02..d8b56c7 100644 --- a/l4d2web/l4d2web/services/l4d2_facade.py +++ b/l4d2web/l4d2web/services/l4d2_facade.py @@ -55,16 +55,15 @@ def build_server_spec_payload( # nor hostname can override it (Source's cvar semantics are last-wins). if server.rcon_password: config_lines.append(f'rcon_password "{server.rcon_password}"') - # HL-Log-Standard UDP streaming registration. Investigation on 2026-05-20 - # established that L4D2's vanilla logaddress UDP emit path is stubbed/broken: - # `log` cvar reports "logging to: file, console, udp" and `logaddress_list` - # shows the registration, but srcds_linux makes ZERO sendto/sendmsg calls - # toward the destination (proven by strace + tcpdump showing 0 packets). - # Local file logging works fine. So the listener currently captures nothing - # in production. The cvars are kept registered for a future SourceMod - # bridge plugin that will call LogMessage() directly — those messages DO - # flow through logaddress destinations on every Source 1 game including L4D2. - # `mp_logdetail` was removed: it is a CS-only cvar (`Unknown command` on L4D2). + # HL-Log-Standard UDP streaming registration. CRITICAL: log_listener_addr + # MUST be a non-loopback IP. The Source engine silently drops logaddress + # destinations in 127.0.0.0/8 — registration succeeds (logaddress_list + # shows it, `log` cvar reports "logging to: udp") but sendto is never + # called for that destination. Use the host's non-loopback IP (public + # interface, wireguard, dummy interface, etc.) and have the listener + # bind on 0.0.0.0 to receive same-host packets routed via lo by the + # kernel's local-destination shortcut. + # `mp_logdetail` is CS-only and emits `Unknown command` on L4D2; omitted. if log_listener_addr: config_lines.append("log on") config_lines.append(f"logaddress_add {log_listener_addr}") diff --git a/l4d2web/l4d2web/services/log_listener.py b/l4d2web/l4d2web/services/log_listener.py index 5bce308..581bc24 100644 --- a/l4d2web/l4d2web/services/log_listener.py +++ b/l4d2web/l4d2web/services/log_listener.py @@ -1,4 +1,4 @@ -"""Temporary UDP listener for srcds log streaming (HL Log Standard). +"""UDP listener for srcds log streaming (HL Log Standard). Captures raw `logaddress_add`-streamed packets from every managed L4D2 server into flat files under LOG_CAPTURE_DIR — one file per source @@ -6,16 +6,16 @@ server into flat files under LOG_CAPTURE_DIR — one file per source reducer. The goal is to observe what L4D2 actually emits on our servers before we commit to a schema or event vocabulary. -KNOWN-DEAD AS OF 2026-05-20: L4D2's vanilla logaddress UDP emit path -is stubbed/broken at the engine level. srcds reports -"currently logging to: file, console, udp" and logaddress_list shows -our address, but strace + tcpdump confirmed srcds_linux makes ZERO -sendto/sendmsg calls toward the destination. Local file logging works -fine; only UDP broadcast is dead. This matches the broken-SourceTV -pattern in L4D2 — see docs/superpowers/plans/2026-05-19-...md for the -full investigation. The listener is therefore a no-op in production -until a SourceMod bridge plugin is deployed that calls LogMessage() -directly (those messages DO flow through logaddress destinations). +CRITICAL: the listener MUST bind on a non-loopback interface (default +LOG_LISTENER_BIND=0.0.0.0:28000) and the cfg-injected logaddress MUST +point to a non-loopback IP (LOG_LISTENER_ADDR set in web.env). +The Source engine silently drops logaddress destinations in +127.0.0.0/8 — registration succeeds but no packets are ever sent +(verified 2026-05-20 with strace + tcpdump). Use the host's public +interface IP, a wireguard interface IP, or a dummy interface IP; the +kernel's same-host routing optimization will keep the traffic on lo +internally but the destination IP must not literally be in 127/8. +See docs/superpowers/plans/2026-05-19-...md for the full investigation. Packet shape (per HL Log Standard): 4 bytes 0xFF | out-of-band marker