- _parse_duration wraps int() in try/except so malformed connected durations raise RconError (not ValueError leaking past the poller's except RconError). - fake_rcon_server captures handler exceptions and re-raises at context exit, so a buggy test handler surfaces as a real failure instead of silently degrading into a client-side timeout. - Two new parser tests: HH:MM:SS duration parsing and malformed input coverage. - Fix Steam ID formula typo in the spec doc (Z*2 + Y, not Y*2 + Z; Y is the low bit). Code was already correct. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
23 KiB
Server live-state display (counts, map, roster, avatars, history)
Context
The l4d2web UI currently shows systemd lifecycle state per game server (running/stopped/unknown) but nothing about what's happening inside the game: player count, current map, whether the server is hibernating, who is connected. To know any of that, users have to context-switch (open the game, query externally).
The goal is a read-side live-state display: counts + map + hibernating on the server list, plus a server-detail panel showing the current player roster (avatars + names) and a "recent players" section for who's been on lately. Backed by a persistent history table so we get count-over-time graphs and player-presence history (foundation for future ban UX) for free.
Source: RCON exclusively. A2S_INFO (UDP, anonymous) was investigated and discarded — it can't deliver Steam IDs, hibernating flag, or interactive commands, so anything beyond raw counts re-routes through RCON anyway. Both transports were verified working against prod left4.me. Going RCON-only means one transport, one set of tests, no throwaway scaffolding.
Avatars: Steam Web API. RCON gives Steam IDs; ISteamUser/GetPlayerSummaries resolves them to persona names + avatar URLs hot-linked from Steam's CDN. API key already obtained.
Commands are deferred to a separate plan. This plan is read-only.
Architecture
┌─────────────────────────────┐
│ left4me-web (Flask) │
┌──────────────┐ RCON │ ┌───────────────────────┐ │
│ srcds 27016 │◄──────┼──┤ live-state poller │ │
└──────────────┘ TCP │ │ (daemon thread) │ │
│ └───────┬───────────────┘ │
┌──────────────┐ RCON │ │ writes │
│ srcds 27021 │◄──────┤ ▼ │
└──────────────┘ │ ┌───────────────────────┐ │
│ │ server_live_state │ │
Steam Web API │ │ server_player_session │ │
┌────────────┐ │ │ steam_user_profile │ │
│ Steam CDN │◄─┼──┤ │ │
│ avatars... │ │ └───────┬───────────────┘ │
└────────────┘ │ │ reads │
▲ │ ▼ │
│ │ ┌───────────────────────┐ │
└────────┼──┤ /servers, /servers/N │ │
<img src=...> │ │ (HTMX 5s refresh) │ │
│ └───────────────────────┘ │
└─────────────────────────────┘
Single daemon thread (modeled on the existing start_state_poller in l4d2web/services/job_worker.py:617-647), inside the Flask process, polls every LIVE_STATE_POLL_SECONDS (default 5). Per poll, per running server with a configured RCON password:
- TCP connect to
127.0.0.1:<port>, auth, sendstatus, parse response. - Compare server-level state (players/map/hibernating/etc.) to the latest
server_live_staterow for this server. If unchanged, bumplast_seen_at. If changed, insert a new row. - Reconcile open sessions (
server_player_sessionrows whereleft_at IS NULL) with the currentstatusroster: open new sessions for new players (backfillingjoined_atfrom RCON'sconnectedfield), close sessions for players no longer present, updatemin_ping/max_pingfor continuing sessions. - Collect Steam IDs that are missing from
steam_user_profileor havefetched_atolder than 24h; batch them into a singleGetPlayerSummariescall; upsert results. - Trim
server_live_stateand closed sessions older than retention.
Schema (one new alembic migration)
New column: servers.rcon_password
rcon_password: Mapped[str] = mapped_column(
String(64), nullable=False, default="", server_default=""
)
Empty string = "no password configured yet" (poller skips). Migration backfills every existing row with secrets.token_urlsafe(32) (~43 chars, URL-safe character set so the literal "..." cfg-quoting needs no escaping).
server_live_state — run-length-encoded snapshots
CREATE TABLE server_live_state (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
started_at DATETIME NOT NULL, -- when this exact state first appeared
last_seen_at DATETIME NOT NULL, -- most recent poll where it still held
players INTEGER NOT NULL,
max_players INTEGER NOT NULL,
bots INTEGER NOT NULL,
map VARCHAR(64) NOT NULL,
hibernating BOOLEAN NOT NULL
);
CREATE INDEX ix_sls_server_started ON server_live_state(server_id, started_at DESC);
- "State" = the tuple
(players, max_players, bots, map, hibernating). Ping/loss are deliberately not stored at server-level, so they don't churn rows. - Idle hibernating server collapses from one-row-per-poll to one-row-per-state-change (≈17,280× compression for a 24h-idle server).
- Latest snapshot for a server:
ORDER BY started_at DESC LIMIT 1. UI staleness check:last_seen_at > now - LIVE_STATE_STALE_SECONDS(default 30). - Retention: trim rows where
last_seen_at < now - LIVE_STATE_HISTORY_DAYS(default 30). - Failed polls produce no DB write; the staleness check on
last_seen_athandles UI degradation cleanly.
server_player_session — interval per connection
CREATE TABLE server_player_session (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
steam_id_64 VARCHAR(20) NOT NULL,
joined_at DATETIME NOT NULL,
left_at DATETIME NULL, -- NULL = currently in-game
name_at_join VARCHAR(64) NOT NULL,
min_ping INTEGER NOT NULL,
max_ping INTEGER NOT NULL
);
CREATE INDEX ix_sps_server_open ON server_player_session(server_id, left_at);
CREATE INDEX ix_sps_steam_history ON server_player_session(steam_id_64, joined_at);
joined_atis backfilled from RCON'sconnectedduration on first sighting (joined_at = now - connected_seconds). This heals brief polling gaps and survives web restarts: even if we just started polling, we know when the still-connected players actually joined.- A player who disconnects and rejoins gets two rows, not one merged interval.
- Bots are excluded — rows with a non-
STEAM_X:Y:Zuniqueid are skipped. min_ping/max_pingupdated only when a new poll pushes the range, to avoid noise writes.- On poller startup, close any sessions whose server isn't in current RCON output. Plus: close sessions after N consecutive failed polls of their server (TBD constant during implementation, e.g. 6 polls = ~30s).
- Retention: trim closed sessions where
left_at < now - SESSION_HISTORY_DAYS(default 30). Open sessions never trimmed.
steam_user_profile — cached profile data (24h TTL)
CREATE TABLE steam_user_profile (
steam_id_64 VARCHAR(20) PRIMARY KEY,
persona_name VARCHAR(64) NOT NULL,
avatar_url TEXT NOT NULL, -- avatarmedium from Steam Web API
fetched_at DATETIME NOT NULL
);
- Cache is global, not per-server (one profile per Steam ID).
- Refreshed when
fetched_at < now - 24hor when entry is missing. - Soft-fail: if the Steam API key is unset, the API is down, or a profile is private, we just leave the cache as-is and the UI falls back to
name_at_join+ placeholder avatar.
Bind-rendered queries
Current players on server X:
SELECT sp.steam_id_64, sp.joined_at, sp.name_at_join,
sp.min_ping, sp.max_ping,
p.persona_name, p.avatar_url
FROM server_player_session sp
LEFT JOIN steam_user_profile p USING (steam_id_64)
WHERE sp.server_id = ? AND sp.left_at IS NULL
ORDER BY sp.joined_at;
Recent players on server X (last 30 days, excluding currently in-game):
SELECT sp.steam_id_64, MAX(sp.left_at) AS last_seen,
p.persona_name, p.avatar_url
FROM server_player_session sp
LEFT JOIN steam_user_profile p USING (steam_id_64)
WHERE sp.server_id = ?
AND sp.left_at IS NOT NULL
AND sp.left_at > datetime('now', '-30 days')
AND sp.steam_id_64 NOT IN (
SELECT steam_id_64 FROM server_player_session
WHERE server_id = ? AND left_at IS NULL
)
GROUP BY sp.steam_id_64, p.persona_name, p.avatar_url
ORDER BY last_seen DESC
LIMIT 20;
Modules
l4d2web/services/rcon.py (new)
Pure stdlib (socket, struct), no new dependency. Source RCON protocol:
@dataclass(slots=True, frozen=True)
class PlayerRow:
steam_id_64: str # converted from STEAM_X:Y:Z
name: str
connected_seconds: int
ping: int
@dataclass(slots=True, frozen=True)
class StatusResponse:
map: str
players: int # humans
max_players: int
bots: int
hibernating: bool
roster: list[PlayerRow]
class RconError(Exception): ...
class RconAuthError(RconError): ...
def query_status(host: str, port: int, password: str, *, timeout: float = 2.0) -> StatusResponse: ...
Implementation notes:
- Auth handshake quirk verified live: server sends a
type=0empty-body packet before thetype=2auth response. Consume both.req_id == -1on the auth response = bad password. - Single TCP connection per query (loopback, ~10-20ms total round-trip — pooling not worth it at this scale).
- Header regex on
map :andplayers :lines (the(hibernating|not hibernating)token is inplayers :). - Roster regex: split lines starting with
#, skip the column-header line, robustly extract the quoted name + theSTEAM_X:Y:Ztoken +MM:SSorHH:MM:SSconnected duration + ping. Tolerate the two-numeric-prefix L4D2 variant (# 2 1 "Crone" STEAM_1:0:...). - Steam ID conversion:
STEAM_X:Y:Z→76561197960265728 + (Z * 2) + Y(Y is the low bit; returned as string).
l4d2web/services/steam_users.py (new)
Modeled directly on l4d2web/services/steam_workshop.py:17-43 (single requests.Session, 30s timeout, anonymous-pattern POST with form-encoded body — only difference is the key= parameter).
@dataclass(slots=True, frozen=True)
class SteamProfile:
steam_id_64: str
persona_name: str
avatar_url: str # avatarmedium
def fetch_profiles_batch(steam_ids: Iterable[str], *, api_key: str) -> list[SteamProfile]: ...
- Endpoint:
GET https://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0002/?key=<key>&steamids=<csv>. - Up to 100 IDs per call; caller batches.
- Returns only successful resolutions (private/deleted accounts simply absent from the response — fine, they stay uncached and the UI falls back).
- Raises on transport errors; caller decides whether to surface.
l4d2web/services/live_state_poller.py (new)
Modeled on start_state_poller / state_poller_loop in l4d2web/services/job_worker.py:617-647.
def start_live_state_poller(app) -> None: ... # spawns daemon thread, skipped under TESTING
def live_state_poller_loop(app, interval: float) -> None: ...
def poll_once() -> None: # one full pass over running servers
...
Per-server algorithm:
- RCON
status→StatusResponse(or skip on auth/timeout, logged viaapp.logger). - Server-level RLE upsert: load newest
server_live_staterow for this server. If(players, max_players, bots, map, hibernating)matches →UPDATE last_seen_at = now(). Else →INSERTnew row. - Session reconciliation in a single transaction:
- Load open sessions for this server.
- For each player in
response.rosternot in open sessions:INSERTnew session withjoined_at = now - connected_seconds,name_at_join = roster.name,min_ping = max_ping = roster.ping. - For each open session whose player is in the roster: if
roster.ping < min_pingor> max_ping,UPDATEthe range. Otherwise skip the write. - For each open session whose player is not in the roster:
UPDATE left_at = now().
- Profile enrichment: collect Steam IDs from the roster where the cached profile is missing or
fetched_at < now - 24h. Skip ifSTEAM_WEB_API_KEYunset. Batch into one Steam API call. Upsert results.
Periodic (every Nth cycle, e.g. once a minute):
- Trim
server_live_stateand closed sessions past retention. - Close any open sessions whose
server_idhasn't had a successful RCON response in the lastSTUCK_SESSION_SECONDS(default 60).
Modify: l4d2web/services/l4d2_facade.py:28-52
build_server_spec_payload appends f'rcon_password "{server.rcon_password}"' as the last entry in the returned config list, only if the password is non-empty. Appending (not prepending) matters: Source's cfg semantics are last-wins, so putting our line after both the overlay exec lines and the user's blueprint config guarantees no overlay or blueprint can silently clobber the password and break the poller. l4d2host/instances.py:40-58 already writes spec.config lines verbatim to server.cfg — no host-side change needed.
Modify: server-create route
Wherever the server-create form handler lives (l4d2web/routes/server_routes.py or similar — confirm during implementation): before commit, generate rcon_password = secrets.token_urlsafe(32).
Web UI
Server list (template TBD: ls l4d2web/templates/ during implementation)
Add an inline live-state cell per server row:
- Stopped server:
— - Stale (no row newer than
LIVE_STATE_STALE_SECONDS): dim?with tooltip "no data" - Hibernating:
0/4 · idle · c1m1_hotel - Active:
2/4 · c1m2_streets
No HTMX on the list page; page reload picks up the latest snapshot.
Server detail (l4d2web/templates/server_detail.html)
New section, HTMX-refreshed every LIVE_STATE_POLL_SECONDS (default 5):
<section class="panel"
hx-get="/servers/{{ server.id }}/live-state"
hx-trigger="every 5s"
hx-swap="outerHTML">
<!-- rendered from l4d2web/templates/_live_state.html -->
</section>
The partial renders three blocks:
- Summary:
players/max_players · map · idle?plus a small "polled Ns ago" caption. - Current players (only if non-empty): grid of cards, each
<img src="{{ profile.avatar_url or placeholder }}" /> {{ profile.persona_name or session.name_at_join }} · {{ joined_relative }} · ping {{ min }}-{{ max }}ms. - Recent players (last 30 days, excluding current; only if non-empty): smaller cards,
{{ avatar }} {{ persona_name or name_at_join }} · last seen {{ last_seen_relative }}.
New route: GET /servers/<id>/live-state returns the partial. Composition mirrors the existing build-status pattern at l4d2web/templates/_overlay_build_status.html:1-5.
Avatar <img> tags point straight at Steam CDN URLs (avatars.cloudflare.steamstatic.com / avatars.akamai.steamstatic.com). No proxying. Same approach as WorkshopItem.preview_url. Note: confirm the existing CSP allows these hosts; if not, extend it.
No JS framework added — HTMX only.
Config keys
In l4d2web/config.py, plus documented defaults in deploy/templates/etc/left4me/web.env where applicable:
| key | default | purpose |
|---|---|---|
LIVE_STATE_POLL_SECONDS |
5 |
poll interval |
LIVE_STATE_QUERY_TIMEOUT_SECONDS |
2.0 |
per-RCON-query timeout |
LIVE_STATE_POLL_WORKERS |
4 |
thread-pool size for parallel per-server polls |
LIVE_STATE_STALE_SECONDS |
30 |
UI staleness threshold |
LIVE_STATE_HISTORY_DAYS |
30 |
retention for snapshots + closed sessions |
STUCK_SESSION_SECONDS |
60 |
close open sessions whose server has been unreachable for this long |
STEAM_PROFILE_TTL_SECONDS |
86400 |
profile cache TTL |
STEAM_WEB_API_KEY |
"" |
from web.env; empty disables enrichment |
Tests
l4d2web/tests/test_rcon.py— protocol handshake against an in-process TCP fixture: auth-success, auth-failure (req_id == -1), header parse (incl.(hibernating)and(reserved <token>)variants), roster parse (incl. the two-numeric-prefix L4D2 variant), Steam ID conversion.l4d2web/tests/test_steam_users.py— request shape (key in querystring, batched ids, 100-per-call ceiling), response parsing, partial response (some IDs missing).l4d2web/tests/test_live_state_poller.py— mirrortest_state_poller_*atl4d2web/tests/test_job_worker.py:882-952. Cover: iterates only running servers with non-emptyrcon_password, RLE upsert (matching state →last_seen_atbump only; differing state → new row), session open with backfilledjoined_at, session close on disappearance, ping range expansion, stuck-session close after N failures, drops auth failures silently, respects retention.l4d2web/tests/test_server_routes.py(extend) —/servers/<id>/live-statefragment route renders summary/current/recent blocks correctly; stale rendering when latest snapshot is old; soft-fail rendering when no profile cached.l4d2web/tests/test_l4d2_facade.py(extend) —build_server_spec_payloadappendsrcon_password "..."as the last config line when password is set; omits the line when empty; appears after both the overlayexeclines and the blueprint config lines.- Migration test — existing rows backfilled with non-empty 43-char passwords; tables created with correct indexes.
Critical files
New:
l4d2web/services/rcon.py— Source RCON client + status parserl4d2web/services/steam_users.py— Steam Web API client (mirrorssteam_workshop.py)l4d2web/services/live_state_poller.py— background thread + poll loop + session reconcilerl4d2web/alembic/versions/00XX_server_live_state.py— migration: new column, three new tables, password backfilll4d2web/templates/_live_state.html— HTMX-refreshed fragment (summary + current + recent)l4d2web/tests/test_rcon.py,l4d2web/tests/test_steam_users.py,l4d2web/tests/test_live_state_poller.py
Modify:
l4d2web/models.py— addServerLiveState,ServerPlayerSession,SteamUserProfile; addrcon_passwordtoServer(after line 137)l4d2web/services/l4d2_facade.py:28-52—build_server_spec_payloadappendsrcon_password "..."as the last config line when setl4d2web/app.py— callstart_live_state_poller(app)next to existingstart_state_pollerl4d2web/routes/server_routes.py(or equivalent — confirm) — generatercon_passwordin create handler; addGET /servers/<id>/live-statel4d2web/templates/server_detail.html— include_live_state.htmll4d2web/templates/<server-list>.html— confirm filename; add inline badge columnl4d2web/config.py— register the eight new config keysdeploy/templates/etc/left4me/web.env— addSTEAM_WEB_API_KEY=and any tunables we expose
Reused without changes:
l4d2web/services/job_worker.py:617-647— daemon-thread / poll-loop pattern referencel4d2web/services/steam_workshop.py:17-43—requests.Session+ form-POST pattern for Steam Web APIl4d2host/instances.py:40-58— already writesspec.configverbatim, so no host-side change for password injectionl4d2web/templates/_overlay_build_status.html— HTMX polling pattern reference
Verification
-
Unit tests:
pytest l4d2web/tests/test_rcon.py l4d2web/tests/test_steam_users.py l4d2web/tests/test_live_state_poller.py -v pytest l4d2web/tests -q # full regression -
Migration check:
alembic upgrade head sqlite3 l4d2web.db "SELECT id, name, length(rcon_password) FROM servers;" # every row ~43 sqlite3 l4d2web.db ".schema server_live_state server_player_session steam_user_profile" -
End-to-end against prod (
left4.me):- Deploy. Confirm
systemctl status left4me-web.serviceshows no crash-loop and the journal logsstart_live_state_polleronce. - Restart both existing game servers so they pick up the injected password.
- SQL sanity (web-host shell):
Expect a single recent row per server while idle; new rows when players come/go.sqlite3 l4d2web.db "SELECT server_id, started_at, last_seen_at, players, map, hibernating FROM server_live_state ORDER BY server_id, started_at DESC LIMIT 10;" - Connect to one server from the L4D2 client; within 5s,
/servers/<id>shows a card with your avatar + persona name + ping range. Disconnect; within 5s the card moves to "recent." sqlite3 l4d2web.db "SELECT * FROM server_player_session WHERE left_at IS NULL;"— empty when nobody's connected; one row per current player when someone is.sqlite3 l4d2web.db "SELECT count(*), MIN(fetched_at), MAX(fetched_at) FROM steam_user_profile;"— at least one row after a player has been resolved.
- Deploy. Confirm
-
Failure-path checks:
- Manually corrupt
servers.rcon_passwordfor one server; confirm the journal logs auth failure and the row's badge goes stale withinLIVE_STATE_STALE_SECONDS; other servers unaffected. - Unset
STEAM_WEB_API_KEYinweb.env, restart web; confirm display still works (in-game names + placeholder avatars), no errors in journal. nftdrop the loopback TCP on one server's port; confirm rows stop appearing, open sessions close afterSTUCK_SESSION_SECONDS, badge goes stale.
- Manually corrupt
Open implementation questions
- Server-list template filename: confirm with
ls l4d2web/templates/once implementation starts. - Server-create route location: confirm path (likely
l4d2web/routes/server_routes.py). - CSP allowlist for Steam avatar CDNs: check
l4d2web/app.py(or wherever security headers live) — extendimg-srcto includeavatars.cloudflare.steamstatic.com,avatars.akamai.steamstatic.com,avatars.steamstatic.comif a CSP is enforced. - Adaptive backoff for hibernating servers: defer; start with fixed 5s and revisit only if load becomes a concern (which it won't at current server count).
- Migration data step: SQLite alembic batch operation with a Python data step that iterates rows and generates
secrets.token_urlsafe(32)per row — confirm pattern against existing migrations underl4d2web/alembic/versions/.
Deferred to a separate plan
- Generic RCON command execution (
changelevel,kick,say,sm_ban, ...) - Web UI buttons mapped to those commands with CSRF + admin authz
- Audit log table for issued commands
- Player-count history graphs (data already accumulating from this plan)
- Ban UX (lookup by Steam ID, search across
server_player_session)