Commit graph

338 commits

Author SHA1 Message Date
mwiegand
62cf6cdd56
spec: handoff for revisiting 1/2/3-user split for left4me
The 2-user split (left4me + l4d2-sandbox) has been inherited as a
constraint across multiple recent plans (idmap-on-mount, build-time-
idmap, helper consolidation) without ever being designed
end-to-end. Three plausible configurations: collapse to 1 user
(rejected for security), keep at 2 users (status quo), or split web
from game into 3 users for blast-radius limiting on either side.

Doc captures the threat-model heuristics, cross-uid file-access
plumbing options (shared group vs. world-read), idmap implications,
a step-by-step migration sketch for the 3-user variant, and explicit
out-of-scope items (per-instance gameserver uids, etc.). Detailed
enough that a future session can pick a configuration and execute
without re-deriving the design space.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:58:09 +02:00
mwiegand
28b0ff951b
spec(build-overlay-unit): flag DB-fetch-in-ExecStartPre as an option
The script content lives in the overlays.script DB column and the
unit's %i is the row id, so the worker-writes-script-to-fs step in
the original sketch is duplication. Document three options (worker
writes / unit fetches via helper / pipe to stdin) and recommend the
unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote
this to the top of the open-decisions list since it shapes the
worker, the unit, and whether a fetch-script helper is added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:54:41 +02:00
mwiegand
a9bbc209ae
spec: handoff for replacing script-sandbox helper with template unit
The build-time idmap landing today required a nsenter self-wrap in
left4me-script-sandbox to escape the web app's PrivateTmp namespace
before pre-creating the idmapped staging bind. Working but band-aid:
the helper is reinventing what a systemd template unit would do
declaratively. Mirror the left4me-server@.service pattern with a
build-overlay@.service template — ExecStartPre does the idmap bind in
PID 1's namespace by default, the hardening flags live in the unit
file, ExecStopPost tears down. Worker switches to sudo systemctl start.

Doc captures full proposed unit, worker rewrite sketch, sudoers
update, migration order, verification steps, and the ~5h estimate
so a future session can pick this up cold and execute.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:52:57 +02:00
mwiegand
7a25c2453c
fix(left4me-script-sandbox): self-wrap into PID 1's mount namespace
The web service runs with PrivateTmp=true, which puts it in its own
mount namespace. Worker invokes the sandbox helper via sudo from there;
the helper's pre-systemd-run `mount --bind --map-users=...` lands in
the web service's namespace. systemd-run then spawns transient units
in PID 1's namespace where the bind is invisible — the BindPaths lookup
finds an empty staging dir owned by root, and the sandbox uid hits
permission-denied on every write.

Mirror the pattern from left4me-overlay's ExecStartPre wrapper: enter
PID 1's mount namespace at the start of the helper via `nsenter
--mount=/proc/1/ns/mnt`. Sentinel env var avoids exec recursion. The
gameserver helper handles this at the unit level; the script helper
doesn't have a unit so we self-wrap.

Diagnosis: 5 failed builds all hit the same EACCES on the first
`mkdir`/`tar mkdir`. Direct SSH-sudo invocations of the same helper
succeeded because SSH-sudo doesn't inherit a private namespace; only
the worker-invoked path is affected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:33:13 +02:00
mwiegand
48381089d3
refactor(left4me-overlay): move uid translation to script-sandbox build
left4me-script-sandbox now pre-creates an idmapped bind staging path
(--map-users=<left4me_uid>:<sandbox_uid>:1) and points the sandbox's
BindPaths at that staging instead of the raw overlay dir. Writes from
inside the sandbox (uid l4d2-sandbox) land on disk as left4me, so all
overlay content is uniformly left4me-owned end-to-end.

left4me-overlay loses ~165 lines of idmap-on-mount logic: the per-
lowerdir stat + idmap-bind setup, the bind-umount loop in teardown,
the uid lookup helpers, the _is_mountpoint /proc/self/mountinfo parser,
and the LEFT4ME_TEST_* env-var stubs. It's back to a simple "validate
lowerdirs, mount overlay" shape; gameserver mount path no longer needs
to know about producer-side ownership decisions.

Verified on kernel 6.12 that the kernel idmap propagates through
systemd-run's plain re-bind of the staging path. Tests dropped 4
idmap-on-mount specs and one deploy-artifact regression check; added
test_script_sandbox_uses_idmap_staging to pin the new staging path
+ map flags + trap cleanup.

The post-build world-read chmod kludge in the sandbox is also dropped:
the web app reads overlay files via its primary uid (left4me).

Existing overlays on the test server are sandbox-owned from prior runs
and need a one-shot `chown -R left4me:left4me /var/lib/left4me/overlays`
during deploy. New overlays produced by the refactored sandbox are
left4me-owned from creation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:20:39 +02:00
mwiegand
bc25d423aa
plan(left4me): move idmap from gameserver mount to script-sandbox build
Architectural cleanup: the uid translation is a build-time concern
(the sandbox produces sandbox-uid files); having the gameserver path
unwind that producer-side decision on every mount means the mount
helper carries idmap lifecycle code it shouldn't need. Moving the
idmap into the script-sandbox bind makes files land left4me-owned on
disk, drops ~140 lines from left4me-overlay, and makes all overlay
content (workshop + script-built) consistent on-disk.

Verified on left4.me kernel 6.12.86 that the kernel idmap propagates
through plain re-bind, so systemd-run's BindPaths can wrap a
pre-created idmapped staging path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:15:46 +02:00
mwiegand
dd918aca4b
fix(left4me-overlay): use /proc/self/mountinfo to detect bind mounts
os.path.ismount() compares st_dev against the parent dir, which silently
returns False for same-fs bind mounts. The idmap binds at runtime/<n>/
idmap/<basename> are exactly that case, so:

- cmd_umount skipped the bind-umount step every stop, leaving orphan
  binds in PID 1's mount namespace.
- cmd_mount's idempotency check then "didn't see" the orphan and
  re-bound on top, accumulating one mount per start/stop cycle.

Findmnt nesting like
    /var/lib/left4me/runtime/2/idmap/overlays_9
    └─/var/lib/left4me/runtime/2/idmap/overlays_9
is the visible symptom. Reboot wipes everything so the bug is invisible
on a fresh boot — only stop/start cycles accumulate.

Replace both ismount sites with a _is_mountpoint() helper that reads
/proc/self/mountinfo (column 5 is the mount point). Keep os.path.ismount
for the overlay merged check, where it's reliable (distinct fs type).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:02:18 +02:00
mwiegand
2b20bffeb8
spec: handoff doc for rethinking deploy/ dir architecture
The 2026-05-15 script-consolidation pass landed a working but
half-finished mental model: deploy/files/ was retroactively promoted
from "historical reference" to "canonical source," but only for the
script files. Several adjacent things (sudoers/sysctl duplication
across both repos, the systemd unit files that ckn-bw's reactor
ignores, deploy-test-server.sh's role, dead-code apply-cake) didn't
get resolved. Capture the open questions and pointers so a future
session can pick this up and commit to a coherent shape.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 00:53:55 +02:00
mwiegand
f5e36eef79
deploy: claim /usr/local/sbin/left4me admin CLI in deploy/files
ckn-bw was shipping the admin CLI wrapper (sudo left4me <flask
subcommand>) verbatim from its own bundle copy. Move ownership of the
file into left4me so ckn-bw's upcoming install-action approach can
deploy it from deploy/files/usr/local/sbin/left4me on the deployed
git checkout, eliminating the cross-repo duplication that masked the
idmap helper update earlier.

Also re-frame deploy/README.md: deploy/files/, deploy/templates/, and
deploy/tests/ are now genuinely canonical (read by ckn-bw via
git_deploy). Only deploy-test-server.sh remains a superseded artifact.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 00:41:06 +02:00
mwiegand
f231ebcb0d
doc(deploy): clarify ckn-bw verbatim-sync workflow for shipped files
Spell out that the deploy step for changes to verbatim-shipped files
(privileged helpers, sudoers, sysctl, …) is just re-syncing the bundle's
copy + bw apply. Removes ambiguity for the idmap helper change and any
future edit within the same set.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 23:57:31 +02:00
mwiegand
e4101de7a5
test(deploy): assert left4me-overlay idmaps sandbox-owned lowerdirs
Guards against silent regression of the idmap bind-mount step in the
privileged kernel-overlayfs helper. Asserts --map-users / --map-groups
argv, the runtime/<name>/idmap/ target path, the LEFT4ME_TEST_* stub-
env-var names, and the collision-detection table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 23:56:36 +02:00
mwiegand
90531864b3
harden(left4me-overlay): fix idmap collision risk, gate test stubs on PRINT_ONLY, wrap os.stat
Issue #1: idmap target now uses parent+name (overlays_workshop instead of
workshop) to prevent basename collisions across allowlist roots; explicit
die() on collision detected in the loop.

Issue #2: env-var uid stubs (renamed to LEFT4ME_TEST_SANDBOX_UID etc.) are
only honoured when LEFT4ME_OVERLAY_PRINT_ONLY=1, so a misconfigured systemd
unit override cannot influence real uid mapping.

Issue #3: os.stat(lowerdir) is wrapped in try/except OSError with a die()
that shell-quotes the path and includes the exception, matching the helper's
existing error style.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 23:53:32 +02:00
mwiegand
2f6a9cfba0
feat(left4me-overlay): idmap bind mounts for l4d2-sandbox-owned lowerdirs
Insert an idmapped bind mount in front of each lowerdir whose top-level
uid matches l4d2-sandbox at overlay-mount time, so that overlayfs copy-up
produces left4me-owned upperdir entries instead of EACCES.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 23:48:07 +02:00
mwiegand
3a2c379b71
plan(left4me-overlay): idmap lowerdir bind mounts for cross-uid copy-up
Persist the implementation plan for adding idmapped bind mounts to
left4me-overlay so that overlay copy-up from l4d2-sandbox-owned lower
layers (script-built overlays) produces left4me-owned upperdir entries
the gameserver can write. Mechanism verified end-to-end on ovh.left4me
in a temp dir on 2026-05-14.
2026-05-14 23:42:36 +02:00
mwiegand
bbb2b983bc
harden(l4d2web): per-username login rate limit alongside per-IP
A 20-attempts-per-60s budget keyed by IP doesn't slow a distributed brute force that rotates source IPs. Add a parallel per-username bucket with the same threshold so a single account can't burn through more than 20 failed logins/min regardless of where they come from. Empty usernames aren't bucketed (would DoS the anonymous 401 path). Successful login clears both buckets.
2026-05-14 22:26:20 +02:00
mwiegand
0e2a78e065
secure(l4d2web): block non-admin writes on system overlays; last-admin guard on deactivate
_load_files_overlay docs already promised "owner or admin" for mutations, but the check only filtered by overlay.type — system overlays (user_id IS NULL) were writable by any logged-in user. Add the explicit 403 for non-admins; read-only routes remain open across all overlay types.

Mirror the delete-route last-admin guard on /admin/users/<id>/deactivate so a future auth-model change (service accounts bypassing require_admin, etc.) can't accidentally lock out the system.
2026-05-14 22:24:19 +02:00
mwiegand
74b7f61437
harden(l4d2web): default security response headers and generic error handlers
- after_request hook sets X-Content-Type-Options=nosniff, X-Frame-Options=DENY, Referrer-Policy=strict-origin-when-cross-origin, and a strict CSP (default-src 'self', script-src self+nonce, frame-ancestors 'none', form-action 'self'); HSTS added on secure non-test responses
- per-request CSP nonce minted in g.csp_nonce; servers.html's inline showModal script picks it up
- 404 and 500 handlers return short plain-text responses so a misbehaving deployment can't leak tracebacks via Werkzeug's debug page
2026-05-14 22:21:36 +02:00
mwiegand
2902c9cc82
harden(l4d2web): auth/session — clear on login+logout, constant-time CSRF, role-change invalidation
- login_user clears any pre-login session state before stamping user_id/pw_changed_at/admin so a fixated cookie value cannot smuggle data past the login boundary
- logout_user now session.clear()s instead of only popping user_id, removing leftover pw_changed_at/admin markers
- CSRF token comparison uses hmac.compare_digest
- load_current_user rejects sessions where the stamped admin flag no longer matches the user row, preventing a demoted admin from retaining elevated access until next password change (backward-compatible: sessions issued pre-upgrade lack the marker and pass through until next login)
2026-05-14 22:18:46 +02:00
mwiegand
66d14feca5
refactor(l4d2-web): harden console-history.js against HTMX version drift and races
- pendingCommand captured in htmx:beforeRequest (not requestConfig).
- ensureLoaded shares a single inflight Promise across concurrent calls.
- Document why synthetic null-id entries are safe in the cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:42:05 +02:00
mwiegand
6f49efd44a
feat(l4d2-web): console panel UI on server detail page
- _console_line.html: command + reply, error variant, "(no reply)" placeholder.
- server_detail.html: console section between Live State and Files, replays
  last 50 history rows server-side; HTMX form appends new lines via hx-swap.
- console-history.js: ArrowUp/Down recall against /console/history JSON;
  scroll-to-bottom on load and after each new line.
- CSS: fixed-height scrolling transcript, terminal-ish styling, spinner via
  HTMX in-flight class.
- test_console_routes.py: update 4 assertions from legacy [ERROR] literal
  to console-error CSS class (matches new semantic markup).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:39:21 +02:00
mwiegand
ecc4aa28c6
refactor(l4d2-web): tighten console route limit test and dedupe is_error
- ?limit clamp test now actually verifies the clamp instead of just
  passing through 5 rows.
- Single is_error assignment per branch, single db.add path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:35:22 +02:00
mwiegand
553b280e40
feat(l4d2-web): backend for RCON console with persisted transcript
- POST /servers/<id>/console runs a command via rcon.execute_command and
  persists every outcome (success / empty / error) to command_history.
- GET /servers/<id>/console/history returns paginated newest-first JSON
  for client-side up-arrow recall.
- server_detail() now passes the last 50 history rows as console_history
  for server-side replay on page load.
- 404 on ownership mismatch — no admin override.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:32:13 +02:00
mwiegand
c4dffd471b
feat(l4d2-web): add command_history table for RCON console transcript
A row per RCON command execution: (user, server, command, reply, is_error,
created_at). Composite index on (user_id, server_id, id) supports the only
query shape — "latest N for this user+server", id DESC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:26:56 +02:00
mwiegand
9ef9ffdbde
chore(l4d2-web): clarify rcon req_id constants and helper docstring
Add comment noting _EXEC_REQ_ID/_MARKER_REQ_ID are arbitrary client-chosen
values unrelated to SERVERDATA_* packet-type constants. Update _connect_and_auth
docstring to accurately reflect that OSError/socket.timeout propagate raw from
post-connect send/recv, while only connect failure is wrapped in RconError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 21:24:41 +02:00
mwiegand
085fd714a5
feat(l4d2-web): add execute_command to rcon service with full test coverage
Extracts _connect_and_auth helper from query_status, adds execute_command
using the trailing-marker pattern for multi-packet reassembly, and covers
all paths (happy path, multi-packet, empty reply, auth failure, timeout,
input validation, marker drain) with 10 new tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:21:41 +02:00
mwiegand
1d3eb51871
docs(plan): RCON console on server detail page
Plan for adding a per-server RCON console: HTMX append-swap input form,
fixed-height scrolling transcript replayed from CommandHistory on load,
multi-packet response handling, owner-only access, 30s timeout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:14:06 +02:00
mwiegand
6cc1736f17
feat(l4d2-web): add hostname edit form to server detail page 2026-05-13 15:42:46 +02:00
mwiegand
963851c0e1
feat(l4d2-web): emit hostname in spec config with ephemeral fallback 2026-05-13 15:31:12 +02:00
mwiegand
d42383dc37
chore: add dev.db and opencode.json to gitignore 2026-05-13 14:29:57 +02:00
mwiegand
69d93dda4f
feat(l4d2-web): accept hostname on server update, default empty on create 2026-05-13 14:29:53 +02:00
mwiegand
0a7f48f174
feat(l4d2-web): add hostname column to Server model 2026-05-13 14:26:14 +02:00
mwiegand
f3f0a8927a
docs: add server hostname implementation plan 2026-05-13 14:21:30 +02:00
mwiegand
fcf3143b39
docs: add server hostname cvar design spec 2026-05-13 14:19:57 +02:00
mwiegand
fe43f67b51
feat: include password-reveal.js in base template 2026-05-13 11:37:47 +02:00
mwiegand
ab83f5fd2b
feat: add RCON password row to server detail page 2026-05-13 11:37:28 +02:00
mwiegand
d9aa6bd395
feat: add password reveal toggle JS 2026-05-13 11:36:40 +02:00
mwiegand
e75feb0649
docs: add rcon password display implementation plan 2026-05-13 11:36:08 +02:00
mwiegand
358a835d65
docs: add rcon password display design spec 2026-05-13 11:35:46 +02:00
mwiegand
d113b7821c
fix(live-state): remove loading=lazy from avatars to fix Firefox/Safari flash
Firefox and Safari defer lazy images by one paint cycle even when cached,
causing a blank frame on each innerHTML swap. These avatars are always
in-viewport and cached after the first poll, so lazy loading has no benefit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:34:53 +02:00
mwiegand
175e4e653c
fix(live-state): eliminate flash on poll by switching to innerHTML swap
outerHTML removes and re-inserts the section on each tick, causing a
blank frame. Keeping the <section> as a stable DOM container and
swapping only innerHTML means avatars and text update in-place without
any teardown/reconstruct cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 23:26:54 +02:00
mwiegand
096d18ac64
feat(live-state): use Steam avatarfull (184x184), downscale in CSS
64x64 avatarmedium looked soft on high-DPI screens. Switching the
GetPlayerSummaries field to avatarfull (184x184) and constraining
display size to 64px via .live-state .avatar gives sharp rendering on
retina/4k panels at the cost of a slightly larger CDN fetch (still
hot-linked, so no proxying cost).

Also adds the previously-missing CSS for the live-state player grid:
avatar+name+meta arranged in a tight 2-column grid per card, link
spans the avatar+name so the meta stays non-interactive.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 23:17:51 +02:00
mwiegand
6cbe7dc9f2
feat(live-state): link player cards to their Steam profile
Wraps avatar + persona name in an a[href=steamcommunity.com/profiles/<id>]
in both the Current and Recent blocks. Steam auto-redirects to the user's
vanity URL on follow, so we don't need to store profileurl separately.

target=_blank + rel=noopener noreferrer to keep the dashboard page in
place when a link is followed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:51:50 +02:00
mwiegand
674c4df360
deploy: add STEAM_WEB_API_KEY to web.env template
For the live-state panel's Steam profile enrichment (persona names +
avatars). Optional: empty value disables enrichment and the panel falls
back to in-game names + placeholder avatars.

The actual web.env is materialized by the ckn-bw bundle's Mako; the
template here documents the operator-facing shape.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:25:03 +02:00
mwiegand
37a9ad68a2
fix(live-state): cast poll_seconds to int for HTMX hx-trigger
HTMX's hx-trigger="every Ns" syntax does not accept fractional seconds —
a config override like 7.5 would render every 7.5s and silently break
auto-refresh. Floor to int with a 1s minimum.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:23:15 +02:00
mwiegand
9aaa26d9a9
feat(servers): add live-state panel with current and recent players
HTMX-refreshed /servers/<id>/live-state fragment renders snapshot
summary, current players with avatars/ping, and recent-player history;
server_detail.html bootstraps it via hx-trigger="load".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:20:01 +02:00
mwiegand
b00a3cceea
test(live-state): assert stale server's map is not rendered in the badge
Closes the negative-assertion gap from the Task 10 review: without this
check, a regression that drops the freshness guard would still pass the
positive 2/4 + c1m2_streets assertions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:17:02 +02:00
mwiegand
072d9f78e7
feat(servers): show live counts + map badge in server list
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:14:57 +02:00
mwiegand
0dc61d5de4
feat(live-state): start daemon poller, prune history, close stuck sessions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:10:55 +02:00
mwiegand
be476112ee
feat(live-state): enrich roster with cached Steam profiles
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 22:02:58 +02:00
mwiegand
33899f8c17
feat(live-state): reconcile player sessions on each poll
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-12 21:58:30 +02:00