left4me

Author	SHA1	Message	Date
mwiegand	15c620f95c	spec(deployment-responsibility): handoff for brainstorming the deploy split The hardening refactor + uid-collapse make the "what does left4me own vs. ckn-bw own" question more pointed. The 2026-05-06 deployment design already framed this: deploy/files/ in left4me mirrors target paths, configmgmt integrates. Some artifacts have drifted into the ckn-bw reactor since (systemd unit emissions, sysctl defaults); the brainstorming session reconciles. Sequenced after uid-collapse. Self-contained for a fresh Claude session to read cold via superpowers:brainstorming. Session-handoff updated to point at this as the next-next queued work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 15:56:38 +02:00
mwiegand	8971b23617	refactor(sandbox): collapse l4d2-sandbox user into left4me The hardening refactor that just landed closes the same-uid attack surface (FS view, ptrace, /proc visibility, signals) for the web + gameserver units via systemd directives plus system-wide kernel.yama.ptrace_scope=2. Keeping the script-sandbox on a separate uid was the inconsistent half-step — defense-in-depth only, with build-time-idmap complexity attached. One principle wins: harden once, share the uid. scripts/libexec/left4me-script-sandbox: drop the idmap block (uid lookups, STAGING setup, cleanup_staging trap, mount --bind --map-users), switch User=/Group= to left4me, point BindPaths at \$OVERLAY_DIR directly. Header comment updated to reflect hardening-not-uid as the same-uid defense. nsenter self-wrap kept — it's about mount-namespace escape, not uid. Tests + comments + companion docs updated. Build-time-idmap and overlay-idmap plans marked SUPERSEDED; user-uid-split spec revised to "1 user is correct"; one-line update notes on the hardening specs and the build-overlay-unit-design. Companion ckn-bw commit removes the l4d2-sandbox user + group and tightens /var/lib/left4me from 0711 → 0755 (the traverse-only mode was specifically for the sandbox uid).	2026-05-15 15:50:57 +02:00
mwiegand	146cb01450	plan(uid-collapse): drop l4d2-sandbox user; handoff to next session Approved-but-not-executed plan to collapse the two-user model (left4me + l4d2-sandbox) into one. The build-time-idmap that translates sandbox writes back to left4me uid becomes a no-op when source uid == target uid, so it's removed along with ~30 lines of helper plumbing. Hardening already covers the same-uid attack surface the sandbox uid was defending against, so collapsing makes the architecture consistent with the web/server hardening-only decision. Plan: docs/superpowers/plans/2026-05-15-uid-collapse.md Handoff: docs/superpowers/specs/2026-05-15-session-handoff.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 15:39:51 +02:00
mwiegand	f5f8db84ef	spec(session-handoff): hardening refactor landed and verified on left4.me 12-task subagent-driven refactor complete. left4me-server@1: 7.5 → 1.3 systemd-analyze. left4me-web: 8.7 → 4.1. All 6 Test 8 attack vectors blocked post-deploy. One acceptable SECCOMP audit line per gameserver restart (Breakpad's ptrace fork, blocked by design). Test tooling (gdb, seccomp, libseccomp-dev) apt-removed from left4.me. uid-split spec marked superseded. No queued follow-up. Adjacent work: build-overlay-unit refactor and the deferred drop-in / configmgmt-responsibility reshape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 15:17:06 +02:00
mwiegand	f615d0de75	spec(user-uid-split): mark superseded by the hardening refactor The 1/2/3-user question is answered: stay at 2 (left4me + l4d2-sandbox). The defenses that motivated a 3-user split (cross-uid ptrace, cross-server contamination, web-side reach into gameserver state, DB/env exposure to srcds) are closed by the systemd hardening composition: PrivateUsers + PrivatePIDs + TemporaryFileSystem + SystemCallFilter=~@debug + empty CapabilityBoundingSet. The residual filesystem-ACL surface (mode 0640 root:left4me on DB and web.env) is noted as a separate concern — covered for the current deployment shape, revisit if shape changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:59:13 +02:00
mwiegand	37309ba399	spec(hardening-test-plan): fix four bugs surfaced by executor Four corrections noted by the test plan's executor in commit `461b8d0`: - PID-lookup race: pgrep+head can pick the wrong instance. Replace with systemctl show -p MainPID --value left4me-server@N.service. - gdb-from-host ptrace check: nsenter into only the mount namespace with root caps bypasses the SECCOMP filter, so the test is a false positive. Replace with systemd-run-with-same-directives probe, or syscall-filter inspection. - D5 pgrep pattern: 'srcds_linux.*\@2' doesn't match because @N is in the unit name, not argv. Use systemctl show -p MainPID. - scmp_sys_resolver is in the seccomp package on Debian 13, not libseccomp-dev. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:58:46 +02:00
mwiegand	7c64910c90	spec(hardening-refactor): resolve emitter open items Verified during plan execution that the ckn-bw systemd-bundle emitter handles tuples and empty values as expected. SocketBindAllow port range hard-coded since systemd directive variable substitution is not universal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:39:11 +02:00
mwiegand	b1293f9952	plan(hardening-refactor): implementation plan against the proven composition 12 tasks across left4me + ckn-bw: emitter verification, three Python constants in the systemd_units reactor, spread into both managed units, sysctl drop-in, annotated reference units, four spec bug fixes, mark uid-split spec superseded, cross-repo push, bw apply + verify on host, apt-remove test tooling. Each task has bite-sized steps with exact commands and expected output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:25:25 +02:00
mwiegand	81dc29a9c3	spec(hardening-refactor): revise design — inline-in-reactor, defer drop-in reshape Going back to the inline-in-reactor shape: hardening directives land in ckn-bw's systemd_units reactor as shared Python dicts (HARDENING_COMMON + HARDENING_SERVER + HARDENING_WEB), spread into each unit's Service block. Educational reference units in deploy/files/.../*.service stay and get per-directive comments. Operator discipline hand-syncs the reference to the reactor; no CI drift test. The broader responsibility reshape — hardening drop-ins living in left4me with ckn-bw as a thin file-shipper — is worth pursuing as a separate dedicated session, not bundled into this refactor. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:16:02 +02:00
mwiegand	3256ed2ab1	spec(hardening-refactor): design — drop-ins owned by left4me, ckn-bw deploys Hardening composition is application knowledge (which paths to bind, that srcds is i386, what breaks sudo). It belongs in the left4me repo as drop-in .conf files under deploy/files/etc/systemd/system/<unit>.d/. ckn-bw shrinks: keeps the base units in its reactor, removes the hardening keys, ships the drop-ins to /etc/systemd/system/. Existing educational reference units in deploy/files/.../*.service are deleted in favor of the drop-ins, which carry per-directive comments. Broader configmgmt-responsibility reshape (base units leaving the reactor) deliberately deferred to a future session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 14:05:38 +02:00
mwiegand	152c313315	spec(session-handoff): point next session at hardening-refactor plan The prior handoff pointed this session at running the test plan; that's done (commit `461b8d0`). Update the handoff to point the next session at writing docs/superpowers/plans/2026-MM-DD-hardening-refactor.md against the proven composition, including the two amendments (x86 arch, PrivatePIDs) and the MDW permanent exclusion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 13:43:37 +02:00
mwiegand	461b8d028f	spec(hardening): test plan executed on left4.me — results recorded Ran the 11-test plan against left4me-server@1 (canary) and left4me-web on left4.me / Debian 13 / systemd 257. Cleaned up all unit drop-ins; kept the Test 9 sysctl (kernel.yama.ptrace_scope=2) per spec. Outcomes: - server@1 systemd-analyze: 7.5 EXPOSED → 1.3 OK - left4me-web systemd-analyze: 8.7 EXPOSED → 4.1 OK - All 8 attack vectors in Test 8 (D1.a-c, D2.a-c, D3, D5) blocked - Test 6 (MemoryDenyWriteExecute) fails as predicted — Source engine i386 .so files have text relocations; exclude from final composition. - Test 11 (24-48h soak) skipped per operator decision. Two amendments to the spec's proposed composition required for the refactor: - SystemCallArchitectures=native x86 (not bare 'native') — srcds_linux is i386, the kernel kills every native-only call. - PrivatePIDs=true added — ProtectProc=invisible alone cannot hide gunicorn from srcds because both run as uid 980; PrivatePIDs gives each instance its own PID namespace and closes D2.b. Spec bugs surfaced and documented in the "Output" section: PID lookup via pgrep (race vs. instance), Test 4/10 gdb-from-host doesn't actually exercise the unit's SECCOMP filter, Test 8 D5 pgrep pattern won't match. Tooling note corrected: scmp_sys_resolver is in 'seccomp' package, not 'libseccomp-dev'. Next session: write docs/superpowers/plans/2026-MM-DD-hardening-refactor.md against the proven composition; supersede the uid-split spec. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 13:39:50 +02:00
mwiegand	1df811e62a	spec(hardening): threat model + defenses survey + test plan; pivot handoff Reframe the queued uid-split decision into a broader hardening analysis. Audit found the same-uid attack surface (DB readable from srcds, ptrace allowed, RCON stored plaintext) is closable by either uid split or systemd directive composition; the three specs ground that choice in a threat model, survey the defenses, and lay out a self-contained test plan to run on left4.me next. uid-split spec deferred pending results. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 13:07:40 +02:00
mwiegand	9a2ab974e6	spec: session handoff pointing next session at uid-split Short companion to the existing topic-specific handoff docs. Captures the situationally-fresh state at the end of the 2026-05-15 deploy-dir-rethink + janitorial sweep so a fresh session can pick up cold: what just landed, what's next (uid-split), what's NOT next (build-overlay-unit, until uid-split decides), and the decision-relevant signals that emerged during this session — mostly that the 2-uid model was freshly load-bearing in the build-time-idmap work and that srcds hardening already covers most of what a gameserver-uid split would add. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:17:55 +02:00
mwiegand	4aa69c2461	spec(janitorial): mark items 8, 9 resolved after on-host verification Both items were operational verifications (not code changes) against the deployed test host ovh.left4me (141.95.32.8). Item 8: orphan idmap binds in PID 1's mount namespace. `sudo findmnt --task 1 -o TARGET \| grep /var/lib/left4me/runtime/.*/idmap/` returned zero matches with left4me-server@{1,2}.service both active. Either swept earlier or never appeared on this host; nothing to umount. Item 9: Optimized Settings (overlay 8) files-overlay sanity. Dir is left4me:left4me end-to-end; `sudo find /var/lib/left4me/overlays/8 -type f -uid 981` returned empty. The invariant "files-overlays are populated by the web app as left4me, never through the sandbox helper" holds. Remaining live janitorial items: 7 (conditional on the build-overlay-unit refactor) and 10 (SourceMod 1.13 calendar reminder, ~late 2026). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:14:34 +02:00
mwiegand	8f30dd7754	docs: correct stale bubblewrap references in v1 spec + live docstring Janitorial item 6 in 2026-05-15-janitorial-cleanup.md. The v1 sandbox design (2026-05-08-l4d2-script-overlays-design.md) was approved 2026-05-08 and superseded the same day by the v2 systemd-only design (2026-05-08-l4d2-script-sandbox-v2-systemd.md). The current left4me-script-sandbox helper uses systemd-run in service-unit mode; no bwrap binary is invoked. The v1 spec still described bubblewrap as the engine. - v1 spec gets a top-of-file banner pointing at v2 as the supersede. Body preserved; the rest of the v1 design (overlay-type unification, resource caps, helper auth) is still valid — only the sandbox engine changed. - l4d2web/services/overlay_builders.py: ScriptBuilder docstring "bubblewrap + systemd-run" → "hardened systemd-run transient service" (the as-built reality). - scripts/tests/test_script_sandbox.py: stray "/bwrap" in a comment cleaned up. Negative regression assertions (`assert "bwrap" not in text`) intentionally retained as the guard against accidental re-introduction. - Plan docs left untouched (historical action snapshots). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:12:31 +02:00
mwiegand	160911fbca	spec(deploy-dir-rethink): plan + mark adjacent specs resolved Adds the implementation plan that landed in the preceding commit (2026-05-15-deploy-dir-rethink.md) under docs/superpowers/plans/, and marks the two related specs: - 2026-05-15-deploy-dir-rethink-design.md (the source handoff) gets a "Resolved by …" banner at the top with a one-paragraph summary of the decisions taken. Body preserved for archaeology. - 2026-05-15-janitorial-cleanup.md gets a status banner noting that items 1, 3, 4, 5 are fully resolved by the deploy-dir-rethink plan and item 2 is partially resolved with a third option the original enumeration didn't list: only the truly-dead two static units (cake.service, nft-mark.service) deleted, the reactor-emitted set (server@, web, workshop-refresh.{service,timer}, slices) retained as curated examples. Resolved items left in place but flagged. Remaining live janitorial items: 6 (bubblewrap doc drift), 7 (conditional on build-overlay-unit refactor), 8 (operational idmap bind cleanup), 9 (Optimized Settings overlay verification), 10 (SM 1.13 calendar reminder). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-15 12:05:53 +02:00
mwiegand	e38b844978	docs: janitorial cleanup checklist + L4D2 server cvar reference Two follow-ups bundled into a single commit: - docs/superpowers/specs/2026-05-15-janitorial-cleanup.md collects the "do later" small TODOs that surfaced across the recent idmap + consolidation work: dead cake-related artifacts, obsolete static systemd units in deploy/files/, the bubblewrap→systemd-run doc drift, stale gameserver-side idmap binds on un-checked instances, calendar reminder for SM 1.13 stable. Each item is small and self-contained. - docs/l4d2-server-cvar-reference.md captures the research from the early-session L4D2 cvar deep-dive: tickrate sweet spots, nb_update_frequency cheat-protection + sm_cvar workaround, cvars that don't exist in L4D2 (net_maxcleartime, z_resolve_zombie_collision_multiplier per RCON probe), recommended plugins, MetaMod/SourceMod branch tracking, and the empirically- verified idmap-propagation-through-rebind kernel-6.12 quirk. Reference material, not a spec — lives at docs/ root. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 02:05:12 +02:00
mwiegand	a450491a90	spec(uid-split): note these are system units, not user units Explicit clarification so the next agent doesn't go looking for user-unit friction. left4me-server@.service and left4me-web.service are system units that drop to User=left4me; the 3-user split is a literal one-line edit per unit. No lingering, no pam_systemd, no per-user systemd instance bootstrap. The privileged ExecStartPre/ExecStopPost steps stay root via the + prefix. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:59:56 +02:00
mwiegand	62cf6cdd56	spec: handoff for revisiting 1/2/3-user split for left4me The 2-user split (left4me + l4d2-sandbox) has been inherited as a constraint across multiple recent plans (idmap-on-mount, build-time- idmap, helper consolidation) without ever being designed end-to-end. Three plausible configurations: collapse to 1 user (rejected for security), keep at 2 users (status quo), or split web from game into 3 users for blast-radius limiting on either side. Doc captures the threat-model heuristics, cross-uid file-access plumbing options (shared group vs. world-read), idmap implications, a step-by-step migration sketch for the 3-user variant, and explicit out-of-scope items (per-instance gameserver uids, etc.). Detailed enough that a future session can pick a configuration and execute without re-deriving the design space. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:58:09 +02:00
mwiegand	28b0ff951b	spec(build-overlay-unit): flag DB-fetch-in-ExecStartPre as an option The script content lives in the overlays.script DB column and the unit's %i is the row id, so the worker-writes-script-to-fs step in the original sketch is duplication. Document three options (worker writes / unit fetches via helper / pipe to stdin) and recommend the unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote this to the top of the open-decisions list since it shapes the worker, the unit, and whether a fetch-script helper is added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:54:41 +02:00
mwiegand	a9bbc209ae	spec: handoff for replacing script-sandbox helper with template unit The build-time idmap landing today required a nsenter self-wrap in left4me-script-sandbox to escape the web app's PrivateTmp namespace before pre-creating the idmapped staging bind. Working but band-aid: the helper is reinventing what a systemd template unit would do declaratively. Mirror the left4me-server@.service pattern with a build-overlay@.service template — ExecStartPre does the idmap bind in PID 1's namespace by default, the hardening flags live in the unit file, ExecStopPost tears down. Worker switches to sudo systemctl start. Doc captures full proposed unit, worker rewrite sketch, sudoers update, migration order, verification steps, and the ~5h estimate so a future session can pick this up cold and execute. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:52:57 +02:00
mwiegand	bc25d423aa	plan(left4me): move idmap from gameserver mount to script-sandbox build Architectural cleanup: the uid translation is a build-time concern (the sandbox produces sandbox-uid files); having the gameserver path unwind that producer-side decision on every mount means the mount helper carries idmap lifecycle code it shouldn't need. Moving the idmap into the script-sandbox bind makes files land left4me-owned on disk, drops ~140 lines from left4me-overlay, and makes all overlay content (workshop + script-built) consistent on-disk. Verified on left4.me kernel 6.12.86 that the kernel idmap propagates through plain re-bind, so systemd-run's BindPaths can wrap a pre-created idmapped staging path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 01:15:46 +02:00
mwiegand	2b20bffeb8	spec: handoff doc for rethinking deploy/ dir architecture The 2026-05-15 script-consolidation pass landed a working but half-finished mental model: deploy/files/ was retroactively promoted from "historical reference" to "canonical source," but only for the script files. Several adjacent things (sudoers/sysctl duplication across both repos, the systemd unit files that ckn-bw's reactor ignores, deploy-test-server.sh's role, dead-code apply-cake) didn't get resolved. Capture the open questions and pointers so a future session can pick this up and commit to a coherent shape. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-15 00:53:55 +02:00
mwiegand	3a2c379b71	plan(left4me-overlay): idmap lowerdir bind mounts for cross-uid copy-up Persist the implementation plan for adding idmapped bind mounts to left4me-overlay so that overlay copy-up from l4d2-sandbox-owned lower layers (script-built overlays) produces left4me-owned upperdir entries the gameserver can write. Mechanism verified end-to-end on ovh.left4me in a temp dir on 2026-05-14.	2026-05-14 23:42:36 +02:00
mwiegand	1d3eb51871	docs(plan): RCON console on server detail page Plan for adding a per-server RCON console: HTMX append-swap input form, fixed-height scrolling transcript replayed from CommandHistory on load, multi-packet response handling, owner-only access, 30s timeout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 21:14:06 +02:00
mwiegand	f3f0a8927a	docs: add server hostname implementation plan	2026-05-13 14:21:30 +02:00
mwiegand	fcf3143b39	docs: add server hostname cvar design spec	2026-05-13 14:19:57 +02:00
mwiegand	e75feb0649	docs: add rcon password display implementation plan	2026-05-13 11:36:08 +02:00
mwiegand	358a835d65	docs: add rcon password display design spec	2026-05-13 11:35:46 +02:00
mwiegand	83d2a9932c	refactor(rcon): harden _parse_duration; surface fixture handler errors - _parse_duration wraps int() in try/except so malformed connected durations raise RconError (not ValueError leaking past the poller's except RconError). - fake_rcon_server captures handler exceptions and re-raises at context exit, so a buggy test handler surfaces as a real failure instead of silently degrading into a client-side timeout. - Two new parser tests: HH:MM:SS duration parsing and malformed input coverage. - Fix Steam ID formula typo in the spec doc (Z2 + Y, not Y2 + Z; Y is the low bit). Code was already correct. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 21:39:32 +02:00
mwiegand	e25e7098f6	refactor(live-state): drop redundant ix_sps_server_recent index The two indexes ix_sps_server_open and ix_sps_server_recent were byte-identical because SQLAlchemy's Index(name, *cols) form drops the DESC ordering the spec intended. Rather than reach for text("left_at DESC"), drop the second index entirely — SQLite scans the ASC index backwards at no measurable cost. Spec and plan updated to match. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 21:27:01 +02:00
mwiegand	a5f7b736a2	docs/plan: server live-state display implementation plan Thirteen TDD-structured tasks covering schema migration, RCON client, spec injection, password generation, Steam Web API client, live-state poller (RLE snapshots + session reconciliation + profile enrichment + retention + thread startup), server list badge, server detail fragment, deploy env, and end-to-end smoke. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 21:10:33 +02:00
mwiegand	202026e11a	docs/spec: add server live-state display design RCON-based polling with run-length-encoded snapshots, session intervals with min/max ping, Steam profile cache, and a server-detail roster of current + recent players hot-linked from Steam CDN avatars. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-12 21:03:26 +02:00
mwiegand	429fee3868	docs/plan: trim retry-backoff tuple to match attempts-1	2026-05-11 23:21:10 +02:00
mwiegand	532b4c4469	docs: implementation plan for workshop auto-download 7 tasks: retry helper, builder download phase, per-overlay refresh route, template button, CLI subcommand, systemd timer, smoke-test.	2026-05-11 22:34:31 +02:00
mwiegand	fef8cc4ea6	docs: design for workshop auto-download Closes the gap where added workshop items never reach disk until an admin presses the global refresh button. Downloads piggyback on the per-overlay build_overlay job; daily updates come from a systemd timer + CLI subcommand that enqueues the existing refresh job.	2026-05-11 22:28:20 +02:00
mwiegand	ccd3b36319	docs: design for profile page with self-service password change The matching design doc for the implementation plan committed in `6eb9bd0`. Captures the session-invalidation reasoning (Django-style "keep current session, kill others") and the open questions resolved during brainstorming. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 22:21:40 +02:00
mwiegand	6eb9bd0ab3	docs: plan for profile page with self-service password change Adds /profile reachable via header username, with change-password form as its first section. Industry-standard session semantics: other sessions invalidated on password change, current session kept, via new users.password_changed_at column + session marker. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-11 21:41:25 +02:00
mwiegand	e1add4fffa	docs(plans): l4d2 network shaping & marking — implementation plan Eight TDD tasks: sysctl extension, nftables marking (file + unit), CAKE shaper (env + helper + unit), deploy-script wiring, README. Each task adds one artifact with its assertion in test_deploy_artifacts.py and ends in its own commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:10:40 +02:00
mwiegand	0cc92f2c17	docs(specs): l4d2 network shaping & marking — design CAKE egress shaping (test-deploy oneshot + systemd-networkd [CAKE] block on prod), nftables uid-based DSCP-EF + skb-priority marking for srcds UDP, plus rounding sysctls (udp_rmem_min/wmem_min, default_qdisc=fq_codel, tcp_congestion_control=bbr). Hardware-specific knobs stay documented escape hatches matching the perf-baseline boundary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:05:44 +02:00
mwiegand	2d3c98866a	feat(files-overlay): user-managed file content as a third overlay type Adds Overlay.type='files' whose source-of-truth IS the overlay directory itself. Users can: * upload arbitrary files / whole folders by dragging from the OS onto a folder row in the file tree (one POST per file, queue with concurrency 3, per-file progress in a floating Uploads panel) * move via drag-and-drop inside the tree (same gesture, source distinguishes; refuses cycles) * create / edit / rename / replace through a single editor modal (text flavor for editable files, binary flavor with replace-upload for everything else; filename input is the rename surface) * mkdir empty folders (slashes allowed for nested intermediates) * stream a folder as a zip download * delete files and empty folders Backend is type-agnostic past the new files_routes endpoints, so the existing mount / spec / overlayfs / expose_server_cfg pipeline is reused unchanged. is_editable gates the row's edit affordance and the /save content rules. Three new safe-resolve helpers (write/delete/move) cover the new operations with the same anchor-and-resolve pattern as listing and download. FilesBuilder is a no-op so the build subsystem can dispatch uniformly. Spec: docs/superpowers/specs/2026-05-09-files-overlay-design.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 18:59:32 +02:00
mwiegand	36d3d83de6	docs: postmortem for the overlay-umount EBUSY rabbit hole Captures the symptom (Reset blew up on `umount target busy`), the false starts (eager retry, lazy fallback, TimeoutStopSec bump — all shipped briefly and reverted), the actual root cause (the helper's own Python interpreter inheriting and pinning the unit's mount namespace), and the fix (nsenter at the systemd Exec line). The lessons section is the part future-me reads first: a retry loop is a hint that something we own is the blocker; probe `/proc/*/ns/mnt` before assuming kernel async; `+` Exec prefix doesn't escape the unit's mount namespace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 15:50:41 +02:00
mwiegand	b62fc08127	docs(specs): l4d2 cpu pinning — decision record (deferred) Investigated whether to hard-pin each srcds instance to a single core within the existing AllowedCPUs=1-7 set. Modern kernels (5.13+) no longer expose kernel.sched_migration_cost_ns or the other classic CFS "laziness" tunables, so a global cheap-fix is unavailable. Decision for now: trust CFS + Nice=-5 + AllowedCPUs=1-7. Per-instance CPUAffinity= remains an opt-in escape hatch in deploy/README.md. Documents the revisit triggers and the preferred implementation path when the time comes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 12:41:40 +02:00
mwiegand	1dd674714a	docs(specs): perf baseline lifecycle — premise check on system vs user units Make explicit that the project uses system units (root systemctl, unit under /usr/local/lib/systemd/system/, WantedBy=multi-user.target), so `systemctl enable --now` is the correct verb to make instances survive a host reboot. User units have different lifecycle rules and would not auto-start at boot without enable-linger. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 12:25:34 +02:00
mwiegand	3b0bde9b50	docs(plans): l4d2 server lifecycle reboot-and-drift — implementation plan Two TDD tasks: helper+service_control verb rename, then poller code + wiring + tests. Operator-side smoke test in F.3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 12:21:59 +02:00
mwiegand	72cd7ca1ef	docs(specs): l4d2 server lifecycle reboot-and-drift — design Switch lifecycle verbs from systemctl start/stop to enable --now / disable --now (servers survive host reboot via WantedBy= symlinks), plus a periodic state poller for runtime drift (OOM kills, manual systemctl ops, exhausted Restart=on-failure). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 12:21:59 +02:00
mwiegand	c91c029c38	docs(plans): l4d2 cpu isolation — implementation plan Two TDD tasks: deploy-script cpuset block + tests, README "CPU isolation" subsection. Operator-side smoke test in F.3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 11:03:37 +02:00
mwiegand	17b7c2ff10	docs(specs): l4d2 cpu isolation — design cgroup-v2 AllowedCPUs= drop-ins for system/user/build/game slices. Defaults: core 0 for everything-not-game, cores 1..N-1 for game, computed from nproc. LEFT4ME_SYSTEM_CPUS / LEFT4ME_GAME_CPUS overrides; single-core hosts skip with a warning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 11:03:37 +02:00
mwiegand	851e6629aa	docs(plans): l4d2 server host perf baseline — implementation plan Six tasks (TDD, one commit each): unit directives, slice files, sysctl conf, sandbox slice + OOMScoreAdjust, deploy-script wiring, README escape-hatch section. Final verification step with full deploy + host + web pytest sweep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 09:39:12 +02:00

1 2

80 commits