Commit graph

269 commits

Author SHA1 Message Date
mwiegand
59771f91c4
fix(deploy): drop deleted l4d2host.fs from pyproject + use nproc --all
Two bugs surfaced by the previous deploy attempt:

1. l4d2host/pyproject.toml still listed `l4d2host.fs` in the explicit
   packages= list. After deleting the fs/ package, pip install -e fails
   with "package directory './fs' does not exist".

2. The CPU-isolation deploy step uses `nproc` to detect host core count,
   but `nproc` honors Cpus_allowed of the calling shell. On a host that
   already has the cpuset drop-ins applied (system.slice/user.slice →
   AllowedCPUs=0), the SSH login lands constrained to one core and
   `nproc` returns 1 — making subsequent deploys think they're on a
   single-core box and skip the cpuset writes entirely. `nproc --all`
   reports installed processors regardless of affinity, which is what
   the deploy actually wants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 13:11:19 +02:00
mwiegand
ff6ce7b091
refactor(l4d2-host): unmount via ExecStopPost — single code path mirroring mount
Symmetric with the earlier mount cleanup (commits 519567e..a982995). Until
now, the unit's ExecStartPre handled mount but the Python side still drove
unmount: stop_instance and _purge_instance both called _mounter.unmount,
which wrapped sudo + the helper. Two code paths for two halves of the
same lifecycle.

Move unmount into the unit:

- ExecStopPost=+/usr/local/libexec/left4me/left4me-overlay umount %i
  (ExecStopPost, not ExecStop, so it runs after the cgroup is cleared;
  ExecStop runs while srcds is alive and would EBUSY the umount syscall.)
- Helper's umount verb is now idempotent (mirrors mount): if merged
  isn't a mount point, return early. PRINT_ONLY mode bypasses both
  short-circuits so the unit tests still exercise the full nsenter argv.

Drop the dead Python machinery:

- _mounter.unmount(...) calls in stop_instance and _purge_instance
- _mounter global + KernelOverlayFSMounter import
- The whole l4d2host/fs/ package (OverlayMounter ABC + KernelOverlayFSMounter
  class) — no production callers, just self-tests
- l4d2host/tests/test_kernel_overlayfs.py
- test_stop_succeeds_when_unmount_fails / test_delete_succeeds_when_unmount_fails
  (tested Python-side unmount-failure tolerance that no longer exists)
- The l4d2host.fs.kernel_overlayfs.run_command monkeypatches in lifecycle tests

After this, the only thing start_instance does beyond cfg-staging is ask
systemd to enable+start the unit. stop/delete/reset only ask systemd to
disable; the overlay lifecycle lives entirely in the unit file.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 13:09:52 +02:00
mwiegand
fc371711ec
fix(deploy): StartLimit* directives belong in [Unit], not [Service]
systemd 230+ moved StartLimitBurst= and StartLimitIntervalSec= from
[Service] into [Unit] (with the rename from StartLimitInterval=). Putting
them in [Service] makes systemd silently ignore them with a warning to
journalctl: "Unknown key 'StartLimitIntervalSec' in section [Service],
ignoring." — meaning the restart-loop cap I claimed in commit 519567e
wasn't actually applied.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:56:54 +02:00
mwiegand
a982995d5b
fix(deploy): ExecStartPre runs overlay helper with + prefix, not sudo
The unit has NoNewPrivileges=true (security hardening for srcds), which
blocks sudo's setuid escalation. The previous sudo'd ExecStartPre failed
on every start with "sudo: the 'no new privileges' switch is set, which
prevents sudo from running as root" -> Restart=on-failure loop.

systemd's `+` prefix runs the Exec command as PID 1 (root, no sandbox),
bypassing User=/Group=/NoNewPrivileges=. Equivalent privilege scope to
the sudoers rule the web app already uses for the same helper, just
without the sudo middleman.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:55:16 +02:00
mwiegand
56f5c30296
refactor(l4d2-host): unit's ExecStartPre is the sole code path to the mount
Before this change there were two callers of left4me-overlay mount:
the web app's start_instance (Python, in-process) and the unit's
ExecStartPre (shell, via sudo). The duplication invited divergence; the
helper's recently-added idempotency made both paths technically work
but at the cost of a "first wins" race and dead-code retry logic in
start_instance.

Drop the in-process _mounter.mount() call from start_instance. The web
app now only stages cfg files (which still must happen on the host
filesystem before mount, to avoid overlayfs copy-up changing ownership),
then asks systemd to enable+start the unit; the unit's ExecStartPre
does the mount.

Removed:
- os.path.ismount(merged) refusal in start_instance and its test
  (test_start_refuses_to_double_mount). The race the check guarded
  against is now handled by the helper's idempotency.
- _load_instance_env helper and the `os` import (both became dead).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:54:05 +02:00
mwiegand
3d9b7ef771
fix(deploy): WorkingDirectory= prefix - so ExecStartPre can mount the overlay
systemd applies WorkingDirectory= to every Exec line including ExecStartPre.
With the merged dir not yet existing at boot time (the volatile overlay
mount has been wiped), the chdir into runtime/%i/merged/left4dead2 fails
with status=200/CHDIR before ExecStartPre can run the mount helper.

The `-` prefix makes chdir failure non-fatal: ExecStartPre runs in the
unit's home (cwd doesn't matter for the mount helper); ExecStart re-applies
WorkingDirectory once the mount has landed and chdirs successfully.

Companion to commit 519567e (which added the ExecStartPre mount + helper
idempotency but didn't account for the WorkingDirectory ordering).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:51:58 +02:00
mwiegand
519567e156
fix(l4d2-host): mount overlay via ExecStartPre so enabled units boot cleanly
The lifecycle change to systemctl enable --now (commit 8552c55) made
units auto-start at boot. But the kernel-overlayfs mount is volatile
(reboot kills it), and the web app's start_instance only re-mounts in
response to a UI click. Result: at boot, systemd starts the unit, finds
empty merged/, CHDIR fails, Restart=on-failure spins forever (counter
hit 65 on ckn before this fix landed).

Fix:
- Unit gets `ExecStartPre=/usr/bin/sudo -n .../left4me-overlay mount %i`
  so the overlay is established before the main process starts.
- Helper is now idempotent: if merged is already a mount point, exit 0.
  Required because Restart=on-failure re-runs ExecStartPre on each
  cycle, and the web-app's start_instance also calls the helper, so
  both paths would otherwise collide on "already mounted".
- StartLimitBurst=5 + StartLimitIntervalSec=60s caps the restart loop
  instead of letting it spin indefinitely on a fundamental failure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:47:20 +02:00
mwiegand
b62fc08127
docs(specs): l4d2 cpu pinning — decision record (deferred)
Investigated whether to hard-pin each srcds instance to a single core
within the existing AllowedCPUs=1-7 set. Modern kernels (5.13+) no
longer expose kernel.sched_migration_cost_ns or the other classic CFS
"laziness" tunables, so a global cheap-fix is unavailable. Decision
for now: trust CFS + Nice=-5 + AllowedCPUs=1-7. Per-instance
CPUAffinity= remains an opt-in escape hatch in deploy/README.md.
Documents the revisit triggers and the preferred implementation path
when the time comes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:41:40 +02:00
mwiegand
67b5521eb6
feat(l4d2-web): periodic state poller refreshes Server.actual_state
A background thread spawned alongside the job workers polls every
server's status every STATE_POLLER_INTERVAL_SECONDS (default 30) and
writes the result via the existing refresh_server_actual_state path.
Servers with in-flight jobs (queued/running/cancelling) are skipped to
avoid racing the post-job refresh. Catches reboot drift, OOM kills,
manual systemctl operations, and any other out-of-band state change.
Spec: docs/superpowers/specs/2026-05-09-l4d2-server-lifecycle-reboot-and-drift-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:31:28 +02:00
mwiegand
8552c559d3
feat(l4d2-host): server lifecycle uses systemctl enable --now / disable --now
Servers started via the web UI now create a WantedBy= symlink under
multi-user.target.wants/, so they auto-start on the next host reboot.
Helper verbs renamed start/stop -> enable/disable; service_control.py
renamed start_service/stop_service -> enable_service/disable_service.
The user-facing l4d2ctl start/stop commands keep their names per the
AGENTS.md contract -- only the implementation changes. Spec:
docs/superpowers/specs/2026-05-09-l4d2-server-lifecycle-reboot-and-drift-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:28:44 +02:00
mwiegand
1dd674714a
docs(specs): perf baseline lifecycle — premise check on system vs user units
Make explicit that the project uses system units (root systemctl, unit
under /usr/local/lib/systemd/system/, WantedBy=multi-user.target), so
`systemctl enable --now` is the correct verb to make instances survive
a host reboot. User units have different lifecycle rules and would not
auto-start at boot without enable-linger.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:25:34 +02:00
mwiegand
3b0bde9b50
docs(plans): l4d2 server lifecycle reboot-and-drift — implementation plan
Two TDD tasks: helper+service_control verb rename, then poller code
+ wiring + tests. Operator-side smoke test in F.3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:21:59 +02:00
mwiegand
72cd7ca1ef
docs(specs): l4d2 server lifecycle reboot-and-drift — design
Switch lifecycle verbs from systemctl start/stop to enable --now /
disable --now (servers survive host reboot via WantedBy= symlinks),
plus a periodic state poller for runtime drift (OOM kills, manual
systemctl ops, exhausted Restart=on-failure).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:21:59 +02:00
mwiegand
20604dd79c
docs(deploy): document CPU isolation in performance-tuning section
Explains the core-0-vs-game-cores split, the LEFT4ME_SYSTEM_CPUS /
LEFT4ME_GAME_CPUS overrides, the single-core skip, and the
subset-of relationship with per-instance CPUAffinity=.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:06:59 +02:00
mwiegand
af3171102a
feat(deploy): cgroup-v2 cpuset drop-ins pin system to core 0, game to rest
Computes NPROC at deploy time. Defaults LEFT4ME_SYSTEM_CPUS=0 and
LEFT4ME_GAME_CPUS=1-(NPROC-1). Single-core hosts skip cpuset writes
with a stderr warning unless an env var override is set. Spec:
docs/superpowers/specs/2026-05-09-l4d2-cpu-isolation-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:06:34 +02:00
mwiegand
c91c029c38
docs(plans): l4d2 cpu isolation — implementation plan
Two TDD tasks: deploy-script cpuset block + tests, README
"CPU isolation" subsection. Operator-side smoke test in F.3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:03:37 +02:00
mwiegand
17b7c2ff10
docs(specs): l4d2 cpu isolation — design
cgroup-v2 AllowedCPUs= drop-ins for system/user/build/game slices.
Defaults: core 0 for everything-not-game, cores 1..N-1 for game,
computed from nproc. LEFT4ME_SYSTEM_CPUS / LEFT4ME_GAME_CPUS
overrides; single-core hosts skip with a warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 11:03:37 +02:00
mwiegand
e5126c8c0b
docs(deploy): tighten perf-tuning escape hatches
- RT example: add AmbientCapabilities=CAP_SYS_NICE so the User=left4me
  service can actually enter SCHED_FIFO on Trixie.
- CPU governor: note that linux-cpupower may need apt install.
- CPUAffinity=2: clarify that per-instance values typically increment.
- NIC tuning: note that ethtool may need apt install.
2026-05-09 10:15:45 +02:00
mwiegand
9e0f6f17ef
docs(deploy): performance-tuning escape-hatch section in README
Documents CPU governor, per-instance CPUAffinity, NIC tuning, and
SCHED_FIFO opt-in patterns. None of these are auto-applied; they're
ops-side knobs for measured problems the perf baseline doesn't solve.
2026-05-09 10:09:40 +02:00
mwiegand
928519fa34
feat(deploy): install slice + sysctl artifacts and apply via sysctl --system
Copies l4d2-game.slice and l4d2-build.slice into
/usr/local/lib/systemd/system/, installs 99-left4me.conf into
/etc/sysctl.d/, and runs sysctl --system so the perf baseline is
live this deploy, not on next reboot.
2026-05-09 10:05:41 +02:00
mwiegand
7e4a5691ed
feat(deploy): script-sandbox runs in l4d2-build.slice + OOMScoreAdjust=500
Builds yield CPU/IO to game-server instances under contention via the
slice's weight=10, and are killed first under memory pressure
(servers have OOMScoreAdjust=-200).
2026-05-09 10:01:38 +02:00
mwiegand
b3fca4772c
feat(deploy): host sysctls for UDP buffers + netdev backlog/budget
99-left4me.conf: rmem_max/wmem_max=8M (with 512K defaults),
netdev_max_backlog=5000, netdev_budget=600, vm.swappiness=10.
2026-05-09 09:53:07 +02:00
mwiegand
66d83a0282
docs(deploy): point slice files at perf baseline spec
Matches the spec-pointer comment Task 1 added to
left4me-server@.service. A future operator running
`systemctl cat l4d2-game.slice` now finds the rationale.
2026-05-09 09:51:48 +02:00
mwiegand
ad7d73608e
feat(deploy): l4d2-game.slice + l4d2-build.slice with 100:1 weight ratio
Flat top-level slices. Game wins under contention; build still gets
the box when uncontended. Referenced by left4me-server@.service and
the script-sandbox systemd-run invocation.
2026-05-09 09:48:41 +02:00
mwiegand
7193163488
feat(deploy): perf-baseline directives on left4me-server@.service
Slice=l4d2-game.slice, Nice=-5, IOSchedulingClass=best-effort,
OOMScoreAdjust=-200, MemoryHigh=1.5G, MemoryMax=2G, TasksMax=256,
LimitNOFILE=65536, KillSignal=SIGINT, TimeoutStopSec=15s,
LogRateLimitIntervalSec=0. Spec:
docs/superpowers/specs/2026-05-09-l4d2-server-host-perf-baseline-design.md
2026-05-09 09:44:12 +02:00
mwiegand
851e6629aa
docs(plans): l4d2 server host perf baseline — implementation plan
Six tasks (TDD, one commit each): unit directives, slice files,
sysctl conf, sandbox slice + OOMScoreAdjust, deploy-script wiring,
README escape-hatch section. Final verification step with full
deploy + host + web pytest sweep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:39:12 +02:00
mwiegand
b6574e308b
docs(specs): perf baseline — fix transient-service phrasing
The existing left4me-script-sandbox helper uses systemd-run in
transient service mode (--unit=, no --scope). Spec wrongly said
'--scope'. No semantic change — the design's --slice= and
-p OOMScoreAdjust= guidance is identical for service vs scope mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:39:12 +02:00
mwiegand
db3b149045
docs(specs): l4d2 server host perf baseline — design
Approach A: per-instance unit directives (Nice, OOM, Memory caps,
KillSignal=SIGINT, log-rate disable), flat l4d2-game/l4d2-build slice
hierarchy with 100:1 CPU/IO weight ratio, sandbox into build slice with
OOMScoreAdjust=500, host sysctls for UDP buffers + netdev backlog/budget
+ vm.swappiness. SCHED_FIFO, CPU governor, CPUAffinity, NIC tuning are
documented escape hatches, not auto-applied.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 09:31:05 +02:00
mwiegand
965b67e6fc
fix(l4d2-host): script-sandbox normalizes file perms so web user can read
Cedapug's build script writes .cedapug/manifest.tsv with mode 0600 owned
by l4d2-sandbox; the web service (left4me uid) then 500s when streaming
that file via the download route — PermissionError on open().

Two fixes:
- UMask=0022 on the systemd-run unit so new file writes default to
  0644 / dirs to 0755.
- Post-script chmod o+r/o+rx walk over the overlay dir to backfill any
  stricter modes the script left behind (e.g. shells/tools that ignore
  umask and explicitly create with 0600).

The helper no longer execs systemd-run; it captures the rc, runs the
post-step, and exits with the original rc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:44:26 +02:00
mwiegand
c16e780283
feat(l4d2-web): server file tree — enable download symmetric with overlay tree
Adds a /servers/<id>/files/download route mirroring the overlay download
endpoint. Same safety rules: real-path must resolve under LEFT4ME_ROOT
(merged view threads through `installation/` and overlay layers, all
already inside the root). The server file-tree partial now renders
download links.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:40:04 +02:00
mwiegand
aacd95012e
feat(l4d2-web): blueprint rename moves to footer modal — matches overlay/server pattern
Drops the inline Name input from the blueprint edit form. A Rename link
sits next to Delete in the page footer; clicking opens a one-line modal
that posts to a new POST /blueprints/<id>/rename route. The main edit
form keeps the current name as a hidden input so its full Save still
works unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:37:29 +02:00
mwiegand
ed12280cf0
feat(l4d2-web): server detail — directory tree of the runtime merged view
Adds a Files section at the bottom of the server detail page that lists
the kernel-overlayfs merged view at runtime/<server_id>/merged/. Reuses
the overlay file-tree partial via two new template variables:

- files_base_url: parent passes "/overlays/<id>" or "/servers/<id>"
- download_supported: false for servers (runtime holds large game
  binaries; no download endpoint), true for overlays (existing behavior)

New service helper safe_resolve_for_server_listing() rejects path
traversal beyond the merged root and returns None when the overlayfs
mount doesn't exist (server never started or just reset).

New route GET /servers/<id>/files?path=<rel> returns the lazy-load
file-tree fragment, gated to the server owner. No download counterpart.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:35:09 +02:00
mwiegand
fa686f11e3
feat(l4d2-web): server + overlay detail — live-refresh via HTMX, restructured
Vendors HTMX 2.0.4 (the prior file was a 1-line stub) and uses it to poll
two new partials on a 2s tick while a job is in flight:

- /servers/<id>/actions → state badge, filtered action buttons,
  last-job sentence, live job log (SSE) while a Start/Stop/Reset job
  is running. When the job is terminal the partial re-renders without
  hx-trigger and polling stops.
- /overlays/<id>/build-status → build state badge, last-build
  sentence, live job log while a build_overlay job is running. Same
  terminal-state stop behavior.

Server detail restructure:
- Editable name moves out of the page body into a Rename modal
  triggered from a link next to Delete in the page footer.
- Compact dl with Port (linked as steam://run/550//+connect <host>:<port>)
  and Blueprint.
- Actions row: state badge + state-filtered buttons (start/stop, reset)
  + last-job sentence. Drift warning when desired ≠ actual.
- Recent Jobs table removed.

Overlay detail restructure:
- Single panel, dl Type/Scope, no separate Last build row, no Builds
  section.
- Script form gets two compound submits: "Save and build" and
  "Save, reset and rebuild". Standalone Rebuild/Wipe gone.
- Build status state badge + last-build sentence under the editor;
  action buttons hide while a build is in flight.
- Rename modal in the page footer next to Delete.

sse.js binds on htmx:load (covers initial document and post-swap inserts)
and closes EventSources on htmx:beforeCleanupElement to avoid leaking
streams across swaps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:27:30 +02:00
mwiegand
3c4bd6880a
refactor(l4d2-web): detail-page UI — single panel, soft border, footer Delete
- Detail panels: softer (color-mix --line-soft) border. h2 sub-section
  spacing inside a single outer panel. admin and job_detail collapse to
  one panel each.
- Color tokens: --color-button-primary / --color-button-danger stay
  saturated in dark mode so white text on filled buttons stays readable.
- Site header: transparent, no full-width bar; aligned with panel-content
  width. No more sticky.
- Page-level Delete: low-contrast outline button at the page footer
  (left side, justify-content flex-start). Save buttons no longer
  full-width (.stack > button { justify-self: end }).
- form-actions-inline helper for right-aligned button rows.
- New service: l4d2web.services.timeago.humanize_delta — used by the
  upcoming server / overlay live-status partials.
- Server route: POST /servers/<id> renames the server (mirrors the
  overlay update pattern, returns 409 on per-user duplicate).
- Overlay route: POST /overlays/<id>/script handles `action` form value
  — `save_build` (default) or `save_reset_build` (wipes overlay dir
  before queuing build). Redirect lands on /overlays/<id> instead of
  the job page so users see the live status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:26:57 +02:00
mwiegand
985df970f8
feat(l4d2-web): per-overlay server.cfg aliases — expose checkbox + auto-exec
Each linked overlay gets a checkbox on the blueprint detail page that opts
its server.cfg in as exec server_overlay_<id>. The web app builds the
spec with {path, alias} per overlay and prepends exec server_overlay_<id>
lines to the blueprint config in lowest-overlay-first order. The host
stages those copies in the overlayfs upper layer before mounting (avoids
copy-up writes against a sandbox-uid file). A live preview block above the
Config textarea shows what gets auto-executed.

Schema:
- alembic 0007: BlueprintOverlay.expose_server_cfg BOOLEAN

Spec contract:
- l4d2host OverlayRef(path, alias?). load_spec accepts both bare-string
  and {path, alias} entries.

Side effects folded in (same file in l4d2_facade):
- start_server auto-initializes; the manual Initialize step is no longer
  needed before Start.
- initialize_server no longer runs blueprint builders — builds happen on
  overlay save, not on every server Start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:26:31 +02:00
mwiegand
c2cf723911
docs(agents): require specs and plans to live in this repo
Make explicit that design specs go in docs/superpowers/specs/ and
implementation plans go in docs/superpowers/plans/, both committed
to git, with the YYYY-MM-DD-<topic>[-design].md naming already used
elsewhere in the tree. The plan-mode scratch file under
~/.claude/plans/ is fine while plan mode is open, but the persisted
artifact must end up inside the repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:37:17 +02:00
mwiegand
a4e9f6cd26
feat(l4d2-web): blueprint overlay picker — drag-list + add-dropdown
Replace the per-row checkbox + numeric Order table on the blueprint
detail page with a drag-to-reorder list of selected overlays plus a
native <select> for adding more. Removing uses an × button per row;
the option sorted-inserts back into the dropdown alphabetically.

Native HTML5 drag-and-drop, no library, no JS-disabled fallback.
Server contract is unchanged: each list row owns one hidden
<input name="overlay_ids">, DOM order = submission order, and the
existing fallback_position branch in ordered_overlay_ids_from_form
absorbs the now-omitted overlay_position_<id> fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:37:11 +02:00
mwiegand
dec4fed809
docs(specs): blueprint overlay picker — drag-list + add-dropdown
Replace per-row checkbox + numeric Order inputs with a drag-to-reorder
list of selected overlays plus a native <select> for adding more.
Native HTML5 DnD; no library, no JS-disabled fallback. Server contract
unchanged (overlay_ids in DOM order; existing fallback_position branch
absorbs the omitted overlay_position_<id> fields).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:32:45 +02:00
mwiegand
01760a31f5
fix(l4d2-web): textareas — monospace font, consistent rows on blueprint forms
Bash script, Arguments and Config are all structured text — render them
in a monospace font with tab-size: 4 and resize: vertical via a base
'textarea' rule in components.css. Add rows="8" + spellcheck="false"
to the blueprint Arguments/Config textareas (both edit and create
forms) so they're a sensible size and consistent with each other.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:52:12 +02:00
mwiegand
7b31390b4c
fix(l4d2-web): file tree — uniform vertical spacing across all rows
The flex 'gap' shorthand on .file-tree-row was setting row-gap as well
as column-gap, so when the .file-tree-children div wrapped to a new
line the row-gap (--space-s) added on top of the nested ul's
margin-top (--space-xs) — making the button-to-first-child gap visibly
bigger than the sibling-row gap. Switch to 'gap: 0 var(--space-s)' so
only column-gap applies; vertical rhythm is now owned exclusively by
the outer grid gap (--space-xs) and the nested ul margin-top
(--space-xs), both equal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:49:05 +02:00
mwiegand
4619a91f45
fix(l4d2-web): file tree layout — wrap children to next line, align names
Two CSS fixes that together turn the rendered file tree from
'everything on one line' into an actual tree:

- .file-tree-children: flex-basis: 100% so an expanded folder's children
  wrap to the next line of the parent <li> flex container instead of
  flowing inline next to the toggle button.
- .file-tree-row-file: padding-left = chevron width, so file rows align
  visually with sibling folder names (folder names are offset by their
  chevron; files have no chevron, so without padding they'd start at
  the chevron column instead of the name column). Chevron itself
  pinned to width: 1ch so rotated/un-rotated states have identical
  layout.
2026-05-08 20:44:41 +02:00
mwiegand
caa8b83cf0
chore(deploy): rewrite web.env every deploy with machine-id-derived SECRET_KEY
Drops the 'only on first creation' guard so newly added env vars reach
existing boxes (today's SESSION_COOKIE_SECURE=false rake). SECRET_KEY
is now sha256(/etc/machine-id) — stable per host, no session
invalidation across redeploys, no state persisted in /etc that the
deploy has to tiptoe around. Single-operator test deployment; the
secret being machine-id-derivable is acceptable per deploy/README.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:39:02 +02:00
mwiegand
c958d0352a
fix(l4d2-web): show empty-state when overlay dir is empty, not just missing
Tickrate and other seeded examples whose overlay directory exists but
hasn't been built yet rendered a visually blank Files panel — entries
was [] (not None), so the template fell through to an empty <ul>. Use
'not file_tree_root_entries' so both None (dir missing) and []
(dir empty) trigger the 'No files yet' message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:32:09 +02:00
mwiegand
2ab54a3800
fix(l4d2-web): file tree fetches in plain JS — vendored htmx is a stub
The vendored static/vendor/htmx.min.js turned out to be a 33-byte
placeholder, so the hx-get/hx-target/hx-trigger attributes on the
overlay file tree's folder buttons were inert: clicks rotated the
chevron (own JS) but never fetched. Switch the lazy-load to a
~30-line plain-JS handler in static/js/file-tree.js that fetches
button.dataset.filesUrl on first expand and dedupes via dataset.loaded.
Update the spec/plan to match. Route + partial contracts unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:23:04 +02:00
mwiegand
a11d030edd
feat(l4d2-web): overlay detail Files section with HTMX file tree + downloads
Adds a server-rendered collapsible file tree section to the overlay
detail page so users can verify what their script/workshop overlays
produced and pull individual artifacts (VPKs, configs) without SSH.
HTMX-driven lazy folder expansion with click-to-download via send_file;
symlinks land anywhere under LEFT4ME_ROOT (so workshop addons stream
from the shared cache) but escapes are refused. Same access rule as the
rest of the page (admin or owner). 39 new tests; full web suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:16:25 +02:00
mwiegand
76bd6e8d4d
docs(specs): overlay file tree — design + implementation plan
Captures the design rationale for the new overlay-detail Files section
(verify build output, click-to-download for individual files via Flask
send_file, HTMX-driven lazy folder expansion) and the paired
implementation plan that produced it. Adds .superpowers/ to .gitignore
so brainstorm session artifacts never sneak into a future commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:16:10 +02:00
mwiegand
1166e13e44
feat(l4d2-web): server identity by id, name as display label
Host-side identifier (systemd unit name and /var/lib/left4me dirs) is now
str(server.id), centralized in services/server_identity.server_unit_name.
Server.name becomes a free-form display label, required and unique per
user (was [a-z0-9_-]{1,64} and globally unique).

Migration 0006 swaps the old global UNIQUE(name) for UNIQUE(user_id, name).
Web routes already keyed on id; templates only used name for display.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 19:22:09 +02:00
mwiegand
0d906605e9
chore: add direnv .envrc for local Python 3.13 venv
Pins to python3.13 to match the Debian Trixie production target.
Documents the dev setup in README and AGENTS.md so a fresh checkout
gets a working `python` via `direnv allow` + editable installs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:56:51 +02:00
mwiegand
196d2db33e
feat(l4d2-web): seed example script overlays from examples/script-overlays/
Bundles four reference script overlays (cedapug_maps, l4d2center_maps,
competitive_rework, tickrate) and adds a `flask seed-script-overlays`
CLI that upserts each *.sh as a system-wide overlay. Test deploy
invokes it after the orphan-cleanup migration so fresh test servers
come up with the same overlays the user has been maintaining by hand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:41:08 +02:00
mwiegand
6b4eef22c2
feat: server Reset action — wipe runtime, keep DB row
Reset stops the systemd service, unmounts the overlay, and rm -rf's both
runtime/<name> and instances/<name>, but keeps the Server row, blueprint,
and (shared) systemd template. Next Start re-initializes from the current
blueprint, so users can clean up logs/caches/accumulated game state without
losing the server.

Implementation factors a shared _purge_instance helper out of
delete_instance; reset_instance reuses it without the existence guard. New
"reset" lifecycle op flows through the same route + worker + facade plumbing
as the other server ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:10:32 +02:00