Commit graph

38 commits

Author SHA1 Message Date
mwiegand
81c6863cca
overlay_builders: restore symlink overwrite guards + nits
Commit 16adc5c silently dropped two defensive guards from the
symlink-creation loop in WorkshopBuilder.build. Restore them:
- refuse to overwrite a non-symlink file that collides with a workshop
  name (logs a message, skips creation)
- refuse to overwrite a foreign symlink (target outside the cache root)

Also: change `skipped` from list to set (O(1) membership test, no
duplicates possible), and add a brief comment above WorkshopMetadata
construction explaining which fields download_to_cache actually uses.

Two regression tests added to pin the guard behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 23:01:38 +02:00
mwiegand
16adc5c1fe
overlay_builders: download missing/stale workshop items inline
Replace the old skip-uncached-with-warning logic in WorkshopBuilder.build
with an inline download phase that calls _download_with_retry for each item
whose cache file is absent or stale (mtime/size mismatch). Stamps
last_downloaded_at / last_error after each download, and skips items with
no file_url. Update test fixture to utime cache files so mtime matches
time_updated, delete the now-superseded skip-warning test, and add six
new builder-level behavior tests covering the new download path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:56:09 +02:00
mwiegand
6fc7f87943
overlay_builders: address code-review nits on retry helpers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:48:28 +02:00
mwiegand
13bd2e48f6
overlay_builders: add _download_with_retry + _sleep_with_cancel helpers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-11 22:43:22 +02:00
mwiegand
26a6a9d7b0
rate-limit: extract generic helper, reuse from login
Pulled the per-IP sliding-window check out of auth_routes so the
upcoming /profile/password endpoint can share it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 21:45:51 +02:00
mwiegand
e1149704c8
job_worker: don't duplicate streamed stderr on HostCommandError
services/host_commands.run_command pumps each stderr line into JobLog
via on_stderr (job_worker.py:215) before it raises HostCommandError —
appending exc.stderr again as a single row produced a second copy of
the entire traceback truncated at JOB_LOG_LINE_MAX_CHARS (4096), which
was visible as the awkward duplicated/cut-off second block at the end
of failed install logs.

Split the existing `except subprocess.CalledProcessError` into two:

  except HostCommandError: stderr already streamed — just record exit
    code + last error summary on the job/server row. No log append.

  except subprocess.CalledProcessError: catches raw CalledProcessErrors
    raised outside host_commands (no pump ran), so still append stderr
    to the log. Preserves the path test_called_process_error_fails_job
    exercises.

New regression test asserts a HostCommandError with multi-line stderr
doesn't land as a single concatenated JobLog row.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:52:54 +02:00
mwiegand
2d3c98866a
feat(files-overlay): user-managed file content as a third overlay type
Adds Overlay.type='files' whose source-of-truth IS the overlay directory
itself. Users can:

  * upload arbitrary files / whole folders by dragging from the OS onto a
    folder row in the file tree (one POST per file, queue with
    concurrency 3, per-file progress in a floating Uploads panel)
  * move via drag-and-drop inside the tree (same gesture, source
    distinguishes; refuses cycles)
  * create / edit / rename / replace through a single editor modal
    (text flavor for editable files, binary flavor with replace-upload
    for everything else; filename input is the rename surface)
  * mkdir empty folders (slashes allowed for nested intermediates)
  * stream a folder as a zip download
  * delete files and empty folders

Backend is type-agnostic past the new files_routes endpoints, so the
existing mount / spec / overlayfs / expose_server_cfg pipeline is reused
unchanged. is_editable gates the row's edit affordance and the /save
content rules. Three new safe-resolve helpers (write/delete/move) cover
the new operations with the same anchor-and-resolve pattern as listing
and download. FilesBuilder is a no-op so the build subsystem can
dispatch uniformly.

Spec: docs/superpowers/specs/2026-05-09-files-overlay-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 18:59:32 +02:00
mwiegand
67b5521eb6
feat(l4d2-web): periodic state poller refreshes Server.actual_state
A background thread spawned alongside the job workers polls every
server's status every STATE_POLLER_INTERVAL_SECONDS (default 30) and
writes the result via the existing refresh_server_actual_state path.
Servers with in-flight jobs (queued/running/cancelling) are skipped to
avoid racing the post-job refresh. Catches reboot drift, OOM kills,
manual systemctl operations, and any other out-of-band state change.
Spec: docs/superpowers/specs/2026-05-09-l4d2-server-lifecycle-reboot-and-drift-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 12:31:28 +02:00
mwiegand
c16e780283
feat(l4d2-web): server file tree — enable download symmetric with overlay tree
Adds a /servers/<id>/files/download route mirroring the overlay download
endpoint. Same safety rules: real-path must resolve under LEFT4ME_ROOT
(merged view threads through `installation/` and overlay layers, all
already inside the root). The server file-tree partial now renders
download links.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:40:04 +02:00
mwiegand
ed12280cf0
feat(l4d2-web): server detail — directory tree of the runtime merged view
Adds a Files section at the bottom of the server detail page that lists
the kernel-overlayfs merged view at runtime/<server_id>/merged/. Reuses
the overlay file-tree partial via two new template variables:

- files_base_url: parent passes "/overlays/<id>" or "/servers/<id>"
- download_supported: false for servers (runtime holds large game
  binaries; no download endpoint), true for overlays (existing behavior)

New service helper safe_resolve_for_server_listing() rejects path
traversal beyond the merged root and returns None when the overlayfs
mount doesn't exist (server never started or just reset).

New route GET /servers/<id>/files?path=<rel> returns the lazy-load
file-tree fragment, gated to the server owner. No download counterpart.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:35:09 +02:00
mwiegand
3c4bd6880a
refactor(l4d2-web): detail-page UI — single panel, soft border, footer Delete
- Detail panels: softer (color-mix --line-soft) border. h2 sub-section
  spacing inside a single outer panel. admin and job_detail collapse to
  one panel each.
- Color tokens: --color-button-primary / --color-button-danger stay
  saturated in dark mode so white text on filled buttons stays readable.
- Site header: transparent, no full-width bar; aligned with panel-content
  width. No more sticky.
- Page-level Delete: low-contrast outline button at the page footer
  (left side, justify-content flex-start). Save buttons no longer
  full-width (.stack > button { justify-self: end }).
- form-actions-inline helper for right-aligned button rows.
- New service: l4d2web.services.timeago.humanize_delta — used by the
  upcoming server / overlay live-status partials.
- Server route: POST /servers/<id> renames the server (mirrors the
  overlay update pattern, returns 409 on per-user duplicate).
- Overlay route: POST /overlays/<id>/script handles `action` form value
  — `save_build` (default) or `save_reset_build` (wipes overlay dir
  before queuing build). Redirect lands on /overlays/<id> instead of
  the job page so users see the live status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:26:57 +02:00
mwiegand
985df970f8
feat(l4d2-web): per-overlay server.cfg aliases — expose checkbox + auto-exec
Each linked overlay gets a checkbox on the blueprint detail page that opts
its server.cfg in as exec server_overlay_<id>. The web app builds the
spec with {path, alias} per overlay and prepends exec server_overlay_<id>
lines to the blueprint config in lowest-overlay-first order. The host
stages those copies in the overlayfs upper layer before mounting (avoids
copy-up writes against a sandbox-uid file). A live preview block above the
Config textarea shows what gets auto-executed.

Schema:
- alembic 0007: BlueprintOverlay.expose_server_cfg BOOLEAN

Spec contract:
- l4d2host OverlayRef(path, alias?). load_spec accepts both bare-string
  and {path, alias} entries.

Side effects folded in (same file in l4d2_facade):
- start_server auto-initializes; the manual Initialize step is no longer
  needed before Start.
- initialize_server no longer runs blueprint builders — builds happen on
  overlay save, not on every server Start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 01:26:31 +02:00
mwiegand
a11d030edd
feat(l4d2-web): overlay detail Files section with HTMX file tree + downloads
Adds a server-rendered collapsible file tree section to the overlay
detail page so users can verify what their script/workshop overlays
produced and pull individual artifacts (VPKs, configs) without SSH.
HTMX-driven lazy folder expansion with click-to-download via send_file;
symlinks land anywhere under LEFT4ME_ROOT (so workshop addons stream
from the shared cache) but escapes are refused. Same access rule as the
rest of the page (admin or owner). 39 new tests; full web suite green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:16:25 +02:00
mwiegand
1166e13e44
feat(l4d2-web): server identity by id, name as display label
Host-side identifier (systemd unit name and /var/lib/left4me dirs) is now
str(server.id), centralized in services/server_identity.server_unit_name.
Server.name becomes a free-form display label, required and unique per
user (was [a-z0-9_-]{1,64} and globally unique).

Migration 0006 swaps the old global UNIQUE(name) for UNIQUE(user_id, name).
Web routes already keyed on id; templates only used name for display.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 19:22:09 +02:00
mwiegand
6b4eef22c2
feat: server Reset action — wipe runtime, keep DB row
Reset stops the systemd service, unmounts the overlay, and rm -rf's both
runtime/<name> and instances/<name>, but keeps the Server row, blueprint,
and (shared) systemd template. Next Start re-initializes from the current
blueprint, so users can clean up logs/caches/accumulated game state without
losing the server.

Implementation factors a shared _purge_instance helper out of
delete_instance; reset_instance reuses it without the existence guard. New
"reset" lifecycle op flows through the same route + worker + facade plumbing
as the other server ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:10:32 +02:00
mwiegand
c8a2d563ce
fix(l4d2-web): server delete job now removes the DB row
The delete job ran l4d2ctl delete (host-side cleanup) but never removed the
Server row, so deleted servers kept appearing on /servers. Hard-delete the
row in the worker's success path and skip the post-op status refresh, since
the systemd unit is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 18:09:45 +02:00
mwiegand
406f2196f8
fix(l4d2-web): write sandbox script tmpfile under LEFT4ME_ROOT, not /tmp
The web service unit has PrivateTmp=yes: its /tmp is a per-instance
namespace at /tmp/systemd-private-X-left4me-web.service-Y/tmp/ from
PID 1's perspective. When ScriptBuilder writes /tmp/tmpXXX.sh and
passes that path to the sandbox helper, systemd-run asks PID 1 to set
up BindReadOnlyPaths=${SCRIPT}:/script.sh — but PID 1 lives in the host
namespace and can't resolve the web service's PrivateTmp path. The
unit fails to start with status=226/NAMESPACE and "Failed to set up
mount namespacing: /script.sh: No such file or directory".

Move the tmpfile to ${LEFT4ME_ROOT}/sandbox-scripts/. /var/lib is not
affected by PrivateTmp (only /tmp and /var/tmp are), so PID 1 can
resolve the path. The web service has ReadWritePaths=/var/lib/left4me
already, and the directory is created on demand by Python.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 17:14:21 +02:00
mwiegand
908bca3687
fix(l4d2-web): ScriptBuilder — chmod script tmpfile to 0644 for sandbox read
NamedTemporaryFile creates the script file at mode 0600 owned by the
left4me web user. The sandbox runs as l4d2-sandbox and bwrap bind-mounts
the file read-only at /script.sh, but the kernel still enforces the
underlying file's permissions — l4d2-sandbox can't read 0600 left4me
files, so /bin/bash /script.sh fails with "Permission denied".

Script content is not a secret (it's stored in the DB and editable by
the user), so 0644 is appropriate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 16:18:00 +02:00
mwiegand
879c54cbda
refactor(l4d2-web): drop refresh_global_overlays from scheduler
GLOBAL_OPERATIONS becomes {"install", "refresh_workshop_items"}.
Removes refresh_global_overlays_running from SchedulerState and the
_run_refresh_global_overlays dispatch. Drops dead test cases and pins
GLOBAL_OPERATIONS contents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:45:34 +02:00
mwiegand
9f476e3456
refactor(l4d2-web): drop global-overlays subsystem in favor of script type
Deletes the global_map_sources, global_overlay_refresh, global_map_cache,
and global_overlays service modules and their tests. Removes the
refresh-global-overlays CLI command, the /admin/global-overlays/refresh
route, and the GlobalOverlaySource view in overlay_detail rendering.
Drops py7zr from dependencies — was only used by the deleted subsystem.

The job_worker scheduler still tracks refresh_global_overlays; that
cleanup is Task 4. Deploy/README references are Task 8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:43:41 +02:00
mwiegand
d29afa41fa
feat(l4d2-web): ScriptBuilder + BUILDERS registry update
Adds ScriptBuilder that runs user-authored bash inside the
left4me-script-sandbox helper via run_command, with a 20 GB post-build
disk cap. Registry now {"workshop", "script"}.
finish_job writes Overlay.last_build_status on build_overlay completion.
Drops GlobalMapOverlayBuilder and the now-unreachable
_check_global_overlay_caches in l4d2_facade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 15:39:13 +02:00
mwiegand
4552af6544
fix(l4d2-web): keep SSE log stream from pinning gunicorn threads
stream_command used a blocking proc.stdout.readline() that never woke
when the underlying journalctl was silent, so Flask never delivered
GeneratorExit on client disconnect — the worker thread and the journalctl
child both leaked permanently and pinned the gunicorn thread pool.

Switch to a select-based read loop with a 15s heartbeat tick (yielded as
""), and translate the tick to an SSE keepalive comment in the log route.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 11:18:56 +02:00
mwiegand
ffc4cdbd7d
refactor(l4d2-web): remove legacy external overlay type
The workshop + managed-global overlay surface fully covers the
admin-SFTP flow that 'external' was a placeholder for. Drop the type
from the model defaults, builder registry, routes, template, and
tests, and add migration 0004 that deletes any leftover external
rows along with their blueprint and job references.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 09:31:04 +02:00
mwiegand
92d6ebbe82
feat(l4d2-web): managed global map overlays with daily refresh
Adds two managed system overlays (l4d2center-maps, cedapug-maps) that
fetch curated map archives from upstream sources and reconcile addons
symlinks for non-Steam maps. A daily systemd timer enqueues a coalesced
refresh_global_overlays worker job; downloads, extraction, and rebuilds
run in the existing job worker and surface in the job log UI.

Schema: GlobalOverlaySource / GlobalOverlayItem / GlobalOverlayItemFile
plus nullable Job.user_id so system jobs render as "system" in the UI.
The new builder reconciles symlinks against the per-source vpk cache
and leaves foreign symlinks untouched. Initialize-time guard refuses
to mount a partial overlay if any expected vpk is missing from cache.

Refresh service uses shutil.move to handle EXDEV when /tmp and the
cache live on different filesystems.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 08:05:14 +02:00
mwiegand
4f78574edd
fix(l4d2-web): keep workshop refresh responsive
Limit workshop refresh downloads to one worker and commit build-overlay enqueue work before writing the final job log so SQLite locks do not wedge the web process.
2026-05-07 17:31:12 +02:00
mwiegand
ac020d1e77
feat(l4d2-web): initialize-time guard for uncached workshop items
Before invoking l4d2ctl initialize, run each blueprint overlay's builder
synchronously and then verify that every workshop item attached to the
blueprint has a cache file on disk. If any are missing, raise a clear
error naming the overlay and the missing steam_ids — server start can't
silently mount a partial overlay where some maps are mysteriously absent
in-game.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:53:04 +02:00
mwiegand
38a6fbbe1e
feat(l4d2-web): worker support for build_overlay and refresh_workshop_items
Extends SchedulerState with running_overlays / refresh_running /
blocked_servers_by_overlay, and updates can_start with the truth table:
install and refresh_workshop_items are global mutexes; build_overlay
serializes per-overlay; server jobs block on builds for any overlay
their blueprint references.

Adds enqueue_build_overlay coalescing helper that returns an existing
queued job for the same overlay rather than inserting a duplicate.

Adds run_job dispatch for build_overlay (BUILDERS[overlay.type].build)
and refresh_workshop_items (re-fetches metadata, re-downloads on
time_updated/filename change, enqueues coalesced rebuilds for affected
overlays).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:44:10 +02:00
mwiegand
700940d578
feat(l4d2-web): overlay builder registry with workshop builder
Adds l4d2web/services/overlay_builders.py with a BUILDERS dict mapping
Overlay.type to a builder class. ExternalBuilder is a no-op that just
ensures the overlay directory exists. WorkshopBuilder diff-applies
absolute symlinks under left4dead2/addons/ against the overlay's current
WorkshopItem associations: creates new ones, removes obsolete, leaves
unrelated files alone, and skips uncached items with a warning rather
than producing dangling symlinks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:40:30 +02:00
mwiegand
f0230e17d3
feat(l4d2-web): overlay path helpers and creation
Adds workshop_paths.cache_path(steam_id) returning
$LEFT4ME_ROOT/workshop_cache/{steam_id}.vpk with digit-only validation.

Adds overlay_creation.generate_overlay_path(id) and
create_overlay_directory(overlay) with exist_ok=False so a stray dir from a
prior failed delete surfaces loudly instead of shadowing fresh content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:38:39 +02:00
mwiegand
c6b41429ee
feat(l4d2-web): steam workshop API client and downloader
Adds l4d2web/services/steam_workshop.py: parse_workshop_input (single ID,
URL, or multi-line batch), resolve_collection (HTTPS POST to
GetCollectionDetails), fetch_metadata_batch (HTTPS POST to
GetPublishedFileDetails with consumer_app_id == 550 enforcement that
raises WorkshopValidationError in add-mode and silently skips in
refresh-mode), download_to_cache (atomic + idempotent on mtime+size),
and refresh_all (ThreadPoolExecutor with per-item error collection).

Adds requests as an explicit dependency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 16:37:39 +02:00
mwiegand
f81e839ba2
security: harden boundary inputs and production defaults
- validate instance names at the host lib and web boundary against
  [a-z0-9][a-z0-9_-]{0,63} to prevent path traversal via Server.name
- fail-closed on SECRET_KEY: load_config returns None when env unset,
  create_app raises if missing or "dev" outside TESTING
- close login timing oracle by hashing a dummy digest when the user
  is not found, equalizing response time
- set SESSION_COOKIE_SECURE outside TESTING
- delete_instance tolerates stop_service and fusermount3 failures so
  partially-initialized instances clean up without contract breaks;
  drops the is_mount() preflight that violated AGENTS.md
- document claim_next_job's single-process assumption
- clarify emit_step contract via docstring

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 00:53:33 +02:00
mwiegand
005d2d8458
fix(host): enforce flush=True to prevent pipeline block buffering 2026-05-06 20:34:41 +02:00
mwiegand
27a905c22b
feat(web): add boundary log lines to job worker execution 2026-05-06 20:18:23 +02:00
mwiegand
bbfc528354
feat(deploy): add production-like test deployment 2026-05-06 19:30:10 +02:00
mwiegand
de86139323
feat(l4d2): add l4d2ctl host command boundary 2026-05-06 16:35:20 +02:00
mwiegand
a347829608
feat(l4d2-web): add job pages and cancellation 2026-05-06 15:05:13 +02:00
mwiegand
91d042cf33
feat(l4d2-web): execute queued lifecycle jobs 2026-05-06 14:08:18 +02:00
mwiegand
288eda7c37
chore(l4d2): flatten component layout 2026-05-05 23:47:06 +02:00