docs(workshop): spec and plan for steam workshop overlays

Add a typed-overlay model with workshop as the first non-external type:
deduplicated WorkshopItem registry, symlink-based overlay directories,
auto-rebuild after item changes, admin global refresh, and a unified
Create-overlay UI with web-managed paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
mwiegand 2026-05-07 16:25:13 +02:00
parent d18b397330
commit b46f52258d
No known key found for this signature in database
2 changed files with 783 additions and 0 deletions

View file

@ -0,0 +1,557 @@
# L4D2 Workshop Overlays Implementation Plan
> **Approval gate:** This plan may be written and refined without further approval. Do not implement code changes from this plan until the user explicitly approves implementation.
**Goal:** Implement the workshop overlay feature per `docs/superpowers/specs/2026-05-07-l4d2-workshop-overlays-design.md`. Add a `WorkshopItem` registry, a typed `Overlay.type` column with a builder registry, a workshop builder that downloads from the Steam Web API and manages symlinks into a deduplicated cache, and the supporting routes, templates, jobs, and tests.
**Architecture:** Keep the v1 single-process Flask architecture. New code is additive: a `WorkshopBuilder` class registered in a builder dispatcher, a `steam_workshop` service module for the Steam Web API and downloader, two new database tables and one extended one, and two new job operations on the existing in-process worker. fuse-overlayfs mount handling in `l4d2host` is unchanged — workshop content arrives at overlay paths the same way externals do today.
---
## Locked Decisions
See `docs/superpowers/specs/2026-05-07-l4d2-workshop-overlays-design.md` for the design rationale. Implementation-relevant decisions:
- Typed overlays: `external` (existing rows; no-op builder) and `workshop` (new); future types deferred.
- No JSON `source_config` blob; per-type structured data in proper tables.
- `WorkshopItem` is a global deduplicated registry keyed on `steam_id`. Cache at `/var/lib/left4me/workshop_cache/{steam_id}.vpk`.
- Overlay symlinks are absolute, named `{steam_id}.vpk`; no Steam filename in any on-disk path.
- `overlay_workshop_items` is a pure association; toggle = remove/re-add.
- Collections are atomic UI bulk-imports; DB never tracks collection attribution.
- Single global admin "Refresh all workshop items" button.
- No cache GC in v1.
- `Overlay.user_id` is the scope (NULL = system, set = private); independent of `type`.
- Workshop overlays default to private; existing externals stay system-wide.
- One unified Create-overlay button with type radio; no path field — paths are always `str(overlay_id)`.
- `consumer_app_id == 550` validated at fetch/add; not stored.
- Input field accepts numeric ID, full Workshop URL, or multi-line batch.
- Auto-rebuild after add/remove with build coalescing.
- HTTPS for all Steam Web API calls.
- `Overlay.id` uses `AUTOINCREMENT`; `create_overlay_directory` uses `exist_ok=False`.
- Two partial unique indexes for overlay names: `(name) WHERE user_id IS NULL` and `(name, user_id) WHERE user_id IS NOT NULL`.
---
## Current Gap
- `Overlay` rows have `id`, `name`, `path`, no type, no scope.
- The web app cannot download anything from Steam; users must SFTP `.vpk` files into prepared overlay directories.
- The job worker has no operations for overlay builds or workshop refreshes.
- The mount/build pipeline assumes overlay directories are externally populated.
- There is no UI affordance to add or list workshop content.
---
## Task 1: Extend Tests First — Schema Migration And Models
**Files:**
- Create: `l4d2web/tests/test_workshop_overlay_models.py`
- Modify: `l4d2web/tests/test_models.py` (extend) — partial unique index behavior
Write tests against fresh SQLite schemas asserting:
- An `Overlay` migration round-trip: existing rows acquire `type='external'` and `user_id=NULL`; their `name` values remain unique by partial index.
- After migration, two externals (both `user_id=NULL`) with the same name are rejected by the system partial unique index.
- After migration, two users may both own a workshop overlay named `"my-maps"` (per-user partial unique index).
- `WorkshopItem.steam_id` is unique; concurrent inserts of the same `steam_id` raise integrity errors.
- `overlay_workshop_items` enforces `UNIQUE(overlay_id, workshop_item_id)`.
- `Overlay` deletion cascades `overlay_workshop_items` rows but does not delete `WorkshopItem` rows (`ON DELETE RESTRICT`).
- `Job.overlay_id` is nullable and references `overlays(id)`.
- `Overlay.id` does not reuse a deleted ID after the migration (AUTOINCREMENT).
Verification command:
```bash
pytest l4d2web/tests/test_workshop_overlay_models.py l4d2web/tests/test_models.py -q
```
Expected before implementation: FAIL.
---
## Task 2: Schema Migration And ORM Mappings
**Files:**
- Create: `l4d2web/alembic/versions/0002_workshop_overlays.py`
- Modify: `l4d2web/models.py`
Migration `0002_workshop_overlays` (`down_revision = "b2c684fddbd3"`):
1. `op.batch_alter_table("overlays")`:
- Add `type VARCHAR(16) NOT NULL DEFAULT 'external'` (server_default during migration; remove after backfill).
- Add `user_id INTEGER NULL REFERENCES users(id)`.
- Drop the existing `unique=True` on `name`.
- Add index `ix_overlays_type_user_id` on `(type, user_id)`.
- Switch `id` to `AUTOINCREMENT`.
2. After batch alter, create the two partial unique indexes via raw `op.create_index(..., postgresql_where=..., sqlite_where=...)`:
- `uq_overlay_name_system` on `(name)` `WHERE user_id IS NULL`.
- `uq_overlay_name_per_user` on `(name, user_id)` `WHERE user_id IS NOT NULL`.
3. `op.create_table("workshop_items", ...)` per spec data-model section.
4. `op.create_table("overlay_workshop_items", ...)` with the unique constraint and the reverse-lookup index.
5. `op.batch_alter_table("jobs")`: add `overlay_id INTEGER NULL REFERENCES overlays(id)`.
ORM (`models.py`):
- Extend `Overlay`: add `type`, `user_id`. Drop `unique=True` on `name`. Set `__table_args__` with the two partial indexes and `ix_overlays_type_user_id`.
- Extend `Job`: add `overlay_id` mapped column with FK.
- New `WorkshopItem` and `OverlayWorkshopItem` classes per spec. Set up `Overlay.workshop_items` relationship through the association.
Verification command:
```bash
pytest l4d2web/tests/test_workshop_overlay_models.py l4d2web/tests/test_models.py -q
```
Expected after implementation: PASS.
Run alembic against a fresh test DB to verify upgrade and downgrade succeed.
---
## Task 3: Tests First — Steam Web API And Downloader
**Files:**
- Create: `l4d2web/tests/test_steam_workshop.py`
Mock HTTP with `responses` or `pytest-httpserver`. Cover:
- `parse_workshop_input` accepts a single numeric ID, a single Workshop URL (`steamcommunity.com/sharedfiles/filedetails/?id=N`), and a multi-line whitespace-separated batch of either; returns deduplicated ordered list of digit-only IDs.
- `parse_workshop_input` rejects garbage, paths outside `?id=`, non-digit IDs.
- `resolve_collection` POSTs to the HTTPS endpoint with the form-encoded payload and returns `publishedfileid` children.
- `fetch_metadata_batch` POSTs once with `itemcount=N`; returns parsed `WorkshopMetadata` per item; captures `result != 1` into `last_error`; raises `WorkshopValidationError` when any `consumer_app_id != 550` during user-add; logs and skips during refresh-mode.
- `WorkshopMetadata.preview_url` is captured.
- `download_to_cache` writes `cache_root/{steam_id}.vpk.partial`, then `os.replace` to the final name; sets `os.utime(file, (time_updated, time_updated))`.
- `download_to_cache` is idempotent: a second call where on-disk `(mtime, size)` matches `(time_updated, file_size)` is a no-op (no HTTP request issued).
- `refresh_all` runs downloads via `ThreadPoolExecutor(max_workers=8)` and reports per-item errors without aborting the batch.
- All Steam API URLs use `https://`.
Verification command:
```bash
pytest l4d2web/tests/test_steam_workshop.py -q
```
Expected before implementation: FAIL.
---
## Task 4: Steam Workshop Service Module
**Files:**
- Create: `l4d2web/services/steam_workshop.py`
Public surface:
```python
def parse_workshop_input(raw: str) -> list[str]: ...
def resolve_collection(collection_id: str) -> list[str]: ...
def fetch_metadata_batch(steam_ids: list[str], *, mode: Literal["add","refresh"]) -> list[WorkshopMetadata]: ...
def download_to_cache(meta: WorkshopMetadata, cache_root: Path, *, on_progress=None, should_cancel=None) -> Path: ...
def refresh_all(items: list[WorkshopItem], cache_root: Path, executor_workers: int = 8) -> RefreshReport: ...
```
Implementation rules:
- Endpoints are HTTPS:
- `https://api.steampowered.com/ISteamRemoteStorage/GetCollectionDetails/v1/`
- `https://api.steampowered.com/ISteamRemoteStorage/GetPublishedFileDetails/v1/`
- Form-encoded POSTs with `itemcount=N` / `collectioncount=N` and `publishedfileids[i]=…` per index.
- Per-request timeout 30s; per-item ceiling 5min. No retry or backoff in v1.
- `consumer_app_id != 550`:
- In `mode="add"`: raise `WorkshopValidationError` with the offending `steam_id`.
- In `mode="refresh"`: log and skip; do not abort other items.
- `result != 1`: capture Steam's result code in the item's `last_error`; do not download; do not abort siblings.
- Cooperative cancellation: `download_to_cache` checks `should_cancel()` between chunked reads; `refresh_all`'s executor checks before each task.
- `WorkshopMetadata` is a dataclass with `steam_id, title, filename, file_url, file_size, time_updated, preview_url, consumer_app_id, result`.
- `RefreshReport` aggregates per-item outcomes for the caller's job log.
- Use a single `requests.Session` per call site for connection reuse.
Verification command:
```bash
pytest l4d2web/tests/test_steam_workshop.py -q
```
Expected after implementation: PASS.
---
## Task 5: Tests First — Path Helpers And Overlay Creation
**Files:**
- Create: `l4d2web/tests/test_workshop_paths.py`
- Create: `l4d2web/tests/test_overlay_creation.py`
Cover:
- `workshop_cache_root()` returns `LEFT4ME_ROOT/workshop_cache`.
- `cache_path(steam_id)` returns `cache_root / f"{steam_id}.vpk"` for valid digit strings; rejects non-digits, slashes, dot-dot.
- `generate_overlay_path(overlay_id)` returns `str(overlay_id)`; passes `validate_overlay_ref` from `l4d2host.paths`.
- `create_overlay_directory(overlay)` creates `LEFT4ME_ROOT/overlays/{path}/` with `exist_ok=False`. Calling twice raises (DB/disk drift surfaced loudly).
Verification command:
```bash
pytest l4d2web/tests/test_workshop_paths.py l4d2web/tests/test_overlay_creation.py -q
```
Expected before implementation: FAIL.
---
## Task 6: Path Helpers And Overlay Creation
**Files:**
- Create: `l4d2web/services/workshop_paths.py`
- Create: `l4d2web/services/overlay_creation.py`
`workshop_paths`:
```python
def workshop_cache_root() -> Path: ... # LEFT4ME_ROOT/workshop_cache
def cache_path(steam_id: str) -> Path: ... # validates digits-only; returns cache_root/{steam_id}.vpk
```
`overlay_creation`:
```python
def generate_overlay_path(overlay_id: int) -> str: ... # str(overlay_id) + validate_overlay_ref
def create_overlay_directory(overlay: Overlay) -> None: # makedirs(..., exist_ok=False)
...
```
Verification command:
```bash
pytest l4d2web/tests/test_workshop_paths.py l4d2web/tests/test_overlay_creation.py -q
```
Expected after implementation: PASS.
---
## Task 7: Tests First — Overlay Builders
**Files:**
- Create: `l4d2web/tests/test_overlay_builders.py`
Cover with `tmp_path`:
- `BUILDERS` dict resolves `"external"` and `"workshop"` to instances; unknown types raise `KeyError` (caller's error).
- `ExternalBuilder.build()` is a no-op: makes the overlay directory if missing, writes one log line, returns. Existing files in the directory are untouched.
- `WorkshopBuilder.build()` against a fixture overlay with three associated `WorkshopItem` rows (two with cache files present, one without):
- Creates `left4dead2/addons/` if missing.
- Creates symlinks `addons/{steam_id_a}.vpk → cache_root/{steam_id_a}.vpk` for items with cache files. Symlinks are absolute.
- Skips the uncached item; emits a warning log line. Does not create a dangling symlink.
- On a re-run with the same associations: no FS changes; logs report `unchanged=2 skipped(uncached)=1`.
- On a re-run after one association is removed: removes the obsolete symlink only; leaves cache files alone.
- On a re-run after one item is added: adds only the new symlink.
- Files in `addons/` that aren't symlinks into the cache are left untouched.
- `should_cancel` mid-build: stops between filesystem ops; partial state is consistent and a re-run heals.
Verification command:
```bash
pytest l4d2web/tests/test_overlay_builders.py -q
```
Expected before implementation: FAIL.
---
## Task 8: Overlay Builders And Dispatcher
**Files:**
- Create: `l4d2web/services/overlay_builders.py`
```python
class OverlayBuilder(Protocol):
def build(self, overlay: Overlay, *, on_stdout, on_stderr, should_cancel) -> None: ...
class ExternalBuilder: ...
class WorkshopBuilder: ...
BUILDERS: dict[str, OverlayBuilder] = {
"external": ExternalBuilder(),
"workshop": WorkshopBuilder(),
}
```
`WorkshopBuilder.build()`:
1. Load the overlay's `WorkshopItem` rows.
2. `os.makedirs(overlay_root / "left4dead2/addons", exist_ok=True)`.
3. Compute `desired = {f"{steam_id}.vpk": cache_path(steam_id)}` for items where `last_downloaded_at IS NOT NULL` and the cache file exists. Skip and warn for items missing a cache file.
4. Inspect existing entries in `addons/` via `os.scandir`: keep entries that are not symlinks into `workshop_cache`; otherwise diff against `desired` and apply changes via `os.unlink` and `os.symlink(absolute_target, link_path)`.
5. Emit `created N, removed M, unchanged K, skipped (uncached) S` log line.
6. Check `should_cancel()` between filesystem ops.
Verification command:
```bash
pytest l4d2web/tests/test_overlay_builders.py -q
```
Expected after implementation: PASS.
---
## Task 9: Tests First — Worker Scheduler Truth Table And Coalescing
**Files:**
- Modify: `l4d2web/tests/test_job_worker.py`
Add coverage:
- Truth table for `can_start`:
- `install` not claimed while `refresh_workshop_items`, any `build_overlay`, or any server job is running.
- `refresh_workshop_items` not claimed while `install`, any `build_overlay`, or any server job is running.
- `build_overlay(N)` not claimed while `install`, `refresh_workshop_items`, or another `build_overlay(N)` is running. Two `build_overlay` jobs for **different** overlay IDs claim concurrently.
- Server start/init blocks if `refresh_workshop_items` runs or if any `build_overlay(N)` runs where N ∈ overlays of the server's blueprint.
- `enqueue_build_overlay(overlay_id)`:
- Inserts a new queued job when no pending job exists.
- Returns the existing pending job when one is already queued (coalescing).
- Does not coalesce against running jobs (a new add after build start gets a fresh queued job).
- `refresh_workshop_items` post-completion enqueues `build_overlay` only for overlays whose items had `time_updated` advance or `filename` change; each such enqueue uses the coalescing helper.
Verification command:
```bash
pytest l4d2web/tests/test_job_worker.py -q
```
Expected before implementation: FAIL.
---
## Task 10: Worker Scheduler And New Operations
**Files:**
- Modify: `l4d2web/services/job_worker.py`
Changes:
- Define `OVERLAY_OPERATIONS = {"build_overlay"}` and `GLOBAL_OPERATIONS = {"install", "refresh_workshop_items"}`. Update `malformed_server_job` to allow `server_id IS NULL` for these.
- Extend `SchedulerState` with `running_overlays: set[int]` and `refresh_running: bool`.
- Update `claim_next_job()`:
- Compute `running_overlays` from queries against `running` jobs of operation `build_overlay`.
- Apply the truth-table rules above.
- Continue using `created_at, id` ordering for deterministic claim.
- Add `enqueue_build_overlay(overlay_id: int) -> Job` helper:
- Look for `queued` `build_overlay` job with same `overlay_id`. Return it if present.
- Otherwise insert a new queued job with `overlay_id` set, `server_id=None`, `operation="build_overlay"`.
- Update `run_job` dispatch:
- `build_overlay` → load `Overlay`, dispatch to `BUILDERS[overlay.type].build(overlay, on_stdout, on_stderr, should_cancel)`.
- `refresh_workshop_items` → call `steam_workshop.refresh_all(...)`. After completion, for each affected overlay, call `enqueue_build_overlay(overlay_id)`.
Verification command:
```bash
pytest l4d2web/tests/test_job_worker.py -q
```
Expected after implementation: PASS.
---
## Task 11: Tests First — Routes, Permissions, And Auto-Rebuild
**Files:**
- Modify: `l4d2web/tests/test_overlays.py`
- Create: `l4d2web/tests/test_workshop_routes.py`
Cover:
- `POST /overlays` with `type='workshop'` and `name` succeeds for any logged-in user; `path` is auto-generated; `user_id` is set; the directory exists at `LEFT4ME_ROOT/overlays/{id}`.
- `POST /overlays` with `type='external'` succeeds only for admins; `user_id` is NULL.
- Duplicate workshop name within the same user is rejected; duplicate names across users are accepted.
- Duplicate external name is rejected.
- Non-admins see `type='external' OR user_id=current_user.id` only when listing overlays.
- `POST /overlays/{id}/items` with one numeric ID adds an association and enqueues a coalesced `build_overlay`. The response is an HTMX fragment of the updated item table.
- `POST /overlays/{id}/items` with a multi-line batch (mix of IDs and URLs) adds all and enqueues one coalesced job for the batch.
- `POST /overlays/{id}/items` with a collection ID resolves members and adds N associations.
- Adding a non-L4D2 item (`consumer_app_id != 550`) returns HTTP 400 with a useful message; no association is created.
- Adding an item already in the overlay returns "already in overlay" (no 500).
- `POST /overlays/{id}/items/{item_id}/delete` removes the association and enqueues a coalesced build.
- `POST /overlays/{id}/build` enqueues the manual rebuild and redirects to the job page.
- `POST /admin/workshop/refresh` is admin-only; non-admins receive 403.
Mock `steam_workshop` HTTP layer for these tests.
Verification command:
```bash
pytest l4d2web/tests/test_overlays.py l4d2web/tests/test_workshop_routes.py -q
```
Expected before implementation: FAIL.
---
## Task 12: Routes And Templates
**Files:**
- Modify: `l4d2web/routes/overlay_routes.py`
- Create: `l4d2web/routes/workshop_routes.py`
- Modify: `l4d2web/routes/page_routes.py`
- Modify: `l4d2web/templates/overlays.html`
- Modify: `l4d2web/templates/overlay_detail.html`
- Create: `l4d2web/templates/_overlay_item_table.html`
- Modify: `l4d2web/templates/admin.html`
- Modify: `l4d2web/app.py` (register the workshop blueprint)
`overlay_routes.py`:
- `create_overlay`: read `type` and `name` from form. No `path` field accepted.
- `type='external'`: admin-only; `user_id=NULL`. After insert, set `path = generate_overlay_path(id)`; call `create_overlay_directory(overlay)`.
- `type='workshop'`: any logged-in user; `user_id=current_user.id`. After insert, set `path = generate_overlay_path(id)`; call `create_overlay_directory(overlay)`.
- `update_overlay`: forbid changing `type` and `path`. Workshop: owner or admin can edit `name`. External: admin-only `name` edits.
- `delete_overlay`: after the row deletes, `shutil.rmtree(LEFT4ME_ROOT/overlays/{path})` only if `overlay.path == str(overlay.id)` (legacy externals are left alone). Cache untouched.
`workshop_routes.py`:
- `POST /overlays/{id}/items`: parse input via `parse_workshop_input`; if a collection ID, resolve members; batch-fetch metadata in `mode="add"`; reject non-550 with HTTP 400; upsert `WorkshopItem` via SQLite `INSERT ... ON CONFLICT DO UPDATE` on `steam_id`; bulk-add associations catching `(overlay_id, workshop_item_id)` unique violations; call `enqueue_build_overlay(overlay_id)`; return rendered `_overlay_item_table.html` fragment.
- `POST /overlays/{id}/items/{item_id}/delete`: ownership check; remove association; call `enqueue_build_overlay(overlay_id)`; return updated fragment.
- `POST /overlays/{id}/build`: ownership check; enqueue (coalesced); redirect to `/jobs/{job_id}`.
- `POST /admin/workshop/refresh`: `@require_admin`; insert a `refresh_workshop_items` queued job; redirect to `/admin/jobs`.
`page_routes.py`:
- `overlays()`: admins see all; non-admins see `type='external' OR user_id=current_user.id`.
- `overlay_detail()`: load `WorkshopItem` rows for workshop-type overlays.
Templates:
- `overlays.html`: add Type column. Modal has type radio (External | Workshop) and name field. No path field.
- `overlay_detail.html`: branch on `overlay.type`.
- External view: read-only path display, name edit (admin only).
- Workshop view: an `<textarea>` accepting one or many IDs/URLs plus a radio (Items | Collection); item table with thumbnail (`preview_url`), `steam_id` linked to Steam, title, filename, time_updated, file_size, last_error, Remove; Rebuild button; small status indicator showing the latest related job.
- `_overlay_item_table.html`: renderable standalone for HTMX swaps.
- `admin.html`: add a CSRF-protected "Refresh all workshop items" button.
Verification command:
```bash
pytest l4d2web/tests/test_overlays.py l4d2web/tests/test_workshop_routes.py -q
```
Expected after implementation: PASS.
---
## Task 13: Tests First — Initialize-Time Guard
**Files:**
- Modify: `l4d2web/tests/test_l4d2_facade.py` (or create if missing)
Cover:
- `initialize_server(server_id)` calls `BUILDERS[overlay.type].build()` for each overlay in the blueprint before writing the spec.
- For workshop overlays, when an associated `WorkshopItem` lacks a cache file (`workshop_cache/{steam_id}.vpk` missing), `initialize_server` raises a clear error containing the missing `steam_id`s and the overlay name; the spec is not written; `l4d2ctl initialize` is not invoked.
- For workshop overlays where all items have cache files, the symlinks are present and `l4d2ctl initialize` runs.
Verification command:
```bash
pytest l4d2web/tests/test_l4d2_facade.py -q
```
Expected before implementation: FAIL.
---
## Task 14: Initialize-Time Guard
**Files:**
- Modify: `l4d2web/services/l4d2_facade.py`
Implementation:
- Before writing the temp spec, iterate over the blueprint's overlays and call `BUILDERS[overlay.type].build(...)`.
- For workshop overlays, the builder logs and skips uncached items rather than failing. After all builders run, perform a second pass: query the blueprint's workshop overlays for any associated `WorkshopItem` with no cache file. If any are found, raise an exception whose message names the missing `steam_id`s and points at the overlay page (`Open overlay {name} ({id}) and click Build`).
Verification command:
```bash
pytest l4d2web/tests/test_l4d2_facade.py -q
```
Expected after implementation: PASS.
---
## Task 15: Deploy Provisioning
**Files:**
- Modify: `deploy/install.sh` (or whichever provisioning script creates `/var/lib/left4me/`)
- Modify: `deploy/README.md`
Behavior:
- Provisioning creates `/var/lib/left4me/workshop_cache/` (mode 0755), owned by the web user.
- `deploy/README.md` documents:
- The new directory and its purpose.
- Permission requirement: web user owns; host user reads (shared group with `g+r` if uids differ).
- `LEFT4ME_ROOT` layout updated with the new subtree.
No tests; verify via test deploy.
---
## Task 16: Full Verification And Manual Test Plan
Run focused suites first:
```bash
pytest l4d2web/tests/test_workshop_overlay_models.py -q
pytest l4d2web/tests/test_models.py -q
pytest l4d2web/tests/test_steam_workshop.py -q
pytest l4d2web/tests/test_workshop_paths.py l4d2web/tests/test_overlay_creation.py -q
pytest l4d2web/tests/test_overlay_builders.py -q
pytest l4d2web/tests/test_job_worker.py -q
pytest l4d2web/tests/test_overlays.py l4d2web/tests/test_workshop_routes.py -q
pytest l4d2web/tests/test_l4d2_facade.py -q
```
Then run the full web suite:
```bash
pytest l4d2web/tests -q
```
Manual test plan on the test deploy:
1. Apply migration on a copy of the prod DB; verify all existing overlays read as `type='external'`, `user_id=NULL`; names still unique by partial index; two externals with the same name are rejected.
2. As non-admin, create a workshop overlay. Add a known popular L4D2 addon by URL. Verify the build job auto-enqueues. Verify symlink + cache file. Confirm web UI shows metadata and thumbnail.
3. Paste a multi-line block of item IDs and URLs. Verify all are parsed and added; verify coalescing (only one `build_overlay` job runs).
4. Add a 50-item collection. Verify all 50 metadata rows appear and no UI mention of "from collection". Verify single coalesced build job.
5. Remove an item. Verify auto-rebuild removes the symlink while the cache file remains.
6. As admin, click Refresh All. Verify only items with newer `time_updated` re-download. Verify affected overlays get coalesced `build_overlay` jobs enqueued.
7. Boot an L4D2 server with a workshop overlay attached. Connect locally and confirm the maps appear in the map vote and load.
8. Concurrency probe: enqueue Refresh All while a `build_overlay` is queued; verify scheduler waits per truth table.
9. Initialize-time guard: manually delete a cache file for an item that's in an overlay attached to a server's blueprint. Try to start the server; verify clear error mentioning the missing `steam_id`.
10. Negative: paste a non-L4D2 workshop ID (e.g., a Skyrim mod). Expect HTTP 400 with a clear message; no row inserted.
11. Negative: simulate Steam API down (block egress). Verify add fails with clean error, not 500. Verify refresh job logs the failure.
---
## Commit Strategy
Use small commits after passing relevant tests:
1. `feat(l4d2-web): typed overlays + workshop schema migration`
2. `feat(l4d2-web): steam workshop API client and downloader`
3. `feat(l4d2-web): overlay path helpers and creation`
4. `feat(l4d2-web): overlay builder registry with workshop builder`
5. `feat(l4d2-web): worker support for build_overlay and refresh_workshop_items`
6. `feat(l4d2-web): workshop overlay UI (routes + templates)`
7. `feat(l4d2-web): initialize-time guard for uncached workshop items`
8. `feat(deploy): workshop_cache provisioning`
Do not commit unless the user explicitly asks for commits.
---
## Open Approval Gate
Before modifying implementation files, ask the user for explicit approval to proceed with the workshop-overlays implementation.

View file

@ -0,0 +1,226 @@
# L4D2 Workshop Overlays Design
**Goal:** Let users add Steam Workshop content (.vpk addons and maps) to L4D2 servers from the web UI. Workshop downloads run as a new typed overlay that fits the existing `Overlay` + `BlueprintOverlay` model, downloaded via the public Steam Web API and exposed through the existing fuse-overlayfs mount layer.
**Approval status:** User-approved design direction. Implementation proceeds in lockstep with the companion plan at `docs/superpowers/plans/2026-05-07-l4d2-workshop-overlays.md`.
## Context
`left4me` users today add `.vpk` content to a server only by SFTP-ing files into a manually-prepared overlay directory or by maintaining shell scripts (`competitive_rework`, `workshop_maps`, `tickrate`, etc.) that wrap `curl`/`steamcmd`. The web app exposes overlay rows but offers no way for users to populate them.
This spec adds **workshop overlays**: a user-private overlay type that downloads `.vpk` files via the public `ISteamRemoteStorage` API and surfaces them through the existing mount layer. Users keep composing blueprints by stacking overlays — workshop overlays become another row alongside today's externally-managed ones.
This is the first *typed* overlay. The design adds a `type` column and a builder-registry so future overlay types (tarball, inline, manual upload) plug in without schema churn or workflow changes.
Steam Workshop content for L4D2 (consumer_app_id 550) is downloadable via two anonymous-POST endpoints with no Steam Web API key required: `GetCollectionDetails` resolves a collection ID to its child item IDs, and `GetPublishedFileDetails` returns per-item metadata including a public `file_url` for the `.vpk`. This is the same API the user's existing `steam-workshop-download` script uses.
L4D2-specific player-side pain points (sv_consistency / RestrictAddons configuration gotchas, the inability to push workshop content via `sv_downloadurl`) are documented in **Out of scope** and tracked as separate follow-ups. This spec stays strictly on workshop content acquisition.
## Locked Decisions
1. **Typed overlays.** `Overlay.type` joins `external` (existing rows; admin-managed; no-op builder) and `workshop` (new). Future types — tarball, inline, manual upload — slot in via the same builder registry without schema churn.
2. **No JSON `source_config` blob.** Per-type structured data lives in proper relational tables. JSON is reserved for genuinely opaque diagnostic payloads.
3. **Central deduplicated `WorkshopItem` registry** keyed on `steam_id`. Cache lives at `/var/lib/left4me/workshop_cache/{steam_id}.vpk`. Multiple overlays referencing the same Steam item share the same cache file.
4. **Symlinks, not copies.** Overlay directories contain `left4dead2/addons/{steam_id}.vpk` symlinks pointing into the cache. Both the cache file and the symlink are named by `{steam_id}` only — no Steam filename in any on-disk path, so Steam can rename the upstream `.vpk` without breaking lookup.
5. **Many-to-many association is pure** (no `enabled` flag). Toggle a workshop item by removing or re-adding the association. The shared cache makes this cheap.
6. **Collections are atomic UI bulk-imports.** Pasting a collection URL/ID resolves member items and creates N item associations. The DB never tracks "this came from a collection." Re-importing a collection is idempotent on existing items and additive for new ones.
7. **Single global admin "Refresh all workshop items" button.** One Steam metadata batch call, then re-download items whose `time_updated` advanced. No per-item, per-overlay, or scheduled refresh in v1.
8. **No cache GC in v1.** Cache grows monotonically. Reference-counted cleanup is a follow-up.
9. **Globality is independent of overlay type.** `Overlay.user_id` is the scope (NULL = system-wide, set = private to that user). v1 defaults newly-created workshop overlays to private and leaves existing external overlays as system-wide. A future "publish/share" button will let owners toggle `user_id` without changing type.
10. **One unified "Create overlay" UI button.** Modal has a type radio (External | Workshop). No path field — the web app generates the path for every new overlay.
11. **Strict scope.** v1 ships only the workshop type. L4D2 server-config gotchas, client-subscription helpers, other recipe types — all deferred to follow-up specs.
12. **`consumer_app_id == 550` validation** at every Steam API response at fetch/add time; non-L4D2 items are rejected and never reach the row. The value is a fixed precondition, not data.
13. **Input field accepts numeric ID, full Workshop URL, or a multi-line batch** of either. Pasting `123456` and pasting `steamcommunity.com/sharedfiles/filedetails/?id=123456` produce the same result; pasting many of either at once works too.
14. **Web-managed overlay paths.** All new overlays (any type) get `path = str(overlay_id)` at insert time. The user never picks a path. Existing legacy external overlay rows keep their current path values; migrating them to the ID-based scheme is a follow-up. `Overlay.id` uses SQLite `AUTOINCREMENT` so deleted IDs are never reused.
15. **Auto-rebuild on item change.** Adding or removing items from a workshop overlay automatically enqueues a `build_overlay` job. The "Rebuild" button on the detail page is for manual recovery only. New build jobs for an overlay coalesce with any pending one for the same overlay (don't queue duplicates).
16. **HTTPS** for all Steam Web API calls. The reference downloader uses HTTP; we don't.
## Architecture
```
Overlay row (type=workshop)
└─refs─▶ overlay_workshop_items
└─▶ WorkshopItem (global, by steam_id)
▼ download (Steam GetPublishedFileDetails + HTTP GET)
workshop_cache/{steam_id}.vpk
overlay_dir/left4dead2/addons/{steam_id}.vpk ─symlink─┘
```
Build dispatch via a registry:
```python
BUILDERS = {"external": ExternalBuilder(), "workshop": WorkshopBuilder()}
def build_overlay(overlay_id):
overlay = db.get(Overlay, overlay_id)
BUILDERS[overlay.type].build(overlay, on_stdout, on_stderr, should_cancel)
```
`ExternalBuilder` is a no-op for legacy admin-managed dirs. `WorkshopBuilder` performs an idempotent diff-apply of `addons/` symlinks against the current associations. Future types add their own builders without changing the dispatcher, the mount layer, or the blueprint editor.
## Data Model
### `Overlay` (extended)
```
id INTEGER PK AUTOINCREMENT
name VARCHAR(255) NOT NULL
path VARCHAR(255) NOT NULL -- new overlays: str(id); legacy externals: existing values
type VARCHAR(16) NOT NULL -- 'external' | 'workshop' (extensible)
user_id INTEGER NULL REFERENCES users(id) -- NULL = system-wide
created_at, updated_at
UNIQUE INDEX on (name) WHERE user_id IS NULL -- system overlays globally unique by name
UNIQUE INDEX on (name, user_id) WHERE user_id IS NOT NULL -- per-user namespace
INDEX on (type, user_id)
```
Two partial unique indexes are required because a naive composite `UNIQUE(name, user_id)` doesn't constrain externals — SQLite treats NULL as distinct in unique constraints, so two externals could share a name. Partial indexes preserve the prior global-uniqueness invariant for system rows.
### `WorkshopItem` (new)
```
id INTEGER PK
steam_id VARCHAR(20) NOT NULL UNIQUE -- 64-bit, store as text
title VARCHAR(255) NOT NULL DEFAULT ''
filename VARCHAR(255) NOT NULL DEFAULT '' -- upstream Steam filename, display only
file_url TEXT NOT NULL DEFAULT ''
file_size BIGINT NOT NULL DEFAULT 0
time_updated INTEGER NOT NULL DEFAULT 0 -- Steam epoch
preview_url TEXT NOT NULL DEFAULT '' -- thumbnail URL hot-linked from Steam
last_downloaded_at DATETIME NULL
last_error TEXT NOT NULL DEFAULT ''
created_at, updated_at
```
`consumer_app_id` is **not** stored. It's validated at fetch time and the row never exists for non-L4D2 items.
### `overlay_workshop_items` (new, pure association)
```
id INTEGER PK
overlay_id INTEGER NOT NULL REFERENCES overlays(id) ON DELETE CASCADE
workshop_item_id INTEGER NOT NULL REFERENCES workshop_items(id) ON DELETE RESTRICT
UNIQUE (overlay_id, workshop_item_id)
INDEX (workshop_item_id) -- reverse lookup for refresh
```
No `enabled` column — toggle is remove/add, which is cheap because the cache survives.
### `Job` (extended)
Add `overlay_id INTEGER NULL REFERENCES overlays(id)` for `build_overlay` jobs.
## Filesystem Layout
```
/var/lib/left4me/
overlays/
{overlay_id}/ # flat — same shape for every type
left4dead2/addons/
{steam_id}.vpk -> /var/lib/left4me/workshop_cache/{steam_id}.vpk
workshop_cache/
{steam_id}.vpk # one file per Steam item
```
- Every new overlay (workshop, future tarball/inline/manual) lives at `overlays/{overlay_id}/`. Legacy external overlays keep their pre-migration paths (e.g. `overlays/standard/`).
- `workshop_cache/` is created during deploy provisioning, not lazily — avoids races between concurrent first downloads.
- Web user owns both trees (mode 0755). Host user (`l4d2ctl`) needs read on both. If web and host are different users, they share a group.
- Symlink targets are absolute. Relative targets resolve in the merged-mount namespace and break across the host/web boundary.
- The builder never creates a dangling symlink. If a `WorkshopItem` lacks a cache file at build time, the builder logs a warning and skips it — fuse-overlayfs surfaces broken links to L4D2 as opaque addon-scan failures.
## UI
A single "Create overlay" button on `/overlays` opens a modal with type radio (External | Workshop) and a name field. No path field. The web app generates `path = str(overlay_id)` after insert.
Workshop overlay detail page (`/overlays/{id}` when `type='workshop'`) shows:
- A multi-line input plus a radio (Items | Collection). Pasting one or many IDs/URLs adds them in order; pasting a collection ID resolves its members.
- An item table with: thumbnail (`preview_url`), `steam_id` linking to Steam, title, filename, last-updated, size, last-error if any, Remove.
- A manual "Rebuild" button (for recovery only — every add/remove auto-enqueues a coalesced `build_overlay` job).
- Status indicator pulled from the latest related `Job` row.
External overlay detail page is unchanged in shape: read-only path display, name edit (admin only). The "External" type retains the existing admin-only SFTP-to-disk workflow until a future "manual upload" type replaces it.
The blueprint editor is unchanged in structure. Workshop overlays appear alongside externals in the user's overlay picker; ordering and stacking semantics are identical.
Admin section gets one new control: "Refresh all workshop items" button on the admin landing or workshop subsection. Pressing it enqueues a single `refresh_workshop_items` job.
### Routes
| Method | Path | Purpose |
|---|---|---|
| GET | `/overlays` | List with Type column, filtered by user permissions |
| POST | `/overlays` | Create; reads `type` and `name` only |
| GET | `/overlays/{id}` | Type-aware detail page |
| POST | `/overlays/{id}/items` | Add items or collection; auto-enqueues coalesced `build_overlay` |
| POST | `/overlays/{id}/items/{item_id}/delete` | Remove association; auto-enqueues coalesced `build_overlay` |
| POST | `/overlays/{id}/build` | Manual rebuild (recovery) |
| POST | `/admin/workshop/refresh` | Admin only; enqueue `refresh_workshop_items` |
HTMX usage stays minimal: only the add-item form and per-row delete swap a fragment. Everything else is full-page POST/redirect/GET.
## Job Operations
Two new operations join the existing job worker:
- **`build_overlay(overlay_id)`** — `Job.overlay_id` is set; `server_id` is NULL. Dispatches to `BUILDERS[overlay.type].build(...)`. Cancellation between filesystem operations.
- **`refresh_workshop_items()`** — admin-only. Both `server_id` and `overlay_id` are NULL. Phases: fetch all metadata in one batched call, download items where `time_updated` advanced, enqueue (coalesced) `build_overlay` for affected overlays. v1 doesn't wait on child builds; the admin sees them in the jobs list.
### Scheduler rules
- `install` and `refresh_workshop_items` are mutually exclusive with each other, with all `build_overlay`s, and with all server jobs.
- `build_overlay(overlay_id=N)` blocks if `install_running`, `refresh_running`, or another build for the same `overlay_id` is running. Builds for *different* overlays may run concurrently.
- Server start/init blocks if `refresh_running` or any `build_overlay` for an overlay referenced by the server's blueprint is running.
Coalescing: a new `build_overlay` for an overlay that already has a queued (not-yet-running) build returns the existing job instead of inserting a new row.
`initialize_server` synchronously calls each overlay's builder before writing the spec for `l4d2ctl initialize`. If a workshop overlay references uncached items (no file in `workshop_cache/`), `initialize_server` fails fast with a clear error naming the missing IDs and pointing the user at the overlay page. It never silently mounts a partial overlay.
## Permissions
- **External overlays**: admin-only create/edit. Visible to all authenticated users (system-wide).
- **Workshop overlays**: any logged-in user can create. Owner or admin can edit and delete. Visible to the owner and admins.
- **Admin refresh**: admin-only.
The `Overlay` listing query for non-admins becomes: `type='external' OR user_id=current_user.id`.
## Risks
- **Broken symlinks across host/web boundary** — mitigated by absolute targets, build-time pre-check skipping uncached items, and `deploy/` documenting permission requirements.
- **Initialize against uncached items** — would silently mount overlays missing maps. Mitigated by `initialize_server`'s fail-fast check; tested.
- **Steam API rate limits** — refresh of 100 items is one metadata POST plus 100 downloads at 8-way parallelism. No retry/backoff in v1; 429s surface verbatim in the job log.
- **Partial failure during refresh** — each item is independent; per-item errors land on the row. Re-running refresh retries failures.
- **Concurrent same-ID adds**`WorkshopItem.steam_id` unique handles cache dedup. `(overlay_id, workshop_item_id)` unique catches double-association; the route returns "already in overlay" rather than 500.
- **Build coalescing missed** — would enqueue dozens of redundant builds during multi-item adds. Mitigated by the `enqueue_build_overlay` helper; tested.
- **Worker concurrency rule miss** — the truth-table test in `test_job_worker.py` is the only way to trust the new scheduler logic; written before dispatch.
- **DB/disk drift** — a stray directory left by a prior failed delete could shadow a fresh overlay. Mitigated by `AUTOINCREMENT` (no ID reuse) and `os.makedirs(exist_ok=False)` (loud failure on collision).
- **Partial unique gap on SQLite** — naive composite `UNIQUE(name, user_id)` doesn't constrain externals because NULL is distinct. Mitigated by two partial unique indexes; tested explicitly.
- **Cache growth without GC** — accepted v1 trade-off.
- **Item removed from Steam** — refresh marks `result != 1`; row keeps last good cache file; UI surfaces error string. Operator decides removal.
- **L4D2 containerized run** — symlink absolute targets break if the server runs in a different mount namespace. Re-evaluate when containerization comes up.
## Out Of Scope
These came up in research and dialog but stay out of v1:
- **Publish / share button on overlays.** Lets owners flip `Overlay.user_id` between their own ID and NULL without changing type. The schema already supports it; only the UI is deferred.
- **Migrate legacy external overlay paths to the ID-based scheme.** Existing external rows keep their pre-migration paths in v1; a follow-up migration moves the directories on disk and updates the rows.
- **Switch from fuse-overlayfs to kernel overlayfs via a privileged helper.** Matches the existing systemd / steam-install sudoers helper pattern under `/usr/local/libexec/left4me/`. Workshop overlays would work identically under either mount engine — symlinks resolve through normal VFS in both.
- **`sv_consistency` / `addonconfig.cfg RestrictAddons` auto-handling.** When a workshop overlay attaches to a blueprint, surface a banner with a one-click fix. Most-cited L4D2 player pain.
- **Shareable Steam Workshop collection link for clients.** Server cannot push workshop content via `sv_downloadurl`; clients must subscribe themselves. A panel-generated collection makes that one click for players. Requires Steam OAuth.
- **Other overlay types.** `tarball` (covers the old `competitive_rework` GitHub-tarball recipe), `inline` (covers `tickrate`'s inline `server.cfg`), `manual` (file manager / upload, replaces the admin-SFTP external workflow). All slot in via the builder registry without schema churn.
- **Cache GC.** Reference-counted delete or admin "Clear unreferenced" page.
- **Per-item / per-overlay / scheduled refresh.** v1 has one global admin button; revisit if users want finer control.
- **Update-aware server restart UX.** Notify users when a running server's overlay content has been refreshed underneath it.
## Implementation Boundaries
- The host library contract is unchanged. Workshop content arrives in overlay directories the same way externals do today; `l4d2host` doesn't know overlays have types.
- The job-execution model is preserved: same workers, same logs, same cancel callbacks. Only the operations table grows.
- The blueprint privacy model and desired-vs-actual server state model are unchanged.
- No new frontend dependencies. Vendored HTMX + custom CSS + small inline JS.
- No new Steam Web API key required; both endpoints used accept anonymous POSTs.
- The companion implementation plan governs task ordering and verification commands. Implementation must not start without explicit user approval per that plan's gate.