# L4D2 Host Smoke Test Design **Goal:** Validate the implemented `l4d2host` library and `l4d2ctl` CLI on the disposable Linux server `ckn@10.0.4.128` before continuing web-app lifecycle job wiring. **Target Host:** `ckn@10.0.4.128` **Access Assumption:** SSH access as `ckn` with sudo privileges. **Primary Constraint:** Ask for explicit user approval before every server-touching step. ## Context The repository now contains both planned components: - `l4d2host`: Python host library and `l4d2ctl` CLI. - `l4d2web`: Flask app for users, blueprints, servers, jobs, and logs. The web app depends on the host library for real lifecycle behavior. Before wiring web lifecycle jobs end-to-end, the host contract should be proven on an actual Linux machine with `steamcmd`, `fuse-overlayfs`, systemd user services, and journald available. ## Scope The smoke test verifies these host-lib behaviors: - SSH connectivity and sudo access to `ckn@10.0.4.128`. - Required runtime tools are present or can be installed: `steamcmd`, `fuse-overlayfs`, `fusermount3`, `systemctl --user`, `journalctl --user`, and Python packaging tooling. - `/opt/l4d2` exists with permissions that allow the `ckn` user to run the v1 host workflow. - `l4d2ctl install` downloads or updates the L4D2 dedicated server into `/opt/l4d2/installation`. - `l4d2ctl initialize smoke -f spec.yaml` writes instance and runtime state under `/opt/l4d2`. - `l4d2ctl start smoke` mounts the runtime overlay, copies `server.cfg`, and starts the systemd user service. - `get_instance_status("smoke")` reports an interpretable status. - `stream_instance_logs("smoke")` can read journald output. - `l4d2ctl stop smoke` stops the user service and unmounts the runtime overlay. - `l4d2ctl delete smoke` removes the instance/runtime directories. - Re-running `l4d2ctl delete smoke` succeeds as a no-op. ## Out Of Scope - Web-app job execution or UI changes. - Long-running game-server operations beyond a short start/status/log/stop check. - Workshop mod management or web-managed overlay file content. - Production hardening for the disposable test server. ## Execution Strategy The smoke test is intentionally gated. Each step must stop after reporting evidence and wait for user approval before moving to the next step. ### Step 1: Read-Only Server Inspection Purpose: understand the target host without changing it. Allowed actions: - SSH into `ckn@10.0.4.128`. - Inspect OS, package manager, current user, sudo availability, Python version, systemd user availability, lingering status, existing `/opt/l4d2` state, and relevant runtime tools. Not allowed in this step: - Installing packages. - Creating or modifying files. - Starting or stopping services. - Mounting or unmounting filesystems. Checkpoint: report findings and ask before any setup changes. ### Step 2: Server Preparation Purpose: make the disposable server capable of running the host-lib workflow. Allowed actions after approval: - Install missing packages needed for the host workflow. - Create `/opt/l4d2` if missing. - Set ownership/permissions so `ckn` can run the smoke workflow. - Configure systemd user prerequisites if required for `systemctl --user`. Checkpoint: report exact changes and ask before deploying code. ### Step 3: Deploy Current Host Lib Purpose: install the current repository implementation on the target host without inventing new packaging. Allowed actions after approval: - Copy or archive the current `l4d2host` source to the server. - Install it using its existing `pyproject.toml`, preferably into an isolated virtual environment. - Verify that `l4d2ctl --help` exposes the fixed v1 command surface. Checkpoint: report command evidence and ask before downloading server files. ### Step 4: Run `l4d2ctl install` Purpose: validate the install/update command against real `steamcmd` behavior. Allowed actions after approval: - Run `l4d2ctl install` on the target host. - Capture stdout, stderr, and exit code. - Inspect `/opt/l4d2/installation` enough to confirm expected installation output. Checkpoint: report evidence and ask before creating a smoke instance. ### Step 5: Run Instance Lifecycle Smoke Test Purpose: validate initialize/start/status/logs/stop/delete against the real runtime. Allowed actions after approval: - Create a minimal spec file for instance name `smoke`. - Run `l4d2ctl initialize smoke -f spec.yaml`. - Run `l4d2ctl start smoke`. - Check `systemctl --user status l4d2@smoke.service`. - Check mount state for `/opt/l4d2/runtime/smoke/merged`. - Call `get_instance_status("smoke")` from Python. - Call `stream_instance_logs("smoke", lines=50, follow=False)` from Python. - Run `l4d2ctl stop smoke`. - Run `l4d2ctl delete smoke`. - Run `l4d2ctl delete smoke` again to verify no-op success. Checkpoint: report command evidence and ask what to do with remaining artifacts. ### Step 6: Cleanup Decision Purpose: preserve useful diagnostics or remove smoke-test state based on user preference. Allowed actions after approval: - Remove copied source archives or virtual environments. - Remove smoke spec files. - Leave `/opt/l4d2/installation` intact if useful for later web-app testing, or remove it if requested. Checkpoint: report final target-host state. ## Failure Handling Any failure stops the smoke-test flow immediately. The report must include: - command that failed - exit code if available - relevant stdout and stderr - likely category: environment issue, host-lib bug, packaging/deploy issue, or unclear - recommended next action No automatic destructive cleanup should happen after a failure. If a failure leaves `/opt/l4d2`, a mounted overlay, copied files, or a systemd service behind, inspectable state should be preserved until the user approves cleanup. ## Evidence Requirements Each completed step should report fresh command evidence. Suitable evidence includes: - exact commands run - exit code or clear command success/failure status - key stdout/stderr lines - relevant filesystem paths - service status summaries - mount state - journal/log snippets No step should be called successful without current evidence from that step. ## Next Phase After Smoke Test If the host-lib smoke test succeeds, continue with web-app lifecycle job wiring: - enqueue lifecycle jobs from routes/UI - run jobs through worker threads - call `l4d2web.services.l4d2_facade` - persist callback output to `job_logs` - live-follow job logs through SSE - update server desired and actual state If the smoke test fails due to host-lib behavior, fix the host library before continuing web-app lifecycle work.