6.5 KiB
L4D2 Host Smoke Test Design
Goal: Validate the implemented l4d2host library and l4d2ctl CLI on the disposable Linux server ckn@10.0.4.128 before continuing web-app lifecycle job wiring.
Target Host: ckn@10.0.4.128
Access Assumption: SSH access as ckn with sudo privileges.
Primary Constraint: Ask for explicit user approval before every server-touching step.
Context
The repository now contains both planned components:
l4d2host: Python host library andl4d2ctlCLI.l4d2web: Flask app for users, blueprints, servers, jobs, and logs.
The web app depends on the host library for real lifecycle behavior. Before wiring web lifecycle jobs end-to-end, the host contract should be proven on an actual Linux machine with steamcmd, fuse-overlayfs, systemd user services, and journald available.
Scope
The smoke test verifies these host-lib behaviors:
- SSH connectivity and sudo access to
ckn@10.0.4.128. - Required runtime tools are present or can be installed:
steamcmd,fuse-overlayfs,fusermount3,systemctl --user,journalctl --user, and Python packaging tooling. /opt/l4d2exists with permissions that allow thecknuser to run the v1 host workflow.l4d2ctl installdownloads or updates the L4D2 dedicated server into/opt/l4d2/installation.l4d2ctl initialize smoke -f spec.yamlwrites instance and runtime state under/opt/l4d2.l4d2ctl start smokemounts the runtime overlay, copiesserver.cfg, and starts the systemd user service.get_instance_status("smoke")reports an interpretable status.stream_instance_logs("smoke")can read journald output.l4d2ctl stop smokestops the user service and unmounts the runtime overlay.l4d2ctl delete smokeremoves the instance/runtime directories.- Re-running
l4d2ctl delete smokesucceeds as a no-op.
Out Of Scope
- Web-app job execution or UI changes.
- Long-running game-server operations beyond a short start/status/log/stop check.
- Workshop mod management or web-managed overlay file content.
- Production hardening for the disposable test server.
Execution Strategy
The smoke test is intentionally gated. Each step must stop after reporting evidence and wait for user approval before moving to the next step.
Step 1: Read-Only Server Inspection
Purpose: understand the target host without changing it.
Allowed actions:
- SSH into
ckn@10.0.4.128. - Inspect OS, package manager, current user, sudo availability, Python version, systemd user availability, lingering status, existing
/opt/l4d2state, and relevant runtime tools.
Not allowed in this step:
- Installing packages.
- Creating or modifying files.
- Starting or stopping services.
- Mounting or unmounting filesystems.
Checkpoint: report findings and ask before any setup changes.
Step 2: Server Preparation
Purpose: make the disposable server capable of running the host-lib workflow.
Allowed actions after approval:
- Install missing packages needed for the host workflow.
- Create
/opt/l4d2if missing. - Set ownership/permissions so
ckncan run the smoke workflow. - Configure systemd user prerequisites if required for
systemctl --user.
Checkpoint: report exact changes and ask before deploying code.
Step 3: Deploy Current Host Lib
Purpose: install the current repository implementation on the target host without inventing new packaging.
Allowed actions after approval:
- Copy or archive the current
l4d2hostsource to the server. - Install it using its existing
pyproject.toml, preferably into an isolated virtual environment. - Verify that
l4d2ctl --helpexposes the fixed v1 command surface.
Checkpoint: report command evidence and ask before downloading server files.
Step 4: Run l4d2ctl install
Purpose: validate the install/update command against real steamcmd behavior.
Allowed actions after approval:
- Run
l4d2ctl installon the target host. - Capture stdout, stderr, and exit code.
- Inspect
/opt/l4d2/installationenough to confirm expected installation output.
Checkpoint: report evidence and ask before creating a smoke instance.
Step 5: Run Instance Lifecycle Smoke Test
Purpose: validate initialize/start/status/logs/stop/delete against the real runtime.
Allowed actions after approval:
- Create a minimal spec file for instance name
smoke. - Run
l4d2ctl initialize smoke -f spec.yaml. - Run
l4d2ctl start smoke. - Check
systemctl --user status l4d2@smoke.service. - Check mount state for
/opt/l4d2/runtime/smoke/merged. - Call
get_instance_status("smoke")from Python. - Call
stream_instance_logs("smoke", lines=50, follow=False)from Python. - Run
l4d2ctl stop smoke. - Run
l4d2ctl delete smoke. - Run
l4d2ctl delete smokeagain to verify no-op success.
Checkpoint: report command evidence and ask what to do with remaining artifacts.
Step 6: Cleanup Decision
Purpose: preserve useful diagnostics or remove smoke-test state based on user preference.
Allowed actions after approval:
- Remove copied source archives or virtual environments.
- Remove smoke spec files.
- Leave
/opt/l4d2/installationintact if useful for later web-app testing, or remove it if requested.
Checkpoint: report final target-host state.
Failure Handling
Any failure stops the smoke-test flow immediately. The report must include:
- command that failed
- exit code if available
- relevant stdout and stderr
- likely category: environment issue, host-lib bug, packaging/deploy issue, or unclear
- recommended next action
No automatic destructive cleanup should happen after a failure. If a failure leaves /opt/l4d2, a mounted overlay, copied files, or a systemd service behind, inspectable state should be preserved until the user approves cleanup.
Evidence Requirements
Each completed step should report fresh command evidence. Suitable evidence includes:
- exact commands run
- exit code or clear command success/failure status
- key stdout/stderr lines
- relevant filesystem paths
- service status summaries
- mount state
- journal/log snippets
No step should be called successful without current evidence from that step.
Next Phase After Smoke Test
If the host-lib smoke test succeeds, continue with web-app lifecycle job wiring:
- enqueue lifecycle jobs from routes/UI
- run jobs through worker threads
- call
l4d2web.services.l4d2_facade - persist callback output to
job_logs - live-follow job logs through SSE
- update server desired and actual state
If the smoke test fails due to host-lib behavior, fix the host library before continuing web-app lifecycle work.