170 lines
6.5 KiB
Markdown
170 lines
6.5 KiB
Markdown
# L4D2 Host Smoke Test Design
|
|
|
|
**Goal:** Validate the implemented `l4d2host` library and `l4d2ctl` CLI on the disposable Linux server `ckn@10.0.4.128` before continuing web-app lifecycle job wiring.
|
|
|
|
**Target Host:** `ckn@10.0.4.128`
|
|
|
|
**Access Assumption:** SSH access as `ckn` with sudo privileges.
|
|
|
|
**Primary Constraint:** Ask for explicit user approval before every server-touching step.
|
|
|
|
## Context
|
|
|
|
The repository now contains both planned components:
|
|
|
|
- `components/l4d2-host-lib`: Python host library and `l4d2ctl` CLI.
|
|
- `components/l4d2-web-app`: Flask app for users, blueprints, servers, jobs, and logs.
|
|
|
|
The web app depends on the host library for real lifecycle behavior. Before wiring web lifecycle jobs end-to-end, the host contract should be proven on an actual Linux machine with `steamcmd`, `fuse-overlayfs`, systemd user services, and journald available.
|
|
|
|
## Scope
|
|
|
|
The smoke test verifies these host-lib behaviors:
|
|
|
|
- SSH connectivity and sudo access to `ckn@10.0.4.128`.
|
|
- Required runtime tools are present or can be installed: `steamcmd`, `fuse-overlayfs`, `fusermount3`, `systemctl --user`, `journalctl --user`, and Python packaging tooling.
|
|
- `/opt/l4d2` exists with permissions that allow the `ckn` user to run the v1 host workflow.
|
|
- `l4d2ctl install` downloads or updates the L4D2 dedicated server into `/opt/l4d2/installation`.
|
|
- `l4d2ctl initialize smoke -f spec.yaml` writes instance and runtime state under `/opt/l4d2`.
|
|
- `l4d2ctl start smoke` mounts the runtime overlay, copies `server.cfg`, and starts the systemd user service.
|
|
- `get_instance_status("smoke")` reports an interpretable status.
|
|
- `stream_instance_logs("smoke")` can read journald output.
|
|
- `l4d2ctl stop smoke` stops the user service and unmounts the runtime overlay.
|
|
- `l4d2ctl delete smoke` removes the instance/runtime directories.
|
|
- Re-running `l4d2ctl delete smoke` succeeds as a no-op.
|
|
|
|
## Out Of Scope
|
|
|
|
- Web-app job execution or UI changes.
|
|
- Long-running game-server operations beyond a short start/status/log/stop check.
|
|
- Workshop mod management or web-managed overlay file content.
|
|
- Production hardening for the disposable test server.
|
|
|
|
## Execution Strategy
|
|
|
|
The smoke test is intentionally gated. Each step must stop after reporting evidence and wait for user approval before moving to the next step.
|
|
|
|
### Step 1: Read-Only Server Inspection
|
|
|
|
Purpose: understand the target host without changing it.
|
|
|
|
Allowed actions:
|
|
|
|
- SSH into `ckn@10.0.4.128`.
|
|
- Inspect OS, package manager, current user, sudo availability, Python version, systemd user availability, lingering status, existing `/opt/l4d2` state, and relevant runtime tools.
|
|
|
|
Not allowed in this step:
|
|
|
|
- Installing packages.
|
|
- Creating or modifying files.
|
|
- Starting or stopping services.
|
|
- Mounting or unmounting filesystems.
|
|
|
|
Checkpoint: report findings and ask before any setup changes.
|
|
|
|
### Step 2: Server Preparation
|
|
|
|
Purpose: make the disposable server capable of running the host-lib workflow.
|
|
|
|
Allowed actions after approval:
|
|
|
|
- Install missing packages needed for the host workflow.
|
|
- Create `/opt/l4d2` if missing.
|
|
- Set ownership/permissions so `ckn` can run the smoke workflow.
|
|
- Configure systemd user prerequisites if required for `systemctl --user`.
|
|
|
|
Checkpoint: report exact changes and ask before deploying code.
|
|
|
|
### Step 3: Deploy Current Host Lib
|
|
|
|
Purpose: install the current repository implementation on the target host without inventing new packaging.
|
|
|
|
Allowed actions after approval:
|
|
|
|
- Copy or archive the current `components/l4d2-host-lib` source to the server.
|
|
- Install it using its existing `pyproject.toml`, preferably into an isolated virtual environment.
|
|
- Verify that `l4d2ctl --help` exposes the fixed v1 command surface.
|
|
|
|
Checkpoint: report command evidence and ask before downloading server files.
|
|
|
|
### Step 4: Run `l4d2ctl install`
|
|
|
|
Purpose: validate the install/update command against real `steamcmd` behavior.
|
|
|
|
Allowed actions after approval:
|
|
|
|
- Run `l4d2ctl install` on the target host.
|
|
- Capture stdout, stderr, and exit code.
|
|
- Inspect `/opt/l4d2/installation` enough to confirm expected installation output.
|
|
|
|
Checkpoint: report evidence and ask before creating a smoke instance.
|
|
|
|
### Step 5: Run Instance Lifecycle Smoke Test
|
|
|
|
Purpose: validate initialize/start/status/logs/stop/delete against the real runtime.
|
|
|
|
Allowed actions after approval:
|
|
|
|
- Create a minimal spec file for instance name `smoke`.
|
|
- Run `l4d2ctl initialize smoke -f spec.yaml`.
|
|
- Run `l4d2ctl start smoke`.
|
|
- Check `systemctl --user status l4d2@smoke.service`.
|
|
- Check mount state for `/opt/l4d2/runtime/smoke/merged`.
|
|
- Call `get_instance_status("smoke")` from Python.
|
|
- Call `stream_instance_logs("smoke", lines=50, follow=False)` from Python.
|
|
- Run `l4d2ctl stop smoke`.
|
|
- Run `l4d2ctl delete smoke`.
|
|
- Run `l4d2ctl delete smoke` again to verify no-op success.
|
|
|
|
Checkpoint: report command evidence and ask what to do with remaining artifacts.
|
|
|
|
### Step 6: Cleanup Decision
|
|
|
|
Purpose: preserve useful diagnostics or remove smoke-test state based on user preference.
|
|
|
|
Allowed actions after approval:
|
|
|
|
- Remove copied source archives or virtual environments.
|
|
- Remove smoke spec files.
|
|
- Leave `/opt/l4d2/installation` intact if useful for later web-app testing, or remove it if requested.
|
|
|
|
Checkpoint: report final target-host state.
|
|
|
|
## Failure Handling
|
|
|
|
Any failure stops the smoke-test flow immediately. The report must include:
|
|
|
|
- command that failed
|
|
- exit code if available
|
|
- relevant stdout and stderr
|
|
- likely category: environment issue, host-lib bug, packaging/deploy issue, or unclear
|
|
- recommended next action
|
|
|
|
No automatic destructive cleanup should happen after a failure. If a failure leaves `/opt/l4d2`, a mounted overlay, copied files, or a systemd service behind, inspectable state should be preserved until the user approves cleanup.
|
|
|
|
## Evidence Requirements
|
|
|
|
Each completed step should report fresh command evidence. Suitable evidence includes:
|
|
|
|
- exact commands run
|
|
- exit code or clear command success/failure status
|
|
- key stdout/stderr lines
|
|
- relevant filesystem paths
|
|
- service status summaries
|
|
- mount state
|
|
- journal/log snippets
|
|
|
|
No step should be called successful without current evidence from that step.
|
|
|
|
## Next Phase After Smoke Test
|
|
|
|
If the host-lib smoke test succeeds, continue with web-app lifecycle job wiring:
|
|
|
|
- enqueue lifecycle jobs from routes/UI
|
|
- run jobs through worker threads
|
|
- call `l4d2web.services.l4d2_facade`
|
|
- persist callback output to `job_logs`
|
|
- live-follow job logs through SSE
|
|
- update server desired and actual state
|
|
|
|
If the smoke test fails due to host-lib behavior, fix the host library before continuing web-app lifecycle work.
|