left4me/docs/superpowers/plans/2026-05-06-l4d2host-step-logging.md

# Host Lifecycle Step Logging Implementation Plan

> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.

**Goal:** Provide granular, live progress feedback in the web GUI during lifecycle operations without exposing sensitive internal paths or raw subprocess mechanics.

**Architecture:** Modifies `l4d2host/instances.py` to emit safe, high-level step markers to `stdout` via an explicit helper function. These are automatically captured by `l4d2web/services/job_worker.py` and forwarded to the UI via existing SSE endpoints. The job worker will also add boundary markers for top-level operations.

**Tech Stack:** Python, subprocess, pytest

---

### Task 1: Add step logging helper to host library

**Files:**
- Modify: `l4d2host/instances.py`
- Modify: `l4d2host/tests/test_initialize.py`

- [ ] **Step 1: Write the failing test for the helper behavior**

In `l4d2host/tests/test_initialize.py`, add imports and a test verifying `_emit_step` correctly uses the callback or `sys.stdout` depending on arguments:

```python
import sys
from io import StringIO
from l4d2host.instances import _emit_step

def test_emit_step_uses_callback() -> None:
    calls: list[str] = []
    _emit_step("test step", on_stdout=calls.append, passthrough=False)
    assert calls == ["Step: test step"]

def test_emit_step_uses_passthrough_stdout(monkeypatch) -> None:
    fake_out = StringIO()
    monkeypatch.setattr(sys, "stdout", fake_out)
    _emit_step("passthrough step", on_stdout=None, passthrough=True)
    assert fake_out.getvalue() == "Step: passthrough step\n"

def test_emit_step_does_nothing_if_no_target() -> None:
    _emit_step("silent step", on_stdout=None, passthrough=False)
```

- [ ] **Step 2: Run test to verify it fails**

Run: `pytest l4d2host/tests/test_initialize.py -k test_emit_step -q`
Expected: FAIL due to `_emit_step` not defined.

- [ ] **Step 3: Implement minimal code**

In `l4d2host/instances.py`, add imports (if needed) and the `_emit_step` function at the top of the file:

```python
import sys

def _emit_step(msg: str, on_stdout: Callable[[str], None] | None, passthrough: bool) -> None:
    formatted = f"Step: {msg}"
    if on_stdout is not None:
        on_stdout(formatted)
    elif passthrough:
        print(formatted, flush=True)
```

- [ ] **Step 4: Run test to verify it passes**

Run: `pytest l4d2host/tests/test_initialize.py -k test_emit_step -q`
Expected: PASS

- [ ] **Step 5: Commit**

```bash
git add l4d2host/instances.py l4d2host/tests/test_initialize.py
git commit -m "feat(host): add _emit_step helper for lifecycle logging"
```

---

### Task 2: Add step logging to initialize_instance

**Files:**
- Modify: `l4d2host/instances.py`
- Modify: `l4d2host/tests/test_initialize.py`

- [ ] **Step 1: Write the failing test**

In `l4d2host/tests/test_initialize.py`, add a test to verify `initialize_instance` logs the correct steps:

```python
def test_initialize_instance_emits_steps(tmp_path: Path) -> None:
    spec = tmp_path / "spec.yaml"
    spec.write_text("port: 27015\noverlays: [standard]\n")

    steps: list[str] = []
    initialize_instance("alpha", spec, root=tmp_path, on_stdout=steps.append)

    assert steps == [
        "Step: creating instance directories...",
        "Step: writing instance.env...",
        "Step: writing server.cfg...",
        "Step: initialization complete.",
    ]
```

- [ ] **Step 2: Run test to verify it fails**

Run: `pytest l4d2host/tests/test_initialize.py -k test_initialize_instance_emits_steps -q`
Expected: FAIL because `steps` is empty.

- [ ] **Step 3: Add step logs to initialize_instance**

Modify `initialize_instance` in `l4d2host/instances.py` to include `_emit_step` calls. Use the parameters `on_stdout` and `passthrough`:

```python
    _emit_step("creating instance directories...", on_stdout, passthrough)
    instance_dir = root / "instances" / name
    runtime_dir = root / "runtime" / name
    (runtime_dir / "upper").mkdir(parents=True, exist_ok=True)
    (runtime_dir / "work").mkdir(parents=True, exist_ok=True)
    (runtime_dir / "merged").mkdir(parents=True, exist_ok=True)
    instance_dir.mkdir(parents=True, exist_ok=True)

    lowerdirs = [str(overlay_path(overlay, root=root)) for overlay in spec.overlays]
    lowerdirs.append(str(root / "installation"))

    _emit_step("writing instance.env...", on_stdout, passthrough)
    instance_env = "\n".join(
        [
            f"L4D2_PORT={spec.port}",
            f"L4D2_ARGS={' '.join(spec.arguments)}",
            f"L4D2_LOWERDIRS={':'.join(lowerdirs)}",
        ]
    ) + "\n"
    (instance_dir / "instance.env").write_text(instance_env)

    _emit_step("writing server.cfg...", on_stdout, passthrough)
    server_cfg = "\n".join(spec.config) if spec.config else ""
    (instance_dir / "server.cfg").write_text(server_cfg)
    _emit_step("initialization complete.", on_stdout, passthrough)
```

- [ ] **Step 4: Run test to verify it passes**

Run: `pytest l4d2host/tests/test_initialize.py -k test_initialize_instance_emits_steps -q`
Expected: PASS

- [ ] **Step 5: Commit**

```bash
git add l4d2host/instances.py l4d2host/tests/test_initialize.py
git commit -m "feat(host): emit steps during initialize_instance"
```

---

### Task 3: Add step logging to start_instance

**Files:**
- Modify: `l4d2host/instances.py`
- Modify: `l4d2web/tests/test_job_worker.py` (adjusting mocks that expect `on_stdout` logic)

- [ ] **Step 1: Check test suite safety net for start/stop**

We don't have dedicated host unit tests for `start`/`stop`/`delete` due to their systemd/fuse dependency, but they are mocked in web tests (`l4d2web/tests/test_job_worker.py` and `test_l4d2_facade.py`).
We must ensure our `_emit_step` calls inside `start_instance` don't break existing tests, and we'll add the logs to the implementation.

- [ ] **Step 2: Add step logs to start_instance**

In `l4d2host/instances.py`, modify `start_instance`:

```python
    env = _load_instance_env(instance_dir / "instance.env")

    _emit_step("mounting runtime overlay...", on_stdout, passthrough)
    run_command(
        [
            "fuse-overlayfs",
            "-o",
            (
                f"lowerdir={env['L4D2_LOWERDIRS']},"
                f"upperdir={runtime_dir / 'upper'},"
                f"workdir={runtime_dir / 'work'}"
            ),
            str(runtime_dir / "merged"),
        ],
        on_stdout=on_stdout,
        on_stderr=on_stderr,
        passthrough=passthrough,
        should_cancel=should_cancel,
    )

    _emit_step("copying server.cfg to runtime...", on_stdout, passthrough)
    target_cfg = runtime_dir / "merged" / "left4dead2" / "cfg" / "server.cfg"
    target_cfg.parent.mkdir(parents=True, exist_ok=True)
    shutil.copy2(instance_dir / "server.cfg", target_cfg)

    _emit_step("starting systemd service...", on_stdout, passthrough)
    start_service(
        name,
        on_stdout=on_stdout,
        on_stderr=on_stderr,
        passthrough=passthrough,
        should_cancel=should_cancel,
    )
    _emit_step("start complete.", on_stdout, passthrough)
```

- [ ] **Step 3: Add step logs to stop_instance**

In `l4d2host/instances.py`, modify `stop_instance`:

```python
    _emit_step("stopping systemd service...", on_stdout, passthrough)
    stop_service(
        name,
        on_stdout=on_stdout,
        on_stderr=on_stderr,
        passthrough=passthrough,
        should_cancel=should_cancel,
    )
    _emit_step("unmounting runtime overlay...", on_stdout, passthrough)
    run_command(
        ["fusermount3", "-u", str(root / "runtime" / name / "merged")],
        on_stdout=on_stdout,
        on_stderr=on_stderr,
        passthrough=passthrough,
        should_cancel=should_cancel,
    )
    _emit_step("stop complete.", on_stdout, passthrough)
```

- [ ] **Step 4: Add step logs to delete_instance**

In `l4d2host/instances.py`, modify `delete_instance`:

```python
    if not instance_dir.exists() and not runtime_dir.exists():
        return

    _emit_step("stopping systemd service (if running)...", on_stdout, passthrough)
    stop_service(
        name,
        on_stdout=on_stdout,
        on_stderr=on_stderr,
        passthrough=passthrough,
        should_cancel=should_cancel,
    )

    merged = runtime_dir / "merged"
    if merged.is_mount():
        _emit_step("unmounting runtime overlay (if mounted)...", on_stdout, passthrough)
        run_command(
            ["fusermount3", "-u", str(merged)],
            on_stdout=on_stdout,
            on_stderr=on_stderr,
            passthrough=passthrough,
            should_cancel=should_cancel,
        )

    _emit_step("removing instance files...", on_stdout, passthrough)
    if instance_dir.exists():
        shutil.rmtree(instance_dir)
    if runtime_dir.exists():
        shutil.rmtree(runtime_dir)
    _emit_step("delete complete.", on_stdout, passthrough)
```

- [ ] **Step 5: Run full test suite**

Run: `pytest l4d2host/tests l4d2web/tests -q`
Expected: PASS. If tests fail because of expected outputs in mock fake `run_command` handlers, update the mock assertions in the tests to allow the newly emitted step lines.

- [ ] **Step 6: Commit**

```bash
git add l4d2host/instances.py
git commit -m "feat(host): emit steps during start, stop, and delete operations"
```

---

### Task 4: Add boundary markers to job worker

**Files:**
- Modify: `l4d2web/services/job_worker.py`
- Modify: `l4d2web/tests/test_job_worker.py`

- [ ] **Step 1: Write failing tests for job worker boundaries**

In `l4d2web/tests/test_job_worker.py`, update `test_successful_start_job_logs_and_refreshes_server_state` to expect the synthetic boundary lines. Also, modify the mock to simulate actual stdout lines instead of arbitrary "initialized" text:

```python
    def fake_initialize(server_id, *, on_stdout=None, on_stderr=None, should_cancel=None):
        del should_cancel
        calls.append(("initialize", server_id))
        on_stdout("Step: creating instance directories...")

    def fake_start(server_id, *, on_stdout=None, on_stderr=None, should_cancel=None):
        del should_cancel
        calls.append(("start", server_id))
        on_stdout("Step: mounting runtime overlay...")
```

Change the `assert lines == ` section at the bottom of the test:

```python
    assert lines == [
        (1, "stdout", "starting initialize for alpha"),
        (2, "stdout", "Step: creating instance directories..."),
        (3, "stdout", "finished initialize successfully"),
        (4, "stdout", "starting start for alpha"),
        (5, "stdout", "Step: mounting runtime overlay..."),
        (6, "stdout", "finished start successfully"),
    ]
```

- [ ] **Step 2: Run test to verify it fails**

Run: `pytest l4d2web/tests/test_job_worker.py::test_successful_start_job_logs_and_refreshes_server_state -v`
Expected: FAIL because boundary markers are missing.

- [ ] **Step 3: Implement boundary logging in job worker**

In `l4d2web/services/job_worker.py`, modify the `run_job` execution block:

First, add a helper to fetch the server name if `server_id` is present:
```python
    server_name = None
    if server_id is not None:
        with session_scope() as db:
            s = db.scalar(select(Server).where(Server.id == server_id))
            server_name = s.name if s else "unknown"
```

Then, wrap the facade calls:
```python
    def _run_with_boundaries(action: str, target: str, func, *args, **kwargs):
        append_job_log_line(job_id, "stdout", f"starting {action} for {target}", max_chars=max_chars)
        func(*args, **kwargs)
        append_job_log_line(job_id, "stdout", f"finished {action} successfully", max_chars=max_chars)

    try:
        if operation == "install":
            _run_with_boundaries("install", "server", l4d2_facade.install_runtime, on_stdout=on_stdout, on_stderr=on_stderr, should_cancel=should_cancel)
        elif operation in SERVER_OPERATIONS and server_id is None:
            raise ValueError(f"{operation} job has no server_id")
        elif operation == "initialize":
            _run_with_boundaries("initialize", server_name, l4d2_facade.initialize_server, server_id, on_stdout=on_stdout, on_stderr=on_stderr, should_cancel=should_cancel)
        elif operation == "start":
            _run_with_boundaries("initialize", server_name, l4d2_facade.initialize_server, server_id, on_stdout=on_stdout, on_stderr=on_stderr, should_cancel=should_cancel)
            raise_if_cancelled()
            _run_with_boundaries("start", server_name, l4d2_facade.start_server, server_id, on_stdout=on_stdout, on_stderr=on_stderr, should_cancel=should_cancel)
        elif operation == "stop":
            _run_with_boundaries("stop", server_name, l4d2_facade.stop_server, server_id, on_stdout=on_stdout, on_stderr=on_stderr, should_cancel=should_cancel)
        elif operation == "delete":
            _run_with_boundaries("delete", server_name, l4d2_facade.delete_server, server_id, on_stdout=on_stdout, on_stderr=on_stderr, should_cancel=should_cancel)
        else:
            raise ValueError(f"unknown job operation: {operation}")
```

- [ ] **Step 4: Run test to verify it passes**

Run: `pytest l4d2web/tests/test_job_worker.py::test_successful_start_job_logs_and_refreshes_server_state -v`
Expected: PASS

- [ ] **Step 5: Verify remaining job worker tests**

Run: `pytest l4d2web/tests/test_job_worker.py -q`
Some tests might fail if they explicitly check log content. Fix them to accommodate the boundary lines. For instance, `test_called_process_error_fails_job_and_sets_server_error` might need the new boundary lines in its assertions.

Example fix in `test_called_process_error_fails_job_and_sets_server_error`:
```python
    assert "starting stop for alpha" in lines
    assert "stop failed" in lines
```

Example fix in `test_cancelled_process_finishes_job_as_cancelled`:
```python
    assert "starting stop for alpha" in lines
    assert "terminating" in lines
```

- [ ] **Step 6: Commit**

```bash
git add l4d2web/services/job_worker.py l4d2web/tests/test_job_worker.py
git commit -m "feat(web): add boundary log lines to job worker execution"
```

---

### Task 5: Verify full test suite and clean up

- [ ] **Step 1: Run all tests**

Run: `pytest l4d2host/tests l4d2web/tests deploy/tests -q`
Expected: ALL PASS.

- [ ] **Step 2: Push changes**

```bash
git push
```