left4me/docs/superpowers/plans/2026-05-14-rcon-console.md

# Add an RCON console to the server detail page

## Context

The server detail page (`/servers/<id>`) already exposes the RCON password,
live state polling, log streaming, and start/stop actions, but to send any
arbitrary command (`changelevel`, `sm_kick`, `mp_*`, `say`, etc.) the user
has to open a separate RCON client and reconnect. Adding an inline console
turns the web UI into a complete operator tool for the owner of a server:
type a command, see the reply, recall earlier commands via persisted
history.

Scope is intentionally narrow:
- One server, one user (the owner). Multi-user shared console = not now.
- Per-user history persisted across reloads.
- No blocklist — owner already has the RCON password and can run anything
  via any RCON client; the UI is a thin wrapper.

## Design decisions (already settled)

| Topic | Choice |
|---|---|
| UI placement | Panel on `server_detail.html`, between **Live State** and **Files**. |
| Output transport | **HTMX append swap**, not SSE. RCON is request/response — SSE adds no value. Matches existing inline-form / `hx-swap` patterns in the codebase. |
| Safety | Owner-of-server check only (`Server.user_id == current_user.id`). No command blocklist. **No admin override** — admins can already SSH if needed; an unaudited UI backdoor isn't worth the asymmetry. |
| History | New `command_history` table, scoped per (user, server). Stores **command + reply + error flag** so the full transcript can be replayed on page reload. |
| Transcript on page load | **Replays the last 50 rows** for this user+server, rendered server-side into the transcript via the same `_console_line.html` partial used for live additions. Visually identical to live lines (no "old vs new" distinction — the whole point is page-reload continuity). |
| Transcript height | Fixed max-height ~400 px, internal vertical scroll. New lines auto-scroll to the bottom on add AND on initial load. Page layout below stays stable. |
| Clear button | None. Reload doesn't help (it replays). If anyone wants to drop history, that's a separate concern handled later. |
| RCON timeout | **30s per command.** Comfortably covers a cold map load with custom add-ons (community-observed worst case ~25s on modest hardware). 3× the python-valve default. Far below `director_transition_timeout` (120s) so no aliasing. If a command exceeds 30s, the RCON exec packet was already sent — the server still did the work; the user just doesn't see the textual reply but sees the effect in the Server Log SSE panel above. |
| Worker model | Rely on `gunicorn --threads N` (or whatever the existing deployment uses for the long-lived SSE log streams). Threads share memory; one stuck `changelevel` holds a thread, not a process. Don't scale processes — adding hundreds of workers wastes RAM (~100 MB each); threads cost nothing. |

## Server-side changes

### 1. Extend `l4d2web/services/rcon.py`

The wire-protocol layer already exists (`l4d2web/services/rcon.py:64`).
Add a generic command executor with **multi-packet response handling**:

```python
def execute_command(
    host: str, port: int, password: str, command: str, *, timeout: float = 30.0
) -> str:
    """Authenticate, send a single command, return the joined reply body.

    Implements the trailing-marker pattern: after the exec packet we
    immediately send an empty SERVERDATA_RESPONSE_VALUE packet with a
    sentinel req_id. We then read response packets, concatenating bodies,
    until we see the sentinel echo back. This is the only reliable way
    to detect end-of-output, because Source RCON splits replies >4096 B
    across multiple packets with no length header.
    """
```

Implementation notes:
- Factor `_connect_and_auth(sock, password)` out of `query_status` so
  both functions share the auth dance.
- Use req_id `0xDEADBEEF` (or any constant ≠ the exec req_id) for the
  sentinel; read packets until one comes back with that req_id.
- Input validation **inside this function** (not just at the route):
  - Reject empty / whitespace-only `command` → `ValueError`.
  - Reject embedded `\x00` bytes (would corrupt the null-terminated
    wire format) → `ValueError`.
  - Cap length at 1000 chars (RCON packet limit is 4096 incl. headers;
    no real command needs more). Longer → `ValueError`.
- Trim trailing whitespace from the joined body. Otherwise return verbatim.
- Existing `RconError` / `RconAuthError` exception types are reused.

Tests in `l4d2web/tests/test_rcon.py` (extend the `FakeRconServer` to
support multi-packet replies):
- happy path: single-packet response
- multi-packet response (synthesize a >4096 B reply)
- empty reply (server replies only with the sentinel — case for `say`)
- bad password → `RconAuthError`
- timeout (fake server sleeps longer than the test timeout)
- input validation: empty / null byte / oversized → `ValueError`

### 2. New `CommandHistory` model (`l4d2web/models.py`)

Append at the bottom of `models.py`:

```python
class CommandHistory(Base):
    __tablename__ = "command_history"
    __table_args__ = (
        Index("ix_cmdhist_user_server_id", "user_id", "server_id", "id"),
    )
    id:         Mapped[int]      = mapped_column(Integer, primary_key=True)
    user_id:    Mapped[int]      = mapped_column(ForeignKey("users.id"), nullable=False)
    server_id:  Mapped[int]      = mapped_column(ForeignKey("servers.id", ondelete="CASCADE"), nullable=False)
    command:    Mapped[str]      = mapped_column(Text, nullable=False)
    reply:      Mapped[str]      = mapped_column(Text, nullable=False, default="", server_default="")
    is_error:   Mapped[bool]     = mapped_column(Boolean, nullable=False, default=False, server_default=text("0"))
    created_at: Mapped[datetime] = mapped_column(DateTime, default=now_utc, nullable=False)
```

Index `(user_id, server_id, id)` because every lookup is "latest N for
this user+server", ordered by `id DESC`.

A row is persisted on **every** RCON outcome — successful reply,
empty reply, and error (auth fail, connect refused, `RconError`). The
`is_error` flag drives the red styling on replay, so the transcript
looks identical after a page reload.

**Storage cost**: most replies are <500 B; `status` ~1 KB;
`sm plugins list` a few KB; `cvarlist` can be 50 KB+. A power user
running 100 commands/day at an average ~2 KB → ~73 MB/year. SQLite
handles that without complaint; a trim job (cap N per user/server,
e.g. last 5000) can be added if anyone ever notices.

**Privacy note for the implementer**: replies from `status` include
player names (user-controlled strings from random Steam users) and
SteamID64s. Treat them as untrusted text on output (handled by Jinja
auto-escaping — see §5) and don't surface them outside this user's
session.

### 3. New alembic migration `0012_command_history.py`

Mirror `l4d2web/alembic/versions/0011_server_hostname.py`:
- `revision = "0012_command_history"`
- `down_revision = "0011_server_hostname"`
- `upgrade()`: `op.create_table("command_history", …)` with columns
  `id`, `user_id`, `server_id`, `command (Text)`, `reply (Text, server_default="")`,
  `is_error (Boolean, server_default="0")`, `created_at`; plus
  `op.create_index("ix_cmdhist_user_server_id", ...)`.
- `downgrade()`: drop index then table.
- `test_alembic_migrations.py` auto-discovers revisions (skim once to
  confirm; no edit if so).

### 4. New route module `l4d2web/routes/console_routes.py`

Two endpoints, both `@require_login`, both verify ownership with
**404** on miss (matches the existing pattern at
`page_routes.py:303` — no admin backdoor).

**`POST /servers/<id>/console`** — submit a command.
- CSRF-checked (form field `csrf_token`).
- Form field `command`. Validation happens twice: at the route (return a
  user-facing error fragment for empty / oversized) and inside
  `execute_command` (defence in depth — never trust a single layer).
- Calls
  `rcon.execute_command("127.0.0.1", server.port, server.rcon_password, command)`.
- **Every outcome persists a `CommandHistory` row** (so the transcript
  fully reconstructs on page reload):
  - Success with reply → `command`, `reply`, `is_error=False`.
  - Success with empty reply (e.g. `say`) → `command`, `reply=""`,
    `is_error=False`. Template renders `(no reply)` in muted text.
  - `RconAuthError` / `RconError` / connect-failed → `command`,
    `reply=<exception message>`, `is_error=True`. Red styling on render.
- On `ValueError` from input validation (empty / null byte / oversized):
  render an error fragment, **do not** insert history (the command
  never reached the wire — nothing happened to remember).
- Returns 200 in all cases (errors are rendered, not raised) so HTMX
  appends them to the transcript like any other line.

**`GET /servers/<id>/console/history?before=<id>&limit=50`** — paged
history for up-arrow navigation.
- Returns JSON `[{"id": …, "command": …}, …]` ordered newest-first.
- The client owns the input state; this stays JSON, not HTML.
- `limit` clamped to ≤200.

Register the blueprint in `l4d2web/app.py` alongside the other
`*_routes` modules.

**Also extend `server_detail()` in `page_routes.py`** to fetch the last
50 `CommandHistory` rows for this `(user, server)`, ordered oldest-first
(so they iterate naturally in the template), and pass as
`console_history` in the render context. Use the same `session_scope`
block that already loads `server` and `blueprint` (`page_routes.py:301`)
— one extra `db.scalars(select(CommandHistory)…)` call, no new round
trip cost.

### 5. Template fragment `templates/_console_line.html`

```jinja2
<div class="console-line{% if error %} console-error{% endif %}">
  <div class="console-prompt">&gt; {{ command }}</div>
  {% if reply %}
    <pre class="console-reply">{{ reply }}</pre>
  {% else %}
    <div class="console-reply muted">(no reply)</div>
  {% endif %}
</div>
```

**XSS reminder for the implementer:** `reply` originates from the game
server's RCON output — we do not trust it. **Never use `|safe`**, never
`{{ reply|markdown }}`, never anything that bypasses Jinja's default
HTML escaping. The existing `{{ reply }}` is the right call.

### 6. Console panel in `templates/server_detail.html`

Insert between the existing live-state section (line 33–37) and the
Files section (line 39):

```jinja2
<h2 class="section-title">Console</h2>
<section class="panel console-panel">
  <div id="console-transcript-{{ server.id }}"
       class="console-transcript"
       data-autoscroll>
    {% for h in console_history %}
      {% include "_console_line.html" with context %}
      {# Loops with h.command, h.reply, h.is_error, h.created_at #}
    {% endfor %}
  </div>
  <form hx-post="/servers/{{ server.id }}/console"
        hx-target="#console-transcript-{{ server.id }}"
        hx-swap="beforeend"
        hx-indicator=".console-spinner"
        hx-on::after-request="this.command.value=''; this.command.focus(); this.closest('section').querySelector('[data-autoscroll]').scrollTop = 1e9"
        class="console-input-form"
        data-console-form data-server-id="{{ server.id }}">
    <input type="hidden" name="csrf_token" value="{{ session.get('csrf_token', '') }}">
    <span class="console-prompt-glyph">&gt;</span>
    <input name="command" autocomplete="off" spellcheck="false" maxlength="1000"
           placeholder="status, changelevel c1m1_hotel, sm_kick …">
    <span class="console-spinner" aria-hidden="true">…</span>
    <button type="submit">Send</button>
  </form>
</section>
```

- Transcript is server-side rendered with the last 50 history rows on
  page load. `_console_line.html` is the single source of truth for
  line layout — same template, same look, whether the line came from
  this session or last week.
- `hx-indicator` gives visible feedback during slow commands (a
  `changelevel` can sit at ~10s+).
- `maxlength="1000"` on the input mirrors the server-side cap.
- The `hx-on::after-request` inline scrolls the transcript to the
  bottom after each new line. On initial page load, the JS module
  scrolls to the bottom once after the DOM is ready (so the most
  recent history is visible, not the oldest).

**Cross-feature interaction (do not "fix"):** Silent or slow commands
(`say`, `kick`, `changelevel`) will produce empty or terse RCON replies
in this transcript. The actual game-side effect is already visible in
the **Server Log** SSE panel right above. A future implementer should
NOT try to mirror server-log lines back into the console transcript —
that's a redundancy, not a feature.

### 7. New `static/js/console-history.js`

Tiny module bound to `[data-console-form]`:
- **On DOM ready**: scroll each `[data-autoscroll]` transcript to the
  bottom so the most recent replayed lines are visible. This is the
  initial-load equivalent of the `hx-on::after-request` scroll.
- **On first focus** of the input: lazy-fetch
  `/servers/<id>/console/history?limit=50` and cache the array in
  memory. (Distinct from the rendered-on-load transcript: this cache
  is *just commands* for up/down recall — replies don't matter for
  navigation, so the JSON endpoint stays narrow.)
- **ArrowUp / ArrowDown**: walk the cached array, set `input.value`.
  - ArrowUp from a non-history state: snapshot the current value so
    ArrowDown can restore it.
- **ArrowUp past the end**: fetch next page using
  `?before=<oldest_cached_id>`. If empty, stop.
- **After a successful submit** (`htmx:afterRequest` with 2xx):
  prepend the just-sent command to the in-memory cache so it's
  instantly recallable.

Loaded via a `<script defer>` line in `base.html` next to the other
small static JS modules (same pattern as `sse.js`).

### 8. Concurrency sanity (no code, just verifying the design)

`live_state_poller.py` already opens fresh RCON connections every 5s
against the same port. SrcDS handles concurrent RCON sessions cleanly
(each is independently auth'd, no shared state). The console adds at
most one more concurrent connection per active user — well within
limits. No locking needed.

### 9. Minimal CSS in `static/css/`

Monospace transcript, dark background, `console-error` styled like the
existing error pills. Match the visual weight of the existing log-stream
`<pre>` block on the detail page — no new design system.

## Files to touch

| File | Change |
|---|---|
| `l4d2web/services/rcon.py` | Add `execute_command()` with multi-packet handling + input validation; extract `_connect_and_auth()` |
| `l4d2web/tests/test_rcon.py` | Extend `FakeRconServer` for multi-packet; add success / multi-packet / empty / bad-pw / timeout / validation tests |
| `l4d2web/models.py` | Add `CommandHistory` (with `reply`, `is_error`) |
| `l4d2web/alembic/versions/0012_command_history.py` | New migration |
| `l4d2web/routes/console_routes.py` | **NEW** — POST + GET endpoints |
| `l4d2web/routes/page_routes.py` | Extend `server_detail()` to fetch last 50 history rows and pass `console_history` |
| `l4d2web/app.py` | Register the new blueprint |
| `l4d2web/templates/_console_line.html` | **NEW** fragment |
| `l4d2web/templates/server_detail.html` | Insert console panel section (with server-rendered replay loop) |
| `l4d2web/static/js/console-history.js` | **NEW** up/down history nav + initial scroll-to-bottom |
| `l4d2web/templates/base.html` | `<script defer src="…/console-history.js">` |
| `l4d2web/static/css/*.css` | Console panel styling (fixed-height scroll transcript, error variant) |
| `l4d2web/tests/test_console_routes.py` | **NEW** route tests |

## Tests to write explicitly

**`test_rcon.py`** (extending existing file):
- `execute_command` happy path, single-packet reply
- `execute_command` multi-packet reply (>4096 B) reassembled in order
- `execute_command` empty reply (server returns only the sentinel)
- `execute_command` bad password → `RconAuthError`
- `execute_command` socket timeout → `RconError`
- Input validation: empty / whitespace-only / null-byte / oversized → `ValueError`

**`test_console_routes.py`** (new):
- not logged in → 302 to login
- logged in but not server owner → **404** (not 403 — match
  `page_routes.py:303`)
- valid command → 200, fragment HTML rendered, `CommandHistory` row
  inserted with `reply` populated and `is_error=False`
- empty RCON reply → 200, fragment renders `(no reply)`, history row
  inserted with `reply=""`, `is_error=False`
- RCON error (mock `execute_command` to raise) → 200, error fragment,
  history row inserted with `is_error=True` and the exception message
  in `reply`
- empty/oversized command (validation error before wire) → 200, error
  fragment, **no** history row
- CSRF token missing → rejected
- `GET /console/history` returns newest-first
- `GET /console/history?before=<id>` paginates correctly
- `GET /console/history?limit=10000` is clamped to ≤200

**`test_page_routes.py`** (extend existing if present, otherwise add):
- `server_detail` returns the last 50 `CommandHistory` rows for the
  viewing user only, oldest-first in the rendered page (newest at the
  bottom of the transcript)
- a history row belonging to another user for the same server is **not**
  visible (ownership scoping is by `user_id`, not just `server_id`)

## What we are deliberately NOT doing

- No command blocklist or admin gate — owner already has the password.
- **No admin override** to console other users' servers (admins can SSH if
  they truly need to; UI backdoor would be unaudited and asymmetric).
- No shared multi-user view of the same console.
- No streaming output (RCON doesn't stream; replies are one-shot).
- No autocomplete of cvars — out of scope; up-arrow history is enough.
- No "Clear transcript" button — the transcript replays on every page
  load by design. Discarding history is a different concern (delete
  rows from the DB) and is out of scope for v1.
- No history-trim job — file an issue if anyone hits >100k rows; not
  worth pre-empting at this scale.
- No mirroring of server-log lines into the console transcript — the
  Server Log panel above already serves that purpose.

## Verification

1. `pytest l4d2web/tests/test_rcon.py l4d2web/tests/test_console_routes.py l4d2web/tests/test_alembic_migrations.py` — unit + migration tests pass.
2. Boot the web app locally, log in, open a server detail page for a
   running server, send `status` — multi-line reply renders in the
   transcript; the input clears and refocuses; spinner shows during
   the request; the transcript scrolls to the new line at the bottom.
3. Send `cvarlist` — a large multi-packet response — and confirm the
   full output reassembles, not truncated.
4. Send `say hello` — transcript shows `> say hello` followed by
   `(no reply)` in muted text; the line appears in the Server Log
   panel above.
5. Send `changelevel c1m1_hotel` — request takes ~10–20s, spinner
   visible the whole time, then a (likely empty) reply appears, and
   the live-state panel updates to the new map within 5s.
6. Send an invalid command (e.g. `nonsense_cvar`) — reply renders
   normally (RCON tolerates unknown commands).
7. Send a command with embedded null bytes (via curl, since the
   browser strips them) — returns 200 with an error fragment, no
   history row.
8. Send a 2000-char command — rejected with an error fragment, no
   history row.
9. **Reload the page** — the transcript reappears identical to before,
    showing the same `> status`, `> say hello`, `> nonsense_cvar` lines
    with their replies, scrolled to the bottom. Errors are still red.
10. Focus the input, press ArrowUp — the previous command reappears.
    ArrowDown restores the empty state.
11. Send 60+ commands, then ArrowUp past the in-memory page boundary —
    older commands load on demand.
12. Stop the server, try to send a command — surfaces as a styled
    `console-error` line ("connect failed") rather than a 500; **a
    history row IS inserted** with `is_error=True`, so the error
    replays on next page load.
13. Log in as a different user, visit `/servers/<other-user-id>` —
    404, no console rendered. POST to that URL also 404. The other
    user's transcript is not visible.
14. Confirm that a `cvarlist`-class large reply persists fully in the
    DB (`SELECT length(reply) FROM command_history ORDER BY id DESC LIMIT 1;`)
    and replays in full on page reload.