left4me/docs/superpowers/plans/2026-05-15-uid-collapse.md
mwiegand 146cb01450
plan(uid-collapse): drop l4d2-sandbox user; handoff to next session
Approved-but-not-executed plan to collapse the two-user model
(left4me + l4d2-sandbox) into one. The build-time-idmap that
translates sandbox writes back to left4me uid becomes a no-op when
source uid == target uid, so it's removed along with ~30 lines of
helper plumbing. Hardening already covers the same-uid attack
surface the sandbox uid was defending against, so collapsing makes
the architecture consistent with the web/server hardening-only
decision.

Plan: docs/superpowers/plans/2026-05-15-uid-collapse.md
Handoff: docs/superpowers/specs/2026-05-15-session-handoff.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 15:39:51 +02:00

9.8 KiB

UID collapse — remove l4d2-sandbox user

Context

The hardening refactor landed earlier today (docs/superpowers/plans/2026-05-15-hardening-refactor.md) deployed the systemd-directive composition that covers all same-uid attack vectors for the gameserver + web units running as left4me.

The script-sandbox unit still runs as a separate uid l4d2-sandbox (981) with a build-time idmap (mount --bind --map-users=980:981:1) translating sandbox-side writes to land on disk as left4me. After the hardening refactor, the same-uid attack vectors the sandbox uid defends against (FS-view access, ptrace, /proc, signals) are already closed by the sandbox's own systemd-run hardening profile. The separate uid is now defense-in-depth only — and it's inconsistent with the decision not to split the web/server uid.

Pick one principle. Option C from the discussion: one user. Delete l4d2-sandbox, simplify the sandbox helper, remove the idmap. Architecture gets smaller (one fewer uid, no idmap binds, ~30 lines deleted from the helper). Trade: if sandbox hardening regresses, kernel uid boundary no longer helps — consistent with what we already accepted for server/web.

Approach

  1. Edit scripts/libexec/left4me-script-sandbox (left4me repo): delete the idmap block (lines 49-78 per Phase 1 exploration — the LEFT4ME_UID/SANDBOX_UID lookups, STAGING setup, cleanup_staging trap, mount --bind --map-users=… call). Change User=l4d2-sandbox -p Group=l4d2-sandbox (line 85) to User=left4me -p Group=left4me. Change BindPaths="${STAGING}:/overlay" (line 102) to BindPaths="${OVERLAY_DIR}:/overlay". Keep the nsenter --mount=/proc/1/ns/mnt self-wrap at the top — it's about namespace escape, not uid.

  2. Update scripts/tests/test_script_sandbox.py (left4me repo):

    • Lines 36-37: change User=l4d2-sandbox/Group=l4d2-sandbox assertions → User=left4me/Group=left4me.
    • Delete test_script_sandbox_uses_idmap_staging (lines 114-133) entirely — it asserts the idmap and staging exist; after refactor neither does.
    • Update line 165-166 comments to drop the sandbox-uid reference.
  3. Update inline comments referencing the sandbox uid:

    • l4d2web/services/overlay_builders.py:342 (or near 100 — agents reported different lines; locate via grep) — "as l4d2-sandbox" → "as left4me".
    • l4d2host/instances.py:80 — comment about l4d2-sandbox-owned lower-layer files → reflect that all overlay content is now left4me-owned end-to-end.
  4. Mark the build-time-idmap plan superseded: docs/superpowers/plans/2026-05-15-build-time-idmap.md — add a top-line status note: "SUPERSEDED 2026-05-15 by the uid-collapse refactor (this plan). The idmap pattern this plan introduced is removed because source uid (left4me) now equals target uid (left4me) — translation is a no-op." Same one-line treatment for docs/superpowers/plans/2026-05-14-overlay-idmap.md.

  5. Update the user-uid-split spec's existing superseded header: docs/superpowers/specs/2026-05-15-user-uid-split-design.md — currently says "2 users (current state) is correct"; revise to say "1 user (after uid-collapse refactor) is correct" and update the reasoning paragraph.

  6. Light-touch updates to other docs that reference l4d2-sandbox for accuracy. Pragmatic scope — add a top-line note instead of rewriting body content:

    • deploy/README.md — drop the l4d2-sandbox bullet (line 84), fix the paragraph at line 141 to reflect no-idmap state.
    • docs/superpowers/specs/2026-05-15-hardening-refactor-design.md and 2026-05-15-hardening-threat-model.md — add a one-line "Updated 2026-05-15: l4d2-sandbox collapsed into left4me; see plans/2026-05-15-uid-collapse.md" note in the relevant context section.
    • docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md — same one-line note (the spec's hardening profile sketch references the old User=l4d2-sandbox; the new build-overlay-unit refactor when it lands will inherit User=left4me from this change).
    • Leave the 2026-05-08- design specs alone.* They describe historical design at the time; rewriting them obscures the evolution. Anyone reading them sees the date and the superseded-note chain leads forward.
  7. Remove l4d2-sandbox from the ckn-bw bundle (~/Projekte/ckn-bw/bundles/left4me/items.py):

    • Delete the l4d2-sandbox entry from the users dict (lines 54-58 per Phase 1).
    • Delete the l4d2-sandbox entry from the groups dict (line 44).
    • Update the /var/lib/left4me mode comment + decide whether to change 07110755. The 0711 was specifically to let l4d2-sandbox traverse (not list) the dir; with sandbox gone, 0755 is the natural choice. Pick 0755.
  8. On-host pre-flight: before bw apply, chown any remaining uid-981 files to left4me:

    ssh left4.me 'sudo find /var/lib/left4me /opt/left4me -uid 981 -print
                  | head -50'
    # If any results, chown them:
    ssh left4.me 'sudo find /var/lib/left4me /opt/left4me -uid 981
                  -exec chown left4me:left4me {} +'
    

    Per the build-time-idmap plan that landed earlier, new sandbox writes already land as left4me, so the result should be small or empty. The chown catches any stragglers.

  9. Cross-repo push + bw apply:

    • Commit left4me changes (helper, tests, doc updates) on master.
    • Commit ckn-bw changes (users/groups deletion, mode change) on master.
    • Push both.
    • bw apply ovh.left4me.
  10. Verify:

    • getent passwd l4d2-sandbox on the host → no result (user removed).
    • sudo find /var/lib/left4me /opt/left4me -uid 981 -print → empty.
    • Trigger a sandbox build via the web UI; observe in journalctl -u 'left4me-script-*' that the transient unit runs as left4me, completes successfully, and the resulting overlay files in /var/lib/left4me/overlays/<id>/ are left4me:left4me.
    • pytest scripts/tests/test_script_sandbox.py locally passes with updated assertions.

Files to modify

Left4me repo (~/Projekte/left4me):

  • scripts/libexec/left4me-script-sandbox — helper changes (step 1)
  • scripts/tests/test_script_sandbox.py — test updates (step 2)
  • l4d2web/services/overlay_builders.py — comment update (step 3)
  • l4d2host/instances.py — comment update (step 3)
  • docs/superpowers/plans/2026-05-15-build-time-idmap.md — SUPERSEDED header (step 4)
  • docs/superpowers/plans/2026-05-14-overlay-idmap.md — SUPERSEDED header (step 4)
  • docs/superpowers/specs/2026-05-15-user-uid-split-design.md — update existing superseded header (step 5)
  • docs/superpowers/specs/2026-05-15-hardening-refactor-design.md — one-line note (step 6)
  • docs/superpowers/specs/2026-05-15-hardening-threat-model.md — one-line note (step 6)
  • docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md — one-line note (step 6)
  • deploy/README.md — drop sandbox bullet, update idmap paragraph (step 6)

Ckn-bw repo (~/Projekte/ckn-bw):

  • bundles/left4me/items.py — drop l4d2-sandbox user + group; tighten mode (step 7)

Host actions (no commits):

  • pre-flight chown of orphan-981 files (step 8)
  • bw apply ovh.left4me (step 9)

Verification

End-to-end on left4.me:

# User removed
ssh left4.me 'getent passwd l4d2-sandbox; getent group l4d2-sandbox'
# Expect: empty (both)

# No orphan-uid files
ssh left4.me 'sudo find /var/lib/left4me /opt/left4me -uid 981 -print 2>/dev/null'
# Expect: empty

# Sandbox build runs as left4me end-to-end
# (Trigger via web UI; then check)
ssh left4.me 'sudo journalctl --since "5 minutes ago" -u "left4me-script-*" | head -30'
# Expect: clean run, no permission errors

ssh left4.me 'sudo ls -ln /var/lib/left4me/overlays/<id>/ | head -5'
# Expect: uid 980 (left4me), not 981

# Local tests
cd ~/Projekte/left4me && pytest scripts/tests/test_script_sandbox.py -q
# Expect: all green (one fewer test — the idmap test was deleted)

Rollback

If the deploy goes wrong:

  • git revert the left4me commits + the ckn-bw commit, push, bw apply again.
  • ckn-bw will recreate the l4d2-sandbox user on the host.
  • The old helper script comes back via git_deploy.
  • Any files chown'd from 981→980 in the pre-flight stay at 980 — that's fine because the new helper would have written them as 980 anyway.

Risks

  • Sandbox build running during bw apply: ckn-bw's user-removal step might fail if a l4d2-sandbox-uid process is alive. Mitigation: don't apply during a build. Quick check before apply: ssh left4.me 'sudo systemctl list-units --type=service "left4me-script-*"' → expect "0 loaded units".
  • Orphan files not caught by the pre-flight find: if any uid-981 file exists outside /var/lib/left4me or /opt/left4me, the user removal succeeds but the file becomes orphan-uid. Practically these paths are exhaustive; if paranoid, expand the find to /.
  • The nsenter self-wrap still needs PrivateTmp=true on the web unit to be the reason the wrap exists. If the web unit's PrivateTmp ever goes away, the wrap becomes unnecessary. Not affected by this refactor; flag for future cleanup.

Out of scope

  • Renaming left4me to something else (e.g., l4d2-app). Cosmetic only; not worth the migration cost.
  • The broader configmgmt responsibility reshape (drop-ins owned by left4me, ckn-bw as thin file-shipper). Deferred per the hardening-refactor design.
  • build-overlay-unit template refactor (docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md) — still queued; will inherit User=left4me cleanly from this work.
  • Rewriting historical 2026-05-08-* design specs.