left4me/docs/superpowers/plans/2026-05-15-uid-collapse.md
mwiegand 146cb01450
plan(uid-collapse): drop l4d2-sandbox user; handoff to next session
Approved-but-not-executed plan to collapse the two-user model
(left4me + l4d2-sandbox) into one. The build-time-idmap that
translates sandbox writes back to left4me uid becomes a no-op when
source uid == target uid, so it's removed along with ~30 lines of
helper plumbing. Hardening already covers the same-uid attack
surface the sandbox uid was defending against, so collapsing makes
the architecture consistent with the web/server hardening-only
decision.

Plan: docs/superpowers/plans/2026-05-15-uid-collapse.md
Handoff: docs/superpowers/specs/2026-05-15-session-handoff.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 15:39:51 +02:00

226 lines
9.8 KiB
Markdown

# UID collapse — remove `l4d2-sandbox` user
## Context
The hardening refactor landed earlier today
(`docs/superpowers/plans/2026-05-15-hardening-refactor.md`) deployed
the systemd-directive composition that covers all same-uid attack
vectors for the gameserver + web units running as `left4me`.
The script-sandbox unit still runs as a separate uid `l4d2-sandbox`
(981) with a build-time idmap (`mount --bind --map-users=980:981:1`)
translating sandbox-side writes to land on disk as `left4me`. After
the hardening refactor, the same-uid attack vectors the sandbox uid
defends against (FS-view access, ptrace, /proc, signals) are
already closed by the sandbox's own systemd-run hardening profile.
The separate uid is now defense-in-depth only — and it's
inconsistent with the decision *not* to split the web/server uid.
Pick one principle. Option C from the discussion: **one user**.
Delete `l4d2-sandbox`, simplify the sandbox helper, remove the
idmap. Architecture gets smaller (one fewer uid, no idmap binds,
~30 lines deleted from the helper). Trade: if sandbox hardening
regresses, kernel uid boundary no longer helps — consistent with
what we already accepted for server/web.
## Approach
1. **Edit `scripts/libexec/left4me-script-sandbox`** (left4me repo):
delete the idmap block (lines 49-78 per Phase 1 exploration —
the `LEFT4ME_UID`/`SANDBOX_UID` lookups, `STAGING` setup,
`cleanup_staging` trap, `mount --bind --map-users=…` call).
Change `User=l4d2-sandbox -p Group=l4d2-sandbox` (line 85)
to `User=left4me -p Group=left4me`. Change
`BindPaths="${STAGING}:/overlay"` (line 102) to
`BindPaths="${OVERLAY_DIR}:/overlay"`. Keep the
`nsenter --mount=/proc/1/ns/mnt` self-wrap at the top — it's
about namespace escape, not uid.
2. **Update `scripts/tests/test_script_sandbox.py`** (left4me repo):
- Lines 36-37: change `User=l4d2-sandbox`/`Group=l4d2-sandbox`
assertions → `User=left4me`/`Group=left4me`.
- Delete `test_script_sandbox_uses_idmap_staging` (lines 114-133)
entirely — it asserts the idmap and staging exist; after
refactor neither does.
- Update line 165-166 comments to drop the sandbox-uid reference.
3. **Update inline comments** referencing the sandbox uid:
- `l4d2web/services/overlay_builders.py:342` (or near 100 — agents
reported different lines; locate via grep) — "as l4d2-sandbox"
→ "as left4me".
- `l4d2host/instances.py:80` — comment about l4d2-sandbox-owned
lower-layer files → reflect that all overlay content is now
left4me-owned end-to-end.
4. **Mark the build-time-idmap plan superseded**:
`docs/superpowers/plans/2026-05-15-build-time-idmap.md` — add a
top-line status note: "SUPERSEDED 2026-05-15 by the uid-collapse
refactor (this plan). The idmap pattern this plan introduced is
removed because source uid (`left4me`) now equals target uid
(`left4me`) — translation is a no-op." Same one-line treatment
for `docs/superpowers/plans/2026-05-14-overlay-idmap.md`.
5. **Update the user-uid-split spec's existing superseded header**:
`docs/superpowers/specs/2026-05-15-user-uid-split-design.md`
currently says "2 users (current state) is correct"; revise to
say "1 user (after uid-collapse refactor) is correct" and update
the reasoning paragraph.
6. **Light-touch updates to other docs** that reference
`l4d2-sandbox` for accuracy. Pragmatic scope — add a top-line
note instead of rewriting body content:
- `deploy/README.md` — drop the `l4d2-sandbox` bullet (line 84),
fix the paragraph at line 141 to reflect no-idmap state.
- `docs/superpowers/specs/2026-05-15-hardening-refactor-design.md`
and `2026-05-15-hardening-threat-model.md` — add a one-line
"Updated 2026-05-15: l4d2-sandbox collapsed into left4me; see
plans/2026-05-15-uid-collapse.md" note in the relevant context
section.
- `docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`
— same one-line note (the spec's hardening profile sketch
references the old `User=l4d2-sandbox`; the new build-overlay-unit
refactor when it lands will inherit `User=left4me` from this
change).
- **Leave the 2026-05-08-* design specs alone.** They describe
historical design at the time; rewriting them obscures the
evolution. Anyone reading them sees the date and the
superseded-note chain leads forward.
7. **Remove `l4d2-sandbox` from the ckn-bw bundle**
(`~/Projekte/ckn-bw/bundles/left4me/items.py`):
- Delete the `l4d2-sandbox` entry from the `users` dict
(lines 54-58 per Phase 1).
- Delete the `l4d2-sandbox` entry from the `groups` dict
(line 44).
- Update the `/var/lib/left4me` mode comment + decide whether to
change `0711``0755`. The `0711` was specifically to let
`l4d2-sandbox` traverse (not list) the dir; with sandbox gone,
`0755` is the natural choice. Pick `0755`.
8. **On-host pre-flight**: before `bw apply`, chown any remaining
uid-981 files to `left4me`:
```bash
ssh left4.me 'sudo find /var/lib/left4me /opt/left4me -uid 981 -print
| head -50'
# If any results, chown them:
ssh left4.me 'sudo find /var/lib/left4me /opt/left4me -uid 981
-exec chown left4me:left4me {} +'
```
Per the build-time-idmap plan that landed earlier, new sandbox
writes already land as `left4me`, so the result should be small
or empty. The chown catches any stragglers.
9. **Cross-repo push + bw apply**:
- Commit left4me changes (helper, tests, doc updates) on master.
- Commit ckn-bw changes (users/groups deletion, mode change) on
master.
- Push both.
- `bw apply ovh.left4me`.
10. **Verify**:
- `getent passwd l4d2-sandbox` on the host → no result (user
removed).
- `sudo find /var/lib/left4me /opt/left4me -uid 981 -print`
empty.
- Trigger a sandbox build via the web UI; observe in
`journalctl -u 'left4me-script-*'` that the transient unit
runs as `left4me`, completes successfully, and the resulting
overlay files in `/var/lib/left4me/overlays/<id>/` are
`left4me:left4me`.
- `pytest scripts/tests/test_script_sandbox.py` locally passes
with updated assertions.
## Files to modify
**Left4me repo (`~/Projekte/left4me`):**
- `scripts/libexec/left4me-script-sandbox` — helper changes (step 1)
- `scripts/tests/test_script_sandbox.py` — test updates (step 2)
- `l4d2web/services/overlay_builders.py` — comment update (step 3)
- `l4d2host/instances.py` — comment update (step 3)
- `docs/superpowers/plans/2026-05-15-build-time-idmap.md`
SUPERSEDED header (step 4)
- `docs/superpowers/plans/2026-05-14-overlay-idmap.md`
SUPERSEDED header (step 4)
- `docs/superpowers/specs/2026-05-15-user-uid-split-design.md`
update existing superseded header (step 5)
- `docs/superpowers/specs/2026-05-15-hardening-refactor-design.md`
one-line note (step 6)
- `docs/superpowers/specs/2026-05-15-hardening-threat-model.md`
one-line note (step 6)
- `docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`
one-line note (step 6)
- `deploy/README.md` — drop sandbox bullet, update idmap paragraph
(step 6)
**Ckn-bw repo (`~/Projekte/ckn-bw`):**
- `bundles/left4me/items.py` — drop `l4d2-sandbox` user + group;
tighten mode (step 7)
**Host actions (no commits):**
- pre-flight chown of orphan-981 files (step 8)
- `bw apply ovh.left4me` (step 9)
## Verification
End-to-end on `left4.me`:
```bash
# User removed
ssh left4.me 'getent passwd l4d2-sandbox; getent group l4d2-sandbox'
# Expect: empty (both)
# No orphan-uid files
ssh left4.me 'sudo find /var/lib/left4me /opt/left4me -uid 981 -print 2>/dev/null'
# Expect: empty
# Sandbox build runs as left4me end-to-end
# (Trigger via web UI; then check)
ssh left4.me 'sudo journalctl --since "5 minutes ago" -u "left4me-script-*" | head -30'
# Expect: clean run, no permission errors
ssh left4.me 'sudo ls -ln /var/lib/left4me/overlays/<id>/ | head -5'
# Expect: uid 980 (left4me), not 981
# Local tests
cd ~/Projekte/left4me && pytest scripts/tests/test_script_sandbox.py -q
# Expect: all green (one fewer test — the idmap test was deleted)
```
## Rollback
If the deploy goes wrong:
- `git revert` the left4me commits + the ckn-bw commit, push,
`bw apply` again.
- ckn-bw will recreate the `l4d2-sandbox` user on the host.
- The old helper script comes back via `git_deploy`.
- Any files chown'd from 981→980 in the pre-flight stay at 980 —
that's fine because the new helper would have written them as 980
anyway.
## Risks
- **Sandbox build running during `bw apply`**: ckn-bw's user-removal
step might fail if a `l4d2-sandbox`-uid process is alive.
Mitigation: don't apply during a build. Quick check before apply:
`ssh left4.me 'sudo systemctl list-units --type=service "left4me-script-*"'`
→ expect "0 loaded units".
- **Orphan files not caught by the pre-flight find**: if any uid-981
file exists outside `/var/lib/left4me` or `/opt/left4me`, the user
removal succeeds but the file becomes orphan-uid. Practically these
paths are exhaustive; if paranoid, expand the find to `/`.
- **The `nsenter` self-wrap still needs `PrivateTmp=true` on the web
unit to be the *reason* the wrap exists**. If the web unit's
PrivateTmp ever goes away, the wrap becomes unnecessary. Not
affected by this refactor; flag for future cleanup.
## Out of scope
- Renaming `left4me` to something else (e.g., `l4d2-app`). Cosmetic
only; not worth the migration cost.
- The broader configmgmt responsibility reshape (drop-ins owned by
left4me, ckn-bw as thin file-shipper). Deferred per the
hardening-refactor design.
- `build-overlay-unit` template refactor
(`docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md`)
— still queued; will inherit `User=left4me` cleanly from this work.
- Rewriting historical 2026-05-08-* design specs.