The script content lives in the overlays.script DB column and the unit's %i is the row id, so the worker-writes-script-to-fs step in the original sketch is duplication. Document three options (worker writes / unit fetches via helper / pipe to stdin) and recommend the unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote this to the top of the open-decisions list since it shapes the worker, the unit, and whether a fetch-script helper is added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
572 lines
23 KiB
Markdown
572 lines
23 KiB
Markdown
# Build-overlay template unit — refactor the script-sandbox helper
|
|
|
|
**Status: open question, not settled design.** This is a handoff
|
|
document prompted by the build-time idmap landing on 2026-05-15. The
|
|
current `left4me-script-sandbox` shell helper works but has accumulated
|
|
several layers of complexity (idmap bind setup, trap cleanup, nsenter
|
|
self-wrap) that a systemd template unit would handle declaratively.
|
|
The same pattern is already established in the codebase for
|
|
gameservers (`left4me-server@.service`). A future session should
|
|
evaluate whether to refactor and, if so, follow the steps below.
|
|
|
|
## Why this came up
|
|
|
|
While verifying the build-time idmap refactor, the first 5 build jobs
|
|
failed with `mkdir: Permission denied` on `/overlay/...`. Root cause:
|
|
|
|
- `left4me-web.service` runs with `PrivateTmp=true`, which puts the
|
|
web app (and anything it sudoes into) in a private mount namespace.
|
|
- The script-sandbox helper, invoked via `sudo` from the web app,
|
|
inherits that namespace.
|
|
- The helper's `mount --bind --map-users=...` pre-creates the idmap
|
|
staging path *in the web app's namespace*.
|
|
- `systemd-run` (called by the helper) spawns a transient unit in
|
|
PID 1's mount namespace.
|
|
- The transient unit's `BindPaths=...:/overlay` resolves the staging
|
|
path in PID 1's namespace — where the bind doesn't exist. It sees
|
|
an empty root-owned dir at the staging path (mkdir'd by the helper
|
|
before the bind) and binds *that* to `/overlay`.
|
|
- Sandbox uid hits EACCES on every write.
|
|
|
|
We fixed it (commit `f1aa05d`) by self-wrapping the helper into
|
|
PID 1's mount namespace at the top of the script:
|
|
|
|
```bash
|
|
if [[ "${L4D2_SANDBOX_IN_PID1_MNT_NS:-}" != "1" ]]; then
|
|
exec env L4D2_SANDBOX_IN_PID1_MNT_NS=1 \
|
|
/usr/bin/nsenter --mount=/proc/1/ns/mnt -- "$0" "$@"
|
|
fi
|
|
```
|
|
|
|
That works. But it's a band-aid for an architectural friction:
|
|
**helper invocation via `sudo` from a hardened service forces us to
|
|
manually escape the caller's namespace before any mount syscall**.
|
|
If the helper were *itself* a systemd unit started by PID 1, the
|
|
namespace would be correct by default.
|
|
|
|
The gameserver helper handles this at the unit level. Its
|
|
ExecStartPre is:
|
|
|
|
```
|
|
ExecStartPre=+/usr/bin/nsenter --mount=/proc/1/ns/mnt -- /usr/local/libexec/left4me/left4me-overlay mount %i
|
|
```
|
|
|
|
i.e. wrapped in nsenter *at the unit*. The unit is started by PID 1,
|
|
so it has PID 1's namespace, then nsenter is a belt-and-braces.
|
|
Mirror that pattern for builds: introduce `build-overlay@.service` as
|
|
a template unit, have the worker activate it instead of forking a
|
|
helper.
|
|
|
|
## Current state (the thing being replaced)
|
|
|
|
Files:
|
|
|
|
- `deploy/files/usr/local/libexec/left4me/left4me-script-sandbox` —
|
|
the bash helper. ~100 lines. Self-wraps in nsenter, does pre-bind
|
|
with `--map-users`, invokes `systemd-run --quiet --collect --wait
|
|
--pipe -p ... -- /bin/bash /script.sh`, cleans up via trap.
|
|
- `l4d2web/services/overlay_builders.py:run_sandboxed_script` — the
|
|
worker entry point. Writes script content to
|
|
`/var/lib/left4me/sandbox-scripts/<uniqued>.sh`, invokes
|
|
`sudo -n /usr/local/libexec/left4me/left4me-script-sandbox <id>
|
|
<path>`, streams stdout/stderr via `subprocess.Popen` + the existing
|
|
`run_command` plumbing.
|
|
- `deploy/files/etc/sudoers.d/left4me` — grants `left4me` NOPASSWD to
|
|
the helper path.
|
|
|
|
What the helper actually does:
|
|
1. nsenter into PID 1's mount ns (the band-aid)
|
|
2. validate args + overlay dir exists
|
|
3. compute `STAGING=/var/lib/left4me/tmp/sandbox-idmap-${OVERLAY_ID}`
|
|
4. `trap` cleanup; pre-emptive `umount` of stale staging; `mkdir -p`
|
|
the staging
|
|
5. `mount --bind --map-users=$(id -u left4me):$(id -u l4d2-sandbox):1
|
|
--map-groups=... $OVERLAY_DIR $STAGING`
|
|
6. `systemd-run` with the full hardening profile, `BindPaths=$STAGING:/overlay`
|
|
7. Wait for completion, propagate exit code
|
|
8. trap fires: `umount $STAGING; rmdir $STAGING`
|
|
|
|
## Proposed design
|
|
|
|
Replace the bash helper with two systemd units (template + a slice)
|
|
emitted from ckn-bw's existing `systemd_units` reactor, plus a small
|
|
worker rewrite.
|
|
|
|
### `build-overlay@.service` (template unit)
|
|
|
|
```ini
|
|
[Unit]
|
|
Description=Sandboxed overlay build for instance %i
|
|
DefaultDependencies=no
|
|
After=local-fs.target
|
|
RequiresMountsFor=/var/lib/left4me/overlays/%i
|
|
ConditionPathIsDirectory=/var/lib/left4me/overlays/%i
|
|
ConditionPathExists=/var/lib/left4me/sandbox-scripts/%i.sh
|
|
|
|
[Service]
|
|
Type=oneshot
|
|
User=l4d2-sandbox
|
|
Group=l4d2-sandbox
|
|
Slice=l4d2-build.slice
|
|
|
|
# Idmap bind: disk uid 980 (left4me) ↔ mount uid 981 (sandbox), so writes
|
|
# from the sandbox land on disk as left4me. + prefix runs as root before
|
|
# the User= drop (mount syscall requires CAP_SYS_ADMIN).
|
|
ExecStartPre=+/usr/bin/mkdir -p /run/left4me/idmap/%i
|
|
ExecStartPre=+/usr/bin/mount --bind \
|
|
--map-users=980:981:1 --map-groups=980:981:1 \
|
|
/var/lib/left4me/overlays/%i /run/left4me/idmap/%i
|
|
|
|
ExecStart=/bin/bash /script.sh
|
|
|
|
ExecStopPost=+-/usr/bin/umount /run/left4me/idmap/%i
|
|
ExecStopPost=+-/usr/bin/rmdir /run/left4me/idmap/%i
|
|
|
|
# Hardening — all the -p flags from the current bash helper, declared
|
|
# declaratively here instead of as systemd-run -p arguments.
|
|
NoNewPrivileges=yes
|
|
ProtectSystem=strict
|
|
ProtectHome=yes
|
|
PrivateTmp=yes
|
|
PrivateDevices=yes
|
|
PrivateIPC=yes
|
|
ProtectKernelTunables=yes
|
|
ProtectKernelModules=yes
|
|
ProtectKernelLogs=yes
|
|
ProtectControlGroups=yes
|
|
RestrictNamespaces=yes
|
|
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
|
|
RestrictSUIDSGID=yes
|
|
LockPersonality=yes
|
|
MemoryDenyWriteExecute=yes
|
|
SystemCallFilter=@system-service @network-io
|
|
SystemCallArchitectures=native
|
|
CapabilityBoundingSet=
|
|
AmbientCapabilities=
|
|
IPAddressDeny=127.0.0.0/8 ::1/128 169.254.0.0/16 fe80::/10 224.0.0.0/4 ff00::/8 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 100.64.0.0/10 fc00::/7
|
|
TemporaryFileSystem=/etc /var/lib
|
|
BindReadOnlyPaths=/etc/left4me/sandbox-resolv.conf:/etc/resolv.conf /etc/ssl /etc/ca-certificates /etc/nsswitch.conf /etc/alternatives /var/lib/left4me/sandbox-scripts/%i.sh:/script.sh
|
|
BindPaths=/run/left4me/idmap/%i:/overlay
|
|
WorkingDirectory=/overlay
|
|
Environment=HOME=/tmp PATH=/usr/bin:/usr/sbin OVERLAY=/overlay
|
|
UMask=0022
|
|
OOMScoreAdjust=500
|
|
MemoryMax=4G
|
|
MemorySwapMax=0
|
|
TasksMax=512
|
|
CPUQuota=200%
|
|
RuntimeMaxSec=3600
|
|
TimeoutStartSec=1h
|
|
TimeoutStopSec=30s
|
|
```
|
|
|
|
Notes:
|
|
- `Type=oneshot` makes `systemctl start` block until ExecStart exits.
|
|
- `ConditionPath*` provides early failure if the overlay dir or script
|
|
doesn't exist (avoids running the unit at all in those cases).
|
|
- `RequiresMountsFor=/var/lib/left4me/overlays/%i` ensures the parent
|
|
fs is mounted before this unit runs (`/` and `/var/lib` if it's a
|
|
separate mount point).
|
|
- `ExecStopPost` uses `+-` (root, ignore failures) — the bind might
|
|
already be torn down if the unit is restarting.
|
|
- `BindReadOnlyPaths=...:/script.sh` makes the per-overlay script
|
|
available at `/script.sh` inside the sandbox, picked from the
|
|
predictable path `/var/lib/left4me/sandbox-scripts/%i.sh`.
|
|
|
|
### Script source: filesystem vs. DB
|
|
|
|
**Critical design decision the future session must make.** The current
|
|
plan in the unit sketch above assumes the worker writes the script
|
|
content to `/var/lib/left4me/sandbox-scripts/<id>.sh` before calling
|
|
`systemctl start`. But the script *already lives in the DB* (the
|
|
`overlays.script` column), and the unit instance name `%i` is the
|
|
overlay row id. The filesystem copy is redundant unless we want it.
|
|
|
|
Three options:
|
|
|
|
**Option A — worker writes the script (the unit sketch above).**
|
|
Worker queries DB, writes `<id>.sh` to a known path, then
|
|
`systemctl start`. Unit reads via `BindReadOnlyPaths`. Simple, no DB
|
|
access from the unit, the existing `_sandbox_script_dir()` plumbing
|
|
mostly works. Cost: redundant on-disk copy; stale files between
|
|
builds if you don't clean them.
|
|
|
|
**Option B — unit fetches the script from the DB itself.** A small
|
|
root-side helper installed as
|
|
`/usr/local/libexec/left4me/left4me-fetch-script` does:
|
|
|
|
```python
|
|
#!/usr/bin/python3
|
|
import sqlite3, sys
|
|
overlay_id = int(sys.argv[1])
|
|
conn = sqlite3.connect("/var/lib/left4me/left4me.db")
|
|
row = conn.execute(
|
|
"SELECT script FROM overlays WHERE id = ?", (overlay_id,)
|
|
).fetchone()
|
|
sys.stdout.write((row[0] if row else "") or "")
|
|
```
|
|
|
|
Unit's ExecStartPre runs it as root (the `+` prefix), pipes the
|
|
output to a runtime path that ExecStart reads:
|
|
|
|
```ini
|
|
RuntimeDirectory=left4me/sandbox-scripts
|
|
RuntimeDirectoryMode=0700
|
|
ExecStartPre=+/bin/sh -c '/usr/local/libexec/left4me/left4me-fetch-script %i \
|
|
> /run/left4me/sandbox-scripts/%i.sh && chmod 0644 /run/left4me/sandbox-scripts/%i.sh'
|
|
BindReadOnlyPaths=/run/left4me/sandbox-scripts/%i.sh:/script.sh
|
|
```
|
|
|
|
(`RuntimeDirectory=` auto-creates `/run/left4me/sandbox-scripts/` on
|
|
start and removes it on stop, including the file inside.)
|
|
|
|
The fetch script doesn't need sudoers — it runs from ExecStartPre with
|
|
root privileges already. It only reads the DB; no writes. The DB is
|
|
`root:left4me 0640` so root can read it.
|
|
|
|
Worker becomes a one-liner: `sudo systemctl start build-overlay@<id>`.
|
|
No FS prep, no tmpfile cleanup.
|
|
|
|
**Option C — pipe the script content directly into bash stdin.** The
|
|
unit's ExecStart is something like
|
|
`/bin/sh -c "fetch-script %i | /bin/bash"`. Pros: no on-disk file at
|
|
all. Cons: `/bin/bash` runs without a file path, so `$0` is `bash` and
|
|
error messages look weird; harder to debug a failing script when there's
|
|
no file to inspect.
|
|
|
|
**Recommendation**: Option B. Decouples script storage (DB) from
|
|
sandbox transport (a /run/ runtime file). RuntimeDirectory= handles
|
|
cleanup. Worker becomes trivially small. The fetch-script helper is
|
|
~10 lines and stays in deploy/files/usr/local/libexec/left4me/.
|
|
|
|
If Option A is chosen instead, plan to track the script tmpfiles
|
|
explicitly so they don't accumulate. With Option B, RuntimeDirectory
|
|
auto-cleans on stop.
|
|
|
|
### Worker invocation
|
|
|
|
Replace `run_sandboxed_script` in
|
|
`l4d2web/services/overlay_builders.py`. The code below is the **Option
|
|
A** shape (worker writes the script). For **Option B** (recommended),
|
|
drop the `script_dir`/`script_path`/`write_text`/`chmod` lines — the
|
|
unit's ExecStartPre fetches from the DB. The signature can also drop
|
|
`script_text` since the worker doesn't need to pass content anymore.
|
|
|
|
```python
|
|
def run_sandboxed_script(
|
|
overlay_id: int,
|
|
script_text: str, # remove this param if Option B
|
|
*,
|
|
on_stdout: LogSink,
|
|
on_stderr: LogSink,
|
|
should_cancel: CancelCheck,
|
|
) -> None:
|
|
# The four lines below are Option A only — delete for Option B.
|
|
script_dir = _sandbox_script_dir()
|
|
script_dir.mkdir(parents=True, exist_ok=True)
|
|
script_path = script_dir / f"{overlay_id}.sh"
|
|
script_path.write_text(script_text or "")
|
|
os.chmod(script_path, 0o644)
|
|
|
|
unit = f"build-overlay@{overlay_id}.service"
|
|
|
|
# Tail the unit's journal as a sidecar so output streams into job-logs
|
|
# while the unit runs. --follow exits when the unit reaches "inactive".
|
|
journal = subprocess.Popen(
|
|
["journalctl", "--unit", unit, "--output=cat", "--follow",
|
|
"--since=now", "--no-pager"],
|
|
stdout=subprocess.PIPE,
|
|
stderr=subprocess.STDOUT,
|
|
text=True,
|
|
)
|
|
|
|
try:
|
|
# Start the unit (sudoers permits this exact verb pattern).
|
|
# Type=oneshot makes this block until ExecStart returns.
|
|
rc = subprocess.run(
|
|
["sudo", "-n", "/bin/systemctl", "start", unit],
|
|
check=False,
|
|
).returncode
|
|
finally:
|
|
# Drain remaining journal lines (journalctl --follow may not have
|
|
# printed everything yet by the time systemctl returns).
|
|
journal.terminate()
|
|
try:
|
|
for line in journal.stdout or []:
|
|
on_stdout(line.rstrip("\n"))
|
|
finally:
|
|
journal.wait(timeout=5)
|
|
|
|
# Read exit code from the unit. ExecMainStatus is the script's rc;
|
|
# Result is "success" / "failed" / "timeout" etc.
|
|
show = subprocess.check_output(
|
|
["systemctl", "show", unit,
|
|
"-p", "ExecMainStatus", "-p", "Result", "--value"],
|
|
text=True,
|
|
).split()
|
|
exec_main_status = int(show[0])
|
|
result = show[1]
|
|
|
|
if rc != 0 or result != "success":
|
|
raise BuildError(
|
|
f"build-overlay@{overlay_id} failed: "
|
|
f"systemctl rc={rc} unit result={result} script exit={exec_main_status}"
|
|
)
|
|
```
|
|
|
|
That's ~30 lines vs. ~50 today, and the helper script disappears
|
|
entirely.
|
|
|
|
**Two refinements to consider:**
|
|
|
|
1. **Cancel semantics**: today the worker's `should_cancel` callback
|
|
triggers a SIGTERM via the existing `run_command` plumbing. With
|
|
systemctl-start, you'd issue `systemctl stop build-overlay@<id>`
|
|
in a parallel thread when `should_cancel()` returns True. Wire
|
|
that up.
|
|
2. **Journal streaming race**: `journalctl --follow --since=now`
|
|
started *after* `systemctl start` may miss the first few lines.
|
|
Two fixes:
|
|
- Start the journal tail before systemctl-start (the unit doesn't
|
|
exist yet, so journalctl waits silently — verify this behaviour
|
|
on Trixie).
|
|
- Or use `journalctl --cursor` machinery: snapshot the cursor
|
|
before start, then read with `--cursor=` after.
|
|
|
|
Start-before is simpler and likely sufficient for L4D2 build
|
|
verbosity, where the first second of output isn't critical.
|
|
|
|
### Sudoers
|
|
|
|
Replace:
|
|
```
|
|
left4me ALL=(root) NOPASSWD: /usr/local/libexec/left4me/left4me-script-sandbox
|
|
```
|
|
|
|
with:
|
|
```
|
|
left4me ALL=(root) NOPASSWD: /bin/systemctl start build-overlay@*.service
|
|
left4me ALL=(root) NOPASSWD: /bin/systemctl stop build-overlay@*.service
|
|
```
|
|
|
|
(Tighter — verb-prefixed and instance-globbed. No script path passed.)
|
|
|
|
### Slice
|
|
|
|
`l4d2-build.slice` already exists (per the gameserver/sandbox today's
|
|
configuration). Reuse it — no change needed.
|
|
|
|
### Sandbox script tmpfile cleanup
|
|
|
|
Currently `run_sandboxed_script` writes a per-invocation
|
|
`tempfile.NamedTemporaryFile` with a random suffix and unlinks it in a
|
|
`finally`. With template-unit lookup, the script path is **predictable
|
|
per overlay id** (`/var/lib/left4me/sandbox-scripts/<id>.sh`).
|
|
Implications:
|
|
|
|
- Two concurrent builds for the *same* overlay id would clobber the
|
|
script file. The job queue already serializes per-overlay (per
|
|
`l4d2web/services/job_worker.py:OVERLAY_OPERATIONS`), so this
|
|
is OK.
|
|
- Scripts persist between builds (no auto-cleanup). Either accept
|
|
that (the next build overwrites) or delete after the unit goes
|
|
inactive. Recommend: leave them — small, useful for debugging.
|
|
|
|
## Migration
|
|
|
|
In order:
|
|
|
|
1. **Add the unit emission to ckn-bw's `bundles/left4me/metadata.py`
|
|
systemd_units reactor.** Mirror the pattern used for
|
|
`left4me-server@.service`. Drop in the template-unit content as
|
|
another reactor entry.
|
|
2. **Update sudoers** (`bundles/left4me/files/etc/sudoers.d/left4me`)
|
|
to permit `systemctl start/stop build-overlay@*.service` and
|
|
remove the script-sandbox grant.
|
|
3. **Replace `run_sandboxed_script` in left4me.** Add the new
|
|
journalctl-based output streaming, exit-code reading, and cancel
|
|
handling. Keep the function signature stable so callers
|
|
(`ScriptBuilder.build`, the wipe route) are unchanged.
|
|
4. **Delete `deploy/files/usr/local/libexec/left4me/left4me-script-sandbox`.**
|
|
5. **Update tests:**
|
|
- `deploy/tests/test_deploy_artifacts.py`:
|
|
- Drop `test_script_sandbox_uses_idmap_staging` and any other
|
|
tests that read SCRIPT_SANDBOX_HELPER.
|
|
- Add tests that assert the new unit emission in ckn-bw's
|
|
reactor output. (But that's in the other repo — left4me's
|
|
deploy tests can't directly cover it.)
|
|
- Add a test that asserts the worker invokes
|
|
`sudo systemctl start build-overlay@*` (grep
|
|
`overlay_builders.py`).
|
|
- `l4d2web/tests/test_overlay_builders.py` (if it exists):
|
|
update mocks for `run_sandboxed_script` to expect the new
|
|
subprocess shape.
|
|
6. **Test on `left4.me`:**
|
|
- Push left4me, `bw apply ovh.left4me`. Apply also picks up the
|
|
new unit emission and the sudoers change.
|
|
- Trigger a script-overlay rebuild via the web UI or the
|
|
enqueue API path used in this session (see test history in
|
|
git log around 2026-05-15).
|
|
- Inspect: `journalctl -u build-overlay@9.service`,
|
|
`systemctl status build-overlay@9.service`.
|
|
- Verify on-disk state: overlay files end up `left4me`-owned;
|
|
idmap bind cleanly torn down (`findmnt | grep idmap` empty).
|
|
|
|
## Open decisions for the future session
|
|
|
|
0. **Script source: filesystem (Option A) vs. DB-fetched in ExecStartPre
|
|
(Option B) vs. piped to stdin (Option C).** See the "Script source"
|
|
section above. This is the highest-impact decision because it
|
|
shapes the worker, the unit's ExecStartPre, and whether you need
|
|
a fetch-script helper binary at all. Recommendation: Option B.
|
|
|
|
1. **`/run/left4me/idmap/%i` vs. `/var/lib/left4me/tmp/sandbox-idmap-%i`** —
|
|
`/run` is tmpfs and wiped on reboot, more correct for transient
|
|
mount paths. But it requires the dir to exist (created by
|
|
ExecStartPre). Either works.
|
|
2. **What to do with the existing `left4me-apply-cake` dead code** —
|
|
irrelevant to this refactor; flagged in the other handoff doc.
|
|
3. **Whether to drop the post-build `chmod o+r` in the sandbox helper** —
|
|
already gone in the build-time-idmap commit. (Verify in the new
|
|
unit nothing equivalent is needed; files are left4me-owned, web
|
|
reads via primary uid.)
|
|
4. **`Type=oneshot` vs. `Type=exec`** — oneshot blocks `systemctl
|
|
start`. exec doesn't. With oneshot we don't need the
|
|
`journalctl --follow` workaround if we read journal *after*
|
|
completion. But for live progress (which the existing builds
|
|
stream), `--follow` is still needed. Stick with oneshot.
|
|
5. **Should the unit set `KillMode=mixed`** to ensure children die on
|
|
stop? Worth checking — the existing systemd-run line doesn't set
|
|
it explicitly; defaults usually suffice.
|
|
6. **`StateDirectory=` vs. explicit `mkdir -p`** — systemd has
|
|
StateDirectory and RuntimeDirectory directives that auto-create
|
|
per-unit directories. Could replace the `mkdir -p /run/left4me/idmap/%i`
|
|
ExecStartPre with `RuntimeDirectory=left4me/idmap/%i`. Cleaner;
|
|
gets auto-cleanup on stop too. Recommend doing this — both the
|
|
mkdir and the rmdir ExecStopPost would go away.
|
|
|
|
## Verification
|
|
|
|
End-to-end smoke test on `left4.me` after the deploy:
|
|
|
|
```bash
|
|
# unit is installed and template-parseable
|
|
systemctl status build-overlay@.service # should show "loaded; static"
|
|
sudo systemd-analyze verify build-overlay@1.service
|
|
|
|
# enqueue a build via the web app's worker path (mimic the
|
|
# enqueue_build_overlay pattern from this session's job 64 onwards)
|
|
# then watch:
|
|
sudo journalctl -u build-overlay@9.service -f
|
|
|
|
# on completion:
|
|
systemctl show build-overlay@9.service -p Result -p ExecMainStatus
|
|
# expect: Result=success, ExecMainStatus=0
|
|
|
|
# disk state
|
|
sudo find /var/lib/left4me/overlays/9 -uid 981 # should be empty
|
|
sudo find /run/left4me/idmap # should not exist or be empty
|
|
|
|
# pid 1 mount table — no orphan idmap binds
|
|
sudo findmnt --task 1 -o TARGET | grep idmap # empty
|
|
```
|
|
|
|
## Risks
|
|
|
|
- **Worker cancel-during-build**: today's `should_cancel` callback
|
|
signals via `run_command`'s child process. With the unit, the
|
|
worker needs a separate path: spawn a thread that polls
|
|
`should_cancel()` and calls `sudo systemctl stop build-overlay@<id>`
|
|
when triggered. Without this, builds that exceed `RuntimeMaxSec` or
|
|
hit user-cancel won't terminate promptly.
|
|
- **Journal lag at unit start**: `journalctl --follow` started before
|
|
`systemctl start` should pick up all output. If not, may need
|
|
cursor-based streaming. Test with a script that prints immediately
|
|
(`echo hello; exit 0`) — if "hello" appears in the job log, race
|
|
is handled.
|
|
- **Sudoers globbing**: `systemctl start build-overlay@*.service`
|
|
permits any instance id including weird strings like `../etc-passwd`.
|
|
Use a tighter glob if possible (e.g.,
|
|
`build-overlay@[0-9]*.service`). Test that sudoers rejects
|
|
unexpected instance names.
|
|
- **Type=oneshot return semantics**: confirm that `systemctl start
|
|
build-overlay@<id>` on a Type=oneshot unit returns rc=3 (or
|
|
similar) when the unit's ExecStart fails, so the worker can detect
|
|
failure without re-querying `systemctl show`.
|
|
- **Idle running over reboot**: a build that's running across a reboot
|
|
is killed when the system goes down. That's identical to today's
|
|
behavior with systemd-run. Acceptable.
|
|
- **The journalctl sidecar process accumulates as a zombie if not
|
|
reaped properly.** The proposed code does `journal.wait(timeout=5)`
|
|
— handle the timeout case (force-kill).
|
|
|
|
## Pointers
|
|
|
|
Reference files (with line numbers if applicable):
|
|
|
|
- **Current helper to be removed**:
|
|
`deploy/files/usr/local/libexec/left4me/left4me-script-sandbox`
|
|
- **Current worker invoker**:
|
|
`l4d2web/services/overlay_builders.py:run_sandboxed_script` (~ln 324)
|
|
- **Current job-worker dispatch**:
|
|
`l4d2web/services/job_worker.py` (build_overlay operation)
|
|
- **Sudoers**:
|
|
`deploy/files/etc/sudoers.d/left4me` (matched verbatim in
|
|
`ckn-bw/bundles/left4me/files/etc/sudoers.d/left4me`)
|
|
- **Sample template unit pattern** (the model to copy):
|
|
`left4me-server@.service` emission in ckn-bw's
|
|
`bundles/left4me/metadata.py` systemd_units reactor.
|
|
- **Existing slice declaration** (already correct):
|
|
`l4d2-build.slice` in ckn-bw's reactor.
|
|
|
|
Recent commits that touched this surface:
|
|
- `4838108` — moved idmap to build time (the refactor that surfaced
|
|
the namespace bug)
|
|
- `f1aa05d` — added nsenter self-wrap (the band-aid this refactor
|
|
removes)
|
|
- `2f6a9cf`, `9053186`, `dd918ac` — earlier idmap-on-mount approach
|
|
that was reverted
|
|
|
|
Related design docs:
|
|
- `docs/superpowers/plans/2026-05-15-build-time-idmap.md` — the
|
|
plan whose architecture this refactor builds on
|
|
- `docs/superpowers/specs/2026-05-15-deploy-dir-rethink-design.md` —
|
|
unrelated open questions about deploy/ layout
|
|
|
|
## What's NOT in scope
|
|
|
|
- Rewriting the sandbox in Python / packaging differently.
|
|
- Changing the security hardening profile (the unit duplicates the
|
|
current set verbatim — adjust later if needed).
|
|
- Splitting the gameserver uid from the web app uid (noted in earlier
|
|
handoff doc).
|
|
- Re-evaluating whether `l4d2-sandbox` should exist as a separate
|
|
uid (kept; defense in depth).
|
|
- Touching the `left4me-overlay` gameserver helper (it already uses
|
|
the pattern; only the sandbox helper is being refactored to match).
|
|
|
|
## Estimate
|
|
|
|
Rough breakdown for the future session:
|
|
- Unit file design + ckn-bw reactor change: 1-2 hours
|
|
- Worker rewrite (run_sandboxed_script): 1-2 hours
|
|
- Tests: 1 hour
|
|
- Deploy + verify on test server: 30 min
|
|
- Bug-fix and iteration buffer: 1 hour
|
|
|
|
~5 hours of focused work, assuming no surprises with journalctl
|
|
streaming or sudoers semantics.
|
|
|
|
## Decision criteria for whether to do this
|
|
|
|
Do it if:
|
|
- You're about to make any other change to the sandbox hardening,
|
|
build lifecycle, or sandbox uid story.
|
|
- You're frustrated by debugging the existing helper.
|
|
- You want to remove the nsenter band-aid for hygiene.
|
|
|
|
Skip if:
|
|
- The sandbox is stable and you're not planning related changes.
|
|
- You'd rather invest the time in higher-value work elsewhere.
|
|
|
|
The current solution is fine; this refactor is upgrade-not-fix.
|