spec(build-overlay-unit): flag DB-fetch-in-ExecStartPre as an option

The script content lives in the overlays.script DB column and the
unit's %i is the row id, so the worker-writes-script-to-fs step in
the original sketch is duplication. Document three options (worker
writes / unit fetches via helper / pipe to stdin) and recommend the
unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote
this to the top of the open-decisions list since it shapes the
worker, the unit, and whether a fetch-script helper is added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
mwiegand 2026-05-15 01:54:41 +02:00
parent a9bbc209ae
commit 28b0ff951b
No known key found for this signature in database

View file

@ -173,20 +173,95 @@ Notes:
available at `/script.sh` inside the sandbox, picked from the available at `/script.sh` inside the sandbox, picked from the
predictable path `/var/lib/left4me/sandbox-scripts/%i.sh`. predictable path `/var/lib/left4me/sandbox-scripts/%i.sh`.
### Script source: filesystem vs. DB
**Critical design decision the future session must make.** The current
plan in the unit sketch above assumes the worker writes the script
content to `/var/lib/left4me/sandbox-scripts/<id>.sh` before calling
`systemctl start`. But the script *already lives in the DB* (the
`overlays.script` column), and the unit instance name `%i` is the
overlay row id. The filesystem copy is redundant unless we want it.
Three options:
**Option A — worker writes the script (the unit sketch above).**
Worker queries DB, writes `<id>.sh` to a known path, then
`systemctl start`. Unit reads via `BindReadOnlyPaths`. Simple, no DB
access from the unit, the existing `_sandbox_script_dir()` plumbing
mostly works. Cost: redundant on-disk copy; stale files between
builds if you don't clean them.
**Option B — unit fetches the script from the DB itself.** A small
root-side helper installed as
`/usr/local/libexec/left4me/left4me-fetch-script` does:
```python
#!/usr/bin/python3
import sqlite3, sys
overlay_id = int(sys.argv[1])
conn = sqlite3.connect("/var/lib/left4me/left4me.db")
row = conn.execute(
"SELECT script FROM overlays WHERE id = ?", (overlay_id,)
).fetchone()
sys.stdout.write((row[0] if row else "") or "")
```
Unit's ExecStartPre runs it as root (the `+` prefix), pipes the
output to a runtime path that ExecStart reads:
```ini
RuntimeDirectory=left4me/sandbox-scripts
RuntimeDirectoryMode=0700
ExecStartPre=+/bin/sh -c '/usr/local/libexec/left4me/left4me-fetch-script %i \
> /run/left4me/sandbox-scripts/%i.sh && chmod 0644 /run/left4me/sandbox-scripts/%i.sh'
BindReadOnlyPaths=/run/left4me/sandbox-scripts/%i.sh:/script.sh
```
(`RuntimeDirectory=` auto-creates `/run/left4me/sandbox-scripts/` on
start and removes it on stop, including the file inside.)
The fetch script doesn't need sudoers — it runs from ExecStartPre with
root privileges already. It only reads the DB; no writes. The DB is
`root:left4me 0640` so root can read it.
Worker becomes a one-liner: `sudo systemctl start build-overlay@<id>`.
No FS prep, no tmpfile cleanup.
**Option C — pipe the script content directly into bash stdin.** The
unit's ExecStart is something like
`/bin/sh -c "fetch-script %i | /bin/bash"`. Pros: no on-disk file at
all. Cons: `/bin/bash` runs without a file path, so `$0` is `bash` and
error messages look weird; harder to debug a failing script when there's
no file to inspect.
**Recommendation**: Option B. Decouples script storage (DB) from
sandbox transport (a /run/ runtime file). RuntimeDirectory= handles
cleanup. Worker becomes trivially small. The fetch-script helper is
~10 lines and stays in deploy/files/usr/local/libexec/left4me/.
If Option A is chosen instead, plan to track the script tmpfiles
explicitly so they don't accumulate. With Option B, RuntimeDirectory
auto-cleans on stop.
### Worker invocation ### Worker invocation
Replace `run_sandboxed_script` in Replace `run_sandboxed_script` in
`l4d2web/services/overlay_builders.py`: `l4d2web/services/overlay_builders.py`. The code below is the **Option
A** shape (worker writes the script). For **Option B** (recommended),
drop the `script_dir`/`script_path`/`write_text`/`chmod` lines — the
unit's ExecStartPre fetches from the DB. The signature can also drop
`script_text` since the worker doesn't need to pass content anymore.
```python ```python
def run_sandboxed_script( def run_sandboxed_script(
overlay_id: int, overlay_id: int,
script_text: str, script_text: str, # remove this param if Option B
*, *,
on_stdout: LogSink, on_stdout: LogSink,
on_stderr: LogSink, on_stderr: LogSink,
should_cancel: CancelCheck, should_cancel: CancelCheck,
) -> None: ) -> None:
# The four lines below are Option A only — delete for Option B.
script_dir = _sandbox_script_dir() script_dir = _sandbox_script_dir()
script_dir.mkdir(parents=True, exist_ok=True) script_dir.mkdir(parents=True, exist_ok=True)
script_path = script_dir / f"{overlay_id}.sh" script_path = script_dir / f"{overlay_id}.sh"
@ -339,6 +414,12 @@ In order:
## Open decisions for the future session ## Open decisions for the future session
0. **Script source: filesystem (Option A) vs. DB-fetched in ExecStartPre
(Option B) vs. piped to stdin (Option C).** See the "Script source"
section above. This is the highest-impact decision because it
shapes the worker, the unit's ExecStartPre, and whether you need
a fetch-script helper binary at all. Recommendation: Option B.
1. **`/run/left4me/idmap/%i` vs. `/var/lib/left4me/tmp/sandbox-idmap-%i`** — 1. **`/run/left4me/idmap/%i` vs. `/var/lib/left4me/tmp/sandbox-idmap-%i`** —
`/run` is tmpfs and wiped on reboot, more correct for transient `/run` is tmpfs and wiped on reboot, more correct for transient
mount paths. But it requires the dir to exist (created by mount paths. But it requires the dir to exist (created by