spec(build-overlay-unit): flag DB-fetch-in-ExecStartPre as an option

The script content lives in the overlays.script DB column and the
unit's %i is the row id, so the worker-writes-script-to-fs step in
the original sketch is duplication. Document three options (worker
writes / unit fetches via helper / pipe to stdin) and recommend the
unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote
this to the top of the open-decisions list since it shapes the
worker, the unit, and whether a fetch-script helper is added.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
mwiegand 2026-05-15 01:54:41 +02:00
parent a9bbc209ae
commit 28b0ff951b
No known key found for this signature in database

View file

@ -173,20 +173,95 @@ Notes:
available at `/script.sh` inside the sandbox, picked from the
predictable path `/var/lib/left4me/sandbox-scripts/%i.sh`.
### Script source: filesystem vs. DB
**Critical design decision the future session must make.** The current
plan in the unit sketch above assumes the worker writes the script
content to `/var/lib/left4me/sandbox-scripts/<id>.sh` before calling
`systemctl start`. But the script *already lives in the DB* (the
`overlays.script` column), and the unit instance name `%i` is the
overlay row id. The filesystem copy is redundant unless we want it.
Three options:
**Option A — worker writes the script (the unit sketch above).**
Worker queries DB, writes `<id>.sh` to a known path, then
`systemctl start`. Unit reads via `BindReadOnlyPaths`. Simple, no DB
access from the unit, the existing `_sandbox_script_dir()` plumbing
mostly works. Cost: redundant on-disk copy; stale files between
builds if you don't clean them.
**Option B — unit fetches the script from the DB itself.** A small
root-side helper installed as
`/usr/local/libexec/left4me/left4me-fetch-script` does:
```python
#!/usr/bin/python3
import sqlite3, sys
overlay_id = int(sys.argv[1])
conn = sqlite3.connect("/var/lib/left4me/left4me.db")
row = conn.execute(
"SELECT script FROM overlays WHERE id = ?", (overlay_id,)
).fetchone()
sys.stdout.write((row[0] if row else "") or "")
```
Unit's ExecStartPre runs it as root (the `+` prefix), pipes the
output to a runtime path that ExecStart reads:
```ini
RuntimeDirectory=left4me/sandbox-scripts
RuntimeDirectoryMode=0700
ExecStartPre=+/bin/sh -c '/usr/local/libexec/left4me/left4me-fetch-script %i \
> /run/left4me/sandbox-scripts/%i.sh && chmod 0644 /run/left4me/sandbox-scripts/%i.sh'
BindReadOnlyPaths=/run/left4me/sandbox-scripts/%i.sh:/script.sh
```
(`RuntimeDirectory=` auto-creates `/run/left4me/sandbox-scripts/` on
start and removes it on stop, including the file inside.)
The fetch script doesn't need sudoers — it runs from ExecStartPre with
root privileges already. It only reads the DB; no writes. The DB is
`root:left4me 0640` so root can read it.
Worker becomes a one-liner: `sudo systemctl start build-overlay@<id>`.
No FS prep, no tmpfile cleanup.
**Option C — pipe the script content directly into bash stdin.** The
unit's ExecStart is something like
`/bin/sh -c "fetch-script %i | /bin/bash"`. Pros: no on-disk file at
all. Cons: `/bin/bash` runs without a file path, so `$0` is `bash` and
error messages look weird; harder to debug a failing script when there's
no file to inspect.
**Recommendation**: Option B. Decouples script storage (DB) from
sandbox transport (a /run/ runtime file). RuntimeDirectory= handles
cleanup. Worker becomes trivially small. The fetch-script helper is
~10 lines and stays in deploy/files/usr/local/libexec/left4me/.
If Option A is chosen instead, plan to track the script tmpfiles
explicitly so they don't accumulate. With Option B, RuntimeDirectory
auto-cleans on stop.
### Worker invocation
Replace `run_sandboxed_script` in
`l4d2web/services/overlay_builders.py`:
`l4d2web/services/overlay_builders.py`. The code below is the **Option
A** shape (worker writes the script). For **Option B** (recommended),
drop the `script_dir`/`script_path`/`write_text`/`chmod` lines — the
unit's ExecStartPre fetches from the DB. The signature can also drop
`script_text` since the worker doesn't need to pass content anymore.
```python
def run_sandboxed_script(
overlay_id: int,
script_text: str,
script_text: str, # remove this param if Option B
*,
on_stdout: LogSink,
on_stderr: LogSink,
should_cancel: CancelCheck,
) -> None:
# The four lines below are Option A only — delete for Option B.
script_dir = _sandbox_script_dir()
script_dir.mkdir(parents=True, exist_ok=True)
script_path = script_dir / f"{overlay_id}.sh"
@ -339,6 +414,12 @@ In order:
## Open decisions for the future session
0. **Script source: filesystem (Option A) vs. DB-fetched in ExecStartPre
(Option B) vs. piped to stdin (Option C).** See the "Script source"
section above. This is the highest-impact decision because it
shapes the worker, the unit's ExecStartPre, and whether you need
a fetch-script helper binary at all. Recommendation: Option B.
1. **`/run/left4me/idmap/%i` vs. `/var/lib/left4me/tmp/sandbox-idmap-%i`** —
`/run` is tmpfs and wiped on reboot, more correct for transient
mount paths. But it requires the dir to exist (created by