spec(build-overlay-unit): flag DB-fetch-in-ExecStartPre as an option

The script content lives in the overlays.script DB column and the unit's %i is the row id, so the worker-writes-script-to-fs step in the original sketch is duplication. Document three options (worker writes / unit fetches via helper / pipe to stdin) and recommend the unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote this to the top of the open-decisions list since it shapes the worker, the unit, and whether a fetch-script helper is added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:54:41 +02:00 · 2026-05-15 01:54:41 +02:00 · 28b0ff951b
commit 28b0ff951b
parent a9bbc209ae
1 changed files with 83 additions and 2 deletions
--- a/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md
+++ b/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md
@ -173,20 +173,95 @@ Notes:
  available at `/script.sh` inside the sandbox, picked from the
  predictable path `/var/lib/left4me/sandbox-scripts/%i.sh`.
 ### Script source: filesystem vs. DB
 **Critical design decision the future session must make.** The current
 plan in the unit sketch above assumes the worker writes the script
 content to `/var/lib/left4me/sandbox-scripts/<id>.sh` before calling
 `systemctl start`. But the script *already lives in the DB* (the
 `overlays.script` column), and the unit instance name `%i` is the
 overlay row id. The filesystem copy is redundant unless we want it.
 Three options:
 **Option A — worker writes the script (the unit sketch above).**
 Worker queries DB, writes `<id>.sh` to a known path, then
 `systemctl start`. Unit reads via `BindReadOnlyPaths`. Simple, no DB
 access from the unit, the existing `_sandbox_script_dir()` plumbing
 mostly works. Cost: redundant on-disk copy; stale files between
 builds if you don't clean them.
 **Option B — unit fetches the script from the DB itself.** A small
 root-side helper installed as
 `/usr/local/libexec/left4me/left4me-fetch-script` does:
 ```python
 #!/usr/bin/python3
 import sqlite3, sys
 overlay_id = int(sys.argv[1])
 conn = sqlite3.connect("/var/lib/left4me/left4me.db")
 row = conn.execute(
    "SELECT script FROM overlays WHERE id = ?", (overlay_id,)
 ).fetchone()
 sys.stdout.write((row[0] if row else "") or "")
 ```
 Unit's ExecStartPre runs it as root (the `+` prefix), pipes the
 output to a runtime path that ExecStart reads:
 ```ini
 RuntimeDirectory=left4me/sandbox-scripts
 RuntimeDirectoryMode=0700
 ExecStartPre=+/bin/sh -c '/usr/local/libexec/left4me/left4me-fetch-script %i \
    > /run/left4me/sandbox-scripts/%i.sh && chmod 0644 /run/left4me/sandbox-scripts/%i.sh'
 BindReadOnlyPaths=/run/left4me/sandbox-scripts/%i.sh:/script.sh
 ```
 (`RuntimeDirectory=` auto-creates `/run/left4me/sandbox-scripts/` on
 start and removes it on stop, including the file inside.)
 The fetch script doesn't need sudoers — it runs from ExecStartPre with
 root privileges already. It only reads the DB; no writes. The DB is
 `root:left4me 0640` so root can read it.
 Worker becomes a one-liner: `sudo systemctl start build-overlay@<id>`.
 No FS prep, no tmpfile cleanup.
 **Option C — pipe the script content directly into bash stdin.** The
 unit's ExecStart is something like
 `/bin/sh -c "fetch-script %i | /bin/bash"`. Pros: no on-disk file at
 all. Cons: `/bin/bash` runs without a file path, so `$0` is `bash` and
 error messages look weird; harder to debug a failing script when there's
 no file to inspect.
 **Recommendation**: Option B. Decouples script storage (DB) from
 sandbox transport (a /run/ runtime file). RuntimeDirectory= handles
 cleanup. Worker becomes trivially small. The fetch-script helper is
 ~10 lines and stays in deploy/files/usr/local/libexec/left4me/.
 If Option A is chosen instead, plan to track the script tmpfiles
 explicitly so they don't accumulate. With Option B, RuntimeDirectory
 auto-cleans on stop.
 ### Worker invocation
 Replace `run_sandboxed_script` in
-`l4d2web/services/overlay_builders.py`:
+`l4d2web/services/overlay_builders.py`. The code below is the **Option
 A** shape (worker writes the script). For **Option B** (recommended),
 drop the `script_dir`/`script_path`/`write_text`/`chmod` lines — the
 unit's ExecStartPre fetches from the DB. The signature can also drop
 `script_text` since the worker doesn't need to pass content anymore.
 ```python
 def run_sandboxed_script(
    overlay_id: int,
-    script_text: str,
+    script_text: str,  # remove this param if Option B
    *,
    on_stdout: LogSink,
    on_stderr: LogSink,
    should_cancel: CancelCheck,
 ) -> None:
    # The four lines below are Option A only — delete for Option B.
    script_dir = _sandbox_script_dir()
    script_dir.mkdir(parents=True, exist_ok=True)
    script_path = script_dir / f"{overlay_id}.sh"
@ -339,6 +414,12 @@ In order:
 ## Open decisions for the future session
 0. **Script source: filesystem (Option A) vs. DB-fetched in ExecStartPre
   (Option B) vs. piped to stdin (Option C).** See the "Script source"
   section above. This is the highest-impact decision because it
   shapes the worker, the unit's ExecStartPre, and whether you need
   a fetch-script helper binary at all. Recommendation: Option B.
 1. **`/run/left4me/idmap/%i` vs. `/var/lib/left4me/tmp/sandbox-idmap-%i`** —
   `/run` is tmpfs and wiped on reboot, more correct for transient
   mount paths. But it requires the dir to exist (created by