spec(build-overlay-unit): flag DB-fetch-in-ExecStartPre as an option

The script content lives in the overlays.script DB column and the unit's %i is the row id, so the worker-writes-script-to-fs step in the original sketch is duplication. Document three options (worker writes / unit fetches via helper / pipe to stdin) and recommend the unit-fetches variant with RuntimeDirectory= auto-cleanup. Promote this to the top of the open-decisions list since it shapes the worker, the unit, and whether a fetch-script helper is added. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 01:54:41 +02:00 · 2026-05-15 01:54:41 +02:00 · 28b0ff951b
commit 28b0ff951b
parent a9bbc209ae
1 changed files with 83 additions and 2 deletions
--- a/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md
+++ b/docs/superpowers/specs/2026-05-15-build-overlay-unit-design.md
@ -173,20 +173,95 @@ Notes:
  available at `/script.sh` inside the sandbox, picked from the
  predictable path `/var/lib/left4me/sandbox-scripts/%i.sh`.

+### Script source: filesystem vs. DB
+
+**Critical design decision the future session must make.** The current
+plan in the unit sketch above assumes the worker writes the script
+content to `/var/lib/left4me/sandbox-scripts/<id>.sh` before calling
+`systemctl start`. But the script *already lives in the DB* (the
+`overlays.script` column), and the unit instance name `%i` is the
+overlay row id. The filesystem copy is redundant unless we want it.
+
+Three options:
+
+**Option A — worker writes the script (the unit sketch above).**
+Worker queries DB, writes `<id>.sh` to a known path, then
+`systemctl start`. Unit reads via `BindReadOnlyPaths`. Simple, no DB
+access from the unit, the existing `_sandbox_script_dir()` plumbing
+mostly works. Cost: redundant on-disk copy; stale files between
+builds if you don't clean them.
+
+**Option B — unit fetches the script from the DB itself.** A small
+root-side helper installed as
+`/usr/local/libexec/left4me/left4me-fetch-script` does:
+
+```python
+#!/usr/bin/python3
+import sqlite3, sys
+overlay_id = int(sys.argv[1])
+conn = sqlite3.connect("/var/lib/left4me/left4me.db")
+row = conn.execute(
+    "SELECT script FROM overlays WHERE id = ?", (overlay_id,)
+).fetchone()
+sys.stdout.write((row[0] if row else "") or "")
+```
+
+Unit's ExecStartPre runs it as root (the `+` prefix), pipes the
+output to a runtime path that ExecStart reads:
+
+```ini
+RuntimeDirectory=left4me/sandbox-scripts
+RuntimeDirectoryMode=0700
+ExecStartPre=+/bin/sh -c '/usr/local/libexec/left4me/left4me-fetch-script %i \
+    > /run/left4me/sandbox-scripts/%i.sh && chmod 0644 /run/left4me/sandbox-scripts/%i.sh'
+BindReadOnlyPaths=/run/left4me/sandbox-scripts/%i.sh:/script.sh
+```
+
+(`RuntimeDirectory=` auto-creates `/run/left4me/sandbox-scripts/` on
+start and removes it on stop, including the file inside.)
+
+The fetch script doesn't need sudoers — it runs from ExecStartPre with
+root privileges already. It only reads the DB; no writes. The DB is
+`root:left4me 0640` so root can read it.
+
+Worker becomes a one-liner: `sudo systemctl start build-overlay@<id>`.
+No FS prep, no tmpfile cleanup.
+
+**Option C — pipe the script content directly into bash stdin.** The
+unit's ExecStart is something like
+`/bin/sh -c "fetch-script %i | /bin/bash"`. Pros: no on-disk file at
+all. Cons: `/bin/bash` runs without a file path, so `$0` is `bash` and
+error messages look weird; harder to debug a failing script when there's
+no file to inspect.
+
+**Recommendation**: Option B. Decouples script storage (DB) from
+sandbox transport (a /run/ runtime file). RuntimeDirectory= handles
+cleanup. Worker becomes trivially small. The fetch-script helper is
+~10 lines and stays in deploy/files/usr/local/libexec/left4me/.
+
+If Option A is chosen instead, plan to track the script tmpfiles
+explicitly so they don't accumulate. With Option B, RuntimeDirectory
+auto-cleans on stop.
+
 ### Worker invocation

 Replace `run_sandboxed_script` in
-`l4d2web/services/overlay_builders.py`:
+`l4d2web/services/overlay_builders.py`. The code below is the **Option
+A** shape (worker writes the script). For **Option B** (recommended),
+drop the `script_dir`/`script_path`/`write_text`/`chmod` lines — the
+unit's ExecStartPre fetches from the DB. The signature can also drop
+`script_text` since the worker doesn't need to pass content anymore.

 ```python
 def run_sandboxed_script(
    overlay_id: int,
-    script_text: str,
+    script_text: str,  # remove this param if Option B
    *,
    on_stdout: LogSink,
    on_stderr: LogSink,
    should_cancel: CancelCheck,
 ) -> None:
+    # The four lines below are Option A only — delete for Option B.
    script_dir = _sandbox_script_dir()
    script_dir.mkdir(parents=True, exist_ok=True)
    script_path = script_dir / f"{overlay_id}.sh"
@ -339,6 +414,12 @@ In order:

 ## Open decisions for the future session

+0. **Script source: filesystem (Option A) vs. DB-fetched in ExecStartPre
+   (Option B) vs. piped to stdin (Option C).** See the "Script source"
+   section above. This is the highest-impact decision because it
+   shapes the worker, the unit's ExecStartPre, and whether you need
+   a fetch-script helper binary at all. Recommendation: Option B.
+
 1. **`/run/left4me/idmap/%i` vs. `/var/lib/left4me/tmp/sandbox-idmap-%i`** —
   `/run` is tmpfs and wiped on reboot, more correct for transient
   mount paths. But it requires the dir to exist (created by