fix(deploy): one-shot cleanup of orphan overlay dirs after globals removal

Migration 0005_script_overlays drops the legacy l4d2center_maps /
cedapug_maps overlay rows but leaves their /var/lib/left4me/overlays/{id}
directories on disk. When the web app subsequently creates a new overlay
and AUTOINCREMENT issues an id matching one of those orphans,
create_overlay_directory(exist_ok=False) crashes with FileExistsError —
which surfaced as a 500 on POST /overlays the first time a script
overlay was created on a deployed test box.

Adds a sentinel-gated sweep in deploy-test-server.sh that lists overlay
ids in the DB, removes any directory under overlays/ whose id has no
matching row, and drops the now-unused global_overlay_cache. Mirrors the
.kernel-overlay-migrated sentinel pattern so reruns are no-ops.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
mwiegand 2026-05-08 16:16:33 +02:00
parent 06ae84fbe4
commit cf865d4915
No known key found for this signature in database
2 changed files with 37 additions and 2 deletions

View file

@ -203,6 +203,36 @@ if [ ! -e "$overlay_sentinel" ]; then
$sudo_cmd chown left4me:left4me "$overlay_sentinel"
fi
# One-shot migration: 0005_script_overlays drops the legacy
# l4d2center_maps / cedapug_maps overlay rows but doesn't touch their
# directories under /var/lib/left4me/overlays/{id}. Without cleanup, when
# AUTOINCREMENT (or its absence after the 0002 batch_alter_table recreate)
# re-issues an id matching one of those orphan dirs, the web app's
# create_overlay_directory(exist_ok=False) fails with FileExistsError.
# Sweep any overlay dir whose id has no matching DB row, plus the
# now-unused global_overlay_cache.
overlay_orphan_sentinel=/var/lib/left4me/.script-overlays-orphans-cleaned
if [ ! -e "$overlay_orphan_sentinel" ]; then
$sudo_cmd rm -rf /var/lib/left4me/global_overlay_cache
$sudo_cmd sh -c '
cd /var/lib/left4me/overlays || exit 0
ids_in_db=$(/opt/left4me/.venv/bin/python -c "
import sqlite3
c = sqlite3.connect(\"/var/lib/left4me/left4me.db\")
print(\" \".join(str(r[0]) for r in c.execute(\"SELECT id FROM overlays\")))
")
for d in */; do
id=${d%/}
case " $ids_in_db " in
*" $id "*) ;;
*) echo "removing orphan overlay dir: $id"; rm -rf "$id" ;;
esac
done
'
$sudo_cmd touch "$overlay_orphan_sentinel"
$sudo_cmd chown left4me:left4me "$overlay_orphan_sentinel"
fi
$sudo_cmd systemctl daemon-reload
$sudo_cmd systemctl enable --now left4me-web.service
$sudo_cmd systemctl restart left4me-web.service

View file

@ -277,10 +277,15 @@ def test_globals_refresh_units_removed():
assert not GLOBAL_REFRESH_TIMER.exists()
def test_deploy_script_does_not_reference_globals_subsystem():
def test_deploy_script_does_not_provision_globals_subsystem():
script = DEPLOY_SCRIPT.read_text()
assert "/var/lib/left4me/global_overlay_cache" not in script
# No mkdir/install of the deleted cache dir; mention in a one-shot
# `rm -rf` cleanup is fine.
for line in script.splitlines():
if "/var/lib/left4me/global_overlay_cache" not in line:
continue
assert "rm -rf" in line, line
assert "left4me-refresh-global-overlays" not in script