Adds IPAddressDeny= to the sandbox unit covering loopback (127/8 + ::1), link-local (169.254/16 + fe80::/10), multicast (224/4 + ff00::/8), all RFC1918 v4 (10/8, 172.16/12, 192.168/16), CGNAT (100.64/10), and ULA v6 (fc00::/7). The kernel attaches systemd's sd_fw_egress BPF program to the unit's cgroup; egress packets matching any of the deny prefixes are silently dropped at the cgroup boundary. Important: do NOT pair this with `IPAddressAllow=any`. Documentation claims "more specific rule wins" but on this systemd 257 + kernel 6.12 combo, having both set causes the allow to win unconditionally — the deny gets ignored. Empty IPAddressAllow + populated IPAddressDeny is the correct shape: kernel default "allow all" applies to non-listed addresses, and the listed prefixes are blocked. Because the host's resolv.conf typically points at a private-IP DNS server (10.0.0.1 in the test deploy), blocking RFC1918 also kills DNS. Adds a static /etc/left4me/sandbox-resolv.conf with public resolvers (Cloudflare 1.1.1.1, Google 8.8.8.8) and bind-mounts that into the sandbox at /etc/resolv.conf, replacing the host's resolver inside the sandbox only. Smoke-tested on ckn@10.0.4.128: - public 1.1.1.1:443: CONNECTED - public HTTPS via DNS (steamcommunity.com): 200 - localhost web app 127.0.0.1:8000: blocked (TimeoutError) - localhost sshd 127.0.0.1:22: blocked - private LAN ssh 10.0.4.128:22: blocked - private DNS 10.0.0.1:53: blocked AF_UNIX stays in RestrictAddressFamilies — dropping it would risk breaking NSS / syslog for marginal gain, and the IP-level filter addresses the primary threat (reaching the host's HTTP/SSH services). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
68 lines
3.5 KiB
Bash
Executable file
68 lines
3.5 KiB
Bash
Executable file
#!/bin/bash
|
|
# Privileged sandbox launcher for left4me script overlays.
|
|
#
|
|
# Invoked via sudo by the web user with two arguments:
|
|
# <overlay_id> numeric overlay id; bind-mounts /var/lib/left4me/overlays/<id>
|
|
# read-write at /overlay inside the sandbox.
|
|
# <script_path> absolute path to a bash file already written by the web app;
|
|
# bind-mounted read-only at /script.sh inside the sandbox.
|
|
#
|
|
# The script runs as a transient systemd .service with the full hardening
|
|
# surface: cgroup limits + walltime kill, NoNewPrivileges, ProtectSystem,
|
|
# ProtectHome, kernel-tunable / -module / -log protection, namespace
|
|
# restriction, address-family restriction, capability bounding (empty),
|
|
# seccomp filter (@system-service @network-io), MemoryDenyWriteExecute,
|
|
# LockPersonality, RestrictSUIDSGID. Network namespace is *not* restricted —
|
|
# scripts must reach the public internet to download workshop / l4d2center
|
|
# / cedapug content. PID namespace is shared with the host (no
|
|
# PrivatePID= directive in systemd); host PIDs are visible via /proc but
|
|
# not signal-able due to UID mismatch.
|
|
set -euo pipefail
|
|
|
|
[[ $# -eq 2 ]] || { echo "usage: $0 <overlay_id> <script>" >&2; exit 64; }
|
|
|
|
OVERLAY_ID=$1
|
|
SCRIPT=$2
|
|
|
|
[[ "$OVERLAY_ID" =~ ^[0-9]+$ ]] || { echo "bad overlay id" >&2; exit 64; }
|
|
OVERLAY_DIR=/var/lib/left4me/overlays/$OVERLAY_ID
|
|
[[ -d $OVERLAY_DIR ]] || { echo "no overlay dir at $OVERLAY_DIR" >&2; exit 65; }
|
|
[[ -f $SCRIPT ]] || { echo "no script at $SCRIPT" >&2; exit 65; }
|
|
|
|
if [[ "${LEFT4ME_SCRIPT_SANDBOX_DRY_RUN:-}" == "1" ]]; then
|
|
echo "DRY RUN: overlay_id=$OVERLAY_ID script=$SCRIPT overlay_dir=$OVERLAY_DIR"
|
|
exit 0
|
|
fi
|
|
|
|
# Make sure the sandbox UID owns the overlay dir so the script can write there.
|
|
# Idempotent: a no-op when the dir is already l4d2-sandbox-owned (re-run case),
|
|
# and corrects the ownership the first time the dir was created by the web app
|
|
# under the left4me UID. World-readable so the gameserver process (left4me)
|
|
# can read the overlay contents via the kernel-overlayfs lowerdir at runtime.
|
|
chown -R l4d2-sandbox:l4d2-sandbox "$OVERLAY_DIR"
|
|
chmod 0755 "$OVERLAY_DIR"
|
|
|
|
exec systemd-run --quiet --collect --wait --pipe \
|
|
--unit="left4me-script-${OVERLAY_ID}-$$" \
|
|
-p User=l4d2-sandbox -p Group=l4d2-sandbox \
|
|
-p NoNewPrivileges=yes \
|
|
-p ProtectSystem=strict -p ProtectHome=yes \
|
|
-p PrivateTmp=yes -p PrivateDevices=yes -p PrivateIPC=yes \
|
|
-p ProtectKernelTunables=yes -p ProtectKernelModules=yes \
|
|
-p ProtectKernelLogs=yes -p ProtectControlGroups=yes \
|
|
-p RestrictNamespaces=yes \
|
|
-p RestrictAddressFamilies="AF_INET AF_INET6 AF_UNIX" \
|
|
-p RestrictSUIDSGID=yes -p LockPersonality=yes \
|
|
-p MemoryDenyWriteExecute=yes \
|
|
-p SystemCallFilter="@system-service @network-io" \
|
|
-p SystemCallArchitectures=native \
|
|
-p CapabilityBoundingSet= -p AmbientCapabilities= \
|
|
-p IPAddressDeny="127.0.0.0/8 ::1/128 169.254.0.0/16 fe80::/10 224.0.0.0/4 ff00::/8 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 100.64.0.0/10 fc00::/7" \
|
|
-p TemporaryFileSystem="/etc /var/lib" \
|
|
-p BindReadOnlyPaths="/etc/left4me/sandbox-resolv.conf:/etc/resolv.conf /etc/ssl /etc/ca-certificates /etc/nsswitch.conf /etc/alternatives ${SCRIPT}:/script.sh" \
|
|
-p BindPaths="${OVERLAY_DIR}:/overlay" \
|
|
-p WorkingDirectory=/overlay \
|
|
-p Environment="HOME=/tmp PATH=/usr/bin:/usr/sbin OVERLAY=/overlay" \
|
|
-p MemoryMax=4G -p MemorySwapMax=0 -p TasksMax=512 \
|
|
-p CPUQuota=200% -p RuntimeMaxSec=3600 \
|
|
-- /bin/bash /script.sh
|