left4me/docs/superpowers/specs/2026-05-16-tz-aware-datetime-design.md
mwiegand 99b528e563
spec(tz-aware-datetime): design for UtcDateTime migration
Validated design for the migration framed by the 2026-05-16 handoff doc.
Two findings shaped this design: (1) DateTime(timezone=True) is a no-op
on SQLite per the round-trip spike, so the fix must live in app code;
(2) every byte on disk is provably UTC (no datetime.now() / utcnow() /
CURRENT_TIMESTAMP / func.now() anywhere), so a result-side tzinfo stamp
is correct, not optimistic.

The chosen approach: a UtcDateTime TypeDecorator that raises on naive
bind and stamps tzinfo=UTC on read. Single PR, two commits (test-first
for clean bisect). No DDL change, no Alembic migration, no on-disk
data transform.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 11:53:22 +02:00

273 lines
11 KiB
Markdown

# Timezone-aware datetime migration — design
Replaces the `.replace(tzinfo=None)` convention in `l4d2web` with a
`UtcDateTime` SQLAlchemy `TypeDecorator` that enforces aware-UTC on every
column boundary. The handoff at
`docs/superpowers/specs/2026-05-16-tz-aware-datetime-handoff.md` framed
the problem; this doc is the validated design.
## Context
`l4d2web`'s `l4d2web/models.py` declares 26 `DateTime` columns. SQLAlchemy
talking to pysqlite strips `tzinfo` on write and returns naive datetimes on
read, so a documented convention emerged: *every datetime in this codebase
is UTC; we just don't say so in the type system.* The convention is correct
in practice — every write path goes through `now_utc()` (`models.py:22-23`)
or `datetime.now(UTC)` directly — but it is held together by author
discipline, not by any compiler or runtime check.
The cost shows up as ~16 defensive `.replace(tzinfo=None)` sites scattered
across routes, services, and tests, plus a permanent mental tax: a
`datetime` parameter could be aware or naive depending on where it came
from, and a wrong guess raises `TypeError` at runtime — or, worse, silently
treats a non-UTC value as UTC.
## Validation work that shaped this design
Two findings reshaped the originally-imagined fix.
### The `timezone=True` kwarg is a no-op on SQLite
The handoff's implicit plan ("flip `timezone=True` on every column and the
strip sites can come down") is dead. A spike script
(`.tmp/tz_spike.py`, to be deleted with this PR) demonstrated:
| Column type | Write input | On-disk text | Read output |
|---|---|---|---|
| `DateTime` | aware UTC | `'2026-05-16 09:23:47.561010'` | naive |
| `DateTime(timezone=True)` | aware UTC | `'2026-05-16 09:23:47.564949'` | **naive** |
| `DateTime(timezone=True)` | legacy naive on disk | `'2026-05-16 12:34:56.789012'` | naive → `TypeError` on comparison with `datetime.now(UTC)` |
`pysqlite` has no native timestamp-with-zone type; the kwarg only affects
DDL emission, not pysqlite's serialization. The fix must live in app code.
### Every byte on disk is UTC
Greps for every conceivable non-UTC write site returned empty:
```
$ grep -rn 'datetime.now()' l4d2web/ → no matches
$ grep -rn 'utcnow' l4d2web/ → no matches
$ grep -rn 'CURRENT_TIMESTAMP' l4d2web/ → no matches
$ grep -rn 'func.now\|func.current' l4d2web/ → no matches
```
So `process_result_value` stamping `tzinfo=UTC` on read is *correct*, not
optimistic — no data audit, no backfill, no migration to rewrite rows.
## The contract — `UtcDateTime`
```python
from sqlalchemy.types import TypeDecorator
class UtcDateTime(TypeDecorator):
"""Always store and surface UTC-aware datetimes.
SQLite has no native tz-aware type and pysqlite strips tzinfo on
round-trip. We convert inputs to UTC, persist them as naive UTC under
the hood, and re-stamp tzinfo=UTC on read. The result: every datetime
on a model attribute is aware UTC, always.
"""
impl = DateTime
cache_ok = True
def process_bind_param(self, value, dialect):
if value is None:
return None
if value.tzinfo is None:
raise TypeError(
"naive datetime passed to UtcDateTime column; "
"all writes must be UTC-aware"
)
return value.astimezone(UTC).replace(tzinfo=None)
def process_result_value(self, value, dialect):
if value is None:
return None
return value.replace(tzinfo=UTC)
```
The class lives in `l4d2web/models.py` (single consumer; a separate module
for ~25 lines is over-structuring for this project).
### Why raise on naive input rather than silently coerce
Silent coercion is exactly how this codebase grew the reverse-strip line in
`auth.py:60`. Once you tolerate ambiguity at the boundary, the ambiguity
leaks into every comparison site. A loud failure teaches the next author
that the column is aware-only; silent coercion enables future drift.
The runtime check is also the *only* enforcement available — there is no
mypy plugin or static check for "naive datetime crossing into SA bind."
### Why `impl = DateTime` (not `DateTime(timezone=True)`)
`UtcDateTime.impl = DateTime` produces SQL identical to the current
schema. `alembic revision --autogenerate` sees no difference. Historical
migrations remain accurate descriptions of past DDL. No new migration
file is needed; the change is purely Python-side.
(Aside: on PostgreSQL, `impl = DateTime(timezone=True)` would use native
`timestamptz`. Switching dialects would be a one-line change. Not in
scope here.)
## Migration shape
**One PR, two commits.**
1. `test(datetime): pin tz-aware contract for fixtures (red until UtcDateTime lands)`
— drop `.replace(tzinfo=None)` from 10 test sites. Suite goes red.
These failing tests are an executable spec for commit 2.
2. `refactor(datetime): introduce UtcDateTime, remove naive-strip workarounds`
— add the class, flip 26 column declarations, drop 5 production strip
sites + their explanatory comment, flip one line in `auth.py` from
strip-live-marker to stamp-legacy-cookie, add 2 unit tests pinning the
decorator contract. Tests go green.
The test-first ordering gives a clean bisect target: commit 2 is the
single causal point where the type-system invariant takes hold. The
project's recent commits show a strong preference for atomic
single-purpose commits, which this ordering matches.
## What stays the same
- `now_utc()` (`models.py:22-23`) — already returns aware UTC.
- On-disk format — identical bytes; pysqlite still writes `'YYYY-MM-DD
HH:MM:SS.ffffff'`.
- Schema (DDL) — `UtcDateTime.impl = DateTime` produces identical CREATE
TABLE statements.
- SQL emitted at query time — bind hook still hands the driver naive
datetimes.
- `services/timeago.py::_ensure_utc` — kept as a public-filter boundary
assertion. Still exercised by `tests/test_timeago.py:113`, which
deliberately passes a naive `datetime(2026, 5, 16, 14, 32, 11)` to
test the normalize-up path. Load-bearing-ness for DB values drains
away; defense-in-depth at the rendering boundary remains.
- The Alembic chain in `l4d2web/alembic/versions/` — no new migration
file. (See [decision #4](#decisions-locked).)
## What changes
### Call sites that lose `.replace(tzinfo=None)`
5 production sites + 1 `_now()` helper:
- `routes/profile_routes.py:88`
- `routes/server_routes.py:210, :234`
- `routes/page_routes.py:174`
- `services/live_state_poller.py:37` (the `_now()` helper just
returns `datetime.now(UTC)`)
10 test sites: `tests/test_auth.py:162`, `tests/test_timeago.py:67, :73`
(NOT `:113`), `tests/test_servers.py:459, :521`,
`tests/test_live_state_poller.py:159, :278, :303, :356`,
`tests/test_models.py:43`.
The explanatory comment at `routes/profile_routes.py:86-87` is deleted —
it documented a workaround that no longer exists.
### `auth.py` legacy-cookie compat
`auth.py:59-60` today strips tz from a freshly-stamped aware marker so it
compares cleanly with naive DB. After migration the inverse is needed:
legacy session cookies (minted before the deploy) carry naive ISO; the DB
now returns aware.
```python
# Before
if marker_dt.tzinfo is not None:
marker_dt = marker_dt.replace(tzinfo=None)
# After
# Legacy sessions carry naive ISO markers; stamp UTC so comparison works.
if marker_dt.tzinfo is None:
marker_dt = marker_dt.replace(tzinfo=UTC)
```
The `UTC` symbol must be added to the `from datetime import datetime`
import on `auth.py:1` — easy to miss in review.
This line stays permanently. There is no reliable signal for "all naive
cookies have expired"; the line is cheap defense and mirrors the
`_ensure_utc` philosophy at the rendering boundary.
### Unit tests pinning the decorator contract
Two tests in `tests/test_models.py`, surviving any future column rename:
```python
def test_utc_datetime_rejects_naive_bind():
with pytest.raises(TypeError, match="naive"):
UtcDateTime().process_bind_param(datetime(2026, 5, 16, 12, 0), None)
def test_utc_datetime_stamps_utc_on_read():
naive = datetime(2026, 5, 16, 12, 0)
result = UtcDateTime().process_result_value(naive, None)
assert result == datetime(2026, 5, 16, 12, 0, tzinfo=UTC)
assert result.tzinfo == UTC
```
### Already-aware writes that "just keep working"
`services/job_worker.py` lines 131, 486, 522, 550, 579 and
`services/overlay_builders.py:233` already write aware
`datetime.now(UTC)` directly to columns. Today they "work" because
pysqlite implicitly strips. After migration they pass `process_bind_param`
explicitly. No code change needed at these sites — but worth noting they
exist, because they're evidence that the codebase has been writing aware
datetimes all along, just losing the tzinfo on the wire.
## Decisions locked
| # | Decision | Rationale |
|---|---|---|
| 1 | `TypeDecorator`, not `DateTime(timezone=True)` | Spike proved the kwarg is a no-op on SQLite. |
| 2 | Raise on naive bind, not silent coerce | Loud failures teach; silent coercion is how `auth.py:60` happened. |
| 3 | Test-first two-commit ordering | Clean bisect target; commit 1 is an executable spec for commit 2. |
| 4 | No new Alembic migration | `impl = DateTime` produces identical SQL; an `alter_column` migration would be a no-op on SQLite and generate autogenerate noise; an empty marker migration is documentation in the wrong place. |
| 5 | Legacy-cookie compat in `auth.py` is permanent | No reliable signal for cookie rollover; one line of defense matches `_ensure_utc`. |
| 6 | `services/timeago.py::_ensure_utc` stays | Public filter boundary + tested at `test_timeago.py:113`. |
| 7 | `UtcDateTime` lives in `models.py` | Single consumer; separate module for 25 lines is over-structuring. |
## Out of scope
- Postgres portability. `UtcDateTime` would adapt with `impl =
DateTime(timezone=True)` if the dialect supports it, but switching
dialects is its own project.
- Cleanup of any tz patterns in `l4d2host` (the dedicated-server side of
the repo). This work touches only `l4d2web`.
- Static enforcement (mypy plugin, ruff rule) for "naive datetime crossing
into a `UtcDateTime` column." The runtime `TypeError` is the gate; a
static check would be a nice-to-have.
## Verification
1. **Unit:** the two new tests in `tests/test_models.py` pin the
decorator's contract independently of the model declarations.
2. **Suite:** `cd l4d2web && uv run pytest`. Expected: all-green after
commit 2.
3. **Smoke (manual):** log in, change password, log out, log back in.
Exercises the full session-cookie round-trip and the
`password_changed_at` comparison at `auth.py:61`.
4. **Legacy-cookie path (manual):** craft a session with a naive ISO
marker and hit any `@require_login` route. Expected: clean
comparison, not a 500.
## Pointers
- `docs/superpowers/specs/2026-05-16-tz-aware-datetime-handoff.md` — the
handoff that framed this work.
- `docs/superpowers/specs/2026-05-16-timeago-shared-display-design.md`
the parent project that surfaced the convention's cost.
- `l4d2web/l4d2web/models.py:22-23``now_utc()` factory (no change).
- `l4d2web/l4d2web/services/timeago.py:12-15``_ensure_utc` boundary
(no change).
- `l4d2web/l4d2web/auth.py:51-60` — the compat block whose direction
flips.