SpSyDiD — Spatial Synthetic-DiD (Serenini & Masek 2024)

SpSyDiD — Spatial Synthetic-DiD (Serenini & Masek 2024)#

Estimator:

Spatial Synthetic Difference-in-Differences (SpSyDiD)mlsynth.SpSyDiD

Source:

Serenini, R., & Masek, F. (2024), “Spatial Synthetic Difference-in-Differences,” SSRN Working Paper 4736857.

Replication type:

Path B — the authors’ State-Level Monte Carlo, as a per-replication cross-validation against the authors’ own reference code.

Status:

Fully verified — matched per-rep to the authors’ algorithm and recovers the paper’s headline unbiasedness.

Validation strategy#

Serenini & Masek’s empirical example (the Arizona 2007 LAWA effect) runs on a CPS panel that the authors do not release. Their public replication repository, serenini/spatial_SDID, instead ships the simulation code plus a BLS unemployment panel for two Monte Carlo exercises. We therefore satisfy the contract along Path B — and, because the authors’ full code is available, we make it a cross-validation: on every replication we run both mlsynth.SpSyDiD and the authors’ own estimator on the same panel and compare the direct ATT per-rep.

Because the reference repository carries no licence, its code is not vendored into mlsynth. The benchmark clones it on demand at a pinned commit (e43427d) and imports the authors’ functions_ssdid weight machinery from the clone; the spatial WLS step is reproduced from State_Level_Simulations.ipynb (cell 16). It skips gracefully when git, the network, or libpysal is unavailable.

The DGP (State-Level Monte Carlo)#

  • Panel: 49 contiguous US states, monthly unemployment 1976-2014 (basedata/state_unemployment.csv); queen-contiguity \(W\) (basedata/US_no_islands_matrix.gal), row-standardised.

  • Windows: for each 3-year rolling window the third year is the post-period (\(T_0 = 24\), \(T_1 = 12\)). Panels are deterministic.

  • Treatment: Arkansas (FIPS 5) is the directly-treated state. The outcome is \(\text{UR2} = \text{perc\_unem} + \text{interaction}\cdot\text{ATT} + \text{spillover}\cdot\text{ATT}\cdot\rho\), with ATT set to 25% of the window’s mean unemployment and \(\rho = 0.8\).

Per-replication cross-validation#

Over the 20 deterministic windows from 1975, mlsynth and the authors’ reference algorithm agree on the direct ATT to within \(\approx 0.09\) pp per rep, with a per-rep correlation of 0.996:

Quantity (20 windows)

mlsynth

reference

mean ATT bias

+0.023

+0.022

max per-rep \(|\Delta\widehat{\tau}|\) vs ref

0.094

per-rep correlation vs ref

0.996

Both estimators recover the paper’s headline finding: the mean ATT bias is essentially zero (~0.02 against an ATT magnitude of ~1.7 pp), i.e. the spatial correction cleanly removes the spillover-induced bias that contaminates plain SDID. The small per-rep residual is the affected-unit weight convention (mlsynth: \(1/N_{sp}\); reference: mean treated-unit SDID weight) — both valid downstream of the SDID weight QPs.

Durable check#

The benchmark lives in benchmarks/cases/spsydid_state_mc.py:

pip install libpysal
python benchmarks/run_benchmarks.py --case spsydid_state_mc

It asserts per-rep agreement (max \(|\Delta| < 0.2\), correlation > 0.98) and near-zero mean ATT bias for both implementations.

References#

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). “Synthetic Difference-in-Differences.” American Economic Review 111(12):4088-4118.

Serenini, R., & Masek, F. (2024). “Spatial Synthetic Difference-in-Differences.” SSRN Working Paper 4736857.