Spatial Synthetic Difference-in-Differences (SpSyDiD)

Spatial Synthetic Difference-in-Differences (SpSyDiD)#

Overview#

SpSyDiD (Serenini, R., & Masek, F. (2024). “Spatial Synthetic Difference-in-Differences,” SSRN 4736857) extends the Synthetic Difference-in-Differences (SDID) estimator of Arkhangelsky-Athey-Hirshberg-Imbens-Wager (2021) with a spatial spillover term. The estimator separates two estimands that standard SDID confounds when SUTVA is violated by geographic spillovers:

\(\widehat{\tau}\) – the direct ATT on the directly-treated units (identical in form to standard SDID).
\(\widehat{\tau}_s\) – the indirect / spillover coefficient per unit of neighbour-treatment exposure \(e_{it} = \sum_{j} w_{ij}\, d_{jt}\).

The implied population ATE follows from Serenini & Masek’s eq. 14,

\[\widehat{\mathrm{ATE}} = \widehat{\tau} \cdot \bigl(1 + \overline{WD}\bigr),\]

where \(\overline{WD}\) is the average exposure across the directly + indirectly treated units in the post-period.

The user supplies a row-standardised \(N \times N\) spatial weight matrix \(\mathbf{W}\). Helpers in mlsynth.utils.spsydid_helpers.spatial cover the standard constructions:

knn_weights() – \(k\)-nearest neighbours from coordinates.
inverse_distance_weights() – \(w_{ij} \propto 1/d_{ij}^p\) with optional cutoff.
contiguity_weights() – queen / rook contiguity from an adjacency dictionary.

When \(\mathbf{W} = \mathbf{0}\) (no spatial structure) or no donor has any treated neighbour, SpSyDiD numerically reduces to plain SDID with \(\widehat{\tau}_s = 0\).

When to Use This Method#

Every difference-based estimator – DiD, synthetic control, and plain Synthetic Difference-in-Differences (SDID) – rests on SUTVA: a control unit’s outcome is unaffected by anyone else’s treatment. Geography routinely breaks this. When a policy in the treated region leaks to its neighbours, those neighbours are exactly the units a synthetic control wants to lean on, and the leakage corrupts the comparison. Serenini & Masek (2024) make the bias explicit:

Spillovers onto units inside the donor pool bias and render inconsistent the standard SDID ATT – the synthetic control is built from partially-treated donors, so the “untreated” benchmark is itself moving with the treatment.
Spillovers outside the donor pool leave the ATT identifiable but make the population ATE unidentified, because the indirect effect on exposed-but-excluded units is never measured.

SpSyDiD targets this regime directly. It adds a single spatial exposure term \(e_{it} = \sum_{j} w_{ij}\, d_{jt}\) to the doubly-weighted SDID regression, so the estimator returns two numbers: the direct ATT \(\widehat{\tau}\) (same form as SDID) and the per-exposure indirect coefficient \(\widehat{\tau}_s\). The population ATE then follows from \(\widehat{\mathrm{ATE}} = \widehat{\tau}\,(1 + \overline{WD})\) (eq. 14). Relative to the older Spatial DiD of Delgado & Florax (2015), the synthetic weighting sharpens identification of the indirect effect while keeping SDID’s robustness for the direct effect.

Reach for SpSyDiD whenever there is a plausible mechanism for the treatment to leak from the directly-treated units to a subset of the donor pool through spatial or structural proximity, and you can supply a credible row-standardised weight matrix \(\mathbf{W}\) encoding that proximity. Typical examples:

Immigration policy with cross-border relocation. Arizona’s 2007 LAWA legislation directly affected Arizona’s noncitizen Hispanic population but also displaced workers to neighbouring states. SDID alone would either bias the ATT (if you include the spillover-affected states as controls) or be unable to estimate the spillover at all.
State tax changes with cross-border shopping. A state sales-tax increase affects that state’s revenue directly and leaks via cross-border shopping into neighbouring states.
Local advertising campaigns with geographic spillovers across DMA boundaries.
Vaccine mandates with cross-state mobility effects.

Do not use SpSyDiD when#

SUTVA holds / there is no spillover concern. With \(\mathbf{W} = \mathbf{0}\) or no treated neighbours, SpSyDiD reduces numerically to plain Synthetic Difference-in-Differences (SDID) with \(\widehat{\tau}_s = 0\); the extra exposure column just adds noise. Use Synthetic Difference-in-Differences (SDID) – it is faster and more parsimonious.
You cannot defend a spatial weight matrix. The whole identification of \(\widehat{\tau}_s\) runs through \(\mathbf{W}\). If proximity is not the spillover channel (e.g., interference flows through an unobserved social or supply-chain network you cannot encode), a misspecified \(\mathbf{W}\) buys biased indirect effects; consider Spillover-Aware Synthetic Control (SPILLSYNTH), which models spillover through donor membership rather than a fixed geographic kernel.
Interference is global or non-local. SpSyDiD assumes exposure is a local, distance-decaying function of neighbours’ treatment. General equilibrium effects that hit every unit equally are absorbed into the time effects and cannot be separated.
You only need the direct ATT and the donor pool is clean. If the spillover-affected units can simply be dropped from the donor pool and the indirect effect is not of interest, plain Synthetic Difference-in-Differences (SDID) on the pruned pool is the simpler honest choice.
Distributional questions (quantiles, tails) – use Distributional Synthetic Control (DSC); or a single treated unit with no spatial structure – use Two-Step Synthetic Control / Forward Difference-in-Differences (FDID).

Mathematical Formulation#

Setup#

Let \(\mathcal{N} \coloneqq \{1, \dots, N\}\) index the units and \(t \in \mathcal{T} \coloneqq \{1, \dots, T\}\) the periods (1-indexed); the intervention takes effect after period \(T_0\), so the pre-period is \(\mathcal{T}_1 \coloneqq \{t \in \mathcal{T} : t \le T_0\}\) and the post-period is \(\mathcal{T}_2 \coloneqq \{t \in \mathcal{T} : t > T_0\}\), with \(T_{\mathrm{post}} \coloneqq T - T_0\). Let \(y_{it}\) be the outcome, \(d_{it} \in \{0, 1\}\) the direct-treatment indicator, and \(\mathbf{W} \in \mathbb{R}_{\ge 0}^{N \times N}\) a row-standardised spatial weight matrix with zero diagonal. The spillover exposure of unit \(i\) at time \(t\) is

\[e_{it} \coloneqq (\mathbf{W}\mathbf{d}_t)_i = \sum_{j \in \mathcal{N}} w_{ij}\, d_{jt} \in [0, 1], \qquad \mathbf{d}_t \coloneqq (d_{1t}, \dots, d_{Nt})^\top .\]

The estimator auto-partitions \(\mathcal{N}\) into

\(\mathcal{I}_{\mathrm{tr}}\) – directly treated units (\(d_{it} = 1\) for some \(t\)), with \(N_{\mathrm{tr}} \coloneqq |\mathcal{I}_{\mathrm{tr}}|\);
\(\mathcal{I}_{\mathrm{sp}}\) – indirectly treated units (\(d_{it} = 0\) for all \(t\) but \(e_{it} > 0\) for some \(t\)), with \(N_{\mathrm{sp}} \coloneqq |\mathcal{I}_{\mathrm{sp}}|\);
\(\mathcal{C}\) – pure controls (\(d_{it} = 0\) and \(e_{it} = 0\) for all \(t\)).

Only \(\mathcal{C}\) is used to fit the SDID unit / time weights.

Algorithm#

Step 1 – SDID weights from pure controls. Following Arkhangelsky et al. (2021), fit the unit weights \(\widehat{\boldsymbol{\omega}} \in \Delta^{|\mathcal{C}|}\) and time weights \(\widehat{\boldsymbol{\lambda}} \in \Delta^{T_0}\) (each on the unit simplex \(\Delta^{m} \coloneqq \{\mathbf{x} \in \mathbb{R}_{\ge 0}^{m} : \|\mathbf{x}\|_1 = 1\}\)) on \(\mathcal{C}\) only. The regularisation parameter is \(\zeta \coloneqq T_{\mathrm{post}}^{1/4} \cdot \mathrm{sd}(\Delta \mathbf{y})\), the standard deviation of the first-differenced pre-period donor outcomes.

Step 2 – assemble the full weight vector. Set

\[\begin{split}\widehat{\omega}_i = \begin{cases} 1 / N_{\mathrm{tr}} & i \in \mathcal{I}_{\mathrm{tr}}, \\ 1 / N_{\mathrm{sp}} & i \in \mathcal{I}_{\mathrm{sp}}, \\ \widehat{\omega}_i^{\mathrm{SDID}} & i \in \mathcal{C}. \end{cases}\end{split}\]

Time weights are SDID-fit for the pre-period and uniform \(1 / T_{\mathrm{post}}\) for the post-period.

Step 3 – augmented two-way FE WLS regression. Solve

\[(\widehat{\tau}, \widehat{\tau}_s, \widehat{\mu}, \widehat{\boldsymbol{\alpha}}, \widehat{\boldsymbol{\beta}}) = \arg\min_{\tau, \tau_s, \mu, \boldsymbol{\alpha}, \boldsymbol{\beta}} \sum_{i \in \mathcal{N}} \sum_{t \in \mathcal{T}} \bigl[ y_{it} - \mu - \alpha_i - \beta_t - \tau\, d_{it} - \tau_s\, e_{it} \bigr]^2 \widehat{\omega}_i\, \widehat{\lambda}_t .\]

The augmented design jointly recovers the direct effect \(\widehat{\tau}\) (the ATT) and the spillover coefficient \(\widehat{\tau}_s\).

Step 4 – combine. With \(\overline{WD}\) the average exposure over \(\mathcal{I}_{\mathrm{tr}} \cup \mathcal{I}_{\mathrm{sp}}\) in the post-period, the indirect and total effects are

\[\widehat{\mathrm{AITE}} = \widehat{\tau}_s\, \overline{WD}, \qquad \widehat{\mathrm{ATE}} = \widehat{\tau}\,(1 + \overline{WD}).\]

Identification assumptions#

A1. No anticipation – units do not adjust outcomes in advance of the treatment.

Remark. The pre-period outcomes \(y_{it}\) for \(t \in \mathcal{T}_1\) must reflect the no-treatment world, since they are what the SDID weights are fit on. If units respond before \(T_0\) – forward-looking firms or households re-optimising ahead of a known policy – the pre-period fit is contaminated and both \(\widehat{\tau}\) and \(\widehat{\tau}_s\) absorb part of the anticipation, as in standard SDID.

A2. Parallel trends – in the absence of treatment, treated, spillover, and control units would have followed similar trends, conditional on unit and time fixed effects.

Remark. This is the SDID identifying restriction carried over to the three-way partition: net of the unit effect \(\alpha_i\) and the time effect \(\beta_t\), the pure controls \(\mathcal{C}\) trace out the common trajectory that the directly- and indirectly-treated units would have followed absent the policy. The doubly-weighted design relaxes the raw parallel-trends requirement to hold only for the SDID-reweighted controls, which is why only \(\mathcal{C}\) enters the weight fit.

A3. Additivity and linearity of spillovers – the potential outcome of a unit depends linearly and additively on its own treatment status and the treatment exposure of its neighbours, captured by \(e_{it}\).

Remark. This is what lets a single coefficient \(\tau_s\) summarise the indirect effect: the exposure \(e_{it} = \sum_j w_{ij} d_{jt}\) enters the regression as one extra additive column, scaling linearly with how much treated mass a unit’s neighbours carry. If the spillover were non-linear in exposure (e.g. saturating once a few neighbours are treated), \(\widehat{\tau}_s\) would recover only a local linear approximation.

A4. Limited interference – spillovers operate exclusively through the structure defined by the exogenous \(\mathbf{W}\). No other local or global interference mechanisms are assumed.

Remark. The whole identification of \(\widehat{\tau}_s\) runs through \(\mathbf{W}\), so the weight matrix must encode the true spillover channel and be exogenous to the treatment. Interference that flows through an unmodelled network, or general-equilibrium effects that hit every unit equally, are not captured by \(e_{it}\) – the latter are absorbed into the time effects \(\beta_t\) and cannot be separated from the direct effect.

A5. Synthetic-control transferability – the SDID synthetic control built on the pure controls also approximates the counterfactual trajectory for the indirectly-treated units. This holds when spillover-affected units are spatially / structurally similar to directly-treated units, which is typically the case in geographic spillover settings (neighbours of treated states tend to resemble treated states).

Remark. Because the unit weights are fit on \(\mathcal{C}\) alone (the indirectly-treated units in \(\mathcal{I}_{\mathrm{sp}}\) are held out, then re-entered with uniform weight \(1/N_{\mathrm{sp}}\)), the same synthetic must stand in for the counterfactual of units it was not fit to. Serenini & Masek (2024) argue this transfer is credible in geographic settings, where a treated unit’s exposed neighbours tend to resemble the treated unit itself; where spillover-affected units look unlike the treated, this is the assumption most at risk.

Connection to existing methods#

When \(\mathbf{W} = \mathbf{0}\) (no spatial structure), the spillover column vanishes and SpSyDiD reduces to plain SDID with \(\widehat{\tau}_s = 0\).
When \(\widehat{\omega}_i = 1 / |\mathcal{C}|\) for all controls (uniform weights), SpSyDiD reduces to the Spatial Difference-in-Differences estimator of Delgado & Florax (2015).
When the panel is balanced + no spillover + non-trivial \(\mathbf{W}\), SpSyDiD’s \(\widehat{\tau}\) matches SDID’s ATT.

Core API#

Spatial Synthetic Difference-in-Differences (SpSyDiD) estimator.

Serenini, R., & Masek, F. (2024). “Spatial Synthetic Difference-in-Differences.” SSRN 4736857.

Extends Arkhangelsky-Athey-Hirshberg-Imbens-Wager (2021) SDID with a spatial spillover term so the estimator can disentangle two estimands that the standard SDID confounds when SUTVA is violated by geographic spillovers:

\(\widehat \tau\) – direct effect on the directly-treated units (the ATT, identical in form to standard SDID).
\(\widehat \tau_s\) – spillover coefficient per unit of neighbour-treatment exposure \((WD)_{it} = \sum_j w_{ij} D_{jt}\).

The user supplies a row-standardised \(N \times N\) spatial weight matrix \(W\) (helpers in mlsynth.utils.spsydid_helpers.spatial cover the standard constructions: k-NN from coordinates, inverse distance, queen / rook contiguity from adjacency). Donors are auto-partitioned into directly treated, spillover-exposed, and pure controls based on \(D\) and \(W\). The SDID unit / time weights are computed on the pure controls; the final WLS regression jointly estimates \(\tau\) and \(\tau_s\).

When \(W = 0\) (no spatial structure) or no donor has any treated neighbour, SpSyDiD numerically reduces to plain SDID with \(\widehat \tau_s = 0\).

class mlsynth.estimators.spsydid.SpSyDiD(config: SpSyDiDConfig | dict)#

Bases: object

Spatial Synthetic Difference-in-Differences estimator.

Parameters:: config (SpSyDiDConfig or dict) – Configuration object. See mlsynth.config_models.SpSyDiDConfig.

fit() → SpSyDiDResults#: Run Algorithm 1 of Serenini & Masek (2024).

Configuration#

class mlsynth.config_models.SpSyDiDConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', plot: ~mlsynth.config_models.PlotConfig = <factory>, spatial_matrix: ~typing.Any, unit_order: ~typing.List[~typing.Any] | None = None, row_standardize_spatial: bool = True)#

Configuration for the Spatial Synthetic Difference-in-Differences estimator.

Serenini & Masek (2024). “Spatial Synthetic Difference-in-Differences,” SSRN 4736857. Extends SDID (Arkhangelsky et al. 2021) with a spatial spillover term so the estimator separates the direct ATT from the indirect (spillover) effect on units exposed via the spatial weight matrix \(W\).

Parameters:

spatial_matrix (np.ndarray) – Square \(N \times N\) spatial weight matrix. Rows / columns must align with unit_order (or sorted(df[unitid].unique()) if unit_order is None). Use the helpers in mlsynth.utils.spsydid_helpers.spatial to build W from coordinates (k-NN, inverse distance) or from an adjacency list (queen / rook contiguity).
unit_order (list, optional) – Canonical ordering of unit ids matching the rows / columns of spatial_matrix. If None (default), units are ordered by sorted(df[unitid].unique()).
row_standardize_spatial (bool) – Row-standardise W internally before computing exposure. Default True. Skip when the caller has already standardised.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

row_standardize_spatial: bool#

spatial_matrix: Any#

unit_order: List[Any] | None#

Helper Modules#

Spatial weight matrix utilities for SpSyDiD.

The estimator accepts a row-standardised \(N \times N\) spatial weight matrix \(W\) directly. These helpers cover the common ways one builds \(W\) in practice, so users can either plug in their own matrix or construct one from coordinates / adjacency information without an external dependency on libpysal.

mlsynth.utils.spsydid_helpers.spatial.contiguity_weights(adjacency: Dict[int, Iterable[int]], unit_order: Sequence, row_standardized: bool = True) → ndarray#

Build a contiguity (queen / rook) spatial weight matrix.

Parameters:

adjacency (dict) – {unit_id: iterable of neighbour unit_ids}.
unit_order (sequence) – Length-N canonical ordering of unit ids matching the panel.
row_standardized (bool) – Divide each row by its row sum (i.e., uniform weight 1/k_i across neighbours).

mlsynth.utils.spsydid_helpers.spatial.inverse_distance_weights(coords: ndarray, cutoff: float | None = None, power: float = 1.0, row_standardized: bool = True) → ndarray#

Build an inverse-distance spatial weight matrix.

\(w_{ij} = 1 / d(i, j)^{\text{power}}\) for i != j, zero elsewhere. Entries beyond cutoff (Euclidean distance) are set to zero.

mlsynth.utils.spsydid_helpers.spatial.knn_weights(coords: ndarray, k: int, row_standardized: bool = True) → ndarray#

Build a \(k\)-nearest-neighbour spatial weight matrix from coords.

Parameters:

coords (np.ndarray) – Shape (N, d) of unit coordinates in some metric space (e.g. (lat, lon) or projected (x, y)). Euclidean distance is used; project to a metric CRS for geographic data.
k (int) – Number of neighbours per unit (excluding self).
row_standardized (bool) – If True (default), divide each row by k so weights sum to 1.

mlsynth.utils.spsydid_helpers.spatial.row_standardize(W: ndarray, warn_isolated: bool = False) → ndarray#

Divide each row of W by its row sum.

Rows with zero sum (units with no neighbours) are left as zero. The paper’s algorithm assumes row-standardised \(W\) so the spillover term \((WD)_{it} = \sum_j w_{ij} D_{jt}\) lies in \([0, 1]\).

Parameters:

W (np.ndarray) – Non-negative weight matrix.
warn_isolated (bool) – If True, emit a RuntimeWarning when one or more rows sum to zero (units with no spatial neighbours). Such units can never be classified as spillover-exposed and contribute a constant-zero exposure column, which is easy to miss. Off by default to keep the low-level helper quiet; enabled at the mlsynth.utils.spsydid_helpers.setup.prepare_spsydid_inputs() boundary.

mlsynth.utils.spsydid_helpers.spatial.validate_spatial_matrix(W: ndarray, n_units: int) → ndarray#

Sanity-check W and return a float-array copy.

Checks:

shape is (n_units, n_units);
entries are finite and non-negative;
diagonal is zero (a unit is not its own spatial neighbour).

The matrix is not automatically row-standardised here – pass it through row_standardize() first if needed.

Micro-panel data preparation for SpSyDiD.

Converts a long-format panel + a spatial weight matrix into the SpSyDiDInputs container expected by run_spsydid(). The donor pool is auto-partitioned into three classes following Serenini & Masek (2024):

Directly treated – units with \(D_{it} = 1\) for some t.
Indirectly treated (spillover-exposed) – units with \(D_{it} = 0\) for all t but \((WD)_{it} > 0\) for some t, i.e. they have at least one spatial neighbour who is treated.
Pure controls – units with \(D = 0\) and \((WD) = 0\) for all t. Only these are used to fit the SDID unit / time weights.

mlsynth.utils.spsydid_helpers.setup.prepare_spsydid_inputs(df: DataFrame, outcome: str, treat: str, unitid: str, time: str, spatial_matrix: ndarray, unit_order: Sequence | None = None, row_standardize_spatial: bool = True) → SpSyDiDInputs#

Pivot a long-format panel into SpSyDiDInputs.

Parameters:

df (pd.DataFrame) – Balanced long panel with columns unitid, time, outcome, treat.
outcome, treat, unitid, time (str) – Column names.
spatial_matrix (np.ndarray) – Square (N, N) spatial weight matrix. Rows / columns must be ordered consistently with the units in the panel (use unit_order to fix the ordering; otherwise sorted unique unitid values are used).
unit_order (sequence, optional) – Canonical ordering of unit ids that matches the rows / columns of spatial_matrix. If None (default), units are ordered by sorted(df[unitid].unique()).
row_standardize_spatial (bool) – If True (default), row-standardise W before storing it. Skip when the caller has already standardised.

SDID weight-computation primitives duplicated for SpSyDiD.

These functions are intentionally duplicated from mlsynth/utils/sdid_helpers/weights.py rather than imported. The duplication isolates SpSyDiD from future changes to the SDID pipeline so silent behavioural drift cannot occur. If the upstream SDID formulas change, this module should be updated deliberately.

Wraps the Arkhangelsky-Athey-Hirshberg-Imbens-Wager (2021) unit-weight QP, time-weight QP, and the \(\zeta = (N_{\text{tr}} T_{\text{post}})^{1/4} \cdot \mathrm{std}(\Delta Y)\) regularisation rule (matching the authors’ functions_ssdid.calculate_regularization).

mlsynth.utils.spsydid_helpers.weights.compute_regularization(donor_outcomes_pre: ndarray, num_post_periods: int, num_treated_units: int = 1) → float#

SDID \(\zeta = (N_{\text{tr}} T_{\text{post}})^{1/4} \cdot \mathrm{sd}(\Delta Y)\).

The standard deviation is of the first-differenced pre-period donor outcomes (Arkhangelsky et al. 2021 Section 3). The tuning count is the number of directly-treated-unit post-period observations \(N_{\text{tr}} T_{\text{post}}\), matching the authors’ reference functions_ssdid.calculate_regularization (serenini/spatial_SDID), whose n_treated_post counts the treated-and-post rows. num_treated_units defaults to 1, so a single directly-treated unit reduces to the \(T_{\text{post}}^{1/4}\) form and leaves such designs unchanged.

mlsynth.utils.spsydid_helpers.weights.fit_time_weights(donor_outcomes_pre: ndarray, mean_donor_outcomes_post: ndarray) → Tuple[float | None, ndarray | None]#

SDID time-weight QP.

Solve for (beta_0, lambda) minimising \(\| \beta_0 \mathbf 1 + \Lambda^\top \mathrm{Y}_{0,\mathrm{pre}} - \bar y_{0,\mathrm{post}} \|_2^2\) subject to sum(lambda) == 1 and lambda >= 0.

mlsynth.utils.spsydid_helpers.weights.fit_unit_weights(donor_outcomes_pre: ndarray, mean_treated_outcome_pre: ndarray, zeta: float) → Tuple[float | None, ndarray | None]#

SDID unit-weight QP.

Solve for (omega_0, omega) minimising \(\| \omega_0 \mathbf 1 + \mathrm Y_{0,\mathrm{pre}} \omega - \bar y_{1,\mathrm{pre}} \|_2^2 + T_0 \zeta^2 \|\omega\|_2^2\) subject to sum(omega) == 1 and omega >= 0.

Orchestration pipeline for Spatial Synthetic Difference-in-Differences.

Implements Algorithm 1 of Serenini & Masek (2024):

Compute SDID unit / time weights using only the pure controls as donors (Arkhangelsky et al. 2021 QPs duplicated in weights).
Fix the per-unit weight as \(\omega_i = 1 / N_{\mathrm{tr}}\) for directly-treated units, \(\omega_i = 1 / N_{\mathrm{sp}}\) for indirectly-treated units, and SDID-fit \(\omega_i\) for pure controls.
Run the weighted two-way FE regression

\[(\widehat \tau, \widehat \tau_s, \widehat \mu, \widehat \alpha, \widehat \beta) = \arg \min \sum_{i, t} \bigl[ Y_{it} - \mu - \alpha_i - \beta_t - \tau D_{it} - \tau_s (WD)_{it} \bigr]^2 \widehat \omega_i\, \widehat \lambda_t.\]

The augmented design recovers the direct effect \(\widehat \tau\) and the spillover coefficient \(\widehat \tau_s\) jointly.
The implied population ATE is \(\widehat \tau (1 + \overline{WD})\) with \(\overline{WD}\) the average exposure across directly + indirectly treated units (paper eq. 14).

mlsynth.utils.spsydid_helpers.pipeline.run_spsydid(inputs: SpSyDiDInputs) → SpSyDiDResults#: Run Algorithm 1 of Serenini & Masek (2024).

Frozen dataclasses for the Spatial Synthetic Difference-in-Differences estimator.

Serenini & Masek (2024). “Spatial Synthetic Difference-in-Differences.” SSRN 4736857. Extends Arkhangelsky-Athey-Hirshberg-Imbens-Wager (2021) SDID with a spatial spillover term \(\tau_s\) so the estimator can disentangle the direct ATT on the directly-treated units from the indirect (spillover) effect on units exposed via a spatial weight matrix \(W\).

Two estimands fall out of one regression:

\(\widehat \tau\) – direct effect on the directly-treated units (the ATT, identical in form to standard SDID).
\(\widehat \tau_s\) – spillover effect per unit of neighbour-treatment exposure \((WD)_{it} = \sum_j w_{ij} D_{jt}\).

class mlsynth.utils.spsydid_helpers.structures.SpSyDiDInputs(outcome_matrix: ndarray, treatment_matrix: ndarray, spatial_matrix: ndarray, exposure_matrix: ndarray, unit_names: List[Any], time_labels: ndarray, T: int, T0: int, direct_indices: ndarray, spillover_indices: ndarray, pure_control_indices: ndarray)#

Preprocessed panel + spatial weights for SpSyDiD.

outcome_matrix#

(N, T) panel of outcomes, ordered by unit_names.

Type:: np.ndarray

treatment_matrix#

(N, T) panel of 0/1 treatment indicators.

Type:: np.ndarray

spatial_matrix#

Row-standardised spatial weight matrix, shape (N, N), ordered consistently with unit_names.

Type:: np.ndarray

exposure_matrix#

Pre-computed spillover exposure \((WD)_{it} = \sum_j w_{ij} D_{jt}\), shape (N, T).

Type:: np.ndarray

unit_names#

Length-N ordering of unit ids.

Type:: list

time_labels#

Length-T ordering of time-period labels.

Type:: np.ndarray

T#

Total number of panel periods.

Type:: int

T0#

Number of pre-treatment periods (largest t such that no unit is treated for t' <= t).

Type:: int

direct_indices#

Indices of directly treated units (those with D=1 at some t).

Type:: np.ndarray

spillover_indices#

Indices of indirectly treated units (D=0 always but (WD)_it > 0 at some t).

Type:: np.ndarray

pure_control_indices#

Indices of pure controls (D=0 and (WD)=0 for all t).

Type:: np.ndarray

property N: int#

property N_direct: int#

property N_pure: int#

property N_spillover: int#

T: int#

T0: int#

direct_indices: ndarray#

exposure_matrix: ndarray#

outcome_matrix: ndarray#

pure_control_indices: ndarray#

spatial_matrix: ndarray#

spillover_indices: ndarray#

time_labels: ndarray#

treatment_matrix: ndarray#

unit_names: List[Any]#

class mlsynth.utils.spsydid_helpers.structures.SpSyDiDResults(*, effects: ~mlsynth.config_models.EffectsResults | None = None, fit_diagnostics: ~mlsynth.config_models.FitDiagnosticsResults | None = None, time_series: ~mlsynth.config_models.TimeSeriesResults | None = None, weights: ~mlsynth.config_models.WeightsResults | None = None, inference: ~mlsynth.config_models.InferenceResults | None = None, method_details: ~mlsynth.config_models.MethodDetailsResults | None = None, sub_method_results: ~typing.Dict[str, ~typing.Any] | None = None, additional_outputs: ~typing.Dict[str, ~typing.Any] | None = None, raw_results: ~typing.Dict[str, ~typing.Any] | None = None, execution_summary: ~typing.Dict[str, ~typing.Any] | None = None, plot_config: ~mlsynth.config_models.PlotConfig | None = None, inputs: ~mlsynth.utils.spsydid_helpers.structures.SpSyDiDInputs, aite: float, ate: float, unit_weights: ~typing.Dict[~typing.Any, float], time_weights: ~numpy.ndarray, zeta: float, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#

Top-level container returned by mlsynth.SpSyDiD.fit().

An EffectResult (the observational report). SpSyDiD is a spillover decomposition: the direct effect \(\widehat{\tau}\) (ATT) drives the standardized surface, so the flat accessors att / counterfactual / gap / pre_rmse describe the directly-treated group (observed mean vs the pure-control SDID synthetic). The indirect (aite) and total (ate) effects, which have no single counterfactual path, are kept as typed fields below. The pure-control SDID unit weights live in the standardized weights slot (with the time weights in summary_stats).

Parameters:

inputs (SpSyDiDInputs) – Preprocessed panel + W matrix + auto-detected partition.
aite (float) – Average indirect treatment effect per unit of exposure \(\widehat{\tau}_s\). Multiply by the average exposure to recover the population-level spillover.
ate (float) – Implied population-level ATE \(\widehat{\tau} \cdot (1 + \bar{WD})\) per the paper’s eq. 14, with \(\bar{WD}\) the average exposure across the directly + indirectly treated units.
unit_weights (dict) – Mapping {unit_name: omega} – the per-unit weights used in the final WLS regression (SDID-style for pure controls, uniform \(1/N_{tr}\) for directly treated and \(1/N_{sp}\) for indirectly treated).
time_weights (np.ndarray) – Length-T0 SDID time weights for the pre-period (post-period weights are uniform \(1/T_{\text{post}}\) and not stored).
zeta (float) – SDID regularisation parameter from Arkhangelsky et al. 2021 (used in the unit-weight QP for pure controls).
metadata (dict) – Free-form diagnostics (mean exposure, partition sizes, etc.).

aite: float#

ate: float#

inputs: SpSyDiDInputs#

metadata: Dict[str, Any]#

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': True, 'json_encoders': {<class 'numpy.ndarray'>: <function BaseEstimatorResults.Config.<lambda>>}}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property tau: float#

direct effect.

Type:: Alias matching paper notation

property tau_s: float#

spillover coefficient.

Type:: Alias matching paper notation

time_weights: np.ndarray#

unit_weights: Dict[Any, float]#

zeta: float#

Example#

A self-contained one-draw Monte Carlo on a \(8 \times 8\) spatial grid. Six well-spaced units receive treatment of magnitude \(\tau = 2.0\); their \(k = 4\) neighbours absorb a spillover of \(\tau_s = 1.0\) per unit of exposure. SpSyDiD with the same \(\mathbf{W}\) recovers both estimates.

"""One draw of a spatial spillover simulation."""

import numpy as np
import pandas as pd

from mlsynth import SpSyDiD
from mlsynth.utils.spsydid_helpers.spatial import knn_weights


# ---------------------------------------------------------------------
# 1. Lay out an 8x8 grid of units
# ---------------------------------------------------------------------

rng = np.random.default_rng(0)
xs, ys = np.meshgrid(np.arange(8), np.arange(8))
coords = np.column_stack([xs.flatten(), ys.flatten()])
N = coords.shape[0]
T_pre, T_post = 16, 8
T = T_pre + T_post
W = knn_weights(coords, k=4, row_standardized=True)


# ---------------------------------------------------------------------
# 2. Two-way FE DGP with planted direct + spillover effects
# ---------------------------------------------------------------------

tau_true = 2.0
tau_s_true = 1.0

unit_fe = rng.standard_normal(N) * 0.5
time_fe = np.linspace(0.0, 1.0, T)
Y0 = (
    unit_fe[:, None]
    + time_fe[None, :]
    + rng.standard_normal((N, T)) * 0.2
)
D = np.zeros((N, T), dtype=float)
for u in (0, 7, 24, 39, 56, 63):
    D[u, T_pre:] = 1.0
Y = Y0 + tau_true * D + tau_s_true * (W @ D)


# ---------------------------------------------------------------------
# 3. Long DataFrame
# ---------------------------------------------------------------------

rows = [
    {"unit": i, "time": t, "y": float(Y[i, t]), "D": float(D[i, t])}
    for i in range(N)
    for t in range(T)
]
df = pd.DataFrame(rows)


# ---------------------------------------------------------------------
# 4. Fit SpSyDiD
# ---------------------------------------------------------------------

res = SpSyDiD({
    "df": df,
    "outcome": "y",
    "treat": "D",
    "unitid": "unit",
    "time": "time",
    "spatial_matrix": W,
}).fit()


# ---------------------------------------------------------------------
# 5. Inspect the output
# ---------------------------------------------------------------------

print(f"true tau   = {tau_true:+.3f}    tau_hat   = {res.att:+.3f}")
print(f"true tau_s = {tau_s_true:+.3f}    tau_s_hat = {res.aite:+.3f}")
print(f"ATE        = {res.ate:+.3f}")
print(f"partition  : {res.inputs.N_direct} direct, "
      f"{res.inputs.N_spillover} spillover, "
      f"{res.inputs.N_pure} pure controls")
print(f"mean post-period exposure on treated union = "
      f"{res.metadata['mean_exposure_post_treated']:.3f}")

Verification (Path-B Monte Carlo)#

Serenini & Masek (2024) include an empirical example (the Arizona 2007 LAWA effect on noncitizen Hispanic share, Tables 8-11) but do not release the CPS panel used to construct it – their public replication repo (serenini/spatial_SDID) ships only the simulation code and a BLS unemployment panel for two Monte Carlo exercises. We therefore satisfy the Path-B contract by reproducing those two simulation findings against the authors’ own driver (functions_ssdid.py in their repo), invoking SpSyDiD(config).fit() end-to-end on every replication.

The state-level finding is institutionalised as a per-replication cross-validation benchmark – benchmarks/cases/spsydid_state_mc.py runs both SpSyDiD and the authors’ own algorithm on each panel and asserts per-rep agreement; see the dedicated page SpSyDiD — Spatial Synthetic-DiD (Serenini & Masek 2024).

A real-data differential cross-check complements the two simulations. The paper’s Section-4 outcome (the noncitizen-Hispanic share) is not reconstructible from public in-repo data, but the SpSyDiD machinery can still be exercised on the genuine LAWA setting: benchmarks/cases/spsydid_lawa_diff.py aggregates the Arizona LAWA CPS extract (basedata/cps_lawa_arizona.parquet, shared with Synthetic Control with Differencing (SCD)) to state-mean log weekly earnings, treats Arizona from period 55 with the queen-contiguity W restricted to the 45 common states, and compares SpSyDiD(config).fit() against the authors’ own fit_unit_weights / fit_time_weights. The SDID quadratic programs are shared exactly – the unit weights agree to \(\sim 10^{-9}\) – so once the reference is placed on mlsynth’s canonical row-weight convention the direct ATT and the spillover coefficient agree to \(\sim 10^{-8}\).

The only substantive difference between the two codebases is a convention the paper leaves underspecified. The authors’ join_weights fills the post-period time weight with \(T_{post}/T\) and the affected-unit weight with the mean treated weight, whereas mlsynth uses the canonical synthetic-DiD choices (\(1/T_{post}\) for the post-period time weights, \(1/N_{sp}\) for the affected units). With the large injected effect of the simulation the two are indistinguishable; on a small real effect they can differ at the third decimal, so mlsynth follows the canonical convention.

The reference panels and adjacency matrices ship with mlsynth in basedata/:

state_unemployment.parquet – BLS monthly state unemployment 1976-2014.
US_no_islands_matrix.gal – queen-contiguity W for the 49 contiguous states.
spsydid_bls_county_subset.csv – the BLS county-employment slice (2002-2004, states WY/OR/PA/AL) used in the county-level MC.
spsydid_county_matrices.pkl – per-state county adjacency matrices.

State-level Monte Carlo (40 rolling-window replications)#

Reproduces State_Level_Simulations.ipynb: at each 3-year window starting in 1975..2014, treat Arkansas (FIPS 5) only and inject \(\text{ATT} = 25\%\) of mean unemployment plus \(\rho = 0.8\) spillover via the queen-contiguity W. We compare the authors’ reference algorithm against SpSyDiD(config).fit() on the same 40 panels to test for per-rep agreement.

                       ref-mean    ref-sd    mlsynth-mean    mlsynth-sd
ATT bias               +0.0187    0.3204         +0.0189        0.3229
rho bias (tau_s/ATT)   +0.0596    0.9228         +0.0669        0.9965

per-rep correlation:  ATT 0.9917      rho 0.9948

Both estimators recover the paper’s headline finding: the mean ATT bias is essentially zero (~0.019 against an ATT magnitude of ~1.5 percentage points). Per-replication, the two implementations agree to ~0.02 on every panel realisation; the small residual is the unit-weight assignment for affected rows (mlsynth: \(1/N_{sp}\); reference: mean of treated-unit SDID weights). Both choices are valid downstream of the SDID weight QPs.

The driver is examples/spsydid/replicate_state_level_mc.py; run with python -m examples.spsydid.replicate_state_level_mc --reps 40.

County-level Monte Carlo (4 states x 200 reps)#

Reproduces Monte_Carlo_Simulations.ipynb: for each of WY, OR, PA, AL, randomly draw 10% of counties as directly treated (multiple treated units per rep), inject \(\text{ATT} = -25\%\) of mean unemployment plus \(\rho = 0.5\) spillover, fit SpSyDiD(config).fit(), repeat. The four states span 23-67 counties, so the test is whether the SUTVA correction works across panel sizes.

state  #counties  #treated   ATT bias mean   (sd)    AITE bias mean   (sd)
WY        23         2          -0.003     0.260         -0.018     0.143
OR        36         4          -0.023     0.225         -0.022     0.183
PA        67         7          +0.034     0.246         +0.062     0.127
AL        67         7          +0.028     0.228         -0.000     0.139

In every cell the absolute mean ATT bias is below 0.04 against an ATT magnitude of ~-1.5 – the spatial-DGP-induced bias of plain SDID is cleanly removed by the SpSyDiD correction at the county scale.

The driver is examples/spsydid/replicate_county_level_mc.py; run with python -m examples.spsydid.replicate_county_level_mc --reps 200.

References#

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). “Synthetic Difference-in-Differences.” American Economic Review 111(12):4088-4118.

Delgado, M. S., & Florax, R. J. G. M. (2015). “Difference-in-Differences Techniques for Spatial Data: Local Autocorrelation and Spatial Interaction.” Economics Letters 137:123-126.

Serenini, R., & Masek, F. (2024). “Spatial Synthetic Difference-in-Differences.” SSRN Working Paper 4736857.

Spatial Synthetic Difference-in-Differences (SpSyDiD)

Contents

Spatial Synthetic Difference-in-Differences (SpSyDiD)#

Overview#

When to Use This Method#

Do not use SpSyDiD when#

Mathematical Formulation#

Setup#

Algorithm#

Identification assumptions#

Connection to existing methods#

Core API#

Configuration#

Helper Modules#

Example#

Verification (Path-B Monte Carlo)#

State-level Monte Carlo (40 rolling-window replications)#

County-level Monte Carlo (4 states x 200 reps)#

References#