Staggered Synthetic Control (SSC)#
When to Use This Estimator#
SSC implements the staggered-adoption synthetic-control estimator of Cao, Lu
and Wu [SSC]. It is for the setting where many units adopt a policy at
different times and you have a long pre-treatment history relative to the
number of units and post-periods (large \(T\), moderate \(N\), small
\(S\) – e.g. monthly or weekly outcomes for a few dozen jurisdictions).
Two features distinguish it from the alternatives.
First, it uses every unit – including not-yet-treated units – as a donor. Each unit’s untreated outcome is modelled as an intercept plus a simplex synthetic control on all the other units. It therefore does not require a pool of never-treated units (existing staggered SC methods lean heavily on them, and degrade when treated units are the majority), and it does not rely on parallel trends (unlike staggered difference-in-differences).
Second, it delivers valid inference for policy-relevant aggregates. All individual unit-by-time effects are estimated jointly; the target is any linear functional \(\gamma = L\tau\) – the event-time ATT, the overall ATT, or a contrast between two policies. Inference is Andrews’ (2003) end-of-sample stability test, whose reference distribution is built from pre-treatment residual windows, and which can test both sharp and non-sharp nulls.
Reach for SSC when adoption is staggered, the pre-period is long,
never-treated units are scarce or absent, and you want an event-study
of dynamic effects with confidence bands. It is well suited to high-frequency
aggregate outcomes (crime rates, prices, bond yields) for a moderate number of
units.
Do not use SSC when#
The pre-period is short. SSC’s guarantees and its end-of-sample inference are large-\(T\) (they need \(T_0 > S\) clean pre-periods, and more in practice). With a short pre-period use Synthetic Difference-in-Differences (SDID), Matrix Completion with Nuclear Norm Minimization (MCNNM), or Partially Pooled SCM (PPSCM).
There is a single treated unit or a single adoption date. SSC’s leverage comes from pooling many staggered adopters. For one treated unit start at Forward Difference-in-Differences (FDID)/Two-Step Synthetic Control; for a block of simultaneous adopters use Multivariate Square-root Lasso Synthetic Control (MSQRT) or Synthetic Difference-in-Differences (SDID).
No unit is well approximated by a convex combination of the others (the treated units sit outside the donors’ convex hull). The simplex fit will be poor; consider Matrix Completion with Nuclear Norm Minimization (MCNNM) (which regularises a latent factor model instead).
Anticipation is a concern. SSC puts not-yet-treated units in the donor pool; if units change behaviour before adoption this can bias the fit (plot the pre-trends to check).
Spillovers across units violate SUTVA – use Spatial Synthetic Difference-in-Differences (SpSyDiD) or Spillover-Aware Synthetic Control (SPILLSYNTH).
Notation#
A balanced panel of \(N\) units over \(T_0 + S\) periods, where \(T_0\) is the number of clean pre-treatment periods (before any unit adopts) and \(S\) is the number of post periods. Adoption times \((t_1, \ldots, t_N)\) are observed (\(t_i = \infty\) for never-treated units); treatment is absorbing. The observed outcome is the never-treated potential outcome before adoption and the treated one after. The individual effect is \(\tau_{i,t} = y_{i,t}(t_i) - y_{i,t}(\infty)\), and the target is a linear functional \(\gamma = L\tau\) of the stacked effect vector \(\tau \in \mathbb{R}^K\) (\(K\) = number of treated cells).
The estimator#
Step 1 – synthetic-control weights. For each unit \(i\), fit a demeaned simplex synthetic control on all other units over the clean pre-period (paper eq. 2.1):
Collect the intercepts \(\widehat a\) and the weight matrix \(\widehat B\) (row \(i\) is \(\widehat b_i\)), and let \(\widehat M = (I - \widehat B)'(I - \widehat B)\). The prediction error is \(u_{i,t} = y_{i,t}(\infty) - (\widehat a_i + Y_t(\infty)'\widehat b_i)\).
Step 2 – joint effect estimation. With selector matrices \(A_s\) mapping \(\tau\) to the period-\((T_0+s)\) effect vector, the GLS estimator (paper eq. 2.4) is
The invertibility of \(\sum_s A_s' M A_s\) (Assumption 2.1) is the key identifying condition; its smallest eigenvalue is a useful diagnostic. The event-time ATT at horizon \(s\) is the average of \(\widehat\tau\) over cells with event time \(s\), and the overall ATT is the grand mean.
Inference#
SSC tests \(H_0: C\tau = d\) (e.g. event-time ATT \(= 0\), or two
policies equal) with Andrews’ (2003) end-of-sample stability test. The test
statistic is \(\widehat P = (C\widehat\tau - d)'(C\widehat\tau - d)\); its
critical value comes from sliding a length-\(S\) window across the
\(T_0\) pre-treatment residuals to form \(T_0 - S\) placebo realisations
of the estimator under the null. Under a stationarity/ergodicity assumption on
the prediction error the test has asymptotically correct size as
\(T \to \infty\) – crucially without point-identifying \(\tau\).
mlsynth reports, for the overall ATT and each event-time ATT, a band (the
point estimate plus the placebo distribution’s quantiles) and a two-sided
p-value, on SSCBand.
Assumptions and econometric theory#
SSC is a large-:math:`T`, fixed-:math:`N`-and-:math:`S` method. The individual effects \(\tau_{i,t}\) are not point-identified (there are more unknowns than the data can pin down); the payoff is that any aggregate \(\gamma = L\tau\) is asymptotically unbiased and admits valid inference as the pre-period lengthens.
Setup (SUTVA, no anticipation). Potential outcomes follow a Rubin model in which (i) a unit stays treated once treated (absorbing), (ii) a unit’s outcome depends only on its own treatment status and timing – no interference / spillovers across units – and (iii) pre-adoption outcomes equal the never-treated potential outcome (no anticipation).
Assumption 2.1 (invertibility). \(\sum_{s=1}^{S} A_s' M A_s\) is
invertible, with \(M = (I-B)'(I-B)\). Remark. This is the key identifying
condition: it makes the linear map from the post-treatment prediction errors to
\(\tau\) full rank, so the estimator (eq. 2.4) is well defined. It fails
only in degenerate cases – a “disconnected treated cohort” whose units lie in
one another’s convex hull – and staggered timing typically bridges cohorts
and restores it. The smallest eigenvalue of the sample
\(\sum_s A_s'\widehat M A_s\) is a practical diagnostic (the paper’s
Table 1); mlsynth reports it as results.metadata["gram_min_eigenvalue"].
Assumption 2.2 (stationary prediction error; consistent weights). The prediction error \(u_{i,t} = y_{i,t}(\infty) - (a_i + Y_t(\infty)'b_i)\) is strictly stationary with mean zero, and the synthetic-control weights converge (\(\widehat a \to a\), \(\widehat B \to B\)). Remark. The authors show this holds when the untreated outcomes share stationary or cointegrated common factors – the cointegrating relationship is exactly what lets a stable cross-sectional synthetic control exist with a stationary remainder, which is why a long, well-behaved pre-period matters.
Assumption 2.3 (ergodicity; regularity for inference). \(\{u_t\}\) is ergodic with finite second moment, a normalising sequence controls the regressors, the weight estimates converge uniformly across the placebo windows, and the test statistic’s distribution is continuous and increasing at its \((1-\alpha)\) quantile. Remark. These are the conditions under which the pre-treatment placebo windows are a valid stand-in for the post-treatment sampling distribution of the estimator.
Theorem 2.1 (asymptotic unbiasedness). Under Assumptions 2.1–2.2, as \(T \to \infty\),
so \(\widehat\gamma\) – and, by Corollary 2.1, the event-time ATT \(\widehat{\mathrm{ATT}}^e_s = l_s'\widehat\tau\) – is an asymptotically unbiased estimator of its target without point-identifying the individual effects. (The remaining \(L V_T\) term is mean-zero estimation noise that the inference procedure quantifies.)
Theorem 2.2 (valid end-of-sample inference). Under Assumptions 2.1–2.3 and the null \(H_0: C\tau = d\), the Andrews test has asymptotically correct size,
and confidence regions are obtained by inverting the test. The result holds for both sharp nulls (e.g. a single \(\mathrm{ATT}^e_s = 0\)) and non-sharp nulls (restrictions on aggregates), which is what makes it suited to policy-relevant hypotheses under staggered adoption.
Why large-:math:`T`. The leverage comes entirely from the long pre-period: it identifies the synthetic-control weights and supplies the placebo windows that calibrate inference. This is why SSC fits high-frequency aggregate outcomes (monthly, weekly) with a moderate number of units – and why it is not for short panels.
Example#
A staggered panel of twenty units (four never treated) following a three-factor
model, adopting across a six-period window, with a dynamic effect that grows
with event time (\(\tau = 1 + e\)). SSC recovers the event-study path
with end-of-sample bands and reports the overall ATT.
from mlsynth import SSC
from mlsynth.utils.ssc_helpers.simulation import simulate_ssc_panel
df = simulate_ssc_panel(
n_units=20, n_never=4, T0=50, S=6, base_effect=1.0, seed=1,
)
res = SSC({
"df": df, "outcome": "Y", "treat": "treated",
"unitid": "unit", "time": "time",
"inference": True, # Andrews end-of-sample bands + p-values
"display_graphs": True, # event-study plot
}).fit()
print(f"overall ATT = {res.att:+.3f} (p = {res.att_band.p_value:.3f})")
for e in sorted(res.event_att):
b = res.event_bands[e]
print(f" event time {e}: {b.point:+.3f} [{b.lower:+.3f}, {b.upper:+.3f}]"
f" (true {1.0 + e:.0f})")
Empirical replication (Guanajuato police reform)#
The package ships the paper’s Section 4 data (Alcocer 2024, Harvard Dataverse)
and the authors’ reference estimates in basedata/. The block below is
copy-paste runnable after a fresh install – it pulls the panels straight
from the basedata/ raw URL, fits SSC through the public API, and checks
every estimate against the authors’ published table:
import pandas as pd
from mlsynth import SSC
BASE = "https://raw.githubusercontent.com/jgreathouse9/mlsynth/main/basedata/"
# --- One outcome, directly through the public API ----------------------
# Homicide rate: monthly panel, the paper's sample window (time < 253).
crime = pd.read_csv(BASE + "guanajuato_crime_ssc.csv").query("time < 253")
res = SSC({"df": crime[["idunico", "time", "Policial", "hom_all_rate"]],
"outcome": "hom_all_rate", "treat": "Policial",
"unitid": "idunico", "time": "time",
"inference": True, "alpha": 0.05, "display_graphs": False}).fit()
print("homicide ATT^e_1 =", round(res.event_att[0], 4), " (paper: 0.0743)")
# --- All seven outcomes vs the authors' reference table ----------------
from mlsynth.utils.ssc_helpers import replicate_guanajuato
est = replicate_guanajuato(verbose=False) # downloads both panels from basedata/
ref = pd.read_csv(BASE + "guanajuato_ssc_reference.csv").rename(
columns={"event time": "event_time", "att estimate": "ref_att"})
m = est.merge(ref[["outcome", "event_time", "ref_att"]],
on=["outcome", "event_time"])
m["abs_diff"] = (m["att"] - m["ref_att"]).abs()
print("\nmax |mlsynth - paper| ATT, per outcome (", len(m), "cells):")
print(m.groupby("outcome")["abs_diff"].max().round(6).to_string())
prints:
homicide ATT^e_1 = 0.0743 (paper: 0.0743)
max |mlsynth - paper| ATT, per outcome ( 357 cells):
outcome
co_num 0.001015
hom_all_rate 0.000187
hom_ym_rate 0.000097
presence_strength 0.000046
theft_nonviolent_rate 0.000016
theft_violent_rate 0.000149
war 0.000081
Every one of the 357 reference cells (seven outcomes x their event-time paths)
is reproduced: the homicide and theft rates match the authors’ table to about
\(10^{-4}\), and the short annual cartel outcomes to \(10^{-3}\) (the
residual is the simplex-weight solver – cvxpy here vs. the reference’s
fmincon). The confidence bands match where the reference has them (present
for homicide and the cartel outcomes; NaN for theft, where \(T_0 < S\)
leaves no pre-treatment placebo window). The reference table itself is shipped
at basedata/guanajuato_ssc_reference.csv,
and the two panels at
guanajuato_crime_ssc.csv
and
guanajuato_cartel_ssc.csv.
Simulation study (Path B)#
The paper’s Section 3 Monte Carlo is reproduced through the same public API.
run_ssc_simulation simulates the staggered factor DGP and returns SSC’s
event-time RMSE per (r, T0) cell (the paper’s Figure 1):
from mlsynth.utils.ssc_helpers.replication import (
run_ssc_simulation, SSCSimConfig, PAPER,
)
# fast, reduced-count preset (use PAPER for the exact N=33, 1000-rep study)
rmse = run_ssc_simulation(SSCSimConfig(n_units=20, n_never=4, S=6,
n_factors=2, T0_grid=[42], n_reps=20))
for cell, by_event in rmse.items():
print(cell, {e: round(v, 3) for e, v in sorted(by_event.items())})
prints (Monte-Carlo values vary by seed/preset, but the pattern – event-time RMSE rising with the horizon, as in the paper’s Figure 1 – is stable):
(2, 42) {0: 0.37, 1: 0.416, 2: 0.547, 3: 0.552, 4: 0.907, 5: 0.991}
Verification#
Note
Path B replication of the paper’s simulation study (Section 3).
mlsynth.utils.ssc_helpers.replication reproduces the authors’
synthetic Monte-Carlo study – a Path B replication, since we replicate
their simulation-section results rather than an empirical data set – through
the public mlsynth.SSC.fit() API. The DGP is the paper’s factor model
(simulate_ssc_panel()):
N = 33 units (30 treated, staggered over an S = 7 window),
r in {3, 6} AR(1) factors, T in {15, 42, 157} pre-periods, and a
dynamic effect \(\tau = 1 + e\). The reported quantity is the
event-time RMSE of the ATT estimates (the paper’s Figure 1). SSC
recovers the increasing effect path, and its event-time RMSE is lowest in the
early post-periods – below GSC (Xu 2017) and partially-pooled SC
(Ben-Michael et al. 2022) there – because it builds the synthetic controls
from all units rather than only the scarce never-treated ones, which
inflate those methods’ variance. The PAPER preset runs the authors’ exact
1,000-replication configuration; the DEMO preset is a faster,
reduced-count version that reproduces the qualitative pattern.
Path A replication of the empirical application (Section 4). Running
SSC on the paper’s Guanajuato police-reform data (Alcocer 2024;
\(N = 33\) municipalities, \(10\) staggered adopters) reproduces the
authors’ reference event-time ATT estimates for all seven outcomes – the
long-pre-period homicide rates (\(T_0 = 174\), \(S = 78\)) and theft
rates (\(T_0 = 42\)) to about \(10^{-4}\), and the short annual cartel
outcomes (\(T_0 = 15\)) to about \(10^{-3}\) (the residual is the
simplex-weight solver, cvxpy here vs. the reference’s fmincon). The bands
are reported exactly where the reference has them: present for homicide and
the cartel outcomes, and NaN for theft, where \(T_0 < S\) leaves no
pre-treatment placebo window.
Inference. The end-of-sample band is calibrated on pre-treatment residual windows, so coverage does not require point-identification of the individual effects – only stationarity of the prediction error.
Core API#
SSC: Staggered Synthetic Control (Cao, Lu & Wu 2026).
Cao, J., Lu, S. & Wu, H. (2026). “Synthetic Control Inference for Staggered Adoption.” The Econometrics Journal.
SSC estimates heterogeneous, dynamic treatment effects when many units adopt a policy at different times (staggered adoption) and the pre-treatment history is long relative to the number of units and post-periods (large \(T\), moderate \(N\), small \(S\)). Two features set it apart from difference-in-differences and other staggered synthetic-control methods:
It uses every unit – including not-yet-treated units – as a donor. Each unit’s untreated outcome is modelled as an intercept plus a simplex synthetic control on all other units,
\[y_{i,t}(\infty) = a_i + Y_t(\infty)' b_i + u_{i,t}, \qquad b_i \ge 0,\ \textstyle\sum_j b_{ij} = 1,\ b_{ii} = 0 ,\]so it does not require a pool of never-treated units and does not rely on parallel trends.
It delivers valid inference for policy-relevant aggregates. All individual unit \(\times\) time effects \(\tau\) are estimated jointly by GLS; the target is any linear map \(\gamma = L\tau\) (event-time ATT, overall ATT, or a contrast between policies). Inference uses Andrews’ (2003) end-of-sample stability test, whose reference distribution is built from pre-treatment residual windows – valid for both sharp and non-sharp nulls under a large-\(T\) stationarity assumption.
This estimator targets the staggered causal setting; it returns the event-study path of effects with confidence bands and the overall ATT.
- class mlsynth.estimators.ssc.SSC(config: SSCConfig | dict)#
Bases:
objectStaggered Synthetic Control estimator.
- Parameters:
config (SSCConfig or dict) – Configuration object. See
mlsynth.config_models.SSCConfig.
- fit() SSCResults#
Run SSC and return
SSCResults.
Configuration#
- class mlsynth.config_models.SSCConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', inference: bool = True, alpha: ~typing.Annotated[float, ~annotated_types.Gt(gt=0.0), ~annotated_types.Lt(lt=1.0)] = 0.1)#
Configuration for the SSC (Staggered Synthetic Control) estimator.
Cao, Lu & Wu (2026), “Synthetic Control Inference for Staggered Adoption” (The Econometrics Journal). Models each unit’s untreated outcome as an intercept plus a simplex synthetic control on all other units (not-yet-treated units are valid donors), jointly estimates every unit x time effect by GLS, and reports event-time / overall ATT with Andrews (2003) end-of-sample stability inference. Targets staggered adoption with a long pre-period (large
T, moderateN, smallS). Inherits the standarddf/outcome/treat/unitid/timeinterface.- Parameters:
inference (bool) – Attach Andrews end-of-sample bands and p-values to the event-time and overall ATT. Default True.
alpha (float) – Two-sided level for the bands (default 0.1 -> 90% band).
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Result Containers#
SSC.fit() returns a
SSCResults: the per-cell effects
tau with their index (post-period, unit, event time), the overall
att and its SSCBand, the
event_att path and per-event event_bands, the per-cell effects grid,
the synthetic-control intercepts a_hat and weight matrix B_hat (plus a
WeightsResults), the pre-treatment
residuals, and the
SSCInference summary.
Frozen dataclasses for the SSC (Staggered Synthetic Control) estimator.
Cao, Lu & Wu (2026), Synthetic Control Inference for Staggered Adoption (The Econometrics Journal). SSC estimates heterogeneous, dynamic treatment effects under staggered adoption by modelling each unit’s untreated outcome as an intercept plus a simplex synthetic control on all other units (so not-yet-treated units are valid donors), then jointly estimating every unit x time effect by GLS and aggregating to event-time / overall ATT. Inference is Andrews’ (2003) end-of-sample stability test, calibrated on pre-treatment residual windows.
- class mlsynth.utils.ssc_helpers.structures.SSCBand(label: Any, point: float, lower: float, upper: float, p_value: float, n_cells: int)#
Bases:
objectPoint estimate, prediction band and p-value for one aggregate effect.
- label#
Event time (int) or
Nonefor the overall ATT.- Type:
Any
- lower, upper
End-of-sample band endpoints.
- Type:
- property ci#
- class mlsynth.utils.ssc_helpers.structures.SSCInference(method: str, alpha: float, n_placebo: int)#
Bases:
objectAndrews end-of-sample stability inference for an SSC aggregate.
The reference distribution is formed from
T0 - Spre-treatment residual windows (the placebo “effects” under the null); a band is the point estimate plus the lower/upper quantiles of that mean-zero distribution.
- class mlsynth.utils.ssc_helpers.structures.SSCInputs(Y: ndarray, D: ndarray, T0: int, unit_names: List[Any], time_labels: ndarray, treated_idx: ndarray, adoption: ndarray)#
Bases:
objectPreprocessed staggered panel for SSC.
- Y#
Outcomes, shape
(N, T1)(T1 = T0 + S).- Type:
np.ndarray
- D#
Treatment indicators, shape
(N, T1)(1where treated; absorbing: once 1, stays 1).- Type:
np.ndarray
- T0#
Number of “clean” pre-treatment periods (before any unit is treated); treatment first appears at column
T0.- Type:
- time_labels#
- Type:
np.ndarray
- treated_idx#
Indices of ever-treated units.
- Type:
np.ndarray
- adoption#
Per-unit first-treated column index (
-1for never-treated).- Type:
np.ndarray
- D: ndarray#
- Y: ndarray#
- adoption: ndarray#
- time_labels: ndarray#
- treated_idx: ndarray#
- class mlsynth.utils.ssc_helpers.structures.SSCResults(inputs: ~mlsynth.utils.ssc_helpers.structures.SSCInputs, tau: ~numpy.ndarray, index: ~numpy.ndarray, att: float, att_band: ~mlsynth.utils.ssc_helpers.structures.SSCBand | None, event_att: ~typing.Dict[int, float], event_bands: ~typing.Dict[int, ~mlsynth.utils.ssc_helpers.structures.SSCBand], effects: ~numpy.ndarray, a_hat: ~numpy.ndarray, B_hat: ~numpy.ndarray, weights: ~typing.Any, residuals: ~numpy.ndarray, inference: ~mlsynth.utils.ssc_helpers.structures.SSCInference | None = None, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#
Bases:
objectTop-level container returned by
mlsynth.SSC.fit().- tau#
Per-treated-cell individual treatment effects, length
K.- Type:
np.ndarray
- index#
(K, 3)rows[post_period s (1-based), unit_index, event_time e (0-based)]aligning withtau.- Type:
np.ndarray
- effects#
(N, S)per-cell effects placed on the post-period grid (NaN where a unit is untreated at that post period).- Type:
np.ndarray
- a_hat#
Per-unit synthetic-control intercepts, length
N.- Type:
np.ndarray
- B_hat#
(N, N)synthetic-control weight matrix (rowi= donor weights for uniti; zero diagonal).- Type:
np.ndarray
- weights#
mlsynth.config_models.WeightsResults– per-treated-unit donor weights plus a summary.- Type:
- residuals#
(N, T0)pre-treatment prediction errors.- Type:
np.ndarray
- inference#
- Type:
SSCInference, optional
- B_hat: ndarray#
- a_hat: ndarray#
- effects: ndarray#
- index: ndarray#
- inference: SSCInference | None = None#
- residuals: ndarray#
- tau: ndarray#
Helper Modules#
Staggered-panel ingestion: pivots the long panel, locates the clean pre-period, and checks the absorbing-treatment and pre-period conditions.
Panel ingestion for the SSC estimator (staggered adoption).
- mlsynth.utils.ssc_helpers.setup.prepare_ssc_inputs(df: DataFrame, outcome: str, treat: str, unitid: str, time: str) SSCInputs#
Pivot a long panel into
SSCInputs.SSC (Cao, Lu & Wu 2026) targets staggered adoption with a long pre-period:
T0is the number of clean periods before any unit is treated, and all units – including not-yet-treated ones – serve as donors. The treatment indicator must be absorbing (once 1, stays 1).
Per-unit simplex synthetic-control weights (each unit on all others).
Synthetic-control weights for SSC (intercept + simplex, batch over units).
Each unit’s untreated outcome is modelled as a_i + Y_t' b_i where b_i
lies on the simplex (non-negative, sums to one) with b_ii = 0 – i.e. a
demeaned synthetic control of unit i on all other units (Cao, Lu & Wu
2026, eq. 2.1). Fitting every unit in turn yields the intercept vector a and
the weight matrix B used throughout the estimator and its inference.
- mlsynth.utils.ssc_helpers.weights.sc_weights_one(y: ndarray, X: ndarray) Tuple[float, ndarray]#
Demeaned simplex synthetic control of one unit on the others.
Solves
min_b || (y - mean y) - (X - mean X) b ||^2subject tob >= 0andsum(b) = 1, then recovers the intercepta = mean(y) - mean(X) b.- Parameters:
y (np.ndarray, shape (T0,)) – Treated unit’s pre-treatment series.
X (np.ndarray, shape (T0, N-1)) – Donor units’ pre-treatment series (columns).
- Returns:
a (float) – Intercept.
b (np.ndarray, shape (N-1,)) – Simplex weights on the donors.
- mlsynth.utils.ssc_helpers.weights.synthetic_control_batch(Y_pre: ndarray) Tuple[ndarray, ndarray]#
Fit
sc_weights_one()for every unit (each treated, others donors).- Parameters:
Y_pre (np.ndarray, shape (N, T0)) – Pre-treatment outcomes (rows are units, columns are periods).
- Returns:
a_hat (np.ndarray, shape (N,)) – Per-unit intercepts.
B_hat (np.ndarray, shape (N, N)) – Weight matrix; row
iholds uniti’s donor weights with a zero on the diagonal.
The selector tensor, the GLS effect estimator, linear aggregation, and the Andrews end-of-sample inference.
SSC effect estimation and Andrews end-of-sample inference.
Given the synthetic-control weights (a_hat, B_hat) fitted on the clean
pre-period, this module (i) stacks the treated unit-period cells into the
selector tensor A_s, (ii) solves the GLS estimator for the full vector of
individual effects tau (Cao, Lu & Wu 2026, eq. 2.4), (iii) aggregates to any
linear target gamma = L tau (event-time / overall ATT), and (iv) calibrates
an end-of-sample stability band (Andrews 2003) from pre-treatment residual
windows.
- mlsynth.utils.ssc_helpers.estimation.aggregate(L: ndarray, tau: ndarray, V: ndarray, alpha: float, label, n_cells: int) SSCBand#
Aggregate
taubyLand attach the end-of-sample band + p-value.The placebo draws
L Vare (asymptotically) mean-zero replicates of the estimator’s errorL au_hat - L au, so inverting gives the band[point - q_{1-alpha/2}, point - q_{alpha/2}](Cao, Lu & Wu 2026; the reference implementation’satt - ub/att - lb). The two-sided p-value forH0: L tau = 0is the share of placebo draws at least as large in magnitude as the point estimate.
- mlsynth.utils.ssc_helpers.estimation.build_treatment_structure(D: ndarray, T0: int) Tuple[ndarray, ndarray]#
Index the treated post-period cells and build the selector tensor.
- Parameters:
D (np.ndarray, shape (N, T1)) – Absorbing treatment indicators.
T0 (int) – Number of clean pre-treatment periods.
- Returns:
index (np.ndarray, shape (K, 3)) – Rows
[post_period s (1-based), unit_index, event_time e (0-based)].A (np.ndarray, shape (N, K, S)) –
A[i, k, s-1] = 1iff treated cellkis unitiat post periods.
- mlsynth.utils.ssc_helpers.estimation.estimate_tau(Y: ndarray, T0: int, A: ndarray, a_hat: ndarray, B_hat: ndarray)#
Solve the GLS estimator (eq. 2.4) for the individual-effect vector.
- Parameters:
Y (np.ndarray, shape (N, T1)) – Full outcome panel.
T0 (int) – Clean pre-period count.
A (np.ndarray, shape (N, K, S)) – Selector tensor from
build_treatment_structure().a_hat (np.ndarray, shape (N,)) – Synthetic-control intercepts.
B_hat (np.ndarray, shape (N, N)) – Synthetic-control weight matrix.
- Returns:
tau (np.ndarray, shape (K,)) – Estimated individual treatment effects.
gram (np.ndarray, shape (K, K)) –
sum_s A_s' M A_s(the design Gram; invertible under Assumption 2.1).residuals (np.ndarray, shape (N, T0)) – Pre-treatment prediction errors
Y_T - (a + B Y_T).
- mlsynth.utils.ssc_helpers.estimation.event_time_maps(index: ndarray) Dict[int, ndarray]#
For each event time
e, the averaging rowL_e(1/n_e on its cells).
- mlsynth.utils.ssc_helpers.estimation.placebo_windows(gram: ndarray, A: ndarray, B_hat: ndarray, residuals: ndarray, T0: int) ndarray#
Pre-treatment “placebo effect” estimates for end-of-sample inference.
Slides a length-
Swindow across the pre-treatment residuals and, for each, applies the same GLS map used fortau– yieldingT0 - Sdraws of the estimator under the null of no effect (Andrews 2003).- Returns:
V (np.ndarray, shape (K, T0 - S)) – Placebo individual-effect vectors (columns).
Run loop: weights, effect estimation, event-time / overall aggregation, and the optional end-of-sample bands.
Orchestration for the SSC estimator (Cao, Lu & Wu 2026).
- mlsynth.utils.ssc_helpers.pipeline.run_ssc(inputs: SSCInputs, *, inference: bool = True, alpha: float = 0.1) SSCResults#
Run SSC end to end and assemble
SSCResults.- Parameters:
inputs (SSCInputs)
inference (bool) – Attach Andrews end-of-sample bands/p-values to the event-time and overall ATT (default True).
alpha (float) – Two-sided level for the bands.
Staggered-adoption factor-model DGP for examples and tests.
Staggered-adoption factor-model DGP for SSC examples and tests.
Reproduces the simulation design of Cao, Lu & Wu (2026, Section 3):
y_{i,t} = tau_{i,t} d_{i,t} + lambda_i’ f_t + alpha_i + xi_t + c + eps_{i,t}
with r AR(1) common factors f_t and time effect xi_t (each
g_t = 0.5 g_{t-1} + N(0,1)), unit factor loadings and fixed effects drawn
from U[-sqrt(3), sqrt(3)], intercept c, and N(0,1) idiosyncratic
noise. Treatment is staggered; the dynamic effect grows with event time,
tau_{i,t} = base + max(e_{i,t}, 0) where e_{i,t} = t - t_i.
- mlsynth.utils.ssc_helpers.simulation.simulate_ssc_panel(n_units: int = 33, n_never: int = 3, T0: int = 42, S: int = 7, n_factors: int = 3, base_effect: float = 1.0, intercept: float = 5.0, seed: int = 0) DataFrame#
Simulate a staggered-adoption panel in the Cao-Lu-Wu regime.
- Parameters:
n_units (int) – Total number of units.
n_never (int) – Number of never-treated units (the rest adopt at staggered times within the post window).
T0 (int) – Clean pre-treatment periods (before any adoption).
S (int) – Post-treatment periods (the adoption window).
n_factors (int) – Number of AR(1) common factors
r.base_effect (float) – Constant part of the dynamic effect; the full effect at event time
eisbase_effect + e.intercept (float) – Common intercept
c.seed (int) – RNG seed.
- Returns:
pandas.DataFrame – Long panel with columns
unit,time,Y,treated.
Path-B replication of the paper’s Section 3 Monte-Carlo study (event-time RMSE)
through the public SSC.fit API, with the PAPER / DEMO presets.
Path-B replication of the Cao, Lu & Wu (2026) simulation study (Section 3).
This reproduces the authors’ synthetic (Monte-Carlo) study for the SSC estimator – a Path B replication (reproducing a paper’s simulation-section results, as opposed to a Path A empirical-data replication).
Design (paper Section 3)#
The data are generated from a factor model
y_{i,t} = tau_{i,t} d_{i,t} + lambda_i’ f_t + alpha_i + xi_t + c + eps_{i,t},
with r AR(1) common factors and a time effect (each g_t = 0.5 g_{t-1} +
N(0,1)), loadings/unit effects U[-sqrt(3), sqrt(3)], intercept c = 5,
N(0,1) noise, and a dynamic effect tau_{i,t} = 1 + max(e_{i,t}, 0) where
e_{i,t} is event time. The authors fix N = 33 units (30 eventually
treated, staggered across an S = 7 post-window), vary the number of factors
r in {3, 6} and the pre-period length T in {15, 42, 157}, and run 1,000
replications. Figure 1 reports the event-time RMSE of each method’s ATT
estimates.
This module computes SSC’s event-time RMSE – the share of Figure 1 that
concerns our estimator – through the public mlsynth.SSC.fit() API:
RMSE_e = sqrt( mean over replications of ( ATT^e_hat - (1 + e) )^2 ),
since the true event-time ATT at horizon e is 1 + e under this DGP.
The PAPER preset is the authors’ exact configuration; DEMO is a faster,
reduced-count version that reproduces the qualitative pattern.
- mlsynth.utils.ssc_helpers.replication.GUANAJUATO_SPEC = {'co_num': ('cartel', 'idunico', 'Year', 'policial', None), 'hom_all_rate': ('crime', 'idunico', 'time', 'Policial', 'time < 253'), 'hom_ym_rate': ('crime', 'idunico', 'time', 'Policial', 'time < 253'), 'presence_strength': ('cartel', 'idunico', 'Year', 'policial', None), 'theft_nonviolent_rate': ('crime', 'idunico', 'time', 'Policial', 'time >= 133'), 'theft_violent_rate': ('crime', 'idunico', 'time', 'Policial', 'time >= 133'), 'war': ('cartel', 'idunico', 'Year', 'policial', None)}#
which file, the panel columns, and the sample window the paper uses (
(unit, time, treat, window_query);window_queryis aDataFrame.querystring, orNonefor the full panel).- Type:
For each outcome
- mlsynth.utils.ssc_helpers.replication.GUANAJUATO_URL = 'https://raw.githubusercontent.com/jgreathouse9/mlsynth/main/basedata/'#
Raw-data URL prefix for the datasets shipped in
basedata/.
- class mlsynth.utils.ssc_helpers.replication.SSCSimConfig(n_units: int = 33, n_never: int = 3, S: int = 7, n_factors: int = 3, base_effect: float = 1.0, intercept: float = 5.0, T0_grid: ~typing.List[int] = <factory>, n_reps: int = 1000)#
Parameters for the SSC Monte-Carlo study.
- mlsynth.utils.ssc_helpers.replication.replicate_guanajuato(crime: str | DataFrame | None = None, cartel: str | DataFrame | None = None, *, outcomes: List[str] | None = None, alpha: float = 0.05, verbose: bool = True) DataFrame#
Replicate the Guanajuato police-reform application (Path A).
Runs
mlsynth.SSCon each of the seven outcomes – homicide and theft rates (monthly) and cartel measures (annual) – using the paper’s sample windows, and returns the event-time ATTs with their end-of-sample bands. Reproduces the authors’ reference estimates to~1e-4(homicide, theft) /~1e-3(cartel).- Parameters:
crime, cartel (str or pandas.DataFrame, optional) – The monthly crime panel and annual cartel panel, or paths/URLs to them. Default downloads
guanajuato_crime_ssc.csv/guanajuato_cartel_ssc.csvfrom themlsynthbasedata/directory.outcomes (list of str, optional) – Which outcomes to estimate (default all seven).
alpha (float) – Two-sided level for the end-of-sample bands.
verbose (bool) – Print a per-outcome summary line.
- Returns:
pandas.DataFrame – Tidy long table with columns
outcome,event_time(1-based, as in the paper),att,ci_lower,ci_upper,T0,S.
- mlsynth.utils.ssc_helpers.replication.run_ssc_simulation(cfg: SSCSimConfig = SSCSimConfig(n_units=20, n_never=4, S=6, n_factors=2, base_effect=1.0, intercept=5.0, T0_grid=[42], n_reps=20), *, n_factors=None, seed: int = 0, verbose: bool = True) Dict#
Run the SSC Monte-Carlo and return event-time RMSE per
(r, T0)cell.For each replication the panel is simulated from the paper’s factor DGP and the event-time ATT is estimated via
mlsynth.SSC.fit()(point estimates only; inference is off for speed). The RMSE of each event-time estimate against the truth1 + eis accumulated across replications.- Parameters:
cfg (SSCSimConfig) – Study configuration (preset
PAPERorDEMO).n_factors (int, optional) – Override the factor count
r(default usescfg.n_factors).seed (int) – Base RNG seed.
verbose (bool) – Print a small per-cell table.
- Returns:
dict –
{(r, T0): {event_time e: rmse_e}}.