Ridge ASCM — Augmented Synthetic Control (Ben-Michael, Feller & Rothstein 2021)#
- Estimator:
Vanilla Synthetic Control (VanillaSC) — the ridge-augmentation layer (
mlsynth.utils.bilevel.ridge_augment.ridge_augment_weights()).- Source:
Ben-Michael, Feller & Rothstein (2021), “The Augmented Synthetic Control Method,” JASA 116(536); reference implementation: the
augsynthR package (ebenmichael/augsynth).- Replication type:
Cross-validation — mlsynth matched value-for-value to
augsynthon its canonical Kansas study — and Path B — the paper’s Section-7 coverage / bias-reduction simulation.- Status:
Fully verified — empirical ladder and simulation reproduced.
Validation strategy#
The Augmented SCM is a bias-correction layer, so it is validated against the
authors’ own R package, augsynth, on its flagship empirical example: the
effect of Kansas’s 2012 tax cuts on quarterly log GDP per capita. augsynth
walks up a ladder of estimators of increasing de-biasing — plain SCM,
ridge ASCM, ridge ASCM with auxiliary covariates (balanced directly), and the
residualized covariate variant — and the measured effect grows while the
pre-treatment imbalance falls. We reproduce that ladder cell by cell, then
reproduce the paper’s Section-7 Monte Carlo (Path B).
Cross-validation — the Kansas ladder#
The treated unit is Kansas (FIPS 20); treatment begins in 2012 Q2, leaving
\(T_0 = 89\) pre-period quarters and \(J = 49\) donor states. The
covariate model is augsynth’s documented spec,
covsyn <- augsynth(lngdpcapita ~ treated | lngdpcapita + log(revstatecapita) +
log(revlocalcapita) + log(avgwklywagecapita) +
estabscapita + emplvlcapita,
fips, year_qtr, kansas, progfunc = "ridge", scm = TRUE)
with each covariate transformed per row and aggregated to a pre-period mean per
unit (rows carrying a missing — sparsely reported — revenue value are dropped
before averaging, R’s model.frame na.omit default).
The whole ladder is reproduced through mlsynth’s public API – Augmented SCM
is a mode of VanillaSC (augment="ridge").
Covariates are passed by column name; the user applies augsynth’s per-row log
transforms to the DataFrame first (mlsynth’s covariate convention), and the
shared-window aggregation drops a pre-period whenever any covariate is missing
there, matching augsynth’s na.omit. residualize=True selects the
residualized variant:
import numpy as np, pandas as pd
from mlsynth import VanillaSC
df = pd.read_csv("basedata/kansas_ascm.csv") # long fips x quarter panel
for c in ("revstatecapita", "revlocalcapita", "avgwklywagecapita"):
df[c] = np.log(df[c]) # augsynth's log transforms
covs = ["lngdpcapita", "revstatecapita", "revlocalcapita",
"avgwklywagecapita", "estabscapita", "emplvlcapita"]
base = dict(df=df, outcome="lngdpcapita", treat="treated",
unitid="fips", time="year_qtr")
att = lambda cfg: VanillaSC({**base, **cfg}).fit().effects.att
att({}) # classic SCM -0.029
att({"augment": "ridge"}) # ridge ASCM -0.040
att({"augment": "ridge", "covariates": covs}) # covariate ASCM -0.063
att({"augment": "ridge", "covariates": covs,
"residualize": True}) # residualized -0.057
The reproduced ladder (mlsynth vs augsynth):
Specification |
ATT (mlsynth) |
Pre-fit L2 |
augsynth (ATT / L2) |
|---|---|---|---|
Classic SCM |
-0.0294 |
0.083 |
-0.029 / 0.083 |
Ridge ASCM |
-0.0401 |
0.062 |
-0.040 / 0.062 |
Covariate ASCM |
-0.0629 |
0.055 |
-0.061 / 0.054 |
Residualized |
-0.0572 |
0.067 |
-0.055 / 0.067 |
The two no-covariate cells are exact; the covariate cells match augsynth’s
values and reproduce the monotone ladder (the un-augmented SCM is the
conservative end). The joint-null conformal \(p\)-value for ridge ASCM
(\(0.071\)) is also reproduced to Monte-Carlo precision.
A note on the residualized penalty. After residualizing out \(K\)
covariates the residual Gram is rank-deficient (\(T_0\) rows, rank
\(\le J - K\)), so a cross-validation on the residuals is ill-posed and
drifts to the grid floor. mlsynth tunes the penalty on the outcome scale
instead — where augsynth’s residual CV lands anyway — which reproduces the
published \(-0.055\) / \(0.067\) robustly.
Path B — coverage and bias reduction (Section 7)#
Four data-generating processes are calibrated to the Kansas panel — a 3-factor
interactive-fixed-effects model (calibrated exactly as gsynth/fect’s
interFE does it with no covariates: two-way demean, then a rank-3 SVD of the
residual), the same model at \(4\times\) noise, additive two-way fixed
effects, and a fitted AR(3). Treatment is assigned to an extreme unit, so plain
SCM struggles and the augmentation matters. Across all four DGPs ridge ASCM
reduces \(|\text{bias}|\) versus plain SCM and gives near-nominal
coverage (\(\approx 0.90\)–\(0.96\)), with the gain limited under high
noise — the paper’s thesis.
Durable cases & tests#
ascm_kansas— the four-spec Kansas ladder cross-validated againstaugsynth(benchmarks/cases/ascm_kansas.py).augsynth_calibrated— the Section-7 coverage / bias-reduction simulation (benchmarks/cases/augsynth_calibrated.py).Regression tests:
mlsynth/tests/test_bilevel_ridge.py(test_augsynth_kansas_replication,test_augsynth_kansas_conformal_pvalue,test_augsynth_kansas_covariate_ladder).