MicroSynth — panel method vs the R microsynth (Seattle DMI)

MicroSynth — panel method vs the R microsynth (Seattle DMI)#

Cross-validation of MicroSynth’s panel method (weight_method="panel") against the R microsynth package (Robbins, Saunders & Kilmer 2017, JASA; Robbins & Davenport 2021, JSS v97i02) on the package’s canonical Seattle Drug Market Intervention example.

What the panel method does#

Reading microsynth/R/weights.r: when match.out (lagged outcomes) is supplied, the package’s my.qp (a LowRankQP solve) chooses control weights by a non-negative quadratic program — exactly balance the covariate totals (an intercept makes the weights sum to the treated count) and least-squares-fit the lagged outcomes, with \(w \\ge 0\). Raking (survey::calibrate) is only the covariate-only initialization/fallback, not the panel weights themselves.

That objective is rank-deficient over a large control pool: it constrains the weights only through a handful of totals (here 10 covariate + 12 lagged-outcome constraints across ~9600 controls), so the counterfactual is not identified by the constraints alone. On this data the feasible period-13 effect ranges over roughly \([-392, +153]\); LowRankQP simply returns its interior-point iterate. mlsynth adds a strictly-convex ridge (panel_ridge) that selects the unique minimum-norm / maximum-ESS optimum — the most diffuse synthetic control consistent with exact covariate balance and the best lagged-outcome fit. Because LowRankQP’s interior-point solution is itself near that point, the two coincide to 3–4 significant figures, making this a genuine cross-validation rather than a comparison of solver artifacts.

Data#

basedata/seattledmi.parquet — the R microsynth package’s seattledmi dataset (data(seattledmi)), trimmed to the columns this case uses (ID/time/Intervention + the four Table 2 outcomes + the 9 census covariates). Full panel: 9642 census blocks × 16 periods, 39 treated blocks, Intervention on from time >= 13.

Configuration matches mlsynth’s one-outcome MicroSynth exactly: match.out = c("any_crime"), match.covar = the 9 census covariates, start.pre = 1, end.pre = 12, end.post = 16.

Result#

Per-period total treatment effect on any_crime (Treated − synthetic Control):

Period

mlsynth

R microsynth

13

−33.06

−33.06

14

−74.43

−74.35

15

−45.35

−45.45

16

−64.86

−64.89

ATT

−54.43

−54.44

Per-period effects agree to ~0.1 crimes and the ATT to ~0.01. The identified quantities both packages pin agree exactly: weights sum to the treated count (39) and covariate + lagged-outcome balance is exact (max |SMD| ≈ 1e-10). The ridge-selected optimum has effective sample size ≈ 378 (the most diffuse control consistent with the fit).

The placebo-permutation test (mlsynth run_inference=True, permutation_test="lower") finds the crime reduction significant: the observed ATT lies well below every placebo group (≈ 7 placebo SDs out), so the one-sided p-value sits at the 1/(1 + n_permutations) floor — matching the R package’s reported significant reductions on this example.

JSS Table 2 (multi-outcome joint match)#

The package’s headline analysis (sea1 in the JSS vignette) matches all four outcomes jointly — one synthetic control balancing every outcome’s pre-period trajectory plus the covariates — and reports a cumulative effect per outcome (Table 2, top panel). mlsynth reproduces it with match_outcomes set to the four outcomes (run once per primary outcome; the shared weight vector is identical across runs). Cumulative Pct.Chng over the post window 13–16:

Outcome

mlsynth

R microsynth

i_felony

−32.6%

−32.6%

i_misdemea

−37.3%

−37.3%

i_drugs

−15.9%

−15.8%

any_crime

−20.1%

−20.1%

All four match to <0.5 percentage points (Trt/Con levels agree to ~0.1 crimes). As in the paper, the permutation test flags felonies, misdemeanors and total crime as significant reductions but not drug crimes — the JSS Perm.pVal column (felony 0.020, misdemea 0.008, drugs 0.304, any_crime 0.020) under test="lower".

Note

What is and isn’t reproduced. The top panel’s point estimates (Trt/Con/Pct.Chng) are reproduced exactly. The top panel’s Linear p-values/CIs are a survey Taylor-linearization variance estimate that mlsynth does not implement; mlsynth’s inference is the placebo permutation, so it is the package’s Perm column that is the comparison target.

Note

R reference is baked in, not run in CI. microsynth does not install from CRAN in the CI/sandbox network (CRAN-over-HTTPS is firewalled). The reference numbers above come from benchmarks/R/microsynth_seattle.R (R microsynth 2.0.51), which documents the apt + GitHub-mirror install route and regenerates them. The benchmark asserts mlsynth’s output matches the baked reference within tolerance.

Reproduce#

python benchmarks/run_benchmarks.py --case microsynth_seattle
# regenerate the R references (needs R microsynth installed):
Rscript benchmarks/R/microsynth_seattle.R    # single-outcome any_crime
Rscript benchmarks/R/microsynth_table2.R     # JSS Table 2 joint match