Partially Pooled SCM (PPSCM)#
When to Use This Estimator#
PPSCM is a faithful port of augsynth::multisynth – the partially
pooled synthetic control of Ben-Michael, Feller and Rothstein [PPSCM] for
staggered adoption. Use it when several units are treated but at different
times, with a pool of never-treated (or late-treated) comparison units, and you
want a single estimate of the average treatment effect on the treated (ATT) over
relative time (time-since-treatment), pooling information across cohorts.
The central idea is a pooling dial \(\nu\). Fitting a separate synthetic
control for each treated unit gives the best per-unit pre-treatment fit but high
variance; a fully pooled control (one synthetic match for the average treated
unit) is stable but may fit any individual unit poorly. PPSCM interpolates
between the two, choosing \(\nu\) to balance overall and unit-level
imbalance. time_cohort=True collapses units sharing an adoption time into a
single fully-pooled cohort (one synthetic control per cohort).
The problem PPSCM solves is that the two reflexive extensions of SCM to staggered adoption are each flawed. Separate SCM (fit a synthetic control per treated unit, then average – common practice) requires a good synthetic control for every treated unit, which often fails, and its strong per-unit fits can still leave the average poorly matched, biasing the ATT. Pooled SCM (match the average treated unit) nails the average fit but can fit individual units badly, biasing unit-level effects and the average when the data-generating process drifts over time. Ben-Michael, Feller and Rothstein bound the estimation error by both the average imbalance and the per-unit imbalances, and partially pooled SCM minimises a weighted combination of the two – the regime where neither extreme is trustworthy.
Reach for PPSCM when#
Several units are treated at different adoption times, with a pool of never-treated (or not-yet-treated) comparison units.
You want an ATT over relative time (an event-study path), pooling information across cohorts rather than trusting each cohort’s own fit.
No single donor mix matches every treated unit, so separate SCM leaves you with unreliable per-unit fits – the partial-pooling dial lets the average fit borrow strength without abandoning unit-level fit.
You want an estimator that nests the familiar special cases (separate and fully pooled SCM) and a principled way to choose between them.
Do not use PPSCM when#
All treated units adopt at the same time (a single cohort). The staggered machinery is unnecessary; use classic SC (Two-Step Synthetic Control, Forward Difference-in-Differences (FDID)) or, for many treated units at one time, Synthetic Difference-in-Differences (SDID).
You are willing to assume parallel trends after weighting and want the DiD-flavoured double weighting / time weights. Synthetic Difference-in-Differences (SDID) (and, for efficiency under interactive fixed effects, Sequential Synthetic Difference-in-Differences (Sequential SDiD)) is the more natural home; PPSCM is a synthetic-control estimator, not a difference-in-differences one.
Spillovers violate SUTVA across the donor pool – use Spatial Synthetic Difference-in-Differences (SpSyDiD).
The treated paths lie outside the donor convex hull / the donor pool is large and noisy. Partial pooling cannot manufacture a hull that does not contain the treated units; a factor-model (Factor Model Approach (FMA)) or low-rank (Cluster Synthetic Controls (CLUSTERSC), Matrix Completion with Nuclear Norm Minimization (MCNNM)) approach is better.
Distributional effects (quantiles, tails) – use Distributional Synthetic Control (DSC).
Notation#
Units \(i = 1, \ldots, n\) are observed over periods \(t = 1, \ldots, T\). Treated unit (or cohort) \(j\) adopts at period \(T_j\); never-treated units have \(T_j = \infty\) and form the donor pool. The panel is split at the last adoption time into a pre-period of length \(d\) and the post-period. For cohort \(j\), donor weights \(\boldsymbol{\omega}_j\) live on the simplex; the synthetic control matches the cohort’s pre-treatment residuals.
Method#
PPSCM follows multisynth in three stages.
1. Two-way fixed effects (``fixedeff=True``, the default). A time effect is the never-treated units’ per-period mean; a unit effect is each unit’s mean over its own pre-adoption window. Both are removed and the synthetic control balances the residuals – the “intercept-shifted” estimator of the paper.
2. Partially pooled QP. With per-cohort pre-treatment imbalance \(\mathbf{q}_j = \mathbf{x}_j - \mathbf{X}_{0,j}\boldsymbol{\omega}_j\) (residuals; the pooled imbalance aligned by relative time), the weights solve
where \(\text{norm}_{\text{pool}}\) and \(\text{norm}_{\text{sep}}\) are
the separate-fit (nu=0) global and individual imbalance norms. Small
\(\nu\) approaches a separate SCM per cohort; large \(\nu\) a fully
pooled SCM.
3. Choosing :math:`nu`. With nu="auto" (default) PPSCM uses augsynth’s
triangle-inequality ratio \(\nu = \text{global\_l2}\cdot\sqrt{d}/\text{avg\_l2}\)
from the separate fit; a float fixes it.
Assumptions / Remarks.
Assumption 1 (no anticipation, parallel residual trends). After removing the two-way fixed effects, the treated cohorts’ residual paths would have matched a convex combination of donor residual paths absent treatment. Remark. This is the staggered-adoption analogue of the SCM identifying assumption; the fixed effects absorb level and common-time shifts so the weights only need to match the residual dynamics.
Assumption 2 (overlap / donor availability). Each cohort has eligible donors
– never-treated units, or units treated more than n_leads periods later.
Remark. Late-treated units can serve as “clean” controls for earlier cohorts
until they themselves are treated, which the donor-eligibility rule enforces.
Remark (pooling). \(\nu\) is a bias–variance dial, not an identification parameter: the estimand (the wATET over the treated cohorts) is the same; \(\nu\) only trades per-cohort fit against stability of the pooled average.
Inference#
PPSCM reports the paper’s delete-one jackknife: drop each unit, refit the
full estimator (holding \(\nu\) fixed), and form
\(\widehat{\text{se}}^2 = \tfrac{n-1}{n}\sum_i(\hat\theta_i - \bar\theta)^2\)
for the overall ATT and each relative-time horizon, with Wald intervals.
Empirical Illustration: mandatory collective bargaining#
The multisynth vignette studies the effect of state mandatory
collective-bargaining laws on log per-pupil education expenditure
(Paglayan 2018), a staggered design. basedata/Teachingaugsynth.scv ships the
panel; the analysis restricts to 1959–1997, drops DC and Wisconsin, and treats
a state from the year it required bargaining.
import numpy as np
import pandas as pd
from mlsynth import PPSCM
url = "https://raw.githubusercontent.com/jgreathouse9/mlsynth/refs/heads/main/basedata/Teachingaugsynth.scv"
df = pd.read_csv(url)
df = df[~df["State"].isin(["DC", "WI"])]
df = df[(df["year"] >= 1959) & (df["year"] <= 1997)].copy()
df["cbr"] = (df["year"] >= df["YearCBrequired"].fillna(np.inf)).astype(int)
res = PPSCM({"df": df, "outcome": "lnppexpend", "treat": "cbr",
"unitid": "State", "time": "year", "display_graphs": True}).fit()
print(f"nu (auto) : {res.design.nu_used:.4f}")
print(f"Average ATT : {res.att:.3f} (SE {res.inference.se:.3f})")
This prints:
nu (auto) : 0.2607
Average ATT : -0.011 (SE 0.020)
reproducing the augsynth vignette (nu = 0.2607, Average ATT -0.011).
Setting time_cohort=True collapses to adoption-time cohorts and gives
nu = 0.3939, Average ATT -0.017 (augsynth: -0.018).
Verification#
Note
Exact replication of augsynth. On the Paglayan data PPSCM matches
augsynth::multisynth to high precision: the auto-\(\nu\) agrees to
four decimals (0.2607 default, 0.3939 time-cohort), the Average ATT matches
(\(-0.011\) default; \(-0.017\) vs \(-0.018\) time-cohort), and
the raw global/individual L2 imbalances agree (0.003 / 0.028). The full
relative-time event study matches the vignette’s per-horizon averages to
3–4 decimals. The decisive fidelity detail is aligning the pooled
imbalance by relative time on top of two-way fixed effects. The jackknife SE
(0.020) is close to augsynth’s default wild-bootstrap SE (0.022); they differ
only by inference procedure. This is locked in by
test_matches_augsynth_vignette in mlsynth/tests/test_ppscm.py.
Core API#
Partially Pooled Synthetic Control (PPSCM) estimator.
A thin orchestration over mlsynth.utils.ppscm_helpers, faithfully
porting augsynth::multisynth:
Ben-Michael, E., Feller, A., & Rothstein, J. (2022). “Synthetic Controls with Staggered Adoption.” JRSS-B 84(2):351-381.
PPSCM removes two-way fixed effects, balances the residuals with a
partially-pooled QP (nu interpolating between separate and fully pooled
SCM), and reports a relative-time event study and overall ATT with the paper’s
delete-one jackknife. time_cohort=True collapses units sharing an adoption
time into one fully-pooled cohort.
- class mlsynth.estimators.ppscm.PPSCM(config: PPSCMConfig | dict)#
Bases:
objectPartially Pooled SCM estimator (augsynth::multisynth port).
- Parameters:
config (PPSCMConfig or dict) – Validated configuration. Reads
nu(pooling, or"auto"),fixedeff,n_leads,n_lags,time_cohort,lam,run_inferenceandalphabeyond the common panel fields.- Returns:
PPSCMResults – Design (pooling level + balance diagnostics), relative-time event study, overall ATT with jackknife inference, and donor weights.
- fit() PPSCMResults#
Fit PPSCM and return the typed result container.
Configuration#
- class mlsynth.config_models.PPSCMConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', nu: float | ~typing.Literal['auto'] = 'auto', fixedeff: bool = True, n_leads: ~typing.Annotated[int | None, ~annotated_types.Ge(ge=1)] = None, n_lags: ~typing.Annotated[int | None, ~annotated_types.Ge(ge=1)] = None, time_cohort: bool = False, lam: ~typing.Annotated[float, ~annotated_types.Ge(ge=0)] = 0.0, solver: ~typing.Any = None, run_inference: bool = True, alpha: ~typing.Annotated[float, ~annotated_types.Gt(gt=0.0), ~annotated_types.Lt(lt=1.0)] = 0.05)#
Configuration for the Partially Pooled SCM (PPSCM) estimator.
Implements Ben-Michael, Feller & Rothstein (2022, JRSS-B 84(2):351-381). Targets staggered-adoption designs by minimizing a weighted average of the per-treated-unit imbalance
q_sepand the average-treated imbalanceq_pool, with weighting hyper- parameternu.- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Result Containers#
PPSCM.fit() returns a
PPSCMResults: the
PPSCMDesign (pooling level and
balance diagnostics), the relative-time
PPSCMEventStudy, the overall
PPSCMInference, and the
per-cohort donor weights.
Typed, NumPy-first result containers for Partially Pooled SCM (staggered).
PPSCM ports augsynth::multisynth (Ben-Michael, Feller & Rothstein 2022): a
partially-pooled synthetic control for staggered adoption that interpolates,
via nu, between a separate SCM per treated unit (nu small) and a fully
pooled SCM (nu large), on top of two-way fixed effects.
- class mlsynth.utils.ppscm_helpers.structures.PPSCMDesign(nu_used: float, lam: float, fixedeff: bool, time_cohort: bool, n_leads: int, n_lags: int, global_l2: float, ind_l2: float, scaled_global_l2: float, scaled_ind_l2: float)#
Bases:
objectThe fitted design: pooling level and balance diagnostics.
- class mlsynth.utils.ppscm_helpers.structures.PPSCMEventStudy(horizons: ndarray, tau: ndarray, se: ndarray, ci: ndarray)#
Bases:
objectRelative-time (time-since-treatment) average ATT path.
- ci: ndarray#
- horizons: ndarray#
- se: ndarray#
- tau: ndarray#
- class mlsynth.utils.ppscm_helpers.structures.PPSCMInference(att: float, se: float, ci: Tuple[float, float], method: str)#
Bases:
objectOverall (post-period average) ATT and its inference.
- class mlsynth.utils.ppscm_helpers.structures.PPSCMInputs(Xy: ndarray, trt: ndarray, n_pre: int, time_labels: ndarray, units: ndarray, outcome: str, intervention_time: Any)#
Bases:
objectPreprocessed staggered panel (the only pandas touchpoint is
setup).- Parameters:
Xy (np.ndarray) – Full outcome matrix, shape
(n, T)(units x all periods).trt (np.ndarray) – Adoption index per unit (position in
time_labels);inffor never-treated controls.n_pre (int) – Number of pre-treatment periods (columns before the last adoption).
time_labels (np.ndarray) – Sorted time labels, length
T.units (np.ndarray) – Unit labels, length
n.outcome (str) – Outcome column name.
intervention_time (Any) – The last adoption time (pre/post split point).
- Xy: ndarray#
- property control_units: ndarray#
- time_labels: ndarray#
- property treated_units: ndarray#
- trt: ndarray#
- units: ndarray#
- class mlsynth.utils.ppscm_helpers.structures.PPSCMResults(inputs: ~mlsynth.utils.ppscm_helpers.structures.PPSCMInputs, design: ~mlsynth.utils.ppscm_helpers.structures.PPSCMDesign, event_study: ~mlsynth.utils.ppscm_helpers.structures.PPSCMEventStudy, inference: ~mlsynth.utils.ppscm_helpers.structures.PPSCMInference, donor_weights: ~typing.Dict[~typing.Any, ~typing.Dict[~typing.Any, float]], metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#
Bases:
objectTop-level container returned by
mlsynth.PPSCM.fit().- design: PPSCMDesign#
- event_study: PPSCMEventStudy#
- inference: PPSCMInference#
- inputs: PPSCMInputs#
Helper Modules#
Staggered long-to-wide formatting (the only DataFrame touchpoint): derive adoption times, split pre/post at the last adoption.
Long-DataFrame -> NumPy boundary for PPSCM (staggered adoption).
Mirrors augsynth::format_data_stag: derive each unit’s first treated period,
split the panel at the last adoption time into pre (X) and post (y),
and index adoption by position in the sorted time vector (Inf for never-treated).
- mlsynth.utils.ppscm_helpers.setup.prepare_ppscm_inputs(df: DataFrame, *, outcome: str, treat: str, unitid: str, time: str) PPSCMInputs#
The engine: two-way fixed effects (fit_feff), the partially-pooled QP,
auto-\(\nu\), and the relative-time event study / ATT.
Core staggered-adoption engine for PPSCM, ported faithfully from augsynth::multisynth (Ben-Michael, Feller & Rothstein 2022).
- Pipeline (one call = one fit):
fit_feffremoves fixed effects (force=3 two-way: time effect from never-treated column means + per-cohort unit pre-mean) and balances the residuals.solve_cohort_qpsolves the partially-pooled QP over donor weights, with the pooled imbalance aligned by relative time (front-padded) and the pooled/separate terms normalized by the separate fit’s norms.run_multisynthchoosesnu(triangle-inequality ratio when “auto”), refits, and produces the relative-time event study and ATT.
Validated to reproduce the multisynth vignette exactly (default nu=0.2607, ATT=-0.011; time_cohort nu=0.3939, ATT=-0.017).
- mlsynth.utils.ppscm_helpers.engine.fit_feff(Xy: ndarray, trt: ndarray, adopt_indices, fixedeff: bool) Dict[int, ndarray]#
Residualize
Xyper cohort.Returns
{adoption_index: residual_matrix (n, T)}. Withfixedeffthe time effect is the never-treated column mean and the unit effect is each unit’s mean residual over its pre-adoption window[:tj]; without it, only the time effect (control averages) is removed.
- mlsynth.utils.ppscm_helpers.engine.run_multisynth(Xy: ndarray, trt: ndarray, d: int, n_leads: int, n_lags: int, *, fixedeff: bool = True, time_cohort: bool = False, nu: float | None = None, lam: float = 0.0, solver: Any = None) Dict[str, Any]#
Run one multisynth fit; returns weights, event study, ATT, diagnostics.
- mlsynth.utils.ppscm_helpers.engine.solve_cohort_qp(res, groups, adopt_of, members, donors, n1, d, n, n_lags, nu, norm_pool, norm_sep, lam, solver) Dict[Any, ndarray]#
Partially-pooled QP: per-cohort simplex weights (summing to cohort size).
The paper’s delete-one jackknife inference.
Delete-one jackknife inference for PPSCM (Ben-Michael et al. 2022).
The paper’s jackknife drops each unit i (treated or control), refits the
full staggered estimator on the remaining n - 1 units (holding nu
fixed), and forms
se^2 = (n - 1) / n * sum_i (theta_i - mean_i theta_i)^2
separately for the overall ATT and each relative-time horizon. Wald intervals are built from these SEs around the full-sample point estimates.
- mlsynth.utils.ppscm_helpers.inference.jackknife_inference(Xy: ndarray, trt: ndarray, d: int, n_leads: int, n_lags: int, *, fixedeff: bool, time_cohort: bool, nu_used: float, lam: float, solver: Any, alpha: float, per_time_full: ndarray, att_full: float) Tuple[float, float, Tuple[float, float], ndarray, ndarray]#
Return
(att, se, ci, per_time_se, per_time_ci).