Two-Step Synthetic Control#
When to Use This Estimator#
The synthetic control (SC) method of Abadie and Gardeazabal rests on an identifying assumption that can only be partially checked: after treatment, the synthetic control should track the treated unit’s untreated outcome. The testable part is the SC pre-trends assumption – that the synthetic control tracks the treated unit before treatment. In practice this is usually verified by eyeballing a plot, which is informal and easy to get wrong; imposing SC’s restrictions when they do not hold can bias the ATT, sometimes with the wrong sign.
Use TSSC, due to Li and Shankar [TSSC], when you have a single treated unit observed over a panel whose outcomes may follow nonstationary, nonlinear trends (sales, market share, macro series), and you want to decide – formally, not visually – whether SC’s restrictions are appropriate or whether you should relax them. TSSC
runs a formal hypothesis test of the SC pre-trends assumption, and
recommends the member of the SC class that best balances the dual goal of reducing bias (relax restrictions that are violated) and increasing efficiency (keep restrictions that hold, since each correct restriction shrinks the estimator’s variance).
If the SC restrictions hold, TSSC keeps the efficient SC estimator. If they are violated, TSSC backs off to the least-restrictive modified variant needed – no more, no less.
Note
TSSC is the only estimator in mlsynth that does not rely on a
machine-learning step; it is a sequence of constrained least-squares
fits plus a subsampling test.
Notation#
We index units by \(j\), with \(j = 1\) the sole treated unit and \(j = 2, \ldots, N\) the control (donor) units. Time runs over \(t\), partitioned by the intervention – which takes effect after period \(T_0\) – into a pre-treatment window \(\mathcal{T}_1 \coloneqq \{t : t \le T_0\}\) of length \(T_1 = T_0\) and a post-treatment window \(\mathcal{T}_2 \coloneqq \{t : t > T_0\}\) of length \(T_2 = T - T_0\), with \(T = T_1 + T_2\). Let \(y_{jt}^0\) and \(y_{jt}^1\) denote potential outcomes without and with treatment; we observe
The regression design vector is \(\mathbf{x}_t = (1, y_{2t}, \ldots, y_{Nt})'\), so the coefficient vector \(\boldsymbol{\beta} = (\beta_1, \beta_2, \ldots, \beta_N)'\) has \(\beta_1\) as the intercept and \(\beta_2, \ldots, \beta_N\) as the donor slopes. We write \(\mathbf{1}_L\) and \(\mathbf{0}_L\) for the \(L\)-vectors of ones and zeros, and \(\widehat{\boldsymbol{\beta}}_{\mathrm{MSC},T_1}\) for the benchmark MSC(c) estimate on the pre-treatment sample. The estimand is the average treatment effect on the treated,
where \(\widehat{y}_{1t}^0\) is the counterfactual untreated outcome.
The Class of Synthetic Control Methods#
Each method fits \(y_{1t} = \mathbf{x}_t' \boldsymbol{\beta} + e_{1t}\) on the pre-treatment window by minimizing \(\sum_{t \in \mathcal{T}_1} (y_{1t} - \mathbf{x}_t'\boldsymbol{\beta})^2\) subject to a subset of three restrictions: (1) a zero intercept \(\beta_1 = 0\); (2) the donor weights sum to one \(\sum_{j=2}^N \beta_j = 1\); (3) the donor weights are non-negative \(\beta_j \ge 0\). The four members differ only in which restrictions they impose:
Method |
Restrictions |
Intercept |
|---|---|---|
SC |
(1), (2), (3) – the canonical Abadie estimator |
none |
MSCa |
(2), (3) – weights sum to one |
free |
MSCb |
(1), (3) – no adding-up |
none |
MSCc |
|
free |
Geometrically, SC projects the treated unit onto the convex hull of the donors; MSCb and MSCc project onto a convex cone (non-negative weights that need not sum to one), with MSCc additionally allowing a vertical shift via the intercept. The flexible variants are appropriate when the treated unit sits on a steeper trend than its donors, but that flexibility costs efficiency – which is precisely the trade-off Step 1 adjudicates.
Failure modes of the convex hull: when each restriction matters#
The three SC restrictions – zero intercept, weights sum to one, weights non-negative – together force the synthetic counterfactual to lie strictly inside the convex hull of the donors’ pre-period paths. Geometrically that’s a useful prior when the treated unit does look like a weighted average of the donors. It is a catastrophic prior when the treated unit doesn’t. Two simple violations cover almost every empirical case where vanilla SC goes wrong:
Level shift – treated above (or below) every donor’s level. The zero-intercept restriction pins the synthetic to a convex combination of the donors, which can never escape the donors’ range. If a national chain’s flagship store outsells every donor store by a constant multiple, SC’s synthetic has to lie between the donors and cannot reach the treated level. The pre-period RMSE blows up; the post-period ATT inherits the same gap as a spurious treatment effect.
Fix: add a free intercept. That is MSCa (keeps sum-to-one and non-negativity, drops the zero-intercept). An intercept can absorb an arbitrary vertical shift without inflating the donor weights.
Steeper trend – treated growing faster than the fastest donor. The sum-to-one restriction forces the synthetic to track a weighted average of donor trajectories, whose slope is bounded above by the maximum donor slope. If the treated unit’s pre-trend is steeper than any donor’s, the SC fit is uniformly behind in the pre-period – and again the post-period gap is mostly miscalibration, not treatment effect.
Fix: drop sum-to-one. That is MSCb (keeps the zero intercept and non-negativity, allows the weights to sum above one). Weights summing above one act as a slope amplifier: \(2 \cdot \text{donor with slope 1}\) reproduces slope 2.
Both at once. Common in practice – the treated unit is bigger in level and steeper in trend than every donor. Neither MSCa (slope still bounded) nor MSCb (no intercept) is enough.
Fix: relax both. That is MSCc, the most flexible variant in the family, retaining only non-negativity.
Side-by-side example#
The script below builds four panels with the same donor pool. Only the treated unit changes between panels, isolating the failure mode in question. The true ATT is zero in every panel (no treatment effect was injected); the pre-RMSE and post-period ATT estimates under each of the four variants tell the user which restriction is hurting the fit, and the TSSC recommendation picks the right variant automatically.
import numpy as np
import pandas as pd
from mlsynth import TSSC
def panel(treated, donors, T1=20):
T = treated.size
rows = [
{"unit": "T", "t": t, "y": float(treated[t]),
"treat": int(t >= T1)}
for t in range(T)
]
for i, d in enumerate(donors):
rows.extend(
{"unit": f"d{i}", "t": t, "y": float(d[t]), "treat": 0}
for t in range(T)
)
return pd.DataFrame(rows)
def report(label, df):
res = TSSC({
"df": df, "outcome": "y", "treat": "treat",
"unitid": "unit", "time": "t",
"display_graphs": False, "seed": 0,
}).fit()
print(f"\n{label}")
print(f" TSSC recommended: {res.recommended_method}")
for m in ("SC", "MSCa", "MSCb", "MSCc"):
v = res.variants[m]
ic = "" if v.intercept is None else f" intercept={v.intercept:+.2f}"
print(f" {m:5} ATT = {v.att:+7.3f} pre-RMSE = {v.rmse_pre:5.3f}{ic}")
rng = np.random.default_rng(0)
T, T1 = 30, 20
t = np.arange(T)
# Eight donors on a shallow common trend.
donors = np.array([1.0 + 0.05 * t + 0.3 * rng.standard_normal(T)
for _ in range(8)])
# (A) Treated lies inside the donor hull.
treated_A = donors.mean(axis=0) + 0.10 * rng.standard_normal(T)
report("(A) Inside the hull -> SC", panel(treated_A, donors))
# (B) Treated is uniformly +8 above every donor.
report("(B) Level shift -> MSCa", panel(treated_A + 8.0, donors))
# (C) Treated trends 4x faster than any donor.
treated_C = 1.0 + 0.20 * t + 0.3 * rng.standard_normal(T)
report("(C) Steeper slope -> MSCb", panel(treated_C, donors))
# (D) Treated is both shifted and steeper.
treated_D = 5.0 + 0.20 * t + 0.3 * rng.standard_normal(T)
report("(D) Shifted AND steeper -> MSCc", panel(treated_D, donors))
prints (deterministic with the seed above):
(A) Inside the hull -> SC
TSSC recommended: SC
SC ATT = -0.059 pre-RMSE = 0.079
MSCa ATT = -0.147 pre-RMSE = 0.063 intercept=+0.06
MSCb ATT = -0.189 pre-RMSE = 0.062
MSCc ATT = -0.184 pre-RMSE = 0.062 intercept=+0.01
(B) Level shift -> MSCa
TSSC recommended: MSCa
SC ATT = +7.973 pre-RMSE = 7.897
MSCa ATT = -0.147 pre-RMSE = 0.063 intercept=+8.06
MSCb ATT = -3.761 pre-RMSE = 1.415
MSCc ATT = -0.184 pre-RMSE = 0.062 intercept=+8.01
(C) Steeper slope -> MSCb
TSSC recommended: MSCb
SC ATT = +3.669 pre-RMSE = 1.396
MSCa ATT = +2.430 pre-RMSE = 0.721 intercept=+1.23
MSCb ATT = +1.720 pre-RMSE = 0.493
MSCc ATT = +1.720 pre-RMSE = 0.493 intercept=-0.00
(D) Shifted AND steeper -> MSCc
TSSC recommended: MSCc
SC ATT = +7.719 pre-RMSE = 5.303
MSCa ATT = +2.408 pre-RMSE = 0.804 intercept=+5.30
MSCb ATT = +0.102 pre-RMSE = 0.434
MSCc ATT = +0.750 pre-RMSE = 0.332 intercept=+1.71
How to read the table:
In (A), no restriction is binding. All four variants attain similar pre-RMSE (~0.06-0.08), and TSSC’s first-step test fails to reject the joint null – so the recommendation is the most restrictive, most efficient member: SC.
In (B), dropping the zero intercept is all you need. MSCa (intercept ~+8) and MSCc (intercept ~+8) both hit a pre-RMSE of ~0.06. SC’s pre-RMSE balloons to 7.9; MSCb tries to fake the level shift by inflating the weights past 1 and only partially succeeds (pre-RMSE 1.4). TSSC’s test rejects the joint null but fails to reject sum-to-one, so it recommends MSCa.
In (C), dropping sum-to-one is what’s needed. The treated’s slope (0.20) is 4x the donors’ (0.05); only MSCb / MSCc can amplify weights to chase the steeper trajectory. The intercept in MSCc is essentially zero – the level shift wasn’t the problem. TSSC tests through to MSCb.
In (D) both restrictions bite. MSCa misses on slope, MSCb misses on level; only MSCc (free intercept and weights unbound above 1) cleans both. The decision path runs all the way to the leaf of the flowchart.
What TSSC does is automate this diagnosis: instead of eyeballing the pre-period fit and guessing which restriction is the problem, Step 1 runs a formal sub-sampling test that rejects exactly the restriction(s) the data refuse, and Step 2 then applies the most restrictive variant the test couldn’t reject – buying back efficiency wherever the data say SC’s restrictions are correct.
Assumptions#
The theory is developed for a nonstationary, nonlinear-trend factor model \(y_{jt}^0 = c_j + d_j f_t + u_{jt}\), where \(f_t\) is a common trend of unknown functional form, \(c_j\) an intercept, \(d_j\) a factor loading, and \(u_{jt}\) an idiosyncratic error.
Assumption 1 (data-generating process). The idiosyncratic errors \(u_{jt}\) are zero-mean, serially uncorrelated, stationary with a finite fourth moment and uncorrelated with the common factor; the projection error \(e_{1t} = y_{1t}^0 - \mathbf{x}_t'\boldsymbol{\beta}_0\) is a zero-mean, finite-variance stationary process obeying a central limit theorem; and \(T_2/T_1 \to \eta\) for a finite \(\eta \ge 0\).
Remark. This says the only nonstationarity in the panel comes through the shared trend \(f_t\) – the unit-specific noise is well behaved. It is what lets a linear combination of donors soak up the treated unit’s trend and leave a stationary residual, which is exactly the parallel- trends condition the test targets.
Assumption 2 (trend and design regularity). The common trend grows no faster than a leading term \(g(t)\) (e.g. a polynomial or \(\log t\), but not \(e^t\)), and the pre-treatment second-moment matrix of the donor outcomes converges to a positive-definite limit.
Remark. The growth bound rules out trends so explosive that no fixed linear combination of donors can track them; the positive-definite Gram condition rules out perfectly collinear donors so the weights are well-defined. Both are mild for typical marketing and macro panels.
Parallel trends. Two nonlinear series have parallel trends if their difference is a zero-mean stationary process. The SC pre-trends assumption is that \(y_{1t}\) and \(\mathbf{x}_t'\widehat{\boldsymbol{\beta}}_{\mathrm{SC}}\) are parallel for \(t \in \mathcal{T}_1\).
Remark. Under Assumptions 1-2, Li and Shankar show (Proposition 3.1) that the MSC(c) fitted curve is almost always parallel to the treated series, provided at least one donor’s loading has the same sign as the treated unit’s. MSC(c) is therefore the natural benchmark against which the SC restrictions are tested.
Step 1: Testing the SC Pre-Trends Assumption#
The key equivalence (Proposition 3.1) is that, with MSC(c) as benchmark, the SC pre-trends assumption holds if and only if the two SC restrictions hold – the donor weights sum to one and the intercept is zero. So testing pre-trends reduces to a joint linear restriction on \(\widehat{\boldsymbol{\beta}}_{\mathrm{MSC},T_1}\):
The first row tests adding-up; the second tests the zero intercept. With \(\widehat{\mathbf{d}} = \mathbf{R}\widehat{\boldsymbol{\beta}}_{\mathrm{MSC},T_1} - \mathbf{q}\), the feasible statistic is the quadratic form
Because the constrained estimator can sit on the boundary of its parameter space (a weight pinned at zero), its limit is the projection of a normal onto a convex cone – non-standard, so the ordinary bootstrap fails. Li and Shankar instead use subsampling:
The subsample fits give a consistent variance estimate \(\widehat{\mathbf{V}} = \mathbf{R}\,\widehat{\mathrm{Var}}^{*}\!\bigl(\sqrt{T_1} \widehat{\boldsymbol{\beta}}_{\mathrm{MSC},T_1}\bigr) \mathbf{R}'\) with
and the subsampling distribution \(S^{*}_{m,b} = \bigl(\sqrt{m} \mathbf{R}(\widehat{\boldsymbol{\beta}}^{*}_{\mathrm{MSC},m,b} - \widehat{\boldsymbol{\beta}}_{\mathrm{MSC},T_1}) \bigr)' \widehat{\mathbf{V}}^{-1} \bigl(\cdots\bigr)\). Sorting the \(S^{*}_{m,b}\) gives the \((1-\alpha)\) acceptance region \([S^{*}_{m,(\alpha B/2)},\, S^{*}_{m,((1-\alpha/2)B)}]\); we reject \(H_0\) when \(\widehat{S}_{T_1}\) falls outside it.
If the joint \(H_0\) is rejected, the source of the violation is unclear, so we test the two restrictions singly. With \(\mathbf{R}_a = (0, \mathbf{1}_{N-1}')\), \(q_a = 1\) for adding-up and \(\mathbf{R}_b = (1, \mathbf{0}_{N-1}')\), \(q_b = 0\) for the intercept, the single statistic is simply the squared scaled deviation (here \(\widehat{\mathbf{V}}\) is replaced by one),
with acceptance regions read off the corresponding subsampling distributions. TSSC then walks a decision tree: keep all SC restrictions if the joint test is not rejected (use SC); otherwise test adding-up – not rejected gives MSCa; if rejected, test the zero intercept – not rejected gives MSCb, rejected gives MSCc. In words, relax exactly the restriction(s) the data reject, stopping at the least-flexible variant consistent with the evidence.
Note
The subsample size \(m\) is a tuning parameter
(subsample_size). The paper’s rule of thumb is \(m\) between
\(T_1/2\) and \(T_1\) for moderate \(T_1\); the bootstrap
special case \(m = T_1\) (the default here, used when
subsample_size is None) performs well in their simulations. If
different \(m\) give similar decisions, the test is reliable.
Step 2: Estimating the ATT and Its Confidence Interval#
With the variant chosen, the ATT is the mean post-period gap between the observed treated series and the recommended counterfactual. Each variant also carries its own confidence interval via the subsampling procedure of Li (2020): refit the variant on permuted size-\(m\) pre-treatment subsamples (whose treated outcome is regenerated from the fitted weights plus pre-period noise) to capture donor-weight estimation error, and add post-period idiosyncratic prediction noise. The interval is \([\mathrm{ATT} - q_{1-\alpha/2},\ \mathrm{ATT} - q_{\alpha/2}]\), with \(q\) the quantiles of the normalized statistic.
Remark. Because each correct restriction removes estimation variance,
the recommended variant typically has a tighter interval than the
fully flexible MSC(c) – the efficiency half of TSSC’s dual goal made
visible. mlsynth reports the CI for all four variants
(att_ci_by_method()) so this trade-off is inspectable.
Verification#
TSSC is validated against the authors’ published numbers and their Figure-2 Monte Carlo. On Li & Shankar’s Brooklyn-showroom panel the recommended variant’s ATT (\(+1{,}131.975\)) and pre-RMSE (\(434.448\)) match the paper to three decimals (and Step 1 selects MSC(b), as the paper reports); the Figure-2 MSE-ratio grid reproduces, with all 16 cells below 1. See the dedicated replication page, TSSC — Two-Step Synthetic Control (Li & Shankar 2024), for the full code, tables and discussion.
Core API#
Two-Step Synthetic Control (TSSC) estimator.
Implements:
Li, K. T., & Shankar, V. (2023). “A Two-Step Synthetic Control Approach for Estimating Causal Effects of Marketing Events.” Management Science. https://doi.org/10.1287/mnsc.2023.4878
TSSC addresses a gap in synthetic-control practice: the SC pretrends assumption is usually checked only by visual inspection. TSSC instead
Step 1 (model selection). Formally tests the SC pretrends assumption – equivalent to the joint restriction that the donor weights sum to one and the intercept is zero (Proposition 3.1) – using a subsampling procedure (Proposition 3.2), then walks a decision tree to recommend the SC-class variant that balances bias and efficiency: SC, MSCa, MSCb, or MSCc.
Step 2 (estimation). Fits the recommended variant and reports the ATT as the mean post-period gap.
See mlsynth.utils.tssc_helpers for the algorithmic pieces.
- class mlsynth.estimators.tssc.TSSC(config: TSSCConfig | dict)#
Bases:
objectTwo-Step Synthetic Control (TSSC) estimator.
- Parameters:
config (TSSCConfig or dict) – Configuration object. See
mlsynth.utils.tssc_helpers.config.TSSCConfig.- Returns:
TSSCResults – Container with all four SC-class variant fits, the Step-1 selection record, and a standardized
summaryfor the recommended variant.
- fit() TSSCResults#
Run the two-step pipeline and return the design.
Configuration#
- class mlsynth.utils.tssc_helpers.config.TSSCConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', plot: ~mlsynth.config_models.PlotConfig = <factory>, alpha: ~typing.Annotated[float, ~annotated_types.Gt(gt=0.0), ~annotated_types.Lt(lt=1.0)] = 0.05, subsample_size: ~typing.Annotated[int | None, ~annotated_types.Ge(ge=2)] = None, draws: ~typing.Annotated[int, ~annotated_types.Ge(ge=1)] = 500, ci: ~typing.Annotated[float, ~annotated_types.Gt(gt=0.0), ~annotated_types.Lt(lt=1.0)] = 0.95, seed: int | None = None)#
Configuration for the Two-Step Synthetic Control (TSSC) estimator.
Implements:
Li, K. T., & Shankar, V. (2023). “A Two-Step Synthetic Control Approach for Estimating Causal Effects of Marketing Events.” Management Science. https://doi.org/10.1287/mnsc.2023.4878
- Parameters:
alpha (float) – Two-sided significance level for the Step-1 restriction tests (the SC-pretrends test and the two single-restriction tests). Default 0.05.
subsample_size (int or None) – Subsample size
mfor the Step-1 subsampling procedure. WhenNone(default) it is set toT_1(the bootstrap special case the paper’s simulations validate). For genuine subsampling, the paper’s rule of thumb ismbetweenT_1/2andT_1for moderateT_1(and smaller for largeT_1).draws (int) – Number of subsampling replications
Bfor the Step-1 tests and bootstrap replications for the per-variant ATT confidence intervals. Default 500.ci (float) – Confidence level for the per-variant ATT confidence interval. Default 0.95.
seed (int or None) – Seed for the subsampling RNG (reproducibility). Default None.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Result Containers#
The four frozen dataclasses returned by the pipeline. TSSC.fit()
returns a TSSCResults,
whose variants map holds one
TSSCVariantFit per
SC-class method and whose selection records the Step-1
TSSCRestrictionTest
outcomes inside a
TSSCSelection.
Structured containers for the Two-Step Synthetic Control (TSSC) estimator.
Implements:
Li, K. T., & Shankar, V. (2023). “A Two-Step Synthetic Control Approach for Estimating Causal Effects of Marketing Events.” Management Science. https://doi.org/10.1287/mnsc.2023.4878
Notation (following Li & Shankar, 2023):
Treated unit
j = 1; control unitsj = 2, ..., N. The regression isy_{1t} = x_t' beta + e_{1t}withx_t = (1, y_{2t}, ..., y_{Nt})', sobeta_1is the intercept andbeta_2, ..., beta_Nare the donor slope coefficients. Pre-period lengthT_1, post-period lengthT_2.
The class of SC methods (Table 1) is indexed by which restrictions are
imposed on beta:
SC : (1) beta_1 = 0, (2) sum_{j>=2} beta_j = 1, (3) beta_j >= 0 MSCa : (2) and (3) – weights sum to one, with intercept MSCb : (1) and (3) – no adding-up, zero intercept MSCc : (3) only – most flexible benchmark
- class mlsynth.utils.tssc_helpers.structures.TSSCInputs(y: ndarray, donor_matrix: ndarray, donor_names: Sequence, T0: int, T2: int, T: int, time_labels: ndarray, treated_unit_name: str)#
Bases:
objectPreprocessed panel data for TSSC.
- Parameters:
y (np.ndarray) – Treated-unit outcome over all
Tperiods, shape(T,).donor_matrix (np.ndarray) – Control-unit outcomes, shape
(T, N-1)(theN-1donors; the intercept column is added internally by the variants that use it).donor_names (Sequence) – Length
N-1donor labels.T0 (int) – Number of pre-treatment periods (
T_1).T2 (int) – Number of post-treatment periods.
T (int) – Total number of periods (
T_1 + T_2).time_labels (np.ndarray) – Length-
Ttime labels.treated_unit_name (str)
- donor_matrix: ndarray#
- time_labels: ndarray#
- y: ndarray#
- class mlsynth.utils.tssc_helpers.structures.TSSCRestrictionTest(name: str, statistic: float, ci_lower: float, ci_upper: float, rejected: bool)#
Bases:
objectOutcome of one Step-1 subsampling restriction test.
- Parameters:
name (str) –
"joint"(H0: weights sum to one AND zero intercept),"sum_to_one"(H0a), or"zero_intercept"(H0b).statistic (float) – Feasible test statistic on the full pre-treatment sample (
S_hat_{T1}for the joint test;(sqrt(T1) d_hat_s)^2for a single restriction).ci_lower, ci_upper (float) – The
alpha/2and1 - alpha/2quantiles of the subsampling distribution – the estimated(1 - alpha)acceptance region under H0 (Proposition 3.2).rejected (bool) – True if
statisticfalls outside[ci_lower, ci_upper].
- class mlsynth.utils.tssc_helpers.structures.TSSCResults(*, effects: EffectsResults | None = None, fit_diagnostics: FitDiagnosticsResults | None = None, time_series: TimeSeriesResults | None = None, weights: WeightsResults | None = None, inference: InferenceResults | None = None, method_details: MethodDetailsResults | None = None, sub_method_results: Dict[str, Any] | None = None, additional_outputs: Dict[str, Any] | None = None, raw_results: Dict[str, Any] | None = None, execution_summary: Dict[str, Any] | None = None, plot_config: PlotConfig | None = None, inputs: TSSCInputs, variants: Dict[str, TSSCVariantFit], selection: TSSCSelection, summary: BaseEstimatorResults | None = None)#
Bases:
BaseEstimatorResultsPublic
TSSC.fit()return container.An
EffectResult(the observational report): besides the TSSC-specific fields below, it exposes the standardized sub-models (effects,time_series,weights,inference,fit_diagnostics,method_details) – lifted from the recommended variant’ssummary– and the flat accessorsatt/att_ci/counterfactual/gap/donor_weights/pre_rmse.- Parameters:
inputs (TSSCInputs) – Preprocessed panel data.
variants (dict) –
method_name -> TSSCVariantFitfor all four SC-class methods.selection (TSSCSelection) – The Step-1 recommendation and its underlying tests.
summary (BaseEstimatorResults, optional) – Standardized result bundle for the recommended variant.
- att_ci_by_method() Dict[str, Tuple[float, float]]#
{method: (ci_lower, ci_upper)}across all four variants.
- inputs: TSSCInputs#
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': True, 'json_encoders': {<class 'numpy.ndarray'>: <function BaseEstimatorResults.Config.<lambda>>}}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- property recommended: TSSCVariantFit#
The
TSSCVariantFitfor the recommended method.
- selection: TSSCSelection#
- variants: Dict[str, TSSCVariantFit]#
- class mlsynth.utils.tssc_helpers.structures.TSSCSelection(recommended: str, tests: Dict[str, TSSCRestrictionTest], alpha: float, subsample_size: int, n_subsamples: int, mscc_beta: ndarray, decision_path: Tuple[str, ...])#
Bases:
objectRecord of the Step-1 model-selection procedure.
- Parameters:
recommended (str) – Selected method:
"SC","MSCa","MSCb", or"MSCc".tests (dict) –
test_name -> TSSCRestrictionTestfor each test actually run (the decision tree short-circuits, so fewer than three may appear).alpha (float) – Two-sided significance level used for the acceptance regions.
subsample_size (int) – Subsample size
m(Step i of the subsampling procedure).n_subsamples (int) – Number of subsampling replications
B.mscc_beta (np.ndarray) – Full-sample MSC(c) coefficient vector
beta_hat_{MSC,T1}of lengthN(intercept first), the benchmark all tests build on.decision_path (tuple of str) – Human-readable trace of the decision tree.
- mscc_beta: ndarray#
- tests: Dict[str, TSSCRestrictionTest]#
- class mlsynth.utils.tssc_helpers.structures.TSSCVariantFit(method: str, weights: ndarray, intercept: float | None, donor_weights: dict, counterfactual: ndarray, gap: ndarray, att: float, att_ci: Tuple[float, float], rmse_pre: float, rmse_post: float, r2_pre: float)#
Bases:
objectFitted result for one SC-class variant.
- Parameters:
method (str) –
"SC","MSCa","MSCb", or"MSCc".weights (np.ndarray) – Raw coefficient vector returned by the solver. Length
Nfor the intercept variants (MSCa/MSCc; intercept first) andN-1for the no-intercept variants (SC/MSCb).intercept (float or None) – The fitted intercept (
Nonefor SC and MSCb, which imposebeta_1 = 0).donor_weights (dict) –
donor_label -> weightfor donors with non-negligible weight.counterfactual (np.ndarray) – Length-
Tsynthetic control path.gap (np.ndarray) – Length-
Tgapy - counterfactual.att (float) – Mean post-period gap.
att_ci (tuple of float) –
(lower, upper)bootstrap confidence interval for the ATT.rmse_pre, rmse_post (float) – Pre/post RMSE of the gap.
r2_pre (float) – Pre-treatment R-squared of the fit.
- counterfactual: ndarray#
- gap: ndarray#
- weights: ndarray#
Helper Modules#
Data preparation – pivots the long panel into the typed
TSSCInputs.
Data preparation for TSSC.
Wraps mlsynth.utils.datautils.dataprep() into the typed
TSSCInputs container. No demeaning or scaling is applied – the
SC-class regressions operate on the raw outcome levels (the intercept
variants absorb level differences directly).
- mlsynth.utils.tssc_helpers.setup.prepare_tssc_inputs(df: DataFrame, outcome: str, unitid: str, time: str, treat: str) TSSCInputs#
Pivot a long panel into the typed inputs the TSSC pipeline expects.
Constrained-LS estimation of the four SC-class variants and the per-variant subsampling ATT confidence interval (Step 2).
Constrained-LS estimation of the four SC-class variants for TSSC.
Each variant minimizes the pre-treatment sum of squares
sum_{t<=T1} (y_{1t} - x_t' beta)^2 subject to its constraint set
(see structures). The solve is delegated to the project’s shared
Opt.SCopt cvxpy wrapper. The post-period counterfactual is
y_hat^0_{1t} = x_t' beta_hat and the ATT is the mean post-period gap
(Li & Shankar, 2023, Eqs. (2.1)/(2.5)).
ATT confidence intervals (per variant) use the subsampling procedure of
Li (2020), which combines two sources of uncertainty: (i) donor-weight
estimation error – captured by refitting the variant on size-m
permuted pre-treatment subsamples whose treated outcome is regenerated
from the fitted weights plus pre-period noise – and (ii) post-period
idiosyncratic prediction error. The interval is
[ATT - q_{1-a/2}, ATT - q_{a/2}] where the q are quantiles of the
normalized statistic’s subsampling distribution.
- mlsynth.utils.tssc_helpers.estimation.bootstrap_att_ci(inputs: TSSCInputs, method: str, weights: ndarray, counterfactual: ndarray, att: float, n_bootstrap: int, confidence_level: float, rng: Generator) Tuple[float, float]#
Subsampling confidence interval for a variant’s ATT (Li, 2020).
- Parameters:
inputs (TSSCInputs)
method (str) – SC-class variant name.
weights (np.ndarray) – The variant’s fitted coefficient vector.
counterfactual (np.ndarray) – Length-
Tfitted/counterfactual path for the variant.att (float) – The variant’s point ATT.
n_bootstrap (int) – Number of subsampling replications.
confidence_level (float) – E.g.
0.95for a 95% interval.rng (numpy.random.Generator) – RNG for reproducible subsampling (no global-seed side effects).
- mlsynth.utils.tssc_helpers.estimation.fit_mscc_beta(donor_pre: ndarray, y_pre: ndarray, n_pre: int, n_donors: int) ndarray | None#
MSC(c) coefficient vector
beta(lengthN, intercept first).This is the benchmark estimator the Step-1 tests build on, and the workhorse re-fit inside the subsampling loop.
- mlsynth.utils.tssc_helpers.estimation.fit_variant(inputs: TSSCInputs, method: str, n_bootstrap: int, confidence_level: float = 0.95, rng: Generator | None = None) TSSCVariantFit#
Fit one SC-class variant and assemble its
TSSCVariantFit.
The Step-1 subsampling test of the SC pre-trends assumption and the SC -> MSCa -> MSCb -> MSCc decision tree.
Step 1 of TSSC: select the SC-class method by subsampling tests.
Implements Section 3.2 of Li & Shankar (2023). The benchmark MSC(c)
coefficient beta_hat_{MSC,T1} is computed on the full pre-treatment
sample, then a subsampling-with-replacement procedure approximates the
null distribution of the restriction test statistics:
Step i. For ``t = 1, ..., m`` draw ``(x_t*, y_{1t}*)`` with
replacement from the ``T1`` pre-treatment observations.
Step ii. Refit MSC(c) on the subsample -> ``beta_hat*_{MSC,m,b}``.
Step iii. Repeat ``B`` times.
The full-sample statistics are (Eqs. 3.6/3.9, 3.13)
joint: S_hat_{T1} = T1 * d_hat’ V_hat^{-1} d_hat, d_hat = R beta - q single: S_hat_{T1,s} = (sqrt(T1) d_hat_s)^2, V replaced by 1
with V_hat = R Var*(sqrt(T1) beta_hat) R' (Eqs. 3.7-3.8). The
subsampling distribution {S*_{m,b}} (built from sqrt(m) R(beta* -
beta_hat)) gives the (1 - alpha) acceptance region
[S*_{(alpha B/2)}, S*_{((1 - alpha/2) B)}] (Proposition 3.2). H0 is
rejected when the full-sample statistic falls outside it.
Decision tree (Figure 1): joint not rejected -> SC; else sum-to-one not rejected -> MSCa; else zero-intercept not rejected -> MSCb; else MSCc.
- mlsynth.utils.tssc_helpers.selection.select_method(inputs: TSSCInputs, alpha: float = 0.05, subsample_size: int | None = None, n_subsamples: int = 500, seed: int | None = None) TSSCSelection#
Run the Step-1 decision tree and return the recommended method.
- Parameters:
inputs (TSSCInputs)
alpha (float) – Two-sided significance level for each restriction test.
subsample_size (int, optional) – Subsample size
m. Defaults toT1(the bootstrap special case the paper’s simulations validate; choosemsmaller – rule of thumbT1/2toT1– for genuine subsampling).n_subsamples (int) – Number of subsampling replications
B.seed (int, optional) – RNG seed for reproducible subsampling.
Assembly of the standardized BaseEstimatorResults summary for the
recommended variant.
Assemble the standardized BaseEstimatorResults for TSSC.
Packages the recommended variant’s fit plus the Step-1 selection record
into the project’s standardized result models, so TSSC’s public
summary matches the rest of the mlsynth suite.
- mlsynth.utils.tssc_helpers.results_assembly.build_summary(inputs: TSSCInputs, variant: TSSCVariantFit, selection: TSSCSelection) BaseEstimatorResults#
Standardized bundle for the Step-1-recommended TSSC variant.
The observed-vs-recommended-counterfactual plot.
Observed vs recommended-counterfactual plot for TSSC.
- mlsynth.utils.tssc_helpers.plotter.plot_tssc(results: TSSCResults, title: str | None = None) None#
Plot the treated series against the recommended variant’s synthetic.
The Li & Shankar Figure 2 DGP, packaged as simulate_tssc_sample so
the Path-B replication in Verification runs as a one-liner.
Li & Shankar (2023) Figure 2 simulation helper for TSSC.
Implements the Monte Carlo design used in
TSSC_Figure2_MSE_Ratio.m from the paper’s replication package:
three latent common factors driving every unit with homogeneous
loadings \(b = [1, 1, 1]'\), plus an additive intercept and iid
\(\mathcal{N}(0, 1)\) idiosyncratic noise. The treated unit’s index
is unit 0; donors are units \(1, \ldots, N - 1\).
with \(b_k = (1, 1, 1)'\) for every unit (the “SC restrictions hold” regime, where plain SC dominates MSCc in MSE), \(\alpha = 1\), and the three factors:
with \(u_{kt} \sim \mathcal{N}(0, 1)\) and initial values zero. \(f_1\) is a nonlinear AR(1) trend; \(f_2\) is ARMA(1, 1); \(f_3\) is MA(2). True ATT is zero — the MATLAB code adds a treatment-effect path \(\Delta_t\) but its size \(C_{TE} = 0\) in the published Figure 2 setup.
- class mlsynth.utils.tssc_helpers.simulation.TSSCSample(df: DataFrame, y_treated: ndarray, donors: ndarray, factors: ndarray, T1: int, T2: int, N_co: int)#
One draw from the Figure 2 DGP.
- df#
Long panel with columns
unit/time/y/treatready formlsynth.TSSC.- Type:
pd.DataFrame
- y_treated#
Treated outcome over the full timeline, shape
(T,).- Type:
np.ndarray
- donors#
Donor outcomes, shape
(T, N_co).- Type:
np.ndarray
- factors#
Common factor matrix, shape
(T, 3).- Type:
np.ndarray
- T1, T2, N_co
Pre-treatment periods, post-treatment periods, and donor count.
- Type:
- df: DataFrame#
- donors: ndarray#
- factors: ndarray#
- y_treated: ndarray#
- mlsynth.utils.tssc_helpers.simulation.simulate_tssc_sample(T1: int = 76, T2: int = 34, N_co: int = 10, alpha: float = 1.0, rng: Generator | None = None) TSSCSample#
Draw one sample from the Li & Shankar Figure 2 DGP.
Defaults match the MATLAB Mock_data_code dimensions (
T = 110,T_1 = 76,N_{co} = 10); pass smallerT1/T2for the Figure 2 sweep.- Parameters:
T1, T2 (int) – Pre- and post-treatment period counts.
N_co (int) – Number of donor units (10 in the paper’s left-panel exercise, 30 in the right-panel exercise).
alpha (float, default 1.0) – Constant added to every unit’s outcome.
rng (np.random.Generator, optional) – NumPy RNG. Defaults to
np.random.default_rng().
- Returns:
TSSCSample