Synthetic Historical Control (SHC)

Synthetic Historical Control (SHC)#

Overview#

The Synthetic Historical Control (SHC) method estimates the time-varying intervention effect on a single treated unit using only its own time series — no cross-sectional control units are required. It is the answer to the setting where every unit is treated (a nationwide policy, a global shock such as COVID-19) so the synthetic control method has no donor pool to draw on.

SHC is due to Chen, Yang & Yang (2024). It builds on a semi-parametric time-series regression in which a smooth latent trend \(\ell_t\) is a time-varying confounder. Rather than extrapolate an assumed parametric trend (as the interrupted-time-series, Prophet, and CausalImpact approaches do), SHC carries the synthetic-control idea over to a single series: it replaces the treated unit with a treated block and the cross-sectional donors with overlapping historical blocks from the same series, then matches the treated block’s pre-intervention segment with a simplex combination of its historical counterparts. The matching quality is therefore detectable from the pre-period fit, unlike parametric trend extrapolation whose misspecification is invisible after the intervention.

When to use this estimator#

Reach for SHC when there is one treated unit and no credible untreated controls, but a reasonably long pre-intervention series with recurring local structure (cycles, seasonal-like swings — not strict periodicity):

Nationwide / global interventions. A country-level minimum-wage hike, a national pension reform, or the macroeconomic impact of a pandemic, where no other unit is plausibly untreated. The paper’s applications are Brexit’s effect on UK GDP growth and COVID-19’s effect on US GDP growth.
Cross-sectional controls exist but fail the SC matching condition. Even when donors are available, SHC can match the treated pre-period better than SC if the donors track the treated series poorly (the paper’s Brexit case: SHC pre-period MSE 0.029 vs SC 0.256).

If you have a clean panel with valid untreated donors, the cross-sectional estimators (mlsynth.CLUSTERSC, mlsynth.PDA, mlsynth.SBC) are the appropriate tools.

Notation#

Let \(j = 1\) denote the single treated unit — here the only unit, since SHC needs no cross-section. Time runs over \(t \in \mathcal{T} \coloneqq \{1, \dots, T\}\), 1-indexed; the intervention takes effect after period \(T_0\), splitting \(\mathcal{T}\) into the pre-period \(\mathcal{T}_1 \coloneqq \{t \in \mathcal{T} : t \le T_0\}\) (of length \(T_0\)) and the post-period \(\mathcal{T}_2 \coloneqq \{t \in \mathcal{T} : t > T_0\}\) (of length \(T - T_0\)). The observed series is \(\mathbf{y}_1 = (y_{11}, \dots, y_{1T})^\top \in \mathbb{R}^{T}\) with scalar outcomes \(y_{1t}\) and treatment dummy \(d_{1t}\) (\(0\) for \(t \in \mathcal{T}_1\), \(1\) for \(t \in \mathcal{T}_2\)).

The per-period treatment effect is \(\tau_t\) (the paper’s \(\delta_t\)); SHC reconstructs the counterfactual \(\widehat{\ell}_t\) and reads \(\tau_t \coloneqq y_{1t} - \widehat{\ell}_t\), with ATT \(\widehat{\tau} \coloneqq |\mathcal{T}_2|^{-1} \sum_{t \in \mathcal{T}_2} \tau_t\). Block weights are \(\mathbf{w}\), constrained to the unit simplex \(\Delta \coloneqq \{\mathbf{w} \in \mathbb{R}_{\ge 0} : \|\mathbf{w}\|_1 = 1\}\), with optimiser \(\mathbf{w}^\ast\). The significance level is \(\alpha\). Throughout, \(\tau_t\) denotes the treatment effect; the paper’s \(\delta_t\) is the same quantity.

Mathematical formulation#

Setup#

For the treated unit with outcome \(\mathbf{y}_1\) and intervention indicator \(d_{1t}\) (\(0\) for \(t \in \mathcal{T}_1\), \(1\) afterwards), the semi-parametric model (Eq. 2; the implementation uses the simplified \(x_t\)-free form) is

\[y_{1t} = \ell_t + \tau_t\, d_{1t} + \varepsilon_t,\]

where \(\ell_t = \ell(t)\) is a non-stochastic, smooth latent trend, \(\tau_t\) is the time-varying intervention effect, and \(\varepsilon_t\) is a zero-mean error. Because both \(\ell_t\) and \(\tau_t\) are unobserved post-intervention, naive pre/post or semi-parametric (Robinson 1988) methods cannot separate the two; SHC identifies \(\tau_t\) by reconstructing the post-intervention \(\ell_t\).

Treated and historical blocks#

Fix a pre-intervention block length \(m\) and a post horizon \(n\). The treated block spans \([T_0 - (m-1),\, T_0 + n]\), with pre-segment \(\boldsymbol\ell_{pre}\) and post-segment \(\boldsymbol\ell_{post}\). The pre-period is sliced into \(N = T_0 - n - (m-1)\) overlapping historical blocks, each with the same pre/post split (Eq. 7). The SHC weights solve a simplex-matching problem on the latent pre-segments,

\[\mathbf{w}^\ast = \operatorname*{argmin}_{\mathbf{w} \in \Delta} \bigl\| \widehat{\boldsymbol\ell}_{pre} - \widehat{\mathbf{L}}_{pre} \mathbf{w} \bigr\|^2, \qquad \Delta = \{\mathbf{w} \ge 0 : \mathbf{1}^\top \mathbf{w} = 1\},\]

and the post-intervention counterfactual is the same combination applied to the historical forward segments, \(\widehat\ell_t(\mathbf{w}^\ast) = \sum_i w_i^\ast \, \widehat\ell_{t(i)}\).

Identifying assumptions#

Assumption 1 (regularity). \(\{\varepsilon_t\}\) is i.i.d. with zero mean and finite variance. Remark. This is what identifies the (semi-parametric) nuisance components in the pre-period.

Assumption 2(a) (smoothness). \(\ell(\cdot)\) has a bounded \((H+1)\)-th derivative, with \(m - 1 \ge H \ge 3\). Remark. The degree of smoothness \(H\) controls the bias bound \(b_\epsilon(H, k) = 2\epsilon |k|^{H+1}/(H+1)!\) (Proposition 1): the estimator is approximately unbiased, with bias vanishing as the latent component gets smoother or the post horizon \(k\) shrinks. This is why SHC favors a small post horizon and why larger-horizon estimates should be read cautiously.

Assumption 2(b) (matching). The treated pre-segment is reproducible as a convex combination of its historical counterparts, \(\boldsymbol\ell_{pre} = \boldsymbol\ell_{pre}(\mathbf{w}_o)\) for some \(\mathbf{w}_o \in \Delta\). Remark. This is the distributional analogue of the SC matching condition, transplanted from cross-sectional donors to historical blocks. It is checkable from the pre-period fit. It also precludes a pure growth trend (which cannot be reproduced by its own history), so differencing/detrending the series first is recommended.

Algorithm (implementation)#

The two-stage estimator (Section 2.3) is orchestrated by mlsynth.utils.shc_helpers.orchestration.solve_shc():

Latent trend. Estimate \(\widehat\ell_t\) over the pre-period by local-linear kernel regression, with the bandwidth chosen by leave-one-out cross-validation (bandwidth_grid).
Blocks. Build the treated block and the \(N\) historical blocks.
Matching. Weight all \(N\) historical blocks by the simplex-constrained matching QP (Eq. 23), solving the nearest-PD approximation \(\widehat{\mathbf{L}}_{\mathrm{pre}}^{\top} \widehat{\mathbf{L}}_{\mathrm{pre}} + \varsigma C_2 C_2^{\top}\); the simplex constraint itself zeroes out the irrelevant blocks.
Augmentation (optional). use_augmented=True adds an ASHC ridge refinement on top of the simplex weights.
Counterfactual. Apply the weights to the historical forward segments to obtain \(\widehat\ell_t(\mathbf{w}^\ast)\) over the post horizon; the gap \(y_{1t} - \widehat\ell_t(\mathbf{w}^\ast)\) estimates \(\tau_t\).

Inference#

SHC reports the conformal permutation test of Chen, Yang & Yang (2024, footnote 21) — their application of Chernozhukov, Wüthrich & Zhu (2021) — for the sharp null \(H_0: \tau_t = 0\) over the post period. The test statistic is

\[S = n^{-1/2} \sum_{t=T_0+1}^{T_0+n} \bigl| \widehat\varepsilon_t^0 \bigr|, \qquad \widehat\varepsilon_t^0 = y_{1t} - \widehat\ell_t,\]

and the null distribution is built by sampling \(n\) residuals with replacement from the \(T_0\) pre-intervention residuals, 1,000 times. results.inference exposes p_value, test_statistic, the 1/5/10% critical_values and reject decisions, the resampled null_distribution, and Andrews-Genton conformal bands for the plot.

Note

The test is designed for the empirical setting where a genuine effect is present (in the paper’s Brexit application it rejects at the 1% level: \(S = 2.492 > 2.190\)). Because the reference residuals are the in-sample kernel-smoother residuals, which are mildly under-dispersed relative to the true noise, the test can over-reject under an exact null; the paper does not run it in the (effect-free) simulation.

Core API#

class mlsynth.SHC(config: SHCConfig | dict)#

Bases: object

Synthetic Historical Control (SHC) estimator.

Estimates a single treated unit’s untreated counterfactual from its own time series alone, by matching the latent pre-intervention trend with a simplex combination of overlapping historical blocks (Chen, Yang & Yang 2024). The augmented variant (ASHC) adds a ridge refinement.

Parameters:: config (SHCConfig or dict) – Configuration object. See mlsynth.config_models.SHCConfig. Key fields: m (pre-intervention block length), use_augmented (ASHC), bandwidth_grid (LOOCV candidates).
Returns:: SHCResults – Bandwidth, latent trend, historical-block weights, the post-intervention counterfactual, the ATT, fit diagnostics, and the conformal permutation inference.

References

Chen, Yi-Ting, Jui-Chung Yang, and Tzu-Ting Yang (2024). “Synthetic Historical Control for Policy Evaluation.” SSRN 4995085.

__init__(config: SHCConfig | dict) → None#

fit() → SHCResults#: Run the SHC pipeline and return structured results.

Configuration#

class mlsynth.config_models.SHCConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', plot: ~mlsynth.config_models.PlotConfig = <factory>, m: int = 1, bandwidth_grid: ~typing.List[float] | None = None, use_augmented: bool = False, inference_method: str = 'bootstrap', permutation_scheme: str = 'moving_block', num_permutations: int | None = None)#

bandwidth_grid: List[float] | None#

check_shc_params() → SHCConfig#

inference_method: str#

m: int#

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_permutations: int | None#

permutation_scheme: str#

use_augmented: bool#

Helper Modules#

Data preparation for the Synthetic Historical Control (SHC) estimator.

Wraps datautils.dataprep (in single-series mode, since SHC needs no cross-sectional donors), builds a time IndexSet, and validates that the pre-treatment window is long enough to form at least one historical block: T0 > m + n - 1 (Section 2.2).

mlsynth.utils.shc_helpers.setup.prepare_shc_inputs(df: DataFrame, outcome: str, treat: str, unitid: str, time: str, m: int) → SHCInputs#

Pivot a single-unit panel into SHCInputs.

Parameters:

df (pd.DataFrame) – Long balanced panel for one treated unit (donors optional and ignored).
outcome, treat, unitid, time (str) – Column names.
m (int) – Pre-intervention block length. Must satisfy T0 > m + n - 1 so at least one historical block exists.

Returns:

SHCInputs

End-to-end Synthetic Historical Control procedure (Chen-Yang-Yang 2024).

Composes the estimator’s stages (Section 2.3) using the shared numerical kernels in mlsynth.utils.shc_helpers.kernels and mlsynth.utils.datautils:

LOOCV-select a kernel bandwidth and smooth the pre-period into the latent trend ell_hat (Eq. 21).

Build the treated block and the overlapping historical blocks (Eq. 7).

Match the treated pre-segment with a simplex combination of all historical pre-segments by solving the nearest-PD QP (Eq. 23).

(Optional) augment with a ridge refinement (ASHC).

Apply the weights to the historical forward segments to obtain the post-intervention counterfactual.

mlsynth.utils.shc_helpers.orchestration.solve_shc(inputs: SHCInputs, *, use_augmented: bool = False, bandwidth_grid: Sequence[float] | None = None) → SHCDesign#

Run the SHC pipeline and return the fitted design.

Parameters:

inputs (SHCInputs) – From prepare_shc_inputs().
use_augmented (bool) – Apply the augmented (ASHC) ridge refinement to the simplex weights.
bandwidth_grid (sequence of float, optional) – Candidate bandwidths for LOOCV. Defaults to 50 points on [0.05, 1.0].

mlsynth.utils.shc_helpers.orchestration.summarize_effects(inputs: SHCInputs, design: SHCDesign) → Tuple[float, float, ndarray, ndarray, ndarray, ndarray, dict]#

Compute ATT, gap, and fit diagnostics over the treated-block window.

Returns:

att, att_percent (float)
observed, counterfactual, gap (np.ndarray) – Series over the m + n block window.
window_time (np.ndarray) – Period labels for the window.
fit_diagnostics (dict) – {"rmse_pre", "rmse_post", "r_squared_pre"}.

Inference for the Synthetic Historical Control estimator.

Primary inference is the conformal permutation test of Chen, Yang & Yang (2024, footnote 21) – their application of Chernozhukov, Wuthrich & Zhu (2021) to SHC – computed by mlsynth.utils.inferutils.shc_conformal_test(). Andrews-Genton conformal prediction bands are computed alongside for the plot.

mlsynth.utils.shc_helpers.inference.ag_conformal(actual_outcomes_pre_treatment: ndarray, predicted_outcomes_pre_treatment: ndarray, predicted_outcomes_post_treatment: ndarray, miscoverage_rate: float = 0.1, pad_value: Any = nan) → Tuple[ndarray, ndarray]#

Construct agnostic conformal prediction intervals.

Generates prediction intervals for post-treatment predictions based on pre-treatment residuals and assuming residuals follow a distribution for which sub-Gaussian concentration bounds apply. The interval width is determined by the variability of pre-treatment residuals and the desired coverage level miscoverage_rate.

Parameters:

actual_outcomes_pre_treatment (np.ndarray) – Actual pre-treatment outcomes. Shape (T_pre,), where T_pre is the number of pre-treatment periods.
predicted_outcomes_pre_treatment (np.ndarray) – Predicted pre-treatment outcomes, corresponding to actual_outcomes_pre_treatment. Shape (T_pre,). Must have the same length as actual_outcomes_pre_treatment.
predicted_outcomes_post_treatment (np.ndarray) – Predicted post-treatment outcomes for which intervals are desired. Shape (T_post,), where T_post is the number of post-treatment periods.
miscoverage_rate (float, optional) – Desired miscoverage level (e.g., 0.1 for 90% prediction intervals, meaning (1-miscoverage_rate) coverage). Must be between 0 and 1. Default is 0.1.
pad_value (Any, optional) – Value used to pad the pre-treatment portion of the returned interval arrays. This makes the output arrays align with a full time series (pre- and post-treatment). Default is np.nan.

Returns:

Tuple[np.ndarray, np.ndarray] – A tuple containing:

lower_bounds_full_series : np.ndarray Lower bounds of the prediction intervals. Shape (T_pre + T_post,). The first T_pre elements are filled with pad_value.
upper_bounds_full_series : np.ndarray Upper bounds of the prediction intervals. Shape (T_pre + T_post,). The first T_pre elements are filled with pad_value.

Raises:

MlsynthDataError – If actual_outcomes_pre_treatment and predicted_outcomes_pre_treatment have different lengths. If actual_outcomes_pre_treatment is empty.
MlsynthConfigError – If miscoverage_rate is not between 0 and 1.

Examples

>>> actual_outcomes_pre_treatment_ex = np.array([10, 12, 11, 13, 12])
>>> predicted_outcomes_pre_treatment_ex = np.array([10.5, 11.5, 10.5, 12.5, 11.5])
>>> predicted_outcomes_post_treatment_ex = np.array([14, 15, 14.5])
>>> miscoverage_rate_ex = 0.1 # For 90% prediction intervals
>>> lower_b, upper_b = ag_conformal(
...     actual_outcomes_pre_treatment_ex, predicted_outcomes_pre_treatment_ex,
...     predicted_outcomes_post_treatment_ex, miscoverage_rate=miscoverage_rate_ex
... )
>>> print("Lower bounds:", np.round(lower_b, 2))
Lower bounds: [  nan   nan   nan   nan   nan 12.01 13.01 12.51]
>>> print("Upper bounds:", np.round(upper_b, 2))
Upper bounds: [  nan   nan   nan   nan   nan 15.99 16.99 16.49]

>>> # Example with empty pre-treatment data (raises MlsynthDataError)
>>> try:
...     ag_conformal(np.array([]), np.array([]), predicted_outcomes_post_treatment_ex)
... except MlsynthDataError as e:
...     print(e)
Pre-treatment arrays cannot be empty.

>>> # Example with invalid miscoverage_rate (raises MlsynthConfigError)
>>> try:
...     ag_conformal(actual_outcomes_pre_treatment_ex, predicted_outcomes_pre_treatment_ex,
...                  predicted_outcomes_post_treatment_ex, miscoverage_rate=1.1)
... except MlsynthConfigError as e:
...     print(e)
miscoverage_rate must be between 0 and 1.

mlsynth.utils.shc_helpers.inference.cwz_conformal_test(pre_intervention_residuals: ndarray, post_intervention_residuals: ndarray, q: float = 1.0, scheme: str = 'moving_block', num_permutations: int | None = None, levels: Tuple[float, ...] = (0.01, 0.05, 0.1), random_state: int = 0) → dict#

Exact permutation conformal test of Chernozhukov, Wuthrich & Zhu (2021).

This is the exact conformal inference of CWZ (2021), as opposed to the Chen-Yang-Yang (2024, footnote 21) with-replacement residual bootstrap in shc_conformal_test(). It tests the sharp null \(H_0: \delta_t = 0\) over the post window using the CWZ statistic (their Definition 1)

\[S_q(u) = \Bigl( \tfrac{1}{\sqrt{n}} \sum_{t = T_o + 1}^{T_o + n} |u_t|^q \Bigr)^{1/q},\]

evaluated on the trailing n positions of the full residual vector \(u = (\hat\varepsilon_1^0, \dots, \hat\varepsilon_{T_o}^0, \hat\delta_1, \dots, \hat\delta_n)\) of length \(T = T_o + n\). The reference distribution is obtained by permuting u (CWZ Definition 2, Figure 2), not by resampling from the pre-period pool:

scheme="moving_block" – the \(T\) cyclic shifts \(\Pi_{\rightarrow}\) (\(\pi_j(i) = i + j \bmod T\)), valid under stationary weak dependence. The set is fully enumerated, so the test is deterministic and its p-values lie on the \(1/T\) grid.
scheme="iid" – random permutations from \(\Pi_{\mathrm{all}}\), exact under exchangeability; num_permutations are drawn (the identity is always included so \(\hat p \ge 1/|\Pi|\)).

Parameters:

pre_intervention_residuals (np.ndarray) – Pre-period residuals \(\hat\varepsilon_t^0\), shape (T_o,).
post_intervention_residuals (np.ndarray) – Post-period residuals (estimated gaps), shape (n,).
q (float, optional) – Norm exponent of the test statistic. Default 1.0 (CWZ’s \(S_1\), which matches the bootstrap statistic in shc_conformal_test()).
scheme ({“moving_block”, “iid”}, optional) – Permutation family. Default "moving_block".
num_permutations (int or None, optional) – Number of permutations for scheme="iid" (>= 2). Ignored for "moving_block" (always T). Defaults to 1000 for "iid".
levels (tuple of float, optional) – Significance levels for critical values / reject decisions.
random_state (int, optional) – Seed for the "iid" permutation RNG (unused for "moving_block").

Returns:

dict – Keys test_statistic, p_value (\(\Pr(S^* \ge S)\)), critical_values, reject, null_distribution, num_permutations, scheme, q, levels.

Raises:

MlsynthDataError – If either residual array is empty.
MlsynthConfigError – If q <= 0, scheme is unknown, or num_permutations < 2.

mlsynth.utils.shc_helpers.inference.run_conformal_inference(inputs: SHCInputs, design: SHCDesign, observed: ndarray, counterfactual: ndarray, *, method: str = 'bootstrap', permutation_scheme: str = 'moving_block', num_permutations: int | None = None, q: float = 1.0, miscoverage_rate: float = 0.1, num_resamples: int = 1000, levels: Sequence[float] = (0.01, 0.05, 0.1), random_state: int = 0) → SHCInference#

Assemble the SHC conformal permutation test and conformal bands.

Parameters:

inputs (SHCInputs) – Preprocessed series (supplies the pre-period and latent trend pool).
design (SHCDesign) – Fitted design (supplies latent_pre for the pre-period residuals).
observed, counterfactual (np.ndarray) – Observed and SHC series over the m + n block window.
method ({“bootstrap”, “exact”}) – "bootstrap" (default) is the Chen-Yang-Yang (2024) with-replacement residual bootstrap (shc_conformal_test()); "exact" is the Chernozhukov-Wuthrich-Zhu (2021) permutation test (cwz_conformal_test()).
permutation_scheme ({“moving_block”, “iid”}) – Permutation family for method="exact".
num_permutations (int or None) – Permutation count for method="exact" with permutation_scheme="iid".
q (float) – Norm exponent of the exact-test statistic.
miscoverage_rate (float) – 1 - coverage for the Andrews-Genton bands (0.10 -> 90%).
num_resamples, levels, random_state – Forwarded to the selected test.

Raises:

MlsynthConfigError – If method is not "bootstrap" or "exact".

mlsynth.utils.shc_helpers.inference.shc_conformal_test(pre_intervention_residuals: ndarray, post_intervention_residuals: ndarray, num_resamples: int = 1000, levels: Tuple[float, ...] = (0.01, 0.05, 0.1), random_state: int = 0) → dict#

Conformal permutation test for the SHC intervention effect.

Implements the inference procedure of Chen, Yang & Yang (Synthetic Historical Control for Policy Evaluation, 2024), which applies the conformal inference of Chernozhukov, Wuthrich & Zhu (2021) to the SHC estimator. The procedure tests the sharp null of no intervention effect,

\[H_0: \delta_t = 0 \quad \text{for } t = T_o + 1, \dots, T_o + n,\]

using the test statistic (their footnote 21)

\[S = n^{-1/2} \sum_{t=T_o+1}^{T_o+n} \bigl| \hat\varepsilon_t^0 \bigr|, \qquad \hat\varepsilon_t^0 = y_t - \hat\ell_t,\]

where \(\hat\varepsilon_t^0\) are the post-intervention residuals (the estimated gaps \(\hat\delta_t\)). The null distribution of S is constructed by randomly sampling ``n`` observations with replacement from the \(T_o\) pre-intervention residuals \(\{\hat\varepsilon_t^0\}_{t=1}^{T_o}\), repeated num_resamples (default 1,000) times, exactly as described in the paper.

Parameters:

pre_intervention_residuals (np.ndarray) – The \(T_o\) pre-intervention residuals \(\hat\varepsilon_t^0 = y_t - \hat\ell_t\), shape (T_o,). These form the resampling pool for the null distribution.
post_intervention_residuals (np.ndarray) – The n post-intervention residuals (estimated gaps), shape (n,). Their absolute sum forms the observed statistic.
num_resamples (int, optional) – Number of resamples used to build the null distribution. Default 1000, matching the paper.
levels (tuple of float, optional) – Significance levels at which to report upper-tail critical values and reject/retain decisions. Default (0.01, 0.05, 0.10).
random_state (int, optional) – Seed for the resampling RNG. Default 0.

Returns:

dict – Keys: test_statistic (S), p_value (\(\Pr(S^* \ge S)\)), critical_values (mapping level -> upper-tail quantile of the null), reject (mapping level -> bool), null_distribution (the resampled S^* array), num_resamples, and levels.

Raises:

MlsynthDataError – If either residual array is empty.

Frozen, NumPy-first containers for the Synthetic Historical Control (SHC).

Implements the containers for

Chen, Y.-T., Yang, J.-C., & Yang, T.-T. (2024). “Synthetic Historical Control for Policy Evaluation.” SSRN 4995085.

SHC reconstructs a single treated unit’s untreated counterfactual without cross-sectional controls. It estimates a smooth latent trend \(\ell_t\) by kernel regression on the pre-period, partitions the series into a “treated block” and a set of overlapping “historical blocks”, and matches the treated block’s pre-segment with a simplex combination of the historical blocks (Section 2.2). The same combination, applied to the historical blocks’ forward segments, yields the post-intervention counterfactual.

Everything below is pure NumPy; time periods are addressed through the repository’s IndexSet. The only DataFrame touchpoint is setup.

class mlsynth.utils.shc_helpers.structures.SHCDesign(bandwidth: float, latent_pre: ndarray, weights: ndarray, selected_blocks: List[int], block_weights: Dict[Any, float], counterfactual_window: ndarray, use_augmented: bool, best_lambda: float | None = None)#

SHC fitted design.

Parameters:

bandwidth (float) – LOOCV-selected kernel bandwidth for the latent-trend smoother.
latent_pre (np.ndarray) – Kernel-smoothed latent trend over the pre-period, shape (T0,) (the first-stage \(\hat\ell_t\)).
weights (np.ndarray) – Full length-N historical-block weights (mostly zero).
selected_blocks (list of int) – Indices of the historical blocks with non-zero weight.
block_weights (dict) – Mapping block_label -> weight for the selected blocks.
counterfactual_window (np.ndarray) – SHC counterfactual over the m + n treated-block window (pre-segment reconstruction followed by the post-intervention prediction).
use_augmented (bool) – Whether the augmented (ASHC) ridge refinement was applied.
best_lambda (float or None) – ASHC ridge penalty chosen by tuning; None for plain SHC.

bandwidth: float#

best_lambda: float | None = None#

block_weights: Dict[Any, float]#

counterfactual_window: ndarray#

latent_pre: ndarray#

selected_blocks: List[int]#

use_augmented: bool#

weights: ndarray#

class mlsynth.utils.shc_helpers.structures.SHCInference(method: str, test_statistic: float, p_value: float, critical_values: Dict[float, float], reject: Dict[float, bool], num_resamples: int, null_distribution: ndarray, conformal_lower: ndarray, conformal_upper: ndarray, confidence_level: float)#

Conformal permutation inference (Chen-Yang-Yang 2024, footnote 21).

Parameters:

method (str) – Always "conformal_permutation".
test_statistic (float) – \(S = n^{-1/2} \sum_t |\hat\varepsilon_t^0|\) over the post period.
p_value (float) – \(\Pr(S^* \ge S)\) under the resampled null.
critical_values (dict) – Mapping significance level -> upper-tail critical value of S^*.
reject (dict) – Mapping significance level -> reject decision (S > cv).
num_resamples (int) – Number of null resamples (1000 in the paper).
null_distribution (np.ndarray) – The resampled S^* values.
conformal_lower, conformal_upper (np.ndarray) – Post-period Andrews-Genton conformal bands, retained for plotting.
confidence_level (float) – Coverage of the conformal bands (e.g. 0.90).

confidence_level: float#

conformal_lower: ndarray#

conformal_upper: ndarray#

critical_values: Dict[float, float]#

method: str#

null_distribution: ndarray#

num_resamples: int#

p_value: float#

reject: Dict[float, bool]#

test_statistic: float#

class mlsynth.utils.shc_helpers.structures.SHCInputs(time_index: ~mlsynth.utils.helperutils.IndexSet, y: ~numpy.ndarray, T0: int, m: int, treated_label: ~typing.Any, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#

Preprocessed, NumPy-only inputs for the SHC engine.

Parameters:

time_index (IndexSet) – All T period labels (row order of y).
y (np.ndarray) – Treated-unit outcome over all periods, shape (T,).
T0 (int) – Number of pre-treatment periods; post is n = T - T0.
m (int) – Pre-intervention window length of the treated/historical blocks.
treated_label (Any) – Identifier of the treated unit.
metadata (dict) – Free-form provenance (e.g. the wide frame from dataprep).

property N: int#

T0 - n - (m - 1).

Type:: Number of historical blocks (Eq. 7)

property T: int#

T0: int#

m: int#

metadata: Dict[str, Any]#

property n: int#: Post-intervention horizon.

time_index: IndexSet#

treated_label: Any#

y: ndarray#

class mlsynth.utils.shc_helpers.structures.SHCResults(*, effects: ~mlsynth.config_models.EffectsResults | None = None, fit_diagnostics: ~mlsynth.config_models.FitDiagnosticsResults | None = None, time_series: ~mlsynth.config_models.TimeSeriesResults | None = None, weights: ~mlsynth.config_models.WeightsResults | None = None, inference: ~mlsynth.config_models.InferenceResults | None = None, method_details: ~mlsynth.config_models.MethodDetailsResults | None = None, sub_method_results: ~typing.Dict[str, ~typing.Any] | None = None, additional_outputs: ~typing.Dict[str, ~typing.Any] | None = None, raw_results: ~typing.Dict[str, ~typing.Any] | None = None, execution_summary: ~typing.Dict[str, ~typing.Any] | None = None, plot_config: ~mlsynth.config_models.PlotConfig | None = None, inputs: ~mlsynth.utils.shc_helpers.structures.SHCInputs, design: ~mlsynth.utils.shc_helpers.structures.SHCDesign, att_percent: float, observed: ~numpy.ndarray, cf_window: ~numpy.ndarray, gap_window: ~numpy.ndarray, time_labels: ~numpy.ndarray, fit_diagnostics_detail: ~typing.Dict[str, ~typing.Any], att_value: float, inference_detail: ~mlsynth.utils.shc_helpers.structures.SHCInference | None = None, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#

Public container returned by mlsynth.SHC.fit().

Parameters:

inputs (SHCInputs) – Preprocessed series.
design (SHCDesign) – Bandwidth, latent trend, block weights, and counterfactual.
att (float) – Mean post-intervention gap mean(observed - counterfactual).
att_percent (float) – ATT as a percentage of the mean counterfactual.
observed (np.ndarray) – Observed treated series over the m + n block window.
counterfactual (np.ndarray) – SHC counterfactual over the same window (= design.counterfactual_window).
gap (np.ndarray) – observed - counterfactual over the window.
time_labels (np.ndarray) – Period labels for the m + n window.
fit_diagnostics (dict) – Pre/post RMSE and pre-period R-squared.
inference (SHCInference or None) – Conformal permutation test output.
metadata (dict) – Free-form diagnostics (m, n, N, bandwidth, augmentation).

att_percent: float#

att_value: float#

cf_window: np.ndarray#

design: SHCDesign#

fit_diagnostics_detail: Dict[str, Any]#

gap_window: np.ndarray#

inference_detail: SHCInference | None#

inputs: SHCInputs#

metadata: Dict[str, Any]#

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'frozen': True, 'json_encoders': {<class 'numpy.ndarray'>: <function BaseEstimatorResults.Config.<lambda>>}}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

observed: np.ndarray#

time_labels: np.ndarray#

property weights_by_block: Dict[Any, float]#

Plotting helper for SHC results.

mlsynth.utils.shc_helpers.plotter.plot_shc(results: SHCResults, *, treated_color: str = 'black', counterfactual_color: str = 'red', title: str | None = None) → None#

Render the SHC counterfactual against the observed treated series.

Plots the observed series and the SHC counterfactual over the m + n block window, the post-intervention conformal band (if inference was run), and a treatment-start indicator.

Example#

A self-contained one-draw example using the paper’s own data-generating process (a smooth, recurring latent trend plus noise, with no intervention effect, so the counterfactual should track the latent component). Paste it into a fresh interpreter:

import numpy as np
from mlsynth import SHC
from mlsynth.utils.shc_helpers import simulate_shc_panel

# One panel from the Chen-Yang-Yang (2024) DGP (Section 3.1):
# T_o = m(4h+1) = 90 pre-periods, n = 8 post-periods, delta_t = 0.
df, info = simulate_shc_panel(
    m=10, h=2, n=8, P=10, sigma=0.1, w_f=(1, 0), regular=True, seed=0,
)

res = SHC({
    "df": df, "outcome": "y", "treat": "treated",
    "unitid": "unit", "time": "time", "m": 10, "display_graphs": False,
}).fit()

# SHC reconstructs the latent confounder over the post window.
cf_post = res.counterfactual[10:]
mse_post = np.mean((cf_post - info["latent_post"]) ** 2)
print(f"ATT                = {res.att:+.4f}   (true effect = 0)")
print(f"pre-period R^2     = {res.fit_diagnostics['r_squared_pre']:.3f}")
print(f"MSE_post vs latent = {mse_post:.5f}")
print(f"historical blocks  = {res.inputs.N}, "
      f"selected = {len(res.weights_by_block)}")
print(f"conformal p-value  = {res.inference.p_value:.3f}")

Monte Carlo Validation#

The estimator is validated against the paper’s simulation design (Section 3.1), re-implemented in mlsynth.utils.shc_helpers.simulation. The latent confounder \(\ell_t\) is a globally \(C^1\) curve of alternating cosine “local trends” and cubic-Hermite connectors; the treated block’s shape is a convex combination of \(h\) historical shapes (Assumption 2(b)). The construction reproduces the paper’s exact dimensions: with \(h = 4\), \(T_0 = m(4h+1)\) (425 for \(m=25\), 850 for \(m=50\)) and \(N = T_0 - n - (m-1)\) historical blocks (376 and 776).

With \(\tau_t = 0\), the exercise measures how well SHC recovers \(\ell_t\), via the mean squared matching error (MSE_pre, Eq. 31) and the mean squared prediction error against the true latent (MSE_post(k), Eq. 38).

from mlsynth.utils.shc_helpers import monte_carlo_shc

out = monte_carlo_shc(
    n_reps=8, m=25, h=4, n=25, P=10, sigma=0.1,
    w_f=(1, 0, 0, 0), regular=True, k_grid=(1, 5, 10, 15, 25),
)
print(out["mse_pre"], out["mse_post"])

Representative output (Regular-\(\ell\), \(\sigma = 0.1\), \(m = 25\)):

MSE_pre      = 0.0011
MSE_post(1)  = 0.0010
MSE_post(5)  = 0.0017
MSE_post(10) = 0.0016
MSE_post(15) = 0.0017
MSE_post(25) = 0.0016

Both measures are near zero, and MSE_post(k) rises from \(k = 1\) before plateauing — consistent with the paper’s finding that the bias bound (Proposition 1) grows with the horizon but stays mild for a smooth, regularly recurring latent component at low noise.

Data-generating process and Monte Carlo harness for the SHC method.

Faithful re-implementation of the simulation design in Chen, Yang & Yang, Synthetic Historical Control for Policy Evaluation (2024), Section 3.1, used here to validate mlsynth.SHC.

The DGP (their Eqs. 35-37) is a single time series

\[y_t = \ell_t + \delta_t d_t + \varepsilon_t,\]

with no intervention effect (\(\delta_t = 0\)) so that the exercise measures how well SHC recovers the latent time-varying confounder \(\ell_t\) in the post-intervention period. The latent component is a globally \(C^1\) curve built from alternating pieces: on each macro-segment of width \(4m\) the first half is a cosine “local trend” \(f_i\) and the second half is a cubic Hermite connector \(g_i\) chosen to match \(f_i\) and \(f_{i+1}\) in level and slope at the knots (the “spline restriction”, Eq. 36).

There are \(h\) historical cosine shapes; the treated block’s shape is their convex combination \(f_{h+1} = \sum_{i=1}^h w_{f,i} f_i\), which encodes Assumption 2(b) (the treated pre-segment is reproducible from its historical counterparts). This construction reproduces the paper’s exact dimensions: with \(h = 4\) it gives \(T_o = m(4h+1)\) (425 for \(m = 25\), 850 for \(m = 50\)) and \(N = T_o - n - (m-1)\) historical blocks (376 and 776, respectively).

Regular-l: \((\alpha_i, P_i) = (0, P)\) for every shape, so the local trends recur identically.
Irregular-l: \((\alpha_i, P_i) = (0, P) + (U_\alpha, U_P)\) with \(U_\alpha \sim U(-1, 1)\), \(U_P \sim U(0, 50)\), so the shapes differ in amplitude and periodicity.

mlsynth.utils.shc_helpers.simulation.simulate_shc_latent(*, m: int = 25, h: int = 4, n: int = 25, P: float = 10.0, w_f: Sequence[float] = (1.0, 0.0, 0.0, 0.0), regular: bool = True, seed: int = 0) → Tuple[ndarray, int, int]#

Construct the latent component ell_t and return (ell, T_o, N).

The series spans t = 1, ..., T_o + n with T_o = m * (4h + 1) and N = T_o - n - (m - 1) historical blocks.

mlsynth.utils.shc_helpers.simulation.simulate_shc_panel(*, m: int = 25, h: int = 4, n: int = 25, P: float = 10.0, sigma: float = 0.1, w_f: Sequence[float] = (1.0, 0.0, 0.0, 0.0), regular: bool = True, seed: int = 0) → Tuple[DataFrame, Dict[str, Any]]#

Generate one SHC simulation panel as a long DataFrame.

Returns:

df (pandas.DataFrame) – Long panel for a single treated unit with columns unit, time (1..T_o+n), y (= ell_t + noise), and treated (0 for t <= T_o, 1 afterwards).
info (dict) – latent (the true ell_t, shape (T_o+n,)), latent_post (ell over the post window), T_o, N, m, n, time (the integer time index).

Monte Carlo harness validating SHC against the Chen-Yang-Yang (2024) DGP.

Runs mlsynth.SHC on repeated draws of simulate_shc_panel() and reports the paper’s two performance measures:

MSE_pre – Eq. 31: mean squared matching error of the SHC reconstruction against the latent over the treated block’s pre-segment.
MSE_post(k) – Eq. 38: mean squared prediction error of the SHC counterfactual against the true latent over the first k post-intervention periods, for k in k_grid.

The headline finding to reproduce: both measures are small, and MSE_post(k) grows with k (consistent with the bias bound of Proposition 2 increasing in the horizon).

mlsynth.utils.shc_helpers.monte_carlo.monte_carlo_shc(*, n_reps: int = 50, m: int = 25, h: int = 4, n: int = 25, P: float = 10.0, sigma: float = 0.1, w_f: Sequence[float] = (1.0, 0.0, 0.0, 0.0), regular: bool = True, use_augmented: bool = False, k_grid: Sequence[int] = (1, 5, 10, 15, 25), seed: int = 0) → Dict[str, Any]#

Replicate the SHC simulation study and aggregate over n_reps draws.

Returns:: dict – mse_pre (float, averaged over reps), mse_post (mapping k -> averaged MSE_post(k)), n_reps (completed reps), and the configuration echoed back.

Developer API#

Internal optimization and tuning routines for SHC and ASHC. When use_augmented is true, the simplex SHC weights from the matching QP are passed to the ASHC ridge refinement for bias correction.

References#

Chen, Yi-Ting, Jui-Chung Yang, and Tzu-Ting Yang (2024). “Synthetic Historical Control for Policy Evaluation.” SSRN 4995085.

Chernozhukov, V., Wüthrich, K., & Zhu, Y. (2021). “An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls.” Journal of the American Statistical Association 116(536):1849-1864.

Hamilton, J. D. (2018). “Why You Should Never Use the Hodrick-Prescott Filter.” Review of Economics and Statistics 100(5):831-843.

Robinson, P. M. (1988). “Root-N-Consistent Semiparametric Regression.” Econometrica 56(4):931-954.

Synthetic Historical Control (SHC)

Contents

Synthetic Historical Control (SHC)#

Overview#

When to use this estimator#

Notation#

Mathematical formulation#

Setup#

Treated and historical blocks#

Identifying assumptions#

Algorithm (implementation)#

Inference#

Core API#

Configuration#

Helper Modules#

Example#

Monte Carlo Validation#

Developer API#

References#