Synthetic Controls for Experimental Design (MAREX)

Contents

Synthetic Controls for Experimental Design (MAREX)#

When to Use This Estimator#

The estimators elsewhere in mlsynth are retrospective: a treatment has already happened and you reweight donors to reconstruct the treated unit’s counterfactual. MAREX, due to Abadie and Zhao (2026) [ABADIE2024], is prospective — it designs an experiment. Before any treatment is assigned, and using only pre-experimental data, it chooses which aggregate units to treat and which to hold out as controls, so that the experiment you are about to run yields a credible estimate.

The motivating setting is a firm (say a ride-sharing company) that wants to test a new policy but can only deploy it in a few whole markets. A within-market A/B test is contaminated by interference (treated and control drivers compete); randomizing whole markets to treatment is unbiased ex ante but, with a handful of large units, routinely produces treated and control groups with very different baselines, so any single realisation is badly off. MAREX instead picks the treated and control markets so their pre-experiment predictors match the population — a non-randomized design that, the paper shows, substantially reduces estimation bias relative to randomization.

Reach for MAREX when:

  • Units are large aggregates (markets, regions, stores) and only one or a few can be treated.

  • You control the assignment and want to choose it well, rather than estimate after the fact.

  • Interference or equity rules out within-unit randomization, forcing whole-unit treatment.

Notation#

There are \(J\) units and \(T\) periods, with \(T_0\) pre-experiment periods; the experiment runs over \(t = T_0 + 1, \dots, T\). Each unit has a pre-intervention predictor vector \(X_j\) (pre-period outcomes and optional covariates); \(\bar X = \sum_j f_j X_j\) is the population predictor mean for known weights \(f_j\) (e.g. market shares, or \(1/J\)). The experimenter chooses treated weights \(w\) and control weights \(v\), both on the simplex, and disjoint:

\[\sum_j w_j = 1,\quad \sum_j v_j = 1,\quad w_j, v_j \ge 0,\quad w_j v_j = 0 \;\;\forall j.\]

Units with \(w_j > 0\) are treated; among the rest, units with \(v_j > 0\) form the synthetic control. Writing \(Y_{jt}\) for the observed outcome (treated units realise \(Y^I_{jt}\) post-treatment, everyone else \(Y^N_{jt}\)), the design estimator of the average effect is

\[\hat\tau_t(w, v) = \sum_j w_j Y_{jt} - \sum_j v_j Y_{jt}, \qquad t > T_0.\]

Assumptions#

Assumption 1 (linear factor model). Potential outcomes follow

\[Y^N_{jt} = \delta_t + \theta_t' Z_j + \lambda_t' \mu_j + \varepsilon_{jt}, \qquad Y^I_{jt} = \upsilon_t + \gamma_t' Z_j + \eta_t' \mu_j + \xi_{jt},\]

with observed covariates \(Z_j\), unobserved factors \(\mu_j\), and mean-zero idiosyncratic noise.

Remark. This is the interactive-fixed-effects model of the SC literature (Abadie-Diamond-Hainmueller 2010), extended with a separate factor structure for the treated potential outcome — necessary because a design must choose a treatment group, not just a comparison group.

Assumption 2 (regularity). The factor loadings are non-degenerate (\(F \le T_E\), smallest eigenvalue bounded below) and the noise is i.i.d. sub-Gaussian with common variance, independent across the two potential outcomes; dependence across units is allowed.

Assumption 3 / 4 (fit quality). A weight vector reproducing the population predictor means exists exactly (Assumption 3), or approximately within a tolerance \(d\) (Assumption 4). This is the design-time analogue of “the treated unit lies in the convex hull of the donors.”

Remark. Under these conditions Abadie & Zhao bound the bias of \(\hat\tau_t(w, v)\) and develop the permutation test below; the better the pre-experiment match, the smaller the bias.

Mathematical Formulation#

The Design Optimization#

MAREX chooses \(w, v\) (and a binary selection mask \(z\), with \(w_j \le z_j\), \(v_j \le 1 - z_j\), so a unit is treated or a control, never both) to match the population predictor mean. The base design minimises

\[\min_{w, v, z}\; \Bigl\| \bar X - \sum_j w_j X_j \Bigr\|_2^2 + \Bigl\| \bar X - \sum_j v_j X_j \Bigr\|_2^2 \quad \text{s.t. the simplex / disjointness / cardinality constraints,}\]

with the number of treated units pinned by m_eq (exactly) or bounded by m_min/m_max. This is a mixed-integer quadratic program (the binary z); mlsynth solves it with SCIP by default, or — via relaxed=True — relaxes z to \([0, 1]\), solves the QP, and discretises post hoc.

mlsynth exposes four objective variants through design (clear names that map to the paper’s formulations):

  • "standard" — match each predictor mean with both synthetic units (formulation 5);

  • "weakly_targeted" — match the treated synthetic to the mean and softly tie the control synthetic to it (weight beta);

  • "penalized"standard plus a distance penalty that down-weights units far from the population mean (lambda1 / lambda2);

  • "unit_penalized"standard plus unit-level penalties (lambda1_unit / lambda2_unit).

Covariates#

By default the design matches on pre-period outcomes. Passing covariates (time-invariant column names) appends them to the predictor vector, \(X_j = [Y^E_j ; Z_j]\), exactly as in the paper — the synthetic treated and control are then balanced on both pre-period outcomes and covariates (with an optional covariate_weight scale). When the pre-period is long, the outcomes already encode the covariates’ contribution, so covariates matter most when few pre-periods are available.

Clustering, Costs, and Budgets#

Passing a cluster column solves the design within each cluster (one or a few treated units per cluster), which better approximates the population predictor distribution and limits interpolation bias (paper OA.1). Per-unit costs and a budget (scalar or per-cluster) add a knapsack constraint \(\sum_j c_j w_j \le B\), so the chosen treatment group respects a spend cap.

Inference#

When inference=True with blank_periods > 0, the last few pre-experiment periods are held out as blanks: there the synthetic treated minus synthetic control is pure noise, so its distribution calibrates inference for the post-period effect. MAREX reports a permutation p-value for the global null of no effect, per-period p-values, and a split-conformal confidence band (Chernozhukov-Wuthrich-Zhu 2021), all on MAREXInference.

Standardized Post-Fit and Power Analysis#

Every call to MAREX.fit() attaches a SyntheticControlPostFit to res.post_fit. This is the single, estimator-agnostic surface for the diagnostic numbers a consumer of the design typically needs: effects, fit RMSEs, conformal / permutation inference, covariate balance (when covariates were used), and power analysis. It is computed by compute_post_fit() from MAREX’s own synthetic_treated / synthetic_control trajectories and weight vectors, so by construction it agrees with what the underlying optimization produced.

pf = res.post_fit                          # SyntheticControlPostFit
pf.ate, pf.ate_percent, pf.total_effect    # treatment-effect scalars
pf.rmse_fit, pf.rmse_blank, pf.rmse_post   # fit quality, per phase
pf.p_value, pf.ci_lower, pf.ci_upper       # inference (when computed)
pf.covariate_smd                           # treated-vs-control SMD dict
pf.covariate_smd_treated_vs_pop            # treated-vs-population
pf.covariate_smd_control_vs_pop            # control-vs-population
pf.power                                   # PowerAnalysis (see below)

Three Standardized Mean Differences#

When covariates=[...] is set, the post-fit reports the three covariate balance diagnostics that match the structure of Abadie & Zhao’s objective. Each is a per-covariate signed dict (covariate_smd_*) plus two summary scalars (max absolute SMD, sum of squared SMDs). With \(\bar X\) the population covariate aggregate, \(X_w := \sum_j w_j X_j\), \(X_v := \sum_j v_j X_j\), and \(s_m\) the cross-unit standard deviation of covariate \(m\), each comparison is the unit-free vector

\[\mathrm{SMD}_m^{(a,b)} = \frac{X_a[m] - X_b[m]}{s_m}.\]

The three pairs (a, b) reported are:

  • covariate_smd(X_w, X_v): synthetic treated vs synthetic control. The internal-validity check (“is the experiment apples-to-apples?”).

  • covariate_smd_treated_vs_pop(X_w, \bar X): synthetic treated vs population aggregate. Tracks the first term of MAREX’s objective, \(\|\bar X - \sum_j w_j X_j\|^2\). Tells you whether the chosen treated group represents the population.

  • covariate_smd_control_vs_pop(X_v, \bar X): synthetic control vs population aggregate. Tracks the second term of the objective. Tells you whether the control set represents the population.

A rule-of-thumb threshold of \(|\mathrm{SMD}| < 0.1\) is conventionally “well balanced”; below \(0.25\) is acceptable; above is a red flag.

Power Analysis and Minimum Detectable Effect#

Power analysis answers the pre-experiment planning question: given the design I’ve chosen, how large a treatment effect can I detect with high probability? This is the dual of inference: inference asks “is the observed effect distinguishable from noise?”, power asks “what effect sizes would be?”

The paper develops permutation inference for MAREX but does not provide a matching MDE. mlsynth fills this gap with an analytical, AR(1)-inflated Gaussian MDE computed from the same residual series the permutation test draws on. Set inference=True and the result auto-populates res.post_fit.powerblank_periods defaults to max(1, floor(0.3 * T0)) so you do not need to pick a scalar yourself (matching the LEXSCM / SYNDES / PANGEO convention).

Where the noise standard deviation comes from#

Under the linear factor model of Assumption 1, the per-period contrast \(g_t := \sum_j w_j Y_{jt} - \sum_j v_j Y_{jt}\) has expectation zero under the no-effect null. Its sample SD on the blank window \(\mathcal{B}\) (the held-out tail of the pre-period) is the natural estimator of the noise scale:

\[\hat\sigma_{\text{placebo}} = \sqrt{\frac{1}{|\mathcal{B}| - 1} \sum_{t \in \mathcal{B}} \bigl(g_t - \bar g\bigr)^2}.\]

When no blank window is carved out (inference=False) the pre-period gap serves as the placebo proxy. The blank-window estimator is preferred because it uses periods that played no role in fitting the weights — it is honest in exactly the same sense Chernozhukov-Wuthrich-Zhu’s conformal residuals are.

Serial correlation matters#

Synthetic-control gap residuals are virtually always serially correlated: the donor weighting absorbs the level but the persistent components of the factor structure (business cycles, seasonality, slow trends) leak through. Ignoring this systematically under-states the SE at long horizons. We model it as an AR(1) process with lag-1 autocorrelation

\[\hat\rho = \frac{\sum_t g_t g_{t-1}}{\sum_t g_t^2},\]

clipped to \((-0.99, 0.99)\) for numerical safety. The variance of the mean of \(T\) consecutive AR(1) periods, expressed as a multiple of \(\sigma^2\), is the variance inflation factor

\[\mathrm{VIF}(T, \rho) = \frac{1}{T}\!\left(1 + 2 \sum_{k=1}^{T-1}\!\Bigl(1 - \frac{k}{T}\Bigr)\rho^k\right),\]

which collapses to the textbook \(1/T\) when \(\rho = 0\) and grows substantially for \(\rho > 0.3\). The same formula is used by PANGEO’s power module.

The MDE formula#

Combining: the standard error of the mean of \(T\) post-period contrasts under \(H_0\) is \(\mathrm{SE}(T) = \hat\sigma_{\text{placebo}} \, \sqrt{\mathrm{VIF}(T, \hat\rho)}\). For a two-sided test at level \(\alpha\) with target power \(1 - \beta\), the minimum detectable effect is

\[\mathrm{MDE}(T) = \bigl(z_{1-\alpha/2} + z_{1-\beta}\bigr) \cdot \hat\sigma_{\text{placebo}} \cdot \sqrt{\mathrm{VIF}(T, \hat\rho)}.\]

The corresponding power to detect a given true effect \(\tau\) at horizon \(T\) is

\[\pi(\tau, T) = \Phi\!\Bigl(\frac{|\tau|}{\mathrm{SE}(T)} - z_{1-\alpha/2}\Bigr) + \Phi\!\Bigl(-\frac{|\tau|}{\mathrm{SE}(T)} - z_{1-\alpha/2}\Bigr),\]

which is reported as power_at_observed for each horizon point using the realised \(\hat\tau\).

What the surface looks like#

p = res.post_fit.power                  # PowerAnalysis dataclass

p.headline.mde_absolute                 # MDE at the realised T_post
p.headline.mde_pct                      # ... as % of post-period baseline
p.headline.se                           # implied SE of mean(g_t) over T_post
p.headline.power_at_observed            # power to detect res.post_fit.ate

p.curve                                 # tuple of MDEPoint, one per horizon
for pt in p.curve:
    print(pt.post_periods, pt.mde_absolute, pt.mde_pct, pt.power_at_observed)

p.sigma_placebo                         # σ̂ used (from blank or pre window)
p.serial_correlation                    # ρ̂ AR(1) of the placebo gaps
p.baseline                              # mean(synthetic_control) on post window
p.alpha, p.power_target                 # 0.05 / 0.80 by default
p.method                                # "analytical_ar1"

The default horizon grid covers \(T \in \{1, 2, 4, 6, 8, 12\}\) plus the realised n_post, so the table also doubles as a “how long do I need to run?” answer — pick the smallest \(T\) whose MDE drops below your target effect size.

Practical reading#

A typical MAREX run with T_post = 6, blank_periods = 4 and modest serial correlation (\(\hat\rho \approx 0.5\)) on a Walmart-style sales panel produces an MDE on the order of 0.05–0.15% of mean sales, well below the 1–3% effect sizes typical marketing interventions aim for; this is the quantitative substance of “good designs are well-powered”. Conversely, an MDE much above the expected effect is a signal the design needs more units (lower m_eq/m_max are typically worse for power) or more post-periods (extend the experiment).

Opting out#

The power computation is wrapped in a try/except in solve_marex() — a power analysis failure (e.g. degenerate residual variance) never breaks the fit, res.post_fit.power is just left as None. To compute power on a non-default horizon grid or significance level, call the free function directly:

from mlsynth.utils.post_fit import compute_power_analysis

alt = compute_power_analysis(
    res.post_fit, alpha=0.10, power_target=0.90,
    post_grid=[2, 4, 8, 16, 32, 52],     # weekly horizons out to a year
)

Monte Carlo: Recovering the Treatment Effect#

The block below replicates the qualitative finding of the paper’s simulation study (Section 5) using mlsynth’s own reimplementation of the linear-factor DGP. A sample is drawn, the design is fit on the pre-period, the treated units realise \(Y^I\) in the experiment, and the estimate is compared to the true average effect.

import numpy as np
import pandas as pd
from mlsynth import MAREX
from mlsynth.utils.marex_helpers.simulation import generate_marex_sample

rng = np.random.default_rng(0)

def design_mae(sample, **card):
    J, T = sample.Y_N.shape
    T0 = sample.T0
    df = pd.DataFrame(
        [{"unit": f"u{j}", "time": t, "y": float(sample.Y_N[j, t])}
         for j in range(J) for t in range(T)]
    )
    res = MAREX({"df": df, "outcome": "y", "unitid": "unit",
                 "time": "time", "T0": T0, **card}).fit()
    w = res.globres.treated_weights_agg
    v = res.globres.control_weights_agg
    treated = np.where(w > 1e-8)[0]
    Y_obs = sample.Y_N.copy()
    Y_obs[treated, T0:] = sample.Y_I[treated, T0:]   # experiment realises Y^I
    tau_hat = w @ Y_obs[:, T0:] - v @ Y_obs[:, T0:]
    return np.mean(np.abs(tau_hat - sample.tau[T0:]))

maes, scales = [], []
for _ in range(5):
    s = generate_marex_sample(J=12, T=30, T0=25, rng=rng)
    maes.append(design_mae(s, m_min=1, m_max=11))     # Unconstrained
    scales.append(np.mean(np.abs(s.tau[s.T0:])))
print(f"MAE {np.mean(maes):.2f}  vs  effect scale {np.mean(scales):.2f}")

The synthetic-control design recovers the average treatment effect with mean absolute error far below the effect’s own scale (≈ 4.4 vs. ≈ 14, i.e. under a third), the central message of the paper’s Table 2. Over the paper’s full 1000 simulations the error also decreases as more units are allowed into the treated group (the Unconstrained design is best), with the largest gains moving from one to two or three treated units.

Note

This is a Path-B replication: it reproduces the simulation study’s conclusions from public DGPs and mlsynth code, with no dependency on the authors’ replication package. It is locked in as mlsynth.tests.test_marex_replication.

Empirical Application: Walmart (Placebo Experiment)#

We replicate the paper’s empirical illustration (Section 4) on the Walmart store-sales panel (basedata/walmart_weekly_sales.csv): weekly sales for 45 stores over 143 weeks (Feb 2010 – Oct 2012). Following the paper, we design a placebo experiment with a fictitious intervention at week 129: \(T_0 = 128\) pre-experiment weeks, of which the first \(T_E = 100\) are the fitting period and the last 28 are blank, leaving 15 experimental weeks. The design uses the constrained formulation with \(m = 2\) treated stores, uniform weights, and predictors normalised to unit variance (standardize).

import pandas as pd
from mlsynth import MAREX

df = pd.read_csv(
    "https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
    "refs/heads/main/basedata/walmart_weekly_sales.csv"
)

res = MAREX({
    "df": df, "outcome": "sales", "unitid": "store", "time": "week",
    "T0": 128, "blank_periods": 28, "T_post": 15,   # TE=100, 28 blank, 15 post
    "m_eq": 2,                  # constrained design, two treated stores
    "design": "standard",
    "standardize": True,        # unit-variance predictors (paper's normalisation)
    "inference": True,
    "display_graph": True,
}).fit()

print("treated stores:", res.treated_units)              # [1, 15]
print("placebo p-value:", round(res.globres.inference.global_p_value, 3))

Because the intervention is a placebo (no real effect), a correct design should produce synthetic treated and control units that track closely and an estimated effect near zero. mlsynth reproduces exactly that — and the paper’s headline number:

Walmart placebo design (m = 2)#

Quantity

mlsynth

Paper (Section 4)

Pre-fit RMSE / mean sales

2.2%

small (close tracking)

Experimental ATT / mean sales

-1.0%

near zero

Placebo permutation p-value

0.937

0.933

Confidence band covers zero

yes (all post weeks)

yes

The synthetic treated and control units track to within ~2% of mean sales over the fitting and blank periods, the estimated placebo effect is ~1% of sales, and the permutation test fails to reject the null of no effect (\(p = 0.937\), matching the paper’s \(0.933\)) — exactly the “no spurious effect” result a good design should deliver on a placebo.

Note

This uses the exact MIQP (relaxed=False, the default) with standardize=True; the unit-variance normalisation is essential here because Walmart stores differ enormously in sales level, and without it the level differences dominate the match. The solve takes roughly a minute with the open-source SCIP solver (the paper used commercial Gurobi).

Correspondence with the Authors’ Code#

The authors’ R replication code (Random_Data_Generator.R, Synthetic_Experiments.R, Different_optimization_methods.R) maps directly onto mlsynth’s implementation, which was checked against it:

Authors’ R ↔ mlsynth MAREX#

Authors’ R

mlsynth

DGP (Random_Data_Generator.R)

generate_marex_sample()

Formulation (5), Gurobi non-convex QCQP

design="standard" (MIQP with binary z; same optimum)

Penalization formulation

design="penalized" (identical \(\lambda\) distance penalty)

Cardinality formulation

m_eq / m_min / m_max

Predictors \(X = [Y^E ; Z]\)

covariates=[...] (matched on pre-outcomes + covariates)

“treated = smaller set” swap

applied in solve_marex()

Exact permutation test (sum statistic)

permutation inference (mlsynth defaults to a mean statistic / sampled permutations)

Driving mlsynth’s solve_design on the authors’ exact DGP and predictor matrix recovers the average treatment effect to within its scale, and the effect estimate degrades gracefully as the noise SD rises from 1 to 5 to 10 (the figures 2-7 settings) — matching the paper’s qualitative findings.

Note

Two faithfulness details from their R code: rnorm(N, 0, noise.variance) passes the value as a standard deviation, so the figures’ “variance” 1/5/10 are SDs; and the R code uses random population weights \(f_j\) whereas the 2026 paper (and mlsynth) use \(f_j = 1/J\). Also note that the unconstrained standard design (formulation 5) is degenerate — many disjoint splits match \(\bar X\) equally well, so the realised design (and hence a single ATE estimate) is solver-dependent; the cardinality-constrained design is the stable, recommended choice.

Example#

import pandas as pd
from mlsynth import MAREX

# long panel: one row per (market, period)
res = MAREX({
    "df": df, "outcome": "revenue", "unitid": "market", "time": "week",
    "T0": 40,            # 40 pre-experiment weeks
    # Equivalently: pass a 0/1 column marking the experiment window
    # "post_col": "in_experiment",
    "m_eq": 2,           # treat exactly two markets
    "design": "standard",
    "inference": True,   # blank_periods defaults to floor(0.3 * T0) = 12
    "display_graph": True,
}).fit()

print("treated markets:", res.treated_units)
print("global p-value:", res.globres.inference.global_p_value)
for label, c in res.clusters.items():
    print(label, c.unit_weight_map["Treated"])

Core API#

MAREX: Synthetic Controls for Experimental Design (Abadie & Zhao 2026).

MAREX designs an experiment on aggregate units (e.g. markets): using only pre-experimental data it selects which units to treat (treated weights w) and which untreated units form the synthetic control (control weights v), on the simplex and disjoint (a unit is treated or a control, never both). The synthetic treated and synthetic control units are built to reproduce population predictor means, so their post-period difference estimates the average treatment effect. Optional clustering treats one (or a few) units per cluster; optional blank-period placebo inference yields p-values and confidence bands.

class mlsynth.estimators.scexp.MAREX(config: MAREXConfig | dict)#

Bases: object

Synthetic-control experimental design estimator (Abadie & Zhao 2026).

Parameters:

config (MAREXConfig or dict) – Configuration object. See mlsynth.config_models.MAREXConfig.

Returns:

MAREXResults – Per-cluster and aggregate treated/control weights, synthetic series, the selected treated units, and (optionally) placebo inference.

fit() MAREXResults#

Run the MAREX design and return MAREXResults.

Configuration#

class mlsynth.config_models.MAREXConfig(*, df: DataFrame, outcome: str, unitid: str, time: str, T0: int | None = None, post_col: str | None = None, cluster: str | None = None, design: str = 'standard', covariates: List[str] | None = None, covariate_weight: float = 1.0, standardize: bool = False, program_type: str = 'MIQP', display_graph: bool = False, beta: float = 1e-06, lambda1: float = 0.0, lambda2: float = 0.0, xi: float = 0.0, lambda1_unit: float = 0.0, lambda2_unit: float = 0.0, costs: List[float] | None = None, budget: int | Dict[int, int] | None = None, blank_periods: int | None = None, m_eq: int | None = None, m_min: int | None = None, m_max: int | None = None, exclusive: bool = True, relaxed: bool = False, solver: Any = None, verbose: bool = False, inference: bool = False, T_post: int | None = None)#

Configuration for the Synthetic Experiment Design estimator (MAREX) in mlsynth.

T0: int | None#
T_post: int | None#
beta: float#
blank_periods: int | None#
budget: int | Dict[int, int] | None#
cluster: str | None#
costs: List[float] | None#
covariate_weight: float#
covariates: List[str] | None#
design: str#
display_graph: bool#
exclusive: bool#
inference: bool#
lambda1: float#
lambda1_unit: float#
lambda2: float#
lambda2_unit: float#
m_eq: int | None#
m_max: int | None#
m_min: int | None#
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

post_col: str | None#
program_type: str#
relaxed: bool#
solver: Any#
standardize: bool#
classmethod validate_design_params(values: Any) Any#
verbose: bool#
xi: float#

Result Containers#

MAREX.fit() returns a MAREXResults: a dict of per-cluster MAREXClusterDesign objects, the aggregate MAREXGlobalDesign, the MAREXStudy hyperparameters, and (optionally) MAREXInference.

Frozen dataclass containers for the MAREX (synthetic experimental design) pipeline.

Implements the containers for:

Abadie, A., & Zhao, J. (2026). “Synthetic Controls for Experimental Design.”

MAREX designs an experiment on aggregate units: it chooses treated weights w and control weights v (on the simplex, disjoint via w_j v_j = 0) so the synthetic treated and synthetic control units reproduce population predictor means. All containers are frozen (immutable) per the repository convention; inference, when requested, is computed up front and embedded.

class mlsynth.utils.marex_helpers.structures.MAREXClusterDesign(label: str, members: List[Any], cardinality: int, treated_weights: ndarray, control_weights: ndarray, selection_indicators: ndarray, synthetic_treated: ndarray, synthetic_control: ndarray, pre_treatment_means: ndarray, rmse: float, unit_weight_map: Dict[str, Dict[Any, float]], inference: MAREXInference | None = None)#

Bases: object

Design and synthetics for a single cluster.

Parameters:
  • label (str) – Cluster label.

  • members (list) – Unit labels in this cluster.

  • cardinality (int) – Number of units in the cluster.

  • treated_weights (np.ndarray) – Treated weights w for this cluster’s column, shape (N,).

  • control_weights (np.ndarray) – Control weights v for this cluster’s column, shape (N,).

  • selection_indicators (np.ndarray) – Binary selection mask z over the cluster’s members.

  • synthetic_treated (np.ndarray) – Synthetic treated outcome over the full timeline, shape (T,).

  • synthetic_control (np.ndarray) – Synthetic control outcome over the full timeline, shape (T,).

  • pre_treatment_means (np.ndarray) – Cluster predictor means used as the matching target.

  • rmse (float) – Pre-treatment fit RMSE (synthetic treated vs control).

  • unit_weight_map (dict) – {"Treated": {unit: w}, "Control": {unit: v}} for non-zero weights.

  • inference (MAREXInference, optional) – Inference for this cluster (None unless requested).

cardinality: int#
control_weights: ndarray#
inference: MAREXInference | None = None#
label: str#
members: List[Any]#
pre_treatment_means: ndarray#
rmse: float#
selection_indicators: ndarray#
synthetic_control: ndarray#
synthetic_treated: ndarray#
treated_weights: ndarray#
unit_weight_map: Dict[str, Dict[Any, float]]#
class mlsynth.utils.marex_helpers.structures.MAREXGlobalDesign(Y_full: ndarray, Y_fit: ndarray, Y_blank: ndarray | None, treated_weights_agg: ndarray, control_weights_agg: ndarray, synthetic_treated: ndarray, synthetic_control: ndarray, inference: MAREXInference | None = None)#

Bases: object

Aggregated (population-level) design and synthetics.

Parameters:
  • Y_full (np.ndarray) – Observed outcome matrix, shape (N, T).

  • Y_fit (np.ndarray) – Fitting slice, shape (N, T_fit).

  • Y_blank (np.ndarray, optional) – Held-out blank pre-periods, shape (N, Tb) (None if none).

  • treated_weights_agg (np.ndarray) – Cluster-size-weighted aggregate treated weights, shape (N,).

  • control_weights_agg (np.ndarray) – Cluster-size-weighted aggregate control weights, shape (N,).

  • synthetic_treated (np.ndarray) – Aggregate synthetic treated outcome, shape (T,).

  • synthetic_control (np.ndarray) – Aggregate synthetic control outcome, shape (T,).

  • inference (MAREXInference, optional) – Aggregate inference (None unless requested).

Y_blank: ndarray | None#
Y_fit: ndarray#
Y_full: ndarray#
control_weights_agg: ndarray#
inference: MAREXInference | None = None#
synthetic_control: ndarray#
synthetic_treated: ndarray#
treated_weights_agg: ndarray#
class mlsynth.utils.marex_helpers.structures.MAREXInference(treated_effects: ndarray, placebo_effects: ndarray, fulltreated_effects: ndarray, s_obs: float, global_p_value: float, per_period_pvals: ndarray, ci: ndarray, alpha: float = 0.05)#

Bases: object

Placebo/permutation inference for one synthetic treated-vs-control pair.

Parameters:
  • treated_effects (np.ndarray) – Post-period synthetic treated minus synthetic control, shape (T1,).

  • placebo_effects (np.ndarray) – The same contrast on the blank (held-out pre) periods, shape (Tb,).

  • fulltreated_effects (np.ndarray) – The contrast over the whole timeline, shape (T,).

  • s_obs (float) – Observed test statistic (mean absolute post-period effect).

  • global_p_value (float) – Permutation p-value for the global null of no effect.

  • per_period_pvals (np.ndarray) – Per-post-period p-values, shape (T1,).

  • ci (np.ndarray) – Split-conformal confidence band over the full timeline, shape (T, 2) (pre-period rows are NaN).

  • alpha (float) – Two-sided significance level.

alpha: float = 0.05#
ci: ndarray#
fulltreated_effects: ndarray#
global_p_value: float#
per_period_pvals: ndarray#
placebo_effects: ndarray#
s_obs: float#
treated_effects: ndarray#
class mlsynth.utils.marex_helpers.structures.MAREXResults(clusters: Dict[str, MAREXClusterDesign], study: MAREXStudy, globres: MAREXGlobalDesign, post_fit: Any | None = None)#

Bases: object

User-facing output of the MAREX estimator.

Parameters:
  • clusters (dict of {str: MAREXClusterDesign}) – Per-cluster design (a single "0" entry when no cluster column).

  • study (MAREXStudy) – Design hyperparameters.

  • globres (MAREXGlobalDesign) – Aggregate design and synthetics.

  • post_fit (SyntheticControlPostFit, optional) – Standardized post-fit diagnostics (ATE / total effect / percentage lift / fit RMSEs / inference / covariate SMDs). Computed at the end of mlsynth.estimators.MAREX.fit() via mlsynth.utils.post_fit.compute_post_fit_marex(). None only when an estimator failure leaves the result partially constructed.

clusters: Dict[str, MAREXClusterDesign]#
globres: MAREXGlobalDesign#
property mode: str#

Solver mode reported to downstream consumers.

post_fit: Any | None = None#
study: MAREXStudy#
property synthetic_control: ndarray#

Aggregate synthetic control outcome, shape (T,).

property synthetic_treated: ndarray#

Aggregate synthetic treated outcome, shape (T,).

property treated_units: List[Any]#

Units assigned to treatment (non-zero aggregate treated weight).

class mlsynth.utils.marex_helpers.structures.MAREXStudy(design: str, T0: int, blank_periods: int, beta: float = 1e-06, lambda1: float = 0.0, lambda2: float = 0.0, xi: float = 0.0)#

Bases: object

Design hyperparameters of a MAREX study (was StudyConfig).

T0: int#
beta: float = 1e-06#
blank_periods: int#
design: str#
lambda1: float = 0.0#
lambda2: float = 0.0#
xi: float = 0.0#

In addition, MAREX.fit() attaches a SyntheticControlPostFit as results.post_fit: the standardized diagnostics container shared across the MAREX family (LEXSCM, MAREX, SYNDES, PANGEO). It carries the ATE / total / lift / per-period / cumulative effect summaries, the inference triple (\(p\), CI), the pre-/blank-/post-period RMSEs, the three standardized-mean-difference blocks (treated-vs-control, treated-vs-population, control-vs-population), and — when a valid noise window exists — a PowerAnalysis block with the headline MDE and the MDE-versus-horizon curve.

class mlsynth.utils.post_fit.SyntheticControlPostFit(treated_series: ndarray, control_series: ndarray, gap_series: ndarray, n_fit: int, n_blank: int, n_post: int, ate: float | None = None, total_effect: float | None = None, ate_percent: float | None = None, ate_per_period: ndarray | None = None, cumulative_effect: ndarray | None = None, p_value: float | None = None, ci_lower: float | None = None, ci_upper: float | None = None, inference_method: str | None = None, rmse_fit: float | None = None, rmse_blank: float | None = None, rmse_post: float | None = None, covariate_names: Tuple[str, ...] = (), covariate_smd: Dict[str, float] | None = None, covariate_smd_abs_max: float | None = None, covariate_smd_squared_sum: float | None = None, covariate_smd_treated_vs_pop: Dict[str, float] | None = None, covariate_smd_treated_vs_pop_abs_max: float | None = None, covariate_smd_treated_vs_pop_squared_sum: float | None = None, covariate_smd_control_vs_pop: Dict[str, float] | None = None, covariate_smd_control_vs_pop_abs_max: float | None = None, covariate_smd_control_vs_pop_squared_sum: float | None = None, power: PowerAnalysis | None = None)#

Bases: object

Standardized post-fit diagnostics for a single synthetic control design.

Field semantics are estimator-agnostic; every MAREX-family adapter populates the same shape. Any field that isn’t naturally computable for the producing estimator is left None.

ate: float | None = None#
ate_per_period: ndarray | None = None#
ate_percent: float | None = None#
ci_lower: float | None = None#
ci_upper: float | None = None#
control_series: ndarray#
covariate_names: Tuple[str, ...] = ()#
covariate_smd: Dict[str, float] | None = None#
covariate_smd_abs_max: float | None = None#
covariate_smd_control_vs_pop: Dict[str, float] | None = None#
covariate_smd_control_vs_pop_abs_max: float | None = None#
covariate_smd_control_vs_pop_squared_sum: float | None = None#
covariate_smd_squared_sum: float | None = None#
covariate_smd_treated_vs_pop: Dict[str, float] | None = None#
covariate_smd_treated_vs_pop_abs_max: float | None = None#
covariate_smd_treated_vs_pop_squared_sum: float | None = None#
cumulative_effect: ndarray | None = None#
gap_series: ndarray#
inference_method: str | None = None#
n_blank: int#
n_fit: int#
n_post: int#
p_value: float | None = None#
power: PowerAnalysis | None = None#
rmse_blank: float | None = None#
rmse_fit: float | None = None#
rmse_post: float | None = None#
total_effect: float | None = None#
treated_series: ndarray#
class mlsynth.utils.post_fit.PowerAnalysis(headline: MDEPoint, curve: Tuple[MDEPoint, ...], alpha: float, power_target: float, sigma_placebo: float, serial_correlation: float, baseline: float, method: str = 'analytical_ar1')#

Bases: object

Standardized power-analysis output attached to SyntheticControlPostFit.

Built from the placebo / blank-period gap variance and an analytical Gaussian approximation, with AR(1) variance inflation to handle serial correlation in the gap residuals. The intent matches the per-estimator power modules already in the library (PangeoPower, SPCDPowerAnalysis, SYNDESPower) but consumes the same SyntheticControlPostFit shape so every covariate-aware SCM-family estimator gets the surface for free.

headline#

MDE for the actual n_post horizon of the realised design.

Type:

MDEPoint

curve#

MDE / power values across the requested post_grid horizons (so callers can read a detectability curve).

Type:

list of MDEPoint

alpha#

Two-sided significance level assumed.

Type:

float

power_target#

Target power the MDEs are computed at (default 0.80).

Type:

float

sigma_placebo#

Standard deviation of the placebo gap series used as the noise scale.

Type:

float

serial_correlation#

Lag-1 (AR(1)) autocorrelation of the placebo gap residuals used to inflate the variance for serial dependence.

Type:

float

baseline#

Mean of the control trajectory on the post window (denominator for mde_pct). NaN when no post window exists.

Type:

float

method#

"analytical_ar1" for the closed-form Gaussian + AR(1) MDE used here. Reserved for future "monte_carlo" extensions.

Type:

str

alpha: float#
baseline: float#
curve: Tuple[MDEPoint, ...]#
headline: MDEPoint#
mde_by_horizon() Dict[int, float]#

{post_periods: mde_pct} for quick lookup.

method: str = 'analytical_ar1'#
power_target: float#
serial_correlation: float#
sigma_placebo: float#
class mlsynth.utils.post_fit.MDEPoint(post_periods: int, mde_absolute: float, mde_pct: float, se: float, power_at_observed: float | None = None)#

Bases: object

Minimum detectable effect at a single post-treatment horizon.

mde_absolute: float#
mde_pct: float#
post_periods: int#
power_at_observed: float | None = None#
se: float#

Helper Modules#

Input preparation for MAREX: long panel -> design-ready arrays.

class mlsynth.utils.marex_helpers.setup.MAREXPanel(Y_full: DataFrame, clusters: ndarray, T0: int, blank_periods: int, covariates: ndarray | None = None, covariate_names: Tuple[str, ...] = ())#

Prepared MAREX inputs.

T0: int#
Y_full: DataFrame#
blank_periods: int#
clusters: ndarray#
covariate_names: Tuple[str, ...] = ()#
covariates: ndarray | None = None#
mlsynth.utils.marex_helpers.setup.prepare_marex_panel(df: DataFrame, outcome: str, unitid: str, time: str, cluster: str | None, T0: int | None, inference: bool, blank_periods: int, T_post: int | None, covariates: List[str] | None = None) MAREXPanel#

Pivot the long panel to units x time and forward the resolved T0 / blank_periods.

The MAREX config validator is the single source of truth for resolving T0 (from either an explicit scalar or a post_col 0/1 column) and the default 30%-of-pre-tail blank window — by the time this helper runs both are concrete integers. covariates columns are aggregated to a per-unit pre-period mean via mlsynth.utils.datautils.build_covariate_matrix() and returned as an (N, R) matrix aligned to the unit order. The matrix is left un-normalised here so MAREX’s existing standardize=True flag (applied to the combined [Y_fit; covariate_weight * Z] predictor matrix in marex_helpers.optimization) keeps its previous behaviour.

Design-formulation primitives for MAREX (Abadie & Zhao 2026).

The experimenter chooses treated weights w and control weights v per cluster on the simplex, with a binary selection mask z linking them (w_j <= z_j, v_j <= 1 - z_j) so a unit is either treated or a control, never both (the disjointness w_j v_j = 0). These helpers build the cvxpy variables, constraints, and the design-specific objective; the objective form is selected by design:

  • "base" – match each cluster mean with both synthetic units;

  • "weak" – match the treated synthetic to the mean and softly tie the control synthetic to it (weight beta);

  • "eq11"base plus cluster-level distance penalties (lambda1 / lambda2);

  • "unit"base plus unit-level penalties (xi / lambda1_unit / lambda2_unit).

mlsynth.utils.marex_helpers.formulation.build_constraints(w, v, z, M, cluster_members, cluster_labels, m_eq, m_min, m_max, costs, budget_dict, exclusive)#

Simplex, disjointness, cardinality, cost, and exclusivity constraints.

mlsynth.utils.marex_helpers.formulation.build_membership_mask(clusters, label_to_k, N, K)#

Boolean (N, K) mask of unit-to-cluster membership.

mlsynth.utils.marex_helpers.formulation.build_objective(Y_fit, Xbar_clusters, cluster_members, w, v, z, design, beta=1e-06, lambda1=0.0, lambda2=0.0, xi=0.0, lambda1_unit=0.0, lambda2_unit=0.0, D1=None, D2_list=None, zeta=0.0)#

Design-specific cvxpy objective (see module docstring).

zeta adds an optional integrality penalty z (1 - z) used by the relaxed (continuous-z) solve; it is 0 for the exact MIQP.

mlsynth.utils.marex_helpers.formulation.compute_cluster_means_members(Y_fit, M, cluster_labels)#

Per-cluster predictor means and member index arrays.

mlsynth.utils.marex_helpers.formulation.get_per_cluster_param(param, klabel, default=None)#

Resolve a possibly-per-cluster parameter to its value for klabel.

mlsynth.utils.marex_helpers.formulation.init_cvxpy_variables(N, K, boolean=True)#

Treated (w), control (v) weights and selection (z).

z is binary for the exact MIQP (boolean=True) or continuous in [0, 1] for the relaxed QP (boolean=False).

mlsynth.utils.marex_helpers.formulation.precompute_distances(Y_fit, Xbar_clusters, cluster_members)#

Unit-to-cluster-mean distances D1 and within-cluster pairwise D2.

mlsynth.utils.marex_helpers.formulation.prepare_clusters(Y_full, clusters)#

Coerce Y_full/clusters to arrays and return cluster bookkeeping.

mlsynth.utils.marex_helpers.formulation.prepare_fit_slices(Y_full_np, T0, blank_periods)#

Split the pre-period into a fitting slice and a held-out blank slice.

mlsynth.utils.marex_helpers.formulation.validate_costs_budget(costs, budget, N, cluster_labels, K)#

Validate cost/budget inputs; return (costs_np, budget_dict).

mlsynth.utils.marex_helpers.formulation.validate_scm_inputs(Y_full, T0, blank_periods, design, beta=1e-06, lambda1=0.0, lambda2=0.0, xi=0.0, lambda1_unit=0.0, lambda2_unit=0.0)#

Validate shapes and design/parameter compatibility (raises ValueError).

MAREX design optimizers (Abadie & Zhao 2026).

solve_design solves the exact mixed-integer design (binary selection z); solve_design_relaxed relaxes z to [0, 1], solves the QP, then discretizes post hoc. Both return a raw result dict consumed by the orchestrator.

mlsynth.utils.marex_helpers.optimization.post_hoc_discretize(w_opt, v_opt, cluster_members, cluster_labels, m_eq=None, m_min=None, m_max=None, trim_threshold=0.01, Y_fit=None, Y_blank=None)#

Round relaxed weights to a feasible integer design (was internal).

mlsynth.utils.marex_helpers.optimization.solve_design(Y_full, T0, clusters, blank_periods=0, m_eq=None, m_min=None, m_max=None, exclusive=True, design='standard', beta=1e-06, lambda1=0.0, lambda2=0.0, xi=0.0, lambda1_unit=0.0, lambda2_unit=0.0, costs=None, budget=None, covariates=None, covariate_weight=1.0, standardize=False, solver='SCIP', verbose=False)#

Exact mixed-integer MAREX design (was SCMEXP).

mlsynth.utils.marex_helpers.optimization.solve_design_relaxed(Y_full, T0, clusters, blank_periods=0, m_eq=None, m_min=None, m_max=None, exclusive=True, design='standard', beta=1e-06, lambda1=0.0, lambda2=0.0, xi=0.0, lambda1_unit=0.0, lambda2_unit=0.0, costs=None, budget=None, covariates=None, covariate_weight=1.0, standardize=False, solver=None, verbose=False, zeta=0.0, trim_threshold=0.01)#

Relaxed (continuous-z) design with post-hoc discretization (was SCMEXP_REL).

Placebo/permutation inference for a MAREX synthetic treated-vs-control pair.

The held-out blank pre-periods act as placebos: the synthetic treated minus synthetic control there should be noise, so its distribution calibrates a permutation p-value and a split-conformal confidence band for the post-period effect (Abadie & Zhao 2026, OA; Chernozhukov-Wuthrich-Zhu 2021).

mlsynth.utils.marex_helpers.inference.compute_inference(Y_treated: ndarray, Y_control: ndarray, T0: int, TcE: int, Tb: int, alpha: float = 0.05, max_combinations: int = 1000, random_state: int | None = None) MAREXInference#

Compute permutation inference for one synthetic contrast.

Parameters:
  • Y_treated, Y_control (np.ndarray) – Synthetic treated / control outcomes over the full timeline, shape (T,).

  • T0 (int) – Number of pre-treatment periods.

  • TcE (int) – Start index of the blank (held-out) window.

  • Tb (int) – Number of blank periods.

  • alpha (float) – Two-sided significance level.

  • max_combinations (int) – Number of permutation draws for the global test.

  • random_state (int, optional) – Seed for reproducibility.

Returns:

MAREXInference

Top-level MAREX solve: run the design optimizer and assemble frozen results.

mlsynth.utils.marex_helpers.orchestration.solve_marex(Y_full, T0, clusters, design='standard', blank_periods=0, m_eq=None, m_min=None, m_max=None, exclusive=True, beta=1e-06, lambda1=0.0, lambda2=0.0, xi=0.0, lambda1_unit=0.0, lambda2_unit=0.0, costs=None, budget=None, covariates=None, covariate_names=(), covariate_weight=1.0, standardize=False, solver=None, verbose=False, relaxed=False, inference=False, alpha=0.05, max_combinations=1000, random_state=42) MAREXResults#

Solve the MAREX design and return a frozen MAREXResults.

With relaxed=True the continuous-z QP with post-hoc discretization is used; otherwise the exact MIQP. With inference=True, blank-period placebo inference is computed for every cluster and the aggregate.

Linear-factor DGP for the MAREX simulation study (Abadie & Zhao 2026, Sec. 5).

Reimplements the paper’s baseline data-generating process (Assumption 1, equations 12a/12b) so the simulation study can be replicated without the authors’ code. Potential outcomes are

Y^N_jt = delta_t + theta_t’ Z_j + lambda_t’ mu_j + eps_jt Y^I_jt = upsilon_t + gamma_t’ Z_j + eta_t’ mu_j + xi_jt

with Z_j (R observed) and mu_j (F unobserved) covariates, sorted time effects, and i.i.d. Normal(0, sigma^2) noise.

class mlsynth.utils.marex_helpers.simulation.MAREXSample(Y_N: ndarray, Y_I: ndarray, tau: ndarray, T0: int)#

One simulated sample.

Y_N#

Potential outcomes under no treatment, shape (J, T).

Type:

np.ndarray

Y_I#

Potential outcomes under treatment, shape (J, T).

Type:

np.ndarray

tau#

True average treatment effect tau_t per period, shape (T,) (zero in the pre-period).

Type:

np.ndarray

T0#

Number of pre-treatment periods.

Type:

int

T0: int#
Y_I: ndarray#
Y_N: ndarray#
tau: ndarray#
mlsynth.utils.marex_helpers.simulation.generate_marex_sample(J: int = 15, R: int = 7, F: int = 11, T: int = 30, T0: int = 25, sigma: float = 1.0, rng: Generator | None = None) MAREXSample#

Draw one sample from the paper’s baseline linear-factor DGP (Sec. 5).

Returns:

MAREXSample

Plotting for MAREX: synthetic treated vs control (or the treatment effect).

mlsynth.utils.marex_helpers.plotter.plot_marex(results: MAREXResults, clusters: List[str] | None = None, plot_type: str = 'treatment', global_result: bool = True, figsize: tuple = (12, 6)) None#

Plot MAREX treatment effects (or predictions), one panel per cluster + global.

Parameters:
  • results (MAREXResults) – Output of mlsynth.estimators.MAREX.

  • clusters (list of str, optional) – Cluster labels to plot (default: all; a lone "0" cluster is skipped).

  • plot_type ({“treatment”, “prediction”}) – Plot the treated-minus-control effect, or both synthetic series.

  • global_result (bool) – Include the aggregate (global) panel.

The shared post-fit module — compute_smd(), compute_post_fit(), and compute_power_analysis() — lives outside the marex_helpers package so the other MAREX-family estimators (LEXSCM, SYNDES, PANGEO) can call into the same one-source-of-truth diagnostics:

Standardized post-fit diagnostics for synthetic control designs and the matching power-analysis surface that consumes them.

After any MAREX-family estimator (LEXSCM, MAREX, SYNDES, PANGEO, …) solves its design problem, downstream consumers (the SAGE dashboard, paper-style reports, comparison tables) all need the same numbers:

  • the post-treatment ATT, total effect, percentage lift, per-period gap;

  • pre / blank / post root-mean-squared-error of the synthetic gap;

  • inference scalars (p-value, CI bounds) when computed;

  • covariate-balance standardized mean differences (SMDs) when covariates were used in the design.

This module exposes one frozen dataclass (SyntheticControlPostFit) and three free functions:

  • compute_smd() – standalone, panel-independent SMD

    from any (cov_matrix, treated_w, control_w);

  • compute_post_fit() – the full diagnostic bundle from

    trajectories + boundaries + (optional) covariate matrix + (optional) inference;

  • compute_post_fit_marex() – adapter that builds the bundle from a

    MAREXResults + MAREXPanel pair.

The free-function entry points are deliberately small and reusable, so the LEXSCM / SYNDES / PANGEO equivalents can be added one-at-a-time without touching this module: they just compose the same primitives.

class mlsynth.utils.post_fit.MDEPoint(post_periods: int, mde_absolute: float, mde_pct: float, se: float, power_at_observed: float | None = None)#

Bases: object

Minimum detectable effect at a single post-treatment horizon.

mde_absolute: float#
mde_pct: float#
post_periods: int#
power_at_observed: float | None = None#
se: float#
class mlsynth.utils.post_fit.PowerAnalysis(headline: MDEPoint, curve: Tuple[MDEPoint, ...], alpha: float, power_target: float, sigma_placebo: float, serial_correlation: float, baseline: float, method: str = 'analytical_ar1')#

Bases: object

Standardized power-analysis output attached to SyntheticControlPostFit.

Built from the placebo / blank-period gap variance and an analytical Gaussian approximation, with AR(1) variance inflation to handle serial correlation in the gap residuals. The intent matches the per-estimator power modules already in the library (PangeoPower, SPCDPowerAnalysis, SYNDESPower) but consumes the same SyntheticControlPostFit shape so every covariate-aware SCM-family estimator gets the surface for free.

headline#

MDE for the actual n_post horizon of the realised design.

Type:

MDEPoint

curve#

MDE / power values across the requested post_grid horizons (so callers can read a detectability curve).

Type:

list of MDEPoint

alpha#

Two-sided significance level assumed.

Type:

float

power_target#

Target power the MDEs are computed at (default 0.80).

Type:

float

sigma_placebo#

Standard deviation of the placebo gap series used as the noise scale.

Type:

float

serial_correlation#

Lag-1 (AR(1)) autocorrelation of the placebo gap residuals used to inflate the variance for serial dependence.

Type:

float

baseline#

Mean of the control trajectory on the post window (denominator for mde_pct). NaN when no post window exists.

Type:

float

method#

"analytical_ar1" for the closed-form Gaussian + AR(1) MDE used here. Reserved for future "monte_carlo" extensions.

Type:

str

alpha: float#
baseline: float#
curve: Tuple[MDEPoint, ...]#
headline: MDEPoint#
mde_by_horizon() Dict[int, float]#

{post_periods: mde_pct} for quick lookup.

method: str = 'analytical_ar1'#
power_target: float#
serial_correlation: float#
sigma_placebo: float#
class mlsynth.utils.post_fit.SyntheticControlPostFit(treated_series: ndarray, control_series: ndarray, gap_series: ndarray, n_fit: int, n_blank: int, n_post: int, ate: float | None = None, total_effect: float | None = None, ate_percent: float | None = None, ate_per_period: ndarray | None = None, cumulative_effect: ndarray | None = None, p_value: float | None = None, ci_lower: float | None = None, ci_upper: float | None = None, inference_method: str | None = None, rmse_fit: float | None = None, rmse_blank: float | None = None, rmse_post: float | None = None, covariate_names: Tuple[str, ...] = (), covariate_smd: Dict[str, float] | None = None, covariate_smd_abs_max: float | None = None, covariate_smd_squared_sum: float | None = None, covariate_smd_treated_vs_pop: Dict[str, float] | None = None, covariate_smd_treated_vs_pop_abs_max: float | None = None, covariate_smd_treated_vs_pop_squared_sum: float | None = None, covariate_smd_control_vs_pop: Dict[str, float] | None = None, covariate_smd_control_vs_pop_abs_max: float | None = None, covariate_smd_control_vs_pop_squared_sum: float | None = None, power: PowerAnalysis | None = None)#

Bases: object

Standardized post-fit diagnostics for a single synthetic control design.

Field semantics are estimator-agnostic; every MAREX-family adapter populates the same shape. Any field that isn’t naturally computable for the producing estimator is left None.

ate: float | None = None#
ate_per_period: ndarray | None = None#
ate_percent: float | None = None#
ci_lower: float | None = None#
ci_upper: float | None = None#
control_series: ndarray#
covariate_names: Tuple[str, ...] = ()#
covariate_smd: Dict[str, float] | None = None#
covariate_smd_abs_max: float | None = None#
covariate_smd_control_vs_pop: Dict[str, float] | None = None#
covariate_smd_control_vs_pop_abs_max: float | None = None#
covariate_smd_control_vs_pop_squared_sum: float | None = None#
covariate_smd_squared_sum: float | None = None#
covariate_smd_treated_vs_pop: Dict[str, float] | None = None#
covariate_smd_treated_vs_pop_abs_max: float | None = None#
covariate_smd_treated_vs_pop_squared_sum: float | None = None#
cumulative_effect: ndarray | None = None#
gap_series: ndarray#
inference_method: str | None = None#
n_blank: int#
n_fit: int#
n_post: int#
p_value: float | None = None#
power: PowerAnalysis | None = None#
rmse_blank: float | None = None#
rmse_fit: float | None = None#
rmse_post: float | None = None#
total_effect: float | None = None#
treated_series: ndarray#
mlsynth.utils.post_fit.compute_post_fit(treated_series: ndarray, control_series: ndarray, *, n_fit: int, n_blank: int = 0, n_post: int | None = None, cov_matrix: ndarray | None = None, cov_names: Sequence[str] | None = None, cov_scales: ndarray | None = None, treated_weights: ndarray | None = None, control_weights: ndarray | None = None, population_weights: ndarray | None = None, inference: Any | None = None, n_treated_units: int | None = None) SyntheticControlPostFit#

Compute a SyntheticControlPostFit from trajectories + boundaries.

The trajectories treated_series and control_series are the estimator’s own synthetic constructs (Σⱼ wⱼ Yⱼ and Σⱼ vⱼ Yⱼ in Abadie-Zhou notation). n_post defaults to len(treated_series) - n_fit - n_blank.

Covariate balance fields are populated when cov_matrix + treated_weights + control_weights are all supplied (the natural inputs for any MAREX-family design). The compute_smd() helper does the work, so the SMD numbers are exactly consistent with a standalone call to compute_smd().

Inference scalars are pulled from the estimator’s inference object via _extract_inference(), which knows about the four common shapes (LEXSCM Inference, MAREX MAREXInference, SYNDES SYNDESInference, or a plain dict). All inference fields are optional.

mlsynth.utils.post_fit.compute_post_fit_marex(raw, panel, *, cov_scales: ndarray | None = None) SyntheticControlPostFit#

Adapt a MAREXResults + MAREXPanel pair into a SyntheticControlPostFit.

Pulls the aggregate synthetic-treated / synthetic-control trajectories from raw.globres, the (T0, blank_periods) split from panel.T0 and panel.blank_periods, the inference object from raw.globres.inference, and the covariate matrix from panel.covariates (when present).

mlsynth.utils.post_fit.compute_power_analysis(post_fit: SyntheticControlPostFit, *, alpha: float = 0.05, power_target: float = 0.8, post_grid: Sequence[int] | None = None) PowerAnalysis#

Analytical MDE + power curve for a design’s SyntheticControlPostFit.

Uses the placebo / blank-period gap residuals (or the pre-period gap when no blank window was carved out) to estimate the noise standard deviation sigma_placebo and the AR(1) autocorrelation rho, then computes the minimum detectable effect for each horizon T in post_grid via the Gaussian formula

MDE(T) = (z_{1-alpha/2} + z_{power}) * sigma_placebo * sqrt(VIF(T, rho)),

where VIF(T, rho) = Var(mean of T AR(1) periods) / sigma_placebo^2. The headline MDE uses T = post_fit.n_post (the realised post window).

Parameters:
  • post_fit (SyntheticControlPostFit) – The standardized post-fit from any MAREX-family estimator.

  • alpha (float, default 0.05) – Two-sided significance level.

  • power_target (float, default 0.80) – Target power for the MDE.

  • post_grid (sequence of int, optional) – Post-treatment horizons at which to compute MDE. Defaults to a small geometric grid centered on post_fit.n_post so users see the detectability tradeoff vs. running the experiment longer.

Returns:

PowerAnalysis – Headline MDE + a curve over the requested horizons.

mlsynth.utils.post_fit.compute_smd(cov_matrix: ndarray, treated_weights: ndarray, control_weights: ndarray, *, cov_names: Sequence[str] | None = None, cov_scales: ndarray | None = None) Dict[str, Any]#

Standardized mean differences between weighted treated and control means.

Parameters:
  • cov_matrix (ndarray, shape (N, M)) – Per-unit covariate values; rows align to treated_weights and control_weights.

  • treated_weights, control_weights (ndarray, shape (N,)) – Non-negative weights with disjoint supports. They are renormalised to sum to 1 internally (so callers may pass raw sums-to-K weights).

  • cov_names (sequence of str, optional) – Names for the M covariates. Defaults to ("cov_0", "cov_1", ...).

  • cov_scales (ndarray, shape (M,), optional) – Pre-computed per-covariate standardization scales (cross-unit std). Defaults to the std of cov_matrix columns. Passing the value already cached by build_covariate_matrix is the right move.

Returns:

  • dict with keys smd (the per-covariate dict), smd_abs_max,

  • and smd_squared_sum. Returns empty / NaN summaries if either weight

  • vector is all-zero.

References#

Abadie, A., & Zhao, J. (2026). “Synthetic Controls for Experimental Design.” See [ABADIE2024].

Abadie, A., Diamond, A., & Hainmueller, J. (2010). “Synthetic Control Methods for Comparative Case Studies.” Journal of the American Statistical Association 105(490):493-505.

Chernozhukov, V., Wuthrich, K., & Zhu, Y. (2021). “An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls.” Journal of the American Statistical Association 116(536):1849-1864.