Factor Model Approach (FMA)#
When to Use This Method#
Quasi-experiments are the bread and butter of marketing causal inference: a policy or firm decision switches on for one unit (a state legalises recreational marijuana, a digitally native retailer opens a physical showroom in Brooklyn) and you want the sales effect against a pool of untreated cities, stores, or categories. The two reflexive choices each strain in this setting:
Difference-in-differences (DiD) assumes the time effect is the same for the treated unit and every control – the parallel-trends assumption. Marketing panels (sales, share, macro series) rarely oblige: different markets ride different seasonal, regional, and business-cycle waves. When that homogeneity fails, DiD carries a large estimation bias that does not shrink as the panel grows – in Li and Sonnier’s own MSE comparison it is so biased they drop it from the table.
Synthetic control (SC) relaxes that by weighting controls on the simplex (nonnegative, sum-to-one, no intercept), which pins the counterfactual inside the convex hull of the controls. That is a feature when the treated unit is “surrounded” by donors and you want an interpretable convex-combination weight story, but a bug when the treated unit sits outside the donor range, and SC has no inference theory when controls are many and the data are non-stationary of unknown form.
The factor model approach (FMA) of Li and Sonnier ([FMA]) targets exactly the regime SC and DiD struggle with: many control units, time effects that differ across units, a treated path that may lie outside the donor range, and a need for honest confidence intervals. Instead of weighting the controls directly, it first projects the control panel onto a small set of latent factors (interactive fixed effects: \(y^0_{it} = \lambda_i' F_t + e_{it}\)), then regresses the treated unit’s pre-period outcome on those factors with no constraint on the loadings. Three properties follow, and they are the reasons to reach for it:
Heterogeneous time effects. The interactive \(\lambda_i' F_t\) structure nests two-way fixed effects (DiD) as the special case \(\lambda_i = (\xi_i, 1)'\), \(F_t = (1, \delta_t)'\), and also absorbs unit-specific trends and cross-sectional dependence. You are not betting on a common shock.
The counterfactual can leave the donor hull. Because the loadings are unrestricted (no nonnegativity, no sum-to-one, no zero intercept), FMA fits treated units whose level or trajectory is outside the range of every control – the case SC structurally cannot represent.
Robust to a large, growing donor pool. Unconstrained regression on the raw controls (the Hsiao-Ching-Wan approach) overfits and breaks down once \(N_{co}\) approaches or exceeds \(T_1\). FMA’s dimension reduction sidesteps this: it benefits from more controls (they sharpen the factor estimates) rather than being destabilised by them. The same single factor extraction also scales cheaply to many treated units and staggered timing, since it is done once.
The paper’s headline contribution is the fourth reason: valid, computationally cheap inference. FMA delivers a closed-form normal confidence interval for the ATT (Theorems 3.1 / 3.3) that is valid for both stationary and non-stationary data and – critically – accommodates treated and control units with different error variances. The previously standard Xu (2017) bootstrap assumes \(\sigma^2_{\text{tr}} = \sigma^2_{\text{co}}\); when that fails (the norm for non-randomised quasi-experiments) it over- or under-covers badly, while permutation/placebo tests need an analogous symmetry. In the California beer application the estimated variance ratio is \(\hat\sigma^2_{\text{tr}} / \hat\sigma^2_{\text{co}} \approx 37\), so the bootstrap interval comes out less than half its honest width. FMA’s normal interval needs no such assumption and no resampling loop.
Factor projection vs. donor weighting#
Every synthetic-control-family estimator answers what reproduces the treated unit’s untreated path? FMA’s structural bet is distinctive:
The controls’ untreated outcomes share a low-dimensional latent factor structure; estimate those factors, then load the treated unit onto them without constraint.
DiD |
Synthetic control |
Factor model (FMA) |
|
|---|---|---|---|
Comparison built from |
all controls, equal weight |
all controls, simplex weights |
latent factors of the controls |
Time effects |
homogeneous (common shock) |
implicit via weights |
heterogeneous \(\lambda_i' F_t\) |
Counterfactual outside donor hull |
allowed |
no – convex hull |
yes – unrestricted loadings |
Many controls (\(N_{co} \gtrsim T_1\)) |
bias does not shrink |
overfits if restrictions relaxed |
benefits – sharper factors |
Inference, non-stationary data |
standard |
none available |
normal CI (Thm 3.1 / 3.3) |
Unequal treated/control variance |
n/a |
placebo needs symmetry |
handled directly |
Reach for FMA when#
You have a single treated unit (or a handful, or many – the factor step generalises) and a large control pool, especially \(N_{co}\) large relative to the pre-period \(T_1\), where unrestricted regression on raw donors would overfit.
The outcome is plausibly driven by a few common latent factors – regional sales, market share, macro-linked categories – so the control matrix is approximately low-rank. This is what the factor projection exploits.
You expect heterogeneous time dynamics across units (different seasonality, trends, cross-sectional dependence) that violate DiD’s parallel-trends assumption.
The treated unit’s level or trajectory may sit outside the range of the controls, so SC’s convex-hull restriction would distort the fit.
You need formal inference – a hypothesis test or confidence interval for the ATT – and you cannot defend the equal-variance assumption behind the Xu (2017) bootstrap or the symmetry behind permutation/placebo tests. FMA’s normal CI is valid under unequal variances and under non-stationarity, and it is far cheaper than resampling.
You have enough data for the asymptotics: the paper’s simulations show the normal CI is reliable for \(N_{co} \ge 30\) and \(T_1 \ge 30\) (at \(T_2 = 20\)). For smaller panels the paper suggests a \(t_{T_1 - (N_{co} + 1)}\) reference distribution, which
mlsynthdoes not yet expose; treat the normal CI cautiously there.
Do not use FMA when#
The control matrix has no low-rank / factor structure. FMA, like the PCA-based estimators in Cluster Synthetic Controls (CLUSTERSC), leans entirely on a few factors explaining the controls. If the spectrum decays slowly, the factors are noise; prefer a balancing estimator (MicroSynth (User-Level Balancing SC) for user-level data) or a selection estimator (Forward Difference-in-Differences (FDID)).
A sparse, interpretable convex-combination weight is the deliverable. FMA’s loadings are unconstrained and not a “California = 0.4 Utah + 0.3 Montana” story. If policy storytelling needs nonnegative donor weights that sum to one, use Two-Step Synthetic Control, Forward Difference-in-Differences (FDID), or classic SC.
The donor pool is tiny (\(N_{co} \le 10\)) with a clean canonical SC fit. Factor extraction adds variance without identification gain; Two-Step Synthetic Control or Forward Difference-in-Differences (FDID) are more honest, and Two-Step Synthetic Control will even test whether SC’s restrictions are needed.
The sample is very small (\(N_{co}, T_1 \approx 20\)). The normal approximation degrades and the recommended \(t\)-correction is not wired into the public API.
You want efficiency from the treated unit’s own pre-history with many treated units. Factors are estimated from controls only; the efficiency loss is negligible for a single treated unit but grows with \(N_{tr}\) (paper, footnote 3).
Distributional questions (quantile effects, Lorenz/tail changes). FMA targets the mean ATT; use Distributional Synthetic Control (DSC).
Continuous or multi-valued treatment, or spillovers/interference across units. FMA encodes a single binary intervention and assumes the controls are untreated; use Continuous-Treatment Synthetic Control (CTSC) for dose response and Spillover-Aware Synthetic Control (SPILLSYNTH) / Spatial Synthetic Difference-in-Differences (SpSyDiD) under interference.
Overview#
FMA implements Li, K. T., & Sonnier, G. P. (2023). “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings.” Journal of Marketing Research, 60(3):449-472. The estimator constructs a counterfactual for a single treated unit by
extracting principal-component factors from the control panel (with the number of factors chosen by the paper’s modified Bai-Ng criterion for stationary outcomes or Bai (2004) IPC1 for non- stationary outcomes);
projecting the treated unit’s pre-treatment outcomes onto a constant plus the factors via OLS to recover the loading \(\hat\lambda_1\);
forming the counterfactual \(\hat y^0_{1, t} = \tilde F_t' \hat\lambda_1\) for every period; the ATT is the mean post-treatment gap.
The paper’s distinctive contribution is valid statistical inference
for the ATT. FMA in mlsynth exposes three procedures in
parallel; the user picks any subset via
FMAConfig.inference_methods:
"asymptotic"(default) – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CI for the ATT, built from the variance decomposition \(\widehat \Omega = \widehat \Omega_1 + \widehat \Omega_2\)."bootstrap"– Web Appendix F residual bootstrap for per-period \(\widehat{ATT}_t\) CIs."placebo"– Web Appendix G control-as-pseudo-treated band.
Mathematical Formulation#
Setup#
We observe one treated unit (indexed 1) and N_co control units
over T periods. Treatment begins at \(T_0 + 1\). Under the
factor model
the goal is to estimate
Factor extraction#
Factors \(\hat F_t\) are extracted from the control panel via PCA after demeaning (or standardising) each control series. The number of factors \(r\) is chosen by one of two criteria:
Modified Bai-Ng (MBN) – stationary data. Choose \(r \in \{0, 1, \dots, r_{\max}\}\) minimising
\[PC_{MBN}(r) = \frac{1}{N_{co} T} \sum_{i=2}^{N} \sum_{t=1}^{T} (y_{i, t} - \hat\lambda_i' \hat F_t)^2 + c_{N, T}\, r\, \hat\sigma^2 \frac{N_{co} + T}{N_{co} T} \log \frac{N_{co} + T}{N_{co} T},\]with the small-sample adjustment
\[c_{N, T} = \frac{(N_{co} + \max(70 - N_{co}, 0)) (T + \max(70 - T, 0))}{N_{co} T}.\]When
N_co, T >= 70the adjustment collapses to 1 and the criterion is identical to Bai-Ng (2002) \(PC_{p1}\).Bai (2004) IPC1 – non-stationary data. Replaces the small-sample factor with a log-log adjustment suited to non- stationary factors.
Override the data-driven selection by passing
FMAConfig.n_factors directly.
Loading and counterfactual#
With \(\tilde F_t = [1, \hat F_t']'\) and pre-period OLS,
The ATT is \(\widehat{ATT} = (T - T_0)^{-1} \sum_{t > T_0} (y_{1, t} - \hat y^0_{1, t})\).
Asymptotic inference (Theorem 3.1 / 3.3)#
Write \(\bar{\tilde F}_2 = (T - T_0)^{-1} \sum_{t > T_0} \tilde F_t\) and \(\widehat \Psi = (\sum_{t \le T_0} \tilde F_t \tilde F_t')^{-1}\). The paper shows
with \(\widehat \sigma_e^2\) the variance of the pre-treatment residuals. The \((1 - \alpha)\) CI for the ATT is
A two-sided z-test of \(H_0: ATT = 0\) reports the p-value.
Bootstrap inference (Web Appendix F)#
The paper notes that the per-period \(ATT_t\) CI cannot shrink to zero as \(T_0, T - T_0, N_{co} \to \infty\) because its leading term is the idiosyncratic shock \(e_{1, t}\) itself. A residual bootstrap therefore drives the per-period CI:
Compute pre-period residuals \(\hat e_{1, t} = y_{1, t} - \tilde F_t' \hat\lambda_1\), \(t = 1, \dots, T_0\).
For each bootstrap draw \(b = 1, \dots, B\):
Sample \(e^*_{1, t}\) from \(\{\hat e_{1, t}\}\) with replacement for every \(t \in \{1, \dots, T\}\).
Form \(y^*_{1, t} = \tilde F_t' \hat\lambda_1 + e^*_{1, t}\).
Re-estimate \(\hat\lambda^{*}_1\) from the bootstrap pre-period.
Compute \(\Delta^*_{1, t} = y^*_{1, t} - \tilde F_t' \hat\lambda^{*}_1\) for \(t > T_0\).
The \((1 - \alpha)\) CI for \(\Delta_{1, t}\) is
\[\bigl[\widehat \Delta_{1, t} - \Delta^*_{1, t, ((1 - \alpha/2) B)}, \widehat \Delta_{1, t} - \Delta^*_{1, t, (\alpha B / 2)}\bigr].\]
Placebo inference (Web Appendix G)#
Set each control unit in turn as a pseudo-treated unit and re-fit the factor model on the remaining controls; collect the \(N_{co}\) pseudo-ATT curves and report their pointwise \((\alpha/2, 1 - \alpha/2)\) quantile band. The paper notes the placebo test is sensitive to error-variance heterogeneity between the treated unit and the controls – when that assumption is violated the asymptotic CI from the previous paragraph is preferred.
Core API#
Factor Model Approach (FMA) estimator.
Li, K. T., & Sonnier, G. P. (2023). “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings.” Journal of Marketing Research, 60(3), 449-472.
FMA estimates the ATT for a single treated unit by:
Extracting principal-component factors from the control panel (paper Section 3.1; factor count chosen by the modified Bai-Ng (MBN) criterion for stationary data or Bai (2004) IPC1 for non-stationary data – see Web Appendix D).
Projecting the treated unit’s pre-treatment outcomes onto a constant + the factors via OLS to recover the loading.
Forming the counterfactual
hat y^0_{1, t} = F_aug_t' lambda_hatfor every period; the ATT is the mean post-treatment gap.
Inference is the paper’s main contribution. Three procedures live side by side:
asymptotic (default) – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CI for the ATT, built from the variance decomposition Omega_hat = Omega1 + Omega2.
bootstrap – Web Appendix F residual bootstrap for per-period ATT_t CIs.
placebo – Web Appendix G control-as-pseudo-treated band.
Activate any combination via FMAConfig.inference_methods.
- class mlsynth.estimators.fma.FMA(config: FMAConfig | dict)#
Bases:
objectFactor Model Approach (Li & Sonnier 2023) estimator.
- Parameters:
config (FMAConfig or dict) – Configuration object. See
mlsynth.config_models.FMAConfig.- Returns:
FMAResults – Frozen container with the optimised design, asymptotic + bootstrap + placebo inference, counterfactual, ATT, and per-period gap.
- fit() FMAResults#
Run the FMA pipeline end to end.
Configuration#
- class mlsynth.config_models.FMAConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', stationarity: ~typing.Literal['stationary', 'nonstationary'] = 'nonstationary', preprocessing: ~typing.Literal['demean', 'standardize'] = 'demean', n_factors: ~typing.Annotated[int | None, ~annotated_types.Ge(ge=1)] = None, max_factors: ~typing.Annotated[int, ~annotated_types.Ge(ge=1)] = 10, alpha: ~typing.Annotated[float, ~annotated_types.Gt(gt=0.0), ~annotated_types.Lt(lt=1.0)] = 0.05, inference_methods: ~typing.List[str] = <factory>, n_bootstrap: ~typing.Annotated[int, ~annotated_types.Ge(ge=100)] = 1000, bootstrap_seed: int = 0)#
Configuration for the Factor Model Approach (FMA) estimator.
Implements Li & Sonnier (2023), “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings”, JMR 60(3):449-472. FMA estimates the ATT for a single treated unit by extracting principal- component factors from the control panel, projecting the treated unit’s pre-period outcomes onto those factors, and using the fitted loading to predict the untreated potential outcome in the post-period.
- Parameters:
stationarity ({“stationary”, “nonstationary”}) – Selects the factor-selection criterion:
"stationary"uses the paper’s modified Bai-Ng (MBN) criterion (Web Appendix D.1);"nonstationary"uses Bai (2004) IPC1 with a log-log adjustment for non-stationary factors. Default:"nonstationary"(the paper’s recommendation for general applied settings).preprocessing ({“demean”, “standardize”}) – Preprocessing applied to the control panel before PCA.
n_factors (int or None) – Override the data-driven factor count.
Nonetriggers the criterion instationarity.max_factors (int) – Upper bound passed to the factor-selection routine.
alpha (float) – Two-sided significance level for CIs.
inference_methods (list of {“asymptotic”, “bootstrap”, “placebo”}) – Inference procedures to run. Defaults to
["asymptotic"], which gives the paper’s Theorem 3.1 normal CI for the ATT. Add"bootstrap"to get per-period ATT_t CIs via the Web Appendix F residual bootstrap, and"placebo"to get the Web Appendix G control-as-pseudo-treated band.n_bootstrap (int) – Number of bootstrap replicates (Web Appendix F). Ignored when
"bootstrap"is not ininference_methods.bootstrap_seed (int) – Seed for the bootstrap RNG.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Helper Modules#
Data preparation for the FMA estimator.
- mlsynth.utils.fma_helpers.setup.prepare_fma_inputs(df: DataFrame, outcome: str, treat: str, unitid: str, time: str, preprocessing: str = 'demean', stationarity: str = 'nonstationary') FMAInputs#
Pivot the panel and assemble FMA inputs.
- Parameters:
df (pd.DataFrame) – Long balanced panel with one row per
(unit, time).outcome, treat, unitid, time (str) – Column names.
preprocessing ({“demean”, “standardize”}) – Preprocessing applied to the control panel before factor extraction. Demeaning (default) follows Bai (2003); standardizing is preferred when the control series have very heterogeneous variances.
stationarity ({“stationary”, “nonstationary”}) – Selects the factor-selection criterion downstream:
"stationary"uses the modified Bai-Ng (MBN) criterion (Li & Sonnier 2023, Web Appendix D),"nonstationary"uses Bai (2004) IPC1.
- Returns:
FMAInputs – Preprocessed panel.
- Raises:
MlsynthDataError, MlsynthConfigError –
Factor extraction + selection for FMA.
Wraps the bai_ng.nbpiid() helper to apply the appropriate
Li & Sonnier (2023) selection criterion:
stationarity="stationary"-> modified Bai-Ng (MBN) – nbpiid criterion code 10 withN_co + max(70 - N_co, 0)/T + max(70 - T, 0)adjustments (Web Appendix D.1).stationarity="nonstationary"-> Bai (2004) IPC1 with the log-log adjustment – nbpiid criterion code 11.
A user may also bypass the data-driven selection by passing
n_factors directly to extract_factors().
- mlsynth.utils.fma_helpers.factors.extract_factors(control_panel: ndarray, stationarity: str = 'nonstationary', preprocessing: str = 'demean', n_factors: int | None = None, max_factors: int = 10) Tuple[int, ndarray, ndarray, str]#
Estimate the factor structure from the control panel.
- Parameters:
control_panel (np.ndarray) –
(T, N_co)matrix of control-unit outcomes.stationarity ({“stationary”, “nonstationary”}) – Drives the factor-selection criterion (MBN vs IPC1).
preprocessing ({“demean”, “standardize”}) – Mapped to
nbpiid’s preprocessing_method_code: 1 for demean, 2 for standardize.n_factors (int, optional) – Override the data-driven selection.
Nonetriggers MBN/IPC1.max_factors (int) – Upper bound passed to the selection routine; ignored when
n_factorsis supplied.
- Returns:
n_factors_selected (int) – Final number of factors used.
common_component (np.ndarray) –
(T, N_co)reconstruction of the control panel.factors (np.ndarray) –
(T, r)matrix of estimated factors.source (str) –
"MBN","IPC1", or"user".
Treated-unit projection and counterfactual construction for FMA.
Given the factor estimates F_hat from the control panel, the
treated unit’s factor loading is recovered by OLS of its pre-period
outcomes on the factors (with a constant):
with \(\tilde F_t = [1, F_{ht}']'\). The full-period counterfactual is \(\hat y^0_{1, t} = \tilde F_t' \hat\lambda_1\).
- mlsynth.utils.fma_helpers.fit.estimate_loading_and_counterfactual(treated_outcome: ndarray, factors: ndarray, T0: int) Tuple[ndarray, ndarray, ndarray, float]#
OLS projection of the treated pre-period onto the factors.
- Parameters:
treated_outcome (np.ndarray) – Treated outcome series, shape
(T,).factors (np.ndarray) – Estimated factor matrix, shape
(T, r).T0 (int) – Number of pre-treatment periods.
- Returns:
lambda_hat (np.ndarray) – Loading vector
(r + 1,)(constant + factor loadings).counterfactual (np.ndarray) – Predicted untreated outcome for every period, shape
(T,).factors_with_const (np.ndarray) – Factor matrix with the constant column prepended, shape
(T, r + 1). Returned for downstream variance plug-ins.residual_variance (float) – Variance of the pre-treatment residuals.
Inference procedures for FMA (Li & Sonnier 2023).
Three procedures live here and can run in any combination:
asymptotic_inference()– Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CI for the ATT.bootstrap_inference()– Web Appendix F residual bootstrap for per-periodATT_tCIs. Uses the pre-period residuals as the bootstrap distribution and refits the loading on each draw, so the CIs reflect the joint variability of the loading estimate and the idiosyncratic shock at timet.placebo_inference()– Web Appendix G placebo test where every control is treated as a pseudo-treated unit in turn. Returns the pointwise quantile band across the placebo ATT curves.
- mlsynth.utils.fma_helpers.inference.asymptotic_inference(treated_outcome: ndarray, counterfactual: ndarray, factors_with_const: ndarray, residual_variance: float, T0: int, alpha: float = 0.05) Tuple[float, float, float, float]#
Theorem 3.1 normal CI for the ATT.
- Returns:
se_att, lower, upper, p_value
- mlsynth.utils.fma_helpers.inference.bootstrap_inference(treated_outcome: ndarray, factors: ndarray, counterfactual: ndarray, T0: int, alpha: float = 0.05, n_replicates: int = 1000, seed: int = 0) dict#
Web Appendix F residual bootstrap for per-period ATT_t CIs.
Procedure#
Compute pre-period residuals
e_hat_1t = y_1t - F_aug_t' lambda_hatfort = 1, ..., T0.For each bootstrap draw b = 1, …, B: a. Sample
e*_1tfrom {e_hat} with replacement for every t. b. Formy*_1t = F_aug_t' lambda_hat + e*_1t. c. Re-fit the loading on the bootstrap pre-period. d. ComputeDelta*_1t = y*_1t - F_aug_t' lambda*_hatfor t > T0.The (1 - alpha) CI for Delta_1t is
[Delta_hat_1t - q_{1 - alpha/2}, Delta_hat_1t - q_{alpha/2}], with quantiles taken across bootstrap replicates of Delta*.
- mlsynth.utils.fma_helpers.inference.placebo_inference(control_outcomes: ndarray, treated_outcome: ndarray, T0: int, n_factors: int | None, stationarity: str, preprocessing: str, alpha: float = 0.05, max_factors: int = 10) dict#
Web Appendix G: pseudo-ATT curves with each control as the treated unit.
For each control
k, swap it into the treated slot, refit the factor model on the remainingN_co - 1controls, project the pseudo-treated pre-period onto those factors, compute the pseudo- ATT curve. The output band is the pointwise alpha/2 / (1 - alpha/2) quantile across the placebo curves at each period.- Parameters:
control_outcomes (np.ndarray) –
(T, N_co)control panel.treated_outcome (np.ndarray) – Real treated outcome series, shape
(T,). Used for the leading row of the curves matrix (so the caller can compare).T0 (int) – Pre-treatment periods.
n_factors (int or None) – Number of factors to fit on each placebo iteration.
Nonemeans re-select per iteration via the criterion instationarity.stationarity ({“stationary”, “nonstationary”})
preprocessing ({“demean”, “standardize”})
alpha (float) – Significance level for the quantile band.
max_factors (int) – Upper bound on the factor-selection routine.
- Returns:
dict – Keys
curves((N_co + 1, T); first row is the real treated unit),q_lower,q_upper((T,)bands).
The Web Appendix E.1 Monte Carlo DGP1, packaged as
simulate_fma_sample() so the Verification replication below runs
as a one-liner.
Li & Sonnier (2023) DGP1 / DGP2 simulation helpers for the FMA estimator.
Implements the data-generating processes from the paper’s Web Appendix E.
DGP1 (Section 4.1, stationary factors, Hsiao-Ching-Wan 2012):
DGP2 (Web Appendix E.1, non-stationary factors):
with \(\xi_t \sim \mathrm{Uniform}[0,1]\) and
\(\varepsilon_{kt} \sim \mathcal{N}(0, 1)\). (The paper’s TeX prints
F_{2t} = F_{4,t-1} + ε_{4t}, which is a typo; the surrounding text
describes \(F_{2t}\) as a drift-less unit root, so F_{4,t-1} reads
as F_{2,t-1}.)
Either DGP feeds the same factor model
with loadings \(\lambda_{il} \stackrel{\text{iid}}{\sim} \mathcal{N}(1, 1)\) and idiosyncratic shocks
The paper studies three variance regimes via the keyword variance_case:
"equal"— \(\sigma_{\text{tr}}^2 = \sigma_{\text{co}}^2 = 1\) (Figures 2, W.5; both the new CI and Xu’s bootstrap nominal);"treated_smaller"— \(\sigma_{\text{tr}} = 0.5 \sigma_{\text{co}}\) (Figures 3, W.6; Xu’s bootstrap overcovers);"treated_larger"— \(\sigma_{\text{tr}} = 2 \sigma_{\text{co}}\) (Figures 4, W.7; Xu’s bootstrap undercovers).
True ATT = 0 in every draw; the paper’s centred statistic \(\sqrt{T_2}(\hat\Delta_1 - \Delta_1)\) is invariant to a constant treatment effect (see the equation following Equation 4.2), so coverage is unaffected by setting \(\Delta_{1t} = 0\).
- class mlsynth.utils.fma_helpers.simulation.FMASample(df: DataFrame, Y_treated: ndarray, Y_controls: ndarray, factors: ndarray, T1: int, T2: int, sigma_tr: float, sigma_co: float, dgp: str = 'dgp1')#
One draw from a Li & Sonnier DGP.
- df#
Long panel with columns
unit/time/y/Dready formlsynth.FMA.- Type:
pd.DataFrame
- Y_treated#
Treated-unit outcome path, shape
(T,).- Type:
np.ndarray
- Y_controls#
Control outcomes, shape
(N_co, T).- Type:
np.ndarray
- factors#
Common factor matrix, shape
(T, 3).- Type:
np.ndarray
- T1, T2
Pre- / post-treatment period counts.
- Type:
- sigma_tr, sigma_co
Idiosyncratic error standard deviations actually drawn.
- Type:
- Y_controls: ndarray#
- Y_treated: ndarray#
- df: DataFrame#
- factors: ndarray#
- mlsynth.utils.fma_helpers.simulation.simulate_fma_sample(dgp: Literal['dgp1', 'dgp2'] = 'dgp1', N_co: int = 30, T1: int = 30, T2: int = 20, variance_case: Literal['equal', 'treated_smaller', 'treated_larger'] = 'equal', alpha: float = 0.0, rng: Generator | None = None) FMASample#
Draw one sample from a Li & Sonnier (2023) DGP.
- Parameters:
dgp ({“dgp1”, “dgp2”}, default
"dgp1") – Stationary (DGP1, Section 4.1) or non-stationary (DGP2, Web Appendix E.1) factor process.N_co (int, default 30) – Number of control units (the paper’s coverage exercise uses 30).
T1, T2 (int, default 30, 20) – Pre- and post-treatment period counts.
variance_case ({“equal”, “treated_smaller”, “treated_larger”}) – Which of the paper’s three variance regimes to simulate.
alpha (float, default 0.0) – Intercept shared by every unit (the paper’s \(\alpha\); coverage is invariant to its value).
rng (np.random.Generator, optional) – NumPy RNG. Defaults to
np.random.default_rng().
- Returns:
FMASample
Diagnostic plot for FMA results.
- mlsynth.utils.fma_helpers.plotter.plot_fma(results: FMAResults) None#
Two-panel plot: trajectories + per-period gap with CI bands.
Frozen dataclasses for the Factor Model Approach (FMA) estimator.
FMA implements Li and Sonnier (2023), “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings”, JMR 60(3):449-472. The estimator constructs a counterfactual for a single treated unit by (i) extracting principal-component factors from the control panel, (ii) projecting the treated unit’s pre-period outcomes onto those factors, and (iii) using the resulting loadings to predict the treated unit’s untreated potential outcomes in the post-period.
Inference is the paper’s main contribution. Three procedures live
side-by-side in FMAInference:
asymptotic – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CIs for the ATT.
bootstrap – Web Appendix F residual bootstrap for per-period ATT_t CIs.
placebo – Web Appendix G control-as-pseudo-treated bands.
The four layers below (inputs, design, inference, results) keep that pipeline pluggable.
- class mlsynth.utils.fma_helpers.structures.FMADesign(n_factors: int, n_factors_source: str, factors: ndarray, lambda_hat: ndarray, counterfactual: ndarray, gap: ndarray, common_component: ndarray, residual_variance: float)#
Factor-model design produced by the FMA fit.
- Parameters:
n_factors (int) – Number of common factors selected (or user-supplied).
n_factors_source (str) –
"MBN"(modified Bai-Ng),"IPC1"(Bai 2004),"user"(user override).factors (np.ndarray) – Estimated factors
F_hat, shape(T, r).lambda_hat (np.ndarray) – Estimated factor loading for the treated unit, shape
(r + 1,)(intercept + loadings).counterfactual (np.ndarray) – Predicted treated potential outcome under no treatment, shape
(T,).gap (np.ndarray) – Observed treated minus counterfactual, shape
(T,).common_component (np.ndarray) – Reconstructed control panel via factor projection, shape
(T, N_co).residual_variance (float) – Variance of the pre-treatment residuals
y_1t - F_hat_t' lambda_hat.
- common_component: ndarray#
- counterfactual: ndarray#
- factors: ndarray#
- gap: ndarray#
- lambda_hat: ndarray#
- class mlsynth.utils.fma_helpers.structures.FMAInference(method: str, alpha: float, att: float, asymptotic_att_se: float = nan, asymptotic_att_lower: float = nan, asymptotic_att_upper: float = nan, asymptotic_att_p_value: float = nan, bootstrap_att_t_lower: ~numpy.ndarray = <factory>, bootstrap_att_t_upper: ~numpy.ndarray = <factory>, bootstrap_replicates: ~numpy.ndarray = <factory>, bootstrap_n_replicates: int = 0, placebo_att_curves: ~numpy.ndarray = <factory>, placebo_quantile_lower: ~numpy.ndarray = <factory>, placebo_quantile_upper: ~numpy.ndarray = <factory>)#
Combined asymptotic + bootstrap + placebo inference output.
- Parameters:
method (str) – Primary inference method tag (
"asymptotic","bootstrap","placebo", or"none").alpha (float) – Two-sided significance level.
att (float) – Mean post-treatment gap.
asymptotic_att_se (float) – Standard error of the ATT from the paper’s Theorem 3.1.
asymptotic_att_lower (float)
asymptotic_att_upper (float) –
(1 - alpha)asymptotic CI bounds.asymptotic_att_p_value (float) – Two-sided z-test of
H_0: ATT = 0.bootstrap_att_t_lower (np.ndarray)
bootstrap_att_t_upper (np.ndarray) – Per-period bootstrap CI bounds for ATT_t, shape
(n_post,). Empty whenmethod != "bootstrap".bootstrap_replicates (np.ndarray) –
(B, n_post)matrix of bootstrap replicate ATT_t.bootstrap_n_replicates (int) – Number of bootstrap draws actually completed.
placebo_att_curves (np.ndarray) –
(N_co, T)matrix of pseudo-ATT curves (one per control used as a placebo treated unit). Empty whenmethod != "placebo".placebo_quantile_lower (np.ndarray)
placebo_quantile_upper (np.ndarray) – Pointwise (alpha/2, 1 - alpha/2) quantile bands across the placebo curves at each period. Shape
(T,).
- bootstrap_att_t_lower: ndarray#
- bootstrap_att_t_upper: ndarray#
- bootstrap_replicates: ndarray#
- placebo_att_curves: ndarray#
- placebo_quantile_lower: ndarray#
- placebo_quantile_upper: ndarray#
- class mlsynth.utils.fma_helpers.structures.FMAInputs(treated_outcome: ndarray, control_outcomes: ndarray, donor_names: ndarray, treated_unit_name: Any, T: int, T0: int, time_labels: ndarray, preprocessing: str, stationarity: str)#
Preprocessed panel data for FMA estimation.
- Parameters:
treated_outcome (np.ndarray) – Outcome series for the treated unit, shape
(T,).control_outcomes (np.ndarray) – Outcome matrix for the
N_cocontrol units, shape(T, N_co).donor_names (np.ndarray) – Labels of the control units, length
N_co.treated_unit_name (Any) – Label of the treated unit.
T (int) – Total number of panel periods.
T0 (int) – Number of pre-treatment periods.
time_labels (np.ndarray) – Labels of the time periods, length
T.preprocessing (str) –
"demean"(default) or"standardize"– preprocessing applied to the control panel before PCA.stationarity (str) –
"stationary"or"nonstationary"– selects the appropriate factor-selection criterion (MBN vs Bai-IPC1).
- control_outcomes: ndarray#
- donor_names: ndarray#
- time_labels: ndarray#
- treated_outcome: ndarray#
- class mlsynth.utils.fma_helpers.structures.FMAResults(inputs: ~mlsynth.utils.fma_helpers.structures.FMAInputs, design: ~mlsynth.utils.fma_helpers.structures.FMADesign, inference: ~mlsynth.utils.fma_helpers.structures.FMAInference, counterfactual: ~numpy.ndarray, gap: ~numpy.ndarray, att: float, pre_rmse: float, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#
Top-level container returned by
mlsynth.FMA.fit().- Parameters:
inputs (FMAInputs) – Preprocessed panel.
design (FMADesign) – Factor-model design.
inference (FMAInference) – Inference output.
counterfactual (np.ndarray) – Convenience alias of
design.counterfactual.gap (np.ndarray) – Convenience alias of
design.gap.att (float) – Mean post-treatment gap.
pre_rmse (float) – Root mean squared pre-treatment fit error.
metadata (dict) – Free-form pipeline diagnostics.
- counterfactual: ndarray#
- gap: ndarray#
- inference: FMAInference#
Example#
A self-contained one-draw Monte Carlo at a non-stationary two-factor DGP – the regime the paper’s empirical California / Brooklyn applications fall into. The example fits FMA in all three inference modes and prints the headline output.
"""One draw of a Li & Sonnier (2023) factor-model simulation."""
import numpy as np
import pandas as pd
from mlsynth import FMA
# ---------------------------------------------------------------------
# 1. Simulate one panel from a non-stationary two-factor DGP
# ---------------------------------------------------------------------
rng = np.random.default_rng(0)
J = 20 # control units
T_pre = 30
T_post = 10
T = T_pre + T_post
r_true = 2
tau_true = 1.0 # additive treatment effect on the treated unit
F = rng.standard_normal((T, r_true)).cumsum(axis=0) # non-stationary
lam = rng.standard_normal((J + 1, r_true))
eps = rng.standard_normal((T, J + 1)) * 0.5
Y0 = F @ lam.T + eps
Y = Y0.copy()
Y[T_pre:, 0] += tau_true # unit 0 treated
rows = [
{
"unit": j,
"time": t,
"y": float(Y[t, j]),
"D": int(j == 0 and t >= T_pre),
}
for j in range(J + 1)
for t in range(T)
]
df = pd.DataFrame(rows)
# ---------------------------------------------------------------------
# 2. Fit FMA with all three inference modes
# ---------------------------------------------------------------------
results = FMA({
"df": df,
"outcome": "y",
"treat": "D",
"unitid": "unit",
"time": "time",
"stationarity": "nonstationary", # IPC1 factor selection
"preprocessing": "demean",
"inference_methods": ["asymptotic", "bootstrap", "placebo"],
"n_bootstrap": 500,
"alpha": 0.05,
"display_graphs": False,
}).fit()
# ---------------------------------------------------------------------
# 3. Inspect the output
# ---------------------------------------------------------------------
print(f"true tau = {tau_true:+.3f}")
print(f"ATT_hat = {results.att:+.3f}")
print(f"r selected = {results.design.n_factors} "
f"({results.design.n_factors_source})")
print(f"pre-RMSE = {results.pre_rmse:.4f}")
inf = results.inference
print(f"asymptotic 95% CI ATT = "
f"[{inf.asymptotic_att_lower:+.3f}, {inf.asymptotic_att_upper:+.3f}]")
print(f"asymptotic p-value = {inf.asymptotic_att_p_value:.3f}")
import pandas as pd
print("\nPer-period ATT_t with bootstrap CI:")
print(pd.DataFrame({
"t": np.arange(T_pre + 1, T + 1),
"att_t": results.gap[T_pre:],
"boot_lower": inf.bootstrap_att_t_lower,
"boot_upper": inf.bootstrap_att_t_upper,
}).round(3).to_string(index=False))
Verification#
Monte Carlo replication (Path B). Li & Sonnier’s empirical applications – California beer sales and a Brooklyn eyeglass retailer’s showroom – run on Nielsen retail scanner data and a proprietary retailer panel, neither of which is redistributable, so the estimator is validated through the paper’s own Monte Carlo. The methodological contribution of the paper is that the new asymptotic CI from Theorem 3.1 attains nominal coverage regardless of whether \(\sigma_{\text{tr}} = \sigma_{\text{co}}\), while the Xu (2017) bootstrap miscovers badly when the two variances differ (Figures 2-4 Panel A, plus the non-stationary mirror in Web Appendix E.1’s Figures W.5-W.7 and the sample-size sweeps in W.8-W.11).
Both DGPs – the stationary Section 4 DGP1 and the non-stationary
Appendix E.1 DGP2 – are packaged in
mlsynth.utils.fma_helpers.simulation.simulate_fma_sample(),
together with the three variance regimes
("equal" / "treated_smaller" / "treated_larger"). True ATT
is zero in every draw; the paper’s centred statistic
\(\sqrt{T_2}(\hat\Delta_1 - \Delta_1)\) is invariant to
\(\Delta\) (see the equation following 4.2), so coverage doesn’t
depend on its value.
Replicating the headline coverage findings is a 15-line script:
import numpy as np
from mlsynth import FMA
from mlsynth.utils.fma_helpers.simulation import simulate_fma_sample
def coverage_cell(dgp, variance_case, N_co, T1, T2, M, alpha=0.05):
stationarity = "stationary" if dgp == "dgp1" else "nonstationary"
covers = []
for j in range(M):
s = simulate_fma_sample(dgp=dgp, N_co=N_co, T1=T1, T2=T2,
variance_case=variance_case,
rng=np.random.default_rng(j))
res = FMA({"df": s.df, "outcome": "y", "treat": "D",
"unitid": "unit", "time": "time",
"stationarity": stationarity,
"inference_methods": ["asymptotic"],
"alpha": alpha, "display_graphs": False}).fit()
covers.append(res.inference.asymptotic_att_lower <= 0.0
<= res.inference.asymptotic_att_upper)
return float(np.mean(covers))
for dgp in ("dgp1", "dgp2"):
for case in ("equal", "treated_smaller", "treated_larger"):
print(dgp, case,
coverage_cell(dgp, case, N_co=30, T1=30, T2=20, M=1000))
At \(M = 1{,}000\) (Li & Sonnier use \(M = 100{,}000\); runtime
difference is the only material change) mlsynth reproduces all four
appendix-figure families. The MC standard error at \(M = 1{,}000\)
is \(\sqrt{0.95 \cdot 0.05 / 1000} \approx 0.7\) pp.
Section 4 + Web Appendix E.1: three variance regimes#
\(N_{co} = 30\), \(T_1 = 30\), \(T_2 = 20\). Reproduces Figures 2-4 Panel A (DGP1, stationary) and Figures W.5-W.7 Panel A (DGP2, non-stationary).
DGP |
Variance case |
\(\sigma_{\text{tr}}\) |
\(\sigma_{\text{co}}\) |
mlsynth |
paper |
|---|---|---|---|---|---|
DGP1 |
|
1.0 |
1.0 |
0.947 |
0.95 |
DGP1 |
|
0.5 |
1.0 |
0.946 |
0.95 |
DGP1 |
|
2.0 |
1.0 |
0.951 |
0.95 |
DGP2 |
|
1.0 |
1.0 |
0.935 |
0.95 |
DGP2 |
|
0.5 |
1.0 |
0.944 |
0.95 |
DGP2 |
|
2.0 |
1.0 |
0.929 |
0.95 |
Every cell lands within \(\pm 2.1\) pp of nominal – well inside three MC standard errors. The headline takeaway is the equality of the three numbers within each DGP: coverage does not deteriorate when \(\sigma_{\text{tr}} \ne \sigma_{\text{co}}\), which is precisely the regime where the Xu (2017) bootstrap mistakes the treated-error variance for the control-error variance and either over- or under-covers (Panel B of Figures 3-4 and W.6-W.7 in the paper).
Web Appendix E.2.1: other \((T_1, N_{co})\) combinations#
Equal variance (\(\sigma_{\text{tr}} = \sigma_{\text{co}} = 1\)), \(T_2 = 20\), sample-size sweeps over both DGPs reproducing Figures W.8-W.9 (stationary) and W.10-W.11 (non-stationary).
DGP |
\(T_1\) |
\(N_{co}\) |
mlsynth |
paper |
|---|---|---|---|---|
DGP1 |
30 |
60 |
0.936 |
0.95 (Fig W.8A) |
DGP1 |
60 |
30 |
0.939 |
0.95 (Fig W.8B) |
DGP1 |
60 |
60 |
0.944 |
0.95 (Fig W.9A) |
DGP1 |
120 |
120 |
0.949 |
0.95 (Fig W.9B) |
DGP2 |
30 |
60 |
0.939 |
0.95 (Fig W.10A) |
DGP2 |
60 |
30 |
0.936 |
0.95 (Fig W.10B) |
DGP2 |
60 |
60 |
0.935 |
0.95 (Fig W.11A) |
DGP2 |
120 |
120 |
0.955 |
0.95 (Fig W.11B) |
All eight cells within \(\pm 1.5\) pp of nominal, and the largest cell (\(T_1 = N_{co} = 120\)) hits the asymptotic regime cleanly – the underlying theory is consistent in both panel dimensions, as the paper asserts.
Not replicated here#
Web Appendix E.2.2 (small-N/T \(t\)-distribution) – the paper recommends switching to \(t_{T_1 - (N_{co} + 1)}\) for very small samples, but the suggested degrees-of-freedom value is non-positive in the figure’s specific \((T_1, N_{co}) = (20, 20)\) configuration, and
FMA’s public API exposes only the normal CI; the small-sample refinement is left as a future addition.Web Appendix E.3 / W.15-W.17 (DGP3) – the paper bootstraps DGP3’s first factor from the proprietary Brooklyn sales panel, so the precise DGP cannot be reconstructed from public data; the corresponding empirical-application coverage check is therefore out of scope.
For reference, the corresponding empirical applications report:
California recreational marijuana legalization on beer sales – \(\widehat{\text{ATT}} = -\$88{,}400\) weekly (\(-0.464\%\)), 95% CI \([-\$407{,}400,\ \$230{,}600]\) (Table 2; not significant). The Xu bootstrap’s interval is less than half as wide (\([-\$229{,}000,\ \$82{,}400]\)), invalidly so given the variance ratio \(\hat\sigma_{\text{tr}}^2 / \hat\sigma_{\text{co}}^2 \approx 37.4\).
Brooklyn showroom opening on weekly sales – \(\widehat{\text{ATT}} = +\$2{,}446\) weekly (\(+27.2\%\)).
References#
Bai, J. (2004). “Estimating Cross-Section Common Stochastic Trends in Nonstationary Panel Data.” Journal of Econometrics 122(1):137-183.
Bai, J., & Ng, S. (2002). “Determining the Number of Factors in Approximate Factor Models.” Econometrica 70(1):191-221.
Li, K. T., & Sonnier, G. P. (2023). “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings.” Journal of Marketing Research 60(3):449-472.