Factor Model Approach (FMA)

Contents

Factor Model Approach (FMA)#

When to Use This Method#

Quasi-experiments are the bread and butter of marketing causal inference: a policy or firm decision switches on for one unit (a state legalises recreational marijuana, a digitally native retailer opens a physical showroom in Brooklyn) and you want the sales effect against a pool of untreated cities, stores, or categories. The two reflexive choices each strain in this setting:

  • Difference-in-differences (DiD) assumes the time effect is the same for the treated unit and every control – the parallel-trends assumption. Marketing panels (sales, share, macro series) rarely oblige: different markets ride different seasonal, regional, and business-cycle waves. When that homogeneity fails, DiD carries a large estimation bias that does not shrink as the panel grows – in Li and Sonnier’s own MSE comparison it is so biased they drop it from the table.

  • Synthetic control (SC) relaxes that by weighting controls on the simplex (nonnegative, sum-to-one, no intercept), which pins the counterfactual inside the convex hull of the controls. That is a feature when the treated unit is “surrounded” by donors and you want an interpretable convex-combination weight story, but a bug when the treated unit sits outside the donor range, and SC has no inference theory when controls are many and the data are non-stationary of unknown form.

The factor model approach (FMA) of Li and Sonnier ([FMA]) targets exactly the regime SC and DiD struggle with: many control units, time effects that differ across units, a treated path that may lie outside the donor range, and a need for honest confidence intervals. Instead of weighting the controls directly, it first projects the control panel onto a small set of latent factors (interactive fixed effects: \(y^0_{it} = \lambda_i' F_t + e_{it}\)), then regresses the treated unit’s pre-period outcome on those factors with no constraint on the loadings. Three properties follow, and they are the reasons to reach for it:

  1. Heterogeneous time effects. The interactive \(\lambda_i' F_t\) structure nests two-way fixed effects (DiD) as the special case \(\lambda_i = (\xi_i, 1)'\), \(F_t = (1, \delta_t)'\), and also absorbs unit-specific trends and cross-sectional dependence. You are not betting on a common shock.

  2. The counterfactual can leave the donor hull. Because the loadings are unrestricted (no nonnegativity, no sum-to-one, no zero intercept), FMA fits treated units whose level or trajectory is outside the range of every control – the case SC structurally cannot represent.

  3. Robust to a large, growing donor pool. Unconstrained regression on the raw controls (the Hsiao-Ching-Wan approach) overfits and breaks down once \(N_{co}\) approaches or exceeds \(T_1\). FMA’s dimension reduction sidesteps this: it benefits from more controls (they sharpen the factor estimates) rather than being destabilised by them. The same single factor extraction also scales cheaply to many treated units and staggered timing, since it is done once.

The paper’s headline contribution is the fourth reason: valid, computationally cheap inference. FMA delivers a closed-form normal confidence interval for the ATT (Theorems 3.1 / 3.3) that is valid for both stationary and non-stationary data and – critically – accommodates treated and control units with different error variances. The previously standard Xu (2017) bootstrap assumes \(\sigma^2_{\text{tr}} = \sigma^2_{\text{co}}\); when that fails (the norm for non-randomised quasi-experiments) it over- or under-covers badly, while permutation/placebo tests need an analogous symmetry. In the California beer application the estimated variance ratio is \(\hat\sigma^2_{\text{tr}} / \hat\sigma^2_{\text{co}} \approx 37\), so the bootstrap interval comes out less than half its honest width. FMA’s normal interval needs no such assumption and no resampling loop.

Factor projection vs. donor weighting#

Every synthetic-control-family estimator answers what reproduces the treated unit’s untreated path? FMA’s structural bet is distinctive:

The controls’ untreated outcomes share a low-dimensional latent factor structure; estimate those factors, then load the treated unit onto them without constraint.

DiD

Synthetic control

Factor model (FMA)

Comparison built from

all controls, equal weight

all controls, simplex weights

latent factors of the controls

Time effects

homogeneous (common shock)

implicit via weights

heterogeneous \(\lambda_i' F_t\)

Counterfactual outside donor hull

allowed

no – convex hull

yes – unrestricted loadings

Many controls (\(N_{co} \gtrsim T_1\))

bias does not shrink

overfits if restrictions relaxed

benefits – sharper factors

Inference, non-stationary data

standard

none available

normal CI (Thm 3.1 / 3.3)

Unequal treated/control variance

n/a

placebo needs symmetry

handled directly

Reach for FMA when#

  • You have a single treated unit (or a handful, or many – the factor step generalises) and a large control pool, especially \(N_{co}\) large relative to the pre-period \(T_1\), where unrestricted regression on raw donors would overfit.

  • The outcome is plausibly driven by a few common latent factors – regional sales, market share, macro-linked categories – so the control matrix is approximately low-rank. This is what the factor projection exploits.

  • You expect heterogeneous time dynamics across units (different seasonality, trends, cross-sectional dependence) that violate DiD’s parallel-trends assumption.

  • The treated unit’s level or trajectory may sit outside the range of the controls, so SC’s convex-hull restriction would distort the fit.

  • You need formal inference – a hypothesis test or confidence interval for the ATT – and you cannot defend the equal-variance assumption behind the Xu (2017) bootstrap or the symmetry behind permutation/placebo tests. FMA’s normal CI is valid under unequal variances and under non-stationarity, and it is far cheaper than resampling.

  • You have enough data for the asymptotics: the paper’s simulations show the normal CI is reliable for \(N_{co} \ge 30\) and \(T_1 \ge 30\) (at \(T_2 = 20\)). For smaller panels the paper suggests a \(t_{T_1 - (N_{co} + 1)}\) reference distribution, which mlsynth does not yet expose; treat the normal CI cautiously there.

Do not use FMA when#

Overview#

FMA implements Li, K. T., & Sonnier, G. P. (2023). “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings.” Journal of Marketing Research, 60(3):449-472. The estimator constructs a counterfactual for a single treated unit by

  1. extracting principal-component factors from the control panel (with the number of factors chosen by the paper’s modified Bai-Ng criterion for stationary outcomes or Bai (2004) IPC1 for non- stationary outcomes);

  2. projecting the treated unit’s pre-treatment outcomes onto a constant plus the factors via OLS to recover the loading \(\hat\lambda_1\);

  3. forming the counterfactual \(\hat y^0_{1, t} = \tilde F_t' \hat\lambda_1\) for every period; the ATT is the mean post-treatment gap.

The paper’s distinctive contribution is valid statistical inference for the ATT. FMA in mlsynth exposes three procedures in parallel; the user picks any subset via FMAConfig.inference_methods:

  • "asymptotic" (default) – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CI for the ATT, built from the variance decomposition \(\widehat \Omega = \widehat \Omega_1 + \widehat \Omega_2\).

  • "bootstrap" – Web Appendix F residual bootstrap for per-period \(\widehat{ATT}_t\) CIs.

  • "placebo" – Web Appendix G control-as-pseudo-treated band.

Mathematical Formulation#

Setup#

We observe one treated unit (indexed 1) and N_co control units over T periods. Treatment begins at \(T_0 + 1\). Under the factor model

\[y_{i, t}^0 = \lambda_i' F_t + e_{i, t}, \qquad y_{i, t} = y_{i, t}^0 + D_{i, t} \tau_{i, t},\]

the goal is to estimate

\[ATT = \frac{1}{T - T_0} \sum_{t > T_0} \tau_{1, t} = \frac{1}{T - T_0} \sum_{t > T_0} (y_{1, t} - y_{1, t}^0).\]

Factor extraction#

Factors \(\hat F_t\) are extracted from the control panel via PCA after demeaning (or standardising) each control series. The number of factors \(r\) is chosen by one of two criteria:

  • Modified Bai-Ng (MBN) – stationary data. Choose \(r \in \{0, 1, \dots, r_{\max}\}\) minimising

    \[PC_{MBN}(r) = \frac{1}{N_{co} T} \sum_{i=2}^{N} \sum_{t=1}^{T} (y_{i, t} - \hat\lambda_i' \hat F_t)^2 + c_{N, T}\, r\, \hat\sigma^2 \frac{N_{co} + T}{N_{co} T} \log \frac{N_{co} + T}{N_{co} T},\]

    with the small-sample adjustment

    \[c_{N, T} = \frac{(N_{co} + \max(70 - N_{co}, 0)) (T + \max(70 - T, 0))}{N_{co} T}.\]

    When N_co, T >= 70 the adjustment collapses to 1 and the criterion is identical to Bai-Ng (2002) \(PC_{p1}\).

  • Bai (2004) IPC1 – non-stationary data. Replaces the small-sample factor with a log-log adjustment suited to non- stationary factors.

Override the data-driven selection by passing FMAConfig.n_factors directly.

Loading and counterfactual#

With \(\tilde F_t = [1, \hat F_t']'\) and pre-period OLS,

\[\hat\lambda_1 = \biggl(\sum_{t = 1}^{T_0} \tilde F_t \tilde F_t'\biggr)^{-1} \sum_{t = 1}^{T_0} \tilde F_t\, y_{1, t}, \qquad \hat y^0_{1, t} = \tilde F_t' \hat\lambda_1.\]

The ATT is \(\widehat{ATT} = (T - T_0)^{-1} \sum_{t > T_0} (y_{1, t} - \hat y^0_{1, t})\).

Asymptotic inference (Theorem 3.1 / 3.3)#

Write \(\bar{\tilde F}_2 = (T - T_0)^{-1} \sum_{t > T_0} \tilde F_t\) and \(\widehat \Psi = (\sum_{t \le T_0} \tilde F_t \tilde F_t')^{-1}\). The paper shows

\[\widehat \Omega = \widehat \Omega_1 + \widehat \Omega_2, \quad \widehat \Omega_1 = \frac{T - T_0}{T_0}\, \bar{\tilde F}_2'\, \widehat \Psi\, \bar{\tilde F}_2, \quad \widehat \Omega_2 = \widehat \sigma_e^2,\]

with \(\widehat \sigma_e^2\) the variance of the pre-treatment residuals. The \((1 - \alpha)\) CI for the ATT is

\[\widehat{ATT} \pm z_{1 - \alpha / 2}\, \frac{\sqrt{\widehat \Omega}}{\sqrt{T - T_0}}.\]

A two-sided z-test of \(H_0: ATT = 0\) reports the p-value.

Bootstrap inference (Web Appendix F)#

The paper notes that the per-period \(ATT_t\) CI cannot shrink to zero as \(T_0, T - T_0, N_{co} \to \infty\) because its leading term is the idiosyncratic shock \(e_{1, t}\) itself. A residual bootstrap therefore drives the per-period CI:

  1. Compute pre-period residuals \(\hat e_{1, t} = y_{1, t} - \tilde F_t' \hat\lambda_1\), \(t = 1, \dots, T_0\).

  2. For each bootstrap draw \(b = 1, \dots, B\):

    • Sample \(e^*_{1, t}\) from \(\{\hat e_{1, t}\}\) with replacement for every \(t \in \{1, \dots, T\}\).

    • Form \(y^*_{1, t} = \tilde F_t' \hat\lambda_1 + e^*_{1, t}\).

    • Re-estimate \(\hat\lambda^{*}_1\) from the bootstrap pre-period.

    • Compute \(\Delta^*_{1, t} = y^*_{1, t} - \tilde F_t' \hat\lambda^{*}_1\) for \(t > T_0\).

  3. The \((1 - \alpha)\) CI for \(\Delta_{1, t}\) is

    \[\bigl[\widehat \Delta_{1, t} - \Delta^*_{1, t, ((1 - \alpha/2) B)}, \widehat \Delta_{1, t} - \Delta^*_{1, t, (\alpha B / 2)}\bigr].\]

Placebo inference (Web Appendix G)#

Set each control unit in turn as a pseudo-treated unit and re-fit the factor model on the remaining controls; collect the \(N_{co}\) pseudo-ATT curves and report their pointwise \((\alpha/2, 1 - \alpha/2)\) quantile band. The paper notes the placebo test is sensitive to error-variance heterogeneity between the treated unit and the controls – when that assumption is violated the asymptotic CI from the previous paragraph is preferred.

Core API#

Factor Model Approach (FMA) estimator.

Li, K. T., & Sonnier, G. P. (2023). “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings.” Journal of Marketing Research, 60(3), 449-472.

FMA estimates the ATT for a single treated unit by:

  1. Extracting principal-component factors from the control panel (paper Section 3.1; factor count chosen by the modified Bai-Ng (MBN) criterion for stationary data or Bai (2004) IPC1 for non-stationary data – see Web Appendix D).

  2. Projecting the treated unit’s pre-treatment outcomes onto a constant + the factors via OLS to recover the loading.

  3. Forming the counterfactual hat y^0_{1, t} = F_aug_t' lambda_hat for every period; the ATT is the mean post-treatment gap.

Inference is the paper’s main contribution. Three procedures live side by side:

  • asymptotic (default) – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CI for the ATT, built from the variance decomposition Omega_hat = Omega1 + Omega2.

  • bootstrap – Web Appendix F residual bootstrap for per-period ATT_t CIs.

  • placebo – Web Appendix G control-as-pseudo-treated band.

Activate any combination via FMAConfig.inference_methods.

class mlsynth.estimators.fma.FMA(config: FMAConfig | dict)#

Bases: object

Factor Model Approach (Li & Sonnier 2023) estimator.

Parameters:

config (FMAConfig or dict) – Configuration object. See mlsynth.config_models.FMAConfig.

Returns:

FMAResults – Frozen container with the optimised design, asymptotic + bootstrap + placebo inference, counterfactual, ATT, and per-period gap.

fit() FMAResults#

Run the FMA pipeline end to end.

Configuration#

class mlsynth.config_models.FMAConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = True, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', stationarity: ~typing.Literal['stationary', 'nonstationary'] = 'nonstationary', preprocessing: ~typing.Literal['demean', 'standardize'] = 'demean', n_factors: ~typing.Annotated[int | None, ~annotated_types.Ge(ge=1)] = None, max_factors: ~typing.Annotated[int, ~annotated_types.Ge(ge=1)] = 10, alpha: ~typing.Annotated[float, ~annotated_types.Gt(gt=0.0), ~annotated_types.Lt(lt=1.0)] = 0.05, inference_methods: ~typing.List[str] = <factory>, n_bootstrap: ~typing.Annotated[int, ~annotated_types.Ge(ge=100)] = 1000, bootstrap_seed: int = 0)#

Configuration for the Factor Model Approach (FMA) estimator.

Implements Li & Sonnier (2023), “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings”, JMR 60(3):449-472. FMA estimates the ATT for a single treated unit by extracting principal- component factors from the control panel, projecting the treated unit’s pre-period outcomes onto those factors, and using the fitted loading to predict the untreated potential outcome in the post-period.

Parameters:
  • stationarity ({“stationary”, “nonstationary”}) – Selects the factor-selection criterion: "stationary" uses the paper’s modified Bai-Ng (MBN) criterion (Web Appendix D.1); "nonstationary" uses Bai (2004) IPC1 with a log-log adjustment for non-stationary factors. Default: "nonstationary" (the paper’s recommendation for general applied settings).

  • preprocessing ({“demean”, “standardize”}) – Preprocessing applied to the control panel before PCA.

  • n_factors (int or None) – Override the data-driven factor count. None triggers the criterion in stationarity.

  • max_factors (int) – Upper bound passed to the factor-selection routine.

  • alpha (float) – Two-sided significance level for CIs.

  • inference_methods (list of {“asymptotic”, “bootstrap”, “placebo”}) – Inference procedures to run. Defaults to ["asymptotic"], which gives the paper’s Theorem 3.1 normal CI for the ATT. Add "bootstrap" to get per-period ATT_t CIs via the Web Appendix F residual bootstrap, and "placebo" to get the Web Appendix G control-as-pseudo-treated band.

  • n_bootstrap (int) – Number of bootstrap replicates (Web Appendix F). Ignored when "bootstrap" is not in inference_methods.

  • bootstrap_seed (int) – Seed for the bootstrap RNG.

alpha: float#
bootstrap_seed: int#
classmethod check_fma_params(values: Any) Any#
inference_methods: List[str]#
max_factors: int#
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

n_bootstrap: int#
n_factors: int | None#
preprocessing: Literal['demean', 'standardize']#
stationarity: Literal['stationary', 'nonstationary']#

Helper Modules#

Data preparation for the FMA estimator.

mlsynth.utils.fma_helpers.setup.prepare_fma_inputs(df: DataFrame, outcome: str, treat: str, unitid: str, time: str, preprocessing: str = 'demean', stationarity: str = 'nonstationary') FMAInputs#

Pivot the panel and assemble FMA inputs.

Parameters:
  • df (pd.DataFrame) – Long balanced panel with one row per (unit, time).

  • outcome, treat, unitid, time (str) – Column names.

  • preprocessing ({“demean”, “standardize”}) – Preprocessing applied to the control panel before factor extraction. Demeaning (default) follows Bai (2003); standardizing is preferred when the control series have very heterogeneous variances.

  • stationarity ({“stationary”, “nonstationary”}) – Selects the factor-selection criterion downstream: "stationary" uses the modified Bai-Ng (MBN) criterion (Li & Sonnier 2023, Web Appendix D), "nonstationary" uses Bai (2004) IPC1.

Returns:

FMAInputs – Preprocessed panel.

Raises:

MlsynthDataError, MlsynthConfigError

Factor extraction + selection for FMA.

Wraps the bai_ng.nbpiid() helper to apply the appropriate Li & Sonnier (2023) selection criterion:

  • stationarity="stationary" -> modified Bai-Ng (MBN) – nbpiid criterion code 10 with N_co + max(70 - N_co, 0) / T + max(70 - T, 0) adjustments (Web Appendix D.1).

  • stationarity="nonstationary" -> Bai (2004) IPC1 with the log-log adjustment – nbpiid criterion code 11.

A user may also bypass the data-driven selection by passing n_factors directly to extract_factors().

mlsynth.utils.fma_helpers.factors.extract_factors(control_panel: ndarray, stationarity: str = 'nonstationary', preprocessing: str = 'demean', n_factors: int | None = None, max_factors: int = 10) Tuple[int, ndarray, ndarray, str]#

Estimate the factor structure from the control panel.

Parameters:
  • control_panel (np.ndarray) – (T, N_co) matrix of control-unit outcomes.

  • stationarity ({“stationary”, “nonstationary”}) – Drives the factor-selection criterion (MBN vs IPC1).

  • preprocessing ({“demean”, “standardize”}) – Mapped to nbpiid’s preprocessing_method_code: 1 for demean, 2 for standardize.

  • n_factors (int, optional) – Override the data-driven selection. None triggers MBN/IPC1.

  • max_factors (int) – Upper bound passed to the selection routine; ignored when n_factors is supplied.

Returns:

  • n_factors_selected (int) – Final number of factors used.

  • common_component (np.ndarray) – (T, N_co) reconstruction of the control panel.

  • factors (np.ndarray) – (T, r) matrix of estimated factors.

  • source (str) – "MBN", "IPC1", or "user".

Treated-unit projection and counterfactual construction for FMA.

Given the factor estimates F_hat from the control panel, the treated unit’s factor loading is recovered by OLS of its pre-period outcomes on the factors (with a constant):

\[\hat \lambda_1 = (\sum_{t=1}^{T_0} \tilde F_t \tilde F_t')^{-1} \sum_{t=1}^{T_0} \tilde F_t \, y_{1, t},\]

with \(\tilde F_t = [1, F_{ht}']'\). The full-period counterfactual is \(\hat y^0_{1, t} = \tilde F_t' \hat\lambda_1\).

mlsynth.utils.fma_helpers.fit.estimate_loading_and_counterfactual(treated_outcome: ndarray, factors: ndarray, T0: int) Tuple[ndarray, ndarray, ndarray, float]#

OLS projection of the treated pre-period onto the factors.

Parameters:
  • treated_outcome (np.ndarray) – Treated outcome series, shape (T,).

  • factors (np.ndarray) – Estimated factor matrix, shape (T, r).

  • T0 (int) – Number of pre-treatment periods.

Returns:

  • lambda_hat (np.ndarray) – Loading vector (r + 1,) (constant + factor loadings).

  • counterfactual (np.ndarray) – Predicted untreated outcome for every period, shape (T,).

  • factors_with_const (np.ndarray) – Factor matrix with the constant column prepended, shape (T, r + 1). Returned for downstream variance plug-ins.

  • residual_variance (float) – Variance of the pre-treatment residuals.

Inference procedures for FMA (Li & Sonnier 2023).

Three procedures live here and can run in any combination:

  • asymptotic_inference() – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CI for the ATT.

  • bootstrap_inference() – Web Appendix F residual bootstrap for per-period ATT_t CIs. Uses the pre-period residuals as the bootstrap distribution and refits the loading on each draw, so the CIs reflect the joint variability of the loading estimate and the idiosyncratic shock at time t.

  • placebo_inference() – Web Appendix G placebo test where every control is treated as a pseudo-treated unit in turn. Returns the pointwise quantile band across the placebo ATT curves.

mlsynth.utils.fma_helpers.inference.asymptotic_inference(treated_outcome: ndarray, counterfactual: ndarray, factors_with_const: ndarray, residual_variance: float, T0: int, alpha: float = 0.05) Tuple[float, float, float, float]#

Theorem 3.1 normal CI for the ATT.

Returns:

se_att, lower, upper, p_value

mlsynth.utils.fma_helpers.inference.bootstrap_inference(treated_outcome: ndarray, factors: ndarray, counterfactual: ndarray, T0: int, alpha: float = 0.05, n_replicates: int = 1000, seed: int = 0) dict#

Web Appendix F residual bootstrap for per-period ATT_t CIs.

Procedure#

  1. Compute pre-period residuals e_hat_1t = y_1t - F_aug_t' lambda_hat for t = 1, ..., T0.

  2. For each bootstrap draw b = 1, …, B: a. Sample e*_1t from {e_hat} with replacement for every t. b. Form y*_1t = F_aug_t' lambda_hat + e*_1t. c. Re-fit the loading on the bootstrap pre-period. d. Compute Delta*_1t = y*_1t - F_aug_t' lambda*_hat for t > T0.

  3. The (1 - alpha) CI for Delta_1t is [Delta_hat_1t - q_{1 - alpha/2}, Delta_hat_1t - q_{alpha/2}], with quantiles taken across bootstrap replicates of Delta*.

mlsynth.utils.fma_helpers.inference.placebo_inference(control_outcomes: ndarray, treated_outcome: ndarray, T0: int, n_factors: int | None, stationarity: str, preprocessing: str, alpha: float = 0.05, max_factors: int = 10) dict#

Web Appendix G: pseudo-ATT curves with each control as the treated unit.

For each control k, swap it into the treated slot, refit the factor model on the remaining N_co - 1 controls, project the pseudo-treated pre-period onto those factors, compute the pseudo- ATT curve. The output band is the pointwise alpha/2 / (1 - alpha/2) quantile across the placebo curves at each period.

Parameters:
  • control_outcomes (np.ndarray) – (T, N_co) control panel.

  • treated_outcome (np.ndarray) – Real treated outcome series, shape (T,). Used for the leading row of the curves matrix (so the caller can compare).

  • T0 (int) – Pre-treatment periods.

  • n_factors (int or None) – Number of factors to fit on each placebo iteration. None means re-select per iteration via the criterion in stationarity.

  • stationarity ({“stationary”, “nonstationary”})

  • preprocessing ({“demean”, “standardize”})

  • alpha (float) – Significance level for the quantile band.

  • max_factors (int) – Upper bound on the factor-selection routine.

Returns:

dict – Keys curves ((N_co + 1, T); first row is the real treated unit), q_lower, q_upper ((T,) bands).

The Web Appendix E.1 Monte Carlo DGP1, packaged as simulate_fma_sample() so the Verification replication below runs as a one-liner.

Li & Sonnier (2023) DGP1 / DGP2 simulation helpers for the FMA estimator.

Implements the data-generating processes from the paper’s Web Appendix E.

DGP1 (Section 4.1, stationary factors, Hsiao-Ching-Wan 2012):

\[\begin{split}f_{1t} &= 0.8\, f_{1,t-1} + v_{1t}, \\ f_{2t} &= -0.68\, f_{2,t-1} + v_{2t} + 0.8\, v_{2,t-1}, \\ f_{3t} &= v_{3t} + 0.9\, v_{3,t-1} + 0.4\, v_{3,t-2}, \qquad v_{kt} \sim \mathcal{N}(0, 1).\end{split}\]

DGP2 (Web Appendix E.1, non-stationary factors):

\[\begin{split}F_{1t} &= (0.2 + \xi_t)\, t + \varepsilon_{5t} \qquad \text{(linear trend, random slope)} \\ F_{2t} &= F_{2,t-1} + \varepsilon_{4t} \qquad \text{(drift-less unit root)} \\ F_{3t} &= \sqrt{t} + \varepsilon_{6t} + 0.9\, \varepsilon_{6,t-1} + 0.4\, \varepsilon_{6,t-2} \qquad \text{(sqrt-}t\text{ trend, MA(2) errors)}\end{split}\]

with \(\xi_t \sim \mathrm{Uniform}[0,1]\) and \(\varepsilon_{kt} \sim \mathcal{N}(0, 1)\). (The paper’s TeX prints F_{2t} = F_{4,t-1} + ε_{4t}, which is a typo; the surrounding text describes \(F_{2t}\) as a drift-less unit root, so F_{4,t-1} reads as F_{2,t-1}.)

Either DGP feeds the same factor model

\[y_{it}^0 = \alpha + F_t' \lambda_i + e_{it}, \qquad i = 1, \ldots, N, \quad t = 1, \ldots, T,\]

with loadings \(\lambda_{il} \stackrel{\text{iid}}{\sim} \mathcal{N}(1, 1)\) and idiosyncratic shocks

\[e_{1t} \sim \mathcal{N}(0, \sigma_{\text{tr}}^2), \qquad e_{it} \sim \mathcal{N}(0, \sigma_{\text{co}}^2) \quad (i \ge 2).\]

The paper studies three variance regimes via the keyword variance_case:

  • "equal"\(\sigma_{\text{tr}}^2 = \sigma_{\text{co}}^2 = 1\) (Figures 2, W.5; both the new CI and Xu’s bootstrap nominal);

  • "treated_smaller"\(\sigma_{\text{tr}} = 0.5 \sigma_{\text{co}}\) (Figures 3, W.6; Xu’s bootstrap overcovers);

  • "treated_larger"\(\sigma_{\text{tr}} = 2 \sigma_{\text{co}}\) (Figures 4, W.7; Xu’s bootstrap undercovers).

True ATT = 0 in every draw; the paper’s centred statistic \(\sqrt{T_2}(\hat\Delta_1 - \Delta_1)\) is invariant to a constant treatment effect (see the equation following Equation 4.2), so coverage is unaffected by setting \(\Delta_{1t} = 0\).

class mlsynth.utils.fma_helpers.simulation.FMASample(df: DataFrame, Y_treated: ndarray, Y_controls: ndarray, factors: ndarray, T1: int, T2: int, sigma_tr: float, sigma_co: float, dgp: str = 'dgp1')#

One draw from a Li & Sonnier DGP.

df#

Long panel with columns unit / time / y / D ready for mlsynth.FMA.

Type:

pd.DataFrame

Y_treated#

Treated-unit outcome path, shape (T,).

Type:

np.ndarray

Y_controls#

Control outcomes, shape (N_co, T).

Type:

np.ndarray

factors#

Common factor matrix, shape (T, 3).

Type:

np.ndarray

T1, T2

Pre- / post-treatment period counts.

Type:

int

sigma_tr, sigma_co

Idiosyncratic error standard deviations actually drawn.

Type:

float

dgp#

Which DGP was drawn ("dgp1" or "dgp2").

Type:

str

T1: int#
T2: int#
Y_controls: ndarray#
Y_treated: ndarray#
df: DataFrame#
dgp: str = 'dgp1'#
factors: ndarray#
sigma_co: float#
sigma_tr: float#
mlsynth.utils.fma_helpers.simulation.simulate_fma_sample(dgp: Literal['dgp1', 'dgp2'] = 'dgp1', N_co: int = 30, T1: int = 30, T2: int = 20, variance_case: Literal['equal', 'treated_smaller', 'treated_larger'] = 'equal', alpha: float = 0.0, rng: Generator | None = None) FMASample#

Draw one sample from a Li & Sonnier (2023) DGP.

Parameters:
  • dgp ({“dgp1”, “dgp2”}, default "dgp1") – Stationary (DGP1, Section 4.1) or non-stationary (DGP2, Web Appendix E.1) factor process.

  • N_co (int, default 30) – Number of control units (the paper’s coverage exercise uses 30).

  • T1, T2 (int, default 30, 20) – Pre- and post-treatment period counts.

  • variance_case ({“equal”, “treated_smaller”, “treated_larger”}) – Which of the paper’s three variance regimes to simulate.

  • alpha (float, default 0.0) – Intercept shared by every unit (the paper’s \(\alpha\); coverage is invariant to its value).

  • rng (np.random.Generator, optional) – NumPy RNG. Defaults to np.random.default_rng().

Returns:

FMASample

Diagnostic plot for FMA results.

mlsynth.utils.fma_helpers.plotter.plot_fma(results: FMAResults) None#

Two-panel plot: trajectories + per-period gap with CI bands.

Frozen dataclasses for the Factor Model Approach (FMA) estimator.

FMA implements Li and Sonnier (2023), “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings”, JMR 60(3):449-472. The estimator constructs a counterfactual for a single treated unit by (i) extracting principal-component factors from the control panel, (ii) projecting the treated unit’s pre-period outcomes onto those factors, and (iii) using the resulting loadings to predict the treated unit’s untreated potential outcomes in the post-period.

Inference is the paper’s main contribution. Three procedures live side-by-side in FMAInference:

  • asymptotic – Theorem 3.1 (stationary) / Theorem 3.3 (non-stationary) normal CIs for the ATT.

  • bootstrap – Web Appendix F residual bootstrap for per-period ATT_t CIs.

  • placebo – Web Appendix G control-as-pseudo-treated bands.

The four layers below (inputs, design, inference, results) keep that pipeline pluggable.

class mlsynth.utils.fma_helpers.structures.FMADesign(n_factors: int, n_factors_source: str, factors: ndarray, lambda_hat: ndarray, counterfactual: ndarray, gap: ndarray, common_component: ndarray, residual_variance: float)#

Factor-model design produced by the FMA fit.

Parameters:
  • n_factors (int) – Number of common factors selected (or user-supplied).

  • n_factors_source (str) – "MBN" (modified Bai-Ng), "IPC1" (Bai 2004), "user" (user override).

  • factors (np.ndarray) – Estimated factors F_hat, shape (T, r).

  • lambda_hat (np.ndarray) – Estimated factor loading for the treated unit, shape (r + 1,) (intercept + loadings).

  • counterfactual (np.ndarray) – Predicted treated potential outcome under no treatment, shape (T,).

  • gap (np.ndarray) – Observed treated minus counterfactual, shape (T,).

  • common_component (np.ndarray) – Reconstructed control panel via factor projection, shape (T, N_co).

  • residual_variance (float) – Variance of the pre-treatment residuals y_1t - F_hat_t' lambda_hat.

common_component: ndarray#
counterfactual: ndarray#
factors: ndarray#
gap: ndarray#
lambda_hat: ndarray#
n_factors: int#
n_factors_source: str#
residual_variance: float#
class mlsynth.utils.fma_helpers.structures.FMAInference(method: str, alpha: float, att: float, asymptotic_att_se: float = nan, asymptotic_att_lower: float = nan, asymptotic_att_upper: float = nan, asymptotic_att_p_value: float = nan, bootstrap_att_t_lower: ~numpy.ndarray = <factory>, bootstrap_att_t_upper: ~numpy.ndarray = <factory>, bootstrap_replicates: ~numpy.ndarray = <factory>, bootstrap_n_replicates: int = 0, placebo_att_curves: ~numpy.ndarray = <factory>, placebo_quantile_lower: ~numpy.ndarray = <factory>, placebo_quantile_upper: ~numpy.ndarray = <factory>)#

Combined asymptotic + bootstrap + placebo inference output.

Parameters:
  • method (str) – Primary inference method tag ("asymptotic", "bootstrap", "placebo", or "none").

  • alpha (float) – Two-sided significance level.

  • att (float) – Mean post-treatment gap.

  • asymptotic_att_se (float) – Standard error of the ATT from the paper’s Theorem 3.1.

  • asymptotic_att_lower (float)

  • asymptotic_att_upper (float) – (1 - alpha) asymptotic CI bounds.

  • asymptotic_att_p_value (float) – Two-sided z-test of H_0: ATT = 0.

  • bootstrap_att_t_lower (np.ndarray)

  • bootstrap_att_t_upper (np.ndarray) – Per-period bootstrap CI bounds for ATT_t, shape (n_post,). Empty when method != "bootstrap".

  • bootstrap_replicates (np.ndarray) – (B, n_post) matrix of bootstrap replicate ATT_t.

  • bootstrap_n_replicates (int) – Number of bootstrap draws actually completed.

  • placebo_att_curves (np.ndarray) – (N_co, T) matrix of pseudo-ATT curves (one per control used as a placebo treated unit). Empty when method != "placebo".

  • placebo_quantile_lower (np.ndarray)

  • placebo_quantile_upper (np.ndarray) – Pointwise (alpha/2, 1 - alpha/2) quantile bands across the placebo curves at each period. Shape (T,).

alpha: float#
asymptotic_att_lower: float = nan#
asymptotic_att_p_value: float = nan#
asymptotic_att_se: float = nan#
asymptotic_att_upper: float = nan#
att: float#
bootstrap_att_t_lower: ndarray#
bootstrap_att_t_upper: ndarray#
bootstrap_n_replicates: int = 0#
bootstrap_replicates: ndarray#
method: str#
placebo_att_curves: ndarray#
placebo_quantile_lower: ndarray#
placebo_quantile_upper: ndarray#
class mlsynth.utils.fma_helpers.structures.FMAInputs(treated_outcome: ndarray, control_outcomes: ndarray, donor_names: ndarray, treated_unit_name: Any, T: int, T0: int, time_labels: ndarray, preprocessing: str, stationarity: str)#

Preprocessed panel data for FMA estimation.

Parameters:
  • treated_outcome (np.ndarray) – Outcome series for the treated unit, shape (T,).

  • control_outcomes (np.ndarray) – Outcome matrix for the N_co control units, shape (T, N_co).

  • donor_names (np.ndarray) – Labels of the control units, length N_co.

  • treated_unit_name (Any) – Label of the treated unit.

  • T (int) – Total number of panel periods.

  • T0 (int) – Number of pre-treatment periods.

  • time_labels (np.ndarray) – Labels of the time periods, length T.

  • preprocessing (str) – "demean" (default) or "standardize" – preprocessing applied to the control panel before PCA.

  • stationarity (str) – "stationary" or "nonstationary" – selects the appropriate factor-selection criterion (MBN vs Bai-IPC1).

property N_co: int#

Number of control units.

T: int#
T0: int#
control_outcomes: ndarray#
donor_names: ndarray#
property n_post: int#

Number of post-treatment periods.

preprocessing: str#
stationarity: str#
time_labels: ndarray#
treated_outcome: ndarray#
treated_unit_name: Any#
class mlsynth.utils.fma_helpers.structures.FMAResults(inputs: ~mlsynth.utils.fma_helpers.structures.FMAInputs, design: ~mlsynth.utils.fma_helpers.structures.FMADesign, inference: ~mlsynth.utils.fma_helpers.structures.FMAInference, counterfactual: ~numpy.ndarray, gap: ~numpy.ndarray, att: float, pre_rmse: float, metadata: ~typing.Dict[str, ~typing.Any] = <factory>)#

Top-level container returned by mlsynth.FMA.fit().

Parameters:
  • inputs (FMAInputs) – Preprocessed panel.

  • design (FMADesign) – Factor-model design.

  • inference (FMAInference) – Inference output.

  • counterfactual (np.ndarray) – Convenience alias of design.counterfactual.

  • gap (np.ndarray) – Convenience alias of design.gap.

  • att (float) – Mean post-treatment gap.

  • pre_rmse (float) – Root mean squared pre-treatment fit error.

  • metadata (dict) – Free-form pipeline diagnostics.

att: float#
counterfactual: ndarray#
design: FMADesign#
gap: ndarray#
inference: FMAInference#
inputs: FMAInputs#
metadata: Dict[str, Any]#
property n_factors: int#

Selected number of factors.

pre_rmse: float#

Example#

A self-contained one-draw Monte Carlo at a non-stationary two-factor DGP – the regime the paper’s empirical California / Brooklyn applications fall into. The example fits FMA in all three inference modes and prints the headline output.

"""One draw of a Li & Sonnier (2023) factor-model simulation."""

import numpy as np
import pandas as pd

from mlsynth import FMA


# ---------------------------------------------------------------------
# 1. Simulate one panel from a non-stationary two-factor DGP
# ---------------------------------------------------------------------

rng = np.random.default_rng(0)
J = 20           # control units
T_pre = 30
T_post = 10
T = T_pre + T_post
r_true = 2
tau_true = 1.0   # additive treatment effect on the treated unit

F = rng.standard_normal((T, r_true)).cumsum(axis=0)        # non-stationary
lam = rng.standard_normal((J + 1, r_true))
eps = rng.standard_normal((T, J + 1)) * 0.5
Y0 = F @ lam.T + eps
Y = Y0.copy()
Y[T_pre:, 0] += tau_true                                    # unit 0 treated

rows = [
    {
        "unit": j,
        "time": t,
        "y": float(Y[t, j]),
        "D": int(j == 0 and t >= T_pre),
    }
    for j in range(J + 1)
    for t in range(T)
]
df = pd.DataFrame(rows)


# ---------------------------------------------------------------------
# 2. Fit FMA with all three inference modes
# ---------------------------------------------------------------------

results = FMA({
    "df": df,
    "outcome": "y",
    "treat": "D",
    "unitid": "unit",
    "time": "time",
    "stationarity": "nonstationary",         # IPC1 factor selection
    "preprocessing": "demean",
    "inference_methods": ["asymptotic", "bootstrap", "placebo"],
    "n_bootstrap": 500,
    "alpha": 0.05,
    "display_graphs": False,
}).fit()


# ---------------------------------------------------------------------
# 3. Inspect the output
# ---------------------------------------------------------------------

print(f"true tau              = {tau_true:+.3f}")
print(f"ATT_hat               = {results.att:+.3f}")
print(f"r selected            = {results.design.n_factors} "
      f"({results.design.n_factors_source})")
print(f"pre-RMSE              = {results.pre_rmse:.4f}")
inf = results.inference
print(f"asymptotic 95% CI ATT = "
      f"[{inf.asymptotic_att_lower:+.3f}, {inf.asymptotic_att_upper:+.3f}]")
print(f"asymptotic p-value    = {inf.asymptotic_att_p_value:.3f}")

import pandas as pd
print("\nPer-period ATT_t with bootstrap CI:")
print(pd.DataFrame({
    "t": np.arange(T_pre + 1, T + 1),
    "att_t": results.gap[T_pre:],
    "boot_lower": inf.bootstrap_att_t_lower,
    "boot_upper": inf.bootstrap_att_t_upper,
}).round(3).to_string(index=False))

Verification#

Monte Carlo replication (Path B). Li & Sonnier’s empirical applications – California beer sales and a Brooklyn eyeglass retailer’s showroom – run on Nielsen retail scanner data and a proprietary retailer panel, neither of which is redistributable, so the estimator is validated through the paper’s own Monte Carlo. The methodological contribution of the paper is that the new asymptotic CI from Theorem 3.1 attains nominal coverage regardless of whether \(\sigma_{\text{tr}} = \sigma_{\text{co}}\), while the Xu (2017) bootstrap miscovers badly when the two variances differ (Figures 2-4 Panel A, plus the non-stationary mirror in Web Appendix E.1’s Figures W.5-W.7 and the sample-size sweeps in W.8-W.11).

Both DGPs – the stationary Section 4 DGP1 and the non-stationary Appendix E.1 DGP2 – are packaged in mlsynth.utils.fma_helpers.simulation.simulate_fma_sample(), together with the three variance regimes ("equal" / "treated_smaller" / "treated_larger"). True ATT is zero in every draw; the paper’s centred statistic \(\sqrt{T_2}(\hat\Delta_1 - \Delta_1)\) is invariant to \(\Delta\) (see the equation following 4.2), so coverage doesn’t depend on its value.

Replicating the headline coverage findings is a 15-line script:

import numpy as np
from mlsynth import FMA
from mlsynth.utils.fma_helpers.simulation import simulate_fma_sample

def coverage_cell(dgp, variance_case, N_co, T1, T2, M, alpha=0.05):
    stationarity = "stationary" if dgp == "dgp1" else "nonstationary"
    covers = []
    for j in range(M):
        s = simulate_fma_sample(dgp=dgp, N_co=N_co, T1=T1, T2=T2,
                                  variance_case=variance_case,
                                  rng=np.random.default_rng(j))
        res = FMA({"df": s.df, "outcome": "y", "treat": "D",
                   "unitid": "unit", "time": "time",
                   "stationarity": stationarity,
                   "inference_methods": ["asymptotic"],
                   "alpha": alpha, "display_graphs": False}).fit()
        covers.append(res.inference.asymptotic_att_lower <= 0.0
                       <= res.inference.asymptotic_att_upper)
    return float(np.mean(covers))

for dgp in ("dgp1", "dgp2"):
    for case in ("equal", "treated_smaller", "treated_larger"):
        print(dgp, case,
              coverage_cell(dgp, case, N_co=30, T1=30, T2=20, M=1000))

At \(M = 1{,}000\) (Li & Sonnier use \(M = 100{,}000\); runtime difference is the only material change) mlsynth reproduces all four appendix-figure families. The MC standard error at \(M = 1{,}000\) is \(\sqrt{0.95 \cdot 0.05 / 1000} \approx 0.7\) pp.

Section 4 + Web Appendix E.1: three variance regimes#

\(N_{co} = 30\), \(T_1 = 30\), \(T_2 = 20\). Reproduces Figures 2-4 Panel A (DGP1, stationary) and Figures W.5-W.7 Panel A (DGP2, non-stationary).

DGP

Variance case

\(\sigma_{\text{tr}}\)

\(\sigma_{\text{co}}\)

mlsynth

paper

DGP1

"equal" (Fig 2A)

1.0

1.0

0.947

0.95

DGP1

"treated_smaller" (Fig 3A)

0.5

1.0

0.946

0.95

DGP1

"treated_larger" (Fig 4A)

2.0

1.0

0.951

0.95

DGP2

"equal" (Fig W.5A)

1.0

1.0

0.935

0.95

DGP2

"treated_smaller" (Fig W.6A)

0.5

1.0

0.944

0.95

DGP2

"treated_larger" (Fig W.7A)

2.0

1.0

0.929

0.95

Every cell lands within \(\pm 2.1\) pp of nominal – well inside three MC standard errors. The headline takeaway is the equality of the three numbers within each DGP: coverage does not deteriorate when \(\sigma_{\text{tr}} \ne \sigma_{\text{co}}\), which is precisely the regime where the Xu (2017) bootstrap mistakes the treated-error variance for the control-error variance and either over- or under-covers (Panel B of Figures 3-4 and W.6-W.7 in the paper).

Web Appendix E.2.1: other \((T_1, N_{co})\) combinations#

Equal variance (\(\sigma_{\text{tr}} = \sigma_{\text{co}} = 1\)), \(T_2 = 20\), sample-size sweeps over both DGPs reproducing Figures W.8-W.9 (stationary) and W.10-W.11 (non-stationary).

DGP

\(T_1\)

\(N_{co}\)

mlsynth

paper

DGP1

30

60

0.936

0.95 (Fig W.8A)

DGP1

60

30

0.939

0.95 (Fig W.8B)

DGP1

60

60

0.944

0.95 (Fig W.9A)

DGP1

120

120

0.949

0.95 (Fig W.9B)

DGP2

30

60

0.939

0.95 (Fig W.10A)

DGP2

60

30

0.936

0.95 (Fig W.10B)

DGP2

60

60

0.935

0.95 (Fig W.11A)

DGP2

120

120

0.955

0.95 (Fig W.11B)

All eight cells within \(\pm 1.5\) pp of nominal, and the largest cell (\(T_1 = N_{co} = 120\)) hits the asymptotic regime cleanly – the underlying theory is consistent in both panel dimensions, as the paper asserts.

Not replicated here#

  • Web Appendix E.2.2 (small-N/T \(t\)-distribution) – the paper recommends switching to \(t_{T_1 - (N_{co} + 1)}\) for very small samples, but the suggested degrees-of-freedom value is non-positive in the figure’s specific \((T_1, N_{co}) = (20, 20)\) configuration, and FMA’s public API exposes only the normal CI; the small-sample refinement is left as a future addition.

  • Web Appendix E.3 / W.15-W.17 (DGP3) – the paper bootstraps DGP3’s first factor from the proprietary Brooklyn sales panel, so the precise DGP cannot be reconstructed from public data; the corresponding empirical-application coverage check is therefore out of scope.

For reference, the corresponding empirical applications report:

  • California recreational marijuana legalization on beer sales\(\widehat{\text{ATT}} = -\$88{,}400\) weekly (\(-0.464\%\)), 95% CI \([-\$407{,}400,\ \$230{,}600]\) (Table 2; not significant). The Xu bootstrap’s interval is less than half as wide (\([-\$229{,}000,\ \$82{,}400]\)), invalidly so given the variance ratio \(\hat\sigma_{\text{tr}}^2 / \hat\sigma_{\text{co}}^2 \approx 37.4\).

  • Brooklyn showroom opening on weekly sales\(\widehat{\text{ATT}} = +\$2{,}446\) weekly (\(+27.2\%\)).

References#

Bai, J. (2004). “Estimating Cross-Section Common Stochastic Trends in Nonstationary Panel Data.” Journal of Econometrics 122(1):137-183.

Bai, J., & Ng, S. (2002). “Determining the Number of Factors in Approximate Factor Models.” Econometrica 70(1):191-221.

Li, K. T., & Sonnier, G. P. (2023). “Statistical Inference for the Factor Model Approach to Estimate Causal Effects in Quasi-Experimental Settings.” Journal of Marketing Research 60(3):449-472.