A practitioner’s decision tree#
mlsynth ships dozens of estimators because each one is a named answer to a specific complication that breaks the method before it. The trick to not drowning is to start with the simplest credible method and escalate only when a concrete complication forces you to. That is exactly how this page is organised: a few identification gates first, then – within each branch – a ladder that runs from the easy, canonical case to the specialised, harder ones.
Answer each question yes or no. A yes sends you to a method (or a short list); a no moves you to the next question.
Reason forward: data, then estimand, then assumptions#
Before you open this tree, adopt the discipline that Baker, Callaway, Cunningham, Goodman-Bacon and Sant’Anna call forward engineering (Difference-in-Differences Designs: A Practitioner’s Guide, 2025, arXiv:2503.13323). The temptation many analysts face is to reverse-engineer: reach for a method because it sounds powerful or modern, run it, and only then back out later after they discover their assumptions were incorrect. Forward engineering goes the other way – and it is the right way:
What state are your data in? Take honest stock first: panel or a single series, how many treated units and whether they adopt at the same time, whether assignment was randomized, the length of the pre-period versus the number of donors (\(N\) vs \(T_0\)), missing cells, stationarity, plausible spillovers, the presence of covariates or an instrument.
Given a data structure, what estimand can those data actually support? Fix the target parameter before the estimator – a mean ATT, a population ATE, a per-arm contrast, a quantile / distributional effect. Do not aim at a parameter your design cannot identify just because a method will return a number for it.
Which identifying assumptions are most defensible for that estimand, given that data state? Parallel trends? The convex-hull / no-extrapolation condition? No interference (SUTVA)? Exogeneity conditional on the latent factors? Proxy or instrument validity? Then, and only then, choose the method whose assumptions you can actually defend.
The questions below operationalize exactly that order as best as possible – data state, then estimand, then assumptions. But one caveat matters more than any single gate: this tree routes you to a method whose assumptions *match* your answers; it does not *verify* those assumptions for you.
Each estimator encodes technical conditions a one-line summary cannot capture. Follow the link at every leaf and read the original paper for the precise assumptions, the inference theory, and the documented failure modes. Treat no method as infallible: every estimate here is conditional on assumptions that are your job to defend, not the software’s.
Note
The gates are a guide, not a strict partition: several methods answer more than one question, and a real problem can trip two at once.
At a glance#
GATE 0 — Identification pre-screen (answer these first)
────────────────────────────────────────────────────────
Are you DESIGNING the experiment (treatment not yet assigned)? ── yes ─► PART 3
Is assignment RANDOMIZED? ── yes, many small units ─► difference-in-means
└─ yes, few large units ─► MUSC
Do CONTROL UNITS exist at all? ── no (everyone treated) ─► SHC
Is treatment ENDOGENOUS (SC can't absorb)? ── have an instrument ─► SIV
└─ have proxies / NCs ─► PROXIMAL
Do parallel trends hold AND fixed-T/large-N? ── yes ─► plain Difference-in-Differences
(you may not need SC)
…otherwise: HOW MANY TREATED UNITS? ── one ─► PART 1 └─ more than one ─► PART 2
PART 1 — ONE treated unit (easy ───────────────────────► hard)
─────────────────────────────────────────────────────────────────
Start: FDID (or TSSC for a formal pre-trends test)
↓ then escalate ONLY if one of these is true:
Spillovers onto donors (SUTVA)? ─► SPILLSYNTH · SpSyDiD (spatial) · SPOTSYNTH (unknown which) · ISCM (outside hull)
Nonstationary / spurious trend? ─► SBC · HSC
Time-varying dynamics / heavy noise? ─► TASC · DSCAR · FMA
Nonlinear outcome surface? ─► NSC
Donor pool N ≳ T0 (overfitting)? ─► CLUSTERSC · SparseSC · PDA · RESCM · FSCM · BVSS
Missing cells in the panel? ─► SNN · MCNNM
Different ESTIMAND / treatment type? ─► DSC (dist.) · CTSC (dose) · SCMO (multi-outcome) · SI (arms)
PART 2 — MANY treated units (easy ─────────────────────► hard)
─────────────────────────────────────────────────────────────────
Same adoption time? ─► SDID (micro units ─► MicroSynth; two-level ─► MLSC)
Staggered (different times)? ─► SDID
+ want pooling / oracle efficiency ─► PPSCM · SequentialSDID
+ long pre-period, few never-treated, event study ─► SSC
+ spillovers ─► SpSyDiD
+ missing cells / gaps ─► MCNNM
PART 3 — DESIGNING an experiment (by what you care about)
─────────────────────────────────────────────────────────────────
Care only about the ATT (effect on the treated)? ─► SYNDES · SPCD · weakly-targeted MAREX
Care about the ATE (population effect)? ─► MAREX · LEXSCM
Must EVERY unit be treated or control (no pure donors), geo? ─► PANGEO (supergeo)
Gate 0 — Identification pre-screen#
These come first because they decide whether synthetic control is even the right family. Get one wrong and no amount of donor weighting saves you.
Q0.1 · Are you designing the experiment? Has the treatment not yet been assigned, and you are choosing whom to treat?
Yes – jump to Part 3 (experimental design).
No – the treatment already happened; continue.
Q0.2 · Is assignment randomized (or as-good-as-random)?
Yes, and you have many small exchangeable units – you do not need synthetic control; a difference-in-means (or a regression with controls) is unbiased.
Yes, but only one or a few large aggregate units (markets, states) – a single random draw can still leave baselines far apart. Modified Unbiased Synthetic Control makes the effect finite-sample unbiased under random assignment and is the only estimator here with an unbiased finite-sample variance and exact randomization intervals.
No – continue.
Q0.3 · Do control units exist at all?
No – every unit is treated (a nationwide policy, a global shock like COVID-19), so there is no donor pool – Synthetic Historical Control (SHC) rebuilds the comparison from overlapping historical blocks of the treated unit’s own series.
Yes – continue.
Q0.4 · Is the treatment endogenous in a way SC cannot absorb? This is the home of the proximal methods: they are fundamentally tools for unmeasured confounding / endogeneity, not for any particular outcome shape. The danger is selection on time-varying unobservables – the pre-fit can look perfect and the ATT still be biased.
You have a (partially valid) instrument – a shift-share, a tariff schedule, a supply shock – Synthetic IV SC-debiases the (outcome, treatment, instrument) triple, then runs 2SLS; the instrument need only be valid conditional on the factors.
You have valid proxies / negative controls – extra controls associated with the latent confounder but with no direct path to the outcome – Proximal Inference Synthetic Control (PROXIMAL) instruments the confounder via GMM (and also covers the single-proxy, doubly-robust, and surrogate variants).
Neither – selection is on the latent factors only (SC’s standard premise) – continue.
Q0.5 · Do parallel trends hold, and are you in a fixed-T / large-N regime?
Yes – you may not need synthetic control at all: plain difference-in-differences is simpler and more powerful. The nearest in-library options are Synthetic Difference-in-Differences (SDID) or Sequential Synthetic Difference-in-Differences (Sequential SDiD).
No – you are in SCM territory. Ask the master question:
Q0.6 · How many treated units?
One – go to Part 1.
More than one – go to Part 2.
Part 1 — A single treated unit#
Begin with the simplest method that could work and escalate only when a named complication applies.
Start here#
With one treated unit, a sharp intervention, and a scalar ATT, start with :doc:`fdid` – Forward DiD greedily selects the donors that share the treated trend, needs no convex-hull assumption, and gives valid inference even under nonstationarity, all with one estimated parameter. If the Forward Parallel Trends Assumption does not hold, use :doc:`tssc`. If neither of the escalations below applies, you are done.
Now walk the escalations, easy to hard:
Q1.1 · Are your donors contaminated by the treatment (SUTVA / spillovers)?
No – next question.
Yes, spatial with a known weight matrix W – Spatial Synthetic Difference-in-Differences (SpSyDiD).
Yes, an enumerable per-unit spillover set – Spillover-Aware Synthetic Control (SPILLSYNTH).
You don’t know which donors are contaminated (a large pool, no a-priori validity knowledge, or a suspiciously-good-match donor) – Spillover-Detecting Synthetic Control (SPOTSYNTH), which screens each donor by a pre-intervention forecast test and excludes the ones that fail it.
Yes, and the treated unit sits outside the donor hull (but is itself a useful donor for others) – Imperfect Synthetic Controls (ISCM).
Q1.2 · Is the treated unit outside the donors’ convex hull even without spillovers (a true outlier)?
No – next question.
Yes – relax the simplex: Nonlinear Synthetic Control (NSC) (affine weights), Relaxed / Penalized Synthetic Control (RESCM) (penalised / \(L_\infty\)), or the unconstrained Panel Data Approach (PDA).
Q1.3 · Is the outcome nonstationary, so a tight pre-fit might be a spurious match?
No – next question.
Yes – decompose first: Synthetic Business Cycle (SBC) (Hamilton trend/cycle split, match the cycle) or Harmonic Synthetic Control (HSC) (soft levels-vs-differences allocation).
Q1.4 · Are there persistent latent factors / time-varying dynamics / heavy observation noise?
No – next question.
Yes – Time-Aware Synthetic Control (TASC) (time-aware state-space model), Factor Model Approach (FMA) (PC factors with a residual-bootstrap test), or Dynamic Synthetic Control for Auto-Regressive processes (DSCAR) (time-varying weights for strongly autocorrelated panels with time-varying confounders).
Q1.5 · Is the untreated outcome a nonlinear function of the predictors?
No – next question.
Q1.6 · Is the donor pool large relative to the pre-period (N >> T0)? This is the most common reason to leave the standard workhorses: unrestricted fits overfit the pre-period and predict the post-period worse.
No – next question.
Yes – escalate, roughly easiest first: Forward-Selected Synthetic Control (FSCM) (tune the donor count), Sparse Synthetic Control (SparseSC) (L1 predictor/covariate selection), Panel Data Approach (PDA) (L2-relaxation / Lasso / forward), Relaxed / Penalized Synthetic Control (RESCM) (one program from simplex SC to \(L_\infty\) to DiD), Cluster Synthetic Controls (CLUSTERSC) (denoise + cluster donors), or Bayesian Synthetic Control with a Soft Simplex Constraint (BVS-SS) (Bayesian spike-and-slab with a soft simplex).
Q1.7 · Are there missing cells in the panel?
No – next question.
Yes, informative (MNAR) with a fully observed anchor block – Synthetic Nearest Neighbors / Causal Matrix Completion (SNN).
Yes, arbitrary sparse / low-rank – Matrix Completion with Nuclear Norm Minimization (MCNNM).
Block-missing (treated cells), and you have unit/time covariates – Robust Matrix estimation with Side Information (RMSI) exploits side information on both margins (a four-component sieve + nuclear-norm completion) to impute the treated counterfactual; it reduces to a low-rank completion when the covariates are uninformative.
Q1.8 · Is your estimand or treatment effect non-standard (not a scalar mean ATT for one binary treatment)?
No – you are done; use the Start here method.
The whole distribution (quantiles, Lorenz, tails) – Distributional Synthetic Control (DSC).
A continuous or multi-valued dose with no clean control – Continuous-Treatment Synthetic Control (CTSC).
Several related outcomes (helps most with a short pre-period) – Synthetic Control with Multiple Outcomes (SCMO).
Several distinct intervention arms to compare – Synthetic Interventions (SI).
Q1.9 · Are you worried about interpolation bias – the synthetic control having to interpolate across donors that are individually far from the treated unit, so the fit blends dissimilar units?
No – you are done; use the Start here method.
Yes – Matching and Synthetic Control (MASC) blends extrapolation-free nearest-neighbour matching with the SC simplex and chooses the mix that minimises estimated bias, directly targeting interpolation bias.
Part 2 — Many treated units#
The base case for multiple treated units is :doc:`sdid` – Synthetic Difference-in-Differences, doubly weighted by unit and time weights – which works whether adoption is simultaneous or staggered and degrades gracefully when parallel trends or exact matching fail. Escalate from there.
Q2.1 · Do all treated units adopt at the same time?
Yes – Synthetic Difference-in-Differences (SDID) is the base case. If the units are individual/micro-level (thousands–millions), use MicroSynth (User-Level Balancing SC); if treatment is assigned at a coarse level but outcomes are observed at a finer level (state policy, county outcomes), use Multi-Level Synthetic Control (mlSC).
No – adoption is staggered – continue.
Q2.2 · Staggered: do you just want the overall / event-study ATT?
Yes, just staggered – Synthetic Difference-in-Differences (SDID) (per-cohort + aggregate) is the simplest. If you want partial pooling across cohorts or oracle efficiency under interactive fixed effects, escalate to Partially Pooled SCM (PPSCM) or Sequential Synthetic Difference-in-Differences (Sequential SDiD).
Staggered, long pre-period, few/no never-treated units, and you want an event study without parallel trends – Staggered Synthetic Control (SSC) (Staggered Synthetic Control). It builds each unit’s synthetic control from all other units (not-yet-treated included), so it needs no never-treated pool, and gives event-time ATTs with Andrews end-of-sample inference. Best with a long pre-period (large \(T\), moderate \(N\)).
Staggered *and* spillovers onto donors – Spatial Synthetic Difference-in-Differences (SpSyDiD).
Staggered *and* missing cells / gaps – Matrix Completion with Nuclear Norm Minimization (MCNNM) (matrix completion handles staggered missingness natively).
Part 3 — Designing an experiment#
You are choosing whom to treat, not estimating an effect; these return assignments and power/MDE curves, not ATTs. Order by what estimand you care about, easiest target first.
Q3.1 · Do you only care about the ATT (the effect on the treated units)?
Yes – Synthetic Design (SYNDES) (a MIP that minimises the ATT estimator’s MSE, exactly \(K\) treated) or Synthetic Principal Component Design (SPCD) (a fast spectral phase-synchronisation design). A weakly-targeted Synthetic Controls for Experimental Design (MAREX) design can also be pointed at the treated set if you want a convex design that leans ATT-ward.
Q3.2 · Do you care about the population ATE (a population-level contrast, not just the treated)?
Yes – base Synthetic Controls for Experimental Design (MAREX) matches synthetic-treated and synthetic-control units to population predictor means; Lexicographic Synthetic Control (LEXSCM) adds an explicit power / minimum-detectable-effect report (validity then power) and budget/cost constraints when only a subset is eligible.
Q3.3 · Must every unit end up either treated or control – no pure-donor pool left over – as in a geo roll-out?
Yes – Parallel-Trends Supergeo Design (PANGEO) groups geos into balanced supergeos, trims no unit, and matches on the full pre-period trajectory for a downstream difference-in-differences read.
Failure-mode index#
A reverse lookup: the symptom, and the method named for it.
When in doubt, fit two or three of the candidate methods and compare the counterfactuals and ATTs. Disagreement is itself diagnostic: it usually means one of the gates above is binding harder than you thought.