A practitioner’s decision tree#

mlsynth ships dozens of estimators because each one is a named answer to a specific complication that breaks the method before it. The trick to not drowning is to start with the simplest credible method and escalate only when a concrete complication forces you to. That is exactly how this page is organised: a few identification gates first, then – within each branch – a ladder that runs from the easy, canonical case to the specialised, harder ones.

Answer each question yes or no. A yes sends you to a method (or a short list); a no moves you to the next question.

Reason forward: data, then estimand, then assumptions#

Before you open this tree, adopt the discipline that Baker, Callaway, Cunningham, Goodman-Bacon and Sant’Anna call forward engineering (Difference-in-Differences Designs: A Practitioner’s Guide, 2025, arXiv:2503.13323). The temptation many analysts face is to reverse-engineer: reach for a method because it sounds powerful or modern, run it, and only then back out later after they discover their assumptions were incorrect. Forward engineering goes the other way – and it is the right way:

  1. What state are your data in? Take honest stock first: panel or a single series, how many treated units and whether they adopt at the same time, whether assignment was randomized, the length of the pre-period versus the number of donors (\(N\) vs \(T_0\)), missing cells, stationarity, plausible spillovers, the presence of covariates or an instrument.

  2. Given a data structure, what estimand can those data actually support? Fix the target parameter before the estimator – a mean ATT, a population ATE, a per-arm contrast, a quantile / distributional effect. Do not aim at a parameter your design cannot identify just because a method will return a number for it.

  3. Which identifying assumptions are most defensible for that estimand, given that data state? Parallel trends? The convex-hull / no-extrapolation condition? No interference (SUTVA)? Exogeneity conditional on the latent factors? Proxy or instrument validity? Then, and only then, choose the method whose assumptions you can actually defend.

The questions below operationalize exactly that order as best as possible – data state, then estimand, then assumptions. But one caveat matters more than any single gate: this tree routes you to a method whose assumptions *match* your answers; it does not *verify* those assumptions for you.

Each estimator encodes technical conditions a one-line summary cannot capture. Follow the link at every leaf and read the original paper for the precise assumptions, the inference theory, and the documented failure modes. Treat no method as infallible: every estimate here is conditional on assumptions that are your job to defend, not the software’s.

Note

The gates are a guide, not a strict partition: several methods answer more than one question, and a real problem can trip two at once.

At a glance#

GATE 0 — Identification pre-screen (answer these first)
────────────────────────────────────────────────────────
Are you DESIGNING the experiment (treatment not yet assigned)?  ── yes ─► PART 3
Is assignment RANDOMIZED?                  ── yes, many small units ─► difference-in-means
                                           └─ yes, few large units  ─► MUSC
Do CONTROL UNITS exist at all?             ── no (everyone treated) ─► SHC
Is treatment ENDOGENOUS (SC can't absorb)? ── have an instrument    ─► SIV
                                           └─ have proxies / NCs     ─► PROXIMAL
Do parallel trends hold AND fixed-T/large-N? ── yes ─► plain Difference-in-Differences
                                                           (you may not need SC)
…otherwise:  HOW MANY TREATED UNITS?  ── one ─► PART 1   └─ more than one ─► PART 2

PART 1 — ONE treated unit   (easy ───────────────────────► hard)
─────────────────────────────────────────────────────────────────
Start:  FDID   (or TSSC for a formal pre-trends test)
  ↓ then escalate ONLY if one of these is true:
Spillovers onto donors (SUTVA)?      ─► SPILLSYNTH · SpSyDiD (spatial) · SPOTSYNTH (unknown which) · ISCM (outside hull)
Nonstationary / spurious trend?      ─► SBC · HSC
Time-varying dynamics / heavy noise? ─► TASC · DSCAR · FMA
Nonlinear outcome surface?           ─► NSC
Donor pool N ≳ T0 (overfitting)?     ─► CLUSTERSC · SparseSC · PDA · RESCM · FSCM · BVSS
Missing cells in the panel?          ─► SNN · MCNNM
Different ESTIMAND / treatment type? ─► DSC (dist.) · CTSC (dose) · SCMO (multi-outcome) · SI (arms)

PART 2 — MANY treated units   (easy ─────────────────────► hard)
─────────────────────────────────────────────────────────────────
Same adoption time?  ─► SDID            (micro units ─► MicroSynth; two-level ─► MLSC)
Staggered (different times)?  ─► SDID
  + want pooling / oracle efficiency  ─► PPSCM · SequentialSDID
  + long pre-period, few never-treated, event study ─► SSC
  + spillovers                        ─► SpSyDiD
  + missing cells / gaps              ─► MCNNM

PART 3 — DESIGNING an experiment   (by what you care about)
─────────────────────────────────────────────────────────────────
Care only about the ATT (effect on the treated)? ─► SYNDES · SPCD · weakly-targeted MAREX
Care about the ATE (population effect)?           ─► MAREX · LEXSCM
Must EVERY unit be treated or control (no pure donors), geo? ─► PANGEO (supergeo)

Gate 0 — Identification pre-screen#

These come first because they decide whether synthetic control is even the right family. Get one wrong and no amount of donor weighting saves you.

Q0.1 · Are you designing the experiment? Has the treatment not yet been assigned, and you are choosing whom to treat?

  • Yes – jump to Part 3 (experimental design).

  • No – the treatment already happened; continue.

Q0.2 · Is assignment randomized (or as-good-as-random)?

  • Yes, and you have many small exchangeable units – you do not need synthetic control; a difference-in-means (or a regression with controls) is unbiased.

  • Yes, but only one or a few large aggregate units (markets, states) – a single random draw can still leave baselines far apart. Modified Unbiased Synthetic Control makes the effect finite-sample unbiased under random assignment and is the only estimator here with an unbiased finite-sample variance and exact randomization intervals.

  • No – continue.

Q0.3 · Do control units exist at all?

  • No – every unit is treated (a nationwide policy, a global shock like COVID-19), so there is no donor pool – Synthetic Historical Control (SHC) rebuilds the comparison from overlapping historical blocks of the treated unit’s own series.

  • Yes – continue.

Q0.4 · Is the treatment endogenous in a way SC cannot absorb? This is the home of the proximal methods: they are fundamentally tools for unmeasured confounding / endogeneity, not for any particular outcome shape. The danger is selection on time-varying unobservables – the pre-fit can look perfect and the ATT still be biased.

  • You have a (partially valid) instrument – a shift-share, a tariff schedule, a supply shock – Synthetic IV SC-debiases the (outcome, treatment, instrument) triple, then runs 2SLS; the instrument need only be valid conditional on the factors.

  • You have valid proxies / negative controls – extra controls associated with the latent confounder but with no direct path to the outcome – Proximal Inference Synthetic Control (PROXIMAL) instruments the confounder via GMM (and also covers the single-proxy, doubly-robust, and surrogate variants).

  • Neither – selection is on the latent factors only (SC’s standard premise) – continue.

Q0.5 · Do parallel trends hold, and are you in a fixed-T / large-N regime?

Q0.6 · How many treated units?

  • One – go to Part 1.

  • More than one – go to Part 2.

Part 1 — A single treated unit#

Begin with the simplest method that could work and escalate only when a named complication applies.

Start here#

With one treated unit, a sharp intervention, and a scalar ATT, start with :doc:`fdid` – Forward DiD greedily selects the donors that share the treated trend, needs no convex-hull assumption, and gives valid inference even under nonstationarity, all with one estimated parameter. If the Forward Parallel Trends Assumption does not hold, use :doc:`tssc`. If neither of the escalations below applies, you are done.

Now walk the escalations, easy to hard:

Q1.1 · Are your donors contaminated by the treatment (SUTVA / spillovers)?

Q1.2 · Is the treated unit outside the donors’ convex hull even without spillovers (a true outlier)?

Q1.3 · Is the outcome nonstationary, so a tight pre-fit might be a spurious match?

Q1.4 · Are there persistent latent factors / time-varying dynamics / heavy observation noise?

Q1.5 · Is the untreated outcome a nonlinear function of the predictors?

Q1.6 · Is the donor pool large relative to the pre-period (N >> T0)? This is the most common reason to leave the standard workhorses: unrestricted fits overfit the pre-period and predict the post-period worse.

Q1.7 · Are there missing cells in the panel?

Q1.8 · Is your estimand or treatment effect non-standard (not a scalar mean ATT for one binary treatment)?

Q1.9 · Are you worried about interpolation bias – the synthetic control having to interpolate across donors that are individually far from the treated unit, so the fit blends dissimilar units?

  • No – you are done; use the Start here method.

  • YesMatching and Synthetic Control (MASC) blends extrapolation-free nearest-neighbour matching with the SC simplex and chooses the mix that minimises estimated bias, directly targeting interpolation bias.

Part 2 — Many treated units#

The base case for multiple treated units is :doc:`sdid` – Synthetic Difference-in-Differences, doubly weighted by unit and time weights – which works whether adoption is simultaneous or staggered and degrades gracefully when parallel trends or exact matching fail. Escalate from there.

Q2.1 · Do all treated units adopt at the same time?

Q2.2 · Staggered: do you just want the overall / event-study ATT?

Part 3 — Designing an experiment#

You are choosing whom to treat, not estimating an effect; these return assignments and power/MDE curves, not ATTs. Order by what estimand you care about, easiest target first.

Q3.1 · Do you only care about the ATT (the effect on the treated units)?

Q3.2 · Do you care about the population ATE (a population-level contrast, not just the treated)?

Q3.3 · Must every unit end up either treated or control – no pure-donor pool left over – as in a geo roll-out?

Failure-mode index#

A reverse lookup: the symptom, and the method named for it.

Complication

Reach for

No control group (everyone treated)

Synthetic Historical Control (SHC)

Randomized, few large units

Modified Unbiased Synthetic Control

Endogenous treatment, have an instrument

Synthetic IV

Endogenous treatment, have proxies / negative controls

Proximal Inference Synthetic Control (PROXIMAL)

Parallel trends holds (fixed-T, large-N)

difference-in-differences (off-ramp); Synthetic Difference-in-Differences (SDID), Forward Difference-in-Differences (FDID)

Single treated unit, no complications

Forward Difference-in-Differences (FDID), Two-Step Synthetic Control

Spillovers onto donors (SUTVA), spatial

Spatial Synthetic Difference-in-Differences (SpSyDiD)

Spillovers onto donors, enumerable per-unit

Spillover-Aware Synthetic Control (SPILLSYNTH)

Contaminated donors unknown / large pool to screen

Spillover-Detecting Synthetic Control (SPOTSYNTH)

Treated unit outside the donor convex hull

Imperfect Synthetic Controls (ISCM), Nonlinear Synthetic Control (NSC), Relaxed / Penalized Synthetic Control (RESCM), Panel Data Approach (PDA)

Nonstationary / spurious-trend matching

Synthetic Business Cycle (SBC), Harmonic Synthetic Control (HSC)

Time-varying dynamics / persistent factors / noise

Time-Aware Synthetic Control (TASC), Factor Model Approach (FMA), Dynamic Synthetic Control for Auto-Regressive processes (DSCAR)

Nonlinear outcome surface

Nonlinear Synthetic Control (NSC)

Donor pool large vs pre-period (N ≳ T0)

Forward-Selected Synthetic Control (FSCM), Sparse Synthetic Control (SparseSC), Panel Data Approach (PDA), Relaxed / Penalized Synthetic Control (RESCM), Cluster Synthetic Controls (CLUSTERSC), Bayesian Synthetic Control with a Soft Simplex Constraint (BVS-SS)

Missing cells, MNAR

Synthetic Nearest Neighbors / Causal Matrix Completion (SNN), Matrix Completion with Nuclear Norm Minimization (MCNNM)

Block-missing with unit/time covariates (side information)

Robust Matrix estimation with Side Information (RMSI)

Distributional estimand (QTE, Lorenz, tails)

Distributional Synthetic Control (DSC)

Continuous / multi-valued treatment

Continuous-Treatment Synthetic Control (CTSC)

Several related outcomes / short pre-period

Synthetic Control with Multiple Outcomes (SCMO)

Several distinct intervention arms

Synthetic Interventions (SI)

Interpolation bias (interpolating across dissimilar donors)

Matching and Synthetic Control (MASC)

Many treated, same time

Synthetic Difference-in-Differences (SDID), MicroSynth (User-Level Balancing SC), Multi-Level Synthetic Control (mlSC)

Many treated, staggered adoption

Synthetic Difference-in-Differences (SDID), Partially Pooled SCM (PPSCM), Sequential Synthetic Difference-in-Differences (Sequential SDiD), Matrix Completion with Nuclear Norm Minimization (MCNNM)

Staggered, long pre-period, few never-treated (event study)

Staggered Synthetic Control (SSC)

Designing for the ATT

Synthetic Design (SYNDES), Synthetic Principal Component Design (SPCD), Synthetic Controls for Experimental Design (MAREX)

Designing for the ATE

Synthetic Controls for Experimental Design (MAREX), Lexicographic Synthetic Control (LEXSCM)

Designing a geo roll-out (no pure donors)

Parallel-Trends Supergeo Design (PANGEO)

When in doubt, fit two or three of the candidate methods and compare the counterfactuals and ATTs. Disagreement is itself diagnostic: it usually means one of the gates above is binding harder than you thought.