Harmonic Synthetic Control (HSC)#
When to Use This Estimator#
Reach for HSC, due to Liu and Xu (2026) [HSC], when your outcome is nonstationary with unit-specific stochastic trends and you do not know, ex ante, whether that trend variation is shared across units or idiosyncratic to the treated unit. This is the regime where standard synthetic control is most fragile, and where the usual fixes force an awkward all-or-nothing choice:
Matching on raw levels (plain SC, SC-with-intercept, SDID) uses all the trend variation for donor weighting. That is ideal when the trend is shared (a convex combination of donors reproduces it), but when the treated unit has its own stochastic trend it produces spurious matching — a tight pre-period fit that does not persist out of sample (Granger-Newbold; Phillips, 1986), with bias that does not vanish as the pre-period grows.
Hard pre-filtering (differencing, or the Hamilton trend-cycle split of Synthetic Business Cycle (SBC)) removes the stochastic trend before matching. That neutralizes the idiosyncratic-trend risk, but it also discards genuinely shared trend variation that would have helped pin down the donor weights.
HSC replaces this binary choice with a soft, data-driven allocation: a single parameter \(\rho \in [0, 1]\), chosen by rolling-origin cross-validation, continuously interpolates between SC on differences (\(\rho \to 0\)) and SC on levels with an intercept/trend (\(\rho \to 1\)). The method does not need to know which regime it is in — cross-validation finds the allocation that predicts best out of sample.
Concretely, use HSC for trending, nonstationary outcomes where the shared-vs-idiosyncratic split is unknown:
Marketing / business science. Brand or category sales, market share, or traffic after an event — where some markets share a common demand trend but the focal market may also drift on its own.
Macro / policy. GDP per capita, unemployment, emissions — the canonical nonstationary outcomes, where whether the treated economy’s trend is common or idiosyncratic is rarely known in advance.
The flip side: if your outcome is plausibly stationary, the spurious-matching concern is muted and a conventional SCM is simpler. And if you are confident the stochastic trend is purely idiosyncratic, hard differencing (or Synthetic Business Cycle (SBC)) is a defensible fixed choice; HSC buys robustness precisely when you are unsure.
Notation#
Let \(Y_{1,t}\) be the treated outcome and \(X_{\text{pre}} \in \mathbb{R}^{T_0 \times N}\) the donor matrix over the \(T_0\) pre-treatment periods, with post-treatment donors \(X_{\text{post}}\). Donor weights \(\omega\) live on the simplex \(\Delta_N = \{\omega \ge 0,\, \mathbf{1}^\top\omega = 1\}\). \(D_q\) is the \(q\)-th order difference operator (\(q \in \{1, 2\}\)) and \(K_q = D_q^\top D_q\) the roughness matrix. The pre-treatment discrepancy is \(r(\omega) = Y_{\text{pre}} - X_{\text{pre}}\,\omega\).
The Outcome Decomposition#
HSC decomposes untreated potential outcomes into three pieces (Liu-Xu Eq. (1)):
where \(L_{i,t} = \Lambda_i^\top F_t\) is a shared low-rank component (a convex combination of donors that matches the treated loadings reproduces it), \(R_{i,t}\) is an idiosyncratic stochastic trend (long-run variance grows without bound), and \(\varepsilon_{i,t}\) is idiosyncratic short-run noise. The difficulty is \(R\): independent stochastic trends produce realized correlations that do not vanish as \(T_0\) grows, so longer pre-periods do not help. Whether the treated unit’s trend variation is mostly shared (in \(L\)) or idiosyncratic (in \(R\)) is generally unknown — which is exactly what HSC’s allocation parameter is designed to handle.
Assumptions#
HSC inherits the standard synthetic-control identification conditions for the shared component and adds two structural conditions on the idiosyncratic trend that license the spectral metric and the forecast step. Each is stated formally with a plain-language remark.
Assumption 1 (three-way outcome model). Untreated potential outcomes follow the decomposition above, \(Y_{i,t}(0) = \Lambda_i^\top F_t + R_{i,t} + \varepsilon_{i,t}\), where \(\Lambda_i^\top F_t\) is a shared low-rank factor structure, \(R_{i,t}\) is an integrated (unit-root) idiosyncratic process, and \(\varepsilon_{i,t}\) is mean-zero short-run noise with \(\mathbb{E}[\varepsilon_{i,t} \mid F_t, \Lambda_i, R_{i,t}] = 0\).
Remark. This is the interactive-fixed-effects model behind synthetic-control consistency theory (Abadie et al., 2010; Bai, 2009), augmented with a unit-specific stochastic trend \(R\). The augmentation is the whole point: classical SC assumes the donor pool spans all of the treated unit’s systematic variation, whereas HSC explicitly allows a piece (\(R\)) that no donor combination can reproduce.
Assumption 2 (the shared component is in the donor hull). The treated unit’s factor loading lies in the convex hull of the donor loadings — there exists \(\omega \in \Delta_N\) with \(\sum_i \omega_i \Lambda_i = \Lambda_1\), hence \(\sum_i \omega_i L_{i,t} = L_{1,t}\) for all \(t\).
Remark. This is the usual “treated unit in the convex hull of donors” requirement of Abadie et al. (2010), but imposed on the shared component only. HSC does not ask the donors to reproduce the treated unit’s idiosyncratic trend — that is what the smooth component \(E\) is for — so the hull condition is materially weaker here than in level-matching SC.
Assumption 3 (spectral separation of the idiosyncratic trend). The idiosyncratic trend \(R_{1,t}\) is low-frequency: it is integrated of order \(q\) so that \(D_q R_1\) is stationary and short-memory, and its energy is concentrated at low frequencies, separated from the broadband short-run noise \(\varepsilon\).
Remark. This is the assumption that gives the metric \(W_{\rho,q}\) and the smoother \(S_{\rho,q}\) their meaning. The smoother extracts the low-frequency part of the pre-period residual into \(E\); the metric down-weights exactly that part when matching donors. If \(R\) were itself high-frequency, no smoother could separate it from noise and the spectral allocation would have nothing to exploit. Under independent stochastic trends, level-matching \(R^2\) does not vanish as \(T_0 \to \infty\) (Granger–Newbold, 1974; Phillips, 1986) — the pre-period fit looks excellent but is spurious — which is precisely the failure mode Assumption 3 lets HSC avoid.
Assumption 4 (the smooth component is forecastable). The smooth component \(E\) absorbing \(R_1\) (plus any unmatched low-frequency residual) follows a model the post-period forecaster extrapolates consistently — by default an integrated AR(1), which is correctly specified when \(R_1\) is an ARIMA(1,1,0) trend.
Remark. The post-period counterfactual error decomposes into the donor-matching
error (controlled by Assumption 2) and the forecast error of \(E\). The
latter dominates at long horizons, so the relevant model is the one for
\(R_1\)’s differences. forecaster="last" is the conservative fallback (a
driftless random walk) when even an AR(1) on the differences is suspect.
Assumption 5 (no anticipation, no interference, clean donors). Donor units are untreated throughout the sample; the treated unit’s pre-period outcomes are untreated potential outcomes (no anticipation); and treatment of the focal unit does not spill over onto donors (SUTVA).
Remark. These are the standard SCM design conditions and are not specific to HSC; they are what make \(X_{\text{pre}}\) and \(X_{\text{post}}\) valid controls and the pre-period a clean training window for both \(\omega\) and \(E\).
When to prefer SBC over HSC#
HSC and Synthetic Business Cycle (SBC) are both targeted at the spurious-matching failure mode of classical SC, but they take opposite engineering bets. HSC dials a soft \(\rho \in [0, 1]\) between levels and differences; SBC commits to a hard Hamilton-filter trend-cycle split and forecasts the treated unit’s trend from its own history. The cases where SBC is the better choice:
The trend / cycle decomposition is part of the substantive story. Macro applications where you specifically want a trend-vs-cycle interpretation (German reunification’s reunification dividend on trend versus the catch-up boom on the cycle) get more out of SBC’s explicit decomposition than out of HSC’s single \(\rho^*\).
The treated unit has a long, self-forecastable trend. SBC’s Step 2 forecasts the treated trend univariately. If the treated unit’s own history is rich enough that you’d happily fit an AR(p) on it without donors, SBC’s modularity is an asset, not a liability.
You prefer a fixed structural recipe to a CV-tuned knob. HSC’s \(\rho^*\) is chosen by rolling-origin CV, which is itself noisy in finite samples. SBC’s Hamilton filter has no hyperparameter to tune at the SC step.
You want explicit asymptotic unbiasedness theory. SBC has a clean Theorem 1; HSC’s CV-based guarantee is less crisp.
The Hamilton filter’s stationary-cycle assumption is defensible. Macro outcomes (GDP, unemployment, prices) where business-cycle properties are well documented are exactly SBC’s comfort zone.
Conversely, prefer HSC when the trend-vs-cycle distinction is not substantively interesting (sales, market share, daily prices), when you have reason to think the shared-trend regime is plausible and don’t want SBC to discard that variation by construction, or when your outcome is outside the macroeconomic comfort zone of the Hamilton filter.
Graphical comparison#
The block below builds two contrasting panels and fits both
HSC and mlsynth.SBC on each, then overlays the
observed treated trajectory with both counterfactuals. Both panels
have a true ATT of zero by construction; the closer a method’s
dashed line tracks the solid observed line in the post-period
(right of the dotted intervention line), the better that method
estimates the (zero) treatment effect on this panel.
Panel A – sales-like data: donors and the treated unit share a strong deterministic linear trend, short pre-period (\(T_0 = 16\)). HSC’s cross-validation picks \(\rho^* \approx 0.97\) (essentially level-matching), exploits the shared trend, and hugs the observed series. SBC strips the trend by Hamilton-filtering and then has to forecast it from a short pre-period of treated observations, which extrapolates poorly – the SBC counterfactual visibly undershoots.
Panel B – macro-like data: the treated unit has its own deterministic linear trend that none of the donors share, but donors and treated share a strong stationary cycle. Long pre-period (\(T_0 = 120\)). SBC’s univariate AR-trend forecast of the treated trend is essentially exact, and the cycle-only SC step uses the donors where they actually help. HSC has no donor combination that can reproduce the treated trend and is forced into a compromise \(\rho^* \approx 0.5\), which drifts off in the post-period.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mlsynth import HSC, SBC
def panel_HSC_wins(*, n_donors=10, T0=16, T1=12, seed=0):
"""Sales-like: shared deterministic growth + short pre-period."""
rng = np.random.default_rng(seed)
T = T0 + T1
t = np.arange(T)
shared = 100 + 2.0 * t
donors = shared[None, :] + 1.5 * rng.standard_normal((n_donors, T))
treated = shared + 1.5 * rng.standard_normal(T)
rows = [{"unit": "T", "t": k, "y": float(treated[k]),
"treat": int(k >= T0)} for k in range(T)]
for i, d in enumerate(donors):
rows.extend({"unit": f"d{i}", "t": k, "y": float(d[k]),
"treat": 0} for k in range(T))
return pd.DataFrame(rows), T0, T1
def panel_SBC_wins(*, n_donors=10, T0=120, T1=20, seed=1):
"""Macro-like: treated has own linear trend, donors share a cycle only."""
rng = np.random.default_rng(seed)
T = T0 + T1
t = np.arange(T)
treated_trend = 100 + 0.8 * t
cycle = np.zeros(T)
for i in range(1, T):
cycle[i] = 0.6 * cycle[i - 1] + rng.standard_normal()
donors = np.zeros((n_donors, T))
for i in range(n_donors):
base = 100 + 4 * rng.standard_normal()
donors[i] = base + 5.0 * cycle + 0.3 * rng.standard_normal(T)
treated = treated_trend + 5.0 * cycle + 0.3 * rng.standard_normal(T)
rows = [{"unit": "T", "t": k, "y": float(treated[k]),
"treat": int(k >= T0)} for k in range(T)]
for i, d in enumerate(donors):
rows.extend({"unit": f"d{i}", "t": k, "y": float(d[k]),
"treat": 0} for k in range(T))
return pd.DataFrame(rows), T0, T1
fig, axes = plt.subplots(1, 2, figsize=(13, 4.8))
for ax, (label, builder) in zip(axes, [
("Panel A: shared trend, short pre (HSC favoured)", panel_HSC_wins),
("Panel B: own trend on treated, shared cycle (SBC favoured)", panel_SBC_wins),
]):
df, T0, T1 = builder()
h = HSC({"df": df, "outcome": "y", "treat": "treat",
"unitid": "unit", "time": "t",
"display_graphs": False}).fit()
s = SBC({"df": df, "outcome": "y", "treat": "treat",
"unitid": "unit", "time": "t", "h": T1,
"display_graphs": False}).fit()
obs = df.loc[df["unit"] == "T", "y"].to_numpy()
t = np.arange(obs.size)
ax.plot(t, obs, "k-", lw=2.2, label="Observed (true ATT=0)")
ax.plot(t, h.counterfactual_full, "--", color="tab:blue", lw=1.8,
label=f"HSC counterfactual (ATT={h.att:+.2f})")
ax.plot(t, s.counterfactual_full, "--", color="tab:red", lw=1.8,
label=f"SBC counterfactual (ATT={s.att:+.2f})")
ax.axvline(T0 - 0.5, color="grey", ls=":", alpha=0.7)
ax.set_title(label, fontsize=11)
ax.set_xlabel("t"); ax.set_ylabel("y")
ax.legend(loc="best", fontsize=8.5)
ax.grid(alpha=0.2)
plt.tight_layout()
plt.show()
prints (deterministic with the seeds above):
Panel A HSC: ATT = -0.24 rho* = 0.97
Panel A SBC: ATT = -6.62
Panel B HSC: ATT = +6.22 rho* = 0.50
Panel B SBC: ATT = -1.63
Read the plot the same way you’d read a Figure-3-style synthetic control plot: the post-period vertical gap between solid (observed) and dashed (counterfactual) is the estimated ATT. The closer that gap is to zero, the closer the method is to recovering the true (zero) effect on this synthetic panel.
Mathematical Formulation#
The Profiled Metric (Proposition 1)#
HSC jointly estimates donor weights \(\omega\) and a treated-unit-specific smooth component \(E\) that absorbs low-frequency residual variation, with the roughness of \(E\) penalized by \(\|D_q E\|_2^2\). Profiling out \(E\) reduces the problem to a donor-weight QP under a \(\rho\)-dependent metric. With \(\lambda_\rho = \rho/(1-\rho)\),
the donor weights solve a ridge-regularized simplex QP,
and the fitted smooth component is \(\hat E = S_{\rho,q}\,(Y_{\text{pre}} - X_{\text{pre}}\hat\omega)\). The smoother \(S_{\rho,q}\) extracts the low-frequency (trend-like) part of the residual; the metric \(W_{\rho,q}\) down-weights exactly that part in donor matching, which is what alleviates the spurious-matching risk.
Note
Donor ridge. The penalty coefficient \(\zeta\) is set by ridge.
A float gives a relative ridge ridge * trace(X'WX)/N (default
1e-6). The string ridge="sdid" uses the data-driven SDID-style
penalty \(\zeta^2 T_0\) with \(\zeta = T_{\text{post}}^{1/4}
\hat\sigma_{\Delta X}\) (Arkhangelsky et al., 2021; Liu-Xu §7), where
\(\hat\sigma_{\Delta X}\) is the standard deviation of the donors’
first differences. The SDID ridge diversifies the donor weights (no
corner solutions) and is the configuration the paper uses for its empirical
application. On the 1997 Hong Kong handover it spreads weight broadly across
all 11 donors (largest Korea ≈ 0.18, vs ≈ 0.41 under the near-unregularized
relative ridge) while leaving the counterfactual essentially unchanged
— see Empirical Application: the 1997 Hong Kong Handover below.
The Two Endpoints#
The metric extends continuously to the boundary, giving the two classical estimators as special cases:
\(\rho \to 0\): \(S = I\), \(W = K_q\) — donor matching on q-th differences, and \(E\) absorbs the entire level discrepancy.
\(\rho \to 1\): \(S = P_0\) (the projector onto \(\mathrm{Null}(K_q)\)), \(W = I - P_0\) — donor matching on levels with an intercept (\(q=1\)) or intercept plus linear trend (\(q=2\)).
Interior \(\rho\) interpolates smoothly between them — a soft spectral transformation of the data rather than a hard filter.
Counterfactual and Forecasting#
In the post-period the smooth component is extrapolated by a time-series forecaster and added to the donor-matched component:
The default forecaster is a closed-form ARIMA(1, 1, 0) (an integrated AR(1) on
the differences of \(\hat E\)), the correctly specified model for an
idiosyncratic ARIMA(1,1,0) stochastic trend; forecaster="last" carries the
last fitted value forward.
Selecting the Allocation#
\(\rho\) is chosen by rolling-origin cross-validation (scikit-learn’s
TimeSeriesSplit): for each candidate
\(\rho\) and each expanding-window fold, HSC is fit on the training block and
scored on the held-out block by \(X_{\text{val}}\hat\omega + \mathrm{forecast}(\hat E)\).
The \(\rho\) with the lowest average out-of-sample error is selected, with no
ex-ante assumption about whether the stochastic trend is shared or idiosyncratic.
Monte Carlo Study: Regime Adaptation#
The block below is self-contained. It draws panels from the paper’s
data-generating process (Appendix C.1) under two regimes — a shared stochastic
trend (rho_u=1) and an idiosyncratic one (rho_u=0) — and compares HSC
against the two fixed strategies it interpolates: SC with an intercept (levels) and
SC on first differences. The treated unit gets no effect, so the post-period RMSE
is pure counterfactual error.
import numpy as np
import pandas as pd
import cvxpy as cp
from mlsynth import HSC
def integ_ar1(T, phi, innov):
d = np.zeros(T)
for t in range(1, T):
d[t] = phi * d[t - 1] + innov[t]
return np.cumsum(d)
def simulate(rng, Lam, lam0, alpha_d, N0, T0, Tpost, kappa, rho_u, phe=0.25):
T = T0 + Tpost
F = np.column_stack([
np.cumsum(rng.normal(0, 2, T)), # random walk
integ_ar1(T, 0.5, rng.normal(0, 2, T)), # ARIMA(1,1,0)
np.r_[0.0, np.cumsum(rng.normal(0, 1, T - 1))], # extra trend factor
])
L = np.vstack([lam0, Lam]) @ F.T
uc = rng.normal(0, np.sqrt(1 - phe**2), T)
E = np.zeros((N0 + 1, T))
for j in range(N0 + 1):
ui = rng.normal(0, np.sqrt(1 - phe**2), T)
E[j] = integ_ar1(T, phe, np.sqrt(rho_u) * uc + np.sqrt(1 - rho_u) * ui)
alpha = np.concatenate([[0.0], alpha_d])
Y = L + kappa * E + rng.normal(0, 1, (N0 + 1, T)) + alpha[:, None] + rng.normal(0, 1, T)
return Y # (N0+1, T); row 0 = treated
def to_df(Y, T0):
return pd.DataFrame(
[{"unit": f"u{j:02d}", "time": t, "y": float(Y[j, t]),
"treat": int(j == 0 and t >= T0)}
for j in range(Y.shape[0]) for t in range(Y.shape[1])]
)
def simplex_fit(X, y):
w = cp.Variable(X.shape[1])
cp.Problem(cp.Minimize(cp.sum_squares(y - X @ w)),
[w >= 0, cp.sum(w) == 1]).solve(solver=cp.CLARABEL)
return np.clip(w.value, 0, None)
def sc_int(Xpre, Ypre, Xpost): # SC with intercept (levels)
w = simplex_fit(Xpre - Xpre.mean(0), Ypre - Ypre.mean())
return Xpost @ w + (Ypre.mean() - Xpre.mean(0) @ w)
def sc_diff(Xpre, Ypre, Xpost): # SC on first differences
w = simplex_fit(np.diff(Xpre, axis=0), np.diff(Ypre))
return Ypre[-1] + (Xpost - Xpre[-1]) @ w
N0, T0, Tpost, R = 20, 100, 10, 60
m = np.random.default_rng(0)
Lam = np.clip(m.normal(0, 0.5, (N0, 3)), -2, 2) # donor loadings (fixed)
S8 = m.choice(N0, 8, replace=False)
lam0 = m.dirichlet(np.ones(8) * 0.5) @ Lam[S8] # treated loading, in hull
alpha_d = m.uniform(5, 15, N0) # donor fixed effects
rmse = lambda e: float(np.sqrt(np.mean(np.asarray(e) ** 2)))
for label, rho_u in [("common drift", 1.0), ("idiosyncratic", 0.0)]:
ei, ed, eh, rhos = [], [], [], []
for r in range(R):
rng = np.random.default_rng(1000 + r)
Y = simulate(rng, Lam, lam0, alpha_d, N0, T0, Tpost, kappa=2.0, rho_u=rho_u)
Ypre, Xpre = Y[0, :T0], Y[1:, :T0].T
Xpost, Y0post = Y[1:, T0:].T, Y[0, T0:]
ei.append(sc_int(Xpre, Ypre, Xpost) - Y0post)
ed.append(sc_diff(Xpre, Ypre, Xpost) - Y0post)
res = HSC({"df": to_df(Y, T0), "outcome": "y", "unitid": "unit",
"time": "time", "treat": "treat"}).fit()
eh.append(res.design.counterfactual_post - Y0post)
rhos.append(res.selected_rho)
print(f"{label:>14}: SC-INT {rmse(ei):5.2f} | SC-diff {rmse(ed):5.2f} | "
f"HSC {rmse(eh):5.2f} | mean rho_hat {np.mean(rhos):.2f}")
Running it reproduces the paper’s headline (and matches the standalone skeleton bit-for-bit):
common drift: SC-INT 1.15 | SC-diff 1.48 | HSC 1.21 | mean rho_hat 0.86
idiosyncratic: SC-INT 10.60 | SC-diff 6.11 | HSC 6.46 | mean rho_hat 0.48
HSC is good in both regimes, tracking whichever fixed estimator is best, while each fixed estimator fails in one: SC-on-levels is excellent under a shared trend (1.15) but catastrophic when the trend is idiosyncratic (10.60); SC-on-differences is the reverse. And the cross-validated allocation moves the right way on its own — \(\hat\rho \approx 0.86\) (lean to levels) under a shared trend, dropping to \(\approx 0.48\) (lean to differencing) when the trend is idiosyncratic.
Note
The small gap to the oracle-best fixed method in each regime (1.21 vs 1.15; 6.46 vs 6.11) is the expected price of adaptation — cross-validation is not an oracle — and is what buys robustness across regimes.
Running the same harness at the paper’s actual sample sizes
(\(N_0 = 50\) donors, \(T_0 = 200\), \(T_{\text{post}} = 20\),
q=1, arima110 forecaster) over the full Liu-Xu (2026) grid —
amplitude \(\kappa \in \{0, 0.5, 1, 2\}\) of the unit-specific stochastic
trend and cross-unit sharing \(\rho_u \in \{0, 0.5, 1\}\), 80 reps/cell —
reproduces both of the paper’s Monte-Carlo figures, all through the packaged
HSC estimator.
Figure 7 — the cross-validated \(\hat\rho\) adapts to the regime. When the stochastic trend is shared across units (\(\rho_u = 1\)) or absent (\(\kappa = 0\)), level matching identifies the donor weights and \(\hat\rho\) sits near 1; when the trend is idiosyncratic (\(\kappa \ge 0.5\), \(\rho_u \le 0.5\)), the donor pool cannot reproduce it, so cross-validation pushes \(\hat\rho\) down toward differencing:
\(\kappa\) |
\(\rho_u = 0\) |
\(\rho_u = 0.5\) |
\(\rho_u = 1\) |
|---|---|---|---|
0.0 |
0.89 |
0.92 |
0.92 |
0.5 |
0.51 |
0.55 |
0.94 |
1.0 |
0.55 |
0.59 |
0.89 |
2.0 |
0.49 |
0.43 |
0.90 |
Figure 6 — HSC ties or beats the baselines in every cell. The post-period RMSE ratio of HSC to plain level-matching SC is below 1 throughout (0.34-0.77): HSC ties SC tightly when the trend is shared/absent (ratio \(\approx\) 0.34-0.39, both near the noise floor) and the advantage widens with \(\kappa\) as idiosyncratic drift grows — exactly the §6.2 pattern.
\(\kappa\) |
\(\rho_u = 0\) |
\(\rho_u = 0.5\) |
\(\rho_u = 1\) |
|---|---|---|---|
0.0 |
0.39 |
0.34 |
0.34 |
0.5 |
0.62 |
0.49 |
0.36 |
1.0 |
0.68 |
0.64 |
0.37 |
2.0 |
0.77 |
0.57 |
0.38 |
To reproduce, set N0, T0, Tpost = 50, 200, 20 in the block above and sweep
kappa and rho_u over the grid (the paper uses \(R = 500\) reps; 80
already gives the pattern above).
Inference#
The authors are explicit that uncertainty quantification “rests on additional assumptions beyond those required for the point estimator” and leave it to future work, pointing toward a prediction interval that combines donor-weight estimation uncertainty with out-of-sample forecast-error calibration for the smooth component.
Empirical Application: the 1997 Hong Kong Handover#
The paper’s headline application (Liu-Xu §7) revisits the economic-integration
question of Hsiao, Ching and Wan (2012): what happened to Hong Kong’s real GDP
per capita after the 1997 handover to mainland China? The panel
(basedata/hong_kong_handover.csv) is annual real GDP per capita for Hong Kong
and 11 OECD donors, with treatment switching on in 1997
(\(T_0 = 36\) pre-treatment years, 1961–1996). Following the authors, the
post-treatment window runs 1997–2003 (\(T_{\text{post}} = 7\)) — the
counterfactual of an integrated process is only trustworthy a few years out (see
the note below), so the analysis is not extended to the end of the raw series.
GDP per capita is the canonical nonstationary, trending outcome for which HSC is
built, and it is unknown a priori whether Hong Kong’s growth trend is shared with
the donor pool or idiosyncratic — exactly the regime HSC’s cross-validated
\(\rho\) resolves.
import numpy as np
import pandas as pd
from mlsynth import HSC
df = pd.read_csv(
"https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
"refs/heads/main/basedata/hong_kong_handover.csv"
)
df = df[df["year"] <= 2003] # authors' post-window: 1997-2003
res = HSC({
"df": df, "outcome": "gdp", "unitid": "country",
"time": "year", "treat": "Handover",
"ridge": "sdid", # SDID-style ridge -> diversified weights
# refine the rho grid: the CV optimum (~0.09) lies between the default
# grid's 0.0 and 0.2 points, so the coarse default would miss it.
"rho_grid": list(np.round(np.arange(0.0, 0.98, 0.01), 2)),
"display_graphs": True,
}).fit()
print("selected rho:", round(res.selected_rho, 3)) # 0.09
print("2003 effect:", round(res.treatment_effect[-1])) # -1902
print("ATT 1997-2003:", round(res.att)) # -1734
for k, v in sorted(res.weights_by_donor.items(), key=lambda kv: -kv[1]):
print(f" {k:12s} {v:.3f}")
Donor weights. With the SDID ridge the design leans on a broad mix of donors rather than a handful — the largest weight is Korea ≈ 0.18, followed by Germany ≈ 0.14, the US ≈ 0.13 and Italy ≈ 0.11, with the remaining seven donors each contributing 0.04–0.09. This matches the paper’s reported weights almost exactly (Korea 0.18, Germany 0.14, US 0.13, Italy 0.11, max < 0.19). Under the near-unregularized relative ridge the same fit instead collapses most of the weight onto Korea (≈ 0.41) — the SDID ridge is what buys the diversification.
The counterfactual path. Cross-validation selects \(\hat\rho = 0.09\) (paper: 0.11) — a genuine interior allocation, leaning toward differencing but keeping some level information. Hong Kong sits below its synthetic counterfactual throughout the post-handover window, a persistent shortfall spanning the Asian Financial Crisis (1998) and the 2003 SARS epidemic:
Year |
Observed |
Counterfactual |
Effect |
|---|---|---|---|
1997 |
26,632 |
26,093 |
+538 |
1998 |
24,857 |
26,528 |
-1,671 |
1999 |
25,238 |
27,651 |
-2,413 |
2000 |
26,933 |
28,809 |
-1,876 |
2001 |
26,885 |
29,225 |
-2,340 |
2002 |
27,210 |
29,688 |
-2,478 |
2003 |
28,097 |
29,999 |
-1,902 |
The 2003 effect is -$1,902 against a counterfactual of $29,999 — matching the paper’s headline (≈ -$1,900 against ≈ $30,000) to the dollar. The average post-treatment effect is \(\widehat{\text{ATT}} =\) -$1,734 per capita.
Note
Refine the \(\rho\) grid for empirical work. The default
rho_grid [0, 0.2, 0.5, 0.8, 0.97] is coarse — fine for the validated
Monte Carlo, but here the true CV optimum (\(\rho \approx 0.09\)) falls in
the gap between the grid’s 0.0 and 0.2 points. Both neighbours are ~8-10%
worse on the CV criterion than the interior optimum, so the coarse grid
selects an unrelated point (0.5) and inflates the largest weight to 0.21.
Passing a fine grid (np.arange(0, 0.98, 0.01)) recovers
\(\hat\rho = 0.09\) and the paper’s weights. This was the dominant
source of the gap to the paper — not the data.
Robustness Checks#
The 2003 effect (≈ -$1,900) and counterfactual (≈ $30,000) are stable across the estimator’s tuning choices. Re-fitting (fine \(\rho\) grid throughout) under the relative ridge, second-order differencing (\(q = 2\)), and the last-value forecaster:
Configuration |
\(\hat\rho\) |
max weight |
cf 2003 |
effect 2003 |
ATT |
|---|---|---|---|---|---|
|
0.09 |
0.18 |
29,999 |
-1,902 |
-1,734 |
|
0.56 |
0.41 |
29,857 |
-1,760 |
-1,744 |
|
0.00 |
0.17 |
30,200 |
-2,103 |
-1,818 |
|
0.00 |
0.18 |
30,708 |
-2,611 |
-2,319 |
Across every configuration the 2003 effect stays in a tight -$1,760 to -$2,610 band and the 2003 counterfactual in $29,800–$30,700. The ridge choice changes how the weight is spread (max donor weight 0.18 under the SDID ridge vs. 0.41 under the relative ridge) far more than it changes the counterfactual; the differencing order and forecaster move it only modestly. The rolling-origin CV curve over \(\rho\) confirms a real interior optimum near 0.09 that the default grid straddles — the criterion rises ~8-10% at the default grid’s 0.0/0.2/0.5 points and explodes toward \(\rho = 1\), the signature of a partly idiosyncratic trend:
\(\rho\) |
CV mean squared error |
vs. optimum |
|---|---|---|
0.00 (default grid) |
1.178e6 |
+8.3% |
0.05 |
1.103e6 |
+1.3% |
0.09 (selected) |
1.088e6 |
— |
0.11 (paper) |
1.094e6 |
+0.6% |
0.20 (default grid) |
1.193e6 |
+9.6% |
0.50 (default grid) |
1.177e6 |
+8.2% |
0.97 (default grid) |
4.69e6 |
+331% |
Note
This is a close Path-A empirical replication: on the same data, the authors’ 1997–2003 window, and a fine \(\rho\) grid, HSC matches the paper’s headline numbers essentially to the dollar — \(\hat\rho = 0.09\) (paper 0.11), a -$1,902 2003 effect against a $29,999 counterfactual (paper ≈ -$1,900 / ≈ $30,000), and donor weights Korea 0.18 / Germany 0.14 / US 0.13 / Italy 0.11 (paper 0.18 / 0.14 / 0.13 / 0.11). The remaining \(\rho\)-gap (0.09 vs 0.11) is within the CV’s grid resolution. The validated bit-for-bit check for HSC is the Monte Carlo above, which matches the standalone estimation skeleton exactly.
Core API#
Harmonic Synthetic Control (HSC) estimator.
Implements:
Liu, Z., & Xu, Y. (2026). “The Harmonic Synthetic Control Method.”
HSC addresses the spurious-matching risk of synthetic control on nonstationary outcomes with unit-specific stochastic trends. Instead of the binary choice between matching on raw levels (vulnerable to spurious matching from idiosyncratic trends) and matching on differences (which discards useful shared trend variation), HSC uses a soft, data-driven allocation:
donor weights are estimated under a frequency-dependent metric
W_{rho,q}that downweights low-frequency (trend-like) residual variation, anda treated-unit-specific smooth component
Eabsorbs the low-frequency residual and is forecast forward to complete the counterfactual.
A single allocation parameter rho in [0, 1] – selected by rolling-origin
cross-validation – continuously interpolates between SC on q-th
differences (rho -> 0) and SC on levels with an intercept/trend
(rho -> 1). See mlsynth.utils.hsc_helpers for the algorithmic pieces.
- class mlsynth.estimators.hsc.HSC(config: HSCConfig | dict)#
Bases:
objectHarmonic Synthetic Control (HSC) estimator.
- Parameters:
config (HSCConfig or dict) – Configuration object. See
mlsynth.config_models.HSCConfig.- Returns:
HSCResults – Donor weights, the fitted smooth component, the counterfactual, and the cross-validated allocation
rho.
Notes
Following Liu & Xu (2026), this ships the HSC point estimator only. Uncertainty quantification rests on additional assumptions beyond those needed for the point estimate (the authors point to a Cattaneo-Feng-Titiunik (2021) style prediction interval) and is deliberately not implemented here.
- fit() HSCResults#
Run the HSC pipeline.
Configuration#
- class mlsynth.config_models.HSCConfig(*, df: ~pandas.DataFrame, outcome: str, treat: str, unitid: str, time: str, display_graphs: bool = False, save: bool | str = False, counterfactual_color: ~typing.List[str] = <factory>, treated_color: str = 'black', q: ~typing.Literal[1, 2] = 1, rho_grid: ~typing.List[float] = <factory>, cv_splits: ~typing.Annotated[int, ~annotated_types.Ge(ge=2)] = 3, ridge: float | ~typing.Literal['sdid'] = 1e-06, forecaster: ~typing.Literal['arima110', 'last'] = 'arima110')#
Configuration for the Harmonic Synthetic Control (HSC) estimator.
Implements:
Liu, Z., & Xu, Y. (2026). “The Harmonic Synthetic Control Method.”
HSC matches donors under a frequency-dependent metric and absorbs the treated unit’s low-frequency residual into a smooth component that is forecast forward. A single allocation parameter
rho in [0, 1], selected by rolling-origin cross-validation, interpolates between synthetic control onq-th differences (rho -> 0) and synthetic control on levels with an intercept/trend (rho -> 1).- Parameters:
q (int) – Smoothness order of the difference operator (1 or 2).
q=1controls a stochastic trend / random-walk drift;q=2controls a local-linear trend. Default 1.rho_grid (list of float) – Candidate allocation values in
[0, 1]searched by CV. Default[0.0, 0.2, 0.5, 0.8, 0.97].cv_splits (int) – Number of rolling-origin folds (sklearn
TimeSeriesSplit). Default 3.ridge (float) – Relative ridge coefficient on the donor weights for QP conditioning. Default
1e-6.forecaster ({“arima110”, “last”}) – Forecaster for the smooth residual.
"arima110"(default) is a closed-form ARIMA(1, 1, 0);"last"carries the last value forward.display_graphs (bool) – Show the observed-vs-counterfactual plot after fitting.
Notes
Following Liu & Xu (2026), HSC ships the point estimator only; uncertainty quantification is deliberately out of scope, so there are no inference options here.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Result Containers#
HSC.fit() returns an
HSCResults, bundling the fitted
HSCDesign (donor weights, the smooth
component, the cross-validated rho and its CV curve, the counterfactual) and
the prepared HSCInputs.
Structured containers for the Harmonic Synthetic Control (HSC) pipeline.
Implements the dataclasses used throughout HSC, which itself implements:
Liu, Z., & Xu, Y. (2026). “The Harmonic Synthetic Control Method.”
All containers are frozen (immutable) per the repository convention.
- class mlsynth.utils.hsc_helpers.structures.HSCDesign(selected_rho: float, q: int, omega: ndarray, smooth_pre: ndarray, donor_match_pre: ndarray, counterfactual_post: ndarray, smooth_forecast: ndarray, donor_match_post: ndarray, cv_curve: Dict[float, float], forecaster: str)#
Bases:
objectFitted HSC design.
- Parameters:
selected_rho (float) – Cross-validated allocation parameter in
[0, 1]. Near 1 the design matches on levels (with an intercept/trend); near 0 it matches onq-th differences and the smooth component absorbs the trend.q (int) – Smoothness order used.
omega (np.ndarray) – Donor weights on the simplex, shape
(N,).smooth_pre (np.ndarray) – Fitted treated-unit smooth residual
Eover the pre-period, shape(T0,)(S_{rho,q} (Y_pre - X_pre omega)).donor_match_pre (np.ndarray) – Donor-matched component over the pre-period,
X_pre @ omega.counterfactual_post (np.ndarray) – Post-treatment counterfactual, shape
(n_post,)(X_post @ omega + forecast(E)).smooth_forecast (np.ndarray) – Forecast of the smooth residual over the post-period.
donor_match_post (np.ndarray) – Donor-matched component over the post-period,
X_post @ omega.cv_curve (dict) –
{rho: rolling-origin CV error}used to selectselected_rho.forecaster (str) – Name of the residual forecaster used (e.g.
"arima110").
- counterfactual_post: ndarray#
- donor_match_post: ndarray#
- donor_match_pre: ndarray#
- omega: ndarray#
- smooth_forecast: ndarray#
- smooth_pre: ndarray#
- class mlsynth.utils.hsc_helpers.structures.HSCInputs(y_target: ndarray, donor_matrix: ndarray, T: int, T0: int, treated_unit_name: Any, donor_names: Sequence, time_labels: ndarray, q: int)#
Bases:
objectPreprocessed panel data for HSC.
- Parameters:
y_target (np.ndarray) – Treated unit’s outcome over the full timeline, shape
(T,).donor_matrix (np.ndarray) – Donor outcomes, shape
(T, N)(rows = periods, cols = donors).T (int) – Total number of periods.
T0 (int) – Number of pre-treatment periods.
treated_unit_name (Any) – Label of the treated unit.
donor_names (Sequence) – Labels of the donor units (length
N).time_labels (np.ndarray) – Time-period labels (length
T).q (int) – Smoothness order of the difference operator (1 or 2).
- property X_post: ndarray#
Donor post-treatment matrix, shape
(T - T0, N).
- property X_pre: ndarray#
Donor pre-treatment matrix, shape
(T0, N).
- property Y_post: ndarray#
Treated post-treatment outcomes, shape
(T - T0,).
- property Y_pre: ndarray#
Treated pre-treatment outcomes, shape
(T0,).
- donor_matrix: ndarray#
- time_labels: ndarray#
- y_target: ndarray#
- class mlsynth.utils.hsc_helpers.structures.HSCResults(inputs: HSCInputs, design: HSCDesign, att: float, counterfactual_full: ndarray, treatment_effect: ndarray, weights_by_donor: Dict[Any, float])#
Bases:
objectUser-facing output of the HSC estimator.
- Parameters:
inputs (HSCInputs) – Preprocessed panel data.
design (HSCDesign) – Fitted design (weights, smooth component, counterfactual).
att (float) – Average post-treatment effect (mean of
treatment_effect).counterfactual_full (np.ndarray) – Counterfactual over the full timeline, shape
(T,)(pre-period in-sample fit followed by the post-period counterfactual).treatment_effect (np.ndarray) – Post-period treated minus counterfactual, shape
(n_post,).weights_by_donor (dict) –
{donor_label: weight}for donors with non-trivial weight.
- counterfactual_full: ndarray#
- treatment_effect: ndarray#
Helper Modules#
Data preparation for HSC.
Wraps datautils.dataprep to pivot the long panel into the treated
outcome vector and donor matrix HSC consumes, and validates that the
pre-treatment window is long enough for the difference operator and the
rolling-origin cross-validation.
- mlsynth.utils.hsc_helpers.setup.prepare_hsc_inputs(df: DataFrame, outcome: str, unitid: str, time: str, treat: str, q: int, n_splits: int = 3) HSCInputs#
Pivot panel data into the
(T,)/(T, N)layout HSC consumes.- Parameters:
df (pd.DataFrame) – Long balanced panel data.
outcome, unitid, time, treat (str) – Column names for outcome, units, time, and the binary treatment indicator (1 for the treated unit in post-treatment periods).
q (int) – Smoothness order (1 or 2); determines the minimum pre-period length.
n_splits (int) – Rolling-origin CV folds; used only to validate the pre-period is long enough to cross-validate.
- Returns:
HSCInputs
The profiled HSC metric and the donor-weight quadratic program.
Implements the profiled representation (Proposition 1 of Liu & Xu, 2026).
For a fixed allocation rho in (0, 1) and smoothness order q in {1, 2}:
lambda_rho = rho / (1 - rho) K_q = D_q’ D_q (roughness matrix) S_{rho,q} = (I + lambda_rho K_q)^{-1} (smoother) W_{rho,q} = (1 / rho) (I - S_{rho,q}) (donor-matching metric)
with the boundary cases
rho = 0: S = I, W = K_q (SC on q-th differences) rho = 1: S = P_0, W = I - P_0 (SC on levels + intercept/trend)
where P_0 projects onto Null(K_q) (constants for q=1, linear trends
for q=2). Donor weights then solve a ridge-regularized simplex QP under the
metric W_{rho,q}.
- mlsynth.utils.hsc_helpers.formulation.difference_operator(T: int, q: int) ndarray#
q-th order difference operatorD_qof shape(T - q, T).
- mlsynth.utils.hsc_helpers.formulation.fit_donor_weights(X: ndarray, Y: ndarray, W: ndarray, ridge: float = 1e-06, ridge_abs: float | None = None, solver: object | None = None) ndarray#
Solve the simplex-constrained, metric-weighted donor QP.
argmin_{omega in simplex} (Y - X omega)' W (Y - X omega) + c ||omega||^2, where the ridge coefficientcis eitherridge_abs(absolute, e.g. the SDID-stylezeta^2 T0fromsdid_ridge_coefficient()) when supplied, or the relativeridge * trace(X'WX) / Notherwise.- Parameters:
X (np.ndarray) – Donor matrix, shape
(T, N).Y (np.ndarray) – Treated outcome, shape
(T,).W (np.ndarray) – Donor-matching metric, shape
(T, T).ridge (float) – Relative ridge coefficient (used only when
ridge_absisNone).ridge_abs (float, optional) – Absolute ridge coefficient on
||omega||^2. Overridesridge.solver (optional) – CVXPY solver; defaults to Clarabel.
- Returns:
np.ndarray – Non-negative donor weights summing to one, shape
(N,).
- mlsynth.utils.hsc_helpers.formulation.roughness_matrix(T: int, q: int) ndarray#
Roughness matrix
K_q = D_q' D_qof shape(T, T).
- mlsynth.utils.hsc_helpers.formulation.sdid_ridge_coefficient(X_pre: ndarray, n_post: int) float#
SDID-style absolute ridge coefficient on
||omega||^2.Following Liu & Xu (2026) §7 and Arkhangelsky et al. (2021), the donor penalty is
zeta^2 * T0withzeta = T_post^{1/4} * sigma_dX, wheresigma_dXis the standard deviation of the first differences of the donor outcomes over the pre-period. This scales the ridge to the data and yields diversified (non-corner) donor weights.- Parameters:
X_pre (np.ndarray) – Donor pre-treatment matrix, shape
(T0, N).n_post (int) – Number of post-treatment periods.
- Returns:
float – Absolute coefficient on
||omega||^2(zeta^2 * T0).
- mlsynth.utils.hsc_helpers.formulation.smoother_and_metric(T: int, q: int, rho: float) Tuple[ndarray, ndarray]#
Return
(S_{rho,q}, W_{rho,q})for the profiled HSC problem.- Parameters:
T (int) – Length of the (pre-treatment) series.
q (int) – Smoothness order (1 or 2).
rho (float) – Allocation parameter in
[0, 1].
- Returns:
S (np.ndarray) – Smoother that extracts the smooth residual
Efrom the discrepancy.W (np.ndarray) – Symmetric PSD donor-matching metric (a seminorm).
Forecasters for the HSC smooth residual E.
In post-treatment periods the treated-unit smooth component E is
extrapolated by a time-series forecaster and added to the donor-matched
component to form the counterfactual. The default is an ARIMA(1, 1, 0)
forecaster – the correctly specified model for the paper’s idiosyncratic
ARIMA(1, 1, 0) stochastic-trend DGP.
The ARIMA(1, 1, 0) forecast is computed in closed form (no iterative MLE): the first differences follow a zero-mean AR(1), so future increments decay geometrically and the level path integrates them forward. This is fully deterministic, which keeps HSC results exactly reproducible.
- mlsynth.utils.hsc_helpers.forecast.arima110_forecast(x: ndarray, h: int) ndarray#
Closed-form ARIMA(1, 1, 0) forecast of a level series
x.Fits a zero-mean AR(1) to
diff(x)by least squares, then forecastshfuture incrementsd_{t+k} = phi^k d_tand integrates to levels.- Parameters:
x (np.ndarray) – Observed level series.
h (int) – Forecast horizon.
- Returns:
np.ndarray – Length-
hforecast of the level path.
- mlsynth.utils.hsc_helpers.forecast.forecast_smooth(E: ndarray, h: int, method: str = 'arima110') ndarray#
Dispatch a forecaster by name over the smooth residual
E.
- mlsynth.utils.hsc_helpers.forecast.last_value_forecast(x: ndarray, h: int) ndarray#
Carry the last observed value forward as a constant.
HSC fit at a fixed allocation, and rolling-origin CV over the allocation.
fit_at_rho solves the profiled donor QP and returns the smooth residual;
select_rho_by_cv chooses rho by sklearn’s rolling-origin
TimeSeriesSplit, scoring each candidate by
its out-of-sample prediction error (donor match + forecast of the smooth
component) on held-out pre-period blocks.
- mlsynth.utils.hsc_helpers.optimization.fit_at_rho(X_pre: ndarray, Y_pre: ndarray, rho: float, q: int, ridge: float = 1e-06, solver: object | None = None, ridge_abs: float | None = None) Tuple[ndarray, ndarray]#
Fit the profiled HSC problem at a fixed
rho.ridge_abs(e.g. the SDID-style coefficient) overrides the relativeridgewhen supplied.- Returns:
omega (np.ndarray) – Donor weights on the simplex, shape
(N,).E_pre (np.ndarray) – Fitted smooth residual
S_{rho,q} (Y_pre - X_pre omega),(T0,).
- mlsynth.utils.hsc_helpers.optimization.select_rho_by_cv(X_pre: ndarray, Y_pre: ndarray, q: int, rho_grid: Sequence[float], n_splits: int = 3, ridge: float = 1e-06, forecaster: str = 'arima110', solver: object | None = None, ridge_abs: float | None = None) Tuple[float, Dict[float, float]]#
Select
rhoby rolling-origin cross-validation.For each candidate
rhoand each expanding-windowTimeSeriesSplitfold, HSC is fit on the training block; the held-out block is predicted byX_val @ omega + forecast(E)and scored by mean squared error. Therhowith the lowest average fold error wins.- Returns:
best_rho (float) – Selected allocation parameter.
cv_curve (dict) –
{rho: mean CV error}.
Top-level HSC solve and effect summary.
solve_hsc runs the full pipeline – select rho by rolling-origin CV,
fit the profiled donor QP on the whole pre-period, forecast the smooth
residual into the post-period, and assemble the counterfactual. The
post-treatment counterfactual is
Y_hat_post = X_post @ omega + forecast(E),
the donor-matched component plus the extrapolated smooth component.
- mlsynth.utils.hsc_helpers.orchestration.solve_hsc(inputs: HSCInputs, q: int, rho_grid: Sequence[float], n_splits: int = 3, ridge: object = 1e-06, forecaster: str = 'arima110', solver: object | None = None) HSCDesign#
Run HSC end-to-end and return the fitted design.
ridgeis either a float (relative ridgeridge * trace(X'WX)/N) or the string"sdid", which uses the data-driven SDID-style coefficientzeta^2 T0(computed once from the full pre-period donors and the post-treatment horizon) for diversified donor weights.
- mlsynth.utils.hsc_helpers.orchestration.summarize_effects(inputs: HSCInputs, design: HSCDesign) Tuple[float, ndarray, ndarray]#
Assemble ATT, full-timeline counterfactual, and per-period effects.
- Returns:
att (float) – Mean post-period treatment effect.
counterfactual_full (np.ndarray) – Pre-period in-sample fit (
X_pre @ omega + E) stacked with the post-period counterfactual, shape(T,).treatment_effect (np.ndarray) –
Y_post - counterfactual_post, shape(n_post,).
Plotting for HSC: observed treated series vs the HSC counterfactual.
- mlsynth.utils.hsc_helpers.plotter.plot_hsc(results: HSCResults) None#
Plot the treated outcome against the HSC counterfactual.
- Parameters:
results (HSCResults) – Output of
mlsynth.estimators.HSC.- Raises:
MlsynthPlottingError – If required arrays are missing from the result.
References#
Liu, Z., & Xu, Y. (2026). “The Harmonic Synthetic Control Method.” See [HSC].
Abadie, A., Diamond, A., & Hainmueller, J. (2010). “Synthetic Control Methods for Comparative Case Studies.” Journal of the American Statistical Association 105(490):493-505.
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). “Synthetic Difference-in-Differences.” American Economic Review 111(12):4088-4118.
Bai, J. (2009). “Panel Data Models with Interactive Fixed Effects.” Econometrica 77(4):1229-1279.
Cattaneo, M. D., Feng, Y., & Titiunik, R. (2021). “Prediction Intervals for Synthetic Control Methods.” Journal of the American Statistical Association 116(536):1865-1880.
Granger, C. W. J., & Newbold, P. (1974). “Spurious Regressions in Econometrics.” Journal of Econometrics 2(2):111-120.
Hsiao, C., Ching, H. S., & Wan, S. K. (2012). “A Panel Data Approach for Program Evaluation: Measuring the Benefits of Political and Economic Integration of Hong Kong with Mainland China.” Journal of Applied Econometrics 27(5):705-740.
Phillips, P. C. B. (1986). “Understanding Spurious Regressions in Econometrics.” Journal of Econometrics 33(3):311-340.