GeoLift Market Selection (GEOLIFT)#

Overview#

Most estimators in mlsynth are retrospective: a treatment has happened and we want its effect. GEOLIFT is prospective — a tool for synthetic experimental design in geo-experiments. Before any ad spend, it answers:

Which markets should be treated, for how long, so that a real lift would be detectable?

It is a faithful port of Meta’s GeoLift market-selection routine onto the mlsynth Augmented-SCM machinery (Ben-Michael, Feller & Rothstein, 2021 [BMFR2021]), with conformal inference (Chernozhukov, Wüthrich & Zhu, 2021 [CWZ2021]) and the standardized design/effect result contract. Reach for it when test markets must be chosen up front, you want a minimum detectable effect (MDE) and power per candidate region plus the deployable synthetic control, you may need to force markets in or out, and — once the experiment runs — you want to realize the chosen design into an effect report.

Mathematical Formulation#

Setup and notation#

There are \(N\) markets \(\mathcal{N} \coloneqq \{1, \dots, N\}\) and \(T\) periods \(t \in \mathcal{T} \coloneqq \{1, \dots, T\}\). The design uses only the pre-treatment window \(\mathcal{T}_1 \coloneqq \{t \in \mathcal{T} : t \le T_0\}\) of length \(T_0\); if post-treatment data exist they are sliced off (see Pre/post split). The outcome of market \(j\) at time \(t\) is \(y_{jt}\), with market series \(\mathbf{y}_j = (y_{j1}, \dots, y_{jT})^\top \in \mathbb{R}^{T}\).

A candidate test region is a set \(\mathcal{S} \subseteq \mathcal{N}\) of \(k \coloneqq |\mathcal{S}|\) markets (the treatment_size). It plays the role of the canonical treated unit through its aggregate series

\[\mathbf{y}^{\mathcal{S}}, \qquad y^{\mathcal{S}}_t \coloneqq \operatorname{agg}_{j \in \mathcal{S}} y_{jt}, \qquad \operatorname{agg} \in \Bigl\{\textstyle\sum,\ \operatorname{mean}\Bigr\}.\]

The donor pool is every other market, \(\mathcal{N}_0(\mathcal{S}) \coloneqq \mathcal{N} \setminus \mathcal{S}\) with \(N_0 \coloneqq N - k\), giving the donor matrix \(\mathbf{Y}_0^{\mathcal{S}} \coloneqq [\mathbf{y}_j]_{j \in \mathcal{N}_0(\mathcal{S})} \in \mathbb{R}^{T \times N_0}\). The sum aggregate is GeoLift’s default (the right object for total spend/lift); the mean keeps \(\mathbf{y}^{\mathcal{S}}\) at donor scale — inside the donor convex hull — for a better-posed fit.

Stage 1 — Candidate nomination#

Enumerating all \(\binom{N}{k}\) regions is intractable, so GeoLift nominates a tractable shortlist by correlation similarity. On the pre-period, form the Pearson correlation matrix \(\mathbf{P} = [\rho_{ij}] \in \mathbb{R}^{N \times N}\),

\[\rho_{ij} \coloneqq \frac{\sum_{t \in \mathcal{T}_1}(y_{it} - \bar{y}_i)(y_{jt} - \bar{y}_j)} {\sqrt{\sum_{t \in \mathcal{T}_1}(y_{it} - \bar{y}_i)^2}\; \sqrt{\sum_{t \in \mathcal{T}_1}(y_{jt} - \bar{y}_j)^2}}, \qquad \bar{y}_i \coloneqq T_0^{-1}\!\!\sum_{t \in \mathcal{T}_1} y_{it}.\]

For each anchor \(i\), let \(\pi_i\) order the other markets by descending correlation, \(\rho_{i,\pi_i(1)} \ge \rho_{i,\pi_i(2)} \ge \dots \ge \rho_{i,\pi_i(N-1)}\). The deterministic nominee anchored at \(i\) is that market plus its \(k-1\) nearest neighbours,

\[\mathcal{S}_i \coloneqq \{i\} \cup \bigl\{\pi_i(1), \dots, \pi_i(k-1)\bigr\},\]

and the shortlist is \(\{\mathcal{S}_i\}_{i \in \mathcal{N}}\) deduplicated — \(N\) candidates instead of \(\binom{N}{k}\). The stochastic (“paired-jitter”) variant replaces ranks \(1, \dots, k-1\) by one draw from each adjacent pair \(\{1,2\}, \{3,4\}, \dots\), exploring near-rank neighbours (run_stochastic; stochastic_mode="global" is faithful to GeoLift, "per_anchor" draws independently per anchor).

Forcing constraints. Given a forced-in set \(\mathcal{S}_{\mathrm{in}}\) (to_be_treated) and a forbidden set \(\mathcal{S}_{\mathrm{out}}\) (not_to_be_treated, \(\mathcal{S}_{\mathrm{in}} \cap \mathcal{S}_{\mathrm{out}} = \varnothing\)), nominees are drawn from the free pool \(\mathcal{F} \coloneqq \mathcal{N} \setminus (\mathcal{S}_{\mathrm{in}} \cup \mathcal{S}_{\mathrm{out}})\) at size \(k - |\mathcal{S}_{\mathrm{in}}|\) and unioned with the forced-in set, so every candidate satisfies \(\mathcal{S}_{\mathrm{in}} \subseteq \mathcal{S}\) and \(\mathcal{S} \cap \mathcal{S}_{\mathrm{out}} = \varnothing\).

Stage 2 — The synthetic control#

For a candidate \(\mathcal{S}\), the counterfactual is a weighted donor combination. The default is the Augmented SCM [BMFR2021]. Each period is first centred by the donor mean \(\mu_t \coloneqq N_0^{-1}\sum_{j \in \mathcal{N}_0}\! y_{jt}\) (the augsynth intercept), giving \(\widetilde{\mathbf{y}}^{\mathcal{S}} = \mathbf{y}^{\mathcal{S}} - \boldsymbol{\mu}\) and \(\widetilde{\mathbf{Y}}_0 = \mathbf{Y}_0 - \boldsymbol{\mu}\mathbf{1}^\top\). A base simplex SCM is solved on the pre-period,

\[\mathbf{w}^{\mathrm{scm}} \in \operatorname*{argmin}_{\mathbf{w} \in \Delta^{N_0}} \bigl\| \widetilde{\mathbf{y}}^{\mathcal{S}}_{\mathcal{T}_1} - \widetilde{\mathbf{Y}}_{0,\mathcal{T}_1}\mathbf{w} \bigr\|_2^2, \quad \Delta^{N_0} \coloneqq \{\mathbf{w} \in \mathbb{R}_{\ge 0}^{N_0} : \|\mathbf{w}\|_1 = 1\},\]

then ridge-augmented to close the residual pre-period imbalance,

\[\mathbf{w}^\ast = \mathbf{w}^{\mathrm{scm}} + \widetilde{\mathbf{Y}}_{0,\mathcal{T}_1}^{\top} \bigl(\widetilde{\mathbf{Y}}_{0,\mathcal{T}_1}\widetilde{\mathbf{Y}}_{0,\mathcal{T}_1}^{\top} + \lambda \mathbf{I}\bigr)^{+} \bigl(\widetilde{\mathbf{y}}^{\mathcal{S}}_{\mathcal{T}_1} - \widetilde{\mathbf{Y}}_{0,\mathcal{T}_1}\mathbf{w}^{\mathrm{scm}}\bigr),\]

with the penalty \(\lambda\) chosen by leave-one-period-out cross-validation. The counterfactual and gap follow the canon,

\[\widehat{y}^{\mathcal{S}}_t \coloneqq \bigl(\mathbf{Y}_0\mathbf{w}^\ast\bigr)_t, \qquad \tau_t \coloneqq y^{\mathcal{S}}_t - \widehat{y}^{\mathcal{S}}_t, \qquad \widehat{\tau} \coloneqq |\mathcal{T}_2|^{-1}\!\!\sum_{t \in \mathcal{T}_2}\! \tau_t .\]

The augment=None variant is the plain simplex SCM with an explicit intercept \(\alpha = \operatorname{mean}_{\mathcal{T}_1}(\mathbf{y}^{\mathcal{S}} - \mathbf{Y}_0\mathbf{w}^\ast)\), predicting \(\widehat{y}^{\mathcal{S}}_t = \alpha + (\mathbf{Y}_0\mathbf{w}^\ast)_t\).

Pre-fit quality is the scaled L2 imbalance — the fitted pre-period imbalance relative to the imbalance of uniform donor weights \(\mathbf{w}^{\mathrm{unif}} \coloneqq N_0^{-1}\mathbf{1}\),

\[\kappa(\mathcal{S}) \coloneqq \frac{\bigl\|\mathbf{Y}_{0,\mathcal{T}_1}\mathbf{w}^\ast - \mathbf{y}^{\mathcal{S}}_{\mathcal{T}_1}\bigr\|_2} {\bigl\|\mathbf{Y}_{0,\mathcal{T}_1}\mathbf{w}^{\mathrm{unif}} - \mathbf{y}^{\mathcal{S}}_{\mathcal{T}_1}\bigr\|_2} \;\in\; [0, \infty),\]

so \(\kappa = 0\) is a perfect fit and \(\kappa = 1\) is no better than the donor average. It is unitless (hence comparable across regions of different magnitudes) and is reported per candidate.

Stage 3 — Power simulation#

Power is estimated by placebo-in-time experiments carved from the end of the pre-period. For a treatment duration \(\ell\) (a durations entry) and a lookback placement \(s \in \{1, \dots, L\}\) (with \(L =\) lookback_window), the pseudo-treatment window is the \(\ell\) periods ending \(s - 1\) before \(T_0\),

\[\mathcal{T}^{(s,\ell)}_{2} \coloneqq \{\,T_0 - \ell - s + 2,\ \dots,\ T_0 - s + 1\,\}, \qquad \mathcal{T}^{(s,\ell)}_{1} \coloneqq \{1, \dots, T_0 - \ell - s + 1\},\]

faithful to GeoLift’s max_time - tp - sim + 2 with \(T_0\) the pre-period end. The SCM is fit on \(\mathcal{T}^{(s,\ell)}_{1}\). For an effect size \(\delta\) (an effect_sizes entry) a known multiplicative lift is injected on the pseudo-post block, \(y^{\mathcal{S},(\delta)}_t = (1+\delta)\,y^{\mathcal{S}}_t\) for \(t \in \mathcal{T}^{(s,\ell)}_2\), which shifts the gap by \(\delta\,y^{\mathcal{S}}_t\) there.

Detection uses the conformal test [CWZ2021]. With the post-block statistic

\[S_q(\boldsymbol{\tau}) \coloneqq \Bigl( |\mathcal{T}_2|^{-1/2} \textstyle\sum_{t \in \mathcal{T}_2} |\tau_t|^{q} \Bigr)^{1/q} \qquad (q = 1 \text{ by default}),\]

the joint-null p-value compares the observed statistic to \(n_s\) i.i.d. permutations \(\Pi\) of the residual path,

\[p \coloneqq \frac{1}{n_s}\sum_{\Pi} \mathbf{1}\!\bigl\{\, S_q(\boldsymbol{\tau}_{\mathcal{T}_2}) \le S_q\bigl((\Pi\boldsymbol{\tau})_{\mathcal{T}_2}\bigr) \,\bigr\},\]

and an effect is detected when \(p < \alpha\). The permutation set \(\Pi\) follows conformal_type: "iid" (the augsynth/GeoLift default — \(n_s\) independent draws) or "block" (the \(T\) moving-block cyclic shifts \(\Pi_k(t) = ((t + k) \bmod T)\), which preserve serial dependence and are deterministic, ignoring \(n_s\)). Power is the detection rate across the \(L\) lookback placements,

\[\beta(\mathcal{S}, \ell, \delta) \coloneqq \frac{1}{L}\sum_{s=1}^{L} \mathbf{1}\!\bigl\{\, p^{(s)}(\mathcal{S}, \ell, \delta) < \alpha \,\bigr\}.\]

Fit-once, sweep-\(\delta\) (an exact optimization)

The injection touches only the post block, so the pre-period the cross-validation sees is identical across effect sizes; the CV-selected \(\lambda\) is therefore the same for every \(\delta\). mlsynth cross-validates once per \((\mathcal{S}, \ell, s)\) and reuses \(\lambda\) across \(\delta\) (augsynth’s own behaviour). This is provably identical to GeoLift’s per-\(\delta\) refit — pinned by test_simulate_lookback_cv_once_equals_per_es_refit — at \(1/|\{\delta\}|\) the cross-validation cost.

Stage 4 — MDE and the composite rank#

The minimum detectable effect for a region/duration is the smallest-magnitude effect whose power clears the threshold \(\beta_0\) (power_threshold, default \(0.8\)),

\[\delta^\ast(\mathcal{S}, \ell) \coloneqq \operatorname*{arg\,min}_{\delta \,:\, \beta(\mathcal{S},\ell,\delta) \ge \beta_0} |\delta|,\]

with GeoLift’s signed positive/negative tie rule. Writing \(\widehat{\delta}\) for the recovered lift at \(\delta^\ast\) and the recovery error \(\eta(\mathcal{S},\ell) \coloneqq |\widehat{\delta} - \delta^\ast|\), the composite rank is the mean of three dense ranks (\(\operatorname{dr}\) over the surviving candidates), faithful to GeoLift,

\[r(\mathcal{S}, \ell) = \operatorname{rank}\!\left(\tfrac13\Bigl[ \operatorname{dr}\!\bigl(|\delta^\ast|\bigr) + \operatorname{dr}\!\bigl(\beta\bigr) + \operatorname{dr}\!\bigl(\eta\bigr) \Bigr]\right), \qquad \text{(lower is better).}\]

Note

Two GeoLift-fidelity quirks, replicated as-is: \(\operatorname{dr}(\beta)\) is ascending (an MDE whose power sits just above \(\beta_0\) is a tighter estimate of the threshold, so it ranks better), and the scaled L2 imbalance \(\kappa\) is not a ranking term — only \(\delta^\ast\), \(\beta\), and \(\eta\) enter. Both are documented and one line to change.

Identifying assumptions#

  1. Pre-period synthesizability. The aggregate \(\mathbf{y}^{\mathcal{S}}\) lies in (or near) the span/convex hull of the donor pool over \(\mathcal{T}_1\). Quantified by \(\kappa(\mathcal{S})\): a low value certifies that the synthetic tracks the region, the prerequisite for a credible counterfactual.

    Remark. With how="sum" the target is \(k\times\) donor scale and can sit outside the convex hull, inflating \(\kappa\) toward 1; how="mean" restores synthesizability. The choice is the user’s.

  2. Exchangeability under the null. The conformal test treats the residual path as exchangeable under \(H_0\): no effect, which the all-period refit underlying \(p\) is designed to deliver [CWZ2021].

  3. Stationarity of the placebo windows. Power from the lookback placements transports to the real experiment only if the pre-period dynamics resemble the experiment window — the usual SC stability assumption.

Budget planning (CPIC)#

Setting cpic (cost per incremental conversion) turns each candidate’s MDE into a spend, faithful to GeoLiftMarketSelection:

\[\mathrm{investment} \;=\; \mathrm{cpic} \times \delta \times \sum_{i \in \mathcal{S}} \sum_{t \in \mathcal{W}} Y_{it},\]

i.e. cost-per-incremental \(\times\) effect size \(\times\) the summed treated volume over the (lookback) window — the baseline outcome, on the total scale, independent of the mean-of-units fit. The shortlist carries an investment column; supplying budget drops candidates whose detectable investment exceeds it (GeoLift’s abs(budget) > abs(Investment) gate). The realized report adds the post-test cost \(= \mathrm{cpic} \times\) incremental outcome. The investment is a deterministic data transform, so it matches GeoLiftMarketSelection to the cent (durable case geolift_cpic); ROI (a value/margin per conversion) is a planned extension beyond GeoLift’s cost-only cpic.

Inference and the realized design#

The design phase reports \(\delta^\ast\) and \(\beta\) per candidate. When a post_col leaves a post window, GEOLIFT.fit() realizes the winning design under the hood — applying the winner’s pre-period weights \(\mathbf{w}^\ast\) to the full panel and running conformal inference: the per-period effect \(\tau_t\) for \(t \in \mathcal{T}_2\), prediction intervals by test inversion (a grid of nulls \(\tau_0\), the interval being the non-rejected range at level \(\alpha\)), and the joint-null p-value \(p\) — exposed on result.report (the DesignResult resolving to its EffectResult). A design over a no-effect post window returns \(p\) non-significant and intervals covering zero.

Pipeline and Options#

The estimator is a thin front door over the helper pipeline; each stage above is a tested leaf in mlsynth/utils/geolift_helpers/. The public surface is a single GEOLIFT.fit() — realization and plotting are handled inside it, driven by the data and config (a post_col triggers realization; display_graphs triggers the plot).

  • treatment_size \(k\), durations \(\{\ell\}\), effect_sizes \(\{\delta\}\), lookback_window \(L\).

  • to_be_treated / not_to_be_treated\(\mathcal{S}_{\mathrm{in}}\) / \(\mathcal{S}_{\mathrm{out}}\).

  • post_col — a 0/1 column marking post-treatment periods; the design slices to \(\mathcal{T}_1\), so it is identical whether you pass the full post-treatment panel or a pre-only one (the “rerun after treatment” invariance). Different post lengths simply change \(T_0\).

  • how (\(\operatorname{sum}\) / \(\operatorname{mean}\)), augment ("ridge" / None), alpha \(\alpha\), power_threshold \(\beta_0\), ns \(n_s\), run_stochastic / stochastic_mode.

  • conformal_type — the conformal permutation scheme, "iid" (default, matching GeoLift) or "block" (moving-block cyclic shifts for serially-dependent residuals; GeoLift’s conformal_type = "block" option).

Scanning several durations yields an MDE per duration (”\(\ell = 7\) detects 10%, but \(\ell = 14\) is needed for 5%”):

res = GEOLIFT({..., "durations": [7, 14, 21]}).fit()
res.power[["candidate", "duration", "mde", "power"]]   # one row per (S, l)

Plotting#

With display_graphs (default True), GEOLIFT.fit() plots the recommended design in the mlsynth house style (mlsynth.utils.plotting.mlsynth_style()): the design phase shows \(\mathbf{y}^{\mathcal{S}}\) vs \(\widehat{\mathbf{y}}^{\mathcal{S}}\) over \(\mathcal{T}_1\); the post phase (when the design was realized) adds the conformal band and the per-period gap \(\tau_t\) over \(\mathcal{T}_2\), with the intervention line at \(T_0\). The standalone helper mlsynth.utils.geolift_helpers.marketselect.plotter.plot_design() re-draws from a result on demand.

Example: GeoLift’s 40-Market Panel#

The package ships GeoLift’s example panel (basedata/geolift_market_data.csv): \(N = 40\) markets over \(T = 90\) days. We design a \(k = 3\) test region, then realize it over a 10-day no-effect post window (so the realized effect should be null).

import pandas as pd
from mlsynth import GEOLIFT

df = pd.read_csv(
    "https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
    "refs/heads/main/basedata/geolift_market_data.csv"
)
df["post"] = df["date"].isin(sorted(df["date"].unique())[-10:]).astype(int)

geo = GEOLIFT({
    "df": df, "outcome": "Y", "unitid": "location", "time": "date",
    "post_col": "post",                 # design on the pre-80, post-10 reserved
    "treatment_size": 3, "durations": [14], "effect_sizes": [0.0, 0.1, 0.2],
    "lookback_window": 3, "how": "mean", "augment": "ridge", "ns": 100,
})
res = geo.fit()              # designs, auto-realizes (post window) and plots
print(res.selected_units, res.search.winner.mde, res.search.winner.power)
print("joint conformal p:", round(res.report.inference.p_value, 3))   # ~0.66 -> null

Without a post_col the same call returns a design-only result (res.report is None); set display_graphs=False to suppress the plot.

The synthetic tracks the test markets, \(\widehat{\tau} \approx 0\), and the joint conformal p-value is far from significant — the correct null over a placebo post period.

Every option at once — the full market-selection call#

GeoLift’s GeoLiftMarketSelection exposes the whole design surface in one call:

MarketSelections <- GeoLiftMarketSelection(data = GeoTestData_PreTest,
    treatment_periods = c(10, 15),
    N = c(2, 3, 4, 5),
    Y_id = "Y", location_id = "location", time_id = "time",
    effect_size = seq(0, 0.2, 0.05),
    lookback_window = 1,
    include_markets = c("chicago"),
    exclude_markets = c("honolulu"),
    cpic = 7.50, budget = 100000,
    alpha = 0.1, Correlations = TRUE,
    fixed_effects = TRUE, side_of_test = "two_sided")

Every argument maps onto a GEOLIFT config field:

import pandas as pd
from mlsynth import GEOLIFT

df = pd.read_csv(                                      # GeoLift_PreTest, 40 mkts x 90d
    "https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
    "refs/heads/main/basedata/geolift_market_data.csv"
)

cfg = {
    "df": df, "outcome": "Y", "unitid": "location", "time": "date",
    "durations": [10, 15],                       # treatment_periods
    "effect_sizes": [0.0, 0.05, 0.10, 0.15, 0.20],
    "lookback_window": 1,
    "to_be_treated": ["chicago"],                # include_markets
    "not_to_be_treated": ["honolulu"],           # exclude_markets
    "cpic": 7.50, "budget": 100000.0,            # budget planning
    "alpha": 0.1,
    "fixed_effects": True,                       # GeoLift default
    "how": "mean", "augment": "ridge",
    "conformal_type": "iid",                     # two-sided |stat| permutation
    "display_graphs": False,
}
# N = c(2,3,4,5): GEOLIFT takes one treatment_size; scan by looping it.
shortlist = pd.concat(
    GEOLIFT({**cfg, "treatment_size": k}).fit().search.shortlist for k in (2, 3, 4, 5)
).sort_values("rank")
shortlist[["candidate", "duration", "mde", "power", "investment", "scaled_l2", "rank"]]

A few argument notes: Correlations=TRUE is GeoLift’s correlation ranking, which mlsynth always uses to nominate candidates (Stage 1); side_of_test = "two_sided" is the default conformal statistic (the symmetric \(|x|^q\) norm); cpic + budget drop candidates whose detectable investment busts the budget (see Budget planning above).

The recommended designs (GeoLift’s BestMarkets table; mlsynth’s shortlist carries the same candidates and reproduces Investment to the cent, see GEOLIFT — Meta’s GeoLift walkthrough (augsynth cross-validation)):

Test markets

Dur

Investment

AvgATT

L2

Rank

chicago, cincinnati, houston, portland

15

74,118.38

159.36

0.197

1

chicago, portland

15

64,563.75

290.01

0.174

1

chicago, cincinnati, houston, portland

10

99,027.75

316.62

0.197

3

chicago, portland

10

43,646.25

300.94

0.168

3

chicago, houston, portland

10

75,389.25

350.31

0.231

5

chicago, cincinnati, houston, nashville, san d.

15

95,755.50

146.80

0.270

6

atlanta, chicago

15

81,348.75

336.78

0.446

7

atlanta, chicago, cleveland, las vegas

15

86,661.75

220.82

0.532

7

(28 designs total; the maximum surviving investment is $99,321.75 < the $100k budget, so the budget gate held.) Lower Rank is better; the smallest, lowest- imbalance region that clears power wins — here chicago, portland.

Reading the results — plots, tables, and weights#

Everything the design and the realized report produce lives on the result object. The complete, runnable example below loads the data, realizes a design with a budget (cpic), and then draws every view — the power / MDE / budget table, observed vs synthetic, the effect with its conformal band, the donor weights, the realized cost, and the built-in plot:

import pandas as pd
import matplotlib.pyplot as plt
from mlsynth import GEOLIFT
from mlsynth.utils.geolift_helpers.marketselect.plotter import plot_design

url = ("https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
       "refs/heads/main/basedata/geolift_test_data.csv")
df = pd.read_csv(url)                                   # GeoLift_Test: 40 mkts x 105 days
dates = sorted(df["date"].unique())
df["post"] = df["date"].isin(dates[90:]).astype(int)   # last 15 days = treatment window

res = GEOLIFT({
    "df": df, "outcome": "Y", "unitid": "location", "time": "date",
    "treatment_size": 2, "to_be_treated": ["chicago", "portland"],
    "durations": [15], "effect_sizes": [0.0, 0.10], "post_col": "post",
    "cpic": 7.50, "how": "mean", "fixed_effects": True, "display_graphs": False,
}).fit()

# headline + the power / MDE / budget shortlist (one row per candidate x duration)
print(res.selected_units, res.report.effects.att, res.report.inference.p_value)
print(res.power[["candidate", "duration", "mde", "power", "investment"]])
print(res.report.weights.summary_stats["cost"])        # realized spend = cpic x incremental

# observed vs synthetic
ts = res.report.time_series
plt.plot(ts.time_periods, ts.observed_outcome, "k", label="observed")
plt.plot(ts.time_periods, ts.counterfactual_outcome, "r--", label="synthetic")
plt.axvline(ts.intervention_time, color="grey", ls=":"); plt.legend(); plt.show()

# the effect (gap) with its conformal prediction band
d = res.report.inference.details
plt.plot(ts.time_periods, ts.estimated_gap, "k")
plt.fill_between(d["periods"], d["lower"], d["upper"], alpha=0.3)
plt.axhline(0, color="grey", ls=":"); plt.show()

# donor weights, biggest contributors
w = res.report.weights.donor_weights
top = dict(sorted(w.items(), key=lambda kv: -abs(kv[1]))[:10])
plt.bar(top.keys(), top.values()); plt.xticks(rotation=45, ha="right"); plt.show()

# the built-in plot (design + realized phases: conformal band + gap)
plot_design(res, report=res.report, show=True)

For the full power-vs-effect-size curve of a region (GeoLift’s GeoLiftPower plot) — power rising through the threshold marks the MDE — run the scoring helpers directly:

import pandas as pd
import matplotlib.pyplot as plt
from mlsynth.utils.datautils import geoex_dataprep
from mlsynth.utils.geolift_helpers.marketselect.helpers.batch import run_simulations
from mlsynth.utils.geolift_helpers.marketselect.helpers.aggregate import compute_power

url = ("https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
       "refs/heads/main/basedata/geolift_market_data.csv")
Ywide = geoex_dataprep(pd.read_csv(url), "location", "date", "Y")["Ywide"]
cube = run_simulations(
    Ywide, [frozenset({"chicago", "portland"})], durations=[15],
    lookback_window=1, effect_sizes=[0.0, 0.05, 0.10, 0.15, 0.20],
    fixed_effects=True, ns=500)
pw = compute_power(cube, alpha=0.10)
plt.plot(pw["effect_size"], pw["power"], marker="o")
plt.axhline(0.8, ls="--", color="grey")                # power threshold -> MDE crossing
plt.xlabel("injected lift"); plt.ylabel("power"); plt.show()

Multi-cell designs#

A multi-cell experiment runs several treatments at once — different channels, budgets, or creatives — each on its own group of geos (“cells” \(A, B, \dots\)), all measured against a shared control pool over the same window (GeoLift’s GeoLiftMultiCell). The dedicated estimator is mlsynth.MULTICELLGEOLIFT; its data model is a unit-level cell-membership column ("A" / "B" / … for treated geos; blank or a control_label for controls) plus a post_col window:

import pandas as pd
from mlsynth import MULTICELLGEOLIFT

url = ("https://raw.githubusercontent.com/jgreathouse9/mlsynth/"
       "refs/heads/main/basedata/geolift_test_data.csv")
df = pd.read_csv(url)                                   # GeoLift_Test: 40 mkts x 105 days
dates = sorted(df["date"].unique())
df["post"] = df["date"].isin(dates[90:]).astype(int)   # last 15 days = treatment window

#   cell:  "A" -> social-media markets, "B" -> paid-search markets, "" -> control
cell = {"chicago": "A", "portland": "A", "atlanta": "B", "boston": "B"}
df["cell"] = df["location"].map(cell).fillna("")        # blank = shared control pool

res = MULTICELLGEOLIFT({
    "df": df, "outcome": "Y", "unitid": "location", "time": "date",
    "cell_column_name": "cell", "post_col": "post", "fixed_effects": True,
}).fit()

res.cells["A"].effects.att          # cell A's per-unit ATT (a full EffectResult)
res.cells["A"].inference.p_value    # cell A's conformal p
res.comparison                      # pairwise [{cell_a, cell_b, att_diff, winner}, ...]
res.winner                          # cell that wins every comparison, or None

Each cell is measured with the same fixed-effect ASCM + conformal inference as a single cell, excluding the other cells’ markets from its donor pool (they are treated, hence contaminated). The cross-cell winner uses GeoLift’s non-overlapping-CI rule; with one cell it is identical to single-cell GEOLIFT. See MULTICELLGEOLIFT — multi-cell GeoLift analysis for the full treatment, the per-cell plots (plot_multicell), and the augsynth cross-validation.

Verification#

The realized effect report is cross-validated against GeoLift/augsynth value-for-value on the package’s own GeoLift_Walkthrough example: with fixed_effects=True (the default), GEOLIFT reproduces the walkthrough’s per-unit ATT (155.6), percent lift (5.4%), summed incremental (4667), and conformal p-value (0.01). See GEOLIFT — Meta’s GeoLift walkthrough (augsynth cross-validation) for the four ingredients required to match (unit fixed effects, mean-of-units fit target, the all-period conformal refit, and augsynth’s period-space ridge ASCM) and the calibration/placebo evidence behind them. The market-selection stages (no published table) remain a faithful port validated end-to-end on GeoLift’s own data, with each documented divergence (the CV-once optimization proven exact, the corrected per-anchor RNG) available as an opt-in, tested swap.

[BMFR2021] (1,2)

Ben-Michael, E., Feller, A., & Rothstein, J. (2021). The Augmented Synthetic Control Method. Journal of the American Statistical Association.

[CWZ2021] (1,2,3)

Chernozhukov, V., Wüthrich, K., & Zhu, Y. (2021). An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls. Journal of the American Statistical Association.

Core API#

class mlsynth.GEOLIFT(config: GeoLiftConfig | dict)#

GeoLift market-selection design.

Chooses which markets to treat before an experiment by simulating power over the historical (pre-treatment) panel, then – once outcomes are observed (a post_col leaving a pre/post split) – realizes the chosen design into a standardized effect report with conformal inference.

Parameters:

config (GeoLiftConfig or dict) – Configuration. See mlsynth.utils.geolift_helpers.config.GeoLiftConfig.

Examples

>>> from mlsynth import GEOLIFT
>>> res = GEOLIFT({"df": panel, "outcome": "Y", "unitid": "location",
...                "time": "date", "treatment_size": 3, "durations": [14],
...                "effect_sizes": [0.0, 0.1, 0.2]}).fit()
>>> res.selected_units
fit() GEOLIFTResults#

Run the market-selection design and return the result.

Behaviour is driven by the data and config, not by manual sequencing:

  • always runs the design (candidate nomination -> power -> MDE -> rank -> per-candidate synthetic controls);

  • when a post_col leaves a post-treatment window, realizes the winning design on the full panel under the hood, populating result.report (conformal effect report) – the DesignResult resolving to its EffectResult;

  • when display_graphs is set, plots the recommended design (design phase, or the realized post phase).

class mlsynth.utils.geolift_helpers.config.GeoLiftConfig(*, df: DataFrame, outcome: str, unitid: str, time: str, treatment_size: int, to_be_treated: List | None = None, not_to_be_treated: List | None = None, durations: List[int], effect_sizes: List[float], lookback_window: int = 1, post_col: str | None = None, how: str = 'sum', augment: str | None = 'ridge', fixed_effects: bool = True, alpha: float = 0.1, power_threshold: float = 0.8, cpic: float | None = None, budget: float | None = None, ns: int = 1000, conformal_type: str = 'iid', run_stochastic: bool = False, stochastic_mode: str = 'global', seed: int = 0, display_graphs: bool = True)#

Configuration for the GeoLift market-selection design (GEOLIFT).

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid'}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].