Bootstrapping & confidence intervals

Resample your data with replacement to build a bootstrap distribution, then read a confidence interval off it.

A bootstrap distribution for a mean

import moderndive as md
from moderndive import specify, get_confidence_interval, visualize, shade_confidence_interval

almonds = md.load_almonds_sample_100()

boot = (
    almonds.specify(response="weight")
    .generate(reps=1000, type="bootstrap", seed=1)
    .calculate(stat="mean")
)

Confidence intervals — three methods

# Percentile method (default)
get_confidence_interval(boot, level=0.95, type="percentile")
# shape: (1, 2) → lower_ci ≈ 3.61, upper_ci ≈ 3.75

# Standard-error method (needs the point estimate)
point = float(almonds.specify(response="weight").calculate(stat="mean"))
get_confidence_interval(boot, level=0.95, type="se", point_estimate=point)

# Bias-corrected
get_confidence_interval(boot, level=0.95, type="bias-corrected", point_estimate=point)
shape: (1, 2)
lower_ciupper_ci
f64f64
3.6138833.755272

Distribution objects also expose the getter as a method:

boot.get_confidence_interval(level=0.95, type="percentile")
shape: (1, 2)
lower_ciupper_ci
f64f64
3.6069753.75

Visualize the interval

ci = get_confidence_interval(boot, level=0.95, type="percentile")

# plotly (default) — interactive
visualize(boot) + shade_confidence_interval(ci)

# plotnine
visualize(boot, engine="plotnine") + shade_confidence_interval(ci)

# one-call keyword form
visualize(boot, shade_ci=ci)
../_images/a273216777edd2d44d46c0b5746fe52bdeb43e370fdf9d484e44c44cfd60293e.png

A confidence interval for a proportion

import polars as pl

mythbusters = md.load_mythbusters_yawn()

boot_prop = (
    mythbusters.specify(response="yawn", success="yes")
    .generate(reps=1000, type="bootstrap", seed=1)
    .calculate(stat="prop")
)
get_confidence_interval(boot_prop, level=0.95, type="percentile")
shape: (1, 2)
lower_ciupper_ci
f64f64
0.15950.4

A confidence interval for a difference

Add an explanatory variable and an order= to compare two groups:

boot_diff = (
    mythbusters.specify(formula="yawn ~ group", success="yes")
    .generate(reps=1000, type="bootstrap", seed=1)
    .calculate(stat="diff in props", order=("seed", "control"))
)
boot_diff.get_confidence_interval(level=0.95)
shape: (1, 2)
lower_ciupper_ci
f64f64
-0.2249340.316189

See also

For theory-based intervals (the formula-based t interval) see Theory-based inference.