--- jupytext: text_representation: extension: .md format_name: myst kernelspec: display_name: Python 3 name: python3 --- ```{code-cell} python :tags: [remove-input] import matplotlib matplotlib.use("Agg") import plotly.io as pio pio.renderers.default = "png" ``` # Bootstrapping & confidence intervals Resample your data with replacement to build a **bootstrap distribution**, then read a confidence interval off it. ## A bootstrap distribution for a mean ```{code-cell} python import moderndive as md from moderndive import specify, get_confidence_interval, visualize, shade_confidence_interval almonds = md.load_almonds_sample_100() boot = ( almonds.specify(response="weight") .generate(reps=1000, type="bootstrap", seed=1) .calculate(stat="mean") ) ``` ## Confidence intervals — three methods ```{code-cell} python # Percentile method (default) get_confidence_interval(boot, level=0.95, type="percentile") # shape: (1, 2) → lower_ci ≈ 3.61, upper_ci ≈ 3.75 # Standard-error method (needs the point estimate) point = float(almonds.specify(response="weight").calculate(stat="mean")) get_confidence_interval(boot, level=0.95, type="se", point_estimate=point) # Bias-corrected get_confidence_interval(boot, level=0.95, type="bias-corrected", point_estimate=point) ``` Distribution objects also expose the getter as a method: ```{code-cell} python boot.get_confidence_interval(level=0.95, type="percentile") ``` ## Visualize the interval ```{code-cell} python ci = get_confidence_interval(boot, level=0.95, type="percentile") # plotly (default) — interactive visualize(boot) + shade_confidence_interval(ci) # plotnine visualize(boot, engine="plotnine") + shade_confidence_interval(ci) # one-call keyword form visualize(boot, shade_ci=ci) ``` ## A confidence interval for a proportion ```{code-cell} python import polars as pl mythbusters = md.load_mythbusters_yawn() boot_prop = ( mythbusters.specify(response="yawn", success="yes") .generate(reps=1000, type="bootstrap", seed=1) .calculate(stat="prop") ) get_confidence_interval(boot_prop, level=0.95, type="percentile") ``` ## A confidence interval for a difference Add an explanatory variable and an `order=` to compare two groups: ```{code-cell} python boot_diff = ( mythbusters.specify(formula="yawn ~ group", success="yes") .generate(reps=1000, type="bootstrap", seed=1) .calculate(stat="diff in props", order=("seed", "control")) ) boot_diff.get_confidence_interval(level=0.95) ``` ```{seealso} For *theory-based* intervals (the formula-based `t` interval) see {doc}`theory-based`. ```