---
jupytext:
  text_representation:
    extension: .md
    format_name: myst
kernelspec:
  display_name: Python 3
  name: python3
---

```{code-cell} python
:tags: [remove-input]
import matplotlib
matplotlib.use("Agg")
import plotly.io as pio
pio.renderers.default = "png"
```

# Theory-based inference

Alongside the simulation grammar, `moderndive` ships tidy wrappers for the
classical (formula-based) tests and a way to draw the theoretical distributions —
mirroring R `infer`'s `t_test`, `prop_test`, `chisq_test`, and `assume()`.

## One-line tests

```{code-cell} python
import moderndive as md
from moderndive import t_test, prop_test, chisq_test

# One-sample t-test
age = md.load_age_at_marriage()
t_test(age, response="age", mu=23)
# → statistic, t_df, p_value, alternative, estimate, lower_ci, upper_ci

# Two-sample (Welch) t-test
movies = md.load_movies_sample()
t_test(movies, formula="rating ~ genre", order=("Action", "Romance"))

# Two-proportion z-test
yawn = md.load_mythbusters_yawn()
prop_test(yawn, formula="yawn ~ group", success="yes", order=("seed", "control"))

# Chi-squared test of independence
chisq_test(yawn, formula="yawn ~ group")
# → statistic, chisq_df, p_value
```

`t_stat` and `chisq_stat` return just the test statistic if that's all you need.

## Theoretical distributions with `assume()`

`assume()` defines a theoretical sampling distribution you can visualize and use
for p-values without simulating:

```{code-cell} python
from moderndive import assume

# t-distribution with 10 degrees of freedom
t_dist = assume("t", df=10)
t_dist.get_p_value(2.0, direction="right")     # one-sided p-value

# plotly by default; engine="plotnine" for ggplot output
t_dist.visualize()
t_dist.visualize(engine="plotnine")
```

Supported distributions: `"t"`, `"z"`, `"F"` (pass `df=(df1, df2)`), and
`"Chisq"`.

## Overlaying theory on a simulation

`visualize(..., method=...)` can show the simulation histogram, the theoretical
curve, or both:

```{code-cell} python
from moderndive import visualize

boot = (
    age.specify(response="age")
    .generate(reps=1000, type="bootstrap", seed=1)
    .calculate(stat="mean")
)

visualize(boot, method="both")          # histogram + normal-approximation curve
```

```{code-cell} python
visualize(boot, method="theoretical")   # just the curve
```

## Population standard deviation

A small helper that divides by `n` (not `n − 1`):

```{code-cell} python
from moderndive import pop_sd

pop_sd([1, 2, 3, 4, 5])   # → 1.414…
```