---
title: "Getting started with baselinr"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting started with baselinr}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(baselinr)
```

## The problem

Every quasi-experimental impact study in education has to answer the same
question before anyone looks at outcomes: *were the treatment and comparison
groups similar enough at baseline?* The What Works Clearinghouse (WWC) sets the
de facto standard for this in education research:

- a covariate with a standardized mean difference (Hedges' g) of **0.05 or
  less** satisfies baseline equivalence on its own;
- between **0.05 and 0.25**, equivalence holds only if the covariate is
  statistically adjusted for in the impact model;
- **above 0.25**, the covariate cannot establish equivalence.

`baselinr` computes those effect sizes and categories so the baseline table is
not something you assemble by hand for every report.

## A worked example

```{r}
study <- data.frame(
  treat = c(1, 1, 1, 0, 0, 0),
  pretest = c(5, 6, 7, 4, 5, 6), # continuous -> Hedges' g
  female = c(1, 0, 1, 0, 0, 1) # binary     -> Cox index
)

baseline_equivalence(study, treatment = "treat")
```

By default, every numeric, logical, and factor column other than the treatment
indicator is treated as a covariate. A covariate with exactly two unique values
is treated as binary and summarized with the Cox index; other numeric covariates
use Hedges' g. Pass `covariates =` to control the set explicitly.

## The building blocks

`baseline_equivalence()` is built from exported helpers you can also call
directly.

```{r}
# Standardized mean difference (Hedges' g) for a continuous covariate
hedges_g(study$pretest, study$treat)

# Cox index for a binary covariate
cox_index(study$female, study$treat)

# Classify any effect size(s) into the WWC categories
wwc_classify(c(0.03, 0.12, 0.80))
```

## Visualise and format

A Love plot shows the standardized effect size of each covariate against the WWC
thresholds (0.05 and 0.25), coloured by category:

```{r loveplot, eval = requireNamespace("ggplot2", quietly = TRUE), fig.width = 7, fig.height = 2.6}
love_plot(baseline_equivalence(study, treatment = "treat"))
```

For a report-ready table, `gt_baseline()` returns a formatted `gt` table:

```{r eval = FALSE}
gt_baseline(baseline_equivalence(study, treatment = "treat"))
```

## Scope

Continuous covariates use Hedges' g (with the WWC small-sample correction);
binary covariates use the WWC Cox index. Collapse the table into an overall
verdict with `wwc_summary()`, assess sample loss with `attrition()`, visualise
with `love_plot()`, and format with `gt_baseline()`. See `NEWS.md` for the
roadmap.