Introduction to statAPA

Overview

statAPA produces publication-ready statistical tables formatted according to the 7th edition of the American Psychological Association (APA) style guidelines. All functions return an invisible list so results can be stored, piped into apa_to_flextable() for Word export, or left to print to the console via message().

library(statAPA)

1. Descriptive statistics

result <- apa_descriptives(
  mtcars,
  vars  = c("mpg", "wt", "hp"),
  group = "cyl"
)
#> Descriptive Statistics by cyl
#> 
#>  Variable     6: M (SD)      4: M (SD)      8: M (SD)  Total: M (SD)  N
#>       mpg  26.66 (4.51)   19.74 (1.45)   15.10 (2.56)   20.09 (6.03) 32
#>        wt   2.29 (0.57)    3.12 (0.36)    4.00 (0.76)    3.22 (0.98) 32
#>        hp 82.64 (20.93) 122.29 (24.26) 209.21 (50.98) 146.69 (68.56) 32
#> 
#> Note. M = Mean; SD = Standard Deviation; N = number of observations.

The returned list has $descriptives_df (the formatted data frame) and $note (the APA table note).


2. t-Test

# Two-sample Welch t-test: mpg by transmission type
auto   <- mtcars$mpg[mtcars$am == 0]
manual <- mtcars$mpg[mtcars$am == 1]

res <- apa_t_test(auto, manual, output = "console")

3. One-way Analysis of Variance (ANOVA)

mtcars2       <- mtcars
mtcars2$cyl   <- factor(mtcars2$cyl)

res <- apa_anova(lm(mpg ~ cyl, data = mtcars2), es = "partial_eta2")
#> ANOVA Results
#> 
#>     Source     SS df     MS     F      p partial_eta2
#>        cyl 824.78  2 412.39 39.70 < .001        0.732
#>  Residuals 301.26 29  10.39  <NA>   <NA>         <NA>
#> 
#> SS = Sum of Squares; MS = Mean Square; partial_eta2 = partial eta squared. Type II sums of squares.

4. Two-Way Analysis of Variance (ANOVA) with simple effects

mtcars2$gear <- factor(mtcars2$gear)

res <- apa_twoway_anova(
  mpg ~ cyl * gear,
  data    = mtcars2,
  factorA = "cyl",
  factorB = "gear",
  simple_effects = TRUE
)
#> Two-Way ANOVA Results
#> 
#>     Source     SS df     MS     F      p partial_eta2
#>        cyl 349.79  2 174.90 15.60 < .001        0.565
#>       gear   8.25  2   4.13  0.37   .696        0.030
#>   cyl:gear  23.89  3   7.96  0.71   .555        0.082
#>  Residuals 269.12 24  11.21  <NA>   <NA>         <NA>
#> 
#> Note. Two-Way Analysis of Variance (ANOVA).  SS = Sum of Squares; MS = Mean Square;  partial_eta2  =  partial eta squared .  Type  II  sums of squares.  Simple effects of cyl within levels of gear computed via emmeans::joint_tests(). 
#> Simple Effects of cyl Within gear
#> 
#>        Source df1 df2     F    p
#>  cyl | gear=3   2  24  3.08 .065
#>  cyl | gear=4   1  24 12.24 .002
#>  cyl | gear=5   2  24  7.46 .003

5. Analysis of Covariance (ANCOVA) with adjusted means

res <- apa_ancova(
  formula   = mpg ~ cyl + wt,
  data      = mtcars2,
  covariate = "wt",
  focal     = "cyl",
  es        = "partial_eta2"
)
#> ANCOVA Results
#> 
#>     Source     SS df     MS     F      p partial_eta2
#>        cyl  95.26  2  47.63  7.29   .003        0.342
#>         wt 118.20  1 118.20 18.08 < .001        0.392
#>  Residuals 183.06 28   6.54  <NA>   <NA>         <NA>
#> 
#> Note. Analysis of Covariance (ANCOVA) with covariate(s):  wt .  SS = Sum of Squares; MS = Mean Square;  partial_eta2  =  partial eta squared .  Type  II  sums of squares.  Adjusted means estimated at covariate mean(s) via emmeans. 
#> Covariate-Adjusted Means for cyl
#> 
#>  cyl Adjusted M    SE         95% CI
#>    4     23.678 1.043 [21.54, 25.81]
#>    6     19.422 0.969 [17.44, 21.41]
#>    8     17.607 0.903 [15.76, 19.45]

The second table shows covariate-adjusted marginal means for each level of cyl, evaluated at the mean of wt.


6. Multivariate Analysis of Variance (MANOVA)

res <- apa_manova(
  cbind(Sepal.Length, Petal.Length) ~ Species,
  data = iris
)
#> MANOVA Results
#> 
#>   Effect             Test    Stat approx F num df den df      p  eta2
#>  Species           Pillai  0.9885    71.83      4 294.00 < .001 0.494
#>  Species            Wilks  0.0399   292.56      4 292.00 < .001 0.800
#>  Species Hotelling-Lawley 23.3647   846.97      4 290.00 < .001 0.921
#>  Species              Roy 23.3342  1715.06      2 147.00 < .001 0.959
#> 
#> Note. Pillai = Pillai's trace; Wilks = Wilks' lambda;  Hotelling-Lawley = Hotelling-Lawley trace; Roy = Roy's largest root.  approx F = approximate F-statistic; num df = numerator degrees of freedom;  den df = denominator degrees of freedom.  eta2 = eta-squared, approximated from F-ratio and degrees of freedom.  Type II sums of squares.

All four multivariate test statistics are reported: Pillai’s trace, Wilks’ lambda, Hotelling-Lawley trace, and Roy’s largest root.


7. Post-hoc pairwise comparisons

fit <- aov(mpg ~ cyl, data = mtcars2)
res <- apa_posthoc(fit, by = "cyl", adjust = "tukey")
#> Post-hoc Pairwise Comparisons
#> 
#>     Contrast Estimate   SE    t      p        95% CI
#>  cyl4 - cyl6     6.92 1.56 4.44 < .001 [3.07, 10.77]
#>  cyl4 - cyl8    11.56 1.30 8.90 < .001 [8.36, 14.77]
#>  cyl6 - cyl8     4.64 1.49 3.11   .011  [0.96, 8.33]
#> 
#> Note. Estimates are pairwise differences of marginal means. Contrast method = pairwise; adjusted using tukey; confidence level = 0.95.

8. Chi-square test

# Independence test
m <- matrix(c(30, 10, 20, 40), nrow = 2,
            dimnames = list(c("Group A", "Group B"),
                            c("Yes",    "No")))
res <- apa_chisq(m)

9. Proportion test with risk difference, risk ratio, and odds ratio

res <- apa_prop_test(x = c(30, 20), n = c(50, 50), output = "console")

10. Regression table

fit <- lm(mpg ~ wt + hp + factor(cyl), data = mtcars)
res <- apa_table(fit)
#> Linear Model Predicting mpg
#> 
#>     Predictor     b   SE     t      p         95% CI
#>   (Intercept) 35.85 2.04 17.56 < .001 [31.66, 40.03]
#>            wt -3.18 0.72 -4.42 < .001 [-4.66, -1.70]
#>            hp -0.02 0.01 -1.93   .064  [-0.05, 0.00]
#>  factor(cyl)6 -3.36 1.40 -2.40   .024 [-6.24, -0.48]
#>  factor(cyl)8 -3.19 2.17 -1.47   .154  [-7.64, 1.27]
#> 
#> Note. Unstandardized coefficients reported. SE = Standard Error; CI = Confidence Interval. p-values based on t tests. N (Level 1) = 32.

11. Robust regression (Heteroscedasticity-Consistent standard errors)

res <- apa_robust(fit, type = "HC3")
#> Linear Model (HC3 robust SEs)
#> 
#>     Predictor     b   SE     t      p         95% CI
#>   (Intercept) 35.85 2.71 13.22 < .001 [30.53, 41.16]
#>            wt -3.18 0.81 -3.93 < .001 [-4.77, -1.60]
#>            hp -0.02 0.01 -1.83   .078  [-0.05, 0.00]
#>  factor(cyl)6 -3.36 1.38 -2.43   .022 [-6.07, -0.65]
#>  factor(cyl)8 -3.19 2.52 -1.26   .217  [-8.13, 1.76]
#> 
#> Note. Unstandardized coefficients reported. Standard errors are HC3 robust via sandwich/lmtest; 95% CIs use normal approximation.

12. Heteroscedasticity diagnostics

res <- apa_hetero(fit)
#> Heteroscedasticity Diagnostics
#> 
#>                         Test Stat df    p
#>                Breusch-Pagan 6.17  4 .187
#>  Non-Constant Variance (NCV) 3.37  1 .066
#> 
#> Note. Breusch-Pagan uses studentized residuals.
res <- apa_homoskedasticity(fit)
#> Homoskedasticity Check
#> 
#>           Test Stat df    p
#>  Breusch-Pagan 6.17  4 .187
#> 
#> Note. Breusch-Pagan and White tests are sensitive to functional form. Levene/Brown-Forsythe not run (no 'group' provided).  At alpha = 0.05, no test detected heteroskedasticity (results are consistent with homoskedastic errors).

13. Multilevel model reporting

library(lme4)
data(ECLS_demo)

m0 <- lmer(math ~ 1 + (1 | schid),        data = ECLS_demo, REML = FALSE)
m1 <- lmer(math ~ SES + (1 | schid),      data = ECLS_demo, REML = FALSE)
m2 <- lmer(math ~ SES + gender + (1 | schid), data = ECLS_demo, REML = FALSE)

res <- apa_multilevel(
  m0, m1, m2,
  model_names = c("Null", "+ SES", "+ SES + Gender")
)
#> Multilevel Model Results: Null - Fixed Effects
#> 
#>         Term      b   SE      t      p           95% CI
#>  (Intercept) 501.27 1.02 491.13 < .001 [499.27, 503.27]
#> Null - Random Effects
#> 
#>     Group         Var Variance    SD
#>     schid (Intercept)    18.24  4.27
#>  Residual        <NA>   103.58 10.18
#> Multilevel Model Results: + SES - Fixed Effects
#> 
#>         Term      b   SE      t      p           95% CI
#>  (Intercept) 501.41 1.02 490.99 < .001 [499.41, 503.41]
#>          SES   2.73 0.36   7.60 < .001     [2.03, 3.44]
#> + SES - Random Effects
#> 
#>     Group         Var Variance   SD
#>     schid (Intercept)    18.44 4.29
#>  Residual        <NA>    96.44 9.82
#> Multilevel Model Results: + SES + Gender - Fixed Effects
#> 
#>         Term      b   SE      t      p           95% CI
#>  (Intercept) 500.86 1.08 465.32 < .001 [498.75, 502.97]
#>          SES   2.76 0.36   7.69 < .001     [2.06, 3.47]
#>   genderMale   1.17 0.70   1.67   .095    [-0.21, 2.55]
#> + SES + Gender - Random Effects
#> 
#>     Group         Var Variance   SD
#>     schid (Intercept)    18.54 4.31
#>  Residual        <NA>    96.09 9.80
#> Model Comparison (Likelihood Ratio Tests)
#> 
#>           Model df    AIC    BIC   logLik Chi-sq Chi-sq df      p
#>            Null NA 6030.3 6044.4 -3012.15   <NA>        NA   <NA>
#>           + SES  1 5976.6 5995.3 -2984.29  55.72        NA < .001
#>  + SES + Gender  1 5975.8 5999.2 -2982.90   2.77        NA   .096
#> 
#> Note. Multilevel model (lme4).  b = unstandardized fixed-effect coefficient; SE = Standard Error;  CI = Confidence Interval (Wald).  p-values use Satterthwaite degrees of freedom (lmerTest).   ICC = intraclass correlation coefficient.  R2m/R2c via MuMIn::r.squaredGLMM().   R2m = marginal R-squared (fixed effects only);  R2c = conditional R-squared (fixed + random effects).

The function reports fixed effects, random effects, the intraclass correlation coefficient (ICC), marginal and conditional R-squared, and a likelihood ratio model-comparison table.


14. Exporting to Word

Any result can be passed to apa_to_flextable() and then saved:

ft <- apa_to_flextable(res)

doc <- officer::read_docx()
doc <- flextable::body_add_flextable(doc, ft)
print(doc, target = "my_table.docx")

Or use the built-in Word route directly:

apa_table(fit, output = "word", file = "regression_table.docx")

15. APA figures

apa_plot_descriptives(mtcars2, y = "mpg", group = "cyl", show_points = TRUE)

apa_plot_anova(aov(mpg ~ cyl, data = mtcars2), by = "cyl")