Partialling out is a package that allows to generate residualised
variables of already existing linear or fixed effects models. So far it
works with lm
, felm
(lfe
package)
and feols
(fixest
package) for applications of
the Frisch-Waugh-Lovell theorem, as explained in Lovell (2008). Whereas this algorithm has
already been implemented in fwlplot
,
this package offers three new characteristics.
It uses an already existing model instead of a formula.
Works with lm
and felm
objects
alongside feols
.
Returns a data.frame with residualised variables instead of a plot, thus offering more freedom of what to do with the results.
The Frisch-Waugh-Lovell theorem states that for a linear model
Y = X_1 \beta_1 + X_2 \beta_2 + u
The coefficient will be equivalent to the coefficient,
, from the regression of
M_{X_2} Y = M_{X_2} X_1 \beta_1 + M_{X_1} u
Where are the residuals of the regression of
on
and
are the residuals of the regression of
on
.
This theorem is designed to help simplifying linear and, particularly, fixed effects models for easier visualisation and interpretation, transforming a multiple regression into a simple one that can be easily visualised via a scatterplot or further used in other models.
You can install the development version of partialling.out from GitHub with:
# install.packages("pak")
::pak("marcboschmatas/partialling.out") pak
Or, from R-Universe with
install.packages("partialling.out", repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))
The workflow for partialling.out
is rather simple:
first, create a linear or fixed effects model.
library(partialling.out)
library(tinytable)
library(palmerpenguins)
<- lm(bill_length_mm ~ bill_depth_mm + species, data = penguins)
model summary(model)
#>
#> Call:
#> lm(formula = bill_length_mm ~ bill_depth_mm + species, data = penguins)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -8.0300 -1.5828 0.0733 1.6925 10.0313
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 13.2164 2.2475 5.88 9.83e-09 ***
#> bill_depth_mm 1.3940 0.1220 11.43 < 2e-16 ***
#> speciesChinstrap 9.9390 0.3678 27.02 < 2e-16 ***
#> speciesGentoo 13.4033 0.5118 26.19 < 2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 2.518 on 338 degrees of freedom
#> (2 observations deleted due to missingness)
#> Multiple R-squared: 0.7892, Adjusted R-squared: 0.7874
#> F-statistic: 421.9 on 3 and 338 DF, p-value: < 2.2e-16
Using the partialling_out
function, you can get the
residualised variable of interest (bill length) and of the first
explanatory variable (bill_length), i.e. it would return the residuals
of the following two regressions.
<- lm(bill_length_mm ~ species, data = penguins)
modely <- lm(bill_depth_mm ~ species, data = penguins) modelx
<- partialling_out(model, data = penguins)
res
tt(head(res)) |>
format_tt(digits = 2) |>
style_tt(align = "c")
res_bill_length_mm | res_bill_depth_mm |
---|---|
0.31 | 0.35 |
0.71 | -0.95 |
1.51 | -0.35 |
-2.09 | 0.95 |
0.51 | 2.25 |
0.11 | -0.55 |
As stated above, if we follow following the Frisch-Waugh-Lovell
theorem the coefficient of res_bill_depth_mm
in the model
lm(res_bill_length_mm ~ res_bill_depth_mm)
will be the same
of the coefficient of bill_depth_mm
in the original
model.
<- lm(res_bill_length_mm ~ res_bill_depth_mm, data = res)
resmodel
print(c(model$coefficients[2], resmodel$coefficients[2]))
#> bill_depth_mm res_bill_depth_mm
#> 1.394011 1.394011
Contributing instructions can be found here
Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
To the authors of the fwlplot package, Kyle Butts and Grant McDermott, which has provided inspiration and ideas for this project.
To my colleague Andreu Arenas-Jal for his insight and guiding.
To the ROpensci editors Mark Padgham and Nima Hejazi and to the reviewers Christian Testa, Kyle Butts, and Adam Loy.