This vignette documents the analysis of actual R source code that
informed the formatting choices in rformat. We analyzed every
.R file from the 30 packages that ship with R: 15 base
packages and 15 recommended packages.
We downloaded the R 4.5.2 source tarball and extracted the R/
directories from all base packages. For recommended packages (boot,
class, cluster, codetools, foreign, KernSmooth, lattice, MASS, Matrix,
mgcv, nlme, nnet, rpart, spatial, survival) we used the CRAN source
tarballs. We parsed each file with getParseData() and
tallied formatting conventions across all function definitions.
An earlier version of this analysis used deparse()
output from 5 packages. That approach shows reconstructed
formatting, not the original source. The numbers below come from the
actual source files.
612 files, 172,810 lines, 8,426 function definitions.
| Style | Count | % |
|---|---|---|
| K&R (same line) | 2,985 | 52.9 |
| Allman (own line) | 2,655 | 47.1 |
| No braces (one-liner) | 2,786 | — |
Base R is genuinely mixed. Neither style dominates.
| Style | Lines | % |
|---|---|---|
| Space-indented | 116,187 | 89.0 |
| Tab-indented | 14,328 | 11.0 |
Most common space indents: 4 (40,412), 8 (24,324), 12 (14,257), 16 (7,054). The 4-space pattern is dominant.
function| Style | Count | % |
|---|---|---|
function( |
8,076 | 95.8 |
function ( |
350 | 4.2 |
The no-space form is overwhelmingly dominant. The space form appears
in a few well-known functions (lm, lapply,
glm) but is the exception, not the rule.
| Style | Count | % |
|---|---|---|
| Single-line | 7,553 | 89.6 |
| Multi-line | 873 | 10.4 |
Of the 873 multi-line signatures:
| Continuation style | Count |
|---|---|
| Paren-aligned | 699 |
| Tab-indented | 144 |
| Other | 30 |
Paren alignment is the clear convention for multi-line signatures (80%).
920 files, 263,807 lines, 12,306 function definitions.
Adding the 15 recommended packages barely shifts the percentages:
| Metric | Base 15 | All 30 |
|---|---|---|
| K&R braces | 52.9% | 53.6% |
| Allman braces | 47.1% | 46.4% |
function( |
95.8% | 96.5% |
| Space-indented | 89.0% | 89.9% |
| Paren-aligned cont. | 80.1% | 78.6% |
The recommended packages follow the same conventions as base.
| Style | Count | % |
|---|---|---|
| Bare (no braces) | 24,839 | 58.7 |
| Braced | 17,505 | 41.3 |
R Core slightly prefers bare one-liner bodies. rformat leaves them
bare by default (control_braces = FALSE). Use
control_braces = TRUE to add braces.
} else placement| Style | Count | % |
|---|---|---|
} else (same line) |
4,576 | 69.7 |
}\nelse (new line) |
1,987 | 30.3 |
Same-line } else is the majority practice. rformat
enforces this by default (join_else = TRUE). Use
join_else = FALSE to preserve }\nelse.
Example from stats/lm.R — one of the minority that uses
function (:
lm <- function (formula, data, subset, weights, na.action,
method = "qr", model = TRUE, x = FALSE, y = FALSE,
qr = TRUE, singular.ok = TRUE, contrasts = NULL,
offset, ...)
{
ret.x <- x
ret.y <- y
Using cat -A to reveal tabs (^I) vs
spaces:
lm <- function (formula, data, subset, weights, na.action,$
^I^Imethod = "qr", model = TRUE, x = FALSE, y = FALSE,$
^I^Iqr = TRUE, singular.ok = TRUE, contrasts = NULL,$
^I^Ioffset, ...)$
{$
ret.x <- x$
function( without space is the norm
(96%). The function ( form that appears in textbooks and
style guides is actually rare in the source.function( — R Core never puts the first argument on a new
line.) on same line as last
argument — no dangling paren style.The R Internals manual, Chapter 9: R coding standards recommends 4-space indentation for R and C code.
The R Dev Guide PR #256 codifies this as an EditorConfig setting (4-space indent, 8-column tab display width, spaces not tabs).
The R source code uses <- exclusively for assignment
— = does not appear as an assignment operator in any base
or recommended package.
rformat adopts the conventions where R Core is consistent and makes opinionated choices where R Core is mixed:
| Convention | R Core source | rformat default | Option to change |
|---|---|---|---|
| Assignment | 100% <- |
<- |
— |
| Control flow braces | 59% bare, 41% braced | bare | control_braces = TRUE |
} else placement |
70% same line | same line | join_else = FALSE |
| Brace style | ~53% K&R, ~47% Allman | K&R | brace_style = "allman" |
function spacing |
96% no space | no space | function_space = TRUE |
| Indentation | 89% spaces, 11% tabs | 4 spaces | indent = "\t" |
| Continuation | 80% paren-aligned | paren-aligned | wrap = "fixed" |