BORG

Bounded Outcome Risk Guard for Model Evaluation

BORG catches data leakage that inflates your model’s performance — before you report the wrong number.

Quick Start

library(BORG)

# You scaled the data, then split it. Looks fine?
data_scaled <- scale(iris[, 1:4])
train_idx <- 1:100
test_idx <- 101:150

borg_inspect(data_scaled, train_idx = train_idx, test_idx = test_idx)
#> INVALID — Hard violation: preprocessing_leak
#> "Normalization parameters were computed on data beyond training set"

The test set means leaked into the scaler. Your reported accuracy is wrong. BORG finds this automatically — for scaling, PCA, recipes, caret pipelines, and more.

Statement of Need

A model shows 95% accuracy on test data, then drops to 60% in production. The usual cause: data leakage. Information from the test set contaminated training, and the reported metrics were wrong.

A Princeton meta-analysis found leakage errors in 648 published papers across 30 fields. In civil war prediction research, correcting leakage revealed that “complex ML models do not perform substantively better than decades-old Logistic Regression.” The reported gains were artifacts.

BORG addresses this problem by automatically detecting six categories of leakage — index overlap, duplicate rows, preprocessing leakage, target leakage, group leakage, and temporal violations — across common R frameworks (base R, caret, tidymodels, mlr3). Beyond detection, BORG diagnoses data dependencies (spatial, temporal, clustered), generates appropriate cross-validation schemes, and produces publication-ready methods paragraphs with test statistics.

These features make the package useful in domains like:

ecological and environmental modeling (spatial/temporal autocorrelation),
clinical research (repeated measures, patient clustering),
any predictive modeling workflow where evaluation integrity matters.

Features

Core Validation

borg(): Main entry point for all validation
- Validates train/test splits against data
- Detects preprocessing leakage (scaling, PCA fitted on full data)
- Checks for target leakage (features derived from outcome)
- Validates grouped data (same patient in train and test)
- Validates temporal data (test predates training)
- Validates spatial data (test points too close to training)
borg_inspect(): Detailed inspection of specific objects
- Works with caret::preProcess, recipes::recipe, prcomp
- Checks rsample resampling objects
- Validates fitted models (lm, glm, ranger, etc.)
borg_diagnose(): Analyze data for dependency structure
- Detects spatial autocorrelation (Moran’s I)
- Detects temporal autocorrelation (ACF/Ljung-Box)
- Detects clustered structure (ICC)
- Recommends appropriate CV strategy

Empirical Evidence & Power Analysis

borg_compare_cv(): Run random and blocked CV side by side on the same data
- Produces the “smoking gun” evidence for reviewers
- Paired t-test quantifies metric inflation
- plot() for visual comparison
borg_power(): Estimate power loss from switching to blocked CV
- Design effect from Moran’s I, ACF, or ICC
- Reports effective sample size and minimum detectable effect
- Answers “is my dataset large enough for blocked CV?”

Publication Support

summary(): Generate publication-ready methods paragraphs
- Includes test statistics (Moran’s I, ACF, ICC with p-values)
- Three citation styles: APA, Nature, Ecology
- Integrates borg_compare_cv() inflation estimates when available
borg_certificate() / borg_export(): Machine-readable validation certificates in YAML/JSON for audit trails

Risk Categories

Category	Impact	Response
Hard Violation	Results invalid	Blocks evaluation
Soft Inflation	Results biased	Warns, allows with caution

Hard Violations: - index_overlap - Same row in train and test - duplicate_rows - Identical observations across sets - preprocessing_leak - Scaler/PCA fitted on full data - target_leakage - Feature with |r| > 0.99 with target - group_leakage - Same group in train and test - temporal_leak - Test data predates training

Soft Inflation: - proxy_leakage - Feature with |r| 0.95-0.99 with target - spatial_proximity - Test points close to training - spatial_overlap - Test inside training convex hull

Installation

# Install from GitHub
# install.packages("pak")
pak::pak("gcol33/BORG")

# Or using devtools
# install.packages("devtools")
devtools::install_github("gcol33/BORG")

Usage Examples

Validate a Train/Test Split

library(BORG)

# Clean split — passes validation
result <- borg(iris, train_idx = 1:100, test_idx = 101:150)
result
#> Status: VALID
#>   Hard violations: 0
#>   Soft inflations: 0

# Overlapping indices — caught immediately
borg(iris, train_idx = 1:100, test_idx = 51:150)
#> INVALID — index_overlap: Train and test indices overlap (50 shared indices)

Catch Leaky Preprocessing Pipelines

# caret preProcess fitted on ALL data (common mistake)
library(caret)
pp <- preProcess(mtcars, method = c("center", "scale"))
borg_inspect(pp, train_idx = 1:25, test_idx = 26:32, data = mtcars)
#> Hard violation: preprocessing_leak
#> "preProcess centering parameters were computed on data beyond training set"

Target Leakage Detection

# Feature highly correlated with outcome
leaky_data <- data.frame(
 x = rnorm(100),
 outcome = rnorm(100)
)
leaky_data$leaked <- leaky_data$outcome + rnorm(100, sd = 0.01)

borg_inspect(leaky_data, train_idx = 1:70, test_idx = 71:100, target = "outcome")
#> Hard violation: target_leakage_direct

Grouped Data Validation

# Clinical data with patient IDs
clinical <- data.frame(
  patient_id = rep(1:10, each = 10),
  measurement = rnorm(100)
)

# Random split ignoring patients
set.seed(123)
idx <- sample(100)
train_idx <- idx[1:70]
test_idx <- idx[71:100]

borg_inspect(clinical, train_idx, test_idx, groups = "patient_id")
#> Hard violation: group_leakage

Spatial Data Validation

spatial_data <- data.frame(
  lon = runif(200, -10, 10),
  lat = runif(200, -10, 10),
  response = rnorm(200)
)

# Let BORG diagnose and generate appropriate CV folds
result <- borg(spatial_data, coords = c("lon", "lat"), target = "response", v = 5)
result$diagnosis@recommended_cv
#> "spatial_block"

Empirical CV Comparison

# Prove to reviewers that random CV inflates metrics
comparison <- borg_compare_cv(
  spatial_data,
  formula = response ~ lon + lat,
  coords = c("lon", "lat"),
  repeats = 10
)
print(comparison)
plot(comparison)

Generate Methods Text for Papers

# summary() writes a publication-ready methods paragraph
result <- borg(spatial_data, coords = c("lon", "lat"), target = "response")
summary(result)
#> Model performance was evaluated using spatial block cross-validation
#> (k = 5 folds). Spatial autocorrelation was detected in the data
#> (Moran's I = 0.12, p < 0.001)...

# Three citation styles
summary(result, style = "nature")
summary(result, style = "ecology")

Framework Integration

BORG works with common ML frameworks:

# caret
library(caret)
pp <- preProcess(mtcars[, -1], method = c("center", "scale"))
borg_inspect(pp, train_idx = 1:25, test_idx = 26:32, data = mtcars)

# tidymodels
library(recipes)
rec <- recipe(mpg ~ ., data = mtcars) |>
  step_normalize(all_numeric_predictors()) |>
  prep()
borg_inspect(rec, train_idx = 1:25, test_idx = 26:32, data = mtcars)

Interface Summary

Function	Purpose
`borg()`	Main entry point — diagnose data or validate splits
`borg_inspect()`	Detailed inspection of objects
`borg_diagnose()`	Analyze data dependencies
`borg_validate()`	Validate complete workflow
`borg_assimilate()`	Assimilate leaky pipelines into compliance
`borg_compare_cv()`	Empirical random vs blocked CV comparison
`borg_power()`	Power analysis after blocking
`plot()`	Visualize results
`summary()`	Generate methods text for papers
`borg_certificate()`	Create validation certificate
`borg_export()`	Export certificate to YAML/JSON

Documentation

Support

“Software is like sex: it’s better when it’s free.” — Linus Torvalds

I’m a PhD student who builds R packages in my free time because I believe good tools should be free and open. I started these projects for my own work and figured others might find them useful too.

If this package saved you some time, buying me a coffee is a nice way to say thanks. It helps with my coffee addiction.

License

MIT (see the LICENSE.md file)

Citation

@software{BORG,
  author = {Colling, Gilles},
  title = {BORG: Bounded Outcome Risk Guard for Model Evaluation},
  year = {2026},
  url = {https://github.com/gcol33/BORG}
}