IMPORTANT: Development Version. This package is under heavy development and has not yet been fully validated. Results should be independently verified before use in any consequential context (e.g., peer review, retractions, editorial decisions). Use of this package is at the sole responsibility of the user. We welcome contributions, verification reports, and bug reports at https://github.com/giladfeldman/escicheck/issues or by contacting Gilad Feldman (giladfel@gmail.com).
EffectCheck is a conservative, assumption-aware statistical consistency checker for published research results. It parses statistical results across multiple citation styles (APA, Harvard, Frontiers, PLOS ONE, Scientific Reports, Nature Human Behaviour, PeerJ, eLife, PNAS, and more), uses type-matched comparison to verify reported effect sizes against computed variants of the same type, explicitly flags all assumptions and uncertainty, and validates internal consistency between test statistics, effect sizes, and confidence intervals.
This package implements effect size computations following the Guide to Effect Sizes and
Confidence Intervals by Jané et al. (2024). The
effectsize R package is used as the primary computation
engine when available.
Citation for the Guide: > Jané, M.B., Xiao, Q., Yeung, S., et al. (2024). Guide to Effect Sizes and Confidence Intervals. http://dx.doi.org/10.17605/OSF.IO/D8C4G
# Install dependencies
install.packages(c("devtools", "stringr", "stringi", "dplyr", "purrr",
"tibble", "tidyr", "xml2", "rvest",
"DT", "shiny", "shinythemes", "testthat", "jsonlite"))
# Install effectcheck
devtools::install_local("effectcheck")For enhanced CI computation: - MBESS - Noncentral
t-distribution CIs for Cohen’s d - effectsize - Additional
effect size computations
library(effectcheck)
# Check text directly
text <- "t(28) = 2.21, p = .035, d = 0.80, 95% CI [0.12, 1.48]"
result <- check_text(text)
print(result)
summary(result)
# Check a single file
result <- check_file("paper.pdf")
# Check multiple files
results <- check_files(c("paper1.docx", "paper2.html"))
# Check a directory (recursively)
results <- check_dir("manuscripts/")
# Export results
render_report(result, "report.html")
export_csv(result, "results.csv")
export_json(result, "results.json")# Check PDF files (like statcheck's checkPDF)
results <- checkPDF(c("paper1.pdf", "paper2.pdf"))
# Check HTML files (like statcheck's checkHTML)
results <- checkHTML(c("paper1.html", "paper2.html"))
# Check directory of PDFs (like statcheck's checkPDFdir)
results <- checkPDFdir("manuscripts/")
# Check directory of HTML files (like statcheck's checkHTMLdir)
results <- checkHTMLdir("manuscripts/")# Get summary statistics
summary(result)
# Plot results
plot(result, type = "status") # Status distribution
plot(result, type = "uncertainty") # Uncertainty levels
plot(result, type = "all") # All plots
# Filter problematic results
errors <- ec_identify(result, "errors")
warnings <- ec_identify(result, "warnings")
decision_errors <- get_decision_errors(result)
high_uncertainty <- filter_by_uncertainty(result, "high")
# Filter by test type
t_tests <- filter_by_test_type(result, "t")
# Count by category
count_by(result, "status")
count_by(result, "test_type")# Get all variants for a specific row
variants <- get_variants(result, row_index = 1)
# Get same-type variants only
same_type <- get_same_type_variants(result, row_index = 1)
# Get alternative suggestions
alternatives <- get_alternatives(result, row_index = 1)
# Format variants for display
cat(format_variants(result, row_index = 1))
# Compare reported value to all variants
comparison <- compare_to_variants(result, row_index = 1)
print(comparison)
# Get metadata for a specific variant type
metadata <- get_variant_metadata("dz")
print(metadata$when_to_use)# Run the interactive Shiny app
shiny::runApp("effectcheck-demo")The Shiny app provides: - Upload/Input tab: File upload, text paste, encoding selection - Parsing Diagnostics: Highlighted statistics, context windows - Results tab: Interactive table with status summary and export options - Uncertainty Review: Filtered view of WARN/ERROR cases - Settings tab: Configurable tolerances, CI levels, r-grid - Documentation tab: Complete user guide and effect size definitions
Key columns in results:
Reported Values: - reported_type: The
effect size type user reported (d, g, r, eta2, etc.) -
effect_reported: Reported effect size value -
effect_reported_name: Original name as parsed from text
Type-Matched Comparison (New in v2.0): -
matched_variant: Which same-type variant matched best -
matched_value: The computed value of that variant -
delta_effect: Absolute difference between reported and
matched - ambiguity_level: “clear”, “ambiguous”, or
“highly_ambiguous” - ambiguity_reason: Why comparison was
ambiguous (if applicable) - all_variants: JSON structure
with all computed variants and alternatives
Legacy Columns (backward compatible): -
test_type: Type of test (t, F, r, chisq, z) -
closest_method: Alias for matched_variant -
delta_effect_abs: Alias for delta_effect
P-value and Decision Errors: -
p_reported: Reported p-value - p_computed:
Computed p-value from test statistic - decision_error: TRUE
if significance reversal detected
Status and Uncertainty: - status:
PASS/WARN/ERROR/INSUFFICIENT_DATA - uncertainty_level:
Low/medium/high - uncertainty_reasons: Specific sources of
uncertainty - assumptions_used: All assumptions made during
computation - design_inferred: Inferred study design
(paired, independent, etc.) - assumptions_used: All
assumptions made during computation - design_inferred: Best
guess at experimental design - variants_tested: All effect
size variants computed - source: Source file name (or
“text” for direct input)
text <- "t(28) = 2.21, p = .035, d = 0.80, 95% CI [0.12, 1.48], N = 30, n1 = 15, n2 = 15"
result <- check_text(text)text <- "F(2, 27) = 4.56, p < .05, η² = 0.25"
result <- check_text(text)
# Computes: eta², partial eta², omega², Cohen's ftext <- "r(198) = .34, p < .001, 95% CI [.21, .45]"
result <- check_text(text)text <- "t(28) = 2.21, d = 0.80. Another test: F(2, 27) = 4.56, η² = 0.25."
result <- check_text(text)
# Returns one row per statistic detectedresult <- check_text(
text,
tol_effect = list(d = 0.02, r = 0.005, phi = 0.02, V = 0.02, eta2 = 0.01),
tol_ci = 0.02
)result <- check_text(
text,
paired_r_grid = seq(0.1, 0.9, by = 0.1) # Default
)
# Shows dav range across correlation assumptionsEffectCheck automatically detects CI level from text: -
95% CI [0.12, 0.45] → uses 95% -
CI [0.12, 0.45] → defaults to 95% (flagged as
assumption)
pdftotext from
poppler-utils, with optional table extraction and OCR)pandoc; requires pandoc on PATH)rvest
package)# Enable table extraction from PDFs
results <- checkPDF("paper.pdf", try_tables = TRUE)
# Enable OCR for scanned PDFs
results <- checkPDF("scanned_paper.pdf", try_ocr = TRUE)EffectCheck follows these principles:
tesseract and
magick packages| Function | Description |
|---|---|
check_text(text, ...) |
Check plain text for statistical consistency |
check_file(path, ...) |
Check a single file |
check_files(paths, ...) |
Check multiple files |
check_dir(dir, ...) |
Check all files in a directory |
| Function | Description |
|---|---|
checkPDF(files, ...) |
Check PDF files |
checkHTML(files, ...) |
Check HTML files |
checkPDFdir(dir, ...) |
Check directory of PDFs |
checkHTMLdir(dir, ...) |
Check directory of HTML files |
checkDOCXdir(dir, ...) |
Check directory of Word documents |
| Method | Description |
|---|---|
print(x) |
Print formatted summary |
summary(x) |
Comprehensive summary statistics |
plot(x, type) |
Visualize results |
ec_identify(x, what) |
Filter problematic results |
| Function | Description |
|---|---|
get_errors(x) |
Get ERROR status results |
get_warnings(x) |
Get WARN status results |
get_decision_errors(x) |
Get significance reversals |
filter_by_test_type(x, types) |
Filter by test type |
filter_by_uncertainty(x, levels) |
Filter by uncertainty level |
filter_by_source(x, files) |
Filter by source file |
filter_by_delta(x, min, max) |
Filter by effect size delta |
count_by(x, by) |
Count results by category |
| Function | Description |
|---|---|
render_report(x, out) |
Generate HTML report |
export_csv(x, out) |
Export to CSV |
export_json(x, out) |
Export to JSON |
If you use EffectCheck in your research, please cite:
EffectCheck: Statistical Consistency Checker for Published Research Results
Version 0.2.0
MIT License - see LICENSE file for details
Contributions welcome! Please open an issue or pull request at https://github.com/giladfeldman/escicheck.
For issues, questions, or feature requests, please open an issue on the project repository.