| Title: | Bayesian Regression Robustness Assessment Test |
| Version: | 0.0.2 |
| URL: | https://github.com/csiro/hydro_BRRAT_Package |
| BugReports: | https://github.com/csiro/hydro_BRRAT_Package/issues |
| Description: | Tests for a linear relationship in the log ratio between an observed and simulated series and an independent variable. Typically this the error in modelled streamflow at an annual time scale, and a rainfall input. The approach allows for multiple sites as random factors and for multiple replicates of the simulated values. The approach is outlined in Gibbs et al. (2026) in review. |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.2 |
| Biarch: | true |
| Depends: | R (≥ 4.1.0) |
| Imports: | methods, Rcpp (≥ 1.1.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), rstantools (≥ 2.5.0), MASS, ggplot2, dplyr |
| Suggests: | covr, testthat (≥ 3.0.0) |
| Config/testthat/edition: | 3 |
| LinkingTo: | BH (≥ 1.66.0), Rcpp (≥ 1.1.0), RcppEigen (≥ 0.3.3.3.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), StanHeaders (≥ 2.18.0) |
| SystemRequirements: | GNU make |
| NeedsCompilation: | yes |
| Packaged: | 2026-04-23 04:45:59 UTC; gib417 |
| Author: | Matt Gibbs [aut, cre] |
| Maintainer: | Matt Gibbs <matt.gibbs@csiro.au> |
| Repository: | CRAN |
| Date/Publication: | 2026-04-24 20:30:11 UTC |
The 'BRRAT' package.
Description
Function to apply the Bayesian Regression Robustness Assessment Test, to evaluate if model simulations are independent of the climate input used, suggesting a robust model.
This package makes use of Stan to fit the linear regression model, and the precompiled model created for the package with rstantools.
Author(s)
Maintainer: Matt Gibbs matt.gibbs@csiro.au
References
Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.32.6. https://mc-stan.org
Gibbs et al. (2026) in review
See Also
Useful links:
Report bugs at https://github.com/csiro/hydro_BRRAT_Package/issues
Bayesian Regression Robustness Assessment Test
Description
Function to apply the BRRAT. A linear regression is fit between the error between the transformed Sim
and Obs data and the corresponding Clim data, to test for a dependent
relationship between the error and climate, indicating a model that is not
robust to the input climate data. Stan is used to fit the linear regression,
and hence a distribution of slope values are returned.
Usage
BRRAT(Sim, Obs, ...)
Arguments
Sim |
Data frame type containing time series of simulated values (typically annual streamflow). |
Obs |
Data frame type containing time series of observed values (typically annual streamflow and rainfall). |
... |
Arguments passed to
An optional |
Details
Multiple catchments are supported through random factors in the hierarchical
linear regression model,indicated by different values in an ID column.
Multiple replicates of the Simulated time series are supported, matched based
on the year column.
Value
a list with members:
- fit
rstan model object for testing of convergence etc
- data
list of data in stan format used to fit the model, after transformation
- id_levels
order of random factors if used
- beta_draws
samples of slope values from the posterior distribution
- statistics
statistics indicating the probability that the slope is: greater than 0
prob_beta_gt0, less than 0prob_beta_lt0, and non-zerop_nonzero
Examples
#generate 2 models of annual streamflow
samples <- 50
#generate some annual rainfall totals
Clim <- sort(pmax(0,rnorm(samples,600,100)))
#calculate annual runoff from tanh curve
observed <- pmax(0,(Clim - 150) - 450 * tanh((Clim - 150)/450))
Obs <- data.frame(P=Clim,Q=observed,year=1:length(Clim))
#non-robust model, underestimate the wetter years
Sim1 <- observed*seq(1,0.7,length.out=length(observed))*rnorm(samples,1,0.05)
Sim1 <- data.frame(Q=Sim1,year=1:length(Clim))
#robust model, only random error
Sim2 <- observed*rnorm(samples,1,0.05)
Sim2 <- data.frame(Q=Sim2,year=1:length(Clim))
#apply BRRAT test to both, reduce chains and iters to speed up example.
fit1 <- BRRAT(Sim1,Obs,chains=1,iter=500)
fit2 <- BRRAT(Sim2,Obs,chains=1,iter=500)
dat <- data.frame(b = c(fit1$beta_draws,fit2$beta_draws),
id = c(rep("fit1",length(fit1$beta_draws)),
rep("fit2",length(fit2$beta_draws))))
#fit2 is centered on 0, fit1 has a negative slope
ggplot2::ggplot(dat)+
ggplot2::geom_density(ggplot2::aes(b,fill=id))
#can also be viewed using the plotting function
plot_BRRAT_population(fit1)
plot_BRRAT_population(fit2)
#Summarise statistics of the mean can be viewed using
summarise_BRRAT(fit1)
summarise_BRRAT(fit2)
Export global and per-site summaries to CSV
Description
Writes the output of summarise_BRRAT() to two CSV files: one for the
population slope (file_global) and one for per-site coefficients (file_per_id).
Usage
export_BRRAT_csv(fit_obj, file_global, file_per_id, level = 0.95)
Arguments
fit_obj |
A fitted object returned by |
file_global |
Path/filename for the global table CSV. No default is
provided; the user must supply an explicit path (e.g. via |
file_per_id |
If applicable to the model, path/filename for the per-site table CSV. No default is provided; the user must supply an explicit path. |
level |
Credible interval level passed to |
Details
Export summary tables to CSV files
Value
(Invisibly) a list with the two data frames written:
list(global = <data.frame>, per_id = <data.frame>).
Examples
set.seed(1)
samples <- 30
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
Obs <- data.frame(P = Clim, Q = observed, year = 1:samples)
Sim <- data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
export_BRRAT_csv(res,
file_global = tempfile(fileext = ".csv"),
file_per_id = tempfile(fileext = ".csv"),
level = 0.95
)
Per-site partial-pooling lines and credible bands
Description
Plots site-specific regression lines using draws of intercept_id[j] and
slope_id[j] for each site ID. Requires more than 1 ID in the initial
regression by BRRAT(). The output overlays lines and credible bands
across sites; optionally facetted into separate panels for readability.
Usage
plot_BRRAT_by_id(fit_obj, n_grid = 150, level = 0.8, facet = TRUE)
Arguments
fit_obj |
A fitted object returned by |
n_grid |
Integer; number of x points per site for the prediction grids
(default |
level |
Credible interval level (default |
facet |
Logical; if |
Details
Plot per-site lines (random intercepts & slopes) with credible intervals
For site j, the line is computed as:
\mu_j(x) = \mathrm{intercept\_id}_j + \mathrm{slope\_id}_j \, x_{\mathrm{std}}
evaluated on a grid of x values transformed to the centered/scaled space used
during fitting. The function reconstructs equal-tail credible bands by propagating
posterior draws of intercept_id[j] and slope_id[j].
Value
A ggplot2 object showing per-site lines and credible intervals.
Examples
set.seed(1)
samples <- 30
make_site <- function(id) {
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
list(
Obs = data.frame(P = Clim, Q = observed, year = 1:samples, ID = id),
Sim = data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples, ID = id)
)
}
s1 <- make_site("A")
s2 <- make_site("B")
Obs <- rbind(s1$Obs, s2$Obs)
Sim <- rbind(s1$Sim, s2$Sim)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
p_sites <- plot_BRRAT_by_id(res, n_grid = 150, level = 0.80, facet = TRUE)
print(p_sites)
Population-level line and credible band
Description
Plots the population-level regression line \alpha + \beta x and its
equal-tail credible band by transforming the x-grid to the centered/scaled space
used during fitting. The plot overlays a (potentially downsampled) scatter of the
original observations for context.
Usage
plot_BRRAT_population(fit_obj, n_grid = 200, level = 0.95)
Arguments
fit_obj |
A fitted object returned by |
n_grid |
Integer; number of x points for the prediction grid (default |
level |
Credible interval level for the band (default |
Details
Plot population-level regression Qe ~ P with uncertainty band
The function extracts posterior draws of alpha and beta from the Stan fit and
evaluates \mu(x) = \alpha + \beta x along a grid of x values on the original
scale, internally converting to the centered/scaled space via
x_{\mathrm{std}} = (x - x_{\mathrm{center}}) / x_{\mathrm{scale}}.
Value
A ggplot2 object showing the population regression line and credible band.
Examples
set.seed(1)
samples <- 30
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
Obs <- data.frame(P = Clim, Q = observed, year = 1:samples)
Sim <- data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
p_pop <- plot_BRRAT_population(res, n_grid = 200, level = 0.95)
print(p_pop)
Summarise population and site-level posterior summaries
Description
Computes tidy posterior summaries for the population slope (beta) and
per-site intercepts and slopes (from intercept_id and slope_id generated
quantities) returned by the Stan fit. The function reports posterior means,
standard deviations, equal-tail credible intervals at a chosen level, and a
two-sided tail probability for non-zero slope at the population level.
Usage
summarise_BRRAT(fit_obj, level = 0.95)
Arguments
fit_obj |
A fitted object returned by |
level |
Credible interval level (default |
Details
Summarise global slope and per-site coefficients into tidy tables
The two-sided tail probability is computed as:
p_{\mathrm{nonzero}} = 2\min\{ \Pr(\beta > 0 \mid \mathrm{data}), \Pr(\beta < 0 \mid \mathrm{data}) \}
which is analogous to a two-sided p-value in spirit, but is a Bayesian posterior tail probability. It shrinks toward 0 as the posterior mass concentrates away from zero.
Value
A list with one or two data frames:
-
global: one row summarising the population slopebeta(mean, sd, lower/upper credible interval,prob_gt0,prob_lt0, andp_nonzero). -
per_id: if applicable (more than one siteIDin the data) one row per siteIDwith posterior summaries forintercept_idandslope_id(mean, sd, lower/upper credible intervals).
Notes
Assumes the Stan model emits intercept_id and slope_id in generated quantities,
i.e., per-site fully-composed coefficients:
\mathrm{intercept\_id}_j = \alpha + b_{0j},\quad \mathrm{slope\_id}_j = \beta + b_{1j}
Examples
set.seed(1)
samples <- 30
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
Obs <- data.frame(P = Clim, Q = observed, year = 1:samples)
Sim <- data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
tabs <- summarise_BRRAT(res, level = 0.95)
tabs$global
head(tabs$per_id)