Title: Bayesian Regression Robustness Assessment Test
Version: 0.0.2
URL: https://github.com/csiro/hydro_BRRAT_Package
BugReports: https://github.com/csiro/hydro_BRRAT_Package/issues
Description: Tests for a linear relationship in the log ratio between an observed and simulated series and an independent variable. Typically this the error in modelled streamflow at an annual time scale, and a rainfall input. The approach allows for multiple sites as random factors and for multiple replicates of the simulated values. The approach is outlined in Gibbs et al. (2026) in review.
License: GPL (≥ 3)
Encoding: UTF-8
RoxygenNote: 7.3.2
Biarch: true
Depends: R (≥ 4.1.0)
Imports: methods, Rcpp (≥ 1.1.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), rstantools (≥ 2.5.0), MASS, ggplot2, dplyr
Suggests: covr, testthat (≥ 3.0.0)
Config/testthat/edition: 3
LinkingTo: BH (≥ 1.66.0), Rcpp (≥ 1.1.0), RcppEigen (≥ 0.3.3.3.0), RcppParallel (≥ 5.0.1), rstan (≥ 2.18.1), StanHeaders (≥ 2.18.0)
SystemRequirements: GNU make
NeedsCompilation: yes
Packaged: 2026-04-23 04:45:59 UTC; gib417
Author: Matt Gibbs [aut, cre]
Maintainer: Matt Gibbs <matt.gibbs@csiro.au>
Repository: CRAN
Date/Publication: 2026-04-24 20:30:11 UTC

The 'BRRAT' package.

Description

Function to apply the Bayesian Regression Robustness Assessment Test, to evaluate if model simulations are independent of the climate input used, suggesting a robust model.

This package makes use of Stan to fit the linear regression model, and the precompiled model created for the package with rstantools.

Author(s)

Maintainer: Matt Gibbs matt.gibbs@csiro.au

References

Stan Development Team (NA). RStan: the R interface to Stan. R package version 2.32.6. https://mc-stan.org

Gibbs et al. (2026) in review

See Also

Useful links:


Bayesian Regression Robustness Assessment Test

Description

Function to apply the BRRAT. A linear regression is fit between the error between the transformed Sim and Obs data and the corresponding Clim data, to test for a dependent relationship between the error and climate, indicating a model that is not robust to the input climate data. Stan is used to fit the linear regression, and hence a distribution of slope values are returned.

Usage

BRRAT(Sim, Obs, ...)

Arguments

Sim

Data frame type containing time series of simulated values (typically annual streamflow).

Obs

Data frame type containing time series of observed values (typically annual streamflow and rainfall).

...

Arguments passed to rstan::sampling (e.g. iter, chains, cores).

Sim requires column names of year and Q. year is used to match the simulated an observed values, and Q are the simulated values to test, typically streamflow volume/depth #' Obs requires column names of year, P and Q. year is the year of observation, matched to year in Sim to calculate errors Q is the observed values to compare to the Q column from Sim, using a log ratio P is the independent climate variable, to test if there is a relationship in the errors

An optional ID column can be included to indicate multiple sites to be used as random factors in the test. If an ID column is used, it is required in both Sim and Obs.

Details

Multiple catchments are supported through random factors in the hierarchical linear regression model,indicated by different values in an ID column.

Multiple replicates of the Simulated time series are supported, matched based on the year column.

Value

a list with members:

fit

rstan model object for testing of convergence etc

data

list of data in stan format used to fit the model, after transformation

id_levels

order of random factors if used

beta_draws

samples of slope values from the posterior distribution

statistics

statistics indicating the probability that the slope is: greater than 0 prob_beta_gt0, less than 0 prob_beta_lt0, and non-zero p_nonzero

Examples



#generate 2 models of annual streamflow
samples <- 50
#generate some annual rainfall totals
Clim <- sort(pmax(0,rnorm(samples,600,100)))
#calculate annual runoff from tanh curve
observed <- pmax(0,(Clim - 150) - 450 * tanh((Clim - 150)/450))
Obs <- data.frame(P=Clim,Q=observed,year=1:length(Clim))

#non-robust model, underestimate the wetter years
Sim1 <- observed*seq(1,0.7,length.out=length(observed))*rnorm(samples,1,0.05)
Sim1 <- data.frame(Q=Sim1,year=1:length(Clim))

#robust model, only random error
Sim2 <- observed*rnorm(samples,1,0.05)
Sim2 <- data.frame(Q=Sim2,year=1:length(Clim))

#apply BRRAT test to both, reduce chains and iters to speed up example.
fit1 <- BRRAT(Sim1,Obs,chains=1,iter=500)
fit2 <- BRRAT(Sim2,Obs,chains=1,iter=500)

dat <- data.frame(b = c(fit1$beta_draws,fit2$beta_draws),
id = c(rep("fit1",length(fit1$beta_draws)),
       rep("fit2",length(fit2$beta_draws))))

#fit2 is centered on 0, fit1 has a negative slope
ggplot2::ggplot(dat)+
 ggplot2::geom_density(ggplot2::aes(b,fill=id))

#can also be viewed using the plotting function
plot_BRRAT_population(fit1)
plot_BRRAT_population(fit2)

#Summarise statistics of the mean can be viewed using
summarise_BRRAT(fit1)
summarise_BRRAT(fit2)



Export global and per-site summaries to CSV

Description

Writes the output of summarise_BRRAT() to two CSV files: one for the population slope (file_global) and one for per-site coefficients (file_per_id).

Usage

export_BRRAT_csv(fit_obj, file_global, file_per_id, level = 0.95)

Arguments

fit_obj

A fitted object returned by BRRAT(), of class BRRAT.

file_global

Path/filename for the global table CSV. No default is provided; the user must supply an explicit path (e.g. via tempfile() or a user-chosen location).

file_per_id

If applicable to the model, path/filename for the per-site table CSV. No default is provided; the user must supply an explicit path.

level

Credible interval level passed to summarise_BRRAT() (default 0.95).

Details

Export summary tables to CSV files

Value

(Invisibly) a list with the two data frames written: ⁠list(global = <data.frame>, per_id = <data.frame>)⁠.

Examples


set.seed(1)
samples <- 30
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
Obs <- data.frame(P = Clim, Q = observed, year = 1:samples)
Sim <- data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
export_BRRAT_csv(res,
  file_global = tempfile(fileext = ".csv"),
  file_per_id = tempfile(fileext = ".csv"),
  level = 0.95
)



Per-site partial-pooling lines and credible bands

Description

Plots site-specific regression lines using draws of intercept_id[j] and slope_id[j] for each site ID. Requires more than 1 ID in the initial regression by BRRAT(). The output overlays lines and credible bands across sites; optionally facetted into separate panels for readability.

Usage

plot_BRRAT_by_id(fit_obj, n_grid = 150, level = 0.8, facet = TRUE)

Arguments

fit_obj

A fitted object returned by BRRAT(), of class BRRAT.

n_grid

Integer; number of x points per site for the prediction grids (default 150).

level

Credible interval level (default 0.80) used for site-specific bands.

facet

Logical; if TRUE, facet the plot by ID; otherwise overlay all sites with colour and fill aesthetics (default TRUE).

Details

Plot per-site lines (random intercepts & slopes) with credible intervals

For site j, the line is computed as:

\mu_j(x) = \mathrm{intercept\_id}_j + \mathrm{slope\_id}_j \, x_{\mathrm{std}}

evaluated on a grid of x values transformed to the centered/scaled space used during fitting. The function reconstructs equal-tail credible bands by propagating posterior draws of intercept_id[j] and slope_id[j].

Value

A ggplot2 object showing per-site lines and credible intervals.

Examples


set.seed(1)
samples <- 30
make_site <- function(id) {
  Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
  observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
  list(
    Obs = data.frame(P = Clim, Q = observed, year = 1:samples, ID = id),
    Sim = data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples, ID = id)
  )
}
s1 <- make_site("A")
s2 <- make_site("B")
Obs <- rbind(s1$Obs, s2$Obs)
Sim <- rbind(s1$Sim, s2$Sim)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
p_sites <- plot_BRRAT_by_id(res, n_grid = 150, level = 0.80, facet = TRUE)
print(p_sites)



Population-level line and credible band

Description

Plots the population-level regression line \alpha + \beta x and its equal-tail credible band by transforming the x-grid to the centered/scaled space used during fitting. The plot overlays a (potentially downsampled) scatter of the original observations for context.

Usage

plot_BRRAT_population(fit_obj, n_grid = 200, level = 0.95)

Arguments

fit_obj

A fitted object returned by BRRAT(), of class BRRAT.

n_grid

Integer; number of x points for the prediction grid (default 200).

level

Credible interval level for the band (default 0.95).

Details

Plot population-level regression Qe ~ P with uncertainty band

The function extracts posterior draws of alpha and beta from the Stan fit and evaluates \mu(x) = \alpha + \beta x along a grid of x values on the original scale, internally converting to the centered/scaled space via x_{\mathrm{std}} = (x - x_{\mathrm{center}}) / x_{\mathrm{scale}}.

Value

A ggplot2 object showing the population regression line and credible band.

Examples


set.seed(1)
samples <- 30
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
Obs <- data.frame(P = Clim, Q = observed, year = 1:samples)
Sim <- data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
p_pop <- plot_BRRAT_population(res, n_grid = 200, level = 0.95)
print(p_pop)



Summarise population and site-level posterior summaries

Description

Computes tidy posterior summaries for the population slope (beta) and per-site intercepts and slopes (from intercept_id and slope_id generated quantities) returned by the Stan fit. The function reports posterior means, standard deviations, equal-tail credible intervals at a chosen level, and a two-sided tail probability for non-zero slope at the population level.

Usage

summarise_BRRAT(fit_obj, level = 0.95)

Arguments

fit_obj

A fitted object returned by BRRAT(), of class BRRAT.

level

Credible interval level (default 0.95). The lower and upper intervals are computed as (1 - level) / 2 and 1 - (1 - level) / 2.

Details

Summarise global slope and per-site coefficients into tidy tables

The two-sided tail probability is computed as:

p_{\mathrm{nonzero}} = 2\min\{ \Pr(\beta > 0 \mid \mathrm{data}), \Pr(\beta < 0 \mid \mathrm{data}) \}

which is analogous to a two-sided p-value in spirit, but is a Bayesian posterior tail probability. It shrinks toward 0 as the posterior mass concentrates away from zero.

Value

A list with one or two data frames:

Notes

Assumes the Stan model emits intercept_id and slope_id in ⁠generated quantities⁠, i.e., per-site fully-composed coefficients:

\mathrm{intercept\_id}_j = \alpha + b_{0j},\quad \mathrm{slope\_id}_j = \beta + b_{1j}

Examples


set.seed(1)
samples <- 30
Clim <- sort(pmax(0, rnorm(samples, 600, 100)))
observed <- pmax(0, (Clim - 150) - 450 * tanh((Clim - 150) / 450))
Obs <- data.frame(P = Clim, Q = observed, year = 1:samples)
Sim <- data.frame(Q = observed * rnorm(samples, 1, 0.05), year = 1:samples)
res <- BRRAT(Sim, Obs, chains = 1, iter = 500)
tabs <- summarise_BRRAT(res, level = 0.95)
tabs$global
head(tabs$per_id)