BRAID Confidence Intervals

The Reason for Confidence Intervals

At it’s heart, the BRAID package is nothing more than a response surface equation and a set of tools for picking the version of that equation that is as close as possible to measured data. But in scientific research, it is often just as important to know how much to trust a particular fit as it is to know what that fit is. Though many techniques for modeling this trust are available, we have long relied on bootstrapped confidence intervals, mostly because they are mathematically simple, applicable to a wide range of models, and easy to implement on top of existing fitting code.

Formally, the confidence intervals in the BRAID package are estimated by bootstrapping with re-sampled residuals. This means that once a response surface is fit, the residuals of that model (the deviations between the actual measurements and the model’s prediction) are re-sampled and then added to the underlying prediction to produce a simulated noisy surface with both the same underlying model and the same distribution of errors. This re-sampled surface is fit with the same method as the original, and after performing this re-sampling and fitting many times, we can estimate the distribution of parameter values that would theoretically arise if our model were spot-on. Taking the 2.5% and 97.5% percentile of these bootstrapped distributions gives our estimated confidence interval for each parameter. If the confidence interval is narrow, then nearly all surfaces that look like ours will be fit with nearly the same value of the parameter, so we can feel confident that we are close to the correct value. If the confidence interval is wide, then surfaces similar to ours produce a wide variety of values for this parameter, and the true parameter may be quite different from the one we have fit.

One of the most common uses we have for the confidence interval is to determine if there is evidence that a response surface deviates significantly from additivity:

summary(braidrm(measure ~ concA + concB, additiveExample, getCIs=TRUE))
#> Call:
#> braidrm.formula(formula = measure ~ concA + concB, data = additiveExample, 
#>     getCIs = TRUE)
#> 
#>            Lo     Est     Hi
#> IDMA   0.6892  0.8036 0.9146
#> IDMB   0.8303  0.9536 1.0949
#> na     1.8645  2.4200 3.1225
#> nb     2.0822  2.6070 3.2912
#> kappa -0.4672 -0.2071 0.0327
#> E0    -0.0567  0.0029 0.0500
#> EfA    0.9471  0.9948 1.0269
#> EfB    0.9540  1.0020 1.0348
#> Ef         NA  1.0020     NA

summary(braidrm(measure ~ concA + concB, synergisticExample, getCIs=TRUE))
#> Call:
#> braidrm.formula(formula = measure ~ concA + concB, data = synergisticExample, 
#>     getCIs = TRUE)
#> 
#>            Lo     Est     Hi
#> IDMA   0.8923  1.0398 1.1670
#> IDMB   0.8533  1.0259 1.1826
#> na     2.4488  2.9116 3.7619
#> nb     2.0568  2.4990 3.1287
#> kappa  1.6776  2.1258 2.6156
#> E0    -0.0826 -0.0300 0.0187
#> EfA    0.9375  1.0080 1.0256
#> EfB    0.9115  0.9848 1.0253
#> Ef         NA  1.0080     NA

In the first (additive) example, though the fitted value of \(\kappa\) is slightly negative, the confidence interval encloses 0, indicting no clear deviation from additivity. In the second (synergistic) example, the confidence interval on \(\kappa\) lies well above zero, giving more clear evidence of synergy.

Note that bootstrapped confidence intervals are constructed through random sampling; this means that estimating confidence intervals, even on identical data, will generally produce slightly different values. It is therefore important to remember that confidence intervals are useful guidelines, not fundamental truths. Those who wish to generate replicatable confidence intervals can do so by explicitly fixing the random seed using set.seed().

By default, confidence intervals are estimated whenever a surface is fit using braidrm() or findBestBraid(); this can be disabled by setting the parameter getCIs to FALSE. Though we recommend always estimating and examining confidence intervals in a first pass analysis or report, including confidence intervals is not always required. And because estimating the confidence intervals requires re-fitting a response surface hundreds of times, including them can add a significant amount of time to an analysis process for large numbers of surfaces.

If a fit is performed without confidence intervals, they can always be added after the fact using calcBraidBootstrap():

bfit <- braidrm(measure ~ concA + concB, antagonisticExample, getCIs = FALSE)
summary(bfit)
#> Call:
#> braidrm.formula(formula = measure ~ concA + concB, data = antagonisticExample, 
#>     getCIs = FALSE)
#> 
#>           Est
#> IDMA   0.9963
#> IDMB   0.9926
#> na     2.8385
#> nb     2.8984
#> kappa -1.0137
#> E0    -0.0102
#> EfA    0.9908
#> EfB    0.9768
#> Ef     0.9908

bfit_ci <- calcBraidBootstrap(bfit)
summary(bfit_ci)
#> Call:
#> braidrm.formula(formula = measure ~ concA + concB, data = antagonisticExample, 
#>     getCIs = FALSE)
#> 
#>            Lo     Est      Hi
#> IDMA   0.9019  0.9963  1.0836
#> IDMB   0.8981  0.9926  1.0829
#> na     2.3428  2.8385  3.6209
#> nb     2.3095  2.8984  3.5318
#> kappa -1.1204 -1.0137 -0.8873
#> E0    -0.0538 -0.0102  0.0223
#> EfA    0.9610  0.9908  1.0237
#> EfB    0.9533  0.9768  1.0050
#> Ef         NA  0.9908      NA

Confidence Intervals on Derived Quantities

Though evaluating parameter fits (particularly \(\kappa\)) is the most common use of confidence intervals, there are many cases in which it is useful to add a confidence interval to some other property or derived value from a response surface. This may be as simple as the predicted effect for a particular dose pair, or a more complex aggregate measure of a response surface such as the index of achievable efficacy (IAE). These derived confidence intervals can be estimated from the existing bootstrapped coefficients of a BRAID fit object using the function calcBraidConfInt(). The function takes a BRAID fit (with confidence intervals) and a function specifying the derived value from a BRAID parameter vector:

# CI on the predicted effect at dose pair (10, 10)
calcBraidConfInt(bfit_ci, function(p) evalBraidModel(10,10,p))
#>         Lo       Est       Hi
#>  0.9619248 0.9877667 1.017107

# CI on the IAE at 50% and 90% efficacy
calcBraidConfInt(bfit_ci,function(p) estimateIAE(p,c(0.5,0.9),c(10,10)))
#>            Lo      Est       Hi
#> [1,] 7.587669 8.031226 8.431710
#> [2,] 3.048281 3.480926 3.848797

If a function returns a single value, the confidence interval is returned as a vector of 3 values (the lower bound, the best-fit estimate, and the upper bound). If it returns a vector of values, the confidence interval is returned as a width-3 array.

Of course this function can only be run on BRAID fit objects for which confidence intervals have already been estimated. If the confidence intervals are missing, the function will produce an error:

# `bfit` does not have confidence intervals
calcBraidConfInt(bfit, function(p) evalBraidModel(10,10,p))
#> Error in calcBraidConfInt(bfit, function(p) evalBraidModel(10, 10, p)): Input 'bfit' must have bootstrapped coefficients.

A Caveat

One word of warning: much like its ideological cousin, the p-value, the confidence interval is widely used and even more widely misinterpreted. Saying that the 95% confidence interval on \(\kappa\) is from 2 to 4 is not, not, NOT saying that there is a 95% chance that \(\kappa\) lies in that range. Indeed, the confidence interval barely says anything about the probability of \(\kappa\) values in the real world at all. All it really says is that if our model were truly correct, we would expect to fit \(\kappa\) values in this range 95% of the time. That first supposition is a very, very big one (especially given that BRAID itself is a general-purpose, descriptive, empirical model), so confidence intervals should always be viewed as suggestive indicators of stability, rather than clear indicators of the boundaries of truth.