‘Stress-Testing’ using foreSIGHT: Stochastic simulation

David McInerney, Seth Westra, Anjana Devanand, Michael Leonard, Sam Culley, Bree Bennett

14 Sep 2025

1. Overview

This vignette demonstrates the use of stochastic weather generators for producing perturbed climates for ‘stress-testing’ a system using foreSIGHT. The inverse approach (Guo et al. 2018) is used to optimize the parameters of one or more stochastic weather generators. The examples in this vignette both collate—and provide context to—information scattered in the function help files to discuss the considerations for application of foreSIGHT to more complex case studies and systems.

It is assumed that the reader has already read the  Introduction to climate stress testing using foreSIGHT vignette, and is familiar with the basic foreSIGHT work flow demonstrated below:

Workflow of climate 'stress-testing' using *fore*SIGHT

Workflow of climate ‘stress-testing’ using foreSIGHT

In this vignette, you’ll learn how to use foreSIGHT to generate and evaluate stochastic weather simulations that perturb a wide range of climate attributes. This is relevant to Steps A and B in the foreSIGHT workflow, with topics including:

We begin by introducing the key concepts and showing how they are implemented in the foreSIGHT package. These concepts are then illustrated through a case study, which stress-tests how streamflow in Scott Creek, South Australia, responds to changes in climate attributes across multiple metrics.

2. Attribute naming convention

foreSIGHT offers a uniquely flexible capability to perturb any number of climate attributes—either individually or in combination. To support this flexibility, attribute names follow a standardized naming convention in the format: “var_agg_strat_funcPars_op”, where

For example, “P_year_all_avg” is the average yearly calculated using all the data rainfall (i.e. mean annual rainfall) and “P_day_all_P99_m” is annual mean of the 99th percentile of rainfall calculated over all days.

2.1. Variable names

The variable name “var” can be any string (that does not contain the separating character ‘_’), and could include:

2.2. Temporal stratification

The aggregation period “agg” can be

2.3. Function names (and optional parameters)

Examples of function names and parameters include:

A full list of attribute functions supported in foreSIGHT can be viewed using the helper functions viewAttributeFuncs(). The table below provides a summary of these functions.

List of attribute functions available in foreSIGHT.
Function name Description

Options:

Syntax

Options:

Description

Options:

Default

Typical

variable

avg Average (mean) - - - Any
avgDSD Average dryspell duration (below threshold) avgDSDTx Threshold, x x=0 P (rainfall)
avgDwellTime Average dwell time, i.e. average of periods with below median value - - - Any
avgWSD Average wetspell duration avgWSDTx Threshold, x x=0 P (rainfall)
cor Lag-1 autocorrelation - - - Any
cv Coefficient of variation (sdev/mean) - - - Any
dyWet Average rainfall on wet days (above threshold) dyWetTx Threshold, x x=0 P (rainfall)
F0 Number of frost days (below zero) - - - Temperature
maxDSD Maximum dry spell duration (below threshold) maxDSDTx Threshold, x x=0 P (rainfall)
maxWSD Maximum wet spell duration (above threshold) maxWSDTx Threshold, x x=0 P (rainfall)
normP Normalised percentile (percentile divided by mean) normPx Probability (%), x None (must be specified) Any
nWet Number of wet days (above threshold) nWetTx Threshold, x x=0 P (rainfall)
P Percentile Px Probability (%), x None (must be specified) Any
R Number of days above a threshold (often used for temperature) Rx Threshold, x None (must be specified) Temperature
rng Inter-quantile range rngx Probability (%), x x=90 Temperature
seasRatio Seasonality ratio (ratio of seasonal to total rainfall) seasRatioMon1Mon2 Start and end months of season, Mon1, Mon2

Mon1=Mar

Mon2=Aug

P (rainfall)
tot Total - - - P (rainfall)
WDcor Lag-1 autocorrelation for wet days - - - P (rainfall)
wettest6monPeakDay Day of year corresponding to the wettest 6 months - - - P (rainfall)
wettest6monSeasRatio Ratio of wet season to dry season rainfall, based on wettest6monPeakDay - - - P (rainfall)

foreSIGHT also allows users to define their own custom functions. The names of these functions must have the format “func_customName”, where “customName” is the custom attribute function. These functions must have the argument “data”, which represents the time series of climate data. For example,

func_happyDays <- function(data){
  return(length(which((data>25)&(data<30))))
}

can be used in the attribute “Temp_day_all_happyDays_m” for calculating the average number of days each year with temperatures between 25oC and 30oC.

2.4. Operator name

The operator name is optional, and currently is limited to “m”: mean value of metric calculated over all years. Often “op” will be left empty.

The use of the operator “m” is a subtle, but important, choice. It determines whether the metric is calculated once using all the data (when operator is not specified), or whether the metric is calculated for each year, and then averaged over all years (when operator specified as “m”). For example, “P_day_all_P99” refers to the 99th percentile of daily rainfall calculated over all days, while “P_day_all_P99_m” would be the mean value of the 99th percentile of daily rainfall from each year.

This distinction becomes particularly important when comparing attributes across datasets of different lengths. For example, calculating the number of wet days without using "m" can be misleading, as longer time series will naturally accumulate more wet days. Including "m" ensures comparability by standardising the metric to an annual average.

The flexibility of attribute names can lead to some equivalence in attributes. E.g. "P_day_all_tot_m" is mean annual value of the total of daily rainfall. This is equivalent to "P_year_all_avg" - the average of the yearly rainfall.

It is noted that attributes are specified either as a fractional change relative to historical levels (and thus do not have units), or they are specified using the metric system.

2.5. Viewing attribute definitions and calculating values

The definition of climate attributes used in foreSIGHT can be viewed using the helper function viewAttributeDef() available in the package.

viewAttributeDef("P_day_all_tot_m")
#> [1] "Mean annual rainfall"

Can calculate attributes using calculateAttributes()

data("tankDat")
calculateAttributes(climateData=tank_obs,attSel=c('P_day_all_tot_m','P_day_all_P99'))
#> P_day_all_tot_m   P_day_all_P99 
#>         449.930          17.148

2.6. Multivariable attributes

foreSIGHT also supports the calculation of multivariable climate attributes, which are based on the interaction between two time series (e.g., precipitation, P, and potential evapotranspiration, PET ).

Multivariable attributes follow the naming convention “mv.var1.var2_agg_strat_func_op”, where

Examples include

Currently, a limited number of built-in multivariable attributes are available in foreSIGHT, as listed in the table below.

List of multivariable attribute functions
Function name Description Typical variable 1 Typical variable 2
avgDryDay Average value of variable 2 on dry days (assuming P is variable 1) P (rainfall) Temp, PET
avgWetDay Average value of variable 2 on dry days (assuming P is variable 1) P (rainfall) Temp, PET
cor Correlation between variables P (rainfall) Temp, PET
cvDryDay Coefficient of variation of variable 2 on dry days (assuming P is variable 1) P (rainfall) Temp, PET
cvWetDay Coefficient of variation of variable 2 on wet days (assuming P is variable 1) P (rainfall) Temp, PET

Users can also define their own custom multivariable functions. For an example of the input and output requirements of these functions, see the built-in function mvFunc_cor().

mvFunc_cor
#> function (data.1, data.2) 
#> {
#>     return(stats::cor(data.1, data.2, use = "pairwise.complete.obs"))
#> }
#> <bytecode: 0x0000026ae737bce0>
#> <environment: namespace:foreSIGHT>

3. Creating the exposure space (Step A)

The createExpSpace() function is used to define the exposure space—a set of perturbations applied to selected climate attributes. The general process for specifying perturbed attributes, along with the sampling strategy and bounds, follows the same principles outlined in the Introduction to climate stress testing using foreSIGHT vignette

To generate realistic climate time series, it is often necessary not only to perturb key attributes but also to specify held and tied attributes. As a rule of thumb, the total number of target attributes (including perturbed, held, and tied attributes) should be at least as large as the number of fitted parameters in the stochastic weather generator (SWG).

Selecting held and tied attributes is often a challenging task. It requires expert judgment to determine which climate changes are plausible, and typically involves an iterative process of evaluation and refinement to ensure the generated scenarios are both realistic and relevant.

3.1. Perturbed attributes

As shown in the Quick Start Guide, the scaling approach for generating perturbed climates allows only a limited set of attributes to be explicitly modified—typically:

By contrast, stochastic simulation enables a much broader set of attributes to be perturbed. The actual attributes that can be modified depend on the capabilities of the specific stochastic generator being used (see Sections 4.3.1 and 4.3.2 for more details).

3.2. Held attributes

To maintain realistic climate behavior, certain attributes can be held constant, typically at their historical values. This is especially useful to prevent perturbations from producing unrealistic extremes or seasonal patterns.

Held attributes are specified using the attHold argument. For example:

attHold <- c("P_day_all_P99", "P_day_all_maxWSD_m", "P_day_all_seasRatioJunAug")

In this case, these three attributes are held at reference levels. The attributes 99th percentile rainfall, maximum length of the wet spells are held at reference levels so that the perturbations do not result in unrealistic extreme rainfall events. In addition, the attribute “P_day_all_seasRatioJunAug” is held at reference levels so that the seasonal cycle of the generated data is realistic.

There are also mechanisms to ensure to preferentially matching the desired values of some attributes during time series generation (as described in Section 4.3.4).

3.3. Tied attributes

Attributes can also be tied together, meaning they share the same perturbation. This is useful when certain attributes are logically or physically linked.

For example, to ensure that JJA rainfall totals (P_day_JJA_tot_m) change in line with annual rainfall totals (P_day_all_tot_m), use:

attTied <- list(P_day_all_tot_m = c("P_day_JJA_tot_m"))

In some situations, it may be beneficial to tie attributes for each season to their equivalent derived from the full time series. For example, tying seasonal 99th percentiles to the overall 99th percentile:

attTied <- setSeasonalTiedAttributes(attSel = "P_day_all_P99") 
attTied 
#> $P_day_all_P99
#> [1] "P_day_DJF_P99" "P_day_MAM_P99" "P_day_JJA_P99" "P_day_SON_P99"

This will tie P_day_DJF_P99 , P_day_MAM_P9 , P_day_JJA_P99 and P_day_SON_P99 to the attribute P_day_all_P99, ensuring consistent changes across seasons.

3.4 Example: Creating an exposure space

This example demonstrates the use of createExpSpace() to define a climate exposure space, including perturbed, held, and tied attributes. As noted earlier, selecting held and tied attributes requires expert judgment, particularly in anticipating how different attributes are likely to co-vary with those being explicitly perturbed. The process is often iterative.

Perturbed Attributes

In this example, we are interested in perturbing two climate attributes

These two attributes are perturbed over a 5×5 regular grid.

Held Attributes

To maintain realism and constrain the stochastic generation process, the following attributes are held constant (at historical levels):

Tied Attributes

To ensure consistent changes across related attributes, the following tied relationships are defined:

############################################################################
# create exposure space

attPerturbType <- "regGrid"
# perturb seasonality and 99% rainfall
attPerturb <- c('P_day_all_seasRatioMarMay','P_day_all_P99')
# consider a 5x5 grid
attPerturbSamp <- c(5,5)
attPerturbMin <- c(0.7,1.)
attPerturbMax <- c(1.3,1.3)
# will hold a number of attributes to historical values
attHold <- c('P_day_all_tot_m',
            'P_day_DJF_tot_m','P_day_JJA_tot_m','P_day_SON_tot_m',
            'P_day_all_nWet_m',
            'P_day_DJF_nWet_m','P_day_MAM_nWet_m',
            'P_day_JJA_nWet_m','P_day_SON_nWet_m',
            'P_day_all_avgDSD',
            'P_day_DJF_avgDSD','P_day_MAM_avgDSD',
            'P_day_JJA_avgDSD','P_day_SON_avgDSD'
            )
# tied attributes
# note: normP = P99/avg rainfall. normP in each season is tied to P99 
attTied <- list(P_day_all_P99=c('P_day_DJF_normP99', 
                               'P_day_MAM_normP99',
                               'P_day_JJA_normP99',
                               'P_day_SON_normP99'),
              P_day_all_seasRatioMarMay=c('P_day_MAM_tot_m')) # MAM rain tied to seasRatioMarMay

expSpace <- createExpSpace(attPerturb = attPerturb,
                          attPerturbSamp = attPerturbSamp,
                          attPerturbMin = attPerturbMin,
                          attPerturbMax = attPerturbMax,
                          attPerturbType = attPerturbType,
                          attHold = attHold,
                          attTied = attTied)

Note: The selection of perturbed, held, and tied attributes in this example creates some inherent conflicts, meaning that not all target values can be perfectly matched.

For instance, perturbing spring rainfall via "P_day_all_seasRatioMarMay" and "P_day_MAM_tot" while simultaneously holding total annual rainfall ("P_day_all_tot") and non-winter seasonal rainfall ("P_day_DJF_tot", "P_day_MAM_tot", "P_day_SON_tot") at historical values introduces inconsistencies. It is not possible to satisfy all these constraints simultaneously.

In such cases, the optimization process will aim to find the best compromise across the specified targets. The relative importance of each attribute can be adjusted using attribute weights, as described in Section 4.3.4.

4. Generating perturbed climate scenarios (Step B)

4.1. The inverse optimization approach

The ‘inverse’ approach to producing perturbed climates uses a numerical optimization to calibrate parameters in a stochastic weather generator that produces time series with attributes that are as close as possible to the target attributes.

To understand what is going on, we need to get into a bit more theory. Let’s imagine we have i=1,…,n different attributes, such as mean annual rainfall and 99th percentile of daily rainfall (in which case n=2). The optimization approach seeks to generate time series that minimize the difference in target values between the attribute values of a simulated time series \(a_{i,j}\) and target values \(t_{i,j}\), as given by the following equation:

\(O(\mathbf{a}_j, \mathbf{t}_j) = \sqrt{\sum_{i=1}^{n} [\lambda_i(a_{i,j} - t_{i,j})]^2}\)

Here \(\lambda_i\) represents a user-specified penalty parameter for each attribute \(i\). The use of larger values for \(\lambda_i\) emphasizes those attributes in the calibration.

4.2. Generating scenarios with generateScenarios()

Generating stochastic perturbed climates in foreSIGHT is carried out using the generateScenarios(). function. The foreSIGHT help file for this functions provides details on the use of generateScenarios().Somekey inputs to this function are as follows.

Control file (controlFile)

The controlFile argument specifies:

The format and available options for the control file are described in Section 4.3.

(Note that if the controlFile argument is not specified or set to NULL, the default stochastic model and associated settings will be used to generate the scenarios. If controlFile='scaling' , then scaling approaches are used instead of stochastic simulation - see the Introduction to climate stress testing using foreSIGHT vignette.)

Length of perturbed series (simLengthNyrs)

The argument simLengthNyrs can be used to specify the desired length of the stochastically generated perturbed time series in years using generateScenarios

simLengthNyrs <- 100

Number of replicates (numReplicates)

By default, generateScenarios will generate a single replicate (or stochastic realization) of the perturbed time series. More replicates can be generated by specifying the numReplicates argument of generateScenarios.

numReplicates <- 5

The random seed (seedID)

The random seed used for stochastic generation of the first replicate is specified by seedID. The random seeds for the subsequent replicates are incremented by 1. Thus, the perturbed stochastic data generated using generateScenarios() for the same function arguments would typically be different.

Numbers of core in parallel processing (cores)

The inverse approach is a computationally intense approach to SWG calibration, as it requires many evaluations of the objective function, with each objective function calculation requiring simulation of the SWG and calculating attributes. Furthermore, objective function surface can be complex, making optimization challenging.

Combined with large numbers of replicates and targets and or longer time series, can result in the generation of scenarios becoming infeasible using a single processor. generateScenarios() supports parallel execution using multiple cores using the cores argument.

4.3. Stochastic simulation options (the control file)

4.3.1. Control file format

Options for stochastic simulation can be modified using the controlFile argument in generateScenarios(), defining the location of a JSON file.

For example, stochastic models are defined using the modelType and modelParameterVariation fields in the controlFile (as described in Section 4.3.2). The user can create a JSON file for input to generateScenarios. As an example, the following text may be used in the JSON file to select alternate models for precipitation and temperature.

# Example text to be copied to a text JSON file
{
  "modelType": {
    "P": "latent",
    "Temp": "wgenLM"
  },
  "modelParameterVariation": {
    "P": "seas",
    "Temp": "har"
  }
}

The helper function writeControlFile() available in foreSIGHT can be used to create a sample JSON file that provides a template to create control files to specify alternate models that the user needs. The writeControlFile() function can be used without arguments as shown below. Note that the following function call will write a JSON file named sample_controlFile.json into your working directory.

writeControlFile()

Alternatively, the toJSON function from the jsonlite package can be used to create a JSON file for the controlFile from an R list as shown below.

# create a list containing the specifications of the selected models
controlFileList <- list()
controlFileList[["modelType"]] <- list()
controlFileList[["modelType"]][["P"]] <- "latent"
controlFileList[["modelType"]][["Temp"]] <- "wgenO"
controlFileList[["modelParameterVariation"]] <- list()
controlFileList[["modelParameterVariation"]][["P"]] <- "seas"
controlFileList[["modelParameterVariation"]][["Temp"]] <- "har"
utils::str(controlFileList)

# write the list into a JSON file
controlFileListJSON <- jsonlite::toJSON(controlFileList, pretty = TRUE, auto_unbox = TRUE)
write(controlFileListJSON, file = paste0(tempdir(), "\\eg_controlFile.json"))
# input a the JSON file
controlFile <- paste0(tempdir(), "\\eg_controlFile.json")  

4.3.2. Stochastic weather generators

There is a plethora of options for stochastic weather generators described in the scientific literature, each with different features and assumptions. This variety provides a high degree of flexibility for perturbing weather time series in a variety of ways to comprehensively ‘stress test’ a climate-sensitive system. The foreSIGHT software currently supports a number of weather generators, and is developed such that developers and advanced users can add other weather generators over time.

The choice depends:

  • Data timestep. Most weather generators run at a daily timestep, which is a common timestep for reporting of key weather variables. However there are also weather generators available at various sub-daily timesteps, as well as longer aggregated timesteps such as monthly or annual. The key consideration here is to ensure the time step is consistent with the likely timescales of system performance sensitivity, which in turn need to correspond to the relevant time scales of weather inputs that are require for the system model.

  • The key timescales of system sensitivity. In addition to data timestep, it is important that the stochastic generator is able to simulate variability in Climate Attributes representing key timescales of system sensitivity. For example some systems may respond at sub-seasonal and seasonal timescales, and others at interannual timescales. It is noted that a stochastic generator may be able to simulate variability at timescales that are equal to or longer than its timestep, but not shorter.

  • Relevant weather variables. Weather generators have the capability of simulating a range of surface variables, including precipitation, temperature, wind, solar radiation, humidity and so forth—as well as derived variables such as potential evapotranspiration that are calculated through various recognised formulations. In many cases weather generators simulate precipitation first, followed by the other variables that are then conditioned to the precipitation time series; however each weather generator is different and it is necessary to review the documentation to understand the basis for generating the weather time series.

  • Other key structural features that drive the weather generator’s capacity to simulate individual or combined changes in attributes. For example, for some systems, it may be important to capture dependence between climate variables or across spatial locations.

A pragmatic approach to assess the appropriateness of weather generation choice is through evaluating the relevant diagnostics in achieving specified target attributes. Poor performance in diagnostics may be due to several issues, including weather generator suitability. The use of weather generator diagnostics is discussed later in this section.

foreSIGHT includes a few stochastic models that the user can select to generate the scenarios. The options differ in the time step, model formulation and the temporal variation of the model parameters. The models available in foreSIGHT can be viewed using the function viewModels as shown below. The defaultModel column indicates the default stochastic model that will be used if the controlFile argument is NULL. The usage of this function is demonstrated below.

viewModels()  
#> [1] "Please select a valid variable. The valid variable names are:"
#> [1] "P"    "Temp" "PET"
# View the models available for a specific climate variable
viewModels("P")  
#>      modelType modelParameterVariation timeStep default
#> 1        BLRPM                     ann   1 hour   FALSE
#> 2       latent                     ann    1 day   FALSE
#> 3       latent                    seas    1 day    TRUE
#> 4       latent                     har    1 day   FALSE
#> 5         wgen                     ann    1 day   FALSE
#> 6         wgen                    seas    1 day   FALSE
#> 7         wgen                     har    1 day   FALSE
#> 8  distScaling                     ann      any   FALSE
#> 9       monAR1                     ann  1 month   FALSE
#> 10      monAR1                    seas  1 month   FALSE
#> 11      monAR1                   seas1  1 month   FALSE
#> 12      monAR1                   seas2  1 month   FALSE
#> 13      monAR1                     har  1 month   FALSE
# View models available for temperature 
viewModels("Temp")
#>   modelType modelParameterVariation timeStep default
#> 1     wgenO                     ann    1 day   FALSE
#> 2     wgenO                    seas    1 day   FALSE
#> 3     wgenO                     har    1 day    TRUE
#> 4     wgenO                  seasWD    1 day   FALSE
#> 5     wgenO                   harWD    1 day   FALSE

A summary of these models is provided in the table below.

Description of models available in foreSIGHT.
Variable Model type Time step Parameter variation Description
P latent 1 day ann 4 parameter latent variable (LV) rainfall model
seas 16 parameter LV rainfall model, with seasonal variations in parameters
har 12 parameter LV rainfall model, with harmonic variations in parameters
wgen ann 4 parameter Richardson WGEN rainfall model
seas 16 parameter WGEN rainfall model, with seasonal variations in parameters
har 12 parameter WGEN rainfall model, with harmonic variations in parameters
BLRPM 1 hour ann 5 parameter Bartlett-Lewis Rectangular Pulse sub-daily rainfall model
monAR1 1 month 4 parameter transformed AR(1) monthly rainfall model
Temp wgenO 1 day ann 3 parameter WGEN AR(1) model for temperature
seas 12 parameter WGEN AR(1) model for temperature, with seasonal variations in parameters
har 7 parameter WGEN AR(1) model for temperature, with harmonic variations in all parameters except autocorrelation parameter
seasWD 20 parameter WGEN AR(1) model for temperature, with separate parameters for wet and dry days, and seasonal variations in parameters.
harWD 13 parameter WGEN AR(1) model for temperature, with separate parameters for wet and dry days, and harmonic variations in parameters. A single autocorrelation parameter is used for all seasons and wet-dry days.
PET seas 12 parameter WGEN AR(1) model for PET, with seasonal variations in parameters
PET har 7 parameter WGEN AR(1) model for PET, with harmonic variations in all parameters except autocorrelation parameter
PET seasWD 20 parameter WGEN AR(1) model for PET, with separate parameters for wet and dry days, and seasonal variations in parameters.
PET harWD 13 parameter WGEN AR(1) model for PET, with separate parameters for wet and dry days, and harmonic variations in parameters. A single autocorrelation parameter is used for all seasons and wet-dry days.

The selection of model type and parameter variation can specified in the controlFile, as seen in the following example:

controlFileList <- list()
controlFileList$modelType <- list()
controlFileList$modelType$P <- "latent"
controlFileList$modelType$PET <- "wgenO"

controlFileList$modelParameterVariation <- list()
controlFileList$modelParameterVariation$P <- "seas"
controlFileList$modelParameterVariation$PET <- "harWD"

Multisite rainfall

foreSIGHT supports the generation of multisite precipitation using the latent variable model with the approach introduced by McInerney et al. (2023). This method allows the simulation of rainfall at multiple locations simultaneously, capturing key spatial dependencies. It also allows the exploration of changes in spatial correlation between sites.

To implement this, the reference data must contain precipitation data at multiple sites (see Introduction to climate stress testing using foreSIGHT vignette, Section 4.1), and the latent variable model is specified in the control file. An example is provided in the help documentation for generateScenarios().

4.3.3. Post-processing of stochastic weather generator output

In addition to generating climate time series using stochastic weather generators (SWGs), foreSIGHT provides the ability to apply post-processing adjustments to the simulated output. This step helps to address known limitations of SWGs and ensures that certain climate features are better represented or that unrealistic behavior is constrained.

These post-processing steps are applied after the SWG has generated the time series and are independent of the SWG model itself—meaning they can be used flexibly across different SWGs in foreSIGHT.

Available post-processing options are:

  • Annual variability adjustment (annVar)
    Adjusts the interannual variability of the generated time series to better reflect historical or target levels. This is especially important because many SWGs tend to underestimate annual variability, which can significantly affect systems that are sensitive to year-to-year climate fluctuations.

  • Extreme value scaling (scaleExtremes, scaleExtremesSeas)
    Applies constraints to the upper tail of the distribution to ensure that changes in very high percentiles (e.g. >99%) scale consistently with the 99th percentile. This helps prevent disproportionate increases in extreme values during perturbation.

    • scaleExtremes: Applies the adjustment across the entire time series.

    • scaleExtremesSeas: Applies the adjustment separately for each season, preserving seasonal differences in extremes.

    These options are especially useful for avoiding unrealistic extremes, such as excessively high rainfall events, which can result from large shifts in the distribution tails. Without this control, such anomalies may lead to distorted or misleading outcomes in downstream impact modelling.

An example of the specification of post-processing options is:

controlFileList[['postProcessing']] <- list(P=list())
controlFileList[['postProcessing']]$P$types <- c('scaleExtremes','annVar')

4.3.4. Optimization

Objective function weights

The objective function shown in Section 4.1 allows different attributes to be prioritized in the calibration through the use of the weights \(\lambda\) . The precise value of these weights depends on the overall objectives of the problem and will often be adjusted after assessing the diagnostics of the stochastically generated time series (discussed later in this section). The likely usage of this is to provide a means of instructing the optimizer to: ‘generate stochastic realizations that reflect the perturbed attributes, while keeping other relevant features of the time series described by the held attributes as close to the reference time series as possible’— by placing greater priority on the perturbed attributes.

The weights are specified by specifying “penaltyAttributes” and “penaltyWeights” in the control file. For example, if we want to apply \(\lambda=3\) for "P_day_all_tot_m", and \(\lambda=2\) for "P_day_all_P99", this is achieved as follows:

controlFileList[["penaltyAttributes"]] <- c("P_day_all_tot_m","P_day_all_P99")
controlFileList[["penaltyWeights"]] <- c(3,2)

Weights for other attributes have the default value of 1.

Parameter Bounds

When using numerical optimization to estimate parameters of a stochastic weather generator (SWG), parameter bounds must be specified. These bounds define the allowable range for each model parameter during the calibration process.

The default bounds for each supported model can be viewed using the viewModelParameters() function by specifying the variable name, model type, and type of parameter variation. For example, to view the bounds for rainfall ("P") using the latent variable model with seasonal parameter variation:

viewModelParameters(variable = 'P',modelType='latent',modelParameterVariation='seas')
#>     parameter min_bound max_bound
#> 1   alpha.SON       0.1       0.9
#> 2   alpha.DJF       0.1       0.9
#> 3   alpha.MAM       0.1       0.9
#> 4   alpha.JJA       0.1       0.9
#> 5   sigma.SON       0.5       8.0
#> 6   sigma.DJF       0.5       8.0
#> 7   sigma.MAM       0.5       8.0
#> 8   sigma.JJA       0.5       8.0
#> 9      mu.SON      -8.0       2.0
#> 10     mu.DJF      -8.0       2.0
#> 11     mu.MAM      -8.0       2.0
#> 12     mu.JJA      -8.0       2.0
#> 13 lambda.SON       1.0       5.0
#> 14 lambda.DJF       1.0       5.0
#> 15 lambda.MAM       1.0       5.0
#> 16 lambda.JJA       1.0       5.0

Users can override default parameter bounds by specifying them in the controlFile using the "modelParameterBounds" argument.

For example, to modify the bounds of the "sigma.SON" parameter for rainfall so that it ranges from 0.001 to 20 (instead of the default range of 0.5 to 8), use the following:

controlFileList[["modelParameterBounds"]] <- list()
controlFileList[["modelParameterBounds"]][["P"]] <- list()
controlFileList[["modelParameterBounds"]][["P"]][["sigma.SON"]] <- c(0.001, 20)

Optimization arguments

The choice of numerical optimization routine and settings plays a large role in how well the simulated attributes match the desired targets. The default optimization routine used in **foreSIGHT is the Robust Gauss Newton (RGN) algorithm (Qin et al, 2018). Default values for optimization arguments in foreSIGHT can be viewed using the viewDefaultOptimArgs() helper function in the package.

viewDefaultOptimArgs()
#> $optimizer
#> [1] "RGN"
#> 
#> $obj.func
#> [1] "WSS"
#> 
#> $seed
#> NULL
#> 
#> $nMultiStart
#> [1] 5
#> 
#> $OFtol
#> [1] 0
#> 
#> $seed
#> NULL
#> 
#> $RGN.control
#> $RGN.control$iterMax
#> [1] 100
#> 
#> $RGN.control$distMin
#> [1] 0

This shows that RGN is the default optimizer, and that optimization will be performed using 5 different initial parameter sets (multi-starts) to improve optimization. It also shows control settings used in the RGN algorithm.

**foreSIGHT has the capability to use alternate optimization routines, including a genetic algorithm (‘GA’), the shuffled complex evolution algorithm (‘SCE’), and the Nelder-Mead method (‘NM’). Default settings for ‘GA’ are shown using

viewDefaultOptimArgs('GA')
#> $optimizer
#> [1] "GA"
#> 
#> $obj.func
#> [1] "WSS"
#> 
#> $seed
#> NULL
#> 
#> $nMultiStart
#> [1] 5
#> 
#> $OFtol
#> [1] 0
#> 
#> $seed
#> NULL
#> 
#> $GA.args
#> $GA.args$pcrossover
#> [1] 0.8
#> 
#> $GA.args$pmutation
#> [1] 0.1
#> 
#> $GA.args$maxiter
#> [1] 50
#> 
#> $GA.args$maxFitness
#> [1] -0.001
#> 
#> $GA.args$popSize
#> [1] 500
#> 
#> $GA.args$run
#> [1] 20
#> 
#> $GA.args$parallel
#> [1] FALSE
#> 
#> $GA.args$keepBest
#> [1] TRUE

The code below demonstrates how user-specified optimisation arguments can be used in the controlFile input to generateScenarios.

This first example shows how the number of multi-starts can be changed to 1 (from the default value of 5)

# add user-specified values for optimisation arguments
controlFileList[["optimisationArguments"]] <- list()
controlFileList[["optimisationArguments"]][["nMultiStart"]] <- 1

This second example shows how we can change the optimizer to ‘GA’, and specify GA::ga() arguments for maximum iterations (‘maxiter’) and stopping criteria (‘run’)

# add user-specified values for optimisation arguments
controlFileList[["optimisationArguments"]] <- list()
controlFileList[["optimisationArguments"]][["optimizer"]] <- 'GA'
controlFileList[["optimisationArguments"]][["nMultiStart"]] <- 1
controlFileList[["optimisationArguments"]][["GA.args"]] <- list(maxiter=100,run=40)

4.3.5. Advice for managing the number of parameters and target attributes

When using the inverse approach to estimate stochastic weather generator (SWGs) parameters in foreSIGHT, it is recommended that the number of target attributes be at least equal to the number of fitted model parameters. Otherwise, the optimization process may be under-constrained, leading to non-unique solutions and potentially large uncertainty in climates.

Several strategies can help ensure this balance:

  • Use of tied and held attributes: Increasing the number of tied and held parameters

  • Fixing SWG parameters: Some model parameters can be fixed based on prior knowledge or separate calibration exercises, reducing the number of parameters that need to be fitted during optimization.

    • Parameters can be estimated using methods such as the Method of Moments or Maximum Likelihood Estimation. The function modCalibrator() is available in foreSIGHT for this purpose and supports a subset of SWG models.

    • As an alternative, parameters can be estimated using generateScenarios() by specifying the exposure space with attributes held to their historical (i.e. unperturbed) values, effectively calibrating the SWG to reproduce observed conditions before introducing perturbations.

    Parameters can then be fixed by setting the parameter bounds to these values.

4.3.6. Example: Generating perturbed stochastic climates

This example demonstrates how the generateScenarios() function can be used to produce perturbed stochastic climates. We focus on daily rainfall perturbations for the Scott Creek catchment in South Australia (see Appendix for case study details). The exposure space defined in Section 3.4 provides the basis for the perturbations applied in this example.

The controlFile is configured to specify the following:

  • Stochastic Weather Generator (SWG) model type and parameterisation:
    We use the latent-variable SWG with seasonally varying parameters.

  • Optimization settings:
    A tolerance value is set to define the stopping criteria for the multistart optimization procedure.

  • Weights for the objective function:
    Highest weights are assigned to the perturbed attributes 'P_day_all_seasRatioJunAug' and 'P_day_all_P99', along with 'P_day_all_tot_m', a known key driver of system response.

The call to generateScenarios() specifies the generation of 50 stochastic weather replicates. Given the combination of a large number of replicates (50) and multiple target locations in the exposure space (25), parallel processing on high-performance computing (HPC) is recommended to reduce runtime.

The example below demonstrates the use of 25 processing cores, and completes in approximately 20 minutes under these settings.


# load dates, precip, PET and streamflow data for Scott Creek
data('data_A5030502')

############################################################################

# create reference data - won't include PET here since not modelled 
clim_ref <- list(times = data_A5030502$times,
                 P = data_A5030502$P)  

############################################################################
# setup model settings

controlFileList <- list()

controlFileList$modelType <- list()
controlFileList$modelType$P <- "latent" # latent variable model

controlFileList$modelParameterVariation <- list()
controlFileList$modelParameterVariation$P <- "seas" # parameters vary with season

controlFileList[["optimisationArguments"]] <- list()
controlFileList[["optimisationArguments"]][["OFtol"]] <- 0.1 # stop multi-start optimization when OF < OFtol

# set penalty weights. More weights to perturbed attributes and total rainfall (which is known to have large impact)
controlFileList[["penaltyAttributes"]]<-c('P_day_all_seasRatioMarMay',
                                        'P_day_all_P99','P_day_all_tot_m',
                                           'P_day_all_avgDSD','P_day_all_nWet_m')
controlFileList[["penaltyWeights"]] = c(3,3,3,1.5,1.5)

# write to JSON file
controlFileListJSON <- jsonlite::toJSON(controlFileList, pretty = TRUE, auto_unbox = TRUE)
controlFile <- paste0(tempdir(), "\\eg_controlFile.json")
write(controlFileListJSON, file = controlFile)

###############
# generate scenarios

sim.stoch <- generateScenarios(reference = clim_ref,        # reference climate data
                               expSpace = expSpace,         # exposure space (defined in Section 3.4)
                               controlFile = controlFile,   # control file constructed above
                               seedID = 1,                  # random seed
                               numReplicates = 50,          # stochastic replicates
                               cores = 25)                  # cores for parallel processing

5. Evaluation of perturbed climates

The generation of perturbed climates through stochastic simulation requires careful evaluation on the part of the user to ensure that:

Failure to perform proper evaluation can lead to misleading conclusions about the impacts of changes in climate attributes.

The following sections introduce diagnostic tools available in foreSIGHT for evaluating the performance of stochastic climates and help identify common issues. We also offer strategies for resolving problems. Diagnostics are best used as part of an iterative workflow—producing, evaluating, and refining stochastic climates until performance is satisfactory.

5.1. Evaluation of target attributes

foreSIGHT contains a function named plotScenarios() which can be used to create plots of the biases in attributes of the simulated data relative to the specified target values, for perturbed, held and target attributes. The function uses a simulation performed using generateScenarios() as input and plots the mean and standard deviation of the absolute biases of each attribute and target, across all the replicates in heatmap-like plots. It is implemented as follows:

p <- plotScenarios(sim) # sim the output from generateScenarios

The figures can be used to assess how well the simulations capture the desired target values of the attributes.

As a rough estimate, biases around or less than 5% are acceptable. If there are larger mean biases, we recommend that you evaluate the following:

If the standard deviation of the absolute biases across the replicates are high, this indicates that the attribute value is highly variable across the replicates in the generated data. This could be due to

The use of plotScenarios() is demonstrated for the simulated climates generated in Section 4.3.6. (for the full exposure space, and with 50 replicates) as follows:

plotScenarios(sim.stoch)
Mean biases in simulated attributes

Mean biases in simulated attributes

SD of biases in simulated attributes

SD of biases in simulated attributes

These figures show the following:

5.2. Evaluating changes in attributes

This section explores how a wide range of climate attributes respond to imposed perturbations—including those that are explicitly perturbed, held, or tied, as well as independent (unconstrained) attributes.

Key aspects to examine include:

If such issues are observed, consider the following actions:

The example below evaluates changes across a wide set of attributes for the simulated climates generated in Section 4.3.6 (which includes additional targets and replicates compared with the example code).

In particular, we examine perturbations to the attribute "P_day_all_seasRatioJunAug", and assess responses across:


# select target attributes (perturbed/held/tied)
attSel <- colnames(sim.stoch$expSpace$targetMat)
# other attributes
attSel <- c(attSel,'P_year_all_avgDwellTime','P_day_all_avgWSD',
           'P_day_all_P99.9','P_day_JJA_P99.9','P_day_SON_P99.9',
           'P_day_DJF_P99.9','P_day_MAM_P99.9')

# plot changes in attributes for perturbations in seasonality ratio
att <- "P_day_all_seasRatioMarMay"
par(mfrow=c(10,3),mar=c(4,7,2,1))
# plot changes in a single attribute with respect to perturbed attrbiutes
plotPerformanceAttributesOAT(clim=clim_ref,        # reference climate
                             sim=sim.stoch,        # simulated perturebed climates (see Section 4.3.6)
                             attPerturb=att,       # perturbed attribute
                             attEval=attSel,       # seelected attributes for evaluation
                             cex.main = 1.5,cex.xaxis = 1,cex.yaxis = 1)    # formatting options
Evaluation of changes in climate attributes for perturbations in P_day_all_seasRatioMarMay

Evaluation of changes in climate attributes for perturbations in P_day_all_seasRatioMarMay

In these plots:

These results can be interpreted as follows:

Next we evaluate the response to perturbations in "P_day_all_P99":

# plot changes in attrtibutes for perturbations in P99
att <- "P_day_all_P99"
par(mfrow=c(10,3),mar=c(4,7,2,1))
# plot changes in a single attribute with respect to perturbed attrbiutes
plotPerformanceAttributesOAT(clim=clim_ref,        # reference climate
                             sim=sim.stoch,        # simulated perturebed climates (see Section 4.3.6)
                             attPerturb=att,       # perturbed attribute
                             attEval=attSel,       # seelected attributes for evaluation
                             cex.main = 1.5,cex.xaxis = 1,cex.yaxis = 1)    # formatting options   
Evaluation of changes in climate attributes for perturbations in P_day_all_P99

Evaluation of changes in climate attributes for perturbations in P_day_all_P99

In these plots, red lines highlight attributes that change substantially more than intended—specifically, where the slope of the attribute response relative to the perturbed attribute exceeds 1.5.

These results can be interpreted as follows:

5.3. Evaluating the ability of SWGs to capture climate features relevant to the system model

This section addresses the critical connection between stochastic climates and system model responses.

To ensure meaningful analysis, the stochastic weather generator (SWG) must be capable of reproducing the climate features that are most influential for the system model’s behavior. One way to assess this is to compare how the system responds under different climate inputs.

In particular, it is recommended to compare system responses under:

This allows users to assess whether the SWG adequately captures the key climate drivers that influence the system’s performance. See Nguyen et al (2023) for further explanation of this approach.

The function evaluate_system_metrics() in foreSIGHT supports this evaluation by comparing system metrics across different climate inputs. We demonstrate this approach using the rainfall–runoff case study for the Scott Creek catchment in South Australia (see Appendix for details).

First, we setup the system model for this application:


# put dates in format required for airGR
dates <- as.Date(clim_ref$times)

# observed flow
Qobs <- data_A5030502$Qobs
# observed PET 
PET <- data_A5030502$PET 

# calibrate GR4J parameters
Param <- calGR4J(dates = dates,P=clim_ref$P,PET=PET,Qobs=Qobs)

# setup systemArgs and metrics
systemArgs <- list(dates=dates,Param=Param,PET=PET)
metrics <- c('meanQ','P99','P25','min3yr')

Next, evaluate_system_metrics() is used to calculate streamflow metrics for the observed and baseline stochastic climates, with boxplots used to compare these values.

eval <- evaluate_system_metrics(sim=sim.stoch,clim=clim_ref,
                               systemModel=GR4J_wrapper,systemArgs=systemArgs,
                               metrics=metrics)

par(mfrow=c(2,2),mar=c(3,4,1,1))
for (m in 1:length(metrics)){
  metric <- metrics[m]
  yAll <- c(eval$systemPerf_base[[metric]],eval$systemPerf_obsClim[metric])
  ylim <- c(min(yAll),max(yAll))
  boxplot_prob(as.vector(eval$systemPerf_base[[metric]]),ylim=ylim)
  points(eval$systemPerf_obsClim[metric],col='red',pch=19)
  title(metric)
}
Evaluation of stochastic climates in terms of capturing hydrological response to observed climates

Evaluation of stochastic climates in terms of capturing hydrological response to observed climates

This figure shows that, for each system metric, the values obtained using the observed climate (red points) fall within the range produced by the baseline stochastic climate simulations (boxplots). This indicates that the SWG is successfully capturing the climate features relevant to the system model—at least under historical (unperturbed) conditions.

6. Calculate and evaluate performance (Steps C and D)

Once we are satisfied with the performance of the generated stochastic climates, we can proceed to calculate and evaluate system responses to climate perturbations—corresponding to Steps C and D in the foreSIGHT framework.

Using the rainfall–runoff case study for the Scott Creek catchment in South Australia (see Appendix for details), system performance metrics for the perturbed climates can be calculated as follows:

# calculate system performance metrics for simulated climates
sysOutSim <- runSystemModel(sim=sim.stoch,systemModel = GR4J_wrapper,systemArgs = systemArgs,metrics = metrics)

Plotting system performance was introduced in the Introduction to Climate Stress Testing using foreSIGHT vignette. The use of multiple stochastic replicates adds further flexibility and insights.

6.1. Plotting OAT (One-At-a-Time) performance

For each attribute perturbation, performance can be visualized using probability bounds across multiple stochastic replicates.

First, the response to perturbations in the seasonality ratio is explored

# performance for changes in P_day_all_seasRatioMarMay (1st attribute)
plotPerformanceOAT(performance = sysOutSim, sim=sim.stoch, metric = 'meanQ',attSel=attPerturb[1],topReps=40,col='orange')
plotPerformanceOAT(performance = sysOutSim, sim=sim.stoch, metric = 'P99',attSel=attPerturb[1],topReps=40,col='orange')
plotPerformanceOAT(performance = sysOutSim, sim=sim.stoch, metric = 'P25',attSel=attPerturb[1],topReps=40,col='orange')
Performance for OAT changes in seasonality ratioPerformance for OAT changes in seasonality ratioPerformance for OAT changes in seasonality ratio

Performance for OAT changes in seasonality ratio

Next, the response to perturbations in extreme rainfall is shown:

# performance for changes in P_day_all_P99 (2nd attribute)
plotPerformanceOAT(performance = sysOutSim, sim=sim.stoch, metric = 'meanQ',attSel=attPerturb[2],topReps=40,col='orange')
plotPerformanceOAT(performance = sysOutSim, sim=sim.stoch, metric = 'P99',attSel=attPerturb[2],topReps=40,col='orange')
plotPerformanceOAT(performance = sysOutSim, sim=sim.stoch, metric = 'P25',attSel=attPerturb[2],topReps=40,col='orange')
Performance for OAT changes in extreme rainfallPerformance for OAT changes in extreme rainfallPerformance for OAT changes in extreme rainfall

Performance for OAT changes in extreme rainfall

6.2. Plot 2D performance spaces:

To explore joint responses to multiple attribute changes, we can generate 2D performance spaces. These plots show the mean system response across the top 40 replicates.


attX <- "P_day_all_seasRatioMarMay"
attY <- "P_day_all_P99"

# plot 2D performance spaces
plotPerformanceSpace(performance = sysOutSim, sim=sim.stoch, metric = 'meanQ',type='filled.contour',nContour=5,topReps = 40,attX = attX,attY = attY)
plotPerformanceSpace(performance = sysOutSim, sim=sim.stoch, metric = 'P99',type='filled.contour',nContour=5,topReps = 40,attX = attX,attY = attY)
plotPerformanceSpace(performance = sysOutSim, sim=sim.stoch, metric = 'P25',type='filled.contour',nContour=5,topReps = 40,attX = attX,attY = attY)
Plots of 2D performance spacesPlots of 2D performance spacesPlots of 2D performance spaces

Plots of 2D performance spaces

7. Conclusion

This vignette has demonstrated how stochastic weather generators can be used within foreSIGHT to generate perturbed climates for stress testing water systems. Unlike simple scaling approaches introduced in the Introduction to climate stress testing using foreSIGHT vignette, stochastic generation enables exploration of a much broader range of climate attributes, offering a more comprehensive assessment of system vulnerabilities.

We have shown how to generate and evaluate stochastic climates and integrate them into system models. Key challenges—such as selecting appropriate weather generator models, defining held and tied attributes, and managing numerical optimization—have also been discussed. While these challenges persist, this vignette has illustrated how they can be addressed within foreSIGHT.

Ongoing work continues to refine these methods and address remaining limitations. Users are encouraged to contact the foreSIGHT development team for further guidance, to share feedback, or to contribute suggestions for future improvements.

Appendix: Scott Creek case study

This case study performs stress-testing on streamflow in the Scott Creek catchment, South Australia, to changes in climate variables—specifically winter rainfall and extreme rainfall events. The analysis uses data from the CAMEL-AUS dataset (Fowler et al., 2021) for the period 1976–1985.

Streamflow is simulated using the GR4J rainfall-runoff model, implemented via the airGR package in R (Coron et al., 2017). Model parameters are calibrated using the calGR4J() function, and system response metrics are calculated with GR4J_wrapper(). For further details, refer to ?calGR4J and ?GR4J_wrapper in the R documentation.

Rainfall inputs are generated using a single-site, latent-variable stochastic weather generator (SWG). Hydrological indicators evaluated include mean flow, low flow (25th percentile), and high flow (99th percentile).

References