2. Attribute naming convention

foreSIGHT offers a uniquely flexible capability to perturb any number of climate attributes—either individually or in combination. To support this flexibility, attribute names follow a standardized naming convention in the format: “var_agg_strat_funcPars_op”, where

“var” is the variable name,
“agg” is the aggregation period,
“strat” is temporal stratification,
“func” is function name (with “Pars” denoting additional optional parameters), and
“op” is an optional operation.

For example, “P_year_all_avg” is the average yearly calculated using all the data rainfall (i.e. mean annual rainfall) and “P_day_all_P99_m” is annual mean of the 99th percentile of rainfall calculated over all days.

2.1. Variable names

The variable name “var” can be any string (that does not contain the separating character ‘_’), and could include:

“P”: rainfall
“Temp”: temperature
“PET”: Potential evapotranspiration

2.2. Temporal stratification

The aggregation period “agg” can be

“ann”: data from entire year
“JJA”, “SON”, etc: seasons
“Jan”, “Feb”, etc: months

2.3. Function names (and optional parameters)

Examples of function names and parameters include:

“tot”: total
“P99”: 99th percentile
“maxDSDT1”: maximum dry-spell duration (with threshold above 1mm)

A full list of attribute functions supported in foreSIGHT can be viewed using the helper functions viewAttributeFuncs(). The table below provides a summary of these functions.

*List of attribute functions available in* foreSIGHT.
Function name	Description	Options: Syntax	Options: Description	Options: Default	Typical variable
avg	Average (mean)	-	-	-	Any
avgDSD	Average dryspell duration (below threshold)	avgDSDTx	Threshold, x	x=0	P (rainfall)
avgDwellTime	Average dwell time, i.e. average of periods with below median value	-	-	-	Any
avgWSD	Average wetspell duration	avgWSDTx	Threshold, x	x=0	P (rainfall)
cor	Lag-1 autocorrelation	-	-	-	Any
cv	Coefficient of variation (sdev/mean)	-	-	-	Any
dyWet	Average rainfall on wet days (above threshold)	dyWetTx	Threshold, x	x=0	P (rainfall)
F0	Number of frost days (below zero)	-	-	-	Temperature
maxDSD	Maximum dry spell duration (below threshold)	maxDSDTx	Threshold, x	x=0	P (rainfall)
maxWSD	Maximum wet spell duration (above threshold)	maxWSDTx	Threshold, x	x=0	P (rainfall)
normP	Normalised percentile (percentile divided by mean)	normPx	Probability (%), x	None (must be specified)	Any
nWet	Number of wet days (above threshold)	nWetTx	Threshold, x	x=0	P (rainfall)
P	Percentile	Px	Probability (%), x	None (must be specified)	Any
R	Number of days above a threshold (often used for temperature)	Rx	Threshold, x	None (must be specified)	Temperature
rng	Inter-quantile range	rngx	Probability (%), x	x=90	Temperature
seasRatio	Seasonality ratio (ratio of seasonal to total rainfall)	seasRatioMon1Mon2	Start and end months of season, Mon1, Mon2	Mon1=Mar Mon2=Aug	P (rainfall)
tot	Total	-	-	-	P (rainfall)
WDcor	Lag-1 autocorrelation for wet days	-	-	-	P (rainfall)
wettest6monPeakDay	Day of year corresponding to the wettest 6 months	-	-	-	P (rainfall)
wettest6monSeasRatio	Ratio of wet season to dry season rainfall, based on wettest6monPeakDay	-	-	-	P (rainfall)

foreSIGHT also allows users to define their own custom functions. The names of these functions must have the format “func_customName”, where “customName” is the custom attribute function. These functions must have the argument “data”, which represents the time series of climate data. For example,

func_happyDays <- function(data){
  return(length(which((data>25)&(data<30))))
}

can be used in the attribute “Temp_day_all_happyDays_m” for calculating the average number of days each year with temperatures between 25^oC and 30^oC.

2.4. Operator name

The operator name is optional, and currently is limited to “m”: mean value of metric calculated over all years. Often “op” will be left empty.

The use of the operator “m” is a subtle, but important, choice. It determines whether the metric is calculated once using all the data (when operator is not specified), or whether the metric is calculated for each year, and then averaged over all years (when operator specified as “m”). For example, “P_day_all_P99” refers to the 99th percentile of daily rainfall calculated over all days, while “P_day_all_P99_m” would be the mean value of the 99th percentile of daily rainfall from each year.

This distinction becomes particularly important when comparing attributes across datasets of different lengths. For example, calculating the number of wet days without using "m" can be misleading, as longer time series will naturally accumulate more wet days. Including "m" ensures comparability by standardising the metric to an annual average.

The flexibility of attribute names can lead to some equivalence in attributes. E.g. "P_day_all_tot_m" is mean annual value of the total of daily rainfall. This is equivalent to "P_year_all_avg" - the average of the yearly rainfall.

It is noted that attributes are specified either as a fractional change relative to historical levels (and thus do not have units), or they are specified using the metric system.

2.5. Viewing attribute definitions and calculating values

The definition of climate attributes used in foreSIGHT can be viewed using the helper function viewAttributeDef() available in the package.

viewAttributeDef("P_day_all_tot_m")
#> [1] "Mean annual rainfall"

Can calculate attributes using calculateAttributes()

data("tankDat")
calculateAttributes(climateData=tank_obs,attSel=c('P_day_all_tot_m','P_day_all_P99'))
#> P_day_all_tot_m   P_day_all_P99 
#>         449.930          17.148

2.6. Multivariable attributes

foreSIGHT also supports the calculation of multivariable climate attributes, which are based on the interaction between two time series (e.g., precipitation, P, and potential evapotranspiration, PET ).

Multivariable attributes follow the naming convention “mv.var1.var2_agg_strat_func_op”, where

“var1” and “var2” are time series of climate variables
“func” is the multivariable function name
Other components (“agg”, “strat”, “op”) are defined as per single-variable attributes

Examples include

"mv.P.Temp_day_all_cor" which calculates correlation between daily precipitation and temperature
"mv.P.PET_day_MAM_avgWetDay" which is average value of PET on wet days (when P>0) between March and May.

Currently, a limited number of built-in multivariable attributes are available in foreSIGHT, as listed in the table below.

*List of multivariable attribute functions*
Function name	Description	Typical variable 1	Typical variable 2
avgDryDay	Average value of variable 2 on dry days (assuming P is variable 1)	P (rainfall)	Temp, PET
avgWetDay	Average value of variable 2 on dry days (assuming P is variable 1)	P (rainfall)	Temp, PET
cor	Correlation between variables	P (rainfall)	Temp, PET
cvDryDay	Coefficient of variation of variable 2 on dry days (assuming P is variable 1)	P (rainfall)	Temp, PET
cvWetDay	Coefficient of variation of variable 2 on wet days (assuming P is variable 1)	P (rainfall)	Temp, PET

Users can also define their own custom multivariable functions. For an example of the input and output requirements of these functions, see the built-in function mvFunc_cor().

mvFunc_cor
#> function (data.1, data.2) 
#> {
#>     return(stats::cor(data.1, data.2, use = "pairwise.complete.obs"))
#> }
#> <bytecode: 0x0000026ae737bce0>
#> <environment: namespace:foreSIGHT>

3. Creating the exposure space (Step A)

The createExpSpace() function is used to define the exposure space—a set of perturbations applied to selected climate attributes. The general process for specifying perturbed attributes, along with the sampling strategy and bounds, follows the same principles outlined in the Introduction to climate stress testing using foreSIGHT vignette

To generate realistic climate time series, it is often necessary not only to perturb key attributes but also to specify held and tied attributes. As a rule of thumb, the total number of target attributes (including perturbed, held, and tied attributes) should be at least as large as the number of fitted parameters in the stochastic weather generator (SWG).

Selecting held and tied attributes is often a challenging task. It requires expert judgment to determine which climate changes are plausible, and typically involves an iterative process of evaluation and refinement to ensure the generated scenarios are both realistic and relevant.

3.1. Perturbed attributes

As shown in the Quick Start Guide, the scaling approach for generating perturbed climates allows only a limited set of attributes to be explicitly modified—typically:

Annual averages and totals (e.g., P_day_all_tot_m, Temp_day_all_avg)
Seasonality ratios (e.g., P_day_all_seasRatioMarAug)

By contrast, stochastic simulation enables a much broader set of attributes to be perturbed. The actual attributes that can be modified depend on the capabilities of the specific stochastic generator being used (see Sections 4.3.1 and 4.3.2 for more details).

3.2. Held attributes

To maintain realistic climate behavior, certain attributes can be held constant, typically at their historical values. This is especially useful to prevent perturbations from producing unrealistic extremes or seasonal patterns.

Held attributes are specified using the attHold argument. For example:

attHold <- c("P_day_all_P99", "P_day_all_maxWSD_m", "P_day_all_seasRatioJunAug")

In this case, these three attributes are held at reference levels. The attributes 99th percentile rainfall, maximum length of the wet spells are held at reference levels so that the perturbations do not result in unrealistic extreme rainfall events. In addition, the attribute “P_day_all_seasRatioJunAug” is held at reference levels so that the seasonal cycle of the generated data is realistic.

There are also mechanisms to ensure to preferentially matching the desired values of some attributes during time series generation (as described in Section 4.3.4).

3.3. Tied attributes

Attributes can also be tied together, meaning they share the same perturbation. This is useful when certain attributes are logically or physically linked.

For example, to ensure that JJA rainfall totals (P_day_JJA_tot_m) change in line with annual rainfall totals (P_day_all_tot_m), use:

attTied <- list(P_day_all_tot_m = c("P_day_JJA_tot_m"))

In some situations, it may be beneficial to tie attributes for each season to their equivalent derived from the full time series. For example, tying seasonal 99th percentiles to the overall 99th percentile:

attTied <- setSeasonalTiedAttributes(attSel = "P_day_all_P99") 
attTied 
#> $P_day_all_P99
#> [1] "P_day_DJF_P99" "P_day_MAM_P99" "P_day_JJA_P99" "P_day_SON_P99"

This will tie P_day_DJF_P99 , P_day_MAM_P9 , P_day_JJA_P99 and P_day_SON_P99 to the attribute P_day_all_P99, ensuring consistent changes across seasons.

3.4 Example: Creating an exposure space

This example demonstrates the use of createExpSpace() to define a climate exposure space, including perturbed, held, and tied attributes. As noted earlier, selecting held and tied attributes requires expert judgment, particularly in anticipating how different attributes are likely to co-vary with those being explicitly perturbed. The process is often iterative.

Perturbed Attributes

In this example, we are interested in perturbing two climate attributes

Seasonality ratio from June–August ("P_day_all_seasRatioMarMay") which will be perturbed using factors from 0.7 to 1.3.
99th percentile of daily rainfall ( "P_day_all_P99") which will perturbed using factors from 1.0 to 1.3.

These two attributes are perturbed over a 5×5 regular grid.

Held Attributes

To maintain realism and constrain the stochastic generation process, the following attributes are held constant (at historical levels):

Mean annual rainfall "P_day_all_tot_m"
Seasonal rainfall: "P_day_DJF_tot_m", "P_day_JJA_tot_m", "P_day_SON_tot_m"
Number of wet days (annual and seasonal): "P_day_all_nWet_m", "P_day_DJF_nWet_m", "P_day_MAM_nWet_m", "P_day_JJA_nWet_m", "P_day_SON_nWet_m"
Average dry-spell duration (annual and seasonal): e.g. "P_day_all_avgDSD", "P_day_DJF_avgDSD", etc.

Tied Attributes

To ensure consistent changes across related attributes, the following tied relationships are defined:

Normalised 99th percentile rainfall for each season ('P_day_DJF_normP99', 'P_day_MAM_normP99', 'P_day_JJA_normP99', 'P_day_SON_normP99') is tied to changes in 99th percentile rainfall ("P_day_all_P99").
Spring rainfall ("P_day_MAM_tot") is tied to changes in the seasonality ratio ("P_day_all_seasRatioMarMay")

############################################################################
# create exposure space

attPerturbType <- "regGrid"
# perturb seasonality and 99% rainfall
attPerturb <- c('P_day_all_seasRatioMarMay','P_day_all_P99')
# consider a 5x5 grid
attPerturbSamp <- c(5,5)
attPerturbMin <- c(0.7,1.)
attPerturbMax <- c(1.3,1.3)
# will hold a number of attributes to historical values
attHold <- c('P_day_all_tot_m',
            'P_day_DJF_tot_m','P_day_JJA_tot_m','P_day_SON_tot_m',
            'P_day_all_nWet_m',
            'P_day_DJF_nWet_m','P_day_MAM_nWet_m',
            'P_day_JJA_nWet_m','P_day_SON_nWet_m',
            'P_day_all_avgDSD',
            'P_day_DJF_avgDSD','P_day_MAM_avgDSD',
            'P_day_JJA_avgDSD','P_day_SON_avgDSD'
            )
# tied attributes
# note: normP = P99/avg rainfall. normP in each season is tied to P99 
attTied <- list(P_day_all_P99=c('P_day_DJF_normP99', 
                               'P_day_MAM_normP99',
                               'P_day_JJA_normP99',
                               'P_day_SON_normP99'),
              P_day_all_seasRatioMarMay=c('P_day_MAM_tot_m')) # MAM rain tied to seasRatioMarMay

expSpace <- createExpSpace(attPerturb = attPerturb,
                          attPerturbSamp = attPerturbSamp,
                          attPerturbMin = attPerturbMin,
                          attPerturbMax = attPerturbMax,
                          attPerturbType = attPerturbType,
                          attHold = attHold,
                          attTied = attTied)

Note: The selection of perturbed, held, and tied attributes in this example creates some inherent conflicts, meaning that not all target values can be perfectly matched.

For instance, perturbing spring rainfall via "P_day_all_seasRatioMarMay" and "P_day_MAM_tot" while simultaneously holding total annual rainfall ("P_day_all_tot") and non-winter seasonal rainfall ("P_day_DJF_tot", "P_day_MAM_tot", "P_day_SON_tot") at historical values introduces inconsistencies. It is not possible to satisfy all these constraints simultaneously.

In such cases, the optimization process will aim to find the best compromise across the specified targets. The relative importance of each attribute can be adjusted using attribute weights, as described in Section 4.3.4.

4. Generating perturbed climate scenarios (Step B)

4.1. The inverse optimization approach

The ‘inverse’ approach to producing perturbed climates uses a numerical optimization to calibrate parameters in a stochastic weather generator that produces time series with attributes that are as close as possible to the target attributes.

To understand what is going on, we need to get into a bit more theory. Let’s imagine we have i=1,…,n different attributes, such as mean annual rainfall and 99th percentile of daily rainfall (in which case n=2). The optimization approach seeks to generate time series that minimize the difference in target values between the attribute values of a simulated time series \(a_{i,j}\) and target values \(t_{i,j}\), as given by the following equation:

\(O(\mathbf{a}_j, \mathbf{t}_j) = \sqrt{\sum_{i=1}^{n} [\lambda_i(a_{i,j} - t_{i,j})]^2}\)

Here \(\lambda_i\) represents a user-specified penalty parameter for each attribute \(i\). The use of larger values for \(\lambda_i\) emphasizes those attributes in the calibration.

4.2. Generating scenarios with generateScenarios()

Generating stochastic perturbed climates in foreSIGHT is carried out using the generateScenarios(). function. The foreSIGHT help file for this functions provides details on the use of generateScenarios().Somekey inputs to this function are as follows.

Control file (controlFile)

The controlFile argument specifies:

The stochastic weather generator (SWG) and any post-processing of SWG outputs
The optimization settings (weights, numerical optimizer)
Parameter bounds

The format and available options for the control file are described in Section 4.3.

(Note that if the controlFile argument is not specified or set to NULL, the default stochastic model and associated settings will be used to generate the scenarios. If controlFile='scaling' , then scaling approaches are used instead of stochastic simulation - see the Introduction to climate stress testing using foreSIGHT vignette.)

Length of perturbed series (simLengthNyrs)

The argument simLengthNyrs can be used to specify the desired length of the stochastically generated perturbed time series in years using generateScenarios

simLengthNyrs <- 100

Number of replicates (numReplicates)

By default, generateScenarios will generate a single replicate (or stochastic realization) of the perturbed time series. More replicates can be generated by specifying the numReplicates argument of generateScenarios.

numReplicates <- 5

The random seed (seedID)

The random seed used for stochastic generation of the first replicate is specified by seedID. The random seeds for the subsequent replicates are incremented by 1. Thus, the perturbed stochastic data generated using generateScenarios() for the same function arguments would typically be different.

Numbers of core in parallel processing (cores)

The inverse approach is a computationally intense approach to SWG calibration, as it requires many evaluations of the objective function, with each objective function calculation requiring simulation of the SWG and calculating attributes. Furthermore, objective function surface can be complex, making optimization challenging.

Combined with large numbers of replicates and targets and or longer time series, can result in the generation of scenarios becoming infeasible using a single processor. generateScenarios() supports parallel execution using multiple cores using the cores argument.

4.3. Stochastic simulation options (the control file)

4.3.1. Control file format

Options for stochastic simulation can be modified using the controlFile argument in generateScenarios(), defining the location of a JSON file.

For example, stochastic models are defined using the modelType and modelParameterVariation fields in the controlFile (as described in Section 4.3.2). The user can create a JSON file for input to generateScenarios. As an example, the following text may be used in the JSON file to select alternate models for precipitation and temperature.

# Example text to be copied to a text JSON file
{
  "modelType": {
    "P": "latent",
    "Temp": "wgenLM"
  },
  "modelParameterVariation": {
    "P": "seas",
    "Temp": "har"
  }
}

The helper function writeControlFile() available in foreSIGHT can be used to create a sample JSON file that provides a template to create control files to specify alternate models that the user needs. The writeControlFile() function can be used without arguments as shown below. Note that the following function call will write a JSON file named sample_controlFile.json into your working directory.

writeControlFile()

Alternatively, the toJSON function from the jsonlite package can be used to create a JSON file for the controlFile from an R list as shown below.

# create a list containing the specifications of the selected models
controlFileList <- list()
controlFileList[["modelType"]] <- list()
controlFileList[["modelType"]][["P"]] <- "latent"
controlFileList[["modelType"]][["Temp"]] <- "wgenO"
controlFileList[["modelParameterVariation"]] <- list()
controlFileList[["modelParameterVariation"]][["P"]] <- "seas"
controlFileList[["modelParameterVariation"]][["Temp"]] <- "har"
utils::str(controlFileList)

# write the list into a JSON file
controlFileListJSON <- jsonlite::toJSON(controlFileList, pretty = TRUE, auto_unbox = TRUE)
write(controlFileListJSON, file = paste0(tempdir(), "\\eg_controlFile.json"))

# input a the JSON file
controlFile <- paste0(tempdir(), "\\eg_controlFile.json")

4.3.2. Stochastic weather generators

There is a plethora of options for stochastic weather generators described in the scientific literature, each with different features and assumptions. This variety provides a high degree of flexibility for perturbing weather time series in a variety of ways to comprehensively ‘stress test’ a climate-sensitive system. The foreSIGHT software currently supports a number of weather generators, and is developed such that developers and advanced users can add other weather generators over time.

The choice depends:

Data timestep. Most weather generators run at a daily timestep, which is a common timestep for reporting of key weather variables. However there are also weather generators available at various sub-daily timesteps, as well as longer aggregated timesteps such as monthly or annual. The key consideration here is to ensure the time step is consistent with the likely timescales of system performance sensitivity, which in turn need to correspond to the relevant time scales of weather inputs that are require for the system model.
The key timescales of system sensitivity. In addition to data timestep, it is important that the stochastic generator is able to simulate variability in Climate Attributes representing key timescales of system sensitivity. For example some systems may respond at sub-seasonal and seasonal timescales, and others at interannual timescales. It is noted that a stochastic generator may be able to simulate variability at timescales that are equal to or longer than its timestep, but not shorter.
Relevant weather variables. Weather generators have the capability of simulating a range of surface variables, including precipitation, temperature, wind, solar radiation, humidity and so forth—as well as derived variables such as potential evapotranspiration that are calculated through various recognised formulations. In many cases weather generators simulate precipitation first, followed by the other variables that are then conditioned to the precipitation time series; however each weather generator is different and it is necessary to review the documentation to understand the basis for generating the weather time series.
Other key structural features that drive the weather generator’s capacity to simulate individual or combined changes in attributes. For example, for some systems, it may be important to capture dependence between climate variables or across spatial locations.

A pragmatic approach to assess the appropriateness of weather generation choice is through evaluating the relevant diagnostics in achieving specified target attributes. Poor performance in diagnostics may be due to several issues, including weather generator suitability. The use of weather generator diagnostics is discussed later in this section.

foreSIGHT includes a few stochastic models that the user can select to generate the scenarios. The options differ in the time step, model formulation and the temporal variation of the model parameters. The models available in foreSIGHT can be viewed using the function viewModels as shown below. The defaultModel column indicates the default stochastic model that will be used if the controlFile argument is NULL. The usage of this function is demonstrated below.

viewModels()  
#> [1] "Please select a valid variable. The valid variable names are:"
#> [1] "P"    "Temp" "PET"
# View the models available for a specific climate variable
viewModels("P")  
#>      modelType modelParameterVariation timeStep default
#> 1        BLRPM                     ann   1 hour   FALSE
#> 2       latent                     ann    1 day   FALSE
#> 3       latent                    seas    1 day    TRUE
#> 4       latent                     har    1 day   FALSE
#> 5         wgen                     ann    1 day   FALSE
#> 6         wgen                    seas    1 day   FALSE
#> 7         wgen                     har    1 day   FALSE
#> 8  distScaling                     ann      any   FALSE
#> 9       monAR1                     ann  1 month   FALSE
#> 10      monAR1                    seas  1 month   FALSE
#> 11      monAR1                   seas1  1 month   FALSE
#> 12      monAR1                   seas2  1 month   FALSE
#> 13      monAR1                     har  1 month   FALSE
# View models available for temperature 
viewModels("Temp")
#>   modelType modelParameterVariation timeStep default
#> 1     wgenO                     ann    1 day   FALSE
#> 2     wgenO                    seas    1 day   FALSE
#> 3     wgenO                     har    1 day    TRUE
#> 4     wgenO                  seasWD    1 day   FALSE
#> 5     wgenO                   harWD    1 day   FALSE

A summary of these models is provided in the table below.

*Description of models available in* foreSIGHT.
Variable	Model type	Time step	Parameter variation	Description
P	latent	1 day	ann	4 parameter latent variable (LV) rainfall model
”	”	”	seas	16 parameter LV rainfall model, with seasonal variations in parameters
”	”	”	har	12 parameter LV rainfall model, with harmonic variations in parameters
”	wgen		ann	4 parameter Richardson WGEN rainfall model
”	”	”	seas	16 parameter WGEN rainfall model, with seasonal variations in parameters
”	”		har	12 parameter WGEN rainfall model, with harmonic variations in parameters
”	BLRPM	1 hour	ann	5 parameter Bartlett-Lewis Rectangular Pulse sub-daily rainfall model
”	monAR1	1 month		4 parameter transformed AR(1) monthly rainfall model
Temp	wgenO	1 day	ann	3 parameter WGEN AR(1) model for temperature
”	”	”	seas	12 parameter WGEN AR(1) model for temperature, with seasonal variations in parameters
”	”	”	har	7 parameter WGEN AR(1) model for temperature, with harmonic variations in all parameters except autocorrelation parameter
”	”	”	seasWD	20 parameter WGEN AR(1) model for temperature, with separate parameters for wet and dry days, and seasonal variations in parameters.
”	”	”	harWD	13 parameter WGEN AR(1) model for temperature, with separate parameters for wet and dry days, and harmonic variations in parameters. A single autocorrelation parameter is used for all seasons and wet-dry days.
PET	”	”	seas	12 parameter WGEN AR(1) model for PET, with seasonal variations in parameters
PET	”	”	har	7 parameter WGEN AR(1) model for PET, with harmonic variations in all parameters except autocorrelation parameter
PET	”	”	seasWD	20 parameter WGEN AR(1) model for PET, with separate parameters for wet and dry days, and seasonal variations in parameters.
PET	”	”	harWD	13 parameter WGEN AR(1) model for PET, with separate parameters for wet and dry days, and harmonic variations in parameters. A single autocorrelation parameter is used for all seasons and wet-dry days.

The selection of model type and parameter variation can specified in the controlFile, as seen in the following example:

controlFileList <- list()
controlFileList$modelType <- list()
controlFileList$modelType$P <- "latent"
controlFileList$modelType$PET <- "wgenO"

controlFileList$modelParameterVariation <- list()
controlFileList$modelParameterVariation$P <- "seas"
controlFileList$modelParameterVariation$PET <- "harWD"

Multisite rainfall

foreSIGHT supports the generation of multisite precipitation using the latent variable model with the approach introduced by McInerney et al. (2023). This method allows the simulation of rainfall at multiple locations simultaneously, capturing key spatial dependencies. It also allows the exploration of changes in spatial correlation between sites.

To implement this, the reference data must contain precipitation data at multiple sites (see Introduction to climate stress testing using foreSIGHT vignette, Section 4.1), and the latent variable model is specified in the control file. An example is provided in the help documentation for generateScenarios().

4.3.3. Post-processing of stochastic weather generator output

In addition to generating climate time series using stochastic weather generators (SWGs), foreSIGHT provides the ability to apply post-processing adjustments to the simulated output. This step helps to address known limitations of SWGs and ensures that certain climate features are better represented or that unrealistic behavior is constrained.

These post-processing steps are applied after the SWG has generated the time series and are independent of the SWG model itself—meaning they can be used flexibly across different SWGs in foreSIGHT.

Available post-processing options are:

Annual variability adjustment (annVar)
Adjusts the interannual variability of the generated time series to better reflect historical or target levels. This is especially important because many SWGs tend to underestimate annual variability, which can significantly affect systems that are sensitive to year-to-year climate fluctuations.
Extreme value scaling (scaleExtremes, scaleExtremesSeas)
Applies constraints to the upper tail of the distribution to ensure that changes in very high percentiles (e.g. >99%) scale consistently with the 99th percentile. This helps prevent disproportionate increases in extreme values during perturbation.
- scaleExtremes: Applies the adjustment across the entire time series.
- scaleExtremesSeas: Applies the adjustment separately for each season, preserving seasonal differences in extremes.
These options are especially useful for avoiding unrealistic extremes, such as excessively high rainfall events, which can result from large shifts in the distribution tails. Without this control, such anomalies may lead to distorted or misleading outcomes in downstream impact modelling.

An example of the specification of post-processing options is:

controlFileList[['postProcessing']] <- list(P=list())
controlFileList[['postProcessing']]$P$types <- c('scaleExtremes','annVar')

4.3.4. Optimization

Objective function weights

The objective function shown in Section 4.1 allows different attributes to be prioritized in the calibration through the use of the weights \(\lambda\) . The precise value of these weights depends on the overall objectives of the problem and will often be adjusted after assessing the diagnostics of the stochastically generated time series (discussed later in this section). The likely usage of this is to provide a means of instructing the optimizer to: ‘generate stochastic realizations that reflect the perturbed attributes, while keeping other relevant features of the time series described by the held attributes as close to the reference time series as possible’— by placing greater priority on the perturbed attributes.

The weights are specified by specifying “penaltyAttributes” and “penaltyWeights” in the control file. For example, if we want to apply \(\lambda=3\) for "P_day_all_tot_m", and \(\lambda=2\) for "P_day_all_P99", this is achieved as follows:

controlFileList[["penaltyAttributes"]] <- c("P_day_all_tot_m","P_day_all_P99")
controlFileList[["penaltyWeights"]] <- c(3,2)

Weights for other attributes have the default value of 1.

Parameter Bounds

When using numerical optimization to estimate parameters of a stochastic weather generator (SWG), parameter bounds must be specified. These bounds define the allowable range for each model parameter during the calibration process.

The default bounds for each supported model can be viewed using the viewModelParameters() function by specifying the variable name, model type, and type of parameter variation. For example, to view the bounds for rainfall ("P") using the latent variable model with seasonal parameter variation:

viewModelParameters(variable = 'P',modelType='latent',modelParameterVariation='seas')
#>     parameter min_bound max_bound
#> 1   alpha.SON       0.1       0.9
#> 2   alpha.DJF       0.1       0.9
#> 3   alpha.MAM       0.1       0.9
#> 4   alpha.JJA       0.1       0.9
#> 5   sigma.SON       0.5       8.0
#> 6   sigma.DJF       0.5       8.0
#> 7   sigma.MAM       0.5       8.0
#> 8   sigma.JJA       0.5       8.0
#> 9      mu.SON      -8.0       2.0
#> 10     mu.DJF      -8.0       2.0
#> 11     mu.MAM      -8.0       2.0
#> 12     mu.JJA      -8.0       2.0
#> 13 lambda.SON       1.0       5.0
#> 14 lambda.DJF       1.0       5.0
#> 15 lambda.MAM       1.0       5.0
#> 16 lambda.JJA       1.0       5.0

Users can override default parameter bounds by specifying them in the controlFile using the "modelParameterBounds" argument.

For example, to modify the bounds of the "sigma.SON" parameter for rainfall so that it ranges from 0.001 to 20 (instead of the default range of 0.5 to 8), use the following:

controlFileList[["modelParameterBounds"]] <- list()
controlFileList[["modelParameterBounds"]][["P"]] <- list()
controlFileList[["modelParameterBounds"]][["P"]][["sigma.SON"]] <- c(0.001, 20)

Optimization arguments

The choice of numerical optimization routine and settings plays a large role in how well the simulated attributes match the desired targets. The default optimization routine used in **foreSIGHT is the Robust Gauss Newton (RGN) algorithm (Qin et al, 2018). Default values for optimization arguments in foreSIGHT can be viewed using the viewDefaultOptimArgs() helper function in the package.

viewDefaultOptimArgs()
#> $optimizer
#> [1] "RGN"
#> 
#> $obj.func
#> [1] "WSS"
#> 
#> $seed
#> NULL
#> 
#> $nMultiStart
#> [1] 5
#> 
#> $OFtol
#> [1] 0
#> 
#> $seed
#> NULL
#> 
#> $RGN.control
#> $RGN.control$iterMax
#> [1] 100
#> 
#> $RGN.control$distMin
#> [1] 0

This shows that RGN is the default optimizer, and that optimization will be performed using 5 different initial parameter sets (multi-starts) to improve optimization. It also shows control settings used in the RGN algorithm.

**foreSIGHT has the capability to use alternate optimization routines, including a genetic algorithm (‘GA’), the shuffled complex evolution algorithm (‘SCE’), and the Nelder-Mead method (‘NM’). Default settings for ‘GA’ are shown using

viewDefaultOptimArgs('GA')
#> $optimizer
#> [1] "GA"
#> 
#> $obj.func
#> [1] "WSS"
#> 
#> $seed
#> NULL
#> 
#> $nMultiStart
#> [1] 5
#> 
#> $OFtol
#> [1] 0
#> 
#> $seed
#> NULL
#> 
#> $GA.args
#> $GA.args$pcrossover
#> [1] 0.8
#> 
#> $GA.args$pmutation
#> [1] 0.1
#> 
#> $GA.args$maxiter
#> [1] 50
#> 
#> $GA.args$maxFitness
#> [1] -0.001
#> 
#> $GA.args$popSize
#> [1] 500
#> 
#> $GA.args$run
#> [1] 20
#> 
#> $GA.args$parallel
#> [1] FALSE
#> 
#> $GA.args$keepBest
#> [1] TRUE

The code below demonstrates how user-specified optimisation arguments can be used in the controlFile input to generateScenarios.

This first example shows how the number of multi-starts can be changed to 1 (from the default value of 5)

# add user-specified values for optimisation arguments
controlFileList[["optimisationArguments"]] <- list()
controlFileList[["optimisationArguments"]][["nMultiStart"]] <- 1

This second example shows how we can change the optimizer to ‘GA’, and specify GA::ga() arguments for maximum iterations (‘maxiter’) and stopping criteria (‘run’)

# add user-specified values for optimisation arguments
controlFileList[["optimisationArguments"]] <- list()
controlFileList[["optimisationArguments"]][["optimizer"]] <- 'GA'
controlFileList[["optimisationArguments"]][["nMultiStart"]] <- 1
controlFileList[["optimisationArguments"]][["GA.args"]] <- list(maxiter=100,run=40)

4.3.5. Advice for managing the number of parameters and target attributes

When using the inverse approach to estimate stochastic weather generator (SWGs) parameters in foreSIGHT, it is recommended that the number of target attributes be at least equal to the number of fitted model parameters. Otherwise, the optimization process may be under-constrained, leading to non-unique solutions and potentially large uncertainty in climates.

Several strategies can help ensure this balance:

Use of tied and held attributes: Increasing the number of tied and held parameters
Fixing SWG parameters: Some model parameters can be fixed based on prior knowledge or separate calibration exercises, reducing the number of parameters that need to be fitted during optimization.
- Parameters can be estimated using methods such as the Method of Moments or Maximum Likelihood Estimation. The function modCalibrator() is available in foreSIGHT for this purpose and supports a subset of SWG models.
- As an alternative, parameters can be estimated using generateScenarios() by specifying the exposure space with attributes held to their historical (i.e. unperturbed) values, effectively calibrating the SWG to reproduce observed conditions before introducing perturbations.
Parameters can then be fixed by setting the parameter bounds to these values.

4.3.6. Example: Generating perturbed stochastic climates

This example demonstrates how the generateScenarios() function can be used to produce perturbed stochastic climates. We focus on daily rainfall perturbations for the Scott Creek catchment in South Australia (see Appendix for case study details). The exposure space defined in Section 3.4 provides the basis for the perturbations applied in this example.

The controlFile is configured to specify the following:

Stochastic Weather Generator (SWG) model type and parameterisation:
We use the latent-variable SWG with seasonally varying parameters.
Optimization settings:
A tolerance value is set to define the stopping criteria for the multistart optimization procedure.
Weights for the objective function:
Highest weights are assigned to the perturbed attributes 'P_day_all_seasRatioJunAug' and 'P_day_all_P99', along with 'P_day_all_tot_m', a known key driver of system response.

The call to generateScenarios() specifies the generation of 50 stochastic weather replicates. Given the combination of a large number of replicates (50) and multiple target locations in the exposure space (25), parallel processing on high-performance computing (HPC) is recommended to reduce runtime.

The example below demonstrates the use of 25 processing cores, and completes in approximately 20 minutes under these settings.


# load dates, precip, PET and streamflow data for Scott Creek
data('data_A5030502')

############################################################################

# create reference data - won't include PET here since not modelled 
clim_ref <- list(times = data_A5030502$times,
                 P = data_A5030502$P)  

############################################################################
# setup model settings

controlFileList <- list()

controlFileList$modelType <- list()
controlFileList$modelType$P <- "latent" # latent variable model

controlFileList$modelParameterVariation <- list()
controlFileList$modelParameterVariation$P <- "seas" # parameters vary with season

controlFileList[["optimisationArguments"]] <- list()
controlFileList[["optimisationArguments"]][["OFtol"]] <- 0.1 # stop multi-start optimization when OF < OFtol

# set penalty weights. More weights to perturbed attributes and total rainfall (which is known to have large impact)
controlFileList[["penaltyAttributes"]]<-c('P_day_all_seasRatioMarMay',
                                        'P_day_all_P99','P_day_all_tot_m',
                                           'P_day_all_avgDSD','P_day_all_nWet_m')
controlFileList[["penaltyWeights"]] = c(3,3,3,1.5,1.5)

# write to JSON file
controlFileListJSON <- jsonlite::toJSON(controlFileList, pretty = TRUE, auto_unbox = TRUE)
controlFile <- paste0(tempdir(), "\\eg_controlFile.json")
write(controlFileListJSON, file = controlFile)

###############
# generate scenarios

sim.stoch <- generateScenarios(reference = clim_ref,        # reference climate data
                               expSpace = expSpace,         # exposure space (defined in Section 3.4)
                               controlFile = controlFile,   # control file constructed above
                               seedID = 1,                  # random seed
                               numReplicates = 50,          # stochastic replicates
                               cores = 25)                  # cores for parallel processing

5. Evaluation of perturbed climates

The generation of perturbed climates through stochastic simulation requires careful evaluation on the part of the user to ensure that:

Target attributes are accurately achieved through optimization,
The simulated climates realistically represent key features of the historical climate, and
Other attributes respond to perturbations in a reasonable and interpretable manner.

Failure to perform proper evaluation can lead to misleading conclusions about the impacts of changes in climate attributes.

The following sections introduce diagnostic tools available in foreSIGHT for evaluating the performance of stochastic climates and help identify common issues. We also offer strategies for resolving problems. Diagnostics are best used as part of an iterative workflow—producing, evaluating, and refining stochastic climates until performance is satisfactory.

5.1. Evaluation of target attributes

foreSIGHT contains a function named plotScenarios() which can be used to create plots of the biases in attributes of the simulated data relative to the specified target values, for perturbed, held and target attributes. The function uses a simulation performed using generateScenarios() as input and plots the mean and standard deviation of the absolute biases of each attribute and target, across all the replicates in heatmap-like plots. It is implemented as follows:

p <- plotScenarios(sim) # sim the output from generateScenarios

The figures can be used to assess how well the simulations capture the desired target values of the attributes.

As a rough estimate, biases around or less than 5% are acceptable. If there are larger mean biases, we recommend that you evaluate the following:

Optimizer configuration: Has it had enough scope to find a good solution? If not, adjust its settings.
Model structure: Can the model simulate the desired attribute combinations? If not, a different model structure may be required.
Over-constraint: Are the target combinations unrealistic or conflicting? If so, potentially revise targets or adjust penalties to prioritize key attributes. Note that it may be OK, or even expected, that some held or tied attributes have large attributes based on the combinations of target attributes.

If the standard deviation of the absolute biases across the replicates are high, this indicates that the attribute value is highly variable across the replicates in the generated data. This could be due to

Under-constrained calibration: Increase the number of target attributes (held/tied) or reduce the number of fitted parameters. Properly constrained calibration requires at least the same number of target attributes as fitted parameters.
Optimization performance: If the optmiizer is unable to consistently find parameters that give good performance, you may need to adjust the optimization settings.

The use of plotScenarios() is demonstrated for the simulated climates generated in Section 4.3.6. (for the full exposure space, and with 50 replicates) as follows:

plotScenarios(sim.stoch)

Mean biases in simulated attributes

SD of biases in simulated attributes

These figures show the following:

Mean values: Most attributes exhibit small mean biases, particularly the perturbed attributes—such as the 99th percentile rainfall and the seasonality ratio—which are closely aligned with their target values. The largest bias is observed for JJA rainfall, an issue that will be examined further in Section 5.2.
Standard deviation: Variability is generally low for the perturbed attributes and those assigned higher weights during optimization. The greatest variability is seen in the number of wet days.

5.2. Evaluating changes in attributes

This section explores how a wide range of climate attributes respond to imposed perturbations—including those that are explicitly perturbed, held, or tied, as well as independent (unconstrained) attributes.

Key aspects to examine include:

Large deviations from target values for target attributes, especially perturbed attributes.
Significant biases in attributes not used in the optimization process when compared to the historical baseline climate.
Unexpected changes in attributes that were not intended to be perturbed.

If such issues are observed, consider the following actions:

Reassess the selection of held and tied attributes.
Explore alternative SWG model structures, or apply post-processing techniques to improve attribute alignment or unrealistic responses.

The example below evaluates changes across a wide set of attributes for the simulated climates generated in Section 4.3.6 (which includes additional targets and replicates compared with the example code).

In particular, we examine perturbations to the attribute "P_day_all_seasRatioJunAug", and assess responses across:

Attributes included in the exposure space
Measures of temporal persistence, such as 'P_year_all_avgDwellTime'
Wet spell characteristics, e.g., 'P_day_all_avgWSD'
Indicators of extreme events, including the 99.9th percentiles for all and seasonal data, i.e. 'P_day_all_P99.9','P_day_JJA_P99.9', 'P_day_SON_P99.9', 'P_day_DJF_P99.9', 'P_day_MAM_P99.9'


# select target attributes (perturbed/held/tied)
attSel <- colnames(sim.stoch$expSpace$targetMat)
# other attributes
attSel <- c(attSel,'P_year_all_avgDwellTime','P_day_all_avgWSD',
           'P_day_all_P99.9','P_day_JJA_P99.9','P_day_SON_P99.9',
           'P_day_DJF_P99.9','P_day_MAM_P99.9')

# plot changes in attributes for perturbations in seasonality ratio
att <- "P_day_all_seasRatioMarMay"
par(mfrow=c(10,3),mar=c(4,7,2,1))
# plot changes in a single attribute with respect to perturbed attrbiutes
plotPerformanceAttributesOAT(clim=clim_ref,        # reference climate
                             sim=sim.stoch,        # simulated perturebed climates (see Section 4.3.6)
                             attPerturb=att,       # perturbed attribute
                             attEval=attSel,       # seelected attributes for evaluation
                             cex.main = 1.5,cex.xaxis = 1,cex.yaxis = 1)    # formatting options

Evaluation of changes in climate attributes for perturbations in P_day_all_seasRatioMarMay

In these plots:

Dashed blue lines indicate the target values for the perturbed attributes.
Green lines represent the target values for held and tied attributes.
The black line shows the median of the simulated attribute values, while the grey band represents the 90% range (5th to 95th percentiles).
Red crosses highlight attributes with large biases—greater than 20%—relative to their historical (unperturbed) values.

These results can be interpreted as follows:

The perturbed attribute is well captured, aligning closely with the target value and showing low uncertainty across replicates.
Most held and tied attributes are also reproduced reasonably well, although with slightly greater uncertainty. Notable exceptions include mean seasonal rainfall in JJA, SON, and DJF (P_day_tot_JJA_m, P_day_tot_SON_m, P_day_tot_DJF_m), which deviate more substantially from their targets.

This result is expected and reflects a known limitation discussed in Section 3.4. Specifically, perturbing spring rainfall using the attribute "P_day_all_seasRatioMarMay" while simultaneously holding Total annual rainfall (P_day_all_tot), and Seasonal rainfall for DJF, MAM, and SON (P_day_DJF_tot, P_day_MAM_tot, P_day_SON_tot) creates internal inconsistencies. These constraints cannot all be satisfied at once.

The optimization resolves this by allowing "P_day_DJF_tot", "P_day_MAM_tot", and "P_day_SON_tot" to shift in the opposite direction of the spring rainfall perturbation. This behavior is consistent with expectations and confirms that the model is responding appropriately to conflicting constraints.
We also see that attributes not used in the calibration (e.g. dwell time 'P_year_all_avgDwellTime' and higher extremes such as 'P_day_all_P99.9' have larger biases.

Next we evaluate the response to perturbations in "P_day_all_P99":

# plot changes in attrtibutes for perturbations in P99
att <- "P_day_all_P99"
par(mfrow=c(10,3),mar=c(4,7,2,1))
# plot changes in a single attribute with respect to perturbed attrbiutes
plotPerformanceAttributesOAT(clim=clim_ref,        # reference climate
                             sim=sim.stoch,        # simulated perturebed climates (see Section 4.3.6)
                             attPerturb=att,       # perturbed attribute
                             attEval=attSel,       # seelected attributes for evaluation
                             cex.main = 1.5,cex.xaxis = 1,cex.yaxis = 1)    # formatting options

Evaluation of changes in climate attributes for perturbations in P_day_all_P99

In these plots, red lines highlight attributes that change substantially more than intended—specifically, where the slope of the attribute response relative to the perturbed attribute exceeds 1.5.

These results can be interpreted as follows:

The perturbed, held, and tied attributes are all well reproduced, indicating that the optimization is working as expected for these constraints.
However, there are notably large changes in 99.9th percentile rainfall, which are much greater than the changes in the 99th percentile.

This discrepancy may be problematic. It introduces uncertainty in interpreting system responses—specifically, whether observed changes are due to the intended perturbation (e.g., 99th percentile rainfall) or to unintended increases in more extreme events (e.g., 99.9th percentile rainfall). In such cases, it may be advisable to limit the magnitude of changes in rainfall extremes beyond the 99th percentile. As discussed in Section 4.3.3, post-processing techniques can be applied to constrain these extremes and prevent the generation of unrealistically large rainfall events.

5.3. Evaluating the ability of SWGs to capture climate features relevant to the system model

This section addresses the critical connection between stochastic climates and system model responses.

To ensure meaningful analysis, the stochastic weather generator (SWG) must be capable of reproducing the climate features that are most influential for the system model’s behavior. One way to assess this is to compare how the system responds under different climate inputs.

In particular, it is recommended to compare system responses under:

Observed historical climate, and
Unperturbed stochastic climates (i.e., generated climates with no imposed perturbations)

This allows users to assess whether the SWG adequately captures the key climate drivers that influence the system’s performance. See Nguyen et al (2023) for further explanation of this approach.

The function evaluate_system_metrics() in foreSIGHT supports this evaluation by comparing system metrics across different climate inputs. We demonstrate this approach using the rainfall–runoff case study for the Scott Creek catchment in South Australia (see Appendix for details).

First, we setup the system model for this application:


# put dates in format required for airGR
dates <- as.Date(clim_ref$times)

# observed flow
Qobs <- data_A5030502$Qobs
# observed PET 
PET <- data_A5030502$PET 

# calibrate GR4J parameters
Param <- calGR4J(dates = dates,P=clim_ref$P,PET=PET,Qobs=Qobs)

# setup systemArgs and metrics
systemArgs <- list(dates=dates,Param=Param,PET=PET)
metrics <- c('meanQ','P99','P25','min3yr')

Next, evaluate_system_metrics() is used to calculate streamflow metrics for the observed and baseline stochastic climates, with boxplots used to compare these values.

eval <- evaluate_system_metrics(sim=sim.stoch,clim=clim_ref,
                               systemModel=GR4J_wrapper,systemArgs=systemArgs,
                               metrics=metrics)

par(mfrow=c(2,2),mar=c(3,4,1,1))
for (m in 1:length(metrics)){
  metric <- metrics[m]
  yAll <- c(eval$systemPerf_base[[metric]],eval$systemPerf_obsClim[metric])
  ylim <- c(min(yAll),max(yAll))
  boxplot_prob(as.vector(eval$systemPerf_base[[metric]]),ylim=ylim)
  points(eval$systemPerf_obsClim[metric],col='red',pch=19)
  title(metric)
}

Evaluation of stochastic climates in terms of capturing hydrological response to observed climates

This figure shows that, for each system metric, the values obtained using the observed climate (red points) fall within the range produced by the baseline stochastic climate simulations (boxplots). This indicates that the SWG is successfully capturing the climate features relevant to the system model—at least under historical (unperturbed) conditions.

‘Stress-Testing’ using foreSIGHT: Stochastic simulation

David McInerney, Seth Westra, Anjana Devanand, Michael Leonard, Sam Culley, Bree Bennett

14 Sep 2025

1. Overview