es() - Exponential Smoothing

Ivan Svetunkov

2026-02-05

es() is a part of smooth package and is a wrapper for the ADAM function with distribution="dnorm". It implements Exponential Smoothing in the ETS form, selecting the most appropriate model among 30 possible ones.

We will use some of the functions of the greybox package in this vignette for demonstrational purposes.

Let’s load the necessary packages:

require(smooth)
require(greybox)

The simplest call for the es() function is:

ourModel <- es(BJsales, h=12, holdout=TRUE, silent=FALSE)

## Forming the pool of models based on... ANN , AAN , Estimation progress:    60 %80 %100 %... Done!

ourModel

## Time elapsed: 0.07 seconds
## Model estimated using es() function: ETS(AAdN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 237.6128
## Persistence vector g:
##  alpha   beta 
## 0.9448 0.2979 
## Damping parameter: 0.8789
## Sample size: 138
## Number of estimated parameters: 4
## Number of degrees of freedom: 134
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 483.2257 483.5264 494.9347 495.6756 
## 
## Forecast errors:
## ME: 2.817; MAE: 2.967; RMSE: 3.654
## sCE: 14.869%; Asymmetry: 88%; sMAE: 1.305%; sMSE: 0.026%
## MASE: 2.491; RMSSE: 2.382; rMAE: 0.957; rRMSE: 0.954

In this case function uses branch and bound algorithm to form a pool of models to check and after that constructs a model with the lowest information criterion. As we can see, it also produces an output with brief information about the model, which contains:

How much time was elapsed for the model construction;
What type of ETS was selected;
Values of persistence vector (smoothing parameters);
What type of initialisation was used;
How many parameters were estimated (standard deviation is included);
Cost function type and the value of that cost function;
Information criteria for this model;
Forecast errors (because we have set holdout=TRUE).

The function has also produced a graph with actual values, fitted values and point forecasts.

If we need prediction interval, then we can use the forecast() method:

plot(forecast(ourModel, h=12, interval="prediction"))

The same model can be reused for different purposes, for example to produce forecasts based on newly available data:

es(BJsales, model=ourModel, h=12, holdout=FALSE)

## Time elapsed: 0 seconds
## Model estimated using es() function: ETS(AAdN)
## With provided initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 258.8401
## Persistence vector g:
##  alpha   beta 
## 0.9448 0.2979 
## Damping parameter: 0.8789
## Sample size: 150
## Number of estimated parameters: 1
## Number of degrees of freedom: 149
## Number of provided parameters: 5
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 519.6801 519.7071 522.6907 522.7585

We can also extract the type of model in order to reuse it later:

modelType(ourModel)

## [1] "AAdN"

This handy function also works with ets() from forecast package.

If we need actual values from the model, we can use actuals() method from greybox package:

actuals(ourModel)

## Time Series:
## Start = 1 
## End = 138 
## Frequency = 1 
##   [1] 200.1 199.5 199.4 198.9 199.0 200.2 198.6 200.0 200.3 201.2 201.6 201.5
##  [13] 201.5 203.5 204.9 207.1 210.5 210.5 209.8 208.8 209.5 213.2 213.7 215.1
##  [25] 218.7 219.8 220.5 223.8 222.8 223.8 221.7 222.3 220.8 219.4 220.1 220.6
##  [37] 218.9 217.8 217.7 215.0 215.3 215.9 216.7 216.7 217.7 218.7 222.9 224.9
##  [49] 222.2 220.7 220.0 218.7 217.0 215.9 215.8 214.1 212.3 213.9 214.6 213.6
##  [61] 212.1 211.4 213.1 212.9 213.3 211.5 212.3 213.0 211.0 210.7 210.1 211.4
##  [73] 210.0 209.7 208.8 208.8 208.8 210.6 211.9 212.8 212.5 214.8 215.3 217.5
##  [85] 218.8 220.7 222.2 226.7 228.4 233.2 235.7 237.1 240.6 243.8 245.3 246.0
##  [97] 246.3 247.7 247.6 247.8 249.4 249.0 249.9 250.5 251.5 249.0 247.6 248.8
## [109] 250.4 250.7 253.0 253.7 255.0 256.2 256.0 257.4 260.4 260.0 261.3 260.4
## [121] 261.6 260.8 259.8 259.0 258.9 257.4 257.7 257.9 257.4 257.3 257.6 258.9
## [133] 257.8 257.7 257.2 257.5 256.8 257.5

We can also use persistence or initials only from the model to construct the other one:

# Provided initials
es(BJsales, model=modelType(ourModel),
   h=12, holdout=FALSE,
   initial=ourModel$initial)

## Time elapsed: 0.02 seconds
## Model estimated using es() function: ETS(AAdN)
## With provided initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 258.782
## Persistence vector g:
##  alpha   beta 
## 0.9702 0.2943 
## Damping parameter: 0.8703
## Sample size: 150
## Number of estimated parameters: 4
## Number of degrees of freedom: 146
## Number of provided parameters: 2
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 525.5639 525.8398 537.6065 538.2976

# Provided persistence
es(BJsales, model=modelType(ourModel),
   h=12, holdout=FALSE,
   persistence=ourModel$persistence)

## Time elapsed: 0.01 seconds
## Model estimated using es() function: ETS(AAdN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 255.3469
## Persistence vector g:
##  alpha   beta 
## 0.9448 0.2979 
## Damping parameter: 0.8737
## Sample size: 150
## Number of estimated parameters: 2
## Number of degrees of freedom: 148
## Number of provided parameters: 2
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 514.6938 514.7754 520.7150 520.9195

or provide some arbitrary values:

es(BJsales, model=modelType(ourModel),
   h=12, holdout=FALSE,
   initial=200)

## Time elapsed: 0.02 seconds
## Model estimated using es() function: ETS(AAdN)
## With provided initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 255.3593
## Persistence vector g:
##  alpha   beta 
## 0.9812 0.2865 
## Damping parameter: 0.8785
## Sample size: 150
## Number of estimated parameters: 5
## Number of degrees of freedom: 145
## Number of provided parameters: 1
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 520.7186 521.1352 535.7717 536.8156

Using some other parameters may lead to completely different model and forecasts (see discussion of the additional parameters in the online textbook about ADAM):

es(BJsales, h=12, holdout=TRUE, loss="MSEh", bounds="a", ic="BIC")

## Time elapsed: 0.15 seconds
## Model estimated using es() function: ETS(AAN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: MSEh; Loss function value: 78.7645
## Persistence vector g:
##  alpha   beta 
## 1.5663 0.0000 
## 
## Sample size: 138
## Number of estimated parameters: 2
## Number of degrees of freedom: 136
## Information criteria:
##       AIC      AICc       BIC      BICc 
##  998.1989  998.2878 1004.0534 1004.2724 
## 
## Forecast errors:
## ME: -0.689; MAE: 1.276; RMSE: 1.425
## sCE: -3.635%; Asymmetry: -55.4%; sMAE: 0.561%; sMSE: 0.004%
## MASE: 1.072; RMSSE: 0.929; rMAE: 0.412; rRMSE: 0.372

You can play around with all the available parameters to see what’s their effect on the final model.

In order to combine forecasts we need to use “C” letter:

es(BJsales, model="CCN", h=12, holdout=TRUE)

## Time elapsed: 0.19 seconds
## Model estimated: ETS(CCN)
## Loss function type: likelihood
## 
## Number of models combined: 10
## Sample size: 138
## Average number of estimated parameters: 4.1656
## Average number of degrees of freedom: 133.8344
## 
## Forecast errors:
## ME: 2.833; MAE: 2.98; RMSE: 3.671
## sCE: 14.954%; sMAE: 1.311%; sMSE: 0.026%
## MASE: 2.502; RMSSE: 2.393; rMAE: 0.961; rRMSE: 0.958

Model selection from a specified pool and forecasts combination are called using respectively:

# Select the best model in the pool
es(BJsales, model=c("ANN","AAN","AAdN","MNN","MAN","MAdN"),
   h=12, holdout=TRUE)

## Time elapsed: 0.07 seconds
## Model estimated using es() function: ETS(AAdN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 237.6128
## Persistence vector g:
##  alpha   beta 
## 0.9448 0.2979 
## Damping parameter: 0.8789
## Sample size: 138
## Number of estimated parameters: 4
## Number of degrees of freedom: 134
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 483.2257 483.5264 494.9347 495.6756 
## 
## Forecast errors:
## ME: 2.817; MAE: 2.967; RMSE: 3.654
## sCE: 14.869%; Asymmetry: 88%; sMAE: 1.305%; sMSE: 0.026%
## MASE: 2.491; RMSSE: 2.382; rMAE: 0.957; rRMSE: 0.954

# Combine the pool of models
es(BJsales, model=c("CCC","ANN","AAN","AAdN","MNN","MAN","MAdN"),
   h=12, holdout=TRUE)

## Time elapsed: 0.09 seconds
## Model estimated: ETS(CCN)
## Loss function type: likelihood
## 
## Number of models combined: 6
## Sample size: 138
## Average number of estimated parameters: 4.3125
## Average number of degrees of freedom: 133.6875
## 
## Forecast errors:
## ME: 2.837; MAE: 2.983; RMSE: 3.675
## sCE: 14.974%; sMAE: 1.312%; sMSE: 0.026%
## MASE: 2.504; RMSSE: 2.396; rMAE: 0.962; rRMSE: 0.959

Now we introduce explanatory variable in ETS:

x <- BJsales.lead

and fit an ETSX model with the exogenous variable first:

es(BJsales, model="ZZZ", h=12, holdout=TRUE,
   xreg=x)

## Time elapsed: 0.48 seconds
## Model estimated using es() function: ETSX(AMdN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 237.5066
## Persistence vector g (excluding xreg):
##  alpha   beta 
## 0.9505 0.2902 
## Damping parameter: 0.8773
## Sample size: 138
## Number of estimated parameters: 5
## Number of degrees of freedom: 133
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 485.0132 485.4677 499.6494 500.7693 
## 
## Forecast errors:
## ME: 2.876; MAE: 3; RMSE: 3.702
## sCE: 15.183%; Asymmetry: 90%; sMAE: 1.319%; sMSE: 0.027%
## MASE: 2.518; RMSSE: 2.413; rMAE: 0.968; rRMSE: 0.966

If we want to check if lagged x can be used for forecasting purposes, we can use xregExpander() function from greybox package:

es(BJsales, model="ZZZ", h=12, holdout=TRUE,
   xreg=xregExpander(x), regressors="use")

## Time elapsed: 1.49 seconds
## Model estimated using es() function: ETSX(AMdN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 236.4488
## Persistence vector g (excluding xreg):
##  alpha   beta 
## 1.0000 0.3129 
## Damping parameter: 0.8385
## Sample size: 138
## Number of estimated parameters: 7
## Number of degrees of freedom: 131
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 486.8975 487.7591 507.3883 509.5108 
## 
## Forecast errors:
## ME: 2.349; MAE: 2.849; RMSE: 3.348
## sCE: 12.4%; Asymmetry: 72.5%; sMAE: 1.253%; sMSE: 0.022%
## MASE: 2.391; RMSSE: 2.182; rMAE: 0.919; rRMSE: 0.874

We can also construct a model with selected exogenous (based on IC):

es(BJsales, model="ZZZ", h=12, holdout=TRUE,
   xreg=xregExpander(x), regressors="select")

## Time elapsed: 1.04 seconds
## Model estimated using es() function: ETS(AMdN)
## With backcasting initialisation
## Distribution assumed in the model: Normal
## Loss function type: likelihood; Loss function value: 237.5549
## Persistence vector g:
##  alpha   beta 
## 0.9443 0.2959 
## Damping parameter: 0.8733
## Sample size: 138
## Number of estimated parameters: 4
## Number of degrees of freedom: 134
## Information criteria:
##      AIC     AICc      BIC     BICc 
## 483.1098 483.4106 494.8188 495.5598 
## 
## Forecast errors:
## ME: 2.819; MAE: 2.969; RMSE: 3.656
## sCE: 14.879%; Asymmetry: 88%; sMAE: 1.306%; sMSE: 0.026%
## MASE: 2.492; RMSSE: 2.383; rMAE: 0.958; rRMSE: 0.954

Finally, if you work with M or M3 data, and need to test a function on a specific time series, you can use the following simplified call:

es(Mcomp::M3$N2457, silent=FALSE)

This command has taken the data, split it into in-sample and holdout and produced the forecast of appropriate length to the holdout.