Framework

In the exercise, I was provided with monthly export data of a Swiss industry. The aim was to forecast exports for several different regions, including Europe, the United States as well as total exports. In this short project description, I will focus on the procedure to forecast total exports of the Swiss industry.

Below, the figure shows the original series of total exports of this Swiss industry. The series shows strong seasonality and a positive trend over time. The series starts in 1993 and total exports peak in october 2014 at 2.28 billion CHF.

Preliminaries

To match the frequency of additional variables used in further analysis, I averaged the monthly time series to quarterly data. As the forecast-horizon of interest was 1.5 years, I proceeded to detrend the time series to focus more on the medium-term evolution of the exports. The de-trending of the time series allows to purge the series of seasonality and abstract from short-term shocks that have no persisting effect.

To detrend, I used a seasonal-trend decomposition procedure based on LOESS. In a nutshell, this approach uses locally-weighted regressions as a smoother on the time series (Cleveland et al. (1990)). It allows to decompose a time series in trend component, seaonsal compent and a remainder. Below, I plot the filtered trend data in quarterly frequency together with the original monthly time series.

Before the model evaulation exercise, I selected several variables that correlate well with the export time series. As we are only interested in an accurate forecast and not the causal relationship between variables, including variables that correlate well with the variable of interest in the forecasting model can improve the model fit. However, correlation should be strong as including arbitrary series may increase noise and decrease the model’s performance. Also, including too many variables may increase the in-sample fit but may lead to poor out-of-sample fit. If the model is overfit, it represents data patterns that are specific to the sample used.

Displayed below are the growth rates of some variables that I considered in the forecasting exercise. It shows the growth rates of our series of interset, the total exports of the Swiss industry, the total exports of all sectors in Switzerland, and the real GDP growth of the region of interest (in this case, the GDP growth of the world).

Evaluation

Next, I present several models that I considered during the evaluation process. I show how I evaluated which models were most suitable, i.e. how I compared the forecasting accuracy across models.

I considered the following models: Auto-Regressive Moving-Average (ARMA), ARMA with exogenous inputs (ARMAX), Vector Auto-Regressive (VAR), VAR with exogenous inputs (VARX), VAR with Moving-Average (VARMA). Note that this list is in no means exhaustive regarding models suitable for forecasting exercises. However, they have proven to fit considerably well in a first evaluation for the specific time series of interest. In what follows, I briefly describe the models.

ARMA

As a first model, I considered a simple ARMA(p) representation. The model can be represented as follows.

\[y_t = c + \phi_1 y_{t-1} + \dots + \phi_p y_{t-p} + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \dots + \theta_q \varepsilon_{t-q}\]

\(y_t\) is the series of interest, \(p,q\) are the number of lags considered. \(\phi\) represent the estimated coefficients for the auto-regressive (AR) part, \(\theta\) are the coefficients of the moving average (MA) and \(\{ \varepsilon_t\}_{t\in ] -\infty,+\infty [}\) is a white noise process. \(c\) is a constant.

ARMAX

The ARMAX(p,q) is a simple extension of the ARMA model in that sense that it allows for additional regressors to be considered in the model estimation. Formally, it is written as

\[y_t = c + \phi_1 y_{t-1} + \dots + \phi_p y_{t-p} + \beta_0 x_t + \dots \beta_r x_{t-r} + \dots \varepsilon_t + \theta_1 \varepsilon_{t-1} + \dots + \theta_q \varepsilon_{t-q}\]

where \(\beta_0 x_t + \dots \beta_r x_{t-r}\) is a matrix containing all additional variables considered (GDP growth and growth of total exports, as an example) and potential lags of these variables.

VAR

The VAR(p) is the multi-dimensional analogue to the the AR model. While in the AR model, only one equation is considered at the time, the VAR estimates multiple equations simultaneously. Formally, we can represent a VAR in the following way

\[y_t = c + \Phi_1 y_{t-1} + \dots + \Phi_p y_{t-p} + \varepsilon_t\]

where \(y_i\) is a vector of length \(k\) and the matrices \(\Phi\) are of dimension \(k \times k\). For example, \(y_t\) would contain observations for our main variable of interest, the exports of a Swiss industry, and GDP growth (and hence \(k = 2\)). Each variable consists of observations from time \(t = 1 \dots T\). \(c\) and \(\varepsilon_t\) are vecotrs of length \(k\).

VARX

VARX(p) extend the VAR by allowing for additional exogenous regressors in the system of equations that are not dynamically dependent of each other. Formally, this is

\[y_t = c + \Phi_1 y_{t-1} + \dots + \Phi_p y_{t-p} + B x_{t} \varepsilon_t\] with \(x_{t}\) being a vector of the exogenous variables.

VARMA

Finally, the VARMA(p,q) model is the multi-dimensional analogue to the ARMA model. Formally, this this can be described as \[y_t = c + \Phi_1 y_{t-1} + \dots + \Phi_p y_{t-p} + \varepsilon_t + \Theta \varepsilon_t + \Theta_1 \varepsilon_{t-1} + \dots + \Theta_q \varepsilon_{t-q}\]

Selection of the Model: In-Sample Forecasts

For each of the model, I did several forecasting exercises. In particular, I used different samples of size \(t < T-6\), where \(T\) is the time period of the last observed quarter. With these different samples, I then calculated the in-sample forecasts: \[\begin{align*} & \mathbb{E}_t\left[ y_{t+h} | y_1 \dots y_t\right] \end{align*}\] for the next 6 quarters (\(h = 1,\dots ,6\)) using the different samples. Subsequently, I compared the forecast with the actual data that is observed. This method allows to check whether the model can predict the main dynamics of the data correctly. Given that we selected a particular model, it is also useful to compare different parameter settings.

As an example, below I compare the in-sample forecast of an ARMAX model to a VARMA model.

I did the in-sample forecasts using different time samples and different model specifications. Overall, the ARMAX model proved to be the most reliable and accurate model.

After a suitable model was selected, the task was to forecast the exports at a quarterly frequency and explaining the results. This also involved an assessment of the past forecast errors and a continuous process of adapting and improving the model specifications.

I want to underline once again that the list of models I considered in this project is not exhaustive. Interesting alternatives would be an unobserved components model which could deal with the strong seasonality directly. Another well-performing and flexible model is the dynamic factor model which allows to combine information from many different time series in one model.

Cleveland, Robert B, William S Cleveland, Jean E McRae, and Irma Terpenning. 1990. “STL: A Seasonal-Trend Decomposition.” J. Off. Stat 6 (1): 3–73.