1. Introduction
Nowcasting and short-term forecasting macroeconomic variables is a key ingredient for policy making, particularly in problematic times. It is well recognised that a good nowcast or short-term forecast for a low frequency variable, such as GDP growth and its components, requires to exploit the timely information contained in higher frequency macroeconomic or financial indicators, such as surveys or spreads, or also in alternative data, such as internet searches or traffic data.
A growing literature has flourished proposing different methods to deal with the mixed-frequency feature. In particular, models cast in state-space form, such as vector autoregressions (VAR) and factor models, can deal with mixed-frequency data, taking advantage of the Kalman filter to interpolate the missing observations of the series only available at low frequency (see among many others, Mariano and Murasawa, Reference Mariano and Murasawa2010; Giannone et al., Reference Giannone, Reichlin and Small2008 in a classical context, Eraker et al., Reference Eraker, Chiu, Foerster, Kim and Seoane2015; Schorfheide and Song, Reference Schorfheide and Song2015 in a Bayesian context).
A second approach has been proposed by Ghysels (Reference Ghysels2016). He introduces a different class of mixed-frequency VAR models, in which the vector of endogenous variables includes both high and low frequency variables, with the former stacked according to the timing of the data releases.
A third approach is the mixed-data sampling (MIDAS) regression, introduced by Ghysels et al. (Reference Ghysels, Santa-Clara and Valkanov2006), and its unrestricted version [unrestricted mixed-data sampling (U-MIDAS)] by Foroni et al. (Reference Foroni, Marcellino and Schumacher2015). MIDAS models are tightly parameterised, parsimonious models, which allow for the inclusion of many lags of the explanatory variables. Given their nonlinear form, MIDAS models are typically estimated by nonlinear least squares. U-MIDAS models are the unrestricted counterpart of MIDAS models, which can be estimated by simple ordinary least squares (OLS), but work well only when the frequency mismatch is small.
Hepenstrick and Marcellino (Reference Hepenstrick and Marcellino2019) combine the U-MIDAS approach with the 3PRF method of Kelly and Pruitt (Reference Kelly and Pruitt2015) to be able to extract information from a large high frequency dataset specifically aimed at forecasting the low frequency target variable of interest. In their analysis, this approach works well for nowcasting GDP growth in a variety of countries with respect to other types of mixed frequency factor models.Footnote 1 , Footnote 2
The U-MIDAS approach can be also used in the context of machine learning (ML) methods, which have attracted considerable interest in the recent past. Among the ML methods most commonly used for macroeconomic nowcasting and forecasting there are both penalised linear regressions (Ridge, Lasso, Elastic Net, etc.) and nonlinear methods (e.g. neural networks, random trees and forests), see for example, Goulet Coulombe et al. (Reference Goulet Coulombe, Leroux, Stevanovic and Surprenant2020) and Medeiros et al. (Reference Medeiros, Vasconcelos, Veiga and Zilberman2021). Masini et al. (Reference Masini, Medeiros and Mendes2021) provide an overview of these methods and their forecasting performance in macroeconomic and financial applications.
In terms of nowcasting applications for Luxembourg, there is only a small set. Among them, Nguiffo-Boyom (Reference Nguiffo-Boyom2008) estimates a monthly real activity indicator in a dynamic factor model. She evaluates its performance for predicting real GDP by aggregating it to a quarterly level, showing that it improves the forecasts. Nguiffo-Boyom (Reference Nguiffo-Boyom2014) improves the model with a mixed frequency framework. The activity indicator is extracted from monthly and quarterly data and its forecast performance, in a mixed frequency regression, is again found to be satisfactory. Glocker and Kaniovski (Reference Glocker and Kaniovski2020) estimates a dynamic factor model which can accommodate ragged edge data and frequency mismatch. He uses it to nowcast real goods exports, real private household consumption and employment variables. He shows that it improves forecasts in a (pseudo) real-time framework. In the context of the Covid-19 crisis, Beine et al. (Reference Beine, Bertoli, Chen, D’Ambrosio, Docquier, Dupuy, Fusco, Girardi, Haas, Islam, Koulovatianos, Leduc, Lorenz, Machado, Peluso, Peroni, Picard, Pieretti, Rapoport, Sarracino, Sologon, Tenikue, Theloudis, Van Kerm, Verheyden and Vergnat2020) develop an ECO-SIR model. It links an epidemiological model Susceptible-Infected-Recovered (SIR) with economic (ECO) input-output tables and is used to simulate the economic consequences of Covid-19. Their model can also be used to simulate/nowcast real GDP growth.
Given this theoretical and empirical background, to construct and evaluate nowcasts for Luxembourg we have collected a large set of conventional and alternative indicators (see e.g. Carriero et al., Reference Carriero, Clark and Marcellino2020; Lewis et al., Reference Lewis, Mertens and Stock2020; Woloszko, Reference Woloszko2020). We focus on the mixed frequency 3PRF approach of Hepenstrick and Marcellino (Reference Hepenstrick and Marcellino2019), labelled mixed-frequency (MF)-3PRF, considering a few other methods as benchmarks, for comparison, and to assess the relevance of model uncertainty. This is because MF-3PRF summarises the relevant information more efficiently than other factor approaches, and its ease of estimation makes it suitable for frequent updating of the nowcasts. In terms of other methods, we will consider U-MIDAS models with single indicators, to get an indication of the effects of specific variables and data release on the GDP growth nowcasts, and the combination of the resulting nowcasts, as pooling typically produces robust forecasts. We will also consider large dynamic factor models in state space form, as an alternative way to summarise and exploit large mixed-frequency information sets for nowcasting. Moreover, to ensure an up to date coverage of available methodologies, we will also implement ML approaches. In particular, we will focus on random forests and neural networks among our forecasting models, as they are among the best performers for macroeconomic forecasting according to previous related research (see e.g., Goulet Coulombe et al., Reference Goulet Coulombe, Leroux, Stevanovic and Surprenant2020; Medeiros et al., Reference Medeiros, Vasconcelos, Veiga and Zilberman2021).
As a preview of the results, mixed frequency dynamic factor models and neural networks perform well, both in absolute terms and in relative terms with respect to a benchmark AR model, with the 3PRF a close third best, with the advantage of computational simplicity and interpretability of the results. The gains are larger during problematic times, such as the financial crisis and the recent Covid period. Even the best models do not track well the Covid period, but simple models based on surveys would have done a decent job.
The rest of this paper is organised as follows. Section 2 reviews the nowcasting models. Section 3 describes the data and the design of the nowcasting exercise. Section 4 presents the empirical results. Section 5 summarises the main results and concludes.
2. Nowcasting models
In this section, we review the nowcasting models used in the analysis, providing additional details on those less commonly used for nowcasting. We consider, in turn, U-MIDAS, mixed frequency factor models and ML methods.
2.1. Unrestricted MIDAS
U-MIDAS model was introduced by Foroni et al. (Reference Foroni, Marcellino and Schumacher2015, FMS), on which this subsection is based and to whom we refer for additional details. When the frequency mismatch is small, like in our application with monthly and quarterly variables, U-MIDAS can be preferable to MIDAS (e.g. Ghysels et al., Reference Ghysels, Santa-Clara and Valkanov2005, Reference Ghysels, Santa-Clara and Valkanov2006, Reference Ghysels, Sinko and Valkanov2007) as it is more flexible and preserves model linearity. We discuss model specification, forecasting, how to use this approach when many high frequency indicators are available, and how the method is implemented in the empirical application. See also Clements and Galvão (Reference Clements and Galvão2008) for some alternative approaches.
2.1.1. Model specification
Let us consider a single variable $ y $ and $ N $ variables $ x $ , with $ y $ and $ x $ stationary. The $ x $ variables can be observed for each period $ t $ , while $ y $ can be only observed every $ k $ periods. For example, $ k=3 $ when $ t $ is measured in months and $ y $ is observed quarterly (e.g. $ x $ could contain an interest rate and inflation and $ y $ GDP growth). Let us indicate the aggregate (low) frequency by $ \tau $ , while $ Z $ is the lag operator at $ \tau $ frequency, with $ Z={L}^k $ and $ {Zy}_{\tau }={y}_{\tau -1} $ . In the sequel, HF indicates high frequency ( $ t $ ) and LF low frequency ( $ \tau $ ).
The operator
characterises the temporal aggregation scheme. For example, $ \omega (L)=1+L+\cdots +{L}^{k-1} $ in the case of flow variables and $ \omega (L)=1 $ for stock variables.
FMS label as unrestricted MIDAS or simply U-MIDAS the model where $ \omega (L){y}_t={y}_{\tau } $ is regressed on its own quarterly lags and on lags of $ \omega (L){x}_{it} $ for $ i=1,\dots, N $ . In formulae, the U-MIDAS model (in LF) can be written as
where $ {s}_{it}=\omega (L){x}_{it} $ , $ c\left({L}^k\right)=\left(1-{c}_1{L}^k-\cdots -{c}_c{L}^{kc}\right) $ , $ {\delta}_i(L)=\left(1-{\delta}_{i,1}L-\cdots -{\delta}_{i,v}{L}^v\right) $ , $ i=1,\dots, N $ . In general, the error term $ {\unicode{x03B5}}_t $ has an MA structure.
Finally, let as assume that the lag orders $ c $ and $ v $ are large enough to make the error term $ {\unicode{x03B5}}_t $ uncorrelated. Then, all the parameters in the U-MIDAS model (2) can be estimated by simple OLS (while the aggregation scheme $ \omega (L) $ is supposed known). Moreover, from a practical point of view, the lag order $ v $ could differ across variables, and $ {v}_i $ and $ c $ could be selected by an information criterion such as Bayesian Information Criteria (BIC).
2.1.2. Forecasting with U-MIDAS
To start with, let us consider the case where the forecast origin is in period $ t= Tk $ and the forecast horizon measured in $ t $ time is $ h=k $ (namely, one-step ahead in LF). Using standard formulae, the optimal forecast (in the MSE sense and assuming that $ {\unicode{x03B5}}_t $ is uncorrelated) can be expressed as
where $ {\hat{s}}_{iTk+j\mid Tk}={s}_{iTk+j\mid Tk} $ for $ j\,\le\,T $ .
A problem with the expression in equation (3) is that forecasts of future values of the HF variables $ x $ are also required. These can be obtained from the so-called reverse MIDAS regressions (Ghysels and Valkanov, Reference Ghysels and Valkanov2006), but in practice, this can be fairly complicated. Hence, a simpler approach is to use a form of direct estimation (see e.g., Marcellino et al., Reference Marcellino, Stock and Watson2006), and construct the forecast as
where the polynomials $ \tilde{c}(Z) $ and $ {\tilde{\delta}}_i(L) $ are obtained by projecting $ {y}_t $ on information dated $ t-k $ or earlier, for $ t=k,2k,\dots, Tk $ .
The main advantage of the U-MIDAS approach is that it allows to easily incorporate HF information in LF models. In particular, suppose that the value of interest is still $ {y}_{T+k} $ , but that now information up to period $ T+1 $ is available (e.g. data on the first month of a given quarter becomes available). Then, the expression in equation (3) can be easily modified to take the new information into account, the forecast becomes:
where $ {\hat{s}}_{iTk+j\mid Tk+1}={s}_{iTk+j\mid Tk+1} $ for $ j\hskip1.5pt \le \hskip1.5pt T+1 $ . Similarly, the coefficients in equation (4) would be now obtained by projecting $ {y}_t $ on information dated $ t-k+1 $ or earlier and the direct forecast becomes
The direct approach of equation (4) can be extended to construct h-step ahead forecasts in LF. In particular it is
where the polynomials $ \overline{c}(Z) $ and $ {\overline{\delta}}_i(L) $ are obtained by projecting $ {y}_t $ on information dated $ t- hk $ or earlier, for $ t=k,2k,\dots, Tk $ . The forecast can be updated to incorporate fresh HF information as in the one-step ahead case.
Finally, the formulae that we have derived so far can be easily adapted to provide nowcasts for the $ y $ variable, that is, $ {\hat{y}}_{Tk\mid Tk} $ , which is the main case of interest in our analysis. For example, timely monthly indicators can be used to nowcast current quarter GDP growth, which is typically published around the middle of the subsequent quarter.
2.1.3. How to handle many indicators
What happens when $ N $ , the number of available high frequency indicators, is large? From a theoretical point of view, the U-MIDAS model in equation (2) can have a generic number of variables, as long as $ N $ is lower than the number of observations $ T $ . Yet, in practice, when $ N/T $ is close to one parameter estimation uncertainty grows substantially, and this is reflected in larger forecast uncertainty. Moreover, when $ N/T>1 $ , OLS estimation is no longer feasible. Hence, alternative solutions are needed, and three main ones are available.
First, one can consider $ N $ U-MIDAS models, each using a single indicator. The resulting nowcasts or forecast can be then averaged, using either equal weights, which often performs well empirically when $ N $ is large, or weights based on the inverse MSFE or the values of information criteria over a training sample.
Second, one can summarise the information in the N high-frequency predictors, for example using principal components, and then use the components as regressors. This approach is called Factor-MIDAS in Marcellino and Schumacher (Reference Marcellino and Schumacher2010), who study its empirical performance in nowcasting and forecasting.
The first two approaches are compared in nowcasting GDP growth for various countries in Kuzin et al. (Reference Kuzin, Marcellino and Schumacher2013), who find that factor models often are slightly better than pooling single indicator models, though the performance can be country and sample dependent. Due to this result, in the next two sections we will also consider more elaborate mixed frequency factor models, which could further enhance the nowcasting performance.
Finally, the third approach to handle a large number of regressors is the use of a type of classical penalised estimation, such as Ridge or Lasso, or the adoption of Bayesian estimation, where shrinkage is achieved by the use of proper priors on model parameters. The former approach is considered, for example, in Bencivelli et al. (Reference Bencivelli, Marcellino and Moretti2017), the latter in Carriero et al. (Reference Carriero, Clark and Marcellino2015). As the gains from these methods with respect to pooling or factor models are not clear-cut, and the computational costs are higher, we will not assess their empirical performance for Luxembourg.
2.1.4. Practical implementation
As mentioned, we use direct forecasting to overcome the problem of missing observations at the end of the sample, as with iterative procedures we would need to forecast the future values of the (many) explanatory variables.
The lag length was selected according to the BIC information criterion, because it favours smaller models than the Akaike information criterion (AIC) and Hannan-Quinn information criterion (HQ), and parsimony is typically a plus for nowcasting models. When selecting the number of lags, models with number of observations smaller than twice the number of parameters were excluded from the analysis. Similarly, variables for which the number of observations was smaller than 10, after accounting for the number of lags, were excluded from the analysis, to improve model stability and reliability of the estimators.
2.2. Mixed-frequency factor models in state space form
Mixed-frequency factor models have been often employed in the literature to handle data with different frequencies by considering the low-frequency variable as a high-frequency one with missing observations. These models are utilised also to extract an unobserved state of the economy, to create (e.g. coincident or financial conditions) indicators, or to forecast and nowcast GDP growth, see for example, Mariano and Murasawa (Reference Mariano and Murasawa2003, Reference Mariano and Murasawa2010) and Nunes (Reference Nunes2005) for small scale applications and Giannone et al. (Reference Giannone, Reichlin and Small2008), Banbura and Modugno (Reference Banbura and Modugno2014), Banbura and Ruenstler (Reference Banbura and Ruenstler2011) for large scale models. Banbura et al. (Reference Banbura, Giannone, Reichlin, Clements and Hendry2011) and Banbura et al. (Reference Banbura, Giannone, Modugno and Reichlin2013) present overviews with a focus on Kalman filter based factor modelling techniques.
We focus on the large mixed frequency factor model proposed by Giannone et al. (Reference Giannone, Reichlin and Small2008), to whom we refer for additional details. The method exploits a large number of series that are released at different times and with different lags. The methodology the authors propose relies on the two-step estimator by Doz et al. (2011). This framework combines principal components with the Kalman filter. First, the parameters of the model are estimated by OLS regression on the estimated factors, where the latter are obtained through principal components calculated on a balanced version of the dataset. Then, the Kalman smoother is used to update the estimate of the signal variable on the basis of the entire unbalanced panel.
More formally, the dynamic factor model of Doz et al. (2011) is given by
Equation (8) relates the $ N $ monthly series $ {x}_{t_m} $ to a $ r\times 1 $ vector of latent factors $ {f}_{t_m} $ , through a matrix of factor loadings $ \Lambda, $ plus an idiosyncratic component $ {\xi}_{t_m}, $ assumed to be a multivariate white noise with diagonal covariance matrix $ {\Sigma}_{\xi }. $ Equation (9) describes the law of motion of the latent factors, which are driven by a $ q- $ dimensional standardised white noise $ {\eta}_{t_m} $ , where $ B $ is a $ r\times q $ matrix ( $ r\hskip1.5pt \le \hskip1.5pt q $ ). Hence, $ {\zeta}_{t_m}\sim N\left(0,B{B}^{\prime}\right). $
To deal with missing observations (mainly at the end of the sample in their case but a similar approach can be used for the systematically missing observations generated by the mixed frequency data), the authors use a two-step estimator. In the first step, the parameters of the model are estimated consistently through principal components on a balanced panel, created by dropping the variables with missing observations (or truncating the dataset at the date of the least timely release). In the second step, the Kalman smoother is applied to update the estimates of the factor (and the forecast) on the basis of the entire unbalanced dataset.
Finally, we review the model by Banbura and Ruenstler (Reference Banbura and Ruenstler2011), who extend Giannone et al. (Reference Giannone, Reichlin and Small2008) by integrating monthly GDP growth $ {y}_{t_m}^{\ast } $ as a latent variable, related to the common factors by the static equation
The quarterly GDP growth, $ {y}_{t_m} $ , is assumed to be the quarterly average of the monthly series:
The innovations $ {\varepsilon}_{t_m},{\eta}_{t_m},{\xi}_{t_m} $ are assumed to be mutually independent at all leads and lags.
For estimation, it is convenient to cast equations (8)–(11) in state-space form. $ {y}_{t_m} $ is constructed in such a way that it contains the quarterly GDP growth in the third month of each quarter, while the other observations are treated as missing. Specifically, the state-space representation, when $ p=1 $ , is:
The aggregation rule (11) is implemented in a recursive way, by introducing a latent cumulator variable $ {y}_{t_m}^C={\Xi}_{t_m}{y}_{t_m-1}^C+\frac{1}{3}{y}_{t_m}^{\ast }, $ where $ {\Xi}_{t_m}=0 $ for $ {t}_m $ corresponding to the first month of the quarter and $ {\Xi}_{t_m}=1 $ otherwise. The estimation of the model parameters follows Giannone et al. (Reference Giannone, Reichlin and Small2008).
Marcellino and Sivec (Reference Marcellino and Sivec2016) use a slightly different method to handle the systematically missing observations caused by the mixed-frequency data, and show how to modify the procedure to allow for some observable factors, which is relevant for economic applications. They use the resulting model to study the propagation of various structural economic shocks.
2.2.1. Practical implementation
Consistent with the literature on factor models, data were aligned by date and missing values at the beginning or end of the sample were projected with an EM algorithm.Footnote 3 To assure model stability, we excluded variables which exhibited more than 70 per cent of missing values and variables for which the longest uninterrupted-spell of missing values was higher than 30 per cent of the sample size. To improve the likelihood of extracting factors that describe rGDP dynamics well and to match the information content of data that enters MIDAS models (where lagged values of rGDP are included as regressors), four lagged values of rGDP were added to the dataset.
The number of factors was selected so that the factors jointly explain at least 60 per cent of the variance of the dependent variable. This rule of thumb was used because formal tests tend to select a high number of factors, which can be detrimental for forecasting purposes. The lag length in the VAR for the factors was selected with the BIC criterion.
2.3. The mixed-frequency 3PRF
In this subsection, we consider an alternative approach to estimate mixed frequency factor models, the 3PRF, which does not require the use of the Kalman filter but is based on recursive application of (many) OLS regressions. In addition, it permits to construct factors that are specifically targeted towards the variable of interest, rather than simply summarising the information in the large information set available. Specifically, we review the 3PRF and its mixed frequency version closely following, respectively, Kelly and Pruitt (Reference Kelly and Pruitt2015, KP) and Hepenstrick and Marcellino (Reference Hepenstrick and Marcellino2019, HM), to whom we refer for additional details.
2.3.1. The 3PRF
Let us consider the following model:
where $ y $ is the target variable of interest; $ {\mathbf{F}}_t={\left({\mathbf{f}}_t^{\prime },{\mathbf{g}}_t^{\prime}\right)}^{\prime } $ are the $ K={K}_f+{K}_g $ common driving forces of all variables, the unobservable factors; $ \beta ={\left({\beta}_f^{\prime },{\mathbf{0}}^{\prime}\right)}^{\prime } $ , such that $ {y}_t $ only depends on $ {\mathbf{f}}_{t-1} $ and not also on $ {\mathbf{g}}_{t-1} $ ; $ {\mathbf{z}}_t $ is a small set of $ L $ proxies that are driven by the same underlying forces as $ {y}_t $ , such that $ \varLambda =\left({\varLambda}_f,\mathbf{0}\right) $ with $ {\mathtt{\varLambda}}_f $ non-singular; $ {\mathbf{x}}_t $ is a large set of $ N $ weakly stationary variables, driven by both $ {\mathbf{f}}_t $ and $ {\mathbf{g}}_t $ ; and $ t=1,\dots, T $ . To achieve identification, when $ N $ and $ T $ diverge, the covariance of the loadings is assumed to be the identity matrix, and the factors are orthogonal to one another. For the sake of space, we refer to KP, Section 2.2, for precise conditions on the factors, loadings, allowed temporal and cross-sectional dependence of the errors, and existence of proper central limit theorems.
With respect to the factor model analysed by, for example, Stock and Watson (Reference Stock and Watson2002), here the large dataset $ {\mathbf{x}}_t $ is possibly driven by more factors than the target variable $ {y}_t $ . Asymptotically and with a strong factor structure, this does not matter for forecasting, as if we include more factors than those strictly needed in equation (14), then their estimated loadings will converge to zero. However, in finite samples, or if the $ {\mathbf{f}}_t $ are weak while $ {\mathbf{g}}_t $ are strong factors (see e.g. Onatski, Reference Onatski2012), estimating and using only the required factors $ {\mathbf{f}}_t $ in equation (14) would be very convenient. This is a well-known problem, see for example, Boivin and Ng (Reference Boivin and Ng2006), who suggest some form of variable pre-selection prior to factor extraction.
KP provide a general, elegant and simple solution to the problem of estimating in the model (14)–(16) $ {\mathbf{f}}_t $ only, based on three steps of OLS regressions (that give the name to the procedure). KP show that the 3PRF factor estimator $ {\hat{\mathbf{F}}}_t $ is consistent for the space spanned by the true factors. Moreover, they demonstrate that the 3PRF based forecast $ {\hat{y}}_{t+1}={\hat{\beta}}_0+{\hat{\beta}}^{\prime }{\hat{\mathbf{F}}}_t $ converges to the unfeasible forecast $ {\beta}_0+{\beta}^{\prime }{\mathbf{F}}_t $ when $ N $ and $ T $ diverge. In addition,
where $ {Q}_t $ is defined in KP.
For the case in which there is just one $ {\mathbf{f}}_t $ factor, KP suggest directly using the target variable $ y $ as proxy $ \mathbf{z} $ . They refer to this case as target-proxy 3PRF. In the case of more factors, they propose to either use proxies suggested by theory, or a simple automated procedure, which can be implemented in the following steps, indicating a proxy by $ {r}_j $ with $ j=1,\dots, L $ .
• Pass 1: set $ {r}_1=y $ , and obtain the 3PRF forecast $ {\hat{y}}_t^1 $ and the associated residuals $ {e}_t^1={y}_t-{\hat{y}}_t^1 $ .
• Pass 2: set $ {r}_2={e}^1 $ , and obtain the 3PRF forecast $ {\hat{y}}_t^2 $ using $ {r}_1 $ and $ {r}_2 $ as proxies. Obtain the associated residuals $ {e}_t^2={y}_t-{\hat{y}}_t^2 $ .
• Pass L: set $ {r}_L={e}^{L-1} $ , and obtain the 3PRF forecast $ {\hat{y}}_t^L $ using $ {r}_1,{r}_2,\dots {r}_L $ as proxies.
2.3.2. The MF-3PRF
HM consider the case in which the target variable $ {y}_t $ (or the proxies $ {\mathbf{z}}_t $ ) are sampled at a lower frequency than the indicators $ {\mathbf{x}}_t $ . This is an empirically common situation. It arises, for example, when the target variable is GDP growth or GDP deflator inflation, which are available on a quarterly basis, while the indicators are monthly, for example, industrial production and its components, labour market variables, financial indicators or survey variables.
The notation of HM is similar to that used for the U-MIDAS model. In particular, HM assume that the indicators $ {x}_t $ can be observed for each $ t $ , while the target variable $ {y}_t $ and the proxies $ {\mathbf{z}}_t $ can be only observed every $ k $ periods. For example, $ k=3 $ when $ t $ is measured in months and $ y $ is observed quarterly (as for GDP growth). They indicate the aggregate (low) frequency by $ \tau $ , the lag operator at high frequency $ t $ by $ L $ , and the lag operator at low frequency frequency $ \tau $ by $ Z $ , with $ Z={L}^k $ so that $ {Zy}_{\tau }={y}_{\tau -1} $ . As for U-MIDAS, HF indicates high frequency ( $ t $ ), LF low frequency ( $ \tau $ ) and the operator
characterises the temporal aggregation scheme. In the case of GDP growth, HM can assume that the observable quarter on quarter GDP growth is obtained by cumulating three (unobservable) monthly month-on-month GDP growth rates, so that $ \omega (L)=1+L+{L}^2 $ . The same transformation is applied to the proxies $ {\mathbf{z}}_{\tau } $ , so that $ {\mathbf{z}}_{\tau }=\omega (L){\mathbf{z}}_t $ , and to each of the monthly indicators in $ {\mathbf{x}}_t $ , $ {x}_{i,t} $ , obtaining $ {x}_{i,\tau }=\omega (L){x}_{i,t} $ , for $ \tau =1,2,\dots, T/3 $ (where $ \tau $ is measured in quarters, so that $ \tau =1 $ corresponds to $ t=3 $ , $ \tau =2 $ corresponds to $ t=6 $ , …, $ \tau =T/3 $ corresponds to $ t=T $ ) and $ i=1,\dots, N $ .
Using this notation, to cope with the frequency mismatch, HM propose modifying the steps of 3PRF as follows.
• Pass 1: run a (time-series) regression, in low (quarterly) frequency $ \tau $ , of each element of $ {\mathbf{x}}_{\tau } $ , $ {x}_{i,\tau } $ , on the proxy variables $ {\mathbf{z}}_{\tau } $ :
for each $ i=1,\dots N $ , and retain the OLS estimates $ {\hat{\alpha}}_i $ .
• Pass 2: run a (cross-sectional) regression of $ {x}_{\mathbf{i},t} $ on $ {\hat{\alpha}}_i: $
for each (month) $ t=1,\dots, T $ , and retain the OLS estimates $ {\hat{\mathbf{F}}}_t $ .
• Pass 3: split the estimated monthly factors $ {\hat{\mathbf{F}}}_t $ obtained in Pass 2 into three quarterly factors ( $ {\hat{\mathbf{F}}}_{\tau}^1 $ , $ {\hat{\mathbf{F}}}_{\tau}^2 $ and $ {\hat{\mathbf{F}}}_{\tau}^3 $ ), where the first (second/third) new quarterly series contains the values of $ {\hat{\mathbf{F}}}_t $ in the first (second/third) month of each quarter; run a (time-series) regression, in low (quarterly) frequency, of $ {y}_{\tau } $ on $ {\hat{\mathbf{F}}}_{\tau -1}^1 $ , $ {\hat{\mathbf{F}}}_{\tau -1}^2 $ and $ {\hat{\mathbf{F}}}_{\tau -1}^3: $
retain the OLS estimates $ {\hat{\beta}}_0 $ , $ \hat{\beta_1^{\prime }} $ , $ \hat{\beta_2^{\prime }} $ and $ \hat{\beta_3^{\prime }}, $ and use them in combination with $ {\hat{\mathbf{F}}}_{\tau}^1 $ , $ {\hat{\mathbf{F}}}_{\tau}^2 $ and $ {\hat{\mathbf{F}}}_{\tau}^3 $ to construct the forecast $ {\hat{y}}_{t+1}={\hat{\beta}}_0+\hat{\beta_1^{\prime }}{\hat{\mathbf{F}}}_t^1+\hat{\beta_2^{\prime }}{\hat{\mathbf{F}}}_t^2+\hat{\beta_3^{\prime }}{\hat{\mathbf{F}}}_t^3 $ .
HM label the resulting procedure the mixed-frequency 3PRF, MF-3PRF. MF-3PRF inherits the consistency properties of 3PRF, as the estimators in each step are consistent (and the fact that the regressions in Pass 1 are static are a key element to get this result). They also discuss how to handle other data irregularities, such as ragged edges and missing observations at the start of the sample.
Empirically, lags of $ {\hat{\mathbf{F}}}_t^1 $ , $ {\hat{\mathbf{F}}}_t^2 $ and $ {\hat{\mathbf{F}}}_t^3 $ could also matter for forecasting the target variable, as well as lags of the dependent variable. We will also experiment with this more general mixed frequency dynamic model in the empirical application.
2.3.3. Practical implementation
Series with many missing values were treated in the same fashion as in dynamic factor models. Data were aligned according to date, with missing values at the beginning or end of the sample fitted with an EM algorithm. To improve the likelihood of extracting factors that describe rGDP dynamics well, we added four lags of rGDP to the data.
Our descriptive correlation analysis revealed that several variables in the dataset do not reflect rGDP dynamics well. Therefore, only variables that were statistically significant in Pass 1 of the 3PRF were kept for model estimation. Moreover, if it proved statistically significant (in sample), the forecasting equation included a second factor, constructed using the automated proxy selection procedure described above.
2.4. Nonlinear machine learning methods
In the previous subsections, we have reviewed methods based on the specification of a parametric model, typically a linear regression, which links the target variable $ y $ with a, possibly big, number of explanatory variables $ x $ . In this section, we consider other methods that do not require an explicit linear or parametric formulation of the relationship between $ y $ and $ x $ , focussing on those cases where $ x $ can be big, as in our empirical analysis for Luxembourg. We consider, in turn, regression trees; random forests and neural networks. This review is based on Buono et al. (Reference Buono, Kapetanios, Marcellino, Mazzi and Papailias2018), to whom we refer for additional details and references, see also Masini et al. (Reference Masini, Medeiros and Mendes2021).
2.4.1. Regression trees
Regression trees are based on a partition of the space of the dependent variable $ y $ into $ M $ subsets $ {R}_m $ , with $ y $ allocated to each subset according to a given rule and modelled as a different constant $ {c}_m $ in each subset. This is a powerful idea, since it can fit various functional relationships between $ y $ and a set of explanatory variables $ x $ , say $ y=f(x) $ , without imposing linearity or additivity, which are commonly assumed in standard linear regression models. Let
where $ \mathbf{1} $ denotes the indicator variable taking value $ 1 $ if the condition is satisfied, $ 0 $ otherwise. Then, given a partition, minimising
with respect to $ {c}_m $ yields $ {\hat{c}}_m={\overline{y}}_m, $ where $ {\overline{y}}_m $ denotes the sample mean of $ y $ over each region $ {R}_m. $
A much more difficult problem is to find the best partition in terms of minimum sum of squares (18). Even in the two-dimensional case, that is, when $ k=2 $ so that $ x=\left[{x}_1,{x}_2\right], $ finding the best binary partition to minimise (18) is not computationally feasible. Instead, greedy algorithms are commonly used. The idea is to do one split at a time. Consider a splitting variable $ j $ (where $ j=1,..,k $ ) and a splitpoint $ s $ such that a region $ {R}_1\left(j,s\right) $ is defined as
Then, equation (18) is minimised wrt $ j $ and $ s. $ For each splitting variable, the split point $ s $ can be found and hence by scanning through all of the variables $ {x}_j $ , determination of the best pair $ (j,s) $ is feasible. Having found the best split, the data are partitioned into two resulting regions and the same splitting exercise is repeated on each of the two regions. Then this process is repeated on all of the resulting regions and so on. How many rounds of the algorithm are done determines how deep the resulting tree is. On one hand, shallow trees might fail to capture the structure of the data. On the other hand, however, deeper trees might overfit the data and hence do poorly in prediction.
A common way to proceed requires to grow a very large tree $ {T}_0 $ , which is then pruned using a penalty function. Define a subtree $ T\subset {T}_0 $ to be any tree that can be obtained by collapsing any number of its non-terminal nodes. Recall that $ {T}_0 $ partitions the space of $ y $ into $ M $ regions $ {R}_m, $ $ m=1,\dots, M, $ and hence contains $ M $ terminal nodes; and define $ \left|T\right| $ to be the number of terminal nodes of a subtree $ T $ . Define $ {N}_m $ to be the cardinality of $ {R}_m $ , that is, $ {N}_m=\left|{x}_i\in {R}_m\right| $ . Recall that
and denote the function
Then a complexity criterion function can be specified in the following way
The idea is to find (for a given $ \alpha $ ) a subtree $ {T}_{\alpha}\subset T $ such that $ {CC}_{\alpha }(T) $ is minimised. The tuning parameter $ \alpha \ge 0 $ governs how much large trees are penalised, so whenever $ \alpha =0, $ the solution is the full tree $ {T}_0, $ while large values of $ \alpha $ result into smaller trees. It turns out that, for a given $ \alpha, $ a unique smallest tree $ {T}_{\alpha}^{\ast}\subset T $ exists that minimises $ {CC}_{\alpha }(T) $ . To find $ {T}_{\alpha}^{\ast } $ an algorithm called ‘weak link pruning’ is used. The idea is to successively collapse subnodes that produce the smallest per-node increase in $ {\sum}_{m=1}^M{N}_m{Q}_m(T), $ until a single root tree is obtained. Breiman et al. (Reference Breiman, Friedman, Olshen and Stone1984) show that this results into a finite sequence of subtrees that contains $ {T}_{\alpha}^{\ast }. $
2.4.2. Random forests
Random forests were introduced by Breiman (Reference Breiman2001), see for example, Biau and Scornet (Reference Biau and Scornet2016) for a survey. The idea is as in bagging (Breiman, Reference Breiman1996), but applied on regression trees: to grow a large collection of de-correlated trees (hence the name forest) and then average them. This is achieved by bootstrapping a random sample at each node of every tree. In order to induce ‘decorrelation’ of trees, when growing trees, before each split, select a subset of the input variables at random as candidates for splitting. This prevents the ‘strong’ predictors imposing too much structure on the trunk of the tree.
Although their asymptotic properties are not fully understood yet, in particular for serially correlated variables, random forests can deliver good out-of-sample performance for macroeconomic variables, documented for instance in Medeiros et al. (Reference Medeiros, Vasconcelos, Veiga and Zilberman2021), Goulet Coulombe (Reference Goulet Coulombe2020), and Goulet Coulombe et al. (Reference Goulet Coulombe, Leroux, Stevanovic and Surprenant2020).
2.4.3. Practical implementation
As with the 3PRF, it proved useful to pre-select the explanatory variables, to get rid of those least correlated with rGDP. To respect consistency of variable selection across models, we utilised step 1 of the 3PRF to select them. The Matlab’s function that fits regression ensembles (random forests in our case) handles missing values automatically. Hyper parameters were determined by cross-validation. We also implemented the Bergmeir et al. (Reference Bergmeir, Hyndman and Koo2015) cross-validation procedure for time-series but, unfortunately, it proved less successful than ignoring the time-series nature of data. This finding is likely driven by the short length of our dataset. We utilised Matlab’s default boosting algorithm (LSBoost) to generate regression ensembles.
2.4.4. Deep learning and neural networks
Artificial neural networks (ANNs) are a family of models inspired by biological neural networks and are used to approximate functions and recognise patterns that can depend on a large number of inputs and are unknown. They are generally presented as systems of interconnected components which exchange messages between each other. The connections have weights that can be tuned based on experience, making neural nets adaptive to inputs and capable of learning; see for example, Blake and Kapetanios (Reference Blake and Kapetanios2010) for more detailed information.
While the application of ANNs to econometric nowcasting has produced mixed results, we note them as they have recently given rise to methods collectively known as deep learning. Deep learning is essentially a multilayered ANN model, which has been shown to have good pattern recognition properties; see Hinton and Salakhutdinov (Reference Hinton and Salakhutdinov2006). Typically, a large temporal dimension is needed, as the multilayered ANN model has a considerable number of parameters that need to be estimated. Let
Neural networks provide an approximation of the unknown function $ \mu (.) $ and their approximation properties have been established formally in the literature (see e.g. Hornik et al., Reference Hornik, Stinchcombe and White1989).
In practice, as with other ML methods, it is common to split the dataset into three subsets: training, validation and testing sets. The training set is used to adjust the weights of the network; the validation set is used to minimise the overfitting through choosing values of hyperparameters and selecting the appropriate model. Finally, the testing set is used to confirm the actual out-of-sample predictive power of the model. Deep learning has been applied in financial applications: for example, Sirignano et al. (Reference Sirignano, Sadhwani and Giesecke2016) use neural networks to analyse mortgage risk using a dataset of over 120 million prime and subprime U.S. mortgages between 1995 and 2014. Heaton et al. (Reference Heaton, Polson and Witte2016a, Reference Heaton, Polson and Witte2016b) also employ neural networks in the context of portfolio theory. Macroeconomic applications are typically less successful, likely due to the shorter samples available that do not permit a good training of the network.
2.4.5. Practical implementation
Pass 1 of the 3PRF was used to pre-select the predictors. Due to high computational burden neural networks were estimated on a quarterly frequency and missing values were fitted with the EM algorithm. The fist part (80 per cent) of each sample was used for parameter estimation, and the remaining part (20 per cent) for validation. In practice, the number of hidden layers is often set to 1 or 2. We report results for networks with one hidden layer as they performed better in the validation set. The optimal number of nodes is often selected in a trial and error procedure, with the upper bound set at once or twice the number of explanatory variables. Since our samples are rather short, we estimated networks with 5, 10, 25, 50 and 100 nodes, with the selected number of nodes determined by minimising the root mean square forecast error (RMSE) over the validation sample. We tested plain-vanilla and long-short term memory layers (LTSM). The latter explicitly take into account the time-series properties of our data. LTSM layers produced superior results on the validation sample. Therefore, the results reported in the next section refer to LTSM-layered neural networks, estimated with the Adam algorithm.
3. Data and design of the nowcasting exercise
In this section, we describe the data and design of the nowcasting exercise, which is based on the models discussed in the previous section and for which results are presented in the next section.
3.1. Data
To nowcast real GDP growth in Luxembourg using mixed frequency models, we collected a large and varied dataset. Collected series refer to different economic areas are of different length or frequency and are available with a diverse publication lag. We use conventional activity indicators (employment, industrial production, etc.) as well as alternative indicators (electricity consumption, traffic data, etc.).
The conventional series are commonly used at Statec for nowcasting and forecasting real GDP (rGDP). We organised them into 11 thematic groups: Banking: credit, debt securities, cash, …; Employment: employment, hours worked, labour cost, …; Output: industrial production, foreign output, iron and steel production, …; Prices: producer prices, deflators, iron and steel prices, …; Exchange rates: real effective exchange rate, …; Trade: imports, exports, …; Stock prices: stock indices, volatility of indices, …; Interest rates: money market rate, bond rate, …; Income: wages, taxes on wages, …; Housing: house price index, …; Consumption: household consumption, … Conventional series are typically of monthly and quarterly frequency (except most financial variables). Monthly series are released with a 20–60 days delay with respect to the reference period (e.g. unemployment vs. total credit) and start between January 1985 and December 2014 (unemployment vs. credit to households). Quarterly aggregate series are available from 1985Q1 or from the early 1990s. They are released with a 60–115 days delay. We also include business and consumer surveys (European Commission). In particular, monthly survey indicators for industry, building, retail, services and consumer sector for Luxembourg, neighbouring countries and the euro area since January 1985. These indicators are released before the end of each month which makes them the most timely among the conventional series. For the conventional series we use final vintage data, but take into proper account the delays in data releases, as real time data are not easily available. The use of final vintages is not relevant for some of the most promising indicators for nowcasting, as surveys and financial variables are not revised.Footnote 4
Alternative indicators include series that are not traditionally used in the construction of national accounts or for forecasting GDP growth. These series are typically available in real-time or with a publication lag shorter than 1 month. Therefore, they could be particularly useful in nowcasting GDP. Unfortunately, they are often either of insufficient range to be included in formal econometric models or not publicly available. After a careful search, we will consider the following alternative seriesFootnote 5:
• Fuel sales data from petrol stations: Consumption in cubic tons. Monthly data available since January 2000 and weekly since January 2019. Monthly data are included in the analysis.
• Google trends: We collected monthly series from January 2004 onward. The data includes Google keyword searches (e.g. Adem, part-time working, credit, …), Google categories (finance, real-estate, …) and topics (unemployment, crisis, …). Data are included in the main analysis, with a coverage overall comparable to that in Woloszko (Reference Woloszko2020).
• Short-term state aid data: Monthly data on various short-term state aid categories. Variable availability depends on the category (April 1998, October 2001, January 2009 and March 2020), longer series included in the analysis.
• New car registrations data: Monthly data of total new car registrations. Available for Luxembourg and neighbouring countries since January 1990, included in the analysis.
Finally, we target the first release of real GDP. This release is naturally less reliable than the second one, but closely monitored by the media and policy makers, and often relevant for policy decisions.
3.2. Design of the nowcasting exercise
We nowcast real GDP growth several times for each quarter. Specifically, we formulate nowcasting models $ v=5 $ times for each vintage, at the end of each first week in the months leading up to the release of the first estimate of real GDP. For example, for vintage 2020Q1, which refers to the period up to the 31st of March and is released 85 days after the 31st of March, we nowcast it on the 7th of January (Mm2—reference period minus 2 months), 7th of February (Mm1—reference period minus 1 month), 7th of March (M0—reference periods minus 0 months), 7th of April (M1—reference period plus 1 month) and 7th of May (M2—reference period plus 2 months). We stop short in the second month after the vintage’s reference period (in M2) because at the end of this month the Luxembourg official statistics office produces the unofficial early-release estimate of real GDP.
For each real GDP vintage and its corresponding nowcasting month (Mm2,…,M2), the explanatory variables included in the models are selected in such a way that their single most recent observation is used, also taking into account the delays in data releases. For example, if we forecast real GDP vintage 2020Q1 in month M0 we use the value of $ x $ available on the 7th of March. This might be, for example, industrial production from January 2020 or a Google popularity of search-term ‘unemployment’ for February 2020 or a term spread for (the first week of) March 2020.
Finally, we consider nowcasting over the full evaluation sample 2006Q3–2020Q3, but also over several subperiods, as different models can perform differently in crisis and normal periods.Footnote 6 The subperiods include two normal periods (pre-financial crisis from 2006Q3 to 2008Q1 and post-sovereign crisis from 2013Q2 to 2020Q1) and three crisis periods (the 2007 financial crisis from 2008Q2 to 2009Q4, the sovereign crisis from 2010Q1 to 2013Q1 and the Covid pandemic, 2020Q2 and Q3). The periods were selected by visual inspection of Luxembourg’s rGDP (see Figure 1).
4. Empirical results
We now compare first the nowcasting performance for Luxembourg of the models described in Section 2, then assess the performance of specific indicators, grouped by type. As mentioned, results are based on recursive estimation and forecasting, with the forecast evaluation period ranging from 2006Q3 until 2020Q3.
4.1. Model performance
We provide results based on all indicators and on a subset of the five best performing indicators, considering five nowcast horizons [from the first month of the target quarter (Mm2) to the second month of the following quarter (M2)] and also various temporal subperiods, characterised by different GDP growth behaviour. To avoid excess fragmentation of results, when commenting we try to single out an important category (e.g. fin-crisis period) and average over the remaining dimensions (e.g. over nowcast horizons).
We use the RMSE and the mean absolute error (MAE) as the evaluation criteria, with the former more commonly adopted and the latter more robust to the presence of extreme forecast errors.Footnote 7 The forecast error is computed using the first release of GDP growth as target, as this is the most relevant release from a policy point of view.
Models and variables are compared according to the value of the loss function (e.g. RMSEs for top 5 performing variables) and in relative terms (e.g. RMSEs are expressed relative to benchmark AR model), whichever option better conveys the information. We first focus on model performance (Tables 1 and 2) and then consider the performance of specific variables (Tables 3 and 4).
Note: Table displays RMSEs (root mean square forecast errors; left panel) and MAEs (mean absolute errors; right panel) values (upper panel) and ratios (lower panel) for five nowcasting models: AR (autoregressive model), ARX (autoregressive model with one exogenous regressor), MIDAS (mixed data sampling model with an AR term and one high frequency regressor), DFM (dynamic factor model estimated at a quarterly frequency), MFDFM (mixed frequency dynamic factor model with quarterly and monthly variables) and 3PRF (mixed-frequency three pass regression filter). RMSE and MAE ratios (lower panel) are expressed relative to best-performing AR model. Higher values reflect higher nowcast errors (upper panel) or worse performance relative to AR model (lower panel). We also present (unfeasible) average RMSE and MAE for (ex-post) top five performing variables for the ARX and MIDAS models. Models are compared for five nowcast horizons (see Section 3.2 Design of the nowcasting exercise).
Abbreviations: NN, neural networks; RANFOR, random forests.
Note: Table displays RMSEs (root mean square forecast errors; left panel) and MAEs (mean absolute errors; right panel) values (upper panel) and ratios (lower panel) for five nowcasting models: AR (autoregressive model), ARX (autoregressive model with one exogenous regressor), MIDAS (mixed data sampling model with an AR term and one high frequency regressor), DFM (dynamic factor model estimated at a quarterly frequency), MFDFM (mixed frequency dynamic factor model with quarterly and monthly variables), 3PRF (mixed-frequency three pass regression filter). RMSE and MAE ratios (lower panel) are expressed relative to best-performing AR model. Higher values reflect higher nowcast errors (upper panel) or worse performance relative to AR model (lower panel). We also present (unfeasible) average RMSE and MAE for (ex-post) top five performing variables for the ARX and MIDAS models. Models are compared for five subperiods (see Section 3.2 Design of the nowcasting exercise).
Abbreviations: NN, neural networks; RANFOR, random forests.
Note: This table displays RMSEs of the top 5 performing variables by data group. The statistics are displayed for the ARX (left panel) and MIDAS models (right panel).
Abbreviations: ARX, autoregressive model with one exogenous regressor; MAE, mean absolute error; MIDAS, mixed data sampling model with an AR term and one high frequency regressor; RMSE, root mean square forecast error.
Note: This table displays RMSEs of the top 5 performing variables by nowcast horizon. The statistics are displayed for the ARX (left panel) and MIDAS models (right panel).
Abbreviations: ARX, autoregressive model with one exogenous regressor; MAE, mean absolute error; MIDAS, mixed data sampling model with an AR term and one high frequency regressor; RMSE, root mean square forecast error.
Table 1 displays RMSE (left panel) and MAE (right panel) for seven nowcasting models and by nowcast horizon. Table 2 displays them by subperiods. Both tables contain RMSE and MAE values in the upper panel and ratios relative to the benchmark quarterly AR model in the lower panel. We compare the following models: ARX (single frequency autoregressive model with exogenous predictor), MIDAS (unrestricted mixed data sampling model), DFM (large factor model estimated on a quarterly frequency), MFDFM (large mixed frequency factor models estimated with quarterly and monthly data), 3PRF (mixed-frequency three pass regression filter), RANFOR (random forests) and NN (neural networks). The rows labelled TOP5 ARX and TOP5 MIDAS display the average of the RMSE and MAE for the five best performing regressors in the ARX and MIDAS models, where the best regressors are selected ex-post (hence, these values are not really achievable, they are used mainly for comparison with the the feasible MIDAS, factor and 3PRF and ML models). When comparing performance relative to the benchmark model entries larger (smaller) than one indicate that a specific model performs worse (better) than the benchmark.
We observe from Table 1 that nowcasts generally improve over horizons, both relative to the AR model and in values, as more information becomes available and is included in the models. Naturally, when more information becomes available the nowcasts become more accurate.Footnote 8
It turns out that ARX and MIDAS models often perform similar as the AR model (ratios are close to 1). The reason is twofold. First, we report RMSEs and MAEs averaged over all exogenous predictors. Among those predictors, some or most do not carry significant information for nowcasting rGDP. Therefore, the regression coefficient related to these regressors tends to be close to zero. These regressors increase parameter uncertainty while contributing little to nowcasting performance. This can be deduced by comparing the RMSEs (or MAEs) of ARX and MIDAS with those for TOP5 ARX and TOP5 MIDAS that, as mentioned, average the RMSEs (or MAEs) for the five best performing regressors only. The RMSEs are now 10–25 per cent lower compared to the AR model. This indicates that some variables are quite successful in decreasing nowcast errors, and a careful pre-selection of indicators is needed to achieve good performance. Yet, it can be difficult to find the best indicators in real time, and the ranking can change over time. Hence, factor based methods, which implicitly do variable weighting, can be a good second best. In particular, as we have seen in Section 2, the 3PRF model implicitly up-weights successful predictors and down-weights unsuccessful ones. This is likely reflected in the good performance of dynamic factor models and 3PRF, which comes close to the TOP5 models.
The second reason for the larger relative RMSE and MAE of ARX and MIDAS models is the rather short available sample period. In short samples, over-fitting the data is common, even when including potentially valuable predictors. Therefore, it can be sometimes beneficial to exclude marginally significant predictors to trade some bias for lower variance. An AR model is an extreme example of such predictor exclusion.
Among the attainable models the MFDFM and NN perform better than the other feasible models, with relative gains in the range 7–14 per cent,Footnote 9 increasing with the information flow. The 3PRF is a close third best. We should mention that while the better performance of the more sophisticated models is a robust result, the relative ranking of 3PRF and MFDFM is affected by the evaluation sample, not surprisingly given their close performance. It is also remarkable that the differences with respect to the unfeasible models based on the ex-post best indicators is rather small. The MAE values, reported in the right panel of the table, overall confirm the pattern identified by the RMSEs. It is also worth mentioning that in 2009Q3 all the nowcasting models made a large error with respect to the AR. Dropping this quarter from the evaluation lowers substantially the lower RMSEs and MAEs for all models.
We now move to inspecting the relative performance by period, reported in Table 2. Inline with the results discussed so far, most models, except ARX and MIDAS (averages over all regressors), outperform the AR model (RMSE ratios are smaller than 1) for the full period, with the MFDFM, NN and 3PRF performing best and very similarly (they outperform the AR model by about 10 per cent in terms of RMSE and MAE), and also comparable to the unfeasible TOP5 ARX and MIDAS.
The performance of all mixed frequency models improves substantially if we exclude the pre-crisis and post-crisis period, as in those periods the AR model performs exceptionally well. The RMSEs of all models, except NN and the unfeasible TOP5 models, are higher than that of the AR model. This is likely related to several facts. First, the pre-crisis period is short (only seven quarters), so the detected pattern could be just a statistical anomaly. Second, and perhaps most important, since the estimation sample is very short when focussing on the pre-crisis period, parsimonious models like the AR tend to perform better because parameter uncertainty dominates potential bias in determining nowcast performance. It is also worth mentioning that the relative statistics over the pre-crisis period appear inflated because the corresponding values for the AR are small by historical terms (see Table 2). This last argument also applies to the post-crisis period.
The models’ ranking is reversed in crisis and Covid period, where MFDFM, NN and 3PRF outperform the AR model in terms of both RMSE and MAE, with gains that reach 24–11 per cent for MFDFM during the crises. This finding is consistent with the forecasting literature that notes that in turbulent periods a wider range of predictors, possibly inserted into nonlinear models, tend to improve forecast accuracy. In relatively calm periods, on the other hand, rGDPs own dynamics (lagged rGDP) seems to be sufficient to produce decent nowcasts and forecasts. In fact, the performance of most models deteriorates in the post-crisis period relative to the AR model, as the post-crisis period is again characterised by rather stable growth.
To get a visual impression of the results we have discussed so far, and to assess the absolute nowcasting performance of the models, topmost panel in Figure 2 reports the actual values of rGDP growth together with the nowcasts from the AR and the MFDFM for M1. The figure highlights how the two nowcasts differ in particular during the financial crisis and the Covid-19 period. In these periods, a simple AR model fails to capture deteriorating economic conditions. Oppositely, the MFDFM partly captured the major drop in rGDP growth in financial crisis and 2020Q2, though not the strong positive value in 2020Q3, because most of the indicators in the large information set used by the large nowcasting indicators did not send reliable signals in time to capture the in Covid-19 rebound. However, some survey indicators related to the services sector did send the right signal and would have provided rather reliable nowcasts, see the bottom two panels in Figure 2, yet they could be hardly identified ex-ante, as for example they did not produce good nowcasts during the financial crisis.
4.2. Indicator performance
We now turn our attention from model performance to an evaluation of the most useful variables in nowcasting. To this purpose, we analyse models that feature one regressor at a time, the ARX and MIDAS models. For each of them we select the top 5 performing models by variable group, nowcast horizon and sub-period.
Table 3 displays RMSE and MAE for the ARX (left panel) and MIDAS models (right panel) by data group (averaged over periods and nowcasting horizons). Lower values indicate better performance. On average, survey data produce models with lowest RMSE (1.93 and 2.00 for ARX and MIDAS, respectively). The best performing survey variables are those that convey past demand and current confidence in services sectors abroad. Note, however, that past demand, as defined in business surveys, refers to demand in the last 3 months. In fact, this variable performs particularly well in nowcast horizons M0–M2 (this is further discussed in the next paragraph) which aligns it with current or future rGDP. In the context of our nowcasting design and despite its name this indicator is a coincident or leading indicator for quarterly rGDP. Other best performing survey indicators related to industry and building. In addition, the best predictors are those that refer to EA or EU in geographical terms. This is consistent with analysis of correlation coefficientsFootnote 10 which revealed that the international environment is an important driver of Luxembourg’s economy.
The second best data group includes stock prices, with average RMSE of 2.07 and 2.10 (ARX and MIDAS, respectively). Luxembourg’s stock index produces the lowest RMSEs but stock indices for neighbouring countries perform well also. Survey and stock price indicators are followed by employment, interest rates, banking, alternative, trade and output data, with similar RMSEs in the range of 2.10–2.20. Exact ranking is less clear and depends on the type of model considered. The other data groups (exchange rates and prices) are less successful in nowcasting rGDP. Later in this section, we present the most successful series. Unsurprisingly, they belong to survey stock price and employment groups.
The comparison of ARX and MIDAS models is somewhat mixed, in line with the previous discussion. In principle, we would expect MIDAS models to outperform traditional ARX models. Yet, when we include a monthly variable in the ARX model, we utilise its most recent available value and use skip-sampling to translate it to a quarterly frequency. In a traditional ARX model, we would use values of the monthly variable up to the last observable rGDP value and use quarterly averages. This proved to be less successful and also explains why our ARX models are competitive with MIDAS models.
Overall, the ARX and MIDAS models perform similarly. Among types of variables, survey data are the most successful, followed by stock prices and employment series. Alternative data do not rank highly. Nevertheless, they perform comparable to banking, interest rate and output data.
Table 4 displays the RMSE and MAE for the best indicators and nowcast horizons. It is interesting to note that in the earliest nowcast horizon (Mm2) stock prices produce lowest RMSEs for both models. This might be because of their forward looking characteristics. In horizon Mm2, there is no overlap between predictors and any of the current rGDP months, while this also holds for stock prices they do reflect future expectations. We also note that among alternative data vehicle registrations often make the list of top 5 performing variables in Mm2 and Mm1. As we move from Mm2–Mm1 and M0 industrial production indexes and survey data emerge as good predictors. Although industrial production is published with a significant lag it is likely a good predictor because it is well correlated with rGDP. By contrast, survey data are extremely timely and get released before the end of the reference month. We also note that alternative data such as vehicle registrations perform competitively if one considers MAE criteria. In latter nowcasting months (M1–M2), survey data outperform all other data groups across all models.
Table 5 repeats the analysis by subperiods (averages over horizons). As discussed before, the RMSEs are low in pre- and post-crisis periods, and peak in the financial crisis. For the full period, survey data for services, industry and building are included in best performing models. In the financial crisis, stock prices and survey data related to building and consumers become particularly useful. Out of the 10 best performing series, 8 are survey data. Interestingly, in the post-crisis period, while survey data still rank best (8 of 10 best performing indicators are survey data), the type of survey indicator changes. If in the crisis, business and consumer expectations were the most important for nowcasting, in the post-crisis period 6 out of 8 survey series relate to current conditions. In the covid period also, the best performing indicators are from the survey data group which proxy for current conditions. This could be because the services sector experienced particularly high losses in the Covid period. And, in all periods, survey data that geographically refer to neighbouring countries or EA/EU seem to be equally or more important than domestic series. We also note that alternative series rarely make the list. We conclude that traditional indicators such as employment, output series perform well in nowcasting in normal times. Survey data, especially those series that convey expectations and current conditions, seem to carry the most predictive power in normal as well as exceptional times (financial, sovereign and covid crisis).
Note: This table displays RMSEs of the top 5 performing variables by sub-period. The statistics are displayed for the ARX (left panel) and MIDAS models (right panel).
Abbreviations: ARX, autoregressive model with one exogenous regressor; Fin-crisis, financial crisis; MAE, mean absolute error; MIDAS, mixed data sampling model with an AR term and one high frequency regressor; RMSE, root mean square forecast error; Svn-crisis, sovereign crisis.
5. Conclusions
Obtaining reliable nowcasts and short term forecasts of economic conditions is very relevant for decision making in the public and private sector. This task is naturally complex, even more so when the economy experiences large fluctuations, as it happens during crisis time, but also more generally for small very open economies such as Luxembourg. Choosing a proper econometric approach to handle this difficult task is important, and recent advances in modelling, possibly very large, mixed frequency datasets can be helpful. In fact, exploiting the timely information contained in higher frequency macroeconomic or financial indicators, such as surveys or spreads, or also in alternative data, such as internet searches or traffic data can be beneficial for tracking economic conditions.
In this paper, we have first reviewed a number of small and large scale nowcasting models; then we have collected and analysed a large set of potentially useful indicators for the Luxembourg economy and neighbouring countries; finally, we have inserted these indicators, or a carefully selected subset of them, into a range of nowcasting models and evaluated the resulting nowcasting performance for (the first release) of real GDP growth, at different horizons.
Overall, we can conclude that more complex mixed frequency nowcasting models are particularly useful in turbulent and volatile times, with MFDFM, NN and 3PRF generally best. As the differences among the best models are limited, the 3PRF may be preferable due to computational considerations and as the 3PRF nowcasts can be more easily interpreted from an economic point of view. Simpler specifications, such as the AR model, are sufficient in ‘calm’ periods. Among types of variables, surveys related to expectations of future economic conditions, employment indicators and alternative data are particularly useful, often related to EU or neighbouring countries. Surveys related to the services sector would have provided reliable nowcasts during the Covid-19 period, but not so much before. Survey data are also preferable because they are released with a short publication lag, are informative and easy to collect. The absolute performance of the best nowcasting models is overall acceptable, in particular when including information on later months of the quarter of interest, but the nowcast error can be large during deep recessions and fast recoveries.
Acknowledgements
We are grateful to STATEC for financial support to this project, which is a contribution to its programme “COVID19 - Lessons learned”. Views and opinions expressed in this article are those of the authors and do not reflect those of STATEC and funding partners. The authors also gratefully acknowledge the support of the Observatoire de la Compètitivitè, Ministère de l’Economie, DG Compètitivitè, Luxembourg, and STATEC, the National Statistical Office of Luxembourg. We are grateful to the Editor Ana Galvao, an anonymous Referee and seminar participants at STATEC for helpful comments on a previous draft.