Last week, Spyros Makridakis asked me a question:

In fact, this has been quite a debatable topic in load forecasting. The answer is not straightforward. This subject could make a good master's thesis or even a doctoral dissertation. I was going to write a paper about it, but always had something more important or urgent to work on. Recently my research team has done some preliminary work along this direction. While the paper is still under preparation, let me start the discussion with this blog post, as part of the blog series on error analysis in load forecasting.

The literature is not vacant in this area. Various empirical studies have suggested different things.

James' TPWRS2012 paper claimed that

No.

This is my interpretation of the paper:

In short, time series models, such as exponential smoothing and ARIMA models, never showed up as a major component of a winning entry in these competitions. On the other hand, regression models with temperature variables are always among the winning models.

In fact, ARIMA showed up in a winning method in GEFCom2014, where my former student Jingrui Xie used four techniques (UCM, ESM, ANN, and ARIMA) to model the residuals of a regression model (see our IJF paper).

Based on these competition results, can we conclude that time series models are not as accurate as regression models?

No.

In GEFCom2012, we let the contestants predict a few missing periods in the history without restricting the contestants to using only the data prior to each missing period. In my GEFCom2012 paper, I briefly mentioned that

GEFCom2014 and GEFCom2017 were on ex ante probabilistic forecasting. The temperature-based models dominated the leaderboards. This would be a fair evidence favoring temperature-based models.

I have been reading your energy competition and I cannot find any clear statements about the superiority of explanatory/exogenous variables. Am I wrong? Is there a place where you state the difference in forecasting accuracy between time series and explanatory multivariate forecasting as it relates to the short as well as beyond the fist two or three days (not to mention the long term) that accurate temperature forecasting exist?Today, Rob Hyndman asked me a similar question, which was routed originally from Spyros.

In fact, this has been quite a debatable topic in load forecasting. The answer is not straightforward. This subject could make a good master's thesis or even a doctoral dissertation. I was going to write a paper about it, but always had something more important or urgent to work on. Recently my research team has done some preliminary work along this direction. While the paper is still under preparation, let me start the discussion with this blog post, as part of the blog series on error analysis in load forecasting.

The literature is not vacant in this area. Various empirical studies have suggested different things.

**Some earlier attempts**were made by James Taylor. James has written many load forecasting papers. His best known work is on exponential smoothing models.James' TPWRS2012 paper claimed that

Although weather-based modeling is common, univariate models can be useful when the lead time of interest is less than one day.In Fig. 9 of the paper that depicted the MAPE values by lead time, the paper stated that

The exponential smoothing methods outperform the weather-based method up to about 5 hours ahead, but beyond this the weather-based method was better.Based on this paper, can we conclude that exponential smoothing models are more accurate than the weather-based methods for very short term ex ante load forecasting?

No.

This is my interpretation of the paper:

A world-class expert in exponential smoothing carefully developed several exponential smoothing models. These models generated more accurate forecasts than a U.K. power company's forecasts.The "weather-based method" used in that paper was devised by the transmission company in Great Britain using regression models. The paper briefly mentioned how the "weather-based method" worked, but the information was not enough for me to judge how accurate these weather-based models are. I don't know if this U.K. transmission company is using state-of-the-art models.

**Some evidence**came from recent load forecasting competitions, such as Global Energy Forecasting Competitions, npower forecasting challenges, and BigDEAL Forecasting Competition 2018.In short, time series models, such as exponential smoothing and ARIMA models, never showed up as a major component of a winning entry in these competitions. On the other hand, regression models with temperature variables are always among the winning models.

In fact, ARIMA showed up in a winning method in GEFCom2014, where my former student Jingrui Xie used four techniques (UCM, ESM, ANN, and ARIMA) to model the residuals of a regression model (see our IJF paper).

Based on these competition results, can we conclude that time series models are not as accurate as regression models?

No.

In GEFCom2012, we let the contestants predict a few missing periods in the history without restricting the contestants to using only the data prior to each missing period. In my GEFCom2012 paper, I briefly mentioned that

This setup may mean that regression or some other data mining techniques have an advantage over some time series forecasting techniques such as ARIMA, which may be part of the reason why we did not receive any reports using the Box–Jenkins approach in the hierarchical load forecasting track.In GEFCom2012, npower forecasting challenges, and the qualifying match of BFCom2018, actual temperature values were provided for the forecast period. In other words, these competitions were on ex post forecasting. Again, the temperature-based models have an advantage since perfect information of temperature is given for the forecast period.

GEFCom2014 and GEFCom2017 were on ex ante probabilistic forecasting. The temperature-based models dominated the leaderboards. This would be a fair evidence favoring temperature-based models.

**For benchmarking purpose**, I included two seasonal naive models in my recency effect paper per the request of an anonymous reviewer. Both performed very poorly compared with the other temperature-based models. I commented in the paper:Seasonal naïve models are used commonly for benchmarking purposes in other industries, such as the retail and manufacturing industries. In load forecasting, the two applications in which seasonal naïve models are most useful are: (1) benchmarking the forecast accuracy for very unpredictable loads, such as household level loads; and (2) comparisons with univariate models. In most other applications, however, the seasonal naïve models and other similar naïve models are not very meaningful, due to the lack of accuracy.Here is a quick summary based on the evidence so far:

- For ex post point load forecasting, evidence favors temperature-based models.
- For ex ante point load forecasting, no solid evidence favoring either method.
- For ex ante probabilistic load forecasting, evidence favors temperature-based models.

I'm not a fan of comparing techniques. In my opinion, it's very difficult to make fair comparisons among techniques. If I were good at ANN but bad at regression, I could build super accurate ANN models than regression models. Using exactly the same technique, two forecasters may build different models with distinct accuracy levels. My fuzzy regression paper offers such an example. In other words, the goodness of a model is largely depending upon the competency of the forecaster. The best way to compare techniques is through forecasting competitions.

In practice, weather variables is must-have in most load forecasting situations. I'll elaborate this in another blog post.

## No comments:

## Post a Comment

Note that you may link to your LinkedIn profile if you choose Name/URL option.