Thursday, July 7, 2016

GEFCom2012 Load Forecasting Data

The load forecasting track of GEFCom2012 was about hierarchical load forecasting. We asked the contestants to forecast and backcast (check out THIS POST for the definitions of forecasting and backcasting) the electricity demand for 21 zones, of which the Zone 21 was the sum of the other 20 zones.

Where to download the data?

You can also download an incomplete dataset from Kaggle, which does not have the solution data. The complete data was published as the appendix of our GEFCom2012 paper. If you don't have access to Science Direct, you can downloaded from my Dropbox link HERE. Regardless where you get the data, you should cite this paper to acknowledge the source:
  • Tao Hong, Pierre Pinson and Shu Fan, "Global energy forecasting competition 2012", International Journal of Forecasting, vol.30, no.2, pp 357-363, April-June, 2014. 

What's in the package?

Unzip the file, and navigate to "GEFCOM2012_Data\Load\" folder, you will see 6 files:
  • load_history
  • temperature_history
  • holiday_list
  • load_benchmark
  • load_solution
  • temperature_solution
Our GEFCom2012 paper has introduced the first five datasets but not the last one. The "temperature_solution" dataset includes the temperature data from 2008/6/30 7:00 to 2008/7/7 24:00, while the "load_solution" dataset does not include the load data from 2008/6/30 7:00 to 2008/6/30 24:00.

What's not working?

Before using the data, please understand that
there is no way to restore the exact Kaggle setup for you to make direct comparison on the error score. 
The main reason is that Kaggle pick a random subset of the solution data to calculate the scores for public leaderboard, and the rest for private leaderboard. We do not know which data was used for which leaderboard.

Nevertheless, it was never our intention to let you make comparisons in a Kaggle way. It is because the GEFCom2012 was set up more like a data mining competition than a forecasting competition. The contestants can submit their forecasts many times, while Kaggle was picking the best score. This is not a realistic forecasting process.

How to use the data?

Instead, we encourage you to use these 4.5 years of hourly data without considering the Kaggle setup. You can even keep 4 full calendar years and get rid of the last half a year in your case studies. With four years of data, you can perform one-year ahead ex post forecasting (see my weather station selection paper). You can also perform short term ex post forecasting on rolling basis (see my recency effect paper).

Then the question is whether the accuracy is "good enough". According to Table 3 of our GEFCom2012 paper, the winning teams improved the benchmark by about 30% - see the "test" column, which is the private leaderboard of Kaggle. In other words, if your model is getting about 30% error reduction comparing to the Vanilla benchmark on this dataset, it is a decent model.

Please also understand that this 30% is gained from a forecasting system with many bells and whistles, such as detailed modeling of temperature, and special treatment of holidays. If your research is focus on one components, the error reduction may be much smaller than 30%. You can find a more detailed arguments in my response to the second review comment in THIS POST.

It's been over two years since we published the GEFCom2012 data. Many researchers have already used it to test their models. You can also replicate the experiment setup in the recently published papers that used this GEFCom2012 data, and compare your results with the results on those papers.

Back to Datasets for Energy Forecasting.

No comments:

Post a Comment

Note that you may link to your LinkedIn profile if you choose Name/URL option.