Saturday, November 19, 2016

FAQ for GEFCom2017 Qualifying Match

I have received many questions from GEFCom2017 contestants. Many thanks to those who raised the questions. This is a list of frequently asked questions. I will update it periodically if I get additional ones.

Q1. I can't open the link to the competition data. How to get access to the data?

A1. If you cannot access the data via the provided link directly, you may need a VPN service. There are many free VPN services available. Use Google to find one, or post the question on LinkedIn forum to see if your peer contestants can help.

Q2. Can the competition organizer re-post the data somewhere else?

A2. No. We are not going to re-post the data during the competition, because ISO New England updates the data periodically.

Q3. Are we forecasting the same forecasting period in both Round 2 and Round 3? And another same forecasting period in both Round 4 and Round 5?

A3. For GEFCom2017-D, ISO New England updates the data every month, typically in the first half of the month. In Round 2, you will be using the data as of Nov 30, 2016. In Round 3, the December 2016 data should be available as well. For GEFCom2017-O, the data is being updated in real-time. We would like to see if there is any improvement with half a month of information. This set up also gives some flexibility to the contestants. If the team is busy with other commitments during the competition, they may submit the same forecast for both Round 2 and Round 3.

Q4. Can the same team join both tracks?

A4. Yes. A team may even submit the same forecasts to both tracks. Nevertheless, we are expecting higher accuracy in the forecasts of GEFCom2017-O than those of GEFCom2017-D.

Q5. Can one person join two or more teams?

A5. No.

Q6. I'm with a vendor. I don't know if my company wants to put its name as the team name. Can I join the competition personally? If I win, can I add my company as my affiliation and/or change the team name to my company's name?

A6. You can join the competition with or without linking your team to your company. However, you need to make the decision before registration. Once you are in the game, we can not change your affiliation or team name.

Q7. Which benchmark method will be used?

A7. The benchmark method forecasts each zone individually. We will use the vanilla model as the underlying model, simulate the temperature by shifting 11 years of temperature data (2005 - 2015) 4 days forward and backward to come up with 99 scenarios, which will be used to extract 9 quantiles. See THIS PAPER for more details.

Q8. In GEFCom2017-D, are we required to process daylight savings time in a specific way?

A8. No. You can treat the daylight savings time any way you like. THIS POST elaborates my approach, which you don't have to follow.

Q9. In GEFCom2017-D, are we allowed to assume the knowledge of federal holidays before 2011?Can we give special treatments to the days before and after the holidays?

A9. Yes, and yes. The website only publishes federal holidays starting from 2011. You can infer the federal holidays before 2011. You can model the days before and after holidays the way you like. I had a holiday effect section in my dissertation, which you don't have to follow. Keep in mind that you should not assume any knowledge about local events or local holidays, such as NBA final games and Saint Patrick's Day.

Q10. The sum of the 8 zones are slightly different from the total demand published by ISO New England. Which number will you use to evaluate the total demand?

A10. Column D of the "ISO NE CA" worksheet.

Q11. For GEFCom2017-D, are you going to provide weather forecasts that every team should use?

A11. No. It is an ex ante hierarchical probabilistic load forecasting problem. We do not provide weather forecasts. The contestants in the GEFCom2017-D track should not use any weather forecasts from other data sources. Nevertheless, the contestants may generate their own weather forecast if they want to. The weather forecasting methodology should be in the final report if they take this route.

Q12. No wind, solar or price forecasting in GEFCom2017? It's a pity!

A12. GEFCom2017 is a load forecasting competition. Unfortunately, we were not able to identify good datasets to set up wind, solar or price forecasting tracks to match the challenge level as this load forecasting problem. Nevertheless, in GEFCom2017-O, you may leverage other data sources to predict wind, solar and prices, which may be good for your load forecasts.

Q13. I'm a professor. Any advice if I want to leverage this competition in class?

A13. It would be nice to leverage the competition in your course. I did so two years ago in GEFCom2014. There will again be an institute prize in GEFCom2017. To aim for the institute prize, I would recommend that you sign up as many teams as possible to maximize the likelihood to win. What I did two years ago was to have each student form a single-person team, and tied the competition ranking to their grades. Anyway, if you are going to join the competition, it's better to have the students look into the data ASAP. The first round submission is due on 12/15/2016.

Q14. Any reference materials we should read before we dive into the competition problem?

A14. For probabilistic load forecasting, you should at least read this recent IJF review paper on probabilistic load forecasting and the relevant references. You can find my recent papers on probabilistic load forecasting HERE. The papers from winning entries of GEFCom2014 are HERE. For hierarchical forecasting, you can check out Hyndman and Athanasopoulos' BOOK and their PAPER


  1. For the defined-data track, can we use lags around holidays? For example, an indicator that it is the day after Thanksgiving, or generally the days around holidays.

  2. For the defined-data track can we use sunrise, sunset and noon datetime timestamps to compute for eg. sunup hours and other stuff? Also this computation takes into consideration that knwoledge of ISO New England is situated in New England, USA. :)

    1. No, not for the defined track, because you have to look it up from some source data. You can do that through the open track.


Note that you may link to your LinkedIn profile if you choose Name/URL option.