Wednesday, May 17, 2017

Wind Speed for Load Forecasting Models

One way to categorize the load forecasting papers is based on the variables used in those forecasting models. Because many people who wrote load forecasting papers only had access to the load data with time stamps, they had to propose the models based on the load series only. The representative techniques include exponential smoothing and the ARIMA family. Sometimes people also include the calendar information to come up with some regression models with classification variables. Although these are good and powerful techniques, their real-world applications in load forecasting are very limited. I have criticized those "load-only" models in some of my papers, such as the IJF2016 paper on recency effect:
Both seasonal naïve models perform very poorly compared with the other four models. Seasonal naïve models are used commonly for benchmarking purposes in other industries, such as the retail and manufacturing industries. In load forecasting, the two applications in which seasonal naïve models are most useful are: (1) benchmarking the forecast accuracy for very unpredictable loads, such as household level loads; and (2) comparisons with univariate models. In most other applications, however, the seasonal naïve models and other similar naïve models are not very meaningful, due to the lack of accuracy. 
Weather is must-have in most of the real-world load forecasting models. The most frequently used weather variable in the load forecasting literature is temperature. Some system operators, such as ISO New England, publish temperature data along with the load information. The recent load forecasting competitions, such as GEFCom2012 and GEFCom2014, have also released several years of hourly load and temperature data for benchmarking purpose.

Although non-temperature weather variables have some presence in the load forecasting literature, they are rarely studied in the context of variable selection. Recently we published a TSG paper Relative Humidity for Load Forecasting Models, discussing how to use humidity information to improve load forecasting accuracy. As a sister of that humidity paper, this paper discusses how to include wind speed information in load forecasting models.

Another comment I want to make is on the open access publication. I personally had no interest in publishing my paper with those open access publishers. This is my first try, which turns out to be a good surprise. The reviews were returned to me rather quickly, within 10 days. There were no non-sense comments, so I didn't need to deal with the personal attacks as I normally had to do. Before the final publication, the copy editor helped clean up some typos we had in the submission. From our first submission to the final pagerized version, the whole process took two weeks!

Anyway, hope that you enjoy reading this open access paper!


Jingrui Xie and Tao Hong, "Wind speed for load forecasting models", Sustainability, vol 9, no 5, pp 795, May, 2017 (open access).

Wind Speed for Load Forecasting Models

Jingrui Xie and Tao Hong


Temperature and its variants, such as polynomials and lags, have been the most frequently-used weather variables in load forecasting models. Some of the well-known secondary driving factors of electricity demand include wind speed and cloud cover. Due to the increasing penetration of distributed energy resources, the net load is more and more affected by these non-temperature weather factors. This paper fills a gap and need in the load forecasting literature by presenting a formal study on the role of wind variables in load forecasting models. We propose a systematic approach to include wind variables in a regression analysis framework. In addition to the Wind Chill Index (WCI), which is a predefined function of wind speed and temperature, we also investigate other combinations of wind speed and temperature variables. The case study is conducted for the eight load zones and the total load of ISO New England. The proposed models with the recommended wind speed variables outperform Tao’s Vanilla Benchmark model and three recency effect models on four forecast horizons, namely, day-ahead, week-ahead, month-ahead, and year-ahead. They also outperform two WCI-based models for most cases.

Thursday, May 11, 2017

RTE Day-ahead Load Forecasting Competition 2017

For many years, the Transmission System Operator RTE has been building electricity demand forecasts, ensuring the ability to match supply and demand at all times and, consequently, guaranteeing power system reliability.

The emergence of new factors related to the energy transition are impacting the electricity demand and making forecasting a more challenging task: self-consumption, growth of new uses (electric vehicles, heat pumps…), regulation of building insulation, new supply offers, possibility for consumers to monitor and control their consumption…

In this context of increasing flexibility and market rule harmonisation at the European level, RTE wants to conduct a review of current forecasting methods and assess the performance of new dynamic and adaptive approaches brought by Data Science.

A first challenge will focus on the deterministic short-term forecast of national and 12 regional electricity demands, a second one will focus on a forecast with associated uncertainty.

RTE will launch its first international public challenge in Data Science mid-May, running till mid-July. The second challenge will take place during winter 2017-2018.

Registration will be open from the opening date to the 24th of May on the platform :

All the information related to the challenge will be available on the platform mid-May. A discussion forum will allow participants to ask any questions they may have.     
Challenge rules

Participants will be given access to meteorological data provided by Météo France for RTE’s operational forecasting activities, and they will be able to retrieve national and regional demand data on RTE’s Eco2mix platform. Participants are allowed to use any other data, provided that source and nature are specified.

The models will be assessed on their ability to forecast demand of ten days (among which bank holidays) between the 25th of May and the 14th of July 2017.

These days will be announced since the challenge’s opening and will be used for the final ranking. Forecasts for day d must be submitted at 9pm on day d-1 at the latest.

In order to practice, participants will have the opportunity to submit forecasts on three consecutive days, from the 18th to the 20th of May.

The same rules regarding time of submission will apply.

Following the final ranking, the top three participants will have to submit a one page methodology document before being awarded their prize.

This document will describe the main principles of the method and the data used.
1st prize:      €10,000
2nd prize:     €5,000
3rd prize:      €3,000

Thursday, May 4, 2017

7 Reasons to Attend ISEA2017

The International Symposium on Energy Analytics (ISEA2017) is coming in 7 weeks. If you are still wondering whether you should join the event or not, here are 7 reasons for you to attend ISEA2017:

1. Grow your international network

ISEA2017 is truly international. The early registrations came from 16 countries. As a conference attendee, you will hear 20+ presentations describing methodologies and insights gained from various places in the world. You will also share your experience and expertise with this diverse audience and get their critique and compliment.

2. Check out the winning methods of GEFCom2017

Selected GEFCom2017 teams will be presenting their methodologies at ISEA2017. You will witness the recognition of GEFCom2017 winners and have the face-to-face discussion with them. Rather than reading thousands of energy forecasting papers published every year and wondering which ones work well, you can grasp the secret sauce of the most effective methods during ISEA2017.

3. Experience a novel peer review process 

Whether we like today's peer review system or not, we have to live with it, at least for the next few years before a better one is in place. We have tied ISEA2017 to an IJF special section on energy forecasting, where we try to implement a new peer review process. The ISEA2017 attendees will have the opportunity to experience this new process and help improve it.

4. Peek and shape the future of energy analytics 

If you are struggling with the topic for your next paper, ISEA2017 is a must-attend conference for you. We will discuss the emerging topics as well as the research agenda for the future. Rather than guessing where the future goes, you can contribute to the plan!

5. Attend International Symposium on Forecasting

The 37th International Symposium on Forecasting (ISF) will be held two days after ISEA2017, right at the same location. ISF is the only major scientific forecasting conference I know of. I find it very rewarding to attend ISF, where I hear forecasting topics from various industries, as well as the methodological breakthroughs in general. Many of them could be applied to the energy forecasting problems. Extending the trip to include ISF in your travel plan would be a wise choice.

6. Two world heritage sites in one place

The World Heritage Centre has a list of about 1000 world heritage sites around the globe. Two of them (Great Barrier Reef and Daintree Rainforest) are in Cairns, Australia, making Cairns the only place in this planet with two world heritage sites side by side. ISF organizers have planned the social program including numerous social events and tour opportunities for delegates, their friends and family.

7. Low registration fees

Our sponsors, the International Institute of Forecasters, Tangent Works and Journal of Modern Power Systems and Clean Energy, have generously contributed to the organization of ISEA2017, helping significantly subsidize the registration fees. If you attend both ISEA and ISF, there is an additional discount. To register both ISEA and ISF, click HERE. To register ISEA only, click HERE.

ISEA2017 will be held in Cairns, Australia, June 22-23, 2017. Look forward to seeing you there!

Monday, March 20, 2017

GEFCom2014 Load Forecasting Data

The load forecasting track of GEFCom2014 was about probabilistic load forecasting. We asked the contestants to provide one-month ahead hourly probabilistic forecasts on a rolling basis for 15 rounds. In the first round, we provided 69 months of hourly load data and 117 months of hourly temperature data. Incremental load and temperature data was provided in each of the future rounds.

Where to download the data?

The complete data was published as the appendix of our GEFCom2014 paper. If you don't have access to Science Direct, you can downloaded from my Dropbox link HERE. Regardless where you get the data, you should cite this paper to acknowledge the source:

  • Tao Hong, Pierre Pinson, Shu Fan, Hamidreza Zareipour, Alberto Troccoli and Rob J. Hyndman, "Probabilistic energy forecasting: Global Energy Forecasting Competition 2014 and beyond", International Journal of Forecasting, vol.32, no.3, pp 896-913, July-September, 2016.

What's in the package?

Unzip the file, you will see the folder "GEFCom2014 Data", which includes five zip files. The data for the probabilistic load forecasting track of GFECom2014 is in the file "". Unzip it, you will see the folder "load", which includes an "Instructions.txt" file and 15 other subfolders. In each folder named as "Task n", there are two files, Ln-train.csv and Ln-benchmark.csv. The train file, together with the train files released in previous rounds, can be used to generate forecasts. The benchmark file includes the forecast generated from the benchmark method.

How to use the data?

Apparently the most straightforward way of using this dataset is to replicate the competition setup and compare results directly with the top entries. Because the data published through GEFCom2014 is quite long (totally 7 years of matching load and temperature data), we can also use this dataset to test methods and models for short term load forecasting.

GEFCom2014-E data

After GEFCom2014, I organized an in-class probabilistic load forecasting competition in Fall 2015 that was open to external participants. My in-class competition setup was very similar to that of GEFCom2014, so I denoted the data for this in-class load forecasting competition as GEFCom2014-E, where E is the abbreviation of "extended". In total, this dataset covers 11 years of hourly temperature and 9 years of hourly load. A top team Florian Ziel was invited to contribute a paper to IJF (see HERE). The readers may replicate the same competition setup and compare results with Ziel's.


Note that the data I used for GEFCom2014-E was created using ISO New England data. If you want to validate a method using two independent sources, you should not use GEFCom2014-E together with ISO New England data.

Back to Datasets for Energy Forecasting.

Monday, March 6, 2017

Leaderboard for GEFCom2017 Qualifying Match!!!

[Update 5/18/2017]: ISO NE just released the April load data two days ago. Jingrui and I have updated the leaderboard for the qualifying match. Please check the rankings and let us know by 5/26/2017 if there is any issue.

The six rounds of GEFCom2017 qualifying match just ended last week. I'm sure that the contestants are anxiously waiting for the leaderboard. Here is a brief report. I'll update this post as ISO New England releases its recent load data.

Out of 177 registered teams, 73 have submitted entries to the defined track, and 26 to the open track. After six rounds, 53 teams completed the defined track with at least 4 submissions, while 20 completed the open track. 

The due date of report and code is on March 10th, 2017. Please send them to Follow the same protocol as the forecast submissions. Please follow THIS GUIDE to prepare the report.

Jingrui Xie created two benchmarks:
  • Vanilla Benchmark, which has been used to calculate the scores of the teams in each round. See Q7 of THIS FAQ for more information.
  • Rain Benchmark, which will be used to select the teams being advanced to the final match.  
(As an organizer of GEFCom2017, Jingrui Xie is not eligible for the prize.)

The spreadsheet with detailed scores can be accessed HERE. The higher the score is, the higher the rank is. 

Stay tuned :)

Tuesday, February 14, 2017

Call For Papers: Forecasting in Modern Power Systems | Journal of Modern Power Systems and Clean Energy

Journal of Modern Power Systems and Clean Energy

Special Section on Forecasting in Modern Power Systems 

Power systems have been evolving over the past century. The grid is getting more and more sophisticated due to modern technologies and business requirements, such as implementation of smart grid technologies, deployment of utra-high voltage transmission systems, and integration of ultra-high levels of renewable resources. All of these factors are challenging today’s energy forecasting practice. This special section of the Journal of Modern Power Systems and Clean Energy is aimed at answering the following question: How to better forecast the supply, demand and prices to accommodate the changes in modern power systems?

The topics of interests include, but are not limited

  • Probabilistic energy forecasting
  • Forecasting in multiple energy systems
  • High dimensional wind and solar power forecasting
  • Load forecasting with temporal and/or geographic hierarchies
  • Combination methods for energy forecasting

Submission Guidelines or link via

The article templates can be downloaded from

Important Dates

Paper Submission Deadline:    June 30, 2017
Acceptance Notification:         December 31, 2017
Date of Publication:                 March 2018

Guest Editorial Board

Guest Editors-in-Chief
Wei-Jen Lee, University of Texas at Arlington, USA
Tao Hong, University of North Carolina at Charlotte, USA

Guest Editors
Jing Huang, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia
Duehee Lee, Arizona State University, USA
Franklin Quilumba, National Polytechnic School, Ecuador
Jingrui Xie, SAS Institute, USA
Ning Zhang, Tsinghua University, China
Florian Ziel, University of Duisburg-Essen, Germany

Editor-In-Chief and Deputy

Professor Yusheng Xue (State Grid Electric Power Research Institute, Nanjing, China)
Professor Kit Po Wong (The University of Western Australia)

For more information, please do not hesitate to contact
Ms. Ying ZHENG
Tel: 86 25 8109 3060 Fax: 86 25 8109 3040

About Journal of Modern Power Systems and Clean Energy (MPCE)

MPCE sponsored by State Grid Electric Power Research Institute (SGEPRI) is Golden Open Accessed, peer-reviewed and bimonthly published journal in English. It is published by SGEPRI Press and Springer-Verlag GmbH Berlin Heidelberg commencing from June, 2013.It is indexed in SCIE, Scopus, Google Scholar, CSAD, DOAJ, CSA, OCLC, SCImago, ProQuest, etc. It is the first international power engineering journal originated in mainland China. MPCE publishes original papers, short letters and review articles in the field of modern power systems with focus on smart grid technology and renewable energy integration, etc. MPCE is dedicated to presenting top-level academic achievements in the fields of modern power systems and clean energy by international researchers and engineers, and endeavors to serve as a bridge between Chinese and global researchers in the power industry.

Monday, February 6, 2017

Mark Your 2017 Calendar: Tao's Recommended Conferences for Energy Forecasters

I didn't realize the overdue of this post until I just hit the road for my first trip of 2017. Here is the 2017 list of my recommended conferences for energy forecasters:

1. International Symposium on Energy Analytics (ISEA2017, Cairns, Australia, June 22-23, 2017)

Even if you missed all the other events down this list, you can still find the year rewarding by attending ISEA2017, the first-ever gathering of world-wide energy forecasters. Our generous sponsors, the International Institute of Forecasters (Super Sponsor), Tangent Works (Gigawatt Sponsor) and the State Grid Electric Power Research Institute (Kilowatt Sponsor), have helped bring the registration fees down. There are many reasons to join the party. You will meet the winners of GEFCom2017. You will hear the presentations from world-class energy forecasting researchers and practitioners. You will network with energy forecasting colleagues from more than a dozen countries. And of course, you will enjoy two World Heritage sites side-by-side.

2. Tao's courses

The next two SAS courses on load forecasting have been scheduled in Charlotte, March 27-29.

In addition, I'm going to teach these three courses through EUCI:

Stay tuned with the training page of Hong Analytics for the recent updates of all training courses.

3. Conferences from other professional organizations

I will attend the following three, as always:

Look forward to seeing you in these fantastic events!

Sunday, January 1, 2017

Energy Forecasting @2016

Happy New Year! As a tradition of this blog, it's time to look at the statistics of Energy Forecasting in 2016.

Where are the readers?

They are from 147 countries and SARs.

They are from 2660 cities.

Comparing with Energy Forecasting @2015.

All-time top 10 most viewed posts (from 4478 views to 2731 views):
Top 10 most-viewed classic posts (from 3914 views to 1525 views):
Thank you very much for your support! Happy Forecasting in 2017!

Wednesday, December 21, 2016

2016 Greetings from IEEE Working Group on Energy Forecasting

Another Christmas is coming in few days. It's time to look back at 2016 and see what IEEE Working Group on Energy Forecasting has done:

Next year will be even more exciting:
  • We will hold the International Symposium on Energy Analytics (ISEA2017), the first-ever gathering of world-wide energy forecasters in Cairns, Australia, the only place on earth with two World Heritage sites side-by-side, Great Barrier Reef and the Daintree Rainforest.  
  • We will conclude GEFCom2017 at ISEA2017 with the winner presentations and prizes. 
  • A PESGM2017 panel session on multiple energy systems is being organized by Ning Zhang and myself. 
  • I will be editing a special issue for the Power & Energy Magazine on big data analytics. The papers are by invitation only. If you have any good idea and would like to present it to thousands of PES members through this special issue, please let me know. 
  • We didn't have the bandwidth for JREF this year. We will try to conduct the JREF survey next year. 

Happy Holidays and Happy Forecasting!

Tuesday, December 20, 2016

Winning Methods from npower Forecasting Challenge 2016

RWE npower released the final leaderboard for its forecasting challenge 2016. I took a screen shot of the top teams. Interestingly, the international teams (colored in red) took over all of the top 6 places. Unfortunately, some of those top-notch UK load forecasters did not join the competition. I'm hoping that they can show up at the game to defend the country's legacy:)

RWE npower Forecasting Challenge 2016 Final Leaderboard (top 12 places)

In each of the previous two npower competitions, I asked my BigDEAL students to join the competition as a team. In both competitions, they were ranked top and beating all UK teams (see the blog posts HERE and HERE). We also published our winning methods for electricity demand forecasting and gas demand forecasting.

This year, instead of forming a BigDEAL team, I sent the students in my Energy Analytics class to the competition. The outcome is again very pleasing. The UNCC students took two of the top three places, and four of the top six places. What makes me, a professor, very happy is the fact that the research findings has been fully integrated into the teaching materials and smoothly transferred to the students in the class. (See my research-consulting-teaching circle HERE.)

OK, enough bragging...

I asked the top teams share their methodologies with the audience of my blog as what we did in BFCom2016s. Here they are:

1st Place: Geert Scholma

My forecast this time consisted of the following elements:
- linear regression models seperated per 30 minute period with 78 variables each
- fourth degree yearly shapes per weekday as a base shape
- an intercept, 6 weekdays and 22 holiday, bridgeday and schoolholiday variables
- daylight savings and a linear timetrend, each seperated for weekdays and weekends
- a shift at september 2014 and a night variable
- conversion of temperature to windchill
- third degree windchill polynomials for cooling and heating with different impacts
- three moving averages with different periods for temperature effects occurring at different timescales
- different radiation variables depending on time of day with up to 6 hourly and moving average radiation variables interacted with a second degree polynomial of the day of year for peak hours
- 1 hourly and 1 moving average rainfall variable
- manually exclusion of outliers and filling of any weather gaps

2nd Place: Devan Patel

Model: Multiple linear regression approach was used during the NPower forecasting competition. The basic model was Tao’s Vanilla Benchmark model. A major change was made in the form of dependent variable Energy Consumption. A Box-Cox transformation of Energy Consumption was taken based on the train data distribution. Polynomials of Humidity and Wind Speed were added into the Base model. With the help of this changes the performance of the benchmark vanilla model was improved. During testing above changes were successfully able to improve the accuracy of vanilla model by around 1.5% on the scale of MAPE.
Data: Two different approaches were used in order to train the model. During winter (Round 1 and Round 3) model was trained using whole year’s data. During summer (Round 2) only summer month’s data was used during model training. Scatter plots across different months were helpful to understand the distribution of energy consumption.
Explanatory data analysis: The missing values of the hours were replaced by previous day's hours. Scatter plots of temperature, humidity and wind speed were used to identify their relationships with energy consumption.
Error matrix: MAPE was used as a base error matrix in order to evaluate the accuracy of the forecast during model validation.
Software: RStudio was used as a main software for model building, validation and forecasting. MS Excel was used to prepare the data files which can be used in RStudio.

3rd Place: Masoud Sobhani

For the first round, the model was Tao's Vanilla model with recency effects (by adding extra lagged temperature to the original model). The model uses MLR method and the predictors are calendar variables, temperature, lagged values of temperature and cross effects between them. The model was implemented in SAS. For the second round, I tried to improve Vanilla model by adding more predictors beyond the temperature. Humidity was added to the model by using the method introduced in Xie and Hong 2016. The new model was an improved model having temperature and relative humidity as weather related predictors. Since we didn't know the location of the utility, I tried to change the new model to select the perfect model with the best results. For the third round, the model used in previous round was improved by adding some lagged values of relative humidity. In each round, the model selection was done by cross validation method.