Sunday, January 1, 2017

Energy Forecasting @2016

Happy New Year! As a tradition of this blog, it's time to look at the statistics of Energy Forecasting in 2016.

Where are the readers?

They are from 147 countries and SARs.


They are from 2660 cities.


Comparing with Energy Forecasting @2015.


All-time top 10 most viewed posts (from 4478 views to 2731 views):
Top 10 most-viewed classic posts (from 3914 views to 1525 views):
Thank you very much for your support! Happy Forecasting in 2017!

Wednesday, December 21, 2016

2016 Greetings from IEEE Working Group on Energy Forecasting

Another Christmas is coming in few days. It's time to look back at 2016 and see what IEEE Working Group on Energy Forecasting has done:

Next year will be even more exciting:
  • We will hold the International Symposium on Energy Analytics (ISEA2017), the first-ever gathering of world-wide energy forecasters in Cairns, Australia, the only place on earth with two World Heritage sites side-by-side, Great Barrier Reef and the Daintree Rainforest.  
  • We will conclude GEFCom2017 at ISEA2017 with the winner presentations and prizes. 
  • A PESGM2017 panel session on multiple energy systems is being organized by Ning Zhang and myself. 
  • I will be editing a special issue for the Power & Energy Magazine on big data analytics. The papers are by invitation only. If you have any good idea and would like to present it to thousands of PES members through this special issue, please let me know. 
  • We didn't have the bandwidth for JREF this year. We will try to conduct the JREF survey next year. 

Happy Holidays and Happy Forecasting!

Tuesday, December 20, 2016

Winning Methods from npower Forecasting Challenge 2016

RWE npower released the final leaderboard for its forecasting challenge 2016. I took a screen shot of the top teams. Interestingly, the international teams (colored in red) took over all of the top 6 places. Unfortunately, some of those top-notch UK load forecasters did not join the competition. I'm hoping that they can show up at the game to defend the country's legacy:)

RWE npower Forecasting Challenge 2016 Final Leaderboard (top 12 places)


In each of the previous two npower competitions, I asked my BigDEAL students to join the competition as a team. In both competitions, they were ranked top and beating all UK teams (see the blog posts HERE and HERE). We also published our winning methods for electricity demand forecasting and gas demand forecasting.

This year, instead of forming a BigDEAL team, I sent the students in my Energy Analytics class to the competition. The outcome is again very pleasing. The UNCC students took two of the top three places, and four of the top six places. What makes me, a professor, very happy is the fact that the research findings has been fully integrated into the teaching materials and smoothly transferred to the students in the class. (See my research-consulting-teaching circle HERE.)

OK, enough bragging...

I asked the top teams share their methodologies with the audience of my blog as what we did in BFCom2016s. Here they are:

1st Place: Geert Scholma

My forecast this time consisted of the following elements:
- linear regression models seperated per 30 minute period with 78 variables each
- fourth degree yearly shapes per weekday as a base shape
- an intercept, 6 weekdays and 22 holiday, bridgeday and schoolholiday variables
- daylight savings and a linear timetrend, each seperated for weekdays and weekends
- a shift at september 2014 and a night variable
- conversion of temperature to windchill
- third degree windchill polynomials for cooling and heating with different impacts
- three moving averages with different periods for temperature effects occurring at different timescales
- different radiation variables depending on time of day with up to 6 hourly and moving average radiation variables interacted with a second degree polynomial of the day of year for peak hours
- 1 hourly and 1 moving average rainfall variable
- manually exclusion of outliers and filling of any weather gaps

2nd Place: Devan Patel

Model: Multiple linear regression approach was used during the NPower forecasting competition. The basic model was Tao’s Vanilla Benchmark model. A major change was made in the form of dependent variable Energy Consumption. A Box-Cox transformation of Energy Consumption was taken based on the train data distribution. Polynomials of Humidity and Wind Speed were added into the Base model. With the help of this changes the performance of the benchmark vanilla model was improved. During testing above changes were successfully able to improve the accuracy of vanilla model by around 1.5% on the scale of MAPE.
Data: Two different approaches were used in order to train the model. During winter (Round 1 and Round 3) model was trained using whole year’s data. During summer (Round 2) only summer month’s data was used during model training. Scatter plots across different months were helpful to understand the distribution of energy consumption.
Explanatory data analysis: The missing values of the hours were replaced by previous day's hours. Scatter plots of temperature, humidity and wind speed were used to identify their relationships with energy consumption.
Error matrix: MAPE was used as a base error matrix in order to evaluate the accuracy of the forecast during model validation.
Software: RStudio was used as a main software for model building, validation and forecasting. MS Excel was used to prepare the data files which can be used in RStudio.

3rd Place: Masoud Sobhani

For the first round, the model was Tao's Vanilla model with recency effects (by adding extra lagged temperature to the original model). The model uses MLR method and the predictors are calendar variables, temperature, lagged values of temperature and cross effects between them. The model was implemented in SAS. For the second round, I tried to improve Vanilla model by adding more predictors beyond the temperature. Humidity was added to the model by using the method introduced in Xie and Hong 2016. The new model was an improved model having temperature and relative humidity as weather related predictors. Since we didn't know the location of the utility, I tried to change the new model to select the perfect model with the best results. For the third round, the model used in previous round was improved by adding some lagged values of relative humidity. In each round, the model selection was done by cross validation method. 

Monday, November 28, 2016

7 Reasons to Send Your Best Papers to IJF

Last week, I was surfing the Web of Science to gather some papers to read during the holidays. Yes, some poor professors like myself work 24x7, including holidays. Suddenly I found that FIVE of my papers are listed by the Essential Science Index (ESI) as Highly Cited Papers. (Check them out HERE!) What a good surprise for Thanksgiving :)

What's even more surprising is that all of these five papers were published by the International Journal of Forecasting! As an editorial board member of two very prestigious and highly ranked journals, IEEE Transactions on Smart Grid (TSG) and International Joirnal of Forecasting (IJF), I send my best papers to these two journals every year, with an even split. So far, I've had six papers in TSG (not counting two editorials) and six in IJF. How come only my IJF papers were recognized by ESI?

The curiosity ate most of my Thanksgiving time. I was doing some research to answer this question, which eventually led to this blog post. In short,
you should send your best energy forecasting papers to IJF first!
Here is why:
  1. No page limit. IJF does not charge authors for extra pages. You can take as many pages as you like to elaborate your idea. The longest IJF paper I've read is Rafal Weron's 52-page review paper on price forecasting. My IJF review on probabilistic load forecasting is 25 pages long. Both reviews are now ESI Highly Cited Papers. 
  2. Short review time. A manuscript first reaches EIC, editor and then Associate Editor. It may be rejected by any of these three people. In other words, if it is a clear rejection, the decision would be coming to you rather quickly. If the manuscript is assigned to the reviewers, the first decision typically comes back within three to four months. 
  3. Very professional comments. I have seen many IJF review reports by far, as an author, reviewer and editor. Most of them are very professional. Eventually these review comments help the authors improve their work. I haven't seen any nonsense reviewer in the IJF peer-review system, which is quite remarkable! I guess the editors have done their job well by filtering out the nonsense reviewers before passing the comments to the authors. 
  4. High quality copy-editing service free of charge. Once the manuscript is accepted, it will be forwarded to a professional copy editor to polish the English for free, so you don't need to spend too much time with wordsmith. You don't need to worry about formatting either, because there is another copy editor doing that before the publisher sends you the proof. 
  5. Bi-annual awards. Every other year, IJF awards a prize for the best paper published in a two-year period. The prize is $1000 plus an engraved plaque. Details of the most recent one can be found HERE. Making some money and getting recognized for your paper, isn't it nice? 
  6. Publicity. Six years ago when I was pursuing my PhD, I was frustrated about the many useless papers in the literature. I brought my frustration to David Dickey. He made a comment that shocked me for a while. Instead of encouraging me to publish, he said that he had lost interest in publishing papers, because "the excellent papers are often buried by so many bad ones". Having been a professor for about three years, I have to agree with him. I believe in the era of "publish or perish", we have to "publish and publicize" to make our papers highly cited. Publishing your energy forecasting papers with IJF means that you get the opportunity of leveraging various channels, such as Hyndsight, Energy Forecasting, and the social media accounts of Elsevier and those renowned IJF editors. 
  7. "Business and economics" category in ESI. This is probably the most important distinction between IEEE Transactions and IJF. Many IEEE Transactions papers (including the ones in TSG) are grouped into engineering, while IJF papers are in the category of business and economics. The business and economics papers get much fewer citations on average than the engineering ones, which makes the ESI thresholds of business and economics lower than those of engineering. For instance, my TSG2014 paper is not an ESI paper, but it would have been if it were published by IJF. 
Unfortunately, IJF's acceptance rate is very low. To increase the chance to have the paper accepted, you should understand how reviewers evaluate the manuscript.

Look forward to your next submission!

Saturday, November 19, 2016

FAQ for GEFCom2017 Qualifying Match

I have received many questions from GEFCom2017 contestants. Many thanks to those who raised the questions. This is a list of frequently asked questions. I will update it periodically if I get additional ones.

Q1. I can't open the link to the competition data. How to get access to the data?

A1. If you cannot access the data via the provided link directly, you may need a VPN service. There are many free VPN services available. Use Google to find one, or post the question on LinkedIn forum to see if your peer contestants can help.

Q2. Can the competition organizer re-post the data somewhere else?

A2. No. We are not going to re-post the data during the competition, because ISO New England updates the data periodically.

Q3. Are we forecasting the same forecasting period in both Round 2 and Round 3? And another same forecasting period in both Round 4 and Round 5?

A3. For GEFCom2017-D, ISO New England updates the data every month, typically in the first half of the month. In Round 2, you will be using the data as of Nov 30, 2016. In Round 3, the December 2016 data should be available as well. For GEFCom2017-O, the data is being updated in real-time. We would like to see if there is any improvement with half a month of information. This set up also gives some flexibility to the contestants. If the team is busy with other commitments during the competition, they may submit the same forecast for both Round 2 and Round 3.

Q4. Can the same team join both tracks?

A4. Yes. A team may even submit the same forecasts to both tracks. Nevertheless, we are expecting higher accuracy in the forecasts of GEFCom2017-O than those of GEFCom2017-D.

Q5. Can one person join two or more teams?

A5. No.

Q6. I'm with a vendor. I don't know if my company wants to put its name as the team name. Can I join the competition personally? If I win, can I add my company as my affiliation and/or change the team name to my company's name?

A6. You can join the competition with or without linking your team to your company. However, you need to make the decision before registration. Once you are in the game, we can not change your affiliation or team name.

Q7. Which benchmark method will be used?

A7. The benchmark method forecasts each zone individually. We will use the vanilla model as the underlying model, simulate the temperature by shifting 11 years of temperature data (2005 - 2015) 4 days forward and backward to come up with 99 scenarios, which will be used to extract 9 quantiles. See THIS PAPER for more details.

Q8. In GEFCom2017-D, are we required to process daylight savings time in a specific way?

A8. No. You can treat the daylight savings time any way you like. THIS POST elaborates my approach, which you don't have to follow.

Q9. In GEFCom2017-D, are we allowed to assume the knowledge of federal holidays before 2011?Can we give special treatments to the days before and after the holidays?

A9. Yes, and yes. The opm.gov website only publishes federal holidays starting from 2011. You can infer the federal holidays before 2011. You can model the days before and after holidays the way you like. I had a holiday effect section in my dissertation, which you don't have to follow. Keep in mind that you should not assume any knowledge about local events or local holidays, such as NBA final games and Saint Patrick's Day.

Q10. The sum of the 8 zones are slightly different from the total demand published by ISO New England. Which number will you use to evaluate the total demand?

A10. Column D of the "ISO NE CA" worksheet.

Q11. For GEFCom2017-D, are you going to provide weather forecasts that every team should use?

A11. No. It is an ex ante hierarchical probabilistic load forecasting problem. We do not provide weather forecasts. The contestants in the GEFCom2017-D track should not use any weather forecasts from other data sources. Nevertheless, the contestants may generate their own weather forecast if they want to. The weather forecasting methodology should be in the final report if they take this route.

Q12. No wind, solar or price forecasting in GEFCom2017? It's a pity!

A12. GEFCom2017 is a load forecasting competition. Unfortunately, we were not able to identify good datasets to set up wind, solar or price forecasting tracks to match the challenge level as this load forecasting problem. Nevertheless, in GEFCom2017-O, you may leverage other data sources to predict wind, solar and prices, which may be good for your load forecasts.

Q13. I'm a professor. Any advice if I want to leverage this competition in class?

A13. It would be nice to leverage the competition in your course. I did so two years ago in GEFCom2014. There will again be an institute prize in GEFCom2017. To aim for the institute prize, I would recommend that you sign up as many teams as possible to maximize the likelihood to win. What I did two years ago was to have each student form a single-person team, and tied the competition ranking to their grades. Anyway, if you are going to join the competition, it's better to have the students look into the data ASAP. The first round submission is due on 12/15/2016.

Q14. Any reference materials we should read before we dive into the competition problem?

A14. For probabilistic load forecasting, you should at least read this recent IJF review paper on probabilistic load forecasting and the relevant references. You can find my recent papers on probabilistic load forecasting HERE. The papers from winning entries of GEFCom2014 are HERE. For hierarchical forecasting, you can check out Hyndman and Athanasopoulos' BOOK and their PAPER

Saturday, October 29, 2016

Instructions for GEFCom2017 Qualifying Match

The GEFCom2017 Qualifying Match means to attract and educate a large number of contestants with diverse background, and to prepare them for the final match. It includes two tracks: a defined-data track (GEFCom2017-D) and an open-data track (GEFCom2017-O). In both tracks, the contestants are asked to forecast the same thing: zonal and total loads of ISO New England. The only difference between the two tracks is on the input data.

Data 

The input data a participating team can use GEFCom2017-D should not go beyond the following:
  1. Columns A, B, D, M and N in the worksheets of "YYYY SMD Hourly Data" files, where YYYY represents the year. These data files can be downloaded from ISO New England website via the zonal information page of the energy, load and demand reports. Contestants outside United States may need a VPN to access the data. 
  2. US Federal Holidays as published via US Office of Personnel Management.
The contestants are assumed to have the general knowledge of Daylight Savings Time and inferring the day of week and month of year based on a date.

There is no limitation for the input data in GEFCom2017-O.

Forecasts

The forecasts should be in the form of 9 quantiles following the exact format provided in the template file. The quantiles are the 10th, 20th, ... 90th percentiles. The forecasts should be generated for 10 zones, including the 8 ISO New England zones, the Massachusetts (sum of three zones under Massachusetts), and the total (sum of the first 8 zones).

Timeline

GEFCom2017 Qualifying Match includes six rounds.

Round 1 due date: Dec 15, 2016; forecast period: Jan 1-31, 2017.
Round 2 due date: Dec 31, 2016; forecast period: Feb 1-28, 2017.
Round 3 due date: Jan 15, 2017; forecast period: Feb 1-28, 2017.
Round 4 due date: Jan 31, 2017; forecast period: Mar 1-31, 2017.
Round 5 due date: Feb 14, 2017; forecast period: Mar 1-31, 2017.
Round 6 due date: Feb 28, 2017; forecast period: Apr 1-30, 2017.
Report and code due date: Mar 10, 2017.

The deadline for each round is 11:59pm EST of the corresponding due date.

Submission

The submissions will be through email. Within two weeks of registration, the team leader should receive a confirmation email with the track name and team name in the email subject line. If the team registered both tracks, the team leader should receive two separate emails, one for each track.

The team lead should submit the forecast on behalf of the team by replying to the confirmation email.

The submission must be received before the deadline (based on the receipt time of the email system) to be counted in the leaderboard.

Template

The submissions should strictly follow the requirements below:
  1. The file format should be *.xls;
  2. The file name should be "TrackInitialRoundNumber-TeamName". For instance, Team "An Awesome Win" in the defined data track's round 3 should name the file as "D3-An Awesome Win".
  3. The file should include 10 worksheets, named as CT, ME, NEMASSBOST, NH, RI, SEMASS, VT, WCMASS, MASS, TOTAL. Please arrange the worksheets in the same order as listed above. 
  4. In each worksheet, the first two columns should be date and hour, respectively, in chronological order.
  5. The 3rdto the 11th columns should be Q10, Q20, ... to Q90. 
The template is HERE. The contestants should replace the date column to reflect the forecast period in each round.

Evaluation

In round i, for a forecast submitted by team j for zone k, the average Pinball Loss of the 9 quantiles will be used as the quantile score of the probabilistic forecast Sijk. A benchmark method will be used to forecast each of the 10 zones. We denote the quantile score of the benchmark method in round i for zone k as Bik.

In round i, we will calculate the relative improvement (1 - Sijk/Bik) for each zone. The average improvement over all zones team j accomplishes will be the rating for team j, denoted as Rij. The rank of team j in round i is RANKij.

The weighted average of the rankings from all 6 rounds will be used to rank the teams in the qualifying match leaderboard. The first 5 rounds will be weighted equally, while the weight for the 6th round is doubled.

A team completing four or more rounds is eligible to for the prizes. The ratings for the missing rounds will be imputed before calculating the weighted average of the ratings.

Prizes

Institute Prize (up to 3 universities): $1000
1st place in each track: $2000
2nd place in each track: $1000
3rd place in each track: $500
1st place in each round of each track: $200

For more information about GEFCom2017, please visit www.gefcom.org.

Friday, October 14, 2016

GEFCom2017: Hierarchical Probabilistic Load Forecasting

IEEE Working Group on Energy Forecasting invites you to join the Global Energy Forecasting Competition 2017 (GEFCom2017): Hierarchical Probabilistic Load Forecasting.

Background

Emerging technologies, such as microgrids, electric vehicles, rooftop solar panels and intelligent batteries, are challenging the traditional operational practices of the power industry. While uncertainties on the demand side are pushing the operational excellence toward the edge of the grid, probabilistic load forecasting at various levels of the power system hierarchy is becoming increasingly important.

GEFCom2017 will bring together state-of-the-art techniques and methodologies for hierarchical probabilistic energy forecasting. The competition features a bi-level setup: a three-month qualifying match that includes two tracks, and a one-month final match on a large-scale problem.

Qualifying match

The qualifying match means to attract and educate a large number of contestants with diverse background, and to prepare them for the final match. The qualifying match includes two tracks, both on forecasting the zonal and total loads of ISO New England (the "DEMAND" column) for the next month in real-time on rolling basis.

The defined-data track (GEFCom2017-D) restricts the data used by the contestants. The data cannot go beyond the calendar data, load (the "DEMAND" column) and temperature data (the "DryBulb" and "DewPnt" columns) provided by ISO New England via the zonal information page of the energy, load and demand reports,  plus the US Federal Holidays as published via US Office of Personnel Management. The contestants may infer day of week and Federal Holidays based on the aforementioned data.

The open-data track (GEFCom2017-O)encourages the contestants to explore various public and private data sources and bring the necessary data into the load forecasting process. The data may include, but is not limited to the data published by ISO New England, the weather forecast data from any weather service providers, the local economy information, the penetration of solar PV published by US government websites.

Final match

The final match (GEFCom2017-F) will be open to the top entries from the qualifying match, tackling a more challenging, larger scale problem than the qualifying match problems. The final match includes one-track only, forecasting the load of a few hundred delivery points of a U.S. utility. The data is from the real world, so the contestants should expect many data issues, such as load transfers and anomalies. Details of the final match will be released on March 15, 2017.

Submission method

To save competition platform costs and implement more sophisticated evaluation methods, the submission will be via email. Within two weeks of the registration, the contestants will receive an email with the instructions about how to submit the forecasts.

Evaluation 

The "DEMAND" column published by ISO New England will be used to evaluate the skills of the probabilistic forecasts. Note that the "DEMAND" data may be revised during the settlement process. The version at the time of evaluation will be used to score the forecasts.

The evaluation metric is quantile score. For each forecasted period, the quantile score of a submitted forecast will be compared with the quantile score of the benchmark. The relative improvement over the benchmark will be used to rate and rank the teams.

World Energy Forecaster Rankings (WEFR)

Many contestants who joined GEFCom2012 also participated in GEFCom2014. To encourage the continuous investments in energy forecasting and recognize those who excel in these competitions, we will start building the World Energy Forecaster Rankings.

The contestants of GEFCom2017 will be eligible to participate in WEFR. We hope the rankings can help reward the participants with career opportunities and tickets to future competitions. In addition, editors of relevant journals can also leverage WEFR to enhance the peer review process.

Prize

IEEE Power and Energy Society budgeted 20,000 for this competition. The prize pool is $18,000, to be shared among the winning teams and institutions from qualifying match and final match.

Publication

Winning teams will be invited to submit papers to a special issue of the International Journal of Forecasting. 

Registration

The maximum team size is three. The team leader should register on behalf of the team. The registration period is from Oct 14, 2016 to Jan 14, 2017. Please register via THIS LINK if you want to join the competition.

Competition timeline
  • Competition Problems Release  --  Oct 14, 2016
  • Qualifying Match Starts  --  Dec 1, 2016
  • Qualifying Match Ends  --  Feb 28, 2017
  • Final Match Data Release  --  Mar 15, 2017
  • Final Match Submission Due  --  May 15, 2017

Additional rules

For any questions or comments, please put them in the comment field below. Please link your name to your LinkedIn profile. 

Thursday, October 13, 2016

Congratulations, Dr. Jingrui Xie!

Today (October 13, 2016), Jingrui (Rain) Xie defended her doctoral dissertation on probabilistic electric load forecasting, which made her the first BigDEAL PhD.

When coming back to academia three years ago, I had the mission of producing the next generation of finest analysts for the industry. As the first PhD from BigDEAL, Rain sets the standard for BigDEAL products and tells what the finest analyst looks like.

Rain joined UNC Charlotte in August, 2013, as my first master student. She received her M.S. degree in Engineering Management in May, 2015, and continued with her PhD in Infrastructure and Environmental Systems.

In just three years, she published 7 journal papers:
  • Temperature scenario generation for probabilistic load forecasting (TSG, in press)
  • Relative humidity for load forecasting models (TSG, in press)
  • On normality assumption in residual simulation for probabilistic load forecasting (TSG, 2016)
  • GEFCom2014 probabilistic electric load forecasting: an integrated solution with forecast combination and residual simulation (IJF, 2016)
  • Improving gas load forecasts with big data (GAS, 2016)
  • Long term retail energy forecasting with consideration of residential customer attrition (TSG, 2015)
  • Long term probabilistic load forecasting and normalization with hourly information (TSG, 2014)
and 3 conference papers:
  • Comparing two model selection frameworks for probabilistic load forecasting (PMAPS, 2016)
  • From high-resolution data to high-resolution probabilistic load forecasts (T&D, 2016)
  • Combining load forecasts from independent experts: experience at NPower forecasting challenge 2015 (NAPS, 2015)
She was among the top contestants in all of the forecasting competitions she participated:
  • Top1 in BigDEAL Forecasting Competition 2016
  • Top 3 in NPower Gas Demand Forecasting Challenge 2015
  • Top 3 in NPower Electricity Demand Forecasting Challenge 2015
  • Top 3 in Load Forecasting Track of Global Energy Forecasting Competition 2014
She has also received several prestigious awards:
  • IEEE PES Technical Committee Prize Paper Award 2016
  • International Symposium on Forecasting 2016 Travel Award
  • International Symposium on Forecasting 2015 Travel Award
  • UNCC College of Engineering Outstanding Graduate Research Assistant Award 2015
  • International Institute of Forecasters Student Forecasting Award 2015
Rain has been full-time working at SAS during the past three years. In addition to the academic excellence, Rain received a promotion earlier this year for her outstanding performance at work

It took her 21 months to get the PhD - she enrolled in the PhD program in January, 2015, and defended the dissertation today. That said, she just proved the reproducibility of my 20-month PhD!

Lastly, but most importantly, she became a mother two years ago - her daughter is now two-year old. 

Again, congratulations, Dr. Jingrui Xie!

Wednesday, October 5, 2016

NPower Forecasting Challenge 2016

RWE npower is running its forecasting challenge again this year. The purpose is to recruit summer interns from UK schools. Nevertheless, the competition will be open to people outside UK as well.

In 2015, BigDEAL participated in both competitions, one on electric load forecasting, and the other on gas load forecasting. We summarized our methods into two papers (electricity; gas), which may give you some idea about the previous competitions.

The registrations are now open until November 1, 2016. Have fun!

Thursday, September 22, 2016

A Five-minute Introduction to Electric Load Forecasting

I was recently interviewed by Prof. Galit Shmueli for her recently launched free online course Business Analytics Using Forecasting. In this interview, I gave a 5 minutes introduction to electric load forecasting, discussing the special characteristics of load forecasting and what is needed for successful solutions.