Wednesday, June 17, 2015

On Normality Assumption in Residual Simulation for Probabilistic Load Forecasting

I have been so fortunate to work with many talents in both industry and academia. Their involvement has added significant values to most, if not all, of my research projects. This paper is the result of one of those great examples.

The motivation of this project was simple. We were interested in improving the forecasts we produced for NCEMC a few years ago. (You can check our TSG2014 paper for the methodology we used to produce the previous forecasts. The same methodology was used in my recent EISPC/NARUC report.)

The initial idea was simple, too. Since all forecasts have errors, we would like to see if modeling and simulating the residuals can help improve the probabilistic forecasts. We expect an YES answer, because so many papers in the literature have reported a very similar approach. We also had a little doubt, because none of those papers really verified the approach via any formal error measures for probabilistic forecasting.

The entire project, from the initial idea to today (I just submitted the final files right before writing this blog), lasts about two and a half years, including about two years of experiment and testing, and about half a year for writing the paper and going through the peer review process.

Here is why it took so long.

The first results we were getting were basically the same as what we expected. We could have wrapped up the results and put them into a paper. However, we had a little doubt and thought it might be too good to be true. Then we started another set of experiment. The second results were not that "good", which means that some of the conclusions we got from the first one were not valid.

We then went on and one to try multiple datasets and models, We just can't get a consensus from the experiment results supporting the hypothesis that simulating residuals with normal distribution helps improve probabilistic forecasts.

Then it came to a point that I thought we should move on to other probability distributions, such as Gamma, Weibull, and even using ARIMA models to simulate the residual series. After trying some of these, I brought the new results to Tom Laing, Director of Market Research and the project owner at NCEMC.

I still remember that conversation with Tom. I was driving and calling him to discuss the findings and recommendations. I basically told Tom that normal distribution is unlikely to work. And I recommended exploring other sophisticated distributions to see if they can improve the forecasts.

His answer was straightforward and insightful, as always,
Hey, Tao, I appreciate your input, but I can't bring those distributions to my clients. They are too complicated. We need something as simple as normal distribution.
At the time, I suddenly realized that I lost myself in this year-long project and even forgot my own KISS principle - Keep It Simple, Stupid.

I thanked Tom for the reminder, and then went back to normal distribution. After another few months of work, we put our key findings together with two of many case studies we conducted into this 8-page paper.

In this paper, we showed how normal distribution work in residual simulation, including when it works well and when it does not work well. We also discussed how we may have got misleading conclusions if the case study had not been comprehensive enough.

We hope that this paper can be useful for those who are practicing probabilistic load forecasting.

Jingrui Xie, Tao Hong, Thomas D. Laing and Chongqing Kang, "On normality assumption in residual simulation for probabilistic load forecasting", IEEE Transactions on Smart Grid, in press, DOI: 10.1109/TSG.2015.2447007. Working paper available from

On Normality Assumption in Residual Simulation for Probabilistic Load Forecasting

Jingrui Xie, Tao Hong, Tom Laing and Chongqing Kang


Grid modernization has brought in various types of active demand, and intermittent and distributed generation resources to challenge the traditional power system planning and operation practices. As a result, more and more decision making processes rely on probabilistic forecasts as an input. While residual simulation has been recognized as one way to generate probabilistic load forecasts, the research on the application side of probabilistic load forecasting has been heavily relying on unverified distributions of load forecasting residuals, such as normal distribution. In this paper, we study the normality assumption from a different angle. Instead of trying to prove or disprove its validity via hypothesis tests, we attempt to understand whether applying the normality assumption helps improve the quality of probabilistic load forecasts. We apply a proper scoring rule, the pinball loss function, to evaluate a set of probabilistic load forecasts developed from different underlying linear and nonlinear models. To ensure the solidity of our conclusions, we conduct two case studies, one based on data from a large generation and transmission cooperative in the U.S., and the other based on data from the probabilistic load forecasting track of the Global Energy Forecasting Competition 2014.  

No comments:

Post a Comment

Note that you may link to your LinkedIn profile if you choose Name/URL option.