Friday, February 27, 2015

Electric Load Forecasting with Recency Effect: a Big Data Approach

When I first wrote the CFP for the special issue on Analytics for Energy Forecasting with Applications to Smart Grid in 2012, I used the term big data, with a quotation mark. Nowadays, big data is no longer new to the utility industry. In fact the utilities have been working with big data since it was called just "data" - we witness the growth of data to big data in this smart grid era. To collect the most recent progress and advancements in big data analytics, we just issued another CFP for the special issue on Big Data Analytics for Grid Modernization.
What is big data analytics, deep learning, high-performance computing and petabyte size? 
I have three simple criteria:
  1. The data size is larger than what typical data analysis tools can handle. If you are using MS Excel to do some data analysis, then a data file with 1.1 million rows is big data. 
  2. The computing time is longer than the analysis time. Let's say it takes you a few days to think of a design of an algorithm. If testing the algorithm takes a few weeks, then it is big data. 
  3. The problem requires analysis at a higher level of granularity than usual. If your typical load forecasting process rely on monthly data, moving to daily or hourly data may bring you the big data challenge. 
Although these three criteria do not have to be met at the same time to qualify big data analytics, they are indeed connected to each other. Analyzing high resolution data often requires advanced data analysis tools and significant computing time. 

This paper has big data in its title, because it covers the latter two criteria. The regression models we developed in this paper contain up to thousands of variables, which require significant amount of time for parameter estimation, much longer than our thought process. Moreover, we customized the models based on each zone of a geographic hierarchy and each node (hour) of the temporal hierarchy. Of course the forecasting errors are reduced with our proposed approach, which also tells us the importance of powerful computers in load forecasting. 

The case study is based on the GEFCom2012 data published in my 2014 IJF paper Global Energy Forecasting Competition 2012. We compared the results with those in my 2015 IJF paper Weather Station Selection for Electric Load Forecasting.

Citation
Pu Wang, Bidong Liu and Tao Hong, "Electric load forecasting with recency effect: a big data approach", International Journal of Forecasting, vol.32, no.3, pp 585-597, July-September, 2016. Working paper available online http://www.drhongtao.com/articles


Electric Load Forecasting with Recency Effect: a Big Data Approach

Pu Wang, Bidong Liu and Tao Hong

Abstract

Temperature plays a key role in driving electricity demand. We adopt "recency effect", a term originated from psychology, to illustrate the fact that electricity demand is affected by the temperatures of preceding hours. In the load forecasting literature, the temperature variables are often constructed in the form of lagged hourly temperatures and moving average temperatures. Over the past decades, computing power has been limiting the amount of temperature variables that can be used in a load forecasting model. In this paper, we present a comprehensive study on modeling recency effect through a big data approach. We take advantage of the modern computing power to answer a fundamental question: how many lagged hourly temperatures and/or moving average temperatures are needed in a regression model to fully capture recency effect without compromising the forecasting accuracy? Using the case study based on data from the load forecasting track of the Global Energy Forecasting Competition 2012, we first demonstrate that a model with recency effect outperforms its counterpart in forecasting individual load series at aggregated level by 18% to 20%. We then apply recency effect modeling to customize load forecasting models at low level of a geographic hierarchy, again showing the superiority over a benchmark model by 13% to 15% on average. Finally, we discuss four different implementations of the recency effect modeling by hour of a day. 

No comments:

Post a Comment

Note that you may link to your LinkedIn profile if you choose Name/URL option.