Tuesday, August 13, 2013

Long Term Probabilistic Load Forecasting and Normalization with Hourly Information

The paper is available on IEEE Xplore.

Tao Hong, Jason Wilson and Jingrui Xie, "Long term probabilistic load forecasting and normalization with hourly information", IEEE Transactions on Smart Grid, vol.5, no.1, pp. 456-462, January, 2014.

Long Term Probabilistic Load Forecasting and Normalization with Hourly Information

Tao Hong, Jason Wilson and Jingrui Xie


The classical approach to long term load forecasting is often limited to the use of load and weather information occurring with monthly or annual frequency. This low resolution, infrequent data can sometimes lead to inaccurate forecasts. Load forecasters often have a hard time explaining the errors based on the limited information available through the low resolution data.  The increasing usage of Smart Grid and Advanced Metering Infrastructure (AMI) technologies provides the utility load forecasters with high resolution, layered information to improve the load forecasting process. In this paper, we propose a modern approach that takes advantage of hourly information to create more accurate and defensible forecasts. The proposed approach has been deployed across many US utilities, including a recent implementation at North Carolina Electric Membership Corporation (NCEMC), which is used as the case study in this paper. Three key elements of long term load forecasting are being modernized: predictive modeling, scenario analysis and weather normalization. We first show the superior accuracy of the predictive models attained from hourly data, over the classical methods of forecasting using monthly or annual peak data. We then develop probabilistic forecasts through cross scenario analysis. Finally, we illustrate the concept of load normalization and normalize the load using the proposed hourly models. 


  1. 1. The weather among multiple weather stations in one region or one state are usually highly correlated, especially hourly temperature or average daily temperature. If you put multiple weather stations in the linear regression model, it will cause multicollinearity. Could you let me know how do you handle this problem?

    2. For model S, you used many cross effects. Are those all significant factors? If some of them are not statistically significant, how do you handle that?

    3. I assume the dependent variable of model S would be hourly load.
    Therefore, you can use that model to predict daily peaks given daily hourlytemperature. You can also add up all hourly loads to get predicted total dailyload, monthly load, etc. Is my understanding correct?

  2. 1. Read my recent IJF paper "Weather Station Selection for Electric Load Forecasting": http://www.sciencedirect.com/science/journal/01692070
    We first combine multiple weather stations to get ONE virtual station, and then put the virtual station in the regression model.

    2. I always told my students to ignore tests about "significance" when doing forecasting. Two primary reasons are: 1) load forecasting errors rarely follow normal distribution in the real world; 2) those tests are based on in-sample fit, which does not say much about the predictive power. Out-of-sample tests are the ones we should go with for forecasting purposes. Read Len Tashman's IJF paper: http://www.sciencedirect.com/science/article/pii/S0169207000000650

    3. Yes. Forecast hourly load first, then you can pick the max of a day to get daily peaks. That’s how we got the monthly peaks and energy in the plots shown in this TSG paper.


Note that you may link to your LinkedIn profile if you choose Name/URL option.