Tuesday, August 1, 2017

Variable Selection Methods for Probabilistic Load Forecasting: Empirical Evidence from Seven states of the United States

I am an evidence-based man. This mentality saves me tremendous amount of time in recent years. I have been minimizing my time in following bluffs in the literature. On the other hand, I have been developing empirical case studies and encourage the community to contribute to he empirical research.

In my GEFCom2014 paper, I raised the following question to the forecasting community:
Can a better point forecasting model lead to a better probabilistic forecast?
To answer this question, we have to first understand the definition of "better", a.k.a., forecast evaluation measures and methods. In this paper, we compared two variable selection methods based on point and probabilistic error measures respectively. The case study covers seven states of the US. The results from this paper can hopefully be leveraged by future empirical studies for comparison purposes.

(This paper is an upgrade to our PMAPS2016 paper.)

Citation

Jingrui Xie and Tao Hong, "Variable Selection Methods for Probabilistic Load Forecasting: Empirical Evidence from Seven states of the United States", IEEE Transactions on Smart Grid, in press.

Variable Selection Methods for Probabilistic Load Forecasting: Empirical Evidence from Seven states of the United States

Jingrui Xie and Tao Hong

Abstract

Variable selection is the process of selecting a subset of relevant variables for use in model construction. It is a critical step in forecasting but has not yet played a major role in the load forecasting literature. In probabilistic load forecasting, many methodologies to date rely on the variable selection mechanisms inherited from the point load forecasting literature. Consequently, the variables of an underlying model for probabilistic load forecasting are selected by minimizing a point error measure. On the other hand, a holistic and seemingly more accurate method would be to select variables using probabilistic error measures. Nevertheless, this holistic approach by nature requires more computational efforts than its counterpart. As the computing technologies are being greatly enhanced over time, a fundamental research question arises: can we significantly improve the forecast skill by taking the holistic yet computationally intensive variable selection method? This paper tackles the variable selection problem in probabilistic load forecasting by proposing a holistic method (HoM) and comparing it with a heuristic method (HeM). HoM uses a probabilistic error measure to select the variables to construct the underlying model for probabilistic forecasting, which is consistent with the error measure used for the final probabilistic forecast evaluation. HeM takes a shortcut by relying on a point error measure for variable selection. The evidence from the empirical study covering seven states of the United States suggests that 1) the two methods indeed return different variable sets for the underlying models, and 2) HoM slightly outperforms but does not dominate HeM w.r.t. the skill of probabilistic load forecasts. Nevertheless, the conclusion might vary on other datasets. Other empirical studies of the same nature would be encouraged as part of the future work.

No comments:

Post a Comment

Note that you may link to your LinkedIn profile if you choose Name/URL option.