Tuesday, October 16, 2018

Robust Regression Models for Load Forecasting

One of my doctoral majors is operations research, for which I took many courses in graduate school to build my knowledge in optimization. The topic of my dissertation was on load forecasting. Only two chapters were related to optimization, one on Artificial Neural Networks, and the other on Fuzzy Regression (or Possibilistic Linear Regression).

In fact, the fuzzy regression chapter was the only one that seriously required some optimization skills, which was published as an FODM paper three years after my graduation. To build a fuzzy regression model, I had to formulate the parameter estimation process as a linear program, and solve it in CPLEX. At that time Gurobi was not even able to provide a feasible solution for my fuzzy regression model with 200+ parameters.

After that, I continued my profession in forecasting. I knew my optimization background is helpful to forecasting, but I didn't really expect to apply many optimization skills in forecasting.

About a year ago, we performed a benchmark study to show that four representative load forecasting models would fail miserably with bad input data. That study was published as an IJF paper early this year. At the end of that IJF paper, we mentioned a future research direction of designing more robust load forecasting models.

In this paper, we propose three robust regression models for load forecasting. While all of them are more robust than the ones compared in the IJF paper, the L1 regression model outperform the others. In fact L1 regression is not really new to load forecasting. It has been used for forecast combination, where some people call it Least Absolute Deviation (LAD) regression. Its "general" form, quantile regression, is heavily used in probabilistic load forecasting.
What's new about the L1 regression model in this paper?
We built an L1 regression model with hundreds of parameters. In fact it shares the same variable combination as the Vanilla model used in Global Energy Forecasting Competitions. Building such a model is nontrivial. We didn't find an off-the-shelf package to do what we need, so we formulated it as a linear program and solved it using MATLAB's linprog.
Among hundreds of techniques that are applicable to load forecasting, how did I find L1 regression?
The idea didn't come from nowhere. When I was working on my doctoral dissertation at FANGroup (Fuzzy And Neural Group), a few other students were working on another project sponsored by U.S. Army Research Office. They were investigating some features and applications of l1 norm. Although I was thinking about applying l1 norm to load forecasting, I didn't find a good use case at that time.

Well, it's better late than never. The skills I acquired 10 years ago came handy for this paper.

Citation

Jian Luo, Tao Hong, and Shu-Cherng Fang, "Robust regression models for load forecasting," submitted to IEEE Transactions on Smart Grid, in press.


Robust Regression Models for Load Forecasting

Jian Luo, Tao Hong, and Shu-Cherng Fang

Abstract

Electric load forecasting has been extensively studied during the past century. While many models and their variants have been proposed and tested in the load forecasting literature, most of the existing case studies have been conducted using the data collected under normal operating conditions. A recent case study shows that four representative load forecasting models easily fail under data integrity attacks. To address this challenge, we propose three robust load forecasting models including two variants of the iteratively re-weighted least squares regression models and an L1 regression model. Numerical experiments indicate the dominating performance of the three proposed robust regression models, especially L1 regression, compared to other representative load forecasting models. 

Monday, October 8, 2018

BigDEAL Forecasting Competition 2018

This semester I'm teaching Energy Analytics for the fifth time. The course has earned its reputation on the UNC Charlotte campus and even around the utility industry, for its toughness, high withdraw rate, and challenging nature. Here are some comments from the students in 2015 and 2017. Nowadays, not many students even dare to register the course. 

After the first midterm exam last week, I have five students left in the class. These five "survivors" (out of more than a dozen students at the beginning of the semester) have completed two assignments and one exam. I am impressed by their submissions every time. I must confess that this is by far the most academically strong class I've ever had for this course, even stronger than the group that won several award plaques in GEFCom2014

Previously, I sent students of this course to the competitions, such as GEFCom2014 and NPower Forecasting Challenge, where they can solve some conventional energy forecasting problems while competing with others around the globe. 

This year, thanks to the outstanding performance of these students, I was spending a lot of time trying to figure out a challenge for them. Finally, I decided to give them a new load forecasting problem to solve. 

I'll keep the problem secret for now, but I can tell that a practical solution to this problem can save power companies a lot of money. To those who are interested in writing academic papers, a winning solution to this problem should greatly increase the likelihood of having the manuscript accepted by the top venues for energy forecasting papers, such as International Journal of Forecasting (IJF) and IEEE Transactions on Smart Grid (TSG). 

The competition is by invitation only. The ones who are interested in joining this competition should first pass the qualifying match. I will use the first homework problem of Energy Analytics for the qualifying match. A contestant has to beat the last-ranked student of my class to receive the invitation to BFCom2018. If nobody beats any of my students, I'll just run the competition with the in-class students. 

For the qualifying match, I'll provide three years of hourly load and temperature, and one year of hourly temperature for the fourth year. The contestants should submit the ex post load forecast for the fourth year. The temperature data is from 28 weather stations. To excel in the qualifying match, the contestants may want to read two of my IJF papers on weather station selection and recency effect

Important Dates

Oct 8, 2018 - Registration open. 
Oct 21, 2018 - Registration close. 
Oct 22, 2018 - Qualifying match data release.
Nov 4, 2018 - Qualifying match submission due. 
Nov 5, 2018 - Leaderboard published; BFCom2018 invitation sent. 
Dec 3, 2018 - BFCom2018 winners announced. 

Note: There is no monetary prize for this competition. The leaderboard will be published on this blog. I will consider providing research assistantships to the top three contestants if they are interested in joining my lab as PhD students.

If you are interested, please register HERE. See you in the game!

Monday, July 9, 2018

From Club Convergence of Per Capita Industrial Pollutant Emissions to Industrial Transfer Effects: An Empirical Study Across 285 Cities in China

China has grown to the world's second largest economy by nominal GDP. Many factors attribute to such rapid growth, such as globalization and hard-working Chinese people. Nevertheless, we can't ignore the pollution resulted from the industrialization. Dr. Chang Liu brought the research problem to me when she visited BigDEAL last year. We spent a year investigating the relationship between industrial transfer effects and per capita industrial pollutant emissions across 285 cities in China. We identified four convergence clubs for SO2 emissions, and three convergence clubs for soot emissions. We also concluded that industrial transfer effects can lead to multiple steady-state equilibria. This presents some evidence to support region-specific environmental policies and execution strategies. 

This is the first time I sent a paper to Energy Policy. The original version was submitted on Feb 5, 2018. Within five months, the paper was published after three revisions. The entire publication process was quite pleasant.

Citation
Chang Liu, Tao Hong, Huaifeng Liu, and Lili Wang, "From club convergence of per capita industrial pollutant emissions to industrial transfer effects: an empirical study across 285 cities in China," Energy Policy, vol.121, pp 300-313, October 2018. (ScienceDirect)

From Club Convergence of Per Capita Industrial Pollutant Emissions to Industrial Transfer Effects: An Empirical Study Across 285 Cities in China

Chang Liu, Tao Hong, Huaifeng Liu, and Lili Wang

Abstract

The process of industrialization has led to an increase in air pollutant emissions in China. At the regional level, industrial restructuring and industrial transfer from eastern China to western China have caused a significant difference in pollutant emissions among various cities. This paper analyzes per capita industrial pollutant emissions across 285 prefecture-level cities from 2003 to 2015, aiming to reveal how industrial transfer affects the formation of convergence clubs. Whether industrial pollutant emissions across heterogeneous cities converge to a unique steady-state equilibrium is first identified based on the concept of club convergence. Logit regression analysis is then applied to assess the effects of industrial transfer on the observed clubs. The log t-test highlights four convergence clubs for industrial SO2 emissions and three clubs for industrial soot emissions. The regression analysis results reveal that the effects of industrial transfer can lead to multiple steady-state equilibria, suggesting region-specific environmental policies and execution strategies. In addition, accelerating the development of clean energy technologies in emission-intense regions should be further emphasized. 

Monday, June 25, 2018

Big Data Analytics: Making Smart Grid Smarter

The May 2018 issue of the Power & Energy Magazine is on Big Data Analytics. My guest editorial is on IEEE Xplore with open access. The original articles are in English. The Spanish translation is also available. The links to these articles are listed below.

Citation

Tao Hong, "Big data analytics: making smart grid smarter" IEEE Power and Energy Magazine, vol.16, no.3, pp 12-16, May-June 2018. (IEEE Xplore)

Features in This Issue

Visualizing Big Energy Data
By Rob J. Hyndman, Xueqin (Amy) Liu, and Pierre Pinson

Distribution Synchrophasors
By Hamed Mohsenian-Rad, Emma Stewart, and Ed Cortez

Big Data Analytics for Flexible Energy Sharing
By Furong Li, Ran Li, Zhipeng Zhang, Mark Dale, David Tolley, and Petri Ahokangas

Weather Data for Energy Analytics
By Jonathan Black, Alex Hofmann, Tao Hong, Joseph Roberts, and Pu Wang

Big Data Analytics in China’s Electric Power Industry
By Chongqing Kang, Yi Wang, Yusheng Xue, Gang Mu, and Ruijin Liao

Training Energy Data Scientists
By Tao Hong, David Wenzhong Gao, Tom Laing, Dale Kruchten, and Jorge Calzada


Articulos de Mayo/Junio de 2018

Visualización de "big data" de energía
Por Rob J. Hyndman, Xueqin (Amy) Liu y Pierre Pinson

Sincrofasores en la distribución
Por Hamed Mohsenian-Rad, Emma Stewart y Ed Cortez

Análisis de "big data" para el intercambio flexible de energía
Por Furong Li, Ran Li, Zhipeng Zhang, Mark Dale, David Tolley y Petri Ahokangas

Datos meteorológicos para el análisis de energía
Por Jonathan Black, Alex Hofmann, Tao Hong, Joseph Robert y Pu Wang

Análisis de "big data" en la industria de la potencia eléctrica de china
Por Chongqing Kang, Yi Wang, Yusheng Xue, Gang Mu y Ruijin Liao

Formación de científicos de datos de energía
Por Tao Hong, David Wenzhong Gao, Tom Laing, Dale Kruchten y Jorge Calzada

Saturday, June 23, 2018

Call For Papers: Food and Agriculture Forecasting | International Journal of Forecasting

International Journal of Forecasting

Special Section on Food and Agriculture Forecasting

The fast growing world population brings a critical challenge to humanity: how to ensure adequate supply and access to safe, healthy food. Accurate forecasts provide valuable information to help in formulating national food and agricultural policies, and to help agriculture companies and farmers adjust their business strategies. Such forecasts cover production, consumption, stocks, trade and prices of major field crops (e.g., corn, sorghum, barley, oats, wheat, rice, soybeans, and cotton) and livestock (e.g., beef, pork, poultry and eggs, and dairy). This special section is to collect high-quality research that involves theoretical and practical aspects of forecasting in food and agriculture. Specifically, it encourages papers that inspire actionable insights and/or make methodological breakthroughs in this area.

Potential topics include but are not limited to:

  • Forecasting methodologies in food and agriculture
  • Major field crops forecasting
  • Livestock forecasting 
  • Agri-food products forecasting 
  • Forecasting in vegetables, fruits and other agriculture commodities
  • Agriculture commodities futures market forecasting
  • Natural resources forecasting in agriculture and food industry 
  • Water and energy forecasting in agriculture 
  • Climate forecasting in agriculture

Submission deadline: 31 December 2018

To submit a paper for consideration for the Special Section, please upload your paper online and include a cover letter clearly indicating that the paper is for the special issue “Food and Agriculture Forecasting”. The webpage for online submission is mc.manuscriptcentral.com/ijf. Instructions for authors are provided at www.forecasters.org/ijf/authors. All papers will follow IJF’s double-blind refereeing process. For further information about the Special Section, please contact the guest editors.

Guest Editors

Jue Wang, Chinese Academy of Sciences, China (wjue@amss.ac.cn)
Tao Hong, University of North Carolina at Charlotte, USA (hong@uncc.edu)

Monday, June 18, 2018

Combining Probabilistic Load Forecasts

We often find simple averaging as a plausible solution for combining point forecasts. Combining probabilistic forecasts is not that trivial. The literature of combining probabilistic load forecasts is rather limited. Previously, we developed a Quantile Regression Averaging (QRA) method to generate probabilistic load forecasts by combining point forecasts. This work is a follow up, where we combine probabilistic load forecasts to generate a more accurate probabilistic forecast. The method we proposed here is a Constrained Quantile Regression Averaging (CQRA) method, where the parameters of a quantile regression model are non-negative and sum up to 1. We applied the method to loads at both high voltage level and household level, showing better results than the benchmarks.

Among my papers published so far, this one has the shortest title.

Citation
Yi Wang, Ning Zhang, Yushi Tan, Tao Hong, Daniel Kirschen, and Chongqing Kang, "Combining probabilistic load forecasts," IEEE Transactions on Smart Grid, in press, available online. (arXiv; IEEE Xplore).

Combining Probabilistic Load Forecasts

Yi Wang, Ning Zhang, Yushi Tan, Tao Hong, Daniel Kirschen, and Chongqing Kang

Abstract

Probabilistic load forecasts provide comprehensive information about future load uncertainties. In recent years, many methodologies and techniques have been proposed for probabilistic load forecasting. Forecast combination, a widely recognized best practice in point forecasting literature, has never been formally adopted to combine probabilistic load forecasts. This paper proposes a constrained quantile regression averaging (CQRA) method to create an improved ensemble from several individual probabilistic forecasts. We formulate the CQRA parameter estimation problem as a linear program with the objective of minimizing the pinball loss and the constraints that the parameters are nonnegative and summing up to one. We demonstrate the effectiveness of the proposed method using two publicly available datasets, the ISO New England data and Irish smart meter data. Comparing with the best individual probabilistic forecast, the ensemble can reduce the pinball score by 4.39% on average. The proposed ensemble also demonstrates superior performance over nine other benchmark ensembles.

Thursday, June 14, 2018

A Semi-heterogeneous Approach to Combining Crude Oil Price Forecasts

Forecast combination is an effective method to enhance the accuracy. Most combination methods in the literature can be grouped two categories, heterogeneous combination and homogeneous combination, with each having pros and cons. I collaborated with my former visiting scholar Dr. Jue Wang and her colleagues to develop a semi-heterogeneous approach to combining forecasts. We leveraged the decomposition-reconstruction concept, mixing and matching 4 decomposition methods with 4 forecasting techniques. In total this process generates 16 forecasts for combination, which is easier than applying 16 completely different techniques (a.k.a. heterogeneous combination) and more robust than producing 16 different forecasts from one technique (a.k.a. homogeneous combination). Furthermore, the proposed method leads to more accurate forecasts than its counterparts.

Citation
Jue Wang, Xiang Li, Tao Hong, and Shouyang Wang, "A semi-heterogeneous approach to combining crude oil price forecasts," Information Sciences, vol.460-461, pp 279-292, September 2018. (ScienceDirect)


A Semi-heterogeneous Approach to Combining Crude Oil Price Forecasts

Jue Wang, Xiang Li, Tao Hong, and Shouyang Wang

Abstract

Crude oil price forecasting has received increased attentions due to its significant role in the global economy. Accurate crude oil price forecasts often lead to a rapid new production development with higher quality and less cost. Making such accurate forecasts, however, is challenging due to the intrinsic complexity of oil market mechanism. Many techniques have been tested in the crude oil price forecasting literature. Although forecast combination is a well-known method to improve forecast accuracy, generating forecasts using various techniques tend to be labor intensive. How to efficiently generate many individual forecasts for combination becomes a research question in crude oil price forecasting. Recently, several signal decomposition methods have been suggested for processing the oil price signals. In this paper, we propose a semi-heterogeneous approach to combining crude oil price forecasts, which interacts a set of decomposition methods with a set of forecasting techniques. We first decompose the original price series using four decomposition methods, such as Wavelet Analysis, Singular Spectral Analysis, Empirical Mode Decomposition, and Variational Mode Decomposition. We then use four different forecasting techniques, such as Autoregressive Models, Autoregressive Integrated Moving Average Models, Artificial Neural Networks, and Support Vector Regression Models, to forecast the components from each decomposition methods. Finally, we reconstruct the price forecasts from the forecasted components. This process generates 16 price forecasts in total for combination. We test the combination based on all individual forecasts, as well as a subset of the individual forecasts selected using Tabu Search. The experimental results demonstrate that the forecasting models with the addition of a decomposition technique can have an error reduction of 30.6% compared to benchmark models on average. The combined forecasts outperform the individual forecasts on average. Furthermore, comparing with the heterogeneous combination of 4 individual forecasts, the semi-heterogeneous combinations reduce the errors by 56.6% (w/o Tabu Search) and 61.6% (w/ Tabu Search).

Friday, April 27, 2018

Weather Data for Energy Analytics

Being an energy forecaster, I am genuinely interested in meteorology. I even recruited a master student who was a practicing meteorologist in Hawaii (see the blog post about Ying Chen). The more energy forecasting projects I conduct, the more I appreciate the value of weather data. In GEFCom2014, the top 1 place of the solar track was a team of meteorologists from Australia, who completely dominated the track. In GEFCom2017, the top 1 place of the final match was a team of meteorologists from Japan. I truly believe that the energy forecasting community can better leverage meteorology than what we do today. Here is an article about two use cases of weather data for energy analytics. In fact we merged two papers into one by removing the sophisticated mathematics and statistics to keep the story readable to a broad audience. The IEEE Power and Energy Society is so kind to offer the open access to this paper, so that people can read it for free.

Citation

Jonathan Black, Alex Hofmann, Tao Hong, Joseph Roberts, and Pu Wang, "Weather data for energy analytics: from modeling outages and reliability indices to simulating distributed photovoltaic fleets," IEEE Power and Energy Magazine, vol.16, no.3, pp 43-53, May-June 2018. (Open AccessIEEE Xplore)


Weather Data for Energy Analytics

From Modeling Outages and Reliability Indices to Simulating Distributed Photovoltaic Fleets

Jonathan Black, Alex Hofmann, Tao Hong, Joseph Roberts, and Pu Wang

Abstract

Weather impacts virtually all facets of our daily life. As a result, many business sectors are affected by weather conditions, and the power industry is no exception. Weather is a major influencer on system reliability and a key driver of both power supply and demand. In this article, we will demonstrate novel uses of weather data for energy analytics via two utility applications. We first use easily accessible weather data together with regression analysis to model distribution outages and construct a probabilistic view of reliability indices that helps reveal a utility’s reliability trend. We then use high-resolution, commercial-grade weather data to develop realistic simulations of anticipated behind-the-meter photovoltaic (PV) fleets

Friday, April 20, 2018

Training Energy Data Scientists

Traditional power engineering curriculum has been heavily focusing on the engineering aspects of power systems, such as power flow, state estimation, stability and control. Data science has never been a focus in the past. I saw that gap 5 years ago, predicted the shortage of data scientists in the power industry, and left a great place to work to come back to academia. Nowadays, when other business sectors are offering 6-figure salaries to fresh graduates, utilities are having a hard time to compete on the analytics talents. Recently I had the opportunity to collaborate with Prof. David Wenzhong Gao from University of Denver and three other utility executives to put our thoughts in a paper.

Citation

Tao Hong, David Wernzhong Gao, Tom Laing, Dale Kruchten, and Jorge Calzada, "Training energy data scientists: universities and industry need to work together to bridge the talent gap," IEEE Power and Energy Magazine, vol.16, no.3, pp 66-73, May-June 2018. (IEEE Xplore)

Training Energy Data Scientists 

Universities and Industry Need to Work Together to Bridge the Talent Gap

Tao Hong, David Gao, Tom Laing, Dale Kruchten, and Jorge Calzada

Abstract

The workforce crisis is nothing new to the U.S. power industry. It has been a growing concern of both governments and industry organizations since the early 2000s. Meanwhile, the growth of data during the past decade has led to a demand surge for data analytics across all business sectors. The shortage of an electricity workforce and the increasing demand for data analytics present an emerging challenge as well as opportunity for university power engineering programs to bridge the data analytics talent gap. After gathering various perspectives from members of academia, industry, and government, we propose an interdisciplinary and entrepreneurial approach to revising the traditional power engineering curriculum for training the next generation of energy data scientists.

Saturday, April 14, 2018

Call For Papers: Forecasting for Social Good | International Journal of Forecasting

International Journal of Forecasting

Special Issue on Forecasting for Social Good

The area of forecasting and its link to decision making has been under research for decades. Whilst there have been many influential contributions seeking to examine the effects of forecasting under financial and economic motives, very little has been contributed (both in regular conferences and journal publications) on forecasting with social impact – that is forecasting for the social good, regardless of the financial implications, or optimizations attempted based on economic terms.

The International Journal of Forecasting (IJF) is excited to announce this Call For Papers for the special issue on “Forecasting for Social Good”. The purpose of the special issue is to attract high quality papers that are concerned with the social impact of forecasting.

Areas of interest include, but are not limited to:
  • Health and healthcare
  • Humanitarian operations
  • Disaster relief
  • Education
  • Social services
  • Environment
  • Sustainability
  • Sharing economy
  • Transportation
  • Urban planning
  • Fraud, collusion, and corruption
  • Government policy
  • Poverty
  • Privacy
  • Cyber security
  • Crime and terrorism

Submission Deadline: 31 October 2018.

Submission Guidelines:

To submit a paper for consideration for the Special Issue, please upload your paper online and include a cover letter clearly indicating that the paper is for the special issue “Forecasting for Social Good”. The webpage for online submission is mc.manuscriptcentral.com/ijf. Instructions for authors are provided at www.forecasters.org/ijf/authors . All papers will follow IJF’s double-blind refereeing process. For further information about the Special Issue, please contact the guest editors.

Guest Editors

Bahman Rostami-Tabar, Cardiff University, UK
Email: rostami-tabarb@cardiff.ac.uk

Michael Porter, University of Alabama, USA
Email: mporter@culverhouse.ua.edu

Tao Hong, University of North Carolina at Charlotte, USA
Email: hong@uncc.edu