The Spring 2016 BigDEAL Forecasting Competition (BFCom2016s) just ended last week. I received 49 registrations from 15 countries, of which 18 teams from 6 countries completed all four rounds of the competition. I want to give my special appreciation to Prof. Chongqing Kang and his teaching assistant Mr. Yi Wang. They organized 8 teams formulated by students from Tsinghua University, an institute prize winner of GEFCom2014. Two of the Tsinghua Teams were finally ranked among the Top 6.
The topic of BFCom2016s is ex ante short term load forecasting. I provided 4 years of historical load and temperature data, asking the contestants to forecast the next three months given historical day-ahead temperature forecasts. Three months of incremental data was released in each round.
The benchmark is made by the Vanilla model, the same as the one used in GEFCom2012. This time among the top 6 teams, five were able to beat the benchmark on average ranking, while four beat the benchmark on average MAPE. The detailed rankings and MAPEs of all teams are listed HERE.
I invited each of the top 6 teams to send me a piece of guest blog to describe their methodology. Their contributions (with my minor editorial changes) are listed below, together with the Vanilla Benchmark, which ranked No. 7.
No.1: Jingrui Xie (avg. ranking: 1.25; avg. MAPE: 5.38%)
Team member: Jingrui Xie
Affiliation: University of North Carolina at Charlotte, USA
No.2: SMHC (avg. ranking: 3.75; avg. MAPE: 5.90%)
Team members: Zejing Wang; Qi Zeng; Weiqian Cai
No. 3: eps (avg. ranking: 5.25; avg. MAPE: 6.08%)
No. 4: Fortune Teller (avg. ranking: 6.25; avg. MAPE: 6.45%)
No. 7: Vanilla Benchmark (avg. ranking: 7.25; avg. MAPE: 6.42%)
The topic of BFCom2016s is ex ante short term load forecasting. I provided 4 years of historical load and temperature data, asking the contestants to forecast the next three months given historical day-ahead temperature forecasts. Three months of incremental data was released in each round.
The benchmark is made by the Vanilla model, the same as the one used in GEFCom2012. This time among the top 6 teams, five were able to beat the benchmark on average ranking, while four beat the benchmark on average MAPE. The detailed rankings and MAPEs of all teams are listed HERE.
I invited each of the top 6 teams to send me a piece of guest blog to describe their methodology. Their contributions (with my minor editorial changes) are listed below, together with the Vanilla Benchmark, which ranked No. 7.
No.1: Jingrui Xie (avg. ranking: 1.25; avg. MAPE: 5.38%)
Team member: Jingrui Xie
Affiliation: University of North Carolina at Charlotte, USA
The same model selection process was used in all four rounds. The implementation was in SAS. The model selection process follows the point forecasting model selection process implemented in Xie and Hong, IJF-2016. In this competition, the forecasting problem was dissected into three sub-problems with each of them having slightly different candidate models being evaluated.
The first sub-problem was a very-short term load forecasting problem, which considered forecasting the first day of the forecast period. The model selection process started with the "Vanilla model plus the lagged load of the previous 24th hour". It then considered the recency effect, the weekend effect, the holiday effect, the two-stage model, and the combination of forecasts as introduced in Hong, 2010 and Xie and Hong, IJF-2016.
The second sub-problem was a short term load forecasting problem, which considered forecasting the second to the seventh day of the month. The model selection process was the same to that for the very-short term load forecasting problem except that the starting benchmark model is the Vanilla model.
The third sub-problem can be categorized as a middle term load forecasting problem in which the rest of the forecast period were forecasted. The model selection process also started with the Vanilla model, but it only considered the recency effect, the weekend effect, and the holiday effect.
No.2: SMHC (avg. ranking: 3.75; avg. MAPE: 5.90%)
Team members: Zejing Wang; Qi Zeng; Weiqian Cai
Affiliation: Tsinghua University, China
We tried the support vector machine (SVM) and artificial neural networks (ANN) models in the model selection stage. We found that the ANN model had a better performance than SVM. When considering the cumulative effect, we introduced the aggregated temperatures of several hours as augmented variables, while and the number of hours was also determined in the model selection process.
In the first round, we used all the provided data for training but didn't consider the influence of holidays. Then in the next three rounds, we divided the provided data into two seasons, “summer” and “winter”. We separately forecasted the load of normal days and special holidays. These so-called seasons are not the traditional ones but were roughly defined by the plot of the average load of the given four years. Then we used the data from each seasons for training to forecast the corresponding season in 2014. This ultimately achieved a higher accuracy. All the aforementioned results and algorithms were implemented by using the MATLAB and C language.
No. 3: eps (avg. ranking: 5.25; avg. MAPE: 6.08%)
Team member: Ilias Dimoulkas
Affiliation: KTH Royal Institute of Technology, Sweden
I used the Matlab’s Neural Network toolbox for the modeling. The evolution of my model during the four rounds was as follows.
1st round: I used the “Fiiting app” which is suitable for function approximation. The training vector was IN = [Hour Temperature] and the target vector OUT = [Load]
2nd round: I used the “Time series app” which is suitable for time series and dynamical systems. I used the Nonlinear Input-Output model instead of the Nonlinear Autoregressive with External Input model because it performs better for long term forecasting. The training vector was still IN = [Hour Temperature] and the target vector OUT = [Load]. The number of the delays I found it works better is 5 (= 5 hourly lags).
3rd round. I used the same model but I changed the training vector to IN = [Month Weekday Hour Temperature AverageDailyTemperature MaxDailyTemperature] where AverageDailyTemperature is the average temperature and MaxDailyTemperature is the maximum temperature of the day that the specific hour belongs to.
4th round: I used two similar models with different training vectors. The final output was the average of the two models. The training vectors where IN1 = [Month Weekday Hour Temperature MovingAverageTemperature24 MovingMaxTemperature24] and IN2 = [Month Weekday Hour Temperature AverageTemperaturePreAfter4Hours MovingAverageTemperature24 MovingAverageTemperature5 MovingMaxTemperature24] where MovingAverageTemperature24 is the average temperature of the last 24 hours, MovingAverageTemperature5 is the average temperature of the last 5 hours, MovingMaxTemperature24 is the maximum temperature of the last 24 hours and AverageTemperaturePreAfter4Hours is the average temperature of the hours ranging from 4 hours before till 4 hours after the specific hour.
No. 4: Fortune Teller (avg. ranking: 6.25; avg. MAPE: 6.45%)
Member: Guangzheng Xing; Zetian Zheng; Liangzhou Wang
Affiliation: Tsinghua University, China
Round 1. Variables:Hour, Weekday, T_act, TH(the highest temperature in a day), TM(the mean temperature), TL(the lowest temperature). First of all, we used the MLR, fitting the mean load by TM, TM^2, TM^3. This method didn’t work well, the MAPE could reach about 14%. Then we used neural network, the data set contains the six variables above, and the target value is the Load_MW. The result is better, but because of improper parameters, the model was kind of overfitted, and we didn’t do the cross-validation. The result was not so good.
Round 2. We changed the parameter, and used the max value/min value/ mean value of the previous 24 hours rather than those of the day. The result was much better.
Round 3. We tried to use SVM to classify the two kinds of day curve, and then used the nnet separately. But this method did not seem to be effective. Then we used the SVM to do regression, the data set is same in nnet. Using the test set, the results of SVM and nnet were similar, so we submitted the mean value of both methods’ result.
Round 4: The MAPE of both methods reach over 7% during model selection, the result of SVM was worse, so we only submitted the result of nnet.
No. 5: Keith Bishop (avg. ranking: 6.50; avg. MAPE: 6.47%)
Team member: Keith Bishop
Affiliation: University of North Carolina-Charlotte, USA; Hepta Control Systems, USA
For my forecast, I utilized SkyFoundry’s SkySpark analytics software. SkySpark is designed for modelling complex building systems and working with the time-series data on a wide range of levels. To support my model, I extended the inherent functionality of this software to support polynomial regression. My model itself went through several iterations. The first of these was fairly similar to Dr. Hong’s Vanilla Model with the exception that instead of clustering by month, I clustered based on whether the date was a heating or cooling date. The heating or cooling determination was made by fitting a third-degree polynomial curve to each, hourly clustered, load-temperature scatter plot, solving for the minimums and then calculating the change-over point by averaging these hourly values. If the average temperature for a day was above this point, it was a cooling day and vice-versa. As my model progressed, I incorporated monthly clustering and the recency effect discussed in Electric load forecasting with recency effect: A big data approach. With the recency effect, I optimized the number of lag hours for each monthly cluster by creating models for each of the past 24-hours and selecting the one with the lowest error. In the end, I was able to reduce the MAPE of the forecast against the known data from 8.51% down to 5.01%.
No. 6: DUFEGO (avg. ranking: 7.25; avg. MAPE: 6.39%)
Affiliation: Dongbei University of Finance and Economics, China
During the 4-round competition,we selected MATLAB as our tool. We use multiple linear regression models (MLR), each of which has 291 variables including trend, polynominal terms,interaction terms and recency effect. We just used all past historical data without cleansing the data. Considering the forecasting task is to improve predicting accuracy rather than the goodness of fit, we seperated the data into training set and validation set. We used cross validation and out of sample test method to select variables to give our model more generalizaton ability.
In Round 1, we trained one MLR model using the entire historical data. In Round 2, we roughly grouped the historical data by season (such as January - March and April - June,) and trained four MLR models, which improved the results significantly. We also found the distinct relationship between temperature and load in different temporal dimensions.We did some work about selecting the best MLR model in different temporal dimensions and found seasonal separate better. We made a mistake in Round 3 that resulted in a very high MAPE.
No. 7: Vanilla Benchmark (avg. ranking: 7.25; avg. MAPE: 6.42%)
The model is the same as the one used in GEFCom2012. See Hong, Pinson and Fan, IJF2014 for more details. All available historical data in each round was used to estimate the model.
Finally, congratulations to these top 6 teams of BFCom2016s, and many thanks to all of you who participated and are interested in BFCom2016s!
No comments:
Post a Comment
Note that you may link to your LinkedIn profile if you choose Name/URL option.