Saturday, March 23, 2013

Three Must-Know Basics of Forecasting

Sometimes, you may hear someone making a comment like this, "ABC is so basic that every forecaster should know." If you happen to be that "lucky" forecaster who doesn't know that ABC or its implications, it can be quite an embarrassing moment. Most importantly, not knowing these basics may eventually lead to some serious mistakes during the forecasting process. Here I'd like to list three must-know basics of forecasting on top of my mind.
1. Forecasting is a stochastic problem.
Forecasting, by nature, is a stochastic problem rather than deterministic. There is no "certain" in forecasting. Things like "the sun will rise tomorrow" are not forecasts.
Since we forecasters are dealing with randomness, the output of a forecasting process is supposed to be in a probabilistic form, such as a forecast under this or that scenario, a probability density function, a prediction interval, or some quantile of interest. In practice, a lot of decision making processes today cannot yet take probabilistic inputs, so the most commonly used forecasting output form is still point forecast, e.g., the future expected value of a random variable.
2. All forecasts are wrong.
Due to the stochastic nature of forecasting, the response variable we forecast is never 100% predictable. Take guessing the head or tail of a coin as an example. We may be right a few times, but the probability that we are always right is zero. The question like "why is your forecast different from the actual?" should have never been asked, because we do expect some differences between the forecasts and actual values. Instead, it is questionable if the forecasts are exactly the same as actual.
On the other hand, it is fair to ask "why does your forecast fail to capture XYZ features from the actual?", while it requires the person who raises the question to identify the missing features first. Of course there are plenty of other factors that may cause wrong forecasts, such as bad data, inappropriate methodologies, and crappy software, etc.
It's the forecaster's job to apply best practices to avoid these avoidable issues.
3. Some forecasts are useful.
Most industries require some forecasts in the decision making processes, but not necessarily the perfect forecast. The retail industry needs SKU (store keeping unit) forecasts to optimize promotion offerings and manage inventory; the airline industry needs passenger forecasts to schedule airlines; our utility industry need energy forecasts, which in a broader sense, include forecasts of load, generation and price, etc., to operate and plan the system.
What does it mean by "useful" in the utility industry?
There are at least two aspects of usefulness in the utility industry: accuracy and defensibility. Accuracy can be calculated based on various peaks (i.e., monthly, seasonal or annual peaks), energy or the combination of them, while defensibility may include interpretability, traceability, and reproducibility.
While the points above may not be mutually exclusive, they should be prioritized differently depending upon the exact business need. For example, for regulatory compliance purpose, we would emphasize defensibility more than the accuracy. As a result, statistical approaches such as multiple linear regression are usually preferred over black-box approaches like Artificial Neural Networks.
It's the forecaster's job to understand the business needs before forecasting.
(Continue reading Three More Must-Know Basics of Forecasting)

1. Since we forecasters are dealing with randomness

Why are we dealing with randomness?

Due to the stochastic nature of forecasting, the response variable we forecast is never 100% predictable

Disagree. If the response variable is how much electricty my house uses, it is 100% predictable if we know what appliances are on and when.

1. Yes, you would be able to calculate how much each appliance uses after the fact. However, if I asked 'How much electricity will your air conditioner use tomorrow and when?' that becomes an issue of forecasting, and there is randomness. In the case of an air conditioner: temperature, cloud cover, humidity, wind speed, and other variables have a level randomness associated with them. As a consequence, I could not tell you how much my air conditioner will run tomorrow....

2. when you say randomness do you mean uncertainty?

2. "If the response variable is how much electricty my house uses, it is 100% predictable if we know what appliances are on and when."

This is a calculation, not a forecast.

1. forecasts are calculations

3. ... Predicting (in the future) what appliances are on and when would be a forecast and would be stochastic.
Forecasts have residuals after fitting to history, this is PART of the randomness. Even if the fit was perfect (zero residual), there is still uncertainty in the future.

1. Forecasts have residuals after fitting to history, this is PART of the randomness.

It is not randomness - it is just the unknown - which is not random.