Predicting Using the Past

Clearer Thinking Team
May 11, 2015
5 min read

By Spencer Greenberg

(Cross-posted at SpencerGreenberg.com)

When we try to predict how long a task will take, we are in danger of falling prey to the planning fallacy. This is the natural human tendency to underestimate how long your own projects will take and the costs involved.

[PICTURE]

To give one of many possible examples, when a group of students were asked to estimate how long their senior theses would take if everything went as poorly as it possibly could, the average estimate was about 49 days. In fact, the average time it took the students to complete these papers was about 56 days, 7 days worse than their worst case scenario estimates. Only about 30% of the students finished their projects in the amount of time they estimated. Other studies have demonstrated a similar optimistic bias on a variety of project types, from computer programming to tax form completion.

Why might we be bad at making estimates about our own projects? It is likely a combination of:

Our tendency to plan as though the stages of a project will each go smoothly (when, in fact, one or more of these stages may have hitches).
A self-serving bias, where we take credit for our past successes, but treat our past failures as being caused by unpredictable external events. This can lead us to have an inflated sense of our ability to complete projects.
Our tendency to try to impress others by exaggerating how well we can perform (which becomes relevant when we are making our estimates in front of others).
A wishful thinking bias, where our beliefs are influenced by how much we want something to be true. Since it is more pleasing to believe that a project will be completed quickly, in some cases we may be biased towards believing that.

So how can we correct this problem in our forecasting? Well, just knowing about it makes it possible for us to consciously make corrections for what are likely to be overly optimistic estimates. But even this approach often fails, as we may not adjust enough (being optimistic about the amount of bias that we have), or overact and adjust too much. Fortunately, there is a prediction method, known as Reference Class Forecasting, that has a tendency to be more reliable. At its core, this technique involves considering past cases that were similar to the project that you are now trying to make predictions about, and applying probabilistic thinking.

Rather than asking “Given what this project’s parts consist of, how long do I expect it to take?”, Reference Class Forecasting involves asking, “How long did similar projects I’ve done in the past take?” If you’ve never done a project similar to the one at hand, you can modify this question to, “Historically, how long have projects like this one taken for people with a level of skill that is similar to mine?” Once you have recalled or collected data on how long similar projects have taken, it is then easy to make estimates for how likely the project is to take different lengths of time. For instance, to estimate the probability that a project takes more than 30 days, we can just check what percentage of the time similar projects took us 30 days or more. The more data you have on similar projects, and the more similar those projects are to the project you are now doing, the less uncertain your estimates will be.

Reference Class Forecasting is useful, in part, because it gives us a way of predicting how long a project will take (or how costly it will be) that is unlikely to be influenced by our various biases. Wishful thinking, excessive optimism, and self-serving tendencies can be avoided simply by viewing our project as one among many, and thinking in terms of the probability of different outcomes. This process is certainly not perfect. For instance, it is not obvious which projects should count as “similar enough” to include in our analysis. And there will be a fundamental trade-off in this procedure between considering more past projects that are less similar, or considering fewer past projects that are more similar, and it isn’t clear what the optimal tradeoff is. But, nonetheless, this method often yields substantial improvements in prediction accuracy over other approaches.

Reference Class Forecasting can be used for many things besides planning projects. For instance:

If you want to know how likely your friend is to cancel plans with you, consider the frequency with which they canceled in the past. For example, you can check your calendar to look at the last six times you had plans, and try to remember if they canceled on each of those occasions. If they canceled half of those times, it is quite likely they will cancel this time as well. If they didn’t cancel any of those times, then they probably won’t this time either. Of course, if you have extra information, such as that your friend has a cold and it is raining outside, you’ll want to try to make an adjustment to this probability, and conclude that your friend is less likely to show than normal.
Suppose that you are stressing out about a test that is coming up a in a couple of weeks, and want to know how likely you are to do poorly. Well, consider your history of taking tests in the past that seemed to be of about this difficulty. If you never got a grade lower than a B on ten such exams, then it is quite unlikely that you will get a C on this next one.
Perhaps you are feeling hopeless about finding a boyfriend, because you haven’t dated anyone for a while. Think back to your dating history, and note how long it took you to meet someone you liked in past cases. Unless something important has changed in your life that would affect your dating outcomes, this information can help you estimate how long it is likely to take you to find someone in the future. If this procedure tells you that you likely will have to wait a depressingly long time to find someone you like, consider the strategies you can use to increase your average romantic happiness.

Using the past to predict how well things will turn out in the future is certainly not an infallible method. Sometimes things change in a fundamental way such that past examples are just not relevant, or we lack knowledge about past examples. There are also fairly arbitrary decisions to be made during this process, like deciding which cases are similar enough to include. But, remarkably, using this simple procedure can often give us reasonable answers to questions that we care about, and produce predictions that may be less biased and more accurate than would typically be achieved by the methods we would have naturally relied on.

In cases where it is very important that our predictions are accurate, we can use both Reference Class Forecasting and other methods, and compare their results. When they agree, this should give us increased credence in our predictions. When they disagree, we can try to figure out why our prediction methods are diverging.

The next time you want to make a prediction, consider asking yourself, “Are there past examples similar to this case? What were the outcomes in those past cases, and how often did each of those outcomes occur?”