There are a LOT of ways to make inferences – that is, for drawing conclusions based on information or evidence. In fact, there are many more than most people realize. All of them have strengths and weaknesses that render them more useful in some situations than in others.

Here's a brief key describing most popular methods of inference, to help you whenever you're trying to draw a conclusion for yourself. Do you rely more on some of these than you should, given their weaknesses? Are there others in this list that you could benefit from using more in your life, given their strengths? And what does drawing conclusions mean, really? As you'll learn in a moment, it encompasses a wide variety of techniques, so there isn't one single definition.

—

1. Deduction

Common in: philosophy, mathematics

Structure:

If X, then Y, due to the definitions of X and Y.

X applies to this case.

Therefore Y applies to this case.

Example: “Plato is a mortal, and all mortals are, by definition, able to die; therefore Plato is able to die.”

Example: “For any number that is an integer, there exists another integer greater than that number. 1,000,000 is an integer. So there exists an integer greater than 1,000,000.”

Advantages: When you use deduction properly in an appropriate context, it is an airtight form of inference (e.g. in a mathematical proof with no mistakes).

Flaws: To apply deduction to the world, you need to rely on strong assumptions about how the world works, or else apply other methods of inference on top. So its range of applicability is limited.

—

2. Frequencies

Common in: applied statistics, data science

Structure:

95% of the time that X occurred in the past, Y occurred also.

X occurred.

Therefore Y is likely to occur (with high probability).

Example: “95% of the time when we saw a bank transaction identical to this one, it was fraudulent. So this transaction is fraudulent.”

Advantages: This technique allows you to assign probabilities to events. When you have a lot of past data it can be easy to apply.

Flaws: You need to have a moderately large number of examples like the current one to perform calculations on. Also, the method assumes that those past examples were drawn from a process that is (statistically) just like the one that generated this latest example. Moreover, it is unclear sometimes what it means for “X”, the type of event you’re interested in, to have occurred. What if something that’s very similar to but not quite like X occurred? Should that be counted as X occurring? If we broaden our class of what counts as X or change to another class of event that still encompasses all of our prior examples, we’ll potentially get a different answer. Fortunately, there are plenty of opportunities to make inferences from frequencies where the correct class to use is fairly obvious.

—

3. Models

Common in: financial engineering, risk modeling, environmental science

Structure:

Given our probabilistic model of this thing, when X occurs, the probability of Y occurring is 0.95.

X occurred.

Therefore Y is likely to occur (with high probability).

Example: “Given our multivariate Gaussian model of loan prices, when this loan defaults there is a 0.95 probability of this other loan defaulting.”

Example: "When we run the weather simulation model many times with randomization of the initial conditions, rain occurs tomorrow in that region 95% of the time."

Advantages: This technique can be used to make predictions in very complex scenarios (e.g. involving more variables than a human mind can take into account at once) as long as the dynamics of the systems underlying those scenarios are sufficiently well understood.

Flaws: This method hinges on the appropriateness of the model chosen; it may require a large amount of past data to estimate free model parameters, and may go haywire if modeling assumptions are unrealistic or suddenly violated by changes in the world. You may have to already understand the system deeply to be able to build the model in the first place (e.g. with weather modeling).

—

4. Classification

Common in: machine learning, data science

Structure:

In prior data, as X1 and X2 increased, the likelihood of Y increased.

X1 and X2 are at high levels.

Therefore Y is likely to occur.

Example: “Height for children can be approximately predicted as an (increasing) linear function of age (X1) and weight (X2). This child is older and heavier than the others, so we predict he is likely to be tall.”

Example: "We've trained a neural network to predict whether a particular batch of concrete will be strong based on its constituents, mixture proportion, compaction, etc."

Advantages: This method can often produce accurate predictions for systems that you don't have much understanding of, as long as enough data is available to train the regression algorithm and that data contains sufficiently relevant variables.

Flaws: This method is often applied with simple assumptions (e.g. linearity) that may not capture the complexity of the inference problem, but very large amounts of data may be needed to apply much more complex models (e.g to use neural networks, which are non-linear). Regression also may produce results that are hard to interpret – you may not really understand why it does a good job of making predictions.

—

5. Bayesianism

Common in: the rationality community

Structure:

Given my prior odds that Y is true...

And given evidence X...

And given my Bayes factor, which is my estimate of how much more likely X is to occur if Y is true than if Y is not true...

I calculate that Y is far more likely to be true than to not be true (by multiplying the prior odds by the Bayes factor to get the posterior odds).

Therefore Y is likely to be true (with high probability).

Example: “My prior odds that my boss is angry at me were 1 to 4, because he’s angry at me about 20% of the time. But then he came into my office shouting and flipped over my desk, which I estimate is 200 times more likely to occur if he’s angry at me compared to if he’s not. So now the odds of him being angry at me are 200 * (1/4) = 50 to 1 in favor of him being angry.”

Example: "Historically, companies in this situation have 2 to 1 odds of defaulting on their loans. But then evidence came out about this specific company showing that it is 3 times more likely to end up defaulting on its loans than similar companies. Hence now the odds of it defaulting are 6 to 1 since: (2/1) * (3/1) = 6. That means there is an 85% chance that it defaults since 0.85 = 6/(6+1)."

Advantages: If you can do the calculations in a given instance, and have a sensible way to set your prior probabilities, this is probably the mathematically optimal framework to use for probabilistic prediction. For instance, if you have a belief about the probability of something, then you gain some new evidence, you can prove mathematically that Bayes's rule tells you how to calculate what your new probability should now be that incorporates that evidence. In that sense, we can think of many of the other approaches on this list as (hopefully pragmatic) approximations of Bayesianism (sometimes good approximations, sometimes bad ones).

Flaws: It's sometimes hard to know how to set your prior odds, and it can be very hard in some cases to perform the Bayesian calculation. In practice, carrying out the calculation might end up relying on subjective estimates of the odds, which can be especially tricky to guess when the evidence is not binary (i.e not of the form “happened” vs. “didn’t happen”), or if you have lots of different pieces of evidence that are partially correlated.

If you’d like to learn more about using Bayesian inference in everyday life, try our mini-course on The Question of Evidence. For a more math-oriented explanation, check out our course on Understanding Bayes’s Theorem.

—

6. Theories

Common in: psychology, economics

Structure:

Given our theory, when X occurs, Y occurs.

X occurred.

Therefore Y will occur.

Example: “One theory is that depressed people are most at risk for suicide when they are beginning to come out of a really bad depression. So as depression is remitting, patients should be carefully screened for potentially increasing suicide risk factors.”

Example: “A common theory is that when inflation rises, unemployment falls. Inflation is rising, so we should predict that unemployment will fall.”

Advantages: Theories can make systems far more understandable to the human mind, and can be taught to others. Sometimes even very complex systems can be pretty well approximated with a simple theory. Theories allow us to make predictions about what will happen while only having to focus on a small amount of relevant information, without being bogged down by thousands of details.

Flaws: It can be very challenging to come up with reliable theories, and often you will not know how accurate such a theory is. Even if it has substantial truth to it and is right often, there may be cases where the opposite of what was predicted actually happens, and for reasons the theory can’t explain. Theories usually only capture part of what is going on in a particular situation, ignoring many variables so as to be more understandable. People often get too attached to particular theories, forgetting that theories are only approximations of reality, and so pretty much always have exceptions.

—

7. Causes

Common in: engineering, biology, physics

Structure:

We know that X causes Y to occur.

X occurred.

Therefore Y will occur.

Example: “Rusting of gears causes increased friction, leading to greater wear and tear. In this case, the gears were heavily rusted, so we expect to find a lot of wear.”

Example: “This gene produces this phenotype, and we see that this gene is present, so we expect to see the phenotype in the offspring.”

Advantages: If you understand the causal structure of a system, you may be able to make many powerful predictions about it, including predicting what would happen in many hypothetical situations that have never occurred before, and predicting what would happen if you were to intervene on the system in a particular way. This contrasts with (probabilistic) models that may be able to accurately predict what happens in common situations, but perform badly at predicting what will happen in novel situations and in situations where you intervene on the system (e.g. what would happen to the system if I purposely changed X).

Flaws: It’s often extremely hard to figure out causality in a highly complex system, especially in “softer” or "messier" subjects like nutrition and the social sciences. Purely statistical information (even an infinite amount of it) is not enough on its own to fully describe the causality of a system; additional assumptions need to be added. Often in practice we can only answer questions about causality by running randomized experiments (e.g. randomized controlled trials), which are typically expensive and sometimes infeasible, or by attempting to carefully control for all the potential confounding variables, a challenging and error-prone process.

—

8. Experts

Common in: politics, economics

Structure:

This expert (or prediction market, or prediction algorithm) X is 90% accurate at predicting things in this general domain of prediction.

X predicts Y.

Therefore Y is likely to occur (with high probability).

Example: “This prediction market has been right 90% of the time when predicting recent baseball outcomes, and in this case predicts the Yankees will win.”

Advantages: If you can find an expert or algorithm that has been proven to make reliable predictions in a particular domain, you can simply use these predictions yourself without even understanding how they are made.

Flaws: We often don’t have access to the predictions of experts (or of prediction markets, or prediction algorithms), and when we do, we usually don’t have reliable measures of their past accuracy. What's more, many experts whose predictions are publicly available have no clear track record of performance, or even purposely avoid accountability for poor performance (e.g. by hiding past prediction failures and touting past successes).

—

9. Metaphors

Common in: self-help, ancient philosophy, science education

Structure:

X, which is what we are dealing with now, is metaphorically a Z.

For Z, when W is true, then obviously Y is true.

Now W (or its metaphorical equivalent) is true for X.

Therefore Y is true for X.

Example: “Your life is but a boat, and you are riding on the waves of your experiences. When a raging storm hits, a boat can’t be under full sail. It can’t continue at its maximum speed. You are experiencing a storm now, and so you too must learn to slow down.”

Example: "To better understand the nature of gasses, imagine tons of ping pong balls all shooting around in straight lines in random directions, and bouncing off of each other whenever they collide. These ping pong balls represent molecules of gas. Assuming the system is not inside a container, ping pong balls at the edges of the system have nothing to collide with, so they just fly outward, expanding the whole system. Similarly, the volume of a gas expands when it is placed in a vacuum."

Advantages: Our brains are good at understanding metaphors, so they can save us mental energy when we try to grasp difficult concepts. If the two items being compared in the metaphor are sufficiently alike in relevant ways, then the metaphor may accurately reveal elements of how its subject works.

Flaws: Z working as a metaphor for X doesn’t mean that all (or even most) predictions that are accurate for situations involving Z are appropriate (or even make any sense) for X. Metaphor-based reasoning can seem profound and persuasive even in cases when it makes little sense.

—

10. Similarities

Common in: the study of history, machine learning

Structure:

X occurred, and X is very similar to Z in properties A, B and C.

When things similar to Z in properties A, B, and C occur, Y usually occurs.

Therefore Y is likely to occur (with high probability).

Example: “This conflict is similar to the Gulf War in various ways, and from what we've learned about wars like the Gulf War, we can expect these sorts of outcomes.”

Example: “This data point (with unknown label) is closest in feature space to this other data point which is labeled ‘cat’, and all the other labeled points around that point are also labeled ‘cat’, so this unlabeled point should also likely get the label ‘cat’.”

Advantages: This approach can be applied at both small scale (with small numbers of examples) and at large scale (with millions of examples, as in machine learning algorithms), though of course large numbers of examples tend to produce more robust results. It can be viewed as a more powerful generalization of "frequencies"-based reasoning.

Flaws: In the history case, it is difficult to know which features are the appropriate ones to use to evaluate the similarity of two cases, and often the conclusions this approach produces are based on a relatively small number of examples. In the machine learning case, a very large amount of data may be needed to train the model (and it still may be unclear how to measure which examples are similar to which other cases, even with a lot of data). The properties you're using to compare cases must be sufficiently relevant to the prediction being made for it to work.

—

11. Anecdotes

Common in: daily life

Structure:

In this handful of examples (or perhaps even just one example) where X occurred, Y occurred.

X occurred.

Therefore Y will occur.

Example: “The last time we took that so-called 'shortcut' home, we got stuck in traffic for an extra 45 minutes. Let's not make that mistake again.”

Example: “My friend Bob tried that supplement and said it gave him more energy. So maybe it will give me more energy too."

Advantages: Anecdotes are simple to use, and a few of them are often all we have to work with for inference.

Flaws: Unless we are in a situation with very little noise/variability, a few examples likely will not be enough to accurately generalize. For instance, a few examples is not enough to make a reliable judgement about how often something occurs.

—

12. Intuition

Common in: daily life

Structure:

X occurred.

My intuition (that I may have trouble explaining) predicts that when X occurs, Y is true.

Therefore Y is true.

Example: “The tone of voice he used when he talked about his family gave me a bad vibe. My feeling is that anyone who talks about their family with that tone of voice probably does not really love them.”

Example: "I can't explain why, but I'm pretty sure he's going to win this election."

Advantages: Our intuitions can be very well honed in situations we’ve encountered many times, and that we've received feedback on (i.e. where there was some sort of answer we got about how well our intuition performed). For instance, a surgeon who has conducted thousands of heart surgeries may have very good intuitions about what to do during surgery, or about how the patient will fare, even potentially very accurate intuitions that she can't easily articulate.

Flaws: In novel situations, or in situations where we receive no feedback on how well our instincts are performing, our intuitions may be highly inaccurate (even though we may not feel any less confident about our correctness).

If you'd like to know more about when intuition is reliable, try our 7-question guide to determining when you can trust your intuition.