Updated: Mar 6
Image created using the A.I. DALL·E
You likely have heard the phrase "replication crisis." It refers to the grim fact that, in a number of fields of science, when researchers attempt to replicate previously published studies, they fairly often don't get the same results. The magnitude of the problem depends on the field, but in psychology, it seems that something like 40% of studies in top journals don't replicate. We've been tackling this crisis with our new Transparent Replications project, and this post explains one of our key ideas.
Replication failures are sometimes simply due to bad luck, but more often, they are caused by p-hacking - the use of fishy statistical techniques that lead to statistically significant (but misleading or erroneous) results. As big a problem as p-hacking is, there is another substantial problem in science that gets talked about much less. Although certain subtypes of this problem have been named previously, to my knowledge, the problem itself has no name, so I'm giving it one: "Importance Hacking."
Academics want to publish in the top journals in their field. To understand Importance Hacking, let's consider a (slightly oversimplified) list of the three most commonly-discussed ways to get a paper published in top psychology journals:
Conduct valuable research - make a genuinely interesting or important discovery, or add something valuable to the state of scientific knowledge. This is, of course, what just about everyone wants to do, but it's very, very hard!
Commit fraud - for instance, by making up your data. Thankfully, very few people are willing to do this because it's so unethical. So this is by far the least used approach.
p-hack - use fishy statistics, HARKing (i.e., hypothesizing after the results are known), selective reporting, using hidden researcher degrees of freedom, etc., in order to get a p<0.05 result that is actually just a false positive. This is a major problem and the focus of the replication crisis. Of course, false positives can also come about without fault, due to bad luck.
But here is a fourth way to get a paper published in a top journal: Importance Hacking.
4. Importance Hack - get a result that is actually not interesting, not important, and not valuable, but write about it in such a way that reviewers are convinced it is interesting, important, and/or valuable, so that it gets published.
If you've found this article valuable so far, you may also like our free tool
For research to be valuable to society (and, in an ideal world, publishable in top journals), it must be true AND interesting (or important, useful, etc.). Researchers sometimes p-hack their results to skirt around the "true" criterion (by generating interesting false positives). On the other hand, Importance Hacking is a method for skirting the "interesting" criterion.
Importance Hacking is related to concepts like hype and overselling, though hype and overselling are far more general. Importance Hacking refers specifically to a phenomenon whereby research with little to no value gets published in top journals due to the use of strategies that lead reviewers to misinterpret the work. On the other hand, hype and overselling are used in many ways in many stages of research (including to make valuable research appear even more valuable).
One way to understand importance hacking is by comparing it to p-hacking. P-hacking refers to a set of bad research practices that enable researchers to publish non-existent effects. In other words, p-hacking misleads paper reviewers into thinking that non-existent effects are real. Importance Hacking, on the other hand, encompasses a different set of bad research practices: those that lead paper reviewers to believe that real (i.e., existent) results that have little to no value actually have substantial value.
This diagram illustrates how I think Importance Hacking interferes with the pipeline of producing valuable research:
There are a number of subtypes of Importance Hacking based on the method used to make a result appear interesting/important/valuable when it's not. Here is how I subdivide them:
Types of Importance Hacking
1. Hacking Conclusions: make it seem like you showed some interesting thing X but actually show something else (X′) which sounds similar to X but is much less interesting/important. In these cases, researchers do not truly find what they imply they have found. This phenomenon is also closely connected with validity issues.
Example 1: showing X is true in a simple video game but claiming that X is true in real life.
Example 2: showing A and B are correlated and claiming that A causes B (when really A and B are probably both caused by some third factor C, which makes the finding much less interesting).
Example 3: if a researcher claims to be measuring “aggression,” and couches all conclusions in these terms but is actually measuring milliliters of hot sauce that a person puts in someone else's food. Their result about aggression will be valid only insofar as it is true that this is a valid measure of aggression.
Example 4: some types of hacking conclusions would fall under the terms "overclaiming" or "overgeneralizing;" Tal Yarkoni has a relevant paper called The Generalizability Crisis.
2. Hacking Novelty: refer to something in a way that makes it seem more novel or unintuitive than it is. Perhaps the result is already well known or is merely what just about everyone's common sense would already tell them is true. In these cases, researchers really do find what they claim to have found, but what they found is not novel (despite them making it seem so). Hacking Novelty is also connected to the "Jingle-jangle" fallacy - where people can be led to believe two identical concepts are different because they have different names (or, more subtly, because they are operationalized somewhat differently).
Example 1: showing something that is already well-known but giving it a new name that leads people to think it is something new. The concept of “grit” has received this criticism; some people claim it could turn out to be just another word for conscientiousness (or already known facets of conscientiousness) - though this question does not yet seem to be settled (different sides of this debate can be found in these papers: 1, 2, 3 and 4).
Example 2: showing that A and B are correlated, which seems surprising given how the constructs are named, but if you were to dig into how A and B were measured, it would be obvious they would be correlated.
Example 3: showing a common-sense result that almost everyone already would predict but making it seem like it's not obvious (e.g., by giving it a fancy scientific name).
3. Hacking Usefulness: make a result seem useful or relevant to some important outcome when in fact, it's useless and irrelevant. In these cases, researchers find what they claim to have found, but what they find is not useful (despite them making it sound useful).
Example: focusing on statistical significance when the effect size is so small that the result is useless. Clinicians often distinguish between “statistical significance” and “clinical significance” to highlight the pitfalls of ignoring effect sizes when considering the importance of a finding.
4. Hacking Beauty: make a result seem clean and beautiful when in fact, it's messy or hard to interpret. In these cases, researchers focus on certain details or results and tell a story around those, but they could have focused on other details or results that would have made the story less pretty, less clear-cut, or harder to make sense of. This is related to Giner-Sorolla’s 2012 paper Science or art: How aesthetic standards grease the way through the publication bottleneck but undermine science. Hacking beauty sometimes reduces to selective reporting of some kind (i.e., selective reporting of measures, analyses, or studies) or at least of selective focus on certain findings and not others. This becomes more difficult with pre-registration; if you have to report the results of planned analyses, there’s less room to make them look pretty (you could just say they’re pretty, but that seems like overclaiming)
Example: emphasizing the parts of the result that tell a clean story while not including (or burying somewhere in the paper) the parts that contradict that story
Science faces multiple challenges. Over the past decade, the replication crisis and subsequent open science movement have greatly increased awareness of p-hacking as a problem. Measures have begun to be put in place to reduce p-hacking. Importance Hacking is another substantial problem, but it has received far less attention.
Digital art created using the A.I. DALL·E
If a pipe is leaking from two holes and its pressure is kept fixed, then repairing one hole will result in the other one leaking faster. Similarly, as best practices increasingly become commonplace as a means to reduce p-hacking, so long as the career pressures to publish in top journals don't let up, the occurrence of Importance Hacking may increase.
It's time to start the conversation about how Importance Hacking can be addressed.
If you're interested in learning more about Importance Hacking, you can listen to psychology professor Alexa Tullett and me discussing it on the Clearer Thinking podcast (there, I refer to it as "Importance Laundering," but I now think "Importance Hacking" is a better name) or me talking about it on the Two Psychologists Four Beers podcast. We also discuss my new project, Transparent Replications, which conducts rapid replications of recently published psychology papers in top journals in an effort to shift incentives and create more reliable, replicable research. If you enjoyed this article, you may be interested in checking our replication reports and learning more about the project.
Did you like this article? If so, you may like to explore the ClearerThinking Podcast, where Spencer Greenberg each week has fun, in-depth conversations with brilliant people about ideas that truly matter. Click here to see a full list of episodes.