Study Report: Is Personality 4, 5, or 6-Dimensional?

Markus Over
5 days ago
29 min read

Note: This is a longer and more technical report of our study into personality traits. If you want to see the shorter, more layperson-friendly version, click here.

There's a debate that has raged in academic journals and among personality researchers about the nature of humans: how many dimensions does it take to best represent a person's personality? Or, put another way: how many attributes would you have to score someone on to have a good sense of what they are like? We've collected new data to help answer these questions.

We also made a video on the subject, which you can watch here:

Before we dig into the details, imagine you're introduced to two new coworkers on the same day, let's call them Sarah and Dave.

Sarah walks into the office with enthusiasm and energy. Within five minutes, she knows everyone's name and is volunteering to present the team update. Dave, on the other hand, hangs back, listens closely, takes some time to think, and later gets back to you with a well-structured text message.

Just based on these brief interactions, you can probably make some reasonable predictions about how Sarah and Dave may react to certain situations - perhaps, that Sarah may be energized by a busy open office, while Dave will do better with deep tasks that he can work on independently. And you would have a decent chance (albeit no guarantees) of being right.

This is the core promise of personality psychology: that people differ in stable, measurable ways, and that those differences are informative. If we can describe a person using a handful of traits, we get a compressed representation of who they are. Imperfect, but useful for understanding them and pre-empting how they might, for example, react to stressful situations, relate to others, and what they'd enjoy or avoid.

But what is that "handful of traits" that you would need to understand to have a sense of what Sarah and Dave (or anyone else) is really like? And how many such traits are there? You might think that people are so complex that this whole endeavor is futile, and you'd need an unlimited number of traits to understand them. Or you might think that people are simple, and once we know a few things about a person, that's enough to understand them pretty well. How can we tell which is right?

You may have heard of two of the most popular personality models out there. The Big Five model (sometimes called OCEAN) says there are five broad dimensions of personality - once you know these, you know much of what makes our personalities different from each other. The Myers-Briggs Type Indicator (MBTI, sometimes called 16 personalities), on the other hand, is a four-dimensional model. It says that to get a good grasp on a person, it's enough to score them on each of four dichotomies.

Note that if you're interested in either of these, you can use our free personality test tool, where you'll get your own results for your Big Five and MBTI-style traits:

Take Our Personality Test

We researched before whether MBTI-style tests have enough dimensions by looking at how accurately one can predict "outcomes" about a person using the four dimensions of an MBTI-style test, compared to the five dimensions of the Big Five. When we attempted to predict 37 different outcomes (e.g., to what extent they exercise, are satisfied with their life, have many close friends, etc.), the four MBTI-style dimensions just didn't cut it.

Results from our prior research: Big Five achieved the best predictive accuracy of the personality models we compared. Dropping the Neuroticism trait from Big Five (leaving us with the "Big Four", with four dimensions that are somewhat analogous to the four MBTI-style dimensions, whereas neuroticism is not meaningfully correlated with them) performed slightly worse - but still better than our MBTI-style scores-based model, indicating that the four MBTI-style dimensions we used were somewhat less predictive than our Big Four traits. Mapping our MBTI-style scores to binary categories, as is commonly done in MBTI-style tests, led to even worse predictiveness.

See how we lose a substantial amount of accuracy if we try to predict outcomes about people using the three different four-factor models compared to the full Big Five. So, even though this is far from a conclusive test, it lends some support to the view that five factors seem to be a better match for modeling human personality than four. But even if this were true, it raises another question: why stop at five? The field of academic psychology actually has long had a different contender competing with Big Five as the leading personality model, which we hadn't conducted our own research on before: HEXACO.

HEXACO looks very similar to Big Five, but it argues that there should be a sixth dimension (Honesty-Humility) in addition to the five that Big Five proposes (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism). Here's an overview of the factors in both models, as well as a breakdown into what each factor describes:

Big Five	HEXACO
1) Openness to Experience
Conformist, Exploratory, Intellectual, Logical, Quick-minded	Aesthetic Appreciation, Inquisitiveness, Creativity, Unconventionality
2) Conscientiousness
Dependable, Hard working, Organized, Perfectionistic, Planning, Pragmatic, Rule-abiding, Spontaneous	Organization, Diligence, Perfectionism, Prudence
3) Extraversion
Attention seeking, Conversationalist, Group-oriented, Leader-like, Socially energized, Spontaneous	Social Self-Esteem, Social Boldness, Sociability, Liveliness
4) Agreeableness
Altruistic, Conflict-avoidant, Emotional, Empathetic, Logical	Forgiveness, Gentleness, Flexibility, Patience
5) Neuroticism	5) Emotionality
Anxious, Demanding, Depressive, Emotional	Fearfulness, Anxiety, Dependence, Sentimentality
	6) Honesty-Humility
	Sincerity, Fairness, Greed Avoidance, Modesty

This table shows a breakdown of how the different factors of Big Five and HEXACO were operationalized in our research. Each trait is associated with a set of more specific lower-level characteristics, which helps clarify what a label such as “Agreeableness” means in this context. Note that while most of these "sub-traits" are aligned with their trait (meaning that scoring high on one would be associated with scoring high on the other), there are a few exceptions here (italicized in the table), where the sub-trait points in the opposite direction. For instance, being conformist would be an indicator of low openness to experience, whereas being exploratory would indicate high openness. More context on the interpretation of this trait breakdown can be found in the appendix.

Over the last few decades, the disagreement over whether there are 5 or 6 factors of personality has intensified. It's shaped how researchers build questionnaires, how they model the data, and how they think about the underlying structure of personality. This article's focus will be not only on comparing Big Five and HEXACO, but also on the essential question that underlies this discussion: is personality inherently five-dimensional, six-dimensional, or perhaps something entirely different? We collected data to directly put these questions to the test.

Big Five's Origin Story

Before we enter that heated debate ourselves, let's start by taking a closer look at the Big Five and how it came about. Where did the idea of the five factors of personality come from in the first place? Was it just some committee that thought five is a fine number and then brainstormed some nice-sounding traits? As we'll see, there's much more to the story than that.

The starting point was the lexical hypothesis, which is the claim that if a personality difference matters, languages will eventually invent words for it. If the concepts of being "reliable," "rude," "bold," or "petty" are useful distinctions to make about people, we'll end up encoding them in vocabulary. This way, dictionaries become large datasets of trait concepts. Early researchers took that idea seriously enough to catalog enormous lists of personality-relevant adjectives.

But of course, you can't just stare at 2,000 adjectives and hope wisdom emerges. So researchers did something more pragmatic: they asked lots of people to rate themselves (or others) in terms of which adjectives applied to them, and then used a statistical method known as factor analysis to compress the chaos. You can think of factor analysis as a way to find hidden, underlying patterns that explain which traits are related. For instance, the data would show that if someone is rated as "talkative", they're also more likely to be rated as "energetic" and "friendly" and less likely to be rated as "reserved". In other words, the adjectives naturally group together. Factor analysis tries to find the few underlying "dials" that explain as many of these correlations as possible with as few numbers as possible.

Factor analysis doesn't "discover" labels like Extraversion or Agreeableness by itself. It just tells you in a more abstract way what adjectives tend to group together. At that point, you still have to do some manual work by asking, "What do these have in common?" If the terms that come together are talkative, energetic, and assertive, you might name that factor "Extraversion." If they're organized, disciplined, and reliable, you might call it "Conscientiousness." This step is partly interpretive and subjective, which is why researchers often compare naming schemes across studies, argue about what a factor "really" captures, and care a lot about whether the same bundle of "high-loading" traits (i.e., those that correlate strongly with the factor) keeps reappearing in new datasets. In other words, the math gives you the groupings based on what correlates, and humans must then assign meaning to those groups and find good, representative labels for the clusters identified by factor analysis. Each of the groupings is actually a spectrum capturing the full range of a trait - for instance, the "extraversion" grouping encompasses the full spectrum from introverted to extroverted. Here's a diagram to give you a more intuitive sense of how this process works:

Schematic visualization of how one might go about creating a personality model like Big Five: You survey a lot of people and ask them to what degree a varied list of personality-related adjectives applies to them, or somebody they know (the black lines on the left). You can then use this survey data to run a factor analysis algorithm to figure out five (or any other number of) "hidden dimensions", which is an abstract representation of five potential traits (here A, B, C, D, E) that do a good job at compressing the data from your survey. This is followed by a manual step where these five abstract factors are analyzed, and you try to find a representative label for each one, based on what types of adjectives load most strongly onto that factor (the colored lines).

But why five factors, and not four, or six, or fifteen? Factor analysis will give you any number of factors that you choose. But you can then investigate those factors and ask follow-up questions to see how useful each is, including:

How much of the variability in people's responses to all the personality questions does each factor account for?
To what extent do you get the same or different factors if you use different wordings of the questions or ask the questions of different populations?

Five wasn't a random number picked out of a hat. The reason that researchers converged on five factors is that in study after study, a solution with about five big clusters kept showing up as particularly stable across many different datasets: Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism. And these five factors accounted for a lot of the variation in how people respond to all the personality questions asked.

More recently, however, some researchers argued that going with a six-factor model may be similarly stable across datasets, while capturing more nuances about people's personality.

Enter HEXACO

Big Five, as described above, came out of the idea that important personality differences leave footprints in language, but its origins are mostly in the English language. So, a natural follow-up question is: what happens if you stop relying mostly on English? Researchers eventually did exactly that - they repeated the same basic recipe (trait words → lots of ratings → factor analysis) across a wider range of languages. And what some researchers found is that a particular solution with six broad clusters often fits the data at least as cleanly as a five-factor solution. Later research suggested that this six-factor solution may actually hold up similarly well in English datasets after all, indicating that prior research may have too quickly settled on a five-factor model.

In the early 2000s, Kibeom Lee and Michael C. Ashton took this recurring sixth factor and integrated it into a full-fledged model: HEXACO (an acronym for the six factors being described). The five dimensions from Big Five are, for the most part, still present, with a new dimension emerging, which they labeled Honesty-Humility.

High Honesty-Humility involves sincerity, fairness, and a low appetite for exploiting others; low Honesty-Humility leans more toward manipulativeness, entitlement, arrogance, and a willingness to bend rules when it pays. Combined with other HEXACO factors, this yields:

Honesty-Humility,
Emotionality (which is similar to what the Big Five calls Neuroticism),
eXtraversion,
Agreeableness,
Conscientiousness, and
Openness.

Lee and Ashton (and colleagues) built questionnaires that measure these six traits in a structured way, including smaller subcomponents sometimes called facets, four of which they assigned to each of the six traits. You can think of these facets as the "sub-dials" underneath each big dial. Conscientiousness, for instance, was broken down into the narrower tendencies Organization, Diligence, Perfectionism, and Prudence. Facets are useful because they let a model stay compact at the top level while still saying something more specific when you zoom in. Big Five, too, is often split down further into facets - but which exact facets these are depends on the concrete "version" of Big Five, as different researchers have come up with slightly different interpretations.

At first glance, HEXACO can look like "Big Five plus one", and in some ways, it basically is. If you line the two up, you get a very familiar mapping: Openness, Conscientiousness, and Extraversion land in almost the same territory in both models. Agreeableness is also there in both, but it is one of the places where the borders don't quite match. And Big Five's Neuroticism maps only partially onto HEXACO's Emotionality.

Simplified schematic depiction of how Big Five traits relate to HEXACO, based on our understanding of the literature and incorporating feedback from an expert academic. Openness, Conscientiousness, and Extraversion remain relatively unchanged. Agreeableness exists on both sides as a label, but part of it is split off into the new Honesty-Humility dimension. Evidently, there is a lot of movement between the Big Five traits Neuroticism and Agreeableness, and the HEXACO traits Honesty-Humility, Emotionality, and Agreeableness. Note that this is a simplification, and in reality, the relation between the two models isn't quite as clean, and the details even depend on which exact "version" of the two models one uses.

A concrete example: In the NEO-PI-R version of the Big Five model, Neuroticism includes an 'angry hostility' facet, meaning anger is treated as part of what it means to be emotionally unstable. HEXACO, in contrast, tends to push anger and irritability more toward low Agreeableness, while Emotionality focuses more on things like anxiety, fearfulness, and sentimentality. In other words, HEXACO doesn't merely add Honesty-Humility, it also rearranges how some other traits are categorized.

But is the six-factor model genuinely a better "compression" of personality, or is it just another reasonable way of rotating and carving up the same underlying space? Are we seeing a truly new dimension of personality, or are we mostly seeing a relabeling and reshuffling of content that Big Five already captures?

Depending on who you ask, HEXACO is either the obvious next step after Big Five, or just an alternative framing that is not clearly superior in practice, giving up some of Big Five's simplicity for little gain.

We set out to gather our own data to see whether it supports the Big Five or HEXACO in this debate.

But before we tell you what we found, it's useful to consider what one might expect to find.

A reasonable starting assumption before you've collected any data might be something like this: humans are complex, and you can probably find many partially independent personality factors if you go looking. The more factors of personality you include, the more you'll be able to predict about a person - but with diminishing returns. By definition, you'd expect the most predictive factor of personality to be more predictive than the second most useful factor - but that using the first AND second factor would allow for better predictions than just the first factor alone. Carrying this logic forward, the 6th dimension should help a bit less than the 5th, the 7th a bit less than the 6th, and so on, with each new dial adding some additional predictive accuracy, but less and less each. So we might expect something like this, where the height each bar shows how much variation in people's responses to a wide range of personality questions would be attributable to each factor:

Hypothetical/idealized chart showing what one might reasonably expect about personality models like Big Five and HEXACO: each additional dimension in your model helps predict people more accurately, but each extra dimension is a little less informative than the previous ones.

If this starting assumption holds, then HEXACO should have a modest but noticeable edge over the Big Five across a wide range of outcomes. Not a dramatic leap, but a smooth, sensible improvement from allowing one extra number in your personality "compression".

But if that does turn out to be true, we run into an awkward question: why stop at five or six dimensions at all? Why not four, or seven, or twenty? Any point you stop at would be arbitrary, since you could always go one factor further to get a bit more accuracy.

In fact, why not skip the whole abstraction step and just use the full raw questionnaire - every single question out of the 100 or more that your personality test may entail - as the basis for your predictions? If personality modeling ended up being a smooth version of "more dimensions = better predictions", then there would be no natural stopping point, only a trade-off between complexity and accuracy.

While this starting assumption seems like a very reasonable one for the way personality works, interestingly, it is not at all what we found!

Our Study

We'll briefly explain where we started and what our research looked like. But if you just want to know the results, feel free to skip ahead to the Results section below.

When we started our Big Five vs HEXACO investigation, we already had a live Big Five test from our Ultimate Personality Test tool (you can take it for free!), and we also had an ongoing side project to find what's missing from the Big Five: a systematic attempt to map out personality-relevant sub-traits that might not be captured well by the standard five-factor framing.

Concretely, for that "missing in Big Five" study, we had identified 36 candidate sub-traits that seemed plausibly underrepresented in Big Five, and wrote five new survey items for each of them. That gave us 180 new questions. Combined with the 116 items already in our live test, this gave us 296 items in total that we considered for our HEXACO test. Some of the new sub-traits were directly inspired by HEXACO's Factor H (Honesty-Humility) - things like Integrity, Non-Stealing (Fairness), Narcissism, Manipulativeness, Ethical Non-Conformity, and Callousness. This first step helped to make sure we gave HEXACO a fair chance: if it really captures something Big Five misses, our item pool needed to actually contain that content.

Next, we constructed HEXACO scales from this pool in a way that was both principled and comparable to our Big Five test. We began by removing 79 items that didn't fit any of the HEXACO traits (spirituality items are a good example). For the remaining 217 items, we manually assigned each item to the HEXACO factor - and sub-facet - it seemed to reflect best.

Then we did a sanity check and ran a six-factor analysis (using an approach called Oblimin) to check whether we could replicate a HEXACO-like structure. And reassuringly, the six factors that emerged lined up well with the six HEXACO traits. After that, we went back item-by-item and verified that our manual assignments made sense statistically - i.e., that an item we labeled as "Agreeableness" actually loaded most strongly on the Agreeableness factor, rather than somewhere else.

Finally, we wanted to make sure the comparison was fair in that both questionnaires should have a similar number of questions - otherwise, one could argue that the difference in how much detail was captured by the survey might give one of the two models an advantage independent of the underlying models. Our live Big Five test uses 102 scored items across five traits (some of the original 116 questions were not assigned to any Big Five traits), so for our final HEXACO test, we also used exactly 102 items. With six traits, that works out to 17 items per trait. We selected those items by prioritizing strong, clean loadings (high correlation with one factor, low cross-loading with others), while also keeping each trait internally diverse. For example, HEXACO Agreeableness has multiple facets (like Forgiveness, Gentleness, Flexibility, and Patience), and we preferred a slightly weaker item that covered a missing facet over a slightly stronger item that would have skewed the scale towards one of the facets.

Overview of how our different studies relate, and how we ultimately ended up with the 149 items in our study on Big Five and HEXACO.

The 102 Big Five questions and the 102 HEXACO questions had some overlap, with 55 questions being shared, yielding a total of 149 unique questions. And this questionnaire, which would then be answered by our 343 study participants, allowed us to finally test the two models head-to-head, and thereby contribute our own findings to the question: can human personality be reasonably described as "five-dimensional" or "six-dimensional"?

The Results

So, which "won", Big Five or HEXACO? Let's go through our findings one by one.

Results 1: Correlations between the two models

First, given the considerable overlap in dimensions between Big Five and HEXACO, it makes sense to have a look at whether the similarly-named dimensions between the two actually measure similar things. For this, we compared the Big Five scores that our study participants received to the HEXACO scores of these same participants. Here's what we found:

The correlations we measured between Big Five and HEXACO dimensions in our study.

Looking at this table, we see:

(1) Comparability across the two frameworks: Openness, Conscientiousness, Extraversion, and Neuroticism / Emotionality are highly correlated between the two.

(2) Some other small correlations: There are some other non-zero correlations beyond these, e.g., Neuroticism and Extraversion having a moderate correlation of -0.29. This is generally not surprising, for two reasons: First, even within each framework, factors tend to have small-to-moderate but non-zero correlations with each other. You can, for instance, look up on our PersonalityMap platform what different studies have found, and see there that the Big Five traits of Neuroticism and Extraversion had correlations of -0.15, -0.06, and -0.23 in three different studies. Additionally, you can find the within-model correlations from our own data in the appendix of this article. Second, as mentioned before, HEXACO is not just "Big Five plus one dimension", but some parts of the model are shifted, such as the anger example we mentioned before. Hence, it was to be expected that the traits that are present in both frameworks would still not be perfectly clean matches.

(3) Agreeableness has been split up: While the other four Big Five traits have remained comparatively stable, Agreeableness has gone through quite a transformation. Its highest correlation is still with Agreeableness on the HEXACO side, but both Emotionality and Honesty/Humility have medium-sized correlations with it as well. This is in line with what some other research has found: once you go beyond five personality dimensions, you often don't end up with additional dimensions that weren't captured before; instead, existing traits will just be split in two, or multiple traits will be "rotated", meaning they measure similar things when taken together, but distributed differently between traits.

Results 2: Compressing personality

As explained before, our study participants answered a large number of personality-related questions, which we then used to compute their personal Big Five and HEXACO scores. One way to measure how powerful these models are is to then use them to predict the same questions that were used to compute the scores, and measure how far off these predictions are. This essentially means we're measuring how well the models can "compress" the underlying data - or, put another way, to what extent the 5 or 6 factors capture variation in people's responses to the underlying personality questions.

The plot below shows how much each additional factor adds to the "compression quality" of the model, quantified as "variance explained" in the original personality questions. Remember our "smooth drop-off" assumption from earlier? This is exactly the chart where one might expect a smooth drop-off - the first dimension would be most predictive (because factor analysis chooses these factors such that the most predictive ones are taken first), and every one beyond that would be a bit less predictive than the one before, with a smooth decrease. However, what we actually found is this:

This chart shows the additional variance explained by each factor when running a factor analysis with that many factors. We observe that there's a step change (which in factor analysis is sometimes called an "elbow") from the 5th factor, with an incremental variance explained of over 5%, to the 6th factor, where the variance explained suddenly drops by about half.

Interestingly, there's quite a gap from dimension 5 to 6! It's not that dimension 6 is useless - the variance explained is still clearly positive. But this substantial gap between using 5 vs 6 factors is pretty remarkable: the 5th factor is only slightly less useful than the 4th, whereas the 6th factor is less than half as useful as the 5th! Additionally, the 6th, 7th, 8th, etc., factors are all pretty close to equally useful. So, given our item pool and data, there is quite a strong justification for stopping after 5 factors (they are substantially more predictive than the rest). But it seems very hard to justify stopping after 6 factors in particular - the 7th and 8th factors are barely worse than the 6th!

Based on this chart alone, it seems that the first five dimensions are each much more informative than each additional factor that comes after. Hence, it would seem reasonable to say that the Big Five is a model where you get the most bang for your buck (given our dataset). Yes, you could keep going to further numbers of factors to get more predictive accuracy, but each additional factor gets you a lot less improvement in accuracy after 5 factors.

Results 3: Predicting Life Outcomes

While the last finding is potentially informative, it also has a limitation: it was all about predicting a participant's answers to the personality survey questions themselves, but it tells us little beyond that. In other words, it reflects compression of information. But what's usually more important than compression is prediction - what can you predict about other things beyond the personality items themselves? To explore this, we looked at how well we could predict a wide variety of life outcomes using people's Big Five scores and Hexaco scores.

What we found is summarized by the chart below. It compares how well the two personality models predict life outcomes that we measured in our survey (separately from the personality-related questions), measured by the R score - a statistical measure of predictive accuracy that tells us how closely our personality-based predictions match the actual life outcomes. An R score of 1 would mean that we can perfectly predict every single life outcome of all our participants. An R score of 0 means the model did no better than simply predicting the average value for everyone.

Before we compare the Big Five and Hexaco to each other - how well did they both do, overall? As you can see, all these R scores are somewhere just below 0.2, indicating that personality traits give us a modest level of predictive power across these 66 life outcomes, but the predictions are highly uncertain. It also depends a lot on which outcomes we're predicting. The numbers above are representative (mean and median) for the entirety of outcomes we measured. If we look at individual outcomes rather than averaging over all of them, we find some that can be predicted quite well:

Life Outcome	Big Five R Score	HEXACO R Score
Count of current mental illnesses	0.53	0.48
Overall life satisfaction from birth	0.49	0.47
Has children	0.49	0.47
Is full-time employed	0.41	0.45
Gender*	0.35	0.49

* Gender may not be a typical "life outcome", but it's still a fact about the study participants that both personality models can predict quite well based on personality scores alone. Notably, HEXACO does better on this than Big Five, suggesting that the added Honesty/Humility factor may be systematically different between men and women.

But other outcomes are not well predicted by either of our two models:

Life Outcome	Big Five R Score	HEXACO R Score
Count of sexual partners	0.00	0.01
Hours of sleep per night	0.00	0.00
Meditates each day	0.05	0.00

You can find the full table of all life outcomes and their R scores in the appendix.

But what do all these numbers tell us about Big Five and HEXACO? Looking back at the previous bar chart, the mean R score over all life outcomes increased by about 3.5%, from 0.191 (Big Five) to 0.198 (HEXACO). This is a very small increase in accuracy, and in our paired analysis across outcomes, it was not statistically distinguishable from zero (more detail on this in the appendix). And besides the pure predictive performance, simplicity is an important factor with these models. It can sometimes make sense to choose a more complex model over a simpler one, but that should usually come with clear benefits to justify the extra complexity. Here, the added complexity of HEXACO did not produce a clearly detectable overall improvement.

What's more, even though HEXACO did, in absolute terms, perform slightly better than Big Five, there's a different way to look at the data that looks even less favorable to HEXACO: out of the 66 life outcomes we measured, HEXACO outperformed Big Five in 27, but it was outperformed in 31. So, if anything, slightly more life outcomes actually benefited from the Big Five model. Here is a visualization of the difference in R Scores for our 66 life outcomes, where positive numbers mean HEXACO outperformed Big Five, and negative numbers mean the opposite:

A bar chart of the relative predictive performance of HEXACO vs Big Five on 66 different life outcomes. Positive values (orange) mean HEXACO performed better on the given life outcome, whereas negative values (blue) mean Big Five performed better. We observe that there are more blue than orange bars, suggesting that Big Five was better at predicting a greater number of life outcomes. However, the orange bars have slightly larger values on average, which explains how HEXACO can still overall achieve an R score slightly better than Big Five. Note that you can find the full table with all life outcomes and R scores in the appendix.

It's unclear whether this finding (that more life outcomes were predicted better by Big Five than by HEXACO) would generalize beyond our dataset, as it depends strongly on the exact outcomes tested. Yet it adds to the view that HEXACO does not look like a clear-cut improvement over Big Five. It seems to add little to no increase in predictive power, at the expense of adding an additional factor. We can go deeper, though, and ask: when does HEXACO help? We find that HEXACO tends to shine on certain types of life outcomes. For example:

Life Outcome	Big Five R Score	HEXACO R Score	Score Difference
Examples where HEXACO did better:
Has used physical force	0.00	0.14	+0.14
Has been arrested in the past 10 years	0.00	0.14	+0.14
Addicted to any substance	0.19	0.29	+0.10
Has cheated on their partner	0.00	0.09	+0.09
Examples where Big Five did better:
Donated to charity in the past year	0.24	0.15	-0.09
Has been fired	0.09	0.00	-0.09
Has received awards in the past year	0.10	0.00	-0.10

It makes sense that for life outcomes that seem particularly related to the Honesty-Humility trait, HEXACO does pretty reliably do a better job at predicting them. So, when engaging with questions of this particular type in a research context, it may well be the better-suited tool compared to Big Five. But when it comes to a general tool to assess personality in a broad set of contexts, our research strongly suggests that Big Five is preferable, as you get very similar predictive accuracy with a simpler model.

What This Means

Going into this article, we raised the question whether personality is, in any meaningful sense, five- or six-dimensional. These two numbers seemed like good candidates based on the fact that the Big Five and HEXACO models have a good track record of combining a certain compactness with being quite replicable in different datasets.

As expected, we did find that HEXACO, with its one extra dimension, was able to capture slightly more about our study participants: it allowed for slightly more accurate "compression" of the survey data, and was possibly (albeit without reaching statistical significance) minimally better at predicting life outcomes overall. But it added very little, and arguably less than what would justify adding a 6th dimension to the already well-performing five dimensions of Big Five.

When we looked at how much variance was explained by each factor, there appeared to be a qualitative difference between the first five factors and those that come after. Based on this and combined with our other findings (such as the fact that the 6th factor is not a wholly new one, but instead previous factors tend to "split" at that point), one can make a compelling case that the first five dimensions are "special", whereas, if you were to add a sixth to your model, it becomes very difficult to justify for why to stop there, rather than also take a factor 7 or 8 on board.

Given all this, our findings further support the Big Five model. Nevertheless, Big Five won't be the ideal personality model in all circumstances, and there are definitely research questions for which HEXACO would be the preferable model (such as when you particularly care about questions of ethics, empathy, or criminality, for which HEXACO is more nuanced), so HEXACO has its place, if only in a more limited context. As a general all-purpose model, it seems that Big Five tends to give you more "bang for your buck".

Does this mean that human personality is five-dimensional after all? First and foremost, it is of course the case that humans are complex, and Big Five overall achieves only a modest degree of predictiveness for life outcomes on average (with an R score of 0.191 across the 66 outcomes in our study, with some being more accurate and some being entirely unpredictable by personality). So, naturally, there is way more about us humans than what five numbers can capture, and it would be a mistake to reduce individuals to their Big Five profile. In fact, studies typically find that the most accurate predictions of outcomes come from using all the personality question responses at once to make predictions, rather than relying on factors at all.

Another important caveat is that our study is based on the data of participants from the US. Some prior research (e.g., Laajaj et al.) suggests that Big Five does not hold up well in all cultures. This limitation may not only apply to the concrete traits of Big Five, but also the number of five dimensions. In fact, some cross-cultural research suggests that a "Big Two" (namely Prosociality and Industriousness, based on research in one particular culture of indigenous Bolivian foragers, Gurven et al.) or "Big Three" (e.g., De Raad et al. find Affiliation, Order and Dynamism as three factors that are robust among many languages) may hold up better in other cultures.

As complex and nuanced as human personalities are, we find that the Big Five model provides an excellent tradeoff in terms of compression and simplicity, at least for a Western population. And if you were to ask "which dimensionality makes most sense for a general personality model", then our findings suggest that "5" would be as close as we get to an answer.

Appendix

Facets and Distinctive Traits

Early in the article, we showed this table:

Big Five	HEXACO
Openness to Experience
Conformist, Exploratory, Intellectual, Logical, Quick-minded	Aesthetic Appreciation, Inquisitiveness, Creativity, Unconventionality
Conscientiousness
Dependable, Hard working, Organized, Perfectionistic, Planning, Pragmatic, Rule-abiding, Spontaneous	Organization, Diligence, Perfectionism, Prudence
Extraversion
Attention-seeking, Conversationalist, Group-oriented, Leader-like, Socially energized, Spontaneous	Social Self-Esteem, Social Boldness, Sociability, Liveliness
Agreeableness
Altruistic, Conflict-avoidant, Emotional, Empathetic, Logical	Forgiveness, Gentleness, Flexibility, Patience
Neuroticism	Emotionality
Anxious, Demanding, Depressive, Emotional	Fearfulness, Anxiety, Dependence, Sentimentality
	Honesty-Humility
	Sincerity, Fairness, Greed Avoidance, Modesty

The "subtraits" we list can be interpreted as follows:

For Big Five, the subtraits are what we call "distinctive traits". Originally, we treated these as a parallel framework to the Big Five model: we came up with 25 distinctive traits. Each item in the questionnaire then received a defined assignment to the Big Five factors as well as (separately) to the distinctive traits. We then determined empirically how strongly each distinctive trait related to each Big Five factor. In this way, they help describe what each factor captures, even though they were not originally designed as Big Five subtraits.

For HEXACO, on the other hand, the subtraits are actual "facets", as they're often used for personality tests. We first broke up the six factors into facets, four per factor, and then assigned our questionnaire items to these facets directly.

This distinction also explains why you'll find some duplicates of the distinctive traits on the Big Five side: as the relation between distinctive traits and factors was empirically determined, it happened that some traits (namely, Logical, Spontaneous, and Emotional) related strongly to multiple factors. On the HEXACO side, this does not occur in our framework because the facets were defined directly within each factor from the outset.

Big Five Internal Correlations

Correlations between Big Five traits based on our data.

HEXACO Internal Correlations

Correlations between HEXACO traits based on our data.

Life Outcomes Predicted by Big Five and HEXACO

Each of the 66 life outcomes was predicted using two separate models: one with the Big Five personality factors as predictors, and one with the six HEXACO factors. Data from 343 participants were used unnormalized across all analyses, with some analyses using slightly fewer than 343 data points, depending on outcome availability.

Regression approach: Ordinary least squares (linear) regression was used for continuous outcome variables, and logistic regression for binary (categorical) outcomes. Binary outcomes can be identified in the table by the presence of Train Set Value Counts, which show the class distribution in the training data.

Cross-validation: A 3-fold cross-validation procedure was employed. The data was randomly shuffled once per outcome, then split into three non-overlapping blocks of approximately 114 participants each. Each block served as the test set in one fold, with the remaining two-thirds used for training. For binary outcome variables, stratification was applied to the train-test split to preserve the proportion of each class across folds. Final train and test set scores are the averages across the three folds. The three rightmost columns of the table show exemplary sample size and (for binary outcomes) value counts based on one of the three analyses.

Computing R: For both regression types, we computed R² as 1 − (residual sum of squares / total sum of squares). Where R² was negative - indicating the model performed worse than simply predicting the mean - it was set to zero before deriving R (= √R²), which we use here as a simplified accuracy summary. This floor was applied per fold before averaging, meaning an R of 0.00 indicates the model had no predictive power above baseline across folds. The "HEXACO out-of-sample outperformance" column reports the difference in test-set R between the HEXACO and Big Five models. So, a positive number indicates HEXACO outperformed Big Five, whereas a negative value means Big Five did better.

Outcome	Big Five R Score (Train set)	Big Five R Score (Test set)	HEXACO R Score (Train set)	HEXACO R Score (Test set)	HEXACO out-of-sample outperformance (R dif)	Train Sample Size	Test Sample Size	Train Set Value Counts
Is Female	0.41	0.35	0.52	0.49	0.14	226	113	{1.0: 130, 0.0: 96}
Has Used Physical Force	0.00	0.00	0.10	0.14	0.14	228	115	{0: 216, 1: 12}
Has Been Arrested (Past 10 Years)	0.09	0.00	0.13	0.14	0.14	228	115	{0: 216, 1: 12}
Average Paid Hours per Week	0.30	0.21	0.40	0.35	0.13	229	114
Political Conservatism	0.39	0.35	0.48	0.47	0.12	229	114
Addicted to Any Substance	0.40	0.19	0.46	0.29	0.10	228	115	{0: 114, 1: 114}
Addicted to Any Behavior	0.41	0.13	0.39	0.23	0.10	228	115	{0: 120, 1: 108}
Has Cheated on a Partner	0.00	0.00	0.06	0.09	0.09	228	115	{0: 202, 1: 26}
Age	0.25	0.12	0.34	0.20	0.09	229	114
Weekly Moderate Exercise (Minutes)	0.23	0.11	0.24	0.18	0.07	229	114
Frequency of Non-White Lies (Past 24 Hours)	0.27	0.20	0.42	0.27	0.07	229	114
Books Read (Past 12 Months)	0.17	0.06	0.24	0.12	0.06	229	114
Alcoholic Drinks (Past 7 Days)	0.19	0.05	0.20	0.11	0.06	229	114
Education Score	0.22	0.04	0.28	0.09	0.05	229	114
Diet Healthiness (1–7)	0.34	0.23	0.37	0.27	0.04	229	114
Time Wearing Seatbelt (%)	0.10	0.00	0.15	0.04	0.04	229	114
Is Full-Time Employed	0.48	0.41	0.51	0.45	0.04	228	115	{1: 116, 0: 112}
Number of Deep Emotional Connections	0.37	0.27	0.38	0.30	0.03	229	114
How Urban Home Area Is (1–7)	0.14	0.00	0.14	0.03	0.03	229	114
Satisfied with Social Life	0.48	0.41	0.50	0.43	0.03	229	114
Household Income Score	0.25	0.15	0.25	0.17	0.02	229	114
Has Any Diagnosed Mental Illness	0.39	0.20	0.42	0.23	0.02	228	115	{0: 125, 1: 103}
Satisfied with Physical Health	0.48	0.37	0.49	0.39	0.02	229	114
Parental Strictness as a Child	0.25	0.13	0.27	0.15	0.01	229	114
Number of Sexual Partners	0.33	0.00	0.34	0.01	0.01	229	114
Relationship Seriousness (0–4)	0.19	0.06	0.23	0.07	0.01	229	114
Suicidality (Past 10 Years, 0–4)	0.45	0.37	0.44	0.38	0.01	229	114
Daily Hours Watching Video	0.15	0.00	0.21	0.00	0.00	229	114
Clicks on Online Ads (Past 30 Days)	0.19	0.00	0.22	0.00	0.00	229	114
Hours of Sleep per Night	0.20	0.00	0.16	0.00	0.00	229	114
Parents Were Married	0.00	0.00	0.00	0.00	0.00	228	115	{1: 208, 0: 20}
Parents Were Divorced	0.06	0.00	0.13	0.00	0.00	228	115	{0: 156, 1: 72}
Has Started a Company	0.10	0.14	0.12	0.14	0.00	228	115	{0: 158, 1: 70}
Has Payment Difficulties	0.05	0.00	0.10	0.00	0.00	228	115	{0: 184, 1: 44}
Opposite-Sex Attraction (%)	0.29	0.17	0.29	0.17	0.00	226	114
Considers Self Religious	0.35	0.27	0.37	0.27	-0.01	229	114
Work-Life Satisfaction	0.49	0.45	0.48	0.45	-0.01	229	114
Colds/Flus per Year	0.26	0.15	0.25	0.14	-0.01	229	114
Election Votes (Past 5 Years)	0.12	0.01	0.11	0.00	-0.01	229	114
Romantic Life Satisfaction	0.31	0.24	0.34	0.23	-0.01	229	114
Has Been Promoted	0.25	0.17	0.27	0.15	-0.02	228	115	{0: 132, 1: 96}
Seeks Feedback	0.20	0.17	0.21	0.16	-0.02	228	115	{1: 156, 0: 72}
Has Children	0.51	0.49	0.49	0.47	-0.02	228	115	{1: 116, 0: 112}
Self-Rated Facial Attractiveness (1–9)	0.41	0.28	0.42	0.26	-0.02	229	114
Spirituality (0–5)	0.41	0.36	0.42	0.34	-0.02	229	114
Overall Life Satisfaction (From Birth)	0.55	0.49	0.54	0.47	-0.02	229	114
High School GPA (1–12)	0.15	0.02	0.12	0.00	-0.02	228	114
Daily Hours on Social Media	0.32	0.12	0.29	0.10	-0.03	229	114
Is Currently Married	0.37	0.23	0.33	0.20	-0.03	228	115	{0: 138, 1: 90}
Parental Permissiveness as a Child (Reverse of Strictness)	0.22	0.18	0.23	0.14	-0.04	229	114
Number of Emergency Helpers	0.33	0.26	0.32	0.22	-0.04	229	114
Social Class (0–4)	0.37	0.27	0.37	0.23	-0.04	229	114
Hours Outdoors (Past Week)	0.25	0.15	0.24	0.11	-0.04	229	114
Felt Loved by Parents as a Child	0.41	0.28	0.38	0.23	-0.05	229	114
Self-Rated IQ Percentile	0.34	0.27	0.31	0.22	-0.05	229	114
Is a Homeowner	0.42	0.41	0.49	0.36	-0.05	228	115	{1: 118, 0: 110}
Number of Current Mental Illnesses	0.59	0.53	0.57	0.48	-0.05	229	114
Organizes Events	0.40	0.37	0.37	0.32	-0.05	228	115	{1: 130, 0: 98}
Meditates Daily	0.22	0.05	0.14	0.00	-0.05	228	115	{0: 153, 1: 75}
Satisfied with Life	0.56	0.52	0.53	0.46	-0.07	229	114
Visited a Worship Site (Past 30 Days)	0.12	0.13	0.16	0.07	-0.07	228	115	{0: 180, 1: 48}
IQ (Normalized Score)	0.25	0.14	0.23	0.07	-0.07	229	114
Trauma Impact (0–4)	0.47	0.43	0.44	0.36	-0.07	229	114
Donated to Charity (Past Year)	0.32	0.24	0.08	0.15	-0.09	228	115	{1: 152, 0: 76}
Has Been Fired	0.06	0.09	0.06	0.00	-0.09	228	115	{0: 201, 1: 27}
Received Awards (Past Year)	0.14	0.10	0.00	0.00	-0.10	228	115	{0: 186, 1: 42}
Mean:	0.285	0.191	0.296	0.198
Median:	0.281	0.170	0.289	0.172

Testing the Differences in Life Outcome Predictions

In the table above, we see that Big Five had a mean R score of 0.191, and HEXACO of 0.198. In the article, we mentioned that this difference is not statistically significant. We tested this as follows:

To assess whether HEXACO outperformed Big Five overall, we treated each of the 66 life outcomes as a paired comparison, since both models were evaluated on the same outcome. For each outcome, we computed the difference in test-set R between HEXACO and Big Five, and then tested whether the mean difference differed from zero using a paired t-test. The mean difference was +0.0071 R, which was not statistically significant: t(65) = 0.99, two-sided p = 0.325, with a 95% confidence interval of [-0.007, 0.021].

Factor Analysis Scree Plot

Scree plot showing the percentage of variance explained by each factor, derived from the eigenvalues of the correlation matrix of all 149 pooled Big Five and HEXACO items. Data were normalized (mean=0, SD=1) prior to analysis. Percentage variance explained per factor was calculated as the factor's eigenvalue divided by the sum of all eigenvalues. The plot displays the first 12 of 149 possible factors.

Supplementary Materials

To download the anonymized data from our study, showing each of the outcome values for each participant and each of the sub-scale values for each participant (i.e., all 5 of their big 5 scorers, all 6 of the big six scores), click here.

To download a data dictionary containing the names of all variables in the file above and a plain English description of each one, click here.