I’ve decided to begin a long-term research project aimed at understanding the underlying epidemiology of Dungeons and Dragons. This research project will consist of a series of (hopefully) increasingly complex simulations of battles between D&D PCs and various nemeses, to answer some key questions in character development and perhaps also to investigate some key controversies in the game. Once I have developed my simulations I hope to extend the project to Exalted, and I might diversify beyond that too.

The simple weight of experience in D&D means that most people know, or feel they know, how D&D works and how the roll of the dice determines a PC’s fate. I have noticed that sometimes our intuitive understanding of these things can be wrong, and I’d like to investigate D&D in enough detail to understand how it works. I’ll write a separate post about some of the principles of the research project, but in this post I’ll present the first analysis.

Introduction

In this post a million battles are simulated between a million randomly-generated fighters and a single (unfortunate) Orc, Gruumsh The Bastard, who has 6 hit points and does 2d4+4 damage with his nasty falchion of fighter-crunching. Both Gruumsh and the million fighters were generated using Pathfinder rules as set out in the System Reference Document. These million battles were run in order to identify the effect of the three basic physical ability scores (Strength, Dexterity and Constitution) on survival for a standard fighter.

Methods Summary

Detailed methods are described at the end of the post. In essence, a million Pathfinder fighters were generated randomly and pitted against Gruumsh the Bastard in simulated battles. Fighter survival was analyzed using multiple logistic regression analysis by ability score. Survival probabilities by ability score are plotted in charts and summarized as Odds Ratios in the logistic regression analysis. No interactions or complex higher effects were considered. The distribution of hit points was summarized using a histogram, but doesn’t represent the true (practical) distribution of hit points for a fighter, since it includes fighters with unrealistically low constitution scores.

Results

Things didn’t go well for the million fighters. Overall survival was just 26%, with 256,584 lucky fighters making it to the end of their battle. The remaining 743,416 fighters were smashed to ribbons by Gruumsh and, in many cases, eaten. The median length of a battle was 4 rounds where the fighter survived, or 3 rounds if Gruumsh won. Figure 1 shows the probability of survival by ability score, and shows some stark differences in effect between ability scores.

Figure 1: Probability of Survival by Ability Score

It is clear from Figure 1 that strength is the key determinant of survival for a first level fighter. Only 0.4% of the weakest fighters survived, compared to 55% of the strongest. Constitution has barely any effect on survival, and dexterity is only important at the extreme ends of its range.

Table 1 summarizes the results of multiple logistic regression of mortality. In this table, the odds ratio of death is given after adjusting for the other two ability scores, so removes the confounding effect of high or low values in other relevant ability scores. All odds ratios are given relative to the lowest value of the corresponding ability score, so for example those with strength 18 – 19 have an odds ratio of mortality of 0.003 compared to those with a strength of 2-3.

Table 1: Multiple Logistic Regression of Death by Ability Score
Variable Odds Ratio 95% Confidence Interval P value
Strength
  2 to 3

Ref.

  4 to 5

0.21

0.06 – 0.66

0.008

  6 to 7

0.07

0.02 – 0.21

<0.001

  8 to 9

0.03

0.01 – 0.10

<0.001

  10 to 11

0.02

0.01 – 0.06

<0.001

  12 to 13

0.01

0 – 0.03

<0.001

  14 to 15

0.006

0 – 0.02

<0.001

  16 to 17

0.004

0 – 0.01

<0.001

  18 to 19

0.003

0 – 0.01

<0.001

Dexterity
  2 to 3

1

  4 to 5

0.87

0.69 – 1.10

0.236

  6 to 7

0.76

0.61 – 0.94

0.012

  8 to 9

0.65

0.53 – 0.81

<0.001

  10 to 11

0.54

0.44 – 0.67

<0.001

  12 to 13

0.45

0.36 – 0.55

<0.001

  14 to 15

0.37

0.3 – 0.45

<0.001

  16 to 17

0.3

0.24 – 0.37

<0.001

  18 to 19

0.23

0.18 – 0.28

<0.001

Constitution
  2 to 3

1

  4 to 5

0.9

0.73 – 1.11

0.307

  6 to 7

0.86

0.71 – 1.04

0.113

  8 to 9

0.82

0.68 – 0.99

0.044

  10 to 11

0.72

0.6 – 0.87

0.001

  12 to 13

0.63

0.52 – 0.76

<0.001

  14 to 15

0.55

0.45 – 0.66

<0.001

  16 to 17

0.48

0.4 – 0.58

<0.001

  18 to 19

0.41

0.34 – 0.49

<0.001

There is no difference statistically between a constitution score of 6-7 and a score of 2-3 – everyone with constitution scores in this range are purely at the mercy of the dice. In comparison, increasing strength from 3 to 4 reduces the odds of death by a factor of five, and fighters with a strength of 18 have an odds of mortality 300 times lower than fighters with a strength of three. Truly, fortune favours the strong.

Figure 2 shows the odds ratio of mortality for constitution with its 95% confidence intervals, as a graphical alternative to a portion of Table 1 (we promised Gruumsh we would describe his victory in pretty pictures).

Figure 2: Odds Ratio of Survival by Constitution Score

Figure 2 suggests that hit points are not as important to combat survival as the ability to smash your opponent into the dirt. Once the Toughness feat is incorporated into simulations, constitution is likely to become even less important, and should probably be treated as a dump stat by players. Given that choosing the Toughness feat is equivalent to making a large increase in constitution, but this increase in constitution gives a barely-statistically-significant reduction in mortality, it seems likely that this feat is not a very useful choice. If Gruumsh is willing, this will be investigated in subsequent analyses[1].

The distribution of strength ability scores under the 4d6 choose-the-best-three method is shown in Figure 3. This method shifts the scores significantly to the right: only 754 fighters had a strength of 3, compared to 16,141 who had a strength of 18. The mean strength was 12.24 and the median 12, a shift of three from a standard 3d6 distribution and a huge change to the extreme values.

Figure 3: Distribution of Strength Scores Under 4d6 choose-the-best-three

Nearly 5% of the sample had at least one physical score of 18; but this method is still not perfect, with only 3 of one million fighters having a score of 18 in all three physical attributes (one of these three, who also had an intelligence of 15 and a charisma of 16, was beaten to a bloody pulp by Gruumsh in just three rounds. His liver, apparently, was exquisite when grilled lightly and eaten on rye bread with a dark ale).

Figure 4 shows the distribution of hit points in this sample of 1 million fighters. This is not the distribution one would actually see in a sample of actual Pathfinder fighters, since in a real game most fighters will have non-negative constitution bonuses (unless their player has read this post, I suppose). This histogram shows an interesting effect, however: even when constitution is unrestricted, under a 4d6/choose-the-best-three system there is a heavy concentration of hit points in the range of 4 – 10. Median hit points in this sample were 6, and the average hit point total was 6.2: in fact, the hit point distribution looks remarkably close to a uniform distribution on the range 4 – 10!

Figure 4: Distribution of Hit Points

Survival was not strongly associated with hit point value: those with 1 hit point survived in 20% of battles, while those with 14 hit points survived in 50% of battles. This extra importance of hps relative to constitution is driven entirely by the extra die roll (the d10 for hps) which suggests that constitution would be of much greater importance if hit points were fixed at first level; equivalently, it may be that the roll of constitution is washed out by the random determination of hit points, and if so one can expect that constitution will be more important at later levels when the law of large numbers cancels out the random effect of dice rolls on survival. For the same reason strength will probably reduce in importance over levels, since its effect is not compounded with level as constitution is. This is an issue that will need to be investigated, although if survival probabilities are replicated at second level it’s unlikely we will have much of a sample size of high level PCs[2].

Conclusion

At first level, strength is far and away the most important ability score for fighters, and constitution is so insignificant as to be almost a dump stat. A fighter with strength of 18 has only 1/300th the odds of death of a fighter with strength 3 when fighting a single Orc. Overall survival rates were low even in the toughest fighters, and in the absence of feats it appears that Pathfinder is an extremely nasty environment for solo adventuring.

Future research will investigate the role of feats in enhancing survival, and their importance relative to ability scores. The results presented here are preliminary, but it appears that in min-maxing fighter PCs the wisest choice is to prioritize strength, then dexterity, then constitution. If one is developing a PC with the intention of long-term survival these findings may be reversed, but the experimental results have not yet been collated.

Finally, the results presented here suggest that the assignment of a 1/3 challenge rating (CR) to Orcs in Pathfinder may be unwarranted. Although data are not shown here, in the testing stage this simulation program was run on Goblins (also CR 1/3) and the fighter survival rate was much higher. It may be the case that Orcs are far more challenging than a CR of 1/3. It’s not clear how Pathfinder assign their CRs, but it seems natural to suppose that a creature with a more than 50% chance of defeating an average human fighter is more than CR 1. Are Pathfinder’s CRs accurate? In any case, basic advice to fighters in Pathfinder would be: hunt Goblins, not Orcs, they’re much lower risk for the same xps.

Methods

For this analysis the fighters were generated according to the following rules:

  • All ability scores were generated using 4d6 choose-the-best-three, rolled in order: This is not orthodox Pathfinder but enables simultaneous estimation of the probability distribution of ability scores under this commonly-used rule, and enables analysis of the effect of ability scores across their full range – not just in the high values that one would usually assign to a PC’s prime characteristics
  • No feats were assigned to the fighter: for this first analysis the effect of raw scores was the topic of analysis, so no special abilities were given to the fighters. These million meat-shields were cast into battle with only their raw talents at their disposal
  • All fighters had the same equipment: raising a levy of a million fighters takes only a minute in 64 bit R, but it’s clearly a costly imposition on the citizenry, so all fighters were assigned standard kit consisting of chain mail armour, a standard shield, and a longsword. If we can secure a sufficiently large research grant from Waterdeep, subsequent battles we will allow random variation in armour types in order to choose the best armour
  • Racial abilities were not tested: no racial ability score adjustments or size bonuses were tested. Only raw scores were used. In future battles, racial ability scores will be incorporated into the PCs. Anyway, who cares if a halfling lives or dies?

The results of all battles were summarized as two numbers: length of the combat in rounds, and whether or not the fighter lived or died (Gruumsh is a bastard, and his survival status is essentially irrelevant). Survival probability was plotted by ability score, and also analyzed using multiple logistic regression to assess the odds ratio of survival for any level of any ability after adjusting for all other abilities. Histograms of hit points and ability score (strength) were also obtained for reference purposes. The odds ratio of survival at different values of one score (constitution) was plotted with 95% confidence intervals.

No ethical approval was obtained for this study, and anyone with concerns about the ethics of the study can raise the issue with Gruumsh. Informed consent was not obtained from any subjects (though Gruumsh seemed pretty eager to participate, and said “smash human!” many times, so could probably be said to have given active consent). No medical care or counselling was offered to survivors of the battles, and no reward was offered. The lucky minority who survived probably went off to start a farm or something, but we don’t know because follow-up to assess general physical health or emotional needs was not offered. Experience points were not distributed to the victors, because if we did Gruumsh would have gained enough levels to take over the world and no one wants that. Gruumsh was allowed to feast on the remains of his vanquished foes, because culturally sensitive research techniques are very highly prized at the Faustusnotes Military Academy. All simulations were conducted in R version 2.15.0, and all analyses were carried out in Stata/MP 12 because R sucks for things like making simple tables. The analyst was not blinded to the participants in the study, but if you think he had any interest in scanning a million records of a .csv file looking for fighters to favour, you’re an over-optimistic fool. This study was also not registered with CONSORT, but it’s unlikely that it would get published in any public health journal, so there was no need, really, was there?

fn1: Actually, Gruumsh is unlikely to get a choice. We’ll just roll up the fighters and send them in his direction.

fn2: Actually, if we run a series of level-by-level simulations we could test whether the probability distributions of levels given in the D&D DMG are correct, and come up with empirical estimates of the true proportion of the population who are higher level!