Search Results for 'pathfinder epidemiology'


Introduction

In previous posts in this series, I showed the differences between fighter builds, and especially that “fast fighters” are a weak decision that is particularly bad for halflings and elves even though they are the more agile races. In this post I will approach the question of fighter builds from a different angle, that of the most effective choice of feats, armour and weapons for given attribute scores. Ultimately, the aim of this work is to develop decision models (expressed as flowcharts) for PC development. We will do this through a generalized version of the simulations run to date, in combination with classification and regression tree (CART) methods.

Methods

For this study a completely random character generation method was developed. This simulation program generated random races, ability scores, weapon and armour types and feats subject to the rules in the online Pathfinder System Reference Document (SRD). Weapons were restricted to three choices: rapier, longsword and two-handed sword. Armour types were studded leather, scale, chain shirt and chain mail. There were eight possible feats: improved initiative, dodge, shield focus, weapon focus, power attack, desperate battler, weapon finesse and toughness. Ability scores were generated uniformly within the range 9 to 18, and racial modifiers then applied: the human +2 bonus was applied randomly to the three physical attributes. Feats were assigned randomly, with humans having three feats and non-humans two. All fighters with a one-handed weapon were given a light wooden shield. Halflings were given size benefits and disadvantages as described in the SRD. Initial investigation revealed that ability score values were only important in broad categories: ability scores that gave bonuses greater than 0 were good, and bonuses of 0 or less were bad. For further analysis, therefore, all ability scores were categorized accordingly into values of that gave a bonus of +1 or greater vs. those that did not.

All fighters were pitted in one-to-one melee combat against an Orc, which had randomly determined hit points and the fully operative ferocity special ability. This happened in a cage deep beneath Waterdeep, so no one could run away. Winners were promised a stash of gold and the chance to buy a farm on the Sword Coast, but were actually subsequently press-ganged into military service in the far south, where most of them died of dysentery. A million fights were simulated.

Once data had been collected it was analyzed using classification and regression tree (CART) models implemented in R. CART models enable data to be divided into groups based on patterns within the predictor variables, which enables complex classification and decision rules to be made. Although it is more complex and less reliable than standard regression, CART enables the data to be divided into classification groups without the formulaic restrictions of classical linear models. Results of CART models can be expressed as a kind of flowchart describing the relationship between variables, with ultimate classification giving an estimate of the probability of observing the outcome. In this case the outcome was a horrible death at the hands of an enraged orc, and the probability of this outcome is expressed as a number between 0 and 1. CART results were presented separately by race, in case different races benefited from different choices of feats.

Some univariate analysis was also conducted to show the basic outline of some of the (complex) relationships between variables in this dataset. Univariate analysis was conducted in Stata, and CART was conducted in R.

Results

Of the million brave souls who “agreed” to participate in this experiment, 498000 (49.8%) survived. Survival varied by race, with 55% of humans surviving and only 45% of halflings making it out alive. Some initial analysis of proportions suggested quite contradictory results for the different feats, with some feats appearing to increase mortality. For example, 47% of those with improved initiative survived, compared to 51% of those without; and 46% of those with shield focus, compared to 52% of those without. This probably represents the opportunity cost of choosing these feats, or some unexpected confounding effect from some other variable.

The three combinations of ability scores and feats with the highest number of observations and the best survival rate were:

  • Dwarf with +3 strength, +3 dexterity, +3 constitution, chain mail armour, rapier, weapon focus and desperate battler (15 observations, 100% survival)
  • Dwarf with +3 strength, +0 dex, +4 con, scale armour, two-handed sword, toughness and weapon focus (13 observations, 100% survival)
  • Dwarf with +3 strength, +2 dex, +3 con, studded leather armour, longsword, desperate battler and power strike (13 observations, 100% survival)

Despite the apparent success of Dwarves, a total of 55% of all unique combinations of ability scores, feats, weapon and armour types with 100% survival were in humans. The majority of the most frequent survival categories appeared to be in non-humans, however – this bears further investigation.

CART results varied by race. For humans, ability scores were most important; for dwarves, weapon type and armour type were important, while constitution was largely irrelevant. For elves and halflings, the only important feat was toughness; weapon finesse was only important for humans, and sometimes only as a negative choice. The key results from the CART analysis were that strength is the single most important variable, followed by dexterity for elves and halflings, or constitution for dwarves; and then by decisions about armour and weapons. Feats are largely relevant only for those with weak ability scores.

As an example, the CART results for humans are presented as a flowchart in Figure 1 (click to enlarge). It is clear that after strength and dexterity, heavy armour and constitution are important determinants of survival. Weapon finesse is only important as a feat to avoid for those with low dexterity – for those with high dexterity it is largely irrelevant. Toughness primarily acts as a counter-balance to poor constitution in those with high dexterity and strength.

Figure 1: Character creation decision model for humans

Decision models for other races will be uploaded in future posts.

Conclusion

This study once again shows that strength is the single most important ability for determining survival in first level fighters, and that feats are largely used to improve survival chances amongst those who already have good ability scores. In previous posts dexterity appeared to be irrelevant, but analysis with CART shows that the absence of a dexterity bonus makes a large difference to survival – those with no dexterity score bonus do not benefit from feat choices, while those who have a dexterity bonus can benefit further by careful choice of armour and feats. Although previous posts found that “tough” fighters have a very high survival rate, this post finds that constitution is not in itself a priority ability score. By following the decision model identified in this study, players can expect to generate a fighter with the highest average survival chance given their ability scores.

This weekend I continued my work on the epidemiology of Pathfinder, including an expansion of my programs to allow for different types of point buy. In the process I took the advice of some commenters at a related thread on the Pathfinder message boards:

I think for the non human fast fighters dropping weapon finesse makes no sense. Because they can hardly hit if they drop that. I would recommend changing it to dropping improved initiative for the fast non-humans.

In my original simulations I had built non-human fast fighters with improved initiative and weapon focus, but in this revision I changed this around so that non-human fast fighters drop improved initiative and keep weapon finesse. The results, though still not presenting a stirring defense of the decision to play a fast rather than a strong fighter, do bear out the suspicions of those commenting on that board, that for fast fighters weapon finesse is the most important feat to choose. Table 1 compares the results with weapon finesse that I generated today with the previous set of results that dropped weapon finesse in favour of improved initiative. The results in Table 1 are shown for combat with meek orcs (lacking ferocity) to be consistent with the previous post. Similar effects are observed against ferocious orcs, however.
Table 1: Non-human mortality with and without weapon finesse (revised)

Race No Weapon Finesse Weapon Finesse Odds Ratio
Dwarf 43.6 37.0 1.32
Elven Ponce 52.2 44.2 1.38
Halfling Loser 61.6 49.7 1.62

The odds ratios in Table 1 are provided to show which race suffers the most from lack of weapon finesse, and it is no surprise that it is the halflings. This is because they do the least damage, so the loss of hit chances affects them the most.

These results don’t change the fundamental conclusion that fast fighters are a very bad choice, but they do indicate that if one is going to pick this fighter build, weapon finesse is a very important feat to choose.

Continuing my series of posts exploring the epidemiology of Pathfinder, today I will report on the impact of adding ferocity to the orc stat block. Is the orc still a CR 1/3 monster when one accounts for ferocity, and just how tough does a fighter have to be to walk away from a fight with a single ferocious orc?

For this simulation (and all sims from now on) I am going to be using my updated and revised modeling program, which has been subject to some fairly severe stress tests and which I’m now fairly certain perfectly mimics a basic combat exchange between an orc and a fighter. I posted revisions here, showing the basic survival probability for three types of fighter and four races, for an orc with no ferocity. This is the basic program I’ll be working with from now on.

Introduction

Previous analysis of survival in Pathfinder have studied conflict between fighters of the four main races and inferior breeds of orc, but it is likely that serious dungeoneering will bring adventurers into conflict with hardier orcs fighting near their lair. It is well known that orcs who maintain a close cultural connection with their tribe are braver and more determined fighters, and this is usually reflected in their ability to fight even when suffering serious physical injuries. For this analysis, this powerful additional trait of “wild” orcs, ferocity, is included in the analysis. Essentially this analysis compares the survival chance of a lone fighter against a lone orc isolated from its tribe, probably in a city, with a lone fighter in combat with a lone orc near its lair, where it will fight beyond death.

Methods

A set of 200,000 simulated battles between randomly-generated fighters and randomly-generated orcs was analyzed using poisson regression. Orcs and fighters were generated in the standard way, but orcs had a 50% chance of having the ferocity trait, which enables them to continue fighting until they reach -12 hps. A simple main-effects poisson regression model of survival was built, and the effect of orc ferocity on survival reported from this model; subsequently, a model with interactions between ferocity and all the main variables of interest (fighter type, race and ability bonuses) was also built. Results from both of these models are reported selectively for simplicity.

Results

Mortality for the 100,000 fighters against meek orcs was unchanged, at 37.2%; but for fighters battling ferocious orcs mortality increased significantly, to 63%. Patterns of mortality differences by race and class type were similar to those seen previously, but mortality rates were higher in all class types and races. Table 1 shows mortality rates by race and ferocity type.

Table 1: Mortality rates by race and orc ferocity

Race

Orc Ferocity

Meek Ferocious
Human 30.6 57.1
Dwarf 32.4 60.1
Elven ponce 40.8 65.8
Halfling loser 44.9 68.2

Note that, although survival patterns are maintained in battles against ferocious orcs, the mortality ratios decrease: from a 50% increase in mortality between humans and halflings against meek orcs, for example, to a 20% increase against ferocious orcs. The increase in mortality due to ferocity also varies, from nearly a two-fold increased mortality rate in humans and dwarves to only a 50% increased mortality amongst halflings.

In a simple main-effects poisson regression model ferocity was associated with an average relative risk of mortality of 1.7, which was highly statistically significant (Z=80.12, p value <0.0001). That is, the average increased mortality from adding ferocity to an orc stat block was about 70%. However, in a model including interaction terms between orc ferocity and all main variables (fighter type, race, and all three stat bonuses) the role of orc ferocity varied significantly across ability scores. For example, after adjusting for other ability scores, class type and race, the increased mortality amongst fighters with minimum strength bonus was only 20%, while it was 85% for fighters with a strength bonus of +5. This effect is shown in Figure 1, which plots the relative risk of mortality by strength score for meek compared to ferocious orcs. All relative risks are relative to a fighter with a strength of -2.

Figure 1: Mortality by Strength Ability Score for Meek and Ferocious Orcs

Essentially, strength induces a lower gradient of mortality improvements when fighting tough orcs, and combinations of high scores become more important. In fact, it seems highly unlikely that decent survival will be obtainable for fighters of any race and class type generated using Pathfinder’s standard point-buy systems. These systems will restrict most PCs to ability scores in the 14-16 range, which will not guarantee survival against even a single ferocious orcs.

Conclusion

Adding ferocity to an orc’s stat block significantly increases its lethality, with an average increase in mortality risk for fighters in one-to-one combat of about 70% after adjusting for race, class type and ability scores. Even the strongest and most unusual fighters, with ability scores above 18, have surprisingly poor survival of about 30%. Orc ferocity increases mortality across all races and fighter types, with halflings again copping the pointy end of Gruumsh the Bastard’s falchion and incurring death rates of up 70%. This is further evidence that orcs are not CR 1/3 opponents, and suggests that GMs who want to field orcs as cannon fodder against their PCs should judge numbers carefully, or consider treating ferocity as a leader-type trait. It also suggests that – just on the numbers – Pathfinder is the most lethal of the D&D incarnations, especially when ability scores are restricted by point buy options. This will be tested in subsequent analyses.

In preparing an analysis of the effect of orc ferocity, I found I wasn’t able to reproduce the results of my previous post on different types of fighter and different races. The overall mortality in that post was 20%, but I kept getting values of 36%. Because I’m such a stunningly good programmer, I’d overwritten the program I used to produce those results, and it has taken me several days (interrupted by moving house) to dig up the original programs from Time Machine[1]. Checking through them I found a tiny error (three letters in one line of code out of 375[2]) which causes character hit points not to update after a round of combat – so that the only way the orc could win was to kill the PC on its first round of combat. That’s an interesting insight right there – 20% of the time the orc wins in the first round of combat!

So the true mortality rate in that analysis should have been 36%. I’m not going to redo the whole analysis (it’s late and I’m tired and I have a new analysis of ferocity to come), but I will put up the corrected table of mortality rates by race and fighter type, in Table 1.

Table 1: Mortality by race and fighter type (revised)

Race Fighter type
Strong Fast Tough
Human 20.4 36.7 35.1
Dwarf 18.8 43.6 36.2
Elf 30.7 52.2 38.7
Halfling 26.1 61.6 45.9

The general conclusion – that fast fighters are a disaster – is retained, but the effect is even more noticeable in elves and halflings, and high strength is even more important for these races than humans. Mortality rates in fast fighters are 1.8 times higher amongst humans, compared to over 2.5 times higher in halflings. Also, when the orc is not constrained from delivering a second blow, constitution becomes much less important than strength – being able to kill the orc first remains the most important skill.

Dwarves, who in this simulation have dropped power attack if they are strong fighters, benefit hugely from being strong rather than tough, presumably because they already have a constitution bonus.

So, the order of ability scores is: strength, constitution, dexterity. And I need to improve my programming!

fn1: which is awesome, btw.

fn2: which would probably be about 50, if I was any good at this stuff

After taking account of comments here and on the Paizo messageboards, I have adapted my simulation programs to allow for purposive attribute scores, feats and races, and re-analyzed the survival data for a smaller sample of more carefully designed fighters. In this second round of analyses Gruumsh the Bastard doesn’t acquit himself well, but neither do some of the PCs who went against him. This post reports on the updated analyses.

Update (3rd July 2012): In editing my code to incorporate some minor changes, I noticed that I didn’t actually pit 100,000 fighters against 100,000 randomly-generated orcs – I pitted 100,000 fighters against Gruumsh, who only has 6 hit points. Against a full range of Orcs one gets very different results – I will report on this today (3rd July 2012). This post has been edited to remove references to 100,000 randomly-generated orcs.

Introduction

Previous analyses of survival in Pathfinder have relied on randomly generated ability scores assigned in order, and have not incorporated feats, race, fighting styles or weapon types. In this post the analyses are updated to allow for a range of basic feats, four races, purposive rather than completely random assignment of ability scores, and three types of fighter: strong, fast and tough. Survival is compared against Gruumsh again, and results analyzed for insights into possible character creation decisions.

Methods

A sample of 100,000 randomly generated fighters were pitted in battle against Gruumsh, who is still not ferocious. The fighters were generated so as to fall into three types, defined by ability scores, armour and weapon types, and feat choices:

  • Strong fighters: strength was determined randomly from a uniform distribution between 13 and 18, and the fighters were equipped with scale mail and a two-handed sword. Human fighters had three feats: power attack, weapon focus and desperate battler. Humans placed their +2 ability score bonus in strength. Non-human fighters dropped power attack
  • Fast fighters: dexterity was determined randomly from a uniform distribution between 13 and 18, and the fighters were equipped with studded leather armour, a heavy wooden shield and a rapier. Human fighters had three feats: improved initiative, dodge and weapon finesse. Non-humans dropped weapon finesse, and humans put their +2 bonus into dexterity.
  • Tough fighters: constitution was determined randomly from a uniform distribution between 13 and 18, and the fighters were equipped with chain shirt, wooden shield and longsword. Human fighters had three feats: toughness, shield focus and weapon focus. Non-humans dropped toughness (because two of the races already had +2 constitution), and humans put their +2 bonus into constitution.

All other physical stats were generated with 3d6, but scores below 9 were reset to 9. Mental stats were generated using 3d6 in order, but nobody cares if their meat shield has read Shakespeare, so the details aren’t reported here. The hapless 100,000 were then thrown against Gruumsh, with the promise that anyone who survived would get to meet Salma Hayek. Needless to say, I lied: for unknown reasons, Hayek only dates bards. All fighters with power attack were assumed to be using it for every strike, and you would too if you met Gruumsh.

Results

After incorporating racial bonuses and feats, and assigning ability scores purposively rather than randomly, overall survival increased significantly: only 20% of the newly trained fighters died. However, variation in survival was significant and depended heavily on race and fighting style. Table 1 shows the mortality rates by race and fighter types.

Table 1: Mortality Rates by Race and Fighter Type
Race Strong Fast Tough
Human 17.1 26.5 0
Dwarf 11.1 21.8 1.1
Elven Ponce 27.3 46.9 16.4
Halfling Loser 21.2 45.3 8.3

From Table 1 it is clear that elves and halflings are not good fighters, and Dwarves are excellent in this particular role. The small difference in mortality between humans and dwarves is probably due to the reduced number of feats that dwarves have relative to humans. In fact, once feats and purposive ability score selection are included in character development, constitution becomes an extremely important score: 0% of fighters with constitution bonuses above 3 died. This is probably because CON bonuses of 3 or more guarantee a fighter cannot be killed in a single blow by an Orc (maximum damage 12) and the increased damage and hit stats of these fighters mean the orc will not survive to deliver a second blow. This is indisputably a good thing.

It is clear from table 1 that the least successful form of fighter is the fast fighter, and indeed some perverse results obtain. Figure 1 shows the mortality rate by dexterity score: mortality increases with increasing dexterity in this dataset. This is probably because higher dexterity scores are more likely in the “fast fighter” choice, and amongst halflings, both of which deliver less damage than other races and class types.

Figure 1: Mortality by dexterity score

A similar perverse result is visible with armour class. Figure 2 shows the relationship between mortality and armour class, which is positive.

Figure 2: Mortality by Armour Class

Again, it is likely that the highest armour class values are only achieved by halflings (who have size bonuses), and higher AC is associated with lower damage and attack values. Note that fast fighters have very high initiative values (up to +9!) but these don’t seem to say the battle: for fighters who start with a minimum of 8 hit points, starting the battle first is less important than being able to hit your opponent and do massive amounts of damage.

Conclusion

Dexterity is useless, and a fighting style based on light armour and fast weapons is a waste of time. As a result, weapon finesse is the ultimate wasted feat: it could have been used to get 3 more hit points, which for a first level fighter guarantees that one strike from an Orc will not be fatal. After incorporating feats, the best option for a first level fighter is to choose toughness, shield focus and weapon focus, and pour as many points as possible into constitution. 17 hit points, chain armour and a shield at first level are vastly more useful than a fancy fighting style and a leather skirt!

In yesterday’s analysis I made the mistake of assigning random HPs to the fighters, which is not the way that Pathfinder works: at first level in Pathfinder all PCs receive maximum hit points.Thus yesterday’s post is actually a fairly faithful representation of survival in D&D 3.5 rather than Pathfinder. Today I’ve brushed off a particularly irritable Gruumsh and set him to work against another million random fighters, this time with properly-adjusted hit points, in order to see what effect this rule has on the relative importance of stats.

The result is that the relative importance of the three ability scores doesn’t change, but overall survival probability has increased to 39%. For 15% of our army, that’s good news. The curves depicting overall survival rates don’t change overmuch though (Figure 1), they just start from a higher base.

Figure 1: Survival Rates by Ability Score, Maximum HPs at Level 1

Figure 2 shows how the survival probabilities have changed for constitution when fighters start with maximum HPs compared to random HPs.

Figure 2: Relationship Between Survival and Constitution for Fixed vs. Random HPs

The odds ratios change only a little, showing the same overall pattern (Table 1).

Variable OR P value Confidence Interval
Strength
  2 to 3

1

  4 to 5

0.24

0.001

0.11 to 0.53

  6 to 7

0.08

0

0.04 to 0.19

  8 to 9

0.04

0

0.02 to 0.08

  10 to 11

0.02

0

0.01 to 0.04

  12 to 13

0.01

0

0.01 to 0.03

  14 to 15

0.01

0

0 to 0.01

  16 to 17

0

0

0 to 0.01

  18 to 19

0

0

0 to 0.01

Dexterity
  2 to 3

1

  4 to 5

0.99

0.9

0.82 to 1.19

  6 to 7

0.86

0.1

0.72 to 1.03

  8 to 9

0.72

0

0.61 to 0.86

  10 to 11

0.6

0

0.5 to 0.72

  12 to 13

0.49

0

0.41 to 0.58

  14 to 15

0.39

0

0.33 to 0.47

  16 to 17

0.31

0

0.26 to 0.37

  18 to 19

0.23

0

0.19 to 0.28

Dexterity
  2 to 3

1

  4 to 5

0.95

0.56

0.78 to 1.14

  6 to 7

0.75

0.002

0.63 to 0.90

  8 to 9

0.56

0

0.47 to 0.67

  10 to 11

0.4

0

0.33 to 0.47

  12 to 13

0.31

0

0.26 to 0.37

  14 to 15

0.26

0

0.22 to 0.31

  16 to 17

0.24

0

0.2 to 0.29

  18 to 19

0.24

0

0.2 to 0.29

These are quite similar odds ratios to the situation with random Hit Points, except that the effect of higher constitution scores is a little greater (though still not as important as dexterity). Figure 3 shows the revised odds ratios for constitution.

Figure 3: Odds of mortality by constitution score, maximum hit points at first level

Conclusion

In Pathfinder, applying the proper rule at first level in which all fighters receive maximum hit points, overall survival increases from 25 to 38%, but constitution remains the least important ability score. The similarity in effect of dexterity and constitution in this revised simulation suggests that the role of feats will be crucial in determining which ability score to prioritize after strength, but the most important ability score remains strength. Probably over multiple levels, as random hit points begin to take their toll, constitution will be more important than dexterity, but we will test that later. The main finding is that although maximum hit points increase survival overall relative to D&D 3.5, they don’t change the overall importance of strength, and they do narrow the difference between dexterity and constitution.

I’ve decided to begin a long-term research project aimed at understanding the underlying epidemiology of Dungeons and Dragons. This research project will consist of a series of (hopefully) increasingly complex simulations of battles between D&D PCs and various nemeses, to answer some key questions in character development and perhaps also to investigate some key controversies in the game. Once I have developed my simulations I hope to extend the project to Exalted, and I might diversify beyond that too.

The simple weight of experience in D&D means that most people know, or feel they know, how D&D works and how the roll of the dice determines a PC’s fate. I have noticed that sometimes our intuitive understanding of these things can be wrong, and I’d like to investigate D&D in enough detail to understand how it works. I’ll write a separate post about some of the principles of the research project, but in this post I’ll present the first analysis.

Introduction

In this post a million battles are simulated between a million randomly-generated fighters and a single (unfortunate) Orc, Gruumsh The Bastard, who has 6 hit points and does 2d4+4 damage with his nasty falchion of fighter-crunching. Both Gruumsh and the million fighters were generated using Pathfinder rules as set out in the System Reference Document. These million battles were run in order to identify the effect of the three basic physical ability scores (Strength, Dexterity and Constitution) on survival for a standard fighter.

Methods Summary

Detailed methods are described at the end of the post. In essence, a million Pathfinder fighters were generated randomly and pitted against Gruumsh the Bastard in simulated battles. Fighter survival was analyzed using multiple logistic regression analysis by ability score. Survival probabilities by ability score are plotted in charts and summarized as Odds Ratios in the logistic regression analysis. No interactions or complex higher effects were considered. The distribution of hit points was summarized using a histogram, but doesn’t represent the true (practical) distribution of hit points for a fighter, since it includes fighters with unrealistically low constitution scores.

Results

Things didn’t go well for the million fighters. Overall survival was just 26%, with 256,584 lucky fighters making it to the end of their battle. The remaining 743,416 fighters were smashed to ribbons by Gruumsh and, in many cases, eaten. The median length of a battle was 4 rounds where the fighter survived, or 3 rounds if Gruumsh won. Figure 1 shows the probability of survival by ability score, and shows some stark differences in effect between ability scores.

Figure 1: Probability of Survival by Ability Score

It is clear from Figure 1 that strength is the key determinant of survival for a first level fighter. Only 0.4% of the weakest fighters survived, compared to 55% of the strongest. Constitution has barely any effect on survival, and dexterity is only important at the extreme ends of its range.

Table 1 summarizes the results of multiple logistic regression of mortality. In this table, the odds ratio of death is given after adjusting for the other two ability scores, so removes the confounding effect of high or low values in other relevant ability scores. All odds ratios are given relative to the lowest value of the corresponding ability score, so for example those with strength 18 – 19 have an odds ratio of mortality of 0.003 compared to those with a strength of 2-3.

Table 1: Multiple Logistic Regression of Death by Ability Score
Variable Odds Ratio 95% Confidence Interval P value
Strength
  2 to 3

Ref.

  4 to 5

0.21

0.06 – 0.66

0.008

  6 to 7

0.07

0.02 – 0.21

<0.001

  8 to 9

0.03

0.01 – 0.10

<0.001

  10 to 11

0.02

0.01 – 0.06

<0.001

  12 to 13

0.01

0 – 0.03

<0.001

  14 to 15

0.006

0 – 0.02

<0.001

  16 to 17

0.004

0 – 0.01

<0.001

  18 to 19

0.003

0 – 0.01

<0.001

Dexterity
  2 to 3

1

  4 to 5

0.87

0.69 – 1.10

0.236

  6 to 7

0.76

0.61 – 0.94

0.012

  8 to 9

0.65

0.53 – 0.81

<0.001

  10 to 11

0.54

0.44 – 0.67

<0.001

  12 to 13

0.45

0.36 – 0.55

<0.001

  14 to 15

0.37

0.3 – 0.45

<0.001

  16 to 17

0.3

0.24 – 0.37

<0.001

  18 to 19

0.23

0.18 – 0.28

<0.001

Constitution
  2 to 3

1

  4 to 5

0.9

0.73 – 1.11

0.307

  6 to 7

0.86

0.71 – 1.04

0.113

  8 to 9

0.82

0.68 – 0.99

0.044

  10 to 11

0.72

0.6 – 0.87

0.001

  12 to 13

0.63

0.52 – 0.76

<0.001

  14 to 15

0.55

0.45 – 0.66

<0.001

  16 to 17

0.48

0.4 – 0.58

<0.001

  18 to 19

0.41

0.34 – 0.49

<0.001

There is no difference statistically between a constitution score of 6-7 and a score of 2-3 – everyone with constitution scores in this range are purely at the mercy of the dice. In comparison, increasing strength from 3 to 4 reduces the odds of death by a factor of five, and fighters with a strength of 18 have an odds of mortality 300 times lower than fighters with a strength of three. Truly, fortune favours the strong.

Figure 2 shows the odds ratio of mortality for constitution with its 95% confidence intervals, as a graphical alternative to a portion of Table 1 (we promised Gruumsh we would describe his victory in pretty pictures).

Figure 2: Odds Ratio of Survival by Constitution Score

Figure 2 suggests that hit points are not as important to combat survival as the ability to smash your opponent into the dirt. Once the Toughness feat is incorporated into simulations, constitution is likely to become even less important, and should probably be treated as a dump stat by players. Given that choosing the Toughness feat is equivalent to making a large increase in constitution, but this increase in constitution gives a barely-statistically-significant reduction in mortality, it seems likely that this feat is not a very useful choice. If Gruumsh is willing, this will be investigated in subsequent analyses[1].

The distribution of strength ability scores under the 4d6 choose-the-best-three method is shown in Figure 3. This method shifts the scores significantly to the right: only 754 fighters had a strength of 3, compared to 16,141 who had a strength of 18. The mean strength was 12.24 and the median 12, a shift of three from a standard 3d6 distribution and a huge change to the extreme values.

Figure 3: Distribution of Strength Scores Under 4d6 choose-the-best-three

Nearly 5% of the sample had at least one physical score of 18; but this method is still not perfect, with only 3 of one million fighters having a score of 18 in all three physical attributes (one of these three, who also had an intelligence of 15 and a charisma of 16, was beaten to a bloody pulp by Gruumsh in just three rounds. His liver, apparently, was exquisite when grilled lightly and eaten on rye bread with a dark ale).

Figure 4 shows the distribution of hit points in this sample of 1 million fighters. This is not the distribution one would actually see in a sample of actual Pathfinder fighters, since in a real game most fighters will have non-negative constitution bonuses (unless their player has read this post, I suppose). This histogram shows an interesting effect, however: even when constitution is unrestricted, under a 4d6/choose-the-best-three system there is a heavy concentration of hit points in the range of 4 – 10. Median hit points in this sample were 6, and the average hit point total was 6.2: in fact, the hit point distribution looks remarkably close to a uniform distribution on the range 4 – 10!

Figure 4: Distribution of Hit Points

Survival was not strongly associated with hit point value: those with 1 hit point survived in 20% of battles, while those with 14 hit points survived in 50% of battles. This extra importance of hps relative to constitution is driven entirely by the extra die roll (the d10 for hps) which suggests that constitution would be of much greater importance if hit points were fixed at first level; equivalently, it may be that the roll of constitution is washed out by the random determination of hit points, and if so one can expect that constitution will be more important at later levels when the law of large numbers cancels out the random effect of dice rolls on survival. For the same reason strength will probably reduce in importance over levels, since its effect is not compounded with level as constitution is. This is an issue that will need to be investigated, although if survival probabilities are replicated at second level it’s unlikely we will have much of a sample size of high level PCs[2].

Conclusion

At first level, strength is far and away the most important ability score for fighters, and constitution is so insignificant as to be almost a dump stat. A fighter with strength of 18 has only 1/300th the odds of death of a fighter with strength 3 when fighting a single Orc. Overall survival rates were low even in the toughest fighters, and in the absence of feats it appears that Pathfinder is an extremely nasty environment for solo adventuring.

Future research will investigate the role of feats in enhancing survival, and their importance relative to ability scores. The results presented here are preliminary, but it appears that in min-maxing fighter PCs the wisest choice is to prioritize strength, then dexterity, then constitution. If one is developing a PC with the intention of long-term survival these findings may be reversed, but the experimental results have not yet been collated.

Finally, the results presented here suggest that the assignment of a 1/3 challenge rating (CR) to Orcs in Pathfinder may be unwarranted. Although data are not shown here, in the testing stage this simulation program was run on Goblins (also CR 1/3) and the fighter survival rate was much higher. It may be the case that Orcs are far more challenging than a CR of 1/3. It’s not clear how Pathfinder assign their CRs, but it seems natural to suppose that a creature with a more than 50% chance of defeating an average human fighter is more than CR 1. Are Pathfinder’s CRs accurate? In any case, basic advice to fighters in Pathfinder would be: hunt Goblins, not Orcs, they’re much lower risk for the same xps.

Methods

For this analysis the fighters were generated according to the following rules:

  • All ability scores were generated using 4d6 choose-the-best-three, rolled in order: This is not orthodox Pathfinder but enables simultaneous estimation of the probability distribution of ability scores under this commonly-used rule, and enables analysis of the effect of ability scores across their full range – not just in the high values that one would usually assign to a PC’s prime characteristics
  • No feats were assigned to the fighter: for this first analysis the effect of raw scores was the topic of analysis, so no special abilities were given to the fighters. These million meat-shields were cast into battle with only their raw talents at their disposal
  • All fighters had the same equipment: raising a levy of a million fighters takes only a minute in 64 bit R, but it’s clearly a costly imposition on the citizenry, so all fighters were assigned standard kit consisting of chain mail armour, a standard shield, and a longsword. If we can secure a sufficiently large research grant from Waterdeep, subsequent battles we will allow random variation in armour types in order to choose the best armour
  • Racial abilities were not tested: no racial ability score adjustments or size bonuses were tested. Only raw scores were used. In future battles, racial ability scores will be incorporated into the PCs. Anyway, who cares if a halfling lives or dies?

The results of all battles were summarized as two numbers: length of the combat in rounds, and whether or not the fighter lived or died (Gruumsh is a bastard, and his survival status is essentially irrelevant). Survival probability was plotted by ability score, and also analyzed using multiple logistic regression to assess the odds ratio of survival for any level of any ability after adjusting for all other abilities. Histograms of hit points and ability score (strength) were also obtained for reference purposes. The odds ratio of survival at different values of one score (constitution) was plotted with 95% confidence intervals.

No ethical approval was obtained for this study, and anyone with concerns about the ethics of the study can raise the issue with Gruumsh. Informed consent was not obtained from any subjects (though Gruumsh seemed pretty eager to participate, and said “smash human!” many times, so could probably be said to have given active consent). No medical care or counselling was offered to survivors of the battles, and no reward was offered. The lucky minority who survived probably went off to start a farm or something, but we don’t know because follow-up to assess general physical health or emotional needs was not offered. Experience points were not distributed to the victors, because if we did Gruumsh would have gained enough levels to take over the world and no one wants that. Gruumsh was allowed to feast on the remains of his vanquished foes, because culturally sensitive research techniques are very highly prized at the Faustusnotes Military Academy. All simulations were conducted in R version 2.15.0, and all analyses were carried out in Stata/MP 12 because R sucks for things like making simple tables. The analyst was not blinded to the participants in the study, but if you think he had any interest in scanning a million records of a .csv file looking for fighters to favour, you’re an over-optimistic fool. This study was also not registered with CONSORT, but it’s unlikely that it would get published in any public health journal, so there was no need, really, was there?

fn1: Actually, Gruumsh is unlikely to get a choice. We’ll just roll up the fighters and send them in his direction.

fn2: Actually, if we run a series of level-by-level simulations we could test whether the probability distributions of levels given in the D&D DMG are correct, and come up with empirical estimates of the true proportion of the population who are higher level!

Over the past few years I’ve looked at a lot of the probabilistic and statistical aspects of specific game designs, from the Japanese game Double Cross 3 to Pathfinder, including comparing different systems and providing some general notes on dice pools. I’ve also played various amounts of World of Darkness, Iron Kingdoms, D&D, Warhammer 2 and 3, and some Japanese systems, that all have quite diverse systems. Given this experience and the analytical background, it seems reasonable to start drawing it all together to ponder what make for some good basic principles of RPG system design. I don’t mean here the ineffable substance of a good RPG, rather I mean the kind of basic mechanical details that can make or break a system for long term play, regardless of its world-building, background and design. For example, I think Shadowrun might be broken in its basic form, to the extent that people who try playing it for any length of time get exasperated, and this might explain why every gaming company that handles Shadowrun seems to go bust.

So, here is a brief list of what I think might be some important principles to use in the development of games. Of course they’re all just my opinion, which comes with the usual disclaimers. Have at ’em in comments if you think any are egregiously bad!

  • Dice pools are fun: everyone likes rolling handfuls of dice, and the weighty feeling of a big hand of dice before a big attack really makes you feel viscerally there, in comparison to a single d20
  • Big or complex dice pools suck: Big dice pools can really slow down the construction and counting parts of rolling a skill check, but on top of this they are basically constructing a binomial distribution, and with more than a couple of trials (dice) in a binomial distribution, it’s extremely hard to get very low numbers of successes. So large and complex dice pools need to be limited, or reserved for super-special attacks
  • Attacks should use a single roll: Having opposed skill checks in combat means doubling the number of rolls, and really slows things down. Having cast around through a lot of different systems, I have to say that the saving throw mechanism of D&D is really effective, because it reduces the attack to one roll and it makes the PC the agent of their own demise or survival when someone attacks them. On the other hand, rolling to hit and then rolling to damage seems terribly inefficient
  • Where possible, the PC should be the agent of the check: that is, if there is a choice in the rules where the GM could roll to affect the PC, or the PC could roll to avoid being affected by the GM, the latter choice is better. See my note above on saving throws.
  • Efficiency of resolution is important: the less rolls, counts and general faffs, the better.
  • Probability distributions should be intuitively understandable: or at least, explainable in the rules – and estimates of the effect of changes to the dice system (bonuses, extra dice, etc.) should be explained so GMs can understand how to handle challenges
  • Skill should affect defense: so many games (D&D and World of Darkness as immediate examples) don’t incorporate the PC’s skills into defense at all, or much. In both games, armour and attributes are the entire determinant of your defense. This is just silly. Attributes alone should not determine how well you survive.
  • Attributes should never be double-counted: In Warhammer 3, Toughness determines your hit points and acts as soak in combat; in D&D strength determines your chance to hit and is then added again to your damage. In both cases this means that your attribute is being given twice the weight in a crucial challenge. This should be avoided.
  • Fatigue and resource-management add risk and fun: Fighting and running and being blown up are exhausting, and so is casting spells; a mechanism for incorporating this into how your PCs decide what to do next is important. Most games have this (even D&D’s spells-per-day mechanism is basically a fatigue mechanism, if a somewhat blunt one), and I would argue that where possible adding elements of randomness to this mechanism really makes the player’s task interesting. But …
  • Resource-management should not be time-consuming: this is a big problem of Warhammer 3, which combined fatigue management with cool-downs and power points. Too much!
  • The PCs should have a game-breaker: we’re heroes after all. Edge, Fate, Feat points, Fortune … many games have this property, and it’s really useful both as a circuit-breaker for times when the GM completely miscalculates adversaries, and as ways for players to escape from disastrous scenarios, and to add heroism to the game
  • Skills should be broad, simple and accessible: The path of Maximum Skill Diversity laid out in Pathfinder is not a good path. The simplification and generalization of skills laid out in Warhammer 3 is the way to go.
  • Wizards should have utility magic: the 13th Age/D&D 4th Edition idea of reducing magic to just another kind of weapon is really a fun-killer. The AD&D list of millions of useless spells that you one day find yourself really needing is a much more fun and enjoyable way of being a wizard. It’s telling that D&D 5th Edition has resurrected this.
  • Character classes and levels are fun: I don’t know why, they just are. Anyone who claims they didn’t like the beautifully drawn and elaborate career section of Warhammer 2 is lying. Sure, diversity should be possible within careers but there should be distinction between careers and clarity in their separate roles (something that, for example, doesn’t seem to actually be a strong point of Iron Kingdoms despite its huge range of careers). At higher levels characters should really rock in the main roles of their class
  • Bards suck: they just do. Social skills should be important in games, but elevating them to a central class trait really should be reserved for very specialized game settings. Bards suck in Rolemaster, they suck in D&D, they suck in 13th Age and they suck in Iron Kingdoms. Don’t play a bard.
  • Magic should be powerful: John Micksen, my current World of Darkness Mage, is awesome, but mainly because he is cleverly combining 4 ranks in life magic and 3 ranks in fate magic with some serious physical prowess and a +5 magic sword (Excalibur, in fact!) to get his 21 dice of awesome. Most of the spells in the Mage book suck, and if you made the mistake of playing a mage who specializes in Prime and Spirit… well, basically you’re doomed, and everyone is going to think you’re a loser. Mages should be powerful and their powers – which in every system seem to come with risk for no apparent justifiable reason – should be something that others are afraid of. You’ll never meet a World of Darkness group who yell “get the mage first!” What’s the point of that?
  • Death spirals are important: PCs should be aware that the longer they are in a battle, the more risky it gets for them. They should be afraid of every wound, and should be willing to consider withdrawal from combat rather than continuing, before the TPK. Death spirals are an excellent way to achieve this combination of caution and ultra-violence. Getting hit hurts, and players should be subjected to a mechanism that reminds them of that.

I don’t know if any game can live up to all these principles, though it’s possible a simplified version of Shadowrun might cut it, and some aspects of the simplified Warhammer 3 I used recently came close (though ultimately that system remains irretrievably broken). Is there any system that meets all of these principles?

Recent conflicts in Iron Kingdoms (which culminated in my character’s necessary death) have introduced me to the fascinating problem of feat point budgets, and methods for estimating the optimal use of feat points. Basically in Iron Kingdoms every PC has three feat points (in Warhammer 3rd Edition these would be fortune points; I think many games have this system). Feat points can be used to boost attacks or damage (or for various other tasks), and in the case of trollkin for regeneration. They are regained through rolling criticals or killing enemies or through GM fiat. Thus expending a feat point to kill someone can be cost free. But you only have three, so expending them too early or in an inefficient way can be catastrophic (as my party discovered, to Carlass’s great cost!) So it’s important to decide where to spend them.

The combat system in Iron Kingdoms is very simple:

  • Attack: roll 2d6 + attack value, you hit if you beat the target’s defense
  • Damage: roll 2d6 + weapon power, all points greater than the target’s armour do damage

That is, you have a threshold for success followed by a threshold for damage, with results above the latter threshold being more important if they are higher. Typically an enemy will have between 5 and 15 hps you can knock down, so a good result on the damage roll can be fatal. However, the attack roll is 2d6 so small improvements in bonuses are very important when attacking high-defense enemies.

Feat points can be spent to add 1d6 to either of these rolls. Adding a feat point to the attack roll increases the chance of hitting, but can be wasted if your target has high armour; adding a feat point to the damage roll can do a lot of extra damage but only works if you actually hit.

This scenario has an equivalent in epidemiology: it’s called a double-hurdle model, and is commonly used for estimating models of health-care expenditure in situations without health insurance. The first step (the first “hurdle”) is the decision to spend money on healthcare – this is often voluntary and poor people won’t always make it. The second step is the amount spent, which is inherently random. Amounts spent above a threshold lead to financial catastrophe (this threshold is defined by various means depending on how you spend income) and the intensity of expenditure is determined by the threshold. In the double hurdle model the decision to spend may be assigned a distribution, and the amount spent is often Gamma-distributed with a high probability of low cost and a small probability of extremely high cost.

In both cases (Iron Kingdoms or out-of-pocket expenditure analysis) the problem is made more complex by the fact that we don’t usually know the thresholds. Usually in the double hurdle model we’re interested in identify risk factors for exceeding the threshold. Typically in Iron Kingdoms we want to know which decision to boost to get over the second threshold – should we boost the consumption (attack) or expenditure (damage) decisions? We’re also often interested in guessing the threshold values – the GM knows them but we don’t, and we may for example roll a 9 and fail to hit, or hit on an 8 but do no damage on a 9, and then someone else boosts and hits on a 9 but does damage on a roll of 15, so the question is – what is the armour threshold?

In my last Iron Kingdoms session this came up in a beautiful way: our opponent was going to finish off the entire group if it lasted another round, and Alyvia had one feat point left. Unboosted, she was guaranteed to achieve nothing. We knew our enemy was hard to hit and hard to damage, but we didn’t know the exact values. What should she spend her last feat point on? Naturally, since I’m a statistician in my day job, all eyes turned to me. What to do? This sparked a new interest for me: I think there are methods that can be used to answer these questions. So, over the next few weeks I aim to do a few analyses to present some answers to the following questions:

  • Under assumed thresholds and attack/damage values, what are the best ways to spend your feat point budget?
  • Are there guidelines for these decisions when you don’t know the thresholds but have a rough idea of what they might be?
  • If you don’t know the thresholds, are there simple formulas you can use to guess what they are, or to assign probabilities to given thresholds, given that you know the results of other players’ rolls?
  • Can these ideas be extended beyond Iron Kingdoms to other games?

The first question can be answered easily using basic probability theory. The second and third problems are actually a slightly challenging problem in estimating boundary values of a distribution using Bayesian statistical analysis, and I’m going to have a crack at it. The fourth question is related to the third, and is most easily explored through d20/Pathfinder: in this case my naive guess is that you can set a uniform distribution on the prior probability of any threshold value, and because the observed values (the likelihood) are also uniform, get a uniformly distributed posterior distribution for the threshold given the observed data (other players’ rolls). I think I will work from this example back to the Iron Kingdoms example (which may require simulation). If the fourth question has an analytical solution it will lead to a formula I can post on the Pathfinder forums that will allow players to second-guess their GMs’ monsters, and my guess is that a party of 3+ PCs can work out the most likely threshold required to hit within a round of combat. That’s a convenient little trick right there!

Finally, it’s possible that this information may be actually informative for the out-of-pocket spending problem, which I occasionally study at work. I doubt it, but wouldn’t it be great if random ponderings on gaming helped to improve our understanding of health insurance issues in Bangladesh?!

Stay tuned for some Bayesian nastiness, if I can find the time over the next few weeks …

Figure 1: Dwarven character creation flow chart

Following yesterday’s post, here I present flow charts for the best survival options for Dwarves (Figure 1) and Halflings (Figure 2). Both charts are based on CART analysis of the simulation data generated for yesterday’s post.

Figure 2: Flow chart for halfling fighter creation

For dwarves, weapon and armour choice is crucial, and weapon finesse is a decision so bad that it actually negatively affects survival: with two feats to choose, wasting one on weapon finesse is a very bad idea. For hobbits, like humans, toughness is only important if the PC doesn’t have good constitution, and weapon choice is only important for clumsy fighters. Note that if a halfling has no strength bonus, constitution is irrelevant to survival but dexterity is important. This is also true for humans, though weapon choice is not important in their case – presumably because they don’t suffer the size penalty on damage dice. For weak dwarves constitution bonus is also not important, but both weapon and armour type make a difference to survival.

Elven decision rules will be posted later…