Could you lie to this nice lady?

On 18th May 2019 Australia held a federal election, and the ruling Liberal/National Party (LNP) Coalition scored a victory over the Australian Labor Party (ALP) that was billed by most observers as an “upset” because opinion polls had in general been predicting a narrow ALP victory. The opinion polls predicted that the ALP would get a two-party preferred vote of 51.5% over 48.5% for the LNP, and would cruise to victory on the back of this; in fact, with 76% of the vote counted the Coalition is on 50.9% two party preferred, and the ALP on 49.1%. So it certainly seems like the opinion polls got it wrong. But did they, and why?

Did opinion polls get it wrong?

The best site for detailed data on opinion polls is the Poll Bludger, whose list of polls (scroll to the bottom) shows a persistent estimate of 51-52% two-party preferred vote in favour of the ALP. But there is a slightly more complicated story here, which needs to be considered before we go to far in saying they got it wrong. First of all you’ll note that the party-specific estimates put the ALP at between 33% and 37% primary vote, with the Greens running between 9% and 14%, while the Coalition is consistently listed as between 36% and 39%. Estimates for Pauline Hanson’s One Nation Party put her between 4% and 9%. This is important for two reasons: the way that opinion pollers estimate the two party preferred vote, and the margin of error of each poll.

The first thing to note is that the final estimates of the different primary votes weren’t so wildly off. Wikipedia has the current vote tally at 41% to the Coalition, 34% to ALP and 10% to Greens. The LNP vote is higher than any poll put it at, but the other three parties’ tallies are well within the range of predicted values. The big outlier is One Nation, which polled at 3%, well below predictions – and far enough below to think that the extra 2% primary vote to the Coalition could reflect this underperformance. This has big implications for the two party preferred vote estimates from the opinion poll companies, because the two-party preferred vote is not a thing that is sampled – it is inferred from past preference distributions, from simple questions about where respondents will put their second choice, or from additional questions in the poll. So uncertainty in primary votes of the minor parties will flow through to larger uncertainty in two-party preferred vote tallies, since these votes have to flow on. By way of example, a 1% difference in the primary vote estimate for the Greens (e.g. 9% vs. 10%) will manifest as a difference of 10% in the total number of two-party preferred votes flowing to the major parties. If the assumed proportion of those votes that go to the Liberals is wrong, then you can expect to see this multiplied through in the final two-party preferred vote. In the case of One Nation, some polls (e.g. Essential Research) consistently gave them 6-7% of the primary vote, when they actually got 3%. So that’s a 50% miscalculation in the number of preference votes that flow to someone from this party. This is a unique problem for opinion polling in a nation like Australia and it raises the question: Have opinion poll companies learnt to deal with preferencing in the era of minor parties?

The second thing to note is the margin of error of these polls. Margin of error is used to show what the range of possible “true” values for the polled proportion might be. For example, if a poll estimates 40% of people will vote Liberal with a 2% margin of error that means that the “real” proportion of people who will vote Liberal is between 38% and 42%. For a binary question, the method for calculating the margin of error can be found here, but polls in Australian politics are no longer a binary question: we need to know the margin of error for four proportions, and this margin of error grows as a proportion of the estimate when the estimate is smaller. For example the most recent Ipsos poll lists its margin of error as 2.3%, but this suggests that the estimated primary vote for the Coalition (39%) should actually lie between 36.7% and 41.3%. This means that the estimated primary vote for the ALP should have a slightly wider margin of error (since it’s smaller) and the Greens even more so. Given this, it’s safe to say that the observed primary vote totals currently recorded lie exactly within the margins of error for the Ipsos poll. This poll did not get any estimates wrong! But it is being reported as wrong.

The reason the poll is reported as wrong is the combination of these two problems: the margin of error on the primary votes of all these parties should magnify the margin of error on the two-party preferred vote so that in the end it is larger than 2.3%, so we should be saying that the two-party preferred vote for the Coalition that is inferred from this poll is probably wider than the range 47 – 51%. That’s easily wide enough for the Coalition to win the election. But newspapers never report the margin of error or its implications.

When you look at the actual data from the polls, take into account the margin of error and consider the uncertainty in preferences, the polls did not get it wrong at all – the media did in their reporting of the polls. But we can ask a second question about these polls: can opinion polls have any meaning in a close race?

What do opinion polls mean in a close race?

In most elections in Australia most seats don’t come into play, and only a couple of swing seats change, because most are safe. This election has definitely followed this pattern, with 7 seats changing hands and 5 in doubt – only 12 seats mattered in this election. Amongst those 12 seats it appears (based on the current snapshot of data) that the Coalition gained 8 and lost 4, for a net gain of 4. Of those 12 seats 9 were held by non-Coalition parties before the election, and 3 by the Coalition. Under a purely random outcome – that is, if there was nothing determining whether these seats changed hands and it was purely random, the equivalent of a coin toss – then the chance of this outcome is not particularly low. Indeed, even if the ALP had a 60% chance of retaining their own seats and a 40% chance of winning Coalition seats, it’s still fairly likely that you would observe an outcome like this. A lot of these seats were on razor thin margins, so that literally they could be vulnerable to upset if there was something like bad weather or a few grumpy people or a change in the proportion of donkey votes.

I don’t think polls conducted at the national level can be expected to tell us much about the results of a series of coin tosses. If those 12 seats were mostly determined by chance, not by any structural drivers of change, how is a poll that predicts a 51% two-party preferred vote, with 2% margin of error, going to determine that they’re going to flip? It simply can’t, because you can’t predict random variation with a structural model. Basically, the outcome of this election was well within the boundaries one would expect based purely on the non-systematic random error at the population level.

When a party is heading for a drubbing you can expect the polls to pick it up, but when a minor change to the status quo is going to happen due to either luck or unobserved local factors, you can’t expect polls to offer a better prediction than coin flips.

The importance of minor parties to the result

One thing I did notice in the coverage of this election was that there were a lot of seats where the Coalition was garnering the biggest primary vote but then the ALP and the Greens’ primary vote combined was almost as large or a little larger, followed by two fairly chunky independent parties. I think in a lot of elections this means that Greens and independents’ preferences were crucial to the outcome. As the Greens’ vote grows I expect it encompasses more and more disaffected Liberal and National voters, and not just ALP voters with a concern about the environment. For example in Parkes, NSW the National Party and the ALP experienced major swings against them, but the National candidate won with a two-party preferred vote swing towards him. This suggests that preferences from minor parties were super important. This may not seem important at the national level but at the local level it can be crucial. In Herbert, which the Coalition gained, two minor parties got over 10% of the vote. In Bass the combined ALP/Green primary vote is bigger than the Coalition’s, but the Liberal member is ahead on preferences, which suggests that the Greens are not giving strong preference flows to the ALP. This variation in flows is highly seat-specific and extremely hard to model or predict – and I don’t think that the opinion polling companies have any way of handling this.

Sample and selection bias in modern polling

It can be noted from the Pollbludger list of surveys that they consistently overestimated the ALP’s two-party preferred vote, which shouldn’t happen if they were just randomly getting it wrong – there appears to be some form of systematic bias in the survey results. Surveys like opinion polls are prone to two big sources of bias: sampling bias and selection bias. Sampling bias happens when the companies random phone dialing produces a sample that is demographically incorrect, for example by sampling too many baby boomers or too many men. It is often said that sampling companies only call landlines, which should lead to an over-representation of old people so that the sample is 50% elderly people even though the population is only 20% elderly. This problem can be fixed by weighting, in which the proportions are calculated with a weight to reflect the relative rarity of young people. This method increases the margin of error but should handle the sample bias problem. However, there is a deeper problem that weighting cannot fix, which is selection bias. Selection bias occurs when your sample is not representative of the population, even if demographically they appear to be. It doesn’t matter if 10% of your sample are aged 15-24, and 10% of the population is aged 15-24, if the 15-24 year olds you sampled are fundamentally different to the 15-24 year olds in the population. Some people will tell you weighting fixes these kinds of problems but it doesn’t: there is no statistical solution to sampling the wrong people.

I often hear that this problem arises because polling companies only call landlines, and people with landlines are weirdos, but I checked and this isn’t the case: Ipsos for example samples mobile phones and 40-50% of its sample is drawn from mobile phones. This sample is still heavily biased though, because people who answer their phones to strangers are a bit weird, and people who agree to do surveys are even weirder. The most likely respondent to a phone survey is someone who is very bored and very politically engaged; and as time goes by, I think the people who answer polls are getting weirder and weirder. If your sample is a mixture of politically super-engaged young people and the bored elderly, then you are likely to get a heavy selection bias. One possible consequence of this could be a pro-ALP bias in the results: the young people who answer their mobile are super politically engaged, which in that age group means pro-ALP or pro-Green, and their responses are being given a high weight because young people are under sampled. It’s also possible that the weighting has been applied incorrectly, though that seems unlikely to be a problem across the entire range of polling companies.

I don’t think this is the main problem for these polls. There is a 2% over-estimate of the ALP two-party preferred vote but this could easily arise from misapplication of preferences. The slight under-estimate of the LNP primary vote could come from inaccuracies in the National Party estimate, for example from people saying they’re going to vote One Nation on the phone, but reverting to National or Liberal in the Booth. Although there could be a selection bias in the sampling process, I don’t think this selection bias has been historically pro-ALP. I think the problem in this election has been that the fragmentation of the major party votes on both the left (to Green/Indies) and on the right (to One Nation, UAP, Hinch and others) has made small errors in sampling and small errors in assignment of preferences snowball into larger errors in the two-party preferred estimate. In any case, this was a close election and it’s hard for polls to be right when the election comes down to toss-ups in a few local electorates.

What does this mean for political feedback processes in democracies?

Although I think the problem is exaggerated in this election, I do think this is going to be a bigger problem in future as the major parties continue to lose support to minor parties. One Nation may come and go but the Greens have been on a 10% national vote share for a decade now and aren’t going anywhere, and as they start to get closer to more lower house seats their influence on election surprises will likely grow – and not necessarily in the ALP’s favour. This means that the major parties are not going to be able to rely on opinion polls as a source of feedback from the electorate about the raw political consequences of their actions and that, I think, is a big problem for the way our democracy works.

Outside of their membership – and in the case of the ALP, the unions – political parties have no particular mechanism for receiving feedback from the general public except elections. Over the last 20 years opinion polls have formed one major component of the way in which political leaders learn about the reception their policies have in the general community. Sure, they can ask their membership for an opinion, and they’ll get feedback through other segments of the community (such as the environmental movement for the Greens, or the unions for the ALP), but in the absence of opinion polls they won’t learn much about how the politically disengaged think of their policies. But in Australia under compulsory voting the politically disengaged still vote, and they still get angry about politicians, and they still have political ideals. If this broader community withdraws completely so that their opinion can no longer be gauged – or worse still, politicians learn to believe that the opinions of those who are polled are representative of community sentiment in general – then politicians will instead learn about the reception their policies receive only through the biased filter of stakeholders, the media, and their own party organisms. I don’t see any of the major parties working to make themselves more accessible to community feedback and more amenable to public discussion and engagement, and I don’t think they will be able to find a way to do that even if they tried. Over the past 20 years instead politicians have gauged the popularity of their platform from polls, and used it to modify and often to moderate their policies in between elections. Everyone hates the political leader who simply shapes their policies to match the polls, but everyone hates a politician who ignores public opinion just as much. We do expect our politicians to pay attention to what we think in between elections, and to take it into account when making policy. If it becomes impossible for them to do this, then an important mode of communication between those who make the laws and those who don’t will be broken or worse still become deceptive.

It does not seem that this problem is going to go away or get better. This means that the major political parties are going to have to start finding new mechanisms to receive feedback from the general public – and we the public are going to have to find new ways to get through to them. Until then, expect more and nastier surprises in the future, and more weird political contortions as the major parties realize they haven’t just lost control of the narrative – they aren’t even sure what the narrative is. And since we the public learn what the rest of the public think from opinion polls as well, we too will lose our sense of what our own country wants, leaving us dependent on our crazy Aunt’s Facebook posts as our only vox populi.

As people retreat from engagement with pollsters, the era of the opinion poll will begin to close. We need to build a new form of participatory democracy to replace it. But, and how? And until we do, how confused will we become in the democracy we have? The strange dynamics of modern information systems are wreaking havoc in our democratic systems, and it is becoming increasingly urgent that we understand how, and what we can do to secure our democracies in this strange new world of fragmented information.

But as Scott Morrison stands up in the hottest, driest era in the history of the continent and talks about building more coal mines on the back of his mandate, I don’t hold out much hope that there will be any change.