On 18th May 2019 Australia held a federal election, and the ruling Liberal/National Party (LNP) Coalition scored a victory over the Australian Labor Party (ALP) that was billed by most observers as an “upset” because opinion polls had in general been predicting a narrow ALP victory. The opinion polls predicted that the ALP would get a two-party preferred vote of 51.5% over 48.5% for the LNP, and would cruise to victory on the back of this; in fact, with 76% of the vote counted the Coalition is on 50.9% two party preferred, and the ALP on 49.1%. So it certainly seems like the opinion polls got it wrong. But did they, and why?
Did opinion polls get it wrong?
The best site for detailed data on opinion polls is the Poll Bludger, whose list of polls (scroll to the bottom) shows a persistent estimate of 51-52% two-party preferred vote in favour of the ALP. But there is a slightly more complicated story here, which needs to be considered before we go to far in saying they got it wrong. First of all you’ll note that the party-specific estimates put the ALP at between 33% and 37% primary vote, with the Greens running between 9% and 14%, while the Coalition is consistently listed as between 36% and 39%. Estimates for Pauline Hanson’s One Nation Party put her between 4% and 9%. This is important for two reasons: the way that opinion pollers estimate the two party preferred vote, and the margin of error of each poll.
The first thing to note is that the final estimates of the different primary votes weren’t so wildly off. Wikipedia has the current vote tally at 41% to the Coalition, 34% to ALP and 10% to Greens. The LNP vote is higher than any poll put it at, but the other three parties’ tallies are well within the range of predicted values. The big outlier is One Nation, which polled at 3%, well below predictions – and far enough below to think that the extra 2% primary vote to the Coalition could reflect this underperformance. This has big implications for the two party preferred vote estimates from the opinion poll companies, because the two-party preferred vote is not a thing that is sampled – it is inferred from past preference distributions, from simple questions about where respondents will put their second choice, or from additional questions in the poll. So uncertainty in primary votes of the minor parties will flow through to larger uncertainty in two-party preferred vote tallies, since these votes have to flow on. By way of example, a 1% difference in the primary vote estimate for the Greens (e.g. 9% vs. 10%) will manifest as a difference of 10% in the total number of two-party preferred votes flowing to the major parties. If the assumed proportion of those votes that go to the Liberals is wrong, then you can expect to see this multiplied through in the final two-party preferred vote. In the case of One Nation, some polls (e.g. Essential Research) consistently gave them 6-7% of the primary vote, when they actually got 3%. So that’s a 50% miscalculation in the number of preference votes that flow to someone from this party. This is a unique problem for opinion polling in a nation like Australia and it raises the question: Have opinion poll companies learnt to deal with preferencing in the era of minor parties?
The second thing to note is the margin of error of these polls. Margin of error is used to show what the range of possible “true” values for the polled proportion might be. For example, if a poll estimates 40% of people will vote Liberal with a 2% margin of error that means that the “real” proportion of people who will vote Liberal is between 38% and 42%. For a binary question, the method for calculating the margin of error can be found here, but polls in Australian politics are no longer a binary question: we need to know the margin of error for four proportions, and this margin of error grows as a proportion of the estimate when the estimate is smaller. For example the most recent Ipsos poll lists its margin of error as 2.3%, but this suggests that the estimated primary vote for the Coalition (39%) should actually lie between 36.7% and 41.3%. This means that the estimated primary vote for the ALP should have a slightly wider margin of error (since it’s smaller) and the Greens even more so. Given this, it’s safe to say that the observed primary vote totals currently recorded lie exactly within the margins of error for the Ipsos poll. This poll did not get any estimates wrong! But it is being reported as wrong.
The reason the poll is reported as wrong is the combination of these two problems: the margin of error on the primary votes of all these parties should magnify the margin of error on the two-party preferred vote so that in the end it is larger than 2.3%, so we should be saying that the two-party preferred vote for the Coalition that is inferred from this poll is probably wider than the range 47 – 51%. That’s easily wide enough for the Coalition to win the election. But newspapers never report the margin of error or its implications.
When you look at the actual data from the polls, take into account the margin of error and consider the uncertainty in preferences, the polls did not get it wrong at all – the media did in their reporting of the polls. But we can ask a second question about these polls: can opinion polls have any meaning in a close race?
What do opinion polls mean in a close race?
In most elections in Australia most seats don’t come into play, and only a couple of swing seats change, because most are safe. This election has definitely followed this pattern, with 7 seats changing hands and 5 in doubt – only 12 seats mattered in this election. Amongst those 12 seats it appears (based on the current snapshot of data) that the Coalition gained 8 and lost 4, for a net gain of 4. Of those 12 seats 9 were held by non-Coalition parties before the election, and 3 by the Coalition. Under a purely random outcome – that is, if there was nothing determining whether these seats changed hands and it was purely random, the equivalent of a coin toss – then the chance of this outcome is not particularly low. Indeed, even if the ALP had a 60% chance of retaining their own seats and a 40% chance of winning Coalition seats, it’s still fairly likely that you would observe an outcome like this. A lot of these seats were on razor thin margins, so that literally they could be vulnerable to upset if there was something like bad weather or a few grumpy people or a change in the proportion of donkey votes.
I don’t think polls conducted at the national level can be expected to tell us much about the results of a series of coin tosses. If those 12 seats were mostly determined by chance, not by any structural drivers of change, how is a poll that predicts a 51% two-party preferred vote, with 2% margin of error, going to determine that they’re going to flip? It simply can’t, because you can’t predict random variation with a structural model. Basically, the outcome of this election was well within the boundaries one would expect based purely on the non-systematic random error at the population level.
When a party is heading for a drubbing you can expect the polls to pick it up, but when a minor change to the status quo is going to happen due to either luck or unobserved local factors, you can’t expect polls to offer a better prediction than coin flips.
The importance of minor parties to the result
One thing I did notice in the coverage of this election was that there were a lot of seats where the Coalition was garnering the biggest primary vote but then the ALP and the Greens’ primary vote combined was almost as large or a little larger, followed by two fairly chunky independent parties. I think in a lot of elections this means that Greens and independents’ preferences were crucial to the outcome. As the Greens’ vote grows I expect it encompasses more and more disaffected Liberal and National voters, and not just ALP voters with a concern about the environment. For example in Parkes, NSW the National Party and the ALP experienced major swings against them, but the National candidate won with a two-party preferred vote swing towards him. This suggests that preferences from minor parties were super important. This may not seem important at the national level but at the local level it can be crucial. In Herbert, which the Coalition gained, two minor parties got over 10% of the vote. In Bass the combined ALP/Green primary vote is bigger than the Coalition’s, but the Liberal member is ahead on preferences, which suggests that the Greens are not giving strong preference flows to the ALP. This variation in flows is highly seat-specific and extremely hard to model or predict – and I don’t think that the opinion polling companies have any way of handling this.
Sample and selection bias in modern polling
It can be noted from the Pollbludger list of surveys that they consistently overestimated the ALP’s two-party preferred vote, which shouldn’t happen if they were just randomly getting it wrong – there appears to be some form of systematic bias in the survey results. Surveys like opinion polls are prone to two big sources of bias: sampling bias and selection bias. Sampling bias happens when the companies random phone dialing produces a sample that is demographically incorrect, for example by sampling too many baby boomers or too many men. It is often said that sampling companies only call landlines, which should lead to an over-representation of old people so that the sample is 50% elderly people even though the population is only 20% elderly. This problem can be fixed by weighting, in which the proportions are calculated with a weight to reflect the relative rarity of young people. This method increases the margin of error but should handle the sample bias problem. However, there is a deeper problem that weighting cannot fix, which is selection bias. Selection bias occurs when your sample is not representative of the population, even if demographically they appear to be. It doesn’t matter if 10% of your sample are aged 15-24, and 10% of the population is aged 15-24, if the 15-24 year olds you sampled are fundamentally different to the 15-24 year olds in the population. Some people will tell you weighting fixes these kinds of problems but it doesn’t: there is no statistical solution to sampling the wrong people.
I often hear that this problem arises because polling companies only call landlines, and people with landlines are weirdos, but I checked and this isn’t the case: Ipsos for example samples mobile phones and 40-50% of its sample is drawn from mobile phones. This sample is still heavily biased though, because people who answer their phones to strangers are a bit weird, and people who agree to do surveys are even weirder. The most likely respondent to a phone survey is someone who is very bored and very politically engaged; and as time goes by, I think the people who answer polls are getting weirder and weirder. If your sample is a mixture of politically super-engaged young people and the bored elderly, then you are likely to get a heavy selection bias. One possible consequence of this could be a pro-ALP bias in the results: the young people who answer their mobile are super politically engaged, which in that age group means pro-ALP or pro-Green, and their responses are being given a high weight because young people are under sampled. It’s also possible that the weighting has been applied incorrectly, though that seems unlikely to be a problem across the entire range of polling companies.
I don’t think this is the main problem for these polls. There is a 2% over-estimate of the ALP two-party preferred vote but this could easily arise from misapplication of preferences. The slight under-estimate of the LNP primary vote could come from inaccuracies in the National Party estimate, for example from people saying they’re going to vote One Nation on the phone, but reverting to National or Liberal in the Booth. Although there could be a selection bias in the sampling process, I don’t think this selection bias has been historically pro-ALP. I think the problem in this election has been that the fragmentation of the major party votes on both the left (to Green/Indies) and on the right (to One Nation, UAP, Hinch and others) has made small errors in sampling and small errors in assignment of preferences snowball into larger errors in the two-party preferred estimate. In any case, this was a close election and it’s hard for polls to be right when the election comes down to toss-ups in a few local electorates.
What does this mean for political feedback processes in democracies?
Although I think the problem is exaggerated in this election, I do think this is going to be a bigger problem in future as the major parties continue to lose support to minor parties. One Nation may come and go but the Greens have been on a 10% national vote share for a decade now and aren’t going anywhere, and as they start to get closer to more lower house seats their influence on election surprises will likely grow – and not necessarily in the ALP’s favour. This means that the major parties are not going to be able to rely on opinion polls as a source of feedback from the electorate about the raw political consequences of their actions and that, I think, is a big problem for the way our democracy works.
Outside of their membership – and in the case of the ALP, the unions – political parties have no particular mechanism for receiving feedback from the general public except elections. Over the last 20 years opinion polls have formed one major component of the way in which political leaders learn about the reception their policies have in the general community. Sure, they can ask their membership for an opinion, and they’ll get feedback through other segments of the community (such as the environmental movement for the Greens, or the unions for the ALP), but in the absence of opinion polls they won’t learn much about how the politically disengaged think of their policies. But in Australia under compulsory voting the politically disengaged still vote, and they still get angry about politicians, and they still have political ideals. If this broader community withdraws completely so that their opinion can no longer be gauged – or worse still, politicians learn to believe that the opinions of those who are polled are representative of community sentiment in general – then politicians will instead learn about the reception their policies receive only through the biased filter of stakeholders, the media, and their own party organisms. I don’t see any of the major parties working to make themselves more accessible to community feedback and more amenable to public discussion and engagement, and I don’t think they will be able to find a way to do that even if they tried. Over the past 20 years instead politicians have gauged the popularity of their platform from polls, and used it to modify and often to moderate their policies in between elections. Everyone hates the political leader who simply shapes their policies to match the polls, but everyone hates a politician who ignores public opinion just as much. We do expect our politicians to pay attention to what we think in between elections, and to take it into account when making policy. If it becomes impossible for them to do this, then an important mode of communication between those who make the laws and those who don’t will be broken or worse still become deceptive.
It does not seem that this problem is going to go away or get better. This means that the major political parties are going to have to start finding new mechanisms to receive feedback from the general public – and we the public are going to have to find new ways to get through to them. Until then, expect more and nastier surprises in the future, and more weird political contortions as the major parties realize they haven’t just lost control of the narrative – they aren’t even sure what the narrative is. And since we the public learn what the rest of the public think from opinion polls as well, we too will lose our sense of what our own country wants, leaving us dependent on our crazy Aunt’s Facebook posts as our only vox populi.
As people retreat from engagement with pollsters, the era of the opinion poll will begin to close. We need to build a new form of participatory democracy to replace it. But, and how? And until we do, how confused will we become in the democracy we have? The strange dynamics of modern information systems are wreaking havoc in our democratic systems, and it is becoming increasingly urgent that we understand how, and what we can do to secure our democracies in this strange new world of fragmented information.
But as Scott Morrison stands up in the hottest, driest era in the history of the continent and talks about building more coal mines on the back of his mandate, I don’t hold out much hope that there will be any change.
May 21, 2019 at 11:10 am
I don’t know how useful opinion polls are as a form of feedback to politicians even if they’re accurate. If you’re ahead in accurate polls, that tells you you’re doing something right, and if you’re behind in accurate polls, that tells you you’re doing something wrong, but it doesn’t tell you which things you’re doing right or which things you’re doing wrong.
I wonder whether politicians in the age before opinion polls were able to use any sort of feedback from the public to evaluate their own performance. Did Gladstone and Disraeli try to get public feedback in any way, or Cleveland and Harrison? What sort of difference did it make to politics when scientific polls became available? Was it an improvement, or not?
For those who are able to follow the reasoning, there is evidence from before the results were known (so not simply relying on the benefit of hindsight) of failure on the part of the pollsters. The blogger Mark the Ballot reported that the results of the opinion polls were, statistically speaking, too close together to be believable. For reasons some of which you’ve mentioned, pollsters have to apply statistical corrections to their raw data: although it’s impossible to get accurate results any other way, it does produce a risk that they will correct the results towards their preconceptions. In particular, they may correct to avoid producing results which vary widely from the results of other polls, which can mean that all the pollsters end up shifting their results to be close to each other (and to their own previous poll results). This particular form of groupthink is sometimes called ‘herding’, and one post by Mark the Ballot (again, before the election) was actually titled ‘a herd of new polls’.
Anyway, whether or not herding is the specific correct explanation in this case, the calculated probability (according to Mark the Ballot) of the polls being as close together as they were, if proper statistical techniques were being used, was negligible. The conclusion that the pollsters were doing something wrong is, to repeat myself, one that Mark the Ballot pointed to in a series of posts before the actual results were known.
May 21, 2019 at 11:19 am
I agree that opinion polls aren’t necessarily a great way to gauge opinion if they’re accurate, but they’re something. Also many ask specific questions about policy, but those questions are not informative if the polls have a huge selection bias. I think that in the modern era the problems we face are greater and more interconnected (e.g. climate change) than in the pre-poll era, and they require politicians to think outside of the interests of their own constituencies; and I suspect that in the era of Gladstone they didn’t have to worry about that so much. Also Gladstone represented probably a much more homogeneous electorate – most adults couldn’t vote in his time, and so he primarily represented the interests of people from his own class and race background, so probably found their opinions easier to judge (this is supposition).
I saw the Poll Bludger reference Mark the Ballot’s position on this and I think it would be a good idea if the media paid more attention to critiques of polling. I would like to see more information about how the companies calculate their weights and estimate margin of error, etc., because I suspect a lot of them are making big assumptions that can be challenged. I think there is a bigger problem here that the media want to believe every election has a narrative and that this means that policy and reporting on policy influences the results of elections. But if elections come down to a few random chance outcomes in a small number of seats, this may not be the case. Look at the US presidential elections as an example of this: there is one every 4 years and they mostly change parties every 8 years. So in the past 40 years we have seen approximately 5 changes of party in 10 elections. That’s pretty much what you’d expect if you just flipped a coin at each election, especially since the presidential elections are decided by only a small number of voters in a few states. I think Mark the Ballot is right and these opinion polls are incapable of distinguishing sides in close elections. But if people realized that poll companies might not be able to do much business …
May 21, 2019 at 1:51 pm
Yes, I know, and on the basis of my knowledge I’m dubious about how informative those questions can be even in the absence of any selection bias.
According to the information I can find on Wikipedia, several countries had universal manhood suffrage long before the age of opinion polling and a few even had adult suffrage (as Australia did, except for the exclusion of Indigenous Australians, from 1902). Even Gladstone, after the passage of the Reform Act initiated by his government, faced an electorate that included an estimated three-fifths of adult males.
Who knows, perhaps this occasion will prompt them to do that.
From my limited knowledge, one problem there is that at least some aspects of what pollsters do are effectively secrets of their business which they wouldn’t want to make public because they operate in commercial competition with each other. However, I have seen that other knowledgeable commentators have suggested that they could be more transparent and that in this respect Australian pollsters compare unfavourably to those in some other countries. After the 2015 UK election the pollsters actually commissioned an inquiry into how they’d got it wrong, although as far as I recall it was unable to come up with any practical suggestions for future improvements in technique. It did rule out some possible sources of error, though, which was presumably worth doing.
Absolutely, although to be fair that doesn’t come entirely from the media. People like narratives! but in this instance I think the media amplify the effect. Michelle Grattan has a piece at The Conversation where she describes Scott Morrison as having unique qualities as a campaigner. But then she says the next election is winnable for Labor. If she truly believed that Scott Morrison were the ideal campaigner she describes him as being, she would also have to believe that he would continue to be unbeatable; evidently she doesn’t believe that, so she doesn’t believe what she said about his qualities, in which case why did she have it published?
I think you might do well to be more careful in your estimates of how narrowly decided US Presidential elections are. It’s true that the most recent Republican victories (2000, 2004, 2016) were narrow by any standards, but the most recent Democratic victories (1992, 1996, 2008, 2012) were more decisive, and the Republican victories before that (1980, 1984, 1988) were decided by even wider margins. There have actually been more close Australian elections over the same period, with 1983 and 1996 the only obvious exceptions.
The problem, in my opinion, is not so much that there are no meaningful causes for election results–I think in the majority of cases there must be meaningful causes–but rather that people want to insist that they know specifically what the causes are without having sufficient evidence to warrant the conclusions.
I don’t know that opinion polling has much to do with the last part of that, though.
May 21, 2019 at 3:00 pm
Re: your last point, I’m not convinced that the victories in the US are big. Trump won with 304 ec votes to 227. That’s 2 states (Florida, which we now know was hacked by the Russians, and one of the states by the great lakes). I don’t think that more than 8 or 10 states were in play. If you flip 10 coins and get 6 heads and 4 tails, Trump wins. That’s a pretty narrow win, and I think it’s hard for polls to predict that, especially when turnout issues come into play. That suggests the entertaining possibility that there is no grand arc of political history in the USA, just good luck and bad luck.
May 22, 2019 at 2:04 pm
I am aware that some US Presidential elections are close: I mentioned that in my comment! You use 2016 as an example: I used it as one of my examples of narrow victories! I did not state that the margins of victory are wide in all US Presidential elections, only that they are wide in many of them. An interpretation which may be justified for 2016 (and also 2000 and 2004) is not justified for 1980, 1984, 1988, 1992, 1996, 2008, and 2012.
On the subject of Australian polling, the most recent post on Kevin Bonham’s blog is interesting. He doesn’t refer to the idea that polling, if accurate, is important as a means of feedback from the public to politicians, but he does think accuracy of polling is important and he criticises the behaviour of Australian pollsters in detail, explaining how their lack of transparency about their methods contrasts unfavourably with the behaviour of UK pollsters. He also discusses various explanations that have been offered for the failure of polls, with reasons for rejecting some of them and for treating others as worthy of further investigation. He doesn’t mention the independent review commissioned by UK pollsters after the 2015 election there, but he does suggest that Australian pollsters should now try something of the sort.
Interestingly, he also cites a broadly based academic analysis which argues that there has not been a recent deterioration in the performance of polls; a few dramatic recent failures have attracted attention (again, people like narratives! to quote Kevin Bonham’s post, ‘Discredited tropes created by the famous author and clueless polling crank Bob Ellis still continue to spread on Twitter years after his death’) but apparently, over the long term and in a wide perspective, the recent performance of pollsters has been at approximately the same standard (good or bad) as in the past.
May 22, 2019 at 2:38 pm
I thought an 80 vote difference was big! It’s not really much bigger than 2012, and it does look like the vote margins have narrowed over time, possibly because the division between red and blue states has grown, so less states are flippable.
To give a specific example of where polling might be useful, today a new poll was released showing that 50% of people who voted for Trump won’t vote for him again. Given that the Democrats are considering impeachment and there is also the obvious option of the 25th Amendment on the table, this information is useful for both parties to decide what they should do. If it’s correct then it shows the Republicans the trouble they’re going to be in if they stick with him; it also shows them the damage they might face if the Democrats decide not to impeach and to let this clown run to the election. But if the poll can’t be trusted, then the Republican elite have almost no idea about how popular he really is in bugfuck, Idaho. It’s not as if a scumbag like David Brooks is going to pry himself away from his latest research assistant and actually go find out – the Republicans have zero connection to the majority of America and they need these polls to figure out what’s going on out there. Another example might be Workchoices, where polling before the election showed it had become hugely unpopular but the Howard government didn’t listen to the polling and got caned on the issue. Obviously you don’t go on polling alone, and maybe Howard thought his workplace reforms were worth sinking the rest of his agenda over (a strange thing to think given the opposition had pledged to – and did – revoke the law), but you can’t even make that decision if you don’t know what the electorate are thinking. He had only 2 years between announcing the law in 2005 and the election in 2007, and in that time the polls provided both parties a strong insight into what was happening in the electorate. If Howard had listened to those polls and backed down on the law he might not have lost his own seat and retired in ignominy in 2007.
I wouldn’t be surprised to learn that pollsters are performing about as well as in the past. But I think their performance may deteriorate if they have to call close elections with complex preference flows from many minor parties. Perhaps they and the media need to reconsider whether this is a good approach to reporting polls. Maybe they should drop the two-party preferred predictions, or be more honest about the margin of error.
Incidentally I see this issue of the confounding influence of error in an area of public health, which I have mentioned here before – the Global Burden of Disease studies. These studies attempt to rank causes of death nationally and globally (so e.g. Ischaemic heart disease is the biggest cause of death in Japan while it’s stroke in the USA, things like that). But the only way they can make that call is if they don’t include the error in their mortality calculations, and/or minimize the amount of error they calculate (through a variety of tricks that they haven’t been caught out on yet). They know that (e.g.) a Japanese policy maker puts zero weight on a report that says “the top killer could be heart disease or stroke or cancer”. They know they need ranks in order for their study to have policy impact. So they do everything they can not to draw attention to the amount of uncertainty in their rankings. I think pollsters feel the same way – their value is in implicitly or explicitly calling an election (which is basically what the two-party preferred estimate does), not in giving ranges of outcomes. If they had said “the liberal vote will be somewhere between 47 and 51%, and the ALP between 49 and 53” the media would have stopped reporting them. But by making calls that go wrong, they damage their reputation even more – and also increase the risk that people giving responses to their questions don’t take them seriously and give dishonest answers, or refuse to participate, producing even more error. It’s a bind, but I see no evidence that Australian elections are going to become more decisive in the near future, so they need to figure something out.