Wednesday, November 1, 2023

Voice Referendum Polling Accuracy

The 2023 Voice referendum was a triumph for Australian (and in one case UK) opinion polling.  With all votes counted apart from a few dozen that may or may not exist, here is my assessment of the accuracy of the final polls, and of the polling overall.  

Before I start a few words about the late count: firstly the Yes vote rebounded after the first night of counting from projecting to the low 39s to finishing up at very nearly 40 (39.94 pending any late corrections, which would appear unlikely at this stage).  The main causes of this were: a strong performance by Yes on absents, a relatively strong performance on out-of-division prepolls (which were intermediate between absents and in-division prepolls) and both these forms of votes being substantially more common than in 2022 (29% and 22% more common respectively).  The latter also pushed the turnout up from the initial mid-to-high 80s range to more or less 90%, with it finishing at 89.92% (up 0.1% on the 2022 Reps election, but also up over 400,000 voters because of increased enrolment).   That said because it is harder to get one's vote rejected over enrolment issues, a better comparison might be the 2022 Senate turnout of 90.47%, on which turnout was slightly down.  

34 divisions voted Yes, including all in doubt after the night except for No's closest victory in Hotham, and 117 voted No.  Tasmania pipped NSW for second highest state Yes vote by 0.02%.

The polls overall

The Voice referendum was among the most heavily polled electoral events in Australian history, and the single most diversely polled with at least 22 different pollsters releasing some kind of result on the Voice since the 2022 federal election.  The polling was characterised by a refreshing lack of herding and saw a range of approaches taken in terms of headline figures (these could be broken broadly into one-pass, two-pass and forced choice approaches).  The major national polls towards the end were in general exclusively online, the main variation being that DemosAU used device engagement whereas the others employed panel polling.  This lack of method diversity turned out not to be a problem.  



The polls taken together told a story that was credible and easy to follow despite the high level of divergence towards the end.  As well as the basic voting intention story many polls contributed useful data about attitudes to the Voice and reasons for voting, including some of the most deluxe polling reports (especially Resolve) seen so far in Australia.  I have posted this simple trajectory graph more than enough but it bears one last time in case someone sees this article who has not seen it before:



Green  - Newspoll, Dark Green - YouGov, Magenta - Resolve, Grey - Essential, Dark blue - JWS, Light blue - Freshwater, Black - Morgan, Red - Redbridge, Orange - DemosAU, Purple - Focaldata, Red star - final result

On a two-answer basis the final polls collectively overestimated Yes by 1.9%, a small and entirely unsurprising difference given the history of referendum/ballot measure polling doing this, and down to a couple of polls with obvious house effects anyway.  My final aggregate reading (a "what the polls say" after adjusting for house effects and dates in field, not a forecast) was just 1.4% over.  It is pleasing that polling has proved so accurate here in spite of the referendum attracting the worst level of poll denying I have ever seen.  
 

What is a "final poll"?

A lot of emphasis gets placed on final polls, but they are also the polls that can most easily be a target of herding, and therefore it's important to also look at how polls behaved overall.   While attempts are made to "recover" voting intention from pre-final polls I am not really convinced that any of them are reliable, so there may be little alternative to continuing to use final polls to assess accuracy.  

The approach I have taken for this article is that a poll counts as a pollster's final poll if it was the last one released with a national result by that pollster before 8 am on polling day, and if it includes at least some data that was after the issue of the writ (or I would accept earlier if the pollster designated it as final).

There are a number of complications here.  Redbridge released a poll that they subsequently stated was their final release, but there was later a poll described by Kos Samaras as "a joint Redbridge, Accent Research and Octopus Group project" that was in field 6-11 Oct but released after voting on 16 Oct with no mention of Redbridge in the release.  (It had a 41-59 result that at the time of release looked worse than the final Redbridge but ended up being better!)  DemosAU released a national poll in September but Queensland and Western Australian results only of polling Oct 6-9 were reported in The Australian, apparently part of a national sample.  Resolve released a poll indicated to be final but also said that it was a midpoint campaign poll that the pollster did not expect to necessarily serve as a forecast.  

There is a view that polls that were taken three or four weeks out from the end of voting should be excluded on the grounds that voting intention was changing and if a poll taken that far out happened to get the right result then that probably means it was wrong at the time taken.  I've avoided this interpretation mainly for consistency with past approaches and to avoid subjecting a poll to a jury-of-its-peers assessment when it is possible the trajectory shown by its peers was incorrect.  (In this case the main issue is whether decline in the Yes vote accelerated more or less all the way to the end or tailed off).   On my models, voting intention probably did not decline by much after the issue of the writs anyway, perhaps 0.8%, so I see no reason not to allow earlier final polls like Freshwater.  

What is a final two-answer figure?

In federal elections, there is a lot of emphasis on correctly estimating the two-party preferred vote for the major parties.  We are used to pollsters releasing a set of primary votes and a two-party preferred vote that may be based on modelling rather than pure polling, as the pollster has to decide what is the most accurate way to forecast how minor party preferences will break.  Even if we know a pollster is using last-election preferences, sometimes their calculation might differ from what would be expected based on the published primaries.  The reason for this is rounding.

With referendums there is a similar challenge, but it relates to forecasting how the undecided voters will break.  The prevailing standard is to assume they will break proportionally, though Morgan at this referendum assumed they would break two-thirds to No.  

The approach I have taken is that if the pollster or their client did not report a specific two-answer figure before the final voting day, I have derived one based on the assumption that they would split proportionally.  Where the pollster or client did report a two-answer figure I have used it.  This is most consistent with how I have handled federal elections (especially Essential's 2PP+ method).

The two-answer figure issue is especially important in the case of Newspoll.  Newspoll released a final 37-57 poll, which with no other information puts Newspoll at 39.36 (still excellent, but not as close as Freshwater at 39.76).  However The Australian's article reported (and the head of Pyxis publicly confirmed before polls closed) that the two-answer preferred estimate for the final Newspoll was 40-60 and not 39-61, making it clear that the final Newspoll was over 39.5 for Yes.  In theory, it could have been as high as 40.1 (taking into account that the sum was forced to 100).  The issue here is rounding; if 37-57 is really 37.2-56.8, then even that is enough to change the 2AP after rounding from 39-61 to 40-60.  

Freshwater issued a release claiming that they were the most accurate poll on the basis of simple conversions that ignored published information about two-answer estimates.  Their final poll was 33-50 and was also reported by the AFR as 40-60.  Although Freshwater post-published a 39.8 Yes result based on simple conversions, the pre-rounding figure could in theory have been anything between 39.5 and 40.4 (ditto to Newspoll re forcing).  I can find nowhere where the 39.8 was published before the vote, although Freshwater did say their Yes vote was "just below 40 per cent." on 15 Oct.

Note that I accept two-answer responses that were published by pollsters or their clients, but I do not accept those that were merely sent to media who never published them, or that were available only on request.  I may accept estimates that were sent to media and released by them.  

Final poll table

The following table shows the published results that I can find for the final polls included in my survey (some others reported by media were excluded for inadequate information):



The table gives:

* dates in and out of field (note that final Newspoll was mostly taken 7-12 Oct)
* N = sample size
* Final? = a column indicating whether poll was specified as the pollster's final poll when released or soon after release (NS = not stated, numbers = cases noted above)
* Yes (1) and No (1) - yes/no figures for first-pass questions that did not prod undecided voters to express a view
* Yes (2) and No (2) - yes/no figures for second-pass questions with undecided voters prodded but not forced.
* Yes (F) and No (F) - yes/no figures with undecided forced
* 2AP - the released or derived two-answer figure for Yes 
* Diff - the difference from the actual Yes figure
* Abs - the absolute difference
* Und - the lowest undecided rate released by the pollster (forced choice treated as zero)
* 2AP Source - where I got the two-answer figure for the poll from.  

Table will be edited in case of minor corrections. 

Overall polls that used the one-pass method tended to miss some potential Yes voters though the break to No among the undecideds was still very strong.  Both two-pass methods and forced choice on average got the Yes vote about right though two-pass underestimated the No vote, which isn't unusual.  (The issues with the two-pass polls result partly from the two polls with the biggest pro-Yes house effect being of this kind.)

Tiebreak Required!

Both Newspoll and Freshwater nailed the result on a two-answer basis with both reported as 40-60; YouGov would probably also have done so had they released a 2AP figure.  While both released an excellent final poll on a two-answer basis the question is what order to put these two in the table.  I have considered whether to break them by time of release but decided against it - there are arguments either way.  A later poll is more likely to reflect real voting intention rather than getting lucky if voting intention changes, but on the other hand if I use that as a criterion some pollsters might be tempted to go late but copy earlier polls.  I've decided that at least in this case the best tiebreak is lowest undecided rate: a Newspoll with a result of 37-57-6 contains clear information that No will win heavily but not, say, two to one, while Freshwater's 33-50-17 leaves a lot more up in the air depending on how the undecideds break.  Therefore on tiebreak Newspoll had the most accurate final poll.  If there is another referendum soon I will give some thought to applying some form of accuracy penalty for polls with very high undecided rates, but if I do that I'll declare it in advance.

Newspoll therefore lands within 1% two-answer (or two-party preferred) for the fifth electoral event in a row - under trying circumstances given that the breakaway of Pyxis from YouGov meant that Pyxis had to establish a new panel environment for Newspoll months out from the poll. The highly accurate YouGov poll was essentially a Newspoll clone but using the same panel as the former YouGov Newspoll, again underlying that what Newspoll is doing in terms of question wording is working well for the time being.  Freshwater Strategy continues to impress and seems to have fast become Newspoll's most promising challenger.  Focaldata, DemosAU and Redbridge also all did very well by referendum standards.  JWS and Resolve were more noticeably on the high side but nothing major by referendum standards.  

Not All Polls Succeeded

While much of the polling for this referendum was excellent, there were a couple of big misses with Morgan and Essential both missing in the 6-7% range on a two-answer basis (though Essential did not publish a two-answer estimate; its final poll was 43-49).  We shouldn't be too harsh on polls that are out on exercises like this because every poll (yes even Australia Institute!) contributed something to understanding of what was likely to happen and to be too harsh on the outliers can encourage herding.  That said neither pollster could be at all happy with misses that are way outside margin of error on a two-answer basis. Morgan also seems to have performed very poorly in NZ (pending final figures there) and Essential not that well there.  

A common factor in these failings appears to be oversampling of politically engaged voters, which has also been seen in high Greens primary votes in both polls in Australia (until this week when Essential introduced education weighting in response to its Voice miss).  This appears to have been a big part of the 2019 federal failure and pollsters who don't weight or target for some proxy for high engagement were at risk of coming a cropper again in this one.  Essential's tracking behaviour also appeared unusual - while other polls were following the accelerating doom spiral downwards, Essential twice plateaued off.  

Essential had been attempting to use past vote to deal with the issue of over-engagement but this didn't work.  One reason why it would not have worked at this referendum even assuming honest voter recall is that the highly engaged end of the Labor voter base (and probably the Coalition's more engaged end as well) would have been more likely to vote Yes than less engaged voters for those parties.  Therefore just knowing that someone is a 2022 Labor voter isn't enough.  


State Final Breakdowns

Not all polls provided state final breakdowns but of those that did Newspoll was easily the best performed, nailing four (!) states including even Tasmania, getting relatively close in NSW and only being substantially incorrect in WA.  The table does not provide a fully valid basis to compare the various polls since SA and Tas were the most error-prone states but not all pollsters produced breakdowns for them; thus eg Focaldata's state breakdowns (estimated from adding up their MRP estimates) were more useful here than Redbridge's.  Resolve would move up two places if excluding Tasmania, where it appeared to have the most consistent panel issue of the pollsters polling the state (Morgan also had a panel issue with Tasmania, but less consistently.)  


(ROA = rest of Australia.)  Again, table may be edited for minor corrections.    Only Newspoll and Focaldata correctly projected that Yes would lose decisively in every state - a result that loomed as a serious prospect for months out from the vote but that the media frequently did not seem able to believe.  

Capping off an excellent campaign for Newspoll, the poll also released an 89% estimate for voters saying they were certain or very likely to vote, within 1% of the actual turnout.  


Focaldata MRP

The Focaldata MRP was a surprise entrant into the polling stakes (the only one for which seat breakdowns were available, though DemosAU also did a MRP analysis).  Its sample size was very much smaller than the YouGov MRP which was rather successful at the 2022 election, so how did it go?

Pretty well on the whole.  The model predicted the right winner in 133 of 151 divisions.  In three cases it predicted Yes but the result was No (Adelaide, Bennelong and unsurprisingly Greenway).  In fifteen cases it predicted No but Yes got up, including six teal seats (Warringah, Goldstein, Curtin, Mackellar, Wentworth and Kooyong) and also Jagajaga, Fraser, Gellibrand, Cunningham, Bradfield, Newcastle, Isaacs, Maribyrnong and Franklin.    One reason the model underestimated the number of Yes wins is it had a similar standard deviation to the Republic (+/- 10.2%) but the actual standard deviation of the referendum results was higher (+/- 12.6%), this greater polarisation bringing more Yes wins but also more heavy losses.  The average error in the model's seat estimates was 4.56%.  The model's biggest overestimates were in South Australia, while its biggest underestimates were in teal seats and a range of Melbourne area seats, especially those with high Green votes.  

The poll featured most prominently in debate when Prime Minister Albanese dismissed its forecast that No would win Lingiari.  No did indeed win Lingiari, albeit with 6.8% less than Focaldata's model (a difference largely down to not weighting by First Nations population, though doing so might have caused problems elsewhere).  By scoffing at this projection (and showing a lack of understanding of how Focaldata's poll worked in the first place), the PM gave a minor free kick to the ecological-fallacy offerings from some pro-No forces post the vote, who tried to use the Lingiari result as evidence that First Nations voters didn't support the Voice.

Onwards!

Will the polling industry get another test like this any decade soon, or will the appetite for referendums disappear?  The Republic was supposed to be the next cab off the rank, but barring a massive scandal in the royal family or a constitutional crisis that brought in the monarchy, another Republic referendum looks like it would lose similarly heavily.  So maybe we won't see anything like this for a long time, but it is an encouraging one for the future of polling in Australia.  

After releasing this article I'm intending to have a quick look at comparative turnout that I may publish here, and I also have some general comments about the campaign and result (the intended part 2 of the previous article) that may come out if I am happy with them.

1 comment:

  1. Thanks Kevin - I have come to similar conclusions here: https://marktheballot.blogspot.com/2023/10/voice-referendum-2023.html

    ReplyDelete

The comment system is unreliable. If you cannot submit comments you can email me a comment (via email link in profile) - email must be entitled: Comment for publication, followed by the name of the article you wish to comment on. Comments are accepted in full or not at all. Comments will be published under the name the email is sent from unless an alias is clearly requested and stated. If you submit a comment which is not accepted within a few days you can also email me and I will check if it has been received.