Monday, November 23, 2015

Wonk Central: Reverse Engineering Special Newspolls

Welcome back to Wonk Central, the occasional series of excursions into psephological arcanity that ... well, if you got past those big words and have some sort of head for maths, you'll probably be right at home here.  This one's not as hard as some, probably just a Wonk Factor 4/5.

Today saw the release of extensive results of a special Newspoll on various national security issues. These included:

* support for ground troops to fight the so-called Islamic so-called State
* how many Syrian refugees Australia should be taking
* whether priority should be given to Christian refugees over others
* the chance the so-called Islamic so-called State will carry out a large scale terror attack in Australia
* whether the Muslim community in Australia is doing enough to condemn attacks like the Paris attacks
* whether Muslims living in Australia are doing enough to integrate into something Newspoll calls "the Australian community"

I will comment on the results in the next Poll Roundup.  What I want to discuss now is whether or not we can use the results to guess at the voting intentions in this Newspoll.  My working will likely assist anyone seeking to do so in similar cases in the future.

The question of reverse-engineering Newspoll received sustained attention (also see Pollbludger thread) in 2009 when Newspoll very unusually released a poll on attitudes to asylum seekers without releasing the poll's voting intentions.  The voting intentions were not released because of a pact with Nielsen, which was releasing voting intentions the same week.  Newspoll received much criticism for not releasing the voting intentions (not even weeks later), especially as its previous poll had been a rogue sample. Reverse-engineering of the new poll suggested it would show a Labor lead more similar to the polls before the rogue.

Nothing like this has happened since that I recall, and subsequent claims about "missing Newspolls" have been rubbish, usually based on someone failing to realise that Newspoll does not always poll fortnightly.  What we do often get, however, is advance results that include partial party breakdowns but do not state the poll's voting intentions.

Can the Newspoll be "solved"?

It might be thought that if we have enough information about the breakdown of a poll's results we could "solve" the poll and determine the exact levels of voting intention for each party.  Certainly, 23 breakdowns for six different questions looks rather promising.

The first problem is that there are still a lot of unknowns here.  Especially, we don't have any of the breakdowns for Others voters (those who say they won't vote Coalition, Labor or Green).  This means that even if we had exact information about the vote breakdowns, we wouldn't know enough to solve and find all the unknowns.  For instance, suppose we tried solving off the "Committing ground troops" question:

Let L=Liberal primary, A=ALP primary, G=Green primary, Z=Others primary

.48L+.41A+.21G+xZ = 42
.40L+.51A+.63G+yZ = 45
.12L+.08A+.16G+wZ = 13
L+A+G+Z = 100

Here we have five simultaneous equations in seven unknowns, which is a no-can-do (sorry Campbell).

It might be thought that when we add in the next question we will get one more new equation than unknown (because we are adding one new coefficient of Z for each line, and then one new sum of these coefficients to 1 at the end) and therefore that with three of these polls we could solve the puzzle via a massive simultaneous equation.

However, we run into big problems with rounding: none of the 48s, 42s, etc supplied by the pollster are exact numbers.  They are rounded, and in some cases they are not even rounded to the nearest whole number (because of Newspoll's trick of making everything add up to 100).  If we don't account for this we might end up with no solution or a silly one, and accounting for it is extremely messy.

The other apparent (and much more minor) problem here is that there seem to be hidden double-ups in the extra equations added. Some of them can be determined from the others, so are not actually adding new information, meaning that the dream of riding off into the sunset after solving 16 equations in 16 unknowns would be a mirage even if we had unrounded data.

A final possible problem is that Newspoll might be including some voters who are undecided on voting intention in the poll.  From the sample size this looks unlikely, however.

We can't "solve" a  Newspoll as such, but we can try some assumptions.

What If Nothing Had Changed?

Sorry about the mess, but below is a screenshot of a spreadsheet for testing assumptions about this Newspoll.

What I do here is enter test values for the Coalition, Labor and Greens primary votes in cells H2, I2 and J2.  Thanks to Newspoll's sum-to-100 rule, the Others vote then appears in K2.  In cells K32 through K59 are the spreadsheet's estimates for the Others breakdowns for each question.  (The figures in bold are the Newspoll known knowns and the italicised figures are just working - ignore them).  For instance, the cell K45 shows that for this set of primaries, the percentage of Others voters who think an attack in Australia is inevitable, is about 32%.  Because of rounding issues, it won't be exact - it will just be something in that ballpark.

There are two things I'm looking for that might bust any particular assumed set of primary votes.  The first is a negative value, or 100+% value, that isn't plausible.  The example above suggests that -2% of Others voters think a terror attack would "never" happen, but given that only 1% of the sample overall think it would, this is probably just a rounding issue, and can be ignored.  (Galaxy polls now and then had Abbott's implied preferred PM score among Others voters as a negative for this reason - it just meant his support was practically zero.)  

The other thing is results that are possible in theory but just don't make sense.  What I know of Others voters after the Abbott dumping is that they're a rather right-wing bunch.  Sure, there are Wilkie or MacGowan supporters, Greens splitters and people who like independence for the sake of it in that mix, but a lot of current-day Others are Christian party voters, right-wing xenophobes or disgruntled Abbott fans who are saying they won't now vote Coalition. This only goes so far, though.  "Others" are not so uniform that we should expect 78% to think Muslims don't integrate enough and virtually none of them to think that they do (I can raise that zero to about 4% by fiddling with the rounding).

The other thing that's standard about Others voters is that they tend to be undecided about everything.  I'd expect their undecided rates to be higher, and the example ticks that box.

Where did I get the above primary votes from?  Well, that's just the previous Newspoll.  So from this I suspect that voting intention as measured by Newspoll has changed, if only slightly.

How much might it have changed by, and in what direction?

Possible Changes

Here is a spreadsheet giving example predicted breakdowns for Others voters based on various assumed primary vote totals.

Column A is the previous Newspoll, with the problems discussed above.  In column B I take 2% from the Coalition and give it to Labor.  This makes the problem of the low Muslim integration result worse.

In columns C and D I take two points from Labor (C) and the Greens (D) and give them to the Coalition.  This improves things, especially in the latter case.

In column E I take two points each from Labor and the Greens and give them to the Coalition and Others.  In column F I give these four points all to the Coalition with none to Others, and in column G I take four points from Labor and two from the Greens and give four to the Coalition and two to Others.

Columns E, F and G give more convincing results on the Muslim integration question, but at the cost of increasingly negative results on the question about giving Christians priority.  Since a substantial portion of Others voters are either Christian micro-party voters or Christian conservatives annoyed by the dumping of Abbott, this seems a bit unlikely.

So overall what would least surprise me in tonight's Newspoll (if these figures are actually from tonight's Newspoll) would be a moderate (but not huge) further move to the Coalition, or at least away from Labor or the Greens.

I've posted this before the Newspoll, at the risk of my conclusions very quickly looking silly, because I think testing this sort of thing by announcing predictions before the poll comes out is good scientific practice. Let's see how it goes ... if the primaries from this poll are released I'll report further.

Update: Well that was fun ... but if it was the same sample then it didn't really work - Labor were down a point but the point went to the Greens!  If there was movement to the Coalition then it was lost in the rounding.

The most likely reason is the relatively small sample (about 160, possibly less because of scaling) of Others voters.  There might be some figure for which the natural response of Others voters is, say, 12%, but simply through sample noise the actual reading was 4, which then through accidents of rounding (especially if the rounding boosted the left's totals) comes out at an implied -2.  With such a very large number of measurements being taken at once, a rogue reading somewhere becomes more likely than not.  If it just happens to be the one that concerns an already very low reading, then the implied breakdowns for Others voters will look more suspicious than they are.

If the chance arises I will test this method again in the future.


  1. For what it's worth, Peter Van O is doing his usual "Whoa what an amazing Newspoll tonight!!" thing. The interesting thing is that Chris Kenny has joined in the fun as well.

    Some would argue that this is an indication the poll will be a good 'un for the Coalition.

  2. It seems that they were just going gaga about Shorten's horrible leader ratings.