Dr Kevin Bonham: The Silly Season 2: End-Of-Year Poll Myths

Friday, December 21, 2012

The Silly Season 2: End-Of-Year Poll Myths

Advance Summary

1. There has been some recent debate about the Government's polling position going into an election year.

2. The idea that this position has any special significance in projecting results is baseless.

3. While some pro-Labor sentiment has compared Labor's position now to Howard's end of year positions before elections that he won, in only one of three cases was Howard's position even arguably as bad.

4. Despite this there are earlier precedents for victory from seemingly quite poor end-of-year positions.

5. Analogies with Howard's position at the end of 2006 are fatally flawed because of the Labor leadership handover.

6. "Momentum" is a common concept in opinion poll commentary, but it has no basis in reality. Movements in one direction from poll to poll are most often followed by movements in the other, probably mostly as a result of random bouncing from sample to sample.

7. Attempts to define a poor end-of-year position for Labor are already outdated, having been brought into question by more recent polls. (This article gives reasons for considering Morgan Face 2 Face to be valid data and not ignoring it completely.)

This article also contains an unrelated section examining an argument by Peter Brent that a given level of voting intention is more durable for an unpopular leader than a popular one. There is not enough evidence to apply this to unpopular opposition leaders, and especially not to Tony Abbott at this time.

--------------------------------------------------------------------------------------------------------------------

In a previous article (Is The Silly Season Real?) I examined the idea that opinion polls behave strangely in December as many people swing into Santa mode and found that, at least in the case of Newspoll, there is no evidence to support it at all.

Another form of commentary that I have seen quite a lot of lately involves focus on the government's polling position at the end of the year and how this compares to that of previous governments, and what this might (supposedly) tell us about whether the government can win.

Frankly I find this sort of comparison dubious without even considering the figures, to say the least. Even if we just look at elections since the Howard government took office, and thus exclude the early-in-year elections of 1993 and 1996, we are still not comparing like with like. Two elections happened in October, two in November, one in August and the next one may well be in August too (or thereabouts). If the relevant issue is the ability of a party to recover a certain gap in polling in a limited amount of time before the election, then comparing these elections is like saying that horse H won at distance Y with weight W, and then speculating about whether zebra Z or donkey D can win at a distance that isn't actually Y under about the same handicap. Especially if horse H actually spent most of the race it won either munching the grass or running backwards.

No evidence has been presented that polling data for a government about eight months from an election is any more predictive based on it being the data at the end of the pre-election year than if it hadn't been, and until someone presents such evidence there's no reason to take the idea seriously. But some do; I've seen a few Labor fans after the Newspoll last week (54-46 to Coalition) talk up their chances on the grounds of Howard recovering from worse end-of-year figures. And the Australian's Prof Peter van Onsolen has apparently heard the same as "the new spin being put out this week by the Prime Minister's office".

van Onsolen's recent article No way back from these figures - just ask Howard argues that none of the end of year positions Howard won from were as bad as Gillard's current position, and that the one time Howard did find himself ending "the polling year before an election as badly as Gillard has ended this year", he lost. I've picked this article out for scrutiny for three reasons. Firstly, it falls roughly in the middle of the line that runs from cutting-edge psephology to somewhere just past Andrew Bolt; secondly the author is a highly qualified political scientist whose views may be taken seriously on that account; and thirdly, anyone who hosts a program called The Contrarians would have to approve of his own output getting thoroughly contra-d. At least, I hope so! The very worst thing about the article is its title, and I don't know if that was written by the author or a sub-editor.

Were Howard's Winning End-Of-Year Positions All Better than Gillard's?

van Onsolen writes "In 1997 and 2000, the two-party results for the main parties were not calculated by Newspoll, so we don't have figures to go off for those elections. But we do for 2003." He is of course correct that the 2003 figures for the Coalition were clearly not as bad as Labor's are now, with the Coalition leading 51:49 in the last two polls. In late 2003 Labor had only just discarded the unpopular Simon Crean and replaced him with Mark Latham. The two December 2003 polls were Latham's first as leader and he was still too new to have rebuilt the party's position. The Latham honeymoon took Labor as high as 53s-54s* in March 2004 before fading (a better position than the Coalition has been in this year since at least August), and we all know where that ended. (* These appear as 55s on Newspoll's site but Newspoll was using respondent-allocated preferences in that year.)

As concerns the 1997 and 2000 situations, just because the 2PPs were not calculated by Newspoll doesn't mean they can't be calculated, at least roughly, and most people familiar with debate about Newspoll know that van Onsolen's colleague Peter Brent did exactly that on his original Mumble site. (Note, however, that at least on my browser there is a huge problem with the old Mumble tables, caused by one date reference running onto two lines and putting all the dates below it out of alignment. I have a data set which fixes this.)

With access to these estimated 2PPs, it can be seen that in late 2000, Howard was indeed not so far behind. His government's last estimated Newspoll 2PPs that year were 46 and 51 in November, then 50 and 48 in December. My rolling Newspoll average puts him at 48.6 compared to the current government's 48. But things got much worse for Howard before they got better - in March 2001 his rolling average was below 45, and he still won. You can see the credibility graves of some of those who wrote him off midyear in this John Watson piece published this April.

1997 is a less clear story. Howard finished the year with estimated 2PPs of 47, 45 (Nov) and 47 and 49 (Dec).   Well, that sounds like a much better end of the year position than 46, but only if you measure a party's standing by the final poll of the year, ignoring those in previous weeks (the previous week in the case of the final 47) and leave yourself at the mercy of random bouncing. 47-45-47-49 is actually not better as a rolling average than Gillard's 50-49-49-46. In retrospective trend terms, the improvement in the last few weeks of 1997 was real (1998 started 49-51-48 for the Coalition) but we still do not know what Gillard's last 2012 Newspoll will look like through the rear-view window. And, again, in June and July 1998 Howard seemed to be in big trouble, polling a 46 and a 44, but won.

So in total, while van Onsolen is largely right that Howard's end-of-year positions were not as bad as Gillard's now (debatable in one case, clearcut in the other two), in all three cases Howard reached a worse position than Labor's current position at some stage during the final year of polling.

If there is "no way back" from 46s, 47s and 48s at the end of the pre-election year, then why is there a way back from 44s in May and June of election years, and a 43 and 45s in March? Not just once but three times? It doesn't follow.

What About Leaders Before Howard?

A longer history of polling data (using Morgan Gallup) provides these examples where governments were trailing at or near the end of the year before a federal election and won it:

* In December 1973, Labor polled 43 compared to 50 for the Coalition and 7 for the DLP, possibly equivalent to a 2PP for the government of about 45% 2PP. Labor narrowly won a double-dissolution election in May 1974. Admittedly, this is not similar to cases of a government running close to its full term.

* In December 1976, the Coalition trailed Labor 45-50 on primaries, probably equivalent to about 48% 2PP. The Coalition won crushingly in December 1977.

* While I do not have figures for December 1979, in November 1979 the Coalition trailed Labor 44-49 (with most of the remaining votes being Democrats), equivalent to about 47.5% 2PP. The Coalition was re-elected in October 1980.

These examples suggest there is nothing inherently uncompetitive about Labor's 2012 end of year polling position.

Was Howard's Losing End Of Year Position As Bad As Gillard's?

No, it was actually much worse than the polling figures indicate. Here van Onsolen writes:

"There was one occasion when Howard ended the polling year before an election as badly as Gillard has ended this year: in December 2006, ahead of Rudd's election as PM. On that occasion, the Coalition's primary vote had dipped into the 30s, at 39 per cent (five points higher than Labor's present primary vote). And the two-party vote for Howard's Coalition was down, at 45 per cent, one point lower than where Labor finds itself now.

It may not fit the script for the purposes of the PM's office spin, but if historical comparisons are to be made, comparing the way the Coalition ended 2006 to the way Labor is ending this year fits best. Poor polling to end a year before a summer recess makes it difficult for a prime minister to use the recess to try to build momentum. That was Howard's experience."

Actually, Howard's experience was Kevin Rudd. Rudd had been installed before the last poll of the year in an ultimately election-winning move by Labor (I'm not convinced that Beazley would have won anyway.) As the honeymoon effect for new Opposition Leaders usually takes a few months to build up its maximum head of steam, a 45% 2PP for Howard was only a sign of how bad it was going to get, and Howard would have had similar problems no matter what time of year such a popular pick was installed.

"Momentum", as applied to individual polls, is the phlogiston of pollgazing. Lots of people believe in it and talk about it but there is actually no evidence that it is real. There is an assumption that if a party gains in one poll it is likely to gain in the next as well, but actually exactly the opposite is true. To give an extreme example, in 51 cases where an Opposition has gained 4 or more 2PP points between one Newspoll and the next, that Opposition has gained in the next poll only 3 times, retained the new value 10 times, and gone backwards 38 times. Overall, if there is a poll-to-poll move towards the Opposition, in 24% of cases the next poll will also move towards the Opposition while in 57% of cases the next poll will move towards the Government.   If there is a poll-to-poll move towards the Government, in 27% of cases the next poll will also move towards the Government while in 56% of cases the next poll will move towards the Opposition.

On average, 40.6% of all 2PP Newspoll points gained by one side or other are lost immediately in the next poll. (Oddly, in my 600-poll data set, the percentage of gain that is lost is over 45% if the gain is two points or four, but below 30% if it was three.) And lest anyone think that this is just a two-steps-forward-one-step-back path to success, the points lost stay off. Indeed, if we look at the position four polls after a poll in which a side has gained, on average 45.2% of the points gained in the original poll have been lost.

Reasons for this include that poll-to-poll gains often represent random bouncing, and sometimes also represent the impact of issues and other influencing factors with a very short shelf life. When issues affect polling, their impact also tends to fade (similar to the way debate and convention bounces in the US Presidential cycle peak after a few to several days and then gradually wash out of the system.)

Furthermore, if anyone wishes to argue "oh, only 40% of gains are lost, so the rest must be real and issue- or trend-driven", that's wrong. To the extent that a gain between Poll 1 and Poll 2 is exaggerated compared to the trend line, that part of the gain would be expected to be lost in the next poll. But if a gain between Poll 1 and Poll 2 actually represents a correction from Poll 1 being below-trend (which is also quite likely - the initial version of this article said "equally probable" but I've reconsidered that) then the points gained would be expected to be retained. It may be that something like 70% of poll-to-poll variation is just the reversal of incorrect gains and losses from the previous poll (or a reflection of very short-term factors) leaving perhaps 30% reflecting trend changes and issue impacts that might have lasting impacts. And that, based on the graphs in Possum's piece here, may even be an overestimate. (I'm thinking that the figure might be higher when looking at a single polling company because of the possibility of short-term anomalies that might skew one pollster's results while not affecting others as much, for example.)

I don't rule out "momentum" existing when applied to longer-term movements, for instance in asking if a party has improved its rolling average over several polls, is it likely to make further gains in subsequent polls. But it is exactly this kind of analysis that is frequently avoided by those using the concept.

When we try to apply this back to the idea that "momentum" going into the New Election Year is important, even though the rest of the time momentum clearly does not exist, the biggest problem we encounter is small sample size. In terms of Newspoll data, we have only nine elections. Three elections were held in the first half of the year, and three of the others involved Opposition Leader changes just before the start of the election year. The idea that we are going to get a useful insight into the hypothesis of start-of-election-year "momentum", that we can apply to 2013, from a sample of precisely three elections (1987, 1998, 2001) is something I'm not even going to try to take seriously.

(It is worth noting that in 1992-3, Labor ended the pre-election year in a roaring position, having risen from a 47 and a 46 (modelled 2PP) in October to a 56 and a 54 in December. With an election due early in the year, if "momentum" in the government's favour going into the Christmas season counted for anything, Hewson's Liberals would have obliterated. Instead, they resumed in January with their key policy revised and immediately recovered the lead, from then on winning almost every poll except the one that counted.)

Don't Speak Too Soon, For The Wheel's Still In Spin

A common fault in both Fairfax and Murdoch punditry is a tendency to use their own house poll as an exclusive source of data for analysis, and ignore polls by other pollsters. (Admittedly I too tend to focus on Newspoll when doing historical work, but that is largely because Morgan's online data set is less useful because of the mix of polling methods, while Nielsen have seemingly removed their historical trend data from the web before I had the presence of mind to take a copy.)

But in any case, while Newspoll is the most-watched opinion poll, it is not the only show in town. Prior to this week there was an upsurge in the Coalition's position on polling aggregators, which was created by three polls giving it a 54:46 lead - Galaxy, Essential and Newspoll. I argued in volume one that the Newspoll 54 is really a 53, and I would also point out that, as Mark the Ballot shows, Galaxy and Essential both skew Coalition by a point relative to the mean of other polls. So the idea that we were really at 54:46 in the wake of the AWU saga (which probably wasn't driving much voting intention anyway) was never correct to begin with.

This week saw a mixed new batch of polls: a 52:48 for the Coalition, a 55:45 for the Coalition from Essential, and a 52.5:47.5 for Labor from Morgan face-to-face. Taking into account the house effects for each pollster, these are really 48, 46 and 50.5 for the Government, with the 48 being from the firm to which the most attention is paid. And that would be much more momentum-inducing than a bunch of 46s, if momentum existed in the first place.

Morgan face-to-face is widely rubbished because of its obvious Labor skew, but I for one completely agree with the following from Mark the Ballot:

"I use the Morgan face to face series because it is fairly consistent in respect of the other polls. It is a bit like comparing a watch that is consistently five minutes slow with a watch that is sometimes a minute or two fast and at other times a minute or two slow, but which moves randomly between theses two states. A watch that is consistently slow is more informative once it has been benchmarked than a watch that might be closer to the actual time, but whose behaviour around the actual time is random. In short, I think the people who ignore or down-play this Morgan series are not taking advantage of really useful information."

Also, while Bludgertrack downweights Morgan Face-to-Face to effectively half the value of a Newspoll on account of its poor predictive record at election time, I wonder to what extent this is because Morgan realises that the pro-Labor skew of the series would embarrass the company if polls using it were done during the final days. In the last two election campaigns, Morgan last used Face-to-Face a week before the election and then switched to phone polling. I wonder to what extent the apparent inaccuracy of F2F even with its bias considered, could be a result of a lack of readings under this method in the last days of polling before elections. (Nielsen has also suffered in this way, for instance in 2007 when its final poll was way wrong.)

Of course, even at an adjusted 50.5 in Labor's favour, the Morgan is likely to be an outlier. But it is still valid data as evidence against the idea that the Coalition has a big lead, and in favour of the idea that Labor ends the year (for what little it matters, if at all) in a competitive position and with the recent move back to the Coalition at least watered down or perhaps even cancelled out. Even that may not be the end of it since some pollster may choose to spring a surprise late reading to see if anyone cares that there probably won't be a budget surplus.

------------------------------------------------------------------------------------------------------------------------
Mumble's "Beware High Personal Ratings"

Peter Brent has written a piece about leader ratings which is well worth reading, arguing that for a given voting intention, a party's position is least durable when that leader is popular and perhaps most durable when they are on the nose. It may well be that this is true, though as Peter points out it is still better to have a popular leader, with the 2PP advantages that go with that. It makes sense at least to expect that once a leader is very unpopular, dislike for them is already factored into the 2PP, so it should not be expected that it will continue driving it down. However, if something else does, then the party with the unpopular leader may strike trouble.

There are a few other reservations I wish to note:

1. Andrew Peacock in 1984 is used as an example of an unpopular leader in a wretched polling position who ended up doing well, but as I pointed out in As Gillard Recovered, So Can Abbott? , Peacock's was an unusual case in which the very unpopular leader became popular by the time of the election. I don't think we'll be seeing this with Abbott.

2. Of the nine results considered (these are comparisons between polling in the last six months and the election result), there are many other patterns that can explain a lot of the data. The first of these is that the Coalition on average improved on its polling by 1.2 points. Indeed of the six cases where the election result differed from the polling average substantially, the five largest were all in the Coalition's favour. This is consistent with my finding (see point 2 here) that the Coalition historically performs better relative to polling than Labor.

3. The second confounding pattern is that where polling shows one party with a significant lead, election results are typically closer, a common finding historically irrespective of leadership ratings. In the Newspoll era this was the case in all of 1987, 1990, 2007 and 2010 and only not the case in 1996.

4. Peter writes:

"But his low approval ratings, and Gillard’s not-too-bad ones, are no reason to believe he won’t win the election. On the contrary, it makes more sense to interpret these personal ratings as holding down Abbott’s measured voting intentions and holding up Gillard’s."

Peter's data includes only two comparable cases of Opposition Leaders for whom the popularity rot had set in - Howard in 1987 and Peacock in 1990. It is true that both these unpopular leaders performed better in the election than their polling in the previous six months had suggested. But we are probably more than six months out from the election now, and we do not yet know what the Coalition's 2PP benchmark for the last six months will be. I for one am more confident at present that Abbott will still be rating badly in that period, than that the Coalition will still be necessarily ahead.

This is important because if we average polling 6-12 months out for the Coalition leading up to the 1987 election, they were fully competitive (50.2% 2PP, a point better than their election result.) And that, not the last six months, is the correct comparison point for where we are now.

The 1990 case is complicated by Peacock recapturing the leadership less than a year from the election, but again in that case, Peacock's 2PP average more than 6 months out (49.7) was much better than his average in the last six months, and 0.4 points below the election result.

The other relatively unpopular Opposition Leader in Peter's sample is Hewson, but Hewson was not all that unpopular (average -7) at the stage of play that we are supposed to be at now.

So I think Peter's evidence is much more useful as an argument that popular bubbles where a leader soars on netsat and also does well on 2PP are prone to burst. To take the argument to the other end, and argue that unpopular opposition leaders are likely to outperform their current polling even this far in advance, requires a lot more data.

Dr Kevin Bonham

Friday, December 21, 2012

The Silly Season 2: End-Of-Year Poll Myths

No comments:

Post a Comment