Dr Kevin Bonham: Why Preferred Prime Minister/Premier Scores Are Rubbish

Monday, December 10, 2012

Why Preferred Prime Minister/Premier Scores Are Rubbish

Note added 2020: This article now has a sequel. See Why Better Prime Minister/Premier Scores Are Still Rubbish

Advance Summary of this Article:
1. All pollsters who currently conduct Preferred Prime Minister/Premier polling but do not conduct approval rate polling for each leader, should conduct leader approval rate polling either instead of PPM/PP or as well as it.

2. Preferred Prime Minister scores have been historically maligned on the basis of a history of failing to predict election results. Although frequently reported in horse-race style by mainstream reporters, they are often dismissed as "beauty contest" scores by informed psephologists.

3. When a "house advantage" to the incumbent Prime Minister is taken into account, Preferred Prime Minister scores align better with election results and voting intention.

4. Preferred Prime Minister ratings are driven by, and lag behind, Prime Ministerial approval ratings, and are not especially good forecasters of future poll results.

5. The focus on Preferred Prime Minister/Premier scores not only leads to misleading commentary but also obscures useful data that are reflected in approval ratings but lost in scores that just compare leaders.

Disclaimer: This article only applies to the usefulness (or otherwise) of comparisons between existing party leaders, such as PM vs Opposition Leader. It does not (necessarily) apply to surveys comparing hypothetical leaders of the same party, or comparing an existing leader of a party with a potential leader of another, or to surveys like the recent Galaxy 4-way preferred prime minister poll.

--------------------------------------

Many pollsters ask voters to decide which of the Prime Minister and Opposition Leader they would prefer to be Prime Minister. At state level, the same thing happens with Preferred Premier polling, and some pollsters (this includes you, EMRS) poll Preferred Premier but do not poll approval ratings.

The actual value of Preferred Prime Minister scores has long been a source of difference of opinion, and at times abuse, between some parts of the mainstream political media and the online "psephosphere". In 2007 this got particularly heated when some mainstream journalists kept arguing that the election was in the balance because John Howard was not far behind Kevin Rudd as Preferred Prime Minister (Howard trailed Rudd consistently from mid-Feb 2007 onwards, but only by an average of seven points, at times closing to as close as one.)

PPM bobbed up again soon after the 2007 election, when Brendan Nelson scored some grotesquely bad Preferred Prime Minister scores (with a low of 7%) and these were taken to reflect badly on him rather than just on his party's situation in the polls.

Preferred Prime Minister scores may again become the object of some interest given that Prime Minister Julia Gillard heads Opposition Leader Tony Abbott by a substantial margin (at the time of writing, 13 points) after the lead in that score had been swapped back and forth between the two several times between June 2011 and September 2012. I thought I'd take the opportunity to have a look at PPM scores again, see if some of the earlier suspicions about it still stood up, and see if it is actually any use at all.

The Case For The Prosecution: PPM Fails Three Elections In A Row

Obviously one of the most important things a polling statistic should do if it is to be considered a big deal is to predict the result of an election taken just days later (assuming there is no unusual late swing). But during the 1990s, preferred prime minister polling did itself no favours with the following three consecutive failures:

1993: John Hewson led Paul Keating as Preferred Prime Minister for most of 1992, and took the lead again during the 1993 campaign. He was PPM in six campaign Newspolls in a row (albeit by an average of only 3.3 points) but lost, and in the end it wasn't all that close.

1996: After trailing for most of 1995, Paul Keating took the lead as PPM against John Howard at the end of 1996 and led by an average of 4.4 points from there to the election (one was tied). However, Keating was thrashed.

1998: Kim Beazley and John Howard had been extremely close as PPM for a year leading into the election. In election week Beazley led Howard by a point. Howard won with a comfortable majority.

Other PPM duds included 1990 (PPM wasn't close, the election was), 2007 (PPM was close, the election wasn't) and 2010 (Gillard had a double-digit lead but scraped in). As we only have Newspoll PPM figures for nine elections, PPM is hardly setting the world on fire.

However, this is not so bad as it sounds. Apologists for PPM can point out that Beazley wasn't really ahead as a rolling average in 1998, and he actually won the 2PP vote, and only lost the election in seat terms. They can also point to the PM's house advantage (see below) as an explanation for the government performing worse than the PPM score suggested in 1996, 2007, 2010 and partly 1990. And as for 1993, since that election breaks almost every other predictive model in the book (except the one that says that unpopular Opposition Leaders do not win), PPM's in good company in failing there. So maybe, if we take the house advantage into account, PPM might actually still be useful?

The PM's House Advantage

Newspoll polled Preferred Prime Minister in the leadup to the 1987 and 1990 elections and then more or less constantly since July 1991. For the polls for which PPM scores have been available,the average Prime Minister net satisfaction reading (or netsat, %approve - %disapprove) has been just a fraction under zero, the average Opposition Leader net satisfaction reading has been -1, and the government has averaged a trivial 50.17:49.83 two-party lead. So all these indicators have been remarkably neutral.

Yet although PMs and Opposition Leaders tend to have about the same personal ratings, the Preferred Prime Minister score shows an average lead of 16 points to the incumbent. Only six Opposition Leaders (Hewson, Downer (briefly), Howard, Beazley (not that often), Rudd and Abbott) have led on PPM at all, and when this has happened the government has almost always been well behind on two-party preferred.

When the election results are considered again, with the incumbent PM docked that 16-point house advantage, the disparity between PPM and election results becomes a little less glaring:

(The PPM column shows a weighted rolling average lead (or deficit) for the PM at election time, and the next column simply subtracts 16.)

With this adjustment made, the cases where the Prime Minister was well behind on adjusted PPM include two losses, a substantial loss on 2PP and a freak win, while cases where the Prime Minister was well ahead or about even were all wins (some of them close.) It's to be expected that the close cases would be most likely wins, since governments normally don't quite need to be level on 2PP to win.

The house advantage makes sense. An incumbent Prime Minister has actually done the job, while an Opposition Leader generally hasn't. A voter who thinks the incumbent is doing a good job is less likely to prefer they be replaced, even by someone who they also think is doing well as Opposition Leader. A voter who disapproves of both leaders may prefer the devil they know - after all if an Opposition Leader is doing a poor job of opposing a PM who the voter thinks is also doing badly, then it's likely their view of the Opposition Leader is very negative.

Prime Minister Gillard has led on preferred Prime Minister by14, 10, 11, 14, 13 and 9 points in the six most recent Newspolls. That puts her slightly behind the house advantage, and also slightly behind where she was at the 2010 election. Such a lead is only historically a sign of a party being in a vaguely competitive position. It does not signify a real advantage.

Is Preferred Prime Minister Predictive?

Preferred Prime Minister scores do correlate quite closely with voting intention. However, in looking at how closely, I discarded all polls where a new Opposition Leader's "uncertain" approval rating had yet to fall below 21, since these polls return anomolous figures for Opposition Leader netsat and 2PP comparisons (as shown in The Abbott Factor). In the remainder, the PM's lead as Preferred Prime Minister, the PM's PPM score, and the Opposition Leader's PPM score, each individually explain from 50-56% of variation in party voting intention, which rises to 70-80% in the Gillard-vs-Abbott era. These are actually generally higher percentages of explained variation than for the Prime Minister's net satisfaction rating, the PM's approval rating and the PM's disapproval rating, and the graphs are steeper too. For instance for every point an Opposition Leader loses in terms of voters picking them as Preferred Prime Minister, the Opposition loses 0.352 points in voting intention. In the Abbott/Gillard era, that has risen to 0.759 points, which is very high. The mid-September 2012 poll, with a 5-point swing on 2PP and a 6-point drop in Abbott's Preferred PM score, is a good example.

But while that sounds promising, for PPM to be actually any use it needs to be predictive, and not just a close reflection of 2PP with a house edge thrown in to create extra confusion. And the prospects of this from Possum's earlier coverage weren't very promising:

"3.The other thing we know is that the Primary and TPP vote is both covariant with, and a lagging indicator of the Preferred PM rating (meaning that the two measures either move together and/or the Preferred PM rating follows the primary and TPP votes with a slight lag.)

The latter being what started the Poll Wars between The Oz and the better informed blogging community.

The Preferred PM rating is essentially a meaningless beauty contest which has no statistical bearing on the vote. It either moves with changes in voting intention and satisfaction ratings, or lags behind them, and the relationship between the vote and the PPM is pretty tight as far as polling relationships go."

What I found in my sample (bearing in mind that I have kicked out the early stages of most Opposition Leaderships) is slightly different. As with Possum's recent findings for Prime Minister net satisfaction driving voting intention (The Primary Dynamic), Preferred Prime Minister also "Granger causes" voting intention (which is not to say it is the real cause, and indeed it isn't.) If I compare the Opposition 2PP in linear regression models like these:

Modelled Opposition 2PP(1) = 52.97 - 0.172*(Prime Minister's PPM Score - Opposition Leader's PPM Score)
Modelled Opposition 2PP(2) = 64.11 - 0.192*(Prime Minister's PPM Score)
Modelled Opposition 2PP(3) = 39.25 + 0.352*(Opposition Leader's PPM Score)

...with the actual 2PP, I've found that where the Opposition 2PP has been higher than the modelled figure, it has been more likely to go down in the next poll, and where it has been lower, it has been more likely to go up. For instance, the third equation above scored 234 hits to 115 misses in projecting which way the Opposition's 2PP went in those Newspolls in the samples in which it moved.

However, the first problem here is that while PPM can be used as a leading indicator of 2PP changes (in cases where the Opposition Leader has been in for a while), it isn't actually the best one. The first reason for this is that it itself is driven by the Prime Minister's approval score, and lags behind it when change occurs. Not surprisingly something like this:

Modelled Opposition 2PP(4) = 50.37 - 0.113*(Prime Minister's Netsat)

performs just a little bit better on past data (eg 239 hits to 110 misses in predicting direction of change - not a remotely significant difference alone, but one that was to be expected.)

(For Abbott-Gillard it is Modelled Opposition 2PP(4) = 50.48 - 0.157* (Prime Minister's Netsat)).

And the second and bigger reason is that while trying to predict the next Newspoll based on discrepancies between the 2PP in the current one and various leadership indicators sounds neat, it doesn't work nearly as well as trying to predict the next Newspoll based on the recent record of 2PP alone. For instance, using my own current pet rolling average (average of the four most recent 2PPs with the most recent counted twice), the prediction that the next Newspoll will move towards the current rolling average correctly predicts the direction of change in 73% of cases where change occurred, compared to 68% for the best of the simple leadership formulae. And I have no reason to believe my pet rolling average is actually the most reliable available - indeed given that I chose it arbitrarily, it almost certainly isn't.

(Update: It is notable that all of these methods, when applied to the 46-54 Newspoll out tonight, predict that if a Newspoll was held next fortnight the Coalition 2PP would go down.)

Furthermore, aggregation of polls across multiple pollsters (as discussed in my previous article) is very likely to give us a better handle on where voting intention is really at, than trying to use either leadership scores or rolling averages to correct any single poll. The reason for this is that the more polls can be included in a sample, the more the rough edges caused by the random ups and downs of particular polls are smoothed out. Changes from poll to poll sometimes indicate rapid shifts that are real, and any poll movements contribute some evidence about the trend, but much more often the underlying change in voting intention from fortnight to fortnight is rather small and swamped by noise.

The whole business of what is the best possible model for using Newspoll data to extract signs of error in the voting intention measurements of the current Newspoll, and hence get a more accurate understanding of where is probably best left to the sorts of people who win Kaggle competitions and not statistical light-middleweights like me. And it's possible that someone, with a fiendishly complex model based on several different indicators, will find a way to squeeze some actual small use out of Preferred Prime Minister scores. But, given that:

* Preferred Prime Minister scores are deceptive because of house advantage.
* Preferred Prime Minister scores are driven by the Prime Minister's personal ratings and lag behind them.
* There is no evidence of a simple way to use PPM scores to predict anything more effectively than by using other simple methods.

... it is hard to see how focus on Preferred Prime Minister "leads" and so on by mainstream commentators can be considered in any way informed or useful. But the greater share of the blame must go to those pollsters who continue to use "preferred leader" ratings instead of (or more often than) approval ratings, and thus continue to churn out figures that journalists comment on, because they imagine that since they have come from A Pollster, they obviously Must Mean Something Big!

And this brings me finally to my biggest beef about Preferred Leader scores, especially in the Tasmanian context:

Preferred Leader Scores Lose Useful Information

Here are some selected, and only lightly cherrypicked, Preferred Prime Minister scores of the past, with the net satisfaction scores of the two leaders in brackets afterwards:

* Dec 1992 - Keating 43 Hewson 30 Unsure 27 (-24, -30)
* Sep 1994 - Keating 44 Downer 28 Unsure 28 (-12, -31)
* July 1997 - Howard 43 Beazley 29 Unsure 28 (-1, +11)
* November 1999 - Howard 47 Beazley 34 Unsure 19 (+19, +16)
* August 2004 - Howard 48 Latham 34 Unsure 18 (+7, +15)
* Sep 2012 - Gillard 46 Abbott 32 Unsure 22 (-16, -30)

Apart from variation in the Unsure score, and slightly higher or lower figures for the given leaders, there is not a lot between all of these in PPM terms. All have the PM leading by 13-16 points, thus around or a little behind the house edge.

Yet these relatively minor variations in PPM conceal a massive sweep of 43 points from best to worst in Prime Ministerial net satisfaction rating, and 47 points from best to worst for the Opposition Leader. They also include a case where the PM's netsat is 12 points worse than the Opposition Leader's and a case where the PM's netsat is 19 points better. The main reason for this is that the Preferred Prime Minister score loses information about whether the leaders are actually popular, and whether the voters are comparing two leaders they like or two leaders they don't like.

In the Tasmanian context, voters go years without seeing a leader approval score. We know that at present the Liberal Party has a massive lead over the Labor/Green coalition and is on track to victory, but do voters actually think Lara Giddings herself is doing a reasonable job in the circumstances or a bad one? We do not know. Is Will Hodgman very popular, or do voters think he's just an average political plodder who happens to hold a dream hand? We do not know. Nick McKim surely isn't still the most popular Greens leader ever, but do the voters disapprove of his leadership over the current state of the forestry debate or do they think he is doing an OK job in a hung parliament, even if they might not vote for him? We do not know.

This information could all be very useful (especially in trying to pick what, if anything, is really driving poll-to-poll primary vote movements like the massive Green drop in the last poll), but instead all we get are Preferred Premier scores that seem to fairly closely track, and certainly move with (given the three-month poll cycle), the primary votes for the various parties. (I haven't attempted to extract the "house advantage" in the Tasmanian situation - I think that given the problems with undecided rates, such an exercise would not be very accurate.)

Approval rating polling would add a great deal to understanding of the state political picture, while Preferred Premier scores at present add very little. Hopefully EMRS will at some stage start asking approval rating questions - or if they don't, perhaps somebody else will.

----------------------------------------------

PS: An interesting result I discovered by accident while working on this article is that for Opposition Leaders who are very well-established (uncertainty score 15 or less) the relationship between Opposition Leader netsat and 2PP is much weaker than for Opposition Leaders who are only moderately established (16-20), even when Tony Abbott is removed from the first group. (Including Abbott in the first group destroys the relationship for that group more or less entirely.) For the second group, the relationship is about three-quarters as strong as for Prime Ministerial netsats. I am still considering why this is the case but it is possible that the familiarity-breeds-contempt issue for very well-known Opposition Leaders discussed in The Abbott Factor plays a role.

Dr Kevin Bonham

Monday, December 10, 2012

Why Preferred Prime Minister/Premier Scores Are Rubbish

No comments:

Post a Comment