Sunday, June 16, 2019

Seat Betting As Bad As Anything Else At Predicting The 2019 Federal Election

Advance Summary

1. Seat betting markets, sometimes believed to be highly predictive, did not escape the general failure of poll and betting based predictions at the 2019 federal election.

2. Indeed, seat betting markets were significantly worse predictors of the result than the national polls through the election leadup, and only converged with polling-based models to reach a prediction that was as inaccurate as the national polls at the end.

3. Seat betting predicted fourteen seats incorrectly, but all of its errors in Labor vs Coalition contests, in common with most other predictive methods, were in the same direction.

4. Seat betting markets did vary from a national poll-based outlook in several seats, but their forecasts in such cases were about as often misses as hits.

5. This is the third federal election in a row at which seat betting has failed to show that it is a useful predictor of classic (Labor vs Coalition) seat-by-seat results in comparison with simpler methods.  

-----------------------------------------------------------------------------------------------------

With all House of Representatives seats now declared, it's time for a regular post-election feature on this site, a review of how seat-by-seat betting fared as a predictive method.  I have been interested in this subject over the years mainly to see whether seat betting contained any superior insight that might be useful in predicting elections.  In 2013 the answer was a resounding no, in 2016 it was a resounding meh, and surely if seat betting could show that it knew something that other sources of information didn't, 2019 would be the year! Even if seat betting wasn't a very good predictor, if it was not as bad as polling or headline betting this year, that would be something in its favour.

2019 saw the first failure in the headline betting markets since 1993, but it was a much bigger failure than that.  In 1993 Labor were at least given some sort of realistic chance by the bookies, and ended up somewhere in the $2-$3 range (I don't have the exact numbers).  This year the Coalition were $7.00 to Labor's $1.10 half an hour before polls closed - just an implied 14% chance -  and Sportsbet had already besmirched itself in more ways than one by paying out early (which I think should be banned when it comes to election betting, but that's another story).  The view that "the money never lies" has been remarkably immune to evidence over the years, but surely this will be the end of it for a while.

Seat Betting Outcomes

As usual I tracked seat betting predictions at various times over the final few months, with a final check at 1 am on election day.

Five markets were split between different companies and I noted that cross-market averages had the Coalition in Reid, Farrer and Stirling and Labor in Braddon and Capricornia as favourites.  Sportsbet (the biggest market) disagreed with the latter two, and had a tie in Farrer.  In total, the markets had Labor as favourites in 80 seats, the Coalition 65 and others 6.  

The markets, at least on average, expected the following Labor wins in seats won by the Coalition:

Bass
Braddon (split - Sportsbet correct)
Capricornia (split - Sportsbet correct)
Forde
Robertson
Petrie
Dickson
Hasluck
Longman
Chisholm
La Trobe
Swan

The markets wrongly expected the Coalition to win Indi (won by Helen Haines) and lose Cowper. (And Sportsbet fence-sat on Farrer, which the rest on average were right about.)

Overall therefore, the markets were wrong in 14 seats (Sportsbet was wrong in 12.5).  This included being wrong in 12 classic-2PP seats (Sportsbet were wrong in 10).  That's not that bad by itself; it's not clear that it's possible to reliably do any better than that (especially not when the national polls are wrong).  The problem, as with most other predictive methods at this election, was that the errors lay all in the same direction.

None of Labor's seat failures were massive surprises in and of themselves.  Only Longman was outside $3 on all markets.

Tracking

My final colour-coded graph of seat betting tracking over time looked like this:



Key to colours:

Red - Labor favourite in all markets
Orange - Labor favourite in some markets, tied in others
Dark blue - Coalition favourite in all markets
Light blue - Coalition favourite in some markets, tied in others
Grey - all markets tied or different favourites in different markets
Purple - IND favourite in all markets
Pink - IND favourite in some markets, tied in others

In some ways this is the reverse of 2013.  In 2013 the markets started with a good prediction and then got worse over time, especially at the end.  In 2019 the markets started out with a worse prediction than would have been obtained from the polling at the time, and over time allowed the polling to move them so that in the end they were only about as bad as the national polls.  When Labor had large leads, a historic view of polling performance suggested those leads would probably narrow by the final weeks, so seat betting markets early in the election leadup were collectively expecting a major blowout in Labor's favour.   They were slow in moving to a position that was only as bad as the national polls - when betting should (if it is any use at all) be better at competing with polls further out from an election than close to it.

Eleven seats were expected to be Labor gains at all times, and Labor gained three of them (two of those notionally theirs to begin with).  Of Labor's five losses, the markets picked two, were lineball about a third, had at one stage predicted a fourth, and never paid all that much attention to Longman.

Betting vs polls

The test I like to use of whether betting is really worth looking at for predictiveness of classic seats is whether it can beat a very simple pendulum and polling based model - the pendulum modified by the average of the final polls from each company, ignoring bells and whistles like personal vote effects, state factors and so on.

In this case the final poll average was 51.4 to Labor (the actual result to one decimal will almost certainly be 51.5 to Coalition.) The simple pendulum model would have made eleven  2PP errors (again all in the same direction): wrongly predicting Labor gains in Capricornia, Forde, Flynn, Robertson, Banks and Petrie and missing Labor's five losses in Herbert, Longman, Lindsay, Braddon and Bass.  So it would have been one seat better than the overall market average, and one seat worse than Sportsbet.  In comparison with the simple pendulum model, the markets differed in nine classic seats.  They got Flynn, Banks, Herbert and Lindsay right (and Sportsbet also got Capricornia and Braddon right) but bought into the view that the Coalition was in trouble in Victoria, Brisbane and WA, and were hence led astray in Chisholm, La Trobe, Dickson, Swan and Hasluck.  In the cases of La Trobe, Dickson and Hasluck, the national polling failure wasn't even the culprit: it turns out that those seats wouldn't have fallen anyway.

Non-classic seats

At the 2019 election there were six crossbench seats being defended (Kennedy, Clark, Melbourne, Indi, Wentworth, Mayo), of which only Indi and Wentworth were ever in serious doubt.  There were clearly significant crossbench challenges in Warringah, Farrer, Cowper and Macnamara.  Some people took crossbench challenges seriously in some other seats such as Kooyong, Higgins, Flinders, Brisbane, Mallee, Curtin and others.

Non-classic seats should give seat betting an opportunity to do well because they cannot be modelled easily by pendulum-based polling methods, and they often don't see that much seat polling.  Where they do see seat polling, it is often internal polling that is even worse than neutral seat polls.

In this election leadup betting always had Wentworth, Macnamara and all the more dubious inclusions in the at-risk list right.  They always had Cowper wrong.  In Warringah they took a lot of convincing to move off the prior that Tony Abbott would retain (presumably based on his past margin, which was irrelevant as he had not faced a similar challenge before).  They were very uncertain about Indi and Farrer and tended to become less predictive about those seats as the campaign went on, before (on average) just getting Farrer right at the end.

Farrer was an interesting one because the feeling that the seat could be lost based on the precedent of the NSW state election involving the Shooters, Fishers and Farmers was a strong one.  However, voters may have felt they'd let off enough steam, or had their message heard, or might have had reservations about even a prominent indie that they didn't have about the Shooters.  The feel that Farrer was in big trouble (it was actually resoundingly retained) also tended to be reinforced by word on the ground type reporting in the mainstream press, but this kind of reporting provided no superior insights either.  (Perhaps it's too easy for this kind of reporting to get a wrong feel by spending too much time in major population centres.)

Is there any hope for seat betting being useful at predicting classic seats?

As a general rule, if two sources of predictive information (like poll-based prediction and seat-betting-based prediction) are similar in accuracy but produce different forecasts, then aggregating them somehow should increase predictiveness compared to either method alone.  This applies even if one of the methods is slightly worse than the other.  However, when one predictive method is much worse than another, aggregating them can make a worse prediction than the better method alone.

In the case of seat betting, at the last three elections we have had one case of it being much worse than polling-based methods, one case of it producing more or less identical forecasts to polling-based methods, and now one case of it producing forecasts that were not much worse and differed in several seats.  This year might be a case in which a combined polling/betting method would have done slightly better than a poll-based method alone, but the very marginal gain that might be possible isn't worth it given the risks shown in 2013.

In general, seat betting has shown that it is strongly influenced by the national polls.  When it does deviate from them, there's been no evidence lately of it correctly second-guessing what national polls are doing wrong.  Yet I suspect people will continue to follow seat betting odds as if they are predictive for no other reason than that the data exist and are easy to look up, talk about and construct elaborate models of.

It's going to be difficult to use models based on polling generally to predict federal elections for a while.  I'm actually intending to stop doing it, because it doesn't seem to be a core part of what I do based on visitor levels to particular articles on this site, so I'd rather focus my predictive efforts in areas where they're less at risk of being wrong.  (I will however continue to offer translations of poll readings into seat tallies on a provisional "if these polls aren't nonsense" type basis.)

The 2019 pollster failure creates two problems.  Firstly, it's very hard to model pollster house effects reliably when polls have suddenly displayed unusually large predictive errors at a given election - was this a freak event down to the character of the election, or a more systematic thing at federal level?  Secondly, if the pollsters adopt unique responses to the polling failure, those responses might cause them to be wrong in the opposite direction (as happened in the UK in 2017).

Ultimately, any form of betting-based prediction has the same problem, because the established behaviour so far is that betting markets are strongly influenced by polls.  Perhaps they will now become less so,  but if so they're at least as likely to be more wrong than right.   

Wednesday, June 12, 2019

Senate 2019: Button Press Thread

Just starting a thread that will cover the button presses in the remaining Senate races including any interesting information from the distributions of preferences as they come to hand.  I haven't been putting myself in the loop concerning when exactly the button presses will occur, save that Tasmania's will be tomorrow at 10:30 am (open to scrutineers, of which I'm not one this year) with the declaration of the poll on Friday at the same time.  The ACT count is also ready to go (to be delcared on Friday afternoon) and the remaining counts are getting close to completion with relatively few unapportioned or uncounted votes still showing.  The NT button has already been pressed, which did nothing because both major party #1 candidates had a quota.  William Bowe has some comments on NT preferences.

This thread will be updated as news and analysis comes to hand over the next week or two.  Some time after they are all over I will be posting a Senate reform performance review similar to that which I posted in 2016.  The great bulk of concerns about the new Senate voting system were clearly  falsified in 2016 but this is its first test at a half-Senate election.

In terms of party totals, only one of the final seat races is even remotely close and that is Queensland.  In Queensland, the notional primary vote margin between the Greens (currently 6th) and Labor (currently 7th) is currently .114 quotas (1.63%).  The LNP and One Nation are slightly ahead of the Greens.  Early in the count this looked like it could be close after preferences (see my Queensland thread with links to other coverage) but (i) the primary vote margin has blown out compared to earlier expectations (ii) Labor might not even gain on preferences at all.

In other states and territories the notional margins between sixth and seventh are very large indeed:

* New South Wales: Greens lead One Nation by .263 Q (3.76%)
* Victoria: Coalition leads One Nation by .317 Q (4.53%) with Derryn Hinch on about the same score as One Nation
* Western Australia: Greens lead One Nation by .410 Q (5.86%)
* South Australia: Liberals lead One Nation by .306 Q (4.37%)
* Tasmania: Jacqui Lambie leads One Nation by .382 Q (5.46%)
* ACT: Liberals lead Greens by .44 Q (6.29%) and will cross quota very early in the preference distribution anyway.

Nothing we saw in 2016 suggests that these gaps will be bridged, although One Nation could erase most of the Greens' gap in NSW.  Two candidates did win from notionally behind in 2016 (Malcolm Roberts and Bob Day), but these were across smaller gaps (1.4% in each case) and in situations where their main opponents were sorely blighted by leakage (Labor vs Day) or poor preference-getting ability (LDP vs Roberts).

There is a candidate-vote-based complication in Tasmania, where Lisa Singh (.397Q) is ahead of One Nation and is in the race for seventh (where she would have finished at a half-Senate election based on the 2016 results).  However it is unrealistic for Singh to reach sixth as she cannot receive any above the line preferences until Catryna Bilyk (notionally about .19 quotas ahead of Singh) has been elected, which may not actually happen until Singh herself has been excluded.  While Tasmania has a 27.1% below the line vote, only about 5.1% (.36 Q) of the Tasmanian count consists of effectively free BTLs that will be distributed at full value from excluded parties or within the Senate ticket (the rest will be distributed at greatly reduced value if even at all), and even if Singh got nearly all of these (which won't happen) she would struggle to overtake Bilyk or Lambie who will both be picking up above-the-line preferences.

There is no candidate exclusion complication in New South Wales, where the very weak supply of BTLs means that Jim Molan will be excluded and his preferences (minus those that leak) will flow up the ticket to Perin Davey.  At the time of Molan's exclusion there will not be all that many places left for his BTLs to leak to.  (See here for more on Molan's BTL vote.)

A section will be posted below for each remaining state/territory, in alphabetical order, once its result is known.

ACT

As expected:

1. Gallagher (ALP)
2. Seselja (Lib)

Gallagher was elected on primaries and Seselja was elected on UAP preferences with the following still in the count: Greens, Pesec, ALP#2.  Exhaust was 0.1%.

Similar issues were seen with the Pesec box as with the Garland box in Tasmania (see below).  54% of above the line voters left the Pesec box blank, although there were only seven boxes in the ACT.  Of these 40% had FACN 6th, suggesting that they were putting FACN last but didn't think the Pesec square should be numbered.

The final margin for Seselja over Kyburz (Green) was 15%.  Had all remaining preferences been thrown I have a rough estimate of a margin of 8.5% but I do not claim certainty that this is accurate as I have cut a few corners (which probably favour the Greens - ignoring breaks in sequence and ignoring issues with votes from other parties) and the spreadsheet programming is getting near the edge of my limited abilities.  Maybe 9.5% would be more accurate.  I get the following flows from major remaining preference sources: ALP ATLs 79-19 to Greens, Gallagher BTLs 85-11, Pesec ATLs 51-44, Pesec BTLs 70-26.

More strategic voting would have reduced the margin further but would not have stopped Seselja winning because more than a quota of votes had him higher than Labor or the Greens.  With the optimum organisation of Labor and Greens voters, among those in theory open to shifting their votes between the two, the margin might be brought down to about 4%-5%.

The #2 FACN candidate also garnered the most BTL last places in the ACT, defeating Seselja for the dishonour 4956-4046.  FACN also won this in the NT (overwhelmingly) so are currently 3 from 3.

NSW

The button has been pressed and the result is:

1. Hughes (Lib)
2. Sheldon (ALP)
3. Bragg (Lib)
4. Ayres (ALP)
5. Davey (Nat, on Lib-Nat ticket)
6. Faruqi (Green)

Then came One Nation, Shooters, HEMP, Molan in tenth, Li (ALP), Lib Dems, CDP and the rest.  NSW has five new Senators with only Faruqi re-elected (Spender, Burston and Molan lost and Williams and Cameron retired.)

The Australian is ludicrously reporting Davey and Faruqi "pipping Senator Molan to the last spots" when in fact Molan was excluded three exclusions before that stage.  At this stage he was 331,923 votes behind Perin Davey (7%) but he would have fallen further behind on the exclusions of remaining candidates.

Molan's preferences (including votes he received from other BTL sources) split 71.5% to Davey, a rather large 20% to One Nation, 3.1% Shooters, 1.1% Greens, 0.8% HEMP, 3.5% exhaust.  Molan received only 9266 votes from all other sources so at minimum 14.5% of his own number 1 votes leaked to One Nation.  Neither Davey nor Faruqi reached quota, with the count stopping with Faruqi 188K ahead of McCulloch (One Nation), a gain of 11,198 compared to the original count.

Exhaust in the count was 5.6%, down from 7.3% effective (9.2% official) last time.

Queensland

The result has been reported as 3 LNP, 1 ALP, 1 PHON, 1 Greens.  Malcolm Roberts (ON) finished fourth, awaiting confirmation of the order.

Tasmania

The Tasmanian result has been confirmed as:

1. Colbeck (Lib)
2. Brown (ALP)
3. Chandler (Lib)
4. McKim (Green)
5. Bilyk (ALP)
6. Lambie (JLN)

The split of votes in the Labor ticket between Bilyk and Singh enabled McKim to cross the line first in fourth place, passing quota on some preferences from the Shooters, Fishers and Farmers, although he was actually the weakest above-the-line performer on those.  At this stage the only remaining candidates in the "race" were Bilyk, Lambie, Lisa Singh (ALP), Tanya Denison (Lib #3), Kevin Morgan (UAP) and Matthew Stephen (PHON).  After McKim's election, Morgan was excluded followed by Denison and Singh.  Singh's votes elected both Bilyk and Lambie with Lambie beating Stephen by more than half a quota (several thousand more than her starting margin over One Nation.)  The exhaust rate (in both official and real terms) was just 1.9%, compared to 2.8% in both cases for 2016.  Only 12% of ballot papers left the count at any value (including greatly reduced), compared to 32% in 2016.

3.7% of all voters voted below the line 1-44.  This is up from 2.2% voting 1-58 in 2016.  The most common last placed candidate was Frank Falzon (FACN) who received 2049 (15.6%) #44s, narrowly shading Steve Mav with 1874.

Surprisingly, despite starting notionally behind Garland, Steve Mav overtook him, mainly on the preferences of fellow ungrouped candidates.  However Mav was next excluded.  Unfortunately the results have revealed serious problems with the current system for grouped independents, who get only a blank box above the line.  Although the Garland/Duncan group was placed last by only 1.7% of below the line voters who voted all the way, it was placed last by 20% of the 6884 voters who numbered all boxes above the line.  (There was remarkably little ideological skew among voters doing this.)  Worse, 6645 voters numbered all boxes bar one above the line, and in 81% of these cases it was the unnamed Garland group that was omitted.  Overall only 10.3% of ATL voters numbered the Garland group box at all, compared to FACN and CEC in the high teens and all other parties over 30% - this despite the Garland group being preferenced by both Labor and the Greens on their how-to-vote cards! There is no political logic to this; it can only be caused by voter confusion, and will have to be addressed in the future if there is to be any prospect for independents to succeed without creating front parties.

See also William Bowe's dissection of the result.

South Australia

1. Ruston (Lib)
2. Gallacher (ALP)
3. Fawcett (Lib)
4. Smith (ALP)
5. Hanson-Young (GRN) 
6. Antic (Lib)

As expected.

Western Australia

As expected:

1. Reynolds (Lib)
2. Dodson (ALP)
3. Brockman (Lib)
4. O'Sullivan (Lib)
5. Pratt (ALP)
6. Steele-John (Green)

The only interesting thing there is that O'Sullivan overtook Pratt from notionally 1% behind, however 1.4% of WA Nationals preferences could have had a bit to do with that.

Jim Molan's Senate Result In Historic Context

There is a lot of discussion surrounding Senator Jim Molan's below the line vote in the NSW Senate race.  Misleading arguments about it are being weaponised by some of those who would like to see Molan appointed to the Sinodinos casual vacancy, but there is also a risk that amid all this appreciation of the scale of Molan's result could be lost.

To start with, Molan absolutely is not going to win and has never even looked remotely like being in contention during counting.   But his result is still very significant - in the state in which getting a high below-the-line vote is most difficult (because of historically low below the line rates and also the sheer scale required for an individual campaign), Molan has so far polled just over 130,000 votes (2.8%).  His share should rise slightly based on remaining unapportioned votes but won't be significantly above 3%, if it even reaches that.  

Saturday, June 1, 2019

How Can Australian Polling Disclosure And Reporting Be Improved?

Australian national opinion polling has just suffered its worst failure in result terms since 1980 and its worst failure in margin terms since 1984.  This was not just an "average polling error", at least not by the standards of the last 30+ years.  The questions remain: what caused it and what can be done (if anything) to stop it happening again.

A major problem with answering these questions is that Australian pollsters have not been telling us nearly enough about what they do.  As Murray Goot has noted, this has been a very long-standing problem.

In general, Australian pollsters have taken an approach that is secretive, poorly documented, and contrary to scientific method.   One notable example of this was Galaxy (it looks like correctly) changing the preference allocation for One Nation in late 2017, and not revealing they had done this for five months (in which time The Australian kept wrongly telling its readers Newspoll preferences were based on the 2016 election.)  But more generally, even very basic details about how pollsters do their work are elusive unless you are on very good terms with the right people.  Some polls also have statistically unlikely properties (such as not bouncing around as much as their sample size suggests they should, either in poll to poll swing terms or in seat-polling swing terms) that they have never explained.