As Daniel notes, we don’t normally do horse race stuff here. And this is week old horse race stuff. But I thought there was some interesting stuff in the SurveyUSA 50 state polls on Clinton vs McCain and Obama vs McCain. The biggest thing was that they show up an interesting fallacy about probabilistic reasoning that, although pretty obvious when stated baldly, is also pretty hard to avoid in practice.
Those polls suggest that if we just look state by state at which candidate is likely to win, we see Obama and Clinton both narrowly ahead of McCain, with the differences between their performances well within any margin of error. That seems right, though by that measure I’d put Clinton a little ahead, and they put Obama ahead.
But the polls also suggest that if we look at two more important measures, Obama is (according to just this poll) a much stronger candidate. He has a higher expected electoral vote and, more importantly, a much higher win probability. Darryl at Hominid Views produced one model that suggests this, though I suspect his numbers make both Obama and Clinton look more likely to win than they really are. So below I detail a model that I think is a little more realistic. (It’s still a very stylised model, and I’d be interested in knowing from people who do this kind of modelling well what changes might be made to make it better.)
I’m only interested here in modelling what the SurveyUSA poll tells us. So even when it throws up antecedently improbable results (Obama up in North Dakota! Obama losing in New Jersey!) I’m going to take this data at face value.
The model I’m using takes McCain’s percentage lead in a given state to be a random variable whose probability distribution is given by a normal distribution with the mean being his lead in the SurveyUSA poll, and standard deviation 10. That gives the following expected electoral vote totals.
- Obama 299 – McCain 239
- Clinton 279 – McCain 259
Obama’s lead is three times Clinton’s. I then ran a Monte Carlo simulation where in each round each state’s McCain lead was calculated independently as a random draw from that distribution. (Possibly it would have made more sense to not have these be completely independent.) In those simulations, Obama beat McCain 78% of the time, and Clinton beat McCain 63% of the time. I ran 10,000 simulations, which is plenty to remove sampling error, though obviously not modelling error.
Obama’s big advantage is that he locks down more Democratic leaning states, and competes in Republican leaning states. So if we think state-by-state, Clinton looks to be as electable as Obama. But it’s not likely that everything that’s likely to happen will happen. It is very likely that there will be surprises. And if Obama’s the candidate, those surprises are more likely to be happy surprises for Democrats. With Clinton, they are more likely to be unpleasant surprises.
Update: I realise I ended this post without saying clearly what the fallacy I was referring to at the top was. It’s the fallacy of inferring from the fact that each of a bunch of things is likely to be the case that it is likely that they’ll all be the case. As I say in the last paragraph, that isn’t generally right. Given enough events, it’s likely that some of them will turn out in unlikely ways. That’s generally important to remember, even if one thinks that the best ways to model this insight are somewhat spuriously precise.