While this blog may seem to stray into the political, it really is about how we all use and abuse data to support the outcomes we want, but not necessarily the outcomes the data supports.
For those of you who don’t spend too much time watching the Colbert Report, there is a war that seems to be going on between the TV and web-based political punditry and Nate Silver, every geek’s favorite prognosticator of elections. For those of you who don’t know, Nate Silver is a baseball statistician (think “Moneyball” which was not about Silver) who four years ago decided to apply statistical analysis to election polling. He started the first well known blog, www.fivethirtyeight.com , which he subsequently sold to the NY Times but still runs, that took the daily public state and national U.S. presidential polling numbers and created a forecast model based on all the data. He then used this model to correctly predict 49 of 50 states in the 2008 election and was subsequently very accurate for the Senate races in 2010.
So what is the big deal about this guy? All he does is take publicly available data and plug it into a model which he does not change. Now while this guy admittedly leans democratic, I believe he once said he was a left leaning libertarian, he is also fiercely proud of his statistical analysis skills. In fact, his job depends on his ability to correctly and dispassionately create a model that will predict the outcome of presidential and congressional elections in the United States.
The problem is that Silver’s model did not match the narrative that the more subjective commentators on the TV were pitching that Mitt Romney was surging. Silver’s model showed that at the height of the Romney surge, Romney still only had a 38.9% of winning.
“Nate Silver says this is a 73.6 percent chance that the president is going to win? Nobody in that campaign thinks they have a 73 percent chance — they think they have a 50.1 percent chance of winning. And you talk to the Romney people, it’s the same thing,” Scarborough said. “Both sides understand that it is close, and it could go either way. And anybody that thinks that this race is anything but a tossup right now is such an ideologue, they should be kept away from typewriters, computers, laptops and microphones for the next 10 days, because they’re jokes.”
— Joe Scarborough, MSNBC Pundit
This is because Silver’s model is weighted towards state based polling due to the fact that we have an Electoral College model in the United States.
“We can debate how much of a favorite Obama is; Romney, clearly, could still win. But this is not wizardry or rocket science,” Silver told POLITICO. “All you have to do is take an average, and count to 270. It’s a pretty simple set of facts. I’m sorry that Joe is math-challenged.”
— Nate Silver’s response
The broader point, and the reason I find this little media war interesting and important to those of us in the data world, is that so often management asks for data to inform decision making but tends to ignore the data that doesn’t support their desired outcomes. There are lots of instances where we want to use historical data to inform our future decisions but one thing to keep in mind is that not only do you need to analyze the data, you need to package it up so it is understood by those who will use it and make sure you can explain it so others understand it.
I don’t know if Joe Scarborough will ever accept Nate Silver’s math. But Nate also should invest a little more effort into getting people like Joe to understand the value of the kinds of statistical analysis that he does. Something to keep in mind the next time you all find a nugget of data that is contrary to the standard thinking in your organization. You may need to do a bit more selling and explaining to get your facts across.
That’s all for now, I have to go vote.