It’s been about a month since the US presidential election, when Donald Trump was elected contrary to the prediction of nearly every poll.
Since then, there’s been lots of talk about “the death of Big Data” — is it really that impressive, if we failed to predict an election by this much?
While I have many reservations about Big Data, the election did little to change my opinion. A few thoughts — and feel free to leave yours in the comments:
1. The nature of politics
Predicting an election based on polls is quite different than, for example, suggesting new music or determining ability to honor a loan.
Political goods (i.e. elections) differ greatly from consumer goods. Transactions are few and far between. Choices are bundled with little room for customization.
It’s likely that people answer polls one way, but vote another.
2. It’s not really Big Data
The funny thing about calling polling Big Data is that it’s not really Big Data.
Big Data is a paradigm shift away from the classical statistics of proposing hypotheses and inferring traits of a population based on a sample size.
Because cheap, fast computing allowed us to look at populations as a whole, sampling was no longer necessary. A lot of these classical statistical methods have been losing popularity to modern data mining techniques.
If anything, the inability to get the polls right might point us back to the importance of classical statistics (experimental design, proper sampling, etc.) — and perhaps, in that way, Big Data is overrated. But that’s not quite the accusation.
3. It’s one data point
To quote my friend Kevin: “Nothing like proclaiming the death of big data based on one data point.” Big Data has been remarkable in so many instances that to dismiss it on the basis of an election or two is unfair.
Thoughts? Did this election change your opinion of Big Data?
Charles N. Steele
What does election prediction have to do w big data? Political polling seems mostly a scam designed to make political points with push polls, over weighting, intentionally sloppy inference. I note Nate Silver now tells us, with great precision, how many extra percentage points the Russians have Trump, as well as exactly why the election didn’t come out as he predicted, “not a fault of his models.” Needs to keep scoring politically correct points. Hahaha! Scam!
Big data is hyped but it’s a different enterprise from this sort of prostitution.