Big Data: “What We Can Learn From the Epic Failure of Google Flu Trends”
From a Wired.com Article by David Lazer and Ryan Kennedy:
Every day, millions of people use Google to dig up information that drives their daily lives, from how long their commute will be to how to treat their child’s illness. This search data reveals a lot about the searchers: their wants, their needs, their concerns—extraordinarily valuable information. If these searches accurately reflect what is happening in people’s lives, analysts could use this information to track diseases, predict sales of new products, or even anticipate the results of elections.
[Clip]
In 2008, researchers from Google explored this potential, claiming that they could “nowcast” the flu based on people’s searches. The essential idea, published in a paper in Nature, was that when people are sick with the flu, many search for flu-related information on Google, providing almost instant signals of overall flu prevalence
[Clip]
In a paper published in 2014 in Science, our research teams documented and deconstructed the failure of Google to predict flu prevalence. Our team from Northeastern University, the University of Houston, and Harvard University compared the performance of GFT with very simple models based on the CDC’s data, finding that GFT had begun to perform worse. Moreover, we highlighted a persistent pattern of GFT performing well for two to three years and then failing significantly and requiring substantial revision.
The point of our paper was not to bury big data—our own research has demonstrated the value of big data in modeling disease spread, real time identification of emergencies, and identifying macro economic changes ahead of traditional methods. But while Google’s efforts in projecting the flu were well meaning, they were remarkably opaque in terms of method and data—making it dangerous to rely on Google Flu Trends for any decision-making.
Read the Complete Article (1033 Words)
See Also: “Google Flu Trends Website Shuts Down; Will Send Data to Boston Children’s, Columbia, CDC” (August 24, 2015)
Filed under: Data Files, Journal Articles, News
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.