Scott’s post yesterday reminded me that I had never linked to Ethan Zuckerman’s fascinating discussion of how much original reporting there is out there on the Internets, as measured through the admittedly imperfect lens of Google news.
This evening, Google News tells me that I have my choice of 5,053 articles on conflicts between Congressional Republicans and Democrats over healthcare reform. (Oh goody.) How many of those stories contain original reporting? In a world with thousands of professional media outlets at our fingertips – as well as hundreds of thousands of amateurs – how much original material do we really have access to? … What’s interesting in those numbers is the 14,000/24 ratio, implying 583 versions of each story. (That ratio is probably much higher today, with Google News following more news sources.) Jonathan Stray did a very smart analysis for Nieman Journalism Lab, looking at a universe of 800 stories about the alleged involvement of two Chinese universities in hacking attacks on Google. His findings were striking: 800 stories = 121 non-identical stories = 13 stories with original quotes = 7 fully independent stories.
Stray coded the 121 non-identical stories that had been clustered together by Google (the clustering algorithms are good, but not perfect – nine stories were unrelated to the specific case of these two universities) and looked for the appearance of novel quotes, which he considered the “bare minimum” standard for original reporting. (Interesting – it’s the same logic that led Jure Leskovec to track quotes to track media flow in MemeTracker.) Only 13 of the stories contained quotes not taken from another media source’s report. The essence of Stray’s piece is the question, “What were those other 100 reporters doing?” The answer, unfortunately, is that they were rewriting everyone else’s stories.
This hasn’t gotten the kind of pick-up that it deserves in the broader blogosphere – even if it’s not necessarily a representative sampling (other stories on other kinds of news might very plausibly do better in English language media) – it’s a rather startling finding.