All You Zombies …

by Henry Farrell on April 8, 2010

“Scott’s post”: yesterday reminded me that I had never linked to Ethan Zuckerman’s “fascinating discussion”: of how much original reporting there is out there on the Internets, as measured through the admittedly imperfect lens of Google news.

This evening, Google News tells me that I have my choice of 5,053 articles on conflicts between Congressional Republicans and Democrats over healthcare reform. (Oh goody.) How many of those stories contain original reporting? In a world with thousands of professional media outlets at our fingertips – as well as hundreds of thousands of amateurs – how much original material do we really have access to? … What’s interesting in those numbers is the 14,000/24 ratio, implying 583 versions of each story. (That ratio is probably much higher today, with Google News following more news sources.) Jonathan Stray did a very smart analysis for Nieman Journalism Lab, looking at a universe of 800 stories about the alleged involvement of two Chinese universities in hacking attacks on Google. His findings were striking: 800 stories = 121 non-identical stories = 13 stories with original quotes = 7 fully independent stories.

Stray coded the 121 non-identical stories that had been clustered together by Google (the clustering algorithms are good, but not perfect – nine stories were unrelated to the specific case of these two universities) and looked for the appearance of novel quotes, which he considered the “bare minimum” standard for original reporting. (Interesting – it’s the same logic that led Jure Leskovec to track quotes to track media flow in MemeTracker.) Only 13 of the stories contained quotes not taken from another media source’s report. The essence of Stray’s piece is the question, “What were those other 100 reporters doing?” The answer, unfortunately, is that they were rewriting everyone else’s stories.

This hasn’t gotten the kind of pick-up that it deserves in the broader blogosphere – even if it’s not necessarily a representative sampling (other stories on other kinds of news might very plausibly do better in English language media) – it’s a rather startling finding.



P O'Neill 04.08.10 at 7:03 pm

“What were those other 100 reporters doing?”

Category error.


Bill Gardner 04.08.10 at 7:25 pm

Searches of this kind will also reveal how much frank plagiarism is out there.

Publication counts have always been misleading measures of the weight of evidence. I once tracked down multiple citations of ’empirical studies’ that supported a factual claim in an amicus brief to the Supreme Court. There was just one data set; the others were editorials, reviews, and other repackagings of the data.


The Raven 04.08.10 at 7:27 pm

Why startling, Henry? It’s seemed obvious to me for a long time, though I did not put numbers to it. Even before the internet, most US news came from a handful of major newspapers and wire services.

What I find discouraging is the collapse of local reporting.


eddie 04.08.10 at 7:46 pm

I think it’s more a case of “how do we frame the question in such a twisted way, to make it look like old media is somehow still relevant?”.

How much of the work of a ‘professional journalist’ is actually the projection of his slave-owner’ opinion? Of course, actually reporting events is a part of journalism that people get paid to do, but it’s a tiny proportion of the job. The larger part is the analysis and commentary, and so-called professionals sucking up to their owners provide a negative amount of value to the public.


Ted Lemon 04.08.10 at 8:04 pm

I have to second The Raven’s observation. In no way is this even a little bit surprising. I would have been surprised if the number of original stories had been higher.


chris 04.08.10 at 8:10 pm

“This hasn’t gotten the kind of pick-up that it deserves in the broader blogosphere”

Maybe people are reluctant to repeat it without adding anything original — even if they do that sort of thing all the time, the irony could deter them in this particular case.


tomslee 04.08.10 at 8:12 pm

I would like to repeat what chris said.


Salient 04.08.10 at 8:39 pm

All your base are belong to Gannett.

For great justice!


Salient 04.08.10 at 8:44 pm

Value added on the Internets, by category and proportion:

* 14% humor, meta-referential

* 13% snark

* 23% thoughtful commentary from people who know 10x more about the topic than the reporter ever will

* 11% incendiary exchanges (most value added when the topic seems trivial to oneself but apparently of world-shaking consequence to those participating in the exchange)

* 20% locating stuff that would be hard to locate otherwise

* 9% friendly community

These percentages have been computed by the LHC and are not subject to dispute.


Hidari 04.08.10 at 8:44 pm


mike shupp 04.08.10 at 10:32 pm

There’s a market out there for home-based “journalists” — places like Associated Content ( which pay roughly half-a-cent per word for just about any sort of material which generates eyeballs for advertising. Rewriting news stories is perfect for this — take your quotes from the piece you’re rewriting, vary the lede, add a touch of right wing or left wing or libertarian perspective, stick close to the length and layout of the original piece, and you’re done. 800 words, half an hour, $4.00 income — repeat ten times per day, and you almost have an income.

This doesn’t lend itself to lengthy phone calls to independent observers or academics with relevant experience. It lends itself to blogging filled with one-liners with clickable tags. Which the ‘net is full of, you may have noticed. (Heh!)


NickS 04.08.10 at 10:47 pm

I’m just curious whether the post title is meant to imply that the reporters as a group function as a single unit that impregnates itself — like the characters in the Heinlein story by that name?


Witt 04.09.10 at 3:21 am

One thing I really appreciate about Google News is the effortless ability to search and translate international news. There is simply no way, prior to the web and computerized translation, that I could have gotten the kind of quick, finger-on-the-pulse read of whether a particular event is getting covered in, e.g., major Chinese papers that I can get today.

(N.b. sometimes “covered” means “reprinting Reuters”! Still, that itself is useful to know.)

Another question — and probably a bit unfair to criticize the study — is whether firsthand original reports and recording count as “news stories.” If someone goes to a public hearing in his town, films testimony, and puts it on YouTube, is that reporting? Not exactly, perhaps, but it’s “news” and it’s “original” (especially if his town is too small and/or the topic is too minor to get covered by local media).

Or how about blog posts that flesh out and comment on news stories? I don’t think this blog thinks of itself as a news site in the least; it’s a specialist blog dealing with archeology and related topics. But it’s a perfect example of a locally informed blogger providing depth and texture to existing mainstream news reports.

So it’s not so bleak as some might think.


Kaveh 04.09.10 at 4:15 am

@13, Juan Cole’s blog, Informed Comment ( ) is a great example of the last thing you mention. It’s not original reporting, but Cole puts together stuff from media in several different languages, with the help of his own considerable cultural and historical knowledge.


andrew 04.09.10 at 9:43 am

it’s a rather startling finding

For a few months a couple of years ago, part of my work involved finding news reports on various topics and passing links along. I relied a lot on google news (in fact, there were dozens of alerts/RSS feeds I was assigned to check regularly). I was never seized with the urge to quantify the experience, but I can’t say I’m all that surprised – especially since this was not a local/domestic story (for the English language media).

Look at who the seven original reporting outlets were, according to the Nieman post that Zuckerman links to:

These were produced by The New York Times, The Washington Post, the Wall Street Journal, The Guardian, Tech News World, Bloomberg, Xinhua (China), and the Global Times (China).

When you think about who (of those news sources not based in China) still has foreign bureaus, this is even less of a surprise.

As for U.S. based journalism, I actually think the bigger issue, at least for political journalism, is in statewide reporting as opposed to local reporting (acknowledging that sometimes local is meant to include state, and that local reporting is important in its own right). Many local papers are struggling, of course, but they do have a bit of an advantage over more widely-focused papers because they live there, their readers live there, their reporters live there, and many of their advertisers live there. That last fact may be the most important; it could be easier to sell ads when you can pretty much guarantee that a high percentage of your audience is in the market, because few outside the market read your paper. Similar arguments could apply to local online reporting efforts, and might explain why some of them seem to be doing fairly well.[1] If online local reporting can replace print reporting and do it as well or better, that seems like a good thing.

State reporting, on the other hand, might as well be national reporting if you’re a local paper and you’re not in or near the capital. At the very least, you’ve got to deal with all the expenses of having someone in the capital, and a full bureau is going to cost even more. Meanwhile, there’s a good chance your readers, who are not in the capital and may not even know who represents them in the legislature, are going to think of politics as either local (if they pay any attention to non-national politics) or national (because they see it in everyday life through mass media). The state capital might just be that mid-sized town people don’t visit often, although the governor’s race probably still draws interest. Even more difficult is if some of your state’s administrative machinery is not in the capital but also not near your town. It’s too bad, because state governments are still large institutions with significant responsibilities, state economies are non-negligible, and all sorts of stuff happens at the state level (with local and regional effects) that people should know about. The state capital’s local paper probably still covers some of it, and it’s probably free online, but when you think of the need for multiple perspectives that’s just not enough.

[1] Although I guess foundation-funding has helped many of the local online efforts so far; that’s obviously a different model than the advertising one, and another issue. I’m not sure if there have been similar initiatives for state government coverage; it’s been a while since I followed this issue. There has been a movement to get state governments to put at least basic legislative and administrative info online in freely accessible and searchable ways, with bill-tracking features and other stuff like that – a locally-based reporter could combine that with the phone and maybe get somewhere.


Nick L 04.09.10 at 2:17 pm

There’s also the finding that an increasing number of news stories are essentially rewritten PR copy:

Something similar was reported in the British press not so long ago (with the Times being the world broadsheet for this sort of thing), but I can’t seem to find it.


mpowell 04.09.10 at 5:05 pm

This hasn’t gotten the kind of pick-up that it deserves in the broader blogosphere – even if it’s not necessarily a representative sampling (other stories on other kinds of news might very plausibly do better in English language media) – it’s a rather startling finding.

I hope the humor here is at least somewhat intentional. Because doesn’t pick-up just imply exactly this process of everyone regurgitating the same story over and over again? Seven independent sources actually sounds pretty good to me. At that point it all just about distributing those stories and getting commentary on them from people who can add appropriate perspective. And the internet is doing a pretty good job of that, imo.


abell_ia 04.09.10 at 8:42 pm

The problem is the short news cycle, not the media in which the news is reported. It takes time to independently verify facts. If one takes this time, and is able to tell a different (more accurate) story than reported earlier, will anyone care? Probably not. Most readers/watchers/listeners have moved on to the next day’s news.

Comments on this entry are closed.