The Impact Factor’s Matthew Effect

by Kieran Healy on August 26, 2009

Since the publication of Robert K. Merton’s theory of cumulative advantage in science (Matthew Effect), several empirical studies have tried to measure its presence at the level of papers, individual researchers, institutions or countries. However, these studies seldom control for the intrinsic “quality” of papers or of researchers–“better” (however defined) papers or researchers could receive higher citation rates because they are indeed of better quality. Using an original method for controlling the intrinsic value of papers–identical duplicate papers published in different journals with different impact factors–this paper shows that the journal in which papers are published have a strong influence on their citation rates, as duplicate papers published in high impact journals obtain, on average, twice as much citations as their identical counterparts published in journals with lower impact factors. The intrinsic value of a paper is thus not the only reason a given paper gets cited or not; there is a specific Matthew effect attached to journals and this gives to paper published there an added value over and above their intrinsic quality.

The full paper has some more detail. Duplicates are defined as those papers published in different journals but which nevertheless have the same title, the same first author, and the same number of cited references. With this definition the authors find 4,532 pairs of duplicates in the Web of Science database across the sciences and social sciences. (This is a pretty striking finding in itself.) Remember that the impact factor of a journal is meant to be a (weighted) product of the number of citations to articles in that journal â€” i.e., a journal’s prestige is a function of the quality of the articles appearing in it. But here we see that, for the same papers, the impact factor of the journal affects the citation rate of the paper. The mechanism is straightforward, but it’s neat to see it shown this way.

(Appropriately enough, I have posted this at both Crooked Timber and OrgTheory. We’ll see which one gets the links and comments.)

{ 40 comments }

1 aaron_m 08.26.09 at 1:17 pm: Imagine the number of duplicate hits they would have received if they had dropped the ‘same author’ criterion :)
2 Stuart 08.26.09 at 1:43 pm: Couldn’t some of the people citing have found both duplicates and decided which one to cite based on the reputation of the journal, but if only the one existed they would have cited that whichever one it was in. Or did they manage to find a control for that effect?
3 kid bitzer 08.26.09 at 1:53 pm: i’m unclear on what “duplicate papers” means.

are these actually word-for-word duplicates, published in multiple journals?

in my discipline, that would be a big no-no. and in any discipline, it seems like a weird phenomenon.

if nature is publishing my piece on “extra legs in cephalopods caused by exposure to sponge-bob square-pants”, why would “the archives of cephalopoddities” be willing to publish it as well?

how many of these are not genuine duplicates, but simply related papers that share title and first author? i mean, many titles are far from unique identifiers of content: “p-transitions in lead isotopes” could be the title of several related papers.

in which case, most researchers would print the better content in the better journal (as those “betters” are antecedently perceived), and this in turn would confound the results of this study.

so i guess i’m just asking for clarification of how we know that the exact same content appeared in both the more and the less prestigious journal.

(and then, subsidiarily, some explanation of how such a thing would be permitted in a discipline).
4 Stuart 08.26.09 at 2:07 pm: how many of these are not genuine duplicates, but simply related papers that share title and first author?

Remember they also check for the same number of citations, which would cut the chance of it being a coincidence down a lot I imagine.
5 LizardBreath 08.26.09 at 2:40 pm: Stuart’s explanation in 2 seems as if it might easily explain the whole effect, and I can’t see how you could control for it. (The underlying hypothesis seems probably true to me, but this paper sounds like bad evidence for it.)
6 Jacob T. Levy 08.26.09 at 3:13 pm: From the paper:

“Although the existence of such duplicate papers raises important ethical questions, they offer a unique occasion to test the effect of the impact factor of the journal on the number of citations received by the papers they publish.”

What a marvelous academic sentence.

I’m still pretty much stunned that there are 4500+ such papers, and can’t imagine how that’s possible.
7 onymous 08.26.09 at 3:17 pm: I would expect papers published in more than one journal to come from researchers of “intrinsic” low quality. People only do that (a) to pad their CV and (b) when they think no one is paying enough attention to notice. Surely that affects the results?
8 Ewout ter Haar 08.26.09 at 3:19 pm: Couldnâ€™t some of the people citing have found both duplicates and decided which one to cite based on the reputation of the journal, but if only the one existed they would have cited that whichever one it was in.

Well, that’s the Matthew Effect, isn’t it? The fact that two papers with the same intrinsic value are cited more or less based on visibilitiy and reputation.
9 onymous 08.26.09 at 3:22 pm: Well, thatâ€™s the Matthew Effect, isnâ€™t it?

No, it isn’t. The Matthew Effect would be if the paper were less likely to be cited at all for being in the low-impact journal. This doesn’t show people aren’t finding it in the low-impact journal; it might just show that if they do, they also find it in the high-impact journal and choose to cite it there.
10 onymous 08.26.09 at 3:25 pm: Another interesting thing along these lines was this study by Asif-ul Haque and Paul Ginsparg about how papers appearing at the top of daily arxiv listings are more likely to be cited.
11 Harry 08.26.09 at 3:54 pm: I know of only one case of an identical paper appearing in two journals, but with different titles — nevertheless the editor of the more influential journal caught it, and banned the author from subsequent admissions.

This isn’t a surprising finding, but stuart’s question needs answering…
12 LizardBreath 08.26.09 at 3:57 pm: I’m not an academic, but there’s no such thing as publishing an abstract and bibliography in one journal as a pointer to the full paper in another journal that your intended audience might not read, is there (like, there’s a small tightly focused field with a topical journal, and researchers in that field usually don’t expect to find relevant papers in Nature)? If that were something people did, that would explain apparent duplicates without wrongdoing.
13 Zamfir 08.26.09 at 4:00 pm: Wouldn’t a good test for the Stuart Effect be to look for papers that are identical in text, but with different titles and/or authors?

Sure, those also raise ethical questions, but that doesn’t appear to be a problem.
14 Stuart 08.26.09 at 4:23 pm: Although a reducing your sample to only papers sent in by unethical authors might make the sample unrepresentative. Hopefully!
15 Satan Mayo 08.26.09 at 4:29 pm: I would expect papers published in more than one journal to come from researchers of â€œintrinsicâ€ low quality. People only do that (a) to pad their CV and (b) when they think no one is paying enough attention to notice. Surely that affects the results?

Would a CV padded with two papers, next to each other, with exactly the same title and authors, really be effectively padded?
16 kid bitzer 08.26.09 at 4:49 pm: stuart @4

i don’t see why having the same number of citations should do that much to ensure identity of content otherwise.

if i have a brief, dull paper on p-transitions in lead isotopes and a long, innovative paper on p-transitions in lead isotopes, i will probably cite most of the same standard sources in both papers. the citations reflect the existing literature on a topic more than the contents of this particular paper.

anyhow–is it the consensus of readers here that these duplicates are ethically suspect? it sure seems so to me–i.e., i would expect the kind of reaction harry describes in 11–but from the original post i thought maybe i was being naive about how things are done in the sciences. i’d be relieved to find out it’s a no-no there, too.
17 Ewout ter Haar 08.26.09 at 4:56 pm: Maybe I am misinterpreting what the “Matthew Effect” means, but if two identical papers are found and citations are only based on intrinsic value, one would expect an equal number of citations to both (authors would cite one or the other, with a 50% chance).

If, one the other hand, the higher visibility paper has a higher probability of getting cited, than there is a “rich get richer” or “Matthew” effect. And that is off course exactly what happens. The Matthew effect, in this interpretation, is not all or nothing, just a higher than 50% chance of getting cited. So, in this sense, the Stuart effect = Matthew Effect: authors base their decision of which of the two articles to cite on grounds other than intrinsic value.
18 onymous 08.26.09 at 6:15 pm: kid bitzer: Yes, it is a definite no-no in the sciences. The only cases I know of when people can reasonably publish the same material more than once is when they publish a paper and then give a talk at a conference, and are asked to submit the talk as a conference proceeding. Even in those cases, the content isn’t precisely identical.
19 onymous 08.26.09 at 6:18 pm: Ewout ter Haar: maybe. But I think it’s important to distinguish two effects. The first is this one, that given identical papers published in two different venues, one will choose to cite the one in the better-known venue. The work is getting cited regardless, so there’s no harm here.

The more pernicious effect would be if work of equal quality in lesser-known journals is not getting cited, or even noticed, because it’s in the lesser-known journal. That this study indicates things are getting cited on grounds other than intrinsic value really tells us nothing about whether this more troubling effect happens.
20 leederick 08.26.09 at 6:43 pm: “anyhowâ€”is it the consensus of readers here that these duplicates are ethically suspect? it sure seems so to me…”

Am I missing something here? Maybe it’s a disciplinary thing, it’s very common in some fields for historic or classic papers to be republished to bring them to the notice of researchers who may have missed the original publication, or to use as a launch pad for a discussion of a method or finding, or a commentary. I’m thinking of areas like medicine, where there are high profile general readership journals for practitioners – as well as a huge and obscure literature in related fields these readers may not have the time or ability to seek out.

In that context I don’t find the result surprising in the slightest – if you make a effort to pimp out a difficult to obtain paper that’s obviously going to be influential, and cite it in a couple of other papers while doing so, it’ll get a lot of cites from the reprint. But I did find it pretty striking they only managed to find 4,532 pairs of papers.
21 watson aname 08.26.09 at 6:43 pm: onymous @18. Agreed, and it’s quite common in the other direction, that a conference proceedings paper will describe initial work that later is expanded on in a full paper. It’s not even unusual to have identical names and author lists in this case, but the actual content isn’t identical, and often isn’t even that close. I’d generally expect the citations list on the journal paper to be longer, but wouldn’t be entirely surprised by identical citation lists. The internal details and data is always in theory at least somewhat different, although it comes from the same project. Typically there is a lot more of it in the journal paper, too. You sometimes see repeated figures for background work particularly, which may or may not be a copyright violation…
22 Doormat 08.26.09 at 6:48 pm: If I have the usual access to Web Of Science (say, which I do), is it possible to run the sort of search which would find such duplicate papers? Or do you need access to the raw database (perhaps by asking WoS??) I’m just very curious to see some actual examples of such duplicates…
23 Stuart 08.26.09 at 6:57 pm: So, in this sense, the Stuart effect = Matthew Effect: authors base their decision of which of the two articles to cite on grounds other than intrinsic value.

Except we can assume most authors think their article looks better with a cite to a more prestigious journal in it, than to an obscure one ceteris paribus. The question surely is if there was a different article that was more appropriate/more effective in supporting your point, would you still cite the less relevant article in the more prestigious journal. I am not sure the first statement being proved by this study allows us to assume the second case is true.

If we want to be non-scientific about it, it seems reasonable to assume that these two elements will compete against each other, so as the difference in prestige between the two journals in question gets smaller, or the difference in value of the articles being considered for citation grows, then the more likely it is the article in the “lesser” journal will be chosen, and the indifference function will probably vary from author to author to a fair extent, finding an average representative value for such a curve would be complicated at best.

Of course the problem with discussing the Matthew effect is it can be applied to so many different things – if one person in a conversation is assuming you are talking about prestigious scientists being cited more (a common variation in academia), and someone else is discussing the effect as it applies to make certain journals more prestigious, things can get very confused.
24 kid bitzer 08.26.09 at 7:38 pm: leederick @20

oh, right; but that’s a different phenomenon than what i think the authors have in mind.

yes–some papers in my discipline (in fact, some of my own papers) have been reprinted, e.g. in anthologies or later collections. and they are marked as such, both in the second venue and on my c.v.

but i think of that as a different kind of venue from a journal. journals publish original, unpublished-elsewhere work, not reprintings of classics.

and come to think of it, sometimes a journal will publish a special issue that selects the best articles of the last few years written in an area. but there too it would be a special case, and marked as such (e.g., the editors of the second journal would know that it had previously been printed, secure permission from the first journal, etc.)

so i recognize the kinds of cases you describe, but i was assuming this research was about a different phenomenon.
25 Paul Orwin 08.26.09 at 8:12 pm: It seems relevant to me that the average IF for both the higher and lower cases were quite low, in absolute terms (1.1 and 0.5). This suggests that we are looking at an atypical set here, and so the effect makes sense (when looking at two obscure articles, use the IF as a heuristic for reliability). The range was very broad, perhaps accounting for the breadth of disciplines studied as well as the duplication possibilities mentioned in previous comments. I’d also note that in my field (microbiology) duplicate publication will get you tossed out on your butt (aside from things like an invited talk/poster abstract with the same title as a paper).
26 kid bitzer 08.26.09 at 8:19 pm: i’m very reassured by the “tossed out on your butt” comments.

though this then makes satan mayo’s question in 15 more pressing:
why would an author even *want* to do this, if they cannot enjoy the spoils of their dishonesty via a padded cv?

and it raises a further research avenue:

look at the cvs of the authors of the 4532 papers in question, and see which one they cited in their *own* cvs!
27 onymous 08.26.09 at 9:35 pm: I’m not convinced the CV argument holds. For one, they could just list things out of order and hope no one notices identical titles. Also, I’ve been told that in some countries, advancement in academia and awarding of grants is determined almost entirely by number of publications; it could be that they aren’t looking too closely at the details.
28 Salient 08.26.09 at 11:08 pm: I am wondering if they controlled for whether or not the paper was published in the high-impact journal first (and I’d like to know what percentage of the papers were first published in the high-impact journal and then the low-impact journal).
29 Fr. 08.26.09 at 11:44 pm: I want to see the duplicates data; especially for the social sciences.
30 Conrad 08.26.09 at 11:50 pm: Like doormat @ 22, I would like more information about the search method for identifying duplicates along these categories. I found another paper by the authors that breaks the duplicates down by field. Apparently they found a couple hundred social science duplicates over the last couple of decades. I would like some examples of these duplicates.

I can imagine someone might think they could get away with padding their CV by publishing the same article with different titles but I don’t see why they would make their sin obvious by repeating the same title in both instances.

Should the authors of this paper have an obligation to disclose their methodology so their results can be duplicated and evaluated? I think so.
31 lemuel pitkin 08.27.09 at 1:07 am: I have to admit, I don’t understand what is at all surprising or troubling about this. The alternative to the “Matthew effect” would be a world where journals added no value. If the point of journals is precisely to act as a filter, isn’t the fact that papers in better-known journals are more widely read a sign that the system is working exactly as it’s supposed to?
32 eudoxis 08.27.09 at 3:31 am: On the topic of duplicate publications.

http://www.nature.com/nature/journal/v451/n7177/full/451397a.html

There was much discussion about this last year. I appears that duplicate publications comprise about 1% of total, mostly involve low impact publications, and, presently, don’t seem to be on the increase. Open access, new rules for submitting total text to PubMed, for example, and better search algorithms seem to be keeping this phenomenon in check.
33 Zamfir 08.27.09 at 6:56 am: Lemuel, it’s not obviously more harmful than any alternatives, but it is an example of a step in a path dependency problem where being well-regarded makes it easier to become even better regarded.

The general assumption in a lot of academic funding etc. is that having done important work in the past makes you more likely to do so in the future, and experience seems agree that this is true.

The Matthew effect suggest that part of this is an illusion, because people regard your work as more important if you did important work in the past.
34 Kenny Easwaran 08.27.09 at 8:29 am: I’m pretty sure Michael Dummett has some papers that count as “duplicates” in this sense. That is, I’m pretty sure there are two papers with the same (only) author, the same title (“Truth”) and the same citation list (namely, nothing). Of course, these aren’t actually duplicates – his 1956 (I think that’s right) paper is a pretty important one, while the other one (or maybe two) titled just “Truth” isn’t as much.

Of course, Dummett is a special case, both in the lack of citations, and the repeat of titles.
35 Doormat 08.27.09 at 9:15 am: This is quite fascinating! Following eudoxis’s link at #32, you can find this website: Dejavu Website. It’s an automated text search, but with manual verification. If you click on some of the entries, you’ll find automated matches which don’t, to a human eye, look like duplicates, but also some verified things which are definately duplicates!

I picked one example where I have access to the journal, and the articles are the same, pretty much. I didn’t want to post the links here, but what do people think? Anyway, these are published in the same journal, a couple of months apart, with the same “recieved” and “accepted” dates. But one looks a bit more polished than the other. I do wonder, in this case, if the journal just accidentally published the first draft, and then made up by publishing the correct version? Of course, this isn’t explained! Ah, but I’ve just found another example where the same thing has happened, but here the journal has also issued an errata explaining the mistake: see Behavioural Brain Research
Volume 157, Issue 2, 28 February 2005, Page 379 (You’ll need some sort of Science Direct access).

But other examples are published in different journals (I haven’t found one yet where I could access both text versions).
36 Fr. 08.27.09 at 10:33 am: Some discussion and amazement about the results on Friendfeed.
37 andrewm 08.27.09 at 2:24 pm: Apparently the Web of Science has about 42 million unique records, so it seems that Lariviere & Gingras have found a duplication rate of approximately 1 in 10,000 papers. This accords much more closely with my own experience (i.e. one colleague encountering one case of simultaneous submission in ~25 years) than the numbers up-thread.

The linked article at #32, with a ~1% duplication rate on a tighter criterion, makes you wonder about the folk indexed by PubMed…
38 Paul Orwin 08.27.09 at 4:32 pm: From a quick look at the DejaVu site, it seems there are several plausible scenarios that result in duplicate publications, some conceivably nefarious (CV padding, etc), most basically innocent.
1) Publish in two different languages – Chinese then English, or vice versa, for example. I can see how a person from China might wish to do this. Same goes obviously for any other linguistic pair. I imagine that this might be a bad thing to do in some people’s minds, but I’m not sure I see it. It might be a violation of journal policies in some cases, though.
2) Republish a review article in an additional venue. Presumably to disseminate it to a wider audience. For example, a review on the pharmacokinetics of some drug within an addict population might be useful to people in several disciplines, who might not read the same journals. Others can feel free to come up with their own examples.
3) Joint publication of interdisciplinary work – two journals agree for similar reasons to those in (2) to jointly publish a work. There was a paper on medical ethics in the list that was published in the Journal of Medical Ethics and the Journal of Medical Education. That seems like an appropriate duplication, since readers of both journals might be interested in the article (I didn’t read it, but it was something about euthanasia).

In the end, this seems like a rather benign phenomenon, mostly explained by innocent practices, and revealed by the newfound ability to compare massive numbers of journal articles using search and comparison algorithms.
39 paul 08.27.09 at 5:16 pm: 37, Paul, I think your point (1) is valid. I think it’s not uncommon to republish articles in the physical sciences in English translation when the original article was in Russian, German or French. Perhaps some of these cases would fail the “duplicates” test as the titles would appear to be different, although many non-English language journals now provide a title and abstract in English.

I was once asked to review a paper which was a borderline case of self-plagiarism, and certainly contained no new material of any merit. A very quick search for the authors’ names and the topic returned a list of suspiciously similar articles, one of which was very similar. My hunch is that quite a lot of people try to get away with this sort of thing, but the peer review process is probably fairly good at catching them.
40 Michael Bishop 08.30.09 at 11:39 pm: See key comments made by Pierre at orgtheory.net

Comments on this entry are closed.

The Impact Factor’s Matthew Effect

Recent Comments

Search

Archives

Pages

Book Events

Contributors

Fine Print

Lumber Room

Old Wood

Meta

Recent Posts

Tags

The Impact Factor’s Matthew Effect

Share this:

Recent Comments

Search

Archives

Pages

Book Events

Contributors

Fine Print

Lumber Room

Old Wood

Meta

Recent Posts

Tags