The REF: A Modest (and very Tentative) Proposal

by Miriam Ronzoni on October 5, 2018

As many readers may already know, UK Universities will undergo their next round of research evaluation in 2021. This is called REF (Research Excellence Framework); has recently been joined by its teaching equivalent, the TEF; and is seen by many UK academics as part of a general managerialist, bureaucratic trend in UK academia which many deplore, and about which other CT members have already written many interest things.

This post is neither about that general trend, nor about the problems or virtues one might identify within the rules of the current REF compared to past versions. It is, instead, about throwing out there a very simple idea on how to engage in some minimal effort, minimum confrontation resistance to the whole thing.

The REF exercise comprises three parts: research outputs (publications); research impact; and research environment. With regard to outputs, publications are ranked as being 4*, 3*, 2*, 1* or unclassified by a panel of experts in the discipline. Research funding (not much though, the REF is mainly a reputational exercise these days) comes with 4* and 3* publications, but disprortionately more for the former than for the latter.

Now, the Politics Panel of REF 2014 expressed a very clear stand both against bibliometrics, and against associating certain “venues” (journals or publishers) with a certain REF mark in a semi-automatic way – a method whereby, say, an article in a top journal almost automatically gets a 4*. In so doing, they put the emphasis on the importance of actually reading the material and assessing it on that basis. This was largely seen as a signalling exercise, pushing Politics towards the Humanities, as it were – and proudly so.

If, however, one of the problems with the REF is that we spend a lot of time doing it and for very little funding, in a way that strikes many as overcompliance with a disciplinary attempt towards academics and is sometimes used in sinsiter ways, are there really no alternatives to the time-consuming option set out by the Politics panel?

Here’s an idea – and it is very tentative, but the only way to test is to throw it out there. How about endorsing the logic of associating certian venues with a REF mark, but doing so in a reasonably generous, expansive way? Disciplines which endorse this logic usually have a very limited “Sancta Sanctorum” of 4* journals in which it is very difficult to get. How about coming up with a consensus of 4* journals for Politics (and for other disciplines too, actually!), but having a reasonably expansive (though still serious) list (basically a list which includes a good chunk of the best “3* journals,” or something along these lines)?

Here’s the two advantages I can see – in a nutshell, they both come down to this being a form of resistance-based, rather than strategic, gaming:

1. Quite simply, more institutions would get a decent number of 4* stars. This could reduce inequalities among institutions, and more generally hijack the use of the REF for punitive purposes;

2. It would simply save a lot of time to many of us, and would enable us to spend more time actually doing our job, namely researching and teaching.

Now, I can see many, many problems with this approach – but let me ask you: would they be knock-down reasons to oppose the idea overall? It’s a genuine question. Let’s just brainstorm together.





Chris Bertram 10.05.18 at 1:01 pm

One thing that we can and should do, in my view, is to refuse offers to work for institutions other than our own in order to help them with their “mock” exercises. When we do so, we are just lending our authority (for money) for use in a process that can be used by managements to target individuals.


Daniel 10.05.18 at 1:07 pm

This could reduce inequalities among institutions

This would be the problem – the purpose of the system is to create inequalities among institutions, and so if a solution were found such that it didn’t, you would have to expect that the funding body (which wants the inequalities) to take some offsetting action to create them in some other way?


Miriam Ronzoni 10.05.18 at 1:10 pm

Hi Daniel – that’s why I am talking of minimum lavel/mimimum confrontation resistance – sure, this might happen, but at least it would delay things? It’s like non-violent sabotage really.


Z 10.05.18 at 1:17 pm

This could reduce inequalities among institutions, and more generally hijack the use of the REF for punitive purposes

But would it really? If you expand the definition of 4* so that now everything that is “serious 3*” is 4*, then institutions with “serious 3*” and 4* research all get 4*, and so get the 4* share of the pie to be split using the REF. So inequalities have been reduced among previously 4* institutions and “serious 3*” institutions. But, it seems to me, an institution formerly ranked as 3* (but not quite “serious 3*”, so “average 3*” or “low 3*”) is now 3*, and since the pie has the same size and there are now more 4* institutions, that institution now has less and so inequalities have increased between “average 3*” and “serious 3*”. And of course if everything formerly 3* is now 4*, then the gap between former 2* and 3* institutions widens etc. (And if everything is 4* (as colleagues in a heavily unionized sector decided a couple of years ago), then next year the REF will require that at most 20% of institutions are ranked 4* etc.).

Is it a good idea to reduce the inequalities between the ̩lite and the very good at the price of increasing the inequality between the very good and the good, or the average? Maybe, depends on the case, I guess. It seems to me that in the end, it is the underlying logic Рconflating the criterion of research with monetary rewards and punishment Рwhich is flawed. (To be clear, I applaud the wish to resist and I pity everyone involved in this system, I just wonder if there are effective ways to do it as long as the logic of the system is as it is now.)


Daniel 10.05.18 at 1:24 pm

I am not familiar at all with the system, but would they not just grade on a curve? As long as scores can be placed in rank order, the funding body can make use of that rank order to distribute the fruits; the absolute variation in the scores matters less. I might be totally wrong here but my assumption is that the inequality is built into the system because the decision of “how many ‘top’ departments are there going to be?” is a policy rather than an output.


Miriam Ronzoni 10.05.18 at 1:33 pm

Hi Daniel no the system works like this: 4*= X amount of money; 3*= 1/4 of that; everything else=0.


Miriam Ronzoni 10.05.18 at 1:34 pm

I mean this does not undemrine your point but things would become much, much flatter in terms of both grading and fruits; although of course there would still be a ranking in the end.


Mercurius Londiniensis 10.05.18 at 1:55 pm

How would this work for books? One could, I suppose, deem any book published by OUP, CUP, Wiley-Blackwell, Harvard or Princeton to be 4* (and double-weighted 4* if it’s more than, say, 150 pages). But since, in my field, pretty well any original book comes from those houses, this would result in very little discrimination. Perhaps that’s the goal, but it might create incentives to over-produce books.


Miriam Ronzoni 10.05.18 at 2:34 pm

Z point taken – see my “minimal confrontation” point. You might think that there’s no point in minimal confrontation resistance.

Mercurius: in Politics, after the last REF, the message is already to prioritizie books (in light of the Panel’s feedback), so that incentive is already there (and includes a larger range of Publishers than the ones you mention).


Daniel 10.05.18 at 2:34 pm

I guess it depends then on whether there is sufficient time for them to change the funding system to weight it even more heavily to 4*s?


Matt 10.06.18 at 1:26 am

What you describe sounds quite a bit like the system that exists in Australia. My impression of it (in the relatively short time I’ve been here) is almost completely negative. It’s very hard for the lists to not end up being arbitrary, and it tends to punish good work that’s published in less high “prestige” locations, forces people to try to publish in journals that may not be the best ones for them or the work, and is especially bad for people doing interdisciplinary work. If you can get a really broad set of journals included in the “top” rankings, it is less problematic, but only less so. It also encourages bureaucrats in education and government to conflate “mechanical” with “objective”. (This is a huge problem in higher education in Australia, it seems to me. It should be fought at every inch.)

No doubt that the REF is a huge time sink, and has many problems. But, I think that any movement towards the bureaucratic mechanicalism found in Australia will very soon be regretted.


Andrew Fisher 10.06.18 at 10:24 am

Surely the knock-down objection to this is that it’s fraud?

REF is used to allocate over £1 billion per year in England alone. ( para 40ff.). If this is ‘not much’ to Miriam, it will nevertheless seem like a significant sum to others.

The ratings are used not just to distribute funds within disciplines, but also between disciplines (or, strictly, units of assessment). So the impact of a particular panel deciding to lower standards wouldn’t just be to change the distribution of funding within that unit, but also to draw resources away from the honest UoAs and into the dishonest one, especially as many institutions then mirror the funding allocation when they distribute QR internally.

It also seems to be an implication of ‘would enable us to spend more time actually doing our job’ that being accountable for the public funding spent on her salary is not something that is perceived as part of an academic’s job. I don’t see how that can be justified, myself.


Sam Tobin-Hochstadt 10.06.18 at 1:12 pm

As a US academic, I’ve always been a little puzzled by the UK’s dislike of the REF. It seems to have lots of good features — considering a small number of works, so that paper-counting is less relevant; actually reading the papers, so it’s not just about where your work appeared; etc. To the degree that it’s about allocating funding, what would be your preference in a different system? In the US, research funding is almost exclusively driven by competitive grant applications, but there’s also much more funding from student tuition available. That seems like a strictly worse way to get money. Another possibility is to just give everyone the same amount of money, but I’m not convinced that’s the most effective way to support research (Canada is a bit like this, as I understand it, but with more money for more senior people).


Miriam Ronzoni 10.06.18 at 5:43 pm

Matt: yes the list would be broad. But the point is not to switch from current norms to quasi-Australian norms. The point is rather for Panel members within the existing REF to de facto endorse a system such as this so as to end up de facto giving many more 4* than predicted, and produce a much, much flatter ranking. So the idea is to work within the existing norms of teh REF to undermine its logic a bit, and counter its unequlaizing logic, rather than to propose something as a good, viable alternative set of norms in the long run.


Miriam Ronzoni 10.06.18 at 5:49 pm

Andrew: right, so our fundamental disagreement is whether the exercise overall is justified, or whether the logic to which it belongs is justified. As I wrote in the post, my proposal *assumes* that the managerialist trend is to be countered, and that the REF might be particularly straightforward to counter, even if I agree with you that it is by far not the worst bit. If I have time, I will try to post later about why I think it is good to resist it (as part of a general trend), although mechanisms of accountability should not, of course, disappear (by I definitely think they have gone mad in UK academia, so yes I think we have reached the points where justifying what we do *stands in the way’* of actually performing our job well). But as I said we seem to disagree about the assumptions.


Miriam Ronzoni 10.06.18 at 5:55 pm

Sam, again this opens a much bigger can of worms, but one big part of the answer is that funding is largely allocated by other means over here, too. So the REF is more and more a disciplining exercise, and one which involves layers and layers of work (including Departments spending A LOT of time engaging in second-guessing exercises). My hunch is that the picture of the REF from afar is way rosier than what the REF actually is.


Matt 10.06.18 at 7:09 pm

Ah, okay. I thought you were proposing a reform or a new system. Now I see that you’re suggesting that, rather than do what the reviewers are supposed to/required to do, they just automatically give out marks based on a large list of journal locations. Well, that _would_ be fraud, as Andrew suggests, so probably a bad idea. Even aside from the ethical issues (“this is a boring waste of my time” isn’t typically thought to be a sufficient justification for engaging in civil disobedience, though of course it’s a tempting one), wouldn’t it be caught out fairly easily and those who did this, rather than what they are supposed to do be sanctioned? It sounds like the results would be a lot different from what’s expected, and then what was done would be learned. If so, why would those above just shrug afterwards? It doesn’t seem like a plausible plan to me.


harry b 10.06.18 at 7:24 pm

Why not replace the REF with funding agencies which consider applications on a case by case basis (like the NSF, etc)? Does anyone know why the RAE/REF block funding model was chosen in the first place?


Tom Hurka 10.06.18 at 8:32 pm

How many readers did the Politics Panel imagine each research work would have? My memory from philosophers who talked about the REF is that each piece would be read by just two people, one a specialist in that area of the discipline and one not. How arbitrary a procedure is that??? Opinions after reading a piece vary drastically, and with only two readers, how a book or paper is evaluated depends massively on luck, i.e. luck in who the readers were, how they felt at that moment, etc.. “Actually reading the material” really means allowing chance — and bias and tiredness and crankiness — have a big influence on the outcome.


Sam Tobin-Hochstadt 10.07.18 at 12:26 am

If not much funding relies on the REF, what reasons do universities have for caring so much? Is it just a rankings exercise, similar to the US News or the Shanghai rankings? If so, the methodology still seems a lot better than most others, although it does sound like a lot of work. Or is there some more tangible result?

I would also say that we on this side of the pond also spend lots of time in justifying what we do — it’s easier for the grass to look greener both ways.

My baseline assumption is that in the absence of some ranking involving lots of work and evidence, even if flawed in many important ways, decisions will instead be made on the basis of the far more biased ranking in people’s heads, in which Harvard and Cambridge always come out on top.


SusanC 10.07.18 at 10:29 am

The incentives here are “interesting”. I have heard serious suggestions in other fields that the top journals should accept more papets. The argument goes something along the lines of: the papers that just missed the cut for acceptance aren’t wrong (or Soxal style hoaxes), just not as good as the better ones. If we reject them, this eventually leads to their authors loosing their funding/academic jobs, and the money going instead to some other academic discipline – which, because it’s someone else’s field, we find less interesting and worthy of encouragement than the second rank papers we are considering rejecting. So accept the second tier stuff – credential inflation,


SusanC 10.07.18 at 10:38 am

@19: the difference between a paper getting into a top tier journal vs a second tier journal also has a lot of randomness (who the referees happened to be, how grumpy they were feeling at the time).

I could see the case for not distinguishing between “top tier” and “second tier” journals on the grounds that getting into one versus the other heavily depends on random factors unrelated to the quality of the work. So giving all papers in both tiers a 4* is defensible.


SusanC 10.07.18 at 4:12 pm

If I’m using some measurement in a psychology experiment – a clinical diagnosis of a patient, for example – I’m interested in the test-retest reliability – how consistent different raters are on the measurement (will different psychiatrists agree that a child has autism, etc.)

If your measure of the papers quality is it got into a 4* journal, the question arises as to its test retest reliability. Would another sample of referees agree that it was an accept?

Null hypothesis: test-retest reliability between 3*. And 4* is poor.

You’ld need to actually do the experiment, but if true, treating 3* and 4* as equivalent would be the standard response. (Assuming you do have good reliability between 3*/4*. And 2* or below)


Miriam Ronzoni 10.08.18 at 9:42 am

Matt, perhaps one bit of background is not clear: many disciplines *already*, and not overtly, use a system like that, i.e. they have a small number of journals, and if you publish within them you get a 4*, and a somewhat bigger list, and if you publish there you get a 3*. Nobody has ever called that fraud, which seems to imply that disciplines are allowed to devise their own methodologies.* That’s the case in, for instance, Economics and Industrial Relations: people talk about having 4* and 3* publications for the next REF in a way that people in Politics would not, or only speculatively, because the Politics line is that there is no strict correlation. And that’s of course a very good line, but only *if* you think that the REF exercise, in the way it is currently conceived, is justified. If you think that an exercise of evaluation is justified, but not the REF as it currently is and especially what it does to academic life and work, then I am not sure why going for the correlation approach, and at the same time being much more expansive about the 4* list, is moraly wrong (once you exclude the fraud line).


Miriam Ronzoni 10.08.18 at 9:43 am

*I call them disciplines rather than units of assessment to be as little idiosyncratic as possible, given the CT’s international readership.


Miriam Ronzoni 10.08.18 at 9:51 am

Tom and Susan, thanks, I agree with much of what you say – esp. on arbitrariness.

Sam: the why question is interesting. There is a bit of self-enslavement at stake, I suspect. And I am not suggesting the grass is greener elsewhere, what made you think that I was implying that? I think this logic has gone out of control everywhere.
And you are right that it’s either these kinds of exercises or elitism, but these kinds of exercises can be done with different levels of obsession. I suspect that several pre-2014 iterations of the REF (it was called RAE then), which people were somewhat less fussed about, had some similarities with what I am suggesting, namely: just check that people are publishing in respectable places (with a generous understanding of what counts as respectable), and if they do they are off the hook. What I am suggesting is that we should see these practices less as aimed at ranking, and more as aimed at checking that departments, roughly, meet a (reasonably demanding) sufficiency threshhold of good research activity, without obessing on ranking quite as much. It’s about checking that people are doing their job and less about assessing who’s best.

Comments on this entry are closed.