Leiter report

by Chris Bertram on November 19, 2004

Brian Leiter’s “Philosophical Gourmet”:http://www.philosophicalgourmet.com/default.asp report is now out in its latest version.

[UPDATE: I hadn’t noticed that Kieran gets a credit for statistical advice on the front page!]

{ 2 comments }

1

Richard Zach 11.20.04 at 12:30 am

Can someone explain to me the way the scaled mean ranking works? In particular, why is it that according to the raw scores, only the first 5 Canadian departments would make it into the US top 50 at all, whereas according to scaled mean, they would all make it into the top 40? Is the University of Alberta (say) up with Syracuse (both at -0.2 scaled mean) or down with the runners-up (2.2 raw mean)? And why is this difference so pronounced in the Canadian programs but comparing UK and Australasian programs with US programs by either scaled mean or raw mean gives roughly the same peer group?

2

Kieran Healy 11.20.04 at 4:10 am

Scaling and centering a variable makes it have a mean of zero and a standard deviation of one. In this case, what’s being standardized is the scores the individual raters give to programs. The point of scaling is to remove some potential bias in the scores arising from raters using the scale in different ways. For instance, some people might use the full 0-5 scale but most might confine their scores to the 2-4 range. In that case, the raters in the first group will have a disproportionate influence over the mean scores awarded to each department — their score will count for more than it should. So scaling a rater’s scores makes them more comparable to other raters.

However, the scaling was done within county-groups. So for the U.S., scaled scores were calculated for each rater across the 66 schools participants might have rated, whereas for Canada they were calculated across the 11 schools participants might have rated. Same for the 20 UK programs and the 8 Australian ones. The upshot is that the scaled scores are _not_ comparable across countries, because they are standardizing the scores of raters only for schools within that country.

You _could_ calculate a scaled ranking for the whole dataset, of course. In that case Alberta’s scaled score would change — it would drop quite a bit. So for quick-and-dirty comparisons across countries, use the raw means.

Comments on this entry are closed.