Two from the FT

by Henry Farrell on December 2, 2005

Two interesting articles in the Financial Times this morning. First, this “piece”: on the use of league tables to assess school performance in the UK does a decent job of talking about the limits of statistical measures.

bq. while quantitative targets and performance indicators may seem like an advance on vague aspirations, their apparent clarity is an illusion. … But statisticians warned of a more basic flaw. … each figure is based on a limited sample, and thus inherently uncertain. … The sample-size problem has since been found to undermine league tables for other institutions whose performance is calculated from small numbers, such as fertility clinics. In common with school league tables, they often show dramatic changes in rankings. These are often taken to signal dramatic change in performance. In reality, they are merely expected random variation in the quoted performance level – an effect that would be made clear if error bars were included.

Second, the “WHO says that it’s going to stop hiring of smokers”: to promote its campaign against tobacco use. This is both idiotic and anti-liberal. There’s a decent case to be made for banning smoking in the workplace, because of the externality costs that smoking imposes on non-smokers. There’s no case whatsoever to be made for discriminating against smokers who don’t impinge on others’ health by refusing to hire them in the first place. The British print version of the article has a quote from ASH, the UK anti-smoking lobby group, criticizing the decision as not being “a very good way of tackling the issue.” Clearly, they’re worried – and rightly so – that this is going to be a public relations disaster.



Bob B 12.02.05 at 6:20 am

Two sets of relating questions:

Schools league tables have many flaws but they do show up a basic issue that some local education authorities remain consistently lowly placed in the national league table over many years when there is no close correlation between measures of local affluence and the ranking in the national schools league tables.

It happens I live in an outer London borough which has consistently rated at or near the top of the league for secondary – although not primary – schools for the past decade and more despite rating only a modest ranking in league tables of local affluence. The reason for this sustained achievement is the particular local education authority is endowed with a cluster of outstanding (selective) schools by national standards.

Without the parental and political pressures created by a national schools league table, what is there to show up the persistent failure of some local education authorities to improve local school standards?

Why is WHO only refusing to employ smokers? To be consistent, why not also those who inbibe alcoholic drinks? After all:

“The number of alcohol-related deaths has increased by nearly a fifth in four years, figures show. The Office for National Statistics data revealed deaths in England and Wales rose by from 5,525 in 2000 to 6,544 in 2004 – an 18.4% increase.”

“Alcohol is up there as one of the biggest killers in the country. More than 6,500 people die each year in England and Wales because of alcohol though liver disease, cancer and alcohol poisoning.

“Colin Drummond, professor of addiction psychiatry at St George’s Hospital Medical School in south London, said: ‘Alcohol is a major risk to public health. Smoking causes more deaths, but the number of smokers is on the decrease.'”


Sherman Dorn 12.02.05 at 8:59 am

Thanks for the tip! It quotes one of my favorite educational statisticians, Harvey Goldstein, who has done yeoman’s work for years on the league tables and value-added statistics.

To bob b, I’d point out that there is a difference between transparency (which is good) and the misleading implication that statistics are more accurate than they really are (which is bad). There’s nothing to prevent governments from adding error bars and confidence intervals to league-table-like results, and yet they never do.


soru 12.02.05 at 9:11 am

With only a relatively small number of pupils taking examinations each year, the resulting error bars for their performance are wide – so wide that they overlapped those of every other school, making a mockery of any attempt at ranking them.

That seems a pretty incredible claim on the face of it.

If you look at any league table, you see leafy selective schools at the top, large under-resourced urban schools at the bottom.

He appears to be saying that could be a complete fluke, next year they could easily be reversed, purely as a result of random sampling error.

Maybe he’s just putting a bit too much faith in 95% CI error bars as ‘real’ things. If two schools have error bars that just barely overlap (say 10% chance of the lower value being higher than the low end of the error bar of the low one, and vice versa), then the odds against sample error alone flipping their positions are pretty slim, something like 100 to 1.



Bob B 12.02.05 at 10:03 am

“If you look at any league table, you see leafy selective schools at the top, large under-resourced urban schools at the bottom.”

That is a convenient myth. Several of the London boroughs placed lowly in the league table of local education authorities (LEAs), based on school leaving exam attainment, are actually big per capita spenders on schooling whereas top rated LEAs are only middling spenders.

Part or much of the basic trouble is that some local politicians running LEAs gained an insight years ago that maintaining poor school standards had reassuring, predictable electoral consequences according to the equation:

Poor school standards = uncertain job prospects for school leavers = vote Labour

It is to Blair’s credit that he is trying to undo this well-recognised recipe, which helps to explain why he is in so much trouble with the many Labour MPs who work on the principle: if it ain’t broke, why fix it?

The league tables are necessary to show up which LEAs and schools are failing. We can then look in depth to see whether failing is due to low spending, to local affluence/poverty, poor teaching, local politics, cultural factors or sheer inertia.

The enduring potency of cultural factors is not to be under-estimated. Unfortunately, the picture painted by George Orwell in chp.7 of The Road to Wigan Pier (1937) is still recognisable in places:

“And again, take the working-class attitude towards ‘education’. How different it is from ours, and how immensely sounder! Working people often have a vague reverence for learning in others, but where ‘education’ touches their own lives they see through it and reject it by a healthy instinct. The time was when I used to lament over quite imaginary pictures of lads of fourteen dragged protesting from their lessons and set to work at dismal jobs. It seemed to me dreadful that the doom of a ‘job’ should descend upon anyone at fourteen. Of course I know now that there is not one working-class boy in a thousand who does not pine for the day when he will leave school. He wants to be doing real work, not wasting his time on ridiculous rubbish like history and geography. To the working class, the notion of staying at school till you are nearly grown-up seems merely contemptible and unmanly. The idea of a great big boy of eighteen, who ought to be bringing a pound a week home to his parents, going to school in a ridiculous uniform and even being caned for not doing his lessons!”

Just to show that I am sadly not mistaken in this, consider this recent BBC news report:

“Almost half of 17 year olds in some parts of England have dropped out of full-time education or training, government statistics reveal. . . . The statistics, which were issued in response to a Parliamentary question from the Liberal Democrats, confirm England’s poor international standing for staying-on rates in education. The Organisation for Economic Co-operation and Development ranks England’s drop-out rates as among the worst among industrialised countries.”

The fact that England shows up poorly on stay-on rates in education after 17 by international comparisons among OECD countries rather proves that commitment to education and staying on in full-time education is not just a simple matter of the deprived inner-urban areas versus the leafy suburbs.

What makes that international ranking especially paradoxical is that by EU standards, the UK is more affluent in terms of GDP per capita than most other EU15 countries according to this:


Harry 12.02.05 at 10:06 am

soru — that’s not true of value-added tables, only of raw score tables. When you look at value added tables, Goldstein is right. It’s long been known that i) there are factors we don’t know enoguh about to construct accurate tables and that ii) the tables we can construct don’t distinguish between the 80% of schools int he middle of the distribution because of sample size problems. Goldstein has been dealing with these methodological issues for years, has been bugging the DfES about them, and has not been getting satisfactory responses.

Now, a big caveat to this. British schools are very small. A cohort within a school will typically be no more than 100-150. Contrast with American secondary schools which routinely have cohorts of 3-500. Sample sizes would be much bigger for non-rural US schools. But those schools only have the kids for 4 years (compared with 7 in the UK), and the prevailing ethos in middle schools is that learning anything is an added bonus (in the UK secondary schools have the middle-school-year kids and have a real incentive to teach them soemthing because they will live with the consequences when they are high school age).

Finally, comparing LEAs gives you bigger sample sizes. But since LEas vary in how much power they have over schools, and have relatively little power (compared with what they had 30 years ago) its not clear what you are finding out about when you compare them.


nik 12.02.05 at 10:09 am

…each figure is based on a limited sample, and thus inherently uncertain.

Bollocks. Position in a league table is not based on a sample, it’s based upon the results from the entire population. There is no uncertainty in the result, every single person is counted. There’s some statistical (or journalistic) slight of hand going on when people suggest that the “true performance figure” is different to the performance figure you get when you count all of the results.

The main point – that there may be no substantial difference between the results of school in first place on a league table and the school in last place is perfectly valid (i.e. that the differences between schools may be the result of change, or so small to be inconsequential). It is perfectly true that changes in ranking may be meaningless.

It’s a shame they didn’t focus on regression toward the mean more. The government likes to select underperforming institutions, put special measures in place, and then declare a success when they improve. That’s basic to New Labour. It’s a shame more people don’t realise just how stupid it is.


They’re not saying the results are totally random (i.e. that a complete reversal could take place). But they are saying that the random element is so great you can’t be sure that the school in top place is any better that most the other schools in the table. If that’s the case, that does kind of destroy the purpose of a ranking.


derek 12.02.05 at 10:26 am

Re: the smoking decision by WHO. That’s just wrong. I have a libertarian friend who is a smoker, and militant about it; and he doesn’t get that smoking in my face (i.e. in work or any other shared public space) comes under the heading of the great libertarian principle “the right to swing your fist stops at the end of my nose”.

But he’s going to be all over this daft decision, and he will be right to be, because it’s one thing to ask someone not to smoke in your face, and another to ask them not to smoke *anywhere*.


dsquared 12.02.05 at 11:40 am

Position in a league table is not based on a sample, it’s based upon the results from the entire population

Not quite right; I would say it’s more usefully thought of by saying that the result of each year’s cohort is a sample of the underlying unobservable “performance” of the school, with noise introduced to the process by the stochastic “quality” of the kids they had to start with. The FTSE100 index is based on the population of the entire 100 stocks in the FTSE, but it’s still noisy over short time periods.


Matt McGrattan 12.02.05 at 12:39 pm

re: “Now, a big caveat to this. British schools are very small. A cohort within a school will typically be no more than 100-150. ”

I’m not sure how true this is. My central Scottish high school, on the edge of a large town but with a partly rural intake, had approx 300 – 350 pupils per year and I don’t think that was atypical.

Although that would decrease to about 150 per year once all the kids who weren’t doing Highers left.


harry b 12.02.05 at 1:42 pm

Even 300 would make it a smallish suburban or urban american school. But certainly in England and Wales (sorry, I should not imperialistically have used Britain) 10-15 form entry (whcih is what you are describing) is considered huge, and quite outside the norm.


Bob B 12.02.05 at 2:02 pm

Even the most ardent espousers of local education authority and schools league tables will admit to flaws but there is a noticeable lack of proposals for alternative indices that measure how well standards are being achieved and the fact is that both employers’ organisations and universities have been expressing concerns over many years about the basic literacy and numeracy skills of school leavers.

Critics of the league tables really need to focus on that substantive issue and come up with alternatives. Despite an open invitation, they have conspicuously failed to do so.


soru 12.02.05 at 2:07 pm

They’re not saying the results are totally random (i.e. that a complete reversal could take place).

Well, the article does say that, even if it didn’t mean it:

so wide that they overlapped those of every other school, making a mockery of any attempt at ranking them

The relative ranking of two schools could easily be 99% certain while their 95% error bars overlapped. Not many things in life are 99% certain, it would be pretty foolish to mock those that are.

In any case, the concept of ‘error bars’ based on sampling error seem pretty dubious – is he really assuming the academic success of pupils in the same class, doing exams on the same day, is statistically independant?

It really does seem to me that either the article simplifies or distorts the professor’s views, or someone is attempting to deliberately mislead through misuse of statistics.

That aside, there is clearly some kind of valid underlying point here, that league tables are more precise and cleaner than the underlying data supports. A school bouncing about between, say, #1 and #3 in a league table almost certainly means nothing in terms of actual standards, but probably causes the head no end of phone calls from irate parents.

Sport aside, not many other areas of life evaluate things so brutally. Kitchen salesmen and japanese school pupils may get ranked strictly, everyone else gets a much vaguer grade, more like the school grades ‘A+’ to ‘F’.

Perhaps some qualified statistician could come up with a scheme where schools with a ranking that was genuinely statistically uncertain were ranked equal in the table?



harry b 12.02.05 at 2:29 pm

soru — you’ll find useful stuff on Harvey Goldstein’s site. I don’t think he is an unqualified statistician:

This is very difficult stuff.

Bob b: I disagree. The critics of tables believe that they thoroughly mislead, rather than provide useful information. HG has offered numerous criticisms and suggested amendments. Not all have been ignored, but many have. There is no “alternative”; we realsie there will be tables, but we hope that people will look at the information they purport to contain with a good deal of scepticism. We don’t think they are particularly useful levers in school improvement.

Suppose league tables were entirley useless. Why should the person pointing this out be under any more of a burdne than the proponents of the table, to propose an alternative?


Bob B 12.02.05 at 4:28 pm


We are learning to live with proliferating numbers of league tables for all sorts of things from football teams through “international competitiveness” to the Global 500 and the ranking of hospitals for surgery success and infection rates. Police forces now have to cope with regular performance rankings. The genie is finally out of the bottle and if there aren’t official league tables, the likelihood is that the press will invest resources to build their own league tables for the public services because their readers – the consumers of public services – want to know and won’t be put off.

Hospital league tables generated a parallel critical discussion to education league tables. Challenging issues arise in ranking hospital departments – the major teaching hospitals tend to attract referrals of especially difficult cases with inbuilt risks of higher failure rates so adjustments need to be made for that factor in making comparisons but the league tables illuminated some worryingly high failure rates in some hospital departments which then became the focus of closer investigation and remedial measures where appropriate. It will virtually impossible to coax that genie back into its bottle now.

Much the same applies to local education authority and schools league tables and for similar reasons. For all the many flaws in the tables, there are substantive reasons for continuing concern about raising and sustaining schooling standards in Britain as I have tried to argue and document above.

We need to recognise situations where particular schools are failing in order to focus investigation and introduce remedies. The fact is that many stereotypical explanations are often actually incorrect – I can recall a paradoxical case of an 11-18 (comp) school with miserable attainment rates at GCSE for 16 year-olds but above national average attainment rates at A-level for its small sixth form. How come? There is no obvious, credible explanation in terms of resourcing or poor teaching – I fear the Orwell factor applied.


Jim Buck 12.02.05 at 6:28 pm

Working for the WHO and being a cigarette smoker is akin to being a feminist
and wearing a burka.


Luc 12.02.05 at 7:06 pm

Can’t say I’m in any way qualified to judge whether the WHO no smoking thing is bad or good publicity but it certainly isn’t “both idiotic and anti-liberal”.

You can’t be a smoker and effectively lobby for implementing the FTCT, which is an important part of the WHO business.

It would sound like “Hey, I can’t convince myself to quit, but surely some of those 1 billion Chinese will fall for my great anti smoking PR line!”.


John Quiggin 12.03.05 at 2:03 am

The “complete population” argument is one made by Deirdre McCloskey, and recently criticised by Hoover and Siegler.

Although I have some sympathy with the argument, it’s clearly wrong in many cases including this one. The obvious response is that, for policy purposes, the performance of the school for a cohort that’s already finished doesn’t matter, except as a signal about likely performance in the future. The only way you can assess the usefulness of the signal is to treat the current cohort as a subset of a larger population, and issues of statistical significance immediately arise.


Bob B 12.03.05 at 6:31 am

Presumably, a similar critical analysis must apply to healthcare league tables, possibly with even greater force on sampling issues. No? But then recent news reports on hospital care such as this suggest very persuasive reasons why monitoring surgery success and infection rates are essential for identifying problematic situations:

“LONDON, Dec. 2 (UPI) — One-fifth of post-surgical patients in Scotland were infected by bacteria such as MRSA, a government report said. The Scottish Audit of Surgical Mortality quantified deaths related to hospital superbugs for the first time and said that of 1,854 people who died after an operation, 376 of them were infected by methicillin-resistant Staphylococcus aureus, known as MRSA despite Health Department campaigns to control infections.”


dave heasman 12.04.05 at 7:05 pm

“One-fifth of post-surgical patients in Scotland were infected by bacteria such as MRSA, a government report said”. But it didn’t. It said that one-fifth of post-surgical patients *who died after an operation* in Scotland were infected by bacteria such as MRSA.

And that’s “infected by” not “killed by”.

Ben Goldacre in tehgrauniad is very good on MRSA testing – apparently some places find it where other places don’t. Like hundreds of times more often. Particularly when contacted by the tabloid press.

What about the other 1500 stiffs? Were their deaths not politically useful to some pressure-group or other?


Bob B 12.05.05 at 12:57 am

Hi Dave: Good to see another of your posts again – I won’t mention the lamentable performance of the Eurozone since we exchanged forum messages about that elsewhere in March 1999 after Lafontaine resigned from Schroeder’s government.

Having been through a cardiac [diagnostic] surgical procedure at a major London teaching hospital almost exactly a year ago, I can assure you that some hospitals take precautions against MRSA infection very seriously. In addition to maintaining rigorous cleaning and staff hygiene practices, all surgical patients are routinely subject to swab tests for MRSA infection before surgery.

Fortunately, I had no subsequent MRSA complications – but I was nevertheless much relieved to be advised that as the result of the procedure, the need for further cardiac surgery wasn’t indicated. This contrasts with the previous experience of someone I know well who twice contacted MRSA infections after cardiac surgery at other hospitals. I also know several people who live locally and who lost spouses through MRSA infections following surgery.

The hazards of post-surgery MRSA infection are really not to be under-estimated. I can promise you that it is not just a tabloid scare but a very real issue.


dave heasman 12.05.05 at 4:56 am

Hi Bob, I guess you’re glad to see your 1998 prediction that Blair would be King of Europe by now hasn’t panned out.
Your anecdotes aren’t statistics.
Do these Scottish hospitals test non-stiffs for MRSA? The only way to gather honest, and full-cohort, statistics is to test patients on admission and on discharge. Can’t blame the hospital if the patients brought it in; nor if their filthy visitors infected them. So better test again after each visit. To be on the safe side, better not admit anyone who looks “unhygienic”. That’ll move the place up in the league table.
And pace the Eurozone, I would still welcome 2.5% interest rates. Oh yes.


Bob B 12.05.05 at 9:34 am

“Your anecdotes aren’t statistics.
Do these Scottish hospitals test non-stiffs for MRSA?”

But I was quoting a news agency which claimed to be quoting from, “The Scottish Audit of Surgical Mortality”.

I can’t speak for the practices of all British hospitals but I have testified as to my own, very specific experience of a surgical procedure in a London teaching hospital where MRSA swabs were taken before surgery.

What I can also say from following the news and from personal contacts with (mainly retired) health service professionals is that post-surgery MRSA infection is certainly a challenging – and so far largely intractable – continuing issue for hospitals in Britain.

Googling will retrieve huge repositories of news reports, studies, audits, ministerial announcements and the like. The problem has international dimensions – not least because this malignant strain of Staphylococcus aureus is alleged to be deeply entrenched in a particular province of Canada, and with significantly lower infection rates elsewhere in Canada.

This is arguably important because it suggests that a silver bullet approach – eg “it’s all due to poor hospital cleaning” because of private cleaning staff – is unlikely to be viable as an explanation for the incidence of MRSA or the basis of a successful remedy.

In the present context, I am also saying that hospital league tables were challenged on similar grounds to school and local education authority league tables but they have become a valued tool for illuminating (often life-critical) issues in healthcare and professional competence. Whatever their flaws, these tables won’t be pushed back into the bottle.

Btw on school league tables, I have a CD, issued recently as a freebie by The [London] Times, with an all schools database supposedly showing the examination results of each and every school. The protective cover displays in bold type: Parent Power.


Bob B 12.05.05 at 6:32 pm

Dave, This is from a recent press release by my local hospital trust rather shows that the threat of MRSA infections are not just a scare story put about by the tabloid press in Britain:

“Epsom and St Helier NHS Trust say the intensification of its infection control programme – now not just with alcohol handwashing units at ward entrances but also at the end of each patient’s bed – has brought down its MRSA rate.

“These show the trust had a rate per 1,000 bed days of 0.22 – comparing favourably with results from the period between April and June 2003 when the rate was 0.43 per 1,000 bed days.

“A trust statement said: ‘This shows the second greatest improvement in MRSA bacteraemia rates in the country.’ . . ”–says-trust-name_page.html#story_continue

Comments on this entry are closed.