From the category archives:

Statistics

It’s time for the Green Human Development Index

by Ingrid Robeyns on November 16, 2020

The United Nations Development Program’s flagship index of wellbeing and social progress, the Human Development Index, no longer captures what humans need, and needs to be replaced by a Green Human Development Index. That’s what I’ll argue in this post.

First, some context for those who do not know the Human Development Index (HDI). The HDI is the main index of the annual Human Development Reports, which, since 1990, have been published by the United Nations Development Program (UNDP). The reports analyse how countries are doing in terms of the wellbeing of their citizens, rather than the size of the economy. In 1990, the Pakistani economist Mahbub ul Haq had the visionary idea that in order to dethrone GDP per capita and economic growth as the yardstick for governmental policies, an alternative index was needed. He asked Amartya Sen to help him construct such an index. The rest is history. The HDI became a powerful alternative to GDP per capita. It consists of three dimensions and several indicators. The first dimension is human life itself, for which the indicators are child mortality and life expectancy. The second dimension is knowledge, captured by school enrollment rates and adult literacy rates. And the last dimension is the standard of living, for which the logarithmic function of GDP per capita is used.

It is easy to criticize the HDI for not capturing all dimensions of wellbeing, or for other shortcomings. For whatever those academic arguments are worth, there is no denying at how successful the HDI has been at accomplishing its two primary purposes: to dethrone GDP per capita and economic growth as the sole yardsticks for societal progress, and to stimulate policy makers to put human beings central in their institutional design and policy making. And by that yardstick, the HDI has been a great success. Each year, the release of the Human Development Reports captures the attention of media and policy makers worldwide. Many politicians and governments care about their ranking in comparison with other countries. And, most importantly, the political power of the HDI provides an incentive for countries to try to invest more in education and health, combatting child mortality and increasing life expectancy.

Yet, it is now time to abandon the HDI. Paradoxically, this is not despite, but because of its political success. The reason is that we have entered the Anthropocene – the geological epoch in which the human species is changing ecosystems and the geology of the Earth. The most well-known of those changes that humans have caused is climate change. And since these ecosystems and planetary boundaries in turn affect human flourishing, they must be central in any analyses of that human flourishing. [click to continue…]

The great replication crisis

by John Quiggin on September 2, 2015

There’s been a lot of commentary on a recent study by the Replication Project that attempted to replicate 100 published studies in psychology, all of which found statistically significant effects of some kind. The results were pretty dismal. Only about one-third of the replications observed a statistically significant effect, and the average effect size was about half that originally reported.

Unfortunately, most of the discussion of this study I’ve seen, notably in the New York Times, has missed the key point, namely the problem of publication bias. The big problem is that, under standard 20th century procedures, research reports will only be published if the effect observed is “statistically significant”, which, broadly speaking means that the average value of the observed effect is more than twice as large as the estimated standard error. According to the standard classical hypothesis testing theory, the probability that such an effect will be observed by chance, when in reality there is no effect, is less than 5 per cent.

There are two problems here, traditionally called Type I and Type II error. The classical hypothesis testing focuses on reducing Type I error, the possibility of finding an effect when none exists in reality, to 5 per cent. Unfortunately, when you do lots of tests, you get 5 per cent of a large number. If all the original studies were Type I errors, we’d expect only 5 per cent to survive replication.

In fact, the outcome observed in the Replication Study is entirely consistent with the possibility that all the failed replications are subject to Type II error, that is, failure to demonstrate an effect that is there in reality

I’m going to illustrate this with a numerical example[^1].

[click to continue…]

Inequality, migration and economists

by Chris Bertram on November 8, 2014

Tim Harford has [a column in the Financial Times claiming that citizenship matters more than class for inequality](http://www.ft.com/cms/s/0/d9cddd8e-6546-11e4-91b1-00144feabdc0.html). In many ways it isn’t a bad piece. I give him points for criticizing Piketty’s default assumption that the nation-state is the right unit for analysis. The trouble with the piece though is the immediate inference from two sets of inequality stats to a narrative about what matters most, as if the two things Harford is talking about are wholly independent variables. This is a vice to which economists are rather prone.

Following Branko Milanovic, Harford writes:

> Imagine lining up everyone in the world from the poorest to the richest, each standing beside a pile of money that represents his or her annual income. The world is a very unequal place: those in the top 1 per cent have vastly more than those in the bottom 1 per cent – you need about $35,000 after taxes to make that cut-off and be one of the 70 million richest people in the world. If that seems low, it’s $140,000 after taxes for a family of four – and it is also about 100 times more than the world’s poorest people have. What determines who is at the richer end of that curve is, mostly, living in a rich country.

Well indeed, impressive stuff. And as Joseph Carens noticed long ago, and Harford would presumably endorse, nationality can function rather like feudal privilege of history. People are indeed sorted into categories, as they were in a feudal or class society, that confine them to particular life paths, limit their access to resources and so forth. But there’s a rather obvious point to make which rather cuts across the “X matters more than Y” narrative, which is that citizenship isn’t a barrier for the rich, or for those with valuable skills. It is the poor who are excluded, who are denied the right to better themselves in the wealthy economies, who drown in the Mediterranean, or who can’t live in the same country as the love of their life. Citizenship, nationality, borders are ways of controlling the mobility of the poor whilst the rich pass effortlessly through. It isn’t simply an alternative or competitor to class, it is also a way in which states enforce class-based inequality.

Severian of Nessus, Amateur Bayesian

by Henry on July 22, 2014

“Noah Smith today”:http://noahpinionblog.blogspot.com/2014/07/bayesian-superman.html

Consider Proposition H: “God is watching out for me, and has a special purpose for me and me alone. Therefore, God will not let me die. No matter how dangerous a threat seems, it cannot possibly kill me, because God is looking out for me – and only me – at all times.” Suppose that you believe that there is a nonzero probability that H is true. And suppose you are a Bayesian – you update your beliefs according to Bayes’ Rule. As you survive longer and longer – as more and more threats fail to kill you – your belief about the probability that H is true must increase and increase. It’s just mechanical application of Bayes’ Rule.

Gene Wolfe, The Citadel of the Autarch

Often their chants sounded so clearly that I could make out the words, though they were in no language I had ever heard. Once one actually stood on his saddle like a performer in a riding exhibition, lifting a hand to the sun and extending the other toward the Ascians. Each rider seemed to have a personal spell; and it was easy to see, as I watched their numbers shrink under the bombardment, how such primitive minds come to believe in their charms, for the survivors could not but feel their thaumaturgy had saved them, and the rest could not complain of the failure of theirs.

“Pete Warden”:http://petewarden.com/2013/07/18/why-you-should-never-trust-a-data-scientist/

bq. The wonderful thing about being a data scientist is that I get all of the credibility of genuine science, with none of the irritating peer review or reproducibility worries. My first taste of this was my Facebook friends connection map. The underlying data was sound, derived from 220m public profiles. The network visualization of drawing lines between the top ten links for each city had issues, but was defensible. The clustering was produced by me squinting at all the lines, coloring in some areas that seemed more connected in a paint program, and picking silly names for the areas. I thought I was publishing an entertaining view of some data I’d extracted, but it was treated like a scientific study. A New York Times columnist used it as evidence that the US was perilously divided. White supremacists dug into the tool to show that Juan was more popular than Juan[HF – John???] in Texan border towns, and so the country was on the verge of being swamped by Hispanics. …

bq. I’ve enjoyed publishing a lot of data-driven stories since then, but I’ve never ceased to be disturbed at how the inclusion of numbers and the mention of large data sets numbs criticism. The articles live in a strange purgatory between journalism, which most readers have a healthy skepticism towards, and science, where we sub-contract verification to other scientists and so trust the public output far more. … If a sociologist tells you that people in Utah only have friends in Utah, you can follow a web of references and peer review to understand if she’s believable. If I, or somebody at a large tech company, tells you the same, there’s no way to check. The source data is proprietary, and in a lot of cases may not even exist any more in the same exact form as databases turn over, and users delete or update their information. Even other data scientists outside the team won’t be able to verify the results. The data scientists I know are honest people, but there’s no external checks in the system to keep them that way.

[via Cosma – Cross-posted at The Monkey Cage]

[My reflections on Britain since the Seventies](https://crookedtimber.org/2013/04/10/britain-since-the-seventies-impressionistic-thoughts/) the other day partly depended on a narrative about social mobility that has become part of the political culture, repeated by the likes of Tony Blair and Gordon Brown and recycled by journalists and commentators. In brief: it is the conventional wisdom. That story is basically that Britain enjoyed a lot of social mobility between the Second World War and the 1970s, but that this has closed down since. It is an orthodoxy that can, and has, been put in the service of both left and right. The left can claim that neoliberalism results in a less fluid society than the postwar welfare state did; the right can go on about how the left, by abolishing the grammar schools, have locked the talented poor out of the elite. And New Labour, with its mantra of education, education, education, argued that more spending on schools and wider access to higher education could unfreeze the barriers to mobility. (Senior university administrators, hungry for funds, have also been keen to promote the notion that higher education is a social solvent.)
[click to continue…]

Invisible Men

by Kieran Healy on January 11, 2013

Over the years I’ve [written](http://kieranhealy.org/blog/archives/2004/07/16/a-new-analysis-of-incarceration-and-inequality/) [about](http://kieranhealy.org/blog/archives/2006/05/23/incarceration-rates/) the work of [Bruce Western](http://www.wjh.harvard.edu/soc/faculty/western/), [Becky Pettit](http://faculty.washington.edu/bpettit/), [Chris Uggen](http://chrisuggen.blogspot.com), and other scholars who study mass incarceration in the United States. By now, the basic outlines of the phenomenon are pretty well established and, I hope, widely known. Two features stand out: its [sheer scale](http://kieranhealy.org/blog/archives/2006/05/23/incarceration-rates/), and its [disproportionate concentration](http://kieranhealy.org/blog/archives/2004/07/16/a-new-analysis-of-incarceration-and-inequality/) amongst young, unskilled black men. It should be astonishing to say that more than one percent of all American adults are incarcerated, and that this rate is without equal in the country’s history and without peer internationally. Similarly, it may seem hard to believe that “five percent of white men and 28 percent of black men born between 1975 and 1979 spent at least a year in prison before reaching age thirty five”, or that “28 percent of white and 68 percent of black high-school dropouts had spent at least a year in prison by 2009”.

Those numbers come from the first chapter of Becky Pettit’s new book, [*Invisible Men: Mass Incarceration and the Myth of Black Progress*](http://www.amazon.com/Invisible-Men-Incarceration-Black-Progress/dp/0871546671). You can read [the first chapter](https://www.russellsage.org/sites/all/files/Pettit_Chap1.pdf) for free, but I recommend you [buy the book](http://www.amazon.com/Invisible-Men-Incarceration-Black-Progress/dp/0871546671). Pettit’s argument is that mass incarceration is such a large and intensive phenomenon that it distorts our understanding of many other social processes.

[click to continue…]

The US News College Rankings Scam

by Henry on February 8, 2012

“Stephen Budiansky”:http://budiansky.blogspot.com/2012/02/us-news-root-of-all-evil.html, via Cosma Shalizi’s Pinboard feed.

bq. Back in ancient times when I worked at esteemed weekly newsmagazine U.S. News & World Report, I always loathed the annual college rankings report. Like all cash cows, however, the college guide was a sacred cow, so I just shut up about its obvious statistical absurdities and inherent mendacity. As a lesson in the evils of our times, it is perhaps inevitable that the college guide is now the only thing left of U.S. News.

bq. A story in today’s New York Times reports that Claremont McKenna college has now been caught red handed submitting phony data to the college guide to boost its rankings. But the real scandal, as usual, is not the occasional flagrant instance of outright dishonesty but the routine corruption that is shot through the whole thing. … To increase selectivity (one of the statistics that go into U.S. News’s secret mumbo-jumbo formula to produce an overall ranking), many colleges deliberately encourage applications from students who don’t have a prayer of getting in. To increase average SAT scores, colleges offer huge scholarships to un-needy but high scoring applicants to lure them to attend their institution. (The Times story mentioned that other colleges have been offering payments to admitted students to retake the test to increase the school average.)

bq. … One of my favorite bits of absurdity was what a friend on the faculty at Case Law School told me they were doing a few years ago: because one of the U.S. News data points was the percentage of graduates employed in their field, the law school simply hired any recent graduate who could not get a job at a law firm and put him to work in the library. Their other tactic was pure genius: the law school hired as adjunct professors local alumni who already had lucrative careers (thereby increasing the faculty-student ratio, a key U.S. News statistic used in determining ranking), paid them exorbitant salaries they did not need (thereby increasing average faculty salary, another U.S. News data point), then made it understood that since they did not really need all that money they were expected to donate it all back to the school (thereby increasing the alumni giving rate, another U.S. News data point): three birds with one stone! (I gather the new Case law dean has put an end to these shenanigans.)

Worth reading the whole thing (even though Budiansky’s site has one of those annoying and anti-social ‘if you cut and paste text from my site, you will get unasked for cruft about how you ought to click on the original link added to your pasted text’ installations).

Fun with Statistics

by Tedra Osell on December 11, 2011

Old friend and former grad student buddy Lawrence White (not sure where he’s teaching these days–Lawrence, are you out reading? –pointed out that <a href=”http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html”>this</a> might be a useful teaching tool. Plus it’s just kinda fun.

Suddenly it all makes sense

Speaking of teaching, this is one of two seasons of the year in which I feel quite gleeful that I no longer have that responsibility. You people really shouldn’t be surfing the web unless your grading is done, you know.

U.S. Traffic Accident Fatalities, 2001-2009

by Kieran Healy on November 22, 2011

From ITO comes this very nice—and very sobering—map of road accident fatalities in the United States between 2001 and 2009. As someone who wrote a book about blood and organ donation in Europe and the United States, I’ve spent time analyzing NHTSA data on traffic accidents. I remember that, during Q&As at talks, people were often surprised to learn just how many road deaths there are in the U.S: about forty thousand per annum (though 2009 saw a very sharp drop, interestingly). Of course, people drive a great deal, too. Standardized by miles traveled, the rate is about 1.5 per 100 million vehicle miles. Still, the absolute number is striking: about two full Boeing 747s’ worth every week of the year.

You can zoom in to the precise location of every accident on the map. Each dot is a life. Drive safely this Thanksgiving.

A few years back as part of the attack on climate science (and in particular the famous ‘hockey stick’ graph) Congressman Joe Barton (R-TX) commissioned an assessment of the work of Michael Mann and others from Professor Edward Wegman of George Mason University, along with his former student Yasmin Said and some others. This included, not only Wegman’s supposedly independent assessment of the statistical methods used by Mann but a ‘social network analysis’ of the relationship between Mann and his co-authors, which purportedly showed that Mann’s network of co-authors dominated the climate science field. As I pointed out at the time, Wegman et al started the analysis with Mann at the centre, so the primary result was that Mann had written a paper with every one of his co-authors! Nevertheless, a version of the paper was published in Computational Statistics and Data Analysis, in which Wegman took this analysis to the startling conclusion that senior academics should not collaborate with each other, but should instead work only with their students. Wegman follows his own advice in this respect, and now we can see why.

It’s just been announced that the paper is to be retracted on the grounds that it contains extensive plagiarism, much but not all of it from Wikipedia. Wegman’s response, showing the wisdom of his research strategy, is to blame his graduate student, who was not, however credited as an author. USA Today, which has taken the lead in following the Wegman plagiarism story, asked an actual expert to look at the paper and her reaction was about the same as my amateur assessment (Wegman and Said are also newcomers to the field, which may explain their heavy reliance on Wikipedia as a reference source).
[click to continue…]

The Statistical Abstract of the United States

by Kieran Healy on April 15, 2011

I saw this report go by on the Twitter saying that, in the wake of the latest budget deal, the Census Bureau is planning on eliminating the Statistical Abstract of the United States, pretty much the single most useful informational document the Government produces. The report says,

When readying the FY2011 budget, the Census Bureau tapped teams to do thorough, systematic program reviews looking for efficiencies and cost savings. Priorities for programs were set according to mission criticality, and some cuts were made to the economic statistics program. According to Tom Mesenbourg, deputy director of the Census Bureau, “difficult choices had to be made” in order to reduce expenditures on existing programs and move forward with new initiatives in FY2012. Core input data that the Bureau of Economic Analysis relies on to produce the National Income and Product Account tables, for example, would be retained. New data sets needed to be added to the Census of Government regarding state and local government pensions (e.g., cost of post-retirement employee benefits). In addition, FY2012 requires funding for the planning stages of the 2012 Economic Census; data collection begins in 2013. So what’s left to cut? It was felt that the popular Statistical Abstract of the United States—the “go to” reference for those who don’t know whether a statistic is available, let alone which agency/department is responsible for it—could be sacrificed. Staff will be moving to “Communications,” digitizing the data set. It is hoped that the private sector—commercial publishers—will see the benefit of publishing some version of the title in the future.

Bleah. When it comes to the United States, the print and online versions of the SA are a peerless source of information for all your bullshit remediation needs. What’s the median household income? What does the distribution of family debt liability look like? How many people are in prison? How many flights were late, got diverted, or crashed in the past few years? How many women hold public office? What sort of families get food stamps? Who does and doesn’t have health insurance? What percentage of households own a cat, a dog, a bird, or a horse? (The fish lobby seem to have lost out on that one.)

In his early days as a pundit, Paul Krugman got a fair amount of mileage from columns that consisted mostly of taking some claims about the U.S. trade balance or industrial structure, looking up the relevant table in the Abstract, and calling bullshit on the claim-maker. (Of course, that was in those far-off days when all this were nowt but fields, Krugman was still a Real Economist—i.e., he had yet to win the Nobel Prize in Economics, or say rude things about Republican economic and social policy—and he patrolled the boundaries of his profession against the incursions of pop internationalists.) So, properly used, the SA might even make you famous.

In the meantime, maybe this is all a feint or post-budget posturing by the Census Bureau. I have no idea. But I really do hope the abstract doesn’t go away anytime soon, or become the property of some gobdaw publisher looking to sell me tabulations of data the government has already collected using public money.

Odds

by Belle Waring on October 24, 2010

I congratulate journalist Megan McArdle for having the good fortune to encounter such a talkative fellow passenger on the D.C. bus the other day.

Yesterday, I rode the bus for the first time from the stop near my house, and ended up chatting with a lifelong neighborhood resident who has just moved to Arizona, and was back visiting family. We talked about the vagaries of the city bus system, and then after a pause, he said, “You know, you may have heard us talking about you people, how we don’t want you here. A lot of people are saying you all are taking the city from us. Way I feel is, you don’t own a city.” He paused and looked around the admittedly somewhat seedy street corner. “Besides, look what we did with it. We had it for forty years, and look what we did with it!”

He’s a little off, because I think black control of Washington D.C. officially occurred only in 1975 when Parliament’s “Chocolate City” was released.

Defending the NRC Rankings

by Henry on October 19, 2010

There’s a qualified defense of the recent NRC rankings of universities by the rather magnificently named duo of E. William Colglazier and Jeremiah P. Ostriker in the “Chronicle today”:http://chronicle.com/article/Counterpoint-Doctoral-Program/125005/?sid=at&utm_source=at&utm_medium=en

[click to continue…]

Mean and Regressive

by Henry on September 28, 2010

I just finished reading Justin Fox’s “The Myth of the Rational Market”:http://www.amazon.com/gp/product/0060599030?ie=UTF8&tag=henryfarrell-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=0060599030 (yes: two years late – I know), and came across this story about Daniel Kahneman which I didn’t know, and which illustrates one of those points that is _ex post_ obvious, but _ex ante_ rather brilliant.

bq. The only point Daniel Kahneman was trying to get across was that praise works better than punishment. The Israeli Air Force flight instructors to whom the Hebrew University psychologist delivered his speech that day in Jerusalem in the mid-1960’s were dubious. One veteran instructor retorted:

On many occasions I have praised flight cadets for clean execution of some aerobatic maneuver, and in general when they try it again they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better. So please don’t tell us that reinforcement works and punishment does not, because the opposite is the case.

bq. As a man trained in statistics, Kahneman saw that _of course_ a student who had just brilliantly executed a maneuver (and was thus praised for it) was less likely to perform better the next time around than a student who had just screwed up. Abnormally good or bad performance is just that – abnormal, which means it is unlikely to be immediately repeated. But Kahneman could also see how the instructor had come to his conclusion that punishment worked. “Because we tend to reward others when they do well and punish them when they do badly, and because there is regression to the mean,” he later lamented, “it is part of the human condition that we are statistically punished for rewarding others and rewarded for punishing them.”