Floating the Fraud Balloon

by Kieran Healy on October 18, 2006

“Daniel wrote a piece”:http://commentisfree.guardian.co.uk/daniel_davies/2006/10/how_to_not_lie_with_statistics.html for the Guardian’s blog saying that critics who wanted to reject the findings of Burnham et al.’s “Lancet paper”:http://www.thelancet.com/webfiles/images/journals/lancet/s0140673606694919.pdf and believe the Iraq Body Count estimate (or similar-sized numbers) were going to have to come out and claim that the paper was fraudulent, “and presumably to accept the legal consequences of doing so.” Well, now “David Kane has floated that balloon.”:http://www.iq.harvard.edu/blog/sss/archives/2006/10/a_case_for_frau.shtml

*Update*: Kane’s accusations have been removed from the front page of the SSS blog. In a “follow-up,”:http://www.iq.harvard.edu/blog/sss/archives/2006/10/removed_a_case.shtml Amy Perfors apologises for the error of judgment and says they removed the post because the “tone is unacceptable, the facts are shoddy, and the ideas are not endorsed by myself, the other authors on the sidebar, or the Harvard IQSS.” Good for them.

He doesn’t have any positive evidence. He just begins from the idea that the number is too big, and asks who would be responsible for faking it. His answer is, the survey team:

bq. We know very little about these Iraqi teams. Besides monetary incentives to give the Lancet authors the answers they wanted, the Iraqis may have had political reasons as well. … Were any former members of the Baath Party? … How can anyone know that they are telling the truth? … The interviewers could, at their discretion, change the location of the sample. How many times did they do this?

Not having a bit of evidence on any of these points, he goes on to put a lot of weight on the paper’s extremely high (in fact, near perfect) response rate, and quotes a commenter from “one of our own threads”:https://crookedtimber.org/2006/10/15/air-war-in-iraq/#comment-175824 saying, correctly, that such rates would be seen as unbelievably high if reported by surveyors in the U.S. or Europe. Following the commenter, Kane concludes that the most likely possibility is that “the survey teams provided fraudulent data.”

The immediate problem with this charge is that, as it turns out, phenomenally high response rates are apparently very common in Iraq, and not just in this survey. “UK Polling Report”:http://www.ukpollingreport.co.uk/blog/archives/884 says the following:

bq. The report suggests that over 98% of people contacted agreed to be interviewed. For anyone involved in market research in this country the figure just sounds stupid. Phone polls here tend to get a response rate of something like 1 in 6. However, the truth is that – incredibly – response rates this high are the norm in Iraq. Earlier this year Johnny Heald of ORB gave a paper at the ESOMAR conference about his company’s experience of polling in Iraq – they’ve done over 150 polls since the invasion, and get response rates in the region of 95%. In November 2003 they did a poll that got a response rate of 100%. That isn’t rounding up. They contacted 1067 people, and 1067 agreed to be interviewed.

If this is correct, then the _only_ bit of circumstantial evidence that Kane proffers in support of his insinuation is in fact a misconception based on his own ignorance.

Kane says, “I can not find a single example of a survey with a 99%+ response rates in a large sample for any survey topic in any country ever.” I googled around a bit looking for information on previous Iraqi polls and their response rates. It took about two minutes. “Here is the methodological statement”:http://abcnews.go.com/images/Politics/1000MethodologyNote.pdf for a poll conducted by Oxford Research International for “ABC News”:http://abcnews.go.com/International/PollVault/story?id=1389228 (and others, including Time and the BBC) in November of 2005. The report says, “The survey had a *contact rate of 98 percent* and a cooperation rate of 84 percent for a *total response rate of 82 percent*.” “Here is one”:http://66.102.7.104/search?q=cache:7So5gURYvcwJ:www.brook.edu/fp/saban/iraq/index.pdf+iraq+opinion+poll+response+rate&hl=en&ct=clnk&cd=2 from the “International Republican Institute”:http://www.iri.org/mena/iraq/2006-07-19-IraqPoll.asp, done in July. The “PowerPoint”:http://www.iri.org/mena/iraq/pdfs/2006-07-18-Iraq%20poll%20June%20June.ppt slides for that one say that “A total sample of 2,849 valid interviews were obtained from a total sample of 3,120 rendering *a response rate of 91 percent*.” And “here is a report”:http://www.cpa-iraq.org/government/political_poll.pdf put out in 2003 by the former Coalition Provisional Authority, summarizing surveys conducted by the Office of Research and Gallup. In the former, “The overall response rate was *89 percent*, ranging from *93% in Baghdad to 100% in Suleymania and Erbil*.” In the latter, “Face-to-face interviews were conducted among 1,178 adults who resided in urban areas within the governorate of Baghdad … *The response rate was 97 percent*.” So much for Iraqi surveys with extraordinary response rates being hard to find.

Oddly, the comment that Kane picked up from our thread was “quickly”:https://crookedtimber.org/2006/10/15/air-war-in-iraq/#comment-175964 “rebutted”:https://crookedtimber.org/2006/10/15/air-war-in-iraq/#comment-175974 by other commenters, who made the same point as the one above. I guess Kane didn’t wait around to find out.

Accusations or insinuations of fraud are a serious matter, especially in a case like this. I have to say I am surprised — and dismayed — to see this balloon being floated at the “Social Science Statistics Blog”:http://www.iq.harvard.edu/blog/sss. I’m a fairly regular reader of theirs. It’s run under the auspices of the Institute for Quantitative Social Science, an interdisciplinary group at Harvard. Most of the posts are by Harvard grad students, but the sidebar also includes respected heavy-hitters like “Jeff Gill”:http://psblade.ucdavis.edu/ and the Institute’s director, “Gary King”:http://gking.harvard.edu/. The blogosphere being what it is, I expect posts with titles like “Harvard statistics blog says Iraq survey results may be fraudulent” to start popping up pretty soon. I wonder whether Prof. King is aware of Kane’s post, and whether he thinks it’s alright that his Institute is providing a platform to Kane to make his claims of fraud.

{ 1 trackback }

Crooked Timber » » Fraud Balloon Pops
10.18.06 at 6:45 pm

{ 101 comments }

1

pidgas 10.18.06 at 1:41 am

The problem with the study, on its face, is that it doesn’t pass the “smell test.” Even the most credulous anti-war activist is shocked by the proposed numbers. Those who look more closely at the methodology have found plenty to criticize. Here, I think, is a particularly stinging critique featuring comments from authors themselves (telling us to dismiss the appendicies b/c they were written by grad students, etc). http://www.opinionjournal.com/editorial/feature.html

No matter what the actual number is, we should take civilian casualties seriously. That is why wild estimates obtained via poorly designed studies potentially do more harm than good. It’s just hard to take them seriously. http://weblogs.swarthmore.edu/burke/?p=288

2

Chris Bertram 10.18.06 at 2:52 am

Marvellous. Pigdas gives us a “stinging critique” in the form of an op-ed from a partner in a Republican-aligned political consultancy, “featuring comments from the authors themselves.” One can only imagine the unedited transcript of the conversation before selection by a Republican-party operative!

3

John Quiggin 10.18.06 at 2:57 am

For those who came in late, David Kane was here in our comments threads defending torture a few weeks ago.

4

Kevin Donoghue 10.18.06 at 3:07 am

Kane’s attack lacks substance but I suppose he deserves credit for having the guts to say what many of the Lancet-bashers are hinting at (with no more evidence than he has).

David, before the corpse of your reputation is consigned to its final resting-place I would like to thank you for getting Les Roberts to release the cluster totals from the 2004 study. That put an end to a lot of arguments about the nature of that sample.

One should always try to find something nice to say about the deceased.

5

thompsaj 10.18.06 at 3:21 am

This editorial doesn’t make it clear how much more uncertain the result is because of its smaller number of clusters. 47 clusters seems roughly comparable to county-level data for california (58 counties, 33.8 m. people). Obviously rough, but probably anyone would agree useful. I haven’t seen anyone compare the UNDP study and the Burnham one, but, judging from the reference, they probably measure two different things. Could someone help?

6

thompsaj 10.18.06 at 3:32 am

my gut tells me that this comparison doesn’t pass the smell test and that it’s a technical smokescreen.

7

Chris Williams 10.18.06 at 3:40 am

My gut tells me that David Kane doesn’t pass the smell test.

Hey, it’s really easy to argue when you eschew logic and evidence, isn’t it? Cool. I must do it some more.

8

bad Jim 10.18.06 at 4:20 am

Yes, and how many deaths will it take till he knows
That too many people have died?

Since Emerson isn’t here, I’ll argue his point for him: aren’t we just quibbling over the price?

9

Charlie Whitaker 10.18.06 at 4:22 am

If this is correct, then the only bit of circumstantial evidence that Kane proffers in support of his insinuation is in fact a misconception based on his own ignorance.

Kieran, you could rephrase that leaving out the ‘if this is correct’: I don’t see why the qualification is needed. And a ‘misconception based on ignorance’ – is that really circumstantial evidence or is it just hearsay?

10

John Emerson 10.18.06 at 5:56 am

I think that two tests which have been most discredited during this debate are the “smell test” and the “gut check”. I think that American intuitions about exotic foreigners are not often reliable sources of information. I perosnally have trouble believing that the so-called Ruandan massacres ever happened. How could a million people be killed in just a few days, one at a time with machetes and clubs?

The “smell test” and the “gut check” are, of course, the conservatives’ primary data sources. That’s how they figured out that Juanita Broderick was raped, and that’s how they figured out that Clinton was a cocaine smuggler. No one else knew those things.

Fox News’s mastery of gutcheck data-collection is what has made them the world’s most trusted news source, except for Weekly World News.

11

Anatoly 10.18.06 at 6:25 am

Chris, in #2, responding to the link given in #1, gives a stunningly authentic example of sticking to the issues. One can always count on Chris to raise the level of the debate!

Meanwhile, if anyone knowledgable about sampling surveys could actually comment on the charges made, that is, 1) whether 47 cluster points of 40 households each can be considered too few, from the professional point of view, for extrapolating on the population of this size; and 2) whether it truly was weird and unusual for the team to not ask for demographic information from the responders, why, that would be just super.

12

soru 10.18.06 at 6:39 am

aren’t we just quibbling over the price?

Except that if you take the one of the higher pre-war death rates shown by other methods, then according to the survey, a majority of governorates have seen negative excess deaths post-invasion, and it becomes a matter of the details of sample weighting, population movement and growth, and survey bias whether the Iraq-wide excess death figure is positive or negative.

Which to me suggests the Iraq-wide net excess death figure is kind of meaningless. Even if the figure was shown to be the negative, would it change anyone’s mind about the war, or about what should be done now?

If there had been a rider for the Iraq war bill that tossed a billion onto buying mosquito nets for Africa, preventing a number of deaths clearly bigger than those caused by the war, would that have made it ok?

A more useful, and better-supported, picture is that in Iraq as of now:

Sunnis are much worse off.

Kurds are somewhat better off.

Shi’a are questionably better off.

That’s what they answer when they are asked, and they are kind of in a position to know.

I guess you could add US military and government much worse off, and al Qaeda questionably worse off (their public support has dropped a lot since they got associated with Zarqawi and the slaughter of Muslims). You can continue that assesment of consequences for different groups as long as you like.

The price of all that is the number of people _killed_, something like the IBC count, or the UN ‘war related deaths’ figure, not the difference of the number of people who died in different years, or would have died in unknowable alternate scenarios.

13

John Quiggin 10.18.06 at 6:41 am

Sticking to the issues indeed, Anatoly.

If, as claimed by Kane, the survey was fraudulent, then questions of sample and questionnaire design are irrelevant. There have been several previous posts where attempts to poke holes of this kind have been made with little success, and your comment would seem better placed in one of those.

14

John Emerson 10.18.06 at 6:55 am

I’ll just repeat my #2 idea — this survey doesn’t need to be interpreted in a vacuum. The qualitative first-person reports we have indicate that the Iraqi economy is barely functioning and that random murder is a fact of life everywhere outside the Kurdish areas. Critics allege that things are more peaceful everywhere outside Baghdad, but this can’t be assumed, and there are reports of massacres in the smaller cities too. (Iraq is pretty heavily urbanized.)

From a realistic long-term perspective, an immediate negative outcome wouldn’t rule out an eventual positive outcome, but things seem to be getting worse rather than better, and no one in the world has any confidence or even hope that the occupiers will ever get their trip together. The occupation is being run by incompetent Republican ideologues and legacy children.

Using the gut-check method I deplored above, my feeling (with Kevin Drum) is that the survey overestimates by a factor of 2 or 3. I can easily say this because it doesn’t change my opinion about anything — there’s no threshold for me at 150,000 deaths or 200,000 deaths. The survey is just one more piece in the puzzle, and I don’t really need it. The actual number would ahve to be much lower still to give me pause.

It is true that by conceding this much, it is possible that I am giving too much credence to a smear job by ignorant hacks, but I don’t understand enough about stats and samples to know whether this is true or not.

15

robert 10.18.06 at 6:57 am

Yikes. Well, you know how David Kane has been harping about Roberts and Burnham not responding to his repeated demands for data? I’m thinking he can probably stop checking his e-mail now for their answers.

16

tim 10.18.06 at 7:04 am

Why would fraud be necessary.
The writings of Daniel Davies may give us a clue to how this survey could be meaningless.

The survey stands on a pre war death rate estimate of 5.5 deaths per thousand.
Yet this would be impossible if infant mortality, as estimated by the WHO and UN was 10%
Daniel has argued that a huge dip in infant mortality took place because of the Oil for Food programme somewhere around the turn of the millenium.
He’s not very precise and provides no evidence.
To back up his cas he claims that all bodies estimating a 10% infant mortality rate used out of date,projected data.

One of those bodies that estimated 10% infant mortality between 2000 and 2003 was the UNDP 2004.

Daniel argues that no work on the ground was done for any of those estimates.

Yet the Lancet report used the UNDP figures for its cluster sampling.

What if Daniel, there was no dip in infant mortality, nearly doubling the estimate of the pre war death rate. AND the population data WAS out of date.
(perhaps not representing population movements out of the most war torn governates – not impossible I’m sure you’d agree.)

This would distort any research to the point of absurdity.

17

Brendan 10.18.06 at 7:30 am

Incidentally, in terms of comparisons, instead of the now completely irrelevant ‘World War 2’ allegory or metaphor, perhaps we should look at other wars in European history for illumination. The Thirty Years War in Germany springs to mind. Like the current Iraqi war, the main motivations (at least in terms of the ongoing civil war) were religious, although, as with this current war, there was much ‘great power’ meddling. Eventually, Germany lost between 15 and 20 percent of its population, (up to 60% in some areas): given that the the nightmare is Iraq is still young, 2.5% excess mortality hardly strikes me as unlikely. It also seems plausible that, if, as seems likely now, the current Iraqi situation will rumble on for years or decades (even 30 years or more, who knows?) that eventually Iraq might lose up to 25% of its population or maybe more. Not that I’m saying this will happen: I’m merely cautioning people not to be surprised if it does.

18

RickD 10.18.06 at 7:34 am

Given that pretty much everybody with a “gut check” feeling somehow feels empowered to throw away a statistically, empirically-derived number and substitute “this is an overestimate by a factor of 2 or 3” as a faux-compromise approach, I think I’ll take the opposite approach, solely to counter-balance this innumerate nonsense.

I think Burnham et al. probably undercounted by at least 35%. So let’s say that the real number should be at least 800k.

I’m feeling very reasonable about this.

19

John Emerson 10.18.06 at 7:35 am

They never quit.

Has anyone here ever been to Iraq? I once knew a guy who claimed to be an Iraqi, and that Iraq was a real place, but he was not a trustworthy guy. I don’t believe that there’s a “war” in “Iraq” any more than I believe that a mna walked on the moon. Common sense, people!

20

abb1 10.18.06 at 7:58 am

I’m with RickD (#18); my gut tells me they undercounted.

For one thing, their assertion that non-violent death rate didn’t increase clearly doesn’t pass the smell test – with a million-plus internally displaced, less electricity, less clean water and so on. They tried to be very conservative and they undercounted by 30% – at least.

21

Matthew 10.18.06 at 8:25 am

My gut tells me it doesn’t pass the smell test. And it knows a lot about smells.

22

Kevin Donoghue 10.18.06 at 8:33 am

Re #16: On the effects of oil-for-food on the situation in Iraq.

23

Steve LaBonne 10.18.06 at 8:54 am

My gut sometimes fails to pass the smell test. This generally happens after I have had lunch at Taco Bell.

24

Chris Bertram 10.18.06 at 8:57 am

Chris, in #2, responding to the link given in #1, gives a stunningly authentic example of sticking to the issues. One can always count on Chris to raise the level of the debate!

You mean you believe that we _can_ rely on the accounts Republican political operatives give us of the views that their opponents express to them in telephone conversations. Not only will the accounts such people give be complete, but they will never seek to mislead us, especially when they embed those accounts in op-ed pieces in the WSJ ….

25

Alex 10.18.06 at 9:12 am

Timtroll, young people die less than old people. Iraq has lots of young people, because they have lots of kids. Quite a few kids die, because there is shit in the drinking water. But if the kids grow up into young people, why, they don’t die until they get old.

This is exactly the age distribution every society on earth had before some of us got rich enough to make sure there wasn’t any shit in the water.

Young people die less than old people. There is, however, a well-known phenomenon in which the normal distribution of death is reversed and young people die more than old people, sometimes quite dramatically. It is known as war.

26

Adam Kotsko 10.18.06 at 9:14 am

Maybe Kane would accept the results of the study if he thought that the Iraqi participants confessed under torture. Torture is, after all, the most reliable method for producing truth.

27

Brownie 10.18.06 at 9:19 am

Pigdas gives us a “stinging critique” in the form of an op-ed from a partner in a Republican-aligned political consultancy…

Let me see if I have this straight: you’re not criticizing Moore for being statistically illiterate (because this would be imprudent given his background) but you’ll cast aspersions on his objectivity based on your understanding of his political sympathies?

When it’s pointed out that most (all?) of the JHU study authors are on record as opposing the war in Iraq and that the editor of the publishing journal has spoken (or is that ‘vented’?) at anti-war rallies, up goes the cry of “smear”. Deal with the figures, we are told. Offer a logic/mathematical/statistical critique, comes the call.

Well, Moore has. Fancy picking up on any of his points instead of assassinating his character using innuendo?

Try:

And so, while the gender and the age of the deceased were recorded in the 2006 Johns Hopkins study, nobody, according to Dr. Roberts, recorded demographic information for the living survey respondents. This would be the first survey I have looked at in my 15 years of looking that did not ask demographic questions of its respondents. But don’t take my word for it–try using Google to find a survey that does not ask demographic questions.

for example.

28

Alex 10.18.06 at 9:25 am

Would it make them any less dead?

29

Chris Bertram 10.18.06 at 9:25 am

A bit of googling turns up no end of stuff on Steven E. Moore who is, inter alia, a professional spinner of “good news” on Iraq. An especially fine example of his work is “this piece”:http://www.manhattan-institute.org/html/_latimes-is_iraq_better_off.htm
from October 2004, which ridicules John Kerry for saying that Muqtada al-Sadr “holds more sway in suburbs of Baghdad than Prime Minister [Iyad] Allawi.” Doesn’t really pass the smell test, does he?

30

aaron 10.18.06 at 9:26 am

Soru, thanks for posting the most reasonable comment I have seen here in months.

31

Tim Lambert 10.18.06 at 9:26 am

Moore is statistically illiterate.

32

Anatoly 10.18.06 at 9:36 am

Chris Bertram:

You mean you believe that we can rely on the accounts Republican political operatives give us of the views that their opponents express to them in telephone conversations.

No, I mean that the article contained actual statistical arguments which are either correct or not regardless of Moore’s party affiliation. Is it or is it not true that it is standard to ask for responders’ demographic data, and it’s very unusual that this was not done here? Is it or is it not true that 47 cluster points is too few cluster points for extrapolation to a population of this size? Instead, you very reliably focused on your certainty that Moore must have distorted Roberts’ response over the phone.

Tim Lambert:

Moore is statistically illiterate.

Unfortunately, just taking your word for it wouldn’t be a prudent thing to do, because you’ve already shown yourself capable of endorsing shockingly illiterate “rebuttals” of IBC. Do you have any arguments to offer?

33

engels 10.18.06 at 9:37 am

Let me see if I have this straight: you’re not criticizing Moore for being statistically illiterate (because this would be imprudent given his background) but you’ll cast aspersions on his objectivity based on your understanding of his political sympathies?

Brownie, your eagerness to defend Moore is pathetic. Has rallying to the call of American right wingers become second nature to you? Moore is a self-described “political consultant” who worked for the International Republican Institute. He empathically does not deserve the same respect as a team of respected scientists from a well known university.

And apart from points which have already been rebutted elsewhere, Moore’s “critique” consists of nothing but overblown rhetoric, and his accounts of phone conversations he has with the researchers. It is hardly illegitimate to point out that the latter may not be reliable.

34

engels 10.18.06 at 9:47 am

And BTW I’ve seen quite a lot of posts on Harry’s Place moaning about the clearly contemptible standards of the BBC and the Guardian. Can I take it then, from your enthusiastic defence of this piece, that the news source of choice for the British “decent Left” is now the opinion pages of the Wall Street Journal?

35

John Emerson 10.18.06 at 10:05 am

I am statistically illiterate. That’s why I’ve cautiously granted that the survey might be off by a factor of 2 or 3. It’s likely that if I understood statistics better I’d have more confidence in the survey.

Only about 0.01% of Americans are capable of evaluating a survey like this one. (That’s 30,000 people, and it’s possible that the estimate I pulled out of my butt is a little high.)

But we all still have to form our opinions. The track record of Pajamas Media fact-checkers and statistical analysts is not an excellent one, and that’s a factor I kept in mind in reaching my conclusion.

36

Brownie 10.18.06 at 10:08 am

And BTW I’ve seen quite a lot of posts on Harry’s Place moaning about the clearly contemptible standards of the BBC and the Guardian.

No, I think you are confusing “posts” with “comments”. There has certainly been criticism of the Guardian’s coverage and editorial policy from time to time and rightly so, but I recall at least two posts in the very recent past that defended the BBC against the more absurd charges laid at its door. You can search the archives and find me describing the BBC as still the preeminent news media organisation on the planet.

OT, I’m not defending Moore’s politics for a second. I’m asking why his politics is important if he’s offering a logic-based critique from the perspective of a survey professional. If Moore’s politics are relevant, then why aren’t Horton’s?

Tim Lambert says Moore is “statistically illiterate”. Well, if this is true there are ways of demonstrating it, but calling him a smelly republican isn’t one of them.

37

soru 10.18.06 at 10:16 am

37: unless it can be statistically proven that there are no republicans, or smelly people, who understand statistics.

Anyone seen a study proving that?

38

Alex 10.18.06 at 10:22 am

No, but I feel more research is needed. Consistent Republicanism requires a number of beliefs that are inconsistent with statistics, for example that drug prohibition is both wise and successful, that most people’s incomes really are above average, and much more.

I consider the hypothesis “there are no republicans who understand statistics” extreme, but not outside the bounds of possibility. Possible barriers to its verifiability include the difficulty of distinguishing genuine ignorance from mere dishonesty.

39

Tim Lambert 10.18.06 at 10:54 am

Sure Anatoly, here you go.

40

dsquared 10.18.06 at 10:59 am

My gut tells me that Iraqi doctors are people, with professional reputations to defend, and that accusing them of scientific fraud is quite definitely libellous. Harvard published this piece, and I for one hope that I was not blowing smoke when I used the phrase “and bear the legal consequences”.

41

Tim Lambert 10.18.06 at 11:00 am

Oh, and I’ve emailed Roberts to get his version of what was discussed.

42

tim 10.18.06 at 11:04 am

Re 23.
No stats?

43

abb1 10.18.06 at 11:25 am

I don’t remember much from whatever I knew about statistics, but this whole business about the number of cluster points and households – isn’t it simply reflected in the confidence interval? What’s the problem here?

44

JP 10.18.06 at 11:45 am

When it’s pointed out that most (all?) of the JHU study authors are on record as opposing the war in Iraq and that the editor of the publishing journal has spoken (or is that ‘vented’?) at anti-war rallies, up goes the cry of “smear”.

Without granting the truth of the factual premise above, if I were a statistician and I conducted a statistical study that revealed that my country’s war had killed 655,000 people in the space of three years, then I would probably be against the war too. In fact, if anyone could support the war after conducting such a statistical study, that person would be a sociopath.

45

r4d20 10.18.06 at 11:53 am

Without granting the truth of the factual premise above, if I were a statistician and I conducted a statistical study that revealed that my country’s war had killed 655,000 people in the space of three years, then I would probably be against the war too. In fact, if anyone could support the war after conducting such a statistical study, that person would be a sociopath.

Anyone who made this decision without considering the bodycount of the other alternatives would be naive.

46

thompsaj 10.18.06 at 12:22 pm

#40, thanks, I was wondering why Moore not only didn’t pass my objective observer smell test but also failed the numerical competency smell test. I liked the part where he said he wouldn’t sample a middle school with only 47 clusters: that definitely passed the HILARIOUS! smell test.

47

Lee 10.18.06 at 12:24 pm

“There are lies, damned lies, and statistics.”

Surprised nobody beat me to this one…

48

dearieme 10.18.06 at 12:42 pm

Why do so many people object to “gut test” and “smell test”: after all, it was from those that my opposition to this rash and foolish war began.

49

spartikus 10.18.06 at 12:55 pm

My guess would be there’s a difference b/w saying “my gut tells me there’s way less than 3000 gumballs in that jar” and “my gut tells me it’s going to really hurt if I stick my hand into the woodchipper“.

50

luci 10.18.06 at 12:57 pm

In my Masters paper, I used a household data set, from a survey sponsored by the Indonesian government, performed by RAND and UCLA. It has ~300 clusters, about 30 households each, around 8,000 total households. I’m no expert on cluster-sampling (or stats, for that matter) but it doesn’t seem correct to say that the sample size is overly small in the Lancet paper – the width of the confidence intervals is reflective of the total sample size.

If they greatly enlarged the size of the sample, they could shrink the width of the bands (and assuming no systemic sampling bias, they would shrink around the area of the original point estimate, the 600k number)….

The confidence intervals seem kinda large in the Lancet paper (compared to my limited experience), but it’s relative, and that’s what they’re there for…

Also, the household data I used was a longitudinal survey, which had a response rate of over 90% initially, then with a subsequent wave (three years later) re-interview response rates of 98% of the same households, and then >90% with the third wave three years later still. Response rates can be so high because, after the intial randomized selection of households, the survey workers try hard to interview the selected households – they often keep returning until someone is home. They don’t want to just interview whoever is at home, or whoever is willing/eager to respond, because of the bias that introduces.

I’m sure the Lancet study people are professionals, and they know what they’re doing.

51

Drm 10.18.06 at 1:08 pm

abb1: That depends on how the CI was calculated. Gelman’s main point linked to in the previous post is that clusters are the unit of interest; i.e. each cluster mean is a measurement with its own CI. The individual cluster means may vary widely (and presumably do). The variance of and CI of the 47 cluster means is what’s well determined, not of the 12,000 or whatever individuals sampled.

More than likely the cluster means will reveal a certain amount (maybe a lot) of non-random structure in the data; e.g. high death rates for the 12 Bagdad clusters compared to rural clusters, etc. I assume that’s why cluster models are used in these kinds of surveys (speaking as lab scientist, not a social scientist) so that you can estimate geographic effects, etc.

Relating the cluster means to the population as a whole requires some understanding of the population structure and how well your clusters model it. From what I can tell that level of detail is missing from the Lancet paper.

Beyond that, like most people I lack a well developed intuition regarding death rates.

52

Barry 10.18.06 at 1:10 pm

“I don’t remember much from whatever I knew about statistics, but this whole business about the number of cluster points and households – isn’t it simply reflected in the confidence interval? What’s the problem here?”

Posted by abb1

That’s what I think. Warning, I’m a statistician, but haven’t dealt with designing cluster sampling, just analyzing some resulting data. In terms of random effects, it seems quite sound.

53

JP 10.18.06 at 1:20 pm

Anyone who made this decision without considering the bodycount of the other alternatives would be naive.

Seeing as how the Lancet study measured excess deaths, your comment is totally irrelevant.

Oh wait, I suppose Saddam could always have attacked us with his WMDs!

54

Sebastian Holsclaw 10.18.06 at 1:27 pm

What about the time of the interviews? Unless the paper disclosed something improperly, it would appear that they did 40 interviews per day. That is a rather agressive schedule. Whether that goes to evidence of careless design (improper interview technique) or fraud can’t be determined without releasing the underlying data.

And what is up with not collecting any demographic data about the interviewees? That seems highly unusual (though it would make the interviews faster).

55

John Emerson 10.18.06 at 1:34 pm

Sebastian, where you been?

I think that gutchecks and sniff tests require some baseline of knowledge about the topic sniffed, and I don’t think that many of the sniff-test-type critics have a good baseline on either Iraqi demographics, on statistics, or on conditions in Iraq.

What they were saying was “Anything that inconveniences with my prowar opinion is implausible.”

56

jiffy 10.18.06 at 1:39 pm

I don’t agree with the disparagement of “gut checks” or “smell tests” or “reality tests.” Of course, just because the results of an analysis conflict with someone’s intuition, doesn’t mean the analysis is wrong. And just saying “600,000 sounds too high” isn’t a particularly useful reality check. Nevertheless, even though I haven’t found any of the methodological challenges to the paper that I’ve seen so far particularly persuasive, I think the type of specific “reality check” questions raised, for example, by IBC give some pause. If your model says that bumblebees can’t fly, maybe there’s a problem with your model, even if you can’t figure out exactly what that problem is. I’m not saying that the IBC points are a “refutation” of the Burnham paper either. Maybe the most reasonable thing is to conclude that the Burnham paper is one of a series of “data points”–in this case on the high side–and that we just need multiple duplications to feel completely confident of any particular number. I have little question that the number of casualties resulting from the Iraq war is high, but I don’t feel confident relying too much on the results of the Burnham study.

57

Barry 10.18.06 at 1:41 pm

Sebastian, IIRC the timing has been covered previously, either here or on Deltoid (frankly, I’m pretty tired from fielding BS, and wonn’t look it up). The teams split up into pairs (presumably one man, one woman). That’d give two interview ‘teams’ per cluster. This would result in 20 interviews per team per cluster. Presumably, in households where there had been no deaths in the past few years, things would have gone faster.

For others reading this thread, Sebastian has been one of the die-hard deniers on Obsidian Wings. His major objection was that 600K deaths in a few years was a ‘WWII’ level of killing. Considering the Rwandan genocide alone, this is a very weak objection, IMHO.

58

spartikus 10.18.06 at 1:50 pm

Sebastian’s had his question addressed already, which begs the question why he is asking it again.

59

Barry 10.18.06 at 2:00 pm

What I’d like to know is, who is David Kane? I checked the Harvard website, and couldn’t come up much (a short bio is at: http://www.iq.harvard.edu/People/people.php?info=166&sub=6). He seems to be a cut above the innumerate deniers, but that isn’t saying much.

I do note that he’s been very careful to avoid statistical attacks on the study, which lends me much confidence that the study is statistically sound. Instead, he snipes at things which are difficult/impossible to disprove.

60

Kevin Donoghue 10.18.06 at 2:13 pm

Now, Now, Spartikus, don’t be mean to Sebastian. He mentioned in that thread that he was busy suing people or something. (Somewhere in all this I spotted a statistician’s joke about the problems of sampling the set of ethical lawyers.)

Demographic info, such as the ages of the people in the household, would certainly be nice to have. Tim Lambert mentions the matter in his post on the WSJ “critique”. But it probably doesn’t greatly affect the results. Incidentally I’m not sure they didn’t gather such data, but supposing they didn’t it was presumably to save time.

In a way it seems to me the critics are clutching at straws here, saying (a) the interviewers should have asked more questions, and (b) they couldn’t possibly have covered all the questions they asked in the limited time they gave themselves.

61

engels 10.18.06 at 2:21 pm

I think that gutchecks and sniff tests require some baseline of knowledge about the topic sniffed

They also require that those doing the sniffing are not up to their eyes in shit.

62

Jim Harrison 10.18.06 at 2:22 pm

The official body-count numbers for Iraq are more likely to be fraudulent than the estimates of the Lancet study. After all, the people who put out the government numbers have a long history of deceit while the academics who performed the Lancet study apparently have spotless reputations.

I’m hardly surprised that the apologists for the invasion are reaching for any available way to minimize the atrocity that is Iraq. If I had so much blood on my hands, I’m sure I’d be stealing in my mind, too.

63

roger 10.18.06 at 2:30 pm

A limited time would seem to go along with the high probability of being killed. I imagine that surveys in the U.S. would go much faster if the kill rate for reporters, scholars, doctors and strangers poking into neighborhoods was a lot higher. We’d get those surveys done ultra-fast!

It seems to me that stage two in the discussion about the survey should begin, which is: if the picture of the war in Iraq is so confused by the Western media that we only catch a glimpse of it – why is that confusion there? For instance, look at the frequency with which, over the past three years, the media has mentioned the names of the varied leaders in Iraq. I would bet anything that leading the pack would be Chalabi. And I would bet anything that those who see that name in a news story have no idea that Chalabi represents maybe less than one percent of Iraqi public opinion – his gains at the polls put him below the level of support enjoyed by Dennis Kucinech in this country.

This isn’t really about reporters not being able to get out of the Green Zone, it is about the Green zone being, apparently, hard to get out of the reporters. Everything about Iraq is filtered through not only how it affects Americans, but through the small contingent of American policymakers in D.C. that shape the news. Distortion piles on distortion. When small bits come through, they come through either years later — for instance, the recent Washington Post story about the police school in Baghdad that a U.S. contractor built that is built so badly that, periodically, feces from the sewage pipes rain down upon the police cadets, but the Washington Post has been writing, for years, stories about the marvels of the way the U.S. is training Iraqi police cadets – or they go through the process of being well known without being reported – the way the Downing Memo, which reported that the Administration was fixing the intelligence for the war, was dismissed as something everybody in D.C. knew when it came out. It was so well known there was no reason to report it.

All of which leads to the question of what the war looks like through Iraqi eyes. That should be more than just an occassional human interest story.

64

Barry 10.18.06 at 2:38 pm

Roger: “All of which leads to the question of what the war looks like through Iraqi eyes. That should be more than just an occassional human interest story.”

I imagine it’d be like a war movie (for us), vs being in a war (for them).

65

Kevin Donoghue 10.18.06 at 2:40 pm

Barry,

What I’d like to know is, who is David Kane?

David Kane comments at Deltoid fairly regularly. He seems to know a reasonable amount about stats and especially stats software (as your link would suggest). Yes, he is way smarter than the average denialist, not that that’s high praise. As I remarked above, he got Roberts to release cluster-level data for the 2004 study. Then he started pressing him for even more detail and was politely told to get lost.

Your link also suggests he has a bit of wealth. He might be wise to ensure none of it is invested in jurisdictions which give the plaintiff the edge in libel actions.

66

Matt Weiner 10.18.06 at 2:51 pm

67

Henry 10.18.06 at 2:52 pm

It would appear that the offending post has been “removed”:http://www.iq.harvard.edu/blog/sss/archives/2006/10/removed_a_case.shtml#more at least from the blog’s main page (they don’t seem to have gotten around to deleting the post itself yet). Explanation from Amy Perfors (who seems to have posted the original thing on Kane’s behalf in the first place.

Amy Perfors

David Kane’s most recent guest post about the Lancet study has been removed. Since this is not a normal practice for us, explanations for why (and why we posted it in the first place) are below the fold.

Why remove it? The tone is unacceptable, the facts are shoddy, and the ideas are not endorsed by myself, the other authors on the sidebar, or the Harvard IQSS.

Why post it in the first place, given this? Here I admit to an error in judgment on my part. I see my job as head of the Author’s Committee as doing the somewhat mundane and boring tasks of coordinating and inspiring our posters, not exercising editorial control. I was uncomfortable with the post even before putting it up, but I also hate censorship, and — since I don’t know this field or this study very well — I couldn’t say with complete confidence that my discomfort was totally justified. I decided to err on the side of expressing something I was uncomfortable with, rather than stifling it. Again, that was probably an error with regards to this post, and I apologize. It was not up to the standards we aspire to here, and does not reflect our views.

68

hilzoy 10.18.06 at 2:57 pm

spartikus: “Sebastian’s had his question addressed already, which begs the question why he is asking it again.”

Kevin D.: “Now, Now, Spartikus, don’t be mean to Sebastian.”

Me: And if you are going to be mean to Seb, at least don’t misuse ‘begs the question’ while you’re doing it. The fact that Seb had his question answered already might make one wonder why he’s raising it again (though personally I think the answer to that is obvious, namely: he wasn’t satisfied with the answer); but it in no way begs the question.

Hmmph.

69

thetruth 10.18.06 at 3:00 pm

“Good heavens, Chomsky is an apologist because he wanted to avoid the fact of Cambodian genocide to make his ‘greater’ point about the fact that all evil comes from the US (only very very mild exaggeration on the word ‘all’). And what is this Hutu, thing? Chomsky didn’t ‘confuse’ two acts of genocide. He denied the existance (sic) of Pol Pots (sic) genocide far beyond the time when it became clear”.
-Sebastian “It’s OK If You Are A Republican” Holsclaw

70

Matt Weiner 10.18.06 at 3:11 pm

though personally I think the answer to that is obvious, namely: he wasn’t satisfied with the answer

Hilzoy, then I think his behavior is unwise. If Sebastian is aware of the previous answer and wants a better one, then he should refer back to the previous answer, say that he finds it unsatsifactory (and hopefully why), and ask again. Otherwise he can only expect to get the original answer again, as he did. Accompanied, perhaps , by a polite inquiry as to why he wasn’t satisfied with it, which I hereby make.

Before the link to the old answer is posted, the effect on readers of this thread is to make it seem as though the question hasn’t been answered, which is misleading. I don’t think this misleading is intentional; looking at that thread Sebastian may not have seen the end of it.

As for ‘begging the question’, I’m not sure descriptivists are entitled to complain. Although when people do or should know our use of the term we are.

71

spartikus 10.18.06 at 3:16 pm

though personally I think the answer to that is obvious, namely: he wasn’t satisfied with the answer

So….he uses his limited time by repeating the question on another blog, rather than following up on the response to his original?

This does not make sense to me. And if that makes me mean, I’m mean.

And Hilzoy, I wasn’t accusing SH of “begging the question”. But I can understand the confusion. What can I say, I’m a child of my times.

72

Anatoly 10.18.06 at 3:22 pm

Tim: thanks for the post, and the nice link to the explanation that sample size need not depend on population size. It’s nice to learn something new, especially when it’s counterintuitive. It seems Moore is either ignorant or deliberately misleading when he speaks of 47 clusters being not enough for the population that large.

The fact that demographics weren’t recorded for respondents remains unexplained (though it’s not, by itself, hugely suspicious). I don’t think your rebuttal here (that it’s just a nitpick at a slight deviation from the High Standards) works. If (and Moore may be false here again, for all I know) it is indeed standard practice (and not just the Right Thing to do) to record that data, and if it’s extremely unusual and weird to not do so, then it remains a puzzling and possibly suspicious fact. However, Moore can’t be trusted at all after the sample size thing, so I’ll just try to get a definite answer from an expert on this and suspend judgement till then.

In my opinion, which doesn’t have to interest anyone of course, Moore is very likely a shill, whle IBC’s criticism remains convincing and unrebutted.

73

pidgas 10.18.06 at 3:24 pm

Tim Lambert, I’m sure you wouldn’t mind posting the content of your email to Dr. Roberts and his verbatim reply. Given your clear bias, I don’t trust you to ask non-leading questions and quote his reply accurately.

Also, a couple comments about your withering critique of the “statistically illiterate” Mr. Moore. First, regarding the LA Times article, randomly sampling 20 households from 75 clusters is very different from picking 50 clusters by the population proportionate to size model and then randomly selecting street corners from which to deterministically survey 40 households. Very different.

Second, demographic info isn’t fluff. It is essential to validate your sample. We cannot even begin to validate this sample. We just have to cross our fingers and hope that it is representative of the population at large.

Third, it is true that sample size calculations do not take population size into account. Your a genius. Of course Moore isn’t talking about sample size, he’s talking about the number of clusters for a given population size. They aren’t the same.

Cluster surveys almost always need a larger sample size than simple random samples to overcome what is called the “design effect” (people within a cluster are often similar, so sampling one more from a cluster doesn’t tell you as much about the population as sampling one more at random). Design effect = 1 + (cluster size – 1) * variability measure. If the design effect is 3, you need a sample size three times that needed for a simple random sampling scheme. Of course the cluster size is directly related to the population size (pop/number of clusters). So in general the larger your clusters, the larger your design effect. This becomes especially important if the intraclass variation (the other factor used to calculate design effect) is uncertain (e.g. say a cluster is a household…it would be low for ethnicity, but potentially high for age and education level). Roberts calculates a single sample size of ~12000 people. They get this number using a measure of variation. They did a lot of fancy stuff to arrive at the estimate…fine. The relevance of the cluster size is that the cluster size MAGNIFIES any errors in that estimate dramatically. Therefore, the number of clusters per population size is potentially quite relevant. It is unfortunate that you seem to think that the number of clusters per population size is the same as sample size. Who, again, is statistically illiterate?

Fourth, you say that it’s easy to see if the sample size was big enough by looking at the CIs. Ok, let’s do that. The 95% CI for their primary outcome (deaths since invasion above expected) is 654965 (392 979–942 636). Their CI is almost 85% of their estimate. Hardly a tight CI. Moreover, their CI for violent death seems non-sensical. They estimate violent deaths were 601 027 (426 369–793 663). But wait! How can you be 95% confident that ALL deaths were greater than 392979 and 95% confident that VIOLENT deaths were greater than 426369 at the same time? If these numbers were consistent the CI for violent deaths would cross blow the lower bound for all deaths OR the lower bound for all deaths should be above the lower bound for violent deaths.

Finally, if it matters that Mr. Moore “supports the war” or is a “republican spinmeister” doesn’t it matter that the lead author on this study was an anti-war Democratic candidate for Congress in New York and that the editor-in-chief of the Lancet (Dr. Richard Horton) has a nice foaming at the mouth anti-war rant on YouTube?

74

robert 10.18.06 at 3:34 pm

anatoly wrote:

The fact that demographics weren’t recorded for respondents remains unexplained […] If (and Moore may be false here again, for all I know) it is indeed standard practice (and not just the Right Thing to do) to record that data, and if it’s extremely unusual and weird to not do so, then it remains a puzzling and possibly suspicious fact.

Depends on what the purpose of the survey is. The purpose in this case was to get estimates of crude death rates, and also some detail on the deceased. For that, all you need to know of the surviving household population is how many and how long. As a general rule it’s better to keep your questionnaire as simple and short as possible, and that means don’t ask questions you’re not going to analyze.

75

Sebastian Holsclaw 10.18.06 at 3:40 pm

I don’t see the link as answering my question, unless the only question was “was there a death in the household”. That would be odd in such a survey (though the apparent lack of surveying demographics could of course make things quicker though more dubious–as I already mentioned). Ten minutes is unreasonably fast per survey of a 6+ person household unless your survey is super-cursory. 15 minutes gets us to some really long work days if you add in non-zero amounts of travel time. 20 minutes (which would be non-shocking) gets us really going.

76

Jim Johnson 10.18.06 at 3:42 pm

I have not read through all the comments (either here or there), so I may be simply repeating what someone else has said. Sorry. But the premise of Kane’s charge of fraud seems to be that the data on which ‘the Lancet study’ was made are being collected by people with an agenda – those on the ground doing the survey are people who were against the war in the first place and so have an axe to grind. OK, fair enough. But how does that not reflect back on US Military/Government estimates. Isn’t their data collection being conducted largely by people who were “for” the war in the first place (or who now have reason not to look like blunderers), with the analogous incentive to “make shit up?” Perhaps I’m missing something. But I guess I am reluctuant to think that “official” statistics are more reliable just because they are “official.” Perhaps officials have an agenda? Just a thought.

77

spartikus 10.18.06 at 3:57 pm

Ten minutes is unreasonably fast per survey of a 6+ person household unless your survey is super-cursory.

Possibly. I don’t work in the industry and wouldn’t know if 10 minutes is completely inadequate.

But, as I believe the survey says, there were teams of 4 and each team interviewed 40 households/day. That’s 10 households per surveyor per day, and that’s would lead one to believe each interview could be potentially longer than 10 minutes. More than 20 minutes, even.

78

pidgas 10.18.06 at 4:01 pm

I just read back my post and want to clarify. The number of clusters for a given population size is very important and NOT the same as sample size. REPEAT, number of clusters for a given population size is not equivalent to sample size.

For a cluster sample survey, people talk about the “design effect.” Mathematically, the design effect is as follows:

Design effect = 1 + intracluster correlation for the statistic in question * (cluster size – 1).

Roughly, if the design effect is three you need a sample size three times larger for the cluster design than you need for a simple random sample design. It has to do with the fact that sampling another person from a cluster does not give you as much information about the population as sampling another person entirely at random.

Fewer clusters for a given population increases cluster size. This, in turn, magnifies any errors in the the estimate of correlation within clusters when calculating the design effect. If your intraclass correlation estimate is too small, your sample size might also be too small. But this “design effect” is particularly pronounced if your cluster size is large.

Moore’s point is that the fewer the clusters used for a given population, the more you magnify any errors in the estimate of correlation. His observation is that most studies use far more clusters for populations the size of Iraq. This doesn’t matter if their estimate of correlation was accurate. It matters a great deal if it was off…even just a little.

79

Anatoly 10.18.06 at 4:04 pm

pidgas, I may be wrong, but I think the study kept the cluster size fixed at 40 households, and calculated how many clusters they would need to sample in order to get the total sample size large enough for their desired CI bounds. The total number of clusters in the country is fixed (number of households/40). 50, later corrected to 47, was the size of their sample for sampling the clusters. Just as with simple random sampling the sample size does not depend on population size, here when sampling the clusters the sample size need not depend on the total number of clusters (and thus ultimately on population size). The sampling error will be greater because of clustering, inherently, but not in proportion to population size. Or am I wrong here?

(I spent some time with a statistics tutorial because I wanted Tim Lambert to be wrong here, but he isn’t, I think).

(some confusion arises because “number of clusters” can be used to mean both “total large number of relatively small clusters into which we’re dividing the population” and “the small number of clusters we randomly sampled from the former and queried”. 47 is the latter number. The former is some large quantity that’s not specified explicitly in the paper)

80

Kevin Donoghue 10.18.06 at 4:15 pm

I don’t see the link as answering my question, unless the only question was “was there a death in the household”.

Sebastian,

The linked comment (mine) made clear that there are a few unavoidable questions, even if there were no deaths in the household: at a minimum, numbers in 2002, births and other additions, departures etc. Still I can’t see why it should take very long. As Robert says, there is no point in gathering data which you don’t plan to analyse.

Look, you’re a lawyer so let’s put it like this: you could do it in ten minutes and bill for thirty.;)

81

Sebastian Holsclaw 10.18.06 at 4:39 pm

“The linked comment (mine) made clear that there are a few unavoidable questions, even if there were no deaths in the household: at a minimum, numbers in 2002, births and other additions, departures etc.”

It doesn’t make that clear. In any well designed survey you have to at a very minimum describe to the people what you mean by a “household”. You have to describe in general what you are doing. You have to introduce yourself and your partner. You should, but they apparently did not, take demographic information for validation. You have to describe the dates in question. You have to check to be sure that the responding people are certain about the dates when they respond. You have to travel to and from the household. You have to knock on the door and give them time to answer. Every little thing takes time. The time in question isn’t just the time of “the survey”. The time in question is the time from the beginning of one survey to the beginning of the next survey. Except with ridiculously rushed work and amazingly brief questions, the time from knocking on the door to knocking on the door, introducting yourself, actually giving and recording the survey to knocking on the next door should easily be 15 or 20 minutes. And if it goes to 25 (and frankly 30 seems more reasonable) you get into the realm of nearly impossible given the number of interviews per day. I’m not suggesting that is indicative of fraud, I’m suggesting it is indicative of poor survey design (especially in light of the lack of verifying demographic information which is absolutely routine and in light of the strong refusal to release the data).

82

Sebastian Holsclaw 10.18.06 at 4:41 pm

“As Robert says, there is no point in gathering data which you don’t plan to analyse.

Look, you’re a lawyer so let’s put it like this: you could do it in ten minutes and bill for thirty.;)”

You gather the demographic data so you can see if you got a representative sample.

And I’m thankfully in corporate now, so timesheets are a thing of the past. :)

83

Kevin Donoghue 10.18.06 at 4:56 pm

I appreciate that it’s good to have demographic data to check on the quality of the sample. But there is a trade-off involved: more time spent with each household means you do fewer households in a given time; or else you increase the time spent and hence the risk to the team.

It galls me a bit that people criticise these guys for not spending more time in the field, but no reporter has taken up the simple challenge presented by Les Roberts: visit a few graveyards and check the number of burials in each year. That would be a real “reality check”, in contrast to the IBC report with that title.

84

pidgas 10.18.06 at 6:29 pm

Anatoly,

In a classic cluster sample survey (http://www.childrens-mercy.org/stats/weblog2004/cluster.asp
or http://en.wikipedia.org/wiki/Cluster_sampling), the clusters cover the whole population in a non-overlapping fashion.

For a cluster sample survey, adjusted sample size = unadjusted sample size * design effect. Moreover, the number of clusters = adjusted sample size / cluster size. Fewer clusters (i.e. using a larger cluster size) causes any errors in the intraclass correlation estimate to be amplified substantially.

This increases the likelihood that the sample will be of the wrong size and the likelihood that your conclusions will also be wrong. Moore is not wrong to bring this up as a critique of the study.

With respect,
Pid

85

Rich Puchalsky 10.18.06 at 6:31 pm

Yes, the denialists here have a point — instead of six million, it might be only 600,000.

Or wait, am I mixing up denialists again? Screw Godwin’s Law.

86

snuh 10.18.06 at 7:01 pm

of course the other answer to sebastian’s question (about how they could interview so many people so quickly) is the subject of kieran’s post. with a response rate like that, they should’ve had no trouble getting the interviews done quickly.

87

anthony 10.18.06 at 7:23 pm

More’s the point, the surveyors were medical professionals – whether it be in a busy practice or doing the rounds in a hospital, these people are professionals at moving quickly and effectively. I don’t find it unlikely at all.

88

John Quiggin 10.18.06 at 8:18 pm

Interestingly, DK himself seems to have disappeared from view – he didn’t respond to comments on his now-vanished post (unless in the interval between my last viewing and its disappearance) and I can’t see anything from him in this thread.

And the denialists above seem eager to back away from his fraud accusation, even though it’s the only coherent basis for challenging the results. Quibbles about cluster sampling and so on (all answered at tedious length last time around in any case) aren’t going to change the finding that the number of excess deaths is huge.

89

thetruth 10.18.06 at 11:55 pm

As Riverbend says:

We literally do not know a single Iraqi family that has not seen the violent death of a first or second-degree relative these last three years. Abductions, militias, sectarian violence, revenge killings, assassinations, car-bombs, suicide bombers, American military strikes, Iraqi military raids, death squads, extremists, armed robberies, executions, detentions, secret prisons, torture, mysterious weapons — with so many different ways to die, is the number so far fetched?

But what weight can be given to that testimony, compared to the hand-washing bleatings of newly forged Republican statistical wizards vomiting up objection after objection to the data that asserts matter-of-factly that they are responsible for the deaths of 600 thousand human beings?

90

robert 10.19.06 at 1:43 am

In #82, Sebastian Holsclaw wrote:

In any well designed survey you have to at a very minimum [long list of duties snipped]. The time in question isn’t just the time of “the survey”. The time in question is the time from the beginning of one survey to the beginning of the next survey. Except with ridiculously rushed work and amazingly brief questions, the time from knocking on the door to knocking on the door, introducting yourself, actually giving and recording the survey to knocking on the next door should easily be 15 or 20 minutes. And if it goes to 25 (and frankly 30 seems more reasonable) you get into the realm of nearly impossible given the number of interviews per day. I’m not suggesting that is indicative of fraud, I’m suggesting it is indicative of poor survey design

To put things in perspective, the 2004 ILCS questionnaire was, in its English translation, 60 pages long and median interview time was 84 minutes. In order to get a full cluster of 40 households done by one team of four interviewers in one day in the Burnham study, the total time per household had to be short. This is exactly the kind of situation where one would want to cut the number of questions down to the bare minimum. Anyway, go take a look at the ILCS questionnaire and imagine doing that in 84 minutes.

91

Brett Bellmore 10.19.06 at 6:18 am

Talk about refusing to see the forest for the trees!

It’s a commonplace observation that sampling error is often the smallest source of error in a poll; It’s merely the easiest to quantify. Even in relatively peaceful countries like the US, it’s understood that poll results will often be warped by people being reluctant to give answers they think might offend the polster, which is why, for instance, contraversial ballot proposals often do suprisingly better on election day than the polling suggests they will.

Anyone care to guess the magnitude of that effect in areas where there are active insurgencies going around murdering whole families who offend them in some way?

Polling just isn’t reliable in war zones or countries with secret police, and there’s no point to pretending otherwise. You can do all the math you like concerning sample sizes and clustering, and it’s all beside the point if your samples’ chief concern is with figuring out what responses have the least risk of getting them murdered.

92

John Emerson 10.19.06 at 6:39 am

Yes, Brett, the people of Iraq figured out that American and European peaceniks would murder them if they gave the wrong answer.

93

Alex 10.19.06 at 7:11 am

…having first issued them with a phony death certificate. Seriously, this is like trollageddon – their first lot of talking points is cannibalising the later ones, like the revolution and its children.

94

Donald Johnson 10.19.06 at 7:23 am

I’m trying to figure out if Brett’s point about the dangers of living in a horrible place like Iraq means that the survey is overestimating the death toll.

But maybe he’s only trying to lower the percentage of deaths attributed to Americans–the problem with that is that there might also be incentives to understate any particular faction’s contribution and attribute the deaths to “unknown”.

95

Uncle Kvetch 10.19.06 at 7:44 am

I can only quote Neel Krishnaswami, in a comment thread on this subject over at High Clearing:

Iraq is too dangerous a place to even merely count the dead, which ought to embarass any remaining apologists for the invasion.

What else is there to say?

96

Barry 10.19.06 at 8:08 am

“Polling just isn’t reliable in war zones or countries with secret police, and there’s no point to pretending otherwise. You can do all the math you like concerning sample sizes and clustering, and it’s all beside the point if your samples’ chief concern is with figuring out what responses have the least risk of getting them murdered.”

Posted by Brett Bellmore

Brett, stop lying. The IBC figures are 50K civilians killed by violence, whose deaths were covered by English-language media, accessible online. Multiplied by the factors indicated by references 11,22-26, this means *at least* 250K civilians killed by violence. Not ‘all deaths, from all causes’, but civilians killed by violence’. And that’s a low-end estimate, for a sub-set of the deaths covered by the Lancet study.

97

asg 10.19.06 at 11:02 am

A CT comment thread wouldn’t be complete without a post like #86, so we should all be thankful that someone stepped up.

98

MQ 10.19.06 at 4:18 pm

I’m stepping in late, but I wanted to respond to Pidgas above, since he appears to be one of the few genuine stats type (ever) weighing in on the “anti-study” side here for Lancet I or II. Yes, cluster studies depend on assumptions about the across-cluster mortality rate, and when those assumptions are wrong it will bias results. But he does not take into account the robustness checks that were reported in the paper, such as the use of bootstrapped confidence intervals that should have shown higher variance if the across-cluster variance assumptions were far off. This study is good methodologically. Also, Pidgas seems to have a problem with a wide confidence interval. Here he does seem to be simply tendentious — a wide CI is not a problem, it just determines what conclusions one can draw. Here the conclusions are sobering anywhere within the CI reported. Finally, there is no contradiction with having a lower bound to the overall deaths CI that is below the violent deaths CI, since it just means that overall deaths have a much greater variance. Again, this seems tendentious, as it seems like someone with his knowledge would understand this.

Personally, the figures reported in Lancet II do not necessarily seem so far off by my “gut check”. This is a major war that has been going on for over 4 years now; we know modern military technology causes high casualties. We have been getting reports out of Iraq of brutal, constant violence for years. Civilian Iraqis have reported huge death tolls in their own neighborhoods that were so routine that they did not make the press:

http://www.salon.com/opinion/feature/2006/10/19/riverbend/

But I don’t know that the Lancet II numbers are right, no one does. There are lots of opportunities for non-sampling error here. One, pointed out in the paper, is differential migration from high-mortality areas. The sampling design was based on regional population counts from several years ago. If people have migrated from high mortality areas (as seems likely), these counts could now overrepresent populations in dangerous areas. This would bias death counts high.

99

mq 10.19.06 at 6:34 pm

mq, that seems a reasonable post. Stop that.

100

Steven Moore 10.21.06 at 1:16 am

Kane says, “I can not find a single example of a survey with a 99%+ response rates in a large sample for any survey topic in any country ever.” I googled around a bit looking for information on previous Iraqi polls and their response rates. It took about two minutes. Here is the methodological statement for a poll conducted by Oxford Research International for ABC News (and others, including Time and the BBC) in November of 2005. The report says, “The survey had a contact rate of 98 percent and a cooperation rate of 84 percent for a total response rate of 82 percent.”

Average response rates in Iraq are, as of last summer when I was there conducting surveys, between 80% and 85% This is a long way from 99.2%.

Here is one from the International Republican Institute, done in July. The PowerPoint slides for that one say that “A total sample of 2,849 valid interviews were obtained from a total sample of 3,120 rendering a response rate of 91 percent.”

Above average response rate, but still a long way from 99.2%

And here is a report put out in 2003 by the former Coalition Provisional Authority, summarizing surveys conducted by the Office of Research and Gallup. In the former, “The overall response rate was 89 percent, ranging from 93% in Baghdad to 100% in Suleymania and Erbil.” In the latter, “Face-to-face interviews were conducted among 1,178 adults who resided in urban areas within the governorate of Baghdad … The response rate was 97 percent.” So much for Iraqi surveys with extraordinary response rates being hard to find.

You might agree that conducting surveys was considerably easier in Iraq in 2003.

The Lancet report study is bizarre in so many ways – ten times the number of deaths reported by anyone else, a spectacularly high response rate, no demographic questions were asked and we are expected to believe that as 2.5% of the population is killed, the bureaucracy is dutifully churning out death certificates for 92% of the deaths.

Bizarre.

101

David Kane 10.21.06 at 3:09 pm

John Quiggin writes:

Interestingly, DK himself seems to have disappeared from view – he didn’t respond to comments on his now-vanished post (unless in the interval between my last viewing and its disappearance) and I can’t see anything from him in this thread.

I was traveling and giving a talk in Florida. Apologies for the absence.

And the denialists above seem eager to back away from his fraud accusation, even though it’s the only coherent basis for challenging the results.

Could someone please define some terms here. What is a “denialist”? Is it anyone with suspicions about the Lancet team? It is anyone who thinks that 650k is too high an estimate?

For the record, I do not describe myself as a “denialist.” I think that the mortality rate in Iraq is much higher post invasion than it was pre-invasion. Whether this “excess death” measure is 100k or 400k or 800k, I do not know. I have serious doubts about both Lancet I and II, but more on that some other time.

And, to be clear, I am not and did not accuse the authors of the Lancet article of fraud. They did not collect the data. They have no first hand evidence of how the interviews were conducted. I think that much (I think a preponderance but others will differ) of the evidence suggests that the data is inaccurate.

As to the response rate, I appreciate all the citations provided here and elsewhere. If I had been aware of them (alas, my googling skills are not what they should be), I would have mentioned them and moderated my language.

But, as Stephen Moore points out above, the response rate is still “spectacularly high”. Does Stephen Moore know more about conducting surveys in Iraq than anyone at CT? Perhaps, perhaps not. But I think it is too early to declare the fraud balloon to have popped.

And, as always, many of these questions would go away if the Lancet teams would release the data.

Comments on this entry are closed.