You only have to hang around the world of social science research- or policy-related blogging for a few hours before you come across someone willing to snottily inform you, or some other luckless interlocutor, that although the finding of this or that paper may appeal to you, nevertheless don’t you know that Correlation Is Not Causation. Often this seems to be the only thing they know about statistics.
I grudgingly admit that it’s a plausible-sounding rule, and in the textbooks and stuff. But, to be honest, I read it too many times in various posts and comments threads the other day, and in my raging pique I found myself thinking that the next time it happened I would say, “That’s completely backwards: in fact, causation is just correlation” and fling a copy of Hume’s first Enquiry at their head. Or at the screen, I suppose, but that image is less satisfying, because now who’s the crank on the internet, etc.
This Halloween when we take the kids Trick-or-Treating, I will dress up as Correlation, as befits a social scientist. My wife will of course be Causation.
{ 44 comments }
Lad Litter 06.30.08 at 6:48 am
We don’t celebrate Halloween in a big way here in Australia but I’m sure going trick or treating as Standard Deviation would be fun, if a little dangerous.
matt 06.30.08 at 1:21 pm
This is where I tend to really like Jon Elster’s work, though. I have lots of sympathy for the idea that people ought not talk of causes until they can say something about mechanisms. The search for mechanisms is, if not the heart of science, at least one of the most important and interesting aspects of it. (I am tempted to say that you can’t explain something until you can describe a mechanism though that’s probably too strong.)
Henry 06.30.08 at 1:41 pm
I’m with Matt here (and indeed have just co-written a paper trying to bring recent scholarship on mechanisms and causation to the benighted wilderness of international relations scholarship).
Dan Kervick 06.30.08 at 1:46 pm
There are a number of accounts of causation that fit into the broad category of “Humean” accounts, according to which causal relations supervene entirely on the underlying mosaic of particular facts and events, and do not depend on any metaphysically dubious “necessary connections” among events.
But none of these accounts, so far as I am aware, goes so far as to say that “causation is just correlation”.
Kieran Healy 06.30.08 at 1:48 pm
It’s been a while since I looked at that literature. It’s certainly well-motivated, though in practice one finds that for every mechanism, there is an equally plausible and opposite mechanism, and then it’s either back to the messy data, like a good empiricist; or reject the data as useless and hew to the theory, like a good Chicago School economist or orthodox Marxist.
e julius drivingstorm 06.30.08 at 1:48 pm
Let’s see. Laurie (c) is making you (in sequence) take the kids Trick-or-Treating (e).
Had the intelligently designed stork never brought you these kids, or had you uncaused her major premise by loaning your kids out to someone else beforehand (uc), would she still be making you go?
Kieran Healy 06.30.08 at 1:49 pm
But none of these accounts, so far as I am aware, goes so far as to say that “causation is just correlationâ€.
The key difference being that, for all their virtues, none of those accounts has the ambition of making a small joke in a blog post on a Sunday night.
Kieran Healy 06.30.08 at 1:49 pm
would she still be making you go?
No, she would be running interference for Billy and Suzy on a bombing mission.
John Emerson 06.30.08 at 1:54 pm
Semi-on-topic, can anyone recommend a good book or two about the misuse of epidemiological methods? Even in non-ideological, non-pop debates I sometimes see dubious conclusions being made.
J.W. Hamner 06.30.08 at 2:03 pm
To use epidemiology for anything other than “hmmm that’s an interesting relationship, now we need to do carefully controlled studies to determine whether it actually exists” is misuse. You can correlate anything you want, but until you have a pre/post demonstrated effect your not going to convince me of anything.
Of course prospective interventions are expensive and time consuming… data mining is not.
Matt Stevens 06.30.08 at 2:04 pm
What I tell my statistics students is “correlation might be evidence of causation, but it isn’t necessarily so.” I then describe various reasons why, with confounding variables being the most likely culprit. They seem to get the idea.
The problem with the maxim, “correlation is not causation,” is that if it were true, there’d be no reason to look for correlations at all.
J.W. Hamner 06.30.08 at 2:04 pm
Damnit. I’m being all snarky and I go and do a your/you’re. Sorta dampens the effect.
matt 06.30.08 at 2:05 pm
Not to mention the fact that there’s some serious debate as to whether Hume really thought that “causation is just correlation” or not. See:
for a pretty good (and to my mind mostly convincing) argument that he did not.
matt 06.30.08 at 2:07 pm
Well, that link didn’t come out, so See Galen Strawson, _The Secret Connexion: Causation, Realism, and David Hume_, Oxford University Press, USA (August 27, 1992)
Chris Dornan 06.30.08 at 3:00 pm
You weren’t being serious! And here I was thinking that someone doing some cool and witty philosophy.
I really don’t know what Matt means by mechanism above. For my money Feynman had it bang on: it is about correlations all right, as long as some of those correlations lie in the future: it’s about making interesting (i.e., surprising) predictions.
I have written a bit more on this here.
abb1 06.30.08 at 4:01 pm
I think that one can view the world as a set of trajectories (as a movie) or a set of snapshots (individual frames on the film). The former approach leads to the cause/effect scenario, the later to the ‘correlations’ scenario.
It’s especially clear in chess modeling: the sequence of moves vs. sequence of position.
Seth Gordon 06.30.08 at 4:18 pm
According to a rigorous meta-analysis, correlation is correlated with causation. Therefore, correlation implies causation. QED.
Matt Stevens 06.30.08 at 4:55 pm
Chris D, it’s a little more complicated than that. You can predict the likelihood of rain based on how many people are carrying umbrellas. The predictions may be accurate, but you have to control for all the confounding factors (cloud formations, weathermen, etc.) to demonstrate a causal relationship. (Even then you could dismiss it for being flat-out bonkers.)
Matt McIrvin 06.30.08 at 5:01 pm
One problem with demanding mechanisms is that it’s potentially an infinite regress. Undergrad students of fundamental physics often get frustrated because they’re not shown any mechanism beneath the basic rules of electromagnetism, gravity, etc.–that is, the things we use to construct other explanatory mechanisms. Gravity happens because matter and energy curve space-time; but why? Who knows? A kid asking “why” repeatedly can usually drive you to the limit of human knowledge in four or five steps.
People may end up discovering other levels of mechanism in the future (strings? quantum gravity? etc.) but at any given time, at some point, observed correlation and successful prediction are all you’ve got.
Walt 06.30.08 at 5:01 pm
Sad. I was just about to make the “correlation is correlated with causation” joke.
virgil xenophon 06.30.08 at 5:07 pm
seth gordon: I liked Tufte’s comment about correlation being a strong “hint” of causation that was quoted in the article you referenced. It seems to capture the flavor of the whole exercise in one “pithy” (I know, O’Reilly, etc.) statement about as well as anything else that’s been written.
matt 06.30.08 at 5:35 pm
Matt M (and others)- I’d agree that there comes a limit to asking for mechanisms, but I’m strongly tempted to say that when we get to the end of that we’ve reached the end of our understanding. And of course I don’t want to say that gathering the data isn’t a hugely important part of science. But usually there will be several causal mechanisms compatible with the data and then it’s a question of figuring out which is more plausible, better supported, etc. Many of the phenomena of interest to Freud, for example, seem to be real, but the question of some greater interest is whether the mechanisms he postulated were the right ones. Without this my tendency is to think we just don’t well understand what we talking about.
Neel Krishnaswami 06.30.08 at 6:22 pm
Hume was smart, but this is one of the rare occasions he was seriously wrong.
A rooster’s crowing is certainly correlated with sunrise, but it’s quite evidently absurd to say that the rooster causes the sun to rise. Interventionist accounts of causality (e.g., Menzies & Price’s, or Judea Pearl’s) accord reasonably well with intuitions about causation, and the mathematics of interventions is certainly not the same as the math of correlation.
It’s certainly often true that the data you need to establish causation are unavailable, especially in the social sciences, but really that’s more a problem for social scientists to solve than for statisticians.
a 06.30.08 at 6:51 pm
“…to snottily inform you, or some other luckless interlocutor, that although the finding of this or that paper may appeal to you, nevertheless don’t you know that Correlation Is Not Causation. Often this seems to be the only thing they know about statistics.”
Well that sounds pretty snotty, doesn’t it? All must bow to Authority, I guess.
When I read the comment, there usually isn’t any indication whether or not the person knows anything more about statistics.
qb 06.30.08 at 6:54 pm
Neel @ 23,
As was remarked above, Hume’s point was not that correlation just is causation, but that correlation is the only thing the data are capable of establishing, ever. Interventionist accounts of causation offer little more than a formalization of Hume’s positive point: that we can use the scientific method to identify false causes by wiggling some variables while holding others steady. But even if we have established that wiggling X invariably leads to changes in Y, we still have nothing more than a very reliable correlation. Social scientists aren’t going to solve that problem any time soon.
Chris Dornan 06.30.08 at 7:15 pm
The rain and umbrellas is good but I prefer the rooster and sunrise example. My point is that it should be a strong correlation that should surprise. If the correlation can be trivially explained by other causal mechanisms then it doesn’t count. You have to take the epistemological argument seriously.
As for Seth Gordon’s point about identifying causation with correlation–that is going too far! If no predictions can be made then by definition I say it is not a causal relationship (I just accuse you of data mining).
Predictions can be made in historical data (as Feynman suggests–see the article), as long as the observer is ignorant of the outcome. Kepler no doubt was doing this as he trialled different hypotheses with De Brahe’s Martian data.
Aaron Swartz 06.30.08 at 7:40 pm
This post makes me want to throw Judea Pearl or Clark Glymour at you. http://ftp.cs.ucla.edu/pub/stat_ser/r284-reprint.pdf is probably a good place to start. Then maybe http://singapore.cs.ucla.edu/LECTURE/lecture_sec1.htm
Thrasymachus 06.30.08 at 8:25 pm
The main issue with the “correlation is not causation” claim is that it is often used selectively, by people who want to ignore evidence that they find ideologically uncongenial. These same individuals will allow far less well supported statements to pass unchallenged if the statements are consistent with what they want to believe.
strategichamlet 06.30.08 at 8:37 pm
“in practice one finds that for every mechanism, there is an equally plausible and opposite mechanism, and then it’s either back to the messy data, like a good empiricist; or reject the data as useless and hew to the theory”
The formation of opposing mechanisms isn’t a bad thing, it helps direct future experiments! For science to work correctly there needs to be continual interplay between theory and experiment.
In the case of the competing mechanisms for sunrise (rooster’s crow vs. spinning earth) one could kill the rooster and see if the sun still rises, but one wouldn’t necessarily know to try if one didn’t have the competing theories. Popper says all this much better than I do.
D.C. 06.30.08 at 9:58 pm
Very interesting point. For example, if a man was walking across the street at night to go to the cornerstore to buy a pack of cigarettes and he wasn’t looking where he was walking, and he was crossing at a blind corner and gets hit by a car. What was it that killed him? Was it his addiction to cigarettes? His inattention? was it the drivers fault? Was it the city planners fault for placing a blindcorner?
All these things we consider when trying to find causation. In the end WE determine which is the most logical ’cause’ of the event. Some people may say the drivers negligence while driving, while others may say his addiction to smoking. It is all about perspective.
I do agree that causation is just correlation because it is such an ambiguous idea that it cannot be pinpointed to an exact incidence. Very good post.
A good article on this topic is by Ardon Lyon called ‘Causality,’ The British Journal for the Philosophy of Science, Vol. 18, No. 1 (May, 1967), pp. 1-20.
abb1 06.30.08 at 10:20 pm
What was it that killed him?
It was the sun, just like with the rooster’s crow. Without the sun nothing would’ve happened. For sure.
ScentOfViolets 07.01.08 at 2:01 am
otoh, lack of correlation definitely implies lack of causation. As I never tire of informing those same snots who also say that cutting taxes increases investment, productivity, patriotism, etc.
As for what’s _really_ happening? Isn’t that not science, more in the purview of epistemology? Maybe it’s all red, blue, green, and yellow demons pushing things around _as_if_ they were being affected by gravity, electromagnetism, etc. So what?
virgil xenophon 07.01.08 at 2:14 am
scentofviolets: Does the possible existence of your demons(outer, not inner) mean that it really IS possible that it’s “turtles all the way down?”
Tracy W 07.01.08 at 8:06 am
The main issue with the “correlation is not causation†claim is that it is often used selectively, by people who want to ignore evidence that they find ideologically uncongenial.
Out of curiousity, is there a claim out there that is not often used selectively by people who want to ignore evidence that they find ideologically uncongenial?
I can think of many claims that are used selectively, eg the laws of thermodynamics and creationists, but I can’t think of any claim that is not used selectively.
Kenny Easwaran 07.01.08 at 8:36 am
Actually, as I understand it, the work of Glymour, Spirtes, and all their collaborators at CMU and elsewhere seems to show that really you can infer causation just from correlation (at least, if causal bayes nets gives the right model of causation). While you can’t infer causation from correlation between merely two variables, even once you get three you can reliably tell the difference between a collision (A->B<-C), a common cause (A<-B->C), and a chain, though you can’t tell apart the two types of chain. Maybe I’m wrong about three variables being sufficient for this, and you really need correlations between five or more, but at some point you can start reading the actual directions of particular arrows off the correlations alone.
Of course, you can’t eliminate the possibility that you’ve ignored some extra variables that play very important roles in your model, or the possibility that you’ve just got very non-representative data. But that’s just what makes science hard. Controlled experiments are nice for giving you variables where you specifically know there are no incoming arrows, but even in observational (is that the same as epidemiological?) studies, you can sometimes get enough correlations to be sure that some of them are causal.
Of course, I’d still prefer a model of causation that involves mechanisms, but I think you can predict the existence of mechanisms even when you don’t know anything about them. (For instance, see the amazing successes of Darwin and Mendel prior to Watson, Crick, and Franklin.)
Dave 07.01.08 at 10:28 am
Yeah, if people we disagreed with weren’t such sly, manipulative, lying, ignorant mother-frackers, the world would be a much better place. Is that correlation or causation, y’think?
ScentOfViolets 07.01.08 at 1:41 pm
Virgil, yes, it is possible it it Turtles All The Way Down. Worlds ‘really’ could move in epicycles. You ‘really’ could be in the grip of an infinitely powerful deceiver. And so on and so forth. But . . .
Kenny, yes, _given_ a known set of variables, _then_ you could argue just as you have. But it is almost impossible to definitively say that you’ve ruled out all other forms of interference. Which is why the logical arrow, the correlation =>causation contrapositive is much more better, to quote a snippet of Cap’n Jack, and we speak of the essential trait of any good theory, that of falsifiability.
D.C. 07.01.08 at 3:36 pm
To abb1: That is very true. It all depends on how far you want to push the string of causations. We could take your sun cause all the way back to the beginning of the universe. Just to play devil’s advocate: But is that correct? It’s definitely not wrong, but people would definitely argue that it cannot be taken back that far. But why? It is all about justification, an perspective.
qb 07.01.08 at 5:11 pm
…you can sometimes get enough correlations to be sure that some of them are causal.
Thirty-five posts into the thread, and some of us still fail to grasp Hume’s very simple point.
OF COURSE you can infer causation from correlation on Bayesian network models of causation, because they simply define causation in terms of statistical correlation. Refuting Hume with interventionism is like refuting Berkeley by kicking rocks.
noen 07.01.08 at 8:43 pm
Well it’s all language isn’t it? The idea that we can gaze upon Nature’s true face is a fantasy, it has no face. Reality is that which resists symbolization. It is a mistake to confuse our descriptions of the world for the world.
Within this thread the question of causation vs correlation has wavered from a narrow context to the much broader metaphysical debate. Hence there has been a certain amount of confusion. I was taught in statistics that you cannot simply proclaim causation. You need to provide a mechanism or you do end up with a rooster/sunrise fallacy. My position on the larger question is sketched out above.
Charles Twardy 07.01.08 at 9:38 pm
otoh, lack of correlation definitely implies lack of causation: That must be qualified, because we often balance competing causes to achieve an overall correlation near 0.
For example, it turns out (from a very large study) that applying sunscreen has no effect on skin cancer. A likely explanation is that we adjust for constant exposure — staying in the sun about as long as we can without burning.
Mind you, it’s a good heuristic. In fact, most causal discovery algorithms assume it. But it doesn’t always work.
Chris Dornan 07.01.08 at 11:28 pm
noen do you agree that don’t get the rooster sunrise fallacy as I explained above if you take the epistemological aspect seriously–simply the rooster doesn’t cause the sunrise because you have another set of causes to explain that correlation.
noen 07.02.08 at 2:48 am
I’m not sure what you’re getting at Chris. Discovering a correlation is an invitation to further study. One shouldn’t leap to conclusions. The problem is that no matter how fine grained our mechanism is there will always be a leap involved. So we are left with observing that B follows A and concluding that A causes B. We call that deduction but there is a gap in our understanding. There always will be.
Chris Dornan 07.02.08 at 8:00 pm
noen I have had a go at trying to clear up the misunderstanding in a short article on my blog. The misunderstanding between us is, I think, revealing.
Comments on this entry are closed.