Tags

, , , , ,

Wikipedia confidently explains this in its first sentence for this entry: “Post hoc ergo propter hoc (Latin: “after this, therefore because of this”) is a logical fallacy that states ‘Since event Y followed event X, event Y must have been caused by event X.’” This so-called fallacy is curious for a number of reasons. Taken literally it is a fallacy that is almost never committed, at least relative to the opportunities to commit it. There are (literally, perhaps) uncountably many events succeeding other events where no one does, nor would, invoke causality. Tides followed by stock market changes, cloud formation followed by earthquakes, and so on and so on. People do attribute causality to successive events of course: bumping a glass causing it to spill, slipping on a kitchen floor followed by a bump to the head. In fact, that’s how as infants we learn to get about in the world. Generally speaking, it is not merely temporal proximity that leads us to infer a causal relation. Other factors, including spatial proximity and the ability to recreate the succession under some range of circumstances, figure prominently in our causal attributions.

Of course, people also make mistakes with this kind of inference. In the early 1980s AIDS was attributed by some specifically to homosexual behavior. The two were correlated in some western countries, but the attribution was more a matter of the ignorance of the earlier spread of the disease in Africa than of fallacious reasoning. Or, anti-vaxxers infer a causal relation between vaccines and autism. In that case, there is not even a correlation to be explained, but still the supposed conjunction of the two is meant to confer support to the causal claim. The mistake here is likely due to some array of cognitive problems, including confirmation bias and more generally conspiritorial reasoning (which I will address on another occasion). But mistakes with any type of inductive reasoning, which inference to a causal relation certainly is, are inevitable. If you simply must avoid making mistakes, become a mathematician (where, at least, you likely won’t publish them!). The very idea of fallacies is misbegotten: there are (almost) no kinds of inference which are faulty because of their logical form alone (see my “Bayesian Informal Logic and Fallacy”). What makes these examples of post hoc wrong is particular to the examples themselves, not their form.

The more general complaint hereabouts is that “correlation doesn’t imply causation”, and it is accordingly more commonly abused than the objection to post hoc reasoning. Any number of deniers have appealed to it as a supposed fallacy to evade objections to gun control or the anthropogenic origins of global warming. It’s well past time that methodologists should have put down this kind of cognitive crime.

This supposed disconnect between correlation and causation has been the orthodox statistician’s mantra at least since Sir Ronald Fisher (“If we are studying a phenomenon with no prior knowledge of its causal structure, then calculations of total or partial correlations will not advance us one step” [Fisher, Statistical Methods for Research Workers, 1925] – a statement thoroughly debunked by many decades thereafter of causal inference based on observational data alone). While there are more circumspect readings of this slogan than to proscribe any causal inference from evidence of correlation, that overly ambitious reading is quite common and does much harm. It is unsupportable by any statistical or methodological considerations.

The key to seeing through the appearance of sobriety in the mantra is Hans Reichenbach’s Principle of the Common Cause (in his The Direction of Time, 1956). Reichenbach argued that any correlation between A and B must be explained in one of three ways: the correlation is spurious and will disappear upon further examination; A and B are causally related, either as direct or indirect causes one of the other or as common effects of a common cause (or ancestor); or as the result of magic. The latter he ruled out as being contrary to science.

Of course, apparent associations are often spurious, the result of noise in measurement or small samples. The “crisis of replicability” widely discussed now in academic psychology is largely based upon tests of low power, i.e., small samples. If a correlation doesn’t exist, it doesn’t need to be explained.

It’s also true that an endurring correlation between A and B is often the result of some association other than A directly causing B. For example, B may directly cause A, or there may be a convoluted chain of causes between them. Or, again, they may have a common cause, directly or remotely. The latter case is often called “confounding” and dismissed as showing no causal relation between A and B. But it is confounding only if the common cause cannot be located (and held constant, for example) and what we really want to know, say, is how much any causal chain from A to B is explanatory of B’s state. Finding a common cause that explains the correlation between A and B is just as much a causal discovery as any other.

I do not wish to be taken as suggesting that causal inference is simple. There are many other complications and difficulties to causal inference. For example, selection biases, including self-selection biases, can and do muck up any number of experiments, leading to incorrect conclusions. But nowhere amongst such cases will you find biases operating which are not themselves part of the causal story. Human experimenters are very complex causal stories themselves, and as much subject to bias as anyone else. So, our causal inferences often go wrong. That’s probably one reason why replicability is taken seriously by most scientists; it is no reason at all to dismiss the search for causal understanding.

There is now a science of causal discovery applying these ideas for data analysis in computer programs, one that has become a highly successful subdiscipline of machine learning, at least since Glymour, Scheines, Spirtes and Kelly’s Discovering Causal Structure (1987). (Their Part I, by the way, is a magnificent debunking of the orthodox mantra.)

The general application of “correlation doesn’t imply causation” to dismiss causal attributions is an example of a little learning being a dangerous thing – also known as the Dunning-Kruger effect.