Wednesday, November 25, 2015

Questioning the Credibility Revolution

If you follow the econonomics blogosphere, you've likely heard a lot about the "credibility revolution." The "credbility" part has to do with the supposed superiority of randomized control trials and quasi-experimental design over traditional models. The "revolution" part has to do with the suggestion that these models should replace, to some degree, theory in determining causality. One popular book, Angrist & Pischke's "Mostly Harmless Econometrics," discusses these methods quite clearly. I've used it a few times as a reference and found it quite helpful.

Of course, this revolution is very important for agricultural economists. We spend a lot of time using empirical methods to examine microeconomic problems. An open mind and critical eye are, I think, important for evaluating the usefulness of these newly-popular methods.

As the title suggests, I'm skeptical of the "revolution" part and somewhat skeptical of the "credibility" part. Experimental methods are nice as far as they go, but the use of experiments (broadly speaking) is more problematic in the social sciences than in the natural sciences. This is especially important if we are going to rely on the data to determine causality for us, rather than relying on sound theory. Further, some people seem to think these methods are the only way to do empirical microeconomics. In this post I'll draw from a few sources to make the arguments that 1) there are some important limitations to these quasi-experimental methods, 2) we shouldn't replace sound theory with data analysis, and 3) the "old" empirical models are just as useful as they always have been.

1) Angus Deaton's 2008 Keynes Lecture at the British Academy presents some criticisms of quasi-experimental methods in economics:
In response, there is movement in development economics towards the use of randomized controlled trials (RCTs) to accumulate credible knowledge of what works, without over-reliance on questionable theory or statistical methods. When RCTs are not possible, this movement advocates quasi-randomization through instrumental variable (IV) techniques or natural experiments. I argue that many of these applications are unlikely to recover quantities that are useful for policy or understanding: two key issues are the misunderstanding of exogeneity, and the handling of heterogeneity. ... Actual randomization faces similar problems as quasi-randomization, notwithstanding rhetoric to the contrary. I argue that experiments have no special ability to produce more credible knowledge than other methods, and that actual experiments are frequently subject to practical problems that undermine any claims to statistical or epistemic superiority.
That last sentence seems particularly damning. If these methods are no better than standard models at determining causality, then we should be leaning just as heavily on  theory now as we did before the credibility revolution.

2) Pete Boettke puts the relationship between theory and empirics this way:
Now I simply use the metaphor of the eyeglasses that I wear and the idea that my eyesight is so poor that without my set of eyeglasses I could not make out the person in the back of the room. Those eyeglasses are like economic theory -- without the aid of economic theory the social world is a blur and an unrecognizable collection of data points, but once I put economic theory to work the social world is brought into sharp relief. 
There are in the social sciences WHAT questions and WHY questions. Answering WHAT questions thoroughly is a necessary component of any empirical social scientist. 
... our job as social scientists is only beginning once we know WHAT, we then must be ready to explain WHY WHAT happened, in fact happened. We must enter the WHY AXIS (as Gneezy and List term it) of explanation in terms of incentives that human actors face in order to make progress in understanding human behavior and the structure of the social world in which we live.
The questions we answer with empirics and theory are different. We also use empirics to fill in the gaps in our theoretical understanding. However, testing theory with data isn't as neat a process as many suggest. We typically hear something like the following in our econometrics courses:
A theory in the empirical sciences can never be proven, but it can be falsified, meaning that it can and should be scrutinized by decisive experiments.

However, I don't think that's a faithful interpretation of Popper, the father of empirical falsifiability. Here’s a quote from his book “The Logic of Scientific Discovery”
In point of fact, no conclusive disproof of a theory can ever be produced; for it is always possible to say that the experimental results are not reliable, or that the discrepancies which are asserted to exist between the experimental results and the theory are only apparent and that they will disappear with the advance of our understanding. If you insist on strict proof (or strict disproof) in the empirical sciences, you will never benefit from experience, and never learn from it how wrong you are.
The applicability of a theory is easier to identify with data than its validity. Epistemic humility is, in my view, extremely important.

Don Boudreaux recently discussed a very good example of confusing the direction  of causality based solely on empirical analysis. Boudreaux's grasp of sound theory is what allows him to properly interpret the data.
Higher wages do not cause higher worker productivity; instead, higher worker productivity causes higher wages.  When industry X is expanding and its workers are becoming more productive, companies in X bid for more workers by raising the wages paid in X.  Higher wages in industry X attract workers from industry Y, thus prompting companies in industry Y to implement labor-saving technology.
In contrast, if wages are forced up by diktat rather than competed up in response to rising worker productivity, wages for some workers will exceed the value of their productivity.  These workers will become unemployed.  And in addition to losing current income, these workers will be denied on-the-job experience – a denial that thwarts improvements in their productivity (that is, in their “human capital”). The economy and workers as a group will over time become less, not more, productive.
We also need to keep in mind that these quasi-experimental or randomized control trials are typically done in very narrow contexts. There could be institutional details about a specific scenario that make the results of a given study valid for only that particular scenario. Good deductive theory, where applicable, can help clarify this issue.

3) Quasi-experimental methods, I think, can be useful in the proper context. However, they are only a subset of a wide array of statistical tools we have at our disposal. In his (short) review of Angrist & Pischke's "Mostly Harmless Econometrics," Francis Diebold says:
The problem isn't what it includes, but rather what it excludes. Starting with its title and continuing throughout, MHE [Mostly Harmless Econometrics] promotes its corner of applied econometrics as all of applied econometrics, or at least all of the "mostly harmless" part (whatever that means). Hence it effectively condemns much of the rest as "harmful," and sentences it to death by neglect. It gives the silent treatment, for example, to anything structural -- whether micro-econometric or macro-econometric -- and anything involving time series. And in the rare instances when silence is briefly broken, we're treated to gems like "serial correlation [until recently was] Somebody Else's Problem, specifically the unfortunate souls who make their living out of time series data (macroeconomists, for example)" (pp. 315-316).
All told, Mostly Harmless Econometrics: An Empiricist's Companion is neither "mostly harmless" nor an "empiricist's companion." Rather, it's a companion for a highly-specialized group of applied non-structural micro-econometricians hoping to estimate causal effects using non-experimental data and largely-static, linear, regression-based methods. It's a novel treatment of that sub-sub-sub-area of applied econometrics, but pretending to be anything more is most definitely harmful, particularly to students, who have no way to recognize the charade as a charade.
I'm not sure the tone is entirely appropriate, but Diebold makes a good point: we have other tools at our disposal. In my view, we shouldn't let the rhetoric of the credibility revolution keep us from looking at problems that simply can't be examined with a quasi-experimentation or randomized control trials. If we employ sound theory in our conceptual frameworks and use the most appropriate empirical models, we're likely to produce valuable and interesting research.


  1. Great post!

    I think the debate is eventually about the paradigm. Economists have very different scientific classifications for the very same phenomenon. Hence that reflects in their epistemology and also in the employed scientific methods.

    Without understanding the paradigm, it might be impossible to contextualize the relevance of epistemology and the institutional nature of individual plans. Actually, once we grasp the whole Mises/Hayek paradigm, then automatically that would reflect in the pursued scientific methods also.

    Otherwise, for an econometrician, these very relevant theoretical points might seem like noise.

    1. Mahesh,

      Thanks for your kind words. I think a lot of what you call the Mises/Hayek paradigm is pretty close to the one many of my colleagues use. My worry is that the dominant paradigm in applied micro will shift toward the methods discussed above to the *exclusion* of other valid and time-tested methods.


    2. Levi,

      Thanks, please note that by Mises/Hayek paradigm I meant social science as a causally complex phenomenon. For example, Hayek's 'Theory of Complex Phenomena' seems to be a lot complementary to the literature in 'Human Action'. I guess beyond that there is divergence in their approach with regards to how they engage this complexity.

      Also, the concern you raise seems to be valid. To an outsider like me, such an 'empirical' approach to discovering causal connections seems completely unscientific. Cannot think of any other scientific field with a coherent causal structure where they employ just data to discover connections.


    3. Mahesh,

      Thanks for clarifying. I agree that causal density is underappreciated in economics.

      "To an outsider like me, such an 'empirical' approach to discovering causal connections seems completely unscientific. Cannot think of any other scientific field with a coherent causal structure where they employ just data to discover connections."

      That's something I hadn't heard before. I haven't studied other fields well enough to know how they deal with these issues. Thanks again for commenting and for reading!