Tuesday, January 3, 2017

Torpedoing Econometric Randomized Controlled Trials

I get no pleasure from torpedoing anything, and "torpedoing" is likely exaggerated, but nevertheless take a look at "A Torpedo Aimed Straight at HMS Randomista". It argues that many econometric randomized controlled trials (RCT's) are seriously flawed -- not even internally valid -- due to their failure to use double-blind randomization. At first the non-double-blind critique may sound cheap and obvious, inviting you to roll your eyes and say "get over it". But ultimately it's not.

Note the interesting situation. Everyone these days is worried about external validity (extensibility), under the implicit assumption that internal validity has been achieved (e.g., see this earlier post). But the 
non-double-blind critique makes clear that even internal validity may be dubious in econometric RCT's as typically implemented.

The underlying research paper, "Behavioural Responses and the Impact of New Agricultural Technologies: Evidence from a Double-Blind Field Experiment in Tanzania", by Bulte et al., was published in 2014 in the American Journal of Agricultural Economics. Quite an eye-opener

Here's the abstract:

Randomized controlled trials in the social sciences are typically not double-blind, so participants know they are “treated” and will adjust their behavior accordingly. Such effort responses complicate the assessment of impact. To gauge the potential magnitude of effort responses we implement an open RCT and double-blind trial in rural Tanzania, and randomly allocate modern and traditional cowpea seed-varieties to a sample of farmers. Effort responses can be quantitatively important––for our case they explain the entire “treatment effect on the treated” as measured in a conventional economic RCT. Specifically, harvests are the same for people who know they received the modern seeds and for people who did not know what type of seeds they got, but people who knew they received the traditional seeds did much worse. We also find that most of the behavioral response is unobserved by the analyst, or at least not readily captured using coarse, standard controls.