Saturday, September 29, 2018

RCT's vs. RDD's

Art Owen and Hal Varian have an eye-opening new paper, "Optimizing the Tie-Breaker Regression Discontinuity Design".

Randomized controlled trials (RCT's) are clearly the gold standard in terms of statistical efficiency for teasing out causal effects. Assume that you really can do an RCT. Why then would you ever want to do anything else?

Answer: There may be important considerations beyond statistical efficiency. Take the famous "scholarship example". (You want to know whether receipt of an academic scholarship causes enhanced academic performance among strong scholarship test performers.) In an RCT approach you're going to give lots of academic scholarships to lots of randomly-selected people, many of whom are not strong performers. That's wasteful. In a regression discontinuity design (RDD) approach ("give scholarships only to strong performers who score above X in the scholarship exam, and compare the performances of students who scored just above and below X"), you don't give any scholarships to weak performers. So it's not wasteful -- but the resulting inference is statistically inefficient. 

"Tie breakers" implement a middle ground: Definitely don't give scholarships to bottom performers, definitely do give scholarships to top performers, and randomize for a middle group. So you gain some efficiency relative to pure RDD (but you're a little wasteful), and you're less wasteful than a pure RCT (but you lose some efficiency).

Hence there's an trade-off, and your location on it depends on the size of the your middle group. Owen and Varian characterize the trade-off and show how to optimize the size of the middle group. Really nice, clean, and useful.

[Sorry but I'm running way behind. I saw Hal present this work a few months ago at a fine ECB meeting on predictive modeling.]

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.