Alwyn Young has an eye-opening recent paper, "

Consistency without Inference: Instrumental Variables in Practical Application". There's a lot going on worth thinking about in his Monte Carlo: OLS vs. IV; robust/clustered s.e.'s vs. not; testing/accounting for weak instruments vs. not; jacknife/bootstrap vs. "conventional" inference; etc. IV as typically implemented comes up looking, well, dubious.

Alwyn's related analysis of published studies is even more striking. He shows that, in a sample of 1359 IV regressions in 31 papers published in the journals of the American Economic Association,

"... statistically significant IV results generally depend upon only one or two observations or clusters, excluded instruments often appear to be irrelevant, there is little statistical evidence that OLS is actually substantively biased, and IV confidence intervals almost always include OLS point estimates."

Wow.

Perhaps the high leverage is Alwyn's most striking result, particularly as many empirical economists seem to have skipped class on the day when leverage assessment was taught. Decades ago,

Marjorie Flavin attempted some remedial education in her 1991 paper, "

The Joint Consumption/Asset Demand Decision: A Case Study in Robust Estimation". She concluded that

"Compared to the conventional results, the robust instrumental variables estimates are more stable across different subsamples, more consistent with the theoretical specification of the model, and indicate that some of the most striking findings in the conventional results were attributable to a single, highly unusual observation."

Sound familiar? The non-robustness of conventional IV seems disturbingly robust, from Flavin to Young.

Flavin's paper evidently fell on deaf ears and remains unpublished. Hopefully Young's will not meet the same fate.