Sunday, February 26, 2017

Machine Learning and Econometrics V: Similarities to Time Series

[Notice that I changed the title from "Machine Learning vs. Econometrics" to "Machine Learning and  Econometrics", as the two are complements, not competitors, as this post will begin to emphasize. But I've kept the numbering, so this is number five.  For others click on Machine Learning at right.]

Thanks for the overwhelming response to my last post, on Angrist-Pischke (AP).  I'll have more to say on AP a few posts from now, but first I need to set the stage.

A key observation is that statistical machine learning (ML) and time-series econometrics/statistics (TS) are largely about modeling, and they largely have the same foundational perspective. Some of the key ingredients are:

-- George Box got it right: "All models are false; some are useful", so search for good approximating models, not "truth".

-- Be explicit about the loss function, that is, about what defines a "good approximating model" (e.g., 1-step-ahead out-of-sample mean-squared forecast error)

-- Respect and optimize that loss function in model selection (e.g., BIC)

-- Respect and optimize that loss function in estimation (e.g., least squares)

-- Respect and optimize that loss function in forecast construction (e.g., Wiener-Kolmogorov-Kalman)

-- Respect and optimize that loss function in forecast evaluation, comparison, and combination (e.g., Mincer-Zarnowitz evaluations, Diebold-Mariano comparisons, Granger-Ramanathan combinations).

So time-series econometrics should embrace ML -- and it is.  Just look at recent work like this.