Thursday, July 19, 2018

Machine Learning, Volatility, and the Interface

Just got back from the NBER Summer Institute. Lots of good stuff happening in the Forecasting and Empirical Methods group. The program, with links to papers, is here.

Lots of room for extensions too. Here's a great example. Consider the interface of the Gu-Kelly-Xiu and Bollerslev-Patton-Quagvleg papers. At first you might think that there is no interface. 

Kelly-Xiu is about using off-the-shelf machine-learning methods to model risk premia in financial markets; that is, to construct portfolios that deliver superior performance. (I had guessed they'd get nothing, but I was massively wrong.) Bollerslev et al. is about predicting realized covariance by exploiting info on past signs (e.g., was yesterday's covariance cross-product pos-pos, neg-neg, pos-neg, or neg-pos?). (They also get tremendous results.)

But there's actually a big interface.

Note that Kelly-Xiu is about conditional mean dynamics -- uncovering the determinants of expected excess returns. You might expect even better results for derivative assets, as the volatility dynamics that drive options prices may be nonlinear in ways missed by standard volatility models. And that's exactly the flavor of the Bollerslev et al. results -- they find that a tree structure conditioning on sign is massively successful.

But Bollerslev et al. don't do any machine learning. Instead they basically stumble upon their result, guided by their fine intuition. So here's a fascinating issue to explore: Hit the Bollerslev et al. realized covariance data with machine learning (in particular, tree methods like random forests) and see what happens. Does it "discover" the Bollerslev et al. result? If not, why not, and what does it discover? Does it improve upon Bollerslev et al.?