Monday, February 19, 2018

More on Neural Nets and ML

I earlier mentioned Matt Taddy's "The Technological Elements of Artificial Intelligence" (ungated version here).

Among other things the paper has good perspective on the past and present of neural nets. (Read:  his views mostly, if not exactly, match mine...)  

Here's my personal take on some of the history vis a vis econometrics:

Econometricians lost interest in NN's in the 1990's. The celebrated Hal White et al. proof of NN non-parametric consistency as NN width (number of neurons) gets large at an appropriate rate was ultimately underwhelming, insofar as it merely established for NN's what had been known for decades for various other non-parametric estimators (kernel, series, nearest-neighbor, trees, spline, etc.). That is, it seemed that there was nothing special about NN's, so why bother? 

But the non-parametric consistency focus was all on NN width; no one thought or cared much about NN depth. Then, more recently, people noticed that adding NN depth (more hidden layers) could be seriously helpful, and the "deep learning" boom took off. 

Here are some questions/observations on the new "deep learning":

1.  Adding NN depth often seems helpful, insofar as deep learning often seems to "work" in various engineering applications, but where/what are the theorems? What can be said rigorously about depth?

2. Taddy emphasizes what might be called two-step deep learning. In the first step, "pre-trained" hidden layer nodes are obtained based on unsupervised learning (e.g., principle components (PC)) from various sets of variables. And then the second step proceeds as usual. That's very similar to the age-old idea of PC regression. Or, in multivariate dynamic environments and econometrics language, "factor-augmented vector autoregression" (FAVAR), as in Bernanke et al. (2005). So, are modern implementations of deep NN's effectively just nonlinear FAVAR's? If so, doesn't that also seem underwhelming, in the sense of -- dare I say it -- there being nothing really new about deep NN's?

3. Moreover, PC regressions and FAVAR's have issues of their own relative to one-step procedures like ridge or LASSO.  See this and this

Tuesday, February 13, 2018

Neural Nets, ML and AI

"The Technological Elements of Artificial Intelligence", by Matt Taddy, is packed with insight on the development of neural nets and ML as related to the broader development of AI. I have lots to say, but it will have to wait until next week. For now I just want you to have the paper. Ungated version at


We have seen in the past decade a sharp increase in the extent that companies use data to optimize their businesses.  Variously called the `Big Data' or `Data Science' revolution, this has been characterized by massive amounts of data, including unstructured and nontraditional data like text and images, and the use of fast and flexible Machine Learning (ML) algorithms in analysis.  With recent improvements in Deep Neural Networks (DNNs) and related methods, application of high-performance ML algorithms has become more automatic and robust to different data scenarios.  That has led to the rapid rise of an Artificial Intelligence (AI) that works by combining many ML algorithms together - each targeting a straightforward prediction task - to solve complex problems.  

We will define a framework for thinking about the ingredients of this new ML-driven AI.  Having an understanding of the pieces that make up these systems and how they fit together is important for those who will be building businesses around this technology. Those studying the economics of AI can use these definitions to remove ambiguity from the conversation on AI's projected productivity impacts and data requirements.  Finally, this framework should help clarify the role for AI in the practice of modern business analytics and economic measurement.

Monday, February 12, 2018

ML, Forecasting, and Market Design

Nice stuff from Milgrom and Tadelis. Improved forecasting via improved machine learning in turn helps improve our ability to design effective markets -- better anticipating consumer/producer demand/supply movements, more finely segmenting and targeting consumers/producers, more accurately setting auction reserve prices, etc. Presumably full density forecasts, not just the point forecasts on which ML tends to focus, should soon move to center stage.

Monday, February 5, 2018

Big Data, Machine Learning, and Economic Statistics

Greetings from a very happy Philadelphia celebrating the Eagles' victory!

The following is adapted from the "background" and "purpose" statements for a planned 2019 NBER/CRIW conference, "Big Data for 21st Century Economic Statistics". Prescient and fascinating reading. (The full call for papers is here.)

Background: The coming decades will witness significant changes in the production of the social and economic statistics on which government officials, business decision makers, and private citizens rely. The statistical information currently produced by the federal statistical agencies rests primarily on “designed data” -- that is, data collected through household and business surveys. The increasing cost of fielding these surveys, the difficulty of obtaining survey responses, and questions about the reliability of some of the information collected, have raised questions about the sustainability of that model. At the same time, the potential for using “big data” -- very large data sets built to meet governments’ and businesses’ administrative and operational needs rather than for statistical purposes -- in the production of official statistics has grown.

These naturally-occurring data include not only administrative data maintained by government agencies but also scanner data, data scraped from the Web, credit card company records, data maintained by payroll providers, medical records, insurance company records, sensor data, and the Internet of Things. If the challenges associated with their use can be satisfactorily resolved, these emerging sorts of data could allow the statistical agencies not only to supplement or replace the survey data on which they currently depend, but also to introduce new statistics that are more granular, more up-to-date, and of higher quality than those currently being produced.

Purpose: The purpose of this conference is to provide a forum where economists, data providers, and data analysts can meet to present research on the use of big data in the production of federal social and economic statistics. Among other things, this involves discussing (1) Methods for combining multiple data sources, whether they be carefully designed surveys or experiments, large government administrative datasets, or private sector big data, to produce economic and social statistics; (2) Case studies illustrating how big data can be used to improve or replace existing statistical data series or create new statistical data series; (3) Best practices for characterizing the quality of big data sources and blended estimates constructed using data from multiple sources.

Monday, January 29, 2018

Structural VAR Analysis

Kilian and Lutkepohl's Structural Vector Autoregressive Analysis is now out. The back-cover blurbs below are not hyperbole. Indeed Harald Uhlig's is an understatement in certain respects -- to his list of important modern topics covered I would certainly add the "external instrument" approach. For more on that, beyond K&L, which went to press some time ago, see Stock and Watson's masterful 2018 external-instrument survey and extension, just now released as an NBER working paper. (Ungated K&L draft here; ungated S&W draft here.)

Sunday, January 21, 2018

Averaging for Prediction in Econometrics and ML

Random thought. At the risk of belaboring the obvious, it's interesting to heighten collective awareness by thinking about the many appearances of averaging in forecasting, particularly in forecast combination. Some averages are weighted, and some are not. Most are linear, some are not.
  • The "equal weights puzzle" in forecast combination 
  • Random forests, and ensemble averaging algorithms more generally
  • Bootstrap aggregation ("bagging") 
  • Boosting 
  • Best subset averaging
  • Survey averages
  • k-nearest-neighbor forecasts
  • Amisano-Geweke equally-weighted prediction pools
  • "1/N" portfolios
  • Bayesian model averaging
  • Bates-Granger-Ramanathan frequentist model averaging
  • Any forecasts extracted from markets (the ultimate information aggregator), ranging from "standard" markets (e.g., volatility forecasts extracted from options prices, interest rate forecasts extracted from the current yield curve, etc.), to explicit so-called "prediction markets" (e.g., sports betting markets).

Sunday, January 14, 2018

Comparing Interval Forecasts

Here's a new one, "On the Comparison of Interval Forecasts".  

You'd think that interval forecast evaluation would be easy.  After all, point forecast evaluation is (more or less) well understood and easy, and density forecast evaluation is (more or less) well understood and easy, and interval forecasts seem  somewhere in between, so by some sort of continuity argument you'd think that their evaluation would also be well understood and easy.  But no.  In fact it's quite difficult, maybe impossible...

Monday, January 8, 2018

Yield-Curve Modeling

Happy New Year to all!

Riccardo Rebonato's Bond Pricing and Yield-Curve Modeling: A Structural Approach will soon appear from Cambridge University Press. It's very well done -- a fine blend of  theory, empirics, market sense, and good prose.  And not least, endearing humility, well-captured by a memorable sentence from the acknowledgements: "My eight-year-old son has forgiven me, I hope, for not playing with him as much as I would have otherwise; perhaps he has been so understanding because he has had a chance to build a few thousand paper planes with the earlier drafts of this book."

TOC below.  Pre-order here


Acknowledgements page ix
Symbols and Abbreviations xi

Part I The Foundations
1 What This Book Is About 3
2 Definitions, Notation and a Few Mathematical Results 24
3 Links among Models, Monetary Policy and the Macroeconomy 49
4 Bonds: Their Risks and Their Compensations 63
5 The Risk Factors in Action 81
6 Principal Components: Theory 98
7 Principal Components: Empirical Results 108

Part II The Building Blocks: A First Look
8 Expectations 137
9 Convexity: A First Look 147
10 A Preview: A First Look at the Vasicek Model 160

Part III The Conditions of No-Arbitrage
11 No-Arbitrage in Discrete Time 185
12 No-Arbitrage in Continuous Time 196
13 No-Arbitrage with State Price Deflators 206
14 No-Arbitrage Conditions for Real Bonds 224
15 The Links with an Economics-Based Description of Rates 241

Part IV Solving the Models
16 Solving Affine Models: The Vasicek Case 263
17 First Extensions 285
18 A General Pricing Framework 299
19 The Shadow Rate: Dealing with a Near-Zero Lower Bound 329

Part V The Value of Convexity
20 The Value of Convexity 351
21 A Model-Independent Approach to Valuing Convexity 371
22 Convexity: Empirical Results 391

Part VI Excess Returns
23 Excess Returns: Setting the Scene 415
24 Risk Premia, the Market Price of Risk and Expected Excess Returns 431
25 Excess Returns: Empirical Results 449
26 Excess Returns: The Recent Literature – I 473
27 Excess Returns: The Recent Literature – II 497
28 Why Is the Slope a Good Predictor? 527
29 The Spanning Problem Revisited 547

Part VII What the Models Tell Us
30 The Doubly Mean-Reverting Vasicek Model 559
31 Real Yields, Nominal Yields and Inflation: The D’Amico–Kim–Wei Model 575
32 From Snapshots to Structural Models: The Diebold–Rudebusch Approach 602
33 Principal Components as State Variables of Affine Models: The PCA Affine Approach 618
34 Generalizations: The Adrian–Crump–Moench Model 663
35 An Affine, Stochastic-Market-Price-of-Risk Model 688

36 Conclusions 714

Bibliography 725

index 000

Monday, December 18, 2017

Holiday Haze

File:Happy Holidays (5318408861).jpg
Your blogger is about to vanish, returning in the new year.  Meanwhile, all best wishes for the holidays, and many thanks for your wonderful support. If you're Philadelphia for the January meetings, please come to the Penn reception (joint Economics, Finance, etc.), Friday, 6-8:30, Center for Architecture and Design, 1218 Arch Street.  

[Photo credit:  Public domain, by Marcus Quigmire, from Florida, USA (Happy Holidays  Uploaded by Princess Mérida) [CC-BY-SA-2.0 (], via Wikimedia Commons]

Sunday, December 10, 2017

More on the Problem with Bayesian Model Averaging

I blogged earlier on a problem with Bayesian model averaging (BMA) and gave some links to new work that chips away at it. The interesting thing about that new work is that it stays very close to traditional BMA while acknowledging that all models are misspecified.

But there are also other Bayesian approaches to combining density forecasts, such as prediction pools formed to optimize a predictive score. (See, e.g. Amisano and Geweke, 2017, and the references therein.  Ungated final draft, and code, here.)

Another relevant strand of new work, less familiar to econometricians, is "Bayesian predictive synthesis" (BPS), which builds on the expert opinions analysis literature. The framework, which traces to Lindley et al. (1979), concerns a Bayesian faced with multiple priors coming from multiple experts, and explores how to get a posterior distribution utilizing all of the information available. Earlier work by Genest and Schervish (1985) and West and Crosse (1992) develops the basic theory, and new work (McAlinn and West, 2017), extends it to density forecast combination.

Thanks to Ken McAlinn for reminding me about BPS. Mike West gave a nice presentation at the FRBSL forecasting meeting. [Parts of this post are adapted from private correspondence with Ken.]