Wednesday, September 19, 2018

Wonderful Network Connectedness Piece

Very cool NYT graphics summarizing U.S. Facebook network connectedness.  Check it out:
https://www.nytimes.com/interactive/2018/09/19/upshot/facebook-county-friendships.html?action=click&module=In%20Other%20News&pgtype=Homepage&action=click&module=News&pgtype=Homepage

They get the same result that Kamil Yilmaz and I have gotten for years in our analyses of economic and financial network connectedness:  There is a strong "gravity effect" -- that is, even in the electronic age, physical proximity is the key ingredient to network relationships. See for example:

Maybe not as surprising for facebook friends as for financial institutions (say).  But still... 

Sunday, September 16, 2018

Banque de France’s Open Data Room

See below for announcement of a useful new product from the Bank of France and its Representative Office in New York. 
Banque de France has been for many years at the forefront of disseminating statistical data to academics and other interested parties. Through Banque de France’s dedicated public portal http://webstat.banque-france.fr/en/, we offer a large set of free downloadable series (about 40 000 mainly aggregated series).

Banque de France has expanded further the service provided and launched, in Paris, in November 2016 an “Open Data Room”, providing researchers with a free access to granular data.
We are glad to announce that the “Open Data Room” service is now also available to US researchers through Banque de France Representative Office in New York City.

Saturday, September 15, 2018

An Open Letter to Tren Griffin

[I tried quite hard to email this privately. I post it here only because Griffin has, as far as I can tell, been very successful in scrubbing his email address from the web. Please forward it to him if you can figure out how.]

Mr. Griffin:

A colleague forwarded me your post, https://25iq.com/2018/09/08/risk-uncertainty-and-ignorance-in-investing-and-business-lessons-from-richard-zeckhauser/.  I enjoyed it, and Zeckhauser definitely deserves everyone's highest praise. 

However your post misses the bigger picture.  Diebold, Doherty, and Herring conceptualized and promoted the "Known, Unknown, Unknowable" (KuU) framework for financial risk management, which runs blatantly throughout your twelve "Lessons From Richard Zeckhauser".  Indeed the key Zeckhauser article on which you draw appeared in our book, "The Known, the Unknown and the Unknowable in Financial Risk Management", https://press.princeton.edu/titles/9223.html, which we also conceptualized, and for which we solicited the papers and authors, mentored them as regards integrating their thoughts into the KuU framework, etc.  The book was published almost a decade ago by Princeton University Press. 

I say all this not only to reveal my surprise and annoyance at your apparent unawareness, but also, and more constructively, because you and your readers may be interested in our KuU book, which has many other interesting parts (great as the Zeckhauser part may be), and which
, moreover, is more than the sum of its parts. A pdf of the first chapter has been available for many years at http://assets.press.princeton.edu/chapters/s9223.pdf.

Sincerely,

Friday, September 14, 2018

Machine Learning for Forecast Combination

How could I have forgotten to announce my latest paper, "Machine Learning for Regularized Survey Forecast Combination: Partially-Egalitarian Lasso and its Derivatives"? (Actually a heavily-revised version of an earlier paper, including a new title.) Came out as an NBER w.p. a week or two ago.

Monday, September 10, 2018

Interesting Papers of the Moment

Missing Events in Event Studies: Identifying the Effects of Partially-Measured News Surprises
by Refet S. Guerkaynak, Burcin Kisacikoglu, Jonathan H. Wright #25016 (AP ME)
http://papers.nber.org/papers/w25016?utm_campaign=ntw&utm_medium=email&utm_source=ntw


Colacito, Ric, Bridget Hoffmann, and Toan Phan (2018) “Temperature and growth: A panel
analysis of the United States,”
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2546456

Do You Know That I Know That You Know...? Higher-Order Beliefs in Survey Data
by Olivier Coibion, Yuriy Gorodnichenko, Saten Kumar, Jane Ryngaert #24987 (EFG ME)
http://papers.nber.org/papers/w24987?utm_campaign=ntw&utm_medium=email&utm_source=ntw

Wednesday, September 5, 2018

Google's New Dataset Search Tool

Check out Goolgle's new Datset Search.

Here's the description they issued today:

Making it easier to discover datasets
Natasha Noy
Research Scientist, Google AI
Published Sep 5, 2018

In today's world, scientists in many disciplines and a growing number of journalists live and breathe data. There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.

Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher's site, a digital library, or an author's personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include  salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.

In this new release, you can find references to most datasets in environmental and social sciences, as well as data from other disciplines including government data and data provided by news organizations, such as ProPublica. As more data repositories use the schema.org standard to describe their datasets, the variety and coverage of datasets that users will find in Dataset Search, will continue to grow.

Dataset Search works in multiple languages with support for additional languages coming soon. Simply enter what you are looking for and we will help guide you to the published dataset on the repository provider’s site.

For example, if you wanted to analyze daily weather records, you might try this query in Dataset Search:

[daily weather records]

You’ll see data from NASA and NOAA, as well as from academic repositories such as Harvard's Dataverse and Inter-university Consortium for Political and Social Research (ICPSR). Ed Kearns, Chief Data Officer at NOAA, is a strong supporter of this project and helped NOAA make many of their datasets searchable in this tool. “This type of search has long been the dream for many researchers in the open data and science communities” he said. “And for NOAA, whose mission includes the sharing of our data with others, this tool is key to making our data more accessible to an even wider community of users.”

This launch is one of a series of initiatives to bring datasets more prominently into our products. We recently made it easier to discover tabular data in Search, which uses this same metadata along with the linked tabular data to provide answers to queries directly in search results. While that initiative focused more on news organizations and data journalists, Dataset search can be useful to a much broader audience, whether you're looking for scientific data, government data, or data provided by news organizations.

A search tool like this one is only as good as the metadata that data publishers are willing to provide. We hope to see many of you use the open standards to describe your data, enabling our users to find the data that they are looking for. If you publish data and don't see it in the results, visit our instructions on our developers site which also includes a link to ask questions and provide feedback.

Monday, September 3, 2018

The Coming Storm

The role of time-series statistics / econometrics in climate analyses is expanding (e.g., here).  Related -- albeit focusing on shorter-term meteorological aspects rather than longer-term climatological aspects -- it's worth listening to Michael Lewis' latest, The Coming Storm.  (You have to listen rather than read, as it's only available as an audiobook, but it's only about two hours.)  It's a fascinating story, well researched and well told by Lewis, just as you'd expect.  There are lots of interesting insights on (1) the collection, use, and abuse of public weather data, including ongoing, ethically-dubious, and potentially life-destroying attempts to privatize public weather data for private gain, (2) the clear and massive improvements in weather forecasting in recent decades, (3) behavioral aspects of how best to communicate forecasts so people understand them, believe them, and take appropriate action before disaster strikes. 

Monday, August 27, 2018

Long Memory / Scaling Laws in Return Volatility

The 25-year accumulation of evidence for long memory / fractional integration / self-similarity / scaling laws in financial asset return volatility continues unabated.  For the latest see this nice new paper from Bank of Portugal, in particular its key Table 6. Of course the interval estimates of the fractional integration parameter "d" are massively far from both 0 and 1 -- that's the well-known long memory. But what's new and interesting is the systematic difference in the intervals depending on whether one uses absolute or range-based volatility. The absolute d intervals tend to be completely below 1/2 (0<d<1/2 corresponds to covariance-stationary dynamics), whereas the range-based d intervals tend to include 1/2 (1/2<d<1 corresponds to mean-reverting but not covariance- stationary dynamics, due to infinite unconditional variance). 

Realized vol based on the range is less noisy than realized vol based on absolute returns. But least noisy of all, and not considered in the paper above, is realized vol calculated directly from high-frequency return data (HFD-vol), as done by numerous authors in recent decades. Interestingly, recent work for HFD-vol also reports d intervals that tend to poke above 1/2. See this earlier post.

Monday, August 20, 2018

More on the New U.S. GDP Series

BEA's new publishing of NSA GDP is a massive step forward. Now it should take one more step, if it insists on continuing to publish SA GDP.

Publishing only indirect SA GDP ("adjust the components and add them up") lends it an undeserved "official" stamp of credibility, so BEA should also publish a complementary official direct SA GDP ("adjust the aggregate directly"), which is now possible. 

This is a really big deal. Real GDP is undoubtedly the most important data series in all of macroeconomics, and indirect vs. direct SA GDP growth estimates can differ greatly. Their average absolute deviation is about one percent, and average real GDP growth itself is only about two percent! And which series you use has large implications for important issues, such as the widely-discussed puzzle of weak first-quarter growth (see Rudebusch et al.), among other things.

How do we know all this about properties of indirect vs. direct SA GDP growth estimates, since BEA doesn't provide direct SA GDP? You can now take the newly-provided NSA GDP and directly adjust it yourself. See Jonathan Wright's wonderful new paper. (Ungated version here.)

Of course direct SA has many issues of its own.  Ultimately significant parts of both direct and indirect SA GDP are likely spurious artifacts of various direct vs. indirect SA assumptions / methods. 

So another, more radical, idea, is simply to stop publishing SA GDP in any form, instead publishing only NSA GDP (and its NSA components). Sound crazy? Why, exactly? Are official government attempts to define and remove "seasonality" any less dubious than, say, official attempts to define and remove "trend"? (The latter is, mercifully, not attempted...)

Tuesday, August 7, 2018

Factor Model w Time-Varying Loadings

Markus Pelger has a nice paper on factor modeling with time-varying loadings in high dimensions. There are many possible applications. He applies it to level-slope-curvature yield-curve models. 

For me another really interesting application would be measuring connectedness in financial markets, as a way of tracking systemic risk. The Diebold-Yilmaz (DY) connectedness framework is based on a high-dimensional VAR with time-varying coefficients, but not factor structure. An obvious alternative in financial markets, which we used to discuss a lot but never pursued, is factor structure with time-varying loadings, exactly in Pelger! 

It would seem, however, that any reasonable connectedness measure in a factor environment would need to be based not only time-varying loadings but also time-varying idiosynchratic shock variances, or more precisely a time-varying noise/signal ratio (e.g., in a 1-factor model, the ratio of the idiosyncratic shock variance to the factor innovation variance). That is, connectedness in factor environments is driven by BOTH the size of the loadings on the factor(s) AND the amount of variation in the data explained by the factor(s). Time-varying loadings don't really change anything if the factors are swamped by massive noise. 

Typically one might fix the factor innovation variance for identification, but allow for time-varying idiosyncratic shock variance in addition to time-varying factor loadings. It seems that Pelger's framework does allow for that. Crudely, and continuing the 1-factor example, consider y_t  =  lambda_t  f_t  +  e_t. His methods deliver estimates of the time series of loadings lambda_t and factor f_t, robust to heteroskedasticity in the idiosyncratic shock e_t. Then in a second step one could back out an estimate of the time series of e_t and fit a volatility model to it. 
Then the entire system would be estimated and one could calculate connectedness measures based, for example, on variance decompositions as in the DY framework

Monday, July 30, 2018

NSA GDP is Finally Here

Non-seasonally-adjusted U.S. GDP is finally here, as of a few days ago. See this BEA slide deck, pp. 10, 14-15. For background see my earlier post here. Also click on "Seasonality" under "Browse by Topic" on the right. 

The deck also covers lots of other aspects of the BEA's important "2018 Comprehensive Update of the National Income and Product Accounts". The whole deck is fascinating, and surely worth a close examination.

Tuesday, July 24, 2018

Gu-Kelly-Xiu and Neural Nets in Economics

I'm on record as being largely unimpressed by the contributions of neural nets (NN's) in economics thus far. In many economic environments the relevant non-linearities seem too weak and the signal/noise ratios too low for NN's to contribute much. 

The Gu-Kelly-Xiu paper that I mentioned earlier may change that. I mentioned their success in applying machine-learning methods to forecast equity risk premia out of sample. NN's, in particular, really shine. The paper is thoroughly and meticulously done. 

This is potentially a really big deal.

Friday, July 20, 2018

Remembering Peter Christoffersen

This is adapted from remarks read at a memorial service earlier this week:

I'm sad not to be able to be here in person, and I'm grateful to Peter Pauly for kindly agreeing to read these academically-focused remarks. His reading is unusually wonderful and appropriate, as he played a key role in my Ph.D. training, which means that if Peter Christoffersen was my student, he was also Peter Pauly's "grandstudent". For all that they taught me, I am immensely grateful both to Peter Pauly in early years, and to Peter Christoffersen in later years. I am also grateful to Peter Pauly for another reason -- he was the dean who wisely hired the Christoffersens!

I have been fortunate to have had many wonderful students in various cohorts, but Peter's broad cohort was surely the best: Peter of course, plus (in alphabetical order) Sassan Alizadeh, Filippo Altissimo, Jeremy Berkowitz, Michael Binder, Marcelle Chauvet, Lorenzo Giorgiani, Frank Gong, Atsushi Inoue, Lutz Kilian, Jose Lopez, Anthony Tay, and several others.

The Penn econometrics faculty around that time was similarly strong: Valentina Corradi, Jin Hahn, Bobby Mariano, and eventually Frank Schorfheide, with lots of additional macro-econometrics input from Lee Ohanian and financial econometrics input from Michael Brandt. Hashem Pesaran also visited Penn for a year around then. Peter was well known by all the faculty, not just the econometricians. I recall that the macroeconomists were very disappointed to lose him to econometrics!

Everyone knows Peter's classic 1998 "Evaluating Interval Forecasts" paper, which was part of his Penn dissertation. He uncovered the right notion of the "residual" for a (1-a) x 100% interval forecast, and showed that if all is well then it must be iid Bernoulli(1-a). The paper is one of the International Economic Review's ten most cited papers since its founding in 1960.

Peter and I wrote several papers together, which I consider among my very best, thanks to Peter's lifting me to higher-than-usual levels. They most definitely include our Econometric Theory paper on optimal prediction under asymmetric loss, and our Journal of Business and Economic Statistics paper on multivariate forecast evaluation.

Peter's research style was marked by a wonderful blend of intuition, theoretical rigor, and always, empirical relevance, which took him to heights that few others could reach. And his personality, which simply radiated positivity, made him not only a wonderful person to talk soccer or ski with, but the best imaginable person to talk research with.

Peter was also exceedingly generous and effective with his time as regards teaching & executive education, public service, conference organization, and more. We used to talk a lot about dynamic volatility models, and their use and abuse in financial risk management. His eventual and now well-known textbook on the topic trained legions of students. He and I were the inaugural speakers at the annual summer school of the Society for Financial Econometrics (SoFiE), that year at Oxford University, where we had a wonderful week lecturing together. He served effectively on many committees, including the U.S. Federal Reserve System's Model Validation Committee, charged with reviewing the models used for bank stress testing. He generously hosted the large annual SoFiE meeting in Toronto, several legendary "ski conferences" at Mont Tremblant, and more. The list goes on and on.

We lost a fine researcher and a fine person, much too soon. One can't begin to imagine what he might have contributed during the next twenty years. But this much is certain: his legacy lives on, and it shines exceptionally brightly. Rest in peace, my friend.

Thursday, July 19, 2018

Machine Learning, Volatility, and the Interface

Just got back from the NBER Summer Institute. Lots of good stuff happening in the Forecasting and Empirical Methods group. The program, with links to papers, is here.

Lots of room for extensions too. Here's a great example. Consider the interface of the Gu-Kelly-Xiu and Bollerslev-Patton-Quagvleg papers. At first you might think that there is no interface. 

Kelly-Xiu is about using off-the-shelf machine-learning methods to model risk premia in financial markets; that is, to construct portfolios that deliver superior performance. (I had guessed they'd get nothing, but I was massively wrong.) Bollerslev et al. is about predicting realized covariance by exploiting info on past signs (e.g., was yesterday's covariance cross-product pos-pos, neg-neg, pos-neg, or neg-pos?). (They also get tremendous results.)

But there's actually a big interface.

Note that Kelly-Xiu is about conditional mean dynamics -- uncovering the determinants of expected excess returns. You might expect even better results for derivative assets, as the volatility dynamics that drive options prices may be nonlinear in ways missed by standard volatility models. And that's exactly the flavor of the Bollerslev et al. results -- they find that a tree structure conditioning on sign is massively successful.

But Bollerslev et al. don't do any machine learning. Instead they basically stumble upon their result, guided by their fine intuition. So here's a fascinating issue to explore: Hit the Bollerslev et al. realized covariance data with machine learning (in particular, tree methods like random forests) and see what happens. Does it "discover" the Bollerslev et al. result? If not, why not, and what does it discover? Does it improve upon Bollerslev et al.?

Thursday, July 5, 2018

Climate Change and NYU Volatility Institute

There is little doubt that climate change -- tracking, assessment, and hopefully its eventual mitigation -- is the burning issue of our times. Perhaps surprisingly, time-series econometric methods have much to offer for weather and climatological modeling (e.g., here), and several econometric groups in the UK, Denmark, and elsewhere have been pushing the agenda forward.

Now the NYU Volatility Institute is firmly on board. A couple months ago I was at their most recent annual conference, "A Financial Approach to Climate Risk", but it somehow fell through the proverbial (blogging) cracks. The program is here, with links to many papers, slides, and videos. Two highlights, among many, were the presentations by Jim Stock (insights on the climate debate gleaned from econometric tools, slides here) and Bob Litterman (an asset-pricing perspective on the social cost of climate change, paper here). A fine initiative!

Monday, June 25, 2018

Peter Christoffersen and Forecast Evaluation

For obvious reasons Peter Christoffersen has been on my mind. Here's an example of how his influence extended in important ways. Hopefully it's also an entertaining and revealing story.

Everyone knows Peter's classic 1998 "Evaluating Interval Forecasts" paper, which was part of his Penn dissertation. The key insight was that correct conditional calibration requires not only that the 0-1 "hit sequence" of course have the right mean ((1-\(\alpha\)) for a nominal 1-\(\alpha\) percent interval), but also that it be iid (assuming 1-step-ahead forecasts). More precisely, it must be iid Bernoulli(1-\(\alpha\)).

Around the same time I naturally became interested in going all the way to density forecasts and managed to get some more students interested (Todd Gunther and Anthony Tay). Initially it seemed hopeless, as correct density forecast conditional calibration requires correct conditional calibration of all possible intervals that could be constructed from the density, of which there are uncountably infinitely many.

Then it hit us. Peter had effectively found the right notion of an optimal forecast error for interval forecasts. And just as optimal point forecast errors generally must be independent, so too must optimal interval forecast errors (the Christoffersen hit sequence). Both the point and interval versions are manifestations of "the golden rule of forecast evaluation": Errors from optimal forecasts can't be forecastable. The key to moving to density forecasts, then, would be to uncover the right notion of forecast error for a density forecast. That is, to uncover the function of the density forecast and realization that must be independent under correct conditional calibration. The answer turns out to be the Probability Integral Transform, \(PIT_t=\int_{-\infty}^{y_t} p_t(y_t)\), as discussed in Diebold, Gunther and Tay (1998), who show that correct density forecast conditional calibration implies \(PIT \sim iid U(0,1)\). 


The meta-result that emerges is coherent and beautiful: optimality of point, interval, and density forecasts implies, respectively, independence of forecast error, hit, and \(PIT\) sequencesThe overarching point is that a large share of the last two-thirds of the three-part independence result -- not just the middle third -- is due to Peter. He not only cracked the interval forecast evaluation problem, but also supplied key ingredients for cracking the density forecast evaluation problem.

Wonderfully and appropriately, Peter's paper and ours were published together, indeed contiguously, in the International Economic Review. Each is one of the IER's ten most cited since its founding in 1960, but Peter's is clearly in the lead!

Friday, June 22, 2018

In Memoriam Peter Christoffersen

It brings me great sadness to report that Peter Christoffersen passed away this morning after a long and valiant struggle with cancer. (University of Toronto page here, personal page here.) He departed peacefully, surrounded by loving family. I knew Peter and worked closely with him for nearly thirty years. He was the finest husband, father, and friend imaginable. He was also the finest scholar imaginable, certainly among the leading financial economists and financial econometricians of his generation. I will miss him immensely, both personally and professionally.

Monday, June 18, 2018

10th ECB Workshop on Forecasting Techniques, Frankfurt

Starts now, program hereLooks like a great lineup. Most of the papers are posted, and the organizers also plan to post presentation slides following the conference. Presumably in future weeks I'll blog on some of the presentations.

Monday, June 11, 2018

Deep Neural Nets for Volatility Dynamics

There doesn't seem to be much need for nonparametric nonlinear modeling in empirical macro and finance. Not that lots of smart people haven't tried. The two key nonlinearities (volatility dynamics and regime switching) just seem to be remarkably well handled by tightly-parametric customized models (GARCH/SV and Markov-switching, respectively). 

But the popular volatility models are effectively linear (ARMA) in squares. Maybe that's too rigidly constrained. Volatility dynamics seem like something that could be nonlinear in ways much richer than just ARMA in squares. 

Here's an attempt using deep neural nets. I'm not convinced by the paper -- much more thorough analysis and results are required than the 22 numbers reported in the "GARCH" and "stocvol" columns of its Table 1 -- but I'm intrigued.

It's quite striking that neural nets, which have been absolutely transformative in other areas of predictive modeling, have thus far contributed so little in economic / financial contexts. Maybe the "deep" versions will change that, at least for volatility modeling. Or maybe not. 

Thursday, June 7, 2018

Machines Learning Finance

FRB Atlanta recently hosted a meeting on "Machines Learning Finance". Kind of an ominous, threatening (Orwellian?) title, but there were lots of (non-threatening...) pieces. I found the surveys by Ryan Adams and John Cunningham particularly entertaining. A clear theme on display throughout the meeting was that "supervised learning" -- the main strand of machine learning -- is just function estimation, and in particular, conditional mean estimation. That is, regression. It may involve high dimensions, non-linearities, binary variables, etc., but at the end of the day it's still just regression. If you're a regular No Hesitations reader, the "insight" that supervised learning = regression will hardly be novel to you, but still it's good to see it disseminating widely.

Monday, May 21, 2018

Top 100 Economics Blogs

Check out the latest "Top 100 Economics Blogs" here. The blurb for No Hesitations (under "Sub-field Economic Blogs") is pretty funny, issuing a stern warning: 
His blog is primarily focused on statistics and econometrics, and is highly technical. Therefore, it is recommended for those with advanced knowledge of economics and mathematics.
In reality, and as I'm sure you'll agree if you're reading this, it's actually simple and intuitive! I guess it's all relative. Anyway the blurb does get this right: "It is especially recommended for those wanting to learn more about dynamic predictive modeling in economics and finance."

Quite apart from pros and cons of its No Hesitations blurb (surely of much more interest to me than to you...), the list provides an informative and timely snapshot of the vibrant economics blogosphere.

Monday, May 14, 2018

Monetary Policy and Global Spillovers

The Bank of Chile's latest Annual Conference volume, Monetary Policy and Global Spillovers: Mechanisms, Effects, and Policy Measures, is now out, here.  In addition to the research presented in the volume, I love the picture on its front cover. So peaceful.

Monday, May 7, 2018

Fourth Penn Quantitative Policy Workshop




Some years ago I blogged on the first Workshop on Quantitative Tools for Macroeconomic Policy Analysis hosted by the Penn Institute for Economic Research (PIER). We just completed the fourth! It was a great group as usual, with approximately 25 participants from around the globe, mostly economists at country central banks, ECB, etc. Some of the happy campers, along with yours truly, appear in the photo. You can find all sorts of information on the workshop site. Information / registration for the next Workshop (May 2019) will presumably be posted in fall. Please consider joining us, and tell your friends!






Monday, April 30, 2018

Pockets of Predictability

Some months ago I blogged on "Pockets of Predictability," here. The Farmer-Schmidt-Timmermann paper that I mentioned is now available, here.

Monday, April 23, 2018

Ghysels and Marcellino on Time-Series Forecasting

If you're teaching a forecasting course and want a good text, or if you're just looking for an informative and modern treatment, see Applied Economic Forecasting Using Time Series Methods, by Eric Ghysels and Massimilliano Marcellino. It will be published this week by Oxford University Press. It has a very nice modern awareness of Big Data with emphasis on reduced-rank structure, regularization methods -- LASSO appears as early as p. 23! -- , structural change, mixed-frequencies, etc. It's also very tastefully done in terms of what's included and what's excluded, emphasizing what's most important and de-emphasizing the rest. As regards non-linearity, for example, volatility dynamics and regime-switching are in, and most of the rest is out.

Monday, April 16, 2018

The History of Forecasting Competitions

Check out Rob Hyndman's "Brief History of Time Series Forecasting Competitions". I'm not certain whether the title's parallel to Hawking's Brief History of Time is intentional. At any rate, even if Hyndman's focus is rather more narrow than the origin and fate of the universe, his post is still fascinating and informative. Thanks to Ross Askanasi for bring it to my attention.

Monday, April 9, 2018

An Art Market Return Index

Rare and collectible goods, from fine art to fine wine, have many interesting and special aspects. Some are shared and some are idiosyncratic.

From the vantage point of alternative investments (among other things), it would be useful to have high-frequency indices for those asset markets, just as we do for traditional "financial" asset markets like equities.

Along those lines, in "Monthly Art Market Returns" Bocart, Ghysels, and Hafner develop a high-frequency measurement approach, despite the fact that art sales generally occur very infrequently. Effectively they develop a mixed-frequency 
repeat-sales model, which captures the correlation between art prices and other liquid asset prices that are observed much more frequently. They use the model to extract a monthly art market return index, as well as sub-indices for contemporary art, impressionist art, etc.

Quite fascinating and refreshingly novel.

Monday, April 2, 2018

Econometrics, Machine Learning, and Big Data

Here's a useful slide deck by Greg Duncan at Amazon, from a recent seminar at FRB San Francisco (powerpoint, ughhh, sorry...). It's basically a superset of the keynote talk he gave at Penn's summer 2017 conference, Big Data in Predictive Dynamic Econometric Modeling. Greg understands better than most the close connection between "machine learning" and econometrics / statistics, especially between machine learning and the predictive perspective emphasized in time series for a century or so.

Monday, March 26, 2018

Classic Jacod (1994) Paper

J. Financial Econometrics will soon publish Jean Jacod's brilliant and beautiful 1994 paper, "Limit of Random Measures Associated with the Increments of a Brownian Semimartingale", which I just had the pleasure of reading for the first time. (Ungated version here.) Along with several others, I was asked to supply some comments for the issue's introduction. What follows is adapted from those comments, providing some historical background. (Except that it's not really historical background -- keep reading...)

Jacod's paper effectively lays the foundation for the vast subsequent econometric "realized volatility" (empirical quadratic variation) literature of the past twenty years.  Reading it leads me to recall my early realized volatility work with Torben Andersen and Tim Bollerslev in the late 1990's and early 2000's. It started in the mid-1990's at a meeting of the NBER Asset Pricing Program, where I was the discussant for a paper of theirs, eventually published as Andersen and Bollerslev (1998). They were using realized volatility as the "realization" in a study of GARCH volatility forecast accuracy, and my discussion was along the lines of, "That's interesting, but I think you've struck gold without realizing it -- why not skip the GARCH and instead simply characterize, model, and forecast realized volatility directly?".

So we decided to explore realized volatility directly. Things really took off with Andersen et al. (2001) and Andersen et al. (2003). The research program was primarily empirical, but of course we also wanted to advance the theoretical foundations. We knew some relevant stochastic integration theory, and we made progress culminating in Theorem 2 of Andersen et al. (2003). Around the same time, Ole Bardorff-Nielsen and Neil Shephard were also producing penetrating and closely-related results (most notably Barndorff-Nielsen and Shephard, 2002). Very exciting early times.

Now let's return to Jacod's 1994 paper, and consider it against the above historical background of early econometric realized volatility papers. Doing so reveals not only its elegance and generality, but also its prescience: It was written well before the "historical background"!! One wonders how it went unknown and unpublished for so long.

References

Andersen, T. G. and T. Bollerslev (1998), "Answering the Skeptics: Yes, Standard Volatility Models do Provide Accurate Forecasts," International Economic Review, 39, 885-905.

Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys (2001), "The Distribution of Realized Exchange Rate Volatility," Journal of the American Statistical Association, 96, 42-55.

Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys (2003), "Modeling and Forecasting Realized Volatility," Econometrica, 71, 579-625.

Barndorff-Nielsen, O. and N. Shephard (2002), "Econometric Analysis of Realized Volatility and its Use in Estimating Stochastic Volatility Models," Journal of the Royal Statistical Society, 64,
253-280.

Jacod, J. (1994), "Limit of Random Measures Associated with the Increments of a Brownian Semimartingale," Manuscript, Institute de Mathematiques de Jussieu, Universite Pierre et Marie Curie, Paris.

Monday, March 19, 2018

Big Data and Economic Nowcasting

Check out this informative paper from the Federal Reserve Bank of New York: "Macroeconomic Nowcasting and Forecasting with Big Data", by Brandyn Bok, Daniele Caratelli, Domenico Giannone, Argia Sbordone, and Andrea Tambalotti.

Key methods for confronting big data include (1) imposition of restrictions (for example, (a) zero restrictions correspond to "sparsity", (b) reduced-rank restrictions correspond to factor structure, etc.), and (2) shrinkage (whether by formal Bayesian approaches or otherwise).

Bok et al. provide historical perspective on use of (1)(b) for macroeconomic nowcasting; that is, for real-time analysis and interpretation of hundreds of business-cycle indicators using dynamic factor models. They also provide a useful description of FRBNY's implementation and use of such models in policy deliberations.

It is important to note that the Bok et al. approach nowcasts current-quarter GDP, which is different from nowcasting "the business cycle" (as done using dynamic factor models at FRB Philadelphia, for example), because GDP alone is not the business cycle. Hence the two approaches are complements, not substitutes, and both are useful.

Monday, March 12, 2018

Sims on Bayes

Here's a complementary and little-known set of slide decks from Chris Sims, deeply insightful as always. Together they address some tensions associated with Bayesian analysis and sketch some resolutions. The titles are nice, and revealing. The first is "Why Econometrics Should Always and Everywhere Be Bayesian". The second is "Limits to Probability Modeling" (with Chris' suggested possible sub-title: "Why are There no Real Bayesians?").

Thursday, March 8, 2018

H-Index for Journals

In an earlier rant, I suggested that journals move from tracking inane citation "impact factors" to citation "H indexes" or similar, just as routinely done when evaluating individual authors. It turns out that RePEc already does it, here. There are literally many thousands of journals ranked. I show the top 25 below. Interestingly, four "field" journals actually make the top 10, effectively making them "super (uber?) field journals" (J. Finance, J. Financial Economics, J. Monetary Economics, and J. Econometrics). For example, J. Econometrics is basically indistinguishable from Review of Economic Studies. 

The rankings

RankJournalFactorAdjusted
citations
ItemsAll
citations
1American Economic Review, American Economic Association2623897539641395549
2Journal of Political Economy, University of Chicago Press2352275342978229129
3Econometrica, Econometric Society (also covers Econometrica, Econometric Society )2332660313530267883
4The Quarterly Journal of Economics, Oxford University Press2312116382311212892
5Journal of Finance, American Finance Association2102130764558215655
6Journal of Financial Economics, Elsevier1651380732654149337
7Journal of Monetary Economics, Elsevier (also covers Carnegie-Rochester Conference Series on Public Policy, Elsevier )1581237053334128056
8Review of Economic Studies, Oxford University Press1571143592317115072
9Journal of Econometrics, Elsevier1491318234160141376
10Journal of Economic Literature, American Economic Association1457291689173201
11Journal of Economic Perspectives, American Economic Association14577073167377758
12The Review of Economics and Statistics, MIT Press1401093303953109953
13Economic Journal, Royal Economic Society (also covers Economic Journal, Royal Economic Society )1371037633663104399
14Journal of International Economics, Elsevier12276446298381228
15Review of Financial Studies, Society for Financial Studies11966261165767063
16Journal of Public Economics, Elsevier11790038372295659
17Journal of Development Economics, Elsevier11365314311168204
18European Economic Review, Elsevier11174123387075847
19Journal of Economic Theory, Elsevier10883540423890223
20Journal of Business & Economic Statistics, Taylor & Francis Journals (also covers Journal of Business & Economic Statistics, American Statistical Association )10544499172644729
21Journal of Money, Credit and Banking, Blackwell Publishing (also covers Journal of Money, Credit and Banking, Blackwell Publishing )10451991295552610
22Management Science, INFORMS9570575689176945
23Journal of Banking & Finance, Elsevier9361760482972982
24International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association (also covers International Economic Review, Department of Economics, University of Pennsylvania and Osaka University Institute of Social and Economic Research Association )9148542253748823
25Journal of Labor Economics, University of Chicago Press9137921108438993

Wednesday, February 28, 2018

The Rate of Return on Everything

Jorda, Knoll, Kuvshinov, Schularick and Taylor deliver more than just a memorable title, "The Rate of Return on Everything, 1870-2015". (Dec 2017 NBER version here; earlier ungated June 2017 version here.) Their paper is a fascinating exercise in data construction and analysis. It goes well beyond, say, the earlier and also-fascinating Dimson et al. (2002) book, by including housing, among other things.

Caveat emptor: In this case two words suffice -- survivorship bias. Jorda et al. are well aware of it, and they work hard to assess and address it. But still.

Monday, February 26, 2018

STILL MORE on NN's and ML

I recently discussed how the nonparametric consistency of wide NN's proved underwhelming, which is partly why econometricians lost interest in NN's in the 1990s.

The other thing was the realization that NN objective surfaces are notoriously bumpy, so that arrival at a local optimum (e.g., by the stochastic gradient descent popular in NN circles) offered little comfort.

So econometricians' interest declined on both counts. But now both issues are being addressed. The new focus on NN depth as opposed to width is bearing much fruit. And recent advances in "reinforcement learning" methods effectively promote global as opposed to just local optimization, by experimenting (injecting randomness) in clever ways. (See, e.g., Taddy section 6, here.)

All told, it seems like quite an exciting new time for NN's. I've been away for 15 years. Time to start following again...

Wednesday, February 21, 2018

Larry Brown

Larry Brown has passed away.  Larry was a giant of modern statistics and a towering presence at Penn.  Simultaneously, everyone who knew him liked him, immensely. He will be missed dearly, both professionally and personally.

I received the obituary below from Penn's Statistics Department.

Lawrence David Brown Lawrence D. Brown died peacefully at 6:30 a.m. on Feb. 21, 2018, at the age of 77. Larry preserved his unfailing fortitude and good humor to his last day. Larry was born on Dec. 16, 1940, in Los Angeles, California. His parents moved to Alexandria, VA, during World War II, then returned to California. His father, Louis Brown, was a successful tax lawyer and later a professor of law at the University of Southern California, where he worked tirelessly on behalf of client services and conflict prevention, for which he coined the phrase preventive law. His mother, Hermione Kopp Brown, studied law in Virginia and then in Los Angeles and became one of the leading women lawyers in Los Angeles in the field of entertainment law, with emphasis on estate planning. Larry inherited their dedication for service, their mental acuity and resourcefulness, and their selfless good spirits. Larry graduated from Beverly Hills High School in 1957 and from the California Institute of Technology in 1961 and earned his Ph.D. in mathematics from Cornell University three years later. Initially hired at the University of California, Berkeley, he then taught in the mathematics department at Cornell University from 1966-72 and 1978-94 and in the statistics department at Rutgers University from 1972-78; he moved to the Wharton School at the University of Pennsylvania in 1994 and taught his last course there as the Miers Busch Professor of Statistics in the fall of 2017. One of the leading statisticians of his generation, he was the recipient of many honors, including devoted service as a member of the National Academy of Sciences, election to the American Academy of Arts and Sciences, the presidency of the Institute of Mathematical Statistics, and an honorary doctorate from Purdue University. He was much loved by his colleagues and his students, many of whom hold leading positions in the United States and abroad. His passion for his work was matched by his devotion to his family. His wife Linda Zhao survives him, as do their sons Frank and Louie, their daughter Yiwen Zhao, his daughters from his first marriage, Yona Alpers and Sarah Ackman, his brothers Marshall and Harold and their wives Jane and Eileen, and 19 grandchildren.

Monday, February 19, 2018

More on Neural Nets and ML

I earlier mentioned Matt Taddy's "The Technological Elements of Artificial Intelligence" (ungated version here).

Among other things the paper has good perspective on the past and present of neural nets. (Read:  his views mostly, if not exactly, match mine...)  

Here's my personal take on some of the history vis a vis econometrics:

Econometricians lost interest in NN's in the 1990's. The celebrated Hal White et al. proof of NN non-parametric consistency as NN width (number of neurons) gets large at an appropriate rate was ultimately underwhelming, insofar as it merely established for NN's what had been known for decades for various other non-parametric estimators (kernel, series, nearest-neighbor, trees, spline, etc.). That is, it seemed that there was nothing special about NN's, so why bother? 

But the non-parametric consistency focus was all on NN width; no one thought or cared much about NN depth. Then, more recently, people noticed that adding NN depth (more hidden layers) could be seriously helpful, and the "deep learning" boom took off. 

Here are some questions/observations on the new "deep learning":

1.  Adding NN depth often seems helpful, insofar as deep learning often seems to "work" in various engineering applications, but where/what are the theorems? What can be said rigorously about depth?

2. Taddy emphasizes what might be called two-step deep learning. In the first step, "pre-trained" hidden layer nodes are obtained based on unsupervised learning (e.g., principle components (PC)) from various sets of variables. And then the second step proceeds as usual. That's very similar to the age-old idea of PC regression. Or, in multivariate dynamic environments and econometrics language, "factor-augmented vector autoregression" (FAVAR), as in Bernanke et al. (2005). So, are modern implementations of deep NN's effectively just nonlinear FAVAR's? If so, doesn't that also seem underwhelming, in the sense of -- dare I say it -- there being nothing really new about deep NN's?

3. Moreover, PC regressions and FAVAR's have issues of their own relative to one-step procedures like ridge or LASSO.  See this and this

Tuesday, February 13, 2018

Neural Nets, ML and AI

"The Technological Elements of Artificial Intelligence", by Matt Taddy, is packed with insight on the development of neural nets and ML as related to the broader development of AI. I have lots to say, but it will have to wait until next week. For now I just want you to have the paper. Ungated version at http://www.nber.org/chapters/c14021.pdf.

Abstract:

We have seen in the past decade a sharp increase in the extent that companies use data to optimize their businesses.  Variously called the `Big Data' or `Data Science' revolution, this has been characterized by massive amounts of data, including unstructured and nontraditional data like text and images, and the use of fast and flexible Machine Learning (ML) algorithms in analysis.  With recent improvements in Deep Neural Networks (DNNs) and related methods, application of high-performance ML algorithms has become more automatic and robust to different data scenarios.  That has led to the rapid rise of an Artificial Intelligence (AI) that works by combining many ML algorithms together - each targeting a straightforward prediction task - to solve complex problems.  

We will define a framework for thinking about the ingredients of this new ML-driven AI.  Having an understanding of the pieces that make up these systems and how they fit together is important for those who will be building businesses around this technology. Those studying the economics of AI can use these definitions to remove ambiguity from the conversation on AI's projected productivity impacts and data requirements.  Finally, this framework should help clarify the role for AI in the practice of modern business analytics and economic measurement.