## Friday, December 21, 2018

### Holiday Haze

Happy holidays!

Your blogger is about to vanish, returning in the new year. Many thanks for your past, present, and future support.

If you're at ASSA Atlanta, I hope you'll come to the Penn Economics and Finance parties.

## Sunday, December 16, 2018

### Causality as Robust Prediction

I like thinking about causal estimation as a type of prediction (e.g., here). Here's a very nice slide deck from Peter Buhlmann at ETH Zurich detailing his group's recent and ongoing work in that tradition.

## Thursday, December 13, 2018

### More on Google Dataset Search

Some months ago I blogged on Google's new development of a dataset search tool.  Evidently it's coming along.  Check out the beta version here. Also, on dataset supply as opposed to demand, see here for how to maximize visibility of your datasets to the search engine.

## Monday, December 10, 2018

### Greater New York Area Econometrics Colloquium

Last week's 13th annual Greater New York Area Econometrics Colloquium, generously hosted by Princeton, was a great success, with strong papers throughout. The program is below. I found two papers especially interesting. I already blogged on Spady and Stouli's “Simultaneous Mean-Variance Regression”. The other was "Nonparametric Sample Splitting", by Lee and Wang.

Think of a nonlinear classification problem. In general the decision boundary is of course a highly nonlinear surface, but it's a supervised learning situation, so it's "easy" to learn the surface using standard nonlinear regression methods. Lee and Wang, in contrast, study an unsupervised learning situation, effectively a threshold regression model, where the threshold is determined by an unknown nonparametric relation. And they have very cool applications to things like estimating effective economic borders, gerrymandering, etc.

The 13th Greater New York Metropolitan Area Econometrics Colloquium

Princeton University, Saturday, December 1, 2018

9.00am-10.30am: Session 1
“Simple Inference for Projections and Linear Programs” by Hiroaki Kaido (BU), Francesca Molinari (Cornell), and Jörg Stoye (Cornell)
“Clustering for multi-dimensional heterogeneity with application to production function estimation” by Xu Cheng (UPenn), Peng Shao (UPenn), and Frank Schorfheide (UPenn)
“Adaptive Bayesian Estimation of Mixed Discrete-Continuous Distributions Under Smoothness and Sparsity” by Andriy Norets (Brown) and Justinas Pelenis (Vienna IAS)

11.00am-12.30pm: Session 2
“Factor-Driven Two-Regime Regression” by Sokbae Lee (Columbia), Yuan Liao (Rutgers), Myung Hwan Seo (Cowles), and Youngki Shin (McMaster)
“Semiparametric Estimation in Continuous-Time: Asymptotics for Integrated Volatility Functionals with Small and Large Bandwidths” by Xiye Yang (Rutgers)
“Nonparametric Sample Splitting” by Yoonseok Lee (Syracuse) and Yulong Wang (Syracuse)

2.00pm-3.30pm: Session 3
“Counterfactual Sensitivity and Robustness” by Timothy Christensen (NYU) and Benjamin Connault (IEX Group)
“Dynamically Optimal Treatment Allocation Using Reinforcement Learning” by Karun Adusumilli (UPenn), Friedrich Geiecke (LSE), and Claudio Schilter (LSE)
“Simultaneous Mean-Variance Regression” by Richard Spady (Johns Hopkins) and Sami Stouli (Bristol)

4.00pm-5.30pm: Session 4
“Semi-parametric instrument-free demand estimation: relaxing optimality and equilibrium assumptions” by Sungjin Cho (Seoul National), Gong Lee (Georgetown), John Rust (Georgetown), and Mengkai Yu (Georgetown)
“Nonparametric analysis of monotone choice” by Natalia Lazzati (UCSC), John Quah (Johns Hopkins), and Koji Shirai (Kwansei Gakuin)
“Discrete Choice under Risk with Limited Consideration” by Levon Barseghyan (Cornell), Francesca Molinari (Cornell), and Matthew Thirkettle (Cornell)

Organizing Committee
Bo Honoré, Michal Kolesár, Ulrich Müller, and Mikkel Plagborg-Møller

Participants

 Karun UPenn

Althoff
Lukas
Princeton
Anderson
Rachel
Princeton
Bai
Jushan
Columbia
Beresteanu
Arie
Pitt
Callaway
Brantly
Temple
Chao
John
Maryland
Cheng
Xu
UPenn
Choi
Jungjun
Rutgers
Choi
Sung Hoon
Rutgers
Cox
Gregory
Columbia
Christensen
Timothy
NYU
Diebold
Frank
UPenn
Dou
Liyu
Princeton
Gao
Wayne
Yale
Gaurav
Abhishek
Princeton
Henry
Marc
Penn State
Ho
Paul
Princeton
Honoré
Bo
Princeton
Hu
Yingyao
Johns Hopkins
Kolesar
Michal
Princeton
Lazzati
Natalia
UCSC
Lee
Simon
Columbia
Li
Dake
Princeton
Li
Lixiong
Penn State
Liao
Yuan
Rutgers
Menzel
NYU
Molinari
Francesca
Cornell
Montiel Olea
José Luis
Columbia
Müller
Ulrich
Princeton
Norets
Andriy
Brown
Plagborg-Møller
Mikkel
Princeton
Poirier
Alexandre
Georgetown
Quah
John
Johns Hopkins
Rust
John
Georgetown
Schorfheide
Frank
UPenn
Seo
Myung
SNU & Cowles
Shin
Youngki
McMaster
Sims
Christopher
Princeton
Richard
Johns Hopkins
Stoye
Jörg
Cornell
Taylor
Larry
Lehigh
Vinod
Hrishikesh
Fordham
Wang
Yulong
Syracuse
Yang
Xiye
Rutgers
Zeleneev
Andrei
Princeton

## Monday, December 3, 2018

### Dual Regression and Prediction

Richard Spady and Sami Stouli have an interesting new paper, “Dual Regression". They change the usual OLS loss function from quadratic to something related but different, as per their equation (2.2), and they get impressive properties for estimation under correct specification. They also have some results under misspecification.

I'd like to understand more regarding dual regression's properties for prediction under misspecification. Generally we're comfortable with quadratic loss, in which case OLS delivers the goods (the conditional mean or linear projection) in large samples under great generality (e.g., see here). The dual regression estimator, in contrast, has a different probability limit under misspecification -- it's not providing a KLIC-optimal approximation.

If the above sounds negative, note well that the issue raised may be an opportunity, not a pitfall! Certainly there is nothing sacred about quadratic loss, even if the conditional mean is usually a natural predictor. We sometimes move to absolute-error loss (conditional median predictor), check-function loss (conditional quantile predictor), or all sorts of other predictive loss functions depending on the situation. But movements away from conditional mean or median prediction generally require some justification and interpretation. Equivalently, movements away from quadratic or absolute predictive loss generally require some justification and interpretation. I look forward to seeing that for the loss function that drives dual regression.

## Friday, November 16, 2018

### Nearest-Neighbor Prediction

The beautiful idea has been around for ages. Find the N closest H-histories to the current H-history (you choose/tune N and H), for each H-history see what followed, take an average, and use that as your forecast. Of course there are many variations and extensions. Interesting new work by Dendramis, Kapetanios, and Marcellino is in exactly that tradition, except that Dendramis et al.  don't show much awareness of the tradition, or attempt to stand on its shoulders, which I find odd. I find myself hungry for tighter connections, for example to my favorite old nearest-neighbor prediction piece, Sid Yakowitz's well-known "Nearest-Neighbor Methods for Time Series Analysis,” Journal of Time Series Analysis, 1987.

## Thursday, November 15, 2018

### JFEC Special Issue for Peter Christoffersen

No, I have not gone into seclusion. Well actually I have, but not intentionally and certainly not for lack of interest in the blog. Just the usual crazy time of year, only worse this year for some reason. Anyway I'll be back very soon, with lots to say! But here's something important and timely, so it can't wait:

Journal of Financial Econometrics

Call for Papers

Special Issue in Honor of Peter Christoffersen

The Journal of Financial Econometrics is organizing a special issue in memory of Professor Peter
Christoffersen, our friend and colleague, who passed away in June 2018. Peter held the TMX Chair in Capital Markets and a Bank of Canada Fellowship and was a widely respected member of the Rotman School at the University of Toronto since 2010. Prior to 2010, Peter was a valued member of the Desautels Faculty of Management at McGill University. In addition to his transformative work in econometrics and volatility models, financial risk and financial innovation had been the focus of Peter’s work in recent years.

We invite paper submissions on topics related to Peter’s contributions to Finance and Econometrics. We are particularly interested in papers related to the following topics:

1)   The use of option-implied information for forecasting; Rare disasters and portfolio
management; Factor structures in derivatives and futures markets.

2)   Volatility, correlation, extreme events, systemic risk and Value-at-Risk modeling for
financial market risk management.

3)   The econometrics of digital assets; Big data and Machine Learning.

To submit a paper, authors should login to the Journal of Financial Econometrics online submission system and follow the submission instructions as per journal policy.  The due date for submissions is June 30, 2019.  It is important to specify in the cover letter that the paper is submitted to the special issue in honor of Peter Christoffersen, otherwise your paper will not be assigned to the guest editors.

Guest Editors

•    Francis X. Diebold, University of Pennsylvania

•    René Garcia, Université de Montréal and Toulouse School of Economics

•    Kris Jacobs, University of Houston

## Monday, October 29, 2018

### Becker Friedman Expectations Conference

I just returned from a great BFI Conference at U Chicago, Developing and Using Business Expectations Data, organized by Nick Bloom and Steve Davis.

Wonderfully, density as opposed to point survey forecasts were featured throughout. There was the latest on central bank surveys (e.g., Binder et al.), but most informative (to me) was the emphasis on surveys that I'm less familiar with, typically soliciting density expectations from hundreds or thousands of C-suite types at major firms. Examples include Germany's important IFO survey (e.g., Bachman et al.), the U.S. Census Management and Organizational Practices Survey (e.g., Bloom et al.)., and fascinating work in progress at FRB Atlanta.

The Census survey is especially interesting due to its innovative structuring of histogram bins. There are no fixed bins. Instead users give 5 bins of their own choice, and five corresponding probabilities (which add to 1). This solves the problem in fixed-bin surveys of  (lazy? behaviorally-biased?) respondents routinely and repeatedly assigning 0 probability to subsequently-realized events.

## Sunday, October 28, 2018

### Expansions Don't Die of Old Age

As the expansion ages, there's progressively more discussion of whether its advanced age makes it more likely to end. The answer is no. More formally, postwar U.S. expansion hazards are basically flat, in contrast to contraction hazards, which are sharply increasing. Of course the present expansion will eventually end, and it may even end soon, but its age it unrelated to its probability of ending.

All of this is very clear in Diebold, Rudebusch and Sichel (1992). See Figure 6.2 on p. 271. (Sorry for the poor photocopy quality.) The flat expansion hazard result has held up well (e.g., Rudebusch (2016)), and moreover it would only be strengthened by the current long expansion.

[I blogged on flat expansion hazards before, but the message bears repeating as the expansion continues to age.]

## Thursday, October 4, 2018

### In Memoriam Herman Stekler

I am sad to report that Herman Stekler passed away last month. I didn't know until now. He was a very early and important and colorful -- indeed unique -- personage in the forecasting community, making especially noteworthy contributions to forecast evaluation.
https://forecasters.org/herman-stekler_oracle-oct-2018/

## Tuesday, October 2, 2018

### Tyranny of the Top 5 Econ Journals

Check out:

PUBLISHING AND PROMOTION IN ECONOMICS: THE TYRANNY OF THE TOP FIVE
by
James J. Heckman and Sidharth Moktan
NBER Working Paper 25093
http://www.nber.org/papers/w25093

Heckman et al. examine a range of data from a variety of perspectives, analyze them thoroughly, and pull no punches in describing their striking results.

It's a great paper. There's a lot I could add, maybe in a future post, but my blood pressure is already high enough for today. So I'll just leave you with a few choice quotes from the paper ["T5" means "top-5 economics journals" ]:

"The results ... support the hypothesis that the T5 influence operates through channels that are independent of article quality."

"Reliance on the T5 to screen talent incentivizes careerism over creativity."

"Economists at highly ranked departments with established reputations are increasingly not publishing in T5 or field journals and more often post papers online in influential working paper series, which are highly cited, but not counted as T5s."

"Many non-T5 articles are better cited than many articles in T5 journals. ...  Indeed, many of the most important papers published in the past 50 years have been too innovative to survive the T5 gauntlet."

"The [list of] most cited non-T5 papers reads like an honor roll of economic analysis."

"The T5 ignores publication of books. Becker’s Human Capital
(1964) has more than 4 times the number of citations of any paper listed on RePEc. The exclusion of books from citation warps incentives against broad and integrated research and towards writing bite-sized fragments of ideas."

## Saturday, September 29, 2018

### RCT's vs. RDD's

Art Owen and Hal Varian have an eye-opening new paper, "Optimizing the Tie-Breaker Regression Discontinuity Design".

Randomized controlled trials (RCT's) are clearly the gold standard in terms of statistical efficiency for teasing out causal effects. Assume that you really can do an RCT. Why then would you ever want to do anything else?

Answer: There may be important considerations beyond statistical efficiency. Take the famous "scholarship example". (You want to know whether receipt of an academic scholarship causes enhanced academic performance among strong scholarship test performers.) In an RCT approach you're going to give lots of academic scholarships to lots of randomly-selected people, many of whom are not strong performers. That's wasteful. In a regression discontinuity design (RDD) approach ("give scholarships only to strong performers who score above X in the scholarship exam, and compare the performances of students who scored just above and below X"), you don't give any scholarships to weak performers. So it's not wasteful -- but the resulting inference is statistically inefficient.

"Tie breakers" implement a middle ground: Definitely don't give scholarships to bottom performers, definitely do give scholarships to top performers, and randomize for a middle group. So you gain some efficiency relative to pure RDD (but you're a little wasteful), and you're less wasteful than a pure RCT (but you lose some efficiency).

Hence there's an trade-off, and your location on it depends on the size of the your middle group. Owen and Varian characterize the trade-off and show how to optimize the size of the middle group. Really nice, clean, and useful.

[Sorry but I'm running way behind. I saw Hal present this work a few months ago at a fine ECB meeting on predictive modeling.]

## Sunday, September 23, 2018

### NBER WP's Hit 25,000

A few weeks ago the NBER released WP25000, What a great NBER service -- there have been 7.6 million downloads of NBER WP's in the last year alone.

This milestone is
of both current and historical interest. The history is especially interesting. As Jim Poterba notes in a recent communication:
This morning's "New this Week" email included the release of the 25000th NBER working paper, a study of the intergenerational transmission of human capital by David Card, Ciprian Domnisoru, and Lowell Taylor.  The NBER working paper series was launched in 1973, at the inspiration of Robert Michael, who sought a way for NBER-affiliated researchers to share their findings and obtain feedback prior to publication.  The first working paper was "Education, Information, and Efficiency" by Finis Welch.  The design for the working papers -- which many will recall appeared with yellow covers in the pre-digital age -- was created by H. Irving Forman, the NBER's long-serving chart-maker and graphic artist.
Initially there were only a few dozen working papers per year, but as the number of NBER-affiliated researchers grew, particularly after Martin Feldstein became NBER president in 1977, the NBER working paper series also expanded.  In recent years, there have been about 1150 papers per year.  Over the 45 year history of the working paper series, the Economic Fluctuations and Growth Program has accounted for nearly twenty percent (4916) of the papers, closely followed by Labor Studies (4891) and Public Economics (4877).

## Wednesday, September 19, 2018

### Wonderful Network Connectedness Piece

Very cool NYT graphics summarizing U.S. Facebook network connectedness.  Check it out:

They get the same result that Kamil Yilmaz and I have gotten for years in our analyses of economic and financial network connectedness:  There is a strong "gravity effect" -- that is, even in the electronic age, physical proximity is the key ingredient to network relationships. See for example:

Maybe not as surprising for facebook friends as for financial institutions (say).  But still...

## Sunday, September 16, 2018

### Banque de France’s Open Data Room

See below for announcement of a useful new product from the Bank of France and its Representative Office in New York.
Banque de France has been for many years at the forefront of disseminating statistical data to academics and other interested parties. Through Banque de France’s dedicated public portal http://webstat.banque-france.fr/en/, we offer a large set of free downloadable series (about 40 000 mainly aggregated series).

Banque de France has expanded further the service provided and launched, in Paris, in November 2016 an “Open Data Room”, providing researchers with a free access to granular data.
We are glad to announce that the “Open Data Room” service is now also available to US researchers through Banque de France Representative Office in New York City.

## Saturday, September 15, 2018

### An Open Letter to Tren Griffin

[I tried quite hard to email this privately. I post it here only because Griffin has, as far as I can tell, been very successful in scrubbing his email address from the web. Please forward it to him if you can figure out how.]

Mr. Griffin:

A colleague forwarded me your post, https://25iq.com/2018/09/08/risk-uncertainty-and-ignorance-in-investing-and-business-lessons-from-richard-zeckhauser/.  I enjoyed it, and Zeckhauser definitely deserves everyone's highest praise.

However your post misses the bigger picture.  Diebold, Doherty, and Herring conceptualized and promoted the "Known, Unknown, Unknowable" (KuU) framework for financial risk management, which runs blatantly throughout your twelve "Lessons From Richard Zeckhauser".  Indeed the key Zeckhauser article on which you draw appeared in our book, "The Known, the Unknown and the Unknowable in Financial Risk Management", https://press.princeton.edu/titles/9223.html, which we also conceptualized, and for which we solicited the papers and authors, mentored them as regards integrating their thoughts into the KuU framework, etc.  The book was published almost a decade ago by Princeton University Press.

I say all this not only to reveal my surprise and annoyance at your apparent unawareness, but also, and more constructively, because you and your readers may be interested in our KuU book, which has many other interesting parts (great as the Zeckhauser part may be), and which
, moreover, is more than the sum of its parts. A pdf of the first chapter has been available for many years at http://assets.press.princeton.edu/chapters/s9223.pdf.

Sincerely,

## Friday, September 14, 2018

### Machine Learning for Forecast Combination

How could I have forgotten to announce my latest paper, "Machine Learning for Regularized Survey Forecast Combination: Partially-Egalitarian Lasso and its Derivatives"? (Actually a heavily-revised version of an earlier paper, including a new title.) Came out as an NBER w.p. a week or two ago.

## Monday, September 10, 2018

### Interesting Papers of the Moment

Missing Events in Event Studies: Identifying the Effects of Partially-Measured News Surprises
by Refet S. Guerkaynak, Burcin Kisacikoglu, Jonathan H. Wright #25016 (AP ME)
http://papers.nber.org/papers/w25016?utm_campaign=ntw&utm_medium=email&utm_source=ntw

Colacito, Ric, Bridget Hoffmann, and Toan Phan (2018) “Temperature and growth: A panel
analysis of the United States,”
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2546456

Do You Know That I Know That You Know...? Higher-Order Beliefs in Survey Data
by Olivier Coibion, Yuriy Gorodnichenko, Saten Kumar, Jane Ryngaert #24987 (EFG ME)
http://papers.nber.org/papers/w24987?utm_campaign=ntw&utm_medium=email&utm_source=ntw

## Wednesday, September 5, 2018

### Google's New Dataset Search Tool

Check out Goolgle's new Datset Search.

Here's the description they issued today:

Making it easier to discover datasets
Natasha Noy
Published Sep 5, 2018

In today's world, scientists in many disciplines and a growing number of journalists live and breathe data. There are many thousands of data repositories on the web, providing access to millions of datasets; and local and national governments around the world publish their data as well. To enable easy access to this data, we launched Dataset Search, so that scientists, data journalists, data geeks, or anyone else can find the data required for their work and their stories, or simply to satisfy their intellectual curiosity.

Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher's site, a digital library, or an author's personal web page. To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include  salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc. We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset. Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way. We encourage dataset providers, large and small, to adopt this common standard so that all datasets are part of this robust ecosystem.

In this new release, you can find references to most datasets in environmental and social sciences, as well as data from other disciplines including government data and data provided by news organizations, such as ProPublica. As more data repositories use the schema.org standard to describe their datasets, the variety and coverage of datasets that users will find in Dataset Search, will continue to grow.

Dataset Search works in multiple languages with support for additional languages coming soon. Simply enter what you are looking for and we will help guide you to the published dataset on the repository provider’s site.

For example, if you wanted to analyze daily weather records, you might try this query in Dataset Search:

[daily weather records]

You’ll see data from NASA and NOAA, as well as from academic repositories such as Harvard's Dataverse and Inter-university Consortium for Political and Social Research (ICPSR). Ed Kearns, Chief Data Officer at NOAA, is a strong supporter of this project and helped NOAA make many of their datasets searchable in this tool. “This type of search has long been the dream for many researchers in the open data and science communities” he said. “And for NOAA, whose mission includes the sharing of our data with others, this tool is key to making our data more accessible to an even wider community of users.”

This launch is one of a series of initiatives to bring datasets more prominently into our products. We recently made it easier to discover tabular data in Search, which uses this same metadata along with the linked tabular data to provide answers to queries directly in search results. While that initiative focused more on news organizations and data journalists, Dataset search can be useful to a much broader audience, whether you're looking for scientific data, government data, or data provided by news organizations.

A search tool like this one is only as good as the metadata that data publishers are willing to provide. We hope to see many of you use the open standards to describe your data, enabling our users to find the data that they are looking for. If you publish data and don't see it in the results, visit our instructions on our developers site which also includes a link to ask questions and provide feedback.

## Monday, September 3, 2018

### The Coming Storm

The role of time-series statistics / econometrics in climate analyses is expanding (e.g., here).  Related -- albeit focusing on shorter-term meteorological aspects rather than longer-term climatological aspects -- it's worth listening to Michael Lewis' latest, The Coming Storm.  (You have to listen rather than read, as it's only available as an audiobook, but it's only about two hours.)  It's a fascinating story, well researched and well told by Lewis, just as you'd expect.  There are lots of interesting insights on (1) the collection, use, and abuse of public weather data, including ongoing, ethically-dubious, and potentially life-destroying attempts to privatize public weather data for private gain, (2) the clear and massive improvements in weather forecasting in recent decades, (3) behavioral aspects of how best to communicate forecasts so people understand them, believe them, and take appropriate action before disaster strikes.

## Monday, August 27, 2018

### Long Memory / Scaling Laws in Return Volatility

The 25-year accumulation of evidence for long memory / fractional integration / self-similarity / scaling laws in financial asset return volatility continues unabated.  For the latest see this nice new paper from Bank of Portugal, in particular its key Table 6. Of course the interval estimates of the fractional integration parameter "d" are massively far from both 0 and 1 -- that's the well-known long memory. But what's new and interesting is the systematic difference in the intervals depending on whether one uses absolute or range-based volatility. The absolute d intervals tend to be completely below 1/2 (0<d<1/2 corresponds to covariance-stationary dynamics), whereas the range-based d intervals tend to include 1/2 (1/2<d<1 corresponds to mean-reverting but not covariance- stationary dynamics, due to infinite unconditional variance).

Realized vol based on the range is less noisy than realized vol based on absolute returns. But least noisy of all, and not considered in the paper above, is realized vol calculated directly from high-frequency return data (HFD-vol), as done by numerous authors in recent decades. Interestingly, recent work for HFD-vol also reports d intervals that tend to poke above 1/2. See this earlier post.

## Monday, August 20, 2018

### More on the New U.S. GDP Series

BEA's new publishing of NSA GDP is a massive step forward. Now it should take one more step, if it insists on continuing to publish SA GDP.

Publishing only indirect SA GDP ("adjust the components and add them up") lends it an undeserved "official" stamp of credibility, so BEA should also publish a complementary official direct SA GDP ("adjust the aggregate directly"), which is now possible.

This is a really big deal. Real GDP is undoubtedly the most important data series in all of macroeconomics, and indirect vs. direct SA GDP growth estimates can differ greatly. Their average absolute deviation is about one percent, and average real GDP growth itself is only about two percent! And which series you use has large implications for important issues, such as the widely-discussed puzzle of weak first-quarter growth (see Rudebusch et al.), among other things.

How do we know all this about properties of indirect vs. direct SA GDP growth estimates, since BEA doesn't provide direct SA GDP? You can now take the newly-provided NSA GDP and directly adjust it yourself. See Jonathan Wright's wonderful new paper. (Ungated version here.)

Of course direct SA has many issues of its own.  Ultimately significant parts of both direct and indirect SA GDP are likely spurious artifacts of various direct vs. indirect SA assumptions / methods.

So another, more radical, idea, is simply to stop publishing SA GDP in any form, instead publishing only NSA GDP (and its NSA components). Sound crazy? Why, exactly? Are official government attempts to define and remove "seasonality" any less dubious than, say, official attempts to define and remove "trend"? (The latter is, mercifully, not attempted...)

## Tuesday, August 7, 2018

Markus Pelger has a nice paper on factor modeling with time-varying loadings in high dimensions. There are many possible applications. He applies it to level-slope-curvature yield-curve models.

For me another really interesting application would be measuring connectedness in financial markets, as a way of tracking systemic risk. The Diebold-Yilmaz (DY) connectedness framework is based on a high-dimensional VAR with time-varying coefficients, but not factor structure. An obvious alternative in financial markets, which we used to discuss a lot but never pursued, is factor structure with time-varying loadings, exactly in Pelger!

It would seem, however, that any reasonable connectedness measure in a factor environment would need to be based not only time-varying loadings but also time-varying idiosynchratic shock variances, or more precisely a time-varying noise/signal ratio (e.g., in a 1-factor model, the ratio of the idiosyncratic shock variance to the factor innovation variance). That is, connectedness in factor environments is driven by BOTH the size of the loadings on the factor(s) AND the amount of variation in the data explained by the factor(s). Time-varying loadings don't really change anything if the factors are swamped by massive noise.

Typically one might fix the factor innovation variance for identification, but allow for time-varying idiosyncratic shock variance in addition to time-varying factor loadings. It seems that Pelger's framework does allow for that. Crudely, and continuing the 1-factor example, consider y_t  =  lambda_t  f_t  +  e_t. His methods deliver estimates of the time series of loadings lambda_t and factor f_t, robust to heteroskedasticity in the idiosyncratic shock e_t. Then in a second step one could back out an estimate of the time series of e_t and fit a volatility model to it.
Then the entire system would be estimated and one could calculate connectedness measures based, for example, on variance decompositions as in the DY framework

## Monday, July 30, 2018

### NSA GDP is Finally Here

Non-seasonally-adjusted U.S. GDP is finally here, as of a few days ago. See this BEA slide deck, pp. 10, 14-15. For background see my earlier post here. Also click on "Seasonality" under "Browse by Topic" on the right.

The deck also covers lots of other aspects of the BEA's important "2018 Comprehensive Update of the National Income and Product Accounts". The whole deck is fascinating, and surely worth a close examination.

## Tuesday, July 24, 2018

### Gu-Kelly-Xiu and Neural Nets in Economics

I'm on record as being largely unimpressed by the contributions of neural nets (NN's) in economics thus far. In many economic environments the relevant non-linearities seem too weak and the signal/noise ratios too low for NN's to contribute much.

The Gu-Kelly-Xiu paper that I mentioned earlier may change that. I mentioned their success in applying machine-learning methods to forecast equity risk premia out of sample. NN's, in particular, really shine. The paper is thoroughly and meticulously done.

This is potentially a really big deal.

## Friday, July 20, 2018

### Remembering Peter Christoffersen

This is adapted from remarks read at a memorial service earlier this week:

I'm sad not to be able to be here in person, and I'm grateful to Peter Pauly for kindly agreeing to read these academically-focused remarks. His reading is unusually wonderful and appropriate, as he played a key role in my Ph.D. training, which means that if Peter Christoffersen was my student, he was also Peter Pauly's "grandstudent". For all that they taught me, I am immensely grateful both to Peter Pauly in early years, and to Peter Christoffersen in later years. I am also grateful to Peter Pauly for another reason -- he was the dean who wisely hired the Christoffersens!

I have been fortunate to have had many wonderful students in various cohorts, but Peter's broad cohort was surely the best: Peter of course, plus (in alphabetical order) Sassan Alizadeh, Filippo Altissimo, Jeremy Berkowitz, Michael Binder, Marcelle Chauvet, Lorenzo Giorgiani, Frank Gong, Atsushi Inoue, Lutz Kilian, Jose Lopez, Anthony Tay, and several others.

The Penn econometrics faculty around that time was similarly strong: Valentina Corradi, Jin Hahn, Bobby Mariano, and eventually Frank Schorfheide, with lots of additional macro-econometrics input from Lee Ohanian and financial econometrics input from Michael Brandt. Hashem Pesaran also visited Penn for a year around then. Peter was well known by all the faculty, not just the econometricians. I recall that the macroeconomists were very disappointed to lose him to econometrics!

Everyone knows Peter's classic 1998 "Evaluating Interval Forecasts" paper, which was part of his Penn dissertation. He uncovered the right notion of the "residual" for a (1-a) x 100% interval forecast, and showed that if all is well then it must be iid Bernoulli(1-a). The paper is one of the International Economic Review's ten most cited papers since its founding in 1960.

Peter and I wrote several papers together, which I consider among my very best, thanks to Peter's lifting me to higher-than-usual levels. They most definitely include our Econometric Theory paper on optimal prediction under asymmetric loss, and our Journal of Business and Economic Statistics paper on multivariate forecast evaluation.

Peter's research style was marked by a wonderful blend of intuition, theoretical rigor, and always, empirical relevance, which took him to heights that few others could reach. And his personality, which simply radiated positivity, made him not only a wonderful person to talk soccer or ski with, but the best imaginable person to talk research with.

Peter was also exceedingly generous and effective with his time as regards teaching & executive education, public service, conference organization, and more. We used to talk a lot about dynamic volatility models, and their use and abuse in financial risk management. His eventual and now well-known textbook on the topic trained legions of students. He and I were the inaugural speakers at the annual summer school of the Society for Financial Econometrics (SoFiE), that year at Oxford University, where we had a wonderful week lecturing together. He served effectively on many committees, including the U.S. Federal Reserve System's Model Validation Committee, charged with reviewing the models used for bank stress testing. He generously hosted the large annual SoFiE meeting in Toronto, several legendary "ski conferences" at Mont Tremblant, and more. The list goes on and on.

We lost a fine researcher and a fine person, much too soon. One can't begin to imagine what he might have contributed during the next twenty years. But this much is certain: his legacy lives on, and it shines exceptionally brightly. Rest in peace, my friend.

## Thursday, July 19, 2018

### Machine Learning, Volatility, and the Interface

Just got back from the NBER Summer Institute. Lots of good stuff happening in the Forecasting and Empirical Methods group. The program, with links to papers, is here.

Lots of room for extensions too. Here's a great example. Consider the interface of the Gu-Kelly-Xiu and Bollerslev-Patton-Quagvleg papers. At first you might think that there is no interface.

Kelly-Xiu is about using off-the-shelf machine-learning methods to model risk premia in financial markets; that is, to construct portfolios that deliver superior performance. (I had guessed they'd get nothing, but I was massively wrong.) Bollerslev et al. is about predicting realized covariance by exploiting info on past signs (e.g., was yesterday's covariance cross-product pos-pos, neg-neg, pos-neg, or neg-pos?). (They also get tremendous results.)

But there's actually a big interface.

Note that Kelly-Xiu is about conditional mean dynamics -- uncovering the determinants of expected excess returns. You might expect even better results for derivative assets, as the volatility dynamics that drive options prices may be nonlinear in ways missed by standard volatility models. And that's exactly the flavor of the Bollerslev et al. results -- they find that a tree structure conditioning on sign is massively successful.

But Bollerslev et al. don't do any machine learning. Instead they basically stumble upon their result, guided by their fine intuition. So here's a fascinating issue to explore: Hit the Bollerslev et al. realized covariance data with machine learning (in particular, tree methods like random forests) and see what happens. Does it "discover" the Bollerslev et al. result? If not, why not, and what does it discover? Does it improve upon Bollerslev et al.?

## Thursday, July 5, 2018

### Climate Change and NYU Volatility Institute

There is little doubt that climate change -- tracking, assessment, and hopefully its eventual mitigation -- is the burning issue of our times. Perhaps surprisingly, time-series econometric methods have much to offer for weather and climatological modeling (e.g., here), and several econometric groups in the UK, Denmark, and elsewhere have been pushing the agenda forward.

Now the NYU Volatility Institute is firmly on board. A couple months ago I was at their most recent annual conference, "A Financial Approach to Climate Risk", but it somehow fell through the proverbial (blogging) cracks. The program is here, with links to many papers, slides, and videos. Two highlights, among many, were the presentations by Jim Stock (insights on the climate debate gleaned from econometric tools, slides here) and Bob Litterman (an asset-pricing perspective on the social cost of climate change, paper here). A fine initiative!

## Monday, June 25, 2018

### Peter Christoffersen and Forecast Evaluation

For obvious reasons Peter Christoffersen has been on my mind. Here's an example of how his influence extended in important ways. Hopefully it's also an entertaining and revealing story.

Everyone knows Peter's classic 1998 "Evaluating Interval Forecasts" paper, which was part of his Penn dissertation. The key insight was that correct conditional calibration requires not only that the 0-1 "hit sequence" of course have the right mean ((1-$$\alpha$$) for a nominal 1-$$\alpha$$ percent interval), but also that it be iid (assuming 1-step-ahead forecasts). More precisely, it must be iid Bernoulli(1-$$\alpha$$).

Around the same time I naturally became interested in going all the way to density forecasts and managed to get some more students interested (Todd Gunther and Anthony Tay). Initially it seemed hopeless, as correct density forecast conditional calibration requires correct conditional calibration of all possible intervals that could be constructed from the density, of which there are uncountably infinitely many.

Then it hit us. Peter had effectively found the right notion of an optimal forecast error for interval forecasts. And just as optimal point forecast errors generally must be independent, so too must optimal interval forecast errors (the Christoffersen hit sequence). Both the point and interval versions are manifestations of "the golden rule of forecast evaluation": Errors from optimal forecasts can't be forecastable. The key to moving to density forecasts, then, would be to uncover the right notion of forecast error for a density forecast. That is, to uncover the function of the density forecast and realization that must be independent under correct conditional calibration. The answer turns out to be the Probability Integral Transform, $$PIT_t=\int_{-\infty}^{y_t} p_t(y_t)$$, as discussed in Diebold, Gunther and Tay (1998), who show that correct density forecast conditional calibration implies $$PIT \sim iid U(0,1)$$.

The meta-result that emerges is coherent and beautiful: optimality of point, interval, and density forecasts implies, respectively, independence of forecast error, hit, and $$PIT$$ sequencesThe overarching point is that a large share of the last two-thirds of the three-part independence result -- not just the middle third -- is due to Peter. He not only cracked the interval forecast evaluation problem, but also supplied key ingredients for cracking the density forecast evaluation problem.

Wonderfully and appropriately, Peter's paper and ours were published together, indeed contiguously, in the International Economic Review. Each is one of the IER's ten most cited since its founding in 1960, but Peter's is clearly in the lead!