Monday, March 31, 2014

Student Advice I: Some Good Reading for Good Writing (and Good Graphics)

Good writing is good thinking, so when you next hear some pretentious moron boast that ``I don't like to write, I like to think," rest assured, he's surely a bad writer and a bad thinker. Again, good writing is good thinking. If you like "to do research" but don't like "to write it up," then you're not thinking clearly. Research and writing are inextricably intertwined.

The Elements of StyleHow to get there? Read and absorb McCloskey's Rhetoric of Economics, and Strunk and White's Elements of Style. There's no real need to read or absorb much else (about writing). But do bolt the Chicago Manual of Style to your desk. Then get going. Think about what you want to say, why, and to whom. Think hard and critically about logical structure and flow, at all scales, small and large. Revise and edit, again and again. Make things easy for your readers.  Listen to your words; push your prose toward poetry. 

VDQI Book Cover

Good graphics is also good thinking, and precisely the same advice holds. Read and absorb Tufte's Visual Display of Quantitative Information. Notice, by the way, how well Tufte writes (even if he sometimes goes overboard with the poetry thing). It's no accident. As Tufte says: show the data, and appeal to the viewer. Recognize that your first cut using default software settings will never, ever, be satisfactory. (If that statement doesn't instantly resonate with you, then you're in desperate need of a Tufte infusion.) So revise and edit, again and again. And again. 

Friday, March 28, 2014

Nate Silver and the Krugman Embarrassment

I'm glad that Nate Silver and his FiveThirtyEight are back. Nate generally provides interesting and responsible data-based journalism for the educated layperson. (Of course he sometimes gets in over his head, but don't we all?)

Now Krugman suddenly starts to dislike Silver; see his "Tarnished Silver" post. Funny, he never complained much when Silver worked at the New York Times, the trough where Krugman feeds. But now that Silver has moved elsewhere, Krugman's vitriol erupts. Perhaps Krugman always felt that way but was mum so as not to offend his NYT. Or perhaps he now wants to punish Silver for defecting. Or perhaps it's a little of both. In any event it strikes me as an embarrassment. Let's call it the Krugman Embarrassment.

I'm not the only one who's noticed the Krugman Embarrassment. See the recent post from Big Data, Plainly Spoken, which gets things right in labeling FiveThirtyEight-bashing "premature and immature." Also see the chart at FiveThirtyEight's Data Lab, which speaks for itself.

Monday, March 24, 2014

Sheldon Hackney Memorial Celebration, March 27

If you're in the area:  Sheldon Hackney Celebration, Thursday, March 27. Program 4-5, reception 5-6, Irvine Auditorium, 34th and Spruce, Philadelphia.  See my earlier memorial post.

Sunday, March 23, 2014

GAS and DCS Models: Tasty Stuff, and I'm Hungry for More

Generalized Autoregressive Score (GAS) models, also known as Dynamic Conditional Score (DCS) models, are an important development. They extend significantly the scope of observation-driven models, with their simple closed-form likelihoods, in contrast to parameter-driven models whose estimation and inference require heavy simulation.

Many talented people are pushing things forward. Most notable are the Amsterdam group (Siem Jan Koopman et al.; see the GAS site) and the Cambridge group (Andrew Harvey et al., see Andrew's interesting new book). The GAS site is very informative, with background description, a catalog of GAS papers, code in Ox and R, conference information, etc. The key paper is Creal, Koopman and Lucas (2008). (It was eventually published in 2012 in Journal of Applied Econometrics, proving once again that the better the paper, the longer it takes to publish.)

The GAS idea is simple. Just use a conditional observation density \(p(y_t |f_t)\) whose time-varying parameter \(f_t\) follows the recursion
\begin{equation}f_{t+1} = ω + β f_t + α S(f_t) \left [ \frac{∂logp(y_t | f_t)}{∂ f_t} \right ],~~~~~~~(1) \end{equation} where \(S(f_t)\) is a scaling function. Note in particular that the scaled score drives \(f_t\). The resulting GAS models retain observation-driven simplicity yet are quite flexible. In the volatility context, for example, GAS can be significantly more flexible than GARCH, as Harvey emphasizes.

Well, the GAS idea seems simple. At least it's simple to implement if taken at face value. But I'm not sure that I understand it fully. In particular, I'm hungry for a theorem that tells me in what sense (1) is the "right" thing to do. That is, I can imagine other ways of updating \(f_t\), so why should I necessarily adopt (1)? It would be great, for example, if (1) were the provably unique solution to an optimal approximation problem for non-linear non-Gaussian state space models. Is it? (It sure looks like a first-order approximation to something.) And if so, might we want to acknowledge that in doing the econometrics, instead of treating (1) as if it were the DGP? And could we somehow improve the approximation?

To the best of my knowledge, the GAS/DCS literature is silent on such fundamental issues. But based on my experience with the fine scholarship of Creal, Harvey, Koopman, Lucas, and their co-authors and students, I predict that answers will arrive soon.

Sunday, March 16, 2014

Is Interdisciplinarity Vastly Over-Rated?

Interdisciplinarity is clearly the flavor of the month (read: two decades) among the academic cognoscenti. Although it makes for entertaining popular press, what's the real intellectual benefit of a top-down interdisciplinary "industrial policy"? Difficult question! That's not necessarily to suggest that there is no benefit; rather, it's simply to suggest that the issues are subtle and deserving of serious thought from all sides.

Hence it's refreshing to see a leading academic throw his hat in the ring with a serious evidence-based defense of the traditional disciplines, as does Penn sociologist Jerry Jacobs in his new book In Defense of Disciplines.

Perhaps the best thing I can do to describe the book and whet your appetite is to reprint some of the book's back-cover blurbs, which get things just right. So here goes:

“Jerry Jacobs’s new book provides the missing counterpoint to the fanfare for interdisciplinary collaboration that has swept over much of academe during the last three decades. Thanks to Jacobs’s creative and painstaking research, we now know that disciplines are not the ‘silos’ they are so often made out to be; instead, they are surprisingly open to good ideas and new methods developed elsewhere. Nor are universities rigidly bound to the disciplines—instead, they, routinely foster interdisciplinary work through dozens of organized research centers. This book is more than a necessary corrective. It is a well-crafted piece of social science, equally at home in the worlds of intellectual history, organizational studies, and quantitative methods. It deserves to be read by all who care about the future of universities—defenders and critics of the disciplines alike.” (Steven G. Brint, University of California, Riverside)

“At a time of undue hoopla about interdisciplinarity, this is a sobering, highly readable, and data-driven defense of retaining disciplinary units as the primary mode of organizing research universities. A must read for those concerned with the future of knowledge innovation.” (Myra H. Strober, Stanford University)

“This is a timely, subtle and much needed evaluation of interdisciplinarity as a far reaching goal sweeping around the globe. Jerry Jacobs sets new standards of discussion by documenting with great new data the long term fate of interdisciplinary fields and the centrality of disciplines to higher education and the modern research university.” (Karin Knorr Cetina, University of Chicago)

Sunday, March 9, 2014

Loss‐Efficient Factor Selection

Alexi Onatski has an interesting recent paper, "Asymptotic Analysis of the Squared Estimation Error in Misspecified Factor Models." There's also an Appendix.

Four interesting cases have emerged in the literature, corresponding to two types of data-generating process (exact factor structure -- diagonal idiosyncratic covariance matrix vs. approximate factor structure -- non-diagonal idiosyncratic covariance matrix) and two modes of asymptotic analysis (strong factor structure vs. weak -- see Alexi's paper for the technical definitions, but you can imagine).

Much recent work focuses on approximate factor structure and strong factor asymptotics. The classic work of Bai and Ng (2002), for example, is in that tradition. Alexi instead focuses on weak factor asymptotics. Crucially and compellingly, moreover, he focuses on selecting the number of factors \(p\) for best estimation of the common component, since estimation of the common component is typically the goal in factor modeling.

Let's get a bit more precise. The DGP is the usual approximate factor model,
X=\Lambda F^{\prime }+e,
$$where \(X\) is an \(n\times T\) matrix of data, \(\Lambda\) is an \(n\times r\) matrix of factor loadings, \(F\) is a \(T\times r\) matrix of factors and \(e\) is an \(n\times T\) matrix of idiosyncratic terms.

We want to select \(p\), the number of factors, to get the best principal-component estimate, \(\hat{\Lambda}_{1:p}\hat{F}_{1:p}^{\prime }\), of the common component \(\Lambda F^{\prime }\) under quadratic loss. That is, the objective is minimization (over time and space) of
L_{p}=\ tr \left[ (\hat{\Lambda}_{1:p}\hat{F}_{1:p}^{\prime }-\Lambda
F^{\prime })(\hat{\Lambda}_{1:p}\hat{F}_{1:p}^{\prime }-\Lambda F^{\prime
})^{\prime }\right] /\left( nT\right).
$$Among many other things, Alexi shows that under weak-factor asymptotics the optimal number of factors is not generally the "true" number!

All told, I find highly compelling the move to loss functions explicitly based on divergence between the true and estimated common component. I'm a little less sure how I feel about the move to weak-factor asymptotics, as my gut tells me that the common component in many macroeconomic and financial environments is driven by a few strong factors, and not much else. We'll see. In any event Alexi's contribution is refreshing, original, and valuable.


By the way, I first saw the paper at the SoFiE Lugano conference (with the Swiss Finance Institute (SFI), and Labex Louis Bachelier, "Large-Scale Factor Models in Finance," generously hosted by The Faculty of Economics of the Università della Svizzera Italiana, Lugano, Switzerland). The title of Alexi's talk was "Asymptotic Analysis of the Squared Estimation Error in Misspecified Factor Models," but the actual paper is the one cited above.

Here's the Lugano program FYI, as there were lots of other interesting papers as well.

Invited Session 1 (Chair: E. Renault)

R. Korajczyk (Northwestern University): Small-sample Properties of Factor Mimicking Portfolio Estimates (with Zhuo Chen and Gregory Connor)

Contributed Session 1: Factor Models and Asset Pricing (Chair: F. Trojani)

S. Ahn, A. Horenstein, N. Wang: Beta Matrix and Common Factors in Stock Returns, Paper

T. Chordia, A. Goyal, J. Shanken: Cross-Sectional Asset Pricing with Individual Stocks: Betas vs. Characteristics, Slides

P. Gagliardini, E. Ossola, O. Scaillet: Time-Varying Risk Premium in large Cross-Sectional Equity Datasets- PaperSlides

Poster Session 

E. Andreou, E. Ghysels: What Drives the VIX and the Volatility Risk  Premium?

T. Berrada, S. Coupy: It Does Pay to Diversify

S. Darolles, S. Dubecq, C. Gouriéroux: Contagion Analysis in the Banking Sector

D. Karstanje, M. van der Wel, D. van Dijk: Common Factors in Commodity Futures Curves

P. Maio, D. Philip: Macro factors and the cross-section of stock returns, Paper

Contributed Session 2: Dynamic Factor Models (Chair: M. Deistler)

G. Fiorentini, E. Sentana: Dynamic Specification Tests for Dynamic Factor Models- PaperSlides

M. Forni, M. Hallin, M. Lippi, P. Zaffaroni: One-Sided Representations of Generalized Dynamic Factor Models

Invited Session 2 (Chair: E. Ghysels)

C. Gourieroux (CREST and University of Toronto): Positional Portfolio Management (with P. Gagliardini and M. Rubin)

Contributed Session 3: Systemic Risk (Chair: S. Darolles)

J. Boivin, M. P. Giannoni, D. Stevanovic: Dynamic Effects of Credit Shocks in a Data-Rich Environment

S. Giglio, B. Kelly, S. Pruitt, X. Quiao: Systemic Risk and the Macroeconomy: An Empirical Evaluation

B. Schwaab, S. J. Koopman, A. Lucas: Modeling Global Financial Sector Stress and Credit Market Dislocation

Invited Session 3 (Chair: F. Diebold)

Alexei Onatski (University of Cambridge): Loss-Efficient Selection of the Number of Factors

Contributed Session 4: Model Specification (Chair: O. Scaillet)

M. Carrasco, B. Rossi: In-sample Inference and Forecasting in Misspecified Factor Models

F. Pegoraro, A. Siegel, L. Tiozzo Pezzoli: Specification Analysis of International Treasury Yield Curve Factors

F. Kleibergen, Z. Zhan: Unexplained Factors and their Effects on Second Pass R-Squared’s and t-Tests- PaperSlides

Wednesday, March 5, 2014

Fun with Mike Steele Quotes and Rants

Cauchy Schwarz Master Class J. Michael Steele Cambridge Univeristy Press Cover Designincluding Pictures of Cauchy and of K. H. A. Schwarz (sometimes misspelled Schwartz)

Check out the web page of my Penn Statistics buddy, Mike Steele, probabilist, statistician and mathematician extraordinaire. (And that's just his day job. At night he battles the really hard stuff -- financial markets.) Among other things, you'll like his Favorite Quotes and Semi-Random Rants.

Sunday, March 2, 2014

Double-"Blind" Refereeing is Misguided in Principle and a Charade in Practice

I view the title of this post as almost self-evident. But lots of do-gooders out there disagree, touting double-blind refereeing as somehow promoting "fairness."

(1) Misguided in principle: One never makes more-informed decisions (or predictions, or inferences, or whatever) by shrinking the information on which they're based. It's that simple.

(2) A charade in practice: It now rarely takes more than a few seconds to identify the author of a "blinded" manuscript. And on the rare occasions when Google can't nail it instantly, the author usually helps in various ways, such as by over-citing himself (whether innocently or strategically).

Whenever I'm requested to review a blinded manuscript, I decline immediately. I simply refuse to participate in the charade. You should too.