Tuesday, July 14, 2020

Spurious Factor Analysis

This abstract definitely produced one of those great "ah ha!" moments, at least for me.  So obvious once someone points it out.  Thanks Alexei and Chen.

I'm hungry for more.

With I(1) regression we have:
Q: When will regressions with I(1) variables not produce spurious results?
A: When the variables are not only integrated but also cointegrated.

What is the analog here, with PCA?  That is:
Q: When will PCA with high-dim I(1) variables not produce spurious results?
A: ??? (I'm not yet sure.  Maybe it's addressed in the paper, which I look forward to reading.  Cointegration should again be part of the answer (maybe all of the answer?), as cointegration implies factor structure.)

By:Onatski, A.Wang, C.
Abstract:This paper draws parallels between the Principal Components Analysis of factorless high-dimensional nonstationary data and the classical spurious regression. We show that a few of the principal components of such data absorb nearly all the data variation. The corresponding scree plot suggests that the data contain a few factors, which is collaborated by the standard panel information criteria. Furthermore, the Dickey-Fuller tests of the unit root hypothesis applied to the estimated “idiosyncratic terms” often reject, creating an impression that a few factors are responsible for most of the non-stationarity in the data. We warn empirical researchers of these peculiar effects and suggest to always compare the analysis in levels with that in differences.
Keywords:Spurious regression, principal components, factor models, Karhunen-Loève expansion.

No comments:

Post a Comment