Sunday, June 14, 2015

A Conjecture Regarding Extracted Dynamic Factors (and Hence GDPplus)

Here's a conjecture that I'd love to see explored. It's well-posed, simple, and really interesting.

Conjecture: GDPplus (obtained by Kalman smoothing) may be very well approximated by taking a simple convex combination of exponentially smoothed GDPe (expenditure side GDP) and exponentially smoothed GDPi (income side GDP). 

That is,

\( GDPplus = \lambda \cdot SMOOTH_{\alpha_e} (GDPe)  + (1 - \lambda) \cdot   SMOOTH_{\alpha_i} (GDPi) , \)

where \(\lambda\) is a combining weight, \(SMOOTH(GDPx)\) denotes an exponential smooth of \(GDPx \), and the \(\alpha_x\)'s are smoothing parameters.

Or even more simply, forget about GDPplus, whose underlying probability model is a bit complicated, and just examine a simpler canonical case, as follows.

Conjecture: In a stationary bivariate single-factor dynamic factor model with AR(1) factor and all shocks Gaussian and orthogonal to all other shocks, the MSE-optimal factor extraction (obtained by Kalman smoothing) may be very well approximated by taking a simple convex combination of exponentially smoothed observed variables.

There are of course many variations and extensions:  N variables, richer dynamics, richer error correlation structures, different smoothers, etc.

Theoretically:

What, precisely, is the relationship between the optimal extraction and the approximation? The answer must be contained in the structure of the Kalman gain derived in ADNSS2.   

Empirically:

-- Check it out in simulated environments for various choices of \(  \lambda\), \(  \alpha_e\) and \(  \alpha_i\).

-- Again in simulated environments, minimize the average squared divergence between the exact and approximate extractions w.r.t. \( \lambda\), \(  \alpha_e\) and \(  \alpha_i\).  How close is it to zero?

-- Now do a serious application: GDPplus vs. a weighted combination of smoothed GDPe and GDPi.  Again minimize w.r.t. \(  \lambda\), \(  \alpha_e\) and \(  \alpha_i\). How close is it to zero? How much closer is it to zero than the divergence between GDPplus and GDPavg (the simple average of GDPe and GDPi now published by BEA -- see this No Hesitations post.)?

-- Based on ADNSS1 and ADNSS2, My guess is that the optimal \(\lambda\) will be around .4, and that the optimal \(\alpha_e\) will be much bigger than the optimal \(  \alpha_i\) (where bigger \(  \alpha\) corresponds to more smoothing).


[Note: In the two or three weeks since the draft of this post was written, we have explored things a bit, and it's looking good. The optimized parameters are \(  \lambda=.14\), \(  \alpha_e =.94 \) and \(  \alpha_i = .18\), and they deliver a predictive \( R^2\) for GDPplus of .94.]