Here's a conjecture that I'd love to see explored. It's well-posed, simple, and really interesting.

Conjecture:

*GDPplus*(obtained by Kalman smoothing) may be very well approximated by taking a simple convex combination of exponentially smoothed*GDPe*(expenditure side GDP) and exponentially smoothed*GDPi*(income side GDP).
That is,

\( GDPplus = \lambda \cdot SMOOTH_{\alpha_e} (GDPe) + (1 - \lambda) \cdot SMOOTH_{\alpha_i} (GDPi) , \)

\( GDPplus = \lambda \cdot SMOOTH_{\alpha_e} (GDPe) + (1 - \lambda) \cdot SMOOTH_{\alpha_i} (GDPi) , \)

where \(\lambda\) is a combining weight, \(SMOOTH(GDPx)\) denotes an exponential smooth of \(GDPx \), and the \(\alpha_x\)'s are smoothing parameters.

Or even more simply, forget about

-- Now do a serious application:

-- Based on ADNSS1 and ADNSS2, My guess is that the optimal \(\lambda\) will be around .4, and that the optimal \(\alpha_e\) will be much bigger than the optimal \( \alpha_i\) (where bigger \( \alpha\) corresponds to more smoothing).

[Note: In the two or three weeks since the draft of this post was written, we have explored things a bit, and it's looking good. The optimized parameters are \( \lambda=.14\), \( \alpha_e =.94 \) and \( \alpha_i = .18\), and they deliver a predictive \( R^2\) for

Or even more simply, forget about

*GDPplus*, whose underlying probability model is a bit complicated, and just examine a simpler canonical case, as follows.

Conjecture: In a stationary bivariate single-factor dynamic factor model with AR(1) factor and all shocks Gaussian and orthogonal to all other shocks, the MSE-optimal factor extraction (obtained by Kalman smoothing) may be very well approximated by taking a simple convex combination of exponentially smoothed observed variables.

There are of course many variations and extensions:

*N*variables, richer dynamics, richer error correlation structures, different smoothers, etc.
Theoretically:

What, precisely, is the relationship between the optimal extraction and the approximation? The answer must be contained in the structure of the Kalman gain derived in ADNSS2.

What, precisely, is the relationship between the optimal extraction and the approximation? The answer must be contained in the structure of the Kalman gain derived in ADNSS2.

Empirically:

-- Check it out in simulated environments for various choices of \( \lambda\), \( \alpha_e\) and \( \alpha_i\).

-- Again in simulated environments, minimize the average squared divergence between the exact and approximate extractions w.r.t. \( \lambda\), \( \alpha_e\) and \( \alpha_i\). How close is it to zero?

-- Again in simulated environments, minimize the average squared divergence between the exact and approximate extractions w.r.t. \( \lambda\), \( \alpha_e\) and \( \alpha_i\). How close is it to zero?

-- Now do a serious application:

*GDPplus*vs. a weighted combination of smoothed

*GDPe*and

*GDPi*. Again minimize w.r.t. \( \lambda\), \( \alpha_e\) and \( \alpha_i\). How close is it to zero? How much closer is it to zero than the divergence between

*GDPplus*and

*GDPavg*(the simple average of

*GDPe*and

*GDPi*now published by BEA -- see this No Hesitations post.)?

-- Based on ADNSS1 and ADNSS2, My guess is that the optimal \(\lambda\) will be around .4, and that the optimal \(\alpha_e\) will be much bigger than the optimal \( \alpha_i\) (where bigger \( \alpha\) corresponds to more smoothing).

[Note: In the two or three weeks since the draft of this post was written, we have explored things a bit, and it's looking good. The optimized parameters are \( \lambda=.14\), \( \alpha_e =.94 \) and \( \alpha_i = .18\), and they deliver a predictive \( R^2\) for

*GDPplus*of .94.]