Monday, June 6, 2016

Fixed Effects Without Panel Data

Consider a pure cross section (CS) of size N.  Generally you'd like to allow for individual effects, but you can't, because OLS with a full set of N individual dummies is conceptually infeasible. (You'd exhaust degrees of freedom.) That's usually what motivates the desirability/beauty of panel data -- there you have NxT observations, so including N individual dummies becomes conceptually feasible.

But there's no need to stay with OLS.  You can recover d.f. using regularization estimators like ridge (shrinkage) or LASSO (shrinkage and selection).  So including a full set of individual dummies, even in a pure CS, is completely feasible!  For implementation you just have to select the ridge or lasso penalty parameter, which is reliably done by cross validation (say).

There are two key points.  The first is that you can allow for individual fixed effects even in a pure CS; that is, there's no need for panel data.  That's what I've emphasized so far.

The second is that the proposed method actually gives estimates of the fixed effects.  Sometimes they're just nuisance parameters that can be ignored; indeed standard panel estimation methods "difference them out", so they're not even estimated.  But estimates of the fixed effects are crucial for forecasting:  to forecast y_i, you need not only Mr. i's covariates and estimates of the "slope parameters", but also an estimate of Mr. i's intercept!  That's why forecasting is so conspicuously absent from most of the panel literature -- the fixed effects are not estimated, so forecasting is hopeless.  Regularized estimation, in contrast, delivers estimates of fixed effects, thereby facilitating forecasting, and you don't even need a panel.