Sunday, October 25, 2015

Predictive Accuracy Rankings by MSE vs. MAE

We've all ranked forecast accuracy by mean squared error (MSE) and mean absolute error (MAE), the two great workhorses of relative accuracy comparison. MSE-rankings and MAE-rankings often agree, but they certainly don't have to -- they're simply different loss functions -- which is why we typically calculate and examine both.

Here's a trivially simple question: Under what conditions will MSE-rankings and MAE-rankings agree? It turns out that the answer it is not at all trivial -- indeed it's unknown. Things get very difficult, very quickly. 

With \(N(\mu, \sigma^2)\) forecast errors we have that

\( E(|e|) = \sigma \sqrt{2/\pi} \exp\left( -\frac{\mu^{2}}{2 \sigma^{2}}\right) + \mu \left[1-2 \Phi\left(-\frac{\mu}{\sigma} \right) \right], \)
where \(\Phi(\cdot)\) is the standard normal cdf. This relates MAE to the two components of MSE, bias (\(\mu\)) and variance (\(\sigma^2\)), but the relationship is complex. In the unbiased Gaussian case (\(\mu=0\) ), the result collapses to \(MAE \propto \sigma \), so that MSE-rankings and MAE-rankings must agree. But the unbiased Gaussian case is very, very special, and little else is known.

Some energetic grad student should crack this nut, giving necessary and sufficient conditions for identical MSE and MAE rankings in general environments. Two leads: See section 5 of Diebold-Shin (2014), who give a numerical characterization in the biased Gaussian case, and section 2 of Ardakani et al. (2015), who make analytical progress using the SED representation of expected loss.