"Text as data" is a vibrant and by now well-established field. (Just Google "text as data".)

For an informative overview geared toward econometricians, see the new paper, "Text as Data" by Matthew Gentzkow, Bryan T. Kelly, and Matt Taddy (GKT). (Ungated version here.)

"Text as data" has wide applications in economics. As GKT note:

... in finance, text from financial news, social media, and company filings is used to predict asset price movements and study the causal impact of new information. In macroeconomics, text is used to forecast variation in inflation and unemployment, and estimate the effects of policy uncertainty. In media economics, text from news and social media is used to study the drivers and effects of political slant. In industrial organization and marketing, text from advertisements and product reviews is used to study the drivers of consumer decision making. In political economy, text from politicians’ speeches is used to study the dynamics of political agendas and debate.

There are three key steps:

1. Represent the raw text D as a numerical array x

2. Map x into predicted values yhat of outcomes y

3. Use yhat in subsequent descriptive or causal analysis.

GKT emphasize the ultra-high dimensionality inherent in statistical text analyses, with connections to machine learning, etc.
Check out the fascinating and creative new paper, "Myopia and Discounting", by Xavier Gabaix and David Laibson.

From their abstract (slightly edited):
We assume that perfectly patient agents estimate the value of future events by generating noisy, unbiased simulations and combining those signals with priors to form posteriors. These posterior expectations exhibit as-if discounting: agents make choices as if they were maximizing a stream of known utils weighted by a discount function. This as-if discount function reflects the fact that estimated utils are a combination of signals and priors, so average expectations are optimally shaded toward the mean of the prior distribution, generating behavior that partially mimics the properties of classical time preferences. When the simulation noise has variance that is linear in the event's horizon, the as-if discount function is hyperbolic.

Among other things, then, they provide a rational foundation for the "myopia" associated with hyperbolic discounting.

Note that in the Gabaix-Laibson environment everything depends on how forecast error variance behaves as a function of forecast horizon \(h\). But we know a lot about that. For example, in linear covariance-stationary \(I(0)\) environments, optimal forecast error variance grows with \(h\) at a decreasing rate, approaching the unconditional variance from below. Hence it cannot grow linearly with \(h\), which is what produces hyperbolic as-if discounting. In contrast, in non-stationary \(I(1)\) environments, optimal forecast error variance *does* eventually grow linearly with \(h\). In a random walk, for example, \(h\)-step-ahead optimal forecast error variance is just \(h \sigma^2\), where \( \sigma^2\) is the innovation variance. It would be fascinating to put people in \(I(1)\) vs. \(I(0)\) laboratory environments and see if hyperbolic as-if discounting arises in \(I(1)\) cases but not in \(I(0)\) cases.
[Click on "Machine Learning" at right for earlier "Machine Learning and Econometrics" posts.]

We econometricians need -- and have always had -- cross section and time series ("micro econometrics" and "macro/financial econometrics"), causal estimation and predictive modeling, structural and non-structural. And all continue to thrive.

But there's a new twist, happening now, making this an unusually exciting time in econometrics. Predictive econometric modeling is not only alive and well, but also blossoming anew, this time at the interface of micro-econometrics and machine learning. A fine example is the new Kleinberg, Lakkaraju, Leskovic, Ludwig and Mullainathan paper, “Human Decisions and Machine Predictions”, NBER Working Paper 23180 (February 2017).

Good predictions promote good decisions, and econometrics is ultimately about helping people to make good decisions. Hence the new developments, driven by advances in machine learning, are most welcome contributions to a long and distinguished predictive econometric modeling tradition.
[Click on "Machine Learning" at right for earlier "Machine Learning and Econometrics" posts.]

The predictive modeling perspective needs not only to be respected and embraced in econometrics (as it routinely *is*, notwithstanding the Angrist-Pischke revisionist agenda), but also to be *enhanced *by incorporating elements of statistical machine learning (ML). This is particularly true for cross-section econometrics insofar as time-series econometrics is already well ahead in that regard. For example, although flexible non-parametric ML approaches to estimating conditional-mean functions don't add much to time-series econometrics, they may add lots to cross-section econometric regression and classification analyses, where conditional mean functions may be highly nonlinear for a variety of reasons. Of course econometricians are well aware of traditional non-parametric issues/approaches, especially kernel and series methods, and they have made many contributions, but there's still much more to be learned from ML.
[Click on "Machine Learning" at right for earlier "Machine Learning and Econometrics" posts.]

Continuing:

So then, statistical machine learning (ML) and time series econometrics (TS) have lots in common. But there's also an interesting difference: ML's emphasis on flexible nonparametric modeling of conditional-mean nonlinearity doesn't play a big role in TS.

Of course there are the traditional TS conditional-mean nonlinearities: smooth non-linear trends, seasonal shifts, and so on. But there's very little evidence of important conditional-mean nonlinearity in the covariance-stationary (de-trended, de-seasonalized) dynamics of most economic time series. Not that people haven't tried hard -- really hard -- to find it, with nearest neighbors, neural nets, random forests, and lots more.

So it's no accident that things like linear autoregressions remain overwhelmingly dominant in TS. Indeed I can think of only one type of conditional-mean nonlinearity that has emerged as repeatedly important for (at least some) economic time series: Hamilton-style Markov-switching dynamics.

[Of course there's a non-linear elephant in the room: Engle-style GARCH-type dynamics. They're tremendously important in financial econometrics, and sometimes also in macro-econometrics, but they're about conditional variances, not conditional means.]

So there are basically only two important non-linear models in TS, and only one of them speaks to conditional-mean dynamics. And crucially, they're both very tightly parametric, closely tailored to specialized features of economic and financial data.

Now let's step back and assemble things:

ML emphasizes approximating non-linear conditional-mean functions in highly-flexible non-parametric fashion. That turns out to be doubly unnecessary in TS: There's just not much conditional-mean non-linearity to worry about, and when there occasionally is, it's typically of a highly-specialized nature best approximated in highly-specialized (tightly-parametric) fashion.