20  Pooled Regression

Main Takeaway

For pooled regression with panel data:

  • Model structure: \(y_{it} = \bx_{it}' \bbeta + u_{it}\) treats all observations as independent cross-sections
  • Key assumption: Strict mean independence \(\mathbb{E}[u_{it} | \bX_i] = 0\) (errors independent of all regressors across time)
  • Estimation: Standard OLS \(\hat{\bbeta}_{\text{pool}} = (\bX' \bX)^{-1} \bX' \by\)
  • Inference: Use cluster-robust standard errors to account for serial correlation within individuals
  • Limitation: Strict mean independence often violated; excludes lagged dependent variables and unobserved heterogeneity

20.1 Notation

This chapter focuses on panel data regression models whose observations are pairs \((y_{it}, \bx_{it})\) where \(y_{it}\) is the dependent variable and \(\bx_{it}\) is a \(K\)-vector of regressors — observations on individual \(i\) for time period \(t\).

It will be useful to cluster the observations at the level of the individual. We write \(\by_i = (y_{i1}, y_{i2}, \ldots, y_{iT_i})'\) as the \(T_i \times 1\) stacked observations on \(y_{it}\) for \(t \in S_i\), stacked in chronological order. Similarly, \(\bX_i = (\bx_{i1}, \bx_{i2}, \ldots, \bx_{iT_i})'\) is the \(T_i \times k\) matrix of stacked \(\bx_{it}'\).

When we assume a balanced panel, that is, \(T_i = T,\) for \(i=1,\ldots,N.\) We use this assumption for simplicity in notations throughout the chapter.

For the full sample:

  • \(\by = (\by_1', \by_2', \dots, \by_N')'\) is the \(n \times 1\) stacked vector of stacked \(\by_i,\) and
  • \(\bX = (\bX_1', \bX_2', \dots, \bX_N')'\) likewise.

20.2 Model Setup

The simplest model in panel regression is the pooled regression. At the level of the observation, the model is:

\[ \begin{split} y_{it} &= \bx_{it}' \bbeta + u_{it}, \\ \mathbb{E}[\bx_{it} u_{it}] &= 0 \end{split} \]

At the individual level:

\[ \begin{split} \by_i &= \bX_i \bbeta + \bu_i, \\ \mathbb{E}[X_i' \bu_i] &= 0 . \end{split} \]

where

\[ \underset{(T\times 1)}{\by_i} = \begin{bmatrix} y_{i1} \\ y_{i2} \\ \vdots \\ y_{iT} \\ \end{bmatrix} , \quad \underset{(T\times K)}{\bX_i} = \begin{bmatrix} \bx_{i1}' \\ \bx_{i2}' \\ \vdots \\ \bx_{iT}' \\ \end{bmatrix} , \quad \underset{(N\times 1)}{\bu_i} = \begin{bmatrix} u_{i1} \\ u_{i2} \\ \vdots \\ u_{iT} \\ \end{bmatrix} \]

For the full sample:

\[ \by = \bX\beta + \bu \] where

\[ \underset{(NT\times 1)}{\by} = \begin{bmatrix} \by_{1} \\ \by_{2} \\ \vdots \\ \by_{N} \end{bmatrix} , \underset{(NT\times 1)}{\bu} = \begin{bmatrix} \bu_{1} \\ \bu_{2} \\ \vdots \\ \bu_{N} \end{bmatrix} , \underset{(NT\times K)}{\bX} = \begin{bmatrix} \bX_{1} \\ \bX_{2} \\ \vdots \\ \bX_{N} \end{bmatrix} \]

The pooled regression model is appropriate when the errors \(u_{it}\) satisfy strict mean independence:

\[ \mathbb{E}[u_{it} \mid \bX_i] = 0 \tag{20.1}\]

This occurs when the errors \(u_{it}\) are mean independent of all regressors \(\bx_{ij}\) for all time periods \(j=1,\ldots,T.\)

This strict mean independence implies that neither lagged nor future values of \(\bx_{it}\) help predict \(u_{it}\). It excludes lagged dependent variables (such as \(y_{it-1}\)) from \(\bx_{it}\), otherwise \(u_{it}\) would be predictable given \(\bx_{it}.\)

Strict mean independence is stronger than pairwise mean independence

\[ \mathbb{E}[u_{it} \mid \bx_{ij}] = 0 \]

as well as the projection assumption

\[ \mathbb{E}[u_{it} \bx_{ij}] = \bold{0} \]

The standard estimator of \(\beta\) in the pooled regression model is least-squares, given by:

\[ \begin{split} \hat{\beta}_{\text{pool}} &= \left(\sum_{i=1}^N\sum_{t=1}^T \bx_{it}\bx_{it}' \right)^{-1} \left(\sum_{i=1}^N\sum_{t=1}^T \bx_{it}y_{it} \right) \\ &= \left( \sum_{i=1}^N X_i' X_i \right)^{-1} \left( \sum_{i=1}^N X_i' y_i \right) \\ &= (X' X)^{-1} X' y \end{split} \]

\(\hat{\beta}_{\text{pool}}\) is called the pooled regression estimator.

The vector of least-squares residuals for the \(i\)th individual is:

\[ \widehat{\bu}_i = \by_i - \bX_i \hat{\bbeta}_{\text{pool}} \]

20.3 Statistical Properties

The estimator can be rewritten as:

\[ \begin{split} \hat{\bbeta}_{\text{pool}} &= \left( \sum_{i=1}^N \bX_i' \bX_i \right)^{-1} \left( \sum_{i=1}^N \bX_i' \left(\bX_i\bbeta + \bu_i\right) \right) \\ &= \bbeta + \left( \sum_{i=1}^N \bX_i' \bX_i \right)^{-1} \left( \sum_{i=1}^N \bX_i' \bu_i \right) \end{split} \]

Given (24.4), \(\hat{\bbeta}_{\text{pool}}\) is unbiased.

  • If \(u_{it}\) is homoskedastic and serially uncorrelated: use classical variance estimator.
  • If \(u_{it}\) is heteroskedastic: use heteroskedasticity-robust estimator.
  • If \(u_{it}\) is serially correlated: use cluster-robust covariance matrix:

\[ \widehat{\bV}_{\text{pool}} = (\bX' \bX)^{-1} \left( \sum_{i=1}^N \bX_i' \widehat{\bu}_i \widehat{\bu}_i' \bX_i \right) (\bX' \bX)^{-1} \]

With Stata’s degrees-of-freedom adjustment:

\[ \widehat{\bV}_{\text{pool}} = \left( \frac{n - 1}{n - k} \right) \left( \frac{N}{N - 1} \right) (\bX' \bX)^{-1} \left( \sum_{i=1}^N \bX_i' \widehat{\bu}_i \widehat{\bu}_i' \bX_i \right) (\bX' \bX)^{-1} \]

When strict mean independence (24.4) fails, however, the pooled least-squares estimator \(\hat{\bbeta}_{\text{pool}}\) is not necessarily consistent for \(\bbeta\). Since strict mean independence is a strong and typically undesirable restriction, it is typically preferred to adopt one of the alternative estimation approaches such as Fixed Effects or Random Effects models.