19 Pooled Regression
19.1 Notation
This chapter focuses on panel data regression models whose observations are pairs \((y_{it}, \bx_{it})\) where \(y_{it}\) is the dependent variable and \(\bx_{it}\) is a \(K\)-vector of regressors — observations on individual \(i\) for time period \(t\).
It will be useful to cluster the observations at the level of the individual. We write \(\by_i = (y_{i1}, y_{i2}, \ldots, y_{iT_i})'\) as the \(T_i \times 1\) stacked observations on \(y_{it}\) for \(t \in S_i\), stacked in chronological order. Similarly, \(\bX_i = (\bx_{i1}, \bx_{i2}, \ldots, \bx_{iT_i})'\) is the \(T_i \times k\) matrix of stacked \(\bx_{it}'\).
When we assume a balanced panel, that is, \(T_i = T,\) for \(i=1,\ldots,N.\) We use this assumption for simplicity in notations throughout the chapter.
For the full sample:
- \(\by = (\by_1', \by_2', \dots, \by_N')'\) is the \(n \times 1\) stacked vector of stacked \(\by_i,\) and
- \(\bX = (\bX_1', \bX_2', \dots, \bX_N')'\) likewise.
19.2 Model Setup
The simplest model in panel regression is the pooled regression. At the level of the observation, the model is:
\[ \begin{split} y_{it} &= \bx_{it}' \bbeta + u_{it}, \\ \mathbb{E}[\bx_{it} u_{it}] &= 0 \end{split} \]
At the individual level:
\[ \begin{split} \by_i &= \bX_i \bbeta + \bu_i, \\ \mathbb{E}[X_i' \bu_i] &= 0 . \end{split} \]
where
\[ \underset{(T\times 1)}{\by_i} = \begin{bmatrix} y_{i1} \\ y_{i2} \\ \vdots \\ y_{iT} \\ \end{bmatrix} , \quad \underset{(T\times K)}{\bX_i} = \begin{bmatrix} \bx_{i1}' \\ \bx_{i2}' \\ \vdots \\ \bx_{iT}' \\ \end{bmatrix} , \quad \underset{(N\times 1)}{\bu_i} = \begin{bmatrix} u_{i1} \\ u_{i2} \\ \vdots \\ u_{iT} \\ \end{bmatrix} \]
For the full sample:
\[ \by = \bX\beta + \bu \] where
\[ \underset{(NT\times 1)}{\by} = \begin{bmatrix} \by_{1} \\ \by_{2} \\ \vdots \\ \by_{N} \end{bmatrix} , \underset{(NT\times 1)}{\bu} = \begin{bmatrix} \bu_{1} \\ \bu_{2} \\ \vdots \\ \bu_{N} \end{bmatrix} , \underset{(NT\times K)}{\bX} = \begin{bmatrix} \bX_{1} \\ \bX_{2} \\ \vdots \\ \bX_{N} \end{bmatrix} \]
The least-squares estimator of \(\beta\) is given by:
\[ \begin{split} \hat{\beta}_{\text{pool}} &= \left(\sum_{i=1}^N\sum_{t=1}^T \bx_{it}\bx_{it}' \right)^{-1} \left(\sum_{i=1}^N\sum_{t=1}^T \bx_{it}y_{it} \right) \\ &= \left( \sum_{i=1}^N X_i' X_i \right)^{-1} \left( \sum_{i=1}^N X_i' y_i \right) \\ &= (X' X)^{-1} X' y \end{split} \]
\(\hat{\beta}_{\text{pool}}\) is called the pooled regression estimator.
The residuals for individual \(i\):
\[ \widehat{\bu}_i = \by_i - \bX_i \hat{\bbeta}_{\text{pool}} \]
The pooled regression model is appropriate when:
\[ \mathbb{E}[u_{it} \mid \bX_i] = 0 \tag{19.1}\]
This strict mean independence implies that neither lagged nor future values of \(\bx_{it}\) help predict \(u_{it}\). It excludes lagged dependent variables (such as \(y_{it-1}\)) from \(\bx_{it}\), otherwise \(u_{it}\) would be predictable given \(\bx_{it}.\)
19.3 Statistical Properties
The estimator can be rewritten as:
\[ \begin{split} \hat{\bbeta}_{\text{pool}} &= \left( \sum_{i=1}^N \bX_i' \bX_i \right)^{-1} \left( \sum_{i=1}^N \bX_i' \left(\bX_i\bbeta + \bu_i\right) \right) \\ &= \bbeta + \left( \sum_{i=1}^N \bX_i' \bX_i \right)^{-1} \left( \sum_{i=1}^N \bX_i' \bu_i \right) \end{split} \]
Given (23.4), \(\hat{\bbeta}_{\text{pool}}\) is unbiased.
- If \(u_{it}\) is homoskedastic and serially uncorrelated: use classical variance estimator.
- If \(u_{it}\) is heteroskedastic: use heteroskedasticity-robust estimator.
- If \(u_{it}\) is serially correlated: use cluster-robust covariance matrix:
\[ \widehat{\bV}_{\text{pool}} = (\bX' \bX)^{-1} \left( \sum_{i=1}^N \bX_i' \widehat{\bu}_i \widehat{\bu}_i' \bX_i \right) (\bX' \bX)^{-1} \]
With Stata’s degrees-of-freedom adjustment:
\[ \widehat{\bV}_{\text{pool}} = \left( \frac{n - 1}{n - k} \right) \left( \frac{N}{N - 1} \right) (\bX' \bX)^{-1} \left( \sum_{i=1}^N \bX_i' \widehat{\bu}_i \widehat{\bu}_i' \bX_i \right) (\bX' \bX)^{-1} \]
When strict mean independence (23.4) fails, however, the pooled least-squares estimator \(\hat{\bbeta}_{\text{pool}}\) is not necessarily consistent for \(\bbeta\). Since strict mean independence is a strong and typically undesirable restriction, it is typically preferred to adopt one of the alternative estimation approaches such as Fixed Effects or Random Effects models.