library(plm)
library(AER)
data(Grunfeld)
# Estimate Between model
<- plm(invest ~ value + capital,
model_between data = Grunfeld,
index = c("firm", "year"),
model = "between")
# Estimate Within model
<- plm(invest ~ value + capital,
model_within data = Grunfeld,
index = c("firm", "year"),
model = "within")
18 Fixed Effects Model
Consider the following individual fixed effects model (at the level of the observation):
\[ \begin{aligned} y_{it} &= \alpha_i + \bbeta'\bx_{it} + u_{it}, \quad t=1,2,\ldots,T, \\ \text{or}\quad y_{it} &= \alpha_i + \beta_1 x_{it1} + \beta_2 x_{it2} + \cdots + \beta_K x_{itK} + u_{it}. \end{aligned} \tag{18.1}\] There are \(K\) regressors in \(\bx_{it},\) NOT including an intercept.
We assume strict exogeneity:
\[ \E[\bu_i|\bX_i]=0. \]
Throughout, we will assume that the number \(N\) of cross-sectional units is potentially large, while the number \(T\) of data points per unit is fixed. This setup is more reflective of typical micropanels.
Alternative notations:
Stacking the \(T\) equations at the level of the individual, the model is written as:
\[ \by_i = \bX_i \bbeta + \bold{1} \alpha_i + \bu_i , \] where \(\bold{1}\) is a \(T\times 1\) column of ones.
Equation for the full sample:
\[ \begin{bmatrix} \by_{1} \\ \by_{2} \\ \vdots \\ \by_{N} \end{bmatrix} = \begin{bmatrix} \bX_{1} \\ \bX_{2} \\ \vdots \\ \bX_{N} \end{bmatrix} \bbeta + \begin{bmatrix} \bold{1} & 0 & \cdots & 0 \\ 0 & \bold{1} & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \bold{1} \\ \end{bmatrix} \begin{bmatrix} \alpha_{1} \\ \alpha_{2} \\ \vdots \\ \alpha_{N} \end{bmatrix} + \begin{bmatrix} \bu_{1} \\ \bu_{2} \\ \vdots \\ \bu_{N} \end{bmatrix} \] or
\[ \begin{split} \by &= \begin{bmatrix} \bX & \bd_1 & \bd_2 & \cdots & \bd_n \end{bmatrix} \begin{bmatrix} \bbeta \\ \balpha \end{bmatrix} + \bu \\ &= \bX\bbeta + \bD\balpha + \bu , \end{split} \] where \(\bd_i\) is a dummy variable indicating the \(i\)-th unit, and
\[ \underset{(NT\times N)}{\bD} = \begin{bmatrix}\bd_1 & \bd_2 & \cdots & \bd_n \end{bmatrix} . \]
18.1 Within Estimator
Within (or “fixed effects”) estimation of the random intercept model has two steps:
- Apply the within transformation to each unit.
- Apply OLS to the resulting pooled data.
18.1.1 Step 1: Within Transformation
To perform the within transformation, we first average the equations for unit \(i\) across \(t\). Label the average of \(y_{it}\) across \(t\) for unit \(i\) as
\[ \overline{y}_{i} = \frac{1}{T} \sum_{t=1}^{T} y_{it}. \]
The averaged outcome \(\overline{y}_{i}\) satisfies the averaged (or “cross-sectional”) equation
\[ \overline{y}_{i} = \alpha_i + \bbeta' \overline{\bx}_{i} + \overline{u}_{i}, \tag{18.2}\] where \(\overline{\bx}_{i}\) and \(\overline{u}_{i}\) are defined analogously to \(\overline{y}_{i}\).
Define the within-transformed equation by subtracting this averaged equation (18.2) from the original equation (18.1) for each \(t\):
\[ y_{it} - \overline{y}_{i} = (\alpha_i - \alpha_i) + \bbeta'(\bx_{it} - \overline{\bx}_{i}) + (u_{it} - \overline{u}_{i}) \] or \[ \widetilde{y}_{it} = \bbeta' \widetilde{\bx}_{it} + \widetilde{u}_{it}. \tag{18.3}\]
where \(\widetilde{y}_{it} = y_{it} - \overline{y}_{i}\) is the deviations from the group means, and similarly for \(\widetilde{\bx}_{it}\) and \(\widetilde{u}_{it}.\)
Under Equation 18.1, the within transformation eliminates the individual random intercepts \(\alpha_i\). All average differences in \(y\)’s or \(\bx\)’s between individuals have been wiped out. Equation 18.3 now looks like a regular homogeneous regression.
18.1.2 Step 2: OLS on the Within-Transformed Equation
The within (fixed effects) estimator is obtained by simply pooling the data across \(i\) and \(t\) in Equation 18.3 and applying OLS to it. Specifically, the estimator is given by
\[ \begin{aligned} \hat{\bbeta}^W &= \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \sum_{i=1}^{N} \left( \tilde{\bX}_i'\, \tilde{\mathbf{y}}_i \right) \\ &= \left(\sum_{i=1}^{N} \sum_{=1}^{T} \tilde{\bx}_{it} \tilde{\bx}_{it}' \right)^{-1} \left(\sum_{i=1}^{N}\sum_{=1}^{T} \tilde{\bx}_{it}\, \tilde{\mathbf{y}}_{it} \right) . \end{aligned} \tag{18.4}\]
It is called the within estimator because it uses the time variation within each individual.
18.1.3 Properties
The within estimator enjoys several desirable properties if the random intercept model reflects the underlying causal model. Most of these properties can be derived from its sampling error representation \[ \hat{\bbeta}^W = \bbeta + \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bu}_i. \tag{18.5}\]
\(\hat{\bbeta}^W\) is unbiased for \(\bbeta\). To show this, it is sufficient to notice that strict exogeneity of \(\bu_i\) with respect to \(\bX_i\) implies strict exogeneity of \(\tilde{\bu}_i\) with respect to \(\tilde{\bX}_i\): \[ \E[\tilde{\bu}_i|\tilde{\bX}_i] = \E[\E[\tilde{\bu}_i|\bX_i]|\tilde{\bX}_i] = 0. \] It follows that the mean of the second term in Equation 18.5 is 0, and so \[ \E[\hat{\bbeta}^W] = \bbeta. \]
The variance of \(\hat{\bbeta}^W\) is
\[ \bV_{\hat{\bbeta}^W} = \var(\hat{\bbeta}^W\mid \bX) = \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \left( \sum_{i=1}^{N} \tilde{\bX}_i'\, \E[\tilde{\bu}_i \tilde{\bu}_i'\mid \bX_i] \,\tilde{\bX}_i \right) \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} . \] Let \[ \bSigma_i = \E[\tilde{\bu}_i \tilde{\bu}_i'\mid \bX_i] \] be the \(T\times T\) conditional matrix of the idiosyncratic errors. We can rewrite \(\bV_{\hat{\bbeta}^W}\) as
\[ \bV_{\hat{\bbeta}^W} = \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \left( \sum_{i=1}^{N} \tilde{\bX}_i'\, \bSigma_i \,\tilde{\bX}_i \right) \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} . \]
If we assume the idiosyncratic errors are homoskedastic and serially uncorrelated:
\[ \begin{aligned} \E(u_{it}^2 \mid \bX_i) &= \sigma^2_u \\ \E(u_{is}u_{it} \mid \bX_i) &= 0 \quad (x\ne t) \end{aligned} \]
In this case,
\[ \bSigma_i = \mathbf I_i \sigma^2_u . \]
\(\bV_{\hat{\bbeta}^W}\) simplifies to
\[ \bV_{\hat{\bbeta}^W}^0 = \sigma^2_u \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} . \]
\(\hat{\bbeta}^W\) is consistent for \(\bbeta\) and asymptotically normal, provided a standard rank condition holds for \(\tilde{\bX}_i\): \[ \hat{\bbeta}^W \xrightarrow{p} \bbeta, \quad \sqrt{N}(\hat{\bbeta}^W - \bbeta) \Rightarrow N(0, \bV_{\hat{\bbeta}^W}). \tag{18.6}\]
Since \(\bbeta\) is the average coefficient vector in this homogeneous model, we conclude that the within estimator consistently estimates average coefficients under the random intercept model (18.1).
Covariance matrix estimation
An estimator for \(\bV_{\hat{\bbeta}^W}^0\) is
\[ \widehat{\bV}_{\hat{\bbeta}^W}^0 = \hat{\sigma}^2_u \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \] where \(\hat{\sigma}^2_u\) is an unbiased estimator of \(\sigma^2_u\) given by:
\[ \hat{\sigma}^2_u = \frac{\sum_{i=1}^{N}\sum_{t=1}^{T}\widehat{\widetilde{u}}_{it}^2}{NT-N-K} . \]
A covariance matrix estimator which allows \(u_{it}\) to be heteroskedastic and serially correlated across \(t\) is the cluster-robust covariance matrix estimator, clustered by individual
\[ \widehat{\bV}_{\hat{\bbeta}^W}^{\text{cluster}} = \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \left( \sum_{i=1}^{N} \tilde{\bX}_i'\, \widehat{\widetilde{\bu}}_{i}\widehat{\widetilde{\bu}}_{i}' \,\tilde{\bX}_i \right) \left(\sum_{i=1}^{N} \tilde{\bX}_i' \tilde{\bX}_i \right)^{-1} \]
18.2 Between-groups Estimator
The Between-groups estimator (or between estimator) is obtained by regressing the time averaged variables on each other (18.2) (where we include an intercept, \(\beta_0\)) using OLS regression. \[ \overline{y}_{i} =\bbeta' \overline{\bx}_{i} + (\alpha_i + \overline{u}_{i}) , \quad i=1,\ldots,N. \] This is a cross-sectional regression, with a sample size being the number of entities.
The Between Groups estimator is given by
\[ \hat{\bbeta}^{\text{B}} = \left( \sum_{i=1}^N \overline{\bx}_{i}\overline{\bx}_{i}'\right)^{-1} \left( \sum_{i=1}^N \overline{\bx}_{i}\overline{y}_{i}\right) \]
Note that the between estimator is biased when \(\alpha_i\) is correlated with \(\bx_{i\cdot}\) (\(\bx_{i\cdot}\) endogenous).
It can be suitable for research questions that specifically address variation between different entities rather than changes within entities over time. One caveat is that the Between-groups estimator overlooks important information about how variables change over time.
To obtain unbiased estimates, it is crucial to ensure the assumption of zero correlation between the error term and averaged independent variables.
\[ \E[\bx_{i\cdot} \alpha_i] = 0 . \] Between Groups estimator is not efficient. In practice, it is only used to obtain an estimate of \(\sigma_{\alpha}^2\) when implementing feasible GLS.
If we think \(\alpha_i\) is uncorrelated with \(\bx_{i\cdot},\) it is better to use the random effects estimator.
library(stargazer)
stargazer(model_between, model_within,
column.labels = c("Between", "Within"),
type="html", digits=2,
notes = "<span>*</span>: p<0.1; <span>**</span>: <strong>p<0.05</strong>; <span>***</span>: p<0.01 <br> Standard errors in parentheses.",
notes.append = F)
Dependent variable: | ||
invest | ||
Between | Within | |
(1) | (2) | |
value | 0.13*** | 0.11*** |
(0.03) | (0.01) | |
capital | 0.03 | 0.31*** |
(0.17) | (0.02) | |
Constant | -7.38 | |
(40.44) | ||
Observations | 11 | 220 |
R2 | 0.86 | 0.77 |
Adjusted R2 | 0.83 | 0.75 |
F Statistic | 25.50*** (df = 2; 8) | 340.08*** (df = 2; 207) |
Note: |
*: p<0.1; **: p<0.05; ***: p<0.01 Standard errors in parentheses. |
References
- Vladislav Morozov, Econometrics with Unobserved Heterogeneity, Course material, https://vladislav-morozov.github.io/econometrics-heterogeneity/linear/linear-within-estimator.html#recall-within-estimator-in-the-random-intercept-model
- Vladislav Morozov, GitHub course repository, https://github.com/vladislav-morozov/econometrics-heterogeneity/blob/main/src/linear/linear-within-estimator.qmd