11.7 Common Correlated Effects Models

This section provides an overview of how Stata’s xtdcce2 package implements Common Correlated Effects (CCE) models, which are useful for panel data analysis with heterogeneous coefficients and common correlated effects.

Environment setup

ssc install xtdcce2
// dependencies for xtdcce2
ssc install moremata
ssc install xtcd2

Data has to be xtset before using xtdcce2.

11.7.1 Econometric model

ARDL(1, 1) model with heterogeneous coefficients and common correlated effects (CCE) is given by:

\[ \begin{equation} \begin{split} y(i,t) &= b0(i) + b1(i) * y(i,t-1) + b2(i) * x(i,t) + b3(i) * x(i,t-1) \\ &\phantom{=}\quad + u(i,t) \end{split} \tag{11.2} \end{equation} \]

where

\[ u(i,t) = g(i) * f(t) + e(i,t) \]

f(t) is an unobserved common factor loading,
g(i) a heterogeneous factor loading,
x(i,t) is a (1 x K) vector and b2(i) and b3(i) the coefficient vectors. It is assumed that x(i,t) is strictly exogenous.
The error e(i,t) is iid.
The heterogeneous coefficients b1(i), b2(i) and b3(i) are randomly distributed around a common mean.
- In the case of a static panel model, we have b1(i) = 0.

11.7.2 Estimation

11.7.2.1 Static

Pesaran (2006) shows that the averages of the coefficients b0, b2 and b3 (for example for b2(mg) = 1/N sum(b2(i))) can be consistently estimated by adding cross sectional means of the dependent and all independent variables.

The default equation in xtdcce2 is given by:

\[ \begin{equation} y(i,t) = b0(i) + b2(i)*x(i,t) + d(i)*z(i,t) + e(i,t). \tag{11.3} \end{equation} \]

Note that Eq. (11.3) is a static model, the lagged dependent variable does not occur and only contemporaneous cross sectional averages are used.

Including the dependent and independent variables in crosssectional() and setting cr_lags(0) leads to the same result.

cr_lags(0) means that only contemporaneous cross sectional means are included.
crosssectional() defines the variables to be included in z(i,t).
Important to notice is, that b1(i) is set to zero.

Example

xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , cr(_all) reportc

cr(_all) means that all variables are included in the cross sectional means.

It is equivalent to crosssectional(log_rgdpo log_hc log_ck log_ngd).
The default number of cross sectional lags is zero (cr_lags(0)), implying only contemporaneous cross sectional averages are used.

cr_lags(3) would include the lags of cross sectional means up to three.
reportc reports the constant term. If not specified the constant is partialled out.

11.7.2.2 Dynamic

Chudik and Pesaran (2015) extends to a dynamic panel data model (b1(i) != 0); pT lags of the cross sectional means are added to achieve consistency.

The mean group estimates for b1, b2 and b3 are consistently estimated as long as N, T and pT go to infinity. This implies that the number of cross sectional units and time periods is assumed to grow with the same rate.

In an empirical setting this can be interpreted as N/T being constant.
A dataset with one dimension being large in comparison to the other would lead to inconsistent estimates, even if both dimension are large in numbers.

Stata estimates the following dynamic CCE model:

\[ \begin{equation} \begin{split} y(i,t) &= b0(i) + b1(i)*y(i,t-1) + b2(i)*x(i,t) \\ &\phantom{=}\quad + \sum_{s=t}^{t-pT} [d(i)*z(i,s)] + e(i,t). \end{split} \tag{11.4} \end{equation} \]

Eq. (11.4) is estimated if the option cr_lags() contains a positive number.

z(i,s) is the cross sectional average of the variables defined in crosssectional().

Example

xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd , ///
    reportc cr(log_rgdpo  log_hc log_ck log_ngd) cr_lags(3)

cr_lags(3) the number of lags is set to 3.

The variance of the mean group coefficient b1(mg) is estimated as:

\[ \var(b1(mg)) = \frac{1}{N} \sum_{i=1}^N \left(b1(i) - b1(mg)\right)^2 \]

If the vector \(pi(mg) = \left(b0(mg), b1(mg)\right)',\) the variance is given by:

\[ \var(pi(mg)) = \frac{1}{N} \sum_{i=1}^N \left(pi(i) - pi(mg)\right) \; \left(p(i)-pi(mg)\right)' \]

11.7.2.3 Pooled Estimation

Eqs (11.3) and (11.4) can be estimated as a pooled model where the coefficients are assumed to be equal across all cross sectional units.

Hence the equations become:

Pooled Pesaran

\[ \begin{equation} y(i,t) = b0 + b2*x(i,t) + d(i)*z(i,t) + e(i,t) \end{equation} \]
Pooled Chudik and Pesaran

\[ \begin{equation} y(i,t) = b0 + b1*y(i,t-1) + b2*x(i,t) + \sum_{s=t}^{t-pT} [d(i)*z(i,s)] + e(i,t). \end{equation} \]

Variables with pooled (homogenous) coefficients are specified using the pooled(varlist) option. The constant is pooled by using the option pooledconstant.

In case of a pooled estimation, the standard errors are obtained from a mean group regression.

Example

xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd , ///
    reportc cr(log_rgdpo  log_hc log_ck log_ngd) ///
    pooled(L.log_rgdpo  log_hc log_ck log_ngd) cr_lags(3) pooledconstant

pooled(L.log_rgdpo log_hc log_ck log_ngd) means all coefficients should be pooled.
pooledconstant means the constant is pooled.

11.7.3 Long run effects

11.7.4 Boostrap

xtdcce2 can bootstrap confidence intervals and standard errors. It supports two types of bootstraps: the wild bootstrap and the cross-section bootstrap.

The cross-section bootstrap is the default method.

The cross-section bootstrap draws with replacement from the cross-sectional dimension. That is it draws randomly cross-sectional units with their entire time series. It then estimates the model using xtdcce2.
The wild bootstrap is a slower from of the wild bootstrap implemented in boottest (Roodman et. al. 2019). It reweighs the residuals with Rademacher weights from the initial regression, recalculates the dependent variable and then runs xtdcce2.

refs:

Arellano, Manuel, and Stephen Bond. 1991. “Some Tests of Specification for Panel Carlo Application to Data: Monte Carlo Evidence and an Application to Employment Equations.” Review of Economic Studies 58: 277–97.

Bruno, Giovanni S. F. 2005. “Estimation and Inference in Dynamic Unbalanced Panel-Data Models with a Small Number of Individuals.” The Stata Journal 5: 473–500. https://www.stata-journal.com/article.html?article=st0091.

Judson, Ruth A., and Ann L. Owen. 1999. “Estimating Dynamic Panel Data Models: A Guide for Macroeconomists.” Economics Letters 65 (October): 9–15. https://doi.org/10.1016/S0165-1765(99)00130-5.

Nickell, Stephen. 1981. “Biases In Dynamic Models With Fixed Effects.” Econometrica 49 (6): 1417–26.