data('fertil3')
<- lm(diff(gfr) ~ diff(pe), data = fertil3)
fertility_diff <- lm(diff(gfr) ~ diff(pe) + diff(pe_1) + diff(pe_2), data = fertil3) fertility_lag
7 TS Examples
Textbook: Chapter 11, Introductory Econometrics: A Modern Approach, 7e by Jeffrey M. Wooldridge
Summary notes by Marius v. Oordt: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3401712
7.1 Example 11.6:
Fertility and Personal Exemption
gfr
: general fertility rate
pe
: personal exemption
Dependent variable: | ||
diff(gfr) | ||
(1) | (2) | |
diff(pe) | -0.04268 (0.02837) | -0.03620 (0.02677) |
diff(pe_1) | -0.01397 (0.02755) | |
diff(pe_2) | 0.10999*** (0.02688) | |
Constant | -0.78478 (0.50204) | -0.96368** (0.46776) |
Observations | 71 | 69 |
R2 | 0.03176 | 0.23248 |
Adjusted R2 | 0.01773 | 0.19705 |
Residual Std. Error | 4.22082 (df = 69) | 3.85945 (df = 65) |
F Statistic | 2.26343 (df = 1; 69) | 6.56266*** (df = 3; 65) |
Note: |
*: p<0.1; **: p<0.05; ***: p<0.01 Standard errors in parentheses. |
The first regression uses first differences:
\[ \begin{aligned} \Delta\widehat{gfr} &= -.785 - .043\, \Delta pe \\ &\phantom{=}\;\; (.502)\;\; (.028) \\ n &= 71, R^2=.032, \bar{R^2} = .018. \end{aligned} \tag{7.1}\]
The estimates indicate an increase in \(pe\) lowers \(gfr\) contemporaneously, although the estimate is not statistically different from zero at the 5% level.
If we add two lags of \(\Delta pe,\) things improve:
\[ \begin{aligned} \Delta\widehat{gfr} &= -.964 - .036\, \Delta pe - .014\, \Delta pe_{-1} + .110\, \Delta pe_{-2} \\ &\phantom{=}\;\; (.468)\quad (.027) \qquad\; (.028) \qquad\quad\; (.027)\\ n &= 69, R^2=.232, \bar{R^2} = .197. \end{aligned} \tag{7.2}\]
We call model (7.2) an finite distributed lag (FDL) model of order two. A more general specification is
\[ y_t = \alpha_0 + \delta_0 z_t + \delta_1 z_{t-1} + \delta_2 z_{t-2} + u_t . \]
Even though \(\Delta pe\) and \(\Delta pe_{-1}\) have negative coefficients, their coefficients are small and jointly insignificant (\(p\text{-value}=.28,\) see Anova test below).
# Compare the restricted with the full model
<- lm(diff(gfr) ~ diff(pe_2), data = fertil3)
fertility_lag2 anova(fertility_lag2, fertility_lag)
Analysis of Variance Table
Model 1: diff(gfr) ~ diff(pe_2)
Model 2: diff(gfr) ~ diff(pe) + diff(pe_1) + diff(pe_2)
Res.Df RSS Df Sum of Sq F Pr(>F)
1 67 1006.6
2 65 968.2 2 38.413 1.2894 0.2824
The second lag (\(\Delta pe_{-2}\)) is very significant and indicates a positive relationship between changes in \(pe\) and subsequent changes in \(gfr\) two years hence. This makes more sense than having a contemporaneous effect.
7.1.1 Example 11.8
In this example, we want to test whether the Finite Distributed Lag model (7.2) for \(\Delta\widehat{gfr}\) and \(\Delta pe\) is dynamically complete.
Being dynamically complete indicates that neither lags of \(\Delta\widehat{gfr}\) nor further lags of \(\Delta pe\) should appear in the equation. Mathematically, given the following finite distributed lag model:
\[ \Delta gfr_t = \beta_0 + \beta_1\Delta pe_t + \beta_2\Delta pe_{t-1} + \beta_3 \Delta pe_{t-2} + u_t . \] Rewrite it as
\[ \begin{aligned} \Delta gfr_t &= \beta_0 + \beta_1x_{t1} + \beta_2x_{t2} + \beta_3 x_{t3} + u_t \\ y_t &= \bx_t'\bbeta + u_t \end{aligned} \] where the explanatory variables \(\bx_t=(x_{t1}, x_{t2}, x_{t3})' = (\Delta pe_t, \Delta pe_{t-2}, \Delta pe_{t-3})'\) and the dependent variable \(y_t=\Delta gfr_t.\)
A dynamically complete model requires the following condition:
\[ \E(u_t\mid \bx_t, y_{t-1}, \bx_{t-1}, \ldots) = 0. \tag{7.3}\] Written in terms of \(y_t,\)
\[ \E(y_t\mid \bx_t, y_{t-1}, \bx_{t-1}, \ldots) = \E(y_t\mid \bx_t). \tag{7.4}\]
We can test for dynamic completeness by adding \(\Delta gfr_{t-1}.\)
<- lm(diff(gfr) ~ lag(diff(gfr)) + diff(pe) + diff(pe_1) + diff(pe_2), data = fertil3)
fertility_lag_dep tidy(fertility_lag_dep) %>%
::kable(digits = 3) knitr
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -0.702 | 0.454 | -1.547 | 0.127 |
lag(diff(gfr)) | 0.300 | 0.106 | 2.835 | 0.006 |
diff(pe) | -0.045 | 0.026 | -1.773 | 0.081 |
diff(pe_1) | 0.002 | 0.027 | 0.077 | 0.939 |
diff(pe_2) | 0.105 | 0.026 | 4.108 | 0.000 |
The coefficient estimate is .300 and its \(t\) statistic is 2.84. Thus, the model is NOT dynamically complete in the sense of (7.4).
The fact that (7.2) is not dynamically complete suggests that there may be serial correlation in the errors. We will need to test and correct for this.
7.2 Example 11.7:
Wages and Productivity
\[\log(hrwage_t) = \beta_0 + \beta_1\log(outphr_t) + \beta_2t + u_t\] Data from the Economic Report of the President, 1989, Table B-47. The data are for the non-farm business sector.
data("earns")
<- lm(lhrwage ~ loutphr + t, data = earns)
wage_time <- lm(diff(lhrwage) ~ diff(loutphr), data = earns) wage_diff
Dependent variable: | ||
lhrwage | diff(lhrwage) | |
(1) | (2) | |
loutphr | 1.63964*** (0.09335) | |
t | -0.01823*** (0.00175) | |
diff(loutphr) | 0.80932*** (0.17345) | |
Constant | -5.32845*** (0.37445) | -0.00366 (0.00422) |
Observations | 41 | 40 |
R2 | 0.97122 | 0.36424 |
Adjusted R2 | 0.96971 | 0.34750 |
Residual Std. Error (df = 38) | 0.02854 | 0.01695 |
F Statistic | 641.22430*** (df = 2; 38) | 21.77054*** (df = 1; 38) |
Note: |
*: p<0.1; **: p<0.05; ***: p<0.01 Standard errors in parentheses. |