5 AR – Example

This script provides an example of autocorrelated residuals using expectations augmented Phillips Curve.

5.1 Dataset Description

US Macroeconomics Data Set, Quarterly, 1950I to 2000IV, 204 Quarterly Observations
Source: Department of Commerce, BEA website and www.economagic.com

Feild Name	Definition
year	Year
qtr	Quarter
realgdp	Real GDP ($bil)
realcons	Real consumption expenditures
realinvs	Real investment by private sector
realgovt	Real government expenditures
realdpi	Real disposable personal income
cpi_u	Consumer price index
M1	Nominal money stock
tbilrate	Quarterly average of month end 90 day t bill rate
unemp	Unemployment rate
pop	Population, mil. interpolate of year end figures using constant growth rate per quarter
infl	Rate of inflation (First observation is missing)
realint	Ex post real interest rate = Tbilrate - Infl. (First observation is missing)

# data preview
data <- read.table("https://raw.githubusercontent.com/my1396/course_dataset/refs/heads/main/TableF5-2.txt", header = TRUE)
data <- data %>% 
    mutate(delta_infl = infl-lag(infl))
data %>% 
    head() %>% 
    knitr::kable(digits = 5) %>%
    kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE, latex_options="scale_down") %>% 
    scroll_box(width = "100%")

Year	qtr	realgdp	realcons	realinvs	realgovt	realdpi	cpi_u	M1	tbilrate	unemp	pop	infl	realint	delta_infl
1950	1	1610.5	1058.9	198.1	361.0	1186.1	70.6	110.20	1.12	6.4	149.461	0.0000	0.0000
1950	2	1658.8	1075.9	220.4	366.4	1178.1	71.4	111.75	1.17	5.6	150.260	4.5071	-3.3404	4.5071
1950	3	1723.0	1131.0	239.7	359.6	1196.5	73.2	112.95	1.23	4.6	151.064	9.9590	-8.7290	5.4519
1950	4	1753.9	1097.6	271.8	382.5	1210.0	74.9	113.93	1.35	4.2	151.871	9.1834	-7.8301	-0.7756
1951	1	1773.5	1122.8	242.9	421.9	1207.9	77.3	115.08	1.40	3.5	152.393	12.6160	-11.2160	3.4326
1951	2	1803.7	1091.4	249.2	480.1	1225.8	77.6	116.19	1.53	3.1	152.917	1.5494	-0.0161	-11.0666

5.2 Empirical Model

\[ \Delta I_t = \beta_1 + \beta_2 u_t + \varepsilon_t \] where

$I_t$ is the inflation rate; $\Delta I_t = I_t - I_{t-1}$ is the first difference of the inflation rate;
$u_t$ is the unemployment rate;
$\varepsilon_t$ is the error term.

We remove the first two quarters due to missing value in the first observation and the change in the rate of inflation.

Regression result for OLS.

lm_phillips <- lm(delta_infl ~ unemp, data = data %>% tail(-2))
stargazer(lm_phillips, 
          type = "html", 
          title = "Phillips Curve Regression",
          notes = "<span>&#42;</span>: p<0.1; <span>&#42;&#42;</span>: <strong>p<0.05</strong>; <span>&#42;&#42;&#42;</span>: p<0.01 <br> Standard errors in parentheses.",
          notes.append = F)

**Phillips Curve Regression**

	Dependent variable:

	delta_infl

unemp	-0.090
	(0.126)

Constant	0.492
	(0.740)


Observations	202
R²	0.003
Adjusted R²	-0.002
Residual Std. Error	2.822 (df = 200)
F Statistic	0.513 (df = 1; 200)

Note:	: p<0.1; : p<0.05; **: p<0.01 Standard errors in parentheses.

vcov(lm_phillips)

            (Intercept)       unemp
(Intercept)  0.54830829 -0.08973175
unemp       -0.08973175  0.01582211

HAC (heteroskedasticity and autocorrelation consistent) standard errors

library(lmtest)
library(sandwich)
vcovHAC(lm_phillips)

            (Intercept)        unemp
(Intercept)  0.23561076 -0.039847043
unemp       -0.03984704  0.006986272

vcovHC(lm_phillips)

            (Intercept)       unemp
(Intercept)   0.9319139 -0.16120691
unemp        -0.1612069  0.02912351

Autocorrelated residuals

Plot the residuals.

plot(lm_phillips$residuals, type="l")

Figure 5.1: Phillips Curve Deviations from Expected Inflation

Figure 5.1 shows striking negative autocorrelation. The correlogram tells the same story (Figure 5.2). The blue dotted lines give the values beyond which the autocorrelations are (statistically) significantly different from zero.

acf(lm_phillips$residuals, type='correlation')

Figure 5.2: Correlogram of the residuals

We can get the autocorrelation coefficients by setting plot = FALSE

acf(lm_phillips$residuals, type='correlation', plot = FALSE)$acf %>%
    as.vector() %>% 
    head(10)

 [1]  1.000000000 -0.424730192 -0.112169741  0.073423178  0.147639239
 [6] -0.111740533 -0.036632121  0.009911709  0.036784781 -0.020769323

Now we test the serial correlation of the residuals by regressing $\varepsilon_t$ on $\varepsilon_{t-1}$.

\[ \varepsilon_t = \phi\varepsilon_{t-1} + e_t \]

res <- tibble(
    res_t = lm_phillips$residuals,
    res_t1 = lag(lm_phillips$residuals))
lm_res <- lm(res_t ~ res_t1, data = res)
summary(lm_res)


Call:
lm(formula = res_t ~ res_t1, data = res)

Residuals:
    Min      1Q  Median      3Q     Max 
-9.8694 -1.4800  0.0718  1.4990  8.3258 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.02155    0.17854  -0.121    0.904    
res_t1      -0.42630    0.06355  -6.708    2e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.531 on 199 degrees of freedom
  (1 observation deleted due to missingness)
Multiple R-squared:  0.1844,    Adjusted R-squared:  0.1803 
F-statistic: 44.99 on 1 and 199 DF,  p-value: 2.002e-10

The regression of the least squares residuals on their past values gives a slope of -0.4263 with a highly significant $t$ ratio of -6.7078. We thus conclude that the residuals in this models are highly negatively autocorrelated.

5.3 Test for Serial Correlation

Durbin-Watson (DW) test for AR(1)
Breusch-Godfrey test for AR(q)

library(lmtest)
dwtest(lm_phillips, alternative = "two.sided") # Durbin Watson test


    Durbin-Watson test

data:  lm_phillips
DW = 2.8276, p-value = 5.212e-09
alternative hypothesis: true autocorrelation is not 0

bgtest(lm_phillips, order=1) # Breusch-Godfrey test


    Breusch-Godfrey test for serial correlation of order up to 1

data:  lm_phillips
LM test = 36.601, df = 1, p-value = 1.45e-09

Both tests show strong evidence of AR(1) serial correlation in the errors.

5.4 Consequence for Serial Correlation

The presence of autocorrelation can lead to misleading results as they violate the assumptions of least squares.

The least squares estimator is still a linear unbiased estimator, but is no longer best.

One consequence of the serial correlated errors is that the standard error and $t$ statistics are not valid anymore. In the case if serial correlation, you can either

Transform the model to remove the serial correlation, or alternatively,

FGLS (Feasible Genralized Least Squares), transform the original equation using, e.g., Cochrane-Orcutt or Prais-Winsten transformation.

This approach assumes strictly exogeneous regressors, that is, no lagged $y$ in the RHS of the equation. See Chapter 12.3 in Wooldridge (2013), Introductory Econometrics: A Modern Approach.
Use serial correlation-robust standard errors

HAC (Heteroskedasticity and autocorrelation consistent) standard errors or Newey-West standard errors.
Infinite Distributed Lag Models

Geomoetric (or Koyck) and Rational Distributed Lag Models.

References

Ex. 12.3, Chap 12 Serial Correlation, Econometric Analysis, Greene 5th Edition, pp 251.