11.4 Forecast
Foreceast: out-of-sample
Before we are able to forecast, we must populate the exogenous variables over the entire forecast horizon before solving our model. 添加数据
Solving our model: means obtain forecast from our model.
Procedure:
Estimate the model
Store the estimation results using
estimate store
Create a forecast model using
forecast create
. This initialize a new model; we will call the modelmymodel.
The name you give the model mainly controls how output from
forecast
commands is labeled. More importantly,forecast create
creates the internal data structures Stata uses to keep track of your model.Add all equations to the model you just created using
forecast estimates
.The following command adds the stored estimation results in
myarima
to the current modelmymodel
.forecast estimates myarima
Compute dynamic forecasts from 2012 to 2024
Creates a new forecast model
The forecast create
command creates a new forecast model in Stata.
You must create a model before you can add equations or solve it. You can have only one model in memory at a time.
You may optionally specify a name
for your model. That name
will appear in the output produced by the various forecast subcommands.
replace
clear the existing model from memory before creating name
. By default, forecast create
issues an error message if another model is already in memory.
Note that you can add multiple equations to a forecast model.
Add equations/identifies
Add estimation results to a forecast model currently in memory.
modelname
is the name of a stored estimation result being added; it is generated by estimates store modelname
.
Options:
predict(p_options)
: callpredict
usingp_options
names(newnamelist[ , replace])
: usenewnamelist
for the names of left-hand-side (LHS) variables in the estimation result being added, i.e.,modelname
.forecast estimates
creates a new variable in the dataset for each element ofnamelist
.You MUST use this option of any of the LHS variables contains time series operators, e.g.,
D.
,L.
.If a variable of the same name already exists in your dataset,
forecast estimates
exits with an error unless you specify thereplace
option, in which case existing variables are overwritten.
Add estimation results stored in myestimates
to the forecast model currently in memory.
Add an Identity to a forecast
Model
An identity
is a nonstochastic equation that expresses an endogenous variable in the model as a function of other variables in the model. Identities often describe the behavior of endogenous variables that are based on accounting identities or adding-up conditions.
// Add an identity to the forecast that states that y3 is the sum of y1 and y2
forecast identity y3=y1+y2
// create new variable newy before adding it to the forecast
forecast identity newy=y1+y2, generate
The difference is that if the LHS variable does not exist, you need to specify the option gen
.
Ex. We have a model using annual data and want to assume that our population variable pop grows at 0.75% per year. Then we can declare endogenous variable pop by using forecast identity:
Typically, you use forecast identity
to define the relationship that determines an endogenous variable that is already in your dataset.
The generate option of forecast identity is useful when you wish to use a transformation of one or more endogenous variables as a right-hand-side variable in a stochastic equation that describes another endogenous variable.
Add equations that you obtained elsewhere to your model
Up untill now, we have been using model output from Stata to add equations to a forecast model, i.e., using forecast estimates
.
You use forecast coefvector
to add endogenous variables to your model that are defined by linear equations.
Common use scenarios of forecast coefvector
:
Sometimes, you might see the estimated coefficients for an equation in an article and want to add that equation to your model. In this case,
forecast coefvector
allows you to add equations that are stored as coefficient vectors to a forecast model.User-written estimators that do not implement a
predict
command can also be included in forecast models viaforecast coefvector
.forecast coefvector
can also be useful insituations where you want to simulate time-series data.
cname
is a Stata matrix with one row. It defines the linear equations, which are stored in a coefficient (parameter) vector.
Options:
variance(vname)
: specify parameter variance matrix of the estimated parameters.This option only has an effect if you specify the
simulate()
option when callingforecast solve
and requestsim_technique
’sbetas
orresiduals
.errorvariance(ename)
: specify additive error term with variance matrixename
, whereename
is the name of s Stata matrix. The number of rows and columns inename
must match the number of equations represented by coefficient vectorcname
.This option only has an effect if you specify the
simulate()
option when callingforecast solve
and requestsim_technique
’sbetas
orresiduals
.names(namelist[ , replace ])
: instructs forecast coefvector to use namelist as the names of the left-hand-side variables in the coefficient vector being added. By default, forecast coefvector uses the equation names on the column stripe of cname.You must use this option if any of the equation names stored with
cname
contains time-series operators.
You use forecast coefvector
to add endogenous variables to your model that are defined by linear equations, where the linear equations are stored in a coefficient (parameter) vector.
// Incorporate coefficient vector of the endogenous equation of y to be used by forecast solve
forecast coefvector y
Ex. We want to add the following eqns to a forecast model. \[ \begin{split} x_t &= 0.2 + 0.3 x_{t-1} - 0.8 z_t \\ z_t &= 0.1 + 0.7 z_{t-1} + 0.3 x_t - 0.2 x_{t-1} \end{split} \]
We first define the coefficient vector eqvector
.
// define a row vector
matrix eqvector = (0.2, 0.3, -0.8, 0.1, 0.7, 0.3, -0.2)
// add equation names and variale names
// equation names are before the colon
// variable names are after the colon
matrix coleq eqvector = x:_cons x:L.x x:y y:_cons y:L.y y:x y:L.x
matrix list eqvector
We could then add the coefficient vector to a forecast model.
forecast adjust
adjusts a variable by add factoring, replacing, etc.
varname
is the name of the endogenous variable that has been previously added to the model using forecast estimates
or forecast coefvector
.
forecast adjust
specifies an adjustment to be applied to an endogenous variable in the model. Adjustments are typically used to produce alternative forecast scenarios or to incorporate outside information into a model.
// Adjust the endogenous variable y in forecast to account for the variable shock in 1990
forecast adjust y = y + shock if year==1990
Solve the foreceast
forecast solve
computes static or dynamic forecasts based on the model currently in memory. Before you can solve a model, you must first create a new model using forecast create
and add equations and variables to it using forecast estimates
, forecast coefvector
, or forecast identity
.
Options:
prefix(string)
andsuffix(string)
specify prefix/suffix for forecast variables.You may specify
prefix()
orsuffix()
but NOT both.By default, forecast values will be prefixed by
f_
.begin(time_constant)
andend(time_constant)
specify period to begin/end forecastingperiods(#)
specify number of periods to forecaststatic
produce static forecasts instead of dynamic forecastsActual values of variables are used wherever lagged values of the endogenous variables appear in the model. Static forecasts are also called one-step-ahead forecasts.
By default, dynamic forecasts are produced, which use the forecast values of variables wherever lagged values of the endogenous variables appear in the model.
actuals
use actual values if available instead of forecastsactuals
specifies how nonmissing values of endogenous variables in the forecast horizon are treated. By default, nonmissing values are ignored, and forecasts are produced for all endogenous variables. When you specifyactuals
,forecast
sets the forecast values equal to the actual values if they are nonmissing. The forecasts for the other endogenous variables are then conditional on the known values of the endogenous variables with nonmissing data.log(log_level)
loglevel
takes on one of the following valueson
: default, provides an iteration log showing the current panel and period for which the model is being solved as well as a sequence of dots for each period indicating the number of iterations.off
: suppress the iteration log.detail
: a detailed iteration log including the current values of the convergence criteria for each period in each panel (in the case of panel data) for which the model is being solved.brief
: produces an iteration log showing the current panel being solved but does not show which period within the current panel is being solved.
simulate(sim_technique, sim_statistic sim_options)
allows you to simulate your model to obtain measures of uncertainty surrounding the point forecasts produced by the model. Simulating a model involves repeatedly solving the model, each time accounting for the uncertainty associated with the error terms and the estimated coefficient vectors.sim_technique
can bebetas
,errors
, orresiduals
.betas
: draw multivariate-normal parameter vectorserrors
: draw additive errors from multivariate normal distributionresiduals
: draw additive residuals based on static forecast errors
sim_statistic
specifies a summary statistic to summarize the forecasts over all the simulations.
statistic
can bemean
,variance
, orstddev
. You may specify either the prefix or the suffix that will be used to name the variables that will contain the requestedstatistic
.sim_options
includessaving(filename, …)
save results to filenodots
suppress replication dots. By default, one dot character is displayed for each successful replication. If during a replication convergence is not achieved, forecast solve exits with an error message.reps(#)
request thatforecast solve
perform#
replications; default isreps(50)
Use example: forecast a panel
\[ \%\Delta \text{dim}_{it} = \beta_0 + \beta_1 \ln(\text{starts}_{it}) + \beta_2 \text{rgspgrowth}_{it} + \beta_3 \text{unrate}_{it} + u_{i} + \varepsilon_{it} \]
\(u_{i}\) refers to individual fixed effects.
When we make forecasts for any individual panel, we may want to include it in our forecasts. This can be achieved by using forecast adjust
.
use https://www.stata-press.com/data/r19/statehardware, clear
generate lndim = ln(dim)
generate lnstarts = ln(starts)
quietly xtreg D.lndim lnstarts rgspgrowth unrate if qdate <= tq(2009q4), fe
predict dlndim_u, u /* obtain individual fixed effects */
estimates store dim /* store estimation results */
With enough observations, we can have more confidence in the estimated panel-specific errors. If we are willing to assume that we have decent estimates of the panel-specific errors and that those panel-level effects will remain constant over the forecast horizon, then we can incorporate them into our forecasts.
Because predict only provided us with estimates of the panel-level effects for the estimation sample, we need to extend them into the forecast horizon.
An easy way to do that is to use egen
to create a new set of variables:
We can use forecast adjust
to incorporate these terms into our forecasts.
The following commands define our forecast model, including the estimated panel-specific terms:
/* create forecast model */
forecast create statemodel, replace
/* add equations, rename the endog variable, D.lndim, to be forecasted as dlndim */
/* since the original endog variable name includes a time series operator
it is required to name, otherwise will return error */
forecast estimates dim, name(dlndim)
/* add state fixed effects */
forecast adjust dlndim = dlndim + dlndim_u2
Note that our dependent variable contains a time series operator, we must use
name(dlndim)
option offorecast estimates
to specify a valid name for the endogenous variable being added.dlndim
stands for the first difference of the logarithm ofdim
. We are interested in the level ofdim
, so we need to back outdim
fromdlndim
.→ We use
forecast identity
to obtain the actualdim
variable.
// reverse first difference, note that you refer to the endog var using the new name, dlndim, now
forecast identity lndim = L.lndim + dlndim
// reverse natural logarithm
forecast identity dim = exp(lndim)
We used forecast adjust to perform our adjustment to dlndim
immediately after we added those estimation results so that we would not forget to do so.
However, we could specify the adjustment at any time.
Regardless of when you specify an adjustment, forecast solve
performs those adjustments immediately after the variable being adjusted is computed.
Finally we can solve the model. Here we obtain dynamic forecasts beginning in the first quarter of 2010: