11.2 Data Manipulation
11.2.1 Import and Export
Shipped datasets
Stata contains some demonstration datasets in the system directories.
sysuse dir
: list the names of shipped datasets.
sysuse lifeexp
: use lifeexp
Note that use lifeexp
will return error. Data not found.
User datasets
.dta
use myauto [, clear]
: Load myauto.dta
(Stata-format) into memory.
clear
it is okay to replace the data in memory, even though the current data have not been saved to disk.
save myauto [, replace]
: Create a Stata data type file myauto.dta
replace
allows Stata to overwrite existing dataset that is the output from previous attempts to run the do file.
.csv
import delimited myauto.csv
: Import myauto.csv
to Stata’s memory
export delimited myauto.csv
” Export to myauto.csv
import delimited filename
reads text (ASCII) files in which there is one observation
per line and the values are separated by commas, tabs, or some other delimiter.
By default, Stat will check if the file is delimited by tabs or commas based on the first line of data.
export delimited filename
writes data into a file in comma-separated (.csv) format by default. You can specify any separation character delimiter that you prefer.
If filename
is specified without an extension, .csv
is assumed. If filename contains embedded spaces, enclose it in double quotes.
Options
delimiters("chars"[, collapse | asstring] )
:"chars"
specifies the delimiter";"
: uses semicolon as a delimiter;"\t"
uses tab,"whitespace"
uses whitespacecollapse
treat multiple consecutive delimiters as just one delimiter.asstring
treatchars
as one delimiter. By default, each character inchars
is treated as an individual delimiter.
clear
replace data in memory
11.2.2 Save Estimation Results
estimates store model_name
stores the current (active) estimation results under the name model_name
.
estimates table
organizes estimation results from one or more models in a single formatted table.
If you type estimates table without arguments, a table of the most recent estimation results will be shown.
// Display a table of coefficients for stored estimates m1 and m2
estimates table m1 m2
// with SE
estimates table m1 m2, se
// with sample size, adjusted 𝑅2, and stars
estimates table m1 m2, stats(N r2_a) star
estimate save filename
save the current active estimation results to filename.ster
.
etable
etable
allows you to easily create a table of estimation results and export it to a variety of file types, e.g., docx, html, pdf, xlsx, tex, txt, markdown, md.
// use example of etable
. clear all
. webuse nhanes2l
(Second National Health and Nutrition Examination Survey)
. quietly regress bpsystol age weight i.region
. estimates store model1
. quietly regress bpsystol i.sex weight i.agegrp
. estimates store model2
. quietly regress bpsystol age weight i.agegrp
. estimates store model3
. etable, estimates(model1 model2 model3) showstars showstarsnote title("Table 1. Models for systolic blood pressure") export(mydoc.docx, replace)
Options:
export
allows you to specify the output format
Alternative to etable
: eststo
.
11.2.3 Stored Results
Stata commands that report results also store the results where they can be subsequently used by other commands or programs. This is documented in the Stored results section of the particular command in the reference manuals.
e-class commands, such as regress, store their results in
e()
; e-class commands are Stata’s model estimation commands.r-class commands, such as summarize, store their results in
r()
; most commands are r-class.
Most estimation commands leave behind
e(b)
the coefficient vector, ande(V)
the variance–covariance matrix of the estimates (VCE)
// display coef vector
matrix list e(b)
// assign it to a variable
matrix myb = e(b)
matrix list myb
You can refer to e(b)
and e(V)
in any matrix expression:
invsym(e(V))
returns the inverse of e(V)
. Generally, invsym
requires a a square, symmetric, and positive-definite matrix.