11.2 Data Manipulation

11.2.1 Import and Export

Shipped datasets

Stata contains some demonstration datasets in the system directories.

sysuse dir: list the names of shipped datasets.

sysuse lifeexp: use lifeexp

Note that use lifeexp will return error. Data not found.

User datasets

.dta

use myauto [, clear]: Load myauto.dta (Stata-format) into memory.

  • clear it is okay to replace the data in memory, even though the current data have not been saved to disk.

save myauto [, replace]: Create a Stata data type file myauto.dta

  • replace allows Stata to overwrite existing dataset that is the output from previous attempts to run the do file.

.csv

import delimited myauto.csv: Import myauto.csv to Stata’s memory

export delimited myauto.csv” Export to myauto.csv

import delimited filename reads text (ASCII) files in which there is one observation per line and the values are separated by commas, tabs, or some other delimiter.

By default, Stat will check if the file is delimited by tabs or commas based on the first line of data.

export delimited filename writes data into a file in comma-separated (.csv) format by default. You can specify any separation character delimiter that you prefer.

If filename is specified without an extension, .csv is assumed. If filename contains embedded spaces, enclose it in double quotes.

import delimited [using] filename [, import_delimited_options]

Options

  • delimiters("chars"[, collapse | asstring] ):

    • "chars" specifies the delimiter

      ";": uses semicolon as a delimiter; "\t" uses tab, "whitespace" uses whitespace

    • collapse treat multiple consecutive delimiters as just one delimiter.

    • asstring treat chars as one delimiter. By default, each character in chars is treated as an individual delimiter.

    // use example
    import delimited auto, delim(" ", collapse) colrange(:3) rowrange(8) 
  • clear replace data in memory


11.2.2 Save Estimation Results

estimates store model_name stores the current (active) estimation results under the name model_name.

// Store estimation results as m1 for use later in the same session
estimates store m1

estimates table organizes estimation results from one or more models in a single formatted table.

If you type estimates table without arguments, a table of the most recent estimation results will be shown.

// Display a table of coefficients for stored estimates m1 and m2
estimates table m1 m2
// with SE
estimates table m1 m2, se

// with sample size, adjusted 𝑅2, and stars
estimates table m1 m2, stats(N r2_a) star

estimate save filename save the current active estimation results to filename.ster.

etable

etable allows you to easily create a table of estimation results and export it to a variety of file types, e.g., docx, html, pdf, xlsx, tex, txt, markdown, md.

// use example of etable
. clear all
. webuse nhanes2l
(Second National Health and Nutrition Examination Survey)
. quietly regress bpsystol age weight i.region
. estimates store model1

. quietly regress bpsystol i.sex weight i.agegrp
. estimates store model2

. quietly regress bpsystol age weight i.agegrp
. estimates store model3

. etable, estimates(model1 model2 model3) showstars showstarsnote title("Table 1. Models for systolic blood pressure") export(mydoc.docx, replace)

Options:

  • export allows you to specify the output format

Alternative to etable: eststo.

11.2.3 Stored Results

Stata commands that report results also store the results where they can be subsequently used by other commands or programs. This is documented in the Stored results section of the particular command in the reference manuals.

  • e-class commands, such as regress, store their results in e(); e-class commands are Stata’s model estimation commands.

  • r-class commands, such as summarize, store their results in r(); most commands are r-class.

// for r-class command
return list
// for e-class command
ereturn list

Most estimation commands leave behind

  • e(b) the coefficient vector, and
  • e(V) the variance–covariance matrix of the estimates (VCE)
// display coef vector
matrix list e(b)
// assign it to a variable
matrix myb = e(b)
matrix list myb

You can refer to e(b) and e(V) in any matrix expression:

matrix c = e(b)*invsym(e(V))*e(b)’
matrix list c

invsym(e(V)) returns the inverse of e(V). Generally, invsym requires a a square, symmetric, and positive-definite matrix.