Stata: Getting Starting and Being Productive with VA Data - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Stata: Getting Starting and Being Productive with VA Data

Description:

UPDATES: C:ProgramFilesStata9adoupdates BASE: C:Program FilesStata9adobase ... OLDPLACE: c:ado Delimiters. SAS recognizes ';' as a delimiter ... – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 36
Provided by: toddw6
Category:

less

Transcript and Presenter's Notes

Title: Stata: Getting Starting and Being Productive with VA Data


1
Stata Getting Starting and Being
Productivewith VA Data
  • Give me six hours to chop down a tree and I will
    spend the first four sharpening the axe.
  • --Abraham Lincoln
  • Todd Wagner
  • June 2007

2
Outline
  • Getting data into Stata
  • Editing in Stata
  • How does Stata handle data
  • Stata notation and help
  • Using Stata and Basic Stata commands

3
Transferring Data
  • Stattransfer or DBMS copy work
  • Stattransfer often seeks to optimize the Stata
    dataset by default
  • If transferring data with SCRSSN, FORCE
    Stattransfer to transfer SCRSSN as double
    precision

4
Stattransfer
5
Editing in Stata
  • Any ASCII text editor will work
  • Stata has a built in text editor, but it is
    limited.
  • I recommend using another text editor
  • http//fmwww.bc.edu/repec/bocode/t/textEditors.htm
    l

6
Handling Data
  • SAS processes one record at a time
  • Stata processes all the records at the same time
  • Loops are commonly used in SAS
  • Loops are very rarely used in Stata

7
Loading Data into Memory
  • Stata reads the data into memory
  • set mem 100m (before you load the data)
  • You must have enough memory for your dataset
  • With large datasets
  • drop unnecessary variables
  • Use the compress command (but dont compress
    SCRSSN)

8
Stata Abbreviations
  • Stata commands can be abbreviated with the first
    three letters
  • regression income education female
  • could be written
  • reg income education female
  • Can also abbreviate variables if uniquely defined
  • reg inc educ fem

9
Stata Help
  • Statas built in help is great
  • Help ltcommandgt
  • Stata manuals are great because they review
    theory

10
Stata and the Web
  • Stata is web aware
  • Check for updates periodically
  • update all
  • You can search for user-written programs
  • findit output
  • findit outreg (click to install)

11
Stata in Windows
  • Page up scrolls through the previous commands
  • There is a graphical user interface (menus) if
    you forget a command
  • We have Stata on rocky and tasha no graphical
    capabilities, no menus, and loss of some shortcuts

12
Using Stata
  • Create batch files called .do files
  • I work interactively
  • Run Stata and create do file as I go
  • I can then use the do file as needed
  • Debugging code and exploratory data analysis is
    very fast in Stata

13
Sysdir, ls and cd
  • Stata recognizes some unix commands, such as ls
    and cd
  • Sysdir provides a listing of Statas working
    directories
  • sysdir
  • STATA C\Program Files\Stata9\
  • UPDATES C\ProgramFiles\Stata9\ado\updates\
  • BASE C\Program Files\Stata9\ado\base\
  • SITE C\Program Files\Stata9\ado\site\
  • PLUS c\ado\stbplus\
  • PERSONAL c\ado\personal\
  • OLDPLACE c\ado\

14
Delimiters
  • SAS recognizes as a delimiter
  • Stata recognizes the carriage return
  • Always add a carriage return after your last
    command
  • You can change delimiters to
  • delimit

15
Missing Data
  • Stata and SAS both use . as missing
  • Stata implicitly values a missing as a very large
    number
  • SAS implicitly values a missing as a very small
    number

16
Generating and Recoding Variables
  • In SAS you type
  • quality0
  • If VA1 then quality1
  • In Stata you type
  • gen quality0
  • recode quality 01 if VA1 or
  • replace quality1 if VA1

17
Boolean Logic
  • Stata is picky about Boolean logic
  • gen yx if ab (must use two )
  • gen yx if agtb bgt10 (must use )
  • gen yx if altb (lt or gt must be before )

18
Creating Dummy Variables
  • Goal create dummy variable for each DRG
  • gen drgnum1drg1 or
  • tab drg, gen(drgnum)
  • This second command automatically creates dummy
    variables

19
Drop
  • Drop ltvarnamesgt (drops variables)
  • Drop if X1 (drop cases where value is 1)

20
egen Commands
  • You want to generate total costs for a medical
    center
  • In SAS this is done by proc summary
  • In Stata, you can type
  • collapse (sum) costs, by (stan3) or
  • sort sta3n
  • by sta3n egen sumcosttotal(cost)

21
ICD-9 Codes
  • Stata has capabilities to handle ICD-9 diagnosis
    and procedure codes
  • You can
  • check to see if codes are valid
  • generate identifiers based on codes or ranges of
    codes

22
Dates
  • Same date functions as SAS

23
Combining Data
  • Merge
  • this automatically creates a variable called
    _merge
  • merge1 obs. from master data
  • merge2 obs. from only one using dataset
  • merge3 obs. from at least two datasets, master
    or using
  • merge scrssn admitday disday using data_y
  • Append (stacking data)

24
Explicit Subscripting
  • Identify the most recent encounter in an
    encounter database
  • gsort id -date
  • by id gen n_n
  • by id gen N_N
  • gen selectn1

Ascending sort by ID and reverse by date
Record counter from 1 to N per person
Total number of records per person
25
Using Stata
26
Stata Interface in Windows
27
Set, Clear and More
  • Set sets system parameters
  • Need to set memory size to open a database
  • set mem 100m
  • Clear erases data from memory
  • When output is gt1 page, you are asked to continue
    (set more off)

28
Summarizing Data
. sum gender age educ Variable
Obs Mean Std. Dev. Min
Max ---------------------------------------------
------------------------ gender 4085
1.496206 .5000468 1 2
age 4085 64.5601 9.451724
50 94 educ 4085
4.398286 1.662883 1 9
  • Sum lt gt, d provides more details on each
    variable
  • Tabstat provides summary info, including totals

29
Tabulating Data
  • . tab gender
  • gender Freq. Percent Cum.
  • -----------------------------------------------
  • 1 2,058 50.38 50.38
  • 2 2,027 49.62 100.00
  • -----------------------------------------------
  • Total 4,085 100.00
  • . table gender
  • ----------------------
  • gender Freq.
  • ---------------------
  • 1 2,058
  • 2 2,027
  • ----------------------

30
Tabulating Data
  • tab gender age
  • too many values
  • r(134)
  • tab age gender
  • gender
  • age 1 2 Total
  • -------------------------------------------
  • 50 49 69 118
  • 51 72 71 143
  • 94 1 0 1
  • -------------------------------------------
  • Total 2,058 2,027 4,085

31
Tabstat
  • . tabstat age, by (gender)
  • gender mean
  • -------------------
  • 1 64.77454
  • 2 64.34238
  • -------------------
  • Total 64.5601
  • --------------------
  • . table gender, c(mean age)
  • -----------------------
  • gender mean(age)
  • ----------------------
  • 1 64.77454
  • 2 64.34238
  • -----------------------

32
Graphing
  • Diagnostic graphics
  • Presenting
  • results

33
Basic Analytical Functions
  • OLS (reg)
  • Logistic, probit, count data (e.g., CLAD)
  • Multinomials
  • GLM/HLM
  • Duration models
  • Semi and non-parametric models

34
Output
  • Linear regression Number of obs 1306
  • F( 21, 1284) 10.88
  • Prob gt F 0.0000
  • R-squared 0.1398
  • Root MSE 90.367
  • Robust
  • wtp Coef. Std. Err. t Pgtt 95
    Conf.Interval
  • ethn1 1.990048 8.742036 0.23 0.820 -15.16019 1
    9.14029
  • Ethn2 -25.74654 11.69993 -2.20 0.028 -48.69961 -2.
    793467
  • ethn3 -35.59552 11.98309 -2.97 0.003 -59.1041 -12
    .08694
  • ethn4 -3.244168 11.16836 -0.29 0.771 -25.15441
    18.66607
  • english -11.44402 9.699576 -1.18 0.238 -30.47277
    7.584741
  • lifeus 37.34419 13.86037 2.69 0.007 10.15274 64
    .53564
  • age1999 -.6272524 .3097408 -2.03 0.043 -1.234906
    -.0195987
  • income .8068256 .1714309 4.71 0.000 .4705102 1.
    143141

35
Outreg
  • Outputs data to a delimited file
  • Delimited file can be read into Excel
  • Very flexible
  • Creates publishable tables
Write a Comment
User Comments (0)
About PowerShow.com