Brief Introduction to STATA (2 Hours) Fabian Waldinger (LSE & CEP) Data for this Course. Download 'Schooling 1.dta', 'Schooling 2.dta', and 'Schooling 3.dta' from ... – PowerPoint PPT presentation
1 Brief Introduction to STATA (2 Hours) Fabian Waldinger (LSE CEP) 2 Data for this Course
Download Schooling 1.dta, Schooling 2.dta, and Schooling 3.dta from http//personal.lse.ac. uk/waldinge/
and save them on your H Drive
Open Dataset using the Open Icon from STATA
3 Opening STATA
Start Menu -gt Programmes -gt Statistics
-gtSTATA Icon
4 Windows
Command Window
Results Window
Variables Window
Review Window
4 Opening Datasets
Usually data is not in STATA format (.dta)
You can copy and paste from Excel
Use Stat/Transfer program (see handout)
5 Preliminary Commands
Adjusting memory size
If you use large datasets you may have to increase the memory which the computer reserves for STATA
set mem 30m
(allocates 30 Megabyte to STATA. Other abbreviations
k kilobyte m megabyte g gigabyte)
Be careful if you allocate too much memory to STATA the processing may become very slow
Changing the way results are reported in the output window
not making the reporting of results be stopped if output window is full (convenient if you use log files see below)
set more off
6 Saving Data
Click Save Icon
Saves Dataset as it is at that moment and replacing the old version
(Equivalent to typing
save H\Schooling 1.dta, replace )
Preserve and Restore
convenient for temporary changes to the dataset which you do not want to save for later uses. Eg
preserve
(photocopies the dataset as it stands)
(then do all the changes to the data and possibly run regressions)
restore
(to go back to the originally saved dataset)
7 Examining the Data
Look at Data in the Data Editor
Be careful to not make arbitrary changes to the data
Summarize
Provides summary statistics
sum
Tabulate
Gives frequencies of variables at each value of the variable
(do not use if variable takes too many different values)
tab s
8 Organising the Data
Rename
Changing the name of a variable.
rename lwage logwage
Recode
Changes the values of a variable
recode black 0 .
(assigns value . (missing value in STATA) if smsa was 0 before)
recode black .0
Assignment of certain values to variables are made using one sign
9 Adding Observations (Append)
Adding additional rows to the dataset
Adding additional observations with data on the same (or almost the same) variables
If one of the datasets has fewer variables than the combined dataset will feature all variables but will have missing values for all observations from the other dataset append using H\Schooling 2.dta save H\Schooling 4.dta 10 Adding Variables to Dataset (Merge)
Adding additional Columns to the dataset
Adding additional variables the same individuals/firms/countries (or almost the same)
A B A B
In order to apply the merge command you first have to sort the data according to the variable you are going to use to identify individual observations
Sort Master dataset and save it
sort id
2) Open Use dataset, sort it according to the same variable, and save
sort id
11 Adding Variables to Dataset II
Reopen Master dataset and then merge using dataset with it
merge using H\Schooling 3.dta
The Merge command creates a new variable called _merge which indicates how well the merging process was carried out. Investigate this with
tab _merge
The _merge variable can take three values
1 observations from the master dataset did not match with observations from the using dataset
2 observations from the using dataset did not match with observations from the master dataset
3 observations from both datasets matched
12 Dropping and Keeping Variables
Before we do this exercise we preserve the data
preserve
Keep
Enables you just to keep the variables specified which you want to use for your analysis.
keep s age e
Drop
Drops the variables which you specify in the variable list.
drop
13 Dropping and Keeping Variables II
Can be combined with relational operators or logical operators
Relational Operators Logical Operators
equal to and
! not equal to or
gt greater than ! not
gt greater than or equal to
lt less than
lt less than or equal to
drop if (sgt12 s lt 8)
restore
14 Creating New Variables
Generate
Generates a new variable
gen college 1 if sgt12
gen e2 ee
Egen
typically creates variables based on summary statistics
egen meanwage mean(logwage)
Replace
Modifies existing variables in exactly the same way as gen creates new variables
replace college 0 if college.
15 Log-Files and Do-Files
Log-Files
Record all commands and the output of your STATA session.
Do-Files
Combine many commands in a file which can be saved to make changes and rerun it later on
Open Do-File Window and type
log using H\schooling.log
sum
(other commands)
log close
16 OLS Regression
Running an OLS regression
first specify the dependent variable and then all the explanatory variables
reg logwage s e e2 black
Can be combined with logical operators
reg logwage s e e2 black if college1
17 IV Regression
Running an IV regression
Instrumenting schooling with the dummy whether you live near a four year college
ivreg logwage (snearcoll) e e2 black
First Stage is not reported. You would have to estimate in the standard way using OLS if you want to investigate it
18 Getting Information on STATA
STATA help
just type help followed by the command you want to use. E.g.
help reg
STATA Manuals
Available in the Computer Manual Section of the Library
PowerShow.com is a leading presentation sharing website. It has millions of presentations already uploaded and available with 1,000s more being uploaded by its users every day. Whatever your area of interest, here you’ll be able to find and view presentations you’ll love and possibly download. And, best of all, it is completely free and easy to use.
You might even have a presentation you’d like to share with others. If so, just upload it to PowerShow.com. We’ll convert it to an HTML5 slideshow that includes all the media types you’ve already added: audio, video, music, pictures, animations and transition effects. Then you can share it with your target audience as well as PowerShow.com’s millions of monthly visitors. And, again, it’s all free.
About the Developers
PowerShow.com is brought to you by CrystalGraphics, the award-winning developer and market-leading publisher of rich-media enhancement products for presentations. Our product offerings include millions of PowerPoint templates, diagrams, animated 3D characters and more.