Research Methods - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Research Methods

Description:

Title: Research Methods Lecture 2 The dummies guide to STATA Author: ecrmg Last modified by: ecrmg Created Date: 10/18/2006 9:28:54 AM Document presentation format – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 46
Provided by: ecr52
Category:

less

Transcript and Presenter's Notes

Title: Research Methods


1
Research Methods
  • Lecture 2
  • The dummies guide to STATA
  • Wiji Arulampalam
  • 18/10/2006

2
Econometrics Software
  • You can use any software that does what you need
  • See Timberlake for details of what does what well
    www.timberlake.co.uk
  • PC Give is hard to beat for time series analysis
  • Microfit, EViews are good alternatives
  • STATA does (just about) everything.
  • STATA (and everything else) is available as a
    delivered application on the network.

3
WHY STATA
  • Need to know how to use STATA for
  • (i) Econometrics A next term
  • (ii) Econometrics B this term
  • (iii) Panel Data Econometrics next term
  • E-Views demo will be given by the Econometrics
    tutors!
  • The above two should be sufficient

4
STATA
  • Hopefully you will have access by next week
  • So full demo next week
  • Stata command file wages.do and data file
    wages.dta on the module web page for you to
    practice

5
STATA
  • Use STATA FOR
  • large survey datasets (merging them)
  • complex nonlinear models (e.g. LDVs)
  • But see also LimDep
  • nonparametric and evaluation methods
  • you want to
  • continue studying economics
  • be a professional economist
  • learn something new
  • you hate PC Give.

6
Some useful websites
  • Statas own resources for learning STATA
  • Stata website, Stata journal, Stata library,
    Statalist archive
  • http//www.stata.com/links/resources1.html
  • Michigans web-based guide to STATA (for SA)
  • UCLA resources to help you learn and use STATA
  • http//www. ats.ucla.edu/stat/stata
  • including movies and web-books

7
Accessing STATA
  • Available from your Delivered Applications
  • Double click on icon!

8
Buttons/Menu
9
Enter commands here
10
OR use the do editor to create a .do file
11
Results window
Better to save the output more later
12
Click for Extensive Help OR Type help in
command line
help
13
Type help in command line
help xxx
14
Exit, clear
15
Click and point in v9
  • Menu/tabs

Exit, clear
16
Important features (1)
  • NOTE
  • Always use lowercase in STATA
  • Otherwise you can get very confused
  • More
  • --more-- in your output window ? more output to
    come.
  • Press spacebar and the next page appears
  • Command set more off turn this off
  • Not enough memory so reset!
  • . set mem XXXm (allocate XXX mb of data)
  • . set matsize XXX (max matrix size XXX square)

17
Important features (2)
  • To Break
  • To stop anything hit the break (menu button
    with red cross, or hit Ctrl and C simultaneously)

18
Using data on disk (1)
  • Opening a dataset
  • datasets need to be rectangular
  • variables in columns observations in rows
  • Stata datasets have a .dta extension
  • Will read excel or text files
  • Otherwise use Stat/Transfer to convert other
    format files to stata files

19
Using data on disk (2)
  • There are several ways of getting data into
    STATA eg wages.dta
  • . use wages (or click file/open on the menu bar)
  • . use lwage ed exp in 1/1000 if fem1
  • . insheet using wages.csv (or .txt)
  • (imports an Excel csv file or a text
    file)

20
Opens the file
List of variables
21
Basic data reporting (1)
  • .describe (or press F3 key)
  • Lists the variable names and labels
  • .describe using wages
  • Lists the variable names etc WITHOUT loading the
    data into memory (useful if the data is too big
    to fit)
  • .codebook
  • Tells you about the means, labels, missing values
    etc

22
(No Transcript)
23
Basic data reporting (2)
  • sort and count
  • .sort personid
  • sorts data by personid
  • .count if personidpersonid_n-1
  • counts how many unique separate personids
  • _n-1 is the previous observation

24
(No Transcript)
25
(No Transcript)
26
First look at the data (1)
  • .list lwage ed exp in 1/10 if femgt0
  • Lists the first 10 rows of var1 to var3 for which
    var40
  • .tab fem union (or tabulate)
  • variables should be integers
  • gives a crosstab of fem vs union

27
(No Transcript)
28
First look at the data (2)
  • .summ fem union (or summarize or sum)
  • means, std devs etc for x1 and x2
  • .corr ed exp in 1/100 if femlt1 (,cov)
  • correlation coeffs (or covariances) for selected
    data
  • .pwcorr ed exp lwage does all pairwise corr
    coeffs

29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
Tabulating (1)
  • tab x1 x2 if x40, sum(x3)
  • gives the means of x3 for each cell of the x1 vs
    x2 crosstabulation for observations where x40
  • tab x1 x2, missing
  • Includes the missing values
  • tab x1 x2, nolabel
  • Uses numeric codes instead of labels
  • Eg 1 instead of NorthWest etc

33
Tabulating (1)
  • tab x1 x2, col
  • Gives of column instead of count
  • Can get row percentages by using row instead
  • Or both by using row col
  • table educ ethnic, c(mean wage) row col
  • Customises the table so it includes the mean (or
    median or mx or count or sd .) of wage by cells

34
Labelling
  • Always have your data comprehensively labelled
  • .label data This is pooled GHS 90-99
  • .label variable reg region
  • .lab define reglab 0 North 1 South 2
    Middle
  • .lab values region reglab
  • Tedious to do for lots of variables
  • but then your output will be intelligibly
    labelled
  • other people will be able to understand it in
    future

35
Data manipulation (1)
  • Data can be renamed, recoded, and transformed
  • Command .generate or gen for short
  • . gen logrwlog((earn/hours)/rpi)
  • . gen agesqage2 (squares)
  • . gen region1(region1) (1 if true, 0 if
    not)
  • . gen ylaggedy _n-1
  • (_n is the obs in STATA)

36
Data manipulation (2)
  • Command recode
  • . recode x1 .0, 1/51 (. is missing value
    (mv))
  • . replace raterate/100
  • . replace age25 if age250
  • . egen meanincmean(income), by (region)
  • (see help egen for details)

37
(No Transcript)
38
Data selection (1)
  • You can also organise your data set with various
    commands
  • . keep if _nlt1000 ( _n is the observation
    number)
  • . drop region
  • . drop if ethnic1
  • keeps only the first 1000 observations, drops
    region, and drops all the observations where the
    variable ethnic?1 ( is not equal to)

39
Data selection (2)
  • Then save the smaller file for subsequent
    analysis
  • . save newfile
  • . save, replace (take care it overwrites
    existing file)

40
(No Transcript)
41
Functions
  • Lots of functions are possible.
  • See . help functions
  • Obvious ones like
  • Log(), abs(), int(), round(), sqrt(), min(),
    max(), sum()
  • And many very specialised ones.
  • Statistical functions
  • distributions
  • String functions
  • Converting strings to numbers and vice versa
  • Date functions
  • Converting dates to numbers and vice versa
  • And lots more

42
Command files
  • Stata command files have a .do extension
  • It is ALWAYS good practice to use a .do file
  • you will know exactly what you have done.
  • It makes it easy to develop ideas.
  • And correct mistakes.
  • . do wages.do, nostop
  • (echoes to screen, and keeps going after error
    encountered)
  • Or . run wages.do (executes silently)

43
Keeping track of output (1)
  • Can scroll back your screen (upto a point)
  • But better to open a log file at the beginning of
    your session, and close it at the end.
  • Click on file, log, begin . Or type
  • . log using myoutput
  • . Commands
  • . log close
  • log command allows the replace and append
    options.

44
Keeping track of output (2)
  • Default is .smcl file extension (that STATA can
    read)
  • .log extension gives an ASCII file that anything
    can edit
  • ALWAYS LOG your output
  • is a good way of developing a .do file since it
    saves the commands as well as the output

45
Next Lecture
  • Monday 23rd October F107 1100-1200
  • STATA demo
Write a Comment
User Comments (0)
About PowerShow.com