Stata Seminar - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Stata Seminar

Description:

Source: Kohler, U. and F. Kreuter (2005). Data Analysis Using Stata. 1. Stata Seminar ... Source: Kohler, U. and F. Kreuter (2005). Data Analysis Using Stata. ... – PowerPoint PPT presentation

Number of Views:231
Avg rating:3.0/5.0
Slides: 22
Provided by: CUR58
Category:
Tags: kohler | seminar | stata

less

Transcript and Presenter's Notes

Title: Stata Seminar


1
Stata Seminar
  • Session 2 Syntax
  • Francisco Jose Gonzalez Carreras
  • fjg23_at_sussex.ac.uk

2
First to read an excel file
  • Download the file and save.
  • Open the excel file and save is as Text (tab
    delimited)
  • Open stata and go to the directory where the new
    text file is. Remember cd f\ (for example)
  • (In this case with data1) type set memory 5m (we
    are increasing the memory because the file is big
    and otherwise it would give us an error)
  • 5 type insheet using data1.txt, clear
  • Type save data1saved and is already a .dta file
  • See that we lost labels, it is better to download
    from the original source.

3
Elements of Stata Commands
  • How to use stata commands. Elements can be
  • required
  • permitted
  • prohibited
  • Type help summarize
  • summarize varlist if in weight ,
    options
  • Command summarize
  • varlist means variable list
  • if stands for the if qualifier (if gender1).
    Qualifiers restrict the command to a particular
    subsample of the database
  • , options, it specifies the command in a more
    particular way.
  • Start session and load data1 (big brother and
    use)
  • log using session2.log
  • use data1, clear

4
Syntax commands
  • Commands can be abbreviated, see that in the help
    some letters of the command are underlined these
    are the shortest possible abbreviation of the
    command.
  • This means that summarize can be written as su,
    but also sum summ summa
  • Type su income and then sum income it is exactly
    the same

5
Syntax Commands
6
Syntax Variable list varlist
  • varlist means that you can use a variable list,
    that you enter by writing the name of the
    variables, separated by spaces
  • varlist in square brackets the variable list is
    possible but not required. summarize
  • varlist with no square brackets, it is required
    if you do not write it the program will report an
    error.
  • Type help drop. Typing just drop gives an error.
  • After that type drop _all. All are dropped.
  • We cannot reverse this command, so we have to
    upload the file again use data1, clear

7
Syntax Variable list varlist
  • You can also see varname or depvar with no
    brackets. This is the case for variable lists
    that consist of one single variable. Sometimes it
    is a variable within a list of variables where
    the order is important
  • Type help regress you see depvar (for dependent
    variable) and then indepvars that you can enter
    or not
  • Remember session 1 regress income sex fulltime

8
Syntax Variable list varlistAbbreviation rules
  • Single variables
  • Abbreviate the name of the only variables that
    begin with a letter kitchen with k
  • sum k summarizes kitchen
  • save typing characters of a variable
  • sum yh summarizes ybirth and saves you birt
  • Multiple variables
  • Use ? for variables with the same name except one
    character sum np940?
  • Use asterisk to specify variables that share
    characters in their name sum np summarizes all
    the variables that begin with np
  • Hyphen to specify a range of variables that
    should be in order.
  • sum kitchen-phone is equal to write sum kitchen
    phone shower wc heating cellar balcony garden
    phone
  • Lets mix resources sum rs np k-ph summarizes
    a lot of variables!!!

9
Syntax Options , options
  • Commands have a default execution and options
    modify it. They are different for each command
    and are possible when you can see the word option
    after a comma.
  • Different options are described below.
  • We did this in the first session. Type
  • summarize income, detail
  • See also syntax of tabulate. Type
  • tabulate gender np9506, missing row

10
Qualifiers in (order)
  • The in qualifier limits the execution of the
    command to a subset of observation. ORDER
  • It is composed by the word in and a range of
    observations separated by /(slash). Before the
    slash will be the first observation for which the
    command will be executed and after the / will be
    the last FIRST/LAST. If the range is a single
    observation, one its number is enough.
  • Remember that it might be very important to sort
    before the data to make sure that you execute the
    commands for those observations that you want.
  • Examples
  • list persnr gender ybirth in 10 (only the tenth
    observation)
  • list persnr gender ybirth in 10/14 (tenth to
    fourteenth)
  • list persnr gender ybirth in -5/-1 (fifth from
    the last / the last)
  • list persnr gender ybirth in -5/l (fifth from the
    last / the last)
  • list persnr gender ybirth in 3330/-5 (3330th /
    5th from the last)

11
Qualifiers if (condition)
  • Restricts the execution of the command to those
    observations that meet a particular condition,
    that has to follow the qualifier.
  • We did sum income if gender 1. See it here
  • list income gender in 1/5, nolabel
  • list income gender in 1/5 if gender1, nolabel
    See how does if work??
  • Try these others
  • sum income if ybirth lt 1979
  • sum income if ybirth lt 1979
  • sum income if ybirth 1979
  • Careful with the infinite mistake the missing
    trap!!!!!
  • tab edu, missing nolabel
  • sum ybirth if edugt6 see that 28 missings are
    added

12
Qualifiers if Relational operators
  • Practice
  • sum ybirth if edu6 edu7
  • sum ybirth if edugt6 edu lt7
  • sum ybirth if edugt6 edult.

13
Expressions, operators
  • Allowed or required when the term exp appears in
    the syntax diagram. Type help generate This
    command needs an expression after the command
    name.
  • Stata calculator display. Type
  • display 22
  • Operators , , /, type
  • display 32
  • display 23 22

14
Lists of numbers
15
Using filenames
  • In stata some commands read or write a file. In
    the syntax is expressed with using filename
  • Normally it consist of a directory, the name of
    the file itself and the extension.
  • F\data1.dta If you type a name stata looks for
    in the current directory (bottom left hand
    corner). If it is not here, it will report it.
  • If you type a filename without extension, Stata
    looks for one with an extension that is
    appropriate for the specified
  • Extensions and commands in the table.

16
Repeating similar commands by prefix
  • Sometimes you will need to type similar commands
    over and over again. There are two main options
    to do this
  • by prefix already known, remember it executes
    the commands by the batches determined by the
    prefix command. Type
  • sort gender
  • by gender sum income
  • by edu, sort summarize income
  • Play with the by prefix
  • bysort edu summarize income
  • by gender edu, sort sum income (the same in
    only one command, adding one variable)

17
Repeating similar commands foreach or forvalues
loops
  • Syntax DO NOT COPY-PASTE THE LOOPS, errors will
    appear because of the formats.
  • foreach lname listtype list
  • commands referring to lname
  • The first line starts the loop and ends into a
  • Then add stata commands
  • Close the loop
  • The element name (lname), the list type
    (listtype), and the foreach list (list)
  • You state the name of the element, the list of
    parameters you want to execute the commands on,
    then close, then the command (s) and close the .
    This example a list of variables
  • foreach x of varlist np9501 np9504
  • tabulate x gender

18
Repeating similar commands foreach
  • Other examples of foreach
  • foreach var of newlist r1-r10
  • gen varuniform()
  • List of new variables. Uniform() creates
    uniformly distributed random number.
  • foreach num of numlist 1/10
  • replace rnumuniform()
  • List of numbers. Replace because the variables
    already exist
  • Practice lets label all the variables r1-r10
    writing a common label 1st uniform variable, 2nd
    uniform variable First, second, third typing,
    the rest with a loop.

19
Repeating similar commands forvalues
  • It has a simplified syntax
  • forvalues lnamerange
  • commands
  • forvalues num1/10
  • replace rnumuniform()
  • Practice replace the label of variables with a
    forvalues loop instead.

20
Repositories
  • Stata saves results of statistical commands in
    r() and of estimation commands in e(). They are
    called repositories.
  • Statistical summarize. Type
  • summarize income and then return list and you
    will see the contents of the last r-class command
  • Estimation regress. Type
  • regress income yedu and then ereturn list
  • You can operate with them. Type
  • sum income
  • display r(mean) 1.96sqrt(r(Var)/r(N))
  • display r(mean) - 1.96sqrt(r(Var)/r(N))
  • Stored results are deleted with a new command.
    Some commands store results into matrices but we
    will not see them

21
  • Unplug big brother
  • log close
  • doedit
  • Copy the commands in the review window and paste
    in the doedit. Save as session2
  • clear
  • exit
Write a Comment
User Comments (0)
About PowerShow.com