Gereltuya Altankhuyag, LecturerStatistician, UNSIAP - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Gereltuya Altankhuyag, LecturerStatistician, UNSIAP

Description:

Third group training course in application of information and communication ... additional statistics including skewness, kurtosis, the four smallest and four ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 33
Provided by: un6
Category:

less

Transcript and Presenter's Notes

Title: Gereltuya Altankhuyag, LecturerStatistician, UNSIAP


1
STATA Third group training course in application
of information and communication technology to
production and dissemination of official
statistics 10 May 11July 2007
Gereltuya Altankhuyag, Lecturer/Statistician,
UNSIAP gereltuya_at_unsiap.or.jp
2
Getting Started
  • There are three ways of executing commands
  • Using menu-bar
  • Using dialog box (db)
  • Using Syntax
  • It is preferable to use Syntax

3
Getting Started dialog box
  • Dialog box db is the command-line way to launch a
    dialog for a Stata command.
  • Syntax
  • db varname
  • For instance db sum

4
Getting Started dialog box
5
Basic commands to inspect datasets
  • The following commands are used to inspect
    datasets
  • codebook
  • count
  • describe
  • list
  • summarize
  • table
  • tabstat

6
Basic commands to inspect datasets
  • codebook
  • It examines
  • the variable names,
  • labels,
  • data to produce a codebook describing the dataset
  • It distinguishes/reports the standard missing
    values
  • Syntax
  • codebook varlist if in , option
  • Example codebook
  • codebook region

7
Basic commands to inspect datasets
  • option
  • all provides a complete report excluding mv
  • header adds header to the top of the output,
    name, date
  • notes lists any notes attached to the variables
  • mv determines the pattern of missing values
  • Examples codebook region hhlandd famsize, all
  • codebook region hhlandd
    famsize, header
  • codebook region hhlandd
    famsize, notes
  • codebook region hhlandd
    famsize, mv

8
Basic commands to inspect datasets
  • count
  • It counts the number of observations that satisfy
    the specified conditions. If no conditions are
    specified, count displays the number of
    observations in the data.
  • Syntax
  • count if in
  • For instance count
  • count if famsizegt5

9
Basic commands to inspect datasets
  • describe
  • It produces a summary of the dataset
  • In memory
  • Of the data stored in a Stata-format dataset
  • Syntax
  • Data in memory
  • describe varlist , describem_options
  • Data in file
  • describe varlist using filename,
    describef_options
  • Example des
  • des region famsize toilet

10
Basic commands to inspect datasets
  • options
  • simple display only variable names
  • short display only general information
  • detail display additional details
  • fullname do not abbreviate variable names
  • numbers display vriable number along with name

11
Basic commands to inspect datasets
  • list
  • It displays values of variables
  • Syntax
  • list
  • list varlist if in , options
  • Example list
  • list region famsize toilet
  • list region famsize toilet in
    1/15
  • list region if famsizegt5 in 1/15

12
Basic commands to inspect datasets
  • summarize
  • It calculates and displays a variety of summary
    statistics. If no varlist is specified, summary
    statistics are calculated for all the variables
    in the dataset.
  • Syntax
  • summarize
  • summarize varlist if in weight ,
    options
  • Example sum
  • sum in 1/15
  • sum region famsize toilet
  • sum region famsize toilet awweight

13
Basic commands to inspect datasets
  • options
  • detail - produces additional statistics
    including skewness, kurtosis, the four smallest
    and four largest values, and various percentiles.
  • meanonly - which is allowed only when detail is
    not specified, suppresses the display of results
    and calculation of the variance.
  • format - requests that the summary statistics
    be displayed using the display formats associated
    with the variables,
  • separator() - specifies how often to insert
    separation lines into the output. The default is
    separator(5), meaning that a line is drawn after
    every 5 variables. separator(10) would draw a
    line after every 10 variables. separator(0)
    suppresses the separation line.

14
Basic commands to inspect datasets
  • NOTE
  • Commands and output are shown in Results window.
  • When MORE message is shown,

press GO to continue display
or X button to stop display
15
Basic commands to inspect datasets
  • NOTE
  • We may specify a variable list for a range of
    variables
  • des region toilet
  • sum region hhlandd
  • list thana - famsize

16
Basic commands to inspect datasets
  • NOTE
  • We may use the menus
  • for DESCRIBE
  • Data ? Describe Data ?Describe Variables in
    Memory
  • for SUMMARIZE
  • Statistics ? Summaries, Tables Tests?Summary
    Statistics ?Summary Statistics
  • Data ? Describe Data ?Summary Statistics

17
Basic commands to inspect datasets
  • There are 5 types of table command
  • table
  • tabstat
  • tabulate one-way
  • tabulate two-way
  • tabulate summarize

18
Basic commands to inspect datasets
  • table
  • It calculates and displays tables of statistics.
  • Syntax
  • table rowvar colvar supercolvar if in
    weight , options
  • Main options
  • contents - specifies the contents of the table's
    cells select up to 5 statistics
  • by(superrowvarlist) - superrow variables up to
    4 variables.

19
Basic commands to inspect datasets
  • Examples
  • table region, c(mean famsize median hhandd)
  • table region, by(sexhead) c(mean famsize median
    hhandd)

20
Basic commands to inspect datasets
  • tabstat
  • It displays table of summary statistics
  • Syntax
  • tabstat varlist if in weight , options
  • Main options
  • by(varname) - group statistics by variable
  • statistics(statname ...) - report specified
    statistics

21
Basic commands to inspect datasets
  • Examples
  • tabstat region, stats(mean range)
  • tabstat region, by( sexhead) stat(min mean max)
    col (stat)

22
Basic commands to inspect datasets
  • tabulate one-way (tab1)
  • It produces one-way tables of frequency counts.
  • Syntax
  • tabulate varname if in weight , options
  • It produces one-way tables of frequency counts.
  • tab1 varlist if in weight , tab1_options
  • It produces a one-way tabulation for each
    variable specified in varlist.

23
Basic commands to inspect datasets
  • Examples
  • tabulate toilet
  • tabulate region
  • tabulate hhelec
  • tabulate sexhead
  • tab1 region toilet hhelec sexhead
  • Note please see the differences!!

24
Basic commands to inspect datasets
  • tabulate two-way (tab2)
  • It produces two-way tables of frequencies
  • Syntax
  • tabulate varname1 varname2 if in weight ,
    options
  • It produces two-way tables of frequency counts,
    along with various measures of association,
    including the common Pearson's chi-squared, the
    likelihood-ratio chi-squared, Cramér's V,
    Fisher's exact test, Goodman etc.

25
Basic commands to inspect datasets
  • tab2 varlist if in weight , options
  • It produces all possible two-way tabulations of
    the variables specified in varlist.
  • Examples
  • tabulate region toilet, row
  • tabulate region sexhead, row col chi2
  • tabulate region toilet, all exact
  • tab2 region sexhead toilet
  • tab2 region sexhead toilet, all exact

26
Basic commands to inspect datasets
  • Tabulate summarize
  • It produces one- and two-way tables (breakdowns)
    of means and standard deviations.
  • Syntax
  • tabulate varname1 varname2 if in weight
    , summarize

27
Basic commands to inspect datasets
  • Examples
  • One-way tables
  • tabulate region, summarize( hhlandd)
  • tabulate region aweightweight, summarize(
    toilet)
  • Two-way tables
  • tabulate region sexhead, summarize( hhlandd)
  • tabulate region sexhead aweightweight,
    summarize( hhlandd)

28
Basic commands to create and change variables,
labels etc.
  • generate
  • It creates a new variable. The values of the
    variable are specified by exp.
  • Syntax
  • generate type newvarlblname exp if in
  • Examples
  • gen agehead2ageheadagehead
  • gen agehead3ageheadagehead if sexhead1

29
Basic commands to create and change variables,
labels etc.
  • replace
  • It changes the contents of an existing variable.
    Because replace alters data, the command cannot
    be abbreviated.
  • Syntax
  • replace oldvar exp if in , nopromote
  • Examples
  • replace agehead30 if region2

30
Basic commands to create and change variables,
labels etc.
  • egen
  • It creates newvar of the optionally specified
    storage type equal to fcn(arguments). Here fcn()
    is a function specifically written for egen.
  • Syntax
  • egen type newvar fcn(arguments) if in ,
    options

31
Basic commands to create and change variables,
labels etc.
  • Examples
  • egen age4mean( agehead)
  • egen testmedian( weight- d_bank)

32
  • To be continued.

Please perform EXERCISE 2
Write a Comment
User Comments (0)
About PowerShow.com