SAS Lecture 3 - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

SAS Lecture 3

Description:

What is the percentage of machines that failed before 20 hours? ... ip=probit(p); ip=mean stdev*ip; label ip='the p-th percentile'; datalines; 0.1. proc print label; ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 19
Provided by: allp
Category:
Tags: sas | lecture | probit

less

Transcript and Presenter's Notes

Title: SAS Lecture 3


1
SAS Lecture 3
  • Computing normal probabilities in SAS
  • PROC CORR
  • PROC GPLOT

2
Example Data on the time between machine
failures were collected during a study on machine
performance that involved 39 similar machines.
From the data we compute, the sample mean
23.35hours and the sample standard deviation
1.67h.
  • What is the percentage of machines that failed
    before 20 hours?
  • What is the percentage of machines that failed
    after 24 hours?
  • What is the percentage of machines with failure
    time between 20 and 22 hours?
  • How short should the failure time be for a
    machine to be in the bottom 10 ?

3
Computing the normal probabilities using SAS
  • data normal
  • input x _at__at_
  • mean23.35
  • stdev1.67
  • z(x-mean)/stdev
  • pnprobnorm(z)
  • invpn1-probnorm(z)
  • label pn"the normal dist. func. at x"
  • label z"the standardized value"
  • label invpn"the area to the right of x"
  • datalines
  • 20 24
  • run
  • proc print label
  • run

4
SAS output
  • the the normal
    the area
  • standardized dist.
    to the
  • Obs x mean stdev value func. at x
    right of x
  • 20 23.35 1.67 -2.00599 0.02243
    0.97757
  • 24 23.35 1.67 0.38922 0.65144
    0.34856

Answer to question 1
Answer to question 2
5
Computing the normal percentiles using SAS
  • data percentile
  • input p _at__at_
  • mean23.35
  • stdev1.67
  • ipprobit(p)
  • ipmeanstdevip
  • label ip"the p-th percentile"
  • datalines
  • 0.1
  • proc print label
  • run

6
SAS output
  • the p-th
  • Obs p mean stdev percentile
  • 1 0.1 23.35 1.67 21.2098

Answer to question 4
7
SAS procedures for scatter plots and correlation
  • PROC CORR
  • The CORR procedure is a statistical procedure for
    numeric random variables that computes Pearson
    correlation coefficients and some descriptive
    statistics. The correlation statistics include
  • PROC CORR DATA dataset-name
  • BY ltDESCENDINGgt variable-1 ltvariable-ngt
  • VAR variable(s)
  • WITH variable(s)

8
  • Data one
  • input time line step device
  • linetline/1000
  • datalines
  • 0.0893 266 2 1
  • 0.0386 120 1 1
  • 0.0988 245 2 1
  • 0.026 102 1 2
  • 0.041 307 2 2
  • 0.0196 143 1 2
  • proc corr
  • var time line step
  • run

9
  • The CORR Procedure
  • 3 Variables time line step
  • Simple Statistics
  • Variable N Mean Std Dev Sum
    Minimum Maximum
  • time 6 0.05222 0.03349 0.31330
    0.01960 0.09880
  • line 6 197.16667 86.06374 1183
    102.0000 307.00000
  • step 6 1.50000 0.54772 9.0
    1.0000 2.00000
  • Pearson Correlation Coefficients, N
    6
  • Prob gt r under H0 Rho0
  • time line step
  • time 1.00000 0.61490
    0.78996
  • 0.1939 0.0615
  • line 0.61490 1.00000
    0.96099
  • 0.1939 0.0023
  • step 0.78996 0.96099
    1.00000
  • 0.0615 0.0023

10
  • proc corr
  • var time
  • with line step
  • run
  • Produces the correlations between time and line,
    and time and step only.

The CORR Procedure
2 With Variables line step 1
Variables time Simple
Statistics Variable N Mean Std Dev
Sum Minimum Maximum line 6 197.16667
86.06374 1183 102.0000 307.000 step
6 1.50000 0.54772 9.00 1.0000
2.0000 time 6 0.05222 0.03349
0.313 0.0196 0.0988 Pearson
Correlation Coefficients, N 6
Prob gt r under H0 Rho0
time line 0.61490
0.1939 step
0.78996 0.0615
11
  • proc sort
  • by device
  • proc corr
  • by device
  • var time line
  • run
  • The BY statement specifies the variable that the
    procedure uses to form BY groups. The data need
    to be sorted first by the BY variable. This
    procedure will compute the correlation between
    time and line for the two groups of data defined
    by the variable device

12
  • ----------------------- device1
    --------------------------
  • The CORR Procedure
  • 2 Variables time line
  • Pearson Correlation Coefficients, N 3
  • Prob gt r under H0 Rho0
  • time line
  • time 1.00000 0.96086
  • 0.1787
  • line 0.96086 1.00000
  • 0.1787
  • ----------------------- device2
    -----------------------------
  • The CORR Procedure
  • 2 Variables time line
  • Pearson Correlation Coefficients, N 3
  • Prob gt r under H0 Rho0
  • time line

13
PROC GPLOT
  • SYMBOL is a global option that controls the plot
    display.
  • SYMBOLlt1 2 399gt
  • ltCOLORsymbol-colorgt ? control the point color
  • ltVALUEspecial-symbol text-string NONEgt
    ?change the plotting symbol
  • ltINTERPOLJOINgt ? join the points with a line
  • ltINTERPOL R ? draw the regression line through
    the cloud of points
  • PROC GPLOT creates a scatter plot for two
    variables.
  • PROC GPLOT ltDATAinput-data-setgt
  • PLOT xvaryvarltzvargt
  • BY variables ? construct a different plot for
    each group defined by the BY variables.

14
Possible value for symbol
  • You can use any of the following symbols
  • Asterisk Diamond
  • Plus Square
  • Circle Dot
  • Point Star
  • Club Heart
  • Spade Triangle

15
  • symbol interpolr valuedot colorred
  • proc gplot
  • plot bratelgnp
  • run

16
  • proc sort
  • by cg
  • run
  • proc gplot
  • plot brategnpcg
  • run

Add a categorical variable
17
symbol1 interpolnone valueE colorred symbol2
interpolnone valueS colorblack symbol3
interpolnone valueG colorgreen symbol4
interpolnone valueM colorblue symbol5
interpolnone valueA colormagenta symbol6
interpolnone valueF colorbrown proc
gplot plot brategnpcg run
18
  • To change the scales and tick marks on the axes
  • data kilowatt
  • input kwh ac dryer _at__at_
  • datalines
  • 35 1.5 1 63 4.5 2 66 5.0 2 17 2.0 0 94 8.5 3 79
    6.0 3 93
  • 13.5 1 66 8.0 1 94 12.5 1 82 7.5 2 78 6.5 3 65
    8.0 1 77
  • 7.5 2 75 8.0 2 62 7.5 1 85 12.0 1 43 6.0 0 57 2.5
    3 33
  • 5.0 0 65 7.5 1 33 6.0 0
  • proc gplot data kilowatt
  • axis1 0 to 20 by 5
  • axis2 0 to 100 by 20
  • symbol valueplus colorblue
  • plot kwhac / haxisaxis1 vaxisaxis2
  • title "Plot of KWH against AC"
  • run
Write a Comment
User Comments (0)
About PowerShow.com