Title: Class 2 Lesson Plan
1Class 2 Lesson Plan
- Syntax Basics
- Statements
- Comments
- Variables Variable Types
- Constants
- Expressions
- Introduction to Reading Data
- DATA Statement
- INPUT Statement
- SET Statement
2Class 2 Lesson Plan
- Debugging and Editing
- Error messages
- Correcting errors
- Resolving common errors
3Syntax Basics - Statements
- Smallest unit of work in SAS.
- SAS is not case sensitive.
- All statements MUST end with a semicolon ().
- SAS ignores spaces, carriage returns, or extra
lines. All statements end with a semicolon.
4Syntax Basics
Always end with a semicolon
Usually begins with a SAS keyword
Can begin anywhere on a line
Can continue on several lines (or continue on one
line)
Can skip lines
Use blanks or special characters to separate words
Can use more than one blank space
5Syntax Basics - Statements
- Examples of different programming styles
- data oneinfile c\new.rawinput case 1-10
age 11-12 - data one
- infile c\new.raw
- input case 1-10 age 11-12
- data one
- infile c\new.raw
- input case 1-10
- age 11-12
6Syntax Basics - Statements
- RUN
- Tells SAS that a DATA or PROC step is over, and
needs to be run. - Its usually needed at the end of each DATA or
PROC step, or else youll confuse poor, stupid
SAS. - Rule of thumb End every DATA or PROC step with
RUN.
7Syntax Basics - Statements
- data one
- infile c\new.raw
- input case 1-10
- age 11-12
- run
- proc print dataone
- run
- proc means dataone
- var age
- run
8Syntax Basics - Statements
- ENDSAS
- Same as RUN, but also ends the SAS session (i.e.
quits SAS and sends you back into the operating
system). - Ive never used this statement it has limited
usefulness with SAS for Windows. - It may be more useful for SAS on mainframes.
9Syntax Basics - Comments
- Comments are not read by SAS, so you can write
anything without having SAS tell you that you
screwed up. - Its good practice to use comments extensively to
organize your program for yourself and others. - There are 2 ways to identify comments.
10Syntax Basics - Comments
- Use /./
-
- / text of comment /
- Use .
- text of comment
11Syntax Basics - Comments
- Examples of comments Headers
- /
- Sample SAS Header
- First Created August 31, 2003
- By Han S. Kim
- Located c/sasfiles/class2
-
- Modification history
- Person Date Changes
- HSK 09/01/2003 added race
/
12Syntax Basics - Comments
- Describe lines in the DATA step
- DATA temp
- infile c/sasfiles/class2/exam.txt
-
- Read variables sampid, case, sex, age
- input sampid case sex age
- run
13Syntax Basics - Comments
- Or you can
- DATA temp
- infile c/sasfiles/class2/exam.txt
-
- input sampid case sex age Read variables
- run
14Syntax Basics - Comments
- But dont
- DATA temp
- infile c/sasfiles/class2/exam.txt
-
- input sampid case Read variables
- sex age sampid, case, sex, age
- run
15Syntax Basics - Comments
- Describe PROC steps
- Read variables sampid, case, sex, age into
- Datafile temp
- DATA temp
- infile c/sasfiles/class2/exam.txt
- input sampid case sex age
- run
- Conduct univariate analysis of DATA temp on
age - PROC univariate datatemp
- var age
- run
16Syntax Basics - Comments
- Or comment out statements during program
development - DATA temp
- infile c/sasfiles/class2/exam.txt
- input sampid case sex age
- run
- /PROC univariate datatemp
- var age
- run/
17Syntax Basics - Comments
- But be aware of the differences in the 2 comment
methods - DATA temp
- infile c/sasfiles/class2/exam.txt
- input sampid case sex age
- run
- PROC univariate datatemp
- var age
- run
18Syntax Basics - Comments
- But be aware of the differences in the 2 comment
methods - DATA temp
- infile c/sasfiles/class2/exam.txt
- input sampid case sex age
- run
- PROC univariate datatemp
- var age
- run
19Syntax Basics - Variables
- Variables are the names of the columns
- Each observation, or row, has a value for each
variable. - Examples of variables sample ID, age, sex,
smoking status. - Examples of values for variables 232112, 34,
male, never.
20Syntax Basics - Variables
- Some rules
- Variable names must be 32 or fewer characters in
length (SAS v8 or higher) - Names must start with an underscore or letter,
NOT A NUMBER - Names must contain only underscores, letters and
numbers, no !_at_. - Names are case insensitive
21Syntax Basics - Variables
- Variable Series
- Be lazy! Save coding time!
- Shorthand way to identify a large number of
variables - The order that SAS reads the variables is
important! - There are seven List Styles
22Syntax Basics Variables
- Example program
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
23Syntax Basics - Variables
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
- VARn VARx
- x1-x3 refers to the variables x1, x2 and x3
- x1-x4 refers to the variables x1, x2, x3 and x4
- x4-x1 gives an error
- x01-x3 gives an error
24Syntax Basics - Variables
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
- VARn - VARx
- x1--x3 refers to x1, x2 and x3
- x1--x4 refers to x1, x2, x3, name, type, y, z
and x4 - x3--x4 refers to x3, name, type, y, z and x4
25Syntax Basics - Variables
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
- VARa numeric- VARb
- x1-numeric-x4 refers to x1, x2, x3, y, z and x4
- namenumeric-sum1_3 refers to y, z, x4 and
sum1_3
26Syntax Basics - Variables
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
- VARa character- VARb
- x3-character-y refers to name and type
- name-character-x4 refers to name and type
- x1-x3 gives an error
27Syntax Basics - Variables
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
- _character_
- Refers to all character variables name and type.
28Syntax Basics - Variables
- input x1 x2 x3 name type y z x4
- sum1_3 x1 x2 x3
- x4 x4 / 2
- _numeric_
- Refers to all numeric variables x1, x2, x3, y,
z, x4 and sum1_3.
29Syntax Basics - Variables
- How do we use variable lists?
- Inputting data
- data temp
- infile c/sasdata/class2/data.txt
- input sampid multi1-x3 nsaid1-y3
- run
30Syntax Basics - Variables
- How do we use variable lists?
- PROC statements
- proc means datatemp
- var x1-numeric-x4
- run
31Syntax Basics Data Types
- Two types of data numeric or character (string).
- Numeric variables can only contain numbers.
- Choose numeric if any arithmetic or calculations
are required. - Character variables can have anything.
- Choose character if you have names, addresses,
quotations, descriptions, etc.
32Syntax Basics Data Types
- Notes on character variables
- Character variables ARE case-sensitive.
- Character variables must be precise
- Keanu Reeves
- Keanu Reeves
- Keanu Reeves
- keanu reeves
- KEANU REEVES
33Syntax Basics Data Types
- Why use a character variable for numeric data?
- Character variables utilize less computer
resources. - Encourages reasoned planning of analyses.
34Syntax Basics Missing Data
- Missing data can occur because
- Data is lost.
- Respondent did not answer question.
- Structural issues i.e. HRT use in men.
- Remember that there is a difference between a
missing value and a zero (0) value.
35Syntax Basics Missing Data
- For numeric variables, the missing datapoint is
represented by a period (.) or a blank space
where SAS expects a value. - For character variables, the missing datapoint is
represented by a blank space where SAS expects a
value. - When referenced within the program, it is
represented by empty quotes ()
36Syntax Basics Missing Data
- data patients
- input patient_id 1-5 sex 7 age 9-10
- cards
- 10021 M 24
- 10022 33
- 10023 F .
- 10024 M
- 10025 M 43
-
- if sex then sex unknown
37Syntax Basics Missing Data
- How does SAS deal with missing data?
- In calculations, if one of the terms is missing,
then the target is also missing. - In PROC steps, SAS does not use missing values in
its calculations, although with some PROCs, you
can force SAS to use missing values. - In DATA steps, variables that you create within
the program are missing until calculations are
completed.
38Syntax Basics Constants
- You can designate constants in SAS, both numeric
and character.
39Syntax Basics Expressions
- We can now use variables and/or constants to
recode, calculate, create new variables.. - Examples..
40Syntax Basics Expressions
- age 2003 - year_of_birth
- if age gt 18 then agecat Under 18
- bmi weight_kg/(height_m2)
- if sex F and hrt_use 1 then hrt_dose
dosage freq
41Syntax Basics Expressions
- General Form
- Variable Variable
- Constant operator Constant
- Expression Expression
42Syntax Basics Expressions
43Syntax Basics - Expressions
- Some notes on arithmetic operations
- If any of the values are missing, the result of
the operation will also be missing. - Division by zero will be indicated by the SAS
log, but it will not stop the program. It will
instead create a missing value.
44Syntax Basics Expressions
45Syntax Basics - Expressions
- statePA coast. oceanAT tidal89
- Expression Evaluation
- statePA true
- oceanat false
- stateMA true
- tidal ge 89 true
- tidal lt 89 false
- coast lt tidal true
46Syntax Basics Expressions
47Syntax Basics - Expressions
- statePA coast. oceanAT tidal89
- Expression Evaluation
- oceanAT and statePA true
- oceanAT statePA true
- oceanAT or stateUT true
- tidal gt 100 coast gt 250 false
- state in (PA MA RI) true
48Syntax Basics - Expressions
- Complex Operations
- Order of operation (arithmetic)
- , , /, , -
- Order of operation (comparison)
- not (), and (), or ()
- Use parenthesis to control order of operation
SAS will calculate operations inside parentheses
first, irregardless of the arithmetic or
comparison order of operation.
49Syntax Basics - Expressions
- a4 b-2 c2 d1 e3 f0
- Expression _
- a2 / b c2 d - 2
- 16 / (-2) 4 1 - 2
- -8 4 - 2
- -6
- a(2/b) c (2d) - 2
- 4(-1) 2(2) - 2
- .25 4 - 2
- 2.25
50Syntax Basics - Expressions
- a4 b-2 c2 d1 e3 f0
- Expression _
- a(2 / (b c2)) d - 2
- 4(2 / (-2 4)) 1 -2
- 4(2 / (-2)) 1 - 2
- 4(-1) 1 - 2
- .25 1 - 2
- .25 - 2
- -1.75
51Syntax Basics - Expressions
- Another way to control complex operations break
them down into sub-statements - F a(2 / (b c2)) d - 2
- OR
- x b c2
- y 2/x
- F ay d - 2
-
52Data Step Reading the Data
- SAS can read data from a number of data types and
sources. - Data read can be fixed or list.
- Fixed data is data in which the location is fixed
and consistent - List data is data separated by commas, tabs or
spaces
53Data Step Reading the Data
- Fixed data is data in which the location is fixed
and consistent - ----5----0----5----0----5----0----5----0----5----0
- 12210 male case 8410309061968ma
- 12211 female control 8410803051968mi
54Data Step Reading the Data
- List data is data separated by commas, tabs or
spaces - 12210, male, case, 84103, 09061968, ma
- 12211, female, control, 84108, 03051968, mi
55Data Step Reading the Data
- DATA Statement
- Tells SAS the DATA step is starting.
- Names the SAS dataset.
- Sets variables used in the dataset to missing
values.
56Data Step Reading the Data
- Format
- DATA dataset_name options...
- Two elements of the DATA statement
- The dataset_name is the name of the dataset.
Naming restrictions are the same as variables. - Options on controlling data.
57Data Step Reading the Data
- Identifying the data source
- The INFILE statement
- INFILE fileref options
- The fileref is the pathway and filename of the
external file that you want to import and convert
into a SAS datafile.
58Data Step Reading the Data
- Identifying the data source
- The FILENAME statement
- FILENAME fileref pathnamefilename
- The fileref is a name you want to call this
external dataset and filename is the location and
name of the external file that you want to import
and convert into a SAS datafile.
59Data Step Reading the Data
- Identifying the data source
- The CARDS statement
- cards
- 123 case female 32
- 234 control male 45
-
- Use this to enter small datasets directly into
the data step.
60Data Step Reading the Data
- Describing the data - INPUT
- INPUT tells SAS how to read the data from the
dataset identified, and what to name it in SAS. - You can specify location of the variable within
the dataset, the name of the variable, and
whether its a character or numeric - You can also specify decimal place if necessary
61Data Step Reading the Data
- Format of INPUT statement
-
- INPUT var start -end ...
- INPUT var modifier start -end .dec...
- INPUT var modifier...
- INPUT var ...
- Examples of INPUT statements
- input name 1-20 grade 22 status 24
- input city state zip
- input month day year low_temp high_temp
- input loc 1-4 span 8-17 .1 rank
62Data Step Reading the Data
- Notes on the INPUT Statement
- You can input using both fixed and list formats
in the same statement. - Decimal spaces can be specified in the input
statement using DEC if the decimal place is
already in the dataset, it supercedes the DEC
specification in the INPUT statement. If DEC is
not used, the decimal places in the data are kept.
63Data Step Reading the Data
- Errors in data
- Bad data happens. O is entered as 0, columns
dont line up, caps are not consistent, different
formats are used to enter numbers (use of commas
for thousands, for currency, etc.) - How do we deal with this?
64Exercise in Reading Data
- Create a SAS program that will
- Input the following data
- Calculate age as of 2004
- Calculate height in inches (1 inch 2.54 cm)
65Error Types
- Syntax Error
- Statements do not conform to SAS language rules
- Data Error
- Logic errors
66Some Common Errors
67Some Common Errors
68Some Common Errors
69Some Common Errors