SAS Macros are the Cure for Quality Control Pains - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

SAS Macros are the Cure for Quality Control Pains

Description:

An ongoing effort for validation, improvement and facilitation ... Forgetful. Busy. Lazy. Don't Care. Why Macros. Minimal Effort. Parameters. Available (FREE) ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 32
Provided by: DASC8
Category:

less

Transcript and Presenter's Notes

Title: SAS Macros are the Cure for Quality Control Pains


1
SAS Macros are the Cure for Quality Control Pains
  • Gary McQuown
  • Data and Analytic Solutions

2
Rants and Raves of a SAS Programmer
3
Purpose
  • I. Quality Control
  • II. SAS Macros for Quality Control
  • III. Sources of SAS Macros and QC Code

4
I. Quality Control
  • An ongoing effort for validation, improvement
    and facilitation of the data related process to
    insure that data meets the business needs.

5
Quality Control
  • Quality control means you can have what you
    need, how you need it, when you need it. E.
    Demming

6
Why Practice QC?
  • It Saves Time
  • It Saves Money
  • It Makes Money
  • Ignorance is not Bliss

7
How Data Goes Bad
  • Bad Genes .. Poor design and collection
  • Adoption Someone Elses Design
  • Child Abuse ... Poorly Nurtured
  • Terrible Teens ... Growing Pains

8
The QC Process
  • Define Requirements
  • Identify Data Issues
  • Analyze Options
  • Improve Data Quality
  • Document every step and repeat

9
Define Requirements
  • What do you need?
  • Requires an understanding of the business
    process, the data, the operating system and the
    users.
  • Documentation, business specs and experts.

10
Devils Advocate
  • What is correct for one task / group may be
    incorrect for another.
  • What is correct now may be incorrect later.
  • What is correct now ... may not be able to be
    repeated.

11
Identify Data Issues
  • Accuracy
  • Completeness
  • Consistency
  • Timeliness
  • Uniqueness
  • Validity

12
G Good F Fair B Bad
13
Analyze Options
  • What do you need?
  • What do you have?
  • What changes need to be made?
  • Will you break anything along the way?

14
Improve Data Quality
  • Selective Processing
  • Clean Existing Values
  • Correcting Existing Values
  • Delete bad data
  • Add additional data
  • Document original and new values.

15
Documentation
  • Design Process ... business specs
  • As You Go ... in the code, log, email
  • Input and Output files (Freqs Means)
  • Modifications .... as per xxx , email
  • Exceptions (Errors and Issues)
  • Users Manual
  • Elizabeth Axelrod ... Big D

Just Shoot Them
16
General Suggestions
  • Drive Out Fear
  • Early Intervention
  • Obtain Buy In from all parties
  • Keep it Simple ... use macros
  • Be consistent use macros
  • Monitor results
  • Document everything, every time

17
II. SAS Macros
  • Macros allow you to use, re-use and share
    object-oriented code.
  • QC is very redundant .... the same or similar
    process performed on each data set, each variable
    and each process.

18
Reality
  • People are
  • Ignorant
  • Forgetful
  • Busy
  • Lazy
  • Dont Care

19
Why Macros
  • Minimal Effort
  • Parameters
  • Available (FREE)

20
FREQOUT Produces Frequencies for multiple
variables FREQOUT (data / input dataset
name /, out freqout / output data set name
, vars / list of variables /, by /
list of by variables /, fmtassign / var fmt
var fmt /, debugging NO / YES or NO /
Author Ian Whitlock Location
www.lexjansen.com and sconsig.com
21
EAP_RPT
EAP_RPT (DSN, LIBIN , LIBOUT, _VARS ,
_FMTS)
DSN Name of
input SAS data set LIBIN SAS
library of input data set
LIBOUT SAS library of output data
set _VARS list of character variables to
review .. paired with _FMTS _FMTS list of
formats to apply ... paired with _VARS Example

EAP_RPT(_VARS AGE
INCOME EDUCATION , _FMTS AGE
INC EDU , LIBIN
PROJ_IN , LIBOUT PROJ_OUT
, DSN STUDY_1)
22
  • DATA CLEANING
  • TIP00128a - Cleansing Macro, Data Scrubbing
    routine (see tip 00128 for more)
  •  
  • cleanse(schlibwork, schema, strlen50, var,
    targettarget, replacereplace,
  • casenocase)
  • Author Charles Patridge
    Version 2.1 (sug. by Ian Whitlock)
  • Location www.sconsig.com

23
REMOVE OUTLIERS
outlier ( data
_SAS_dataset_name_,
out _SAS_output_dataset_name
var
_variable_to_screen
pass _number_of_passes
except
_exception_report_data_set_,
mult _multiplier_of_standard_dev
iations_)

The OUTLIER macro completes outlier screens
based on statistical values of a numeric
variable in a SAS data set. It is set up to
remove any outlier records that are within a
given number of Standard Deviations from the
mean, and will run that screen a given number of
times. For example, a "3-Pass-2" outlier screen
will remove any values outside 3 standard
deviations from the mean, and will run that
outlier screen twice. The given numbers can be
any integer. Author Unknown Location
www.spikeware.com
24
  • CONT_COMPARE

  • Compares two data sets, list all variables and
  • reports potential issues
  • Fields in Both
  • Type
  • Length
  • cont_compare (dsn1, dsn2)

25
KEEPDBLS Documents Duplicates TIP000367-
KeepDbls
MACRO KeepDbls
(SourceDs _LAST_,
TargetDs ,
Overwrit N,
IdList ,
Where ) Moves duplicate
observations to another file. Author Jim
Groeneveld Location www.sconsig.com
26
CK_MISSING Evaluates variables in regards to
missing and non missing status. Default
_numeric_ missing. _character_
missing. Parms DSN libname and name of data
set. Default is the last read/created. PATH path
to directory where QC info is stored. VAR list
of variables to b evaluated. FMT format
statment. ck_missing( dsnmylib.recentfile,
varUPB FICO1 FICO2 FICO3 CHANNEL,
fmtUPB upb. FICO1 FICO2 FICO3 fico. CHANNEL
chnl. )
27
LOG FILTER Examines and Reports on SAS Log
Log Filter
checks your log for errors, warnings, and
other "interesting" messages. It then displays
what it finds in its summary window.
Double-click on a row and it'll reposition
the log window to display the message in
context (if it's an external log file, it'll
open it in a viewer window and position it
for you). Author Ratcliffe Location
http//ratcliffe.co.uk/rest_logfilt.htm
28
MK_FORMATS Create a format from a SAS data
set. Parms DSN SAS data set START Unique
key value ie. SSN LABEL Value to be
associated with start ie. Full Name with
SSN FMTNAME Name of Format (sans ".") TYPE C
or N for Character or Numeric LIBRARY Libname
of Format Library (default work) OTHER
Value to supply for missing (default OTHER)
29
III. Sources of SAS Macros and QC Code
  • www.sas.com (examples)
  • www.lexjansen.com (proceeding)
  • www.sconsig.com
  • www.ratcliffe.co.uk
  • www.statetechservices.com
  • www.spikeware.com

30
More Sources
www.mcw.edu/pcor/rsparapa/sasmacro.html
www.math.yorku.ca/scs/friendly.html
www.stat.ncsu.edu/sas/samples/index.html www.dasc
onsultants.com SAS-L Books By UsersRon Codys
Data Cleaning Numerous books on Macros .... By
Example
31
Questions ?
  • Gary McQuown
  • mcquown_at_DASconsultants.com
  • www.DASconsultants.com
Write a Comment
User Comments (0)
About PowerShow.com