SAS/INSIGHT Demonstration - Exploratory Data Analysis (EDA) techniques - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

SAS/INSIGHT Demonstration - Exploratory Data Analysis (EDA) techniques

Description:

Office of the Actuary. X7933. David.Lassman_at_cms.hhs.gov. 9/7/09. 2. Introduction ... 'A set of tools for finding what we might have otherwise missed' in a set of data ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 14
Provided by: sascom1
Category:

less

Transcript and Presenter's Notes

Title: SAS/INSIGHT Demonstration - Exploratory Data Analysis (EDA) techniques


1
SAS/INSIGHT Demonstration - Exploratory Data
Analysis (EDA) techniques
  • David Lassman
  • Office of the Actuary
  • X7933
  • David.Lassman_at_cms.hhs.gov

2
Introduction
  • Apply exploratory data analysis techniques (using
    SAS/INSIGHT) to improve outlier detection
  • Demonstration
  • Datasets
  • Census Bureau
  • Service Annual Survey
  • Industry estimates
  • American Hospital Association
  • Annual hospital data
  • Baseball data 1986 statistics
  • Hitters

3
Exploratory Data Analysis(EDA)
  • A set of tools for finding what we might have
    otherwise missed in a set of data (Tukey, 1977)
  • Determine which cases are unusual with respect to
    the bulk of the cases and to follow-up those
    cases.
  • Limited time frame

4
Data Review using Graphical Data Analysis-Benefits
  • Allows you to see the big picture
  • Research/explain data trends
  • Find patterns in your data

5
Techniques
  • Box plot
  • Location, spread, and shape of a distribution
  • Scatter plot
  • Fitting
  • Ordinary least squares regression
  • Transformations
  • Log
  • Square root
  • Other

6
Demonstration datasets - Service Annual Survey
  • Collect financial data
  • Revenue
  • Expenses
  • Service Industries include
  • Personal Services
  • Business Services
  • Health Care and Social Assistance
  • Trucking and Warehousing

7
Health Care and Social Assistance (NAICS 62)
  • Physicians - NAICS 6211
  • Dentists - NAICS 6212
  • Other health care practitioners - NAICS 6213
  • Chiropractors NAICS 62131
  • Optometrists NAICS 62132
  • Outpatient Care Centers NAICS 6214
  • Home Health Care NAICS 6216
  • Other Ambulatory Services NAICS 6219
    examples blood/organ banks, health screening
    services, hearing services
  • Hospitals NAICS 622
  • Nursing and Residential Care Facilities NAICS
    623

8
Health Care and Social Assistance (NAICS 62) cont
  • Data items
  • Total Revenue
  • Source of payment
  • Medicare
  • Medicaid
  • Other Government
  • Patient (out-of-pocket)
  • Patient from family NAICS 623 only
  • Patient from social security benefits NAICS 623
    only
  • Private insurance
  • All other patient care
  • All other services
  • Total Expenses

9
-----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes-----
Variable Type Len Pos Format Informat Label
17 Description Char 47 135      
2 ITEM Char 24 110 24. 24. ITEM
1 NAICS Char 6 104 6. 6. NAICS
3 Tax Char 1 134 1. 1. Tax
15 ch_00_99 Num 8 88 PERCENT6.2   ch_00_99
14 ch_01_00 Num 8 80 PERCENT6.2   ch_01_00
13 ch_02_01 Num 8 72 PERCENT6.2   ch_02_01
12 ch_03_02 Num 8 64 PERCENT6.2   ch_03_02
11 ch_04_03 Num 8 56 PERCENT6.2   ch_04_03
16 ch_99_98 Num 8 96 PERCENT6.2   ch_99_98
10 r1998 Num 8 48 DOLLAR12.   r1998
9 r1999 Num 8 40 DOLLAR12.   r1999
8 r2000 Num 8 32 DOLLAR12.   r2000
7 r2001 Num 8 24 DOLLAR12.   r2001
6 r2002 Num 8 16 DOLLAR12.   r2002
5 r2003 Num 8 8 DOLLAR12.   r2003
4 r2004 Num 8 0 DOLLAR12.   r2004
Service Annual Survey -
10
AHA hospital dataset
-----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes-----
Variable Type Len Pos Format Informat Label
6 ADMTOT Num 8 32     ADMTOT
5 BDTOT Num 8 24     BDTOT
7 BIRTHS Num 8 40     BIRTHS
2 EXPTOT Num 8 0 DOLLAR27.2 DOLLAR27.2 EXPTOT
1 ID Char 10 48 10. 10. ID
8 MNAME Char 30 58 30. 30. MNAME
4 NPAYBEN Num 8 16 DOLLAR27.2 DOLLAR27.2 NPAYBEN
3 PAYTOT Num 8 8 DOLLAR27.2 DOLLAR27.2 PAYTOT
9 bdgrp Char 18 88      
11
-----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes----- -----Alphabetic List of Variables and Attributes-----
Variable Type Len Pos Label
19 assists Num 8 112 Assists
5 atbat Num 8 0 Times at Bat
12 atbatc Num 8 56 Career Times at Bat
22 batavg Num 8 136 Batting Average
23 batavgc Num 8 144 Career Batting Average
20 errors Num 8 120 Errors
6 hits Num 8 8 Hits
13 hitsc Num 8 64 Career Hits
7 homer Num 8 16 Home Runs
14 homerc Num 8 72 Career Home Runs
2 league Char 1 166 League
1 name Char 14 152 Hitter's name
4 position Char 2 170 Position(s)
18 putouts Num 8 104 Put Outs
9 rbi Num 8 32 Runs Batted In
16 rbic Num 8 88 Career Runs Batted In
8 runs Num 8 24 Runs
15 runsc Num 8 80 Career Runs Scored
21 salary Num 8 128 Salary (in 1000)
3 team Char 3 167 Team
10 walks Num 8 40 Walks
17 walksc Num 8 96 Career Walks
11 years Num 8 48 Years in the Major Leagues
Baseball -
12
SAS/INSIGHT
  • Interactive tool for exploring and analyzing
    data.
  • Graphs include box plots, histograms, and scatter
    plots
  • Dynamic - Click on points to identify
  • All graphs and analyses are linked.
  • Brush observations in one window and they are
    highlighted in all windows.
  • Color a point in one graph and it receives the
    same color in all displays.
  • Exclude an observation from calculations, and all
    analyses recalculate automatically.

13
Demonstration
  • Invoke insight
  • Type insight in the command box
  • P-menu solutions-analysis interactive data
    analysis
  • PROC insight
Write a Comment
User Comments (0)
About PowerShow.com