Exploratory Data Analysis EDA - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Exploratory Data Analysis EDA

Description:

An outlier is also known as an extreme value. ... (BOX-AND-WHISKER DIAGRAMS) Boxplots are good for revealing: 1. center of the data ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 16
Provided by: aful3
Category:

less

Transcript and Presenter's Notes

Title: Exploratory Data Analysis EDA


1
Section 3-5
  • Exploratory Data Analysis (EDA)

2
EXPLORATORY DATA ANALYSIS
Exploratory data analysis (EDA) is the process of
using statistical tools (such as graphs, measures
of center, and measures of variation) to
investigate data sets in order to understand
their important characteristics.
3
OUTLIERS
  • An outlier is a value that is located very far
    away from almost all of the other values.
  • An outlier is also known as an extreme value.
  • Outliers can have a dramatic effect on the mean,
    standard deviation, and on the scale of the
    histogram so that the true nature of the
    distribution is totally obscured.
  • To find outliers, examine a sorted list of data
    and look for values that are far from most other
    values.

4
5-NUMBER SUMMARY
For a set of data, the 5-number summary consists
of
  • the minimum value
  • the first quartile, Q1
  • the median (or second quartile, Q2)
  • the third quartile, Q3 and
  • the maximum value.

5
EXAMPLE
Find the 5-number summary for Bank of Providence
waiting times.
6
BOXPLOTS(BOX-AND-WHISKER DIAGRAMS)
Boxplots are good for revealing 1. center of
the data 2. spread of the data 3. distribution
of the data 4. presence of outliers Boxplots are
also excellent for comparing two or more data
sets.
7
CONSTRUCTING A BOXPLOT
  • Find the 5-number summary.
  • Construct a scale with values that include the
    minimum and maximum data values.
  • Construct a box (rectangle) extending from Q1 to
    Q3, and draw a line in the box at the median
    value.
  • Draw lines extending outward from the box to the
    minimum and maximum data values.

8
AN EXAMPLE OF A BOXPLOT
9
DRAWING A BOXPLOTON THE TI-83/84
  • Press STAT select 1Edit.
  • Enter your data values in L1. (Note You could
    enter them in a different list.)
  • Press 2ND, Y (for STATPLOT). Select 1Plot1.
  • Turn the plot ON. For Type, select the boxplot
    (middle one on second row).
  • For Xlist, put L1 by pressing 2ND, 1.
  • For Freq, enter the number 1.
  • Press ZOOM. Select 9ZoomStat.

10
EXAMPLE
Use boxplots to compare the waiting times at
Jefferson Valley Bank and the Bank of Providence.
Interpret your results.
11
BOXPLOTS AND DISTRIBUTIONS
12
BOXPLOTS AND DISTRIBUTIONS (CONTINUED)
13
BOXPLOTS AND DISTRIBUTIONS (CONCLUDED)
14
EXPLORING
  • Measures of Center mean, median, and mode
  • Measures of Variation standard deviation and
    range
  • Measures of Dispersion minimum value, maximum
    value, and quartiles
  • Unusual Values outliers
  • Distribution histogram, stem-leaf plots, and
    boxplots

15
EXAMPLE
Explore the data below which shows the ages of
most employees at the Vita Needle Company. 76
45 72 77 63 87 73 84 86
79 86 75 87 74 39 75 41 82
34 88 85 79 73 53 65 (Based on
data from Where Retirement Became a Dirty Word
by Julie Flaherty, New York Times.)
Write a Comment
User Comments (0)
About PowerShow.com