Title: LSSG Green Belt Training
1LSSG Green Belt Training
Overview of Charts and Graphs Mini Case
2Overview The Story
- A retail company has 3 regional shipping
centers, each with its own computer system. - Goal
- To determine which of the 3 systems is most
efficient, determine if it helps meet customer
needs regarding shipping, and improve it further.
Then, the system can be used companywide to
facilitate integration and enhance efficiency. - Source Meet Minitab
- (http//www.minitab.com/support/docs/rel14/MeetMin
itab14.pdf)
3Shipping Example Data
- A few records from the dataset (total 319
records) are shown below - Center Order Arrival Days Status Distance
- Eastern 3/3/2003 834 3/7/2003 1521 4.28264 On
time 255 - Eastern 3/3/2003 835 3/6/2003 1705 3.35417 On
time 196 - Eastern 3/3/2003 838 Back order 299
- Central 3/3/2003 858 3/6/2003 1459 3.25069 On
time 81 - Central 3/3/2003 904 3/8/2003 1012 5.04722 On
time 235 - Central 3/3/2003 906 3/9/2003 1613 6.29653 Late
259 - Western 3/3/2003 944 3/6/2003 1008 3.01667 On
time 291 - Western 3/3/2003 946 3/6/2003 950 3.00278 On
time 271
Continuous (Numerical) Data
Categorical Data
4First Look at Shipping Data Univariate Analysis
- Univariate analysis simply means looking at data
one variable at a time, as a precursor to
multivariate analysis. - Purpose
- To understand the individual variables before
exploring relationships among variables, - To check for extraordinary, incorrect, or missing
data. - The basic method is to graph the data to look at
the distribution, and compute measures of central
tendency (Mean/Median) and variation (Range,
Standard Deviation).
5Individual Value Plots
The graphs below show the number of days for
shipping for all data together on the left, and
separated by shipping center on the right. The
graph on the right also shows the mean values for
each center connected by a line. At first
glance, it is evident that the Western center
has the lowest mean number of days for shipping.
6Frequency Histograms A look at the distribution
Frequency histograms help us see how the data are
distributed. At left we can see that the number
of days for shipping across all centers is
normally distributed with a mean of about 4, and
ranging from about 1 to 8 days. At the right is
the distribution of the 3 centers shown
separately.
7Box Plots Comparing Distributions
Box plots are a useful way to compare
distributions visually. Box plots show the
smallest value, the first quartile, median, third
quartile, and the Largest value of the variable.
In the plots below, the Mean is also marked.
8Graphing Categorical Data Bar/Pie Charts
The Bar chart below shows the status of orders in
the sample as A percentage. About 7-8 of the
orders are on back-order, while About 5 are
shipped late overall. The Pie charts show the
percent Of back-orders and late shipments by
each center.
9Descriptive Statistics
Descriptive statistics give us numerical insight
into the data. Compare this information to the
graphs on the previous page.
Descriptive Statistics Days Results for
Center Central Variable Status
N N Mean SE Mean StDev Days
Back order 0 6
Late 6 0 6.431
0.157 0.385 On time 93 0
3.826 0.119 1.149 Results for Center
Eastern Variable Status N N
Mean SE Mean StDev Days Back order
0 8
Late 9 0 6.678 0.180
0.541 On time 92 0 4.234
0.112 1.077 Results for Center Western
Variable Status N N Mean SE
Mean StDev Days Back order 0 3
On
time 102 0 2.981 0.108 1.090
10Testing Hypotheses - ANOVA
While it looks like the centers are different in
their efficiencies, a hypothesis test can
confirm that. A one-way ANOVA tests whether the
mean number of days for the 3 centers are in fact
significantly different from each other. The low
p-value (almost 0) indicates that one can
conclude with great confidence (almost 100) that
there at least one of the centers is different
from the others.
- One-way ANOVA Days versus Center
- Source DF SS MS F P
- Center 2 114.63 57.32 39.19 0.000
- Error 299 437.28 1.46
- Total 301 551.92
- S 1.209 R-Sq 20.77 R-Sq(adj) 20.24
- Individual 95 CIs
For Mean Based on - Pooled StDev
- Level N Mean StDev -----------------
------------------- - Central 99 3.984 1.280
(-------) - Eastern 101 4.452 1.252
(--------) - Western 102 2.981 1.090 (-------)
-
------------------------------------ -
3.00 3.50 4.00 4.50
11Examining relationships - scatterplot
Is the better performance of the Western center
due to smaller shipping distances than the other
regions? First, a scatterplot of Number of Days
for shipping against the Distance across all
centers seems to show no relationship between
the two.
12Scatterplot separated by Center
When we look at it by center, there still seems
to be no relationship between distance and
number of days.