Analysis

About This Presentation

Title:

Analysis

Description:

Analysis & Evaluation of Data The collected data should be Reliable none or very little error is committed in the gathering and tabulation of data – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 15

Provided by: fts1

Category:

more less

Transcript and Presenter's Notes

Title: Analysis

1
Analysis Evaluation of Data

The collected data should be
Reliable
none or very little error is committed in the
gathering and tabulation of data
Accurate
maintain the desired degree of precision
Valid
the data is applicable to the issue and attribute
of interest

2
Sample Consideration

We have collected error data on Requirements
Inspection, Design Inspection and Unit Testing
and want to analyze them for quality attribute
Potential Reliability problem?
Did we collect and count the data correctly in
all three cases
Potential Accuracy problem ?
Did we use the same level of precision (e.g. same
level of severity breakdown)
Potential Validity problem ?
Is number of defect a valid quality attribute
Do these data reflect a measure of the extent of
defects committed (extent number, severity,
complexity of fix, etc. ?)

3
Some Common Analysis Methods of Data

Distribution of Data
Centrality and Dispersion
Moving Averages
Data Correlation
Normalization of Data

4
1. Distribution of Data

We often look at a scatter diagram of the raw
data and pick out the outliers
We count the frequency of occurrences and get a
distribution to get a view of the shape of the
distribution and the range of distribution.
severity 1 7 defects
severity 2 24 defects
severity 3 26 defects
severity 4 88 defects
severity 5 92 defects

Range is from 7 defects to 92 defects
Shape is not that important in this case,
the skew is towards the less severe defects

5
Common Distributions of Data

There are some recognizable distributions

Normal
Linear

Logarithmic
Exponential
Negative Exponential
6
2. Centrality and Dispersion

Use centrality to compare two sets of data
distribution
mean
median

median value
mean value
median value
median value
mean value
Mean value
7
Variance Standard Deviation

A measure of dispersion from the central value
(see below)
we measured number of defects (xi) from n similar
sized functional areas
the mean or central value is calculated Xmean
?(xi) / n
the variance ? ( (Xi Xmean )2 ) / n
Std Dev. SQRT (variance)
For Normal Distribution, 1 Std captures about 68
of the sample.
Given a new function of similar size, we can
measure the number of defects found and compare
against the mean of the earlier group and the 1
std deviation.

8
Control Chart

1 Std Dev.

Mean 5.3

1 Std Dev.

9
3. Moving Average - a Smoothing Technique
Jump smoothed
Jump smoothed
Special jump
10
4. Correlation

Only addresses whether there is a relationship
Does not address cause and effect
Example
size of the module may correlate to number of
defects
but size of the module may or may not be the
cause

11
Linear Relationship
Y
Linear equation of the form Y a bX where
- b is the slope and - a is the y
intercept

X
12
Least Square Linear Regression

A method of estimating the linear relationship of
Y variables with the X variables in the following
form by minimizing the distance of Y coordinates
from the linear line to get Y abX.
We can estimate the parameters a, b as follows
b ?(XY) - (1/n)(?X)(?Y)/ ?(X2) -
(1/n)(?X)2
this b estimate gives the same value as the one
shown in the book
a Yave - (bXave)
where X is each of the X observation and Xave is
the average of Xs

13
Least Square Linear Regression - Example

(size,defects) (150,2) (230,3)(500,4)(730,7)
(1000,9)
Xs 150, 230, 500, 730, 1000 ?(Xs) 2610
X2 22,500 52,900 250,000 532,900 1,000,000
and
?(X2) 1,858,300
Ys 2, 3, 4, 7, 9 ?(Y) 25
XY 300, 690, 2000, 5110, 9000 ?(XY) 17,100
b 17100-(1/5)(2610)(25)/1858300
-(1/5)((2610)2)
4050/495880 .0081
a 25/5 - (.0081)(2610/5) 5 - 4.23 .77
Least Square Regression line is Y .77 .0081
X

Lets plug in x 150 and see what we get. .0081
(150) .77 1.22 .77 1.99 (close!) More
accurate for interpolation than extrapolation.
14
5. Normalization