Summarizing Performance Data - PowerPoint PPT Presentation

1 / 55

About This Presentation

Title:

Summarizing Performance Data

Description:

This is simple if we can assume that the data comes from an iid model ... Mean and standard deviation make sense when data sets are not wild ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 56

Provided by: jeanyves4

Category:

more less

Transcript and Presenter's Notes

Title: Summarizing Performance Data

1
Summarizing Performance Data

Important
Easy to Difficult
Warning some mathematical content

2
1 Summarizing Performance Data

How do you quantify
Central value
Dispersion (Variability)

old
new
3
Histogram is one answer, but is not summarized
enough
4
ECDF allow easy comparison
new
old
5
Summarized Measures

Median, Quantiles
Median
Quartiles
P-quantiles
Mean and standard deviation
Mean
Standard deviation
What is the interpretation of standard deviation
?
A if data is normally distributed, with 95
probability, a new data sample lies in the
interval

6
Coefficient of Variation

Scale free
Summarizes variability
Second order
For a data set with n samples
Exponential distribution CoV 1

7
Lorenz Curve Gap

Alternative to CoV
For a data set with n samples
Scale free, index of unfairness

8
Jains Fairness Index

Quantifies fairness of x
Ranges from
1 all xi equal
1/n maximum unfairness
Fairness and variability are two sides of the
same coin

9
Lorenz Curve
Perfect equality (fairness)
Lorenz Curve gap

Old code, new code is JFI larger ? Gap ?
Ginis index is also used Def 2 x area between
diagonal and Lorenz curve
More or less equivalent to Lorenz curve gap

10
Which Summarization Should One Use ?

There are (too) many synthetic indices to choose
from
Traditional measures in engineering are standard
deviation, mean and CoV
In Computer Science, JFI and mean
JFI is equivalent to CoV
In economy, gap and Ginis index (a variant of
Lorenz curve gap)
Statisticians like medians and quantiles (robust
to statistical assumptions)
We will come back to the issue after discussing
confidence intervals

11
2 Confidence Interval

Do not confuse with prediction interval
Quantifies uncertainty about an estimation

12
mean and standard deviation
quantiles

central value
accuracy of central value
dispersion

13
Confidence Intervals for Mean of Difference

Mean reduction 0 is outside the confidence
intervals for mean and for median
Confidence interval for median

14
Computing Confidence Intervals

This is simple if we can assume that the data
comes from an iid modelIndependent Identically
Distributed
How do I know if this is true ?
Controlled experiments draw factors randomly
with replacement
Simulation independent replications (with random
seeds)
Else we do not know in some cases we will have
methods for time series

15
What does independence mean ?
16
CI for median

Is the simplest of all
Robust always true provided iid assumption holds

17
(No Transcript)
18
Confidence Interval for Median

n 31
n 32

19
CI for mean and sdt dev

Most commonly used method
But requires some assumptions to hold, may be
misleading if they do not hold
There is no exact theorem as for median and
quantiles, but there are asymptotic results and a
heuristic.

20
Normal Case

Assume data comes from an iid normal
distributionUseful for very small data samples
(n lt30)

21
Tables in Weber-Tables
22
Example

n 100 95 confidence levelCI for meanCI
for standard deviation
amplitude of CI decreases incompare to
prediction interval

23
Standard Deviation n or n-1 ?
24
CI for mean, asymptotic case

If data is not normal but the central limit
theorem holds(in practice n is large and
distribution is not wild)

25
Example

CI for mean same as before except s instead
of ? 1.96 for all n instead of 1.98 for n100
In practice both (normal case and large n
asymptotic) are the same if n gt 30
But large n asymptotic does not require normal
assumption

26
Bootstrap Percentile Method

A heuristic that is robust (requires only iid
assumption)
But be careful with heavy tail, see next
but tends to underestimate CI
Applies to mean and any other statistic
Idea use the empirical distribution in place of
the theoretical (unknown) distribution
For example, with confidence level 95
the data set is S
Do r1 to r999
(replay experiment) Draw n bootstrap replicates
with replacement from S
Compute sample mean Tr
Bootstrap percentile estimate is (T(25), T(975))

27
Example Compiler Options

Does data look normal ?
No
Methods 2.3.1 and 2.3.2 give same result (n gt30)
Method 2.3.3 (Bootstrap) gives same result
gt Asymptotic assumption valid

28
Other Example File Transfer Times

Normal assumption and Bootstrap do not coincide
for data
Symtom that Asymptotic assumption may not hold
Normal assumption does not hold
This is an example of wild distribution
They coincide for log of data

29
Take Home Message

Confidence interval for median (or other
quantiles) is easy to get from the Binomial
distribution
Requires iid
No other assumption
Confidence interval for the mean
Requires iid
And
Either if data sample is normal and n is small
Or data sample is not wild and n is large enough
The boostrap is more robust but more complicated
to use
To apply student or normal statistic, we need to
verify the assumptions

30
QQplot is common tool for verifying assumption

Normal Qqplot
X-axis standard normal quantiles
Y-axis Ordered statistic of sample
If data comes from a normal distribution, qqplot
is close to a straight line (except for end
points)
Visual inspection is often enough
If not possible or doubtful, we will use tests
later

31
QQPlots of Compiler Options

Both data sets do not look normal

32
Verifying Assumption

If data set looks normal (by inspection of
qqplot) OK
Else, do the test of the asymptotic regime
Compute bootstrap replicates of the estimator of
the mean
If the asymptotic regime holds, they should look
normal

33
QQplots of Compiler OptionBootstrap Replicates

Asymptotic CI is valid

34
QQplots of File Transfer TimesBootstrap
Replicates

Do not appear to be normal

35
Prediction Interval

CI for mean or median summarize
Central value uncertainty about it
Prediction interval summarizes variability of
data

36
Prediction Interval based on Order Statistic

Assume data comes from an iid model
Simplest result (not well known, though)

37
Prediction Interval for small n

For n39, xmin, xmax is a prediction interval
at level 95
For n lt39 there is no prediction interval at
level 95 with this method
But there is one at level 90 for n gt 18
For n 10 we have a prediction interval xmin,
xmax at level 81

38
Prediction Interval for Small n and Normal Data
Sample
39
Re-Scaling

Many results are simple if the data is normal, or
close to it (i.e. not wild). An important
question to ask is can I change the scale of my
data to have it look more normal.
Ex log of the data instead of the data
A generic transformation used in statistics is
the Box-Cox transformation
Continuous in ss0 logs-1 1/xs1
identity

40
Prediction Intervals for File Transfer Times
mean and standard deviation
order statistic
mean and standard deviation on rescaled data
41
Take Home Message

The interpretation of ? as measure of
variability is meaningful if the data is normal
(or close to normal). Else, it is misleading. The
data should be best re-scaled.

42
Non-standard Means

Geometric, etc. means were invented for cases
where the data does not look normal they
correspond to re-scaling
Compare to prediction interval the exponential
of a prediction interval for the log of the data
is a prediction interval for the data

43
Which Summarization Should I Use ?

Two issues
Robustness to outliers
Compactness

44
Outlier in File Transfer Time
45
Robustness of Conf/Prediction Intervals
Based on mean std dev
mean std dev
Order stat
Based on mean std dev re-scaling
CI for median
geom mean
Outlier removed Outlier present
46
Fairness Indices

Confidence Intervals obtained by Bootstrap
How ?
JFI is very dependent on one outlier
As expected, since JFI is essentially CoV, i.e.
standard deviation
Gap is sensitive, but less
Does not use squaring why ?

47
Compactness

If normal assumption (or, for CI asymptotic
regime) holds, ? and ? are more compact
two values give both CIs at all levels,
prediction intervals
Derived indices CoV, JFI
In contrast, CIs for median does not give
information on variability
Prediction interval based on order statistic is
robust (and, IMHO, best)

48
Take-Home Message