Chapter 2 Descriptive Statistics II: Numerical Methods Part B - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Chapter 2 Descriptive Statistics II: Numerical Methods Part B

Description:

... is an unusually small or unusually large value in a data set. ... If the data sets are samples, the covariance is ... for one-bedroom apartments presented ... – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 24

Provided by: davidr138

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 2 Descriptive Statistics II: Numerical Methods Part B

1
Chapter 2 Descriptive Statistics II Numerical
Methods - Part B

Measures of Relative Location and Detecting
Outliers
Exploratory Data Analysis
Measures of Association between Two Variables
The Weighted Mean and Working with Grouped Data

2
Measures of Relative Locationand Detecting
Outliers

z-Scores
The Empirical Rule
Detecting Outliers

3
z-Scores

The z-score is often called the standardized
value.
It denotes the number of standard deviations a
data value xi is from the mean.
A data value less than the sample mean will have
a z-score less than zero.
A data value greater than the sample mean will
have a z-score greater than zero.
A data value equal to the sample mean will have a
z-score of zero.

4
Example Apartment Rents

z-Score of Smallest Value (425)
Standardized Values for Apartment Rents

5
The Empirical Rule

For data having a bell-shaped distribution
Approximately 68 of the data values will be
within one standard deviation of the mean.
Approximately 95 of the data values will be
within two standard deviations of the mean.
Almost all of the items (99.7) will be
within three standard deviations of the mean.

6
Example Apartment Rents

The Empirical Rule
Interval in Interval
Within /- 1s 436.06 to 545.54 48/70 69
Within /- 2s 381.32 to 600.28 68/70 97
Within /- 3s 326.58 to 655.02 70/70 100

7
Detecting Outliers

An outlier is an unusually small or unusually
large value in a data set.
A data value with a z-score less than -3 or
greater than 3 might be considered an outlier.
It might be an incorrectly recorded data value.
It might be a data value that was incorrectly
included in the data set.
It might be a correctly recorded data value that
belongs in the data set !

8
Example Apartment Rents

Detecting Outliers
The most extreme z-scores are -1.20 and 2.27.
Using z gt 3 as the criterion for an outlier,
there are no outliers in this data set.
Standardized Values for Apartment Rents

9
Exploratory Data Analysis

Five-Number Summary
Smallest Value
First Quartile
Median
Third Quartile
Largest Value

10
Example Apartment Rents

Five-Number Summary
Lowest Value 425 First Quartile 445
Median 475
Third Quartile 525 Largest Value 615

11
Measures of Association between Two Variables

Covariance
Correlation Coefficient

12
Covariance

The covariance is a measure of the linear
association between two variables.
Positive values indicate a positive relationship.
Negative values indicate a negative relationship.

13
Covariance

If the data sets are samples, the covariance is
denoted by sxy.
If the data sets are populations, the covariance
is denoted by .

14
Correlation Coefficient

The coefficient can take on values between -1 and
1.
Values near -1 indicate a strong negative linear
relationship.
Values near 1 indicate a strong positive linear
relationship.
The formula is complex, and well use Excel to do
the mathematics for us.

15
Using Excel to Compute theCovariance and
Correlation Coefficient

Formula Worksheet

16
Using Excel to Compute theCovariance and
Correlation Coefficient

Value Worksheet

17
Excel, covariance and correlation

Excel calculates a population covariance
Excel calculates a sample correlation
It is usually necessary to correct the covariance
to be a sample covariance (n/(n-1))
COVAR(array1,array2) n / (n-1) is the
correction factor

18
The Weighted Mean andWorking with Grouped Data

The Weighted Mean
Mean for Grouped Data
Variance for Grouped Data
Standard Deviation for Grouped Data

19
The Weighted Mean

When the mean is computed by giving each data
value a weight that reflects its importance, it
is referred to as a weighted mean.
In the computation of a grade point average
(GPA), the weights are the number of credit hours
earned for each grade.
When data values vary in importance, the analyst
must choose the weight that best reflects the
importance of each value.

20
The Weighted Mean

xwt ? wi xi
? wi
where
xi value of observation i
wi weight for observation i

21
Grouped Data

The weighted mean computation can be used to
obtain approximations of the mean, variance, and
standard deviation for grouped data.
To compute the weighted mean, we treat the
midpoint of each class as though it were the mean
of all items in the class.
We compute a weighted mean of the class midpoints
using the class frequencies as weights.
Similarly, in computing the variance and standard
deviation, the class frequencies are used as
weights.

22
Example Apartment Rents