Data Analysis Quantitative Methods - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Data Analysis Quantitative Methods

Description:

For Example, Time, height, weight. X e {[0,100], [0,30), (12, 80], (1,2) ... Pie Chart. Used to illustrate percentages. Scattergrams - Positive Relationships ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 36
Provided by: scet8
Category:

less

Transcript and Presenter's Notes

Title: Data Analysis Quantitative Methods


1
Data Analysis (Quantitative Methods)
  • Lecture 1
  • Fundamental Statistics

2
What is statistics
  • Statistics is the science of data. This involves
    collecting, classifying, summarizing, organizing,
    analyzing, and interpreting numerical information.

3
Dealing with Data
  • Measurement Scales
  • Descriptive Statistics
  • Inferential Statistics

4
Dealing with Data
  • Quantitative data are measurements that are
    recorded on a naturally occurring numerical
    scale.
  • Qualitative data are measurements that cannot be
    measured on a natural numerical scale they can
    only be classified into one of a group of
    categories.

5
Levels of Measurement.
  • Nominal Scale (Qualitative category membership
    e.g. gender, eye colour, nationality).
  • Ordinal Scale (Ranks or assignments, positions in
    a group e.g. 1st 2nd 3rd).
  • Interval and Ratio Scales (measured on an
    independent scale with units, e.g. I.Q scale.
    Ratio scale has an absolute zero point e.g.
    distance, Kelvin scale).

6
Population
  • Population is a set of units (usually, people,
    objects, transactions, or events) that we are
    interested in studying.

7
Sample Statistical inference
  • A sample is a subset of the units of a
    population.
  • A statistical inference is an estimate,
    prediction, or some other generalization about a
    population based on information contained in a
    sample

8
Variables
  • A variable is a characteristic or property of an
    individual population unit (a set of unit we are
    interested in studying).
  • Discrete Variables There are no possible values
    between adjacent units on the scale. For
    Example, number of children in a family. X1, X2,
    , Xn
  • Continuous Variables Is a variable that
    theoretically can have an infinite number of
    values between adjacent units on the scale. For
    Example, Time, height, weight. X e 0,100,
    0,30), (12, 80, (1,2)

9
Descriptive Statistics
Descriptive statistics utilizes numerical and
graphical methods to look for patterns in a data
set, to summarize the information revealed in a
data set, and to represent that information in a
convenient form.
  • Graphical Representation of Data
  • Measures of Central Tendency
  • Measures of Dispersion

10
Representing Data Graphically
  • Bar Charts
  • Histograms
  • Pie Charts
  • Scattergrams

11
The Bar Chart
  • Used for Discrete variables
  • Bars are separated

12
Histogram
  • Columns can only represent frequencies.
  • All categories represented.
  • Columns are not spaced apart.

13
Pie Chart
  • Used to illustrate percentages

14
Scattergrams - Positive Relationships
15
Negative Correlation
16
No Relationship
17
Measures of Central Tendency
  • The Mean
  • The Median
  • The Mode

18
The Mean
Mean Sum of all values in a group divided by
the number of values in that group. So if 5
people took 135, 109, 95, 121, 140 seconds to
solve an anagram, the mean time taken is
135 109 95 121 140
600 --------------------------------------------
----------- 120 5
5
19
The Mean Pros Cons
  • Advantages
  • Very Sensitive Measure.
  • Forms the basis of most tests used in inferential
    statistics.
  • Disadvantages
  • Can be effected by outlying scores E.g.
  • 135, 109, 95, 121,140 480. Mean 1080/6 180
    seconds.

20
The Median
The median is the central value of a set of
numbers that are placed in numerical order.
For an odd set of numbers 95, 109, 121, 135, 140
The Median is 121
For an even set of numbers 95, 109, 121, 135,
140, 480 The Median is the two central scores
divided by 2. (121 135)/2 128
21
The Median Pros Cons
  • Advantages
  • Easier and quicker to calculate than the mean.
  • Unaffected by extreme values.
  • Disadvantages
  • Doesnt take into account the exact values of
    each item
  • If values are few it can be unrepresentative.
  • e.g 2,3,5,98,112 the median is 5

22
The Mode
The Mode The most frequently occurring value.
1, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6,
7, 7, 7, 8
The Mode 5
23
The Mode Pros cons
  • Disadvantages
  • Doesnt take into account the value of each item.
  • Not useful for small sets of data
  • Advantages
  • shows the most important value of a set.
  • Unaffected by extreme values

24
Data Types and Central Tendency Measures.
The Mode may also be used on Ordinal and Interval
Data. The median may also be used on Interval
Data.
25
Why look at dispersion?
  • 17, 32, 34, 58, 69, 70, 98, 142
  • Mean 65
  • 61, 62, 64, 65, 65, 66, 68, 69
  • Mean 65

26
Measures of Dispersion
  • The Range
  • Variance
  • The Standard Deviation

27
The Range
The Range is the difference between the highest
and the lowest scores.
Range Highest score - lowest score
4, 10, 5, 12, 6, 14 Range 14 - 4 10
28
Variance
Population Variance
Sample Variance
29
Calculating the Standard Deviation
Population standard deviation
Sample standard deviation
30
Inferential Statistics
  • Inferential statistics utilizes sample data to
    make estimates, decisions, predictions, or other
    generations about a larger set of data.
  • Inferential statistics allows us to draw
    conclusions about populations, and to test
    research hypotheses.
  • Inferential Statistics Involves
  • Probability, Distribution Theory,
  • Tests of Hypothesis etc.

31
Summary
  • All data is measured on either Nominal, Ordinal,
    Interval or Ratio Scales
  • Variables can be discrete and continuous
  • Descriptive Statistics such as measures of
    central tendency and dispersion are used to
    describe or characters data
  • Inferential Statistics is used to make inferences
    from sample data about the population at large.

32
References
  • Statistics, 8th Edition
  • MaClave and Sincich
  • Prentice Hall, 2000.

33
Exercise
  • Briefly explain what is meant by each of the
    following
  • 1. Statistics
  • 2. Descriptive statistics
  • 3. Inferential statistics
  • 4. Quantitative data
  • 5. Qualitative data
  • 6. Population variance
  • 7. Sample standard deviation

34
Exercise
  • 1. Calculate the mode, mean, and median of the
    following data
  • 8, 0, 5, 3, 7, 5, 2, 5, 8, 6, 1
  • 2. Calculate the range, variance, and standard
    deviation of the following sample
  • 6, 2, 3, 4, 3, 1, 4
  • (Answers are in the next slide)

35
Answers
  • 1. Mean4.545455
  • Median5
  • Mode5
  • 2. Range6-15
  • Sample Variance 2.57142857
  • Sample Standard Deviation1.60356745
Write a Comment
User Comments (0)
About PowerShow.com