CSEM03 REPLI - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

CSEM03 REPLI

Description:

Be able to plot graphs and know your way around SPSS ... on the back of football shirts denotes the type of player ( 1 = goalkeeper). Measurement data ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 19
Provided by: lynnehu
Category:

less

Transcript and Presenter's Notes

Title: CSEM03 REPLI


1
CSEM03REPLI
  • Research and the use of statistical tools

2
Objectives
  • At the end of this lecture you will be able to
  • Be able to plot graphs and know your way around
    SPSS
  • Describe the shape of normal and non-normal
    distributions
  • Describe the characteristics of a normal and
    non-normal distribution
  • Mode, median, mean, standard deviation

3
The SPSS statistical tool
  • Recommended text
  • Andy Fields Field, A (2005) Discovering
    Statistics using SPSS, 2nd Edition, Sage
    Publishing, London (ISBN 0-7619-4452-4)
  • You can get SPSS for your own machine for 5 from
    the LRC

4
Simple statistical models mean, sum of squares,
variance and standard deviation
  • Mean
  • We can consider the mean of a sample as one of
    the simplest statistical models because it
    represents a summary of the data.
  • For example
  • How may CDs does a group of students own?
  • If we take 5 students then the numbers of CDs
    owned respectively are 1,2,3,4,6.
  • The mean is the sum of these (?) divided by the
    number of students.
  • Mean 16 3.2
  • 5
  • This is a theoretical mean, you cannot have .2 of
    a CD.
  • How well does mean represent the data?
  • What are the differences between the observed and
    the mean?

5
Differences between the mean and the observed data
Figure 1 Differences between observed no of CDs
and the mean
6
Cancelling out the errors in the data (deviation
from the mean)
  • xi - 1-3.2 -2.2
  • where xi first data point x1
  • so x2 - 2 3.2 -1.2 etc
  • The deviances are -2.2, -1.2, -0.2, 0.8 and
    2.8
  • Total error ? xi - (sum of deviances) 0

7
Sum of Squared errors (SS)
  • So there is no total between our model and the
    observed data. Some errors are negative, some are
    positive but they cancel each other out. To avoid
    the problem of knowing the direction of error (eg
    in a large dataset) we square each error (a
    negative squared becomes positive).
  • This is called the Sum of Squared errors (SS) and
    is a good measure of the accuracy of our model.
    This however depends on the amount of data
    collected and the more data points the higher the
    SS. To overcome this we average the error by
    dividing SS by the number of observations N.

8
  • A more useful statistic is to use the error in
    the sample to estimate the error in the
    population this is done by dividing SS by the
    no of observations 1.
  • This measure is known as the variance

9
Variance and Standard Deviation
  • Variance s2 SS ? (xi - )2
  • N-1
    N-1
  • ? ( xi - )2 4.84 1.44 0.04 0.64 7.84
    14.8
  • ? (xi - )2 14.8 3.7
  • N-1 4
  • From this statistic we can derive a very useful
    measure called Standard Deviation (SD)
  • SD ?Variance ? 3.7 1.92

10
Levels of data
  • The type of data you collect will depend on the
    design of your study. Data can be measured on
    different scales
  • Interval data
  • These data are measured on a scale along which
    intervals are equal. For example if you record
    ratings of a pop video on a range of 1 to 5 the
    change between each number should be equal.
  • Categorical data
  • These are any variables that are made up of
    objects/ entities. For example the UK degree
    classification system comprises 1, 21, 22, 3,
    pass or fail. The interval between each class is
    not equal.
  • Nominal data
  • This is where numbers can represent names e.g.
    the numbers on the back of football shirts
    denotes the type of player ( 1 goalkeeper).
  • .
  • Measurement data
  • The objects being studied are measured on a
    quantitative scale. With discrete measurement
    data only certain values are possible The data
    can be discrete or continuous. Examples of
    continuous measurement data are age, height,
    cholesterol level.
  • Ordinal data
  • A type of categorical data where the order is
    important e.g Degree classification, seriousness
    of illness.

11
Median
  • The median is the "Middle value" of a list. The
    smallest number such that at least half the
    numbers in the list are no greater than it. If
    the list has an odd number of entries, the median
    is the middle entry in the list after sorting the
    list into increasing order.
  • If the list has an even number of entries, the
    median is equal to the sum of the two middle
    (after sorting) numbers divided by two.
  • The median can be estimated from a histogram by
    finding the smallest number such that the area
    under the histogram to the left of that number is
    50

12
Mode
  • For lists, the mode is the most common (frequent)
    value. A list can have more than one mode. In a
    histogram, a mode is the most frequently
    occurring interval (seen as a bump).

13
Normal distribution example (scores in a test)
14
Positive skew
15
Standard deviation ?
Mean ?
34.1
34.1
13.6
13.6
0.1
0.1
2.1
2.1
-3?
-2?
-1?
1?
2?
3?
?
16
Find the mean, median, mode, and range of these
data.
  • ExampleThree dice are rolled 12 times. The sum
    of the numbers after each roll is recorded
    below
  • Numbers rolled
  • 12, 11, 3 , 7, 4, 4, 17, 13, 12, 5, 8, 12
  • Step 1 Rearrange the data elements.3, 4, 4, 5,
    7, 8, 11, 12 ,12, 12, 13, 17Step 2 Find the
    mean.Add all the numbers and divide by 12
  • 116/12 9.7Step 3 Find the median.The
    sample size is even, for there are 12 data
    elements.The median is the average value of the
    sixth and the seventh elements.median
    (811)/2 9.5

17
  • Step 4 Find the mode.The number 3 occurs
    once.The number 4 occurs twice.The number 5
    occurs once.The number 7 occurs once.The number
    8 occurs once.The number 11 occurs once.The
    number 12 occurs three times.The number 13
    occurs once.The number 17 occurs once.mode
    12Step 5 Find the range.The highest value is
    17.The lowest value is 3.range 17-3 14

18
Examples for practice
  • The weekly salaries of six employees at McDonalds
    are70, 100, 90, 80, 70, 100. For these
    six salaries, find (a) the mean (b) the median
    (c) the mode
  • List the data in order 70,70,80,90,100,100
                                                      
                                 Mean 60 70 80
    90 90 120 510 85                      
                  6                         6     
    Median    60, 70, 80, 90 , 90 , 120 The two
    numbers that fall in the middle need to
    beaveraged.     80 90 85
  •                            2 Mode   The number
    that appears the most is 90
Write a Comment
User Comments (0)
About PowerShow.com