Psychology 9 - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Psychology 9

Description:

Aim to summarize: central tendency, variability, shape ... Thinking about central tendencies ... 'least squares' estimate of central tendency. why prefer mean? ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 26
Provided by: bro9
Category:

less

Transcript and Presenter's Notes

Title: Psychology 9


1
Psychology 9
  • Quantitative Methods in Psychology
  • Jack Wright
  • Brown University
  • Session 7

Note. These lecture materials are intended
solely for the private use of Brown University
students enrolled in Psychology 9, Spring
Semester, 2002-03. All other uses, including
duplication and redistribution, are unauthorized.

2
Agenda
  • Working with percentiles
  • Central tendencies
  • Variability
  • Assignment remainder of Chapter 4

3
Percentiles reprise preview
  • Useful way of describing how high X is relative
    to the sample in question
  • X is at 90th percentile is to say 90 of sample
    falls at or below X
  • Percentiles provide the foundation for many of
    the methods we will develop later
  • Eg making inferences about normal distributions
  • Making inferences about sample means

4
Finding percentiles using summary frequency
distributions
X f cf c 1-2 1 1 10
10 3-4 2 3 20 30 5-6 4 7 40
70 7-8 2 9 20 90 9-10 1 1 10 100
cases
Interval
  • Note
  • 1. Intervals extend to upper and lower real
    limits (eg, 4.5-6.5)
  • Eg, percentile rank for 6.5 70.
  • For other values of X, you must interpolate
  • Eg, what is the percentile rank for X 5.5?
  • See next slide

5
Working with percentiles
X f cf c 1-2 1 1 10
10 3-4 2 3 20 30 5-6 4 7 40
70 7-8 2 9 20 90 9-10 1 10 10 100
cases
Interval
  • What is the percentile rank for X 5.5?
  • How far in the interval (5.5 4.5)/2 1/2
    .5
  • How many cases that far in .50 4 2
  • Get cumulative percentage (3 2)/10 5/10
    50

6
A less obvious case of percentiles (optional)
X f cf c 1-2 1 1 10
10 3-4 2 3 20 30 5-6 4 7 40
70 7-8 2 9 20 90 9-10 1 10 10 100
cases
Interval
  • What is the percentile rank for X 5?
  • How far in the interval (5 4.5)/2 .5/2
    .25
  • How many cases that far in .25 4 1
  • Get cumulative percentage (3 1)/10 4/10
    40

7
Reversing the problem Find X for some ile
X f cf c 1-2 1 1 10
10 3-4 2 3 20 30 5-6 4 7 40
70 7-8 2 9 20 90 9-10 1 10 10 100
cases
Interval
  • What score would put you at the 70th percentile?
  • Answer When exact percentile already available
  • Simply extract value of X using true limits
  • X(at 70th percentile) 6.5

8
A less obvious case of finding X (optional)
X f cf c 1-2 1 1 10
10 3-4 2 3 20 30 5-6 4 7 40
70 7-8 2 9 20 90 9-10 1 10 10 100
?
cases
N 10
Interval
  • What score would put you at the 80th percentile?
  • get n needed to reach that percentile
  • .80 N 8, or 1 in from next Interval
  • get how much distance to cover in interval to get
    this n
  • (1/freq. In interval) interval width (1/2) 2
    1
  • get X by adding to lower limit X 6.5 1
    7.5

9
Note about text
  • Page 123, problem 3
  • Frequency distribution is incorrect
  • Interval is missing
  • Answer 4 is incorrect and/or solution to table
    other than table in 3.

10
From graphical to numeric summaries
  • So far, considered graphical methods
  • Advantage rich, flexible, capitalize on power of
    visual cortex
  • Disadvantage cumbersome, inefficient
  • Descriptive statistics and numeric summaries
  • Aim to summarize central tendency, variability,
    shape
  • To do so more efficiently than graphical methods
  • Tradeoff
  • Easier to be mislead if we are careless
  • Easier to forget what we are measuring

11
Three measures of central tendency
Mode the most common values(s) the most
population interval(s)
0l 1 0t 2 3 0f 4 4 4 5 0s 6 6 7 0h 8
1l 1t 1
Median the value that Divides sample into 2
equal Parts (50th percentile) 4.5
Mean SX/N 61/12 5.08
Note decimal is 1 to right of stem.
12
Footnote on Mode
  • Two uses
  • 1. most frequent exact value
  • eg 4 in last example
  • useful with discrete variables
  • 2. most populous interval
  • eg interval 4-5 in last example
  • interval interpretation necessary when using
    continuous variables

13
Strengths and weakness of these measures Ex. 1
Modes 2.298 2.310
2.29h 8 8 8 8 8 8 2.30l 1 1 2.30t 2.30f 2.30s 2.30
h 9 2.31l 0 0 0 0 0 0 2.31t 2
Mean 2.304
Median 2.305
Note neither mean nor median is a good
description of central tendency because there is
no one central tendency.
14
Strengths and weakness Ex. 2
2.29h 8 8 8 8 8 8 2.30l 1 1 2.30t 2.30f 2.30s 2.30
h 9 2.31l 0 0 0 0 0 0 2.31t 2 2.31f . . . Etc . .
. . . . . . 5.00 0
Modes 2.298 2.310 Not changed
Median 2.309 Almost no change
Mean 2.46. Now lies outside the Range of all
but one value.
15
Summary
  • Mode
  • Strengths
  • Easy to identify (usually)
  • When based on intervals, will change as intervals
    are changed
  • Weaknesses
  • Not necessarily unique
  • Not necessarily central

16
Summary
  • Median
  • Strengths
  • Uses more information than mode
  • Not disturbed by extreme scores (robust)
  • Weaknesses
  • Does not use all information in sample ranks
    matter, but not distances
  • Awkward to compute (e.g., when ties exist)
  • Insensitive to bimodality
  • Medians of two samples may not be combined to
    determine median of the combined samples.

17
Monday 6.10.03 ended here
18
Summary
  • Mean
  • Strengths
  • Uses all of the information in sample distances
    matter
  • Mathematicall appealing
  • Eg means of two samples can be combined to
    determine mean of the combined samples
  • Weaknesses
  • Disturbed by extreme scores not robust
  • Therefore, can be misleading when data are skewed
  • Like median, insensitive to bimodality

19
Thinking about central tendencies
cases
Interval
  • Where is the middle?
  • How would we know if one description of middle
    is better than another?

20
Thinking about central tendencies
  • Cleary, we want our estimate of middle (M) to
    be as near as possible to the data
  • How are we going to define near?
  • Perhaps average distance from M to each datum?
  • Snag negative distances
  • Eg for X 3 4 5 m6 and m2 are equally far
    from the data
  • But distances are in one case, - in the other
  • So what to do?

21
Thinking about central tendencies
  • Two options
  • 1. Take the absolute value of the deviations
  • Data 1 2 9
  • if guess 9, X 9 8 7 0, sum 15
  • if guess 3, X 3 2 1 6, sum 9
  • if guess 2, X 2 1 0 7, sum 8
  • 2. Take the square of the deviations
  • Data 1 2 9
  • if guess 9, (X 9)2 64 49 0, sum 113
  • if guess 3, (X 3)2 4 1 36, sum
    41
  • if guess 2, (X 2)2 1 0 49, sum
    50

22
Thinking about central tendencies
  • Now just imagine we make many guesses about where
    the middle is
  • data 1, 2, 9
  • For each guess, assess how far from data by these
    two criteria
  • sum of absolute differences
  • sum of squared differences

23
Absolute distances
Median
Minimum
Guess of where the middle is
24
Squared distances
Mean
Minimum
Guess of where the middle is
25
Thinking about central tendencies
  • Implications
  • the middle we select depends on how we
    operationalize distance from data to middle
  • median minimizes sum of absolute distances
  • mean minimizes sum of squared distances
  • therefore, mean is known as the least squares
    estimate of central tendency
  • why prefer mean?
  • erratic behavior of sum of absolute distances,
    as we have seen
  • other problems already noted (e.g., combining
    samples)
Write a Comment
User Comments (0)
About PowerShow.com