Measuring Individual Differences - PowerPoint PPT Presentation

1 / 113
About This Presentation
Title:

Measuring Individual Differences

Description:

The goal of science is to describe, predict, and explain natural phenomena ... Plot the means, the slope of the line is the correlation (point-biserial correlation) ... – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 114
Provided by: Bri8194
Category:

less

Transcript and Presenter's Notes

Title: Measuring Individual Differences


1
Measuring Individual Differences
2
Overview
  • Measurement as a scientific process
  • Psychological tests
  • Statistical Concepts
  • Reliability
  • Validity

3
Psychological Measurement
  • Measurement the process (rules) for assigning
    numbers to observations to represent quantities
    of attributes
  • Statistics a body of procedures for organizing
    data, describing variation, and making inferences

4
Goals of Science
  • The goal of science is to describe, predict, and
    explain natural phenomena
  • It is necessary to make careful and precise
    observations of the phenomena (i.e., measurement)
  • These observations must be interpreted within an
    explanatory framework (i.e., theory)
  • The truthfulness of the explanation must be
    evaluated (i.e., validation)

5
  • Theory the proposed interpretation, or
    explanation, of the interrelationships among
    variables found in nature
  • Constructs basic elements of a theory
  • Constructs are abstractions from our observations
    and are themselves unobservable.
  • Constructs are related to other constructs by
    hypotheses which specify the ways which variation
    in one construct will cause, accompany, or affect
    variation in another construct.

6
Stages of Scientific Inquiry
  • Observation construct generation
  • Hypothesis generation related to, associated
    with, predicts
  • Investigation of hypotheses/ Data collection
    create operational definitions (measurement) and
    gather data. Because there may be many
    operational definitions of any single construct
    the adequacy of the operational definition has to
    be ascertained (construct validation)
  • Verify/Refute Theory

7
Operational Definition
  • Each (abstract) construct must be translated into
    something that is directly observable, which
    serves as a proxy for the construct.
  • Operational Definition the set of specified
    procedures which take the theoretical construct
    and reduce (map) it to a quantitative (real
    world) level.

8
Scientific Method (Measurement)
  • Hypothesis based on Theory
  • Operational Definition of Constructs
  • Data Collection
  • Data Analysis, Summarization, Interpretation
  • Evaluate the fit of the results to either
    support or fail-to-support the stated
    (theoretical) hypothesis

9
Nomological Network
  • Visual representation of a theory that delineate
    the relations among the constructs
  • Path Diagrams, conventions

Circles represent latent or unobserved
variables (constructs)
Double headed arrows represent correlations
Rectangles represent latent observed
variables, Measurements that serve as
operational definitions of constructs
Single headed arrows represent regression
coefficients change in the variable at the tail
causes change in the variable at the head
10
Hypothesis Relationship
C2 School Performance
C1 Intelligence
C3 Learning Ability
Hypothesis
Hypothesis
Relationship
Relationship
Construct level Theory Land
O2 Course Grades
O1 IQ scores
O3 Speed of Learning Paired Associates
Observed
Observed
Relationship
Relation
Observable level Data Land
Observed Relationship
11
Measurement Standardizes Meaning and Communication
  • Express general laws in precise ways
  • Allows for the use of math stats
  • Greater descriptive flexibility
  • Use of numbers relays more precise information
  • Better characterization of relative position

12
  • Psychological measurement is less clear and far
    more complicated than physical measurement.
  • Physical measurements can be repeated without
    substantially changing the measurement.
  • Psychological measurements run the risk of
    changing the individual as a result of the
    measuring process.
  • Due to limitations of scale, the basis of
    psychological measurement are found in comparing
    an individual with the group (normative).

13
What is a Psychological Test?
  • 3 criteria
  • Sample of behavior
  • Obtained under standardized conditions
  • Established measurement and scoring rules

14
Sampling Behavior
  • Cant measure all relevant behavior have to get
    a sample of behavior
  • Three types
  • Specific task tests of performance
  • Observation
  • Self-reports

15
Specific Task Tests
  • Most familiar type
  • Score based on success in performing the task
  • Generalizable?
  • Limited by testing situation
  • Examples?

16
Observation
  • Participant knowledge of being observed
  • Generalizable?
  • Again, limited by the testing situation
  • Examples?

17
Self-reports
  • Widely used
  • Description or report
  • Valid?
  • Truthful?
  • Faking
  • Examples?

18
Standardization
  • Key word uniformity
  • Administration
  • Scoring
  • Why is uniformity important?
  • What factors could affect test results?

19
Test Scoring
  • Obvious objective
  • Right/wrong
  • Not so obvious subjective
  • Projective Tests
  • Must set up clear criteria
  • Again consistency is very important!

20
Back to Standardization
  • Key word uniformity
  • Administration
  • Scoring
  • Establish norms
  • What are norms?
  • Terminology norm group normative sample
    standardization sample

21
Test Norms
  • A conversion process
  • Raw scores scaled scores for comparisons
  • Example percentile or percentile ranks
  • Where do we get norms?
  • Standardization group
  • Needs to be representative sample!

22
Statistics
  • Measurement is a set of procedures for assigning
    numbers to observations to represent quantities
    of attributes
  • Measurement yields data
  • Statistics is a set of procedures for summarizing
    data, describing variation, and making inference

23
Descriptive Statistics
  • Summarizes data and describes variation
  • Central Tendency mean, median, mode
  • Dispersion variance, standard deviation, min
    and max
  • Distribution skewness, kurtosis

24
Central Tendency
  • The typical or expected score
  • Mean average, ? Xi / N
  • Median middle score of the distribution
  • Mode most frequent score
  • If the distribution is symmetrical (e.g., a
    normal distribution), the mean, median, and mode
    will be the same value
  • The greater the skew or kurtosis, the more
    measures of central tendency will differ

25
Dispersion or Variability
  • Indication of how much scatter there is in the
    distribution of scores
  • Variability is absolutely essential to
    measurement and the study of individual
    differences no reason to measure if everyone the
    same
  • Generally want to maximize variability in a
    measuring instrument, provides greater
    sensitivity or ability to distinguish people

26
Variance
  • Average (squared) deviation from the mean
  • ? (Xi Mean)2
  • N
  • Have to square the deviation otherwise the sum of
    the deviations will equal zero
  • This puts the variance in a different metric than
    the mean
  • Just take the square root of the variance to get
    the Standard deviation (SD)

27
Distribution statistics
  • Skewness describes the tail of the distribution
  • If the distribution is symmetrical (e.g., normal)
    there is no skew
  • Positive skew tail is in the high values, but
    most scores in the low values
  • Negative skew tail is in the low values, but
    most scores in the high values
  • Kurtosis how much the distribution bunches up
    around the mean
  • Usually want to minimize both skew and kurtosis

28
(No Transcript)
29
Meaning of scores
  • Raw scores of psychological tests usually have
    little inherent meaning
  • Meaning is derived by comparing scores to others
    (e.g., other members of a sample or a normative
    sample)
  • Percentiles
  • Z scores
  • T scores

30
Percentile/Percentile Rank
  • Percentile relative position in the sample or
    reference group
  • Percentile rank percentage of people that
    earned a raw score lower than the given score
  • Percentage of persons, not items

31
Standard scores
  • Expresses distance of score from the mean in SD
    units
  • Advantages of standard scores
  • Includes information about the persons standing
    in the distribution (ie., percentile rank)
  • Allows comparisons across tests that have
    different raw metrics

32
Z score
  • How far the score is away from the mean in SD
    units
  • Xi Mean
  • SD
  • Z score mean 0, SD 1.0

33
Z scores and Percentile ranks
  • Z scores relate to percentile ranks (see figure
    2-7 in textbook)
  • For a normal distributionZ score Percentile
    rank
  • 2 97.5
  • 1 84
  • 0 50
  • -1 16
  • -2 2.5
  • Z scores between 1 and 1 are usually considered
    the average range

34
T scores
  • T scores are linear transformations of Z scores
  • Why? For Z scores, half the scores are negative
    and fractional numbers are involved

35
T scores
  • T score (Z score 10) 50
  • Mean 50, SD 10
  • If normal scores will be between 20 and 80
  • Scores no longer negative or fractional
    components

36
Conversions of Standard Scores
37
T scores
  • MMPI scales are expressed in T score units
  • T score of 65 or higher is considered clinically
    significant
  • GRE and SAT subtests use T score (10) metric
  • Mean 500, SD 100
  • E.g., Verbal score of 600 is 1 SD above the mean
  • Quantitative score of 700 is 2 SD above the mean

38
Norms
  • Usually compared a persons score relative to a
    normative sample
  • The normative sample is some defined group
  • A persons score is interpreted in relation to
    the scores of this defined group

39
Types of Norms
  • Age-related
  • Average scores for persons of a certain
    chronological age
  • Grade equivalent
  • Average scores for persons of a certain grade
    level
  • Percentile
  • Relative position in the norm group
  • Standard score (Z and T scores)
  • Deviation score from the mean of the norm group

40
Scales of Measurement
  • We usually treat psychological tests as interval
    but really they are ordinal
  • Interval equal spaces on the scale have the
    same meaning
  • Can only say how far apart in the distribution
    scores are from each other

41
Stats 2 Inferential Statistics
  • Statistics are tools that help us understand our
    observations by
  • Summarize and describe our data (descriptive
    stats)
  • Test hypotheses (inferential stats)
  • We need to inferential statistics to verify or
    refute hypotheses
  • These tests help to establish the validity of our
    theory and measuring instruments

42
Population vs. Sample
  • Population encompasses all the phenomenon of
    interest
  • Parameters are the numbers used to describe the
    population
  • Sample is a subset of observations from the
    population
  • Sample Statistics are the are the numbers used to
    describe the sample and to estimate the
    population parameters
  • Want to generalize or infer that what we observe
    in our sample also applies to the population
  • We do this by making probablistic statements
    relating the population and sample

43
Population vs. Sample
  • This is exactly the same logic used in testing
  • The population is construct of interest
  • The sample is the test
  • We then generalize from what we observe in the
    sample to the population using probablistic
    statements

44
Correlation (r) Coefficient
  • Way to describe relationship between two
    variables
  • Magnitude
  • Direction
  • Many types
  • Pearsons r (Product Moment Correlation)

45
Pearsons r
  • Ranges from 1.0 to 1.0
  • Has no units of measurement
  • 0 indicates no linear relationship
  • -1 indicates a perfect, negative linear
    relationship
  • 1 indicates a perfect, positive linear
    relationship

46
Co-variance
  • Where does correlation come from?
  • Amount of overlapping variance need variance to
    have covariance
  • Covariance
  • S (X X)(Y Y)
  • N

47
Problems with Covariance
  • Same as raw scores, units typically have little
    intrinsic meaning and no upper limit
  • Also, two variables may be on different scales
  • Need an analog to standard scores
  • Standardized covariance

48
Correlation as Standardized Covariance
  • Doesnt matter which variable is x or y
  • r Covariance
  • SDx SDy

49
Examples of Correlations
  • Item 1 on Quiz 1 and total score for Quiz 1 r
    .92, corrected r .82
  • Cumulative quiz scores and total score on
    screening measure of IQ r .03
  • Difference score (Quiz 2 Quiz 1) and Quiz 1
    score r -.81

50
r .92, corrected r .82
51
(No Transcript)
52
r -.81
53
Null Hypothesis for Correlation Coefficient
  • Typically, NH is whether the correlation is
    different from zero
  • Bigger the sample, more power to detect any
    differences from zero (reject NH)
  • Can be different from zero, but have little
    practical significance
  • r2 - coefficient of determination or proportion
    of variance accounted for

54
Effect Sizes for Correlations
  • Small ES r .10 to .29
  • Medium ES r .30 to .49
  • Large ES r .50 to 1.00
  • Most psychological research works with effects in
    the small to medium range

55
Usual Correlation Disclaimer
  • Correlation does NOT equal causation
  • Reasons?
  • Chance
  • Third variable causes the relationship

56
Prediction
  • r describes how much two things go together
  • Therefore, can be used to predict y from x
  • If r 1.0, what z score would you predict for y
    if you knew x?
  • Unlike correlation, in regression it matters
    which variable is x and y

57
Linear Regression
  • Describes the association between two variables
    using a straight line
  • Equation of a line
  • y a bx
  • Where
  • x predictor or independent variable
  • y outcome or criterion variable
  • y predicted value of y
  • b slope amount of change in y associated with
    one unit change in x
  • a intercept value of y when x 0

58
Conceptual Understanding of Linear Regression
  • a mean of y
  • If x 0, mean is your best guess of someones
    score
  • x just gives you additional information to
    improve your prediction
  • The stronger the relationship between x and y
    (i.e., the correlation), the better your
    prediction gets

59
Linear Regression with z scores
  • b is simply r
  • Why?
  • a is zero
  • no adjustment needed for different scales
  • Change in y per 1 SD change in x
  • Equation
  • zy rzx

60
Regression and ANOVA
  • Regression and ANOVA are really the same
  • General Linear Model (GLM)
  • y a bx
  • If you have groups, x is group membership
  • Dummy code 0 group 1, 1 group 2
  • Plot the means, the slope of the line is the
    correlation (point-biserial correlation)

61
Mean differences as a Correlation
Height (in)
62
GLM
  • ANOVA and regression both try to account for
    variance in a criterion
  • Only difference is the nature of predictor
    variable quantitative (continuous) or
    categorical (dichotomous)

63
Reliability
  • Definition the proportion of variance in a set
    of test scores that is due to the real or true
    attributes of the persons being measured, rather
    than error
  • Also, repeatability, consistency, or stability

64
Reliability as Repeatability
  • Conceptually, any observation has some degree of
    error or imprecision
  • By taking multiple measurements it is presumed
    that these random errors will cancel each other
    out
  • Under certain assumptions the mean of repeated
    measurements is considered an estimate of the
    true score

65
Components of Reliability
  • Want a statistic of the proportion of total test
    score variance that is due to the true score
    variance
  • i.e., what proportion is not due to error
    variance?
  • Defining true score variance as the consistent,
    stable variance

66
Classical Test Theory (CTT) Reliability
  • Observed score true score error
  • X True error
  • sX2 sT2 se2
  • What is observed is a function of the variability
    in the true score and variability of the errors
    of measurement

67
Definition by Symbols
  • Reliability
  • rxx sT2 sT2
  • sX2 sT2
    se2

68
Assumptions of True Score Theory
  • Error of measurement is unsystematic or random
    deviation of an individuals score from a
    theoretically expected observed score
    (true-score)
  • Observed score True Score error
  • True score is an expected or mean score
  • Errors are not correlated with true score (i.e.,
    random)

69
Methods of Assessing Reliability
  • Test-retest
  • Alternate Forms
  • Split-half
  • Internal Consistency

70
Average Item Intercorrelation
  • Related to the last type of reliability well
    discuss, internal consistency reliability
  • An example imagine two people who are taking an
    internally consistent test of extraversion

71
An Example cont.
  • Brittany is very extraverted, Hillary is not
  • For every item, Brittany always responds true
    and Hillary always responds false
  • So, within a sample of different people, the
    responses to items will be correlated
  • People who score high on item 1 will also score
    high on item 2, 3,..n
  • Internal consistency

72
Another Example
  • Imagine Brittany and Hillary take an internally
    consistent test of intelligence
  • Hillary is very intelligent Brittany is not so
    bright
  • Hillary passes every item Brittany fails nearly
    every item
  • Again within a sample of different people, the
    item responses will be correlated
  • People who pass item 1 will tend to pass items 2,
    3,.n

73
Flipping Examples
  • Now imagine an internally inconsistent test
  • Responses would be random with respect to what
    the test is supposedly measuring (extraversion,
    intelligence)
  • What does this have to do with reliability?

74
Internal Consistency Reliability
  • Take the logic of split-half and parallel forms
    reliability to the extreme
  • Every ITEM is a parallel test of the construct
  • Therefore, the average correlation among items is
    an index of reliability

75
Cronbachs Coefficient Alpha (a)
  • Alpha is the average value of all possible
    split-half reliabilities
  • As number of items increases so will alpha
  • Some consider this a major flaw, claiming alpha
    is useless if more than 40 items are used
  • Use average interitem correlation instead

76
Standard Error of Measurement
  • Applying reliability to individuals
  • SEM
  • standard deviation of the distribution of test
    scores you would expect if a test was
    administered repeatedly to the same person

77
SEM
  • If test scores are consequential, a small SEM is
    important
  • Normal curve reference
  • Standard deviation tells how far off you are in
    estimating the true score, on average

78
SEM Formula
  • SEM SD 1 rxx
  • SD Standard deviation of test scores
  • rxx reliability coefficient

79
SEM example
  • IQ score rxx .90, SD 15
  • SEM 15 1 - .90 4.74
  • Get a confidence interval for a score of 110
  • 68 CI 110 4.7 105.3, 114.7
  • 95 CI 110 9.5 100.5, 119.5
  • 99 CI 110 14.2 95.8, 124.2

80
Relationship between Reliability and Validity
  • Reliability places a limit on validity
  • Why?

81
Factors Influencing Reliability
  • Inter-item correlation
  • Number of items
  • The more items, the higher the reliability
    coefficient

82
Dependence of Reliability on the Sample Tested
  • Internal consistency reliability is dependent on
    observed item scores
  • Cant assume reliability estimate in one sample
    will apply to a different sample

83
Dependence of Reliability on the Sample Tested
  • Also, applies to SEM
  • Assumes
  • equal measurement precision across all levels a
    trait
  • Individuals dont differ in the ability of the
    test to measure their trait level
  • SEM dependent on variability of sample scores

84
Validity
  • Does the test measure what it is supposed to
    measure?
  • Is the label put on the test and scores
    appropriate
  • What inferences can you make about a test score?
  • Validity is multifaceted
  • Face, Content, Criterion, and Construct Validity

85
Face Validity
  • Does the test appear to measure what people
    responding to it think it does?
  • Subjective reaction to a test
  • Primarily a PR issue
  • Some dont consider it part of validity

86
Content Validity
  • Is the coverage of testing material an adequate
    sample of the construct of interest?
  • Have to cover everything
  • Structure of test should be the same as the
    construct
  • Factor analysis

87
Criterion related Validity
  • Can a test predict a criterion that is external
    to the test?
  • Concurrent validity
  • Can the test predict criteria measured at roughly
    the same time?
  • Predictive validity
  • Can the test predict criteria measured after the
    test was taken?

88
Construct Validity
  • Subsumes all types of validity
  • Determines the appropriateness of inferences
    about a construct
  • What is part of the construct?
  • What other constructs is it related to?
  • What other constructs is it NOT related to?

89
Construct Validation
  • An ongoing process
  • Interplay between hypothesis generation, data
    collection, and refining the construct
  • No construct validation index no single value
    that summarizes a tests construct validity

90
Evidence of Construct Validity
  • Group (mean) differences
  • Correlations
  • Factor analysis
  • Studies of internal structure
  • Studies of change over occasions
  • Studies of process (experimental manipulations)

91
Establishing Validity
  • Scores on the measuring instrument must behave in
    a way that is consistent with theory
  • Make measurements, test hypotheses
  • Validating a measuring instrument also validates
    (or refutes) a theory

92
Max Consumption as a measure of Alcoholism
  • Construct of Alcoholism
  • People all over the world consume alcohol
  • Individual differences in alcohol consumption
  • Some persons use of alcohol is considered
    pathological
  • Drink large quantities, spend excessive time
    drinking or pursuing alcohol, interferes with
    major life roles (work, parent), unable to stop
    drinking, withdrawal, medical problems and
    continued use despite medical problems

93
Alcoholism Definitions
  • DSM-IV criteria for Alcohol Dependence
  • 3 symptoms (or more) occurring in the same
    12-month period
  • Tolerance, withdrawal, drinking more than
    intended, unable to cut down, great deal of time
    spent obtaining, consuming, or recovering from
    substance use, important activities given up, use
    continued despite physical or psychological
    problem caused or exacerbated by the substance

94
Maximum Consumption
  • What is the largest amount of alcohol you have
    ever consumed in 24 hours?
  • Alternative measure of alcoholism?
  • Must demonstrate the same associations as would
    be predicted for Alcoholism

95
Max Consumption vs. Alc Dep
  • Advantages Max Consumption
  • Objective number and easy to compare across
    people
  • More socially acceptable people reluctant to
    admit to Alc Dep symptoms
  • Quantitative, spans the full range of
    vulnerability to alcoholism
  • Alc Dep only measures the extreme range
  • Lose information, lose statistical power

96
Quantitative Measures and Alcoholism Severity
Threshold
Liability
97
Max Consumption vs. Alc Dep
  • Potential Disadvantages
  • Sufficient content validity?
  • How accurate at people at reporting?
  • False positives?
  • False negatives?
  • Most of these are no different for any
  • other measures including Alc Dep

98
Alcoholisms Nomological Network
Drug Use
Intelligence
Tobacco Use
School Achievement
Adult Antisocial Behavior
Alcoholism
Delinquency
Depression
Risky Sexual Behavior
99
Hypotheses connecting Alcoholism and other
constructs
  • Construct
  • Drug use
  • Tobacco use
  • Adult Antisocial Behavior
  • Delinquency
  • Risky Sexual Behavior
  • Depression
  • Intelligence
  • School Achievement
  • Predicted relation
  • Strong ()
  • Strong ()
  • Strong ()
  • Strong ()
  • Moderate ()
  • Zero, small ()
  • Zero, small (-)
  • Small (-)

100
Does Max Consumption Reproduce the same
Nomological Network?
Drug Use
Intelligence
Tobacco Use
School Achievement
Adult Antisocial Behavior
MAX CONS
Delinquency
Depression
Risky Sexual Behavior
101
Validate Max Consumption as measure of Alcoholism
  • Does Max Consumption exhibit the same relations
    with other constructs?
  • Have to measure each construct
  • Make observations in representative sample
  • Test hypotheses using statistics

102
Need to measure each construct
  • Construct
  • Drug use
  • Tobacco use
  • Adult Antisocial Behavior
  • Delinquency
  • Risky Sexual Behavior
  • Depression
  • Intelligence
  • School Achievement
  • Measure
  • DSM symptoms Drug Dependence
  • Nicotine Dependence
  • Antisocial Personality Disorder
  • Conduct Disorder
  • Life Events Interview
  • DSM Major Depression
  • Wechsler IQ scores
  • Class Grades

103
Sample
  • Minnesota Twin Family Study
  • 17-year old male and female twins
  • Born in MN, recruited from all over the state
  • Almost all white, IQ gt 70, no mental or physical
    disability
  • Representative?

104
Statistics
  • Need to use statistics to test hypotheses
  • Rely on correlations
  • Correlation index of association ranges from 1
    to 1
  • 1 perfect positive relation
  • -1 perfect inverse relation
  • 0 no association

105
Convergent Discriminant Relations
  • Convergent validity
  • Measure should be positively correlated with
    certain constructs
  • Discriminant Validity
  • Measure should be uncorrelated or negatively
    correlated with other constructs

106
Convergent Validity
  • Test should be positively correlated with other
    tests attempting to measure the same construct
  • Correlation between Max Consumption Alc. Dep
    r .65
  • Big correlation
  • Measures a similar construct, but not the same
    construct
  • Which is a better measure of Alcoholism?

107
Convergent Validity
108
Discriminant Validity
109
Group (mean) differences
  • Alcoholism more common in men than women
  • Mean Alc Dep symptoms
  • Men .63, Women .43
  • Mean Max Consumption
  • Men 7.7, Women 4.87
  • Using t-tests, means are significantly different
    for both

110
Evaluate Measure
  • How consistent are the observations with theory?
  • As measures of Alcoholism, both Max Cons and Alc
    Dep criteria are related to external constructs
    in a way consistent with theory
  • Therefore, both are valid measures of Alcoholism
  • It might seem simple, but if the observations are
    NOT as predicted, the test is not valid

111
Is Construct Validation over?
  • No, its just the beginning
  • Continue to delineate relations with other
    constructs
  • Now that good measures of the construct of
    Alcoholism are available
  • Can now study the etiology or causal processes of
    Alcoholism

112
Construct Elaboration
  • As new observations accrue, new criteria to
    evaluate the construct validity of measures
  • For example, develop markers of the underlying
    processes of Alcoholism
  • Specific genes
  • psychophysiological markers present before
    symptom onset
  • The test that has a stronger relation with these
    variables is more valid measure of the construct

113
What to do when theory and data dont agree?
  • Have to determine whether the measure is invalid
    or theory is wrong
  • Need multiple lines of evidence
  • Tests of hypotheses across different measures and
    samples
  • A body of evidence accumulates to make the
    determination
Write a Comment
User Comments (0)
About PowerShow.com