Unit 2 Review - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Unit 2 Review

Description:

Differences by Sex. Differences minimal on total scores ... as part of a white, middle-class upbringing foreign to inner city children ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 49
Provided by: marciab7
Category:
Tags: review | unit

less

Transcript and Presenter's Notes

Title: Unit 2 Review


1
Unit 2 Review
  • Chapters 7-9, 11

2
Theories of Intelligence
  • Two classical theories of intelligence
  • Spearmans g or two-factor theory
  • Thurstones 9 Primary Mental Abilities
  • Guilfords Structure of Intellect is another
    multi-factor theory of intelligence
  • Also known as the psychometric theories of
    intelligence because of reliance on data
    relationships

3
Chapter 7
  • Theories of Intelligence

4
Spearman
  • Developed first formal theory about human mental
    ability
  • One general, g, factor accounted for correlations
    among tests of simple sensory functions
  • Each test also had a specific component, s,
    unique to that test plus error

5
Thurstones Primary Mental Abilities
  • S spatial
  • P perceptual (esp. speed of visual perception)
  • N Numerical (speed accuracy of computation)
  • V Verbal M - Memory
  • W Words (word fluency or disarranged words)
  • I Induction (finding a rule)
  • R Reasoning (arithmetic)
  • D Deduction (application of rule)

6
Hierarchical Models
  • Are compromise between and one vs. many argument
  • Acknowledge there are many separate abilities but
    can be arranged so only a few dominant factors
    are at the top of the hierarchy.
  • Includes Cattell (crystallized and fluid), Vernon
    (verbal educational and spatialmechanical), and
    Carroll (three stratum theory with g as the top
    stratum)

7
Other Theories
  • Developmental (e.g., Piaget)
  • Information Processing--based on elementary
    cognitive tasks, ECT, such as reaction time
    (e.g., Jensen, Sternberg)
  • Biological Theories (e.g., Gardners theory of
    multiple intelligences)

8
Differences by Sex
  • Differences minimal on total scores
  • Males outperform females on tests of spatial
    ability (effect size of 5-7)
  • Females outperform males on verbal tests during
    childhood and much of adolescence
  • Greater variability in intelligence for males

9
Differences by Racial/Ethnic Group
  • Compared to whites
  • Blacks are about 1SD below
  • Hispanics and native Americans are .5-1 SD below
    on verbal and at the mean on performance tests
  • Asians have a similar verbal mean and are about 1
    SD above on non-verbal tests

10
Heredity and Environment
  • Intelligence results from an interaction of
    heredity and environment
  • Estimates of heritability range from .4 to .8
    (median of about .5 or .6)
  • Heritability increases with age
  • g has a higher heritability index than specific
    abilities

11
Correlation of IQ scores From Bouchard McGue
(1981)
  • Identical twins reared together .86
  • Identical twins reared apart .72
  • Same sex fraternal twins reared together .62
  • Opposite sex fraternal twins reared together
    .57
  • Non-twin siblings reared together .47
  • Unrelated (adopted) siblings reared together .30

12
Individual Tests of Intelligence
  • Chapter 8

13
Common Characteristics of Individual Intelligence
Tests
  • individually administered
  • administration requires advanced training
  • tests cover wide range of age and ability
  • examiner must establish rapport
  • immediate scoring of items
  • usually requires about one hour
  • allows opportunity for observation

14
Two Main Individually Administered Intelligence
Tests
  • Stanford-Binet
  • He wanted to create a process for identifying
    intellectually limited children so they could be
    removed from the regular classroom and put in
    special education.
  • Wechsler scales
  • Developed in response to the perceived
    shortcomings of the Stanford-Binet

15
Early Binet Scales
  • 1905 30 items ordered by difficulty. Test
    lacked
  • adequate measuring units to express results
    (only used idiot, imbecile, and moron)
  • adequate normative data (only used 50 subjects)
  • evidence of validity
  • 1908 Grouped items according to age level
    rather than simply increasing difficulty.
    Introduced concept of mental age.
  • Increased norm group to 203.
  • Criticized because it produced only one score
    almost exclusively related to verbal, language,
    and reading ability

16
Modern Binet scale
  • Totally revised in 1986 by Thorndike et al.
  • Used Thurstones multidimensional model (1938)
  • G made up of crystallized ability (verbal
    quantitative reasoning), fluid-analytic abilities
    (abstract-visual reasoning) and short term memory.

17
Structure of the SB-IV
  • Verbal reasoning included vocabulary test,
    comprehension test, absurdities test, and verbal
    relations test.
  • Abstract-visual reasoning included pattern
    analysis test, copying test, matrices test,
    paper-folding and cutting test.
  • Quantitative reasoning included quantitative
    test, number series test, equation-building test.
  • Short-term memory included bead memory, memory
    for sentences, memory for digits, and memory for
    objects
  • Composite included all areas combined

18
Psychometric properties of SB-IV
  • Standardization sample stratified based on 1980
    census geographic region, community size,
    ethnic group, age, and gender.
  • Internal consistency reliability is .98 for
    composite and .93-.97 for area scores. Some
    individual test scores are lower .73 for memory
    for objects is the lowest.
  • Test-retest reliabilities for composite score
    were .91 and .90 for 5 and 8-year-olds.
  • Factor analysis supports the structure of the
    test.
  • Correlations with other IQ tests are generally in
    the 70s and 80s

19
Wechsler Scales
  • David Wechsler worked at NYs Bellevue Hospital.
    He wasnt happy with the Stanford Binet with its
    focus on children or on the production of a
    single score.
  • In 1939, he created the Wechsler-Bellevue, later
    called the WAIS.
  • In 1949, he created the childrens version, the
    WISC.
  • In 1967, he added the WPPSI for children ages
    2.5-7.

20
Structure of the WAIS
  • The WAIS yields separate verbal and performance
    IQs
  • The WAIS-III has four index scores Verbal
    comprehension, working memory, perceptual
    organization, and processing speed.

21
Scales and Norms for the WAIS
  • Determine raw score for each subtest.
  • Convert raw scores to standard scores, called
    scaled scores (M10, SD3)
  • There are conversions for 13 age groups. This
    method of conversion obscures any differences in
    performance by age.
  • Subtest scaled scores are added, then converted
    to WAIS-III composite scores.
  • Three composite scores verbal, performance,
    full scale, each with M100, SD15
  • Four index scores verbal comprehension,
    perceptual organization, working memory,
    processing speed

22
Reliability of the WAIS
  • Internal consistency and test-retest
    reliabilities are about .95 or higher for full
    scale and verbal scores.
  • Theyre about .90 for performance and three other
    index scores perceptual organization, working
    memory, and processing speed.
  • Internal consistency reliability for the
    subtests range from upper .70s to low .90s.
    Test-retest is about .83.
  • Generally, performance reliabilities are lower
    than verbal reliabilities on the subtests.

23
Validity of the WAIS
  • Great deal of information on criterion-related
    and construct validity.
  • Factors analyses support use of 4 index scores.
  • Comparison studies show the pattern of WAIS-III
    scores for many special groups, e.g., Alzheimers
    Disease, Parkinsons, learning disabled, brain
    injury.
  • Is the top test used today

24
Group Differences in IQ
  • Test scores that demonstrate differences among
    people may suggest that people are not created
    with the same basic abilities.
  • Biggest problem Some ethnic groups obtain lower
    average scores on some psychological tests. On
    average African Americans score 15 points lower
    than whites on IQ tests.
  • Dispute is not whether differences occur but why
    they occur.environment vs. biology

25
Problems with Biology Argument
  • IQ scores are improving (called the Flynn
    effect), more so for African Americans than
    whites.
  • Victimization by stereotyping could affect test
    performance and grades.
  • Construct of race has no biological meaning based
    on evidence from studies in population genetics,
    the human genome and physical anthropology.

26
Criticisms related to Content Validity
  • Looking at specific items, it was thought that
    they might be biased because some children
    wouldnt have the opportunity to learn the
    material
  • Members of ethnic groups might answer some items
    differently but still correctly
  • Scores affected by language skills inculcated as
    part of a white, middle-class upbringing foreign
    to inner city children

27
Responses to Content Validity Criticisms
  • Some evidence suggests that the linguistic bias
    in standardized tests does not cause the observed
    differences (Scheuneman, 1987).
  • Elimination of biased items from a test didnt
    change the test scores (Bianchini, 1976).
  • Cant find classes of items most likely to be
    missed by minority group members (Wild, et al.,
    1989)

28
Group Tests of Mental Ability
  • Chapter 9

29
Characteristics of Group Mental Ability Tests
  • Administered to a large group
  • Composed of multiple choice items that can be
    machine-scored
  • Content similar to individual tests
  • Fixed time limit and number of items
  • Usually yield a total score and some subscores
  • Principal purpose is prediction

30
Advantages of individual tests
  • Provide information beyond the test score
  • Allow the examiner to observe behavior in a
    standard setting
  • Allow individualized interpretation of test scores

31
Advantages of group tests
  • Are cost-efficient
  • Minimize professional time for administration and
    scoring
  • Require less examiner skill training
  • Have more objective and more reliable scoring
    procedures
  • Have a very broad application
  • Group tests far outnumber individual tests and
    group tests vary widely among themselves

32
Scoring Information for the OLSAT7
  • Yields verbal, Nonverbal, and Total scores
  • Converted to School Ability Index (SAI) with
    M100 and SD16
  • SAIs determined separately for age groups at 3
    month interval from ages 5-19
  • Score reports also include anticipated
    achievement comparisons (AAC) to predict
    performance on the Stanford tests

33
Psychometric Properties
  • About half a million cases are part of the
    research base for the OLSAT7
  • High internal consistency with nothing lower than
    .87 for total score (higher at higher grades)
  • KR-20 for Verbal and Nonverbal in the high .80s
    for upper grades and low .80s for lower grades
  • No test-retest reliability data
  • High correlations between the OLSAT7 and the
    Stanford tests, but other validity evidence is
    weak

34
College Admissions The SAT
  • The College Board oversees the development of the
    test called the SAT
  • ETS (Educational Testing Service) actually
    develops the SAT
  • The SAT is a cluster of tests
  • SAT I includes the well-known Verbal and Math
    tests
  • SAT II has tests in 12 subject fields

35
SAT I Structure
  • Includes verbal (SAT-V) and math (SAT-M) summed
    to get total score
  • Uses correction for guessing
  • Range for each subtest is 200-800 with M500 and
    SD100. Total M1000 and SD200
  • Norms based on test users, not any well-defined,
    predetermined population
  • Scaled score norms last determined in 1994.
    Percentile norms adjusted on an annual basis.

36
Reliability and validity of the SAT
  • Internal consistencies of .91-.93
  • SEMs of about 30 points
  • Poor predictive power regarding grades of
    students scoring in middle ranges
  • Number of English or math units doesnt correlate
    significantly (maybe due to coaching)
  • Validity coefficients are about .40 with 1st year
    grades
  • On old SAT, African-American and Latino students
    scored lower, sometimes by as much as 80 points.
    New test MAY have reduced that.

37
The ACT
  • ACT provides more emphasis on school-based skills
  • Have scores for English, Math, Reading, Science
    Reasoning, and Composite which is an average of
    the 4 tests
  • Does NOT use a correction for guessing
  • Score range is 1-36 with M20 and SD5

38
Psychometric properties of the ACT
  • Norms based on users, usually about a million
    annually
  • Reliabilities ranges from .84-.91 with the
    Composite score reliability being .96
  • SEMs are 1.5-2 points for each subtest and 1
    point for the Composite score
  • Correlates about .80 with SAT
  • Like SAT, high school GPA is generally as good a
    predictor as test scores

39
Graduate and Professional School Selection The
GRE
  • Includes General Test, Subject Test, and Writing
    Assessment
  • General tests includes Verbal, Quantitative, and
    Analytical
  • General tests intended to measure developed
    abilities that have been acquired over a long
    period of time
  • Gradually moving to computer adaptive testing

40
GRE Scores and Norms
  • Scaled score set at M500 and SD100, with a
    range of 200-800
  • Like SAT, average scores have gradually drifted
    downward
  • Norms are user norms
  • Internal consistency is in the low 90s
  • SEMs are about 30-40 points

41
Validity of the GRE
  • GRE tests correlate with first-year GPA in the
    range of mid-20s to low 30s.
  • Tests correlate with each other. Lowest
    correlation is V-Q at .45, higher is Q-A at .66
  • Undergraduate GPA is a better prediction than any
    of the tests and about equal to total test score
    in predictive validity.

42
Ravens Progressive Matrices
  • Designed to measure the fluid dimension of
    intelligence may be best measure of g
  • Not used more widely because
  • too many manuals and norm groups
  • conflicting evidence about what the test is
    measuring
  • Hasnt really eliminated differences between
    majority and minority group examinees

43
Achievement Tests
  • Chapter 11

44
Achievement vs. Aptitude tests
  • Achievement tests
  • Evaluate the effects of a known or controlled set
    of experiences
  • Evaluate the product of a course of training
  • Rely heavily on content validation procedures
  • Aptitude tests
  • Evaluate the effect of an unknown, uncontrolled
    set of experiences
  • Evaluate the potential to profit from a course of
    training
  • Rely heavily on predictive criterion validation
    procedures

45
Classification of Achievement Tests
  • Achievement battery
  • Single area achievement tests
  • Certification, licensing exams
  • State, national, international tests
  • Psycho-educational batteries
  • Teacher-made tests

46
Establishing cutscores
  • Norm-referenced approach
  • select a percentile and everyone above that point
    is in
  • Criterion-referenced approach
  • many approaches
  • Most popular approach is Angoffs where judge
    looks at each item and assesses the probability
    of minimally competent person getting it right.
    Probabilities summed to get total score
  • Original judgments often changed after results
    are seen

47
TIMSS design
  • TIMSS 1999 used a matrix sampling technique to
    achieve broad coverage the total of 308 items
    were systematically distributed across 8 test
    booklets and the booklets were distributed
    randomly to students
  • Each student completed one 90-minute test
    booklet.
  • Approximately one-third of the items were
    constructed-response format, and the remaining
    items were multiple-choice

48
Nagging questions about achievement tests
  • Is there some other way to measure content
    validity?
  • Is there really a difference between achievement
    and ability?
  • How motivated were examinees to perform well?
  • How can we get diagnostic information from a
    short test?
  • What is the difference between constructed
    response and selected response? Is it important?
Write a Comment
User Comments (0)
About PowerShow.com