Unit 2 Review presentation

About This Presentation

Transcript and Presenter's Notes

Title: Unit 2 Review

1
Unit 2 Review

Chapters 7-9, 11

2
Theories of Intelligence

Two classical theories of intelligence
Spearmans g or two-factor theory
Thurstones 9 Primary Mental Abilities
Guilfords Structure of Intellect is another
multi-factor theory of intelligence
Also known as the psychometric theories of
intelligence because of reliance on data
relationships

3
Chapter 7

Theories of Intelligence

4
Spearman

Developed first formal theory about human mental
ability
One general, g, factor accounted for correlations
among tests of simple sensory functions
Each test also had a specific component, s,
unique to that test plus error

5
Thurstones Primary Mental Abilities

S spatial
P perceptual (esp. speed of visual perception)
N Numerical (speed accuracy of computation)
V Verbal M - Memory

W Words (word fluency or disarranged words)
I Induction (finding a rule)
R Reasoning (arithmetic)
D Deduction (application of rule)

6
Hierarchical Models

Are compromise between and one vs. many argument
Acknowledge there are many separate abilities but
can be arranged so only a few dominant factors
are at the top of the hierarchy.
Includes Cattell (crystallized and fluid), Vernon
(verbal educational and spatialmechanical), and
Carroll (three stratum theory with g as the top
stratum)

7
Other Theories

Developmental (e.g., Piaget)
Information Processing--based on elementary
cognitive tasks, ECT, such as reaction time
(e.g., Jensen, Sternberg)
Biological Theories (e.g., Gardners theory of
multiple intelligences)

8
Differences by Sex

Differences minimal on total scores
Males outperform females on tests of spatial
ability (effect size of 5-7)
Females outperform males on verbal tests during
childhood and much of adolescence
Greater variability in intelligence for males

9
Differences by Racial/Ethnic Group

Compared to whites
Blacks are about 1SD below
Hispanics and native Americans are .5-1 SD below
on verbal and at the mean on performance tests
Asians have a similar verbal mean and are about 1
SD above on non-verbal tests

10
Heredity and Environment

Intelligence results from an interaction of
heredity and environment
Estimates of heritability range from .4 to .8
(median of about .5 or .6)
Heritability increases with age
g has a higher heritability index than specific
abilities

11
Correlation of IQ scores From Bouchard McGue
(1981)

Identical twins reared together .86
Identical twins reared apart .72
Same sex fraternal twins reared together .62
Opposite sex fraternal twins reared together
.57
Non-twin siblings reared together .47
Unrelated (adopted) siblings reared together .30

12
Individual Tests of Intelligence

Chapter 8

13
Common Characteristics of Individual Intelligence
Tests

individually administered
administration requires advanced training
tests cover wide range of age and ability
examiner must establish rapport
immediate scoring of items
usually requires about one hour
allows opportunity for observation

14
Two Main Individually Administered Intelligence
Tests

Stanford-Binet
He wanted to create a process for identifying
intellectually limited children so they could be
removed from the regular classroom and put in
special education.
Wechsler scales
Developed in response to the perceived
shortcomings of the Stanford-Binet

15
Early Binet Scales

1905 30 items ordered by difficulty. Test
lacked
adequate measuring units to express results
(only used idiot, imbecile, and moron)
adequate normative data (only used 50 subjects)
evidence of validity
1908 Grouped items according to age level
rather than simply increasing difficulty.
Introduced concept of mental age.
Increased norm group to 203.
Criticized because it produced only one score
almost exclusively related to verbal, language,
and reading ability

16
Modern Binet scale

Totally revised in 1986 by Thorndike et al.
Used Thurstones multidimensional model (1938)
G made up of crystallized ability (verbal
quantitative reasoning), fluid-analytic abilities
(abstract-visual reasoning) and short term memory.

17
Structure of the SB-IV

Verbal reasoning included vocabulary test,
comprehension test, absurdities test, and verbal
relations test.
Abstract-visual reasoning included pattern
analysis test, copying test, matrices test,
paper-folding and cutting test.
Quantitative reasoning included quantitative
test, number series test, equation-building test.
Short-term memory included bead memory, memory
for sentences, memory for digits, and memory for
objects
Composite included all areas combined

18
Psychometric properties of SB-IV

Standardization sample stratified based on 1980
census geographic region, community size,
ethnic group, age, and gender.
Internal consistency reliability is .98 for
composite and .93-.97 for area scores. Some
individual test scores are lower .73 for memory
for objects is the lowest.
Test-retest reliabilities for composite score
were .91 and .90 for 5 and 8-year-olds.
Factor analysis supports the structure of the
test.
Correlations with other IQ tests are generally in
the 70s and 80s

19
Wechsler Scales

David Wechsler worked at NYs Bellevue Hospital.
He wasnt happy with the Stanford Binet with its
focus on children or on the production of a
single score.
In 1939, he created the Wechsler-Bellevue, later
called the WAIS.
In 1949, he created the childrens version, the
WISC.
In 1967, he added the WPPSI for children ages
2.5-7.

20
Structure of the WAIS

The WAIS yields separate verbal and performance
IQs
The WAIS-III has four index scores Verbal
comprehension, working memory, perceptual
organization, and processing speed.

21
Scales and Norms for the WAIS

Determine raw score for each subtest.
Convert raw scores to standard scores, called
scaled scores (M10, SD3)
There are conversions for 13 age groups. This
method of conversion obscures any differences in
performance by age.
Subtest scaled scores are added, then converted
to WAIS-III composite scores.
Three composite scores verbal, performance,
full scale, each with M100, SD15
Four index scores verbal comprehension,
perceptual organization, working memory,
processing speed

22
Reliability of the WAIS

Internal consistency and test-retest
reliabilities are about .95 or higher for full
scale and verbal scores.
Theyre about .90 for performance and three other
index scores perceptual organization, working
memory, and processing speed.
Internal consistency reliability for the
subtests range from upper .70s to low .90s.
Test-retest is about .83.
Generally, performance reliabilities are lower
than verbal reliabilities on the subtests.

23
Validity of the WAIS

Great deal of information on criterion-related
and construct validity.
Factors analyses support use of 4 index scores.
Comparison studies show the pattern of WAIS-III
scores for many special groups, e.g., Alzheimers
Disease, Parkinsons, learning disabled, brain
injury.
Is the top test used today

24
Group Differences in IQ

Test scores that demonstrate differences among
people may suggest that people are not created
with the same basic abilities.
Biggest problem Some ethnic groups obtain lower
average scores on some psychological tests. On
average African Americans score 15 points lower
than whites on IQ tests.
Dispute is not whether differences occur but why
they occur.environment vs. biology

25
Problems with Biology Argument

IQ scores are improving (called the Flynn
effect), more so for African Americans than
whites.
Victimization by stereotyping could affect test
performance and grades.
Construct of race has no biological meaning based
on evidence from studies in population genetics,
the human genome and physical anthropology.

26
Criticisms related to Content Validity

Looking at specific items, it was thought that
they might be biased because some children
wouldnt have the opportunity to learn the
material
Members of ethnic groups might answer some items
differently but still correctly
Scores affected by language skills inculcated as
part of a white, middle-class upbringing foreign
to inner city children

27
Responses to Content Validity Criticisms

Some evidence suggests that the linguistic bias
in standardized tests does not cause the observed
differences (Scheuneman, 1987).
Elimination of biased items from a test didnt
change the test scores (Bianchini, 1976).
Cant find classes of items most likely to be
missed by minority group members (Wild, et al.,
1989)

28
Group Tests of Mental Ability

Chapter 9

29
Characteristics of Group Mental Ability Tests

Administered to a large group
Composed of multiple choice items that can be
machine-scored
Content similar to individual tests
Fixed time limit and number of items
Usually yield a total score and some subscores
Principal purpose is prediction

30
Advantages of individual tests

Provide information beyond the test score
Allow the examiner to observe behavior in a
standard setting
Allow individualized interpretation of test scores

31
Advantages of group tests

Are cost-efficient
Minimize professional time for administration and
scoring
Require less examiner skill training
Have more objective and more reliable scoring
procedures
Have a very broad application
Group tests far outnumber individual tests and
group tests vary widely among themselves

32
Scoring Information for the OLSAT7

Yields verbal, Nonverbal, and Total scores
Converted to School Ability Index (SAI) with
M100 and SD16
SAIs determined separately for age groups at 3
month interval from ages 5-19
Score reports also include anticipated
achievement comparisons (AAC) to predict
performance on the Stanford tests

33
Psychometric Properties

About half a million cases are part of the
research base for the OLSAT7
High internal consistency with nothing lower than
.87 for total score (higher at higher grades)
KR-20 for Verbal and Nonverbal in the high .80s
for upper grades and low .80s for lower grades
No test-retest reliability data
High correlations between the OLSAT7 and the
Stanford tests, but other validity evidence is
weak

34
College Admissions The SAT

The College Board oversees the development of the
test called the SAT
ETS (Educational Testing Service) actually
develops the SAT
The SAT is a cluster of tests
SAT I includes the well-known Verbal and Math
tests
SAT II has tests in 12 subject fields

35
SAT I Structure

Includes verbal (SAT-V) and math (SAT-M) summed
to get total score
Uses correction for guessing
Range for each subtest is 200-800 with M500 and
SD100. Total M1000 and SD200
Norms based on test users, not any well-defined,
predetermined population
Scaled score norms last determined in 1994.
Percentile norms adjusted on an annual basis.

36
Reliability and validity of the SAT

Internal consistencies of .91-.93
SEMs of about 30 points
Poor predictive power regarding grades of
students scoring in middle ranges
Number of English or math units doesnt correlate
significantly (maybe due to coaching)
Validity coefficients are about .40 with 1st year
grades
On old SAT, African-American and Latino students
scored lower, sometimes by as much as 80 points.
New test MAY have reduced that.

37
The ACT

ACT provides more emphasis on school-based skills
Have scores for English, Math, Reading, Science
Reasoning, and Composite which is an average of
the 4 tests
Does NOT use a correction for guessing
Score range is 1-36 with M20 and SD5

38
Psychometric properties of the ACT

Norms based on users, usually about a million
annually
Reliabilities ranges from .84-.91 with the
Composite score reliability being .96
SEMs are 1.5-2 points for each subtest and 1
point for the Composite score
Correlates about .80 with SAT
Like SAT, high school GPA is generally as good a
predictor as test scores

39
Graduate and Professional School Selection The
GRE

Includes General Test, Subject Test, and Writing
Assessment
General tests includes Verbal, Quantitative, and
Analytical
General tests intended to measure developed
abilities that have been acquired over a long
period of time
Gradually moving to computer adaptive testing

40
GRE Scores and Norms

Scaled score set at M500 and SD100, with a
range of 200-800
Like SAT, average scores have gradually drifted
downward
Norms are user norms
Internal consistency is in the low 90s
SEMs are about 30-40 points

41
Validity of the GRE

GRE tests correlate with first-year GPA in the
range of mid-20s to low 30s.
Tests correlate with each other. Lowest
correlation is V-Q at .45, higher is Q-A at .66
Undergraduate GPA is a better prediction than any
of the tests and about equal to total test score
in predictive validity.

42
Ravens Progressive Matrices

Designed to measure the fluid dimension of
intelligence may be best measure of g
Not used more widely because
too many manuals and norm groups
conflicting evidence about what the test is
measuring
Hasnt really eliminated differences between
majority and minority group examinees

43
Achievement Tests

Chapter 11

44
Achievement vs. Aptitude tests

Achievement tests
Evaluate the effects of a known or controlled set
of experiences
Evaluate the product of a course of training
Rely heavily on content validation procedures
Aptitude tests
Evaluate the effect of an unknown, uncontrolled
set of experiences
Evaluate the potential to profit from a course of
training
Rely heavily on predictive criterion validation
procedures

45
Classification of Achievement Tests

Achievement battery
Single area achievement tests
Certification, licensing exams
State, national, international tests
Psycho-educational batteries
Teacher-made tests

46
Establishing cutscores

Norm-referenced approach
select a percentile and everyone above that point
is in
Criterion-referenced approach
many approaches
Most popular approach is Angoffs where judge
looks at each item and assesses the probability
of minimally competent person getting it right.
Probabilities summed to get total score
Original judgments often changed after results
are seen

47
TIMSS design

TIMSS 1999 used a matrix sampling technique to
achieve broad coverage the total of 308 items
were systematically distributed across 8 test
booklets and the booklets were distributed
randomly to students
Each student completed one 90-minute test
booklet.
Approximately one-third of the items were
constructed-response format, and the remaining
items were multiple-choice

48
Nagging questions about achievement tests

Is there some other way to measure content
validity?
Is there really a difference between achievement
and ability?
How motivated were examinees to perform well?
How can we get diagnostic information from a
short test?
What is the difference between constructed
response and selected response? Is it important?

Write a Comment

User Comments (0)

About PowerShow.com

Unit 2 Review PowerPoint PPT Presentation