StateNAEP Standard Mappings: Cautions and Alternatives - PowerPoint PPT Presentation

1 / 47

About This Presentation

Title:

StateNAEP Standard Mappings: Cautions and Alternatives

Description:

Shifting the cut score changes the percent of proficient ... have built a provocative, speculative, casual mapping that will be dramatically overinterpreted. ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 48

Provided by: andrew72

Category:

more less

Transcript and Presenter's Notes

Title: StateNAEP Standard Mappings: Cautions and Alternatives

1
State-NAEP Standard Mappings Cautions and
Alternatives

Andrew Ho
University of Iowa

2
When People Want a Method

Give it to them?
So someone more dangerous wont?
But build in limitations?
Make strenuous cautions?
And try to change the subject?

3
State-NAEP Percent Proficient Comparisons
4
Visualizing Proficiency

Setting cut scores is a judgmental process.
Shifting the cut score changes the percent of
proficient students in a nonlinear fashion.

5
Percent Proficiency by Cut Score
6
Why a State-NAEP Link?

State A and state B both report that 50 of
students are proficient.
Are states comparable in student proficiency, or
are their definitions of proficient different?
If state A has a higher standard for proficiency,
then state A students are more proficient.
How can we tell?
Rationale If both states use a NAEP-like test,
and state A students are more proficient on NAEP,
then state A must have the higher standard for
proficiency.

7
States A and B on NAEP
8
Map State Standards Onto NAEP
9
Interpretations

State A (250) has a higher standard for
proficiency than state B (225).
Essentially adjusts or handicaps state
proficiency standards based on NAEP performance.
All else being equal (!), higher scoring NAEP
states will appear to have higher standards.
What percent of proficient students would state B
have to report to have the same as or a higher
standard than state A?

10
Same Standards, Lower Proficient
11
A Closer Look (Braun, Qian, McLaughlin)
NAEP distributions for students taking both tests
BQ average percents proficient.
Percents proficient on state tests for those
taking both tests
McLaughlin averages scores
12
Strong Inferences, What Support?

Handicapping state percents proficient by NAEP
performance logically requires a strong
relationship between state tests and NAEP.
Does this method require a strong relationship
between state tests and NAEP?
Does this method provide evidence for a strong
relationship between state tests and NAEP?

13
Throw Percents, See What Falls Out
14
It Doesnt Matter What Percents

The method is blind to the meaning of the
percentages.
We can map state football standards to NAEP.
States A and B both claim that 10 of the
students who took NAEP are JV- or Varsity-ready.
Conclusion State A has higher standards due to
its higher NAEP performance. State B should let
fewer students on its teams to match state As
standards.
We can map height to NAEP.
States A and B both claim that 60 of the
students who took NAEP are tall.
Conclusion State A has higher standards due to
its higher NAEP performance. State B should
consider its students shorter to match state As
standards.

15
Can the Method Check Itself?
Braun and Qian (2007)
16
Mapped Standard and Proficient

There is a strong negative correlation between
the NAEP mapping of the state standard and the
percent of proficient students.
Braun and Qian We assert that most of the
observed differences among states in the
proportions of students meeting states
proficiency standards are the result of
differences in the stringency of their
standards.
Perhaps true, but unfounded.

17
Look again at how the method works
18
Unfounded (see also Koretz, 2007)

The plot does not support the inference.
It doesnt matter whether you are throwing
percents of proficient students, percents of
football players, or percents of tall students,
you will see a strong negative correlation.
Its standard setting and not
Test Content?
Motivation?
State policies?
This method cannot make these distinctions.

19
If You Link It, They Will Come

Braun and Qian have built a provocative,
speculative, casual mapping that will be
dramatically overinterpreted.
Braun and Mislevy (2005) argue against Intuitive
Test Theory tenets such as A Test Measures What
it Says at the Top of the Page, and, A Test is
a Test is a Test.
Would you handicap your golf game based on your
tennis performance? The analogy is largely
untested.

20
Three Higher Order Issues

Mappings are unlikely to last over time.
NAEP and state trends are dissimilar.
Percent Proficiency is Horrible Anyways
Absolutely abysmal for reporting trends and gaps.
NAEP-State Content analyses are the way to
proceed.
But the outcome variables should not be
percent-proficient-based statistics.
And this is just not easy to do.

21
Proficiency Trends Over Time?
22
Proficiency Trends Are Not Linear
23
Revisiting the Distribution
24
Percent Proficiency by Cut Score
25
Two Time Points and a Cut Score
26
Another Perspective (Holland, 2002)
27
Proficiency Trends Depend on Cut Score
28
Five States Under NCLB
29
Sign Reversal?! Sure.
30
Six Blind Men and an Elephant
Fan
Spear
Wall
Rope
Snake
Trunk
31
NAEP R4 Trends by Cut Score
32
State R4 Trends by Cut Score
33
Percent Proficient-Based Trends

Focus on only a slice of the distribution.
Are not expected to be linear over time.
Depend on the choice of cut score.

And now you want to compare them across tests?!
34
NAEP Trend (Scale-Invariant Effect Size)
NAEP vs. State Trends
State Trend (Scale-Invariant Effect Size)
35
Model Trend Discrepancies

As a function of content (Ho, 2005 Wei, Lukoff,
Shen, Ho and Haertel, 2006)
Overlapping content trends should be similar
nonoverlapping content trends should account for
the discrepancies.
Doesnt work well in practice, so far.
The tests are just too different!

36
2005 NAEP Black-White Gap by PAC
37
Gap Trend Flipping
38
NAEP Math Grade 4 Gap Trends
39
Dont Forget the Elephant

Gaps Naturally Bow
Trends Naturally Bow, Occasionally Flip
Gap Trends NATURALLY Flip
High-Stakes Ambiguity
Students are learning more and less.
Teachers are teaching better and worse.
NCLB is working and not working.
There is and is not progress toward equity in
educational opportunity.

40
State vs. NAEPTwo Kids on Pogo Sticks
41
State vs. NAEP

Averages are better than PACs.
But even averages are weak.
Distorting the scale can reverse the sign of an
average-based trend, too!
Not a problem for one test, but for cross-test
comparisons for tests with different scales
How about a scale-free statistic?

42
Further Considerations

Which states have comparable 03-05 trends?
Changed cut scores, tests, or scales?
Incorporate Alternate Assessments?
Fall vs. Spring Testing?
Grade 3 or 5 in place of Grade 4?
Reading vs. English and Language Arts?

43
References

Koretz, McCaffrey and Hamilton (2001)
Linn, Baker and Betebenner (2002)
Haertel, Thrash and Wiley (1978)
Spencer (1983)
Holland (2002)
Ho and Haertel (2006)
And thanks to Tracey Magda, my RA.

44
You Dont Need the State Distribution!
NAEP Proficiency Curve
State Proficiency Curve
But a Mapping Cannot Validate Itself Alone!!!
45
Mapping State Football Standards Onto the NAEP
Scale
State Football Proficiency Curve
NAEP Proficiency Curve
PeeWee Pop Jr. JV HS C Pro
46
Mapping School Doorway Height Onto the NAEP Scale
NAEP Proficiency Curve
State Height Cutoff Curve
49 50 53 56 59 60 63
47
Tabular Equipercentile Mapping