Ethical Standards for Selecting Tests to Assess Educational Abilities and Needs - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Ethical Standards for Selecting Tests to Assess Educational Abilities and Needs

Description:

Standard set of questions to assess knowledge, skills, interests, or other ... Discriminative validity: the degree to which a test is able to effectively ... – PowerPoint PPT presentation

Number of Views:38

Avg rating:3.0/5.0

Slides: 29

Provided by: MarySt94

Category:

more less

Transcript and Presenter's Notes

Title: Ethical Standards for Selecting Tests to Assess Educational Abilities and Needs

1
Ethical Standards for Selecting Tests to Assess
Educational Abilities and Needs

Dr. Mary E. Stafford
Chair, ISPA Ethics Committee
Editor, WorldGoRound

2
Overview

Relevant definitions
Historic perspective on development of standards
and classification systems
Ethical standards for selecting tests
Cultural issues to consider
Audience participation Issues you face related
to assessment in your country

3
Definition Test versus Assessment

Test
Procedure or method to determine presence of
phenomenon
Standard set of questions to assess knowledge,
skills, interests, or other characteristics of an
examinee
Set of operations designed to determine validity
of hypothesis
Assessment
Overall investigation into ones functional
capacities and limitations
Some are brief
Others are comprehensive
(Vandenbos, 2006)

4
Relevant Definitions Principles versus Standards

Ethical principles
Identify virtues to which practitioners strive
Are desired, but not required
Ethical standards
Specify behaviors that members of the
professional organization are expected to follow
Are required to be followed
(Koocher Keith-Spiegel, 1998)

5
Example of Standards for Testing

Test users should select tests that meet the
intended purpose and that are appropriate for the
intended test takers.
Test users should administer and score tests
correctly and fairly.
Test users should report and interpret test
results accurately and clearly.
Test users should inform test takers about the
nature of the test, test taker rights and
responsibilities, the appropriate use of scores,
and procedures for resolving challenges to
scores.
(Joint Committee on Testing Practices, 2004, pp.
5-11)

6
History of Development of Classification Systems

International List of Causes of Diseases, 1893
International Classification of Diseases-6, 1948
Current systems for classification
International Statistical Classification of
Diseases and Related Health Problems-10th
Revision ICD-10, 1990
International classification of functioning,
disability and health ICF, 2001
(World Health Organization WHO, 2006)

7
History of Development of Classification Systems

Diagnostic and Statistical Manual of Mental
Disorders -I, 1952 (DSM)
DSM-II, 1968
DSM-III, 1980
DSM-III-R, 1987
DSM-IV, 1994
DSM-IV-TR, 2000
American Psychiatric Association, 2007,
http//www.psych.org/research/dor/dsm/dsm_faqs/faq
81301.cfm

8
Historic Perspective on Development of Standards
for Assessment

Hippocratic Oath 400 BCE
Codes of ethics from care giving professions
Current codes used by school psychologists
International School Psychology Association, 1991
American Psychological Association, 2002
National Association of School Psychologists,
2000
Numerous codes of ethics for school psychologists
in various countries

9
Ethical Standards for Selecting Tests

Standards for this presentation derive primarily
from
NASP Code of Ethics
ISPAs Code of Ethics
American Psychological Associations Code of
Ethics
American Educational Research Association
International Testing Commission

10
(No Transcript)
11
Defining the Purpose of Testing

Have you fully defined in observable, measurable
terms the primary purpose or complaint that the
patient has and for which you will do the
assessment?

12
Evaluating Available Tests or Other Assessment
Methods

Before selecting a test or other assessment
method, have you evaluated a representative
sample of test questions and/or practice tests,
directions, answer sheets, manuals, and score
reports?
For the tests you consider using, did their
manuals adequately describe the development of
the instrument and its norming and scaling
processes?

13
Evaluating available tests or other assessment
methods

Have you evaluated the tests technical qualities
by reviewing information in the test manual,
research articles, and test reviewers?
Specifically, does the test provide
Evidence of good reliability for measuring the
constructs to be assessed?
Information about standard errors of measurement
and confidence intervals?
Evidence of adequate validity to address the
reasons for using the test and in light of the
patients demographic qualities?
Information about norms for the comparison group
to which the patient belongs?

14
Reliability

The trustworthiness or the accuracy of a measure
Typically is estimated based on the internal
consistency and stability of a tests scores
Internal consistency refers to the degree to
which all parts of a test measure the same
construct
Stability refers to the degree to which a test
measures the same quality at different times or
in different situations
Test-retest reliability, refers to the
consistency of scores obtained from the same
persons when tested on two or more occasions
A test is considered to be reliable if the scores
provide consistent information about a person

15
Reliability

Reliability coefficients range from 0 to 1.00
Intelligence tests have reliability coefficients
in the high .90s
Personality tests may have reliability
coefficients in the high .70s or the low .80s
Reliability estimates
Between .70 and .79 - fair (clinical decisions
should be supportable by other strong evidence)
Between .80 and .90 good
Above .90 excellent
Internal reliability estimates below .70
generally are considered to be too unstable to be
used with confidence.

16
Standard Error of Measurement

Based on tests reliability
Estimate of the error score
Provides a confidence interval, i.e., a number
used to determine the area around an obtained
score in which the true score lies
Report scores using confidence intervals rather
than the observed score

17
Interrater Reliability

Refers to the degree to which scores obtained
from ratings of the same behavior by two or more
independent raters are consistent
Used for nonstandardized measures
Calculating interrater reliability
Percentage of time the ratings agree by dividing
the number of times they agreed by the total
number of ratings
Correlate the scores from two or more ratings of
the childs life skills abilities
Tests that have higher rates of agreement or
higher correlations are more reliable

18
Validity

Refers to the extent to which test scores measure
their targeted construct(s), as well as the
extent to which they may be used meaningfully to
guide decision making
Process of test validation involves accumulating
evidence to provide a sound scientific basis for
the proposed score interpretations
Validity coefficients range from 0 to 1.00
The higher the coefficient, the higher the
validity, and thus the greater confidence we have
in using a tests scores to make decisions

19
Validity

Types of validity
Construct validity the extent to which a test
measures the theoretical construct it intends to
measure
Face validity the degree to which items on the
test are judged to appropriately measure the
targeted construct
Content validity the degree to which items on a
test represent the tasks, behaviors, or knowledge
of the domain of interest
Discriminative validity the degree to which a
test is able to effectively differentiate between
clinical and nonclinical samples of people who
take the test
Criterion-related validity the degree of
relationship between a new, targeted test and an
already established test that purports to measure
the same construct

20
Validity

Types of criterion-related validity
Predictive validity when the criterion scores
are obtained at a later time
Concurrent validity when the tests are
administered at the same time
Convergent validity both tests measure the same
construct
Divergent validity the tests are measures of
different psychological constructs

21
Test norms

Information about the tests average and typical
range of scores
Likely to provide the greatest problem for school
psychologists in countries where there are no
test developers who understand the culture and
language
Consider the relevance of the norms in light of
the tests use
Look for norms that are
Acquired recently
Representative of the general population,
including persons
From racial/ethnic/cultural group of the child
From the full range of socioeconomic levels
With disabilities in proportion to their
representation in the population

22
Evaluating Available Tests or Other Assessment
Methods

Following your review, have you found the test
procedures and materials to not be potentially
offensive in content and language?

23
Selecting the Best Test or Other Assessment
Method

Have you selected a test or other assessment
method that
Addresses the needs for the assessment in light
of the tests content and skills
Is appropriate in light of the childs age,
gender, cultural/racial/ethnic and developmental
level?
Has clear, accurate, and complete psychometric
information?
Has the potential of providing information
relevant to the development or evaluation of
interventions for this child?

24
Providing Accommodations for Subgroups

If the test taker has disabilities that require
special accommodations, have you selected tests
for which modified forms and/or administration
procedures exist or can be developed?
If the test takers are members of diverse
subgroups, have you evaluated cultural learning
factors relevant to test-taking behaviors and
determined to the extent feasible which
performance differences are likely to be caused
by factors related to culture rather than skills
being assessed?

25
Evaluating Your Administration Skills For The
Selected Test

Do you have the appropriate knowledge, skills,
and training to properly administer the selected
assessment method?

26
Cultural Issues to Consider

Standards for ethnically-appropriate test
selection
Select tests that are fair to all test takers
Eliminate language, symbols, words, phrases, and
content that generally are regarded as offensive
by members of racial, ethnic, gender, or other
groups, except when judged to be necessary for
adequate representation of the domain
Minimize the linguistic or reading demands of the
test to a level necessary for the valid
assessment when the level of linguistic or
reading ability is critical to the assessment
Describe linguistic modifications and a rational
for the modifications in detail in the test
manual
Biases from our own cultural learning
Differences in world view

27
Cultural Issues to Consider - Summary

Norms may not be available for group of the child
Translators may not be trained in test
administration
Test developers may not be willing to expend
money without getting a return on it
Interpretation of test data may be difficult
It is difficult to recognize our own biases and
we tend to think the way we view the world is the
same as others view it

28
Audience Participation

What problems do you face related to assessment
of children in your country?
What steps have you taken to try to deal with the
problems you face in this area?
What solutions would you recommend be considered
in solving these problems?

Write a Comment

User Comments (0)