Title: Applying Rigorous Standards to Program Evaluation
1Applying Rigorous Standards to Program Evaluation
- Technical Assistance Workshop
- on Evaluation for Evidence-Based Programs
- Savannah, Georgia
- March 9, 2007
- Chris Ringwalt, DrPH
- Senior Scientist
2What this presentation is about
- Ensure a common understanding of the evaluation
requirements of the Partnerships and Character
Education Program - Based on the language of the enabling legislation
- What this presentation is not about
- Evaluation 101
3Whats required of you? (Part I)
- To conduct a comprehensive evaluation of your
program based on a scientific research design - To assess the integration of your program into
your schools classroom instruction and its
consistency with state academic standards - To assess program effects on
- Students (including students with disabilities)
- Teachers
- Administrators
- Parents
- Others
4Whats required of you (Part II)
- To determine your programs effects on key
elements of character education, such as - Caring
- Civic virtue and citizenship
- Justice and fairness
- Respect
- Responsibility
- Trustworthiness
- Giving
- Other elements pertinent to your program
5Whats required of you (Part III)
- Factors from which you may select to determine
program effectiveness - Discipline
- Academic achievement
- Participation in extra-curricular activities
- Faculty and administration involvement
- Student and staff morale
- School climate
- Parental and community involvement
6The next slide displays one example of a research
design
- Type of evaluation is indicated by
- PE Process Evaluation
- OE Outcome Evaluation
- Both types of evaluation are needed
7(No Transcript)
8So whats a scientifically-based research
design?
- Guidance from Identifying and
- Implementing Educational Practices
- Supported by Rigorous Evidence
- A User-Friendly Guide
- U.S. Department of Education
- Institute of Education Sciences
- http//www.ed.gov/rschstat/research/pubs/
- rigorousevid/rigorousevid.pdf
9Well-designed evaluations
- address the questions
- Did your program have the effects desired?
- Are you certain that whatever effects you found
(or didnt find) can be attributed to the
program, and not to something else? - Did you collect your data in a rigorous and
defensible fashion?
10Hierarchy of designs
- Gold standard Randomized Controlled Trials
(RCTs) - Quasi-experimental designs that use very closely
matched comparison groups - Quasi-experimental designs that use comparison
groups of convenience - Single group designs
11Now, RCTs only work if you
- Randomize at the right level (student, class, or
entire school) - Can avoid cross-group contamination
- Can secure the cooperation of the individuals or
groups to be randomized - Can prevent any manipulation
- Have enough units of randomization to
- achieve equivalence between groups
- address your research questions (i.e., power)
12Quasi-experimental designs, however
- Depend on the quality of the match between
intervention and control groups - Are weakened with matches of convenience, which
often fail - May have groups that differ in many ways, both
measurable and unobservable (e.g., motivation to
change) - Require careful adjustments, which only work if
you can adjust on things that matter
13Some design options for quasi-experimental
designs
- Close matching of schools using all available
data (e.g., from the Common Core of Data _at_
http//nces.ed.gov/ccd/) - Pairwise matching
- Propensity scores
- Comparisons of successive cohorts of students in
a given grade or set of grades - Time series designs that rely on archival
achievement or disciplinary data - Linkage of student survey to achievement data at
individual level
14Heres what you are doing
15Here are the types of archival data youre using
(updated 2005 survey)
16Focusing on . . .
17Other important things to think about (pages 5-8)
- Describe how you implemented your program.
- Specify chain of logic by which you expect your
program to achieve results. - Check for baseline differences between
intervention and comparison groups. - Use reliable and valid measures.
- Collect data in a consistent and unbiased
fashion. - Minimize loss to follow-up (attrition).
- Report findings for everyone enrolled in the
program, not just those exposed to it. - Conduct analyses appropriate to level of
assignment.
18More things to think about (pages 9-14)
- Report program effects in plain English.
- Be wary of reporting effects on sub-groups.
- Remember that multiple evaluations with
consistent effects are required to generate
confidence in program effects. - Minimize selection bias in comparison groups.
- Maintain and measure program fidelity.
19In summary
- RCTs are best, but
- Quasi-experimental designs can be very powerful
tools. - Strength of evidence depends on similarity of
comparison groups. - Outcome evaluations mean nothing without process
evaluations to complement them. - There is no substitute for an evaluation design
carefully specified prior to program
implementation. - Close and ongoing communication between program
directors and evaluators is key.
20Good work!
- Let me know if youd like to bounce any
- ideas off me.
- I can be reached at
- Phone (919) 265-2613
- E-mail ringwalt_at_pire.org