Title: The Bumps and Bruises of the Evaluation Mine Field Presented by Olivia Silber Ashley, Dr.P.H. Presen
1The Bumps and Bruises of the Evaluation Mine
FieldPresented by Olivia Silber Ashley,
Dr.P.H.Presented toOffice of Adolescent
Pregnancy Programs Care Grantee Conference,
February 1-2, 2007, New Orleans, Louisiana
3040 Cornwallis Road P.O. Box 12194
Research Triangle Park, NC 27709
Phone 919-541-6427
e-mail osilber_at_rti.org
Fax 919-485-5555
RTI International is a trade name of Research
Triangle Institute
2Overview
- Core evaluation instruments
- Evaluation design
- Analysis
3Background on Core Evaluation Instruments
- Office of Management and Budget (OMB) recently
examined the AFL program using its Program
Assessment Rating Tool (PART) - Identified program strengths
- Program purpose
- Design
- Management
- Identified areas for improvement
- Strategic planning
- Program results/accountability
- In response, OPA
- Developed baseline and follow-up core evaluation
instruments - Developed performance measures to track
demonstration project effectiveness
4Staff and Client Advisory Committee
- Anne Badgley
- Leisa Bishop
- Doreen Brown
- Carl Christopher
- Cheri Christopher
- Audra Cummings
- Christina Diaz
- Amy Lewin
- David MacPhee
- Janet Mapp
- Ruben Martinez
- Mary Lou McCloud
- Charnese McPherson
- Alice Skenandore
- Jared Stangenberg
- Cherie Wooden
5Capacity Assessment Methods
- Review of grant applications, annual reports, and
other information from 28 most recently funded
programs - Qualitative assessment involving program
directors, evaluators, and staff in - 14 Title XX Prevention programs
- 14 Title XX Care programs
- Telephone interviews
- Site visit
- Observations of data collection activities
- Document review
- Conducted between January 26, 2006, and March 16,
2006 - 31 interviews involving 73 interviewees across 28
programs - 100 response rate
6Selected Title XX Prevention and Care Programs
- Baptist Childrens Home Ministries
- Boston Medical Center
- Emory University
- Freedom Foundation of New Jersey, Inc.
- Heritage Community Services
- Ingham County Health Department
- James Madison University
- Kings Community Action
- National Organization of Concerned Black Men
- Our Lady of Lourdes
- Red Cliff Band of Chippewas
- St. Vincent Mercy Medical Center
- Switchboard of Miami, Inc.
- Youth Opportunities Unlimited
- Childrens Home Society of Washington
- Childrens Hospital
- Choctaw Nation of Oklahoma
- Congreso de Latinos Unidos
- Hidalgo Medical Services
- Illinois Department of Human Services
- Metro Atlanta Youth for Christ
- Roca, Inc.
- Rosalie Manor Community Family Services
- San Mateo County Health Services Agency
- Truman Medical Services
- University of Utah
- Youth and Family Alliance/Lifeworks
- YWCA of Rochester and Monroe
7Capacity Assessment Research Questions
- How and to what extent have AFL projects used the
core evaluation instruments? - What problems have AFL projects encountered with
the instruments?
8Difficulties with Core Evaluation Instruments
among Care Programs
9Difficulties with Core Evaluation Instruments
among Prevention Programs
10Expert Work Group
- Elaine Borawski
- Claire Brindis
- Meredith Kelsey
- Doug Kirby
- Lisa Lieberman
- Dennis McBride
- Jeff Tanner
- Lynne Tingle
- Amy Tsui
- Gina Wingood
11Draft Revision of Core Evaluation Instruments
- Confidentiality statement
- 5th grade reading level
- Instructions for adolescent respondents
- Re-ordering of questions
- Improved formatting
- Sensitivity to diverse family structures
- Consistency in response options
- Improved fidelity to original source items
- Eliminated birth control question for pregnant
adolescents - Modified birth control question for parenting
adolescents - Clarified reference child
- Separated questions about counseling/testing and
treatment for STD - Modified living situation question
- Improved race question
- Added pneumococcal vaccine (PCV) item
12Why is a Rigorous Evaluation Design Important?
- Attribute changes to the program
- Reduce likelihood of spurious results
- OMB performance measure to improve evaluation
quality - Peer-reviewed publication
- Continued funding for your project and for the
AFL program - Ensure that program services are helpful to
pregnant and parenting adolescents
13Evaluation Design
- Appropriate to answer evaluation research
questions - Begin with most rigorous design possible
- Randomized experimental design is the gold
standard to answer research questions about
program effectiveness - Units for study (such as individuals, schools,
clinics, or geographical areas) are randomly
allocated to groups exposed to different
treatment conditions
14Barriers to Randomized Experimental Design
- Costs
- Consume a great deal of real resources
- Costly in terms of time
- Involve significant political costs
- Ethical issues raised by experimentation with
human beings - Limited in duration
- High attrition in either the treatment or control
groups - Population enrolled in the treatment and control
groups not representative of the population that
would be affected by the treatment - Possible program contamination across treatment
groups - Lack of experience using this design
- (Bauman, Viadro, Tsui, 1994 Burtless, 1995)
15Benefits of Randomized Experimental Design
- Able to infer causality
- Assures the direction of causality between
treatment and outcome - Removes any systematic correlation between
treatment status and both observed and unobserved
participant characteristics - Permits measurement of the effects of conditions
that have not previously been observed - Offers advantages in making results convincing
and understandable to policy makers - Policymakers can concentrate on the implications
of the results for changing public policy - The small number of qualifications to
experimental findings can be explained in lay
terms - (Bauman, Viadro, Tsui, 1994 Burtless, 1995)
16Strategies for Implementing Randomized
Experimental Design
- Read methods sections from evaluations using
randomized experimental design - Ask for evaluation technical assistance to
implement this design - Recruit all interested adolescents
- Ask parents/adolescents for permission to
randomly assign to one of two conditions - Divide program components into two conditions
- Overlay one component on top of others
- Focus outcome evaluation efforts on randomly
assigned adolescents - Include all adolescents in process evaluation
17An Example
- Study examined whether
- Home-based mentoring intervention prevented
second birth within 2 years of first birth - Increased participation in the intervention
reduced likelihood of second birth - Randomized controlled trial involving first-time
black adolescent mothers (n181) younger than age
18 - Intervention based on social cognitive theory,
focused on interpersonal negotiation skills,
adolescent development, and parenting - Delivered bi-weekly until infants first birthday
- Mentors were black, college-educated single
mothers - Control group received usual care
- No differences in baseline contraceptive use or
other measures of risk or family formation - Follow-up at 6, 13, and 24 months after
recruitment at first delivery - Response rate 82 at 24 months
- Intent-to-treat analysis showed that intervention
mothers less likely than control mothers to have
a second infant - Two or more intervention visits increased odds of
avoiding second birth more than threefold - Source Black et al. (2006). Delaying second
births among adolescent mothers A randomized,
controlled trial of a home-based mentoring
program. Pediatrics, 118, e1087-1099.
18Obtaining and Maintaining a Comparison Group
- Emphasize the value of research
- Explain exactly what the responsibilities of the
comparison group will be - Minimize burden to comparison group
- Ask for commitment in writing
- Provide incentives for data collection
- Provide non-related service/materials
- Meet frequently with people from participating
community organizations and schools - Provide school-level data to each participating
school (after data are cleaned and de-identified) - Work with organizations to help them obtain
resources for other health problems they are
concerned about - Add questions that other organizations are
interested in - Explain the relationship of this project to the
efforts of OAPP - Adapted from Foshee, V.A., Linder, G.F., Bauman,
K.E., Langwick, S.A., Arriaga, X.B., Heath, J.L.,
McMahon, P.M., Bangdiwala, S. (1996). The Safe
Dates Project Theoretical basis, evaluation
design, and selected baseline findings. American
Journal of Preventive Medicine, 12, 39-47.
19Analysis
- Include process measures in outcome analysis
- Attrition analysis
- Missing data
- Assessment of baseline differences between
treatment groups - Intent-to-treat-analysis
- Multivariate analysis controlling for variables
associated with baseline differences and attrition
20Incorporate Process Evaluation Measures in
Outcome Analysis
- Process evaluation measures assess qualitative
and quantitative parameters of program
implementation - Attendance data
- Participant feedback
- Program-delivery adherence to implementation
guidelines - Facilitate replication, understanding of outcome
evaluation findings, and program improvement - Avoids Type III error Concluding that program is
not effective when program was not implemented as
intended - Source USDHHS. (2002). Science-based prevention
programs and principles, 2002. Rockville, MD
Author.
21Attrition Analysis
- Number of participants lost over the course of a
program evaluation - Some participant loss is inevitable due to
transitions among program recipients - Extraordinary attrition rates generally lower the
degree of confidence reviewers are able to place
on outcome findings - Not needed if imputing data for all respondent
missingness - Evaluate the relationship of study variables to
dropout status (from baseline to follow-up) - Report findings from attrition analysis,
including direction of findings - Control for variables associated with dropout in
all multivariate outcome analyses - Source USDHHS. (2002). Science-based prevention
programs and principles, 2002. Rockville, MD
Author.
22Missing Data
- Not the same as attrition (rate at which
participants prematurely leave an evaluation) - Absence of or gaps in information from
participants who remain involved - A large amount of missing data can threaten the
integrity of an evaluation - Item-level missingness
- Run frequency distributions for all items
- Consider logical skips
- Report missingness
- Address more than 10 missingness
- Imputation procedures
- Imputed single values
- Multiple imputation (SAS Proc MI) replaces
missing values in a dataset with a set of
plausible values - Full Information Maximum Likelihood Modeling
(FIML) estimation in a multilevel structural
equation modeling (SEM) framework in Mplus 4.1
(Muthen Muthen, 1998-2006) - Source USDHHS. (2002). Science-based prevention
programs and principles, 2002. Rockville, MD
Author.
23Analysis
- Appropriateness of data analytic techniques for
determining the success of a program - Employ state-of-the-art data analysis techniques
to assess program effectiveness by participant
subgroup - Use the most suitable current methods to measure
outcome change - Subgroup (moderation) analyses allow evaluation
of outcomes by participant age and ethnicity, for
example - Okay to start with descriptive statistics
- Report baseline and follow-up results for both
treatment and comparison groups - Conduct multivariate analysis of treatment
condition predicting difference of differences - Control for variables associated with attrition
- Control for variables associated with differences
at baseline - Source USDHHS. (2002). Science-based prevention
programs and principles, 2002. Rockville, MD
Author.
24Assessment of Baseline Differences between
Treatment and Comparison Groups
- Address the following research questions
- Are treatment and comparison group adolescents
similar in terms of - Baseline levels of outcome variables (e.g.,
educational achievement, current school status) - Key demographic characteristics, such as
- Age
- Race/ethnicity
- Pregnancy stage
- Marital status
- Living arrangements
- SES
25Test for Baseline Differences
- Test for statistically significant differences in
the proportions of adolescents in each category - If you decide to analyze potential mediators as
short-term program outcomes, test for baseline
differences on these mediators - Report results from these tests in the end of
year evaluation report for each year that
baseline data are collected - Important for peer-reviewed publication
- Control for variables associated with treatment
condition in outcome analyses
26An Example Childrens Hospital Boston
- Study to increase parenting skills and improve
attitudes about parenting among parenting teens
through a structured psychoeducational group
model - All parenting teens (n91) were offered a 12-week
group parenting curriculum - Comparison group (n54) declined the curriculum
but agreed to participate in evaluation - Pre-test, post-test measures included
Adult-Adolescent Parenting Inventory (AAPI), the
Maternal Self-Report Inventory (MSRI), and the
Parenting Daily Hassles Scale - Analyses controlled for mothers age, babys age,
and race - Results showed that program participants or those
who attended more sessions improved their
mothering role, perception of childbearing,
developmental expectations of child, empathy for
baby, and reduced frequency of hassles in child
and family events - Source Woods et al. (2003). The parenting
project for teen mothers The impact of a
nurturing curriculum on adolescent parenting
skills and life hassles. Ambul Pediatr, 3,
240-245.
27Moderation and Mediation Analyses
- Test for moderation
- Assess interaction between treatment and
demographic/baseline risk variables - When interaction term is significant, stratify by
levels of the moderator variable and re-run
analyses for subgroups - Test for mediation
- Standard z-test based on the multivariate delta
standard error for the estimate of the mediated
effect (MacKinnon, Lockwood, Hoffman, West,
Sheets, 2002 Sobel, 1982) - Treatment condition beta value is attenuated by
20 or more after controlling for proposed
mediators (Baron Kenny, 1986)
28An Example
Main Effect
Goal
Outcomes
Outcomes
- Teacher Characteristics
- Improved interactions
- with adolescent
- Positive messages
- about adolescents
- capabilities
Improved adolescent self-efficacy to succeed
academically
Training Curriculum
Longer adolescent stay in school
Mediating Effect
Academic case management
- Program Content
- Program Delivery
- Program Activities
Improved adolescent behavioral capability to use
contraception and negotiate with partner
Increased adolescent contraceptive use
Reduced adolescent repeat pregnancy
Family planning counseling
Improved adolescent outcome expectations about
immunizations
Grandparent support group
- Grandparent Characteristics
- Increased knowledge about
- immunization benefits
- Increased skills for avoiding
- conflict with adolescent
Increased adolescent contraceptive use
Increased immunizations
Moderating Effect
AFL Care Program
- Demographic
- characteristics
- Family dysfunction
- Adolescent age at
- first pregnancy
Process Evaluation
Outcome Evaluation
29Intent-to-Treat Analysis
- Requires that all respondents initially enrolled
in a given program condition be included in the
first pass of an analysis strategy, regardless of
whether respondents subsequently received program
treatment (Hollis Campbell, 1999) - Report findings from the intent-to-treat analysis
- Important for peer-reviewed publication
- Okay to re-run analyses, recoding respondents as
not receiving the program or dropping them from
analyses