Title: By Violaine Le Rouzic, Evaluation Officer, WBIEG
1Measuring Learning withWBIEG Level-2 Evaluation
Toolkit
- By Violaine Le Rouzic, Evaluation Officer, WBIEG
- March 27, 2005
2What is a Level-2 evaluation?
- Objective
- To help refine a course by
- Measuring how much participants learned
- Assessing what participants learned
- Evaluation Design
- Test all participants knowledge of course
contents at the start the end of the course - Same difficulty on pre- posttests, but
different items, to avoid pretest recall effect - Main Indicator
- Groups Learning Gain Post-test Pre-test
3What is WBIEG Level-2 evaluation Toolkit?
- A set of guidelines, templates, databases, and
macro enabling course teams to assess their
participants learning. - Adapt psychometrics to WB context to measure
learning with fair confidence (Tradeoff between
science and feasibility)
4Main Toolkit features
- To measure participant group learning at the
course - Practical, short, step-by-step
- Tasks divided between content experts
assistants - Accounts for short preparation time, few test
takers - No evaluation knowledge required
- Need basic Word, Excel Internet browsing skills
- Test form templates in over ten languages
- Not theoretical, state of the art psychometrics
- Not for certification of individual participants
5For whom is the Toolkit?
- World Bank course teams (for courses with
external and/or internal participants) - World Bank managers or any WB staff who want to
compare learning outcomes data across various
criteria (division, years, etc.) - Course teams in other organizations can use the
evaluation tools on external WB web site (but the
test items and results database is for World Bank
users only.)
6When to use the Toolkit?
- Level-2 evaluation is feasible
- Learning knowledge/skills is the main objective
- Learning objectives are clear before the course
- Every participant follows the same curriculum
- Worth investing in Level-2 evaluation
- Course will be offered again
- Long enough (at least 1 week recommended)
- Many participants (30 or more recommended)
- Enough resources and commitment
- Two staff weeks time for the evaluation
- Commitment to use the evaluation results
7Toolkits 13 Steps
http//web.worldbank.org/WBSITE/EXTERNAL/WBI/0,,co
ntentMDK20270021pagePK209023piPK335094theSit
ePK213799,00.html
81. Plan the evaluation
- For course director
- Course team resources needed
- 1 week of the content expert team (can be split)
- 1 week of assistants
- Less time required on subsequent L2 evaluations
- Through the course cycle
- From early design stage to course re-design
- Most time for test development before delivery
- Start at early course design stage
http//siteresources.worldbank.org/WBIINT/Resource
s/Plan-for-Level-2.pdf
92. Map the test
- For course director
- Build a test specification matrix to determine
- Which content areas should be tested
- To which cognitive domain each area relates
- How many test items are needed per content area
and cognitive domain (Recommended minimum total
20 item pairs per test) - Objective
- Make the test representative of the course content
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-build-matrix.pdf
103. Review past items (optional)
- For content experts of World Bank only
- Consult a database with over 5,000 items used in
over 100 WB courses with Level-2 evaluations - Search for keywords in offering titles or items
- Potential benefits
- Save time if some items fit your needs
- Identify issues to avoid from past items
- Get ideas on writing new items
- Caution
- Item quality is context-specific, dont re-use
blindly!
http//intranet.worldbank.org/WBSITE/INTRANET/UNIT
S/WBIINT/0,,contentMDK20191931pagePK135700piPK
135698theSitePK136975,00.html
114. Write items
- For content experts
- Match the content area and cognitive domain of
the test specification matrix - Test items use multiple-choice format
- All items have five response options (Last option
is always I dont know.) - Average difficulty level
- Clearly stated
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-write-items.pdf
125. Pair items
- For content experts
- For each item, write an equivalent item
- Same difficulty level
- Same content area
- Same cognitive domain
- Same length
- Same format
- Examples in guidelines
- Objective Make pre- and post-test equivalent
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-pair-items.pdf
136. Pilot tests
- For content experts (with assistants)
- Have volunteers take the tests (or part of the
test) before the course. Volunteers can be - Other content experts (to check key)
- Alumni
- Participant look-alike
- Non-content experts
- BUT NOT the actual participants!
- Collect comments and demographics with tests
responses. Test without, then with key.
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-pilot-items.pdf
147. Review test items
- For content experts (with assistants)
- Use the pilot test responses and the Toolkit
checklist to review each item for - content
- wording
- using statistical item analysis (if any)
- Finalize the items
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-review-items.pdf
158. Produce test forms
- For assistants
- Use automated template to randomly assign items
to either pre- or post-test - Use test form templates (customize the templates,
as needed) - Use formatting and production guidelines
- Poor test form production can ruin all results!
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-prepare-test-forms.pdf
169 10. Collect test forms
- For any organizer on site
- Collect pre-test at course start and post-test at
course end - Have all participants answer
- Have all participants write their codes on both
forms to match results by respondent - Explain evaluation objectives confidentiality
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-collect-pre-test.pdf
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-collect-post-test.pdf
1711. Compute results
- For assistants
- Follow tabulation guidelines
- Enter responses in tabulation template
- Click macro for automatic item analysis
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-tabulate-responses.pdf
1811A. Results examplea course learning gain and
post-test score
Learning gain green if statistically
significant orange, if not.
Compare with other courses
Pre- and post-test scores (matched respondents)
Post-test scores (all respondents)
1911B. Results example distribution of a courses
pre- and post-test scores
Post-test to the right of pre-test participants
learned
2011C. Results example post-test responses by item
Key
Items text
Responses
21Check item confusing or misconception not
overcome
11D. Results example a course item analysis
Check item too hard or course did not teach
this well
Check pre-post equivalence
Most items statistically OK
Confused high scorers
Compare reliability (WB only)
2212. Send results to WBIEG
- For assistants (WBI courses only)
- WBIEG will
- Check quality of processing
- Include test items and results in database
- Report evaluation efforts and results on request
http//web.worldbank.org/WBSITE/EXTERNAL/WBI/0,,co
ntentMDK20270039pagePK209023piPK335094theSit
ePK213799,00.html
2313. Interpret results
- For course director content experts
- Review and interpret results using
- Interpretation guidelines
- Results database (WB only)
- Toolkit glossary
- Decide how to improve next offering and next test
http//siteresources.worldbank.org/WBIINT/Resource
s/How-to-interpret-results.pdf
24Thanks to
- Developers Joy Behrens, Guangbin Liu, WBIEG
staff - Advisors Marlaine Lockheed, William Eckert,
Sukai Prom-Jackson, Zhengfang Shi, Gary
Echternacht - Main contact Violaine Le Rouzic