Title: The Andes Intelligent Tutoring System: Five years of evaluations
1The Andes Intelligent Tutoring SystemFive years
of evaluations
- Kurt VanLehn
- Pittsburgh Science of Learning Center (PSLC)
- University of Pittsburgh
2The physics LearnLab course committee
- Andes development
- Anders Weinstein
- Brett van de Sande
- Kurt VanLehn (co-chair)
- U.S. Naval Academy
- Don Treacy (co-chair)
- Bob Shelby
- Mary Wintersgill
- Kay Schulze
- Experimenters
- Scotty Craig
- Sandy Katz
- Bob Hausmann
- Michael Ringenberg
- Meet weekly
- Thursdays, 330
3Funding
- The U. S. Office of Naval ResearchCognitive
Science Program - The U.S. National Science FoundationPittsburgh
Science of Learning Center
4Research question
- Given
- Whole semester of instruction
- No change to content of course
- No change to lectures, labs, assignments
- Standard exams (not designed by experimenters)
- Can a homework helper increase learning?
5Prior work with answer-only tutoring steps
- Web-based homework grading systems
- E.g., Web-assign, CAPA, Mastering Physics
- Provide feedback hints on the answer only
- Compared to ordinary paper-based homework
- Positive benefits
- When paper-based homework is collected graded
- No benefits (Pascarella, 2002 Dufresne, Mestre
Rath, 2002) - Interpretation
- Motivating students to do their homework provides
benefits, but the answer-only tutoring system
provides no additional benefits
6Prior work with tutoring systems that give
feedback hints on steps
- Lisp Tutor (Corbett, 2001) and many others
- Same homework problems text
- Experimenters exams only
- But not a whole semester (only 5 lessons)
- Pump curriculum Pat tutor (Koedinger et al)
- Whole year of high-school algebra
- Both experimenters exams standard exams
- But content confounded with tutoring system
- Earlier evaluations of Andes
- First half-semester only
- Experimenters exams only
7Why does it matter?
- Ideally, an intelligent homework helper
- can increase learning without changing the
course, and - the increase is strong enough to show in final
exam - The diligent always do well slackers always do
poorly - Cramming
- If not
- still useful if it facilitates content upgrades,
and - the upgrades cause robust increases in learning
8Outline
- Andes
- Evaluation
- Discussion
Next
9What kind of physics?
- US university introductory physics courses
- US high school advanced physics courses
- A typical problem
If a 2000 kg car at the top of a 20 degree
inclined driveway 20 m long slips its parking
brake and rolls down. If we ignore friction and
drag, what is the magnitude of the velocity of
the car when it hits the garage door?
10Andes user interface
Read a physics problem
Draw vectors
Type in equations
Type in answer
11Andes feedback and hints
What should I do next?
Whats wrong with that?
Green means correctRed means incorrect
Dialogue hints
12Major challenges
- Dealing with equations
- Giving red/green feedback
- Undoing algebraic combination
- For what should I do next?
- Analyzing errors in equations
- Scale-up
- 13 chapters, 500 textbook pages
- 350 problems
- 300 principles
13Outline
- Andes
- Evaluation
- Method
- Main results
- Which students benefited?
- Which knowledge benefited?
- Interpretation of results
- Discussion
Next
14Evaluations of Andes at the US Naval Academy
- Fall semesters 2000, 2001, 2002 2003
- Only the homework modality was varied Andes vs.
paper-based - Same textbook
- Similar lectures, labs, recitations
- Similar homework problems
- Same exams
- Students were motivated to do paper-based
homework - Either collected and graded
- Or 1 homework problem on each quiz
15Exams
- Midterm exam
- 1 hour, 4 problems
- Scored on derivation answer
- Drawings (30)
- Variable definitions (20)
- Equations (40)
- Answers (10)
- Final exam
- 3 hours, 50 problems
- Multiple choice
Next
16Checking prior competence of Andes and control
students
- Grade-point averages equal
- Distribution of majors equal
- Engineering majors vs.
- Science majors vs.
- Other majors
17Midterm exam results(All differences reliable, p
lt .01)
How to calculate effect size?
18Calculating effect size over 4 different midterm
exams
- Normalize each score
- z_score(student)
- raw_score(student) mean(exam) /
standard_deviation(exam) - For each condition, pool z-scores across years
- Effect size
0.61
19Final exam
- Exam covers 100 of course, but Andes didnt
- Does now
- Use 2003 exam only Andes covered 70
- 89 Andes students
- 823 non-Andes students
20Prior competence not equal
- Majors not equally distributed
- Andes group had more engineering majors
- GPAs not equally distributed
- Andes group had marginally higher GPAs
- Factor out prior competence statistically
- For each major, regress GPA on final exam score
- Residual_score(student) raw_score(student)
predicted_score(students major, students GPA)
21Final exam results
Difference is reliable (p 0.028) Effect size
0.25
22Outline
- Andes
- Evaluation
- Method
- Main results
- Which students benefited?
- Which knowledge benefited?
- Interpretation of results
- Discussion
Next
23Benefits same regardless of GPA
24Benefits varied by major on final exam but not on
midterm exam
25Outline
- Andes
- Evaluation
- Method
- Main results
- Which students benefited?
- What knowledge benefited?
- Interpretation of results
- Discussion
Next
26Effect sizes for subscores of midterm exam
27Interpretation of results
- Engineering science majors learned the red path
and prefer it - Andes does not increase their final exam scores
- They use blue path on the midterm
- Andes increases their midterm exam scores
- Other majors do not have red path, so they use
the blue path on both exams - Andes increases both exams scores
- On midterm exams, subscores measure components of
blue path separately - Biggest benefit for diagrams variables
- Smaller on equations none on answer
Problem
Andes
Diagram variables
Prior physics
Andes
Equations
Prior math physics
Answer
28Summary of results
- Main result Andes provides benefits
- Midterm exam effect size 0.61
- Final exam effect size 0.25
- Andes helps students learn conceptual skills
- Effect sizes on conceptual subscores 1.21 0.69
- Effect sizes on calculational subscores 0.11
-0.08 - Some students appear to have a non-conceptual
method for solving problems - Competes with the conceptual method taught by
Andes - They use it on the (answer-only) final exam
- This dilutes the benefit of Andes on final exam
29Outline
- Andes
- Evaluation
- Discussion
- Andes compared to others
- Why is Andes effective?
Next
30Effect sizes on experimenters standard exams
of 3 tutoring systems
31Interpretation of the comparison with other tutors
- Andes is about the same as other tutoring systems
that give feedback and hints on steps - Perhaps the PumpPat benefits are due solely to
the tutoring system and not the content upgrade
32Summary Studies of homework helpers when content
is controlled
Ordinary paper-based homework
Large benefits
Motivated paper-based homework
No benefits
Feedback hints on answer only
Large benefits
Feedback hints on steps
33Outline
- Andes
- Evaluation
- Discussion
- Andes compared to others
- Why is feedback hints on steps so effective?
Next
34Hypothesis Andes increases the number of
successful knowledge events
- Without feedback hints on steps, students skip
them - Guess
- Copy similar examples step edit
- Copy edit a higher goals outcome
- Doing a step correctly requires
- Figuring out how the first time (sense-making)
- Figuring out why the second third times
(refinement) - Recalling why how the other times (fluency
building) - This increases number of successful knowledge
events - Wherein a student constructs or applies a
knowledge component
35Thanks for your attention!
- At www.andes.pitt.edu
- Download stand-alone version of Andes
- Try OLI version of Andes
- Download papers on Andes
- Sorry, but Andes only runs on Windows