Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms' - PowerPoint PPT Presentation

1 / 87
About This Presentation
Title:

Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms'

Description:

Program Chairs: Ryan S.J.d. Baker (ryan_at_educationaldatamining.org), Carnegie ... By Leena Razzaq, Neil Heffernan & Robert Lindeman ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 88
Provided by: nth9
Category:

less

Transcript and Presenter's Notes

Title: Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms'


1
Neil T. Heffernan The Use of Intelligent
Tutoring Systems as Research Platforms.
  • Worcester Polytechnic Institute
  • Computer Science Department
  • Massachusetts

2
Outline
  • Who am I?
  • My dissertation work
  • Carnegie Learning Inc.
  • ASSISTments
  • I now spend a great deal of time working with
    teachers in their classroom to figure out the
    most effective ways of using data.
  • The Future
  • Databases and research schools
  • Computer assistance of learning what works.
  • I am always looking for collaborators (teachers,
    schools, researchers, psychometrians)
  • We give away our service to any teacher /
    researcher
  • We share out data anonymously with researchers

3
Why are algebra word problems difficult?
  • Ann is in a rowboat in a lake that is 2400 yards
    wide. She is 800 yards from the dock. She then
    rows for m(3) minutes back towards the dock.
    Ann rows at a speed of 40 yards per minute.
    Write an expression for Ann's distance from the
    dock.
  • Symbolize 800-40m
  • Articulate an expression(m3) 800-403
  • Compute a value(m3) 680

Two Cognitive Science conference papers (1997,
1998) with Ken Koedinger
4
Over 20,000 Students Tutored
www.AlgebraTutor.org
5
Cognitive Tutor Carnegie Learning Inc.
6
300 Schools in 2000-01
7
Carnegie Learning Inc
  • I do not work for, nor did I create the Cognitive
    Tutoring Technology. My advisors did (Ken
    Koedinger).
  • Ken Koedinger and I 2nd project is Called
    ASSISTments, and its a free web-based system
    that is more focused on assessment and less on
    being the primary delivery of instruction.
  • Carnegie Learning Inc success 2 of the 5 days a
    week are kids on the computer
  • ASSISTments suggested use is once every two
    weeks.
  • ASSISTments is meant to be an add-on two existing
    curriculum (textbooks) and not meant to be a
    curriculum itself.
  • We are more interested in using intelligent
    tutors as research platforms to produce sound
    research.

8
(No Transcript)
9
Blending Assessment and Instructional
Assisting The Assistment Project
www.Assistment.org
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Andrea KNIGHT,
Edwin MERCADO, Terrence E. TURNER, Ruta
UPALEKAR, Jason A. WALONOSKI Michael A.
MACASEK, Christopher ANISZCZYK, Sanket CHOKSEY,
Tom LIVAK, Kai RASMUSSEN
Carnegie Learning
10
Research Questions
  • 1. Does Assistment system work as an assessment
    tool?
  • Reliably predict MCAS performance?
  • Reliably advise teachers students on what
    knowledge to focus on?
  • 2. Does the system help enhance student learning?
  • Does the system effectively teach as it assesses?
  • Can teachers use the advice of this system to
    lead to higher student learning?

11
Intro to the Assistment System
  • Massachusetts Comprehensive Assessment System
    MCAS
  • Required to graduate- 30 failed
  • Very challenging multi-step problems
  • 8th grade (13 year olds)
  • Click here-gt New England Cable News

12
  • We break multi-step problems into scaffolding
    questions
  • Hint Messages given on demand that give hints
    about what step to do next- GREEN
  • Buggy Message a context sensitive feedback
    message
  • Knowledge Components Skills, Strategies,
    concepts
  • The state reports to teachers on 5 areas
  • We seek to report on 100 knowledge components
  • Video of demo backup

13
The Assistment Builder A Rapid Development Tool
for ITS Terrence Turner, Michael Macasek, Goss
Nuzzo-Jones, Neil Heffernan, Ken Koendinger
Data Collection Results
What the teacher sees
What a student sees
Goal
It is reported2 that in order to create one
hour of content for an Intelligent Tutoring
System (ITS) it requires at least 200 hours of
development time by PhD-level AI programmers and
cognitive scientists. We wanted to create a tool
that was usable by non-programming school
teachers and that could reduce that time cost by
a factor of 10. We report that we have certainly
succeeded at making a tool usable by
non-programmers, and that the cost, in time, is
reduced by a factor of 5.
  • To analyze the effectiveness of the Builder, we
    developed a system to log and time stamp the
    actions of an author. Unfortunately, the logging
    was not in place until recently, capturing less
    than 1 of Builder usage, so caution is
    warranted.

This shows a student that first said 16, then got
the first scaffolding question correct with AC.
The student then clicked on ½8x and the
system spit out the bug message in red. The
student then asked for a hint twice in a row
shown in green
The originalquestion
Uploaded image
The firstscaffold
  • Above are results on the time it took 5 different
    authors to build a total of 14 different
    assistments. On average it took authors 25
    minutes per assistment. Given the fact that each
    assistment provides about 2 minutes of content,
    this suggests a ratio of 131, instead of the
    2001 from the literature 2. These results
    suggest a speed-up of over a factor of 15!
  • However, self-reports by our content creators on
    the amount of time it took to build and test an
    assistment was 90 minutes per assistment, which
    yields a ratio of 451 instead of the 131
    reported above. This suggests that the builder
    allows content to be created 5 times faster than
    the literature suggests.
  • Both of these results ignore the time it takes to
    mark questions with skills. That process was
    done for the Assistment System as a whole
    marking 250 items in 6 hours, resulting in an
    amortized cost of 2 minutes per item.

The author clicked here to open up this window to
write the hint messages as shown above in green.
The second scaffold
Context
The three hint messages for the second scaffold.
After Building Assistments .
Teachers can select items to put into
experiments, and then assign them to classes.
Below shows a real experiment4 that
investigated whether students would learn better
if asked to set up proportions. The author built
two different Assistments that differed only by
one extra scaffold. The author made a second
morphed version of each by changing the cover
story. Finally, the author selected two items
to posttest for far transfer.
  • The Assistment Builder is a web-based
    tutor-creation tool that has been used to create
    about 8 hours of instruction, which has been
    shown to lead to statistically significant
    learning (see 4) with over 1,000 students.
  • This tool allows Assistments to be created that
    are behaviorally equivalent to rule-based
    cognitive tutors1 but are not general. While
    our underlying representation 3 allows both
    rule-based and state-based tutors, the Builder
    creates a tutor for a single item.
  • Anecdotally, we observed a high school teacher
    create three items, each in under 30 minutes, but
    that was with the questions already prepared and
    stored in a text file. Two high-school teachers
    have together made over 50 assistments.

The bottom out hint.
The third scaffold
Conclusion
This dialog shows the author has tagged the third
scaffold with three different grained sized
models.
  • We have built a system in which non-programmers
    can easily build content in 1/5 the amount of
    time it has taken other ITS authoring tools
    1,2. The quality of the content has also been
    shown to lead to statistically significant
    learning 4.
  • The system will be freely available via the
    Assistment Project at www.assistment.org in
    September of 2005. Any teacher will be allowed to
    make their own content and/or use our existing
    content, and get live class reports.
  • Primary Contact Neil Heffernan, nth_at_wpi.edu
  • Lead funding from a NSF-CAREER to Prof Heffernan
    US Dept of Education support from Spencer
    Foundation, Office of Naval Research, and the US
    Army.

Authors can then schedule when to be notified!
By tagging items with skills, teachers can 1) get
reports on which skills students are doing poorly
on, and 2) track them over time as discussed in
6.
  • References
  • Koedinger, K. R., Aleven, V., Heffernan. T.,
    McLaren, B. Hockenberry, M. (2004) Opening the
    Door to Non-Programmers Authoring Intelligent
    Tutor Behavior by Demonstration. Proceedings of
    7th Annual Intelligent Tutoring Systems
    Conference, Maceio, Brazil. Page 162-173
  • Murray, T. (1999).  Authoring intelligent
    tutoring systems An analysis of the state of the
    art. International Journal of Artificial
    Intelligence in Education, 10, pp. 98-129.
  • Nuzzo-Jones, G., Walonoski, J.A., Heffernan,
    N.T., Livak, T. (2005). The eXtensible Tutor
    Architecture A New Foundation for ITS. Poster in
    the12th Annual Conference on Artificial
    Intelligence in Education 2005, Amsterdam
  • Razzaq, L, Feng, M., Nuzzo-Jones, G., Heffernan,
    N.T. et. al (2005). The Assistment Project
    Blending Assessment and Assisting. 12th Annual
    Conference on Artificial Intelligence in
    Education 2005.
  • Turner, T.E., Macasek, M.A., Nuzzo-Jones, G.,
    Heffernan, N..T, Koedinger, K. (2005). The
    Assistment Builder A Rapid Development Tool for
    ITS. Poster in the 12th Annual Conference on
    Artificial Intelligence in Education 2005,
    Amsterdam
  • Feng, Mingyu, Heffernan, N.T. (2005). Informing
    Teachers Live about Student Learning Reporting
    in the Assistment System. 12th Annual Conference
    on Artificial Intelligence in Education 2005
    Workshop on Usage Analysis in Learning Systems,
    2005, Amsterdam.

The system automatically does the analysis. This
shows the SetupRatio condition to have better
learning within the condition as well as better
learning on the posttest/transfer items (reported
in 4)
The fourth and last Scaffold
14
The Web Based Architecture and Builder Web-App
Video lt- Click Here
  • Web-based server-side system to build content and
    deploy anyone can use.
  • We built 8 hours of content (240 items) in 375
    hours with no programming needed.
  • Murray reports that ITS take a minimum of 200
    hours of work to produce 1 hour of ITS content.
    Typically requires PhD-level programming, and
    cognitive science.
  • 5 items faster that literature would suggest.

Turner, T.E., Macasek, M.A., Nuzzo-Jones, G.,
Heffernan, N..T, Koedinger, K. (2005). The
Assistment Builder A Rapid Development Tool for
ITS. Poster in the 12th Annual Conference on
Artificial Intelligence in Education 2005,
Amsterdam
15
Research Questions
  • 1. Does Assistment system work as an assessment
    tool?
  • Reliably predict MCAS performance?
  • Reliably advise teachers students on what
    knowledge to focus on?
  • 2. Does the system help enhance student learning?
  • Does the system effectively teach as it assesses?
  • Can teachers use the advice of this system to
    lead to higher student learning?
  • We can predict MCAS scores well Feng, Heffernan
    Koedinger(2006)
  • We can predict better if we look at how much
    assistance they needed. (A. Brown called this
    dynamic assessment.)
  • We predict MCAS scores better if we track 98
    skills.

16
Do students learn from Assistments?
  • To determine if students are learning, we gave
    them problems testing the same concept within a
    given class period and compared their results.
  • Problems were presented randomly
  • - Example concept pairs
  • Supplementary Angles and Traversals of Parallel
    Lines,
  • Perimeter and Area,
  • Pythagorean Theorem,
  • Approximating Square Roots.

17
N 595 studentsAverage Gain 1.5
In the figure shown, lines l and m are parallel,
and triangle ABC is isosceles. What is the
measure of angle ACB?
In the figure above, lines CD and EF are
parallel. What is the measure of angle BHF?
18
N 390 studentsAverage Gain 4.3
The squares in the figure above are congruent.
The perimeter of the entire figure is 24 units.
What is the area of one small square?
In the figure, the perimeter of the equilateral
triangle is 24 inches. What is the area of the
square?
19
N 329 studentsAverage Gain 3
Manuel will use an old 16-foot fence for the
longest side and an old 8-foot fence for another
side as shown. What is the best estimate of the
amount of fencing he will need to the nearest
whole number for the third side?
What is the area of the smaller square?
20
N 424 studentsAverage Gain 7
Which is the best approximation of v72?
The square root of 55 is between which two whole
numbers?
21
That is, do students learn from Assistments?
  • 742 students
  • Example content gains
  • Supplementary Angles and Traversals of Parallel
    Lines, 1.5
  • Perimeter and Area 4
  • Pythagorean Theorem, 3
  • Approximating Square Roots, 7
  • 29 other pairs
  • Student item analysis reveal statistically
    reliable gains of 2 higher on 2nd opportunity at
    a cluster of skills
  • Student Level Analysis p 0.02
  • Item Level Analysis that weighted items
    proportionally p.04

22
Experiments comparing different teaching
techniques
  • How the Assistments System can be used to run
    randomly controlled experiments.
  • Math Research Question
  • Should student be scaffolded in their
    proportional reasoning problems by being taught
    to set up a proportion/ratio
  • Two Conditions (SetUpRatio Condition)
  • Two items to test for transfer/posttest

23
(No Transcript)
24
The Two Morphs
25
Transfer/Posttest Items
26
Tools to make an experiment
Built by Ema Holban and Shane Gibbons
27
Tools to make an experiment
Built by Ema Holban and Shane Gibbons
28
Tools to make an experiment
Built by Ema Holban and Shane Gibbons
29
Tools to make an experiment
Percent correct
Built by Ema Holban and Shane Gibbons
30
(No Transcript)
31
(No Transcript)
32
The Current Status
  • 4,000 students in towns around and in Worcester
    and Pittsburgh.
  • http//nth1.wpi.edu/zpardos/day-stats.html

33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
What types of experiments?Both where the
conditions are online or offline.
  • Which is best for coached problem solving?
    Traditional classroom, computer aided instruction
    (CAI) versus intelligent tutoring systems
    (Mendocino, Heffernan Razzaq, in submission)
  • Do student learn more from scaffolding question
    compared to a traditional CAI approach? (Razzaq
    Heffernan, 2006)
  • How about for students with less knowledge?
    (Razzaq, Heffernan Lindeman, 2007)
  • The value of worked examples.

37
  • Jon Star at Harvard and Bethany Rittle-Johnson at
    Vanderbilt and I are looking at seeing if middle
    school math students learn more if taught
    multiple solution strategies. These were
    classroom experiments but the ASSISTment system
    is used the testing (kids get feedback.)

38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
DateBases Class Summaries
43
DataBases Common Errors
44
DateBases What problems are hardest?
45
DateBases What skills are hardest?
46
DataBases What skills are slowest to learn?
47
Datebases What instructional interventions are
most effective in causing student learning?
48
Datebases Far Transfer?
49
(No Transcript)
50
Lets teachers build this stuff rather than a a
tiny group of smart researchers at CMU
51
What is ASSISTments?
  • Using Formative Assessment to drive instruction.
  • But the testing takes away valuable classroom
    time.
  • Who fundede it?
  • What is it about
  • What results do we have?
  • We can predict state test scores well
  • We can do so better if we track 98 different
    skills in 8th grade math
  • We do assess better if use how much assistance
    student needed.
  • Kids are learning during the testing.

52
Wiki-ASSISTment
  • Suppose you

53
  •  
  •  
  • First International Conference on Educational
    Data Mining
  •   Data Mining and Statistics in Service of
    Education
  • Call for papers (preliminary)
  • http//www.EducationalDataMining.org
  • June 20-21, 2008
  • Co-located with International Conference on
    Intelligent Tutoring Systems (ITS 2008)
  •  
  • The First International Conference on Educational
    Data Mining brings together researchers from
    computer science, education, psychology,
    psychometrics, and statistics to analyze large
    data sets to answer educational research
    questions.  The increase in instrumented
    educational software, as well as state databases
    of student test scores, has created large
    repositories of data reflecting how students
    learn.  The EDM conference focuses on
    computational approaches for using those data to
    address important educational questions.  The
    broad collection of research disciplines ensures
    cross fertilization of ideas, with the central
    questions of educational research serving as a
    unifying focus.  This Conference emerges from
    preceding EDM workshops at the AAAI, AIED, ICALT,
    ITS, and UM conferences.
  •  
  • Topics of Interest
  • We welcome papers describing original work.  
    Areas of interest include but are not limited to
  • Improving educational software.  Many large
    educational data sets are generated by computer
    software.  Can we use our discoveries to improve
    the softwares effectiveness?
  • Domain representation.  How do learners represent
    the domain?  Does this representation shift as a
    result of instruction?  Do different
    subpopulations represent the domain differently?
  • Evaluating teaching interventions.  Student
    learning data provides a powerful mechanism for
    determining which teaching actions are
    successful.  How can we best use such data?

54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
Assessing Students Performance Longitudinally
Item Difficulty Parameter vs. Skill Learning
Tracking
  • Mingyu Feng, Zach Pardos, Neil T. Heffernan
  • Worcester Polytechnic Institute

59
Some of the ASSISTMENT TEAM
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Meghan MYERS,
Elizabeth Ayers, T. TURNER
R. UPALEKAR, J. WALONOSKI
Z. PARDOS Michael A. MACASEK,
Christopher ANISZCZYK, Sanket CHOKSEY, Tom LIVAK,
Kai RASMUSSEN
Carnegie Learning
60
The ASSISTment System
  • A web-based tutoring system that assists students
    in learning mathematics and gives teachers
    assessment of their students progress

61
An ASSISTment
Geometry
  • We break multi-step problems into scaffolding
    questions
  • Hint Messages given on demand that give hints
    about what step to do next
  • Buggy Message a context sensitive feedback
    message
  • (Feng, Heffernan Koedinger, 2006a)
  • Skills
  • The state reports to teachers on 5 areas
  • We seek to report on more and finer grain-sized
    skills

(Demo/movie)
The original question
a. Congruence
b. Perimeter
c. Equation-Solving
The 1st scaffolding question
Congruence
The 2nd scaffolding question
Perimeter
A buggy message
A hint message
62
The ASSISTment Project What Level of Tutor
Interaction is Best? By Leena Razzaq, Neil
Heffernan Robert Lindeman
Collaborators
Sponsors
Goal
Experiment Design
To determine the best level of tutor interaction
to help students learn the mathematics required
for a state exam based on their math proficiency.
Experiment Screen Shots
  • 3 levels of interaction
  • Scaffolding hints represents the most
    interactive experience students must answer
    scaffolding questions, i.e. learning by doing.
  • Hints on demand are less interactive because
    students do not have to respond to hints, but
    they can get the same information as in the
    scaffolding questions by requesting hints.
  • Delayed feedback is the least interactive
    condition because students must wait until the
    end of the assignment to get any feedback.
  • 2 levels of math proficiency
  • Students in Honors math classes.
  • Students in Regular math classes.

Background on ASSISTments
  • The Assistment System is a web-based assessment
    system that tutors students on math problems. The
    system is freely available at www.assistment.org
  • As of March 2007, 1000s of Worcester middle
    school students use ASSISTments every two weeks
    as part of their math class.
  • Teachers use the fine-grained reporting that the
    system provides to inform their instruction.

Students in this condition interact with the
tutor by answering scaffolding questions.
Students in this condition can get hints when
they ask for them by pressing the hint button.
Students in this condition get no feedback until
the end of the assignment when they get answers
and solutions.
Analysis and Conclusions
  • 566 8th grade students participated.
  • Results showed a significant interaction between
    condition and math proficiency (p lt 0.05), a good
    case for tailoring tutor interaction to types of
    students.

Scaff. Q. 1
The Interaction Hypothesis
Hint 1
Students see the solution after they finish all
of the problems.
When one-on-one tutoring, either by a human tutor
or a computer tutor, is compared to a less
interactive control condition that covers the
same content, then students will learn more in
the interactive condition than the control
condition.
Scaff. Q. 2
Hint 2
Scaff. Q. 3
  • Is this hypothesis true?
  • We found evidence to support this hypothesis in
    some cases, not in others.
  • Based on the results of Razzaq Heffernan
    (2006), we believe the difficulty of the material
    influences how effective interactive tutoring
    will be.

Hint 3
Hint 4
Hint 5
  • Regular students learned more with scaffolding
    hints (p lt 0.05) less-proficient students
    benefit from more interaction and coaching
    through each step to solve a problem.
  • Honors students learned more with delayed
    feedback (p 0.075) more-proficient students
    benefit from seeing problems worked out and
    getting the big picture.
  • Delayed feedback performed better than hints on
    demand (p.048) for both more- and
    less-proficient students students dont do as
    well when we depend on student initiative.

Scaff. Q. 4
Our Hypothesis
Hint 6
  • More interactive intelligent tutoring will lead
    to more learning (based on post-test gains) than
    less interactive tutoring.
  • Differences in learning will be more significant
    for students who are less-proficient than
    students who are more-proficient.

Hint 7
Hints on Scaff. Q.
This work has been accepted for publication at
the 2007 Artificial Intelligence in Education
Conference in Los Angeles.
63
CAREER Learning about Learning Using
Intelligent Tutoring Systems as a Research
Platform to Investigate Human Learning Free
researcher, teacher and student accounts for
7th-10th grade math preparation at
www.assistment.org
What the State MCAS test provides
What a student sees
What the teacher who builds the tutoring sees.
This shows a student that first guessed 16 (real
answer is 24), then got the first scaffolding
question correct with AC. The student then
clicked on ½8x and the system spit out the
bug message in red. The student, twice in a
row, asked for a hint shown in the green box.
The originalquestion
Uploaded image
Teacher Reports
The firstscaffold
The second scaffold
The author wrote this hint message shown in the
green box, put typing it in here.
The three hint messages for the second scaffold.
Recent Results - 2006
Teachers get reports per student, per skill, and
per item.
This project has 5 main research thrusts 1) For
the designing cognitive models thrust we report
that we can do a better job of modeling students
by using finer-grained models (i.e., that track
more knowledge components) of student than more
courser grain model (Zapdos et al, 2006, Feng, et
al, 2006). 2) For the research thrust of
inferring what students know and are learning we
can report two new results. First, we can do a
better job of assessing students (as measured by
predicting state test scores) by seeing how much
tutoring they need to solve a question (Feng, et
al, 2006a). Secondly, we have shown that we can
do a better job of modeling students learning
overtime by building models that take allow us to
model different rates of learning for different
skills (Feng et al, 2006a). 3) For the optimizing
learning thrust we have new empirical results
that show that students learn more with the type
of tutoring we provide that compared to a
traditional Computer-Added Instruction (CAI)
control (Razzaq Heffernan, 2006). 4) For the
thrust for informing educators, we have some
recent publications on the types of feedback we
give educators (Feng Heffernan, 2005 2006)).
Additionally, we have work that shows we can
track student motivation and then inform
educators in novel manners that increase student
motivation (Walonoski Heffernan, 2006a
2006b). 5) Finally, for the thrust of allowing
user adaptation we have shown that the authoring
tools we have built can be used to teachers and
quickly create content for their classes
(Heffernan, Turner et al, 2006). References
are at www.asssistment.org
The bottom out hint.
The third scaffold
This dialog shows the author has tagged the third
scaffold with three different grained sized
models.
By tagging items with skills, teachers can 1) get
reports on which skills students are doing poorly
on, and 2) track them over time.
The fourth and last Scaffold
64
Scaling up a Server-Based Web Tutor Jozsef
Patvarczki Neil Heffernan
Results
Introduction
Assistment Features
Our research team has built a web-based tutor,
located at www.ASSISTment.org 1, that is used
by hundreds of students a day in Worcester and
surrounding towns The systems focus is to teach
8th and 10th grad mathematics and MCAS
preparation. Because it is easily accessible, it
helps lower the entry barrier for teachers and
enable both teachers and researchers to collect
data and generate reports. Scaling up a
server-based intelligent tutoring system requires
developers to care about speed and reliability.
We will present how the Assistment system can
improve performance and reliability with a
fault-tolerant scalable architecture.
  • Since each public school classes have about 20
    students, we noticed clusters (shown in ovals in
    the bottom left) of intervals where a single
    class was logged on.
  • The log-on procedures is the most expensive step
    in the process and this data shows that this
    might be a good place for us to improve.
  • We noticed a second cluster of around 40 users,
    which most likely represents instances where two
    classes of students were using the system
    simultaneously.
  • There was no appreciable pattern towards a slower
    page creation time with more users.
  • Three simulated scenarios with 10s random delay
    between student actions
  • In the first scenario we used 50 threads
    simulating 50 students working without
    load-balancer, one application server, and one
    database
  • Second scenario with load-balancer and two
    application servers

Users begin interacting with our system through
the Portal that manages all activities
This problem uses a pseudo-tutor (state-based
implementation) with pre-made scaffolding and
hint questions selected based upon student input.
Incorrect responses are in red, and hints are in
green.
Example of a State-based Pseudo Tutor
  • Horizontal scaled configuration
  • Scalable
  • Fault-tolerant
  • Dynamically configurable

Architecture
Clients actions represent the systems load
HTTP server as Load Balancer
System Scalability and Reliability
  • Two concerns when running the Intelligent Tutor
    on a central server are
  • 1) building a scalable server architecture
  • 2) providing reliable service to researchers,
    teachers, and students.
  • We will answer several research questions
  • 1) can we reduce the cost of authoring ITS
  • 2) how can we improve performance and reliability
    with a better server architecture.
  • In order to server thousands of users, we must
    achieve high reliability and scalability at
    different levels.
  • Scalability at our first entry point through the
    use of a virtual IP for www.assistment.org,
    provided by the CARP protocol.
  • Random and round-robin redirection algorithms can
    provide very effective load-sharing and the
    load-balancer distributes load over multiple
    application servers.
  • This will allow us to redirect incoming web
    requests and build a web portal application in a
    multiple-server environment.
  • The monitoring system uses Selenium has allowed
    us to send text messages to our administrators
    when the system goes down.
  • Multiple database servers with automatic
    synchronization, pooling, and fail-over
    detection.

Additional application servers for load balancing
GRID computing Bayesian Network Application
Workflow Editor and Manager
Visualization and Resource Information System
WPI P-GRADE GRID Portal http//pgrade.wpi.e
du
  • Reference
  • Razzaq, L, Feng, M., Nuzzo-Jones, G., Heffernan,
    N.T. et. al (2005). The Assistment Project
    Blending Assessment and Assisting. 12th Annual
    Conference on Artificial Intelligence in
    Education 2005, Amsterdam

Contact Neil Heffernan, nth_at_wpi.edu
65
How was the Skill Models Created
66
(No Transcript)
67
  • Fine grained skill models in reporting
  • Teachers get reports that they think are credible
    and useful. (Feng Heffernan, 2005, 2006, 2007)

68
DataBases Logging Every Student Action
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
Research Question
  • In the ASSISTment project, which approach works
    better on assessing students performance
    longitudinally?
  • Skill learning tracking?
  • Or using item difficulty parameter?
    (unidimensional)

73
Data Source
  • 497 students of two middle schools
  • Students used the ASSISTment system every other
    week from Sep. 2004 to May 2005
  • Real state test score in May 2005
  • Item level online data
  • students binary response (1/0) to items that are
    tagged in different skill models
  • Some statistics
  • Average usage 7.3 days
  • Average questions answered 250
  • 138,000 data points

74
Data Source
75
Item Difficulty Parameter
  • Fit one-parameter logistic (1PL) IRT model (Rasch
    model) on our online data
  • the dependent variable probability of correct
    response for student i to item n
  • The independent variables the persons trait
    score and the items difficulty level .

76
Longitudinally Modeling
  • Mixed-effects Logistic Regression Models
  • Models we fitted
  • Model-beta time beta -gt item response
  • Model-WPI5 time skills in WPI5 -gt item
    response
  • Model-WPI78 time skills in WPI78 -gt item
    response
  • Evaluation
  • The accuracy of the predicted MCAS test score
    was used to evaluate different approaches

Singer Willet (2003). Applied Longitudinal
Data Analysis. Oxford University Press New York.
Hedeker Gibbions (in preparation).
Longitudinal Data Analysis.
77
How do we predict the MCAS
  • That the paramers learned up tehf ro online data
    to preidct which questions they will get correct
    or incorrect on MCAS based upon the skill of the
    item, and the time of the question, and the n
    paremetns per kid for that skill.
  • If 3 items with skill A are on the test, and the
    prediction is .25 per skill then we predict .75
    MCAS points from those three questions.

78
Results
gt
gt
P-values of both Paired t-tests are below 0.05
79
Conclusion
  • We have found evidence that shows skill learning
    tracking can better predict MCAS score than
    simply using item difficulty parameter and
    fine-grained models did even better than
    coarse-grained model
  • Our skill mapping is good (maybe not optimal)
  • We are considering using these skills models in
    selecting the next best-problem to present a
    student with.
  • Although we used Rasch model to train the item
    difficulty parameter, we were not modeling
    students' response with IRT. One interesting work
    will be comparing our results to predictions made
    through item response modeling approach.

80
Modeling Student Knowledge Using Bayesian
Networks to Predict Student Performance By Zach
Pardos Neil Heffernan, Advisor Computer
Science Joint work with Brigham Anderson and
Cristina Heffernan
Sponsors
Collaborators
Goal
Predicting student responses within the
ASSISTment tutoring system
Student Test Score Prediction Process
To evaluate the predictive performance of various
fine-grained student skill models in the
ASSISTment tutoring system using Bayesian
networks.
Bayesian Belief Network
Result The finer-grained the model, the better
prediction accuracy. The finest-grained WPI-106
performed the best with an average of only 5.5
error in prediction of student answers within the
system.
  • Skill probabilities are inferred from a
    students responses to questions on the system

The Skill Models
  • The skill models were created for use in the
    online tutoring system called ASSISTment, founded
    at WPI. They consist of skill names and
    associations (or tagging) of those skill names
    with math questions on the system. Models with 1,
    5, 39 and 106 skills were evaluated to represent
    varying degrees of concept generality. The skill
    models ability to predict performance of
    students on the system as well as on a
    standardized state test was evaluated.
  • The five skill models used
  • WPI-106 106 skill names were drafted and tagged
    to items in the tutoring system and to the
    questions on the state test by our subject matter
    expert, Cristina.
  • WPI-5 and WPI-39 5 and 39 skill names drafted
    by the Massachusetts Department of Education.
  • WPI-1 Represents unidimensional assessment.

Predicting student state test scores
  • Arrows represent associations of skills with
    question items. They also represent conditional
    dependence in the Bayesian Belief Network.
  • Probability of Guess is set to 10 (tutor
    questions are fill in the blank)
  • Probability of getting the item wrong even if the
    student knows it is set to 5

Result The finest-grained model, the WPI-106,
came in 2nd to the WPI-39 which may have
performed better than the 106 because 50 of its
skills are sampled on the MCAS Test vs. only 25
of the WPI-106s.
2. Inferred skill probabilities from above are
used to predict the probability the student will
answer each test question correctly
Bayesian Networks
  • A Bayesian Network is a probabilistic machine
    learning method. It is well suited for making
    predictions about unobserved variables by
    incorporating prior probabilities with new
    evidence.

Background on ASSISTment
Conclusions
  • ASSISTment is a web-based assessment system for
    8th-10th grade math that tutors students on items
    they get wrong. There are 1,443 items in the
    system.
  • The system is freely available at
    www.assistment.org
  • Question responses from 600 students using the
    system during the 2004-2005 school year were
    used.
  • Each student completed around 260 items each.
  • The ASSISTment fine-grained skill models excel at
    assessment of student skills (see Ming Fengs
    poster for a Mixed-Effects approach comparison)
  • Accurate prediction means teachers can know when
    their students have attained certain competencies.
  • Probabilities are summed to generate total test
    score.
  • Probability of Guess is set to 25 (MCAS
    questions are multiple choice)
  • Probability of getting the item wrong even if the
    student knows it is set to 5

This work has been accepted for publication at
the 2007 User Modeling Conference in Corfu,
Greece.
81
Tracking skill learning longitudinally
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
85
Future Work
  • Apply modern longitudinal data analysis to our
    data set.
  • Pivot table report
  • More future work is to do a randomized controlled
    experiment assigning teacher to with, or without,
    the system

86
Conclusions
  • The Assistment system seems to do a reasonable
    job of assessing and assisting.
  • We are release for beta testing the system today,
    with an official release for Sept for the state
    of Massachusetts.
  • www.Assistment.org

Nuzzo-Jones, G., Walonoski, J.A., Heffernan,
N.T., Livak, T. (2005). The eXtensible Tutor
Architecture A New Foundation for ITS. Poster in
the 12th Annual Conference on Artificial
Intelligence in Education 2005, Amsterdam
87
Reporting to Teachers Researchers
Write a Comment
User Comments (0)
About PowerShow.com