Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms'

About This Presentation

Title:

Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms'

Description:

Program Chairs: Ryan S.J.d. Baker (ryan_at_educationaldatamining.org), Carnegie ... By Leena Razzaq, Neil Heffernan & Robert Lindeman ... – PowerPoint PPT presentation

Number of Views:57

Avg rating:3.0/5.0

Slides: 88

Provided by: nth9

Category:

more less

Transcript and Presenter's Notes

Title: Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms'

1
Neil T. Heffernan The Use of Intelligent
Tutoring Systems as Research Platforms.

Worcester Polytechnic Institute
Computer Science Department
Massachusetts

2
Outline

Who am I?
My dissertation work
Carnegie Learning Inc.
ASSISTments
I now spend a great deal of time working with
teachers in their classroom to figure out the
most effective ways of using data.
The Future
Databases and research schools
Computer assistance of learning what works.
I am always looking for collaborators (teachers,
schools, researchers, psychometrians)
We give away our service to any teacher /
researcher
We share out data anonymously with researchers

3
Why are algebra word problems difficult?

Ann is in a rowboat in a lake that is 2400 yards
wide. She is 800 yards from the dock. She then
rows for m(3) minutes back towards the dock.
Ann rows at a speed of 40 yards per minute.
Write an expression for Ann's distance from the
dock.
Symbolize 800-40m
Articulate an expression(m3) 800-403
Compute a value(m3) 680

Two Cognitive Science conference papers (1997,
1998) with Ken Koedinger
4
Over 20,000 Students Tutored
www.AlgebraTutor.org
5
Cognitive Tutor Carnegie Learning Inc.
6
300 Schools in 2000-01
7
Carnegie Learning Inc

I do not work for, nor did I create the Cognitive
Tutoring Technology. My advisors did (Ken
Koedinger).
Ken Koedinger and I 2nd project is Called
ASSISTments, and its a free web-based system
that is more focused on assessment and less on
being the primary delivery of instruction.
Carnegie Learning Inc success 2 of the 5 days a
week are kids on the computer
ASSISTments suggested use is once every two
weeks.
ASSISTments is meant to be an add-on two existing
curriculum (textbooks) and not meant to be a
curriculum itself.
We are more interested in using intelligent
tutors as research platforms to produce sound
research.

8
(No Transcript)
9
Blending Assessment and Instructional
Assisting The Assistment Project
www.Assistment.org
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Andrea KNIGHT,
Edwin MERCADO, Terrence E. TURNER, Ruta
UPALEKAR, Jason A. WALONOSKI Michael A.
MACASEK, Christopher ANISZCZYK, Sanket CHOKSEY,
Tom LIVAK, Kai RASMUSSEN
Carnegie Learning
10
Research Questions

1. Does Assistment system work as an assessment
tool?
Reliably predict MCAS performance?
Reliably advise teachers students on what
knowledge to focus on?
2. Does the system help enhance student learning?
Does the system effectively teach as it assesses?
Can teachers use the advice of this system to
lead to higher student learning?

11
Intro to the Assistment System

Massachusetts Comprehensive Assessment System
MCAS
Required to graduate- 30 failed
Very challenging multi-step problems
8th grade (13 year olds)
Click here-gt New England Cable News

We break multi-step problems into scaffolding
questions
Hint Messages given on demand that give hints
about what step to do next- GREEN
Buggy Message a context sensitive feedback
message
Knowledge Components Skills, Strategies,
concepts
The state reports to teachers on 5 areas
We seek to report on 100 knowledge components
Video of demo backup

13
The Assistment Builder A Rapid Development Tool
for ITS Terrence Turner, Michael Macasek, Goss
Nuzzo-Jones, Neil Heffernan, Ken Koendinger
Data Collection Results
What the teacher sees
What a student sees
Goal
It is reported2 that in order to create one
hour of content for an Intelligent Tutoring
System (ITS) it requires at least 200 hours of
development time by PhD-level AI programmers and
cognitive scientists. We wanted to create a tool
that was usable by non-programming school
teachers and that could reduce that time cost by
a factor of 10. We report that we have certainly
succeeded at making a tool usable by
non-programmers, and that the cost, in time, is
reduced by a factor of 5.

To analyze the effectiveness of the Builder, we
developed a system to log and time stamp the
actions of an author. Unfortunately, the logging
was not in place until recently, capturing less
than 1 of Builder usage, so caution is
warranted.

This shows a student that first said 16, then got
the first scaffolding question correct with AC.
The student then clicked on ½8x and the
system spit out the bug message in red. The
student then asked for a hint twice in a row
shown in green
The originalquestion
Uploaded image
The firstscaffold

Above are results on the time it took 5 different
authors to build a total of 14 different
assistments. On average it took authors 25
minutes per assistment. Given the fact that each
assistment provides about 2 minutes of content,
this suggests a ratio of 131, instead of the
2001 from the literature 2. These results
suggest a speed-up of over a factor of 15!
However, self-reports by our content creators on
the amount of time it took to build and test an
assistment was 90 minutes per assistment, which
yields a ratio of 451 instead of the 131
reported above. This suggests that the builder
allows content to be created 5 times faster than
the literature suggests.
Both of these results ignore the time it takes to
mark questions with skills. That process was
done for the Assistment System as a whole
marking 250 items in 6 hours, resulting in an
amortized cost of 2 minutes per item.

The author clicked here to open up this window to
write the hint messages as shown above in green.
The second scaffold
Context
The three hint messages for the second scaffold.
After Building Assistments .
Teachers can select items to put into
experiments, and then assign them to classes.
Below shows a real experiment4 that
investigated whether students would learn better
if asked to set up proportions. The author built
two different Assistments that differed only by
one extra scaffold. The author made a second
morphed version of each by changing the cover
story. Finally, the author selected two items
to posttest for far transfer.

The Assistment Builder is a web-based
tutor-creation tool that has been used to create
about 8 hours of instruction, which has been
shown to lead to statistically significant
learning (see 4) with over 1,000 students.
This tool allows Assistments to be created that
are behaviorally equivalent to rule-based
cognitive tutors1 but are not general. While
our underlying representation 3 allows both
rule-based and state-based tutors, the Builder
creates a tutor for a single item.
Anecdotally, we observed a high school teacher
create three items, each in under 30 minutes, but
that was with the questions already prepared and
stored in a text file. Two high-school teachers
have together made over 50 assistments.

The bottom out hint.
The third scaffold
Conclusion
This dialog shows the author has tagged the third
scaffold with three different grained sized
models.

We have built a system in which non-programmers
can easily build content in 1/5 the amount of
time it has taken other ITS authoring tools
1,2. The quality of the content has also been
shown to lead to statistically significant
learning 4.
The system will be freely available via the
Assistment Project at www.assistment.org in
September of 2005. Any teacher will be allowed to
make their own content and/or use our existing
content, and get live class reports.
Primary Contact Neil Heffernan, nth_at_wpi.edu
Lead funding from a NSF-CAREER to Prof Heffernan
US Dept of Education support from Spencer
Foundation, Office of Naval Research, and the US
Army.

Authors can then schedule when to be notified!
By tagging items with skills, teachers can 1) get
reports on which skills students are doing poorly
on, and 2) track them over time as discussed in
6.

References
Koedinger, K. R., Aleven, V., Heffernan. T.,
McLaren, B. Hockenberry, M. (2004) Opening the
Door to Non-Programmers Authoring Intelligent
Tutor Behavior by Demonstration. Proceedings of
7th Annual Intelligent Tutoring Systems
Conference, Maceio, Brazil. Page 162-173
Murray, T. (1999). Authoring intelligent
tutoring systems An analysis of the state of the
art. International Journal of Artificial
Intelligence in Education, 10, pp. 98-129.
Nuzzo-Jones, G., Walonoski, J.A., Heffernan,
N.T., Livak, T. (2005). The eXtensible Tutor
Architecture A New Foundation for ITS. Poster in
the12th Annual Conference on Artificial
Intelligence in Education 2005, Amsterdam
Razzaq, L, Feng, M., Nuzzo-Jones, G., Heffernan,
N.T. et. al (2005). The Assistment Project
Blending Assessment and Assisting. 12th Annual
Conference on Artificial Intelligence in
Education 2005.
Turner, T.E., Macasek, M.A., Nuzzo-Jones, G.,
Heffernan, N..T, Koedinger, K. (2005). The
Assistment Builder A Rapid Development Tool for
ITS. Poster in the 12th Annual Conference on
Artificial Intelligence in Education 2005,
Amsterdam
Feng, Mingyu, Heffernan, N.T. (2005). Informing
Teachers Live about Student Learning Reporting
in the Assistment System. 12th Annual Conference
on Artificial Intelligence in Education 2005
Workshop on Usage Analysis in Learning Systems,
2005, Amsterdam.

The system automatically does the analysis. This
shows the SetupRatio condition to have better
learning within the condition as well as better
learning on the posttest/transfer items (reported
in 4)
The fourth and last Scaffold
14
The Web Based Architecture and Builder Web-App
Video lt- Click Here

Web-based server-side system to build content and
deploy anyone can use.
We built 8 hours of content (240 items) in 375
hours with no programming needed.
Murray reports that ITS take a minimum of 200
hours of work to produce 1 hour of ITS content.
Typically requires PhD-level programming, and
cognitive science.
5 items faster that literature would suggest.

Turner, T.E., Macasek, M.A., Nuzzo-Jones, G.,
Heffernan, N..T, Koedinger, K. (2005). The
Assistment Builder A Rapid Development Tool for
ITS. Poster in the 12th Annual Conference on
Artificial Intelligence in Education 2005,
Amsterdam
15
Research Questions

1. Does Assistment system work as an assessment
tool?
Reliably predict MCAS performance?
Reliably advise teachers students on what
knowledge to focus on?
2. Does the system help enhance student learning?
Does the system effectively teach as it assesses?
Can teachers use the advice of this system to
lead to higher student learning?

We can predict MCAS scores well Feng, Heffernan
Koedinger(2006)
We can predict better if we look at how much
assistance they needed. (A. Brown called this
dynamic assessment.)
We predict MCAS scores better if we track 98
skills.

16
Do students learn from Assistments?

To determine if students are learning, we gave
them problems testing the same concept within a
given class period and compared their results.
Problems were presented randomly
- Example concept pairs
Supplementary Angles and Traversals of Parallel
Lines,
Perimeter and Area,
Pythagorean Theorem,
Approximating Square Roots.

17
N 595 studentsAverage Gain 1.5
In the figure shown, lines l and m are parallel,
and triangle ABC is isosceles. What is the
measure of angle ACB?
In the figure above, lines CD and EF are
parallel. What is the measure of angle BHF?
18
N 390 studentsAverage Gain 4.3
The squares in the figure above are congruent.
The perimeter of the entire figure is 24 units.
What is the area of one small square?
In the figure, the perimeter of the equilateral
triangle is 24 inches. What is the area of the
square?
19
N 329 studentsAverage Gain 3
Manuel will use an old 16-foot fence for the
longest side and an old 8-foot fence for another
side as shown. What is the best estimate of the
amount of fencing he will need to the nearest
whole number for the third side?
What is the area of the smaller square?
20
N 424 studentsAverage Gain 7
Which is the best approximation of v72?
The square root of 55 is between which two whole
numbers?
21
That is, do students learn from Assistments?

742 students
Example content gains
Supplementary Angles and Traversals of Parallel
Lines, 1.5
Perimeter and Area 4
Pythagorean Theorem, 3
Approximating Square Roots, 7
29 other pairs
Student item analysis reveal statistically
reliable gains of 2 higher on 2nd opportunity at
a cluster of skills
Student Level Analysis p 0.02
Item Level Analysis that weighted items
proportionally p.04

22
Experiments comparing different teaching
techniques

How the Assistments System can be used to run
randomly controlled experiments.
Math Research Question
Should student be scaffolded in their
proportional reasoning problems by being taught
to set up a proportion/ratio
Two Conditions (SetUpRatio Condition)
Two items to test for transfer/posttest

23
(No Transcript)
24
The Two Morphs
25
Transfer/Posttest Items
26
Tools to make an experiment
Built by Ema Holban and Shane Gibbons
27
Tools to make an experiment
Built by Ema Holban and Shane Gibbons
28
Tools to make an experiment
Built by Ema Holban and Shane Gibbons
29
Tools to make an experiment
Percent correct
Built by Ema Holban and Shane Gibbons
30
(No Transcript)
31
(No Transcript)
32
The Current Status

4,000 students in towns around and in Worcester
and Pittsburgh.
http//nth1.wpi.edu/zpardos/day-stats.html

33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
What types of experiments?Both where the
conditions are online or offline.

Which is best for coached problem solving?
Traditional classroom, computer aided instruction
(CAI) versus intelligent tutoring systems
(Mendocino, Heffernan Razzaq, in submission)
Do student learn more from scaffolding question
compared to a traditional CAI approach? (Razzaq
Heffernan, 2006)
How about for students with less knowledge?
(Razzaq, Heffernan Lindeman, 2007)
The value of worked examples.

Jon Star at Harvard and Bethany Rittle-Johnson at
Vanderbilt and I are looking at seeing if middle
school math students learn more if taught
multiple solution strategies. These were
classroom experiments but the ASSISTment system
is used the testing (kids get feedback.)

38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
DateBases Class Summaries
43
DataBases Common Errors
44
DateBases What problems are hardest?
45
DateBases What skills are hardest?
46
DataBases What skills are slowest to learn?
47
Datebases What instructional interventions are
most effective in causing student learning?
48
Datebases Far Transfer?
49
(No Transcript)
50
Lets teachers build this stuff rather than a a
tiny group of smart researchers at CMU
51
What is ASSISTments?

Using Formative Assessment to drive instruction.
But the testing takes away valuable classroom
time.
Who fundede it?
What is it about
What results do we have?
We can predict state test scores well
We can do so better if we track 98 different
skills in 8th grade math
We do assess better if use how much assistance
student needed.
Kids are learning during the testing.

52
Wiki-ASSISTment

Suppose you

First International Conference on Educational
Data Mining
Data Mining and Statistics in Service of
Education
Call for papers (preliminary)
http//www.EducationalDataMining.org
June 20-21, 2008
Co-located with International Conference on
Intelligent Tutoring Systems (ITS 2008)
The First International Conference on Educational
Data Mining brings together researchers from
computer science, education, psychology,
psychometrics, and statistics to analyze large
data sets to answer educational research
questions. The increase in instrumented
educational software, as well as state databases
of student test scores, has created large
repositories of data reflecting how students
learn. The EDM conference focuses on
computational approaches for using those data to
address important educational questions. The
broad collection of research disciplines ensures
cross fertilization of ideas, with the central
questions of educational research serving as a
unifying focus. This Conference emerges from
preceding EDM workshops at the AAAI, AIED, ICALT,
ITS, and UM conferences.
Topics of Interest
We welcome papers describing original work.
Areas of interest include but are not limited to
Improving educational software. Many large
educational data sets are generated by computer
software. Can we use our discoveries to improve
the softwares effectiveness?
Domain representation. How do learners represent
the domain? Does this representation shift as a
result of instruction? Do different
subpopulations represent the domain differently?
Evaluating teaching interventions. Student
learning data provides a powerful mechanism for
determining which teaching actions are
successful. How can we best use such data?

54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
Assessing Students Performance Longitudinally
Item Difficulty Parameter vs. Skill Learning
Tracking

Mingyu Feng, Zach Pardos, Neil T. Heffernan
Worcester Polytechnic Institute

59
Some of the ASSISTMENT TEAM
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Meghan MYERS,
Elizabeth Ayers, T. TURNER
R. UPALEKAR, J. WALONOSKI
Z. PARDOS Michael A. MACASEK,
Christopher ANISZCZYK, Sanket CHOKSEY, Tom LIVAK,
Kai RASMUSSEN
Carnegie Learning
60
The ASSISTment System

A web-based tutoring system that assists students
in learning mathematics and gives teachers
assessment of their students progress

61
An ASSISTment
Geometry

We break multi-step problems into scaffolding
questions
Hint Messages given on demand that give hints
about what step to do next
Buggy Message a context sensitive feedback
message
(Feng, Heffernan Koedinger, 2006a)
Skills
The state reports to teachers on 5 areas
We seek to report on more and finer grain-sized
skills

(Demo/movie)
The original question
a. Congruence
b. Perimeter
c. Equation-Solving
The 1st scaffolding question
Congruence
The 2nd scaffolding question
Perimeter
A buggy message
A hint message
62
The ASSISTment Project What Level of Tutor
Interaction is Best? By Leena Razzaq, Neil
Heffernan Robert Lindeman
Collaborators
Sponsors
Goal
Experiment Design
To determine the best level of tutor interaction
to help students learn the mathematics required
for a state exam based on their math proficiency.
Experiment Screen Shots

3 levels of interaction
Scaffolding hints represents the most
interactive experience students must answer
scaffolding questions, i.e. learning by doing.
Hints on demand are less interactive because
students do not have to respond to hints, but
they can get the same information as in the
scaffolding questions by requesting hints.
Delayed feedback is the least interactive
condition because students must wait until the
end of the assignment to get any feedback.
2 levels of math proficiency
Students in Honors math classes.
Students in Regular math classes.

Background on ASSISTments

The Assistment System is a web-based assessment
system that tutors students on math problems. The
system is freely available at www.assistment.org
As of March 2007, 1000s of Worcester middle
school students use ASSISTments every two weeks
as part of their math class.
Teachers use the fine-grained reporting that the
system provides to inform their instruction.

Students in this condition interact with the
tutor by answering scaffolding questions.
Students in this condition can get hints when
they ask for them by pressing the hint button.
Students in this condition get no feedback until
the end of the assignment when they get answers
and solutions.
Analysis and Conclusions

566 8th grade students participated.
Results showed a significant interaction between
condition and math proficiency (p lt 0.05), a good
case for tailoring tutor interaction to types of
students.

Scaff. Q. 1
The Interaction Hypothesis
Hint 1
Students see the solution after they finish all
of the problems.
When one-on-one tutoring, either by a human tutor
or a computer tutor, is compared to a less
interactive control condition that covers the
same content, then students will learn more in
the interactive condition than the control
condition.
Scaff. Q. 2
Hint 2
Scaff. Q. 3

Is this hypothesis true?
We found evidence to support this hypothesis in
some cases, not in others.
Based on the results of Razzaq Heffernan
(2006), we believe the difficulty of the material
influences how effective interactive tutoring
will be.

Hint 3
Hint 4
Hint 5

Regular students learned more with scaffolding
hints (p lt 0.05) less-proficient students
benefit from more interaction and coaching
through each step to solve a problem.
Honors students learned more with delayed
feedback (p 0.075) more-proficient students
benefit from seeing problems worked out and
getting the big picture.
Delayed feedback performed better than hints on
demand (p.048) for both more- and
less-proficient students students dont do as
well when we depend on student initiative.

Scaff. Q. 4
Our Hypothesis
Hint 6

More interactive intelligent tutoring will lead
to more learning (based on post-test gains) than
less interactive tutoring.
Differences in learning will be more significant
for students who are less-proficient than
students who are more-proficient.

Hint 7
Hints on Scaff. Q.
This work has been accepted for publication at
the 2007 Artificial Intelligence in Education
Conference in Los Angeles.
63
CAREER Learning about Learning Using
Intelligent Tutoring Systems as a Research
Platform to Investigate Human Learning Free
researcher, teacher and student accounts for
7th-10th grade math preparation at
www.assistment.org
What the State MCAS test provides
What a student sees
What the teacher who builds the tutoring sees.
This shows a student that first guessed 16 (real
answer is 24), then got the first scaffolding
question correct with AC. The student then
clicked on ½8x and the system spit out the
bug message in red. The student, twice in a
row, asked for a hint shown in the green box.
The originalquestion
Uploaded image
Teacher Reports
The firstscaffold
The second scaffold
The author wrote this hint message shown in the
green box, put typing it in here.
The three hint messages for the second scaffold.
Recent Results - 2006
Teachers get reports per student, per skill, and
per item.
This project has 5 main research thrusts 1) For
the designing cognitive models thrust we report
that we can do a better job of modeling students
by using finer-grained models (i.e., that track
more knowledge components) of student than more
courser grain model (Zapdos et al, 2006, Feng, et
al, 2006). 2) For the research thrust of
inferring what students know and are learning we
can report two new results. First, we can do a
better job of assessing students (as measured by
predicting state test scores) by seeing how much
tutoring they need to solve a question (Feng, et
al, 2006a). Secondly, we have shown that we can
do a better job of modeling students learning
overtime by building models that take allow us to
model different rates of learning for different
skills (Feng et al, 2006a). 3) For the optimizing
learning thrust we have new empirical results
that show that students learn more with the type
of tutoring we provide that compared to a
traditional Computer-Added Instruction (CAI)
control (Razzaq Heffernan, 2006). 4) For the
thrust for informing educators, we have some
recent publications on the types of feedback we
give educators (Feng Heffernan, 2005 2006)).
Additionally, we have work that shows we can
track student motivation and then inform
educators in novel manners that increase student
motivation (Walonoski Heffernan, 2006a
2006b). 5) Finally, for the thrust of allowing
user adaptation we have shown that the authoring
tools we have built can be used to teachers and
quickly create content for their classes
(Heffernan, Turner et al, 2006). References
are at www.asssistment.org
The bottom out hint.
The third scaffold
This dialog shows the author has tagged the third
scaffold with three different grained sized
models.
By tagging items with skills, teachers can 1) get
reports on which skills students are doing poorly
on, and 2) track them over time.
The fourth and last Scaffold
64
Scaling up a Server-Based Web Tutor Jozsef
Patvarczki Neil Heffernan
Results
Introduction
Assistment Features
Our research team has built a web-based tutor,
located at www.ASSISTment.org 1, that is used
by hundreds of students a day in Worcester and
surrounding towns The systems focus is to teach
8th and 10th grad mathematics and MCAS
preparation. Because it is easily accessible, it
helps lower the entry barrier for teachers and
enable both teachers and researchers to collect
data and generate reports. Scaling up a
server-based intelligent tutoring system requires
developers to care about speed and reliability.
We will present how the Assistment system can
improve performance and reliability with a
fault-tolerant scalable architecture.

Since each public school classes have about 20
students, we noticed clusters (shown in ovals in
the bottom left) of intervals where a single
class was logged on.
The log-on procedures is the most expensive step
in the process and this data shows that this
might be a good place for us to improve.
We noticed a second cluster of around 40 users,
which most likely represents instances where two
classes of students were using the system
simultaneously.
There was no appreciable pattern towards a slower
page creation time with more users.
Three simulated scenarios with 10s random delay
between student actions
In the first scenario we used 50 threads
simulating 50 students working without
load-balancer, one application server, and one
database
Second scenario with load-balancer and two
application servers

Users begin interacting with our system through
the Portal that manages all activities
This problem uses a pseudo-tutor (state-based
implementation) with pre-made scaffolding and
hint questions selected based upon student input.
Incorrect responses are in red, and hints are in
green.
Example of a State-based Pseudo Tutor

Horizontal scaled configuration
Scalable
Fault-tolerant
Dynamically configurable

Architecture
Clients actions represent the systems load
HTTP server as Load Balancer
System Scalability and Reliability

Two concerns when running the Intelligent Tutor
on a central server are
1) building a scalable server architecture
2) providing reliable service to researchers,
teachers, and students.
We will answer several research questions
1) can we reduce the cost of authoring ITS
2) how can we improve performance and reliability
with a better server architecture.
In order to server thousands of users, we must
achieve high reliability and scalability at
different levels.
Scalability at our first entry point through the
use of a virtual IP for www.assistment.org,
provided by the CARP protocol.
Random and round-robin redirection algorithms can
provide very effective load-sharing and the
load-balancer distributes load over multiple
application servers.
This will allow us to redirect incoming web
requests and build a web portal application in a
multiple-server environment.
The monitoring system uses Selenium has allowed
us to send text messages to our administrators
when the system goes down.
Multiple database servers with automatic
synchronization, pooling, and fail-over
detection.

Additional application servers for load balancing
GRID computing Bayesian Network Application
Workflow Editor and Manager
Visualization and Resource Information System
WPI P-GRADE GRID Portal http//pgrade.wpi.e
du

Reference
Razzaq, L, Feng, M., Nuzzo-Jones, G., Heffernan,
N.T. et. al (2005). The Assistment Project
Blending Assessment and Assisting. 12th Annual
Conference on Artificial Intelligence in
Education 2005, Amsterdam

Contact Neil Heffernan, nth_at_wpi.edu
65
How was the Skill Models Created
66
(No Transcript)
67

Fine grained skill models in reporting
Teachers get reports that they think are credible
and useful. (Feng Heffernan, 2005, 2006, 2007)

68
DataBases Logging Every Student Action
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
Research Question

In the ASSISTment project, which approach works
better on assessing students performance
longitudinally?
Skill learning tracking?
Or using item difficulty parameter?
(unidimensional)

73
Data Source

497 students of two middle schools
Students used the ASSISTment system every other
week from Sep. 2004 to May 2005
Real state test score in May 2005
Item level online data
students binary response (1/0) to items that are
tagged in different skill models

Some statistics
Average usage 7.3 days
Average questions answered 250
138,000 data points

74
Data Source
75
Item Difficulty Parameter

Fit one-parameter logistic (1PL) IRT model (Rasch
model) on our online data
the dependent variable probability of correct
response for student i to item n
The independent variables the persons trait
score and the items difficulty level .

76
Longitudinally Modeling

Mixed-effects Logistic Regression Models
Models we fitted
Model-beta time beta -gt item response
Model-WPI5 time skills in WPI5 -gt item
response
Model-WPI78 time skills in WPI78 -gt item
response
Evaluation
The accuracy of the predicted MCAS test score
was used to evaluate different approaches

Singer Willet (2003). Applied Longitudinal
Data Analysis. Oxford University Press New York.
Hedeker Gibbions (in preparation).
Longitudinal Data Analysis.
77
How do we predict the MCAS

That the paramers learned up tehf ro online data
to preidct which questions they will get correct
or incorrect on MCAS based upon the skill of the
item, and the time of the question, and the n
paremetns per kid for that skill.
If 3 items with skill A are on the test, and the
prediction is .25 per skill then we predict .75
MCAS points from those three questions.

78
Results
gt
gt
P-values of both Paired t-tests are below 0.05
79
Conclusion

We have found evidence that shows skill learning
tracking can better predict MCAS score than
simply using item difficulty parameter and
fine-grained models did even better than
coarse-grained model
Our skill mapping is good (maybe not optimal)
We are considering using these skills models in
selecting the next best-problem to present a
student with.
Although we used Rasch model to train the item
difficulty parameter, we were not modeling
students' response with IRT. One interesting work
will be comparing our results to predictions made
through item response modeling approach.

80
Modeling Student Knowledge Using Bayesian
Networks to Predict Student Performance By Zach
Pardos Neil Heffernan, Advisor Computer
Science Joint work with Brigham Anderson and
Cristina Heffernan
Sponsors
Collaborators
Goal
Predicting student responses within the
ASSISTment tutoring system
Student Test Score Prediction Process
To evaluate the predictive performance of various
fine-grained student skill models in the
ASSISTment tutoring system using Bayesian
networks.
Bayesian Belief Network
Result The finer-grained the model, the better
prediction accuracy. The finest-grained WPI-106
performed the best with an average of only 5.5
error in prediction of student answers within the
system.

Skill probabilities are inferred from a
students responses to questions on the system

The Skill Models

The skill models were created for use in the
online tutoring system called ASSISTment, founded
at WPI. They consist of skill names and
associations (or tagging) of those skill names
with math questions on the system. Models with 1,
5, 39 and 106 skills were evaluated to represent
varying degrees of concept generality. The skill
models ability to predict performance of
students on the system as well as on a
standardized state test was evaluated.
The five skill models used
WPI-106 106 skill names were drafted and tagged
to items in the tutoring system and to the
questions on the state test by our subject matter
expert, Cristina.
WPI-5 and WPI-39 5 and 39 skill names drafted
by the Massachusetts Department of Education.
WPI-1 Represents unidimensional assessment.

Predicting student state test scores

Arrows represent associations of skills with
question items. They also represent conditional
dependence in the Bayesian Belief Network.
Probability of Guess is set to 10 (tutor
questions are fill in the blank)
Probability of getting the item wrong even if the
student knows it is set to 5

Result The finest-grained model, the WPI-106,
came in 2nd to the WPI-39 which may have
performed better than the 106 because 50 of its
skills are sampled on the MCAS Test vs. only 25
of the WPI-106s.
2. Inferred skill probabilities from above are
used to predict the probability the student will
answer each test question correctly
Bayesian Networks

A Bayesian Network is a probabilistic machine
learning method. It is well suited for making
predictions about unobserved variables by
incorporating prior probabilities with new
evidence.

Background on ASSISTment
Conclusions

ASSISTment is a web-based assessment system for
8th-10th grade math that tutors students on items
they get wrong. There are 1,443 items in the
system.
The system is freely available at
www.assistment.org
Question responses from 600 students using the
system during the 2004-2005 school year were
used.
Each student completed around 260 items each.

The ASSISTment fine-grained skill models excel at
assessment of student skills (see Ming Fengs
poster for a Mixed-Effects approach comparison)
Accurate prediction means teachers can know when
their students have attained certain competencies.

Probabilities are summed to generate total test
score.
Probability of Guess is set to 25 (MCAS
questions are multiple choice)
Probability of getting the item wrong even if the
student knows it is set to 5

This work has been accepted for publication at
the 2007 User Modeling Conference in Corfu,
Greece.
81
Tracking skill learning longitudinally
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
85
Future Work

Apply modern longitudinal data analysis to our
data set.
Pivot table report
More future work is to do a randomized controlled
experiment assigning teacher to with, or without,
the system

86
Conclusions

The Assistment system seems to do a reasonable
job of assessing and assisting.
We are release for beta testing the system today,
with an official release for Sept for the state
of Massachusetts.
www.Assistment.org

Nuzzo-Jones, G., Walonoski, J.A., Heffernan,
N.T., Livak, T. (2005). The eXtensible Tutor
Architecture A New Foundation for ITS. Poster in
the 12th Annual Conference on Artificial
Intelligence in Education 2005, Amsterdam
87
Reporting to Teachers Researchers

Write a Comment

User Comments (0)