Title: Neil T' Heffernan The Use of Intelligent Tutoring Systems as Research Platforms'
1Neil T. Heffernan The Use of Intelligent
Tutoring Systems as Research Platforms.
- Worcester Polytechnic Institute
- Computer Science Department
- Massachusetts
2Outline
- Who am I?
- My dissertation work
- Carnegie Learning Inc.
- ASSISTments
- I now spend a great deal of time working with
teachers in their classroom to figure out the
most effective ways of using data. - The Future
- Databases and research schools
- Computer assistance of learning what works.
- I am always looking for collaborators (teachers,
schools, researchers, psychometrians) - We give away our service to any teacher /
researcher - We share out data anonymously with researchers
3Why are algebra word problems difficult?
- Ann is in a rowboat in a lake that is 2400 yards
wide. She is 800 yards from the dock. She then
rows for m(3) minutes back towards the dock.
Ann rows at a speed of 40 yards per minute.
Write an expression for Ann's distance from the
dock. - Symbolize 800-40m
- Articulate an expression(m3) 800-403
- Compute a value(m3) 680
Two Cognitive Science conference papers (1997,
1998) with Ken Koedinger
4Over 20,000 Students Tutored
www.AlgebraTutor.org
5Cognitive Tutor Carnegie Learning Inc.
6300 Schools in 2000-01
7Carnegie Learning Inc
- I do not work for, nor did I create the Cognitive
Tutoring Technology. My advisors did (Ken
Koedinger). - Ken Koedinger and I 2nd project is Called
ASSISTments, and its a free web-based system
that is more focused on assessment and less on
being the primary delivery of instruction. - Carnegie Learning Inc success 2 of the 5 days a
week are kids on the computer - ASSISTments suggested use is once every two
weeks. - ASSISTments is meant to be an add-on two existing
curriculum (textbooks) and not meant to be a
curriculum itself. - We are more interested in using intelligent
tutors as research platforms to produce sound
research.
8(No Transcript)
9Blending Assessment and Instructional
Assisting The Assistment Project
www.Assistment.org
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Andrea KNIGHT,
Edwin MERCADO, Terrence E. TURNER, Ruta
UPALEKAR, Jason A. WALONOSKI Michael A.
MACASEK, Christopher ANISZCZYK, Sanket CHOKSEY,
Tom LIVAK, Kai RASMUSSEN
Carnegie Learning
10Research Questions
- 1. Does Assistment system work as an assessment
tool? - Reliably predict MCAS performance?
- Reliably advise teachers students on what
knowledge to focus on? - 2. Does the system help enhance student learning?
- Does the system effectively teach as it assesses?
- Can teachers use the advice of this system to
lead to higher student learning?
11Intro to the Assistment System
- Massachusetts Comprehensive Assessment System
MCAS - Required to graduate- 30 failed
- Very challenging multi-step problems
- 8th grade (13 year olds)
- Click here-gt New England Cable News
12- We break multi-step problems into scaffolding
questions - Hint Messages given on demand that give hints
about what step to do next- GREEN - Buggy Message a context sensitive feedback
message - Knowledge Components Skills, Strategies,
concepts - The state reports to teachers on 5 areas
- We seek to report on 100 knowledge components
- Video of demo backup
13The Assistment Builder A Rapid Development Tool
for ITS Terrence Turner, Michael Macasek, Goss
Nuzzo-Jones, Neil Heffernan, Ken Koendinger
Data Collection Results
What the teacher sees
What a student sees
Goal
It is reported2 that in order to create one
hour of content for an Intelligent Tutoring
System (ITS) it requires at least 200 hours of
development time by PhD-level AI programmers and
cognitive scientists. We wanted to create a tool
that was usable by non-programming school
teachers and that could reduce that time cost by
a factor of 10. We report that we have certainly
succeeded at making a tool usable by
non-programmers, and that the cost, in time, is
reduced by a factor of 5.
- To analyze the effectiveness of the Builder, we
developed a system to log and time stamp the
actions of an author. Unfortunately, the logging
was not in place until recently, capturing less
than 1 of Builder usage, so caution is
warranted.
This shows a student that first said 16, then got
the first scaffolding question correct with AC.
The student then clicked on ½8x and the
system spit out the bug message in red. The
student then asked for a hint twice in a row
shown in green
The originalquestion
Uploaded image
The firstscaffold
- Above are results on the time it took 5 different
authors to build a total of 14 different
assistments. On average it took authors 25
minutes per assistment. Given the fact that each
assistment provides about 2 minutes of content,
this suggests a ratio of 131, instead of the
2001 from the literature 2. These results
suggest a speed-up of over a factor of 15! - However, self-reports by our content creators on
the amount of time it took to build and test an
assistment was 90 minutes per assistment, which
yields a ratio of 451 instead of the 131
reported above. This suggests that the builder
allows content to be created 5 times faster than
the literature suggests. - Both of these results ignore the time it takes to
mark questions with skills. That process was
done for the Assistment System as a whole
marking 250 items in 6 hours, resulting in an
amortized cost of 2 minutes per item.
The author clicked here to open up this window to
write the hint messages as shown above in green.
The second scaffold
Context
The three hint messages for the second scaffold.
After Building Assistments .
Teachers can select items to put into
experiments, and then assign them to classes.
Below shows a real experiment4 that
investigated whether students would learn better
if asked to set up proportions. The author built
two different Assistments that differed only by
one extra scaffold. The author made a second
morphed version of each by changing the cover
story. Finally, the author selected two items
to posttest for far transfer.
- The Assistment Builder is a web-based
tutor-creation tool that has been used to create
about 8 hours of instruction, which has been
shown to lead to statistically significant
learning (see 4) with over 1,000 students. - This tool allows Assistments to be created that
are behaviorally equivalent to rule-based
cognitive tutors1 but are not general. While
our underlying representation 3 allows both
rule-based and state-based tutors, the Builder
creates a tutor for a single item. - Anecdotally, we observed a high school teacher
create three items, each in under 30 minutes, but
that was with the questions already prepared and
stored in a text file. Two high-school teachers
have together made over 50 assistments.
The bottom out hint.
The third scaffold
Conclusion
This dialog shows the author has tagged the third
scaffold with three different grained sized
models.
- We have built a system in which non-programmers
can easily build content in 1/5 the amount of
time it has taken other ITS authoring tools
1,2. The quality of the content has also been
shown to lead to statistically significant
learning 4. - The system will be freely available via the
Assistment Project at www.assistment.org in
September of 2005. Any teacher will be allowed to
make their own content and/or use our existing
content, and get live class reports. - Primary Contact Neil Heffernan, nth_at_wpi.edu
- Lead funding from a NSF-CAREER to Prof Heffernan
US Dept of Education support from Spencer
Foundation, Office of Naval Research, and the US
Army.
Authors can then schedule when to be notified!
By tagging items with skills, teachers can 1) get
reports on which skills students are doing poorly
on, and 2) track them over time as discussed in
6.
- References
- Koedinger, K. R., Aleven, V., Heffernan. T.,
McLaren, B. Hockenberry, M. (2004) Opening the
Door to Non-Programmers Authoring Intelligent
Tutor Behavior by Demonstration. Proceedings of
7th Annual Intelligent Tutoring Systems
Conference, Maceio, Brazil. Page 162-173 - Murray, T. (1999). Authoring intelligent
tutoring systems An analysis of the state of the
art. International Journal of Artificial
Intelligence in Education, 10, pp. 98-129. - Nuzzo-Jones, G., Walonoski, J.A., Heffernan,
N.T., Livak, T. (2005). The eXtensible Tutor
Architecture A New Foundation for ITS. Poster in
the12th Annual Conference on Artificial
Intelligence in Education 2005, Amsterdam - Razzaq, L, Feng, M., Nuzzo-Jones, G., Heffernan,
N.T. et. al (2005). The Assistment Project
Blending Assessment and Assisting. 12th Annual
Conference on Artificial Intelligence in
Education 2005. - Turner, T.E., Macasek, M.A., Nuzzo-Jones, G.,
Heffernan, N..T, Koedinger, K. (2005). The
Assistment Builder A Rapid Development Tool for
ITS. Poster in the 12th Annual Conference on
Artificial Intelligence in Education 2005,
Amsterdam - Feng, Mingyu, Heffernan, N.T. (2005). Informing
Teachers Live about Student Learning Reporting
in the Assistment System. 12th Annual Conference
on Artificial Intelligence in Education 2005
Workshop on Usage Analysis in Learning Systems,
2005, Amsterdam.
The system automatically does the analysis. This
shows the SetupRatio condition to have better
learning within the condition as well as better
learning on the posttest/transfer items (reported
in 4)
The fourth and last Scaffold
14The Web Based Architecture and Builder Web-App
Video lt- Click Here
- Web-based server-side system to build content and
deploy anyone can use. - We built 8 hours of content (240 items) in 375
hours with no programming needed. - Murray reports that ITS take a minimum of 200
hours of work to produce 1 hour of ITS content.
Typically requires PhD-level programming, and
cognitive science. - 5 items faster that literature would suggest.
Turner, T.E., Macasek, M.A., Nuzzo-Jones, G.,
Heffernan, N..T, Koedinger, K. (2005). The
Assistment Builder A Rapid Development Tool for
ITS. Poster in the 12th Annual Conference on
Artificial Intelligence in Education 2005,
Amsterdam
15Research Questions
- 1. Does Assistment system work as an assessment
tool? - Reliably predict MCAS performance?
- Reliably advise teachers students on what
knowledge to focus on? - 2. Does the system help enhance student learning?
- Does the system effectively teach as it assesses?
- Can teachers use the advice of this system to
lead to higher student learning?
- We can predict MCAS scores well Feng, Heffernan
Koedinger(2006) - We can predict better if we look at how much
assistance they needed. (A. Brown called this
dynamic assessment.) - We predict MCAS scores better if we track 98
skills.
16Do students learn from Assistments?
- To determine if students are learning, we gave
them problems testing the same concept within a
given class period and compared their results. - Problems were presented randomly
-
- - Example concept pairs
- Supplementary Angles and Traversals of Parallel
Lines, - Perimeter and Area,
- Pythagorean Theorem,
- Approximating Square Roots.
17N 595 studentsAverage Gain 1.5
In the figure shown, lines l and m are parallel,
and triangle ABC is isosceles. What is the
measure of angle ACB?
In the figure above, lines CD and EF are
parallel. What is the measure of angle BHF?
18N 390 studentsAverage Gain 4.3
The squares in the figure above are congruent.
The perimeter of the entire figure is 24 units.
What is the area of one small square?
In the figure, the perimeter of the equilateral
triangle is 24 inches. What is the area of the
square?
19N 329 studentsAverage Gain 3
Manuel will use an old 16-foot fence for the
longest side and an old 8-foot fence for another
side as shown. What is the best estimate of the
amount of fencing he will need to the nearest
whole number for the third side?
What is the area of the smaller square?
20N 424 studentsAverage Gain 7
Which is the best approximation of v72?
The square root of 55 is between which two whole
numbers?
21That is, do students learn from Assistments?
- 742 students
- Example content gains
- Supplementary Angles and Traversals of Parallel
Lines, 1.5 - Perimeter and Area 4
- Pythagorean Theorem, 3
- Approximating Square Roots, 7
- 29 other pairs
- Student item analysis reveal statistically
reliable gains of 2 higher on 2nd opportunity at
a cluster of skills - Student Level Analysis p 0.02
- Item Level Analysis that weighted items
proportionally p.04
22Experiments comparing different teaching
techniques
- How the Assistments System can be used to run
randomly controlled experiments. - Math Research Question
- Should student be scaffolded in their
proportional reasoning problems by being taught
to set up a proportion/ratio - Two Conditions (SetUpRatio Condition)
- Two items to test for transfer/posttest
23(No Transcript)
24The Two Morphs
25Transfer/Posttest Items
26Tools to make an experiment
Built by Ema Holban and Shane Gibbons
27Tools to make an experiment
Built by Ema Holban and Shane Gibbons
28Tools to make an experiment
Built by Ema Holban and Shane Gibbons
29Tools to make an experiment
Percent correct
Built by Ema Holban and Shane Gibbons
30(No Transcript)
31(No Transcript)
32The Current Status
- 4,000 students in towns around and in Worcester
and Pittsburgh. - http//nth1.wpi.edu/zpardos/day-stats.html
33(No Transcript)
34(No Transcript)
35(No Transcript)
36What types of experiments?Both where the
conditions are online or offline.
- Which is best for coached problem solving?
Traditional classroom, computer aided instruction
(CAI) versus intelligent tutoring systems
(Mendocino, Heffernan Razzaq, in submission) - Do student learn more from scaffolding question
compared to a traditional CAI approach? (Razzaq
Heffernan, 2006) - How about for students with less knowledge?
(Razzaq, Heffernan Lindeman, 2007) - The value of worked examples.
37- Jon Star at Harvard and Bethany Rittle-Johnson at
Vanderbilt and I are looking at seeing if middle
school math students learn more if taught
multiple solution strategies. These were
classroom experiments but the ASSISTment system
is used the testing (kids get feedback.)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42DateBases Class Summaries
43DataBases Common Errors
44DateBases What problems are hardest?
45DateBases What skills are hardest?
46DataBases What skills are slowest to learn?
47Datebases What instructional interventions are
most effective in causing student learning?
48Datebases Far Transfer?
49(No Transcript)
50Lets teachers build this stuff rather than a a
tiny group of smart researchers at CMU
51What is ASSISTments?
- Using Formative Assessment to drive instruction.
- But the testing takes away valuable classroom
time. - Who fundede it?
- What is it about
- What results do we have?
- We can predict state test scores well
- We can do so better if we track 98 different
skills in 8th grade math - We do assess better if use how much assistance
student needed. - Kids are learning during the testing.
52Wiki-ASSISTment
53-
-
-
-
- First International Conference on Educational
Data Mining - Data Mining and Statistics in Service of
Education - Call for papers (preliminary)
- http//www.EducationalDataMining.org
- June 20-21, 2008
- Co-located with International Conference on
Intelligent Tutoring Systems (ITS 2008) -
-
- The First International Conference on Educational
Data Mining brings together researchers from
computer science, education, psychology,
psychometrics, and statistics to analyze large
data sets to answer educational research
questions. The increase in instrumented
educational software, as well as state databases
of student test scores, has created large
repositories of data reflecting how students
learn. The EDM conference focuses on
computational approaches for using those data to
address important educational questions. The
broad collection of research disciplines ensures
cross fertilization of ideas, with the central
questions of educational research serving as a
unifying focus. This Conference emerges from
preceding EDM workshops at the AAAI, AIED, ICALT,
ITS, and UM conferences. -
- Topics of Interest
- We welcome papers describing original work.
Areas of interest include but are not limited to
- Improving educational software. Many large
educational data sets are generated by computer
software. Can we use our discoveries to improve
the softwares effectiveness? - Domain representation. How do learners represent
the domain? Does this representation shift as a
result of instruction? Do different
subpopulations represent the domain differently? - Evaluating teaching interventions. Student
learning data provides a powerful mechanism for
determining which teaching actions are
successful. How can we best use such data?
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58Assessing Students Performance Longitudinally
Item Difficulty Parameter vs. Skill Learning
Tracking
- Mingyu Feng, Zach Pardos, Neil T. Heffernan
- Worcester Polytechnic Institute
59 Some of the ASSISTMENT TEAM
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Meghan MYERS,
Elizabeth Ayers, T. TURNER
R. UPALEKAR, J. WALONOSKI
Z. PARDOS Michael A. MACASEK,
Christopher ANISZCZYK, Sanket CHOKSEY, Tom LIVAK,
Kai RASMUSSEN
Carnegie Learning
60The ASSISTment System
- A web-based tutoring system that assists students
in learning mathematics and gives teachers
assessment of their students progress
61An ASSISTment
Geometry
- We break multi-step problems into scaffolding
questions - Hint Messages given on demand that give hints
about what step to do next - Buggy Message a context sensitive feedback
message -
- (Feng, Heffernan Koedinger, 2006a)
- Skills
- The state reports to teachers on 5 areas
- We seek to report on more and finer grain-sized
skills
(Demo/movie)
The original question
a. Congruence
b. Perimeter
c. Equation-Solving
The 1st scaffolding question
Congruence
The 2nd scaffolding question
Perimeter
A buggy message
A hint message
62The ASSISTment Project What Level of Tutor
Interaction is Best? By Leena Razzaq, Neil
Heffernan Robert Lindeman
Collaborators
Sponsors
Goal
Experiment Design
To determine the best level of tutor interaction
to help students learn the mathematics required
for a state exam based on their math proficiency.
Experiment Screen Shots
- 3 levels of interaction
- Scaffolding hints represents the most
interactive experience students must answer
scaffolding questions, i.e. learning by doing. - Hints on demand are less interactive because
students do not have to respond to hints, but
they can get the same information as in the
scaffolding questions by requesting hints. - Delayed feedback is the least interactive
condition because students must wait until the
end of the assignment to get any feedback. - 2 levels of math proficiency
- Students in Honors math classes.
- Students in Regular math classes.
Background on ASSISTments
- The Assistment System is a web-based assessment
system that tutors students on math problems. The
system is freely available at www.assistment.org - As of March 2007, 1000s of Worcester middle
school students use ASSISTments every two weeks
as part of their math class. - Teachers use the fine-grained reporting that the
system provides to inform their instruction.
Students in this condition interact with the
tutor by answering scaffolding questions.
Students in this condition can get hints when
they ask for them by pressing the hint button.
Students in this condition get no feedback until
the end of the assignment when they get answers
and solutions.
Analysis and Conclusions
- 566 8th grade students participated.
- Results showed a significant interaction between
condition and math proficiency (p lt 0.05), a good
case for tailoring tutor interaction to types of
students.
Scaff. Q. 1
The Interaction Hypothesis
Hint 1
Students see the solution after they finish all
of the problems.
When one-on-one tutoring, either by a human tutor
or a computer tutor, is compared to a less
interactive control condition that covers the
same content, then students will learn more in
the interactive condition than the control
condition.
Scaff. Q. 2
Hint 2
Scaff. Q. 3
- Is this hypothesis true?
- We found evidence to support this hypothesis in
some cases, not in others. - Based on the results of Razzaq Heffernan
(2006), we believe the difficulty of the material
influences how effective interactive tutoring
will be.
Hint 3
Hint 4
Hint 5
- Regular students learned more with scaffolding
hints (p lt 0.05) less-proficient students
benefit from more interaction and coaching
through each step to solve a problem. - Honors students learned more with delayed
feedback (p 0.075) more-proficient students
benefit from seeing problems worked out and
getting the big picture. - Delayed feedback performed better than hints on
demand (p.048) for both more- and
less-proficient students students dont do as
well when we depend on student initiative.
Scaff. Q. 4
Our Hypothesis
Hint 6
- More interactive intelligent tutoring will lead
to more learning (based on post-test gains) than
less interactive tutoring. - Differences in learning will be more significant
for students who are less-proficient than
students who are more-proficient.
Hint 7
Hints on Scaff. Q.
This work has been accepted for publication at
the 2007 Artificial Intelligence in Education
Conference in Los Angeles.
63CAREER Learning about Learning Using
Intelligent Tutoring Systems as a Research
Platform to Investigate Human Learning Free
researcher, teacher and student accounts for
7th-10th grade math preparation at
www.assistment.org
What the State MCAS test provides
What a student sees
What the teacher who builds the tutoring sees.
This shows a student that first guessed 16 (real
answer is 24), then got the first scaffolding
question correct with AC. The student then
clicked on ½8x and the system spit out the
bug message in red. The student, twice in a
row, asked for a hint shown in the green box.
The originalquestion
Uploaded image
Teacher Reports
The firstscaffold
The second scaffold
The author wrote this hint message shown in the
green box, put typing it in here.
The three hint messages for the second scaffold.
Recent Results - 2006
Teachers get reports per student, per skill, and
per item.
This project has 5 main research thrusts 1) For
the designing cognitive models thrust we report
that we can do a better job of modeling students
by using finer-grained models (i.e., that track
more knowledge components) of student than more
courser grain model (Zapdos et al, 2006, Feng, et
al, 2006). 2) For the research thrust of
inferring what students know and are learning we
can report two new results. First, we can do a
better job of assessing students (as measured by
predicting state test scores) by seeing how much
tutoring they need to solve a question (Feng, et
al, 2006a). Secondly, we have shown that we can
do a better job of modeling students learning
overtime by building models that take allow us to
model different rates of learning for different
skills (Feng et al, 2006a). 3) For the optimizing
learning thrust we have new empirical results
that show that students learn more with the type
of tutoring we provide that compared to a
traditional Computer-Added Instruction (CAI)
control (Razzaq Heffernan, 2006). 4) For the
thrust for informing educators, we have some
recent publications on the types of feedback we
give educators (Feng Heffernan, 2005 2006)).
Additionally, we have work that shows we can
track student motivation and then inform
educators in novel manners that increase student
motivation (Walonoski Heffernan, 2006a
2006b). 5) Finally, for the thrust of allowing
user adaptation we have shown that the authoring
tools we have built can be used to teachers and
quickly create content for their classes
(Heffernan, Turner et al, 2006). References
are at www.asssistment.org
The bottom out hint.
The third scaffold
This dialog shows the author has tagged the third
scaffold with three different grained sized
models.
By tagging items with skills, teachers can 1) get
reports on which skills students are doing poorly
on, and 2) track them over time.
The fourth and last Scaffold
64Scaling up a Server-Based Web Tutor Jozsef
Patvarczki Neil Heffernan
Results
Introduction
Assistment Features
Our research team has built a web-based tutor,
located at www.ASSISTment.org 1, that is used
by hundreds of students a day in Worcester and
surrounding towns The systems focus is to teach
8th and 10th grad mathematics and MCAS
preparation. Because it is easily accessible, it
helps lower the entry barrier for teachers and
enable both teachers and researchers to collect
data and generate reports. Scaling up a
server-based intelligent tutoring system requires
developers to care about speed and reliability.
We will present how the Assistment system can
improve performance and reliability with a
fault-tolerant scalable architecture.
- Since each public school classes have about 20
students, we noticed clusters (shown in ovals in
the bottom left) of intervals where a single
class was logged on. - The log-on procedures is the most expensive step
in the process and this data shows that this
might be a good place for us to improve. - We noticed a second cluster of around 40 users,
which most likely represents instances where two
classes of students were using the system
simultaneously. - There was no appreciable pattern towards a slower
page creation time with more users. - Three simulated scenarios with 10s random delay
between student actions - In the first scenario we used 50 threads
simulating 50 students working without
load-balancer, one application server, and one
database - Second scenario with load-balancer and two
application servers
Users begin interacting with our system through
the Portal that manages all activities
This problem uses a pseudo-tutor (state-based
implementation) with pre-made scaffolding and
hint questions selected based upon student input.
Incorrect responses are in red, and hints are in
green.
Example of a State-based Pseudo Tutor
- Horizontal scaled configuration
- Scalable
- Fault-tolerant
- Dynamically configurable
Architecture
Clients actions represent the systems load
HTTP server as Load Balancer
System Scalability and Reliability
- Two concerns when running the Intelligent Tutor
on a central server are - 1) building a scalable server architecture
- 2) providing reliable service to researchers,
teachers, and students. - We will answer several research questions
- 1) can we reduce the cost of authoring ITS
- 2) how can we improve performance and reliability
with a better server architecture. - In order to server thousands of users, we must
achieve high reliability and scalability at
different levels. - Scalability at our first entry point through the
use of a virtual IP for www.assistment.org,
provided by the CARP protocol. - Random and round-robin redirection algorithms can
provide very effective load-sharing and the
load-balancer distributes load over multiple
application servers. - This will allow us to redirect incoming web
requests and build a web portal application in a
multiple-server environment. - The monitoring system uses Selenium has allowed
us to send text messages to our administrators
when the system goes down. - Multiple database servers with automatic
synchronization, pooling, and fail-over
detection.
Additional application servers for load balancing
GRID computing Bayesian Network Application
Workflow Editor and Manager
Visualization and Resource Information System
WPI P-GRADE GRID Portal http//pgrade.wpi.e
du
- Reference
- Razzaq, L, Feng, M., Nuzzo-Jones, G., Heffernan,
N.T. et. al (2005). The Assistment Project
Blending Assessment and Assisting. 12th Annual
Conference on Artificial Intelligence in
Education 2005, Amsterdam
Contact Neil Heffernan, nth_at_wpi.edu
65How was the Skill Models Created
66(No Transcript)
67- Fine grained skill models in reporting
- Teachers get reports that they think are credible
and useful. (Feng Heffernan, 2005, 2006, 2007)
68DataBases Logging Every Student Action
69(No Transcript)
70(No Transcript)
71(No Transcript)
72Research Question
- In the ASSISTment project, which approach works
better on assessing students performance
longitudinally? - Skill learning tracking?
- Or using item difficulty parameter?
(unidimensional)
73Data Source
- 497 students of two middle schools
- Students used the ASSISTment system every other
week from Sep. 2004 to May 2005 - Real state test score in May 2005
- Item level online data
- students binary response (1/0) to items that are
tagged in different skill models
- Some statistics
- Average usage 7.3 days
- Average questions answered 250
- 138,000 data points
74Data Source
75Item Difficulty Parameter
- Fit one-parameter logistic (1PL) IRT model (Rasch
model) on our online data - the dependent variable probability of correct
response for student i to item n - The independent variables the persons trait
score and the items difficulty level .
76Longitudinally Modeling
- Mixed-effects Logistic Regression Models
- Models we fitted
- Model-beta time beta -gt item response
- Model-WPI5 time skills in WPI5 -gt item
response - Model-WPI78 time skills in WPI78 -gt item
response - Evaluation
- The accuracy of the predicted MCAS test score
was used to evaluate different approaches
Singer Willet (2003). Applied Longitudinal
Data Analysis. Oxford University Press New York.
Hedeker Gibbions (in preparation).
Longitudinal Data Analysis.
77How do we predict the MCAS
- That the paramers learned up tehf ro online data
to preidct which questions they will get correct
or incorrect on MCAS based upon the skill of the
item, and the time of the question, and the n
paremetns per kid for that skill. - If 3 items with skill A are on the test, and the
prediction is .25 per skill then we predict .75
MCAS points from those three questions.
78Results
gt
gt
P-values of both Paired t-tests are below 0.05
79Conclusion
- We have found evidence that shows skill learning
tracking can better predict MCAS score than
simply using item difficulty parameter and
fine-grained models did even better than
coarse-grained model - Our skill mapping is good (maybe not optimal)
- We are considering using these skills models in
selecting the next best-problem to present a
student with. - Although we used Rasch model to train the item
difficulty parameter, we were not modeling
students' response with IRT. One interesting work
will be comparing our results to predictions made
through item response modeling approach.
80Modeling Student Knowledge Using Bayesian
Networks to Predict Student Performance By Zach
Pardos Neil Heffernan, Advisor Computer
Science Joint work with Brigham Anderson and
Cristina Heffernan
Sponsors
Collaborators
Goal
Predicting student responses within the
ASSISTment tutoring system
Student Test Score Prediction Process
To evaluate the predictive performance of various
fine-grained student skill models in the
ASSISTment tutoring system using Bayesian
networks.
Bayesian Belief Network
Result The finer-grained the model, the better
prediction accuracy. The finest-grained WPI-106
performed the best with an average of only 5.5
error in prediction of student answers within the
system.
- Skill probabilities are inferred from a
students responses to questions on the system
The Skill Models
- The skill models were created for use in the
online tutoring system called ASSISTment, founded
at WPI. They consist of skill names and
associations (or tagging) of those skill names
with math questions on the system. Models with 1,
5, 39 and 106 skills were evaluated to represent
varying degrees of concept generality. The skill
models ability to predict performance of
students on the system as well as on a
standardized state test was evaluated. - The five skill models used
- WPI-106 106 skill names were drafted and tagged
to items in the tutoring system and to the
questions on the state test by our subject matter
expert, Cristina. - WPI-5 and WPI-39 5 and 39 skill names drafted
by the Massachusetts Department of Education. - WPI-1 Represents unidimensional assessment.
Predicting student state test scores
- Arrows represent associations of skills with
question items. They also represent conditional
dependence in the Bayesian Belief Network. - Probability of Guess is set to 10 (tutor
questions are fill in the blank) - Probability of getting the item wrong even if the
student knows it is set to 5
Result The finest-grained model, the WPI-106,
came in 2nd to the WPI-39 which may have
performed better than the 106 because 50 of its
skills are sampled on the MCAS Test vs. only 25
of the WPI-106s.
2. Inferred skill probabilities from above are
used to predict the probability the student will
answer each test question correctly
Bayesian Networks
- A Bayesian Network is a probabilistic machine
learning method. It is well suited for making
predictions about unobserved variables by
incorporating prior probabilities with new
evidence.
Background on ASSISTment
Conclusions
- ASSISTment is a web-based assessment system for
8th-10th grade math that tutors students on items
they get wrong. There are 1,443 items in the
system. - The system is freely available at
www.assistment.org - Question responses from 600 students using the
system during the 2004-2005 school year were
used. - Each student completed around 260 items each.
- The ASSISTment fine-grained skill models excel at
assessment of student skills (see Ming Fengs
poster for a Mixed-Effects approach comparison) - Accurate prediction means teachers can know when
their students have attained certain competencies.
- Probabilities are summed to generate total test
score. - Probability of Guess is set to 25 (MCAS
questions are multiple choice) - Probability of getting the item wrong even if the
student knows it is set to 5
This work has been accepted for publication at
the 2007 User Modeling Conference in Corfu,
Greece.
81Tracking skill learning longitudinally
82(No Transcript)
83(No Transcript)
84(No Transcript)
85Future Work
- Apply modern longitudinal data analysis to our
data set. - Pivot table report
- More future work is to do a randomized controlled
experiment assigning teacher to with, or without,
the system
86Conclusions
- The Assistment system seems to do a reasonable
job of assessing and assisting. - We are release for beta testing the system today,
with an official release for Sept for the state
of Massachusetts. - www.Assistment.org
Nuzzo-Jones, G., Walonoski, J.A., Heffernan,
N.T., Livak, T. (2005). The eXtensible Tutor
Architecture A New Foundation for ITS. Poster in
the 12th Annual Conference on Artificial
Intelligence in Education 2005, Amsterdam
87Reporting to Teachers Researchers