Using%20Mixed-Effects%20Modeling%20to%20Compare%20Different%20Grain-Sized%20Skill%20Models presentation

About This Presentation

Transcript and Presenter's Notes

Title: Using%20Mixed-Effects%20Modeling%20to%20Compare%20Different%20Grain-Sized%20Skill%20Models

1
Using Mixed-Effects Modeling to Compare Different
Grain-Sized Skill Models

Mingyu Feng, Worcester Polytechnic Institute
Neil T. Heffernan, Worcester Polytechnic
Institute
Murali Mani, Worcester Polytechnic Institute
Cristina Heffernan, Worcester Public Schools

2
The ASSISTment System

An e-assessment and e-learning system that does
both ASSISTing of students and assessMENT (movie)
Massachusetts Comprehensive Assessment System
MCAS

Web-based system built on Common Tutoring Object
Platform (CTOP) 1

We are giving away accounts!
1 Nuzzo-Jones., G. Macasek M.A., Walonoski, J.,
Rasmussen K. P., Heffernan, N.T., Common Tutor
Object Platform, an e-Learning Software
Development Strategy, WPI technical report.
WPI-CS-TR-06-08.
3
ASSISTment
Geometry

We break multi-step problems into scaffolding
questions
Hint Messages given on demand that give hints
about what step to do next
Buggy Message a context sensitive feedback
message
Skills
The state reports to teachers on 5 areas
We seek to report on more and finer grain-sized
skills
Demo (two triangles problem)

(Demo/movie)
The original question
a. Congruence
b. Perimeter
c. Equation-Solving
The 1st scaffolding question
Congruence
The 2nd scaffolding question
Perimeter
A buggy message
A hint message
4
How was the Skill Models Created
5
How was the Skill Models Created
Multi-mapped model (WPI-5) vs. single-mapped
model (MCAS-5) ?
6
Previous Work on Skill Models

Fine grained skill models in reporting
Teachers get reports that they think are credible
and useful. 3

3 Feng, M., Heffernan, N.T. (in press).
Informing Teachers Live about Student Learning
Reporting in the Assistment System. To be
published in Technology, Instruction, Cognition,
and Learning Journal Vol. 3. Old City Publishing,
Philadelphia, PA. 2006
7
(No Transcript)
8
(No Transcript)
9
Previous Work on Skill Models

Tracking skill performance over time 45

Number Sense
4 Feng, M., Heffernan, N.T., Koedinger, K.R.
(2006). Addressing the Testing Challenge with a
Web-Based E-Assessment System that Tutors as it
Assesses. Proceedings of the Fifteenth
International World Wide Web Conference. pp.
307-316. ACM Press New York, NY. 2006. 5
Feng, M., Heffernan, N.T., Koedinger, K.R.
(2006). Predicting state test scores better with
intelligent tutoring systems developing metrics
to measure assistance required. In Ikeda, Ashley
Chan (Eds.). Proceedings of the Eight
International Conference on Intelligent Tutoring
Systems. Springer-Verlag Berlin. pp. 31-40. 2006.
10

In this work, we compare different grain-sized
skill models
By comparing the accuracy of their prediction of
state test score

11
Research Questions

RQ1 Would adding response data to scaffolding
questions help us do a better job of tracking
students knowledge?

RQ2 How does the finer-grained skill model
(WPI-78) do on estimating external test scores
comparing to the skill model with only 5
categories (WPI-5) and the one even with only one
category (WPI-1)?

RQ3Does introducing item difficulty information
help to build a better predictive model?

12
Data Source

497 students of two middle schools
Students used the ASSISTment system every other
week from Sep. 2004 to May 2005
Real state test score in May 2005
Item level online data
students binary response (1/0) to items that are
tagged in different skill models

Some statistics
Average usage 7.3 days, Minimum usage 6 days
138,000 data points (43,000 original data points)
Average question answered
Original 87, Scaffolding 189

Online data of 700 8th grade students available
for researchers! If you want access, talk to Neil
Heffernan and Kenneth Koedinger.
13
How is the Data Organized?
14
Approach

Fit mixed-effects logistic regression model on
the longitudinal online data
using skills as a factor
predicting prob(response1) on an item tagged
with certain skill at certain time
The fitted model gives learning parameters
(initial knowledge learning rate) of each skill
of individual student

Compare skill models by Mean Absolute Difference
(MAD) and Err ( MAD/full score)

15
Data Preprocessing Strategies

Scaffolding Credit
Scaffolding only shows in case of wrong answer to
original
We assume correct responses to all scaffolding
questions if a student correctly answered the
original one
Partial Blame
Only blame the skill of the worst performance
overall

16
RQ1 Will Scaffolding Response Help?
Real MCAS score Assistment Predicted Score (WPI-78) Assistment Predicted Score (WPI-78)
Orig. Orig. Scaffolds
Mary 29 22.93 27.05
Tom 28 19.38 25.35

Sue 25 18.58 24.10
Dick 22 16.57 21.31
Harry 33 18.66 28.12
Absolute Difference between Real Score and Assistment Predicted Score Absolute Difference between Real Score and Assistment Predicted Score
Orig. Orig. Scaffolds
6.06 1.35
8.62 1.57

6.42 0.06
5.43 0.78
14.34 6.63
MAD 6.03 4.121
Error 17.75 12.12

Why?
Using more training data
Deal with credit-blame issue better
More identifiability per skill
Scaffolding questions provide valuable
information 4567

Answer Yes!
6 Walonoski, J., Heffernan, N.T. (2006).
Detection and Analysis of Off-Task Gaming
Behavior in Intelligent Tutoring Systems. In
Ikeda, Ashley Chan (Eds.). Proceedings of the
Eighth International Conference on Intelligent
Tutoring Systems. Springer-Verlag Berlin. pp.
382-391. 2006 7 Walonoski, J., Heffernan, N.T.
(2006). Prevention of Off-Task Gaming Behavior in
Intelligent Tutoring Systems. In Ikeda, Ashley
Chan (Eds.). Proceedings of the Eighth
International Conference on Intelligent Tutoring
Systems. Springer-Verlag Berlin. pp. 722-724.
2006.
17
RQ2 Does finer grained model predict better?
Real MCAS score Assistment Predicted Score (scaffolding response used) Assistment Predicted Score (scaffolding response used) Assistment Predicted Score (scaffolding response used)
Skill Models Skill Models WPI-1 WPI-5 WPI-78
Mary 29 28.59 27.65 27.05
Tom 28 27.58 26.43 25.35

Sue 25 26.56 24.94 24.10
Dick 22 23.70 22.78 21.31
Harry 33 27.54 26.37 28.12
Absolute Difference between Real Score and Assistment Predicted Score Absolute Difference between Real Score and Assistment Predicted Score Absolute Difference between Real Score and Assistment Predicted Score
WPI-1 WPI-5 WPI-78
0.41 1.35 1.95
0.42 1.57 2.65

1.56 0.06 0.90
1.70 0.78 0.69
5.46 6.63 4.88
MAD 4.552 4.343 4.121
Error 13.39 12.77 12.12
Is 12.12 any good for assessment
purpose? MCAS-simulation result 11.12
18
Conclusion

Recall RQ1, RQ2.
Positive answer to both RQ1 and RQ2.
RQ3 Item difficulty was introduced as a factor
to improve the predictive models. We ended up
with better internally fitted models, but
surprisingly no significant enhancement on the
prediction of state test.

19
Some of the ASSISTMENT TEAM (2004-2005)
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Andrea KNIGHT,
Edwin MERCADO, Terrence E. TURNER, Ruta
UPALEKAR, Jason A. WALONOSKI Michael A.
MACASEK, Christopher ANISZCZYK, Sanket CHOKSEY,
Tom LIVAK, Kai RASMUSSEN
Carnegie Learning

Write a Comment

User Comments (0)

About PowerShow.com

Using%20Mixed-Effects%20Modeling%20to%20Compare%20Different%20Grain-Sized%20Skill%20Models PowerPoint PPT Presentation