Using%20Mixed-Effects%20Modeling%20to%20Compare%20Different%20Grain-Sized%20Skill%20Models PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Using%20Mixed-Effects%20Modeling%20to%20Compare%20Different%20Grain-Sized%20Skill%20Models


1
Using Mixed-Effects Modeling to Compare Different
Grain-Sized Skill Models
  • Mingyu Feng, Worcester Polytechnic Institute
  • Neil T. Heffernan, Worcester Polytechnic
    Institute
  • Murali Mani, Worcester Polytechnic Institute
  • Cristina Heffernan, Worcester Public Schools

2
The ASSISTment System
  • An e-assessment and e-learning system that does
    both ASSISTing of students and assessMENT (movie)
  • Massachusetts Comprehensive Assessment System
    MCAS
  • Web-based system built on Common Tutoring Object
    Platform (CTOP) 1

We are giving away accounts!
1 Nuzzo-Jones., G. Macasek M.A., Walonoski, J.,
Rasmussen K. P., Heffernan, N.T., Common Tutor
Object Platform, an e-Learning Software
Development Strategy, WPI technical report.
WPI-CS-TR-06-08.
3
ASSISTment
Geometry
  • We break multi-step problems into scaffolding
    questions
  • Hint Messages given on demand that give hints
    about what step to do next
  • Buggy Message a context sensitive feedback
    message
  • Skills
  • The state reports to teachers on 5 areas
  • We seek to report on more and finer grain-sized
    skills
  • Demo (two triangles problem)

(Demo/movie)
The original question
a. Congruence
b. Perimeter
c. Equation-Solving
The 1st scaffolding question
Congruence
The 2nd scaffolding question
Perimeter
A buggy message
A hint message
4
How was the Skill Models Created
5
How was the Skill Models Created
Multi-mapped model (WPI-5) vs. single-mapped
model (MCAS-5) ?
6
Previous Work on Skill Models
  • Fine grained skill models in reporting
  • Teachers get reports that they think are credible
    and useful. 3

3 Feng, M., Heffernan, N.T. (in press).
Informing Teachers Live about Student Learning
Reporting in the Assistment System. To be
published in Technology, Instruction, Cognition,
and Learning Journal Vol. 3. Old City Publishing,
Philadelphia, PA. 2006
7
(No Transcript)
8
(No Transcript)
9
Previous Work on Skill Models
  • Tracking skill performance over time 45

Number Sense
4 Feng, M., Heffernan, N.T., Koedinger, K.R.
(2006). Addressing the Testing Challenge with a
Web-Based E-Assessment System that Tutors as it
Assesses. Proceedings of the Fifteenth
International World Wide Web Conference. pp.
307-316. ACM Press New York, NY. 2006. 5
Feng, M., Heffernan, N.T., Koedinger, K.R.
(2006). Predicting state test scores better with
intelligent tutoring systems developing metrics
to measure assistance required. In Ikeda, Ashley
Chan (Eds.). Proceedings of the Eight
International Conference on Intelligent Tutoring
Systems. Springer-Verlag Berlin. pp. 31-40. 2006.
10
  • In this work, we compare different grain-sized
    skill models
  • By comparing the accuracy of their prediction of
    state test score

11
Research Questions
  • RQ1 Would adding response data to scaffolding
    questions help us do a better job of tracking
    students knowledge?
  • RQ2 How does the finer-grained skill model
    (WPI-78) do on estimating external test scores
    comparing to the skill model with only 5
    categories (WPI-5) and the one even with only one
    category (WPI-1)?
  • RQ3Does introducing item difficulty information
    help to build a better predictive model?

12
Data Source
  • 497 students of two middle schools
  • Students used the ASSISTment system every other
    week from Sep. 2004 to May 2005
  • Real state test score in May 2005
  • Item level online data
  • students binary response (1/0) to items that are
    tagged in different skill models
  • Some statistics
  • Average usage 7.3 days, Minimum usage 6 days
  • 138,000 data points (43,000 original data points)
  • Average question answered
  • Original 87, Scaffolding 189

Online data of 700 8th grade students available
for researchers! If you want access, talk to Neil
Heffernan and Kenneth Koedinger.
13
How is the Data Organized?
14
Approach
  • Fit mixed-effects logistic regression model on
    the longitudinal online data
  • using skills as a factor
  • predicting prob(response1) on an item tagged
    with certain skill at certain time
  • The fitted model gives learning parameters
    (initial knowledge learning rate) of each skill
    of individual student
  • Compare skill models by Mean Absolute Difference
    (MAD) and Err ( MAD/full score)

15
Data Preprocessing Strategies
  • Scaffolding Credit
  • Scaffolding only shows in case of wrong answer to
    original
  • We assume correct responses to all scaffolding
    questions if a student correctly answered the
    original one
  • Partial Blame
  • Only blame the skill of the worst performance
    overall

16
RQ1 Will Scaffolding Response Help?
Real MCAS score Assistment Predicted Score (WPI-78) Assistment Predicted Score (WPI-78)
Orig. Orig. Scaffolds
Mary 29 22.93 27.05
Tom 28 19.38 25.35

Sue 25 18.58 24.10
Dick 22 16.57 21.31
Harry 33 18.66 28.12
Absolute Difference between Real Score and Assistment Predicted Score Absolute Difference between Real Score and Assistment Predicted Score
Orig. Orig. Scaffolds
6.06 1.35
8.62 1.57

6.42 0.06
5.43 0.78
14.34 6.63
MAD 6.03 4.121
Error 17.75 12.12
  • Why?
  • Using more training data
  • Deal with credit-blame issue better
  • More identifiability per skill
  • Scaffolding questions provide valuable
    information 4567

Answer Yes!
6 Walonoski, J., Heffernan, N.T. (2006).
Detection and Analysis of Off-Task Gaming
Behavior in Intelligent Tutoring Systems. In
Ikeda, Ashley Chan (Eds.). Proceedings of the
Eighth International Conference on Intelligent
Tutoring Systems. Springer-Verlag Berlin. pp.
382-391. 2006 7 Walonoski, J., Heffernan, N.T.
(2006). Prevention of Off-Task Gaming Behavior in
Intelligent Tutoring Systems. In Ikeda, Ashley
Chan (Eds.). Proceedings of the Eighth
International Conference on Intelligent Tutoring
Systems. Springer-Verlag Berlin. pp. 722-724.
2006.
17
RQ2 Does finer grained model predict better?
Real MCAS score Assistment Predicted Score (scaffolding response used) Assistment Predicted Score (scaffolding response used) Assistment Predicted Score (scaffolding response used)
Skill Models Skill Models WPI-1 WPI-5 WPI-78
Mary 29 28.59 27.65 27.05
Tom 28 27.58 26.43 25.35

Sue 25 26.56 24.94 24.10
Dick 22 23.70 22.78 21.31
Harry 33 27.54 26.37 28.12
Absolute Difference between Real Score and Assistment Predicted Score Absolute Difference between Real Score and Assistment Predicted Score Absolute Difference between Real Score and Assistment Predicted Score
WPI-1 WPI-5 WPI-78
0.41 1.35 1.95
0.42 1.57 2.65

1.56 0.06 0.90
1.70 0.78 0.69
5.46 6.63 4.88
MAD 4.552 4.343 4.121
Error 13.39 12.77 12.12
Is 12.12 any good for assessment
purpose? MCAS-simulation result 11.12
18
Conclusion
  • Recall RQ1, RQ2.
  • Positive answer to both RQ1 and RQ2.
  • RQ3 Item difficulty was introduced as a factor
    to improve the predictive models. We ended up
    with better internally fitted models, but
    surprisingly no significant enhancement on the
    prediction of state test.

19
Some of the ASSISTMENT TEAM (2004-2005)
This research was made possible by the US Dept
of Education, Institute of Education Science,
"Effective Mathematics Education Research"
program grant R305K03140, the Office of Naval
Research grant N00014-03-1-0221, NSF CAREER
award to Neil Heffernan, and the Spencer
Foundation. Authors Razzaq and Mercado were
funded by the National Science Foundation under
Grant No. 0231773. All the opinions in this
article are those of the authors, and not those
of any of the funders.
Leena RAZZAQ, Mingyu FENG, Goss NUZZO-JONES,
Neil T. HEFFERNAN, Kenneth KOEDINGER,
Brian JUNKER, Steven RITTER, Andrea KNIGHT,
Edwin MERCADO, Terrence E. TURNER, Ruta
UPALEKAR, Jason A. WALONOSKI Michael A.
MACASEK, Christopher ANISZCZYK, Sanket CHOKSEY,
Tom LIVAK, Kai RASMUSSEN
Carnegie Learning
Write a Comment
User Comments (0)
About PowerShow.com