Inferring Conceptual Knowledge from Unstructured Student Writing

About This Presentation

Title:

Inferring Conceptual Knowledge from Unstructured Student Writing

Description:

Inferring Conceptual Knowledge from Unstructured Student Writing Workshop: Personalizing Education with Machine Learning Neural Information Processing Systems (NIPS ... – PowerPoint PPT presentation

Number of Views:83

Avg rating:3.0/5.0

Slides: 19

Provided by: Norm2161

Learn more at: https://home.cs.colorado.edu

Category:

more less

Transcript and Presenter's Notes

Title: Inferring Conceptual Knowledge from Unstructured Student Writing

1
Inferring Conceptual Knowledge from Unstructured
Student Writing
Workshop Personalizing Education with Machine
Learning Neural Information Processing Systems
(NIPS) ConferenceLake Tahoe, CA, 8 December 2012

Norma C. Ming

Vivienne L. Ming
2
The role of assessment in instruction

Reveals what students already know and what they
need to learn
Provides feedback to students and teachers on
success of learning and instruction

Timely and specific feedback can guide continued
instruction (formative assessment)

Graphic from http//www.cmu.edu/teaching/assessmen
t/basics/alignment.html
3
Challenges with assessment

Large-scale assessment
Heavy on summative assessment
Standardized tests, academic analytics systems
Emphasize performance, not conceptual
understanding
Delayed, coarse-grained feedback
Intrusive
Interrupt class to administer test
Modify instruction to adopt others materials
Alternatives
Teachers may lack training in designing and
interpreting other kinds of assessment
Difficult to aggregate, calibrate

Printable sign available athttp//www.pickens.k12
.ga.us/assessment.html
4
Our goals

Use continuous, passive assessmentto elucidate
conceptual knowledge.
Wealth of unstructured data
Informal
Build on teachers existing instruction
Align with formal assessment, e.g.
course grades
standardized tests
instructor qualitative assessment

5
Research questions

Can topic models of unstructured student writing
predict course outcomes?
How does the accuracy of these predictions change
over time as more student work is analyzed?
What does learning the topic hierarchy add beyond
conventional topic modeling in improving these
predictions?

6
Dataset Methods

Online discussion forums
5- or 6-week courses
2 mandatory discussion questions per week
Introductory courses at large, for-profit
university

Biology (undergraduate) Economics (MBA)
Course length (wks) 5 6
discussion question threads per class 10 12
classes 17 45
students (after filtering) 230 970
posts by students 9118 44345
7
Analytical approach

Outcome of interest Student conceptual
understanding
Proxy Outcome Student course grade
Compare possible data features
Baseline
Mean course grade
Individual student posting characteristics
Word count
Conventional Semantic Modeling
Probabilistic Latent Semantic Analysis (pLSA)
Feature of Interest
Hierarchical Latent Dirichlet Allocation (hLDA)

8
Algorithms

Proof of concept
Logistic regression on the accumulated topic
coefficients from each week
Other supervised algorithms (e.g., SVM) surely
better
LR chosen to focus on contribution from hLDA
Current work utilizes
HCRF (Hidden-state Conditional Random Fields)
Improved weekly predictions
Allows forward prediction in course time

9
Results Biology course

Prediction accuracy
Word count gt mean (for 3 wks)
pLSA gtgt word count
hLDA gt pLSA
With more data collected over time
All predictions improve.

10
Results Economics course

Prediction accuracy
Word count gt mean (for 2 wks)
pLSA gt word count
hLDA gtgt pLSA
With more data collected over time
All predictions improve.

11
Topic modeling can distinguish topics discussed
by final grades.

Each point represents posts by one student
Posts projected in 100-D pLSA concept space
Used local linear embedding (LLE) to reduce to
2-D

Cs Ds neglect these topics
Increasing final grades
12
Comments by higher grade-earners reveal more
structure.

Each point represents one post, color-coded by
grade
Ds and below cluster in the center
Higher grades move in specific directions toward
periphery
Directions may correspond to course structure or
instructors guidance
Not just depth or specificity, but particular
concepts

13
Structure corresponds to course topics.

Same points, color-coded by week
Different weeks on different branches
Low grades stay in center even when discussion
topics invite more specific comments.

14
What does hierarchical modeling add?

Not all language is equal.
Conventional topic modeling treats all topics as
equal (and independent).
Hierarchy implies ranking
Shallower more frequent and generic language
Deeper more infrequent and technical language

15
Examining hLDA results (Econ)

Posts from students earning higher grades
correlated with

Higher mean of depth in hLDA
C grades most language at shallowest level
A, B grades more language at deeper levels
More technically proficient language use
General language more anecdotal comments
Specific language greater conceptual depth

16
Summary of results

Can topic models of unstructured student writing
predict course outcomes?
YES pLSA, hLDA both better than chance (and
better than post length).
How does the accuracy of these predictions change
over time as more student work is analyzed?
Extra weeks of data improves predictions.
By end of course, pLSA predictions are within one
letter grade.
What does learning the topic hierarchy add beyond
conventional topic modeling in improving these
predictions?
hLDA gt pLSA
Higher grades associated with discussion of
deeper topics in hLDA.

17
Conclusions and Future Work

There is some collection of topics associated
with higher grades (and some other collection of
topics associated with lower grades).
Deeper topics associated with low/high grades
could potentially differ analysis yet to be
done.
i.e., deep misconceptions such as inheriting
acquired traits (Lamarckian evolution)
Next steps
Create topic map
Hierarchical relationships
Normative sources (e.g., textbook, exemplary
student work)
Labeled, non-normative sources (common
misconceptions)

18
Implications

Extensions to other text data
Essays, short-answer test questions
Online tutoring
Informal learning environments (e.g., Quora,
Evernote)
Annotations on e-texts
Wiki contributions
Language mediates learning text is everywhere.
Learn from it, improve it.

Write a Comment

User Comments (0)

About PowerShow.com

Inferring Conceptual Knowledge from Unstructured Student Writing - PowerPoint PPT Presentation

Inferring Conceptual Knowledge from Unstructured Student Writing

Inferring Conceptual Knowledge from Unstructured Student Writing Workshop: Personalizing Education with Machine Learning Neural Information Processing Systems (NIPS ... – PowerPoint PPT presentation