Title: Student Retention in Online Classrooms
1How we developed a machine learning model to
predict usage persistence of digital classrooms.
https//www.playpowerlabs.com/
2Online learning is everywhere these days, but
continuously engaging teachers and students in
digital platforms is a challenge. Better user
experience design can lead to more engagement and
retention. When edtech platforms are trying to
create better experiences for their users, they
can use machine learning to identify potential
issues with platform engagement.
Historical usage data of edtech platforms can
easily help us identify patterns of usage that
lead to disengagement and dropout. We built a
classroom dropout prediction model for a
large-scale edtech platform that successfully
identified online classrooms that were at risk
for discontinuing the usage of the online
learning platform. Identifying such classrooms
was very important for the school districts that
the edtech platform was serving. These classrooms
needed additional help with using the software
and implementing the online curriculum as planned
3We started by looking at the historical data that
had examples of online classrooms stopping their
usage. We used K-Means clustering to group usage
patterns of the classrooms and found that there
were classrooms that persisted throughout the
entire school year, and there were classrooms
that waned their usage over time and stopped
using the online platform before the school year
ended. We used the clustering method to generate
the training labels for our binary classification
model that predicted whether a given classroom
will continue their online learning activities or
not.
4When we are doing modeling with educational data,
multi-level models are often helpful because they
capture the natural hierarchy of the data. You
typically have data at the student/teacher level
nested in classrooms, that are in turn nested in
schools, which are part of the districts.
Some schools/districts are more likely to drop
out compared to others because of their
collective attitudes towards certain platforms/
technologies. To leverage these facts, we decided
to use a multi-level model.
Once our modeling exercise was done, our model
was packaged as an R package. This allowed us to
put our model in production easily. Our R package
contained functions to calculate features based
on the raw data so that the downstream
applications only needed to fetch raw data and
the predictions were made with two function calls.
5At the end of the project, our machine learning
model was piloted through an email campaign where
district leaders received the information about
potential classroom dropout in an email. They
received a list of classrooms that were likely to
stop online usage and needed help. Our data
helped districts take action and improve their
digital learning program implementation