A Predictive Model of Inquiry to Enrollment - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

A Predictive Model of Inquiry to Enrollment

Description:

College Choice. 3 stage process - Hossler and Gallagher (1987) ... completed some college. colldemo ... Prior years very open search criteria. MN, CO, SD, MT ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 49
Provided by: BPA87
Category:

less

Transcript and Presenter's Notes

Title: A Predictive Model of Inquiry to Enrollment


1
A Predictive Model of Inquiry to Enrollment
  • Cullen F. Goenner, PhD
  • Department of Economics
  • University of North Dakota
  • cullen.goenner_at_und.nodak.edu
  • www.business.und.edu/goenner
  • Kenton Pauls
  • Director of Enrollment Services
  • University of North Dakota
  • kenton.pauls_at_mail.und.nodak.edu

2
Issues Facing Enrollment Managers
  • Finding new markets
  • Increasing Tuition
  • Declining population (ND)
  • Increasing competition
  • Need to attract a particular type of student
  • Diversity/Quality
  • Data driven analysis
  • Accountability

3
Questions we will answer today
  • What is predictive modeling?
  • How does one build a predictive model?
  • How can predictive modeling be used by
    institutions of higher education to improve
    enrollment?

4
What is Predictive Modeling?
  • Predictive modeling uses statistical/econometric
    methods to quantitatively predict the future
    behavior of individuals.
  • Steps include
  • Data collection on the subject of interest
  • Build the model based on data analysis
  • Predictions made out of sample
  • Model validation/testing

5
College Choice
  • 3 stage process - Hossler and Gallagher (1987)
  • Predisposition/aspiration for higher education
  • Encouragement, coursework, and interest.
  • Search of potential schools
  • Councilors, campus contacts, program
    availability
  • Selection
  • SES, Ability, Fit, Geography

6
Factors Influencing Choice
  • Economic perspective
  • Education an investment in human capital
  • Cost vs Benefit calculus
  • Psychological perspective
  • Need of self to find sense of belonging and
    fulfillment of needs.
  • Sociological perspective
  • Social interaction dictated by societal/family
    norms.

7
Existing Empirical Work
  • Search Choice
  • Applications
  • DesJardin, Dundar, Hendel (1999)
  • Weiler (1994)
  • Interest SAT scores sent
  • Toutkoushian (2001)

8
Existing Models of Enrollment Choice
  • Model a students binary choice to enroll at a
    particular college while controlling for a
    students characteristics.
  • Logistic models used
  • Conditional on students have
  • Applied
  • Bruggink and Gambhir (1996)
  • Thomas, Dawes, and Reznik (2001)
  • Admitted
  • DesJardins (2002)
  • Leppel (1993)

9
Our Predictive Model
  • Builds on the models of DesJardins (2002) and
    Thomas, Dawes, Reznik (2001)
  • Focus here is on prediction of enrollment of
    students that inquired of our institution.
  • Inquiry model is relevant because
  • Time of information exchange, opinion formation
  • Allows for early intervention in a students
    decision making process (Target Marketing)

10
Inquiry Model Challenges
  • Data collection
  • Data already collected on those who are admitted
    or apply. Typically not collected for inquiries.
  • Quality of data
  • Applicants provide detailed data describing
    themselves (demographic data test scores, HSGPA,
    etc.), which are not available for most student
    inquiries.

11
Types of Inquiries We Recorded
  • Return of information card
  • Attendance of college fair
  • Campus visit
  • Contact via e-mail
  • Contact via phone
  • Referral from faculty, coach, or alumni
  • ACT automatically submitted

12
How these data were captured
  • Enrollment Services Prospective Student Network
    relational database (ESPSN)
  • Customized system
  • SQL 2000/Visual Basic

13
Information Collected From Information Request
Card
  • Name
  • High School attended
  • Interested Major (if any)
  • Address
  • Lacks the demographic data typical to
    application records and use in most predictive
    models.

14
Geodemography
  • Process of attaching demographic characteristics
    to geographic characteristics.
  • Notion is that Birds of a Feather Flock
    Together, i.e. individuals living in the same
    neighborhood will tend to have similar behavior
    patterns.
  • Ex Neighborhoods homogenous in terms of
    household income, occupations, family size, and
    purchases.

15
Implementation
  • US Census data aggregated to zip code level
  • Geodemographic variables considered for our
    model specification
  • College age demographic
  • Population
  • Average Income
  • White demographic
  • Median age

16
Building the model
  • Binary choice model Model whether students, who
    inquire of UND, either enroll or do not enroll.
  • 15,827 students made inquiries for Fall 2003
    enrollment. Of these students 2067 actually
    enrolled.
  • Logistic regression model used.

17
Candidate Control Variables
  • Type and Frequency of Contact
  • Geographic
  • Academic
  • Geodemographic
  • Interaction Effects

18
Contact Variables
19
Geographic Variables
20
Academic/Geodemographic
21
Interaction Terms
22
Model Specification
  • Researchers typically assume their model
    specification is the true model which generates
    the data.
  • Difficult to justify a priori the choice of
    variables to include in model, given each by
    design is theoretically relevant.
  • With k candidate variables there are 2k different
    linear models one could consider.

23
  • Consider the case in which several models M1,
    MK are theoretically possible.
  • Basing inference on the results of a single model
    is risky.
  • Bayesian model averaging (BMA) allows us to
    account for this type of uncertainty.

24
BMA
  • The posterior distribution of the parameters
    given the data in the presence of uncertainty is
    the posterior distribution under each of the K
    models, with weights equal to the posterior model
    probabilities P(Mk/D) .
  • (1)

25
  • Posterior Model Probability is
  • (2)
  • Where P(D/Mk) is the likelihood and P(Mk) is the
    prior probability that model Mk is the true
    model, given one of the K models is the true
    model.

26
Posterior Model Probability
  • Assuming a non-informative prior, (P(M1)
    P(Mk) 1/K)
  • (3)

27
  • The posterior mean and variance summarize the
    effects of the parameters on the dependent
    variable. Raftery (1995) reports
  • (9)
  • where (k) and Var(k) are MLE under model k,
    and the summation is over models that include .

28
BMA Implementation
  • SPlus function bic.logit performs BMA on
    logistic regression models.
  • 30 regressors implies summation in equation 1
    over 1 billion models.
  • To manage summation we use Occams window.

29
Occams Window
  • Exclude models that predict the data
    sufficiently less than predictions of the best
    model. Predictions based on PMP of each model.
    Models in A are included

30
Results
  • 26 Models supported by the data
  • Model with highest PMP receives 21 of total.
  • Variables that receive strong support for
    inclusion include
  • Geographic Distance, HY State, HY School,
    Competitor distance
  • Geodemog College Age, Average Income
  • Contacts Number, Campus visit, Referral

31
(No Transcript)
32
Out of Sample Predictive Performance
  • Split the data into two equal parts
  • First part used to build/estimate the model
  • Second part used to test the models predictions.
  • Outcome (enrollment) is binary, while our model
    generates a probability estimate.

33
What is a successful prediction?
  • Greene (2001) - No correct choice for
    probability cutoff. Typical value is .5
  • Tradeoff in cutoff choice
  • Lower cutoff increases the accuracy of inquiries
    that are predicted to enroll and who actually
    enroll (sensitivity) at the expense of inquiries
    predicted to enroll and do not enroll (false
    positive rate)

34
Predictive Performance Classification
35
Predictive performance
  • 89 of observations correctly classified
  • Specificity 97
  • Sensitivity 36
  • ROC curve describes relation between sensitivity
    and 1- specificity (false rate)
  • Area under ROC curve .87

36
Another Predictive Performance Method
37
  • 79 of enrolled found within 22 of entire
    population (scores gt 0.2)
  • Focused efforts without compromising enrollment
    numbers
  • Efficiency implications

38
Practical Applications
  • Effective regional market segmentation
  • Targeted tele-counseling efforts
  • Special projects

39
Regional Market Segmenting
  • Target Marketing and Segmentation
  • Prospect names purchased based on zip code.
  • Establish a predictive score for all zip codes
    in US based on census-level data

40
What the data indicated (WA)
41
Where enrolled students came from (WA)
42
  • 83 of enrolled WA students fell within top
    scoring zips over three years
  • Direct Mail Names Purchases
  • Prior years very open search criteria
  • MN, CO, SD, MT
  • This year, much more restrictive to get deeper
    into broader markets
  • Only key zips
  • CO, WA, OR, AZ, IL, MN, etc.

43
WA Search Names - 2003
44
WA Search Names - 2004
45
Targeted Tele-Counseling Efforts
  • Student calling program
  • Top 20 of all model scores identified
  • Fluid number excluding applicants
  • Prompt student to take action

46
Special Projects
  • Limited funds but targeted initiatives
  • Focus on as many of top scoring students
  • Postcards, brochures, etc.

47
Possible Future Research
  • Cluster analysis for better market segmentation
  • Study of marginal effects

48
Thank You!
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com