International symposium on Methodological tools for accountability systems in education" Ispra, 69 F - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

International symposium on Methodological tools for accountability systems in education" Ispra, 69 F

Description:

International symposium on 'Methodological tools for accountability systems in education' ... Statistics 'G. Parenti', University of Florence ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 43
Provided by: leonard82
Category:

less

Transcript and Presenter's Notes

Title: International symposium on Methodological tools for accountability systems in education" Ispra, 69 F


1
International symposium onMethodological tools
for accountability systems in education"Ispra,
6-9 Feb 2006
  • Multilevel models for the analysis of the
    transition from school to work
  • Leonardo Grilli
  • grilli_at_ds.unifi.it
  • www.ds.unifi.it/grilli
  • Dep. Statistics G. Parenti, University of
    Florence

2
Outline
  • Effectiveness and the role of multilevel models
  • Discrete-time survival models
  • Application Multilevel analysis of the time to
    obtain the first job for Italian graduates

3
Preamble
  • In this talk I consider the analysis of the
    transition from school (or university) to work
  • In particular, I focus on the time needed to get
    the first job
  • For the application I refer to paper
  • Biggeri L., Bini M. and Grilli L. (2001) The
    transition from university to work a multilevel
    approach to the analysis of the time to obtain
    the first job, J.R. Statist. Soc. A, 162, 293-305.

4
Preamble
  • Aims of the analysis
  • Assessing the influence of graduates'
    characteristics, such as gender, age,
    socio-economic status and curriculum studiorum,
    on the time to get the first job
  • Assessing the differences among educational
    institutions w.r.t. the time to get the first job
    for their graduates (external relative
    effectiveness)
  • Methodological tools
  • Multilevel statistical models for discrete-time
    survival data

5
Effectiveness and the role of multilevel models
6
Framework
  • Assessment of the effectiveness of educational
    institutions (schools or universities)
  • Remark the outcome of an educational institution
    cannot be defined in absolute terms, but only
    with respect to the effects on the students
  • Problem the effects on the students are affected
    by the features of the students themselves (if
    two istitutions of similar quality have students
    with rather different motivation, ability the
    outcome of the two istitutions is likely to be
    quite different)

How to make a fair assessment?
7
Effectiveness A and B
  • A potential student (or its family) and the
    government are interested in different types of
    effectiveness
  • A - Potential student interested in comparing
    the results she can obtain by enrolling in
    different institutions, irrespective of the way
    such results are yielded
  • B - Government interested in assessing the
    production process in order to evaluate the
    ability of the institutions to exploit the
    available resources

The two types of effectiveness are called A and B
after Raudenbush Willms (1995)
8
Effectiveness A and B
  • Type A effectiveness performance of the
    institutions adjusted for the features of the
    students
  • Type B effectiveness performance of the
    institutions adjusted for the features of
  • the students
  • the institution itself and the context in which
    it operates (e.g. resources, procedures, local
    labour market)

In practice the adjustment required for the
assessment of Type B effectiveness is not easy
(many variables whose measurement is problematic)
9
Statistical issues
  • The statistical models for assessing the relative
    effectiveness of educational institutions must
    face two main problems
  • Adjustment the measures must be adjusted at
    least for the features of the students -gt this is
    necessary for a fair (ceteris paribus) comparison
  • Quantification of uncertainty -gt this is
    necessary in order to formulate judgements
    strongly supported by the empirical evidence
    (avoiding judgements that may be originated by
    the sampling variability or other sources of
    error)

The raw rankings (so called League Tables)
ignore both issues (Goldstein Spiegelhalter,
1996)
10
Statistical issues
  • Adjustment
  • Quantification of uncertainty

Regression models
But standard models are not adequate. In
particular, they make unsuitable assumptions on
the variance-covariance structure (in fact, the
results of the students of the same institution
are likely to be correlated) -gt poor
quantification of uncertainty
11
Multilevel models
  • Multilevel models are well suited for assessing
    the relative effectiveness of institutions, in
    particular they allow
  • good representation of the variance-covariance
    structure -gt good quantification of uncertainty
  • explicit representation of the concept of
    effectiveness by means of the random effects (on
    the intercept) uj

Institution 1
Institution J

Student 1
Student n1

Student 1
Student nJ

12
Multilevel models
Features of the institution and its context
Features of the student
Outcome of the student
Effectiveness of the institution
i student j institution
The type of effectiveness depends on which
covariates are in the model
13
Discrete-time survival models
14
Nature of the outcome
  • Outcome variable time elapsed from graduation to
    the beginning of the first job
  • Such a variable (measured through a survey) has
    two important features
  • Time is discrete, as it is recorded in months or
    larger time units ? Many graduates have the same
    recorded time (so-called ties)
  • Time is censored for those who did not get the
    first job by the end of the observation period

15
Right censoring
ti time recorded for graduate i di event
indicator for graduate i di 1 ? ti is
uncensored di 0 ? ti is censored
Example (ti 42, di 1) graduate i got 1st job
after 42 months (ti 42, di 0) graduate i did
not get 1st job within 42 months
Censored times cannot be deleted, nor treated as
uncensored times The standard estimation methods
for survival models use the censored times under
the assumption of non informative censoring
16
Example censored discrete times
17
Discrete-time survival models
  • The outcome is a time that is both discrete and
    censored

the appropriate statistical model is
a DISCRETE-TIME SURVIVAL MODEL
Judith D. Singer John B. Willett (2003) Applied
Longitudinal Data Analysis Modeling Change and
Event Occurrence. New York Oxford University
Press.
18
Discrete-time survival models
Time is represented by a discrete random variable
T assuming values in the set 1,2,,tmax
h(t) P(T t T ? t) hazard (risk)
function probability of getting the 1st job in
month t given that job was not previously obtained
19
Model likelihood
The form of the likelihood for graduate i is
different if the time is uncensored or censored
  • Uncensored (di 1)
  • Censored (di 0)

Time Likelihood factors 1 1-hi(1)
2 1-hi(2) 3 hi(3)
Example (ti 3, di 1)
20
Model with covariates
The building blocks of the likelihood are the
hazards ? A survival model is just a model for
the hazard function To adjust for covariates one
can specify a functional form for the conditional
hazard h (ti xit) P(Ti ti Ti ? ti ,
xit) the vector xit includes all the covariates
of graduate i at time t The covariates can be
time-invariant or time-varying (rarely available
in practice)
21
Model with covariates
Since the hazard function h is bounded between 0
and 1 a linear model for h itself is not
suitable, but one can apply a linear model to an
appropriate trasformation of h
g is the link function
t?1,2,,tmax
b regression coefficients
(?1 , ?2 , , ?tmax) time-specific intercepts
representing the baseline hazard (xit 0)
22
Logit survival model
If the link g is the logit function the
corresponding model is called logit or
proportional odds
or equivalently
23
Logit survival model
bk change in the logit of the hazard following
a unit increase in the k-th covariate
If there are many time points it is advisable to
assume a (smooth) functional form for the
time-specific intercepts, e.g. a polynomial of
degree R
The parameters (a1, a2,, atmax) are replaced by
(g0, g1,, gR)
To allow a time-invariant covariate xk to have a
time-varying effect simply insert interactions
with time e.g. xk ?t xk ?t2 xk ?t3
24
Estimation
likelihood of a binary response model on the
person-period dataset
likelihood of a discrete-time survival model on
the original dataset
Graduate month job 09424 1 0
09424 2 0 09424 3
0 09424 4 0 09424 5
0 13306 1 0 13306 2
0 13306 3 1
Graduate month d_ 09424 5 0 13306
3 1
So it is possible to use standard software for
binary response models
25
Application Multilevel analysis of the time to
obtain the first job for Italian graduates
26
Data
  • Survey on Italian graduates of year 1992 carried
    out by Istat (Italian national statistical
    institute)
  • Retrospective study postal questionnaire in
    December 1995 ? observation period from 37 to 48
    months after graduation
  • 13511 interviews
  • The dataset includes many features of the
    graduates (time-invariant covariates) but no
    features of the course programmes or universities

27
Data
Empirical hazard function
Time from graduation to the first job recorded in
months (from 1 to 48)
In order to reduce the dimension of the
person-period dataset to be constructed, for the
analysis time is collapsed into quarters (from 1
to 16)
28
Data
  • Subsample of interest from the whole sample
    (13511 records) a subsample was taken (10338
    records) by eliminating the graduates who, at the
    date of the interview
  • had the same job as before the degree or
  • declared to be not interested in getting a job
  • 3-level hierarchy Graduates nested in 766 course
    programmes nested in 64 universities

29
The model
Discrete-time survival models
Multilevel models
i graduate j course programme k
university t quarter (after degree)
Random effects ujk at course programme level vk
at university level
Barber, J. S., Murphy, S., Axinn, W. G. and
Maples, J. (2000) Discrete-time multi-level
hazard analysis. Sociological Methodology, 30,
201235.
30
The model
  • Distribution of the random effects
  • in the application ujk and vk are assumed to be
    independent and zero-mean Gaussian
  • alternative discrete distribution with unknown
    locations and mass points (leading to
    Non-Parametric Maximum Likelihood)
  • Unobserved heterogeneity
  • leads to duration bias
  • the model controls for unobs. heter. at the
    course programme and university levels
  • unobs. heter. at the graduate level could still
    be present in such a case a further random
    effect at the individual level should be added

31
Model fitting
  • Estimation procedure
  • Generate the person-period dataset (71143
    records)
  • Fit logit binary-response models (method PQL2 of
    MLwiN is used, but many programs can do the job)
  • Model selection backward with 95 Wald tests

32
Covariates in final model
  • FINAL MARK is the only quantitative from 66 to
    110 (in the model is centered on 100) ?

All the other covariates are binary. The 0
category stands for otherwise and is the
reference category (defining the reference
individual)
  • FEMALE (1female) ?
  • EDUCATIONAL STATUS OF THE PARENTS (1 at least
    one with secondary school certificate or degree)
    ?
  • OCCUPATIONAL STATUS OF THE PARENTS (1 at least
    one working) ?

33
Covariates in final model
  • OCCUPATIONAL STATUS WHILE STUDYING (1 graduate
    held at least one job during university studies)
    ?
  • INSTITUTIONAL TIME (1 degree obtained within the
    istitutional time established for the course) ?
  • AGE AT DEGREE (1 over 30 years) ?
  • MILITARY SERVICE (1 done after degree, 0 done
    before degree or exempted) ?

34
Time pattern
  • R3, i.e. the polynomial for the baseline hazard
    has degree 3
  • ?1.70 0.04t ?0.02t2 ? 0.0001t3
  • Two covariates have time-varying effects
  • Female (f) ?0.32f ? 0.08ft 0.006ft2
  • Milit. serv. (s) ?1.22s ?0.47st 1.18st2 ?0.01st3

35
Estimated hazard function for three types of
graduates
mmales who did military service before the
degree or were exempted from it ffemales
smales who did military service after the degree

the other covariates are set to zero ? reference
individual the random effects are set to zero ?
mean course prog., mean university
36
Gender and mark
  • Females are at disadvantage w.r.t. the males
    without military service, their hazard is always
    below (though the difference vanishes as time
    elapses) WARNING as the effect of FEMALE is
    time-varying, one should look at the whole hazard
    curve (not merely the coefficient 0.32)
  • The effect of the FINAL MARK is larger for
    females than for males (due to an interaction)
  • 0.0063 (for each additional point) for males
  • 0.0165 (for each additional point) for females
  • For both genders, obtaining the degree within the
    INSTITUTIONAL TIME increases the hazard 0.1157

37
Estimated hazard function for males without
military service and females. Final mark at 110
or 100, other covariates and random effects at 0
38
Residual variances
  • Residual variances on the intercept
  • at the university level var(vk) 0.1186
  • at the course programme level var(ujk) 0.2292
  • they are both statistically significant
  • such values are difficult to interpret in
    absolute terms
  • much more variability at the course programme
    level
  • No significant random slopes
  • Models residuals (shrinked) are computed at both
    levels and used to make comparisons
  • interpreted as estimates of type A effectiveness

39
Pairwise confidence intervals for residuals at
level 3 (universities)
Intervals for pairwise comparisons ? 1.39?s.e.
e.g. Goldstein (2003)
40
Estimated probability of getting job in the first
quarter (reference graduate) Same type of degree
in different universities
41
Final remarks
  • Multilevel survival modelling allows to model
    both the hierarchical and dynamic aspects of the
    transition from school to work
  • The model is implemented by means of standard
    software for multilevel models
  • The parametric specification of the baseline
    hazard is both parsimonius and flexible
    (especially for modelling time-varying effects)
  • Even if time-varying covariates were not present
    in the dataset, they could be easily included in
    the model
  • More covariates would be needed to assess Type B
    effectiveness (available resources, unemployment
    rates, labour market structure)

42
A further reference
  • Grilli L. (2005) The random effects proportional
    hazards model with grouped survival data a
    comparison between the grouped continuous and
    continuation ratio versions. J.R. Statist. Soc.
    A, 168, 83-94.
  • High school graduates
  • Comparisons of models
  • Several covariates with time-varying effects
  • Attempt to adjust for labour market conditions

grilli_at_ds.unifi.it www.ds.unifi.it/grilli
Write a Comment
User Comments (0)
About PowerShow.com