Title: Handling data on occupations, educational qualifications, and ethnicity
1Handling data on occupations, educational
qualifications, and ethnicity
- Paul Lambert Vernon Gayle, Univ. Stirling
- Talk to the workshop Resources for Data
Management and Handling Social Science Data - ESRC Research Methods Festival, Oxford, 1 July
2008
2Handling variables
- DAMES project (www.dames.org.uk) - specialist
data services on three major social science
topics (occupations, education, ethnicity) - GEDE Grid Enabled Specialist Data
Environments - From www.geode.stir.ac.uk
3Handing social science variables general themes
- Common vs best practice
- Recording the derivation/variable construction
process - Reviewing alternative measures
- Comparability (between contexts - countries,
times) - Input or output harmonisation?
- Measurement or functional equivalence?
- See esp. Variable constructions in longitudinal
research, http//www.longitudinal.stir.ac.uk/vari
ables/ - Existing standards of National Statistics
Institutes and international bodies (during data
collection)
4Handling variables general themes, ctd.
- The unit of analysis
- Individual, spouse, household, etc.
- Current time career summary, etc.
- Concept and measures
- Variety of academic preferences
- NSI standard measures
5Key variables concepts and measures
6Key variables comments speculation (from
www.longitudinal.stir.ac.uk/variables/Coefficients
.html )
- a) Data manipulation skills and inertia
- I would speculate that around 80 of applications
using key variables dont consult literature and
evaluate alternative measures, but choose the
first convenient and/or accessible variable in
the dataset - Data supply decisions (what is on the archive
version) are critical - Much of the explanation lies with lack of
confidence in data manipulation / linking data - Too many under-used resources cf.
www.esds.ac.uk
7b) Software and key variables a personal view
- Stata is the superior package for secondary
survey data analysis - Advanced data management and data analysis
functionality - Supports easy evaluation of alternative measures
(e.g. est store) - Culture of transparency of programming/data
manipulation - Problems with Stata
- Not available to all users
- Slow estimation times
-
8c) Endogeneity and key variables
- everything depends on everything else
Crouchley and Fligelstone 2004 - We know a lot about simple properties of key
variables - Key variables often change the main effects of
other variables - Simple decisions about contrast categories can
influence interpretations - Interaction terms are often significant and
influential - We have only scratched the surface of
understanding key variables in multivariate
context and interpretation - Key variables are often endogenous (because they
are key!) - Work on standards / techniques for multi-process
systems and/or comparing structural breaks
involving key variables is attractive
9d) Social science variables and functional form
- Functional form the way in which measures are
arithmetically incorporated in quantitative
analysis - With occupations, education, ethnicity, and
elsewhere, we tend to be too willing to make
simplifying categorisations - An alternative - scaling and relative positions
is better suited for complex analytical
procedures
101. Data and research on occupations
- In the social sciences, occupation is seen as one
of the most important things to know about a
person - Direct indicator of economic circumstances
- Proxy Indicator of social class or
stratification - GEODE how social scientists use data on
occupations - DAMES extending GEODE resources
- Expanding range
- Improving usability
11Stage 1 - Collecting Occupational Data (and
making a mess)
12www.geode.stir.ac.uk/ougs.html
13Occupations we agree on what we should do
- Preserve two levels of data
- Source data Occupational unit groups, employment
status - Social classifications and other outputs
- Use transparent (published) methods i.e. OIRs
- for classifying index units
- for translating index units into social
classifications - for instance..
- Bechhofer, F. 1969. 'Occupations' in Stacey, M.
(ed.) Comparability in Social Research. London
Heinemann. - Jacoby, A. 1986. 'The Measurement of Social
Class' Proceedings from the Social Research
Association seminar on "Measuring Employment
Status and Social Class". London Social Research
Association. - Lambert, P.S. 2002. 'Handling Occupational
Information'. Building Research Capacity 4 9-12. - Rose, D. and Pevalin, D.J. 2003. 'A Researcher's
Guide to the National Statistics Socio-economic
Classification'. London Sage.
14in practice we dont keep to this...
- Inconsistent preservation of source data
- Alternative OUG schemes
- SOC-90 SOC-2000 ISCO SOC-90 (my special
version) - Inconsistencies in other index factors
- employment status supervisory status number
of employees - Individual or household current job or career
- Inconsistent exploitation of Occupational
Information - Numerous alternative occupational information
files - (time country format)
- Substantive choices over social classifications
- Inconsistent translations to social
classifications by file or by fiat - Dynamic updates to occupational information
resources - Strict security constraints on users
micro-social survey data - Low uptake of existing occupational information
resources
15GEODE provides services to help social scientists
deal with occupational information resources
- disseminate, and access other, Occupational
Information Resources - Link together their (secure) micro-data with OIRs
16Occupational information resources small
electronic files about OUGs
17For example ISCO-88 Skill levels classification
18and UK 1980 CAMSIS scales and CAMCON classes
19Summary on occupations and data management
- Extensive debate about occupation-based social
classifications - Document your procedures..
- ..as you may be asked to do something different..
- If you need to choose between occupation-based
measures - They all measure, mostly, the same things
- Dont assume concepts measure measures
- Lambert, P. S., Bihagen, E. (2007). Concepts
and Measures Empirical evidence on the
interpretation of ESeC and other occupation-based
social classifications. Paper presented at the
ISA RC28 conference, Montreal (14-17 August),
www.camsis.stir.ac.uk/stratif/archive/lambert_biha
gen_2007_version1.pdf .
20(No Transcript)
21(No Transcript)
22July 2008 Existing resources on occupations
- Popular websites
- http//www2.warwick.ac.uk/fac/soc/ier/publications
/software/cascot/ - http//home.fsw.vu.nl/ganzeboom/pisa/
- www.iser.essex.ac.uk/esec/
- www.camsis.stir.ac.uk/occunits/distribution.html
- Emerging resource http//www.geode.stir.ac.uk/
- Some papers
- Chan, T. W., Goldthorpe, J. H. (2007). Class
and Status The Conceptual Distinction and its
Empirical Relevance. American Sociological
Review, 72, 512-532. - Rose, D., Harrison, E. (2007). The European
Socio-economic Classification A New Social Class
Scheme for Comparative European Research.
European Societies, 9(3), 459-490. - Lambert, P. S., Tan, K. L. L., Gayle, V., Prandy,
K., Bergman, M. M. (2008). The importance of
specificity in occupation-based social
classifications. International Journal of
Sociology and Social Policy, 28(5/6), 179-192.
23Using data on occupations further speculation
- Growing interest in longitudinal analysis and use
of longitudinal summary data on occupations - Intuitive measures (e.g. ever in Class I)
- Lampard, R. (2007). Is Social Mobility an Echo of
Educational Mobility? Sociological Research
Online, 12(5). - Empirical career trajectories / sequences
- Halpin, B., Chan, T. W. (1998). Class Careers
as Sequences. European Sociological Review,
14(2), 111-130. - Growing cross-national comparisons
- Ganzeboom, H. B. G. (2005). On the Cost of Being
Crude A Comparison of Detailed and Coarse
Occupational Coding. In J. H. P.
Hoffmeyer-Zlotnick J. Harkness (Eds.),
Methodological Aspects in Cross-National Research
(pp. 241-257). Mannheim ZUMA, Nachrichten
Spezial. - Treatment of the non-working populations
- Seldom adequate to treat non-working as a
category - Selection modelling approaches expanding
242. Data and research on education
- Although there have been standardisation
attempts, data on an individuals level of
education is notoriously difficult to collect and
compare between studies - Between countries
- Between regions
- Between time periods
- Even between short time periods (Example of the
UK Youth Cohort Study)
25In international research..
- There are two leading standards
- ISCED
- www.unesco.org/education/information/nfsunesc
o/doc/isced_1997.htm - CASMIN education
- http//www.equalsoc.org/publications/show/40
- But not all researchers adopt them, or are
satisfied with them when they do
26In UK research..
- There are some recommended standard data
collection schemes - Simplified measure (other primary standard) at
www.statistics.gov.uk/about/data/harmonisation/ - ..but many studies build up unstandardised data
on highest levels of qualifications - Often hundreds of unique qualification titles
- Little standardisation on relative levels
- Many surveys collect multiple response data
(multiple qualifications held by an individual)
27BHPS example
28Family and Working Lives Survey (54 vars per educ
record)
29Data on education levels cf. occupations
- Underlying qualification units
- There are few obvious educational unit groups
- There are many publicly defined alternative
schemes - Manipulation of educational data
- Few published educational information resources
- Many open-access sources of data about
educational qualifications - e.g. national statistics website reports
- There has been less previous recognition of value
of standardisation - Though this is emerging in comparative research
- Educational data is dynamic and rapidly expanding
30Educational data and cohort change
- A critical consideration concerns cohort change
in educational qualifications and distributions - Appreciating relative value of education level
given context - Multivariate analytical procedures
- Mean benefit of education within cohort?
31Summary on education and data management
- We should document measures because..
- Some way away from agreeing on preferred measures
- Dynamic nature of educational distributions
- Debate between categorisers and scorers
- Some useful resources
- Schneider, Silke L. (ed.) (2008), The
International Standard Classification of
Education (ISCED-97). An Evaluation of Content
and Criterion Validity for 15 European Countries.
Mannheim MZES. ISBN 978-3-00-024388-2 - ISMF educational databases and recodes
http//home.fsw.vu.nl/hbg.ganzeboom/ISMF/ismf.htm
323. Data and research on ethnicity
- Rapid growth in social science interest, and
data, on ethnic minority groups, immigration,
immigrants - Data includes
- Generic specialist studies collecting ethnic
referents - ethnic identity nationality, parents
nationality country of birth language spoken
religion race - National research and data management
- Most countries have evolving standard definitions
of ethnic groups - International research and data management
- Seen as highly problematic in many fields except
immigration data - Lambert, P.S. (2005). Ethnicity and the
Comparative Analysis of Contemporary Survey Data.
In J. H. P. Hoffmeyer-Zlotnick J. Harkness
(Eds.), Methodological Aspects in Cross-National
Research (pp. 259-277). Manheim ZUMA-Nachrichten
Spezial 11.
33(No Transcript)
34(No Transcript)
35UK ONS ESDS data guides
- Input harmonisation within decades
- Output harmonisation between decades
- Bosveld, K., Connolly, H., Rendall, M. S.
(2006). A guide to comparing 1991 and 2001 Census
ethnic group data. London Office for National
Statistics. - Academic strategies ad hoc black group, etc
- Addition of extra categories over time
- Mixed ethnicities, marriages
- UK Focus on ethnic identity, lack of attention
to alternative referents
36Comparative research solutions?
- Measurement equivalence might be achieved by
- Survey data collection
- Connecting related groups
- Longitudinal linkage
- Functional equivalence for categories
- Simplified categorical distinctions
- Immigrant cohorts
- Scaling ethnic categories
37Ethnicity and the DAMES project
- Hard subject to collate information on
- Few recognisable ethnic unit groups
- Limited previous data management reflection
- Very few published databases on ethnicity
- Important question of sparse distributions
- Dynamic, rapidly expanding
- Likely role is to give new guidance on emerging
strategies for analysing and exploiting data
38Concluding summary Handling data on occupations,
educational qualifications and ethnicity
- Principles for data management
- Keep clear records
- Recodes and transformations
- Use existing standards
- Do something, not nothing
- Distributional differences by cohorts
- Learn how to match files
- Exploiting wider resources / other research