Surfing the education wave with official statistics - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Surfing the education wave with official statistics

Description:

The role of a National Statistics Office in education - why surf at all? ... Using 2004 Income Survey SURF data. ... we let the SURF dataset represent a ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 50
Provided by: sharl
Category:

less

Transcript and Presenter's Notes

Title: Surfing the education wave with official statistics


1
Surfing the education wave with official
statistics
  • Sharleen Forbes
  • Statistics New Zealand
  • School of Government, Victoria University

2
To cover
  • The role of a National Statistics Office in
    education - why surf at all?
  • Prioritising - what can we afford and where
    should we invest?
  • Current initiatives
  • - Community groups
  • Schools
  • - Tertiary education
  • Playing with official statistics
  • Examples for classroom use
  • Where to in the future?
  • Providing more sets of real data
  • New ways of visualising data

3
Role of Statistics New Zealand
  • Lead the state sector in production of official
    statistics (official statistics system
    responsibility)
  • Employ large number of statisticians
  • Not funded specifically for education (promote,
    partner or facilitate rather than provide)
  • Need to provide easily understood statistics
    (Public Good requirement)
  • Should target informal / second chance education
    (NSO Workshop ICOTS 6, Singapore)
  • Focus on official statistics

4
  • Differences between official and other statistics

5
Prioritising - what can we afford
- where should we invest?
  • Need to balance external demands with internal
    training needs
  • Limited funds (need to pick the wave - where
    can we make a difference?)

6
Current initiatives -
community groups
  • State sector (Official Statistics System)
  • Certificate of Official Statistics (Level 4)
  • School of Government and ANZSOG courses
  • Workshops and seminars
  • Journalists
  • JTO compulsory statistics unit(s)
  • Statistics prize
  • Small businesses
  • GoStats!
  • Maori communities
  • Pilot projects

7
Current initiatives -
schools
  • Resources to support the new curriculum
  • Schools Corner on Statistics New Zealand website
    (http//www.stats.govt.nz/schools-corner)
  • CensusAtSchools
  • Joint funder (http//www.censusatschool.org.nz)
  • Dataset provision
  • Census
  • Official Statistics Surveys
  • Synthetic Unit Record Files (SURFs)

8
Current initiatives -
tertiary education
  • Network of Academics in Official Statistics
  • To provide training and research
  • Undergraduate student prizes (1000)
  • Official Statistics Research Fund
  • Partnerships with researchers
  • Vice-Chancellors agreement
  • Confidentialised Unit Record Files (CURFs)
  • Half-time Professor of Official Statistics
  • School of Government, Victoria University

9
Playing with official statistics - Examples
  • Census data
  • Official Statistics Survey data
  • Specially constructed data sets
  • Confidentialised Unit Record Files (CURFS)
  • Synthesised Unit Record Files (SURFS)

10
The statistical investigation (PPDAC)
cycle(Creators Wild and Pfannkuch, Auckland
University,1999)
  • Problem statement of the research
    questions
  • Plan procedures used to carry out the study
  • Data data collection process
  • Analysis summaries and analyses of the data to
    answer the questions posed
  • Conclusion about what has been learned.

11
1. Census data example
  • Problem (Question)Is Hamilton greener than
    Wellington?
  • Plan / DataUse 2006 Census data on the way
    people travel to work to indicate how green a
    city is. (www.stats.govt.nz/census/)

12
  • Analysis

13
Definitions (Re)classifications
  • How many and what classes of green shall we
    have?
  • Have defined green-ness by mode of travel to
    work
  • Lets have only 3 classes of green-ness
  • Not green Driving private or company vehicles
  • Green Passenger in private vehicle or using
    public transport
  • Very green Walking, biking or working at home
  • Omit other categories

14
More analysis
15
Conclusion (and classroom questions)
  • Conclusion
  • Wellington is greener than Hamilton
  • Questions
  • Is mode of travel to work a good indicator of
    green-ness?
  • What other variables might affect mode of
    travel?
  • Should we use more than one indicator?

16
Official Statistics Survey data
  • Problem (questions)
  • Are fewer people unemployed now than in previous
    years?
  • Are you less likely to be unemployed if you have
    a high level of education ?
  • Plan / Data
  • Analyse time series data on national
    unemployment rates
  • Statistics New Zealands Household Labour Force
    Survey (www.stats.govt.nz)

17
Analysis - Question a).
Time series plots
18
Conclusions (and classroom questions)
  • Conclusions
  • Unemployment has been lower since 2004 than in
    previous years
  • Since 2004 unemployment has stayed at roughly the
    same level (about 4)
  • Seasonality is not marked
  • Questions
  • What was the cause of the peaks (1991-3 and 1999)
    in unemployment?
  • What do the small peaks in 2004 - 2007 reflect?
  • Should we answer a count question (number
    unemployed) with a rate (percent unemployed in
    the labour force)?

19
Analysis - Question b).
Time series plots
20
Conclusions (and classroom questions)
  • Conclusions
  • Pattern over time is similar for all
    qualification groups.
  • Unemployment rate always highest for workers with
    no educational qualifications.
  • Questions
  • Which group appears to be the most disadvantaged
    when unemployment is high?
  • What appears to be different in recent (compared
    to past) years between the qualification groups?

21
Another sample survey example - a
simple look at seasonality
  • Problem (question)
  • Is there an annual pattern in retail sales?
  • Plan / data
  • Check for seasonality in quarterly summary time
    series data for monthly retail trade sales (in
    dollars)
  • Statistics New Zealands Retail Trade Survey
  • (www.stats.govt.nz)

22
Analysis Time series
plot
23
Conclusions (and classroom questions)
  • Conclusions
  • Annual seasonality - peak every December /
    January
  • Rising trend over time - plateau in last 3
    quarters
  • Questions
  • What components of retail trade would contribute
    most to the December peaks?
  • What does it mean when the seasonally adjusted
    and trend lines lie virtually on top of each
    other?
  • Easter fell in the March rather than June quarter
    in 2008. Is there any evidence that this affected
    the pattern of retail sales?

24
3. Specially constructed data sets -
Confidentialised datasets
(e.g. 2004 Income Survey)
25
SURFING Classroom Examples (SURF creator
Pauline Stuart, Statistics NZ)
  • Using 2004 Income Survey SURF data.
  • Data available on CD or downloaded from Schools
    Corner on the Statistics New Zealand website
    (www.stats.govt.nz/schoolscorner/).
  • Dataset has 200 records and seven variables
  • gender (male, female)
  • highest education qualification (none, school,
    vocational, degree)
  • marital status (married, never, previously,
    other)
  • ethnic group (European, Maori, Other)
  • age (15-45)
  • hours worked weekly (0-79)
  • weekly income (0-2000).

26
Example
  • Background
  • In this example we let the SURF dataset represent
    a companys employees.
  • Every employee creates the same administration
    costs regardless of how many hours are worked.
  • The company is concerned that its staff
    administration costs are too high.
  • Problem (questions)
  • Do most employees work a normal (40 hour) week?
  • What variables are related to the number of hours
    worked?

27
Specific questions for secondary school classrooms
  • What proportion of employees work at least 40
    hours per week? (Summary)
  • 2. Are these proportions different for males
    and females? (Comparison)
  • 3. Do males tend to work more hours per week
    than females? (Comparison)
  • 4. What is the relationship between hours
    worked and income? (Relationship between two
    measurement variables)

28
Plan / Data (a).Take a random sample of 35 from
the SURF
  • Analysis

Table Sample Summary Statistics
29
Conclusions (and classroom questions)
  • Conclusions
  • Only half of all employees work 40 hours or more.
  • On average (mean) males work longer hours than
    femalesHours females work vary (standard
    deviation, inter-quartile range) more than hours
    males work.
  • Questions
  • Are samples of size 17 and 18 large enough?
    (beware of categorical data)
  • What does it indicate when the mean and the
    median are different?

30
Plan / Data (b). - Resample
  • Compare between students samples (summary
    statistics)
  • Combine students samples and create new summary
    statistics
  • Sample (another 35 say) and compare (or combine)
    summary statistics

31
Plan / Data (c). - Use all the
SURF data
  • Analysis
  • How do sample statistics compare with total SURF?
  • Would a graph be easier to interpret than the
    table?

32
  • Analysis
  • Graphs of SURF data

33
Conclusions (and classroom questions)
  • Conclusions
  • Use tables for reference, graphs to tell a story.
  • Females bimodal? at 5-25 hours (part-time) and
    35-50 hours (full-time)?
  • Males tri-modal? small at 10-15 hours
    (part-time), large at 35-55 hours (full-time),
    small at 60-75 hours (maybe managers)?
  • Proportions of males and females working 40 hours
    or more are different. About half of the males do
    but only about a quarter of the females do.
  • Questions
  • What is the clumping at 40 hours?
  • Given the size of the SURF do you think the above
    patterns will be similar if other SURFs are
    taken?

34
Analysis - Question 4. Relationship between
hours worked and income?
35
Conclusion (and classroom questions)
  • Conclusion
  • Income increases as work more hours.
  • Questions
  • What is the estimated income for someone who
    doesnt work?
  • What extra income (on average) is expected if
    work an extra hour per week?
  • Is the (regression) line a good fit to the data?

36
Other factors related to hours worked?(Sex /
Highest qualification / Ethnicity, etc.)Example
from a first-year university course Creator
John Harraway, Otago University
  • Plan / Data
  • Recategorise highest qualification
  • Secondary None OR Secondary (105) S
  • Tertiary Vocational OR Tertiary (95) T
  • Do a linear regression in SPSS(equivalent to
    t-test for difference in means)

37
AnalysisSPSS regression output
  • Weekly Income (414 344Tertiary)
  • 95 confidence interval for increase in income if
    have a tertiary qualification is257 - 431
  • T 7.8, p 0.000..
  • R2 0.24 (only about quarter of the variation in
    the points explained by the best-fitting line)

38
Conclusion (and classroom question)
  • Conclusion
  • Income is higher on average (by 344) if have a
    tertiary qualification.
  • Question
  • Is qualification a good explanator of income
    earned?

39
Are there multiple factors related to income?
  • Problem (Question)Are both qualification and
    hours worked related to income?
  • Plan / DataDo a multiple regression (main
    effects model - no interaction terms) in SPSS
    using SURF data

40
AnalysisScatterplot Income by hours worked
and qualification
(S secondary, T tertiary)
41
SPSS regression output (values extracted
rounded for all 3 models)
42
Conclusions
  • Weekly Income (-19 15xHours Worked
    183xTertiary)
  • Conclusions
  • Both hours worked and highest qualification are
    related to weekly income earned
  • Mean increase in income per hour worked is
    reduced (from 17 to 15) if tertiary also
    considered
  • Mean increase in income if have a tertiary
    qualification is also reduced (from 344 to 183)
    when adjusted for number of hours worked
  • 95 confidence interval for the intercept (income
    when no hours are worked) still contains zero

43
Classroom questions
  • Questions
  • Is there any interaction between hours worked
    and qualification?
  • Which of the above models fits the data best?
  • Are there any outliers?
  • What does a scatterplot of the residuals
    (distances from the line) indicate?

44
More resampling
  • Use SURF as sample from CURF population
  • Bootstrapping
  • Take repeated samples with replacement (of same
    size as original, n200).
  • Jack-knifing
  • Take repeated samples dropping one value from
    original sample each time (n199).
  • Calculate mean and standard deviation of sample
    means
  • Compare summary statistics with CURF (or full
    2004 Income Survey).

45
Where to from here?
  • Continue and develop partnerships (academics,
    teachers, community groups)
  • More CURFs and SURFs(official launch 1 September
    2008 - 2001 Savings Survey SURF
    www.stats.govt.nz/schools-corner)
  • Increased free access to data for post-graduate
    students
  • Data visualisation (dynamic graphs)
  • More across-discipline outputs

46
Animated population pyramids(Creator Martin
Ralphs, Statistics NZ)
47
Economic structure population pyramid(Office of
National Statistics UK)
48
Gapminder www.gapminder.orgGeography, history,
demography, econometrics(Creator Hans Rosling)
49
Questions and comments
  • What are your ideas for the future?
  • Contact sharleen.forbes_at_stats.govt.nz
  • Thank you.
Write a Comment
User Comments (0)
About PowerShow.com