Next on OPRAH - PowerPoint PPT Presentation

About This Presentation
Title:

Next on OPRAH

Description:

Before we get all shaken up about data and statistics, with warnings that such ... and darned near 100% of those injured in traffic accidents are people who move ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 58
Provided by: stauffer2
Category:
Tags: oprah | darned | next

less

Transcript and Presenter's Notes

Title: Next on OPRAH


1
  • Next on OPRAH
  • Bringing Data Out of the Closet

OLA SuperConference Friday, 1 February, 2002
Walter Giesbrecht, Data Librarian York
University Jeff Moon, Head, Documents
Unit Queens University
2
Not this Data
3
but these kinds!
4
(No Transcript)
5
Lets take a look at Data and Statistical
Analysis have you ever seen the movie Twins?
6
Think of Arnie as the Data continuum
Raw Survey Data
Tables, Charts, Graphs
A number
(from books, journals, the web, etc...)
French Mother Tongue (1996) in Ontario
Employment levels by occupation class
Annual inflation rate from 1914 to present
Coded responses of surveyed individuals
Aggregate Data
Microdata
7
Aggregate Data
Canada - Employment Telecommunication Equipment
Industry
479,285
A Number
Tables, Charts, Graphs
Time Series
8
Sources of Aggregate Data
  • Statistics Canada is generally the first stop for
    Canadian Data
  • The Canada Year Book (print)
  • The Daily (web)
  • Canadian Social Trends (web/print)
  • CANSIM / E-Stat (web) time series
  • Canadian Statistics (web)
  • Beyond 20/20 Files multidimensional tables

9
Survey Data (microdata)
variables
respondents
Statistical analysis software is used to generate
meaningful results e.g. SPSS, SAS.
10
Sources of Survey Data
  • Once again, Statistics Canada is generally the
    first stop for Canadian Data
  • The Data Liberation Initiative (DLI) provides
    access to hundreds of publicly released survey
    data files.
  • Polling Companies (Environics, CROP, etc.)
    produce microdata files as well.
  • For US International data, the
    Inter-university Consortium for Political
    Social Research (ICPSR)

11
Survey Data
Aggregate Data
Postcard
Camera
Fixed
Flexible
12
Think of Danny as the Statistical Analysis
continuum
Tests of
Percentages
Standard
Counts
Deviations
Significance
Averages
Descriptive Statistics
Inferential Statistics
13
Aggregate / Descriptive
Microdata / Inferential
Data continuum
A number
Tables, Charts, Graphs
Raw Survey Data
Statistical Analysis continuum
Significance testing
Percentages
Counts
Standard Deviations
Averages
14
To review
Data Aggregate Survey Data (Microdata)
Statistical Analysis Counts, Percentages,
Averages, Standard Deviations, Cross-tabulations,
t-tests, Regression, etc.
15
Reference Question Example
How many of you have had a patron arrive at the
Reference Desk with a newspaper article reporting
Statistics Canada data?
16
Globe Mail, Dec 17, 2001, p A15
71 of 15- to 17-year-olds use online chat
rooms, double the proportion of the only slightly
older 20- 24-year-olds.
17
First, note that the article says Statistics
Canada, in a study released last week So
where do you go from here?
18
First Lets try
http//www.statcan.ca/start.html
19
Which leads you to the following
20
Which leads, in turn to
Canadian Social Trends, Winter 2001
Here is the statistic quoted in the Globe
and here is the source
21
So how do we check out this source?
General Social Survey, 2000
DLI Web Site (or Local Data Centre)
http//www.statcan.ca/english/Dli/dli.htm
22
(No Transcript)
23
Documentation
and Data
24
So going to your campus Data Centre
http//library.queensu.ca/webdoc/ssdc/key.htm
25
(No Transcript)
26
AGEGR5 less than or equal to 3
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
Results
31
?
Canadian Social Trends
Our cross-tab
vs
32
Reply from Statistics Canada
The difference in the numbers is because I used
the variable H19 while your client is using the
variable H20. H19 asked respondents who had used
the Internet in the last year, if they had ever
used the Internet to connect to an ONLINE CHAT
SERVICE. H20 asked respondents how often they
used the Internet to connect to an online chat
service in the last month.
An errata will be issued for the table appearing
in CST because the table does not show
percentages for those who used the Net in the
last month but for those who used the Net in the
last year.
So lets try again with H19
33
So we need
34
(No Transcript)
35
The numbers match!
AND youll note the table now says last 12
months
36
Original Table
Dec 2001
Revised
Jan 2002
37
So We can use survey files to verify published
results.
But We can also use survey files to expand on
published results and explore new avenues of
research.
  • For example
  • What is the influence of gender, education, or
    income on Internet use?
  • Are there differences between provinces? Between
    URBAN and RURAL dwellers?
  • Or any number of other dimensions any question
    asked in the survey.

38
Survey Data
Aggregate Data
Postcard
Camera
Fixed
Flexible
39
Sources of Aggregate Data
  • print
  • e.g., Canada Year Book, STC print publications
  • CD-ROM
  • e.g., 1996 Census Profiles, LFHR, other DSP
    products
  • Web-based
  • The Daily
  • Canadian Statistics
  • PDF versions of print publications
  • Beyond 20/20 Files multidimensional tables
  • CANSIM / E-Stat time series

40
Beyond 20/20 what is it?
  • Used to display multidimensional data, i.e., more
    than 3 dimensions or characteristics at once
  • e.g., age, sex (usually 3!), geography, date,
    etc. ...
  • allows user to customize the display of the data
  • very useful for aggregate data, less so for
    microdata

41
Beyond 20/20what is it used for/in?
  • used in an increasing number of STC products,
  • many CD-ROM DSP products,
  • e.g., LFHR, ITC, Profiles, Nation Series,
    Dimensions, etc.
  • one of available formats on E-Stat

42
(No Transcript)
43
CANSIM
  • acronym for CANadian Socio-Economic Information
    Management System
  • time-series data
  • available
  • direct from STC ()
  • via E-Stat (free to registered institutions)
  • via DLI (from UofT)

44
CANSIM II via E-Stat
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
Dealing with data really isnt that hard ...
56
Dont be afraid to ask for help!
57
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com