Title: hints
1(No Transcript)
2Data Reference(the very, very basics)
3(No Transcript)
4Data-reference what do we need?
- Tools
- Strategies
- Terminology
- Understanding of what we are looking for not
books or articles -- or facts.
5Data-reference what do we need?
- Understanding of what we are looking for not
books or articles -- or facts.
- Terminology
- Strategies
- Tools
6(No Transcript)
7La trahison des images, The treachery of images,
Rene Magritte
8Ceci nest pas les data.
Cest les statistiques!
9Data
Statistics
Raw (for analysis)
Cooked (facts)
Intended for use by computer
For human useEye-readable, charts, tables,
graphs
Computer-readable
Can be print, micro, computer readable
Collected based on social science methodologies
or administrative procedures
Produced from data
10Data
11Statistics
12Where do statistical babies come from?
13Data or Statistics Why does it matter?
- Different search strategies and tools.
- Defines your goal.
- Helps you know when you've found it!
14Tip Data or Statistics?
- Determine if the user wants (needs) statistics or
data.
- Do you want want one number?
- Are you looking for a fact or figure?
- Do you want to know how many?
15Tip Data or Statistics?
- Determine if the user wants (needs) statistics or
data.
- Or do you want a series of numbers?
- Do you want to identify trends, make comparisons,
model relationships?
- Will you be using statistical software (not
Excel)?
16(No Transcript)
17http//factfinder.census.gov/
18http//www.census.gov/compendia/statab/elections/e
lection.pdf
19http//www.census.gov/compendia/statab/tables/06s0
405.xls
20ftp//ftp.bls.gov/pub/special.requests/lf/aat44.tx
t
21http//www.bls.gov/webapps/legacy/cpsatab7.htm
22(No Transcript)
23(No Transcript)
24From survey to data to statistics
Survey instrument Q1. enter zip code Q2. ent
er Rs first name Q3. enter sex of R Q4. Wh
at was your major in College? Q5. What was your i
ncome last year? Q6. Did you go to church last we
ek?
25Answers to Questions
Zip Name Sex Major income church
29002 Wilma F lit 0 y
99005 Barney M engin 10 n
99005 Betty F . 0 n
92005 Ethel F theater 1000 y
12534 Fred M. M PE 10000 y
12534 Lucy F lit 700 y
25000 Ricky M music 11000 y
20000 Fred A. M dance 10500 n
15000 Ginger F math 9500 y
26Must anonymize the data!
Zip Name Sex Major income church
29002 Wilma F lit 0 y
99005 Barney M engin 10 n
99005 Betty F . 0 n
92005 Ethel F theater 1000 y
12534 Fred M. M PE 10000 y
12534 Lucy F lit 700 y
25000 Ricky M music 11000 y
20000 Fred A. M dance 10500 n
15000 Ginger F math 9500 y
27Must anonymize the data!
Zip Name Sex Major income church
29002 001 F lit 0 y
99005 002 M engin 10 n
99005 003 F . 0 n
92005 004 F theater 1000 y
12534 005 M PE 10000 y
12534 006 F lit 700 y
25000 007 M music 11000 y
20000 008 M dance 10500 n
15000 009 F math 9500 y
28Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 F lit 0 y
99005 002 M engin 10 n
99005 003 F . 0 n
92005 004 F theater 1000 y
12534 005 M PE 10000 y
12534 006 F lit 700 y
25000 007 M music 11000 y
20000 008 M dance 10500 n
15000 009 F math 9500 y
29Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 1 lit 0 y
99005 002 2 engin 10 n
99005 003 1 . 0 n
92005 004 1 theater 1000 y
12534 005 2 PE 10000 y
12534 006 1 lit 700 y
25000 007 2 music 11000 y
20000 008 2 dance 10500 n
15000 009 1 math 9500 y
30Change Text to Numeric Codes
The codebook mustdocument the numeric codes
used!For example Variable sex 1 femal
e
2 male
Zip Name Sex Major income church
29002 001 1 lit 0 y
99005 002 2 engin 10 n
99005 003 1 . 0 n
92005 004 1 theater 1000 y
12534 005 2 PE 10000 y
12534 006 1 lit 700 y
25000 007 2 music 11000 y
20000 008 2 dance 10500 n
15000 009 1 math 9500 y
31Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 1 0075 0 y
99005 002 2 0070 10 n
99005 003 1 . 0 n
92005 004 1 0076 1000 y
12534 005 2 0001 10000 y
12534 006 1 0075 700 y
25000 007 2 0077 11000 y
20000 008 2 0078 10500 n
15000 009 1 0050 9500 y
32Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 1 0075 0 1
99005 002 2 0070 10 2
99005 003 1 . 0 2
92005 004 1 0076 1000 1
12534 005 2 0001 10000 1
12534 006 1 0075 700 1
25000 007 2 0077 11000 1
20000 008 2 0078 10500 2
15000 009 1 0050 9500 1
33Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 1 lit 0 y
99005 002 2 engin 10 n
99005 003 1 . 0 n
92005 004 1 theater 1000 y
12534 005 2 PE 10000 y
12534 006 1 lit 700 y
25000 007 2 music 11000 y
20000 008 2 dance 10500 n
15000 009 1 math 9500 y
34Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 1 0075 0 y
99005 002 2 engin 10 n
99005 003 1 . 0 n
92005 004 1 theater 1000 y
12534 005 2 PE 10000 y
12534 006 1 0075 700 y
25000 007 2 music 11000 y
20000 008 2 dance 10500 n
15000 009 1 math 9500 y
35Change Text to Numeric Codes
Zip Name Sex Major income church
29002 001 1 0075 0 y
99005 002 2 0070 10 n
99005 003 1 . 0 n
92005 004 1 0076 1000 y
12534 005 2 0001 10000 y
12534 006 1 0075 700 y
25000 007 2 0077 11000 y
20000 008 2 0078 10500 n
15000 009 1 0050 9500 y
36Change Text to Numeric Codes
Sometimes, evennumeric variablesare encoded in
ranges. For example Variable income 1
less than 1000 2 1000 - 4999 3 5000 -
10000 4 more than 10000 9 not reported
Zip Name Sex Major income church
29002 001 1 0075 0 1
99005 002 2 0070 10 2
99005 003 1 . 0 2
92005 004 1 0076 1000 1
12534 005 2 0001 10000 1
12534 006 1 0075 700 1
25000 007 2 0077 11000 1
20000 008 2 0078 10500 2
15000 009 1 0050 9500 1
37Change Text to Numeric Codes
Sometimes, evennumeric variablesare encoded in
ranges. For example Variable income 1
less than 1000 2 1000 - 4999 3 5000 -
10000 4 more than 10000 9 not reported
Zip Name Sex Major income church
29002 001 1 0075 1 1
99005 002 2 0070 1 2
99005 003 1 . 1 2
92005 004 1 0076 2 1
12534 005 2 0001 3 1
12534 006 1 0075 1 1
25000 007 2 0077 4 1
20000 008 2 0078 4 2
15000 009 1 0050 3 1
38Data Files do not need headers
Zip Name Sex Major income church
29002 001 1 0075 1 1
99005 002 2 0070 1 2
99005 003 1 . 1 2
92005 004 1 0076 2 1
12534 005 2 0001 3 1
12534 006 1 0075 1 1
25000 007 2 0077 4 1
20000 008 2 0078 4 2
15000 009 1 0050 3 1
39Data Files do not need headers
29002 001 1 0075 1 1 99005 002
2 0070 1 2 99005 003 1 .
1 2 92005 004 1 0076 2 1 1253
4 005 2 0001 3 1
12534 006 1 0075 1 1
25000 007 2 0077 4 1
20000 008 2 0078 4 2
15000 009 1 0050 3 1
40Data Files do not need extra space
29002 001 1 0075 1 1 99005 002
2 0070 1 2 99005 003 1 .
1 2 92005 004 1 0076 2 1 1253
4 005 2 0001 3 1
12534 006 1 0075 1 1
25000 007 2 0077 4 1
20000 008 2 0078 4 2
15000 009 1 0050 3 1
41Data Files do not need extra space
290020011 0075 1 1 990050022
0070 1 2 990050031 . 1 2
920050041 0076 2 1 125340052 0001
3 1 125340061 0075 1 1 250000072
0077 4 1 200000082 0078 4 2 150
000091 0050 3 1
42Data Files do not need extra space
2900200110075 1 1 990050022
0070 1 2 990050031. 1 2 920
0500410076 2 1 1253400520001 3 1
1253400610075 1 1 2500000720077 4
1 2000000820078 4 2 1500000910050
3 1
43Data Files do not need extra space
29002001100751 1 990050022
00701 2 990050031. 1 2 92005004100762
1 12534005200013 1 12534006100751 1 250000
07200774 1 20000008200784 2 15000009100503
1
44Data Files do not need extra space
290020011007511 990050022
007012 990050031. 12 920050041007621 125340
052000131 125340061007511 250000072007741 2000
00082007842 150000091005031
45Codebook must document locations
290020011007511 990050022
007012 990050031. 12 920050041007621 125340
052000131 125340061007511 250000072007741 2000
00082007842 150000091005031
For example Variable sex location colum
n 9 width 1
46Codebook must document locations
123456789
290020011007511 990050022
007012 990050031. 12 920050041007621 125340
052000131 125340061007511 250000072007741 2000
00082007842 150000091005031
For example Variable sex location colum
n 9 width 1
47Codebook documents question, location, codes.
290020011007511 990050022
007012 990050031. 12 920050041007621 125340
052000131 125340061007511 250000072007741 2000
00082007842 150000091005031
For example Q3. enter sex of R
Variable sex location column 9 width
1 Variable sex 1 female 2
male
48To Use Data You Need 3 Things
- Data the datafile (the raw numbers)
- Metadata the codebook (where the numbers are
and what they mean)
- Statistical Software (for reading the datafile
and analyzing the data)
49Data
90020011007511
990050022007012 990050031. 12 92005004100762
1 125340052000131 125340061007511 250000072007
741 200000082007842 150000091005031
Codebook
Q3. enter sex of R Variable sex location
column 9 width 1
Variable sex
1 female 2 male
Statistical software
50And produces charts, tables, analysis, etc.
Student writes SPSS program to analyze data
SPSS reads the program
SPSS commands
90020011007511
990050022007012 990050031. 12 92005004100762
1 125340052000131 125340061007511 250000072007
741 200000082007842 150000091005031
SPSS reads the data.
51(No Transcript)
52Codebook entry for variable PRES92
53Codebook entry for variable DEGREE
54Voted for Clinton
Junior college
55(No Transcript)
56(No Transcript)
57 58(No Transcript)
59Tip "variables" contain the essential, important
content of data files
60Tip Data-reference is not about searching for an
answer
- Data reference is often less about searching to
find an answer. (That's a statistical reference
question.)
- Data reference is often more about exploring to
find data that will enable users to ask a
question.
61What have we learned?
- Data and statistics are not the same
- Data reference leads to primary research
material, not facts or statistics.
- To use data, a user must have data, metadata, and
statistical software.
- A-and
62What have we learned?
- "Variables" are what contain critical, important
content of data files.
- And that means that the gold-standard of
data-reference is variable-level searching.
63(No Transcript)
64http//gort.ucsd.edu/calpol/
65(No Transcript)
66(No Transcript)