Survey Methodology - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Survey Methodology

Description:

... residents, based on phonebooks and car registrations, and got 2 million back ... numbers, business numbers; May have a 4-year old explaining the family finances ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 43
Provided by: chriswe
Category:
Tags: car | methodology | old | steals | survey | year

less

Transcript and Presenter's Notes

Title: Survey Methodology


1
Survey Methodology
  • Thanks to Todd Williams for material.

2
What is a survey?
  • A survey is a systematic data collection about a
    sample drawn from a specified larger population
  • Aim is generally to draw conclusions about the
    population from a sample
  • Can be quasi or non-experimental in nature
  • Quasi-experimental Manipulate some aspect of the
    survey or the population to whom it is given
  • Non-experimental Randomly sample the whole
    population

3
Major elements of a survey
4
Research Objectives and Constraints
  • Integrate your research goals with your financial
    and temporal constraints
  • Survey design should reflect your research
    objectives
  • This seems obvious but isnt always the case
  • Accuracy costs money Good surveys are
    unavoidably expensive

5
Sources of Error on Survey
  • In surveys, error comes from two main sources
  • Sampling error
  • Sample design error
  • Estimation error
  • Non-sampling error
  • Non-Response Error
  • Measurement Error
  • Processing error
  • All sources of error contribute to the overall
    error of the survey

6
1. Sample design error
  • Sample design error refers to inaccuracies caused
    by sampling a subset of the defined population
  • Eg. The 1936 U.S. presidential election between
    Alfred Landon and Franklin D. Roosevelt was
    famously miscalled by the Literary Digest, which
    mailed out 8 million postcard ballots to U.S.
    residents, based on phonebooks and car
    registrations, and got 2 million back
  • The problem Those who owned cars and had
    telephones were wealthier than most, and strongly
    biased towards conservativism.

7
Sample design error
  • Sampling non-probabilistically causes many
    problems
  • No statistical basis for evaluating how your
    sample represents the population
  • Confidence intervals and error margins are not
    applicable
  • There are three main methods for probability
    sampling
  • Quota Sampling
  • Random Sampling
  • Cluster Sampling

8
Quota Sampling
  • A young researcher named George Gallup correctly
    predicted the Literary Digest fiasco in advance,
    advocating quota sampling
  • Quota sampling involves trying to define
    characteristics that are relevant to the attitude
    being measured and then constructing a sample
    that reflects the distribution of those
    characteristics in the population
  • Problems it is difficult to know in advance what
    characteristics are relevant to the attitude
    being measured, and may be difficult to identify
    people with the right characteristics.

9
Random Sampling
  • For this reason, most modern surveys use random
    sampling (can you think of a common exception?)
  • In random sampling, each person in the population
    has the same probability of being chosen.
  • So long as you can truly make all people
    equi-probable (or, more precisely, know the
    probabilities with which people are likely to be
    chosen probability sampling), this is the best
    way to go
  • But it is not always easy to make all people
    equi-probable

10
Random Sampling
  • One (annoyingly) common method is Random-Digit
    Dialing, which which computer-generated random
    phone numbers (or numbers just slightly different
    from known real numbers) are called
  • Benefits Truly random, and can get unlisted
    numbers
  • Drawback May generate inappropriate numbers
    unactivated phone numbers, cell phone numbers of
    people already contacted, fax numbers, business
    numbers May have a 4-year old explaining the
    family finances

11
Cluster Sampling
  • For pragmatic reasons, it is desirable to have
    participants in a survey clustered together so
    that many can be interviewed in a small
    geographical area
  • Cluster sampling allows random sampling within
    such clusters.
  • The idea is to stratify the population,
    randomizing units at each level
  • The population is divided into units at different
    levels and at each level one chooses randomly
    where to sample
  • For example, one might start with random voting
    ridings or counties choose some randomly, and
    discard the rest
  • Within each chosen riding or county, choose some
    blocks (or wards or streets) randomly, and
    discard the rest
  • Within each block, choose some households
    randomly, and discard the rest

12
2. Estimation error
  • Estimation error is the standard statistical
    error involved in estimating anything.
  • Sampling error is what is commonly reported as
    the error margin in survey results (but we will
    see in a minute that it must be an
    under-estimate)
  • We have already seen how it is computed for
    surveys

13
SEproportions
  • SE the square root of the variance
  • The variance for data with p 1s and q 0s is
    simply pq
  • So the SD (pq)0.5
  • And SEproportions (pq)0.5 / N 0.5

14
SEproportions Example
  • Example 40 of a sample of 1000 Albertans vote
    to separate from Canada. What is the SE and 95
    confidence interval?
  • SE (p (1-p)) 0.5 /n 0.5
  • SE (.40 X .60) 0.5 /1000 0.5
  • 0.015
  • The 95 confidence interval is 1.96 0.015 2.9

15
How important is sample size ?
  • Error diminishes non-linearly as a function of
    sample size
  • Example (assuming 60 in favour)
  • N 100 SE 4.8
  • N 500 SE 2.1
  • N 1000 SE 1.5
  • N 2000 SE 1.1
  • N 4000 SE 0.8
  • (All other factors being held constant).
  • Most major surveys include at least 1000 people.
  • Notice that the size of the population doesnt
    matter (e.g polling Edmonton versus Canada)

16
1. Non-Response Error
  • The first form of non-sample error is
    non-response error
  • Non-response error refers to failure to obtain
    responses from individuals selected by sample
  • Non-contacts Impact can be lessened by
    systematic sampling
  • Refusals Impact can be lessened by offering
    prizes, being very insistent
  • The response rate from mail surveys can be below
    30, rising to over 50 with reminders or
    inducements of various kinds
  • Face-to-face interviews yield the highest
    response rates close to 80

17
2. Measurement error
  • The second kind of non-sampling error is
    measurement error
  • Measurement error refers to deviations between
    the actual attributes of the respondents and what
    was reported
  • It is may be separated into three sources
  • Interview error Misunderstandings or errors in
    scoring by the interviewer
  • Respondent error Lies, demand characteristics,
    misunderstandings by the interviewee
  • Mode of data collection errors Errors due to the
    way data were collected

18
3. Processing error
  • The third kind of non-sampling error is
    processing error
  • This refers to errors that occur after data have
    been collected e.g. data input errors,
    computer-read errors

19
What about other error?
  • Clearly, not all error can be measured by
    SEproportions
  • All non-sampling and coverage error is
    independent of N
  • You can have a full complement of subjects, and
    still have failed to properly cover the
    population
  • Unfortunately, it is difficult to accurately
    estimate these other sources of error, and
    impossible to eliminate them
  • Unlike sampling error, the extent of
    non-sampling error cannot usually be assessed
    from the sample itself, even if the sample is a
    probability sample
  • American Statistical Association

20
A rule of thumb
  • According to the American Statistical
    AssociationMultiply the currently reported
    margin of error associated by 1.7 to obtain a
    more accurate estimate of the margin of error
  • It is common for political polls to quote error
    margins of 3
  • 3 1.7 5.1
  • However, the average actual deviation in Gallup
    polls is 1 to 2.

21
Heuristics for conducting surveys
  • Question concerns (as in any other instrument)
  • Ease of question comprehension
  • Scale dispersion
  • Use of negative items
  • Wording Effects
  • Latent word loadings
  • Vague quantifiers
  • Normalization As you know, many people have
    been killing their husbands these days. Do you
    happen to have killed yours?

22
Heuristics for conducting surveys
  • Issues of information accessibility for subjects
  • Recall and count vs. Frequency estimation
  • Recall cues
  • Framing effects
  • Rugg (1941) Changing the wording of a question
    from "Do you think the United States should allow
    public speeches against democracy?" to "Do you
    think the United States should forbid speeches
    against democracy?" changed the proportion of
    respondents favoring free speech from 21 to 39
  • Formatting (remember the butterfly ballot!)
  • Interviewer bias
  • Social desirability

23
Butterfly Ballot
24
Some Psychometric Miscellany
25
(No Transcript)
26
Psychopathology Checklist Revised (PCL-R)
27
Dr. Robert Hare
  • Professor Emeritus at University of British
    Columbia.
  • Spent the last 30 years involved in research into
    the psychopathic mind.
  • PCL-R, PCL-YV (Youth Version)

28
Dr. Robert Hare
  • The PCL-R was developed by Robert Hare, Professor
    Emeritus at University of British Columbia.
  • Spent the last 30 years involved in research into
    the psychopathic mind.
  • Has written both popular and textbooks about
    psychopathy
  • PCL-R, PCL-YV (Youth Version)

29
What does the PCL-R Measure?
  • The psychopathic personality, which is
    characterized by a pervasive pattern of
    disregard for and violation of the rights of
    others occurring since age 15 years (DSM-IV)
  • Diagnostic criteria include Repeated
    law-breaking, deceitfulness, impulsivity or
    failure to plan, manifest aggression, disregard
    for safety of self or others, irresponsibility,
    deceitfulness, impulsivity and/or failure to plan
    ahead, and lack of remorse at having having hurt,
    mistreated, or stolen from another
  • The PCL-R also predicts if the subject will
    re-offend when/if released.

30
Administration
  • The test consists of 20 traits, scored on a
    3-point scale 0 (Not present), 1 (Probably
    present) 2 (Definitely Present)
  • Cut off score is 30/40
  • Jailed criminals average scores around 22
  • Normal populations average around 5.
  • The traits are scored using information collected
    during a clinical interview and from the clients
    file.
  • Interview usually requires 1.5 - 2 hours
  • Two separate interviewers are officially
    recommended
  • Primarily used within male populations (MF is at
    least 31)

31
Checklist items
  • Glibness/superficial charm
  • Grandiose sense of self-worth
  • Need for stimulation/proneness to boredom
  • Pathological lying
  • Conning/manipulative
  • Lack of remorse or guilt
  • Shallow affect
  • Callous/lack of empathy
  • Parasitic lifestyle
  • Poor behavioral controls
  • Promiscuous sexual behavior
  • Early behavior problems
  • Lack of realistic, long-term plans
  • Impulsivity
  • Irresponsibility
  • Failure to accept responsibility for own actions
  • Many short-term relationships
  • Juvenile delinquency
  • Revocation of conditional release
  • Criminal versatility

32
Factors
  • Two major factors (each with two sub-factors)
    reflect the characteristics of the psychopathic
    personality.
  • Factor 1 - The callous, selfish, remorseless use
    of others
  • Factor 1a - Interpersonal (4 items)
  • Factor 1b - Affective (4 items)
  • Factor 2 - The chronically unstable and
    antisocial lifestyle
  • Factor 2a - Impulsive Lifestyle (5 items)
  • Factor 2b - Antisocial Behavior (5 items)

33
Reliability of PCL-R Total
  • Alpha is around 0.85 - 0.87 overall
  • - Inter-rater reliabilities are around 0.86-0.93

34
Validity
  • There is overlap between the PCL-R and the DSM-IV
    criteria for Anti-social personality disorder
    (Psychopathy is not a recognized diagnosis).
  • Clinicians also are in agreement that this tool
    corresponds highly to their clinical concept of
    the psychopathic personality.
  • PCL-R scores correlate with self-report measures
    related to psychopathy, including the Pd
    (Psychopathic Deviate) and Ma (Hypomania) scales
    of the MMPI

35
Predictive Validity
  • PCL-R predicts response to treatment and is cross
    culturally relevant (Hare et. al 2000)
  • The PCL-R is a reliable indicator of re-offending
    (Hart 1988)
  • Good predictor of violent re-offending a
    psychopath is 4 times more likely to violently
    re-offend than a non-psychopath
  • Hart et al (1988) within 3 years, 75 of
    non-psychopaths but only 20 of psychopaths
    conditionally released from a federal prison were
    still out
  • Among psychopaths (PCL-R gt 30) released from
    prison, the 5-year violent recidivism rate is
    about 70
  • High scores on the PCL-R (30 and above) suggest
    repeated acts of violence and aggression
    following release from detention than are other
    offenders.

36
Q-sorting
37
Q-sort
  • The Q-sort (also called Q-technique) is a
    psychometric scaling technique, related to
    Kellys Grid Method
  • It was first developed by British
    physicist/psychologist (double PhD!) William
    Stephenson (1935), who had studied with Charles
    Spearman, the inventor of factor analysis (which
    Q-sort was supposed to extend)
  • It was laid out fully in The Study of Behavior
    Q-technique and Its Methodology (1953)
  • The method was then popularized by humanist
    psychotherapist Carl Rogers (1954) as a method
    for studying changes in self-concept

38
Q-technique and Its Methodology (1953)
  • This book was extremely controversial, in part
    because of its writing style (described as
    belligerent, entertaining and megalomanic)
    and in part because Stephenson was trying to
    revolutionize factor analysis by claiming that
    two separate data matrices were at issue
  • One contained objective measures (the usual stuff
    of factor analysis, R)
  • The other containing subjective data (Q).

39
Q-technique and Its Methodology (1953)
  • We can think of Q as set of weights on
    objective measures (Stephensons operant
    subjectivity)
  • Classic example is factor analysis of body parts,
    which reduces to a single factor, size
  • However, if you get people to put together body
    parts by significance for me, you get quite
    different factors the first placing emphasis on
    eyes, head, and mouth the second factor
    emphasizing trunk, hips, and chest.

40
Q-sort
  • The subject is presented with a set of many
    (perhaps 100) cards containing statements
  • Subjects sort cards into a specified number of
    piles (usually 9 or 11), with the constraint that
    a certain number must go into each pile (usually
    normal may also be equal), along some prescribed
    dimension (e.g. from most descriptive to least
    descriptive or from unfavorable to
    favorable or from beautiful to ugly)

41
Scoring the Q-sort
  • Q-sorts may be scored with reference to some
    established norms (Block, 1961)
  • The may also be scored with more subjective
    reference e.g to an ideal sort in which the
    subject is asked to sort by his ideal, rather
    than actual, self
  • Rogers correlated these two sorts separately to
    get insight into his clients
  • A correlation approaching 1.0 means actual and
    ideal selves are aligned
  • A correlation approaching -1.0 means actual and
    ideal selves are almost perfectly mis-aligned
  • Psychotherapy clients were shown to increase
    their congruence (Rogers Dymond, 1954), though
    adjusted control subjects scored higher.

42
Other applications
  • Q-sorts have been used to
  • Describe patients with specific MMPI profiles
    (Marks Seeman, 1963)
  • Derive characteristics associated with success at
    a task (Bem and Funder, 1978)
  • Get at perceived product characteristics
    aesthetically pleasure, preference etc.
  • In general, they can be used wherever you need to
    locate an entity within a (especially
    non-objective) high-dimensional space at one or
    more times
Write a Comment
User Comments (0)
About PowerShow.com