Title: Political Science 5 Lecture 8, 21904
1Political Science 5Lecture 8, 2/19/04
- - Midterm 1 on Tuesday, 3/2
- - Review next Thursday
- Homework 2 due a week from today
- Homework 1 will be returned on Tuesday
2What is a sample?
- A researcher usually wants to generalize about a
whole class of individuals. This is called the
population.
- However, studying the whole population is usually
impractical. Only part of it can be examined,
and this is called the sample.
- Researchers make generalizations from the part to
the whole in technical terms, they make
inferences from the sample to the population.
3Sample vs. the Population
- An observational study simply observes cases,
without attempting to impose a treatment and
without requiring any quasi- or natural
experimental design. - Researchers can ask their cases questions in
order to measure some variable.
- Most of the time, researchers look closely at a
small sample of the overall population.
4Sample vs. the Population
- A population is the entire group of cases about
which you want information.
- A sample is a subset of the population which is
used to gain information about the whole
population.
Population
Sample
5Parameters
- Usually, there are some numerical facts about the
population which the investigators want to know.
Such numerical facts are called parameters. In
forecasting a presidential election in the United
States, two relevant parameters are - The average age of all eligible voters
- The percentage of all eligible voters who are
currently registered to vote.
- Ordinarily, parameters like these cannot be
determined exactly, but can only be estimated
from a sample. Then a major issue is accuracy
how close are the estimates going to be?
6Statistics
- Parameters are estimated by statistics, or
numbers which can be computed from a sample.
- For instance, with a sample of 10,000 Americans,
a researcher could calculate the following two
statistics in order to estimate the parameters
mentioned above - The average age of eligible voters in the sample
- The percentage of the eligible voters in the
sample who are currently registered to vote
- Statistics are what researchers know parameters
are what they want to know.
7Sample vs. the Population
- A parameter is a number describing a population.
It is a usually a mystery.
- A statistic is a number describing a sample.
Statistics vary from sample to sample.
- If our sample is representative of the
population, sample statistics will closely
approximate population parameters.
8Estimating Parameters from the Sample
- Estimating parameters from the sample is
justified when the sample represents the
population.
- This is impossible to check by just looking at
the sample. Why? To see whether the sample is
like the population, the researcher would have to
know the facts about the population that they are
trying to estimate--a vicious circle. - So, you have to look at how the sample was
chosen. Some methods tend to do badly others
are more likely to give representative samples.
- Thus, the method of choosing a sample matters a
lot, and the best methods involve the planned
introduction of chance.
9How to Draw a Good Sample
- Nonrandom methods of drawing a sample
- Haphazard (or convenience) sample
- Voluntary response sample
- Quota sample
- Example Internet Polls
- Surveys of a Sub-Population
10Nonrandom methods of drawing a sample (Note
These are Bad!)
- A voluntary response sample includes the members
of the population who voice their desire to be
included in the sample.
- 1936 Literary Digest Poll mailed 10 million
ballots to magazine readers to volunteer
participate in their Presidential election
survey. 2 million surveys came back, predicting
that FDR would lose 43-57. In reality, FDR won,
61-39. - Poll was plagued by both selection bias and
non-response bias.
- When a sampling procedure is biased, taking a
large sample does not help. This just repeats
the basic mistake on a larger scale.
11Nonrandom methods of drawing a sample (Note
These are Bad!)
- A haphazard sample studies the segment of the
population that is easiest for the researcher to
reach.
- Polls only of people who have telephones. (Less
of a problem than it used to be).
- Television call-in survey (self-selection).
- Studies by college students of their dormmates.
- We cannot trust the results of haphazard surveys.
Why?
12Nonrandom methods of drawing a sample (Note
These are Bad!)
- A quota sample tries to obtain a group
representative of the population by setting
quotas for selecting various categories of people
based on their proportions in the population. - First, divide population into categories on the
basis of variables from census data..
- Then, set sample selection quotas for each
category based on its proportion in the
population.
- Better than haphazard surveys, but still has
shortcomings.
- Too much interviewer discretion.
- Problems in getting accurate data on the
proportion of different groups in the
population.
- Dewey-Truman example.
13Example Internet Polls
- Some internet polls ask the opinions of those who
have logged on to
- www.foxnews.com
- www.uclabruins.com
- www.peetscoffee.com
- www.pabst.com
- www.sfgate.com
- www.rogaine.com
- More professional internet polls advertise with
banners on a variety of web sites to recruit
people into their sample.
14Example Internet Polls
- Knowledge Networks is an internet-administered
survey that recruited its sample by using random
digit dialing.
- To give those without an internet connection the
chance to participate, they offered a free Web
TV console to participants.
- Those in the 50,000 person sample are given the
chance to participate in polls about subjects
like hard liquor or politics.
-
15Probability Sampling
- A probability sample is a sample of a population
in which each person has a known chance of being
selected.
- Removes bias from the sample selection process
- Probability samples are more representative than
than haphazard or quota samples.
- Can use probability theory to estimate the
accuracy of the sample (warning--next time
math!).
- Talking about surveys, but applies to sampling
other types of cases as well.
16How many people do you need in your sample?
- Depends on
- How much accuracy do you need in the survey
results?
- How much confidence do you want that your results
are actually within the specified range of
accuracy?
- How much variability is there in the variable?
17What Determines the Margin of Error (accuracy)
of a Poll?
- Margin of error tells us how close the sample
statistic is to the population parameters.
- If we have drawn a truly random sample
- Sample Population Proportion
- Proportion Random Error
18What Determines the Margin of Error (accuracy)
of a Poll?
- The margin of error is calculated by
19What Determines the Margin of Error (accuracy)
of a Poll?
- In a poll of 505 likely voters, the Field Poll
found 55 support for the recall.
20What Determines the Margin of Error (accuracy)
of a Poll?
- The degree of accuracy, or margin of error, is
usually stated in plus or minus percentage
terms.
- The margin of error for this poll was plus or
minus 4.4 percentage points.
- That means that the true percentage (the
population parameter, or the percentage that
would be obtained if the entire population, and
not just the sample, was surveyed) favoring the
recall could be anywhere between 50.6 and 59.4 -
21Confidence
- The confidence level is the probability that the
results are outside the specified level of
accuracy.
- Confidence level is stated in probability terms
- A confidence level of .01 means that there is 1
chance out of 100 that the survey results are
outside the specified range of accuracy.
Researchers usually use the .05 level. - In the previous example, at the .05 confidence
level, we can say that if we took many samples
using the Field Polls methods, 95 of the
samples would yield a statistic within plus or
minus 4.4 percentage points of the true
population parameter
22Sample sizes needed for different levels of
accuracy at .05 confidence level
23Bias and Variability
- Sampling bias is consistent deviation of the
sample statistic from the parameter
- Sampling variability describes how far apart
statistics are over many samples.
24Random sample
- A random sample is one in which each person in
the population has an equal chance of being
selected throughout the selection process.
- Removes bias
- Need to have an entire list of the defined
population
25Surveys of a Sub-Population
- Many researchers do not want to generalize to the
population of all Americans.
- They begin by defining the population that they
want to study, such as likely voters,
Asian-Americans voters, or Lesbian, Gay,
Bisexual, and Transgendered California adults.
26Surveys of a Sub-Population
- Option 1. Take a random sample of the entire
population, ask the respondent if he or she fits
into the category, and then continue the
interview if you find a match - Option 2. Begin with a list that approximates
the entire subpopulation (registered voters with
Asian surnames) and then take a random sample.
27How to Draw a Random sample
- Draw names out of a hat, a really big hat
- Label every case in the population with a number,
then draw some random numbers
- In a telephone poll, random digit dialing uses a
random number generator to get even those with
unlisted numbers
28How to Draw a Random sample
- What if the entire nation is your target
population? (no list available)
- Use multistage cluster sampling
- What is this?
29How to Draw a Random Sample
- Select the primary sampling units (counties,
cities, etc)
- Take a sample of smaller units from the list of
primary sample units (city blocks)
- Make a list of of smaller units and take a sample
of them
- Select a sample of persons
- At this point, one person would be randomly
selected from each household. Interviewer has no
choice about the person selected.