Title: Assignment
1Assignment
- After this lecture, start Assignment 3.
- Its in the course binder.
2Lecture 6Who are we talking about?Populations
and samples
3Overview
- Defining a population
- Taking a simple random sample
- How similar is the sample to the population?
4Population and Samples
- Population
- All the cases (individuals, objects, or groups)
in which the researcher is interested. - Sample
- A relatively small subset from a population.
- Example
- The US population 300 million people
- The General Social Survey (GSS)
- a sample of the US population
- about 3,000 people
- Student version of the GSS
- a sample from the GSS
- about 1,500 people
5The sampling problem
- We care about populations.
- We can only afford to look at samples.
- How do we know our sample is relevant?
6Simple random sampling Definition
- Define the population
- label every person
- Sample the labels
- randomly
- so everyone in the population has the same
probability of being sampled
7Simple random sampling Example
- Define the population this class52 people
- label every person Give everyone a playing card
- Sample the labels Draw 5 cards from a second
deck - randomly after shuffling
- so everyone in the population has the same
probability of being sampled Everyones card
appears once in the deck.
8Sampling Bad examples
- Define the population This class52 people
- label every person Put everyone in a seat
- Sample the labels Choose 5 people from first
row. - Problems
- not random
- not everyone has the same probability
- only first row has any chance
- Sample the labels Choose 5 volunteers.
- Problem
- not everyone has the same probability
- favors extroverts
- Sample the labels Choose 5 people without a
system - Problem
- Is it random?
9Sampling Presidential election, 1936
- Literary Digest poll
- Sampled 10 million names from lists of car and
phone owners - Mailed 10 million questionnaires
- Got 2.3 million responses
- Results 57 favor Landon (R), 43 favor
Roosevelt (D) - Election result
- What went wrong?
10Myth Simple random samples are representative
- Actually can be quite different from population
- But
- we can usually place bounds on the difference
11Notation
Mnemonics Population measures are called
Parameters. Sample measures are called
Statistics. The P words and S words go
together. Population parameters use Greek
letters mGreek m pGreek p sGreek s The
population is the source of the sample. The
Greeks are the source of a lot of things.
12Population
Population GSS student version Variable
CHILDS How many children have you ever
had? mY1.75, sY1.62.
13Sampling error for a mean
A simple random sample of N4 cases.
The sample mean is not equal to the population
mean mY1.75 The difference is called sampling
error.
14Sampling variation
A different simple random sample of N4 cases.
Again, the sample mean is different from the
population mean mY1.75 and different from the
mean of the first sample The variation from one
sample to another is called sampling variation.
15Conceptual definitions
- Sampling error The sample mean is probably
not the same as the population mean - Sampling variation Take a different sample,get
a different sample mean.
16Technical definitions
- Sampling error The difference between the
sample mean and the population mean. - Sampling variation The variation of the sample
meanfrom one sample to another.
17Repeated sampling
All possible samples
All US households
N4
Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y
Y
N4
Each Y represents the number of children in a
household
18Sampling distribution
Suppose 5000 different researchers took simple
random samples of N4 cases.
- Sampling distribution of the meanThe
distribution of sample meansover all possible
samples.
19Mean of the sampling distribution
The mean of the sample means is the population
mean mYhere 1.75. In other words, the mean of
the sampling errors is zero.
mY
20Variation of the sampling distribution
The variation of the sample means is less than
that of the original variable.
21Standard error Definition
The standard deviation of the sample means
is exactly
It is also the standard deviation of the sampling
errors. So its called the standard error.
22Standard error shrinks with sample size
Again the standard error is
Bigger N?smaller
To halve
, quadruple N
1.75 .81
1.75 .2025
1.75 .405
23Sample mean usually within 2 SEs of pop. mean
As in any distribution, most values within 2 SDs
of mean.
24SummarySampling distribution of the mean
- Across all possible samples
- has
- mean
- and standard deviation
- a.k.a. standard error
- Implications
- sample mean usually within 2 SEs of pop. mean
- In newspapers, /- 2 SEs is often called margin
of error - larger samples have smaller SEs
25Summary More general
- If we take a simple random sample
- from a well-defined population
- we expect
- that the sample mean
- is probably close to the population mean
- By close we mean within 2 standard errors
- Larger samples have smaller standard errors.
- Next time well say what me mean by probably