Title: Excursions in Modern Mathematics Sixth Edition
1Excursions in Modern MathematicsSixth Edition
2Chapter 13Collecting Statistical Data
- Censuses, Surveys, and Clinical Studies
3Collecting Statistical DataOutline/learning
Objectives
- To identify whether a given survey or poll is
biased. - To list and discuss the quality of several
sampling methods. - To identify components of a well-constructed
clinical study.
4Collecting Statistical DataOutline/learning
Objectives
- To define key terminology in the data collection
process. - To estimate the size of a population using the
capture-recapture method.
5Collecting Statistical Data
6Collecting Statistical Data
- Population
- Every statistical statement refers, directly or
indirectly, to some group of individuals or
objects. - N-value
- Given a specific population, an obviously
relevant question is, How many individuals or
objects are there in that population?
7Collecting Statistical Data
- Census
- The process of collecting data by going through
every member of the population.
8Collecting Statistical Data
Over the last 45 years, the United States Fish
and Wildlife Service has been able to keep a
remarkably accurate tally of the number of bald
eagle breeding pairs in the lower 48 states.
9Collecting Statistical Data
A tremendous amount of effort has gone into
collecting and verifying these N-values, which,
for a wildlife population, are of remarkable
accuracy. The above figure summarizes the
population numbers over the period 1963-2000.
10Collecting Statistical Data
11Collecting Statistical Data
- Survey
- The practical alternative to a census is to
collect data only from some members of the
population and use that data to draw conclusions
and make inferences about the entire population. - Poll
- When the data collection is done by asking
questions.
12Collecting Statistical Data
- Sample
- The subgroup chosen to provide the data.
- Sampling
- The act of selecting a sample.
13Collecting Statistical Data
- Target population
- The most important step in a survey is to
distinguish the population for which the survey
applies. - Sampling frame
- The actual subset of the population from which
the sample will be drawn.
14Collecting Statistical Data
- Public Opinion Polls
- Selection bias
- When the choice of the sample has a built-in
tendency to exclude a particular group or
characteristics within the population. - Response rate
- The percentage of respondents out of the total
sample. - Nonresponse bias
- When the response rate to a survey is low.
15Collecting Statistical Data
- Convenience Sampling
- In convenience sampling the selection of which
individuals are in the sample is dictated by what
is easiest for the data collector. - A classic example is when interviewers set up at
a fixed location such as a mall or outside a
supermarket and ask passersby to be a part of a
public opinion poll.
16Collecting Statistical Data
- Quota Sampling
- Quota sampling is a systematic effort to force
the sample to be representative of a given
population through the use of quotas the sample
should have so many women, so many men, so many
blacks, so many whites, so many people living in
urban areas, so many people living in rural
areas, and so on. -
17Collecting Statistical Data
18Collecting Statistical Data
- Random sampling
- Sampling methods that use randomness as part of
their design. - Random sample
- Any sample obtained through random sampling.
19Collecting Statistical Data
- Simple Random Sampling
- It is based on the same principle a lottery is.
Any set of numbers of a given size has an equal
chance of being chosen as any other set of
numbers of that size.
20Collecting Statistical Data
- Stratified Sampling
- The alternative to simple random sampling used
nowadays for national surveys and public opinion
polls. The basic idea of stratified sampling is
to break the sampling frame into categories,
called strata, and then randomly choose a sample
from these strata.
21Collecting Statistical Data
- 13.4 Sampling Terminology and Key Concepts
22Collecting Statistical Data
- Statistic
- To describe any kind of numerical information
drawn from a sample. - Parameter
- An estimate for some unknown measure of the
population. - Sampling error
- To describe the difference between a parameter
and a statistic used to estimate that parameter.
23Collecting Statistical Data
- Chance error
- The result of the basic fact that a sample,
being just a sample, can only give us approximate
information about the population. - Sampling variability
- Different samples are likely to produce
different statistics for the same population,
even when the samples are chosen in exactly the
same way.
24Collecting Statistical Data
- Sample bias
- The result of choosing a bad sample and is a
much more serious problem than chance error. - Sample proportion
- The size of the sample, denoted by n (to
contrast with N, the size of the population). - The ratio n/N is the sample proportion.
25Collecting Statistical Data
- 13.5 The Capture-Recapture Method
26Collecting Statistical Data
- The Capture-Recapture Method
- Step 1. Capture (sample) Capture (choose) a
sample of size n1, tag (mark, identify) the
animals (objects, people), and release them back
into the general population. - Step 2. Recapture (resample) After a certain
period of time, capture a new sample of size n2,
and take an exact head count of the tagged
individuals. Lets call this number k.
27Collecting Statistical Data
- Small Fish in a Big Pond
- A large pond is stocked with catfish. As part
of a research project we need to estimate the
number of catfish in the pond. - Step 1. For our first sample we capture a
predetermined number n1 of catfish, say
n1 200. The fish are tagged and released
unharmed back in the pond.
28Collecting Statistical Data
- Small Fish in a Big Pond
- Step 2. After giving enough time for the released
fish to mingle and disperse throughout the pond,
we capture a second sample of n2 catfish. While
n2 does not have to equal n1, it is a good idea
for the two samples to be of approximately the
same order of magnitude. Lets say that n2
250. - Of the 250 catfish in the second sample, 35 have
tags (were part of the original sample).
29Collecting Statistical Data
- Small Fish in a Big Pond
- The ratio of tagged fish in the second sample is
the same as the ratio of tagged fish in the pond. - 35/250 ? 200/N
- which in turn gives
- N ? 200 X 250/35 ? 1428.57
- A sensible conclusion is that there are
approximately N 1400 catfish in the pond.
30Collecting Statistical Data
- Clinical Studies Terminology
- Clinical study (trial). Studies concerned with
determining whether a single variable or
treatment can cause a certain effect. - Confounding variables. All other possible
contributing causes that could produce the same
effect in a clinical study.
31Collecting Statistical Data
- Clinical Studies Terminology
- Controlled study. The subjects are divided into
two different groups. - Treatment group. Subjects receiving the actual
treatment. - Control group. Subjects that are not receiving
any treatment.
32Collecting Statistical Data
- Clinical Studies Terminology
- Randomized controlled study. The subjects are
assigned to the treatment group or the control
group randomly. - Placebo effect. A critical confounding variable
from the generally accepted principle that just
the idea that one is getting a treatment, can
produce positive results.
33Collecting Statistical Data
- Clinical Studies Terminology
- Placebo. A make-believe form of treatment a
harmless pill, an injection of saline solution,
or any other fake type of treatment intended to
look like the real treatment. - Controlled placebo study. A controlled study in
which the subjects in the control group are given
a placebo.
34Collecting Statistical Data
- Clinical Studies Terminology
- Blind. A study in which neither the members of
the treatment group nor the members of the
control group know to which of the two groups
they belong. - Double-blind study. A controlled placebo study
in which neither the subjects nor the scientist
conducting the experiment know which subjects are
in the treatment group and which are in the
control group.
35Collecting Statistical Data
Conclusion
- Census
- Sample/ Survey/ Sample Bias
- Simple Random/Stratified Sampling
- Confounding Variables
- Controlled Study
-