Title: Stats 242.3(02)
1Stats 242.3(02)
- Statistical Theory and Methodology
2(No Transcript)
3Text
- Dennis D. Wackerly, William Mendenhall III,
Richard L. Scheaffer, Mathematical Statistics
with applications, 6th Edition, Duxbury Press
4Course Outline
5Introduction
6Sampling Distributions
- Chapter 7
- Sampling distributions related to the Normal
distribution - The Central Limit theorem
- The Normal approximation to the Binomial
7Estimation
- Chapter 8
- Properties of estimators
- Interval estimation
- Sample size determination
8Properties and Methods of Estimation
- Chapter 9
- The method of moments
- Maximum Likelihood estimation
- Sufficiency (Sufficient Statistics)
9Hypothesis testing
- Chapter 10
- Elements of a statistical test - type I and type
II errors - The Z test - one and two samples
- hypothesis testing for the means of the normal
distribution with small sample sizes - Power and the NeymannPearson Lemma
- Likelihood ratio tests
10Linear and Nonlinear Models Least Squares
Estimation
- Chapter 11
- Topics covered dependent on available time
11The Analysis of Variance
- Chapter 13
- Topics covered dependent on available time
12Nonparametric Statistical Methods
- Chapter 15
- Topics covered dependent on available time
13Introduction
14What is Statistics?
- It is the major mathematical tool of scientific
inference methods for drawing conclusion from
data. - Data that is to some extent corrupted by some
component of random variation (random noise)
15Phenomena
Non-deterministic
16Deterministic Phenomena
- A mathematical model exists that allows accurate
prediction of outcomes of the phenomena (or
observations taken from the phenomena)
17Non-deterministic Phenomena
- Lack of perfect predictability
18Non-deterministic Phenomena
Random
19Random Phenomena
- No mathematical model exists that allows accurate
prediction of outcomes of the phenomena (or
observations) - However the outcomes (or observations) exhibit in
the long run on the average statistical
regularity
20Example
- Tossing of a Coin
- No mathematical model exists that allows accurate
prediction of outcome of this phenomena - However in the long run on the average
approximately 50 of the time the coin is a head
and 50 of the time the coin is a tail
21Haphazard Phenomena
- No mathematical model exists that allows accurate
prediction of outcomes of the phenomena (or
observations) - No exhibition of statistical regularity in the
long run. - Do such phenomena exist?
22- In both Statistics and Probability theory we are
concerned with studying random phenomena
23In probability theory
- The model is known and we are interested in
predicting the outcomes and observations of the
phenomena.
outcomes and observations
model
24In statistics
- The model is unknown
- the outcomes and observations of the phenomena
have been observed. - We are interested in determining the model from
the observations
outcomes and observations
model
25Example - Probability
- A coin is tossed n 100 times
- We are interested in the observation, X, the
number of times the coin is a head. - Assuming the coin is balanced (i.e. p the
probability of a head ½.)
26Example - Statistics
- We are interested in the success rate, p, of a
new surgical procedure. - The procedure is performed n 100 times.
- X, the number of successful times the procedure
is performed is 82. - The success rate p is unknown.
27- If the success rate p was known.
- Then
This equation allows us to predict the value of
the observation, X.
28- In the case when the success rate p was unknown.
- Then the following equation is still true the
success rate
We will want to use the value of the observation,
X 82 to make a decision regarding the value of
p.
29Some definitions
30A population
- this is the complete collection of subjects
(objects) that are of interest in the study. - There may be (and frequently are) more than one
in which case a major objective is that of
comparison.
31A case (elementary sampling unit)
- This is an individual unit (subject) of the
population.
32A variable
- a measurement or type of measurement that is made
on each individual case in the population.
33Types of variables
- Some variables may be measured on a numerical
scale while others are measured on a categorical
scale. - The nature of the variables has a great influence
on which analysis will be used. .
34- For Variables measured on a numerical scale the
measurements will be numbers. - Ex Age, Weight, Systolic Blood Pressure
- For Variables measured on a categorical scale the
measurements will be categories. - Ex Sex, Religion, Heart Disease
35Note
- Sometimes variables can be measured on both a
numerical scale and a categorical scale. - In fact, variables measured on a numerical scale
can always be converted to measurements on a
categorical scale.
36Example
- The following variables were evaluated for a
study of individuals receiving head injuries in
Saskatchewan.
- Cause of the injury (categorical)
- Motor vehicle accident
- Fall
- Violence
- other
37- Time of year (date) (numerical or categorical)
- summer
- fall
- winter
- spring
- Sex on injured individual (categorical)
- male
- female
38- Age (numerical or categorical)
- lt 10
- 10-19
- 20 - 29
- 30 - 49
- 50 65
- 65
- Mortality (categorical)
- Died from injury
- alive
39Types of variables
- In addition some variables are labeled as
dependent variables and some variables are
labeled as independent variables.
40- This usually depends on the objectives of the
analysis. - Dependent variables are output or response
variables while the independent variables are the
input variables or factors.
41- Usually one is interested in determining
equations that describe how the dependent
variables are affected by the independent
variables
42Example
- Suppose we are collecting data on
- Blood Pressure
- Height
- Weight
- Age
43- Suppose we are interested in how
- Blood Pressure
- is influenced by the following factors
- Height
- Weight
- Age
44- Then
- Blood Pressure
- is the dependent variable
- and
- Height
- Weight
- Age
- Are the independent variables
45Example Head Injury study
- Suppose we are interested in how
- Mortality
- is influenced by the following factors
- Cause of head injury
- Time of year
- Sex
- Age
46- Then
- Mortality
- is the dependent variable
- and
- Cause of head injury
- Time of year
- Sex
- Age
- Are the independent variables
47dependent
Response variable
independent
predictor variable
48A sample
- Is a subset of the population
49In statistics
- One draws conclusions about the population based
on data collected from a sample
50Reasons
It is less costly to collect data from a sample
then the entire population
Accuracy
51Accuracy
Data from a sample sometimes leads to more
accurate conclusions then data from the entire
population
Costs saved from using a sample can be directed
to obtaining more accurate observations on each
case in the population
52Types of Samples
- different types of samples are determined by how
the sample is selected.
53Convenience Samples
- In a convenience sample the subjects that are
most convenient to the researcher are selected as
objects in the sample. - This is not a very good procedure for inferential
Statistical Analysis but is useful for
exploratory preliminary work.
54Quota samples
- In quota samples subjects are chosen conveniently
until quotas are met for different subgroups of
the population. - This also is useful for exploratory preliminary
work.
55Random Samples
- Random samples of a given size are selected in
such that all possible samples of that size have
the same probability of being selected.
56- Convenience Samples and Quota samples are useful
for preliminary studies. It is however difficult
to assess the accuracy of estimates based on this
type of sampling scheme. - Sometimes however one has to be satisfied with a
convenience sample and assume that it is
equivalent to a random sampling procedure
57Some other definitions
58A population statistic (parameter)
- Any quantity computed from the values of
variables for the entire population.
59A sample statistic
- Any quantity computed from the values of
variables for the cases in the sample.