zscore: how many units a value is above or below the mean - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

zscore: how many units a value is above or below the mean

Description:

POWER: the probability that the hypothesis test correctly rejects a false HO ... we test the HO hypothesis which assumes the null distribution is true, so the ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 32
Provided by: University354
Category:
Tags: above | below | ho | many | mean | units | value | zscore

less

Transcript and Presenter's Notes

Title: zscore: how many units a value is above or below the mean


1
Test Statistics
z-score how many units a value is above or below
the mean
Population
z-statistic how many units a value is
above/below the mean
Logic of the z-test based on the assumption
that the null hypothesis is rightif this is
true, how likely am I to obtain a sample average
of the observed value the smaller P is, the
more likely we are to reject the null hypothesis
2
Test Statistics
General form obtained difference between data
and hypothesis standard difference
expected by chance sample mean -
population mean standard error Types of
statistics 1. z statistic 2. t
statistic When we know the population
variance/standard deviation, we use z, if not use
t OR if sample size gt 25, use z
3
Test Statistics
z statistic
our SEaverage calculation has been
Example
where
or
and
4
Test Statistics
Types of statistics 1. z statistic 2. t
statistic
Difference in calculation between the z and the t
is in the calculation of the SE exact value
computed in z estimated value computed in t
z SE
t SE
where
where
and
5
Review of Equations for population data
(X- )2 0 121 144 1
X 75 86 63 76
(X- ) 0 11 -12 1
variance/mean squared deviation sum
the squared deviations, obtain mean
Standard deviation r.m.s. of the deviation
scores
6
Review of Equations for population data
X 75 86 63 76 300
X2 5625 7396 3969 5776 22766
Definitional formula for the sums of squares
Computational formula for the sums of squares
7
Review of Equations for sample data
(X- )2 0 121 144 1
X 75 86 63 76
(X- ) 0 11 -12 1
variance/mean squared deviation sum
the squared deviations, obtain mean
Standard deviation r.m.s. of the
deviation scores
8
Review of Equations for sample data
X 75 86 63 76 300
X2 5625 7396 3969 5776 22766
Definitional formula for the sums of squares
Computational formula for the sums of squares
9
Standard errors
s 4 n 100 SE for sum 40
Standard error for the sum of the draws
Standard error for the average of the draws
SE for avg. 0.40
Standard error for the count of the draws from a
0 - 1 box
Standard error for the percent SE for the count
from a 0 - 1 box the number of draws
10
Note
In this single sample design, if we knew the ?
value OR N gt 25, we would use the
z-statistic The only difference would lie in the
manner in which we calculated the standard error
for the test statistic
Independent Measures Research Designs
Most often, researchers do NOT have prior
knowledge about the population/s that they are
studying experiments involve the comparison of
two (or more) samples in an attempt to compare
two (or more) populations This type of research
requires the comparison of two separate
samples the term separate implies independence
between the two samples known as a
between-subjects design or between-groups
design The independent measures design evaluates
differences between treatments by looking at
differences between two separate groups of
subjects
11
Independent Measures Research Designs
As before t sample statistic - population
parameter estimated standard
error But now we need to accommodate the
between groups design, so
known as the independent measures t statistic
Points to note
specified by the null hypothesis and generally
equals 0
So this independent measures t simplifies to
12
Independent Measures Research Designs
The standard error for a sample mean difference
a the SE for the first quantity and b the SE
for the second quantity
Another perspective
Keep in mind in the single sample case
Because we have two sources of error here
because each sample mean estimates its population
mean and we need to know the total amount of
error involved in using these two sample
statistics to estimate the two population
parameters
In the independent measures t statistic when the
two groups have equal sample sizes ( n1 n2),
the standard error of the sample mean difference
is
13
Independent Measures Research Designs
Example sample A sample B n 4 n
4 SS 36 SS 40
Now
Notice that this works out the same if we
calculate in the other way
a the SE for the first quantity and b the SE
for the second quantity
as
Which produces
14
Independent Measures Research Design
With equal n, the independent t
Recall larger samples more precision as n
increases, precision increases larger
samples produce better estimates with unequal n
samples, we want to weight the samples
accordingly more weight to the larger sample,
less weight to the smaller sample
With equal n, we thus need to adjust the manner
in which the SE is estimated by weighting each
sample variance by its respective
df because larger df larger n more
precision Thus we use a pooled variance term in
this case instead of simply adding the separate
variances as in the equal n case
15
Assumptions
Assumptions underlying the t statistic 1.
Independence of observations within each
group like draws with replacement from a
box 2. Population data must be normally
distributed 3. Homogeneity of variance the
variability in the population between must be
equal for both populations
16
Dependent Measures Research Designs
  • Dependent measures designs repeated measures
    designs related samples
  • data are obtained by repeatedly measuring the
    same set of people
  • known as a within-subjects design
  • a single sample of subjects is used to compare 2
    (or more) different treatment conditions
    provides 2 (or more) sets of scores FROM the same
    people
  • Advantages
  • reduces any risk that the two different sets of
    subjects differ in any way
  • could be important when not randomly assigning
  • Matched subjects design
  • match subjects on a variable related to the
    dependent variable
  • example match on IQ for testing a new reading
    program

17
Two tailed Hypothesis Tests
The critical region is located in both tails of
the distribution because the null and alternative
hypotheses do not specify a specific direction
Pre-treatment distribution
could be positive or negative
Possible post-treatment distributions
Note even if the tx had no effect, we know that
simply due to chance we will not find the same
mean the question is is the observed difference
large enough to reject the HO
18
Two tailed Critical regions
Alpha .05
-1.96
1.96
Alpha .01
-2.58
2.58
3.30
Alpha .001
-3.30
19
One tailed Hypothesis Tests/Directional
Hypothesis Tests
The critical region is located in one tail of the
distribution because a specific effect is
expected for the treatment and is reflected in
the null and alternative hypotheses
Pre-treatment distribution
Expected post-treatment distribution
Expected to be positive
Note again, even if the tx had no effect, we
know that simply due to chance we will not find
the same mean the question is is the observed
difference large enough to reject the HO
20
One tailed Critical regions
Alpha .05
1.65
2.35
Alpha .01
2.55
Alpha .001
21
One tailed versus two tailed tests
Goals of hypothesis testing to
determine whether or not a treatment/interventio
n has an effect on a population whether two
groups are different Single sample example when
the population mean is known randomly select a
sample administer the treatment compar
e the sample result with the known population
mean if the two differ significantly (by a large
enough amount), reject the HO if the two are
similar, we fail to reject the HO CRITICAL
FACTOR the size of the difference between the
two groups one tailed tests require a smaller
difference critical z of 1.65 versus the two
tailed tests which have a critical z value of 1.96
22
Statistical Power
Recall the grid of possible outcomes in
hypothesis testing
REALITY No effect Effect
Exists (HO True) (HO False) R
eject HO Type I error Correct DECISION
Fail to Correct Type II error reject
HO
Note Type I errors minimized by selecting a
low alpha level alpha level maximum
probability of a type I error POWER the
probability that the hypothesis test correctly
rejects a false HO The more power, the greater
the chance of correctly rejecting a false HO
23
Statistical Power
Recall the grid of possible outcomes in
hypothesis testing
REALITY No effect Effect
Exists (HO True) (HO False) R
eject HO Type I error Correct DECISION
Fail to Correct Type II error reject
HO
When a treatment effect exists in reality, then
the HO is false and there are only two possible
outcomes 1. we fail to discover the effect
Type II error 2. we correctly reject the false
HO Note P(Type II error) Beta B Because we
know that with only two possible outcomes, the
probabilities must add to 1.0, we know
that P(correctly rejecting a false HO ) 1 -
B POWER
24
Statistical Power
Power and Treatment Effect Size power depends
on the size of the effect that the treatment has
in the population as effect size increases,
power increases as effect size decreases, power
decreases thus we examine power NOT as a single
value, but as a function of the magnitude of the
effect size Another look at power power
P(correctly rejecting a false HO hypothesis)
thus we reject the HO when the sample means fall
in the critical region
SO power is the probability of obtaining sample
means in the critical region when the HO is
false
-1.96 alpha .05 1.96
25
Statistical Power an example
-1.96 alpha .05 1.96
Questions does caffeine influence reaction time
Assume in the untreated population reaction
times are normally distributed with µ 200 the
null distribution
Also assumethat caffeine does have an effect
such that people given caffeine have reaction
times that are 20 milliseconds lower the
treatment distribution
1 - B
B P(Type II error)
we test the HO hypothesis which assumes the null
distribution is true, so the critical region is
defined by the extreme unlikely sample means in
the tails of the null distribution many of the
possible sample means in the tx distribution fall
in the critical region and this defines the
P(correctly rejecting a false HO hypothesis)
power 1 - B
26
Statistical Power an example
-1.96 alpha .05 1.96
Questions does caffeine influence reaction time
Assume in the untreated population reaction
times are normally distributed with µ 200 the
null distribution
Also assumethat caffeine does have an effect
such that people given caffeine have reaction
timers that are 10 milliseconds lower the
treatment distribution
1 - B
B P(Type II error)
Again we test the HO hypothesis and the critical
region is defined by the extreme unlikely sample
means in the tails of the null distribution here
the tx distribution is closer to the HO
distribution, so fewer of the possible sample
means in the tx distribution fall in the critical
region and power (1 - B) is smaller while the
P(Type II error) is larger Note the larger the
effect size, more power smaller the effect size,
less power
27
Statistical Power other factors influencing it
Alpha level Same example untreated population
normal with µ 200 treatment population
normal with µ 180
-1.96 alpha .05 1.96
1 - B
B P(Type II error)
1 - B
B P(Type II error)
-2.58 alpha .01 2.58
28
Statistical Power other factors influencing it
One tailed versus two tailed tests
1.65 Alpha .05
Alpha .05
-1.96
1.96
29
Statistical Power other factors influencing it
One tailed versus two tailed tests
2.35 Alpha .01
Alpha .01
-2.58
2.58
30
Statistical Power other factors influencing it
One tailed versus two tailed tests
2.55 Alpha .001
3.30
Alpha .001
-3.30
31
Statistical Power other factors influencing it
Sample size Same example untreated population
normal with µ 200 treatment population
normal with µ 180
Smaller sample size means less precision more
variability
N 25
1 - B
B P(Type II error)
larger sample size means more precision less
variability and thus more power
N 100
1 - B
B P(Type II error)
Write a Comment
User Comments (0)
About PowerShow.com