Elementary hypothesis testing - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Elementary hypothesis testing

Description:

Problem of the hypothesis testing is either for a given significance level find ... common in hypothesis testing to set ... Hypothesis testing vs intervals ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 20
Provided by: gar115
Category:

less

Transcript and Presenter's Notes

Title: Elementary hypothesis testing


1
Elementary hypothesis testing
  • Purpose of hypothesis testing
  • Type of hypotheses
  • Type of errors
  • Critical regions
  • Significant levels
  • Power of tests
  • Hypothesis vs intervals
  • R commands for tests

2
Example
  • This example will be used throughout this
    lecture. This data set was taken from R (it is
    shoes dataset in the package MASS)
  • The purpose of this experiment was to test wear
    of shoes of two materials A and B. For this
    purpose 10 boys were selected and for each boy
    shoes were made using materials A and B. One type
    of material was used for shoe for the right foot
    and another for the left foot. This type of
    experiment is called paired design.
  • A 13.2 8.2 10.9 14.3 10.7 6.6 9.5 10.8 8.8
    13.3
  • B 14.0 8.8 11.2 14.2 11.8 6.4 9.8 11.3 9.3
    13.6
  • For simplicity let us consider difference between
    A and B vectors
  • -0.8 -0.6 -0.3 0.1 -1.1 0.2 -0.3 -0.5 -0.5 -0.3
  • Sample mean is -0.41 and sample variance is 0.39

3
Purpose of hypothesis testing
  • Statistical hypotheses are in general different
    from scientific ones. Scientific hypotheses deal
    with the behavior of scientific subjects such as
    interactions between all particles in the
    universe. These hypotheses in general cannot be
    tested statistically. Statistical hypotheses deal
    with the behavior of observable random variables.
    These are hypotheses that are testable by
    observing some set of random variables. They are
    usually related to the distribution(s) of
    observed random variables.
  • For example if we have observed two sets of
    random variables x(x1,x2,,,,xn) and
    y(y1,y2,,,,ym) then one natural question is are
    means of these two sets different? It is a
    statistically testable hypothesis. Another
    question may arise do these two sets of random
    variables come from the population with the same
    variance? Or do these random variables come from
    the populations with the same distribution? These
    questions can be tested using observed samples.

4
Types of hypotheses
  • Hypotheses can in general be divided into two
    categories a) parametric and b) non-parametric.
    Parametric hypotheses concern with situations
    when the distribution of the population is known.
    Parametric hypotheses depend on the value of one
    or several parameters of this distribution.
    Non-parametric hypotheses concern with situations
    when none of the parameters of the distribution
    is specified in the statement of the hypothesis.
    For example hypothesis that two sets of random
    variables come from the same distribution is
    non-parametric one.
  • Parametric hypotheses can also be divided into
    two families 1) Simple hypotheses are those when
    all parameters of the distribution are specified.
    For example hypothesis that set of random
    variables comes from a population with normal
    distribution with known variance and known mean
    is a simple hypothesis 2) Composite hypotheses
    are those when some parameters of the
    distribution are specified and others remain
    unspecified. For example hypothesis that set of
    random variables comes from a population with
    normal distribution with a given mean value but
    unknown variance is a composite hypothesis.

5
Errors in hypothesis testing
  • Hypothesis is usually not tested alone. It is
    tested against some alternative one. Hypothesis
    being tested is called the null-hypothesis and
    denoted by H0 and alternative hypothesis is
    denoted H1. Subscripts may be different and may
    reflect the nature of the alternative hypothesis.
    Null-hypothesis gets benefit of doubt. There
    are two possible conclusions reject
    null-hypothesis or not-reject null-hypothesis. H0
    is only rejected if sample data contains
    sufficiently strong evidence that it is not true.
    Usually testing of a hypothesis comes to
    verification of some test statistic (a function
    of the sample points). If this value belongs to
    some region w hypothesis is rejected.. This
    region is called critical region. The region
    complementary to the critical region that is
    equal to W-w is called acceptance region. By
    rejecting or accepting hypothesis we can make two
    types of errors
  • Type I error Reject H0 if it is true
  • Type II error Accept H0 when it is false.
  • Type I errors usually considered to be more
    serious than type II errors.
  • Type I errors define significance levels and Type
    II errors define power of the test. In ideal
    world we would like to minimize both of these
    errors.

6
Power of a test
  • The probability of Type I error is equal to the
    size of the critical region, ?. The probability
    of the type II error is a function of the
    alternative hypothesis (say H1). This probability
    usually denoted by ?. Using notation of
    probability we can write
  • Where x is the sample points, w is the critical
    region and W-w is the acceptance region. If the
    sample points belong to the critical region then
    we reject the null-hypothesis. Above equations
    are nothing else than Type I and Type II errors
    written using probabilistic language.
  • Complementary probability of Type II error, 1-?
    is called the power of the test of the null
    hypothesis against the alternative hypothesis. ?
    is the probability of accepting null-hypothesis
    if alternative hypothesis is true and 1-? is the
    probability of rejecting H0 if H1 is true
  • Power of the test is the function of ?, the
    alternative hypothesis - H1 and probability
    distributions conditional on H0 and H1.

7
Critical region
  • Let us assume that we want to test if some
    parameter of the population is equal to a given
    value against alternative hypothesis. Then we can
    write (for example)
  • Test statistic is usually a point estimation for
    ? or somehow related to it. If critical region
    defined by this hypothesis is an interval (-?cu
    then cu is called the critical value. It defines
    upper limit of the critical interval. All values
    of the statistic to the left of cu leads to
    rejection of the null-hypothesis. If the value of
    the test statistic is to the right of cu this
    leads to not-rejecting the hypothesis. This type
    of hypothesis is called left one-sided
    hypothesis. Problem of the hypothesis testing is
    either for a given significance level find cu or
    for a given sample statistic find the observed
    significance level (p-value).

8
Significance level
  • It is common in hypothesis testing to set
    probability of Type I error, ? to some values
    called the significance levels. These levels
    usually set to 0.1, 0.05 and 0.01. If null
    hypothesis is true and probability of observing
    value of the current test statistic is lower than
    the significance levels then hypothesis is
    rejected.
  • Consider an example. Let us say we have a sample
    from the population with normal distribution
    N(?,?2). We want to test following
    null-hypothesis against alternative hypothesis
  • H0 ? ?0 and H1 ? lt ?0
  • This hypothesis is left one-sided hypothesis.
    Because all parameters of the distribution (mean
    and variance of the normal distribution) have
    been specified it is a simple hypothesis. Natural
    test statistic for this case is the sample mean.
    We know that sample mean has normal distribution.
    Under null-hypothesis mean for this distribution
    is ?0 and standard deviation is ?/?n. Then we can
    write
  • If we use the fact that Z is standard normal
    distribution (mean equal to 0 and variance equal
    to 1) then using the tables of standard normal
    distribution we can solve this equation.

9
Significance level Cont.
  • Let us define
  • Then we need to solve the equation (using
    standard tables or programs)
  • Having found z? we can solve the equation w.r.t
    cu.
  • If the sample mean is less than this value of cu
    we would reject with significance level ?. If
    sample mean is greater than this value then we
    would not reject null-hypothesis. If we reject
    (sample mean is smaller than cu) then we would
    say that if the population mean would be equal to
    ?0 then probability that we would observe sample
    mean is ?.
  • To find the power of the test we need to find
    probability under condition that alternative
    hypothesis is true.

10
Significance level An example.
  • Let us have a look example we had in the
    beginning of the lecture. Consider differences
    between A and B. The sample has a size of 10 and
    sample mean is -0.41. We assume that this sample
    comes from the population with normal
    distribution with standard deviation 0.39.. We
    want to test the following hypothesis
  • H0 ?0, against H1 ?lt0
  • We have ?0 0. Let us set significance level to
    0.05. Then from the table we can find that
    z0.051.645 and we can find cu.
  • cu ?0 z0.05 0.39/?10 0-1.645 0.39/3.16
    -0.2
  • Since the value of the sample mean (-0.41)
    belongs to the critical region (I.e. it is less
    than -0.2) we would reject null-hypothesis with
    significance level 0.05 (as well as at the level
    0.01).
  • Note that we could have used R function to get
    the same result
  • qnorm(0.05,sd0.39/sqrt(10))
  • Test we have performed was left one-sided test.
    I.e. we wanted to know if the value of the sample
    mean is less than assumed value (0). Similarly we
    can build right one-sided tests and combine these
    two tests and build two sided tests. Right sided
    tests would look like
  • H0 ??0 against H1 ?gt?0
  • Then critical region would consist of interval
    cl?). Where cl is the lower bound of the
    critical region
  • And two sided test would look like
  • H0 ??0 against H1 ???0
  • Then critical region would consists combination
    of two intervals (-?cu ?cl?).

11
Power of a test
  • Power of a test depends on the alternative
    hypothesis.
  • To see the power of the current test let us use
    normal distribution. Null hypotheis that true
    mean is equal 0. Let us set alternative
    hypothesis true mean is the observed difference
    that is equal to -0.41. Again we assume that the
    probability distribution is normal with known sd
    0.39. Sample size is 10. Significance level we
    want to test is 0.05 as in the previous case. We
    found that the upper bound of the critical region
    is -0.2. To find the power of the test we need to
    find P(xlt-0.2 ?1-0.41,?0.39). We can use R
    function to calculate this
  • pnorm(-0.2,mean-0.41,sd0.39/sqrt(10))
  • We divided standard deviation by the square root
    of 10 because of the sample size. This test is
    very powerful - power is equal 0.96
  • Again if we do not know the standard deviation of
    the population then t distribution is more
    appropriate (it is implemented in the command -
    power.t.test).
  • Note that if the distribution of the population
    (therefore each sample point and sample
    statistics) is known then the power of the test
    with a given difference could be calculated even
    before sample is available. Power of the test
    depends on alternative hypothesis. In case of
    mean of normal distribution that is exprssed
    using the difference between alternative and null
    hypothesis (?1-?0, where ?0 is mean under null
    and ?1 is under alternative hypothesis), the
    sample size and standard deviations.

12
Power of test
  • Power of a test can be used before as well as
    after experimental data have been collected.
    Before the experiment it is performed to find out
    the sample size to detect a given effect. It can
    be used as a part of the design of an experiment.
  • After the experiment it uses the sample size,
    effect (e..g. observed difference between means),
    standard deviation and calculates the power of
    the test.
  • For example if we want to detect difference
    between means equal to 1 (delta) in paired design
    with power equal 0.9 at a significance level 0.05
    in one sided test then we need around 10
    observations.
  • It is done in R using the command
    power.t.test(delta1,sd1,power0.8,type'paired',
    alt'one.sided')
  • The result of R function
  • Paired t test power calculation
  • n 7.7276
  • delta 1
  • sd 1
  • sig.level 0.05
  • power 0.8
  • alternative one.sided

13
Critical regions and power
  • The table shows schematically relation between
    relevant probabilities under null and alternative
    hypothesis.

14
Composite hypothesis
  • In the above example we assumed that the
    population variance is known. It was simple
    hypothesis (all parameters of the normal
    distribution have been specified). But in real
    life it is unusual to know the population
    variance. If population variance is not known the
    hypothesis becomes composite (hypothesis defines
    the population mean but population variance is
    not known). In this case variance is calculated
    from the sample and it replaces the population
    variance. Then instead of normal t distribution
    with n-1 degrees of freedom is used. Value of z?
    is found from the table of the tn-1 distribution.
    If n (gt100) is large then as it can be expected
    normal distribution very well approximates t
    distribution.
  • Above example can be easily extended for testing
    differences between means of two samples. If we
    have two samples from the population with equal
    but unknown variances then tests of differences
    between two means comes to t distribution with
    (n1n2-2) degrees of freedom. Where n1 is the
    size of the first sample and n2 is the size of
    the second sample.
  • If variances for both population would be known
    then test statistics for differences between two
    means has a normal distribution.

15
P-value of the test
  • Usually instead of setting pre-defined
    significance level, observed p-value is reported.
    It is also called observed significance level.
    Let us analyse it. Let us consider above example
    when we had a sample of size 10 with the sample
    mean -0.41. We assumed that we knew population
    variance 0.39. P-value is calculated as
    follows
  • We would reject null-hypothesis with significance
    level 0.05, 0.01 etc. If the population mean
    would be 0 with standard deviation 0.39 then
    observing -0.41 or less has the probability equal
    to 0.0004. In other words if would draw 10000
    times sample of size of 10 then around four times
    we would observe that mean value is less or equal
    to -0.41.
  • In this example we assumed that we know the
    variance of the population. That is why we used
    normal distribution. If we do not know the
    variance we should use t (with degree of freedom
    equal to 10-19) distribution and p-value becomes
    0.004.

16
Likelihood ratio test
  • Likelihood ratio test is one of the techniques to
    calculate test statistics. Let us assume that we
    have a sample of size n (x(x1,,,,xn)) and we
    want to estimate a parameter vector ?(? 1,?2).
    Both ?1 and ?2 can also be vectors. We want to
    test null-hypothesis against alternative one
  • Let us assume that likelihood function is L(x
    ?). Then likelihood ratio test works as follows
    1) Maximise the likelihood function under
    null-hypothesis (I.e. fix parameter(s) ?1 equal
    to ?10 , find the value of likelihood at the
    maximum, 2)maximise the likelihood under
    alternative hypothesis (I.e. unconditional
    maximisation), find the value of the likelihood
    at the maximum, then find the ratio
  • w is the likelihood ratio statistic. Tests
    carried out using this statistic are called
    likelihood ratio tests. In this case it is clear
    that
  • If the value of w is small then null-hypothesis
    is rejected. If g(w) is the the density of the
    distribution for w then critical region can be
    calculated using

17
Hypothesis testing vs intervals
  • Some modern authors in statistics think that
    significance testing is overworked procedure. It
    does not make much sense once we have observed
    the sample. Then it is much better to work with
    confidence intervals. Since we can calculate
    statistics related with the parameter we want to
    estimate then we can make inference that where
    true value of the parameter may lie.
  • R commands produce predefined confidence
    intervals as well as p values. Usually p-values
    are used in rejecting or not-rejecting a
    hypothesis.

18
R commands for tests
  • t.test - one, two-sample and paired t-test
  • var.test - test for equality of variances
  • power.t.test - to calculate power of t-test
  • Some other tests. These are nonparametric tests
  • wilcox.test - test for differences between means
    (works for one, two sample and paired cases)
  • ks.test - Kolmogorov-Smirnov test for equality of
    distributions

19
Further reading
  • Full exposition of hypothesis testing and other
    statistical tests can be found in
  • Stuart, A., Ord, JK, and Arnold, S. (1991)
    Kendalls advanced Theory of statistics. Volume
    2A. Classical Inference and the Linear models.
    Arnold publisher, London, Sydney, Auckland
  • Box, GEP, Hunter, WG, Hunter, JS (1978)
    Statistics for experimenters
  • Peter Dalgaard, (2008) Introductory statistics
    with R
Write a Comment
User Comments (0)
About PowerShow.com