ARCH 21266126 - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

ARCH 21266126

Description:

Mean of special batch = mean of population ... Special batch has normal distribution, for large sample size ... Imagine, then, two batches of numbers... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 28
Provided by: anu9
Category:
Tags: arch | batches

less

Transcript and Presenter's Notes

Title: ARCH 21266126


1
ARCH 2126/6126
  • Session 7 Hypothesis-testing with numbers

2
Sequence of topics so far
  • Defining variables values
  • Measuring recording
  • Visualizing and summarizing data
  • Measures of central tendency
  • Measures of dispersion
  • In fact, descriptive statistics
  • Sampling
  • Probability

3
Descriptive statistics already begin the search
for pattern
  • More explicitly theoretical approaches extend
    that search
  • A hypothesis is a proposition put forward to
    explain observed facts
  • To be useful in empirical studies, a hypothesis
    must be testable
  • Testable means open to disproof
  • A theory is a set of hypotheses which have been
    tested and not disproved

4
M. R. Dawkins
  • Statistical description is the quest for
    patterns or rules which permit reduction in the
    quantity of data without undue loss of
    information. Some reduction is essential if the
    description is to be useful. One cannot publish
    ones field notebook. Clearly some data must be
    thrown away. (1974)

5
Inferential statistics
  • Depend on a notion of probability
  • Sometimes we may have a model where all outcomes
    are equally likely, e.g. dice
  • Even simple models have non-obvious outcomes when
    put together, e.g. 2 dice
  • Artificial examples e.g. dice are used because
    simple to understand but there are real-world
    counterparts - sex ratio

6
More complicated distributions
  • What kind of pattern emerges in dice-rolling?
  • Resemblance to the normal distribution?
  • Any resemblance not coincidental
  • Many real-world outcomes which result from
    multiple independent processes approximate this
    normal distribution

7
Distributions and models
  • Different variables may have different
    distributions
  • Different models may lead us to expect different
    distributions
  • Thus when testing the fit of a model to data, or
    when testing the resemblance of one data-set to
    another, our test will, where possible, take
    distributions into account, may make
    assumptions
  • On many models, not all outcomes are equally
    likely, even with random sampling

8
A sample statistic estimates a population
parameter
  • E.g. a percentage or a mean
  • Can we get an idea how minor or major the error
    in estimation can be?
  • An idea/model of distribution can help
  • There are different ways in different cases to
    estimate confidence intervals

9
Confidence intervals
  • Confidence intervals (or error ranges) are ranges
    around the sample statistic
  • They have defined probabilities of covering the
    population parameter
  • The higher the confidence the user requires of
    covering the population parameter, the wider the
    confidence interval will be

10
In the case of a mean ...
  • We are helped particularly by the finding that
    the means of all possible re-samplings from a
    distribution have a normal distribution
    themselves
  • Central limit theorem
  • Using this, it can be shown that for large
    samples this distribution has a SD of s/?n (
    standard error of mean)

11
Confidence intervals for population mean
  • From SE of mean we can get confidence intervals
    for mean
  • 95 CI is given, in large samples, by mean ?
    (SE1.96)
  • 99 CI given by mean ? (SE2.58)
  • 99.9 CI given by mean ? (SE3.29)
  • These state the level of confidence with which we
    can say that the population mean lies within the
    stated limits
  • Conditions apply SRS and large n

12
(No Transcript)
13
Central limit theorem
  • Special batch consists of the means of all
    possible samples of a given size that could be
    drawn from a given population
  • Mean of special batch mean of population
  • SD of special batch SD of population / square
    root of sample size
  • Special batch has normal distribution, for large
    sample size
  • Large in this case means over about 30

14
So lets use Excel to calculate a standard error
of a mean
  • Enter an invented data set 20 numbers
  • Calculate mean and SD as before
  • Take square root of N
  • Divide SD by ?N to get SE
  • Multiply SE by 1.96
  • Add that number to mean to get upper limit
    subtract it to get lower limit
  • From upper to lower limit is 95 confidence
    interval for large N, chance that true
    population mean lies in between those numbers is
    95

15
So what kinds of hypothesis can we test?
  • A common kind of hypothesis posits average
    differences between groups, e.g. lengths of
    scrapers, statures of people
  • These might be important because they might
    suggest different artefact traditions, different
    nutrition/health conditions etc.
  • Will use these as examples but many of the
    points are more general
  • E.g. we might have a hypothesis that longer
    scrapers are broader, or made in a different stone

16
Types of hypothesis
  • In inferential statistics, we start from a base
    you may find counter-intuitive
  • We begin with the simplest possibility (the
    so-called Occams razor)
  • This may be that, for example, two samples are
    similar enough to have been drawn from the same
    population
  • Thus we have the null hypothesis or H0 the
    proposition of no difference, no effect or
    no association

17
Testing the null hypothesis
  • If the null hypothesis can be disproved, then the
    (more interesting) alternative hypothesis H1 -
    that there is an effect, difference, association
    etc. - becomes the simplest available hypothesis
  • We cannot actually prove H1
  • We also cannot show causes - we may be right for
    the wrong reason
  • But we can test hypotheses

18
Imagine, then, two batches of numbers...
  • ... representing, say, lengths of scrapers
    excavated from two sites
  • Are they the same or different?
  • Well, you wouldnt expect them to be exactly the
    same, would you?
  • Why not? Sampling error, even if sampling from
    one population
  • So are they basically the same? Or are they
    different in more than just sampling error?

19
Statistical significance
  • Or as a statistician would say, are the lengths
    significantly different?
  • Significance has become a term of art in
    statistics, has a specific meaning, not
    equivalent to social or biological or
    archaeological significance
  • Refers to low probability that the two batches
    could have been drawn from the same population
    (or 2 populations with same mean)

20
Suppose we do re-sample from one population
  • We will find that sample means vary
  • But in a regular way that depends on sample size
    population distribution
  • We can invent an artificial population and
    re-sample from it to simulate this
  • Can also show the mathematical properties of this
  • Sample means cluster around the population mean
    with given dispersion

21
So the question is ...
  • Are the two real sample means only as different
    as you would expect by sampling error? Or more
    different?
  • In reality this is a question of probability how
    likely is it that the difference we observe could
    have arisen by re-sampling from the same
    population?
  • A dramatic difference is obvious by eye but
    often they arent so dramatic, so we test
    statistically

22
Significance and p-values
  • If it is very unlikely that a difference as large
    as we observe could have been produced by
    re-sampling, we may be inclined to reject the
    null hypothesis say there is a significant
    difference
  • Convention is that if the probability (p) that
    the null hypothesis is correct is less than (lt)
    5 (0.05), we say the result is significant at
    the 5 level

23
More on p-values
  • Significance at the 5 level plt0.05
  • Similarly for plt0.01, plt0.001 etc.
  • The smaller the number, the stronger the
    rejection of the null hypothesis
  • In many publications you will see results
    asterisked or bolded if they reach 5 or stated
    significance level and ignored, omitted or
    N.S. if not

24
The problem of many tests
  • To accept any result where plt0.05 as
    significant also implies that on about 1/20
    occasions a result will appear significant even
    if H0 is true
  • Psychiatrists seeking a difference between
    schizophrenics others did 77 tests and found
    significant (plt0.05) differences in 2 of these
    tests
  • Corrections can be made or we can decide to
    accept only plt0.01 or less
  • But still beware seeking significance!

25
How likely is our test to mislead us?
  • Risk of rejecting H0 when it is true is known as
    a Type I Error
  • It is given by the p-value we choose
  • Risk of accepting the H0 when it is false is
    known as a Type II Error
  • It is sometimes hard to calculate but clearly
    rises, the lower the p-value
  • Its converse is the power of the test

26
What is so magic about 5?
  • Nothing
  • Its just the number Sir R.A. Fisher thought of,
    to represent low probability
  • He felt that if that kind of difference only
    occurred by chance once in 20 times, it was rare
    enough to indicate against H0
  • But of course this is arbitrary

27
Classical statistics versus exploratory data
analysis
  • The classical approach to hypothesis testing
    just sketched was developed 1920-1950 by Fisher
    others
  • Gives a clear yes/no, accept/reject
  • But there is no certainty in statistics
  • To force yes/no from analysis which really says
    highly likely or probably not is artificial
  • Exploratory data analysis advocates simply
    stating the p-values, whatever
Write a Comment
User Comments (0)
About PowerShow.com