STATISTICAL ANALYSIS. - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

STATISTICAL ANALYSIS.

Description:

Median: mid point with equal half above and below; (ordinal, interval and ration) ... Sometimes statisticians use what is called 'ordinal' data. ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 16
Provided by: off689
Category:

less

Transcript and Presenter's Notes

Title: STATISTICAL ANALYSIS.


1
STATISTICAL ANALYSIS.
  • Your introduction to statistics should not be
    like drinking water from a fire hose!!

2
What do you mean by data??
3
Statistics 101!!
  • Statistics
  • Measures of locationmean vs. median and why
  • Measures of scalerange, interquartile range,
    standard deviation (and variance)
  • Measures of positionpercentiles, deciles,
    quartiles, median
  • Note. For categorical variables, we use
    proportions as the descriptive statistics

4
Why does lack of normality cause problems?
  • When we calculate the p-value for an inference
    test, we find the probability that the sample was
    different due to sampling variability.
    Basically, we are trying to see if a recorded
    value occurred by chance and chance alone. When
    we look for a p-value, we are assuming that all
    samples of the given sample size are normally
    distributed around the mean. This is why the test
    statistic, which is the number of standard
    deviations away from the population mean the
    sample mean is, is able to be used. Therefore,
    without normality, no p-value can be found.

5
There are non-parametric tests which are similar
to the parametric tests. The following table
shows how some of the tests match up.
6
What is different about Non-Parametric Statistics?
  • Sometimes statisticians use what is called
    ordinal data. This data is obtained by taking
    the raw data and giving each sample a rank.
    These ranks are then used to create test
    statistics.
  • In parametric statistics, one deals with the
    median rather than the mean. Since a mean can be
    easily influenced by outliers or skewness, and we
    are not assuming normality, a mean no longer
    makes sense. The median is another judge of
    location, which makes more sense in a
    non-parametric test. The median is considered
    the center of a distribution.

7
Drawing a histogram..the good the bad and the
downright ugly!!.
Many modern introductory texts and confuse
frequency graphs, relative frequency graphs, and
histograms.
Bad
Good
8
What's the difference between a bar chart a
Histogram??
9
Critical Values
  • For a given number of degrees of freedom, by the
    property of the t-distribution, we know how large
    the t-statistic must be in order to reject the
    null.
  • We call that number the critical value of the
    t-statistic and is typically determined by the
    values in a table of the t-statistic.
  • If the value of the t-statistic calculated from
    the data is greater than this critical value,
    then we reject the null hypothesis.
  • - This is because, for t-statistics greater than
    this critical value, our probability of falsely
    rejecting the null hypothesis is very small.

10
Example
  • Suppose our null hypothesis is that X is less
    than 0.
  • The sample mean is 3
  • The sample standard deviation is 2
  • There are 121 observations.
  • Step 1. We need to establish our critical
    value.
  • We wish to reject the null hypothesis if we are
    95 certain that it is false. For 121
    observations and a one-tailed test, the
    critical value is 1.66 (which we look up on the
    table. This corresponds to a significance level
    of .05 with 120 degrees of freedom).
  • Step 2. The t-statistic ( 3 0 ) / ( 2 / ?121
    ) ? 3 / .18 ? 16.7.
  • Step 3. Compare the t-statistic with the critical
    value. If the t-statistic is greater than the
    critical value, then you can reject the null
    hypothesis.
  • In this case, 16.7 is greater than 1.66, so we
    can reject the null hypothesis that X is less
    than zero.

11
Example
  • The table to the right is a sample cross-tab
  • Your research hypothesis is that dog ownership
    and gender are related.
  • How do you test this hypothesis?

12
Hypothesis Tests about tables
  • Step 1. Define null and research hypotheses.
  • The null hypothesis will usually be that there
    is no relationship between the rows and the
    columns.
  • Step 2. Determine your tolerance for falsely
    rejecting the null hypothesis of no relationship.
  • Step 3. Empirically analyse the data to determine
    if there is a relationship.

13
Example
  • To calculate independence
  • 1) Identify the number of respondents in each
    internal cell of the table
  • 2) Calculate the number of respondents who would
    be in each cell if independent (corresponds to
    the second number under each total)
  • e.g. cell1,1 .5 .15 1000 75
  • cell1,2 .5 .85 1000 425
  • 3) Compute the chi-squared test statistic (next
    slide)

14
The Chi-Square Test Statistic
  • To calculate independence
  • 3) Compute the chi-squared test statistic
  • The chi-squared test statistic is simply
  • ??2 ?rows?columns (Observedrow,column -
    Expectedrow,column)2
  • Expectedrow,column
  • The chi-squared statistic follows a chi-squared
    distribution with degrees of freedom (rows 1)
    (columns 1).

15
Example
  • If we look at our table of the ??2 with 1 degrees
    of freedom, the critical value for our test
    statistic is 3.84.
  • ??2 (100 - 75)2 / 75
  • (400-425)2 / 425
  • (50- 75)2 / 75
  • (450-425)2 / 425
  • 19.6
  • In this case, we reject the null hypothesis that
    the two populations are statistically independent
    because our test-statistic is greater than our
    critical value.
Write a Comment
User Comments (0)
About PowerShow.com