STATISTICAL ANALYSIS. - PowerPoint PPT Presentation

1 / 15

About This Presentation

Title:

STATISTICAL ANALYSIS.

Description:

Median: mid point with equal half above and below; (ordinal, interval and ration) ... Sometimes statisticians use what is called 'ordinal' data. ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 16

Provided by: off689

Category:

more less

Transcript and Presenter's Notes

Title: STATISTICAL ANALYSIS.

1
STATISTICAL ANALYSIS.

Your introduction to statistics should not be
like drinking water from a fire hose!!

2
What do you mean by data??
3
Statistics 101!!

Statistics
Measures of locationmean vs. median and why
Measures of scalerange, interquartile range,
standard deviation (and variance)
Measures of positionpercentiles, deciles,
quartiles, median
Note. For categorical variables, we use
proportions as the descriptive statistics

4
Why does lack of normality cause problems?

When we calculate the p-value for an inference
test, we find the probability that the sample was
different due to sampling variability.
Basically, we are trying to see if a recorded
value occurred by chance and chance alone. When
we look for a p-value, we are assuming that all
samples of the given sample size are normally
distributed around the mean. This is why the test
statistic, which is the number of standard
deviations away from the population mean the
sample mean is, is able to be used. Therefore,
without normality, no p-value can be found.

5
There are non-parametric tests which are similar
to the parametric tests. The following table
shows how some of the tests match up.
6
What is different about Non-Parametric Statistics?

Sometimes statisticians use what is called
ordinal data. This data is obtained by taking
the raw data and giving each sample a rank.
These ranks are then used to create test
statistics.
In parametric statistics, one deals with the
median rather than the mean. Since a mean can be
easily influenced by outliers or skewness, and we
are not assuming normality, a mean no longer
makes sense. The median is another judge of
location, which makes more sense in a
non-parametric test. The median is considered
the center of a distribution.

7
Drawing a histogram..the good the bad and the
downright ugly!!.
Many modern introductory texts and confuse
frequency graphs, relative frequency graphs, and
histograms.
Bad
Good
8
What's the difference between a bar chart a
Histogram??
9
Critical Values

For a given number of degrees of freedom, by the
property of the t-distribution, we know how large
the t-statistic must be in order to reject the
null.
We call that number the critical value of the
t-statistic and is typically determined by the
values in a table of the t-statistic.
If the value of the t-statistic calculated from
the data is greater than this critical value,
then we reject the null hypothesis.
- This is because, for t-statistics greater than
this critical value, our probability of falsely
rejecting the null hypothesis is very small.

10
Example

Suppose our null hypothesis is that X is less
than 0.
The sample mean is 3
The sample standard deviation is 2
There are 121 observations.
Step 1. We need to establish our critical
value.
We wish to reject the null hypothesis if we are
95 certain that it is false. For 121
observations and a one-tailed test, the
critical value is 1.66 (which we look up on the
table. This corresponds to a significance level
of .05 with 120 degrees of freedom).
Step 2. The t-statistic ( 3 0 ) / ( 2 / ?121
) ? 3 / .18 ? 16.7.
Step 3. Compare the t-statistic with the critical
value. If the t-statistic is greater than the
critical value, then you can reject the null
hypothesis.
In this case, 16.7 is greater than 1.66, so we
can reject the null hypothesis that X is less
than zero.

11
Example

The table to the right is a sample cross-tab
Your research hypothesis is that dog ownership
and gender are related.
How do you test this hypothesis?

12
Hypothesis Tests about tables

Step 1. Define null and research hypotheses.
The null hypothesis will usually be that there
is no relationship between the rows and the
columns.
Step 2. Determine your tolerance for falsely
rejecting the null hypothesis of no relationship.
Step 3. Empirically analyse the data to determine
if there is a relationship.

13
Example

To calculate independence
1) Identify the number of respondents in each
internal cell of the table
2) Calculate the number of respondents who would
be in each cell if independent (corresponds to
the second number under each total)
e.g. cell1,1 .5 .15 1000 75
cell1,2 .5 .85 1000 425
3) Compute the chi-squared test statistic (next
slide)

14
The Chi-Square Test Statistic

To calculate independence
3) Compute the chi-squared test statistic
The chi-squared test statistic is simply
??2 ?rows?columns (Observedrow,column -
Expectedrow,column)2
Expectedrow,column
The chi-squared statistic follows a chi-squared
distribution with degrees of freedom (rows 1)
(columns 1).

15
Example

If we look at our table of the ??2 with 1 degrees
of freedom, the critical value for our test
statistic is 3.84.
??2 (100 - 75)2 / 75
(400-425)2 / 425
(50- 75)2 / 75
(450-425)2 / 425
19.6
In this case, we reject the null hypothesis that
the two populations are statistically independent
because our test-statistic is greater than our
critical value.