Week 2 An overview presentation

About This Presentation

Transcript and Presenter's Notes

Title: Week 2 An overview

1
Week 2An overview

Exposure and outcome (dependent and independent
variables)
Reliability and validity
What is statistical significance?
Relationships between variables-continuous
variables (t-tests and z-tests)-continuous
variables (correlations)
-the normal (gaussian) distribution-categorical
variables (chi-square tests)
Two by two tables and confidence intervals
Review of the articles
Example 1 Children crossing streets
Measures of association between variables
For next week

A somewhat advanced society has figured how to
package basic knowledge in pill form. A student,
needing some learning, goes to the pharmacy and
asks what kind of knowledge pills are available.
The pharmacist says "Here's a pill for English
literature." The student takes the pill and
swallows it and has new knowledge about English
literature!
"What else do you have?" asks the student. "Well,
I have pills for art history, biology, and world
history, "replies the pharmacist. The student
asks for these, and swallows them and has new
knowledge about those subjects!
Then the student asks, "Do you have a pill for
statistics? "The pharmacist says "Wait just a
moment", and goes back into the storeroom and
brings back a whopper of a pill that is about
twice the size of a jawbreaker and plunks it on
the counter. "I have to take that huge pill for
statistics?" inquires the student.
The pharmacist understandingly nods his head and
replies "Well, you know statistics always was a
little hard to swallow."

3
Epidemiologic study designs

Randomized controlled trial
Considered the gold standard
Exposure is assigned randomly
Participants followed over time to assess outcome
Analytic comparison of risk or benefit in exposed
vs. not exposed
Can be applied to program evaluation

4
Epidemiologic study design 2

2. Cohort study
One group exposed
Other group unexposed
Participants followed over time to assess outcome
Analytic comparison of risk in exposed vs. not
exposed
Can be applied to program evaluation

5
Epidemiologic study designs 3

3. Case-control study
Based on outcome
Exposure is compared in those with and without
outcome
Analytic comparison of risk in exposed vs. not
exposed
4. Descriptive study
Provides descriptive statistics of problem under
study
No analytic comparison of risk / benefit
Often precedes analytic studies

6
Dependent vs independent variables

Remember the exposure/outcome relationship
Another way to describe it is to attribute
dependent and independent variables-the outcome
depends on the independent exposure variables
It is the association between these variables
that leads us to statistical tests
The test we use depends on the type of variable

7
Statistical significance

What is statistical significance?
The probability that the observed relationship
could have happed by chance
The p-value and confidence interval are the usual
measures of significance
Set by tradition at 0.05 or 95
The higher the p value, the more likely it could
have happened by chance
The wider the confidence interval, the more
likely it could have happened by chance
Both driven by variability in the data and sample
size

8
Types of variables

Continuous variables
-variables for which there is a range of
responses
e.g., age, blood pressure, weight
Categorical variables
Variables that fall into categories
e.g, gender, smoking status

9
Hypothesis testing for continuous variables

Mean (the average number)
-calculated by summing all the numbers and
dividing by n
-Hypothesis testing usually done using a t-test
to compare the 2 means
-Significance of t-test based on sample size and
variability within the data
Median (the number in the middle)
-not usually tested
Mode (the most frequent response)
-not usually tested

10
Hypothesis testing for categorical variables

Counts (how many fall within each category)
Compare using 2X2 table
Proportions (what percentage fall within each
category)
Compare 2 proportions
Frequency distributions (comparing counts and
percentages between categories)
Compare using chi-square test

11
2X2 tables the foundation
Disease or other outcome No disease or other outcome
Exposed a b
Not exposed c d
12
2X2 tables estimating associations
Disease or other outcome No disease or other outcome
Exposed a b ab
Not exposed c d cd
ac bd abcd
13
Odds ratios and relative risks

Odds ratios (ad/bc) calculate the odds of an
outcome given an exposure
Relative risk (a/ab)/c/cd) calculates the
relative risk of an outcome in exposed compared
to non-exposed group
Statistical packages calculate confidence
intervals

14
Confidence intervals

Confidence intervals are used for hypothesis
testing in 2X2 tables (and others)
The width of a confidence interval is based on
the variablility within the data and the sample
size
An OR or RR of 1 no association
A confidence interval that crosses 1 is NOT
statistically significant

15
Regression lines and correlation

Correlation is the measure of the way one
variable is associated with another
Can be done with 2 continuous variables
The regression line is the best fit between 2
variables
Ranges from -1 to 1

16
Article review

Questions to consider
What is the research question?
What is their study design?
What is the exposure variable(s)?
What is the outcome variable?
What are the strengths and limitations?
Who funded the study?
How compelling are the findings?

Example 1
Statistical associations of the number of streets
crossed by children and
-socio-economic indicators-child pedestrian
injury rate

18
Background

Child pedestrian injury rate has been declining
in many countries, including Canada
Concern has been expressed that the decline is
due to a reduction in exposure to traffic (i.e.,
children are driven or bussed rather than walking)

19
Objective

The objective of this study was to measure the
number of streets children cross on one day
To see if the number of streets crossed varies by
socio-economic status
To see if the child pedestrian injury rate is
associated with the number of streets crossed

20
Variables

Number of streets crossed as reported by parents
from a random sample of schools in Montreal
Socio-economic status measured by-car
ownership-parental education-home ownership
Injury rate in police district as reported by the
police

21
Methods

Frequency distribution of average of streets
crossed presented by age and SES
Statistical testing for the differences between
means for categorical variables
Scatterplot generated and regression line
calculated

22
Table 1 Number of Streets Crossed by Age and
Socio-economic Indicators
Age N Mean SD
5 6 487 3.8 4.2
7 730 4.2 5.0
8 9 519 4.8 5.3
10 657 5.5 5.8
11 12 108 6.6 6.3
Number of cars
0 467 5.9 5.8
1 1191 4.8 5.3
2 815 3.8 4.8
Home Ownership
Rent home 1213 5.5 5.6
Own home 1210 3.8 4.7
23
Comparing average streets crossed by car ownership
No car 1 car
Average streets crossed (Mean) 5.9 4.8
Standard deviation 5.8 5.3
Sample size 467 1171
Z Test for difference between means 13.8, plt0.001
24
(No Transcript)
25
Measures of association between variables

Tied in to the concept of reliability and
validity
Sometimes we need to test a new variable in
relation to an old one
For example, a new questionnaire, faster blood
test, etc.
Several ways to measure association
Cronbachs alpha, kappa, sensitivity,
specificity, positive predictive value, negative
predictive value

26
Cronbachs alpha

Measures the reliability of a psychometric
instrument
Assesses the extent to which a set of test items
can be treated as measuring a single latent
variable
Mean correlation between a set of items with the
mean of all the other items
Looks at variation between individuals compared
to variation due to items
Can be between infinity and 1 (although usually
only between 0 and 1)
Usually considered good if gt 0.8

27
Kappa

Measures the extent to which ratings given by 2
raters agree
Often used when experts are assigning scores
based on opinions (e.g., medication errors)
Gives credit when scores match exactly, takes
away agreement when they dont
Can be between 0 and 1
Usually considered good if gt 0.7

28
Sensitivity and specificity

Sensitivity
Measures the extent to which a test agrees with a
gold standard
Often used when trying out a new diagnostic test
Reports how often the new test agrees with the
old when positive
Captures the false negatives
Calculated using a 2 X 2 table
Acceptability of score depends on test qualities

29
Sensitivity and specificity

Specificity
Measures the extent to which a test agrees with a
gold standard
Often used when trying out a new diagnostic test
Captures the false positives
Reports how often the new test agrees with the
old when negative (eg accurately reports the
absence of the condition)
Calculated using a 2 X 2 table
Acceptability of score depends on test qualities

30
2X2 tables revisited
Gold standard (has condition) Gold standard (does not have condition)
New test a b
New test - c d
31
Calculating sensitivity and specificity

Sensitivity number who are both disease positive
and test positive/number who are disease positive
a/ac
Specificity number who are both disease
negative and test negative/number who are disease
negative
d/db

32
Understanding sensitivity and specificity

Sensitivity is high when the test picks up a lot
of the true disease (has few false negatives)
High sensitivity is important for infectious
diseases (e.g., HIV)
Specificity is high when the test does not have
false positives. This is important when the
consequences of treating the disease are
significant (e.g., cancer)

33
Positive and negative predictive value

Tells you how good a test is at predicting
whether a patient actually has the disease
Positive predictive value is the probability that
the patient has the disease given a positive test
Depends on sensitivity, specificity and the
prevalence of the disease

34
Overview

Different types of variables are measured and
presented differently
P values and confidence intervals are the measure
of statistical significance
Tell us the probability that these results could
have happened by chance
Cronbachs alpha, kappa, sensitivity and
specificity tell us about relationships between
measurements

35
For next week 1

Read Chapter 3 in the text
Read the ICES privacy document (www.ices.on.ca)
Think about privacy and confidentiality
What issues are relevant to you in your current
research?

36
(No Transcript)
37
For next week 2

Identify your data set
Where did it come from?
How was it collected?
What type of variables does it include?
What is your research question?
What are your exposure variables?
What is your outcome variable?
If you are not familiar with SPSS it is STRONGLY
recommended that you complete the tutorial

Write a Comment

User Comments (0)

About PowerShow.com

Week 2 An overview PowerPoint PPT Presentation