Title: Variation: role of error, bias and confounding Raj Bhopal, Bruce and John Usher Professor of Public Health, Public Health Sciences Section, Division of Community Health Sciences, University of Edinburgh, Edinburgh EH89AG Raj.Bhopal@ed.ac.uk
1Variation role of error, bias and
confoundingRaj Bhopal, Bruce and John Usher
Professor of Public Health, Public Health
Sciences Section, Division of Community Health
Sciences,University of Edinburgh, Edinburgh
EH89AGRaj.Bhopal_at_ed.ac.uk
2Educational objectives
- On completion of your studies should understand
- That error is crucially important in applied
sciences based on free living populations such as
epidemiology - Bias, considered as an error which affects
comparison groups unequally, is particularly
important in epidemiology - The major causes of error and bias in
epidemiology, can be analysed based on the
chronology of a research project - Bias in posing the research question, stating
hypotheses and choosing the study population are
relatively neglected but important topics in
epidemiology
3Educational objectives
- Errors and bias in data interpretation and
publication are particularly important in
epidemiology because of its health policy and
health care applications - Confounding is the mis-measurement of the
relationship between a risk factor and disease
and arises in comparisons of groups which differ
in ways that affect disease - Different epidemiological study designs share
most of the problems of error and bias
4Exercise Error and bias
- Reflect on the words error and bias. What is
the difference, if any, between error and bias? - Why might error and bias be particularly common
and important in epidemiology?
5Error
- An error is by definition an act, an assertion,
or a belief that deviates from what is right..but
what is right? - The true length of a metre is arbitrarily decided
by agreeing a definition - The difference between a "correct" metre stick
and an erroneous one can be accurately measured - For health and disease the truth is usually
unknown and cannot be defined in the way we
define metre - Error should be considered as an inevitable and
important part of human endeavor - Popperian view is that science progresses by the
rejection of hypotheses (by falsification) rather
than the establishing of so called truths (by
verification)
6Bias
- A preference or an inclination
- Bias may be intentional or unintentional
- In statistics a bias is an error caused by
systematically favoring some outcomes over others
- Bias in epidemiology can be conceptualised as
error which applies unequally to comparison
groups.
7Error and bias in biology
- Biological research is difficult because of
the complexity and variety of living
things - Circadian and other natural rhythms cause change
- Measurement techniques are usually limited by
technology, cost or ethical considerations - Strict rules restrict what measurement is
permissible ethically and what humans are willing
to give their consent to - Experimental manipulation to test a hypothesis is
usually done late
8Figure 4.1
(b) Error is unequal in one of these groups
leading to a false interpretation of the pattern
of disease - here failure to detect differences
(a) Error is unequal in one of these groups
leading to a false interpretation of the pattern
of disease - falsely detecting differences
9Error and bias in epidemiology
- Error and bias in epidemiology focus on (a)
selection (of population), (b) information
(collection, analysis and interpretation of data)
and (c) confounding - Error and bias is also inherent in the process of
developing research questions and hypotheses but
is seldom discussed - Are questions of sex or racial differences in
intelligence, disease, physiology or health
biased questions?
10The research question, theme or
hypothesis
- Science is done by human beings who often have
strong ideas and views - They share in the social values and beliefs of
their era such as class, racial and sexual
prejudice - The question "Are men more intelligent (or
healthy) than women?" could be considered a
biased question
11Research question
- Apparently the neutral hypothesis here would be
that there are no gender differences in
intelligence - The underlying values of the researchers may be
that men are more intelligent than women - Likely to be revealed at the analysis and
interpretation stage by biased interpretation - It is problematic to describe difference without
conveying a sense of superiority and inferiority
12The research question
- Syphilis Study of the US Public Health Service
followed up 600 African American men for some 40
years - The question does syphilis have different and,
particularly, less serious outcomes in African
Americans than European origin Americans? - Investigators denied the study subjects treatment
even when it was available and curative
(penicillin)
13Choice of population
- Known as selection bias
- Volunteers are a popular choice
- Volunteers tend to be different in their
attitudes, behaviours and health status compared
to those who do not volunteer - Men have been more often selected than women
- Investigators are prone to exclude individuals
and populations for reasons of convenience, cost
or preference rather than for neutral, scientific
reasons
14Selection bias
- Selection bias is inevitable, simply because
investigators need to make choices - Captive populations are popular-some may be
fairly representative, e.g.
schoolchildren, others not at
all, e.g. university students - People are also missed either inadvertently or
because they actively do not participate - Selection bias matters much more in epidemiology
than in biologically based medical sciences. - Biological factors are usually generalisable
between individuals and populations, so there is
a prior presumption of generalisability - If an anatomist describes the presence of a
particular muscle, or cell type, based on one
human being it is likely to be present in all
human beings (and possibly all mammals)
15Non-participation
- Some subjects chosen for a study do not
participate causing selection bias - The non-response in good studies is typically
30-40 - Non-responders differ from those who respond
- Problem is compounded when the non-response
differs greatly in two populations that are to be
compared - The effect may be understood if some information
is available on those not participating e.g.
their age, sex, social circumstances and why they
refused - Non-response bias is an intrinsic limitation of
the survey method and hence of epidemiology
16Figure 4.2
- Ignoring populations
- Questions harming one population
- Measuring unequally
- Generalising
- from unrepresentative populations
Study population
Ignored population
Comparison population
17Comparing risk factor-disease outcome
relationships in populations which differ
(confounding)
- Confounding is a difficult idea to explain and
grasp - It is the error in the measure of association
between a specific risk factor and disease
outcome, which arises when there are differences
in the comparison populations other than the risk
factor under study - Confounding is derived from a Latin word meaning
to mix up, a useful idea, for confounding mixes
up causal and non-causal relationships - The potential for it to occur is there whenever
the cardinal rule compare like-with-like is
broken
18Exercise Confounding
- Imagine that a study follows up people
who drink alcohol and observes the
occurrence of lung cancer - A group of people who do not drink and are of the
same age and sex provide the comparison group - The study finds that lung cancer is more common
in alcohol drinkers, i.e. there is an association
between alcohol consumption and lung cancer. - Did alcohol causes lung cancer?
19Confounding
- In what other important ways might the study
(alcohol drinking) and comparison (no alcohol
drinking) populations be different? - Could the association between alcohol and lung
cancer be confounded? - What might be the confounding variable?
- First key analysis in all epidemiological studies
is to compare the characteristics of the
populations under study
20Examples of confounding
21Figure 4.3
The true cause confounding variable
Association between the apparent risk factor and
the causal factor
One of the causes of the disease
A statistical but not causal association
Apparent but spurious risk factor for disease
Disease
22Figure 4.4
Smoking
Smoking is associated with the apparent risk
factor alcohol, and vice versa
Smoking causes lung cancer
Alcohol is statistically but not causally linked
to lung cancer
Alcohol drinking
Lung cancer
23Possible actions to control confounding
24Measurement errors in epidemiology
- Information bias
- Why are measurement errors in epidemiology likely
to be more common and more important than in
other scientific disciplines - say physics,
anatomy, biochemistry or animal physiology? - Assessing the presence of disease in living human
beings requires a judgement - Measuring socio-economic circumstances, ethnic
group, cigarette smoking habits or alcohol
consumption are complex matters - These errors are life-and-death matters, even in
epidemiological research
25Measurement errors
- Past exposures will need to be estimated,
sometimes from contemporary measures - Biological variation needs to be taken into
account e.g. blood pressure varies from moment to
moment in response to physiological needs related
to activity, in a 24 hour (circadian) cycle with
lowered pressure in the night, and with the
ambient temperature - Some variables have natural variation so great
that making estimates is extremely difficult, for
example, in diet, alcohol consumption, and the
level of stress - Machine imprecision is also inevitable
- Inaccurate observation by the investigator or
diagnostician
26Measurement errors and bias
- Measurement errors which occur unequally in the
comparison populations are-differential
misclassification errors or bias-likely to
irreversibly destroy a study - -will increase the strength of the association
in error - Non-differential errors or biases, occurring in
both comparison populations, are much more likely
to occur
27Misclassification bias
- Misclassification error (or bias) occurs when
a person is put into the wrong
category (or
population sub-group), usually as a result
of faulty measurement - Some people who are hypertensive will be
misclassified as normal - Some who are not hypertensive will be
misclassified as hypertensive - The end result in terms of the prevalence of
hypertension may be about right - The degree to which a measure leads to a correct
classification can be quantified using the
concepts of sensitivity and specificity - and
these are discussed in relation to screening
tests - In measuring the strength of association between
exposures and disease outcomes non-differential
misclassification error has an important and not
always predictable effect
28Non differential misclassification error
- Imagine a study of 20,000 women, 10,000 on the
contraceptive pill and the rest not - Say that over 10 years 20 of those on the pill
develop a cardiovascular disease compared to 10
of those not on the pill - The rate of disease in the oral contraceptive
group is doubled (relative risk 2) - Assume that misclassification in exposure occurs
10 of the time, so that 10 of women actually on
the pill were classified as not on the pill, and
that 10 who were not on were classified as on
the pill
29Imaginary study of cardiovascular outcome and
pill use no misclassification
30Pill and cardiovascular disease 10
misclassification of pill use
31Misclassification the pill
- The risk of CVD in the "pill users group" with
10 misclassification is1,900/10,000, and in the
"not on the pill group" is 1,100/10,000, so the
relative risk is - Misclassification will, inevitably, also arise in
measurement of the disease outcome, further
reducing the strength of the association - Generally, non-differential misclassification
bias lowers the relative risk. - This general principle may break down when
misclassification occurs in confounding variables
as well
32Analysis and interpretation
- Usually the potential for data analysis is far
greater than that actually done - The choices will be informed by the prior
interests (and biases) and expertise of the
researcher - External scrutiny at an early stage by objective
advisors of the research protocol could reduce
such biases - Inclusion of objective, uninvolved people in the
research team at the data analysis and
interpretation stage is possible but unusual, so, - Investigators should ensure their analysis is
driven by hypotheses, research questions and an
analysis strategy prepared in advance - Proposal is that investigators should make public
their data questionnaire, the analysis strategy,
and other information required to replicate the
analysis
33Judgement and action
- The data and interpretation are examined by those
who need to make decisions - Interpretations, especially those which involve
change that may threaten powerful interests, will
be contested. - Interpretation is a matter of judgement and
judgement will depend on the prior values,
beliefs and interests of the observer - Epidemiologists are not the sole arbiters of the
theory and data. - Epidemiologists, however, have responsibilities
for minimising the impact of their own biases and
preventing the misinterpretation of data and
recommendations by those with vested interests
34Study population bias generalisation
- Much of epidemiology is concerned with population
subgroups and comparisons between them - The interpretation rests on the assumption that
the results apply, at least, to the whole group
as originally chosen if not the whole population - Error arises in the inappropriate generalisation
of study data to another population
35Controlling errors and bias
- Error control requires awareness and good
scientific technique - Bias control needs equal attention to error
control in all the population sub-groups - Error and bias cannot be fully controlled so the
most important need is for systematic, cautious
and critical interpretation of data
36Conclusion
- Bias is a central issue in epidemiology
- When epidemiological data are applied to provide
health advice to individuals and to shape public
health policy, error and bias are especially
important - I am not aware of an epidemiological theory on
why error and bias occur - Social sciences research on the nature of science
indicates that the scientific endeavour is not
wholly objective but open to the influence of
society and context - The framework provided by the chronology and
structure of a research project offers a logical
approach to analysis of bias and error
37Conclusions
- The main principles are
- develop research questions and hypotheses which
benefit all the population and will not lead to
harm - study a representative population
- measure accurately and with equal care across
comparison groups - compare like-with-like
- check for the main findings in subgroups before
assuming that inferences and generalisations
apply across all groups - findings of a single study should rarely be
accepted at face value - first consider artefact
- a critical attitude is essential