Statistical Errors in Publications - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Statistical Errors in Publications

Description:

Title: BeST Back Skills Training Trial Author: IT Services Last modified by: mhsdaw Created Date: 1/17/2005 12:19:35 PM Document presentation format – PowerPoint PPT presentation

Number of Views:213
Avg rating:3.0/5.0
Slides: 36
Provided by: ITSe208
Category:

less

Transcript and Presenter's Notes

Title: Statistical Errors in Publications


1
Statistical Errors in Publications
  • October 2010

2
  • OVERVIEW
  • Greater emphasis on sections dealing with
  • Design
  • Sample size
  • Statistical methodology
  • Results (Presentation/Interpretation)
  • Discussion/Conclusion.

3
  • SAMPLE PAPERS
  • Sample 1 Randomised controlled trial
    management of ankle sprains comparing elastic
    support bandage v. aircast ankle brace (Br. J.
    Sports Med, 2005)
  • Sample 2 Study to assess variables which
    predict chronic neck pain disability (Arch Phys
    Med Rehabil, 2004).

4
  • PREVALENCE OF STATISTICAL ERRORS
  • Concerns of misuse of statistics dating back over
    70 years (Altman, 2004)
  • Despite greater awareness (e.g. CONSORT) of
    statisticalissues such concerns have not
    diminished

5
  • Prevalence of Statistical Errors (contd)
  • Serious statistical errors were found in 40 of
    164 articles published in psychiatry (Altman,
    2002)
  • At least one serious statistical error occurred
    in 38 and 25 of papers in Nature and BMJ
    respectively (Garcia-Berthou and Alcaraz (2004))
  • Many surveys of statistical errors report error
    rates ranging from 30-90 (Altman, 1991 Gore
    et. al., 1976 Pocock et al., 1987 and MacArthur,
    1984).

6
  • Why are there so many errors? (Altman, 2004)
  • Many investigators are not professional
    researchers, they are primarily clinicians
  • Training usually a single course in statistics
  • Training focuses on data analysis, but issues
    such as statistical reporting and
    interpreting are not addressed
  • Statistical content and complexity of medical
    research has
  • increased steadily over recent decades.

7
  • (Altman, 2004)
  • ........ When I tell friends outside medicine
    that many papers published in medical journals
    are misleading because of methodological
    weaknesses they are rightly shocked.
  • Huge sums of money are spent annually on research
    that is seriously flawed through the use of
    inappropriate designs, unrepresentative samples,
    small samples, incorrect methods

8

Personal Scientific Experience
Observe (Natural Course of Disease)
Research Planning, Grant Writing,
Protocol Development
Concept Development
Hypothesize (Frame Research Question)
Data Collection Analysis
Test (Conduct Experiment/ Clinical Trial)
Experimental Design
Journal Articles, Scientific Meetings
Conclude (Validate or Modify Hypothesis)
Process Stage Activity
Statistical Inference
9
  • DESIGN
  • Population A population is a group of
    individuals persons, objects, or items from which
    samples are taken
  • Sample A sample is a finite part of a
    statistical population whose properties are
    studied to gain information about the whole
  • Sampling Sampling is the process of selecting a
    suitable sample, or a representative part of a
    population for the purpose of determining
    parameters or characteristics of the whole
    population.
  • Purpose of sampling To draw conclusions about
    populations from samples, we must use inferential
    statistics which enables us to determine a
    populations characteristics by directly
    observing only a portion (or sample) of the
    population.

10
  • Design (contd)
  • Sampling error What can make a sample
    unrepresentative of its population? One of the
    most frequent causes is sampling error.
  • Two types of sampling errors
  • Chance That is the error that occurs just
    because of bad luck.
  • Bias Sampling bias is a tendency to favour the
    selection of units that have particular
    characteristics (as a result of poor sampling
    plan)
  • To avoid sampling error Plan careful !!
  • select using a random selection of
    participants

11
  • SAMPLE SIZE
  • Sample size may be determined by various
    practical constraints
  • Financial
  • Resources
  • Too small a sample is not representative of a
    population
  • Too large a sample results in wastefulness and is
    unethical
  • The larger the sample size the more likely the
    results will reflect what will happen in the
    population

12
  • Sample size (Power Calculation) (contd)
  • Difference Clinically important difference
  • significance threshold type I error -
    conventionally set at 0.01 or 0.05
  • Power i.e. 1- type II error -
    conventionally 80 or 90
  • How confident you are that the sample will
    detect a difference, if
  • one really exists in the
    population
  • Variability The less variability among patients
    within each group, the more
  • likely they reflect the overall
    populations.

13
  • Sample size (Power Calculation) (contd)
  • Increase in Sample size
  • Smaller the clinically relevant difference
  • Increase in power
  • Less variability
  • Reduction in Type I error rate
  • Allow for dropouts and/or withdrawals

14
  • Sample size (contd)
  • Review the two articles in terms of
  • Design
  • Sample size

15
  • Sample size (contd)
  • .A major concern in the design of studies is
    the almost universal lack of reporting of how the
    sample size was obtained.. (Altman, 2000).
  • Basis of the power calculation is inadequately
    described (Malachy, 2004, Vail et al.,
    2003).(all sample papers)
  • Quite often sample size calculations are
    computed without allowing for dropouts
    (McGuigan, 1995).(all sample papers)

16
  • Sample size (contd)
  • Small studies
  • Small trials have a low power and high type I
    error
  • No sample size provided, then conclusions of the
    study have little value (as sample 2)
  • If underpowered then the conclusions to be taken
    with caution and the results are inconclusive
    (as sample 1 )

17
  • Sample size (contd)
  • A description of the sample size in the
    literature should contain, for example
  • The mean and sd. for the RMQ on the active
    management is 5.91 and 4.27 respectively
    (Oxfordshire Low Back Pain trial, BMJ, 2005). The
    smallest difference between the two therapies
    which is clinically relevant is approximately
    2.0. Using this information, the total number of
    participants required for this study will be 700,
    allowing for a 25 loss-to-follow up and using
    90 power with a 1 type I error rate
    (significance level).

18
  • METHODS
  • ................ All of the problems hinge on
    the understanding of what a statistical test is
    doing and what a p-value means ....
  • (Murphy, 2004)

19
  • METHODS
  • A Statistical test is a procedure you use to
    compute a probability in support of the
    hypothesis (null)

20
  • Methods (contd)
  • e.g. H0
  • H1
  • Test statistic t-test
  • The test statistic is transformed into a p-value

21
  • Methods (contd)
  • P-value strength of the evidence (quantified by
    a probability) in support of the null hypothesis.
  • Neither the statistical test nor the p-value
    PROVE/DISPROVE the null hypothesis they provide
    EVIDENCE in support of the null hypothesis.

22
  • Methods (contd)
  • Review the two articles in terms of
  • Methods
  • Results (including figures and tables)

23
  • Methods (contd)
  • .. A further issue is the copying of incorrect
    or inappropriate methods. Once incorrect
    procedures become common, it is hard to stop them
    from spreading through the medical literature
    like a genetic mutation.. (Altman, 2002).
  • (as sample 1)
  • Schwartzer et al. (2000) found that most papers
    made important errors in the application of new
    technology such as models for longitudinal data.
    (Altman, 2000).
  • (e.g. Hierarchical models in sample 1 ROC curves
    in sample 2)

24
  • Methods (contd)
  • Most common errors in Methods section
  • Failure to check assumption (Nature says that the
    most common error was not checking for a normal
    distribution and not stating how normality was
    tested)
  • Using linear regression analysis without first
    establishing that the relationship is linear
  • Ignoring paired or ordered categories and
    therefore using an inappropriate test
  • Arbitrarily dividing continuous data into ordinal
    categories without explanation (Data dredging)
  • Multiple comparison (could increase the
    likelihood of significant result) (sample 2)
  • And many more . sub-group analyses,
    ignoring repeated measures design, non-matched
    analysis for matched data, modelling incorrectly,
    i.e. interactions not included .

25
  • Methods (contd)
  • Begin a statistical analysis with data
    exploration
  • Check assumptions
  • Type of data continuous, binary, ordinal,
    repeated over time, etc.
  • Missing values, outliers, no. of withdrawals
  • Be careful with computer output (often helps to
    do simple calculations by hand first).

26
  • RESULTS
  • ..The results section must be written so that
    the average reader can understand the study
    findings (Cummings, 2003).
  • poorly written with excessive jargon
    (Byrne, 2000).
  • (sample 1 and sample 2)
  • .. A major bias is cherry-picking results
    (Malachy, 2004).

27
  • Results (contd)
  • Common Language pitfalls
  • Avoid non-technical uses of technical terms such
    as normal, significant, sample
  • No difference means evidence of lack of
    statistical significant difference
  • (Sample 1)
  • p-values - using 2 digit precision (e.g. p
    0.82)
  • Do not reduce p-values to non-significant or
    NS
  • Report a quantity so as that it is
    scientifically relevant (e.g. mean blood pressure
    of 115.73 mmHg should be reported as 115.7 mmHg
    or even 116 mmHg)

28
  • Results (contd)
  • P-values
  • Over-emphasis on the p-value
  • An arbitrary division of the results into
    significant and non-significant according to
    the p-value was not the intention of the founders
    of statistical inference
  • Smaller p-values indicate a strong evidence
    against the null hypothesis.

29
  • Results (contd)
  • Confidence Intervals
  • A confidence interval is simply a range of values
    which enclose the population value
  • Confidence intervals are preferable to p-values,
    as they tell us the range of possible effect
    sizes compatible with the data
  • The larger the sample size the narrower the
    confidence interval
  • A confidence interval based on the difference
    (e.g. treatment difference) and contains a 0, or
    on a ratio (e.g. odds ratio) and contains a 1,
    implies lack of evidence of a statistically
    significant difference.

30
  • Results (contd)
  • and many more pitfalls ..
  • testing baseline values (sample 1)
  • not reporting missing data
  • lack of statistical power not considered
  • misinterpreting and misunderstanding results
    from models e.g. no interactions included.

31
  • PRESENTATION
  • In tables that compare groups include count (of
    patients or events) and column percentages
  • Use appropriate statistics (mean instead of
    median for non-normal data)
  • In tables of column percentages, do not include a
    row of counts and percentage of missing data
    (doing this will distort the other percentages in
    the table)
  • Statistical software packages provide a large
    amount of output need to be selective about
    what is presented
  • Use graphs as alternative to tables with many
    entries do not duplicate graphs and tables.
  • Labelling graphs and tables correctly (sample 1
    and sample 2)

32
  • INTERPRETATION AND DISCUSSION
  • Put the study sample in context of the
    population
  • Interpreting studies with non-significant results
    and low statistical power as negative (when
    they are inconclusive) The absence of proof is
    not proof of absence
  • Errors encountered in the design and analysis of
    a study can also continue through to errors in
    interpretation (Rushton, 1999)
  • Weaknesses in study design and study strengths
    stated so that a clear and accurate impression of
    the reliability of the data can be formed.

33
  • And finally..
  • The misuse of statistics is very important
  • The need for statisticians to be involved in
    research at some stage, preferably early as
    possible
  • Most errors relatively unimportant
  • Some can have major bearings on the validity of
    the study.
  • So.

34
  • There are three kinds of lies lies,
  • damn lies and statistics.
  • Benjamin Disreali.

35
Write a Comment
User Comments (0)
About PowerShow.com