Title: INTRODUCTION TO BIOSTATISTICS
1INTRODUCTION TO BIOSTATISTICS
- DR.S.Shaffi Ahamed
- Asst. Professor
- Dept. of Family and Comm. Medicine
- KKUH
2This session covers
- Background and need to know Biostatistics
- Origin and development of Biostatistics
- Definition of Statistics and Biostatistics
- Types of data
- Graphical representation of a data
- Frequency distribution of a data
3- Statistics is the science which deals with
collection, classification and tabulation of
numerical facts as the basis for explanation,
description and comparison of phenomenon. - ------ Lovitt
4BIOSTATISICS
- (1) Statistics arising out of biological
sciences, particularly from the fields of
Medicine and public health. - (2) The methods used in dealing with statistics
in the fields of medicine, biology and public
health for planning, conducting and analyzing
data which arise in investigations of these
branches.
5Origin and development of statistics in Medical
Research
- In 1929 a huge paper on application of statistics
was published in Physiology Journal by Dunn. - In 1937, 15 articles on statistical methods by
Austin Bradford Hill, were published in book
form. - In 1948, a RCT of Streptomycin for pulmonary tb.,
was published in which Bradford Hill has a key
influence. - Then the growth of Statistics in Medicine from
1952 was a 8-fold increase by 1982.
6C.R. Rao
Ronald Fisher
Karl Pearson
Douglas Altman
Gauss -
7 8Sources of Medical Uncertainties
- Intrinsic due to biological, environmental and
sampling factors - Natural variation among methods, observers,
instruments etc. - Errors in measurement or assessment or errors in
knowledge - Incomplete knowledge
9Intrinsic variation as a source of medical
uncertainties
- Biological due to age, gender, heredity, parity,
height, weight, etc. Also due to variation in
anatomical, physiological and biochemical
parameters - Environmental due to nutrition, smoking,
pollution, facilities of water and sanitation,
road traffic, legislation, stress and strains
etc., - Sampling fluctuations because the entire world
cannot be studied and at least future cases can
never be included - Chance variation due to unknown or complex to
comprehend factors
10Natural variation despite best care as a source
of uncertainties
- In assessment of any medical parameter
- Due to partial compliance by the patients
- Due to incomplete information in conditions such
as the patient in coma
11Medical Errors that cause Uncertainties
- Carelessness of the providers such as physicians,
surgeons, nursing staff, radiographers and
pharmacists. - Errors in methods such as in using incorrect
quantity or quality of chemicals and reagents,
misinterpretation of ECG, using inappropriate
diagnostic tools, misrecording of information
etc. - Instrument error due to use of non-standardized
or faulty instrument and improper use of a right
instrument. - Not collecting full information
- Inconsistent response by the patients or other
subjects under evaluation
12Incomplete knowledge as a source of Uncertainties
- Diagnostic, therapeutic and prognostic
uncertainties due to lack of knowledge - Predictive uncertainties such as in survival
duration of a patient of cancer - Other uncertainties such as how to measure
positive health
13- Biostatistics is the science that helps in
managing medical uncertainties
14Reasons to know about biostatistics
- Medicine is becoming increasingly quantitative.
- The planning, conduct and interpretation of much
of medical research are becoming increasingly
reliant on the statistical methodology. - Statistics pervades the medical literature.
15CLINICAL MEDICINE
- Documentation of medical history of diseases.
- Planning and conduct of clinical studies.
- Evaluating the merits of different procedures.
- In providing methods for definition of normal
and abnormal.
16Role of Biostatistics in patient care
- In increasing awareness regarding diagnostic,
therapeutic and prognostic uncertainties and
providing rules of probability to delineate those
uncertainties - In providing methods to integrate chances with
value judgments that could be most beneficial to
patient - In providing methods such as sensitivity-specifici
ty and predictivities that help choose valid
tests for patient assessment - In providing tools such as scoring system and
expert system that can help reduce epistemic
uncertainties
17PREVENTIVE MEDICINE
- To provide the magnitude of any health problem
in the community. - To find out the basic factors underlying the
ill-health. - To evaluate the health programs which was
introduced in the community (success/failure). - To introduce and promote health legislation.
18Role of Biostatics in Health Planning and
Evaluation
- In carrying out a valid and reliable health
situation analysis, including in proper
summarization and interpretation of data. - In proper evaluation of the achievements and
failures of a health programme
19Role of Biostatistics in Medical Research
- In developing a research design that can minimize
the impact of uncertainties - In assessing reliability and validity of tools
and instruments to collect the infromation - In proper analysis of data
20Example Evaluation of Penicillin (treatment A)
vs Penicillin Chloramphenicol (treatment B) for
treating bacterial pneumonia in childrenlt 2 yrs.
- What is the sample size needed to demonstrate the
significance of one group against other ? - Is treatment A is better than treatment B or
vice versa ? - If so, how much better ?
- What is the normal variation in clinical
measurement ? (mild, moderate severe) ? - How reliable and valid is the measurement ?
(clinical radiological) ? - What is the magnitude and effect of laboratory
and technical - error ?
- How does one interpret abnormal values ?
-
21WHAT DOES STAISTICS COVER ?
- Planning
- Design
- Execution (Data
collection) - Data Processing
- Data analysis
- Presentation
- Interpretation
- Publication
22BASIC CONCEPTS
Data Set of values of one or more variables
recorded on one or more observational units
Sources of data 1. Routinely kept
records 2. Surveys (census) 3.
Experiments 4. External source
Categories of data 1. Primary data
observation, questionnaire, record form,
interviews, survey, 2. Secondary data census,
medical record,registry
23TYPES OF DATA
- QUALITATIVE DATA
- DISCRETE QUANTITATIVE
- CONTINOUS QUANTITATIVE
24QUALITATIVE
- Nominal
- Example Sex ( M, F)
- Exam result (P, F)
- Blood Group (A,B, O or AB)
- Color of Eyes (blue, green,
- brown,
black)
25- ORDINAL
- Example
- Response to treatment
- (poor, fair, good)
- Severity of disease
- (mild, moderate, severe)
- Income status (low, middle,
- high)
26- QUANTITATIVE (DISCRETE)
-
- Example The no. of family members
- The no. of heart beats
- The no. of admissions in a day
- QUANTITATIVE (CONTINOUS)
-
- Example Height, Weight, Age, BP, Serum
- Cholesterol and BMI
27Discrete data -- Gaps between possible values
Number of Children
Continuous data -- Theoretically, no gaps between
possible values
Hb
28- CONTINUOUS DATA
-
-
- QUALITATIVE DATA
- wt. (in Kg.) under wt, normal over wt.
- Ht. (in cm.) short, medium tall
29(No Transcript)
30Scale of measurement
Qualitative variable A categorical
variable Nominal (classificatory) scale -
gender, marital status, race Ordinal (ranking)
scale - severity scale, good/better/best
31Scale of measurement
Quantitative variable A numerical variable
discrete continuous Interval scale Data is
placed in meaningful intervals and order. The
unit of measurement are arbitrary. -
Temperature (37º C -- 36º C 38º C-- 37º C are
equal) and No implication of ratio (30º C
is not twice as hot as 15º C)
32- Ratio scale
- Data is presented in frequency distribution in
logical order. A meaningful ratio exists. - - Age, weight, height, pulse rate
- - pulse rate of 120 is twice as fast as 60
- - person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.
33Scales of Measure
- Nominal qualitative classification of equal
value gender, race, color, city - Ordinal - qualitative classification which can
be rank ordered socioeconomic status of
families - Interval - Numerical or quantitative data can
be rank ordered and sizes compared temperature
- Ratio - Quantitative interval data along with
ratio time, age.
34CLINIMETRICS
- A science called clinimetrics in which qualities
are converted to meaningful quantities by using
the scoring system. - Examples (1) Apgar score based on appearance,
pulse, grimace, activity and respiration is used
for neonatal prognosis. - (2) Smoking Index no. of cigarettes, duration,
filter or not, whether pipe, cigar etc., - (3) APACHE( Acute Physiology and Chronic Health
Evaluation) score to quantify the severity of
condition of a patient
35(No Transcript)
36(No Transcript)
37(No Transcript)
38INVESTIGATION
39 Frequency Distributions
- data distribution pattern of variability.
- the center of a distribution
- the ranges
- the shapes
- simple frequency distributions
- grouped frequency distributions
- midpoint
40Tabulate the hemoglobin values of 30 adult male
patients listed below
Patient No Hb (g/dl) Patient No Hb (g/dl) Patient No Hb (g/dl)
1 12.0 11 11.2 21 14.9
2 11.9 12 13.6 22 12.2
3 11.5 13 10.8 23 12.2
4 14.2 14 12.3 24 11.4
5 12.3 15 12.3 25 10.7
6 13.0 16 15.7 26 12.5
7 10.5 17 12.6 27 11.8
8 12.8 18 9.1 28 15.1
9 13.2 19 12.9 29 13.4
10 11.2 20 14.6 30 13.1
41Steps for making a table
- Step1 Find Minimum (9.1) Maximum (15.7)
- Step2 Calculate difference 15.7 9.1 6.6
- Step3 Decide the number and width of
- the classes (7 c.l) 9.0 -9.9,
10.0-10.9,---- - Step4 Prepare dummy table
- Hb (g/dl), Tally mark, No. patients
42 DUMMY TABLE
Tall Marks TABLE
43Table Frequency distribution of 30 adult male
patients by Hb
44Table Frequency distribution of adult patients
by Hb and gender
45Elements of a Table
Ideal table should have Number
Title Column headings
Foot-notes Number Table number
for identification in a report Title,place
- Describe the body of the table,
variables, Time period (What, how
classified, where and when) Column -
Variable name, No. , Percentages (),
etc., Heading Foot-note(s) - to describe some
column/row headings, special cells,
source, etc.,
46Table II. Distribution of 120 (Madras)
Corporation divisions according to annual death
rate based on registered deaths in 1975 and 1976
Figures in parentheses indicate percentages
47DIAGRAMS/GRAPHS
- Discrete data
- --- Bar charts (one or two groups)
- Continuous data
- --- Histogram
- --- Frequency polygon (curve)
- --- Stem-and leaf plot
- --- Box-and-whisker plot
48Example data
68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 3
2 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43
27 49 28 23 19 11 52 46 31 30 43 49 12
49Histogram
Figure 1 Histogram of ages of 60 subjects
50Polygon
51Example data
68 63 42 27 30 36 28 32 79 27 22 28 24 25 44 65
43 25 74 51 36 42 28 31 28 25 45 12 57 51 12 3
2 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43
27 49 28 23 19 11 52 46 31 30 43 49 12
52Stem and leaf plot
Stem-and-leaf of Age N 60 Leaf Unit
1.0 6 1 122269 19 2
1223344555777788888 (11) 3 00111226688 13
4 2223334567999 5 5 01127 4 6
3458 2 7 49
53Box plot
54Descriptive statistics report Boxplot
- - minimum score
- maximum score
- lower quartile
- upper quartile
- median
- - mean
- the skew of the distribution positive
skew mean gt median high-score whisker is
longer negative skew mean lt median
low-score whisker is longer
55Pie Chart
- Circular diagram total -100
- Divided into segments each representing a
category - Decide adjacent category
- The amount for each category is proportional to
slice of the pie
The prevalence of different degree of
Hypertension in the population
56Bar Graphs
Heights of the bar indicates frequency Frequency
in the Y axis and categories of variable in the X
axis The bars should be of equal width and no
touching the other bars
The distribution of risk factor among cases with
Cardio vascular Diseases
57HIV cases enrolment in USA by gender
Bar chart
58HIV cases Enrollment in USA by gender
Stocked bar chart
59Graphic Presentation of Data
the frequency polygon (quantitative data)
the histogram (quantitative data)
the bar graph (qualitative data)
60(No Transcript)
61General rules for designing graphs
- A graph should have a self-explanatory legend
- A graph should help reader to understand data
- Axis labeled, units of measurement indicated
- Scales important. Start with zero (otherwise //
break) - Avoid graphs with three-dimensional impression,
it may be misleading (reader visualize less easily
62 63Origin and development of statistics in Medical
Research
- In 1929 a huge paper on application of statistics
was published in Physiology Journal by Dunn. - In 1937, 15 articles on statistical methods by
Austin Bradford Hill, were published in book
form. - In 1948, a RCT of Streptomycin for pulmonary tb.,
was published in which Bradford Hill has a key
influence. - Then the growth of Statistics in Medicine from
1952 was a 8-fold increase by 1982.