Title: Introductory Biostatistics
1Introductory Biostatistics
- Runhua Shi MD, MPH, PhD
- Associate Professor of Medicine and Feist-Weiller
Cancer Center - rshi_at_lsuhsc.edu
- 813-1434
- Office FWCC B-444
2Review
S16
Age
65
30
18
n13
S212
Age
65
30
18
N13
3Arithmetic Mean The arithmetic mean is the sum
of all the observations divided by the number of
observations.
4Measures of Spread/Dispersion
Sample Variance A measure of the difference of
each value from the mean value. Sample
Standard Deviation A measure of dispersion
expressed in terms of the original units.
.
5(No Transcript)
6Median
The sample median is n is odd the (n1)/2 th
largest observation n is even the average of the
n/2 and n/21 th largest observation Example
Arrange values in ordered array, and choose the
middle value(s). 23,23,24,28,30,40,43,44,48
n9, The median is (91)/2(10/2)5th
median30 23,23,24,28,30,38,40,43,44,48 n10,
the median is the average of 10/25th and
10/216th obs (3038)/234
7Geometric mean
8Chapter 3. Concepts of Basic Probability
- Definition of Probability
- Conditional Probability
- Screening Tests-Bayes Rule
- ROC curves
- Prevalence and Incidence
9Sensitivity the probability of a positive test
given you have the diseases sensitivity
P(TD) a/(ac) Specificity the probability
of a negative test given you dont have the
diseases specificity P(T_ D_) d/(bd)
10sensitivity P(TD) a/(ac) specificity P(T_
D_) d/(bd)
11What is PV in our example? PVP(DT)a/(ab
) 132/1115 0.11838 or 11.8 Interpretation
11.8 of the women who screened positive actually
had cancer. What is PV- in our example?
PV_P(D_T_) d/(cd) 63650/63695 0.99929 or
99.9 Interpretation almost all of the women who
screened negative really were disease free.
12Random Variable A variable whose values arise as
a result of chance factors and cannot be exactly
predicted in advance. Discrete Random Variable
A random variable that is characterized by gaps
or interruptions in the values that it can
assume. (yes, no positive, negative racial
categories) Continuous Random Variable A random
variable that does not possess interruptions or
gaps. (age, concentration, BP)
13Chapter 4 Discrete Probability
- Definition of Random Variables.
- Mass function for a discrete random variable
- The expected value of a discrete random variable
- The variance of a discrete random variable
- Binomial Distribution
- Poisson Distribution
14Random Variable
- A random variable (RV) is numeric function that
assigns probabilities to different events in a
sample space. - A RV for which there exists a discrete set of
values with specified probabilities is a discrete
RV (/ - of a test). - A RV whose possible values cannot be enumerated
is continuous RV (cumulative of exposure-smoking,
in a life time)
15Arithmetic Mean The arithmetic mean is the sum
of all the observations divided by the number of
observations.
16Measures of Spread/Dispersion
Sample Variance A measure of the difference of
each value from the mean value. Sample
Standard Deviation A measure of dispersion
expressed in terms of the original units.
.
17A discrete RV
18Cumulative distribution function of a Discrete RV
- The cumulative distribution function (cdf) of a
random variable X is denoted by F(X) and, for a
specific values x of X, is denoted by P(Xx) or
by F(x)---step function
19Permutation
- The number of permutation of n things taken k at
a time is
A, B, C AB, AC, BC BA, CA, CB
20Combination
- The number of combination of n things taken k at
a time is
A, B, C AB, AC, BC BA, CA, CB
21Properties of The Binomial Distribution
- A sample of n independent trials, each of which
have only two possible outcomes, which are
denoted as success or failure. - Furthermore, the probability of a success at each
trial is assumed to be some constant p, and hence
the probability of a failure at each trial is - 1-pq.
- Example Flip a coin for 100 times, at each time
(flip), 2 possible outcomes are head or tail, the
probability of head is ½, the probability of tail
is 1- ½ ½ , here n100, p ½ , q ½
22The Binomial Distribution
- The distribution of the number of successes
(k0,1,2,..n) in n statistically independent
trials, where the probability of success on each
trial is p, is known as the binomial distribution
and has a probability-mass function (pdf) given by
23example
- What is the probability of obtaining 2 boys out
of 5 children if the probability of a boy is 0.51
at each birth and the sexes of successes children
are considered independent random variable. - N5, k2, p0.51, q1-p0.49
- P(X2)0.306
24example
- What is the probability of obtaining 2 boys out
of 5 children if the probability of a boy is 0.51
at each birth and the sexes of successes children
are considered independent random variable. - N5, k0,1,2, p0.51, q1-p0.49
- P(Xlt2)P(X0)P(X1)P(X2)
0.3060.1470.028 0.481 - P(Xlt2)0.481
25Get probability from excel
- In excel BINOMDIST (k, n, p, FALSE)
- False if for probability mass function
- True if for cumulative distribution function
26 27The Binomial Parameters (expected value and
variance) The binomial distribution has 2
parameters, n and p. The binomial distribution is
a group of distributions, with each possible
value of n and p designating a different member
of the group. Measures of central tendency
(mean) and dispersion ( Variance, SD) for the
binomial distribution are
28- The Poisson Distribution-rare event
- The probability of k events occurring in a time
period t for a Poisson random variable with
parameter ? is - e is approximately 2.71828
- Represents expected number of events per unit
timerate ? Represents expected number of events
over a time period t, ?? t, - For rare event, cancer count by age group and race
29example
- If A100 cm2 and ?0.02 colonies per cm2,
calculate the probability distribution of the
number of bacterial colonies. - Assuming that the probability of finding 1 colony
in an area size of ?A at any point on an agar
plate is ??A for some ? and that the number of
bacterial colonies found at 2 different points of
the plate are independent random variables, then
the probability of finding k bacterial colonies
in an area of size A is
?? A 0.021002
30(No Transcript)
31Expected value and variance of Poisson
Distribution For a Poisson distribution with
parameter ?, the mean and variance are both equal
? . Poisson approximation to the binomial
distribution The binomial distribution with
large n and small p can be accurately
approximated by a Poisson distribution with
parameter ?np When n is very large, its hard to
compute the nCk and (1-p)(n-k)
32example
- Suppose we are interested in the genetic
susceptibility to breast cancer. we find that 4
out of 1000 women aged 40-49 whose mothers have
had breast cancer also develop BCa over the next
10 year of life. We would expect from large
population studies that 1 in 1000 women of this
age group will develop a new case of the disease
over this period of time. - How unusual is this event?
33- Exact binomial distribution,
- N1000, k4, p0.001
- P(X4)1-P(X3)0.0189
- Using Poisson distribution, ?1000(0.001)1,
- K0,1,2,3
- P(X4)1-P(X3)0.0190
- This event is indeed unusual and suggests a
genetic susceptibility to BCa among daughters of
women who have had BCa. Plt0.05
34An drug company is designed test the potency of a
bird flu vaccine on 2000 children, previous study
have shown that one dose of the bird flu vaccine
has side effect (i.e. fever) of 0.2 of time
within the first 48 hours. What is the
probability of that 0 child will experience side
effect in the first 48 hours. Using Poisson
distribution, ?2000(0.002)4 e2.718 k0
The probability of that 0 child will experience
side effect in the first 48 hours is 1.8.
35Binomial or Poison distribution
- If the event is rare such as rate is lt 2 and the
sample size is large- Poison - If the event is not rare such as the rate is
greater than 2 and the sample size is not very
large such as n lt100. -Binomial
36(No Transcript)
37Get probability from excel
- In excel POISSON(x,m,FALSE)
- False if for probability mass function
- True if for cumulative distribution function
38Review
- The probability of obtain k out of n events where
the individual event with the success probability
of p. -Binomial Distribution - The probability of finding k bacterial colonies
in an area of size A is where ??A (A can also be
time t)
39Review
- Definition of Random Variables.
- Mass function for a discrete random variable
- The expected value of a discrete random variable
- The variance of a discrete random variable
- Binomial Distribution
- Poisson Distribution (??t t can be time or area)
40Home work-1
- An experiment is designed test the potency of a
drug on 20 rats, previous study have shown that a
10-mg dose of the drug is lethal 5 of time
within the first 4 hours. - What is the probability of that 0 rat will die in
the first 4 hours. - What is the probability of that 1 rat will die in
the first 4 hours. - What is the probability of that 2 rat will die in
the first 4 hours. - What is the probability of that 3 or more rats
will die in the first 4 hours.
41Home work-2
- An experiment is designed test the potency of a
drug on 20 rats, previous study have shown that a
20-mg dose of the drug is lethal 10 of time
within the first 8 hours. - What is the probability of that 0 rat will die in
the first 8 hours. - What is the probability of that 1 rat will die in
the first 8 hours. - What is the probability of that 2 rats will die
in the first 8 hours. - What is the probability of that 3 or more rats
will die in the first 8 hours.
42Home work-3
- An drug company is designed test the potency of a
bird flu vaccine on 2000 children, previous study
have shown that one dose of the bird flu vaccine
has side effect (i.e. fever) of 0.2 of time
within the first 48 hours. - What is the probability of that 0 child will
experience side effect in the first 48 hours. - What is the probability of that 1 child will
experience side effect in the first 48 hours. - What is the probability of that 2 children will
experience side effect in the first 48 hours. - What is the probability of that 3 or more
children will experience side effect in the first
48 hours