Module 3: Characterizing Variability presentation

About This Presentation

Transcript and Presenter's Notes

Title: Module 3: Characterizing Variability

1
Module 3 Characterizing Variability

Probability

2
Outline

the probability framework
random variables
probability distributions and densities
expected values

3
Probability Framework

How would you design a framework to account for
uncertainty?
values that can occur
groups of outcomes that can occur
frequency of occurrence
of values, and groups

4
Probability Framework

Experiment
situation leading to a value
could be an actual experiment, or a circumstance
in which values are observed
e.g., gold recovery in a lab experiment
e.g., atmospheric concentration of phenol
e.g., product consumer preferences
an experiment has outcomes
Sample Space
the space of all possible outcomes from an
experiment
e.g,. appealing, acceptable, unappealing
e.g., temperature - Real line
denoted by S

5
Probability Framework

Events
are collections of outcomes
a single outcome is also an event
e.g., E chip is defective
e.g., E car finish is metallic, car finish is
matte
an event is said to have occurred if at least one
of the outcomes in the event has occurred
events are sets of outcomes, and we can talk
about intersection and union of events
e.g., for E1car is red, car is green, car is
blue, E2car is not red, then E1?E2 car is
green, car is blue

6
The situation being considered...

In considering an experiment with outcomes, and
events, we are trying to describe a physical
system.
We will gain information about this system
through
observations.
is called a ...

7
Population

Broad definition -
all possible items or units possessing one or
more common characteristics under specified
experimental or observational conditions (Mason,
Gunst and Hess)
in other words, all possible outcomes from a
well-specified system
e.g., values from a process - process - series
of repeatable actions resulting in observable
characterisitics
See also Devore, page 3 and page 7
In identifying a population, we must be clear
about what is being considered.

8
Set Operations

Since events are subsets, we can use standard set
operations
union
union of two events is the set of outcomes
occurring in either event
intersection
intersection of two events is the set of outcomes
that occur in both events
complement
the complement of an event E is the set of events
in the sample space that are not in E - notation

9
Visualizing Events

Since events are subsets, and the sample space is
a large set, we can use Venn diagrams to
visualize events

S
Sample space
E2
E1
E1 ? E2
10
Mutually Exclusive Events

Two events are mutually exclusive if
i.e., both events cant occur together

11
Examples

Temperature in a reactor
sample space (-?, ?)
event E1 - temperature below 350 C- E1 T?350
event E2 - temperature above 300 C - E2 Tgt300
E1?E2 300 lt T lt 350
E1?E2 (-?, ?)

Continuous Case
12
Examples

defects in samples of 5 from a chip foundry
sample space nnnnn, dnnnn, ndnnn, nndnn,
nnndn, nnnnd, ddnnn,nddnn, ddddd
event E1 - one of the first two chips in the
sample is defective and the rest are not - E1
dnnnn, ndnnn
event E2 - at most one chip is defective - E2
dnnnn, ndnnn, nndnn, nnndn, nnnnd, nnnnn
E1?E2 dnnnn, ndnnn
event E3 - no defective chips - E3 nnnnn
E3 ?E1 ? (mutually exclusive)

Discrete Case
13
Probability Framework

Probability
provides a measure on the space of all possible
outcomes
indicates relative frequency, or likelihood, of a
certain event occurring
must obey a few rules to be consistent
Axioms of Probability

14
Axioms of Probability

required for consistency
P(S) 1 - the probability that something
happens is always one something always happens!
- probability
provides a relative frequency of occurrence a
fractional value that should like between 0 and 1
if E1 and E2 are mutually exclusive, P(E1? E2)
P(E1) P(E2)
Recognizing that we have to be careful about
double counting importance of the concept of
mutually exclusive

15
Additional Probability Facts

Probability of nothing happening
Probability of an event NOT happening
where the overbar denotes complement
alternative symbol - (prime)

16
Additional Probability Facts

General case - probability of a union of events

Need to avoid double counting when an outcome in
both events occurs
Note that if the events are mutually exclusive,
their intersection is zero and this term drops
from the expression.
17
How can we determine probability functions?

by examining the sample space - how often can/do
values occur?
definition of sample space - enumeration of
values and outcomes
counting rules - permutations/combinations
physical observation
e.g., temperatures appear to occur in a pattern
that follows a normal probability distribution

18
Probability functions for discrete problems

Equally Likely Outcomes
If we have N equally likely outcomes, then
If we have an event consisting of several
outcomes, i.e., Eoutcome1, outcome2, outcome3
then

19
Probability functions for discrete problems

More generally, if we have an event consisting of
individual outcomes, then
where n(E) is the number of outcomes in E, and
n(S) is the number of outcomes in the sample
space S.

20
Multiplication Rule

for counting numbers of possible outcomes.
If we have two operations that are independent,
then if the first operation can be performed n1
ways, and the second operation can be performed
n2 ways, then both operations can be performed
n1n2 ways.

21
Additional Counting Rules

for arrangements of n outcomes
Permutations
choosing r objects from a total of n when order
is important
Combinations
choosing r objects from a total of n when order
is not important

22
Example

Functional groups
suppose we have a set of 6 functional groups
F1, F2, F3, F4, F5, F6
what is the probability of obtaining F1-F2-F3-F4
when we are considering strings of 4 functional
groups?
order IS important here
number of outcomes in the sample space n(S)
number of ways of choosing strings of 4 from the
6 groups 6P4 6!/2! 360
only one outcome in the event
P(E) 1/360

Important consequences in computational
chemistry.
23
Probability and Inter-relationships

between events
Conditional Probability
Independence
Bayes Theorem

24
Conditional Probability

What is the likelihood of an event E1 occurring,
given that event E2 has occurred?
Validity check - if events E1 and E2 are mutually
exclusive, P(E1?E2)0, and P(E1E2) 0/P(E2) 0
if event E2 has occurred, event 1 cant occur --gt
conditional probability is zero

given
25
Example

Galvanneal Line
Outcomes with probabilities -
O1 thickness off-spec, fails tape test -- 0.04
O2 thickness acceptable, fails tape test -- 0.1
O3 thickness off-spec, passes tape test -- 0.03
O4 thickness acceptable, passes tape test --
0.83
Events -
E1 fails tape test
P(E1) P(O1) P(O2) 0.14
E2 fails thickness test
P(E2) P(O1) P(O3) 0.07

26
Galvanizing Line - Photos
Steel sheet goingthrough a moltenzinc bath
27
Example

Conditional Probability
what is the probability that given the zinc
thickness is off-spec, the coil fails the tape
test?
E1?E2 thickness offspec, fails tape test
prob 0.04
point of discussion - is zinc coating thickness a
reliable indicator of tape test failure?

28
Independent Events

Two events are independent if
intuitive interpretation
likelihood of one event occurring is not
influenced by whether the other event has
occurred
likelihood of both events occurring together is
simply the product of the likelihood of each one
occurring
Validity check - conditional probability for two
independent events

29
Bayes Theorem

useful for situations in which we have incomplete
probability knowledge
forms basis for statistical estimation
suppose we have two events, A and B
from conditional probabilityso for P(B)gt0

30
Bayes Theorem

we can generalize this to the case where we have
some event B, and a range of mutually exclusive
events E1, , En that cover the sample space
exhaustive set of events
nowfor P(B)gt0
in this case, we have obtained P(B) from
knowledge of how B occurs with the other events

31
Bayes Theorem - Example

Drug Testing
Drug testing - reliability of analytical
procedure
Events - T -- positive test reading, D -- drug
user
probability of true positive is 0.99 (correctly
detects usage when individual is a drug user) --
P(TD)0.99
probability of true negative is 0.94 (correctly
detects non-usage when individual is not a drug
user) -- P(TD)0.94
suppose that 5 of population are drug users --
P(D) 0.05
if a positive reading is obtained, what is the
probability that the individual is in fact a drug
user? -- P(DT)

The prime denotes complement.
32
Bayes Theorem - Example

From Bayes Theorem
P(TD) 0.99, P(TD)0.95, P(D)0.05
from sum to unity for probabilities,
P(D)1-P(D)1-0.05 0.95
P(TD)1-P(TD)1-0.94 0.06

33
Bayes Theorem - Example

putting it all together,
with a positive detection rate of 99, and a
false positive rate of 6, there is a 46 chance
that an individual is a drug user given a
positive reading, when 5 of the population are
drug users

34
Bayes Theorem - Example

Policy implications
incidence of drug use fixed in the population -
given
reliability of test depends significantly on true
positive, false positive rate
e.g., how can we improve the reliability of the
test by minimizing the false positive rate?

Underscores the importance of analytical
procedures
35
Random Variables and Probability Distributions
36
Random Variable

is a means of attaching a numerical value
(label) to an outcome
in some instances, this occurs by definition -
e.g., temperature is inherently numerical
e.g., defective 0, functional 1 --gt random
variable that takes on the values of 0 and 1
why do we need this notion?
to allow us to express probability and outcomes
in a mathematical setting

37
Types of Random Variables

reflect types of data
Discrete Random Variables
take on integer values - discrete set of values
Continuous Random Variables
take on values from a portion of the real line
continuum of values
implications for probability statements later

38
Random Variables - Notation

Standard Convention
Random variable denoted by capital -- X
Values assumed denoted by lower-case -- x

39
Discrete Random Variables
40
Discrete Random Variables

We have a probability function
Example - sampling one chip from a batch of 30
(10 of which are defective)
defective 0, function 1

41
Cumulative Distribution Function

We can also define a Cumulative Distribution
Function as follows
FX is the probability that we obtain an outcome
less than or equal to a given number
FX is the accumulation of probabilities of
outcomes less than the given number
more to come...

42
Probability Function - Example

Galvanneal Line
discrete random variable - attach score (number)
to reflect outcomes - x0, 1, 2 -- acceptability
score
O1 thickness off-spec, fails tape test - x 0
O2 thickness acceptable, fails tape test -x 1
O3 thickness off-spec, passes tape test - x 1
O4 thickness acceptable, passes tape test -x 2
interpretation - score reflects severity of
situation in descending order
Probability Function
P(X0) 0.04, P(X1) 0.13, P(X2) 0.83

43
Expected Value

What is the value of the random variable expected
on average?
Reasoning
we have probability function that indicates
values occur PX(x) fraction of the time
if we had 1000 experiments, we would would obtain
an outcome of 1 in PX(1) 1000 instances
we can carry this analysis for each outcome, and
then take the average
we obtain (0PX(0) 1000 1PX(1) 1000
)/1000 0 PX(0) 1PX(1) 2PX(2)
leads to definition of expected value for a
discrete r.v.

44
Expected Value

The expected value of a discrete random variable
X is defined as
The expected value is an important parameter that
characterizes probability functions, and is given
a symbol

? is the MEAN of the random variable X.
45
Example - Mean for Galvanneal Line

Using the probability function,

46
Variance

is defined using the expected value
what is the value of the squared deviation from
the mean expected on average?
Note - reminiscent of sample variance, which in
fact is the statistic that estimates the
parameter ?2

47
Standard Deviation

is the square root of the variance

The mean, variance and standard deviation are
parameters summarizing a probability
distribution for a random variable.
48
Expected Values

In general, if we have a function of a random
variable, we can take the expected value
Examples
mean - g(X) X
variance - g(X) (X-?)2

49
Linearity of Expectation

The Expected Value operation is LINEAR
1) Additivity E(XY) E(X) E(Y)
2) Scaling
E(kX) k E(X)
where k is a constant
e.g., E(X6) E(X) 6 ?X 6

50
Probability Distributions for Discrete R.V.s

Recall - we can determine probability functions
by counting - enumeration given characteristics
of physical situation - or based on empirical
observations
specific types of problems occur frequently, and
motivate the labeling and study of generic
distributions
Binomial Distribution
Poisson Distribution

General Approach - build a library of standard
distributions.
51
Binomial Distribution

Suppose we are conducting a number of independent
trials, each with only one of two possible values
each trial is referred to as a Bernoulli trial
note that each trial is independent
outcomes -- 0, 1 -- True/False -- Success/Fail --
...
in each trial, P(1) p, and P(0) 1-p
if we have n trials, what is the probability that
we obtain x outcomes of 1 (successes)?
in N trials, we have nCx ways of having x
successes
for each case of x successes, the probability is

52
Binomial Distribution

Putting it all together, the probability of
having x successes in n independent trials is

Binomial Probability Distribution Function
53
Binomial Distribution

Mean
Variance

54
Using the Binomial Distribution

Sampling with Replacement -
Example -
On the microwave module line of a
telecommunications equipment maker, the
probability of a defective module is 0.21. From
each batch, one module is selected and tested,
and then returned to the batch. This procedure is
repeated 5 times, so that we have 5 independent
tests for defects. What is the probability of
having
a) 1 defect in the five tests?
b) 3 defects in the five tests?
c) why is it important that the module be
returned?

55
Binomial Example

a) n 5 (independent trials), x 1 (success
defect identified - need to be clear on this!)
b) n 5 (independent trials), x 3

56
Binomial Example

c) why is it necessary to return the module to
the batch before the next sample?
preserve independence
if module not returned, batch is one smaller, and
there is potentially one fewer defect -
underlying probability is influenced
Binomial distribution is appropriate in sampling
situations when there is sampling with
replacement
for sampling without replacement, we need to use
the Hypergeometric distribution
if the lot size is large relative to the number
of tests in the sample, binomial provides
reasonable approximation
e.g., 10 sampling tests for lot of 1000

Write a Comment

User Comments (0)

About PowerShow.com

Module 3: Characterizing Variability PowerPoint PPT Presentation