Title: Stat 155, Section 2, Last Time
1Stat 155, Section 2, Last Time
- Big Rules of Probability
- Not Rule ( 1 Popposite)
- Or Rule (glasses football)
- And rule (multiply conditional probs)
- Use in combination for real power
- Bayes Rule
- Turn around conditional probabilities
- Write hard ones in terms of easy ones
- Recall surprising disease testing result
2Reading In Textbook
- Approximate Reading for Todays Material
- Pages 266-271, 311-323, 277-286
- Approximate Reading for Next Class
- Pages 291-305, 334-351
3Midterm I
- Coming up Tuesday, Feb. 27
- Material HW Assignments 1 6
- Extra Office Hours
- Mon. Feb. 26, 830 1200, 200 330
- (Instead of Review Session)
- Bring Along
- 1 8.5 x 11 sheet of paper with formulas
4Recall Pepsi Challenge
- In class taste test
- Removed bias with randomization
- Double blind approach
- Asked which was
- Better
- Sweeter
- which
5Recall Pepsi Challenge
- Results summarized in spreadsheet
- Eyeball impressions
- a. Perhaps no consensus preference between Pepsi
and Coke? - Is 54 "significantly different from 50? (will
develop methods to understand this) - Result of "marketing research"???
6Recall Pepsi Challenge
- b. Perhaps no consensus as to which is sweeter?
- Very different from the past, when Pepsi was
noticeably sweeter - This may have driven old Pepsi challenge
phenomenon - Coke figured this out, and matched Pepsi in
sweetness
7Recall Pepsi Challenge
- c. Most people believe they know
- Serious cola drinkers, because now flavor driven
- In past, was sweetness driven, and there were
many advertising caused misperceptions! - d. People tend to get it right or not??? (less
clear) - Overall 71 right. Seems like it, but again is
that significantly different from 50?
8Recall Pepsi Challenge
- e. Those who think they know tend to be right???
- People who thought they knew right 71 of the
time - f. Those who don't think they know seem to right
as well. Wonder why? - People who didn't also right 70 of time?
Why? "Natural sampling variation"??? - Any difference between people who thought they
knew, and those who did not think so?
9Recall Pepsi Challenge
- g. Coin toss was fair (or is 57 heads
significantly different from 50?) - How accurate are those ideas?
- Will build tools to assess this
- Called hypo tests and P-values
- Revisit this example later
10Independence
- (Need one more major concept at this level)
- An event A does not depend on B, when
- Knowledge of B does not change
- chances of A
- PA B PA
11Independence
- E.g. I Toss a Coin, and somebody on South Pole
does too. - PH(me) T(SP) PH(me) ½.
- (no way that can matter, i.e. independent)
12Independence
- E.g. I Toss a Coin twice
- (toss number indicated with subscript)
- Is it lt ½?
- What if have 5 Heads in a row?
- (isnt it more likely to get a Tail?)
- (Wanna bet?!?)
13Independence
- E.g. I Toss a Coin twice,
- Rational approach
- Look at Sample Space
- Model all as equally likely
- Then
- So independence is good model for coin tosses
14New Ball Urn Example
- H ? R R R R G G T ? R R G
- Again toss coin, and draw ball
- Same, so R H are independent events
- Not true above, but works here, since proportions
of R G are same
15Independence
- Note, when A is independent of B
- so
- And thus
- i.e. B is independent of A
16Independence
- Note, when A in independent of B
- It follows that B is independent of A
- I.e. independence is symmetric in A and B
- (as expected)
- More formal treatments use symmetric version as
definition - (to avoid hassles with 0 probabilities)
17Independence
18Special Case of And Rule
- For A and B independent
- PA B PA B PB PB A PA
- PA PB
- i.e. When independent, just multiply
probabilities - Textbook Call this another rule
- Me Only learn one, this is a special case
19Independent And Rule
- E.g. Toss a coin until the 1st Head appears,
find P3 tosses - Model tosses are independent
- (saw this was reasonable last time, using equally
likely sample space ideas) - P3 tosses
- When have 3 group with parentheses
20Independent And Rule
- E.g. Toss a coin until the 1st Head appears,
find P3 tosses - (by indep)
- I.e. just multiply
21Independent And Rule
- E.g. Toss a coin until the 1st Head appears, P3
tosses - Multiplication idea holds in general
- So from now on will just say
- Since Independent, multiply probabilities
- Similarly for Exclusive Or rule,
- Will just add probabilities
22Independent And Rule
- HW
- 4.29 (hint Calculate
- PG1G2G3G4G5G6G7)
- 4.33
23Overview of Special Cases
- Careful these can be tricky to keep separate
- OR works like adding,
- for mutually exclusive
- AND works like multiplying,
- for independent
24Overview of Special Cases
- Caution special cases are different
- Mutually exclusive independent
- For A and B mutually exclusive
- PA B 0 PA
- Thus not independent
25Overview of Special Cases
- HW C15 Suppose events A, B, C all have
probability 0.4, A B are independent, and A
C are mutually exclusive. - Find PA or B (0.64)
- Find PA or C (0.8)
- Find PA and B (0.16)
- Find PA and C (0)
26Random Variables
- Text, Section 4.3 (we are currently jumping)
- Idea take probability to next level
- Needed for probability structure of political
polls, etc.
27Random Variables
- Definition
- A random variable, usually denoted as X,
- is a quantity that
- takes on values at random
28Random Variables
- Two main types
- (that require different mathematical models)
- Discrete, i.e. counting
- (so look only at counting numbers, 1,2,3,)
- Continuous, i.e. measuring
- (harder math, since need all fractions, etc.)
29Random Variables
- E.g X for Candidate A in a randomly
selected political poll discrete - (recall
all that means) - Power of the random variable idea
- Gives something to get a hold of
- Similar in spirit to high school algebra
30High School Algebra
- Recall Main Idea?
- Rules for solving equations???
- No, major breakthrough is
- Give unknown(s) a name
- Find equation(s) with unknown
- Solve equation(s) to find unknown(s)
31Random Variables
- E.g X that comes up, in die rolling
- Discrete
- But not very interesting
- Since can study by simple methods
- As done above
- Dont really need random variable concept
32Random Variables
- E.g Measurement error
- Let X measurement
- Continuous
- How to model probabilities???
33Random Variables
- HW on discrete vs. continuous
- 4.40 ((b) discrete, (c) continuous, (d)
could be either, but discrete is more common)
34And now for something completely different
- My idea about visualization last time
- 30 really liked it
- 70 less enthusiastic
- Depends on mode of thinking
- Visual thinkers loved it
- But didnt connect with others
- So hadnt planned to continue that
35And now for something completely different
- But here was another viewpoint
- Professor Marron,
- Could you focus on something more intelligent in
your "And now for something completely different"
section once every two weeks, perhaps, instead of
completely abolishing it? I really enjoyed your
discussion of how to view three dimensions in 2-D
today.
36And now for something completely different
- A fun example
- Faces as data
- Each data point is a digital image
- Data from U. Carlos, III in Madrid
- (hard to do here for confidentiality reasons)
- Q What distinguishes men from women?
37And now for something completely different
38And now for something completely different
- Context statistical problem of
classification, i.e. discrimination - Basically automatic disease diagnosis
- Have measurmts on sick healthy cases
- Given new person, make measmts
- Closest to sick or healthy populations?
39And now for something completely different
- Approach Distance Weight Discrimination
- (Marron Todd)
- Idea find best separating direction in high
dimensional data space - Here
- Data are images
- Classes Male Females
- Given new image classify make - female
40And now for something completely different
- Fun visualization
- March through point clouds
- Along separating direction
- Captures Femaleness Maleness
- Note relation to training data
41And now for something completely different
42Random Variables
- A die rolling example
- (where random variable concept is useful)
- Win 9 if 5 or 6, Pay 4, if 1, 2 or 3,
otherwise (4) break even - Notes
- Dont care about number that comes up
- Random Variable abstraction allows focusing on
important points - Are you keen to play? (will calculate)
43Random Variables
- Die rolling example
- Win 9 if 5 or 6, Pay 4, if 1, 2 or 4
- Let X net winnings
- Note X takes on values 9, -4 and 0
- Probability Structure of X is summarized by
- PX 9 1/3 PX -4 1/2 PX 0 1/6
- (should you want to play?, study later)
44Random Variables
- Die rolling example, for X net winnings
- Win 9 if 5 or 6, Pay 4, if 1, 2 or 4
- Probability Structure of X is summarized by
- PX 9 1/3 PX -4 1/2 PX 0 1/6
- Convenient form a table
Winning 9 -4 0
Prob. 1/3 1/2 1/6
45Summary of Prob. Structure
- In general for discrete X, summarize
distribution (i.e. full prob. Structure) by a
table - Where
- All are between 0 and 1
- (so get a prob. functn as above)
Values x1 x2 xk
Prob. p1 p2 pk
46Summary of Prob. Structure
- Summarize distribution, for discrete X,
- by a table
- Power of this idea
- Get probs by summing table values
- Special case of disjoint OR rule
Values x1 x2 xk
Prob. p1 p2 pk
47Summary of Prob. Structure
- E.g. Die Rolling game above
- PX 9 1/3
- PX lt 2 PX 0 PX -4 1/61/2 2/3
- PX 5 0 (not in table!)
Winning 9 -4 0
Prob. 1/3 1/2 1/6
48Summary of Prob. Structure
- E.g. Die Rolling game above
Winning 9 -4 0
Prob. 1/3 1/2 1/6
49Summary of Prob. Structure
- HW
- 4.41 (c) Find PX 3 X gt 2 (3/7)
- 4.52 (0.144, , 0.352)
50Probability Histogram
- Idea Visualize probability distribution using a
bar graph - E.g. Die Rolling game above
Winning 9 -4 0
Prob. 1/3 1/2 1/6
51Probability Histogram
- Construction in Excel
- Very similar to bar graphs (done before)
- Bar heights probabilities
- Example Class Example 18
52Probability Histogram
53Random Variables
- Now consider continuous random variables
- Recall for measurements (not counting)
- Model for continuous random variables
- Calculate probabilities as areas,
- under probability density curve, f(x)
54Continuous Random Variables
- Model probabilities for continuous random
variables, as areas under probability density
curve, f(x) - Area(
) -
a b -
(calculus notation)
55Continuous Random Variables
- Note
- Same idea as idealized distributions above
- Recall discussion from
- Page 8, of Class Notes, Jan. 23
56Continuous Random Variables
- e.g. Uniform Distribution
- Idea choose random number from 0,1
- Use constant density f(x) C
- Models equally likely
- To choose C, want
Area - 1 PX in 0,1 C
- So want C 1. 0
1
57Uniform Random Variable
- HW
- 4.54 (0.73, 0, 0.73, 0.2, 0.5)
- 4.56 (1, ½, 1/8)
58Continuous Random Variables
- e.g. Normal Distribution
- Idea Draw at random from a normal population
- f(x) is the normal curve (studied above)
- Review some earlier concepts
59Normal Curve Mathematics
- The normal density curve is
- usual function of
- circle constant 3.14
- natural number 2.7
60Normal Curve Mathematics
- Main Ideas
- Basic shape is
- Shifted to mu
- Scaled by sigma
- Make Total Area 1 divide by
- as , but never
61Computation of Normal Areas
- EXCEL Computation
- works in terms of lower areas
- E.g. for
- Area lt 1.3
62Computation of Normal Probs
- EXCEL Computation
- probs given by lower areas
- E.g. for X N(1,0.5)
- PX lt 1.3 0.73
63Normal Random Variables
- As above, compute probabilities as areas,
- In EXCEL, use NORMDIST NORMINV
- E.g. above X N(1,0.5)
- PX lt 1.3 NORMDIST(1.3,1,0.5,TRUE)
- 0.73 (as in pic
above)
64Normal Random Variables