Population Distributions - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Population Distributions

Description:

Example 7.3 Pet Ownership ... Example: Service Times of An Airline's Reservation. An airline's toll-free reservation number recorded the length of time required ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 37
Provided by: shish
Category:

less

Transcript and Presenter's Notes

Title: Population Distributions


1
Chapter 7
  • Population Distributions

2
Chapter 7 Population Distributions
  • 7.1 Describing the Distribution of Values in a
    Population
  • 7.2 Population Models for Continuous Numerical
    Variables
  • 7.3 Normal Distributions

3
7.1 Describing the Distribution of Values in a
Population
  • An Example How is the performance of an online
    registration system for college courses?
  • Population All the students who use the system
  • Variable of Interest Time to complete
    registration
  • A variable associates a value with each
    individual or object in a population.
  • The distribution of all the values of a numerical
    variable or all the categories of a categorical
    variable is called a population distribution.

4
Categorical Variables Example
  • In a study of factors related to air quality,
    monitors were posted at every entrance to a
    California university campus on October 10, 2003.
    From 6am to 10 pm, monitors recorded the mode of
    transportation for each person entering the
    campus. Based on the information collected, the
    population distribution of the variable

    x mode of transportation was
    constructed.

5
Numerical Variables
  • Numerical variables can be either discrete or
    continuous.
  • A discrete numerical variable is one whose
    possible values are isolated points along the
    number line.
  • A continuous numerical variable is one whose
    possible values form an interval along the number
    line.
  • A discrete numerical variable can be summarized
    by a relative frequency histogram.
  • A continuous numerical variable can be summarized
    by a density histogram.

Therefore, relative frequency
(density)(interval width).
6
Example Pet Ownership
  • The Department of Animal Regulation released
    information on pet ownership for the population
    of all households in a particular county. The
    variable considered was
  • x number of licensed dogs or cats for a
    household.
  • Summarize the population distribution in a
    relative frequency histogram.
  • Is x discrete or continuous?
  • What is the probability of observing a household
    with 3 or more licensed dogs or cat?

Answer on the next slide
7
Answer to the Example Pet Ownership
  • x number of licensed dogs or cats for a
    household
  • Possible x values are 0, 1, 2, 3, 4 and 5. These
    are isolated points along the number line, x is a
    discrete variable.
  • The probability of observing a household with 3
    or more licensed dogs or cats is
  • P ( x 3 ) P ( x 3 ) P ( x 4 ) P ( x
    5 )
  • 0.9 .03 .01 0.13
  • Exercises
  • What is the probability of observing a household
    with at most 2 licensed dogs or cat?
  • What is the probability of observing a household
    with at least 2 licensed dogs or cat?

8
Example Birth Weights
  • Birth weight was recorded for all full-term
    babies born during 2002 in a semirural county.
    The variable x birth weight for a full-term
    baby in this county is an example of a continuous
    numerical variable.
  • We can construct a density histogram of describe
    the population distribution of x values.
  • Density the height of each rectangle
  • Relative frequency Probability
  • (density)(interval width)
  • (height)(interval width) area

9
The Mean (µ) and Standard Deviation (s) of A
Numerical Variable
  • The mean value of a numerical variable x, denoted
    by µ, describes where the population distribution
    of x is centered.
  • The standard deviation of a numerical variable x,
    denoted by s, describes variability in the
    population distribution.
  • When s near 0, the values of x tend to be close
    to µ (little variability) when s is large, there
    is more variability in the population of x
    values.

10
(No Transcript)
11
Which distribution has the largest standard
deviation?Which distribution has the largest
mean?Which distribution has a mean of about
5?Which distribution has the smallest standard
deviation?
12
  • The area of any rectangle in a density
    histogram can be interpreted as the probability
    of observing a variable value in that interval.
  • In (a), P ( 4.5 lt x lt 5.5 ) height (density) x
    width (.05)(1) .05.

13
A smooth curve specifies a continuous probability
distribution
  • An observation of the four density histograms on
    the preceding slide shows
  • A density histogram based on a small number of
    intervals can be quite jagged.
  • As the number of interval increases, the
    resulting histograms become much smoother in
    appearance.
  • A smooth curve superimposed over a density
    histogram such as the one shown on the right, is
    called a continuous probability distribution.

A smooth superimposed over the density histogram
(d) of the preceding slide
14
7.2 Population Models for Continuous Numerical
Variables
  • A continuous probability distribution is a smooth
    curve, called a density curve, that serves as a
    model for the population distribution of a
    continuous variable.
  • Properties of continuous probability
    distributions are
  • The total area under the curve is equal to 1
  • The area under the curve and above any particular
    interval is interpreted as the (approximate)
    probability of observing a value in the
    corresponding interval when an individual or
    object is selected at random from the population.

15
  • Let x be a continuous numerical variable.
  • For any particular number a, P( x a ) 0.
  • (Because there is 0 area under the density curve
    above a single x value.)
  • For any particular numbers a and b,
  • P( x b ) P( x lt b )
  • P( x a ) P( x gt a )
  • P( a lt x lt b ) P( a x b )
  • The above are NOT true for discrete numerical
    variables!

16
Example Departure Delays of A Commuter Train
  • The length of time that elapses between the
    scheduled departure time and the actual departure
    time is recorded on 200 occasions, and the
    resulting observations are summarized in the
    density histogram.

The histogram in (a) is fairly flat, a reasonable
model for the population distribution is uniform
distribution in (b). The height of the density
curve is uniformly chosen to be 0.1, so that the
total area under the curve is equal to 1.
17
Example 7.6 Priority Mail Package Weights
  • Develop a reasonable model for the population
    distribution.
  • The shape of the sample density histogram
    suggests that a reasonable model for the
    population is a triangular distribution. (See
    next slide.)
  • 2. How to choose the height of the triangle so
    that the total area under the probability
    distribution curve 1?
  • Total area of triangle
  • ½ ( base )( height ) 1.
  • With base 2.0, the height must be equal to
    1.
  • 200 packages shipped using the
    Priority Mail rate for packages under 2 lb were
    weighed, resulting the following density
    histogram.
  • Let x package weight (in pounds).

18
Example Priority Mail Package Weights
Figure (a) histogram of package weight values
(b) continuous probability distribution for
package weight.
  • Example Find the probability that a package
    selected at random weighs less than 1.5 lb.
  • First using similar triangles to find the height
    h at x 1.50 is 0.75.
  • Then P( x lt1.5 ) ½ (1.5)(0.75) .5625.
  • Exercise Find P( x 1.5 ) and P( x 1.5 ).

19
Example Service Times of An Airlines Reservation
  • An airlines toll-free reservation number
    recorded the length of time required to provide
    service to each of 500 callers. Let x service
    time.
  • What is the population?
  • Develop a model for the population distribution.

20
Example Service Times of An Airlines Reservation
Figure (a) histogram of service times (b)
continuous distribution of service times.
  • What is the probability that the service time for
    a randomly selected caller lasts less than 3
    minutes?
  • P ( x lt 3 ) ( height )( width ) (1/24)(3)
    1/8
  • 2. What is the probability that the service time
    for a randomly selected caller lasts greater than
    8 minutes?
  • 3. What is the probability that the service time
    for a randomly selected caller lasts between 2
    and 4 minutes?

21
Example Telephone Registration Times
  • Students at a university use a telephone
    registration system to register for courses. The
    variable
  • x length of time required for a student to
    register.
  • The general form of the density histogram can be
    described as bell shaped and symmetric.
  • The probability model of this problem is an
    example of a type of symmetric bell-shaped
    distribution known as a normal probability
    distribution.


The superimposed smooth curve is a
reasonable model for the population distribution.
22
7.3 Normal Distributions
  • Normal distributions are widely used for two
    reasons
  • They provide a reasonable approximation to the
    distribution of many different variables.
  • They play a central role in many of the
    inferential procedures.
  • Normal distributions are distinguished by two
    important parameters
  • The mean µ where the normal curve is centered.
  • The standard deviation s how much the curve
    spreads out around the center.

23
Normal Distributions
  • Normal distributions are continuous probability
    distributions that are (1) bell shaped, (2)
    symmetric and (3) the two tails die out quickly.

24
The Standard Normal Distribution
  • The normal distribution with µ 0 and s 1.
  • The term z curve is used for the standard normal
    curve.
  • P ( z lt z ) the cumulative area of z.

25
Using Appendix Table 2 on page 706 and page 707
(also inside the back cover) to find standard
normal curve areas
  • For any number z between -3.89 and 3.89 and
    rounded to two decimal places, Appendix Table 2
    gives
  • (area under z curve to the left of z) P ( z lt
    z ) P ( z z )
  • where z represents a variable whose distribution
    is standard normal distribution .
  • To find this probability, locate the following
  • The row labeled with the sign z and the digit to
    either side of the decimal point (e. g., -1.7,
    0.5, 3.6, etc.)
  • The column identified with the second digit to
    the right of the decimal point in z (e. g., .06
    if z -1.76)
  • The number at the intersection of this row and
    column is the desired probability, P ( z lt z ) .
  • Example Find P ( z lt -1.76).
  • Because -1.76 is a negative number, we
    use the table on page. 706. First locate the row
    labeled with -1.7 in the first column (z
    column). Then we find P ( z lt -1.76) .0392 at
    the intersection of this row and the column
    labeled .06.
  • Note You may use an online z table instead of
    Table 2, but you have to understand how to use it
    because it may have a different design.

26
Using Table 2 (p. 706 p. 707) to Find Standard
Normal Probabilities ( Cumulative z Curve Areas)
  • Examples Find the following probability
  • 1. P( z lt -1.76) .0392
  • Exercise P( z 0.58)
  • P( z lt -4.12)
  • P( z lt 4.18)
  • 2. P( zgt1.96) 1 - P( zlt 1.96) 1 - .9750
    .0250
  • Another method P( zgt1.96) P( zlt -1.96)
    .0250
  • Exercise P(zgt -1.28)
  • P(zgt 3.9)
  • 3. P(-1.76lt z lt0.58) P( z 0.58) - P( z lt
    -1.76)
  • .7190 - .0392 .6798
  • Exercise P(-2.00lt z lt2.00)

Answer to the exercises P( z 0.58) .7190
P( z lt -4.12) 0 P( z lt 4.18) 1 P( z gt
-1.28) .8997 P( z gt 3.9) 0 P(-2.00 lt z lt
2.00) .9544.
27
Example Identifying Extreme Values
  • Find z such that P( zlt z ).02.
  • Figure (a) shows that the cumulative area
    for z is .02. (The area of .02 lt 0.5, so z must
    be a negative number.) Therefore, we look for an
    area of .02 in the body of Appendix Table 2. The
    closest area in the table is .0202 in the -2.0
    row and .05 column. So z -2.05.
  • Find z such that P( z gt z ).05.
  • P( z gt z ) .05 indicates that the area to
    the right of z is .05 in Figure (b). Area to the
    left of z is 1 -.05 .95. In Table 2 on page
    707 .95 falls exactly between .9495
    (corresponding to a z value of 1.64) and .9505
    (corresponding to a z value of 1.65). So z
    ½(1.64 1.65)1.645.

28
  • Exercise Find the values that make up the most
    extreme 5 of the standard normal distribution.
  • We need to separate the middle 95 from the
    extreme 5. Because the standard normal
    distribution is symmetric, the most extreme 5 is
    equally divided between the high side and the low
    side of the distribution, resulting in an area of
    .025 for each of the tails of the z curve.
    Symmetry about 0 implies that if z denotes the
    value that separate the largest 2.5, the value
    that separate the smallest 2.5 is simply -z.

Complete the problem by finding z.
Answer z 1.96
29
Other Normal Distributions
  • Let x be a variable whose behavior is described
    by a normal distribution with mean µ and standard
    deviation s. To calculate probabilities for x, we
    standardize the relevant values and then use the
    table for z curve areas.
  • P( xltb ) P( zltb )
  • P( altx ) P( altz ) (equivalently, P( xgta )
    P( zgta) )
  • P( altxltb ) P( altzltb ),
  • where

30
Example Childrens Heights
  • The height of a randomly selected 5-year-old
    child is a normal distribution with a mean of µ
    100 cm and standard deviation s 6cm. What
    proportion of the heights is between 94 and 112
    cm?
  • Let x the height of a randomly selected
    5-year-old child.

About 82 of 5-year-old children have heights
between 94 and 112 cm. Exercise What is the
probability that a randomly selected 5-year-old
child will be taller than 110 cm?
Answer to Exercise 4.75
31
Example IQ Scores
  • A commonly used IQ scale has a mean of 100 and a
    standard deviation of 15, and scores are
    approximately normally distributed. (IQ score is
    actually a discrete variable, but its population
    distribution closely resembles a normal curve.)
  • What proportion of the population would qualify
    for Mensa membership, which requires an IQ score
    above 130?
  • What proportion of the population with IQ score
    below 80?
  • What proportion of the population with IQ score
    between 75 and 125?

32
Solution to Example Registration Times
Let x IQ score of a randomly selected
individual. Given µ 100 and s 15.
  • 1.
  • 2.
  • 3. For you exercise.

Answer of 3. P ( 75 lt x lt 125 ) .9050
33
Example Registration Times
  • The length of time (in minutes) required for
    students to complete telephone registration in a
    particular university can be well approximated by
    a normal distribution with mean µ 12 min and
    standard deviation s 2 min. The university
    would like to disconnect students automatically
    after some amount of time has elapsed. Determine
    the amount of time that should be allowed before
    disconnecting a student if the university wants
    only the largest 1 to be disconnected.

34
Solution of Registration Time Example
  • Let x be the length of registration time for a
    randomly selected student.
  • Given µ 12 minutes and s 2 minutes. Let x
    be the time (in minutes) the phone registration
    should be disconnected.
  • P( x gt x ) 1, P( x lt x ) 99.
  • The z value corresponding to the
  • cumulative area .99 is 2.33.
  • Therefore,

35
Example Motor Vehicle Emissions
  • The EPA has determined that the emissions of
    nitrogen oxides, which are major constituents of
    smog, can be modeled using a normal distribution
    with µ 1.6 and s 0.4. Suppose that the EPA
    wants to offer some sort of incentive to get the
    worst polluters off the road. What emission
    levels constitute the worst 10 of the vehicles?

36
Solution of Vehicle Emission Example
  • Let x be the emission level of pollutant for a
    randomly selected vehicle.
  • Given µ 1.6 and s 0.4. Let x be the
    emission level that constitutes the worst 10
    (the highest 10 emission level).
  • P( x gt x ) 10, P( x lt x ) 90.
  • The z value corresponding to area .9 is 1.28.
  • Therefore,
Write a Comment
User Comments (0)
About PowerShow.com