Business Statistics Course Notes Rev. 2003 - PowerPoint PPT Presentation

1 / 86
About This Presentation
Title:

Business Statistics Course Notes Rev. 2003

Description:

As they watch TV, volunteers press buttons to indicate their presence. ... A new TV show will be a hit. Used often in decision analysis, strategic decision making ... – PowerPoint PPT presentation

Number of Views:176
Avg rating:3.0/5.0
Slides: 87
Provided by: jcala
Category:

less

Transcript and Presenter's Notes

Title: Business Statistics Course Notes Rev. 2003


1
Business StatisticsCourse Notes Rev. 2003
  • Professor Joel M. Calabrese
  • San Francisco State University

2
Part 1Introduction

3
What is Statistics?
  • Definitions
  • A statistic is a numerical fact or datum
    assembled, classified, or tabulated so as to
    present significant information about a given
    topic.
  • Statistics The science of preparing and
    analyzing such data.
  • Descriptive Statistics Preparing and presenting
    information
  • Inferential statistics Analyzing Information
    and drawing conclusions (inferences).

4
Inferential Statistics
  • Infer conclusions about a large population using
    a small sample from the population Uses
    probabilities.
  • Example TV Ratings Nielsen Media Research.

5
TELEVISION RATINGSNielsen Media Research
  • NIELSEN PEOPLE METER is programmed with the age
    and gender of each household member. Viewers
    enter their code when they begin watching
    visitors can log their presence as well. The
    meter records which channels are tuned by sensing
    the frequencies emitted by the cable box, TV or
    videocassette recorder.
  • EVERY DAY, in some 5,000 homes throughout the
    U.S., People Meters gather data on who watched
    what, when and for how long.
  • AT STAGGERED TIMES throughout the night, all the
    meters call Nielsen's mainframe computer system
    in Dunedin, Fla., and transfer their daily
    viewing records via modem.
  • BY MORNING, Nielsen has assembled and processed
    its sample of the nation's viewing behavior. TV
    executives and other subscribers can log in to
    Nielsen's data network to learn which shows were
    hits and which flopped.
  • VERY WEEK subscribers receive a detailed report
    chronicling how many Nielsen household viewers
    were watching television during any given quarter
    hour and how specific programs fared against
    their competition.

Source Edgar W. Aust, senior vice president of
engineering and technology for Nielsen Media
Research in Dunedin, Fla.
6
Nielsen Data Collection
7
Neilsen Media Research
In 1936 engineer Arthur C. Nielsen, Sr., attended
a demonstration at the Massachusetts Institute of
Technology of a mechanical device that could keep
a record of the station to which a radio was
tuned at any given moment. Nielsen bought the
technology practically on the spot and six years
later launched the Nielsen Radio Index, which
analyzed the listening habits of 800 homes.
Later, he adapted the same technology to the new
medium of television, creating a ratings system
that nearly all American broadcasters use today
to help determine the popularity of their
programs. Over the years, Nielsen Media Research
has used several methods to collect viewing
information, including surveys and volunteer
diaries. In 1986 the company supplanted these
with an electronic device called a People Meter.
The meter is now connected to televisions and
telephone lines in about 5,000 households
throughout the U.S. Nielsen households are
selected from a sample that is statistically
representative of the television-viewing
population. Each household receives nominal
compensation--about 50 and occasional gifts--for
their cooperation. In order to keep the sample
representative, viewers can participate for only
two years. As they watch TV, volunteers press
buttons to indicate their presence. The People
Meter records the gender and age of each viewer,
as well as the time spent watching each channel
frequency. Every night the device transmits that
household's data by modem to Nielsen's central
computer in Florida, which assembles the data
into a ratings database. To meet the changing
needs of broadcasters and sponsors, the
technology continues to evolve. In 1986 Nielsen
introduced a system that uses computerized
pattern recognition to identify particular
commercials as they are broadcast. Future
versions of the People Meter now under
development will monitor codes embedded into
digital TV signals to verify which programs are
on the air. They will also use image-recognition
computers to identify viewers the moment they hit
the couch. Source EDGAR W. AUST, senior vice
president of engineering and technology for
Nielsen Media Research in Dunedin, Fla.
8
How Accurate Are Statistical Samples?
  • Number of U.S. Households in 1999103.9 million.
  • Number of Households with TVs100 million.
  • Number of Households in Nielsen Sampleabout
    5000.Nielsen samples about 1 out of every
    20,000 households.

9
Outline of Course Topics
  • Introduction Data, Variables and Descriptive
    Measures (2 weeks)
  • Basic Probability Theory (2 weeks)
  • Probability Distributions (3 weeks)
  • Statistical Sampling and Estimation (2 weeks)
  • Hypothesis Testing (3 weeks)
  • Regression Analysis (3 weeks)

10
Definitions
  • Subjects Persons or Objects having specific
    characteristics we wish to measure.
  • Variable A measurable characteristic of the
    subjects.
  • Population Set of measurements from all
    subjects.
  • Sample Selected subset of measurements.

11
Data and Variables
12
Levels of Measurement
  • Nominal Data Numbers that label qualitative
    differences.
  • Example Residence Variable
  • 1 USA Citizen, but not CA Resident
  • 2 USA Citizen California Resident
  • 3 Foreign Student
  • Ordinal Data Assigned numbers that indicate
    rank order.
  • Example Grade Points

13
Levels of Measurement
  • Interval Data -- Intervals between numbers can be
    compared, but not ratios (no natural zero point).
  • Examples Calendar Years, Fahrenheit
    Temperatures
  • Ratio Data -- Ratios and Intervals can be
    compared (data has a natural zero point).
  • Examples Height, weight, Length

14
Kinds of Variables
  • Discrete Variables Values can be represented as
    separate, distinct points on a number line.
  • Example The number of magazines subscribed to by
    a student. Possible values 0,1,2,3,...
  • Continuous Variables Possible values
    represented as a continuum on a number line.
  • Examples Measurements of height, length,
    weight, time.

15
Subscripts and Summations
  • A variable is a list of measurements or
    observations. A subscript identifies a
    particular observation in the list.
  • Examples X2, X7, W3
  • A summation sign (S) indicates addition.
  • Example S X means the sum of all the values
    of X

16
Rules of Summation
  • S cX c S X
  • S c nc
  • S (X Y) S X S Y
  • c a constant
  • n total number of observations
  • But note S XY does not equal S X S Y
  • S X2 does not equal (S X)2

17
Applying the Rules
18
Using the Rules of Summation
19
Calculating the Variance
20
Part 2Descriptive Statistics
21
Descriptive Statistics
  • Measures of Location Where are values located?
  • Measures of Variation How spread out are the
    data?
  • Summarizing, Classifying, and Presenting Data

22
Measures of Location
  • Mean (or Average) S X / n
  • Known as for sample.
  • Known as m for population.
  • Median the middle value, X(n1)/2
  • Mode most frequently observed value
  • Percentiles shows position of a value
  • the pth percentile is the value such that at
    least p of all values in the data set are at or
    below it and at least (100-p) are at or above
    it.
  • (Note n denotes the sample size N is the
    population size).

23
Example Ages of playground visitors in
years. Raw Data 2,1,6,2,3,3,7,5,2,4,5,4,6,6,7,
6,3,2,3,6,3,5,6,5,6,2,7,3 S X /
n 120 / 28 4.29 years 1,2,2,2,2,2,3,3,3,3,3,
3,4,4,5,5,5,5,6,6,6,6,6,6,6,7,7,7
Median X(n1)/2 X14.5 4.5 years
Mode 6 years (with secondary mode 3)
24
Estimating Percentiles
  • Arrange the data in ascending order.
  • Compute the subscript of the percentile i
    (p/100)n 1/2, where p is the percentile to be
    calculated and n is the number of data items.
  • If i is an integer, then the pth percentile is Xi
  • If i is not an integer, then interpolate between
    the two X values with subscripts just above and
    below i.

25
Examples Calculating Percentiles
1,2,2,2,2,2,3,3,3,3,3,3,4,4,5,5,5,5,6,6,6,6,6,6,6,
7,7,7
Estimate the 75th percentile i (p/100)n
½ (75/100)(28) ½ 21.5 75th percentile ?
(X21 X22)/2 (66)/2 6
Estimate the 20th percentile i (p/100)n
(20/100)(28) ½ 6.1 20th percentile ? X6
(0.1)(X7 ? X6) 2 (0.1)(3-2) 2.1
26
Other Common Terms
  • Quartiles
  • The 25th percentile is the first quartile
  • The 50th percentile is the second quartile
  • The 75th percentile is the third quartile
  • The 100th percentile is the fourth quartile
  • Deciles
  • The 10th percentile is the first decile
  • The 20th percentile is the second decile,
  • etc.

27
Measures of Variation
  • Range highest minus lowest value
  • Example Range of ages in playground7 years
    (oldest) - 1 year (youngest) 6 years
  • Mean Absolute Deviation (MAD)

The MAD is the average distance from the mean.
28
Calculating the MAD
29
  • Variance
  • Known as s2 for sample. Formula is

or
  • Known as ?2 for population. Formula is

or
n is the sample size N is the population size
30
Calculating the Variance
31
Using the Computational Formula for Calculating
the Variance
32
The Standard Deviation
  • The standard deviation is the square root of the
    variance. It is called s for a sample, or ? for
    a population.
  • For the example s ?3.4 1.84
  • One use of the standard deviation is the 3-Sigma
    Rule. This rule says that it is very unusual to
    find any observations in the data greater than
    the mean plus 3 times s, and also any
    observations less than the mean minus 3 times s.

33
Grouping and Presenting Data
  • Frequency Distributions
  • Absolute Frequencies f(X)
  • Relative Frequencies p(X)
  • Cumulative Frequencies
  • Histograms and Frequency Curves
  • Calculating the mean, variance, and standard
    deviation with grouped data.

34
Definitions
  • Absolute Frequency f(X)
  • A count of the number of times that a particular
    value of the variable X occurs.
  • Relative Frequency p(X)
  • The fraction or percentage of times that a
    particular value of X occurs.
  • Histograms and Frequency Curves
  • Graphs of frequencies of X.

35
Example Playground Data
Childrens ages 1,2,2,2,2,2,3,3,3,3,3,3,4,4,5,5,5,
5,6,6,6,6,6,6,6,7,7,7
36
Graphing f(X)
37
Graphing p(X)
38
Graphing f(X) Part Two
39
Graphing f(X) The Frequency Distribution Curve
40
Cumulative Frequencies
Cumulative absolute frequency measures the number
of subjects at or below the indicated value of
X. Cumulative relative frequency measures the
proportion (or percentage) of subjects at or
below the indicated value of X. It also gives an
estimateof the percentile.
41
Calculating the Mean of a Frequency Distribution
Using Absolute Frequencies
42
Calculating the Mean of a Frequency Distribution
Using Relative Frequencies
43
Calculating the Variance and Standard Deviation
  • Using absolute frequencies, f(X)
  • s2 S (X - )2 . f(X) / (n - 1)
  • Using relative frequencies, p(X)
  • ?2 S (X - m )2 . p(X) (population formula)
  • Note The standard deviation is, as before, the
    square root of the variance.

44
Example Calculation of Variance
Using relative frequencies Variance ? S (X - m
)2 . p(X)
Note difference between 3.34 and earlier
calculation of 3.4 is due to round-off error and
using population formula.
45
Price to Earnings RatiosNatural Resources
Companies
Source Business Week, March 16, 1987
46
Frequency Distribution of PE Ratios
47
Histogram for PE Ratios
48
Relative Frequency Curve for PE Ratios
Example of positively skewed curve.
49
Weighted Averages
Where x represents the values of the variable and
w represents the weight on each value.
Formula
Example Calculating GPA
Course Units Grade Grade Points
Comp. Sci 2 C 2 English 5
A 4 Math 3 B 3
50
A Note on Transforming Variables
  • Suppose you have two variables, x and y, such
    that y ax b

a b
VAR(y) a2VAR(x)
STD DEV(y) aSTD DEV(x)
  • Example 1 The average wholesale price of a
    bottle of wine
  • at Kermits Restaurant is 6, with a standard
    deviation of
  • 2. The retail price that the customer pays
    is equal to the
  • wholesale price plus a markup of 150 plus a
    5 corkage
  • fee. What are the mean and standard deviation
    of the retail
  • prices?

51
Tranforming Variables -- Example 2
  • The average daily temperature in June in a
    particular location is 68o F with a standard
    deviation of 7o F. What is the average and
    standard deviation in degrees centigrade?
  • Note C (5/9)(F - 32)

52
Dont Jump to Conclusions!
  • Statistics can be misleading. For example,
    suppose the two companies below have equivalent
    benefit packages. Which one pays more?
  • Company A
  • Mean wage 34,000 a year
  • Median wage 40,000 a year
  • Modal wage 40,000 a year
  • Company B
  • Mean wage 30,000 a year
  • Median wage 24,000 a year
  • Modal wage 24,000 a year

53
Look for Reasons Behind Differences
Company A
Employee Number of
Total Classification Employees Payroll
Average
White Collar 600 24 m 40,000 Skilled
Trade 200 6 m 30,000 Unskilled
200 4 m 20,000 Total
1000 34 m 34.000
Company B
Employee Number of
Total Classification Employees Payroll
Average
White Collar 200 8.8 m 44,000 Skilled
Trade 200 6.8 m 34,000 Unskilled
600 14.4 m 24,000 Total
1000 30.0 m 30.000
54
Part 3Basic Probability Theory

55
Probability
  • Definition A probability is a number between 0
    and 1, representing the likelihood that a given
    event will occur. Examples of events
  • Coin flip comes up heads
  • Roll a 7 throwing dice
  • It will rain on Memorial Day
  • You have an auto accident next year
  • The economy is good next year
  • Your new business succeeds

56
Calculating Probabilities
  • Three Alternatives
  • The Classical Approach
  • Experience with Long-Run Percentages
  • Subjective Judgment

57
The Classical Approach
  • Works only if the results of a random action or
    experiment can be broken down into a collection
    of equally likely outcomes.
  • Collection of all possible outcomes called the
    sample space
  • An event is thought of as an outcome or set of
    outcomes (a subset of the sample space)

58
Classical Probabilities
  • Probability of an event is P(E) F / T
  • F Number of outcomes favorable to E
  • T Total number of possible outcomes
  • Examples of events
  • A a coin flip comes up heads
  • B at least one head comes up in two coin
    flips
  • C a jack is drawn from a deck of cards
  • D a roll of a pair of dice comes up 7

59
Outcomes of Rolling Dice
Roll Die 2
60
Long-Run Percentages
  • Works if the action or experiment of interest is
    repeatable under the same general conditions
  • Probability of an event is P(E) n / N
  • N Total number of trials
  • n Number of times the event E occurs
  • Examples
  • Flipping a bottle cap
  • Probability of an auto accident

61
Subjective Probabilities
  • A resort to human judgment -- the degree of
    belief of the decision maker. Examples
  • The economy is good next year
  • Next years sales will be high
  • A new TV show will be a hit
  • Used often in decision analysis, strategic
    decision making

62
Ad Campaign Example
  • A company sent out a questionnaire to 200
    consumers to study the effects of a recent
    advertising campaign. Results are
  • 80 people saw the ad (event A)
  • 60 people bought the product (event B)
  • 40 people saw the ad, but did not purchase the
    product
  • Calculation of probabilities?

63
Organizing Ad Campaign Data
Joint Probability Table
64
Combining Events
  • Intersection of events
  • written (A and B) (A Ç B) or (A,B).
  • The situation where both events occur. If C
    (A Ç B), then C represents the event that both A
    and B occur.
  • Union of events
  • written (A or B) (A È B)
  • The situation where either event occurs (or
    both). If D (A È B), then D is the event that
    either A or B or both occur.
  • Which is greater, P(C) or P(D)?

65
Ad Campaign Example (contd)
  • Write out the following events in symbols and
    find the probabilities
  • E Saw the ad and also purchased the product
  • F Did not see the ad, but did purchase the
    product
  • G Either saw the ad or purchased the product
    (or both)
  • H Did not see the ad did not purchase the
    product
  • J Either did not see the ad, or did not
    purchase the product, or both

66
Conditional Probability
  • Conditional Probability -- The likelihood of an
    event, given the occurrence of another event.
    Written, for example P(A B) -- probability of
    A given that the event B occurs.
  • Revising the probability of A after finding out
    that B has occurred.

67
Ad Campaign Example (contd)
  • Write out the following in symbols and find the
    probabilities
  • Suppose it is known that someone bought the
    product. Now what is the probability that s/he
    saw the ad?
  • If someone has seen the ad, what is the
    probability that s/he has also bought the
    product?
  • What is the chance that someone who has not seen
    the ad will buy the product?
  • Among the people who have not bought the product,
    what proportion has not seen the ad?
  • Given that a person saw the ad, what are the
    chances that the person did not buy the product?

68
Some Definitions
  • Two events A and B are mutually exclusive if the
    occurrence of one automatically rules out the
    occurrence of the other.
  • A group of events A, B, C, ... are collectively
    exhaustive if, taken together, they cover all the
    possible outcomes of an action or experiment.
  • Two events A and B are statistically independent
    if the occurrence of one does not affect the
    probability that the other occurs this means
    that P(A) P(A B) and P(B) P(B A).

69
Note on checking for independence
  • Check if either
  • P(A) P(A B) or P(B) P(B A)
  • If either one is true, then A and B are
    independent -- no need to check both.
  • If either is false, then A and B are not
    independent

70
Multiplication Rule for Probability
The rule P(A Ç B) P(A)? P(B A)
Dividing by P(A), we obtain an alternate formula
Note If A and B are independent, then P(A Ç
B) P(A)? P(B)
71
Note on Multiplication Rule
  • Switching the roles of A and B, note that (A Ç
    B) (B Ç A), thus P(A Ç B) P(B)?
    P(A B)
  • Dividing by P(B), we obtain

72
Addition Rule for Probability
  • The Rule
  • P(A È B) P(A) P(B) - P(A Ç B)
  • Note If A and B are mutually exclusive, P(A È
    B) P(A) P(B)

73
Solving Probability Problems
  • There is no general method. Some general
    guidelines are
  • Given a random action or experiment, work out
    probabilities by laying out all the possible
    outcomes and applying the classical rule.
  • Given some probabilities, work out others by
    applying the multiplication and/or addition
    rules. Often joint probability tables or
    probability trees are useful for organizing the
    information.

74
Problem Solving Aids
  • Probability Trees.
  • Joint Probability Tables.
  • Venn Diagrams.
  • Formulas for Counting Outcomes
  • Multiplication rule for counting
  • Formulas for permutations and combinations

75
Venn Diagram for Ad Campaign Example
76
Probability Tree for Ad Campaign Example
77
Flipping the Tree
78
Tables vs. Trees
  • Given some simple probabilities and joint
    probabilities, the information can often be
    organized in a joint probability table and used
    to derive other probabilities.
  • A probability tree is often helpful if
  • You have some simple probabilities and a series
    of independent events or
  • You have some simple and some conditional
    probabilities with a sequence of events.

79
Example Exam Problem
A small brewery has two bottling machines.
Machine A produces 40 percent of the bottles,
Machine B produces the rest. On average, one out
of every 20 bottles produced by Machine A is
rejected for some reason, while one out of every
five bottles produced by Machine B is
rejected a. (1 pt) Construct a complete
probability tree for this problem to help answer
the following questions. Label all the branches
and show all the relevant probabilities. b. (1
pt) Suppose a bottle is selected at random. What
is the probability that it will be rejected? c.
(1 pt) Given that a bottle is rejected, what is
the probability that it was produced by Machine B?
80
Example Exam Problem
According to recent political polls (S.F.
Chronicle 2/5/98), 67 percent of Americans
approve of President Clintons job performance
(event A) 53 percent think that he had an affair
with Monica Lewinsky (event M). Further analysis
shows that 25 percent think that Clinton had an
affair with Lewinsky and still approve of his job
performance. a. (2 pts) Construct a 2 by 2 table
to help answer the following questions. b. (1 pt)
What proportion of Americans does not think that
President Clinton had an affair with Lewinsky,
yet still does not approve of his job
performance? c. (1 pt) Given that an American
thinks Clinton had an affair with Lewinsky, what
is the probability that s/he approves of
Clintons job performance? d. (1 pt) Find the
probability that an American either does not
approve of Clintons job performance or thinks
that he had an affair with Lewinsky, or both. e.
(1 pt) Is an Americans opinion about the
Lewinsky affair statistically independent from
his/her opinion about Clintons job performance?
(Show your reasoning numerically).
81
Example Exam Problem
The director of a large employment agency wishes
to study various characteristics of its job
applicants. In a sample of 8 applicants, 2
applicants have had their current jobs for at
least five years and 3 of the applicants are
college graduates only one of the applicants who
have not graduated from college has had his/her
current job for at least five years. a. (2 ½ pts)
Suppose an applicant from this group is chosen at
random. Find the following probabilities (1) The
applicant is a college graduate. (2) The
applicant is a college graduate who has held
his/her current job for at least five years. (3)
The applicant is either not a college graduate or
someone who has not held his/her current job for
at least five years. (4) That a college graduate
has held his/her job for at least five years. (5)
That someone who has held his/her job for at
least five years is a college graduate. b. (1 pt)
Determine numerically whether being a college
graduate and holding the current job for at least
5 years are statistically independent. c. (1 ½
pt) Do you think that college graduates are more
likely to remain in a job longer? BRIEFLY explain
using appropriate probabilities.
82
Example Exam Problem
In drilling for oil in the Arctic, only 10
percent of the holes are "gushers" (very
successful wells), 40 percent are "squirters"
(moderately successful wells), and the rest are
dry holes. Before drilling, however, a geological
test can be done. The test is not perfect the
probability of a favorable test result is 0.85
when a potential site will in fact be a gusher.
The test has a 60 percent chance of a favorable
result when a site will be a squirter and a 90
percent chance of an unfavorable result when the
site will in fact turn out to be a dry hole. a.
Draw a probability tree to represent this
problem. b. What is the probability that the
test will be favorable? c. If the test is
favorable, what is the probability of a
gusher? d. If the test is unfavorable, what is
the probability of a dry hole? e. If the test is
unfavorable, what is the probability of a
squirter?
83
Example Exam Problem
  • The personnel manager at Megacorp classifies job
    applicants as either qualified or unqualified for
    the jobs they seek. The manager says that only
    25 of the job applicants are qualified. In this
    pool of qualified applicants, 20 list high
    school as their highest level of education, 50
    list college, and the remaining 30 list trade
    school. The situation is different among the
    unqualified applicants 40 of them list high
    school as their highest level of education,
    another 40 list trade school, and only 20 list
    college.
  • Draw an appropriate probability tree diagram of
    this situation, clearly labeling the event
    represented by each branch and its probability.
  • What is the probability that an applicant is both
    qualified and a trade school graduate?
  • c. What is the unconditional probability that an
    applicant comes from a trade school?
  • d. What is the probability that an applicant is
    qualified, given that s/he has a trade school
    education?

84
Practice Problem
85
Practice Problem Independent Events 1
86
Practice Problem Independent Events 2
Write a Comment
User Comments (0)
About PowerShow.com