Title: QUANTITATIVE METHODS 1
1SAMIR K. SRIVASTAVA
2Quantitative Analysis for Management - I
- Summary Descriptive Statistics
- Collection, Presentation and Analysis of Data 1
- Measures of Central Tendency and Variability 1
- Probability Theory 2
- Probability Distributions 2
- Decision Theory 2
- Correlation and Regression Analysis 2
- Quizzes 3-5
- Assignments/Attendance
3Statistics
- Coined by German scholar G. Achenwall.
- Two meanings
- (plural noun) Numerical data arising in any field
- Share prices, food grain production, road
accidents, employee absenteeism. - (singular noun) Body of scientific methods for
collection, analysis and interpretation of
numerical data. - As managers, why should we be concerned with this
field?
Decision
4 A Few Management Applications
- Selecting locations for new stores, restaurants,
warehouses, fire stations etc. - Deciding whether to continue or cancel a new
television series. - Forecasting whether an economic recession is
imminent. - Negotiating a labour contract.
- Setting up airlines schedules.
- Establishing sales quotas for regional sales
territories. - Determining premium for different insurance
covers. - Selecting appropriate media for advertising.
- Planning production schedules and raw material
purchases. - Make or buy decisions.
- For management reporting.
5Descriptive Statistics
A set of facts expressed in quantitative form.
It includes procedures used to summarize masses
of data and present these in an understandable
way.
It includes computation of specialized averages,
measures of dispersion and other measures which
aid in making decisions.
6Data Summarization Tally Sheet and Frequency
Distribution
15,12,14,17,12,16,13,17,6,13,16,24,15,22,11,16,14,
23,12,15,16,12,7,27,15,14,13,14,17,10
relative frequency
class
Freq.
Tally marks
6-10 11-15 16-20 21-25 26-30
3 16 7 3 1 ----- 30
10 53.3 23.3 10 3.3
8 13 18 23 28
Class limits
Class width 5
Class mark (LLUL)/2
How many classes?
Between 5 and 20
Array A data set sorted by size
Stem and Leaf Table a convenient way to manually
sort the data
7Data Presentation
Histogram / Bar Chart
Pie Chart
8Stem and Leaf Diagram
- RD expenditure for fifteen largest automakers as
percentage of sales - 4.4, 3.6, 4.4, 3.7, 9.6, 3.9, 3.6, 3.5, 3.0, 4.5,
5.1, 3.9, 2.4, 6.3, 5.4 - Each data value has two digits.
- Define classes based on the first digit.
- Rather than the tally mark, use the second digit
to record the data in frequency table.
Original data values are preserved in the display
stem
leaves
Easy to sort the data manually
2 3 4 5 6 7 8 9
4
2 4 3 0 5 6 6 7 9 9 4
4 4 5 5 1 4 6 3 7 8 9 6
Pictorial view, akin to a bar chart, is
superimposed on numerical data Easy to generate
by computer
6
7
9
6
5
0
9
4
4
5
1
4
3
6
9Density Histogram
- A frequency distribution is given, but class
widths are not equal. - How to draw a histogram?
Income Rel. Freq. 0-60 0.10 60-80 0.12 80-100 0.15
100-125 0.20 125-150 0.16 150-200 0.12 200-250 0.
10 250-300 0.05
Density 0.0017 0.0060 0.0075 0.0080 0.0064 0.0024
0.0020 0.0010
10Frequency Polygon
Represent each class with class mark Convert the
bar chart into a graph. How?
3 8 13 18
23 28 33
Useful for comparing two distributions. It is
easy to superimpose two or more polygons, unlike
the bar charts.
11Area under frequency polygon? Does it mean
anything?
Area Fraction of values between x1 and x2
Most of the time, well use a smooth curve as an
idealized distribution. Total area under the
curve 1.0 Curve described by a mathematical
function to make it amenable to mathematical
analysis
0 1 2 3 4 5
6 7
12A Very Special Distribution NORMAL
Has a symmetrical Bell-shaped curve. Extends
indefinitely in both directions. Described by a
nice mathematical function. Many real-life data
very closely follow this pattern. (Hence the
name Normal Distribution)
13Skewness
- A distribution is called skewed if it is not
symmetrical. - Tail is longer in one direction than the other.
Symmetrical
Right Skewed
Left Skewed
14Cumulative Frequency
Cum. Freq. ?
Cum. Freq. ?
0.100
1.000
0.633
0.900
0.866
0.366
0.966
0.133
1.000
0.033
Cumulative Freq. Polygon Less than Ogive
More than ogive
15Practice Questions
Table Below shows the percentage distribution of
total income of males 18 years old or over in
India in 2003. Using the table answer the
following
Income (Rs 00) of Males Income (Rs 00) of Males
Under 1000 17.2 4000-4999 15.9
1000-1999 11.7 5000-5999 11.9
2000-2999 12.1 6000-9999 12.7
3000-3999 14.8 10000 and above 3.6
- What is the width or size of the second class
interval? The seventh class interval? - How many different class interval sizes are
there? - How many open class intervals are there?
16Practice Questions
- How should the first class interval be written so
that its class width will equal of the second
class interval? - What is the class mark of the second class
interval? The seventh class interval? - What are the class boundaries of the Fourth class
interval? - What percentage of the males earn over Rupees
400000? Under Rs 300000? - What percentage of the males earn at least Rs
300000 but not more than Rs 500000? - What percentage of the males earn between Rs
300000 and Rs 630000? - Why dont the percentages total 100?
17Practice Questions
The data relating to sales of 100 companies is
given below
Sales (Rs Lacs) No. of Companies Sales (Rs Lacs) No. of Companies
5-10 5 25-30 18
10-15 12 30-35 15
15-20 13 35-40 10
20-25 20 40-45 7
- Draw less than and more than ogives.
- Determine the number of companies whose sales are
- Less than Rs 13 lacs,
- More than Rs 36 lacs, and
- Between Rs 13 and 36 lacs
18Measures of Central Tendency
- Data is clustered around a central point.
- Let the central point represent the data set.
- What is the most appropriate value for that
central point?Some desirable properties - Easy to understand
- Simple to Compute
- Based on all observations
- Uniquely defined
- Capable of further algebraic treatment
- Not unduly affected by extreme values
19Arithmetic Mean
- First thing that comes to mind is AVERAGE or
MEAN - Let xi value of ith data point, and let n be
the number of data points - Mean ?x ? xi / n
- Also called Arithmetic Mean
- Other measures of central tendency Median, Mode
20Properties of Arithmetic Mean
- Total value property Mean times n equals the sum
of all observations. - (Avg. daily production) ? (No. of days) Total
Production - MARGINAL ADDITION
- Add a new item with xi gt ?. How will it affect
the mean? - It pulls the average up.
- What if xi lt ?. Average is pushed down.
- The sum of deviations is always zero.
- The sum of square of deviations is minimum.
- COMBINATION
- Arithmetic means of several sets of data may be
combined.
21Properties of Arithmetic Mean
- Outliers ??
- 3, 6, 4, 7, 5, 30
- 33, 42, 39, 45, 37, 28, 8
- Outliers may significantly affect the value of
arithmetic mean. - It is a common practice to identify and remove
outliers from the dataset. - Weighted Average Combining several averages
into one. - ? (?wi?i)/?wi
- Geometric Mean (Product of all values)1/n
- Useful when calculating average percentage
change in some variable over time.
22Median
- Middle value
- Half the data items have value below the median,
and half above. - Arrange n observations in increasing order.
- If n is odd, (n1)/2th observation is the median.
- If n is even, median average of (n/2)th and
(n/21)th observation. - 1,3,4,9,12,17,22 n7 (n1)/2 4 Median x4 9
-
1,3,4,9,12,17 n6 n/2 3
Median (x3x4)/2 (49)/2 6.5
23Approximating the Median for Grouped Data
00.00-04.99 16 05.00-09.99 35 10.00-14.99 42 15.00
-19.99 39 20.00-24.99 27 n 159
16 51 93
- Can we find the precise value of the Median? Why
not? - Median is x80. Which class is it in?
- What should be taken as the numerical value of
the Median? - Assume that 42 data points are evenly spread over
the class width of 5 units. Then what is the
location of 29th (?) data item? - 10.00295/42 13.45 (approx.).
24Properties of Median
- Not affected significantly by outliers.
- Sum of absolute deviations about a median is
minimum. - Easy to determine and explain.
- May be computed for open-end distributions.
- Positional average
- Not capable of algebraic treatment
25Percentiles
- A value is called pth percentile of a data set if
p data points lie below that value. - What percentile is the Median? 50-percentile.
- First Quartile Q1 25th percentile
- Second Quartile Q2 50th percentile (same as
Median) - Third Quartile Q3 75th percentile
- 12, 14, 14, 15, 17, 18, 18, 19, 19, 19, 21, 23,
26, 30, 35, 42 - N16
- Q1 16, Q2 19, Q3 24.5
- Box and Whisker Plot
- Inter-quartile range IQR Q3 - Q1 8.5
26Determining Percentiles Graphically
Percentage Cumulative frequency
X variable
27The Mode
- French word meaning fashion.
- Here it means most frequent.
- Consider the following sorted data set.
- 26, 26, 27, 28, 28, 28, 28, 29, 29, 29, 30, 30,
31. - Which value occurs most frequently?
- How to define mode for grouped data?
- Should we take the midpoint of the class with
highest frequency?
Mode L c.d1/(d1d2) 11513/17 14.8
L
28Locating Mode Graphically
29Mean, Median and Mode Relationship
Mean Median Mode Mode (3 Median 2 Mean)
30Measures of Variability
- Is the Mean, Mode or Median an adequate
representation of data? - They are measures of central tendency, which
totally ignore variability. - Variability is a vital aspect of data.
- How much do the data points tend to deviate from
the mean? - Can we define a measure for it?
- Range the difference of largest and smallest
data values. - Is it a good measure of variability? Whats
wrong? - Inter-Quartile Range Q3-Q1 . A little more
robust, but still not satisfactory. - We need something which takes into account the
deviation of all data values. Deviation from what?
31Average Deviation?
- 1,2,3,4,5
- Mean ? 3
- Deviations? -2, -1, 0, 1, 2
- What is the average deviation? ZERO!
- Ignore the sign of deviation (absolute deviation)
- Average absolute deviation (?xi-?)/n 6/5
1.2 - Looks reasonable, but absolute value function is
mathematically inconvenient. Is there an
alternative? - Square the deviations and then take average.
- Variance ?2 ?(xi-?)2 /n (41041)/5
2.0 - Since we squared the deviations before taking
average, square root of this value is more
meaningful. - Standard Deviation ? ?(?(xi-?)2 /n) ? 2
1.414 - This measure has nice mathematical properties,
and is universally accepted.
32Alternative formula
- ?2 ?(xi-?)2 /n
- ?2 ?xi2/n (?xi/n)2
- ?2 ?xi2/n ?2
- Approximating the Variance for grouped data
- fi Frequency of class i
- xi class mark of class i
- ?2 ?fixi2/n (?fixi/n)2 where n
?fi
33Standard Deviation and the Normal Curve
68
95
99.7
34Chebyshev Inequality
- 68-95-99.7 rule applies only for Normal
Distribution. - Can we make a similar statement for other
distributions? - Chebyshev gave a rule that applies to all
distributions - Let z be at least 1. Then in any data set, the
proportion of data values lying in the interval
??z? is at least (1-1/z2). - Thus, at least 75 data lies within 2? of mean.
- At least 89 lies within 3?.
- Can we say anything about 1? limit?
- Nothing can be said. The rule does not apply if z
?1. - Consider the data points -1,-1,-1,1,1,1.
- ? 0, ? 1.0. How many points lie strictly
inside the 1 ? limit?
35Effect of a constant on ? and ?
- Suppose a constant ? is added to all the data
values. - How will it affect ? and ??
- Clearly, ? ?? ?. Why?
- Will ? be affected? Why or why not?
- ?2 ?(xi-?)2 /n
- What if all values are multiplied by ??
- Clearly ? ??.?. Why?
- What about ?? It will be multiplied by ?. Why?
- Variance cannot be negative.
36Relative Variability
- Risk associated with a stock is indicated by the
variability in its price. - Consider the following two stocks
- Stock A ? 20, ?5
- Stock B ? 200, ?5
- Both have same ?. Do you think they are equally
risky? Why not? - A true indicator of risk is relative variability
?/?.
0.25 0.025
Coefficient of Variation (?/?)(100)
37Problem
The stock of Highprice Inc. has been selling at
? Rs 250 per share and ? Rs 40. Lowprice Inc.
stock averages ? Rs 20 with ? Rs 5. Which
stock exhibits greater variability? Which stock
seems to be more risky?
Coefficient of Variation (?/?)(100)
38Thank You !