Title: Lecture series two
1Lecture series two
- Probability and Statistics
2Recapitulation
- In the previous set of lectures we explored the
link between sets of numbers and functions - Today we shall re-examine this link from the
perspective of the concepts of uncertainty and
probability theory
3Modelling uncertainty
- Probability theory is a mathematical model of
uncertainty. Consider flipping a fair coin
repeatedly. There are two possible outcomes of
the coin flip head or tail but we do not know
which one will occur in a single flip. The
outcomes are uncertain.
4Modelling uncertainty contnd.
- It can be argued that the uncertainty in the
world is fully contained in the selection of some
hidden variable, say ?. If this variable is
known, then nothing will be uncertain anymore. - Many choices are possible, but only one was made
and everything derives from it. In other words,
everything that is uncertain is a function, say
X(?) of the hidden variable.
5Modelling uncertainty contnd.
- In order to explain the link between the hidden
variable and the function(s) defined over the
hidden variable we need to introduce some
associated concepts.
6Some useful definitions
- It is customary in statistics to refer to any
process of observation or measurement as an
experiment. The results that one obtain from an
experiment are called outcomes. The set S of all
possible outcomes of some experiment is called
the sample space. An event A in this set is a set
of outcomes or a subset in the sample space S.
7Examples
- Example one Toss a die and observe the number
(of dots) that appears on the top face. Let A be
the event that an even number occurs, B that an
odd number occurs and C that a number greater
than 3 occurs. Find the events of A and C and A
and B occurring simultaneously. - Example two Toss a coin 3 times and observe the
sequence of heads H and tails T that appears. Let
A be the event that two or more heads appear
consecutively, and B that all the tosses are the
same. Find the event of A and B occurring
simultaneously. - Example three A card is drawn from an ordinary
deck of 52 cards. Let E be the event that a
picture card was drawn and F the event of a
heart. Find the event of E and F occurring
simultaneously.
8The probability function
- In all these cases, we found the probability of
an event. Probabilities are values of a set
function, which assigns real numbers to various
subsets of a sample space S. When the sample
space is discrete - P1 For any event A, P(A)?0
- P2 For the certain event S, P(S)1
- P3 If A,B,C. Is a finite or infinite sequence
of mutually exclusive events, P(A?B
?C)P(A)P(B)P(C)
9The probability function cntnd.
- Theorems on probability spaces
- T1 The impossible event or, in other words, the
empty set ? has probability zero, that is P (?)0 - T2 For any event A, P(Ac)1-P(A)
- T3 For any event, 0?P(A) ?1
- T4 For any two events A and B,
P(A?B)P(A)P(B)- P(A?B). - T5 If an experiment can result in any one of N
different equally likely outcomes, and if n of
these outcomes together constitute event A, then
the probability of event A is P(A)n/N.
10Examples
- Example one A card is selected at random from an
ordinary deck of 52 playing cards. Let A be the
event of a heart, and B the event of a face card.
Find P(A), P(B), P(A?B), P(A?B). - Example two A student is selected at random from
80 students, where 30 are taking maths, 20
chemistry and 10 both maths and chemistry. Find
the probability that a students is taking either
maths or chemistry. - Example three Let three coins be tossed and the
number of heads observed. Find the probability of
the event that at least one head appears. Find
the probability of the event that all heads or
all tails appear.
11Conditional probability
- Example Auto insurance rates usually depend on
the probability that a random person is involved
in an accident. It is well known that male
drivers under 25 years of age get into accidents
more often than the general public. That is,
letting P(A) denote the probability of an
accident and letting E denote male drivers
younger than 25. The data tells us that
P(A)ltP(AE).
12Conditional probability contnd.
- Suppose E is an event in sample space S with
P(E)gt0. The probability that an event A occurs
once event E has occurred, or the conditional
probability of A given E, is defined as -
13Conditional probability contnd.
- Example one A pair of dice is tossed. Find the
probability that one of the dice is 2 if the sum
is 6. - Example two A couple has two children. Find the
probability that both children are boys if it is
known that at least one is a boy.
14Conditional probability contnd.
- The Multiplication Theorem for Conditional
Probability is a direct consequence of the
definition of conditional probability -
15Conditional probability contnd.
- Example one A lot contains 12 items of which 4
are defective. Three items are drawn at random
from the lot one after the other. Find the
probability that all 3 are non-defective. - Example two Suppose the following three boxes
are given Box X has 10 light bulbs of which 4
are defective., box Y has 6 light bulbs of which
1 is defective and box Z has 8 light bulbs of
which 3 are defective. A box is chosen are random
and then a bulb is randomly selected from the
box. Find the probability that the bulb is
non-defective. If a bulb is non-defective find
the probability that it came from box Z.
16Random variable
- We are now ready to define properly the concept
of a random variable - If S is a sample space with a probability measure
and X is a real-valued function defined over the
elements of S, then X is called a random variable
17Random variable cntnd
- Example one A pair of fair dice is tossed. The
sample space S consists of 36 ordered pairs (a,b)
where a and b can be any integers between 1 and
6 - S(1,1),(1,2),.(6,6). Let X assign to each
point the maximum of its numbers
X(a,b)max(a,b). Then X is a random variable with
a range Rx1,2,3,4,5,6. - Example two A coin is tossed until a head
appears. The sample space is SH,TH,TTH,TTTH,TTT
TH... Let X denote the number of times the coin
is tossed. Then X is a random variable with range
space Rx1,2,3,4,
18Probability distribution of a discrete random
variable
- If X is a discrete random variable, the function
given by f(x)P(Xx) for each x within the range
of X is called probability distribution of X. The
set of ordered pairs is usually given in the form
of a table as follows -
19Probability distribution of a finite random
variable cntnd
- This function f is called the probability
distribution or distribution of the random
variable X. It satisfies the following two
conditions f(xk)?0 and ?k f(xk)1
20Probability distribution of a finite random
variable cntnd
- Example Let S be the sample space when a pair of
fair dice is tossed. The S is a finite
equiprobable space consisting of the 36 ordered
pairs (a,b). Let X and Y be random variables such
that Xmax (a,b) and Yab. Find the distribution
of X and Y. - (a) One toss (1,1) has a maximum value of 1,
hence f(1)1/36. - Three tosses (1,2), (2,2), (2,1) have a max
value of 2, hence f(2)3/36. Five tosses (1,3),
(2,3), (3,3), (3,1), (3,2) have a max value of 3
f(3)5/36 and so on
21Probability distribution of a finite random
variable cntnd
22Probability distribution of a finite random
variable cntnd
- Similarly, the distribution of Y is
23More examples
- Example two A fair coin is tossed three times.
Let X be the random variable that assigns to each
point in S the number of heads. Find the
distribution of X. - Example three Suppose a coin is tossed three
times, but now it is weighted so that P(H)2/3
and P(T)1/3. Find the distribution of X.
24Expectation of a finite random variable cntnd.
- Let X be a finite random variable and suppose the
following is its distribution -
- Then the mean, or expectation (expected value) of
X, denoted by E(X) is defined as
E(X)x1f(x1)x2f(x2)xnf(xn) - Exercise Find the expected value of the random
variable in example 1 above.
25Variance and standard deviation of a discrete
random variable
- The expectation of a random variable is a measure
of its mean, or average value. Suppose that X is
a random variable with n distinct values and
suppose that each value occurs with the same
probability pi1/n. Then E(X)x11/nx21/n.
xn1/n?. - The variance and the standard deviation, on the
other hand, are measures of the spread or
dispersion of the random variable.
26Variance and standard deviation of a discrete
random variable cntnd.
- Let X be a random variable with mean ?E(X) and
the following probability distribution - The variance of X is defined by
- var(X)(x1- ?)2f(x1) (x2- ?)2f(x2) (xn-
?)2f(xn) - T var(X)E(X2)- ?2
- Exercise Compute the variance in example 1
above.
27Joint distribution of a random variable
- Let X and Y be random variables on the same
sample space S with respective range spaces
Rxx1,x2,..xn and Ryy1,y2,.yn. The joint
distribution or joint probability function of X
and Y is the function h on product space - h(xi,yj)?P(Xxi,Yyj) ?P(s?SX(s)xi,Y(s)yj)
- It has properties (i) h(xi,yj), (ii) ?i?j
h(xi,yj)1
28Expectation of a finite random variable
- T1 Let X be a random variable and let k be a
real number. Then E(kX)kE(X) and E(Xk)E(X)k - Thus, for any real numbers a and b
E(aXb)aE(X)b. - T2 Let X and Y be random variables on the same
sample space S. Then E(XY)E(X)E(Y).
29Joint distribution of a random variable cntnd.
f(xi)?jh(xi,yj) and g(xi)?ih(xi,yj) are
marginal distributions
30Covariance and correlation
- Let X and Y be random variables with joint
distribution h(x,y) and respective means ?x and
?y. The covariance of X and Y, denoted by
cov(X,Y) is defined by - cov(X,Y)?i,j (xi- ?x)(yj- ?y)h(xi,yj)E(X-
?x)(Y- ?y) - E(XY)- ?x ?y
- The correlation
- -1???1
- Exercise calculate the joint, marginal
distributions and the correlation coefficients
for the random variables in example one.
31Independent random variables
- Let X,Y..Z are random variables over space S.
They are said to be independent if - P(Xxi,Yyj.Zzk)P(Xxi)P(Yyj).P(Zzk)
- T If X an Y are independent random variables
- E(XY)E(X)E(Y)
- var(XY)var(X)var(Y)
- cov(X,Y)0
-
32Continuous random variables
- Suppose that X is a random variable on a sample
space S whose range space Rx is a continuum of
numbers such as an interval. From the definition
of a random variable, the set a?X ?b is an
event in S and therefore the probability P(a?X
?b) is well defined. In calculus terms - The function f is called the distribution or
continuous probability (density) function of X
and satisfies the following conditions
33Continuous random variables cntnd.
- The expectation E(X) for a continuous random
variable X is defined by the following integral - While the variance
- Example Find the expectation and variance of
random variable X with the following distribution
function
34Continuous random variables cntnd.
Continuous random variables cntnd.
- A bivariate function with values f(x,y), defined
over the xy-plane, is called the joint
probability density function of the continuous
random variables X and Y iff -
35Continuous random variables cntnd.
Continuous random variables cntnd.
- Example Given the joint probability density
function - Find the probability P(X,Y)?A, where A is the
region - (x,y)0ltxlt1/2, 1ltylt2.
-
36Continuous random variables cntnd.
Continuous random variables cntnd.
- If X and Y are continuous random variables and
f(x,y) is the value of their joint probability
density at (x,y), the function given by -
- is called the marginal density of X, while
the function -
- is called the marginal density of Y.
-
-
37Continuous random variables cntnd.
Continuous random variables cntnd.
- Example Given the joint probability density
-
- Find the marginal densities of X and Y.
-
38Conditional expectation
- The concept that has a special importance in
econometric is the concept of conditional
expectation - In the discrete case it is given by
-
- In the continuous case it is given by
-
39Conditional expectation cntnd.
- While we shall spend more time on it tomorrow,
here are a couple of examples - Example one Find the conditional mean of X when
y1 using the following joint distribution -
- Example two find the conditional mean of f(x,y)
for y1/2
40Recapitulation
- It has been observed that one can discuss X and
f(x) without referring to the original
probability space S. In fact, there are many
applications of probability theory which give
rise to the same probability distribution. - Some of the probability distribution and density
functions widely used in finance are the
Bernoulli/ Binomial, the Uniform etc. - Overall, the material that we learnt during the
past two series of lectures (in particular, the
concept of optimization and those of expectation,
variance, correlation coefficient etc) is
sufficient to understand as complicated models as
the portfolio theory model. Examples are given
in your course pack. - For the rest of todays class we will have a
brief look at some useful characteristics of the
normal distribution and will define some sampling
distributions and the principles of hypothesis
testing. This will enable us to understand well
what comes in the rest of the class.
41The normal distribution
- A random variable X has a normal distribution,
and is referred to as a normal random variable,
if and only if its probability density is given
by -
-
42The normal distribution cntnd.
- Suppose that X is any normal distribution
N(µ,s2). One of the most useful representations
of the normal distribution is the standardized
random variable corresponding to X, defined as - Z is also a normal distribution with µ0 and s1,
i.e. ZN(0,1). Its density function is -
43The normal distribution cntnd.
- One of the most helpful in terms of hypothesis
testing property of the standard normal
distribution is the so called 68-95-99.7 rule
which gives the percentage of area under the
standardized normal curve as follows -
- 68.2 for -1?z ?1 and for µ-s ?x ? µs
- 95.4 for -2?z ?2 and for µ-2s ?x ? µ2s
- 99.7 for -3?z ?3 and for µ-3s ?x ? µ3s
44The normal distribution cntnd.
- Evaluating Standard Normal Probabilities
-
- Example one Find (a) ?(1.72) (b) ?(0.34) (c)
?(2.3) (d) ?(4.3) -
- Example two Evaluate the probabilities
- P(-0.5?Z ?1.1), P(0.2?Z ?1.4), P(-1.5?Z
?-0.7), -
- Example three Evaluate the probabilities
P(Z ?0.75), P(Z ? -1.2), P(Z?0.60), P(Z ? -0.45) -
- Example four Evaluate the following
probabilities for N(70,4) - P(68 ?X ?74), P(72 ?X ?75), P(63 ?X ?68), P(X
?73)
45Sampling distributions
- Definition If X1, X2.Xn are independent and
identically distributed random variables, we say
that they constitute a random sample from the
infinite population given by their common
distribution. - Statistical inferences are typically based on
statistics, i.e. on random variables that are
functions of a set of random variables X1, X2.Xn
. Typically these statistics are the sample mean
and the sample variance.
46Sampling distributions cntnd.
- Definition one If X1, X2,., Xn constitute a
random variable, then the sample mean is defined
as - Definition two If X1, X2,., Xn constitute a
random variable, then the sample variance is
defined as -
47Sampling distributions cntnd.
- The central limit theorem If X1, X2,., Xn
constitute a random sample from an infinite
population with mean µ and variance ?, then the
limiting distribution of as n?? is
the standard normal - distribution.
48Sampling distributions cntnd.
- If X1, X2.Xn are independent random variables
having standard normal distribution, then -
- has the chi-square distribution with n degrees
of freedom.
49Sampling distributions and hypothesis testing
cntnd.
- If Y and Z are independent random variables, Y
has a chi - square distribution with ? degrees of
freedom, and Z has the standard normal
distribution, then the distribution of -
-
- is called the t distribution with ? degrees
of freedom.
50Sampling distributions cntnd.
- If U and V are independent random variables
having chi-square distributions with ?1 and ?2
degrees of freedom, then -
- is a random variable with an F distribution.
51Introduction to hypothesis testing
- Estimators such as the sample mean and variance
of a distribution are point estimates as they
provide only a single (point) estimate of the
unknown variable that we are interested in. - Instead of basing our inference about the true
unknown variables of interest on these single
estimates, we can obtain two different estimates
and argue with some confidence (probability) that
the interval between these two values contains
the true parameter. - This is the logic behind interval estimations and
hypothesis testing.
52Introduction to hypothesis testing cntnd.
- The key concept underlying interval estimation is
the notion of the sampling, or probability
distribution of an estimate. For instance, it can
be shown that if a variable X is normally
distributed, then the sample mean is also
normally distributed with a mean ? and a variance
?2/n. In other words, the sampling or
probability distribution of the estimator is - As a result, we can construct the interval
- and discuss the probability that an interval
like this contains the true ?.
53Introduction to hypothesis testing cntnd.
- More generally, in interval estimation we
construct - two estimators and , both
functions of the -
- sample X values, such that
- That is, we can say that the probability is 1-?
that the above interval, called the confidence
interval of size 1-? contains the true value of
our unknown parameter. If 1-? is 0.95, we can
argue that in 95 out of 100 such intervals the
interval will contain the true parameter. -
-
-
-
-
-
54Introduction to hypothesis testing cntnd.
- Example Suppose that the distribution of height
of men in a population is normally distributed
with mean? inches and variance ?2.5 inches. A
sample of 100 men drawn randomly from this
population had an average height of 67 inches.
Establish a 95 confidence interval for the mean
height in the population. - Solution Since
-
- From the normal distribution table we see
that -
- Plugging the relevant variables, we obtain a
95 confidence interval as - 66.51? ? ?67.49.
-
-
55Introduction to hypothesis testing cntnd.
- The problem of hypothesis testing can be stated
as follows. Assume that we have a random variable
X with a known PDF f(x,?), where ? is the
parameter of the distribution. Having obtained a
random sample with size n, and a point estimator
, we can raise the question is this estimator
compatible with some hypothesized value of ?, say
? ?. - In the language of statistics ? ? is called the
null hypothesis. It is tested against an
alternative hypothesis, say ?? ?.
56Introduction to hypothesis testing cntnd.
- The null hypothesis and the alternative
hypothesis can be simple and composite. It is
simple if it satisfies specific values of the
parameters, otherwise it is composite. - Example H0 ?15 and ?2 is a simple hypothesis.
- H0 ?15 and ?gt2 is a composite hypothesis.
57Introduction to hypothesis testing cntnd.
- To test the null hypothesis we use the sample
information to obtain what is known as test
statistics. Very often this is a point estimator
of the unknown parameter. Then we try to find out
the sampling, or probability distribution of the
test statistics and use the confidence interval
or test of significance approach to test the null
hypothesis.
58The confidence interval approach
- In the previous example, we established that the
95 confidence interval for the average male
height in the population is 66.51? ? ?67.49. - Now let us test the null hypothesis H0 ?69
against the alternative hypothesis H1 ??69. - Clearly, 69 does not belong to the above interval
and we reject the null hypothesis.
59The confidence interval approach cntnd.
- In the language of hypothesis testing, the
confidence interval that we established is called
the acceptance region. The area(s) outside this
region is (are) called critical region(s). The
lower and upper limits of the acceptance region
are called critical values. If the hypothesized
value falls within the acceptance region we
cannot reject the null hypothesis otherwise we
reject it.
60The confidence interval approach cntnd.
- In rejecting or accepting the null hypothesis we
are likely to commit two types of errors - As it is difficult to minimize both errors, in
practice we keep the probability of Type I error
at 0.01 or 0.05 and try to minimize the
probability of having a type II error. -
61The confidence interval approach cntnd.
- In the language of statistics, the probability of
Type I error ? is called the level of
significance. The probability of committing Type
II error is designated as ? and 1- ? is called
the power of the test.
62Test of significance approach
- If instead of constructing a confidence interval,
we instead substitute the given values in -
- For the hypothesis of ?H0 ?69 versus H1
??69 - we obtain
-
63Test of significance approach
- From the normal distribution table we observe
that the probability of the Z value to exceed 3
or 3 is 0.001. The probability of the Z value to
exceed -1.96 or 1.96 is 0.05 and so on. We
therefore conclude that the computed value of
Z-8 is statistically significant, and reject the
null hypothesis of ?69 at any acceptable
significance level.
64Examples
- Example one Suppose that it is known from
experience that the standard deviation of the
weight of an 8-ounce package of cookies made by a
certain bakery is 0.16 ounce. To check whether
its production is under control on a given day,
namely, to check whether the true average weight
of the packages is 8 ounces, employees select a
random sample of 25 packages and find that their
mean weight is 8.091 ounces. Since the bakery
will loose money when the average package exceeds
8 ounces and the customer loses money when it is
smaller than 8, test the hypothesis that the
average size of a package is 8 ounces. - Example two Suppose that 100 tires by a
manufacturer lasted on average 21,819 miles with
standard deviation 1,295 miles. Test the null
hypothesis of ?22,000 versus the alternative
hypothesis of ?lt22,000.
65The t-test
- When the sample size is relatively small and s2
unknown, the appropriate test statistic is - Example The specification of a certain kind of
ribbon calls for a mean breaking strength of 185
pounds. If five pieces randomly selected from
different rolls have breaking strengths of 171.6,
191.8,178.3, 184.9 and 189.1 pounds, test the
null hypothesis that µ185 pounds against the
alternative hypothesis of µlt185 pounds. -
66The Chi-square test
- Sometimes it is essential to test hypotheses
related to the variance of a variable. The
relevant test statistics in this case is
67Example
- Suppose that the thickness of a part used in a
semiconductor is its critical dimension and that
measurements of the thickness of a random sample
of 18 such parts have the variance s20.68
(thousand inch). The process is considered to be
under control if the variation of the thickness
is given by a variance not greater than 0.36.
Assuming that the measurements are a random
sample from a normal distribution, test the
hypothesis that s20.36