Computing Fundamentals 2 Lecture 7 Statistics - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Computing Fundamentals 2 Lecture 7 Statistics

Description:

We use the standard deviation to measure this spread (SD(A) 11. Variance & Standard Deviation ... 10 ways of drawing two cards at random. Distribution Example(2) ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 47
Provided by: comp55
Category:

less

Transcript and Presenter's Notes

Title: Computing Fundamentals 2 Lecture 7 Statistics


1
Computing Fundamentals 2Lecture 7 Statistics
  • Lecturer Patrick Browne
  • http//www.comp.dit.ie/pbrowne/
  • Room K408

2
Statistics
  • Raw data are just lists of facts and numbers. The
    branch of mathematics that organizes, analyzes
    and interprets raw data is called statistics.

3
Permutations Combinations
  • P(n,r) n! / (n-r)!
  • Permutations a, b, and c taken 2 at a time is
    32/16 ltsequencegt
  • ltabgt,ltbagt,ltacgt,ltcagt,ltbcgt,ltcbgt
  • C(n,r) n! /r! (n-r)!
  • Combinations of a, b, and c taken 2 at a time is
    32/213ab,ac,bc set
  • ab is the same combination as ba, but they are
    distinct permutations.

4
Probability Calculations
  • Conditional probability
  • P(AE) P(A ? E)/P(E)
  • Test for independence
  • P(A ? B) P(A)P(B)
  • Calculation of union
  • P(A ? B) P(A) P(B) P(A ? B)

5
Frequency Table
  • One way of organizing raw data is to use a
    frequency table (or frequency distribution),
    which shows the number of times that an
    individual item occurs or the number of items
    that fall within a given range or interval.

6
Frequency Distribution
  • Suppose that a sample consists of the heights of
    100 male students at XYZ University. We arrange
    the data into classes or categories and determine
    the number of individuals belonging to each
    class, called the class frequency. The resulting
    table is called a frequency distribution or
    frequency table

7
Frequency Distribution
  • The first class or category, for example,
    consists of heights from 60 to 62 inches,
    indicated by 6062, which is called class
    interval. Since 5 students have heights belonging
    to this class, the corresponding class frequency
    is 5. Since a height that is recorded as 60
    inches is actually between 59.5 and 60.5 inches
    while one recorded as 62 inches is actually
    between 61.5 and 62.5 inches, we could just as
    well have recorded the class interval as 59.5
    62.5. In the class interval 59.5 62.5, the
    numbers 59.5 and 62.5 are often called class
    boundaries.

8
Frequency Distribution
  • The midpoint of the class interval, which can be
    taken as representative of the class, is called
    the class mark. A graph for the frequency
    distribution can be supplied by a histogram.

9
Frequency table class interval
10
Mean
  • The arithmetic mean is the sum of the values in a
    data set divided by the number of elements in
    that data set.

11
Mean
  • The arithmetic mean is the sum of the values in a
    data set divided by the number of elements in
    that data set.
  • x ?xi
  • n
  • x ?fixi where f
    denotes frequency
  • ?fi

12
Variance Standard Deviation
  • List A 12,10,9,9,10
  • List B 7,10,14,11,8
  • The mean (x) of A B is 10, but the values of A
    are more closely clustered around the mean than
    those in B (or there is greater desperation or
    spread in B). We use the standard deviation to
    measure this spread (SD(A)1.1,SD(B) 2.4)

13
Variance Standard Deviation
  • The variance is always positive and is zero only
    when all values are equal.
  • variance ?(xi - x )2
  • n
  • standard deviation

Alternatively
14
Variance of a frequency distribution
15
Median
  • The median is the middle value. If the elements
    are sorted the median is
  • Median valueAt(n1)/2 odd
  • Median average(valueAtn/2,
  • valueAtn/21) even
  • For odd and even n respectively.
  • Example 1,2,3,4,5 , Median 3
  • Example 1,2,3,4,5,6, Median 3.5

16
Mode
  • The mode is the class or class value which occurs
    most frequently. We can have bimodal or
    multimodal collections of data.

The height of the bars is the number of cases in
the category
17
Bernouilli Trials
  • Independent repeated trial with two outcomes are
    called Bernouilli Trials. The probability of k
    successes in a binomial experiment is
  • Where n is the number of trials and (n-k) is the
    number of failure.

18
Bernouilli Trials Example
  • John hits target p1/4,
  • John fires 6 times, n6,
  • What is the probability John hits the target 2
    times?

19
Bernoulli Trials Example
  • John hits target p1/4,
  • John fires 6 times, n6,
  • What is the probability John hits the target at
    least once?

No success (0), all failures, Anything to the
power of 0 is 1 Only 1 way to pick 0 from 6
Probability that John hits target at least once
EXCEL 1-((3/4)6)
Probability that John does not hit target
0 to the power 0 is undefined, anything else to
the power of zero is 1.
20
Bernoulli Trials Example
  • Probability that Mary hits target p1/4,
  • Mary fires 6 times, n6,
  • What is the probability Mary hits the target more
    than 4 times?

In EXCEL (6)((1/4)5)((3/4)1)(1/4)6
21
Random variables and probability distributions.
  • Suppose you toss a coin two times. There are four
    possible outcomes HH, HT, TH, and TT. Let the
    variable X represents the number of heads that
    result from this experiment. The variable X can
    take on the values 0, 1, or 2. In this example, X
    is a random variable because its value is
    determined by the outcome of a statistical
    experiment.

22
Random variables and probability distributions.
  • A probability distribution is a table or an
    equation that links each outcome of a statistical
    experiment with its probability of occurrence.
    The table below, which associates each outcome
    (the number of heads) with its probability. This
    is an example of a probability distribution.

23
Random Variable
  • A random variable X on a finite sample space S is
    a function (or mapping) from S to a number R in
    S.
  • Let S be sample space of outcomes from tossing
    two coins. Then mapping a is
  • SHH,HT,TH,TT (assume HT?TH)
  • Xa(HH)1, Xa(HT)2, Xa(TH)3, Xa(TT)4
  • The range (image) of Xa is
  • S1,2,3,4

24
Random Variable
  • Let S be sample space of outcomes from tossing
    two coins, where we are interested in the number
    of heads. Mapping b is
  • SHH,HT,TH,TT
  • Xb(HH)2, Xb(HT)1, Xb(TH)1, Xb(TT)0
  • The range (image) of Xb is
  • S0,1,2

25
Random Variable
  • A random variable is a function that maps a
    finite sample space into to a numeric value. The
    numeric value has a finite probability space of
    real numbers, where probabilities are assigned to
    the new space according to the following rule
  • pointi P(xi) sum of probabilities of points
    in S whose range is xi.
  • Recall function F Domain -gt Range (Image)

26
Random Variable
  • The function assigning pi to xi can be given as a
    table called the distribution of the random
    variable.
  • pi P(xi)
  • number of points in S whose image is xi
  • number of points in S
  • (i 1,2,3...n) gives the distribution of X

27
Random Variable
  • The equiprobable space generated by tossing pair
    of fair dice, consists of 36 ordered pairs(1)
  • Slt1,1gt,lt1,2gt,lt1,3gt...lt6,6gt
  • Let X be the random variable which assigns to
    each element of S the sum of the two dice
    integers 2,3,4,5,6,7,8, 9,10,11,12

28
Random Variable
  • Continuing with the sum of the two dice.
  • There is only one point whose image is 2, giving
    P(2)1/36.
  • There are two points whose image is 3, giving
    P(3)2/36. (lt1,2gt?lt2,1gt, but their sums are )
  • Below is the distribution of X.

36/36
29
Example Random Variable
  • A box contains 9 good items and 3 defective items
    (total 12 items). Three items are selected at
    random from the box. Let X be the random variable
    that counts the number of defective items in a
    sample. X has a range space Rx
    0,1,2.3.
  • The sample space 12-choose-3 220 different
    samples of size 3.
  • There are 9-choose-3 84 samples of size 3 with
    0 defective items.
  • There are 3 9-choose-2 108 samples of size
    3 with 1 defective.
  • There are 3-choose-2 9 27 samples of size 3
    with 2 defective.
  • There 3-choose-3 1 samples of size 3 with 3
    defective items.
  • Where n-choose-r means the number of
    combinations

COMBIN(12,3))
84 108 27 1 ----- 220
30
Example Random Variable
  • A box contains 9 good items and 3 defective items
    (total 12 items). Three items are selected at
    random from the box. Let X be the random variable
    that counts the number of defective items in a
    sample. X can have values 0-3.
  • Below is the distribution of X.

84 108 27 1 ----- 220
220/220
31
Functions of a Random Variable
  • If X is a random variable then so is Yf(X).
  • P(yk) sum of probabilities xi, such that
    ykf(xi)

32
Expectation and variance of a random variable
  • Let X be a discrete random variable over sample
    space S.
  • X takes values x1,x2,x3,... xt with respective
    probabilities p1,p2,p3,... pt
  • An experiment which generates S is repeated n
    times and the numbers x1,x2,x3,... xt occur with
    frequency f1,f2,f3,... ft (?fin)
  • If n is large then
  • one expects

33
Expectation of a random variable
  • So becomes
  • The final formula is the population mean,
    expectation, or expected value of X is denoted as
    ? or E(X).

34
Variance of a random variable
  • The variance of X is denoted as ?2 or Var(X).
  • 2
    2
  • The standard deviation is

35
Expected value, Variance, Standard Deviation
  • E(X) µ µx ??xipi
  • Var(X) ?2 ?2x ?(xi - µ)2pi
  • SD(X) ?x

36
Relation between population and sample mean.
  • If we select a sample size N at random from a
    population, then it is possible to show that the
    expected value of the sample mean m is the
    population mean µ.
  • This rule differs slightly for variance. The
    sample variance is (N-1)/N times the population
    variance.

37
Example Random Variable Expected Value
  • A box contains 9 good items and 3 defective
    items. Three items are selected at random from
    the box. Let X be the random variable that counts
    the number of defective items in a sample. X can
    have values 0-3.
  • Below is the distribution of X.

38
Example Random Variable Expected Value
  • µ is the expected value of defective items in
    in a sample size of 3.
  • µE(X)
  • 0(84/220)1(108/220)2(27/220)3(1/220)132/220
    ?
  • Var(X)
  • 02(84/220)12 (108/220)22 (27/220)32 (1/220)
    - µ 2 ?
  • SD(X) sqrt(µ2)?

39
Fair Game1?
  • If a prime number appears on a fair die the
    player wins that value. If an non-prime appears
    the player looses that value. Is the game
    fair?(E(X)0)
  • S1,2,3,4,5,6
  • E(X) 2(1/6)3(1/6)5(1/6)(-1)(1/6)(-4)(1/6)(-
    6)(1/6) -1/6
  • Note 1 is not prime

40
Fair Game2?
  • A player gambles on the toss of two fair coins.
    If 2 heads occur the player wins 2 Euro. If 1
    head occurs he wins 1 Euro. If no heads occur he
    looses 3 Euro. Is the game fair?(E(X)0)
  • SHH,HT,TH,TT,
  • X(HH) 2, X(HT)X(TH)1, X(TT)-3
  • E(X) 2(1/4)1(2/4)-3(1/4) 0.25

41
Mean(µ), Variance(?2), Standard Deviation(?)
xi 2 3 11
pi 1/3 1/2 1/6
µExipi 2(1/3) 3(1/2) 11(1/6) 4 E(X2)
Exipi 2(1/3) 3(1/2) 11(1/6) 26 ?2 Var(X)
E(X2) µ2 26 42 10 ? sqrt(Var(X))
sqrt(10) 3.2
42
Mean(µ), Variance(?2), Standard Deviation(?)
xi 2 3 11
pi 1/3 1/2 1/6
µExipi 2(1/3) 3(1/2) 11(1/6) 4 E(X2)
Exipi 2(1/3) 3(1/2) 11(1/6) 26 ?2 Var(X)
E(X2) µ2 26 42 10 ? sqrt(Var(X))
sqrt(10) 3.2
43
Distribution Example(1)
  • Five cards are numbered 1 to 5. Two cards are
    drawn at random .Let X denote the sum of the
    numbers drawn. Find (a) the distribution of X and
    (b) the mean, variance, and standard deviation.
  • There are C(5,2) 10 ways of drawing two cards
    at random.

44
Distribution Example(2)
  • Ten equiprobable sample points with their
    corresponding X-values are

points 1,2 1,3 1,4 1,5 2,3 2,4 2,5 3,4 3,5 4,5
xi 3 4 5 6 5 6 7 7 8 9
45
Distribution Example(3)
  • The distribution is

xi 3 4 5 6 5 6 7 7 8 9
pi 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1
46
Distribution Example(4)
  • The distribution is

xi 3 4 5 6 5 6 7 7 8 9
pi 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.1 0.1
  • The mean is 3(0.1)....9(0.1)6
  • The E(X2) is 32(0.1)....92(0.1) 39
  • The variance is 39 62 3
  • The SD is sqrt(3) 1.7
Write a Comment
User Comments (0)
About PowerShow.com