Elementary Statistics - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Elementary Statistics

Description:

Divide the range of the data (smallest to largest) into ... Good choice for unimodal and roughly symmetric. Median: It is not influenced by extreme values. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 40
Provided by: yaya7
Category:

less

Transcript and Presenter's Notes

Title: Elementary Statistics


1
Elementary Statistics
  • Summary

2
Where are we?
  • Scientific Method
  • Formulate a theory -- hypothesis
  • Collect data to test the theory-sampling method,
    and experimental design
  • Analyze the results --- ?
  • Interpret the results and make a decision
    --p-value approach

3
Histogram
  • Construct the frequency table
  • Find the range of data
  • Divide the range of the data (smallest to
    largest) into classes of equal width, without
    overlap
  • Count the number of observations that fall into
    each class--- frequency table
  • Plot the graph
  • Draw a horizontal axis and mark off the classes
    along this axis.
  • The vertical axis can be the count, the
    proportion, or the percentage.
  • Draw a rectangle (a vertical bar) above each
    class with the height equal to the count, the
    proportion, or the percentage.

4
Use calculator to produce a histogram
  • Clear List (press STAT)
  • Enter data (press STAT)
  • Set the data range (press Window)
  • Xscl increment of data, length of interval..
  • Set the stat plot options (press 2nd Y)

5
More Exercise.
  • Construct a histogram of following data
  • Construct the frequency table
  • Range 50 -100
  • Scale 10

6
More Exercise.
  • Construct a histogram of following data
  • Construct the frequency table
  • Range 50 -100
  • Scale 10

7
More Exercise.
  • Construct a histogram of following data
  • Construct the frequency table
  • Range 50 -100
  • Scale 10

8
(No Transcript)
9
(No Transcript)
10
Example 4.10 Histogram versus Bar Graph
11
Guidelines for Plots, Graphs, and Pictures
  • Provide an appropriate title of the picture.
  • Include the source of the data and relevant
    details of the sample size how and how the data
    were collected.
  • Be sure to label the axes appropriately.
  • Check to see if the frequency, proportion or
    percentage axis starts at zero.
  • Check to see if the axes maintain a constant
    scale.
  • Include the measuring units of the variables
    being summarized.

12
Measuring Center
  • Center of the set of data
  • Average gas mileage
  • Median income
  • GPA (grade point average)
  • Method
  • Mean
  • Weighted mean
  • Median
  • Mode

13
Mean
  • Example The following data are the number of
    children in a household for a simple random
    sample of 10 households in a neighborhood
  • 2, 3, 0, 2, 1, 0, 3, 0, 1, 4.
  • The mean of these 10 observations is

14
Is mean always the center?
  • A Mean Is Not Always Representative
  • Kim's test scores are 78, 98, 80, 0 and 81
  • Calculate Kim's mean test score. Explain why the
    mean does not do a very good job at summarizing
    Kim's test scores.

15
Think about mean?
  • Suppose a sample of size n10 observations is
    obtained.
  • Can the mean be larger than the maximum value or
    less than the minimum value?
  • Can the mean be the minimum value?
  • Can the mean be the maximum value?
  • Can the mean be exactly the midpoint ?
  • Can the mean be exactly the second smallest
    value ?

16
Combine means
  • Lets Do It! 5.2 Combining Means
  • We have seven students.
  • The mean score for three of these students is 54
  • and the mean score for the four other students is
    76.
  • What is the mean score for all seven students?

17
Weighted mean --- GPA
  • Whats the semester GPA of Kim?
  • ACSC145 4.0 C
  • AMTH140L 1.0 A
  • AMTH108 3.0 B
  • AMTH141 4.0 C

18
Median
  • The median of a set of n observations, ordered
    from smallest to largest,
  • the median is the middle observation, if the
    number of observations is odd
  • the median is any number between the two middle
    observations, if the number of observations is
    even
  • The median is resistantthat is, it does not
    change, or changes very little, in response to
    extreme observations.

19
(No Transcript)
20
Mode
  • The mode of a set of observations is the most
    frequently occurring value it is the value
    having the highest frequency among the
    observations
  • What would be the mode for 0, 0, 0, 0, 0,
    1, 2, 3, 4, 4, 4, 4, 5 ?

21
Three different method- mean, median, modeHow to
use them?
22
Which measure of center to use?
23
Mean, Median, Mode
  • Mean
  • The point of equilibrium
  • Sensitive to outliners
  • Median
  • Resistant measure of center

24
(No Transcript)
25
A different distributionLDI 5.6
26
Summary of Mean, Median,Mode
  • Mean
  • A most common measure of center, but it is also
    affected by extreme observations. Good choice for
    unimodal and roughly symmetric.
  • Median
  • It is not influenced by extreme values. Good
    choice for skewed distributions or distributions
    with outliers.
  • Mode
  • value(s) that occurs most often, For a
    distribution, the mode is the value associated
    with the highest peak.
  • Think about which average is computed!

27
(No Transcript)
28
Measuring of Variation
  • Measure of center often give an incomplete
    interpretation of the data
  • List1 10 20 30 40 50 50 50 60 70 80 90
  • List2 46 47 48 49 50 50 50 51 52 53 54
  • Same mean, same median, same mode
  • But different variation

29
Measure of variation
  • Range
  • Interquartile Range
  • Standard deviation
  • Q same range, but different variation?

30
Interquartile Range
  • The interquartile range measures the spread of
    the middle 50 of the data.
  • -- salary.com
  • Licensed Practical Nurse 25thile Median
    75thile
  • Aiken, SC 29803 32,499 34,088
    36,783
  • ----

31
How to calculate IQR
  • Find the median of all of the observations.
  • First Quartile Q1 median of observations that
    fall below the median.
  • Third Quartile Q3 median of observations that
    fall above the median.
  • IQR Q3-Q1

32
Example
  • IQR Q3-Q1 46.5-41 5.5
  • We also call
  • Q1 25th percentiles
  • Q2 50th percentiles
  • Q3 70th percentiles

33
Note
  • When the number of observations is odd, the
    middle observation is the median. This
    observation is not included in either of the two
    halves when computing Q1 and Q3.
  • Although different books, calculators, and
    computers may use slightly different ways to
    compute the quartiles, they are all based on the
    same idea.
  • In a left-skewed distribution, the first quartile
    will be farther from the median than the third
    quartile is. If the distribution is symmetric,
    the quartiles should be the same distance from
    the median.

34
Five-number summary when median is used
  • Five-number summary
  • Minimum, Q1, Median, Q3, Maximum

35
How to build box-plot
  • Five-number summary
  • min 32 Q1 41 median 43.5 Q3 46.5
    max 51

36
How to build boxplot by calculator
  • Step 1 clear list
  • Step 2 enter data in column L1
  • Step 3 Set the stat plot options for a box plot
    --- set the fourth type
  • Xlist L1
  • Freq 1

37
1.5x IQR Rule
  • Compute the quantity STEP 1.5 x (IQR)
  • Find the location of the inner fences by taking 1
    step out from each of the quartiles
  • lower inner fence Q1 STEP
  • upper inner fence Q3 STEP.
  • Observations that fall OUTSIDE the inner fences
    are considered potential outliers. If there are
    any outliers, plot them individually along the
    scale using a solid dot.

38
LDI 5.9
39
Symmetric boxplot V.S Symmetric distribution
  • Symmetric distribution ? Symmetric boxplot
  • But Symmetric boxplot does not implies Symmetric
    distribution
  • Here is a set of sample data
  • 1 1 3 5 5 7 7 9 9
  • Min
  • Q1
  • Median
  • Q2
  • Max
  • Side-by-side boxplot
Write a Comment
User Comments (0)
About PowerShow.com