Describing Data: Frequency Distributions and Graphic Presentation - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Describing Data: Frequency Distributions and Graphic Presentation

Description:

... or classes based on a characteristic and then putting elements into categories ... characteristics of a population - collecting data for a population ... – PowerPoint PPT presentation

Number of Views:254
Avg rating:3.0/5.0
Slides: 38
Provided by: joey62
Category:

less

Transcript and Presenter's Notes

Title: Describing Data: Frequency Distributions and Graphic Presentation


1
Describing Data Frequency Distributions and
Graphic Presentation
2
Frequency Distribution
  • A Frequency Distribution is a grouping of data
    into mutually exclusive categories showing the
    number of observations in each class.
  • -(explanation) you are just developing categories
    or classes based on a characteristic and then
    putting elements into categories based on that
    characteristic. No element appears in more than
    one class.

3
Frequency Distribution
  • -Here is an analogyWe divide clothes to wash
    into WHAT THREE CATEGORIES???!!!
  • Whites, Lights/Colors, and Darks, right?
  • -The freq. dist. of clothes is developed by
    counting how many articles of clothing are in
    each laundry bin.

4
Rule of Thumb for Developing a Frequency Dist.
  • Step 1 Decide on the number of classes (k) or
    containershint must be more than 1, but less
    than a million. 2 gt n, where knumber of
    classes ,nnumber of obs.
  • If obs50, 2 64gt50 so we should use at least 6
    classes.
  • Step 2 Determine the class interval width (i)
  • Should be the same for all classes, and
  • Cover lowest (L) to highest (H) observation value
  • i (H-L)/k
  • This is a rule of thumb folks typically round up
    to the next convenient number for i, e.g., 8.9
    becomes 10 and 94 becomes 100.
  • Step 3 Set the individual class limits. Dont
    overlap at all. E.g., dollarsclasses like
    50-59, 60-69, and so on. Dont have 50-60,
    60-70if something is 60 is will appear in two
    classes.

k
6
5
Rule of Thumb for Developing a Frequency Dist.
  • Step 4 Tally the items into classes.
  • Step 5 Count the number of items in each class.
  • Now you can graphically depict the counts with a
    histogram.

6
Example Hudson Auto Repair
The manager of Hudson Auto would like to have a
better understanding of the cost of parts used in
the engine tune-ups performed in the shop. She
examines 50 customer invoices for tune-ups. The
costs of parts, rounded to the nearest dollar,
are listed on the next slide.
7
Example Hudson Auto Repair
  • Sample of Parts Cost for 50 Tune-ups
  • Based on the rule of thumb, how many classes
    might we use? 2kgtn, where n is 50
  • 2664 which is juuuust greater than 50.
  • Based on the rule of thumb, what should the
    width of the classes be? i (H-L)/k
  • (109-52)/69.5 Lets round up to 10 to make
    it easy, and lets start the classes at 50 (just
    lower than the lowest observation)

8
Tabular Summary Frequency and Relative (or
Percent) Frequency
Parts Cost ()
Relative Frequency()
Parts Frequency
2 13 16
7 7 5 50
4 26 32 14
14 10 100
50-59 60-69 70-79 80-89
90-99 100-109
(2/50)100
9
Graphical Summary Histogram
Tune-up Parts Cost
Frequency
Parts Cost ()
50 60 70 80 90 100
110 120
10
Numerical Descriptive Statistics
  • The most common numerical descriptive
    statistic
  • is the average (or mean).
  • Hudsons average cost of parts, based on the
    50
  • tune-ups studied, is 79 (found by summing
    the
  • 50 cost values and then dividing by 50).
  • In Excel there are several common ways for
  • obtaining the mean. Three of the most common
    are
  • AVERAGE()
  • SUM()/n n in this case is 50.
  • ToolsgtData Analysisgt
  • Descriptive StatisticsgtSummary Statistics

11
Statistical Inference
Population
- the set of all elements of interest in a
particular study
Sample
- a subset of the population
Statistical inference
- the process of using data obtained from a
sample to make estimates and test hypotheses
about the characteristics of a population
Census
- collecting data for a population
Sample survey
- collecting data for a sample
12
Process of Statistical Inference
1. Population consists of all tune-ups.
Average cost of parts is unknown.
2. A sample of 50 engine tune-ups is examined.
3. The sample data provide a sample average
parts cost of 79 per tune-up.
4. The sample average is used to estimate the
population average.
13
Statistical Analysis Using Microsoft Excel
  • Statistical analysis typically involves
    working with
  • large amounts of data.
  • Computer software is typically used to conduct
    the
  • analysis.
  • Frequently the data that is to be analyzed
    resides in a
  • spreadsheet (or, it will when you are done
    with it).
  • Modern spreadsheet packages are capable of
    data
  • management, analysis, and presentation.
    Analysis
  • Pack is an add-in in Excel.
  • MS Excel is the most widely available
    spreadsheet
  • software in business organizations.

14
Statistical Analysis Using Microsoft Excel
  • 3 tasks might be needed
  • Enter Data
  • Enter Functions and Formulas
  • Apply Tools

15
Statistical Analysis Using Microsoft Excel
  • Data Set

Note Rows 10-51 are not shown.
16
Statistical Analysis Using Microsoft Excel
  • Formula Worksheet

Note Columns A-B and rows 10-51 are not
shown. Neat excel trick not taught in CISM To
view a function instead of its result press
ltCtrlgt
17
Statistical Analysis Using Microsoft Excel
  • Value Worksheet

Note Columns A-B and rows 10-51 are not shown.
18
Pop Quiz!!!
  • You were just handed an Excel spreadsheet with
    two years of monthly sales data from Off Campus
    Liquor, a local beverage distributor.
  • Your manager says, make this data say something,
    our jobs are on the line. He then staggers out
    of the door and passes out in the parking lot.
  • Although you have very little actual experience
    in statistics, you know a few things about the
    data and how it might be presented. Right?!

19
Questions
time series
  • Is the data cross section or time series?

Quantative
  • Is sales data Qualitative or Quantative?
  • How many observations are there?

24
20
Lets Look at Exercise 7 (page 31)
  • The data set is pb2-07.xls
  • The BiLo store is gathering info on its customer
    visits during each month.
  • You need to used the data to create a frequency
    distribution.
  • -Start with 0 as the lower limit of the first
    class and use a class interval of 3.
  • Describe the distribution (see any clusters?)
  • Convert the distribution to a relative frequency
    distribution.
  • There are several ways to attack this
    problemlets look at one.

21
Homework
  • For the next class period, try 4, 6 and 8 on
    pages 31-32

22
Other Graphical Depictions of Data
  • Pie Chart-for Relative Frequencies and Shares of
    the Whole
  • Line Graphs-for changes over time, trends, or
    differences between groups
  • Bar Charts-Similar to line graphs in their uses.
    Sometimes they make for better pair-wise
    comparisons.

23
The three commonly used graphic forms are
Histograms, Frequency Polygons, and a
Cumulative Frequency distribution.
A Histogram is a graph in which the class
midpoints or limits are marked on the horizontal
axis and the class frequencies on the vertical
axis. The class frequencies are represented by
the heights of the bars and the bars are drawn
adjacent to each other.
24
Example Histogram for Hours Spent Studying
Class widths are all the same
7.5 up to 12.5
12.5 up to 17.5
17.5 up to 22.5
22.5 up to 27.5
27.5 up to 32.5
32.5 up to 37.5
Midpoints of classes
How do you read this graphic? How many people
study around 20 hours per week? How many study
less than 32.5 hours per week?
25
Graphic Presentation of a Frequency Distribution
A Frequency Polygon consists of line segments
connecting the points formed by the class
midpoint and the class frequency.
26
Frequency Polygon for Hours Spent Studying
27
Both on the same Chart
28
Cumulative Frequency Distribution
A Cumulative Frequency Distribution is used to
determine how many or what proportion of the data
values are below or above a certain value. You
are just adding up as you go along
29
Cumulative Frequency Table for Hours Spent
Studying
30
Cumulative Frequency Distribution For Hours
Studying
31
Line graphs are typically used to show the change
or trend in a variable over time.
32
Example 3 continued
33
A Bar Chart can be used to depict any of the
levels of measurement (nominal, ordinal,
interval, or ratio).
Construct a bar chart for the number of
unemployed per 100,000 population for selected
cities during 2001
34
Bar Chart for the Unemployment Data
35
A Pie Chart is useful for displaying a relative
frequency distribution. A circle is divided
proportionally to the relative frequency and
portions of the circle are allocated for the
different groups.
A sample of 200 runners were asked to indicate
their favorite type of running shoe. Draw a pie
chart based on the following information.
36
Pie Chart for Running Shoes
37
More homework
  • Do problem 28 and 32 on page 48-52 in addition
    to 4, 6 and 8 on pages 31-32 assigned earlier.
Write a Comment
User Comments (0)
About PowerShow.com