Unit I: Descriptive Statistics - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Unit I: Descriptive Statistics

Description:

a table in which measurements are tallied ... the tally column. the corresponding frequencies. Used for both quantitative & qualitative data ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 46
Provided by: xxxx200
Category:

less

Transcript and Presenter's Notes

Title: Unit I: Descriptive Statistics


1
Unit I Descriptive Statistics
  • Introduction
  • Summarising and Describing Data
  • Graphical Methods of Presentation
  • Cross-Sectional and Time Series Data
  • Measures of Central Tendency
  • Measures of Variation
  • Measures of Shape

2
Introduction
  • What is Statistics
  • Basic Definitions

3
What is Statistics
  • Numerical facts
  • A group of methods used in the collection,
    analysis, presentation and interpretation of data
    in order to make decisions.

4
Key Definitions
  • A population is the collection of all items of
    interest or under investigation
  • N represents the population size
  • A sample is an observed subset of the population
  • n represents the sample size
  • A parameter is a specific characteristic of a
    population
  • A statistic is a specific characteristic of a
    sample

5
Population vs. Sample
Population
Sample
a b c d ef gh i jk l m n o p
q rs t u v w x y z
b c g i n o r
u y
Values calculated using population data are
called parameters
Values computed from sample data are called
statistics
6
Examples of Populations
  • Names of all registered voters in Jamaica
  • Incomes of all families living in Kingston
  • Grade point averages of all the students in your
    university

7
Types of Statistics
  • Descriptive Statistics these are methods used
    for organising, displaying and describing data
    using tables, graphs and summary measures.
  • Inferential Statistics these are methods that
    use sample results to help make decisions or
    predictions (inferences) about a population.

8
Descriptive Statistics
  • Collect data
  • e.g., Survey
  • Present data
  • e.g., Tables and graphs
  • Summarize data
  • e.g., Sample mean

9
Inferential Statistics
  • Estimation
  • e.g., Estimate the population mean weight using
    the sample mean weight
  • Hypothesis testing
  • e.g., Test the claim that the population mean
    weight is 120 pounds

Inference is the process of drawing conclusions
or making decisions about a population based on
sample results
10
Descriptive vs. Inferential Statistics
  • Descriptive Statistics
  • Collect
  • Organize
  • Summarize
  • Display
  • Analyze
  • Inferential Statistics
  • Predict and forecast values of a population
  • Test hypotheses about values of a population
  • Make decisions

11
Basic Definitions
  • Data
  • numbers or measurements that are collected
  • Variables
  • characteristics or attributes that enable us to
    distinguish one individual from another
  • they take on different values when different
    individuals are observed (eg. height)
  • Element
  • a single person, object or event in a data set

12
Types of Variables
  • Quantitative or Numerical these are variables
    that can be measured numerically, examples are,
    number of children, age, weight, height
  • Qualitative or Categorical these are variables
    that cannot assume numerical values but can be
    classified into categories or groups, examples
    are, marital status, eye colour, opinions

13
Summarising and Describing Data
14
Summarising Describing Data
  • Describing the observed patterns in data is an
    important part of statistics
  • Distribution of a single variable
  • Shape
  • Outliers
  • Centre
  • Spread

15
Describing Data
16
Graphical Methods of Presenting Data
17
Graphical Methods of Presentation
  • Data in raw form are usually not easy to use for
    decision making
  • Some type of organization is needed
  • Table
  • Graph
  • Techniques reviewed here
  • Bar charts and Pie charts
  • Frequency distributions, histograms and polygons
  • Cumulative distributions and ogives

18
Graphical Methods of Presentation
  • Type of graphical representation depends on the
    type of data to be presented
  • When presenting Quantitative Data use
  • histograms
  • frequency polygons
  • cumulative frequency polygons
  • When presenting Qualitative Data use
  • pie charts
  • bar charts

19
Frequency Distributions
  • A frequency distribution
  • a table in which measurements are tallied
  • then the frequency or total number of times that
    each item occurs is recorded
  • Usually measurements are arranged in ascending or
    descending order
  • A frequency distribution has 3 columns
  • the data categories or classes
  • the tally column
  • the corresponding frequencies
  • Used for both quantitative qualitative data

20
Examples
  • Quantitative
  • Qualitative

21
Frequency Distribution Contd
  • two main types of frequency distributions
  • Ungrouped data
  • Grouped data

Mean
22
Guidelines for Constructing Frequency
Distributions
  • Class boundaries never overlap - no element can
    belong to more than one class
  • Even if the frequency is zero, include each and
    every class
  • Make all classes the same width, determine the
    width of each interval by
  • Usually at least 5 but no more than 15 groupings,
    depending on the range and number of data points
  • Keep the limits as simple and as convenient as
    possible

23
Definitions
  • Class Intervals/Limits
  • largest or smallest numbers which can actually
    belong to each class
  • each class has a lower class limit and an upper
    class limit

24
Definitions
  • Class Boundaries
  • the numbers which separate classes
  • they are equally spaced halfway between
    neighbouring class limits

25
Definitions
  • Class Mark
  • midpoints of the classes
  • aka Midpoint
  • may be used in the calculation of other
    statistics
  • found by taking the average of the class limits
    or boundaries

26
Definitions
  • Class Width
  • aka class size, class width, class length
  • Two ways of calculating
  • Method 1 the difference between corresponding
    class limits
  • Method 2 the difference between two class
    boundaries

27
Relative Frequency Distribution
  • This gives the frequency of each class interval
    as a proportion of the total frequencies
  • The sum of the relative frequencies MUST add to 1
  • Sometimes expressed as a percentage

Class 1
28
Frequency Distribution Example
  • Consider the following data set
  • 24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
  • 32, 13, 12, 38, 41, 43, 44, 27, 53, 27
  • Group these figures into a frequency distribution
  • What is the class interval (width)
  • Calculate the class boundaries
  • Calculate the class midpoints
  • Calculate the relative frequencies

29
Bar and Pie Charts
  • Bar charts and Pie charts are used for
    qualitative data

30
Bar Charts
  • Bars (columns) are separated from each other
  • Similar to a histogram (which we will soon meet)
  • Height of bar shows the frequency for each
    category
  • Used to represent qualitative data

31
Constructing a Bar Chart
  • Divide the data into groups (also called
    segments, bins or classes)
  • Label the vertical axis (y-axis) - Frequency (the
    number of counts for each bin)
  • Label the horizontal axis (x-axis) - the group
    names of your response variables
  • Determine the number of data points that are in
    each bin from the frequency and construct the bar
    chart

32
Pie Charts
  • A pie chart is a circle which is divided into
    portions where each portion represents the
    percentage of a population or sample that belongs
    to different categories.
  • A pie chart is mostly used to display percentages
    even though it can also be used to display
    frequencies or relative frequencies.

33
Constructing a Pie Chart
  • Segment the range of the data into groups (also
    called segments or classes)
  • Determine the number of data points that are
    within each group from the frequency.
  • Express the number of data points in each
    category as a percentage of the total.
  • Now construct the pie chart, each slice of the
    pie should be representative of the percentage
    of the data that lies within each category.

34
Bar Chart Example
Current Investment Portfolio
Investment Amount Percentage Type
(in thousands ) () Stocks
46.5 42.27 Bonds
32.0 29.09 CD 15.5
14.09 Savings 16.0
14.55 Total 110.0 100.0
35
Pie Chart Example
Current Investment Portfolio
Investment Amount Percentage Type
(in thousands ) () Stocks
46.5 42.27 Bonds
32.0 29.09 CD 15.5
14.09 Savings 16.0
14.55 Total 110.0 100.0
Savings 15
Stocks 42
CD 14
Percentages are rounded to the nearest percent
Bonds 29
36
Histograms
  • Similar to bar chart
  • Bars (columns) are (joined together)
  • Used to present quantitative data
  • This method shows
  • Location (measures of centre) of the data
  • spread (the scale) of the data
  • shape of the data
  • presence of outliers

37
Histograms
  • A graph of the data in a frequency distribution
    is called a histogram
  • The class boundaries (or class limits) are shown
    on the horizontal axis
  • the vertical axis is either frequency, relative
    frequency, or percentage
  • Bars are drawn where the base of each bar covers
    the class while the height of each bar represents
    the frequency (relative frequency or percentage)
  • The bars are joined

38
Histogram - Example
  • Construct a histogram using the frequency
    distribution constructed from the example on
    slide 28 above.

39
Frequency Polygons
  • This is a line graph of a frequency distribution.
  • It is a line graph formed by joining the
    midpoints of the tops of the bars in a
    histogram.
  • We plot the midpoint of each class against the
    frequency for that class.
  • The midpoint could also be plotted against the
    relative frequency for each class, this would be
    called a relative frequency polygon.

40
Constructing a Frequency Polygon
  • Mark a dot above the midpoint of each class at a
    height equal to the frequency (relative
    frequency) of that class. Simply mark the
    midpoint at the top of each bar in the histogram
  • Imagine that two more classes exist, one before
    the first class and one after the final class.
    Plot the midpoints for these classes as well,
    remembering that the frequencies for these two
    classes is zero.
  • Join the points using straight lines.

41
Frequency Polygon - Example
  • Using the histogram constructed on slide 38
    above, construct a frequency polygon.

42
Cumulative Frequency Distribution
  • A cumulative frequency distribution contains the
    total number of observations whose values are
    either less than or greater than the upper
    boundary for each interval.
  • A cumulative frequency distribution which tallies
    the total number of observations whose values are
    less than the upper boundary is known as a less
    than cumulative frequency distribution.
  • A cumulative frequency distribution which tallies
    the total number of observations whose values are
    greater than the upper boundary is known as a
    more than cumulative frequency distribution.

43
Less than Cumulative Frequency Ogive
  • This is a plot of a less than cumulative
    frequency distribution.
  • Similar to the histogram, the less than
    cumulative frequency ogive can be plotted against
    either frequency, relative frequency, or
    percentage
  • Using Frequencies
  • The first value in the distribution is ALWAYS
    zero
  • The last value in the distribution is ALWAYS the
    total number
  • Using Relative Frequencies
  • The first value in the distribution is ALWAYS
    zero
  • The last value in the distribution is ALWAYS 1
  • Using Percentages
  • The first value in the distribution is ALWAYS
    zero
  • The last value in the distribution is ALWAYS 100

44
More than Cumulative Frequency Ogive
  • This is a plot of a more than cumulative
    frequency distribution.
  • Similar to the histogram, the more than
    cumulative frequency ogive can be plotted against
    either frequency, relative frequency, or
    percentage
  • Using Frequencies
  • The first value in the distribution is ALWAYS the
    total number
  • The last value in the distribution is ALWAYS zero
  • Using Relative Frequencies
  • The first value in the distribution is ALWAYS 1
  • The last value in the distribution is ALWAYS zero
  • Using Percentages
  • The first value in the distribution is ALWAYS 100
  • The last value in the distribution is ALWAYS zero

45
Example - Cumulative Frequency Distribution
  • Using the example on slide 28,
  • Calculate less than cumulative frequencies and
    construct a less than cumulative frequency ogive
  • Calculate more than cumulative frequencies and
    construct a more than cumulative frequency ogive
Write a Comment
User Comments (0)
About PowerShow.com