Review - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Review

Description:

Unimodal or Bimodal. Symmetric or skewed (left/right)? Examining distribution. Center ... Single-peaked (Unimodal); Bell-shaped. center and spread. The normal ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 15
Provided by: nipis4
Category:
Tags: review | unimodal

less

Transcript and Presenter's Notes

Title: Review


1
Review
  • Statistics the science of collecting,
    organizing, and interpreting data.
  • Data collection.
  • Data analysis - organize summarize data to
    bring out main features and clarify their
    underlying structure.
  • Inference and decision theory extract the info
    provided by the data and may be used as a guide
    for further action.

2
Fundamental concepts
  • Population the entire group of individuals that
    we want information about.
  • Sample a part of the population that we actually
    examine in order to gather information.
  • Sample size number of observations/individuals
    in a sample.
  • Statistical inference to make an inference about
    a population based on the information contained
    in a sample.

3
  • Data contains
  • Individuals the objects described by the data
  • Variable any characteristic of an individual. A
    variable can take different values for different
    individuals
  • A categorical variable places an individual into
    one of several groups or categories.
  • A quantitative variable takes numerical values
    for which arithmetic operations such as adding
    and averaging make sense.

4
  • Distributions of Variables
  • The distribution of a variable indicates what
    values a variable takes and how often it takes
    these values.
  • For a categorical variable, distribution
    categories count/percent for each category
  • For a quantitative variable, distribution
    pattern of variation of its values

5
Examining distribution
  • Overall Pattern
  • Shape
  • Graphical techniques to display distributions
  • Bar graph
  • Pie chart
  • Stemplot
  • Histogram
  • Modes peaks of a distribution.
  • Unimodal or Bimodal
  • Symmetric or skewed (left/right)?

6
Examining distribution
  • Center
  • Mean
  • easily calculated
  • easy to work with algebraically
  • highly affected by outliers
  • Not a resistant measure
  • Median
  • can be time consuming to calculate
  • more resistant to a few extreme observations
    (sometimes outliers)
  • robust
  • Mode, Mean and Median
  • Relative locations for skewed/symmetric dist.
  • Which one to use

7
Examining distribution
  • Spread
  • Standard deviation and variance
  • Definition and calculation
  • Sum always 0
  • Why (n-1)?
  • Quartiles
  • Definition and calculation
  • IQR
  • Rule to identify outliers
  • The five-number summary
  • Boxplots
  • Comparison with histograms and stemplots
  • Range, IQR, S.D.

8
Examining distribution
  • Deviations
  • Outliers some values that fall outside the
    overall pattern.
  • IQR can help to identify outliers
  • Modified boxplots
  • Strategies
  • Detect them, investigate their causes, correct
    them, or delete them, or give them individual
    attention.
  • Use resistant methods such as median to reduce
    the influence of the outliers.

9
Linear transformation
  • form XabX
  • its effects on shape, center and spread

10
Density curves
  • pdf
  • properties of a pdf
  • comparison with histograms
  • mode, median, mean, quartiles and s.d. of density
    curves

11
The normal distributions
  • normal distributions
  • shape
  • Symmetric around mean
  • Single-peaked (Unimodal)
  • Bell-shaped.
  • center and spread

12
The normal distributions
  • Standardizing and z-Scores
  • Effects of Standardizing
  • Standardizing is a linear transformation.
  • The standardized values for any distribution
    always have mean 0 and standard deviation 1.
  • Effects on shape, center and spread.
  • Linear transformation normal into normal.

13
The normal distributions
  • Standard normal distribution
  • Standard normal table
  • Normal probability calculation
  • Normal quantile plot
  • Standard normal distribution
  • Standard normal table
  • Normal probability calculation
  • Normal quantile plot
  • Relationships between variables
  • Scatterplot
  • Response and explanatory variables
  • Form, direction, strength
  • Adding categorical variable
  • Correlation r
  • -1ltrlt1
  • The sign indicates positive/negative association
  • What it means when r0, -1, 1
  • Invariant to scaling
  • Sensitive to outliers

14
Relationships between variables
  • Scatterplot
  • Response and explanatory variables
  • Form, direction, strength
  • Adding categorical variable
  • Correlation r
  • -1ltrlt1
  • The sign indicates positive/negative association
  • What it means when r0, -1, 1
  • Invariant to scaling
  • Sensitive to outliers
Write a Comment
User Comments (0)
About PowerShow.com