Title: Elementary Statistics
1Elementary Statistics
2Where are we?
- Scientific Method
- Formulate a theory -- hypothesis
- Collect data to test the theory-sampling method,
and experimental design - Analyze the results --- ?
- Interpret the results and make a decision
--p-value approach
3Histogram
- Construct the frequency table
- Find the range of data
- Divide the range of the data (smallest to
largest) into classes of equal width, without
overlap - Count the number of observations that fall into
each class--- frequency table - Plot the graph
- Draw a horizontal axis and mark off the classes
along this axis. - The vertical axis can be the count, the
proportion, or the percentage. - Draw a rectangle (a vertical bar) above each
class with the height equal to the count, the
proportion, or the percentage.
4Use calculator to produce a histogram
- Clear List (press STAT)
- Enter data (press STAT)
- Set the data range (press Window)
- Xscl increment of data, length of interval..
- Set the stat plot options (press 2nd Y)
5More Exercise.
- Construct a histogram of following data
- Construct the frequency table
- Range 50 -100
- Scale 10
6More Exercise.
- Construct a histogram of following data
- Construct the frequency table
- Range 50 -100
- Scale 10
7More Exercise.
- Construct a histogram of following data
- Construct the frequency table
- Range 50 -100
- Scale 10
8(No Transcript)
9(No Transcript)
10Example 4.10 Histogram versus Bar Graph
11Guidelines for Plots, Graphs, and Pictures
- Provide an appropriate title of the picture.
- Include the source of the data and relevant
details of the sample size how and how the data
were collected. - Be sure to label the axes appropriately.
- Check to see if the frequency, proportion or
percentage axis starts at zero. - Check to see if the axes maintain a constant
scale. - Include the measuring units of the variables
being summarized.
12Measuring Center
- Center of the set of data
- Average gas mileage
- Median income
- GPA (grade point average)
- Method
- Mean
- Weighted mean
- Median
- Mode
13Mean
- Example The following data are the number of
children in a household for a simple random
sample of 10 households in a neighborhood - 2, 3, 0, 2, 1, 0, 3, 0, 1, 4.
- The mean of these 10 observations is
14Is mean always the center?
- A Mean Is Not Always Representative
- Kim's test scores are 78, 98, 80, 0 and 81
- Calculate Kim's mean test score. Explain why the
mean does not do a very good job at summarizing
Kim's test scores.
15Think about mean?
- Suppose a sample of size n10 observations is
obtained. - Can the mean be larger than the maximum value or
less than the minimum value? - Can the mean be the minimum value?
- Can the mean be the maximum value?
- Can the mean be exactly the midpoint ?
- Can the mean be exactly the second smallest
value ?
16Combine means
- Lets Do It! 5.2 Combining Means
- We have seven students.
- The mean score for three of these students is 54
- and the mean score for the four other students is
76. - What is the mean score for all seven students?
17Weighted mean --- GPA
- Whats the semester GPA of Kim?
- ACSC145 4.0 C
- AMTH140L 1.0 A
- AMTH108 3.0 B
- AMTH141 4.0 C
18Median
- The median of a set of n observations, ordered
from smallest to largest, - the median is the middle observation, if the
number of observations is odd - the median is any number between the two middle
observations, if the number of observations is
even - The median is resistantthat is, it does not
change, or changes very little, in response to
extreme observations.
19(No Transcript)
20Mode
- The mode of a set of observations is the most
frequently occurring value it is the value
having the highest frequency among the
observations - What would be the mode for 0, 0, 0, 0, 0,
1, 2, 3, 4, 4, 4, 4, 5 ?
21Three different method- mean, median, modeHow to
use them?
22Which measure of center to use?
23Mean, Median, Mode
- Mean
- The point of equilibrium
- Sensitive to outliners
- Median
- Resistant measure of center
24(No Transcript)
25A different distributionLDI 5.6
26Summary of Mean, Median,Mode
- Mean
- A most common measure of center, but it is also
affected by extreme observations. Good choice for
unimodal and roughly symmetric. - Median
- It is not influenced by extreme values. Good
choice for skewed distributions or distributions
with outliers. - Mode
- value(s) that occurs most often, For a
distribution, the mode is the value associated
with the highest peak. - Think about which average is computed!
27(No Transcript)
28Measuring of Variation
- Measure of center often give an incomplete
interpretation of the data - List1 10 20 30 40 50 50 50 60 70 80 90
- List2 46 47 48 49 50 50 50 51 52 53 54
- Same mean, same median, same mode
- But different variation
29Measure of variation
- Range
- Interquartile Range
- Standard deviation
- Q same range, but different variation?
30Interquartile Range
- The interquartile range measures the spread of
the middle 50 of the data. - -- salary.com
- Licensed Practical Nurse 25thile Median
75thile - Aiken, SC 29803 32,499 34,088
36,783 - ----
31How to calculate IQR
- Find the median of all of the observations.
- First Quartile Q1 median of observations that
fall below the median. - Third Quartile Q3 median of observations that
fall above the median. - IQR Q3-Q1
32Example
- IQR Q3-Q1 46.5-41 5.5
- We also call
- Q1 25th percentiles
- Q2 50th percentiles
- Q3 70th percentiles
33Note
- When the number of observations is odd, the
middle observation is the median. This
observation is not included in either of the two
halves when computing Q1 and Q3. - Although different books, calculators, and
computers may use slightly different ways to
compute the quartiles, they are all based on the
same idea. - In a left-skewed distribution, the first quartile
will be farther from the median than the third
quartile is. If the distribution is symmetric,
the quartiles should be the same distance from
the median.
34Five-number summary when median is used
- Five-number summary
- Minimum, Q1, Median, Q3, Maximum
35How to build box-plot
- Five-number summary
- min 32 Q1 41 median 43.5 Q3 46.5
max 51
36How to build boxplot by calculator
- Step 1 clear list
- Step 2 enter data in column L1
- Step 3 Set the stat plot options for a box plot
--- set the fourth type - Xlist L1
- Freq 1
371.5x IQR Rule
- Compute the quantity STEP 1.5 x (IQR)
- Find the location of the inner fences by taking 1
step out from each of the quartiles - lower inner fence Q1 STEP
- upper inner fence Q3 STEP.
- Observations that fall OUTSIDE the inner fences
are considered potential outliers. If there are
any outliers, plot them individually along the
scale using a solid dot.
38LDI 5.9
39Symmetric boxplot V.S Symmetric distribution
- Symmetric distribution ? Symmetric boxplot
- But Symmetric boxplot does not implies Symmetric
distribution - Here is a set of sample data
- 1 1 3 5 5 7 7 9 9
- Min
- Q1
- Median
- Q2
- Max
- Side-by-side boxplot