Title: What is Statistics
1What is Statistics?
- Statistics is the scientific discipline that
provides methods to help us make sense of data - Math 146 focuses on applied statistics
2Why Study Statistics
Being Informed Are claims based on numerical
data reasonable?
Making Informed Judgments Is data adequate?
Is it displayed correctly?
Evaluating Decisions that Affect Your Life What
statistical Information is relevant? What
studies have been done? Have studies been done
correctly
2
3Variability of Data
3
4Definitions
Population The entire collection of individuals
or objects about which information is desired.
Sample A subset of the population selectedfor
study in some prescribed manner.
Descriptive Statistics The branch of statistics
that includes methods for organizing
andsummarizing data.
Inferential Statistics The branch of
statisticsthat involves generalizing from a
sample to a population from which it was
selected, and assessing the reliability of such
generalizations
4
5The Data Analyses Process
- Understanding the nature of the problem
- Deciding what to measure and how to measure it.
- Data collection.
- Data Summarization and preliminary analysis.
- Formal data analysis
- Interpretation of results
5
6Problems
7Definitions
- Variable A variable is any characteristic whose
value may change from one individual to another - Examples
- Brand of television
- Height of a building
- Number of students in a class
8Definitions
- Data results from making observations either on a
single variable or simultaneously on two or more
variables. - A univariate data set consists of observations on
a single variable made on individuals in a sample
or population
9Definitions
- A bivariate data set consists of observations on
two variables made on individuals in a sample or
population. - A multivariate data set consists of observations
on two or more variables made on individuals in a
sample or population.
10Definitions
- A univariate data set is categorical (or
qualitative) if the individual observations are
categorical responses. - A univariate data set is numerical (or
quantitative) if the individual observations are
numerical responses where numerical operations
generally have meaning.
11Categorical Variables
- The brand of TV owned by the six people that work
in a small office - RCA Magnavox Zenith Phillips
- GE RCA
- The hometowns of the 6 students in the first row
of seats - Mendon Victor Bloomfield
- Victor Pittsford Bloomfield
- The zip codes (of the hometowns) of the 6
students in the first row of seats. - Since numerical operations with zip codes make
no sense, the zip codes are categorical rather
than numeric.
12Definitions
- Numerical data is discrete if the possible values
are isolated points on the number line. - Numerical data is continuous if the set of
possible values form an entire interval on the
number line.
13Examples of Discrete Data
- The number of costumers served at a diner lunch
counter over a one hour time period is observed
for a sample of seven different one hour time
periods - 13 22 31 18 41 27 32
- The number of textbooks bought by students at a
given school during a semester for a sample of 16
students - 5 3 6 8 6 1 3 6 12
- 3 5 7 6 7 5 4
14Continuous Data
- The height of students that are taking a Data
Analysis at a local university is studied by
measuring the heights of a sample of 10 students. - 72.1 64.3 68.2 74.1 66.3
- 61.2 68.3 71.1 65.9 70.8
- Note Even though the heights are only measured
accurately to 1 tenth of an inch, the actual
height could be any value in some reasonable
interval.
15Problems
16Definition
- The relative frequency for a particular
category is the fraction or proportion of the
time that the category appears in the data set.
It is calculated as
When the table includes relative frequencies,
it is sometimes referred to as a relative
frequency distribution.
17(No Transcript)
18Dot Plots for Numerical Variables
19Problems
20Karl PearsonFather of Modern Statistics
In the 20th century, the role of mathematics has
become increasingly decisive, and studies of
these new statistical tools and practices are
gradually being written, episode by episode
discipline by discipline. In the end, a picture
will emerge of a powerful body of mathematics,
allied to schemes for data gathering
and designing experiments, that has become one of
the most important sources of scientific
expertise and guarantors of objectivity in the
modern world. It is the narrow gate through
which must pass new pharmaceuticals,
manufacturing processes, official measures of all
descriptions, and empirical findings of
psychologists, economists, biologists and many
others. In that sense, its import goes far
beyond the history of a mathematical discipline.
Statistics has functioned as no narrow specialty,
but as a vital if often invisible element of the
cultural history of government, business, and the
professions, as well as science.
Karl Pearson, The Scientific Life in a
statistical age by Theodore Porter, 2004. page 4.