Title: Department of Statistics
1Statistics and Design of Experiments Role in
Research
- Imad Khamis, PhD
- Department of Mathematics
- Southeast State University
- Cape Girardeau, Missouri
2Statistics A collection of procedures and
processes to enable researchers in the unbiased
pursuit of Knowledge Statistics is an important
part of the Scientific Method
State a Hypothesis
Design a Study and Collect Data
Interpret the ResultsDraw Conclusions
Analyze the Data
3State a Hypothesis The OBJECTIVE or OBJECTIVES
of the Study A HYPOTHESIS OR SET OF HYPOTHESES
should state exactly what you want to DO or LEARN
or STUDY SHOULD ANSWER What are the factors to
be studied and what relationships are to be
investigated? What is the experimental material?
Etc.?
4The area of STATISTICS would not be needed if
each time you measured an experimental unit you
would obtain the same response or value
BUT, THE RESPONSES ARE NOT THE SAME SINCE THERE
IS VARIABILITY or NOISE IN THE SYSTEM
STATISTICAL METHODS EXTRACT THE SIGNAL FROM THE
NOISE TO PROVIDE INFORMATION
One of the Statisticians JOBS is to make sense
from DATA in the presence of VARIABILITY or noise
by using DATA ANALYSIS TOOLS
5DESIGN VS. ANALYSIS
The PURPOSE OF DATA COLLECTION is to GAIN
INFORMATION OR KNOWLEDGE!!
Collecting Data does not guarantee that
information is obtained. INFORMATION ?DATA At
best INFORMATIONDATAANALYSIS
6If data are collected such that they contain NO
information in the first place, then the analysis
phase cannot find it!!!
The best way to insure that appropriate
information is contained in the collected data is
to DESIGN (plan) and Carefully Control the DATA
COLLECTION PROCESS
The measured variables must relate to the stated
OBJECTIVES of the study
7If you have a good design and process for data
collection, it is quite often straight forward to
construct an analysis that extracts all of the
available information from the data
The ROLE of a STATISTICIAN is to work with the
REAEARCH TEAM (or researcher) from the START of
the study
8The MOST IMPORTANT TIME for the statistician to
become involved with a research study is in the
very BEGINNING
A STATISTICIAN CAN HELP OBTAIN THE MAXIMUM AMOUNT
INFORMATON FROM AVAILABLE RESOURCES
9HOW??? HELP WITH THE DESIGN OF THE
EXPERIMENT DETERMINE SAMPLE SIZE NEEDED DEVELOP
PROCESS OF COLLECTING DATA DISCUSS VARIABLES TO
BE MEASURED AND HOW THEY RELATE TO THE OBJECTIVES
OF THE STUDY PROVIDE METHODS OF ANALYZING THE
DATA HELP TRANSLATE STATISTICAL CONCLUSIONS INTO
SUBJECT MATTER CONCLUSIONS
10THE CORE HELP FROM THE STATISTICIAN IS IN THE
DESIGN OF THE EXPERIMENT
Help with selecting conditions that relate to the
objectives of the study
Selecting the Experimental Units
Deciding when REPLICATIONS exist
Determining the ORDER in which the experiment is
to be carried out
THE DESIGN OF THE EXPERIMENT IS CRITICAL
11COMPONENTS OF DESIGNED EXPERIMENTS
TREATMENT STRUCTURE Factors or Populations or
Treatments related to the objectives of the
experiment Brands of Product, Types of Uses of
Product
DESIGN STRUCTURE OR EXPERIMENTAL UNITS Factors
used in blocking the experimental units as well
as characteristics of exp. Units Washing Machine,
Person Using Machine, Products evaluated in
Session by Taste Panelist
12Complete Designed Experiment
Treatment Structure
Design Structure
RANDOMIZE randomization plan to assign
Treatment of TS to Experimental Units in DS
13RANDOMIZATION IS THE INSURANCE POLICY AGAINST
INTRODUCING BIAS INTO THE STUDY
Selecting an appropriate Treatment Structure,
necessary Design Structure, and required
Randomization Process provides the Statistician
the information needed to construct an
appropriate model
APPROPRIATE MODEL BEST ANALYSIS
14Key to the Design of the Experiment is the
Concept of REPLICATION
REPLICATON The independent observation of a
treatment
An Experimental Unit Provides a Replication of
the level of a Factor if the level is randomly
assigned the the Experimental Unit and observed
independently of the other Experimental Units
Must make sure that Sub-samples are not
considered to be Replications
15The Variability among Experimental Units treated
independently alike provides the estimate of the
variance (or Standard Error) to be used as the
measuring stick for comparing the levels of
treatments randomly assigned to those
Experimental Units
Between Sub-sample variance is generally much
less than between Replication variance It is
critical that the Replications are appropriately
Identified
Treatment Structure, Design Structure (with
experimental units and replication) and
Randomization describe the total Design
16ANALYZE THE DATA Use the COMPLETED DESIGNED
EXPERIMENT and the data type to construct an
appropriate analysis
Use Statistical Software SAS, RS/1, JMP A
software package you know will provide valid
results
17The Statistician will provide the STATISTICAL
interpretation of the results from the analyses
STATISTICAL ANALYSES CONCLUSIONS
The Statistician will help the Researcher
TRANSLATE the statistical analyses conclusions
into subject matter conclusions
Discuss how the statistical analyses provide
results that relate to the STATED OBJECTIVES of
the study. The expected results should be
written along with the objectives. Results that
are not expected should be looked at carefully
18Washing Machine Example 4 brands or models --
one machine each 3 types of laundry Whites,
Wash/wear, Denim 3 persons to operate the
Machines
For each person Randomly assign the order of
Brands For each Brand, randomly assign the order
of Types
19Random Order of Brands for Person 1
Brand D
Brand B
Brand A
Brand C
Denim
White
White
W/W
White
W/W
Denim
Denim
White
W/W
W/W
Denim
Machine
Random Order of Types within each Machine
Re-Randomize for each Person
20Persons are Blocks of Machines
Machines are Experimental Unit for Brands and
Variance is computed by PersonBrand
Compare BRANDS by using the variability among
Machines Treated Alike
21The Machines within a Person are Blocks for Types
Three Loads per Machine
The Loads within a machine are the Experimental
Units for Type and BrandType
Variability among Loads treated alike provides
the measuring stick for comparing the levels of
Type and BrandType
This Design Involves Persons as Blocks and Two
Sizes of Experimental Unit Machine and Load
22If you ignore that this design involves TWO sizes
of Experimental Units and there are Two Error
Terms, the resulting error term is a combination
of these two error terms The combined error term
is Too Large for making comparisons involving
Type and BrandType wont find things that are
there The Combined error term is Too Small for
Making comparisons involving Brand will declare
things to be different when they are
not Statistical Conclusions can be very
misleading
23STATISTICIANS JOB to figure out how the study
is being ran and help identify the type of design
that is being used which includes determining if
more than one size of experimental unit is
involved
This is accomplished BEST when the Statistician
is involved at the Beginning of the Study
24SALSA TASTING EXPERIMENT
NINE TYPES OR BRANDS OF SALSA
A PERSON CAN TASTE ONLY THREE SALSAS DURING THE
SESSION
TWELVE PERSONS WILL BE USED IN THE STUDY
25ASSIGNMENT OF PRODUCTS TO PERSONS with
order Person Person Order 1 2
3 Order 1 2 3 1 C A B
7 F D E 2 H I G
8 A G D 3 E B H 9 C I F 4 G B F 10 D H C 5 I E A
11 F A H 6 C G E 12 B D I
26Each Product is Tasted 4 times there are Four
Replications of each product
Since each person tastes only Three of the
products, how do we compare the products?
Want to compare the Products as if each Person
had tasted all of the Products
The Analysis obtains predicted values for each
Product for each Person
27The Product Means of these Predicted Values are
the ADJUSTED MEANS for each Product Called
LEAST SQUARES MEANS by SAS
The LSMEANS are the Predicted Means as if Each of
the Persons has Tasted and evaluated all of the
products
28Some times characteristics of experimental units
are measured to be used as possible covariates
Study the effect of three types of Drugs on a
persons heart rate
Randomly Assign 12 persons to each of the Drugs
-- person is experimental unit
Dose the person with the assigned drug and
measure the heart rate after 15 minutes
29Persons do not have identical heart rates before
being given the respective drug
Measure the initial heart rate heart rate
before giving the drug
We want to compare the Drugs as if all
experimental units (persons) had the same initial
heart rate
30Analysis of Covariance uses a regression model to
obtain predicted after drug heart rate values as
if all persons had initial heart rates of, say,
74 beats per minute
The Drug Means of these predicted heart rates are
used to compare the Drugs These means of
Predicted Values are called LSMEANS
31- LSMEANS are adjusted means and occur in several
venues - Obtain treatments means when not all treatments
are observed the same number of times by each
person - 2. Obtain treatments means when the experimental
units do not have identical values of the
covariates
32Another Role of the Statistician is to provide
appropriate models for the analysis of the data
from a given study in order to take into account
the Design Structure and covariates to provide
estimates of the treatment effects as if all
experimental units had observed all treatments or
all experimental units had the same value of the
covariate -- provide appropriate LSMEANS
33- Involving the Statistician in the Beginning of
the Study will - improve the chance of conducting a successful
experiment - Speed up the turn around of the analyses since
was involved with the design - 3. Reduce the costs associated with the
experiment -- making sure the sample size is
adequate to provide the needed detectable
differences
34THE END
THANK YOU FOR LISTENING