Title: Experimentation
1Experimentation
2Agenda
- Experimentation in Computer Science and
information systems research - Basic experimentation concepts
- Some widely used experimental design in CS and IS
field - Analyze data from experiment study
3History
- Experiment in natural science
- systematic acquisition of new knowledge, testing
theory about nature - Agriculture
- Chemistry
-
- Experimentation in social, psychology and
economic studies - Study peoples behavior
- E.g., fairness study
4Experiment in computer science research
- Derived from natural science experimentation
- Computer systems performance analysis
- Hardware
- Software
- Algorithm
- Network
5Experimentation in Information System research
- Derived from social and economic experimentation
- Subject under study is usually human
- Human behavior with regard to information system
- Hyperlink transferred trustiness
- Which subject is most suitable for distance
learning
6Purpose of experiment
- Discover and confirm causal relationship
- Examine the possible influences that one factor
or condition may have on another factor or
condition
7Basic experimentation concepts
- Independent variable
- Cause
- Research measure (manipulate) independent
variable by creating a condition or situation - Manipulation of independent variable create
different treatments. - Event manipulation
- Affecting the independent variable by altering
the events that subjects experience - Presence versus absence
- Instructional manipulation
- Varying the independent variable by giving
different sets of instructions to the subjects
8Basic experimentation concepts (cont)
- Effect (outcome)
- Physical conditions, behaviors, attitudes,
feelings, or beliefs of subjects that change in
response to a treatment. - How to measure
- IS research various data collection methods
- Questionnaire, interviews, observation, test
- CS research Metrics in the field
- Performance time, rate, error rate, time to
failure and duration
9The importance of control
- Internal validity -- The extent to which we can
accurately state that the independent variable
produced the observed effect
10Experiment cases
- A marketing researcher wants to study how humor
in television commercials affects sales. To do
so, the researcher studies the effectiveness of
two commercials that have been developed for a
new soft drink called Zowie. One commercial, in
which a well-known but serious television actor
describes how Zowie has a zingy and a refreshing
taste, airs during the months of March, April and
May. The other commercial, a humorous scenario in
which several teenagers throw Zowie at on another
on a hot summer day, airs during the months of
June, July, and the August. The researcher finds
that in June through August, Zowie sales are
almost double what they were in the preceding
three months. Humor boost sales, the research
concludes. - Many alternative explanations
11Strategies to achieve control
- Keep some things constant
- What are variables that need to be held constant
in most experiments? - Include a control group
- Treatment group (experimental group)
- Between-subjects design
- Randomly assign people to groups
- Use matched pairs
- Matched-subject design
12Between and matched-subjects design
13Steps in conducting an experiment
- Identify the relevant variables
- State hypotheses
- Decide on an experimental design
- Decide the way to manipulate independent
variables - Develop a valid and reliable measure for
dependent variable - Pilot testing the treatment and dependent
variable measures - Recruit subjects (or locate cases)
- Assign subject to groups
- Introduce treatment to treatment groups
- Gather data for measure of the dependent
variables - Hypotheses testing
14Experimental design
- One shot case study
- True experimental design
- Factorial design
- Block design
15Classic true experimental design
- pretest-posttest
- Treatment Versus control group
- Randomized
- Experimental design
-
http//trochim.human.cornell.edu/kb/desintro.htm
16Factorial design
- Two or more independent variables are manipulated
in a single experiment - They are referred to as factors
- The major purpose of the research is to explore
their effects jointly - Factorial design produce efficient experiments,
each observation supplies information about all
of the factors
17A simple example
- Investigate an education program with a variety
of variations to find out the best combination - Amount of time receiving instruction
- 1 hour per week vs. 4 hour per week
- Settings
- In-class vs. pull out
- 2 X 2 factorial design
- Number of numbers tells how many factors
- Number values tell how many levels
- The result of multiplying tells how many
treatment groups that we have in a factorial
design
18Factorial designs in computer system performance
analysis
- Personal workstation design
- Processor 68000, Z80, 8086
- Memory size 512K 2M or 8M bytes
- Number of disks one, two or three
- Workload Secretarial, managerial or scientific
- User education high school, college,
post-graduate level - Dependent variable
- Throughput, response time
1922 factorial design
- Two factors, each at two levels
- Example workstation design
- Factor 1 memory size
- Factor 2 cache size
- DV performance in MIPS
Cache size Memory size Memory size
Cache size 4M byte 8M byte
1K 15 45
2K 25 75
202K factorial design
- K factors, each at two level
- 2K experiments
- 23 design example
- In designing a personal workstation, the three
factors needed to be studied are cache size,
memory size and number of processors
Factor Level -1 Level 1
Memory size 4Mbytes 16Mbytes
Catch size 1Kbytes 2Kbytes
Number of processors 1 2
Cache size (Kbytes) 4 Mbytes 4 Mbytes 16 Mbytes 16 Mbytes
Cache size (Kbytes) 1 proc 2 proc 1 proc 2 proc
1 14 46 22 58
2 10 50 34 86
21Full and fractional factorial design
- Full factorial design
- Study all combinations
- Can find effect of all factors
- Fractional (incomplete) factorial design
- Leave some treatment groups empty
- Less information
- May not get all interactions
- No problem if interaction is negligible
222 factors full factorial design
- Used where there are two factors that are
carefully controlled - Examples in computer system performance analysis
- To compare several processors using several
workload - To determine two configuration parameters such as
cache and memory size
232 factors full factorial design (cont)
workload Two caches One caches No caches
ASM 54.0 55.0 106.0
TECO 60.0 60.0 123.0
SIEVE 43.0 43.0 120.0
DHRYSTONE 49.0 52.0 111.0
SORT 49.0 50.0 108.0
24Field and controlled laboratory experiment
- Field experiment
- Experiments conducted in real-life or field
settings - Researcher has less control over the experimental
condition - Greater external validity but lower internal
validity - Controlled laboratory experiment
- Conducted under controlled conditions of a
laboratory - Greater internal validity but lower external
validity - Practical consideration
- Planning and pilot testing
- Instruction to subjects
- Post experiment interview
25Example of field and controlled laboratory
experiments
- Field experiment
- The case in slide 10
- A controlled laboratory version
- Ask two group of subject (students) to view the
tape of two different Ads (event manipulation). - Use questionnaire to collect their intentions to
buy the product. - Compare the response from the two groups
26Analyzing data from between subject design
- Problem
- You want to measure the acquisition of
mathematical skills by distance learning and
traditional classroom learning. The study
involves the comparison of 20 students, ten
taught in classroom and ten taught by distance
learning program. The final test scores were
collected as dependent variable.
DL CL
94 90
89 91
76 83
85 81
88 74
65 60
70 69
72 63
68 62
64 63
77.1 73.6
27Why cant we just compare the means
- The difference between the means is the same in
all three. - They tell very different stories
- When we are looking at the differences between
scores for two groups, we have to judge the
difference between their means relative to the
spread of variability of their scores
28T-test
- t-test
- Assesses whether the means of two groups are
statistically different from each other - Sample size is small
- Approximately normal distribution of the measure
in the two groups is assumed
29Perform t-test
30Interpret result
- Set a significance level
- Degree of freedom
- N1N2 - 2
- Compare t-value with critical value from
t-distribution to see if it is larger enough to
be significant
31Analyzing data from matched subject design
- Problem
- You want to compare the hit rate of a two cache
algorithms. The simulated cache algorithms are
running on 5 benchmarks and the hit rate were
recorded
Cache 1 Cache 2
0.91 0.95
0.67 0.65
0.85 0.90
0.73 0.80
0.93 0.97
0.818 0.854
32Suitable test Paired t-test
- Calculation of t-value
- Degree of freedom
- N-1
Cache 1 Cache 2 Difference D2
B1 0.91 0.95 -0.04 0.0016
B2 0.67 0.65 0.02 0.0044
B3 0.85 0.90 -0.05 0.0025
B4 0.73 0.80 -0.07 0.0049
B5 0.93 0.97 -0.04 0.0016
Total -0.18 0.011
Avg -0.036
33Analyzing data from factorial design
- Problem
- The memory-cache experiments were repeated three
times each. The result is shown right - What we want to find out
- Which factor contribute most to the performance
- Whats the joint effect of the two factors
Cache size Memory size Memory size
Cache size 4M 8M
1 K 15 18 12 (15) 45 48 51 (48)
2K 25 28 19 (24) 75 75 81 (77)
34Suitable test ANOVA
- 2 way ANOVA (Analysis of Variance)
- F-value
- Between-sample variation/within-sample variation
35Statistical package
36References
- Paul D. Leedy and Jeanne Ellis Ormrod ltlt
Practical Research Planning and Design gtgt 7th
edition - Robert.B.Burns ltltIntroduction to Research
Methodsgtgt 4th edition - Raj Jain ltltThe art of computer system performance
analysis by gtgt - www.socialresearchmethods.net
- http//www.statsoft.com/textbook/stathome.html