Title: Classroom Data Analysis
1Workflow Analysis of Student Data
John R. Gilbert and Viral Shah
2Goals and Caveats
- Our goal is to develop methods for using measured
data to evaluate hypotheses about workflow. - These are preliminary experiments using data from
one pilot classroom study (UCSB CS240A, spring
2004). - Dont put too much trust in this data! (We are
still learning what data to gather and how.) - Therefore, this talk wont defend any particular
conclusions about workflows. Rather, we aim to - show that a data-based analytical approach is
promising for further development - inform data-gathering in upcoming studies.
3Background
- System asked student for a reason for each
compile - We didnt trust the answers . . .
- But we captured full source etc. at each compile
run - So, we completely re-ran the student experience
- here, one assignment from one class
- 17 student histories
- about one day of 32-processor cluster time
- Used heuristics to assign reasons for compiles
4Scripted questionnaire
- What is the reason for this compile/run?
- Learn / experiment with compiler
- Adding serial functionality
- Parallelizing
- Performance tuning
- Fixing compile time error
- Fixing run time error
5Heuristics to deduce answers
- What is the reason for this compile/run?
- Learn / experiment with compiler
- Few MPI calls, LOC unchanged
- Adding serial functionality
- Few MPI calls, LOC changes
- Parallelizing
- of MPI calls changes
- Performance tuning
- Correct, run time changes
- Fixing compile time error
- Previous compile failed
- Fixing run time error
- Previous run failed on random test case
6Example of student trace
- Student 2 70 runs in 7 days, 12 hr, 0 sec
- rev 1 after 0s seq runerr
- rev 2 after 4d 4h 27m 7s seq par
cplerr - rev 3 after 41s seq cplerr
- rev 4 after 19s cplerr
- rev 5 after 21s cplerr
- . . . . . . . . . . . .
- rev 14 after 21m 49s cplerr
- rev 15 after 6m 32s seq cplerr
- rev 16 after 38s seq run ,
time7.47479 - rev 17 after 57s seq run ,
time4.58393 - rev 18 after 2m 25s run ,
time7.68748 - rev 19 after 1m 14s run ,
time4.54306 - . . . . . . . . . . . .
- rev 64 after 2m 22s par runerr
- rev 65 after 5m 0s par crash
- rev 66 after 6m 43s run ,
time10.6737
7Summary of student experience
- Student 1 324 runs in 22h 41m 37s best
3.386 - Student 2 70 runs in 8h 35m 11s best
2.346 - Student 3 36 runs in 4h 59m 54s best
1.854 - Student 4 216 runs in 11h 42m 54s best
6.509 - Student 5 173 runs in 14h 57m 35s best
5.459 - Student 6 122 runs in 8h 27m 10s best
2.606 - Student 7 174 runs in 18h 17m 8s best
3.576 - Student 8 536 runs in 27h 10m 16s best
7.540 - Student 9 72 runs in 17h 23m 32s best
5.533 - Student 10 110 runs in 11h 43m 29s best
5.183 - Student 11 325 runs in 41h 18m 12s best
1.684 - Student 12 188 runs in 18h 24m 39s best
4.428
8Compiles, Runs, Correct runs
- Distribution varies significantly by programmer
9LOC profiles
- All kinds of workflows are observed in the class
10Timed Markov processes
- A timed Markov process is a Markov process with
associated state dwell times - The dwell times depend on how the state is exited
-
- prob(B A) is the probability the next state is
B given that the current state is A - time(A B) is the dwell time spent in state A
given that the next state is B - The dwell time is a random variable in general
From Burton Smith and David Mizell, CRAY
11General model of researcher workflow
- express new scientific theory in informal
notation - express new theory in VHLL (Matlab, Perl, OS360
JCL, etc.) - debug code on small data sets
- test new theory on small, medium data sets --
compare against known results, previous models - redesign program for HPC system
- write program for HPC system
- compile for debug
- test HPC code
- debug HPC code
- select medium-to-large data set for performance
testing/optimization - optimize HPC code for performance
- do performance test run for HPC code
- select, obtain large-scale data set
- structure data set for large-scale computation
- test large-scale data set arrangement for
correctness - test for expected performance
- design/implement visualization approach for
larger-scale problems - run performance-tuned version against large-scale
data set - visualize results
Items in red new system could shorten time
well try to model these
Items in orange could also be sped up by new
system, but not yet part of model
From Burton Smith and David Mizell, CRAY
12Researcher workflow model
From Burton Smith and David Mizell, CRAY
13Fitting data to Cray model
Transition
probabilities program compile1 test
debug run optimize compile2 program
23.7
compile1 100.0
100.0
test 100.0
debug 71.3
26.6 run
4.8
100.0 optimize
69.9
compile2
100.0 done
0.2 3.5
- Average dwell
times - program compile1 test debug
run optimize compile2 - program 5s
- compile1 4m 28s 6m 20s
- test 49s
- debug 5s
5s - run 9s
30s - optimize
4s - compile2
10m 29s - done 5s
3s
14Fitting data to Cray model
Formulate
Program
1.0 / 268s
1.0 / 380s
Compile
Debug
.237 / 5s
1.0 / 49s
.713 / 5s
Test
.002 / 5s
.048 / 9s
1.0 / 629s
Compile
Optimize
.266 / 5s
1.0 / 30s
Run
.699 / 4s
.035 / 3s
15Conclusions
- Remember all the caveats!
- This is very preliminary, no conclusions about
workflows yet - Fitting measured data to hypothesized workflows
looks promising - Were getting better at knowing what to measure
and how - See Vics talk yesterday
- Formulation time, programming vs debugging, . . .
? - Measure automatically when possible
- Lots more data coming soon from classroom
experiments - Should be able to do this with professional data
too - Want to compare different languages / apps / . .
. - Want to use principled approach to estimating
statistics of dwell times, evaluating competing
state models, etc.