Title: Science in the Clouds: History, Challenges, and Opportunities
1Science in the CloudsHistory, Challenges, and
Opportunities
- Douglas Thain
- University of Notre Dame
- GeoClouds Workshop
- 17 September 2009
2http//www.cse.nd.edu/ccl
3The Cooperative Computing Lab
- We collaborate with people who have large scale
computing problems. - We build new software and systems to help them
achieve meaningful goals. - We run a production computing system used by
people at ND and elsewhere. - We conduct computer science research, informed by
real world experience, with an impact upon
problems that matter.
4Clouds in the Hype Cycle
Gartner Hype Cycle Report, 2009
5What is cloud computing?
- A cloud provides rapid, metered access to a
virtually unlimited set of resources. - Two significant impact on users
- End users must have an economic model for the
work that they want to accomplish. - Apps must be flexible enough to work with an
arbitrary number and kind of resources.
6Example Amazon EC2 Sep 2009(simplified slightly
for discussion)
- Small 1 core, 1.7GB RAM, 160GB disk
- 10 cents/hour
- Large 2 cores, 7.5GB RAM, 850GB disk
- 40 cents/hour
- Extra Large 4 cores, 15 GB, 1690GB disk
- 80 cents/hour
- And the Simple Storage Service
- 15 cents per GB-month stored
- 17 cents per GB transferred (outside of EC2)
- 1 cent per 1000 write operations
- 1 cent per 10000 read operations
7Is Cloud Computing New?
- Not entirely, but a combination of the old ideas
of utility computing and distributed computing. - 1960 MULTICS
- 1980 The Cambridge Ring
- 1987 Condor Distributed Batch System
- 1989 Seti_at_Home
- 1990s Clusters, Beowulf, MPI, NOW
- 1995 Globus, Grid Computing
- 2001 TeraGrid
- 2004 Sun Rents CPUs at 1/hour
- 2006 Amazon EC2 and S3
8Clouds Trade CapEx for OpEx
Cost
OpEx of Ownership
Capital Expense of Ownership
Time
9What about grid computing?
- A vision much like clouds
- A worldwide framework that would make massive
scale computing as easy to use as an electrical
socket. - The more modest realization
- A means for accessing remote computing facilities
in their native form, usually for CPU-intensive
tasks. - The social context
- Large collaborative efforts between computer
scientists and computer-savvy fields,
particularly physics and astronomy.
10Clouds vs Grids
- Grids provide a job execution interface
- Run program P on input A, return the output.
- Allows the system to maximize utilization and
hide failures, but provides few performance
guarantees and inaccurate metering. - Clouds provide resource allocation
- Create a VM with 2GB of RAM for 7 days.
- Gives predictable performance and accurate
metering, but exposes problems to the user. - Can be used to build interactive services.
- How do I run 1M jobs on 100 servers?
11Grid Computing Layer Provides Job Execution
Cloud Computing Layer Provides Resource Allocation
12Create a Condor Pool with 100 Nodes
Allocate 100 Cores
Run 1M Jobs
13Clouds Solve Some Grid Problems
- Application compatibility is simplified.
- You provide a VM for Linux 2.3.4.1.2.
- Performance is reasonably predictable.
- 10 variations rather than orders of mag.
- Fewer administrative headaches for the lone user.
- A credit card swipe instead of a certificate.
14But, Problems New and Old
- How do I reliably execute 1M jobs?
- Can I share resources and data with others in the
cloud? - How do I authenticate others in the cloud?
- Unfortunately, location still matters.
- Can we make applications efficiently span
multiple cloud providers? - Can we join existing centers with clouds?
- (These are all problems contemplated by grid.)
15More Open Questions
- Can I afford to move my data in to the cloud?
- Can I afford to get it out?
- Do I trust the cloud to secure my data?
- How do I go about constructing an economic model
for my research? - Are there social/technical dangers in putting too
many eggs in one basket? - Is pay-go the proper model for research?
- Should universities get out of the data center
business?
16Clusters, clouds, and gridsgive us access to
unlimited CPUs.
How do we write programs that canrun effectively
in large systems?
17MapReduce( S, M, R )
Key0
R
O0
V
V
V
Set S
K,V
K,V
Key1
K,V
M
V
R
O1
K,V
K,V
V
K,V
K,V
KeyN
V
V
R
O2
V
V
18Of course, not all science fits into the
Map-Reduce model!
19Example Biometrics Research
- Goal Design robust face comparison function.
20Similarity Matrix Construction
1.0 0.8 0.1 0.0 0.0 0.1
1.0 0.0 0.1 0.1 0.0
1.0 0.0 0.1 0.3
1.0 0.0 0.0
1.0 0.1
1.0
Challenge Workload 60,000 iris images 1MB
each .02s per F 833 CPU-days 600 TB of I/O
21Now What?
22(No Transcript)
23(No Transcript)
24(No Transcript)
25Non-Expert User Using 500 CPUs
26Observation
- In a given field of study, many people repeat the
same pattern of work many times, making slight
changes to the data and algorithms. - If the system knows the overall pattern in
advance, then it can do a better job of executing
it reliably and efficiently. - If the user knows in advance what patterns are
allowed, then they have a better idea of how to
construct their workloads.
27Abstractionsfor Distributed Computing
- Abstraction a declarative specification of the
computation and data of a workload. - A restricted pattern, not meant to be a general
purpose programming language. - Uses data structures instead of files.
- Provide users with a bright path.
- Regular structure makes it tractable to model and
predict performance.
28Working with Abstractions
A1
A1
F
A2
A2
An
Bn
AllPairs( A, B, F )
Custom Workflow Engine
Cloud or Grid
Compact Data Structure
29All-Pairs Abstraction
- AllPairs( set A, set B, function F )
- returns matrix M where
- Mij F( Ai, Bj ) for all i,j
A1
A2
A3
A1
A1
allpairs A B F.exe
An
B1
F
F
F
AllPairs(A,B,F)
B1
B1
Bn
B2
F
F
F
F
B3
F
F
F
30How Does the Abstraction Help?
- The custom workflow engine
- Chooses right data transfer strategy.
- Chooses the right number of resources.
- Chooses blocking of functions into jobs.
- Recovers from a larger number of failures.
- Predicts overall runtime accurately.
- All of these tasks are nearly impossible for
arbitrary workloads, but are tractable (not
trivial) to solve for a specific abstraction.
31(No Transcript)
32Choose the Right of CPUs
33Resources Consumed
34All-Pairs in Production
- Our All-Pairs implementation has provided over
57 CPU-years of computation to the ND biometrics
research group over the last year. - Largest run so far 58,396 irises from the Face
Recognition Grand Challenge. The largest
experiment ever run on publically available data. - Competing biometric research relies on samples of
100-1000 images, which can miss important
population effects. - Reduced computation time from 833 days to 10
days, making it feasible to repeat multiple times
for a graduate thesis. (We can go faster yet.)
35(No Transcript)
36Are there other abstractions?
37Wavefront( matrix M, function F(x,y,d) ) returns
matrix M such that Mi,j F( Mi-1,j,
MI,j-1, Mi-1,j-1 )
Wavefront(M,F)
M
F
38Applications of Wavefront
- Bioinformatics
- Compute the alignment of two large DNA strings in
order to find similarities between species.
Existing tools do not scale up to complete DNA
strings. - Economics
- Simulate the interaction between two competing
firms, each of which has an effect on resource
consumption and market price. E.g. When will we
run out of oil? - Applies to any kind of optimization problem
solvable with dynamic programming.
39Problem Dispatch Latency
- Even with an infinite number of CPUs, dispatch
latency controls the total execution time O(n)
in the best case. - However, job dispatch latency in an unloaded grid
is about 30 seconds, which may outweigh the
runtime of F. - Things get worse when queues are long!
- Solution Build a lightweight task dispatch
system. (Idea from Falkon_at_UC)
401000s of workers Dispatched to the cloud
worker
worker
worker
worker
worker
worker
queue tasks
put F.exe put in.txt exec F.exe ltin.txt
gtout.txt get out.txt
worker
work queue
wavefront engine
tasks done
F
In.txt
out.txt
41Problem Performance Variation
- Tasks can be delayed for many reasons
- Heterogeneous hardware.
- Interference with disk/network.
- Policy based suspension.
- Any delayed task in Wavefront has a cascading
effect on the rest of the workload. - Solution - Fast Abort Keep statistics on task
runtimes, and abort those that lie significantly
outside the mean. Prefer to assign jobs to
machines with a fast history.
42500x500 Wavefront on 200 CPUs
43Wavefront on a 200-CPU Cluster
44Wavefront on a 32-Core CPU
45The Genome Assembly Problem
AGTCGATCGATCGATAATCGATCCTAGCTAGCTACGA
AGTCGATCGATCGAT
TCGATAATCGATCCTAGCTA
AGCTAGCTACGA
46Sample Genomes
Reads Data Pairs Sequential Time
A. gambiae scaffold 101K 80MB 738K 12 hours
A. gambiae complete 180K 1.4GB 12M 6 days
S. Bicolor simulated 7.9M 5.7GB 84M 30 days
47Some-Pairs Abstraction
- SomePairs( set A, list (i,j), function F(x,y) )
- returns
- list of F( Ai, Aj )
A1
A2
A3
A1
A1
An
A1
F
SomePairs(A,L,F)
(1,2) (2,1) (2,3) (3,3)
A2
F
F
A3
F
F
48Distributed Genome Assembly
100s of workers dispatched to Notre Dame, Purdue,
and Wisconsin
worker
(1,2) (2,1) (2,3) (3,3)
worker
A1
worker
A1
F
worker
An
worker
worker
queue tasks
detail of a single worker
put align.exe put in.txt exec F.exe ltin.txt
gtout.txt get out.txt
worker
work queue
somepairs master
tasks done
F
in.txt
out.txt
49Small Genome (101K reads)
50Medium Genome (180K reads)
51Large Genome (7.9M)
52Whats the Upshot?
- We can do full-scale assemblies as a routine
matter on existing conventional machines. - Our solution is faster (wall-clock time) than the
next faster assembler run on 1024x BG/L. - You could almost certainly do better with a
dedicated cluster and a fast interconnect, but
such systems are not universally available. - Our solution opens up research in assembly to
labs with NASCAR instead of Formula-One
hardware.
53What if your application doesnt fit a regular
pattern?
54Makeflow
part1 part2 part3 input.data split.py
./split.py input.data out1 part1 mysim.exe
./mysim.exe part1 gtout1 out2 part2 mysim.exe
./mysim.exe part2 gtout2 out3 part3 mysim.exe
./mysim.exe part3 gtout3 result out1 out2 out3
join.py ./join.py out1 out2 out3 gt result
55Makeflow Implementation
100s of workers dispatched to the cloud
worker
worker
worker
bfile afile prog prog afile gtbfile
worker
worker
worker
queue tasks
detail of a single worker
put prog put afile exec prog afile gt bfile get
bfile
worker
work queue
makeflow master
tasks done
prog
Two optimizations Cache inputs and output.
Dispatch tasks to nodes with data.
afile
bfile
56Experience with Makeflow
- Still in initial deployment, so no big results to
show just yet. - Easy to test and debug on a desktop machine or a
multicore server. - The workload says nothing about the distributed
system. (This is good.) - Graduate students in bioinformatics running codes
at production speeds on hundreds of nodes in less
than a week.
57Abstractions as a Social Tool
- Collaboration with outside groups is how we
encounter the most interesting, challenging, and
important problems, in computer science. - However, often neither side understands which
details are essential or non-essential - Can you deal with files that have upper case
letters? - Oh, by the way, we have 10TB of input, is that
ok? - (A little bit of an exaggeration.)
- An abstraction is an excellent chalkboard tool
- Accessible to anyone with a little bit of
mathematics. - Makes it easy to see what must be plugged in.
- Forces out essential details data size,
execution time.
58Conclusion
- Grids, clouds, and clusters provide enormous
computing power, but are very challenging to use
effectively. - An abstraction provides a robust, scalable
solution to a narrow category of problems each
requires different kinds of optimizations. - Limiting expressive power, results in systems
that are usable, predictable, and reliable. - Is there a menu of abstractions that would
satisfy many consumers of clouds?
59Acknowledgments
- Cooperative Computing Lab
- http//www.cse.nd.edu/ccl
- Faculty
- Patrick Flynn
- Nitesh Chawla
- Kenneth Judd
- Scott Emrich
- Grad Students
- Chris Moretti
- Hoang Bui
- Li Yu
- Mike Olson
- Michael Albrecht
- Undergrads
- Mike Kelly
- Rory Carmichael
- Mark Pasquier
- Christopher Lyon
- Jared Bulosan
- NSF Grants CCF-0621434, CNS-0643229