Title: Cloud Computing Skepticism
1Cloud Computing Skepticism
- Abhishek Verma, Saurabh Nangia
2Outline
- Cloud computing hype
- Cynicism
- MapReduce Vs Parallel DBMS
- Cost of a cloud
- Discussion
3Recent Trends
Amazon S3 (March 2006)
Amazon EC2 (August 2006)
Salesforce AppExchange (March 2006)
Google App Engine (April 2008)
Microsoft Azure (Oct 2008)
Facebook Platform (May 2007)
4Tremendous Buzz
5Gartner Hype Cycle
From http//en.wikipedia.org/wiki/Hype_cycle
6Blind men and an Elephant
7- Cloud computing is simply a buzzword used to
repackage grid computing and utility computing,
both of which have existed for decades.
whatis.com Definition of Cloud Computing
8- The interesting thing about cloud computing is
that weve redefined cloud computing to include
everything that we already do. - The computer industry is the only industry that
is more fashion-driven than womens fashion. - Maybe Im an idiot, but I have no idea what
anyone is talking about. What is it? Its
complete gibberish. Its insane. When is this
idiocy going to stop?
Larry Ellison During Oracles Analyst Day
From http//blogs.wsj.com/biztech/2008/09/25/larry
-ellisons-brilliant-anti-cloud-computing-rant/
9From http//geekandpoke.typepad.com
10Reliability
- Many enterprise (necessarily or unnecessarily)
set their SLAs uptimes at 99.99 or higher, which
cloud providers have not yet been prepared to
match
Amazons cloud outages receive a lot of exposure Amazons cloud outages receive a lot of exposure
July 20, 2008 Failure due to stranded zombies, lasts 5 hours
Feb 15, 2008 Authentication overload leads to two-hour service outage
October 2007 Service failure lasts two days
October 2006 Security breach where users could see other users data
and their current SLAs dont match those of enterprises and their current SLAs dont match those of enterprises and their current SLAs dont match those of enterprises and their current SLAs dont match those of enterprises
Amazon EC2 99.95 Amazon S3 99.9
- Not clear that all applications require such
high services - IT shops do not always deliver on their SLAs
but their failures are less public and customers
cant switch easily
SLAs expressed in Monthly Uptime Percentages
Source McKinsey Company
11A Comparison of Approaches to Large-Scale Data
Analysis
- Andrew Pavlo, Erik Paulson, Alexander Rasin,
Daniel J. Abadi, David J. DeWitt, Samuel Madden,
Michael Stonebraker - To appear in SIGMOD 09
Basic ideas from MapReduce - a major step
backwards, D. DeWitt and M. Stonebraker
12MapReduce A major step backwards
- A giant step backward
- No schemas, Codasyl instead of Relational
- A sub-optimal implementation
- Uses brute force sequential search, instead of
indexing - Materializes O(m.r) intermediate files
- Does not incorporate data skew
- Not novel at all
- Represents a specific implementation of well
known techniques developed nearly 25 years ago - Missing most of the common current DBMS features
- Bulk loader, indexing, updates, transactions,
integrity constraints, referential Integrity,
views - Incompatible with DBMS tools
- Report writers, business intelligence tools, data
mining tools, replication tools, database design
tools
13Architectural Element Parallel Databases MapReduce
Schema Support Structured Unstructured
Indexing B-Trees or Hash based None
Programming Model Relational Codasyl
Data Distribution Projections before aggregation Logic moved to data, but no optimizations
Execution Strategy Push Pull
Flexibility No, but Ruby on Rails, LINQ Yes
Fault Tolerance Transactions have to be restarted in the event of a failure Yes Replication, Speculative execution
14MapReduce II
- MapReduce didn't kill our dog, steal our car, or
try and date our daughters. - MapReduce is not a database system, so don't
judge it as one - Both analyze and perform computations on huge
datasets - MapReduce has excellent scalability the proof is
Google's use - Does it scale linearly?
- No scientific evidence
- MapReduce is cheap and databases are expensive
- We are the old guard trying to defend our
turf/legacy from the young turks - Propagation of ideas between sub-disciplines is
very slow and sketchy - Very little information is passed from generation
to generation
http//www.databasecolumn.com/2008/01/mapreduce-
continued.html
15Tested Systems
- Hadoop
- 0.19 on Java 1.6, 256MB block size, JVM reuse
- Rack-awareness enabled
- DBMS-X (unnamed)
- Parallel DBMS from a major relational db vendor
- Row based, compression enabled
- Vertica (co-founded by Stonebraker)
- Column oriented
- Hardware configuration 100 nodes
- 2.4 GHz Intel Core 2 Duo
- 4GB RAM, 2 250GB SATA hard disks
- GigE ports, 128Gbps switching fabric
16Data Loading
- Hadoop
- Command line utility
- DBMS-X
- LOAD SQL command
- Administrative command to re-organize data
- Grep Dataset
- Record 10b key 90b random value
- 5.6 million records 535MB/node
- Another set 1TB/cluster
17Grep Task Results
SELECT FROM Data WHERE field LIKE XYZ
18Select Task Results
SELECT pageURL, pageRank FROM Rankings WHERE
pageRank gt X
19Join Task
SELECT INTO Temp sourceIP, AVG(pageRank) as
avgPageRank, SUM(adRevenue) as totalRevenue FROM
Rankings AS R, UserVisits AS UV WHERE R.pageURL
UV.destURL AND UV.visitDate BETWEEN
Date(2000-01-15) AND Date(2000-01-22) GROUP
BY UV.sourceIP SELECT sourceIP, totalRevenue,
avgPageRank FROM Temp ORDER BY totalRevenue DESC
LIMIT 1
20Concluding Remarks
- DBMS-X 3.2 times, Vertica 2.3 times faster than
Hadoop - Parallel DBMS win because
- B-tree indices to speed the execution of
selection operations, - novel storage mechanisms (e.g.,
column-orientation) - aggressive compression techniques with ability to
operate directly on compressed data - sophisticated parallel algorithms for querying
large amounts of relational data. - Ease of installation and use
- Fault tolerance?
- Loading data?
21The Cost of a Cloud Research Problem in Data
Center Networks
- Albert Greenberg, James Hamilton, David A. Maltz,
Parveen Patel - MSR Redmond
Presented by Saurabh Nangia
22Overview
- Cost of cloud service
- Improving low utilization
- Network agility
- Incentive for resource consumption
- Geo-distributed network of DC
23Cost of a Cloud?
- Where does the cost go in todays cloud service
data centers?
24Cost of a Cloud
Amortized Costs (one time purchases amortized
over reasonable lifetimes, assuming 5 cost of
money)
45
25
15
15
25Are Clouds any different?
- Can existing solutions for the enterprise data
center work for cloud service data centers?
26Enterprise DC vs Cloud DC (1)
- In enterprise
- Leading cost operational staff
- Automation is partial
- IT staff Servers 1100
- In cloud
- Staff costs under 5
- Automation is mandatory
- IT staff Servers 11000
27Enterprise DC vs Cloud DC (2)
- Large economies of scale
- Cloud DC leverage economies of scale
- But up front costs are high
- Scale Out
- Enterprises DC scale up
- Cloud DC scale out
28Types of Cloud Service DC (1)
- Mega data centers
- Tens of thousands (or more) servers
- Drawing tens of Mega-Watts of power (at peak)
- Massive data analysis applications
- Huge RAM, Massive CPU cycles, Disk I/O operations
- Advantages
- Cloud services applications build on one another
- Eases system design
- Lowers cost of communication needs
29Types of Cloud Service DC (2)
- Micro data centers
- Thousands of servers
- Drawing power peaking in 100s of Kilo-Watts
- Highly interactive applications
- Query/response, office productivity
- Advantages
- Used as nodes in content distribution network
- Minimize speed-of-light latency
- Minimize network transit costs to user
30Cost Breakdown
31Server Cost (1)
- Example
- 50,000 servers
- 3000 per server
- 5 cost of money
- 3 year amortization
- Amortized cost 50000 3000 1.05 / 3
- 52.5 million dollars per year!!
- Utilization remarkably low, 10
32Server Cost (2)
- Uneven Application Fit
- Uncertainty in demand forecasts
- Long provisioning time scales
- Risk Management
- Hoarding
- Virtualization short-falls
33Reducing Server Cost
- Solution Agility
- to dynamically grow and shrink resources to meet
demand, and - to draw those resources from the most optimal
location. - Barrier Network
- Increases fragmentation of resources
- Therefore, low server utlization
34Infrastructure Cost
- Infrastructure is overhead of Cloud DC
- Facilities dedicated to
- Consistent power delivery
- Evacuating heat
- Large scale generators, transformers, UPS
- Amortized cost 18.4 million per year!!
- Infra cost 200M
- 5 cost of money
- 15 year amortization
35Reducing Infrastructure Cost
- Reason of high cost requirement for delivering
consistent power - Relaxing the requirement implies scaling out
- Deploy larger numbers of smaller data centers
- Resilience at data center level
- Layers of redundancy within data center can be
stripped out (no UPS generators) - Geo-diverse deployment of micro data centers
36Power
- Power Usage Efficiency (PUE)
- (Total Facility Power)/(IT Equipment Power)
- Typically PUE 1.7
- Inefficient facilities, PUE of 2.0 to 3.0
- Leading facilities, PUE of 1.2
- Amortized cost 9.3million per year!!
- PUE 1.7
- .07 per KWH
- 50000 servers each drawing average 180W
37Reducing Power Costs
- Decreasing power cost -gt decrease need of
infrastructure cost - Goal Energy proportionality
- server running at N load consume N power
- Hardware innovation
- High efficiency power supplies
- Voltage regulation modules
- Reduce amount of cooling for data center
- Equipment failure rates increase with temp
- Make network more mesh-like resilient
38Network
- Capital cost of networking gear
- Switches, routers and load balancers
- Wide area networking
- Peering traffic handed off to ISP for end users
- Inter-data center links b/w geo distributed DC
- Regional facilities (backhaul, metro-area
connectivity, co-location space) to reach
interconnection sites - Back-of-the-envelope calculations difficult
39Reducing Network Costs
- Sensitive to site selection industry dynamics
- Solution
- Clever design of peering transit strategies
- Optimal placement of micro mega DC
- Better design of services (partitioning state)
- Better data partitioning replication
40Perspective
- On is better than off
- Server should be engaged in revenue production
- Challenge Agility
- Build in resilience at systems level
- Stripping out layers of redundancy inside each
DC, and instead using other DC to mask DC failure - Challenge Systems software Network research
41Cost of Large Scale DC
http//perspectives.mvdirona.com/2008/11/28/CostO
fPowerInLargeScaleDataCenters.aspx
42Solutions!
43Improving DC efficiency
- Increasing Network Agility
- Appropriate incentives to shape resource
consumption - Joint optimization of Network DC resources
- New mechanisms for geo-distributing states
44Agility
- Any server can be dynamically assigned to any
service anywhere in DC - Conventional DC
- Fragment network server capacity
- Limit dynamic growth and shrink of server pools
45Networking in Current DC
- DC network two types of traffic
- Between external end systems internal servers
- Between internal servers
- Load Balancer
- Virtual IP address (VIP)
- Direct IP address (DIP)
46Conventional Network Architecture
47Problems (1)
- Static Network Assignment
- Individual applications mapped to specific
physical switches routers - Adv performance security isolation
- Disadv Work against agility
- Policy-overloaded (traffic, security,
performance) - VLAN spanning concentrates traffic on links high
in tree
48Problems (2)
- Load Balancing Techniques
- Destination NAT
- All DIPs in a VIPs pool be in the same layer 2
domain - Under-utilization fragmentation
- Source NAT
- Servers spread across layer 2 domain
- But server never sees IP
- Client IP required for data mining response
customization
49Problems (3)
- Poor server to server connectivity
- Connection b/w servers in diff layer 2 must go
through layer 3 - Links oversubscribed
- Capacity of links b/w access router border
routers lt output capacity of servers connected to
access router - Ensure no saturation in any of network links!
50Problems (4)
- Proprietary hardware scales up, not out
- Load balancers used in pairs
- Replaced when load becomes too much
51DC Networking Design Objectives
- Location-independent Addressing
- Decouple servers location in DC from its address
- Uniform Bandwidth Latency
- Servers can be distributed arbitrarily in DC
without fear of running into bandwidth choke
points - Security Performance Isolation
- One service should not affect others performance
- DoS attack
52Incenting Desirable Behavior (1)
- Yield management
- to sell the right resources to the right customer
at the right time for the right price - Trough filling
- Cost determined by height of peaks, not area
- Bin packing opportunities
- Leasing committed capacity with fixed minimum
cost - Prices varying with resource availability
- Differentiate demands by urgency of execution
53Incenting Desirable Behavior (2)
- Server allocation
- Large unfragmented servers Agility
- Less requests for servers
- Eliminating hoarding of servers
- Cost for having a server
- Seasonal peaks
- Internal auctions may be fairest
- But, how to design!
54Geo-Distribution
- Speed latency matter
- Google 20 revenue loss for 500ms delay!!
- Amazon 1 sales decrease for 100ms delay!!
- Challenges
- Where to place data centers
- How big to make them
- Using it as a source of redundancy to improve
availability
55Optimal Placement Sizing (1)
- Importance of Geographical Diversity
- Decreasing latency b/w user and DC
- Redundancy (earthquakes, riots, outages, etc)
- Size of data center
- Mega DC
- Extracting maximum benefit from economies of
scale - Local factors like tax, power concessions, etc.
- Micro DC
- Enough servers to provide statistical
multiplexing gains - Given a fixed budget, place closes to each
desired population
56Optimal Placement Sizing (2)
- Network cost
- Performance vs cost
- Latency vs Internet peering dedicated lines
between data centers - Optimization should also consider
- Dependencies of services offered
- Email -gt buddy list maintenance, authentication,
etc - Front end micro data centers (low latency)
- Back end mega data centers (greater resources)
57Geo-Distributing State (1)
- Turning geo-diversity to geo-redundancy
- Distribute critical state across sites
- Facebook
- Single master data center replicating data
- Yahoo! Mail
- Partitions data across DCs based on user
- Different solutions for Different data
- Buddy status replicated weak consistency
assurance - Email mailbox by user ids, strong consistency
58Geo-Distributing State (2)
- Tradeoffs
- Load distribution vs service performance
- eg Facebooks single master coordinate
replication - Speeds up lookup but loads on master
- Communication cost vs service performance
- Data replication-more inter data center
communication - Longer latency
- Higher cost message over inter DC links
59Summary
- Data center costs
- Server, Infrastructure, Power, Networking
- Improving efficiency
- Network Agility
- Resource Consumption Shaping
- Geo-diversifying DC
60Opinions
61- Richard Stallman, GNU founder
- Cloud Computing is a trap
- .. cloud computing was simply a trap aimed at
forcing more people to buy into locked,
proprietary systems that would cost them more and
more over time. - "It's stupidity. It's worse than stupidity it's
a marketing hype campaign"
62- Open Cloud Manifesto
- a document put together by IBM, Cisco, ATT, Sun
Microsystems and over 50 others to promote
interoperability - "Cloud providers must not use their market
position to lock customers into their particular
platforms and limit their choice of providers, - Failed? Google, Amazon, Salesforce and Microsoft,
four very big players in the area, are notably
absent from the list of supporters
63- Larry Ellison, Oracle founder
- "fashion-driven" and "complete gibberish
- What is it? What is it? ... Is it - 'Oh, I am
going to access data on a server on the
Internet.' That is cloud computing? - Then there is a definition What is cloud
computing? It is using a computer that is out
there. That is one of the definitions 'That is
out there.' These people who are writing this
crap are out there. They are insane. I mean it is
the stupidest.
64- Sam Johnston, Strategic Consultant Specializing
in Cloud Computing, - Oracle would be out badmouthing cloud computing
as it has the potential to disrupt their entire
business. - "Who needs a database server when you can buy
cloud storage like electricity and let someone
else worry about the details? Not me, that's for
sure - unless I happen to be one of a dozen or so
big providers who are probably using open source
tech anyway,
65- Marc Benioff, head of salesforce.com
- Cloud computing isn't just candyfloss thinking
it's the future. If it isn't, I don't know what
is. We're in it. You're going to see this model
dominate our industry." - Is data really safe in the cloud? "All complex
systems have planned and unplanned downtime. The
reality is we are able to provide higher levels
of reliability and availability than most
companies could provide on their own," says
Benioff
66- John Chambers, Cisco Systems CEO
- "a security nightmare.
- cloud computing was inevitable, but that it
would shake up the way that networks are
secured
67- James Hamilton, VP Amazon Web Services
- any company not fully understanding cloud
computing economics and not having cloud
computing as a tool to deploy where it makes
sense is giving up a very valuable competitive
edge - No matter how large the IT group, if I led the
team, I would be experimenting with cloud
computing and deploying where it make sense
68To Cloud or Not to Cloud?
69References
- Clearing the air on cloud computing,
McKinseyCompany - http//geekandpoke.typepad.com/
- Clearing the Air - Adobe Air, Google Gears and
Microsoft Mesh, Farhad Javidi - http//en.wikipedia.org/wiki/Hype_cycle
- A Comparison of Approaches to Large-Scale Data
Analysis, Pavlo et al - MapReduce - a major step backwards, D. DeWitt and
M. Stonebraker