Introduction to cloud computing - PowerPoint PPT Presentation

1 / 64

About This Presentation

Title:

Introduction to cloud computing

Description:

Jiaheng Lu Department of Computer Science Renmin University of China www.jiahenglu.net Advanced MapReduce Application Reference: Jimmy Lin http://www.umiacs.umd.edu ... – PowerPoint PPT presentation

Number of Views:464

Avg rating:3.0/5.0

Slides: 65

Provided by: datasearc

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to cloud computing

1
Introduction to cloud computing

Jiaheng Lu
Department of Computer Science
Renmin University of China
www.jiahenglu.net

Advanced MapReduce Application
Reference Jimmy Lin
http//www.umiacs.umd.edu/jimmylin/cloud-2008-Fal
l/schedule.html

3
Managing Dependencies

Remember Mappers run in isolation
You have no idea in what order the mappers run
You have no idea on what node the mappers run
You have no idea when each mapper finishes
Tools for synchronization
Ability to hold state in reducer across multiple
key-value pairs
Sorting function for keys
Partitioner
Cleverly-constructed data structures

4
Motivating Example

Term co-occurrence matrix for a text collection
M N x N matrix (N vocabulary size)
Mij number of times i and j co-occur in some
context (for concreteness, lets say context
sentence)
Why?
Distributional profiles as a way of measuring
semantic distance
Semantic distance useful for many language
processing tasks

e.g., Mohammad and Hirst (EMNLP, 2006)
5
MapReduce Large Counting Problems

Term co-occurrence matrix for a text collection
specific instance of a large counting problem
A large event space (number of terms)
A large number of events (the collection itself)
Goal keep track of interesting statistics about
the events
Basic approach
Mappers generate partial counts
Reducers aggregate partial counts

6
First Try Pairs

Each mapper takes a sentence
Generate all co-occurring term pairs
For all pairs, emit (a, b) ? count
Reducers sums up counts associated with these
pairs
Use combiners!

7
Pairs Analysis

Advantages
Easy to implement, easy to understand
Disadvantages
Lots of pairs to sort and shuffle around (upper
bound?)

8
Another Try Stripes

Idea group together pairs into an associative
array
Each mapper takes a sentence
Generate all co-occurring term pairs

a ? b 1, c 2, d 5, e 3, f 2
(a, b) ? 1 (a, c) ? 2 (a, d) ? 5 (a, e) ? 3
(a, f) ? 2
a ? b 1, d 5, e 3 a ? b 1, c
2, d 2, f 2 a ? b 2, c 2, d 7,
e 3, f 2
9
Another Try Stripes

Reducers perform element-wise sum of associative
arrays

a ? b 1, d 5, e 3 a ? b 1, c
2, d 2, f 2 a ? b 2, c 2, d 7,
e 3, f 2

10
Stripes Analysis

Advantages
Far less sorting and shuffling of key-value pairs
Can make better use of combiners
Disadvantages
More difficult to implement
Underlying object is more heavyweight
Fundamental limitation in terms of size of event
space

11
Cluster size 38 cores Data Source Associated
Press Worldstream (APW) of the English Gigaword
Corpus (v3), which contains 2.27 million
documents (1.8 GB compressed, 5.7 GB uncompressed)
12
Conditional Probabilities

How do we compute conditional probabilities from
counts?
Why do we want to do this?
How do we do this with MapReduce?

13
P(BA) Pairs
Reducer holds this value in memory
(a, ) ? 32

For this to work
Must emit extra (a, ) for every bn in mapper
Must make sure all as get sent to same reducer
(use Partitioner)
Must make sure (a, ) comes first (define sort
order)

(a, b1) ? 3 (a, b2) ? 12 (a, b3) ? 7 (a, b4) ?
1
(a, b1) ? 3 / 32 (a, b2) ? 12 / 32 (a, b3) ? 7 /
32 (a, b4) ? 1 / 32
14
P(BA) Stripes
a ? b13, b2 12, b3 7, b4 1,

Easy!
One pass to compute (a, )
Another pass to directly compute P(BA)

15
Synchronization in Hadoop

Approach 1 turn synchronization into an ordering
problem
Sort keys into correct order of computation
Partition key space so that each reducer gets the
appropriate set of partial results
Hold state in reducer across multiple key-value
pairs to perform computation
Approach 2 construct data structures that bring
the pieces together
Each reducer receives all the data it needs to
complete the computation

16
Issues and Tradeoffs

Number of key-value pairs
Object creation overhead
Time for sorting and shuffling pairs across the
network
Size of each key-value pair
De/serialization overhead
Combiners make a big difference!
RAM vs. disk and network
Arrange data to maximize opportunities to
aggregate partial results

17
Data Types in Hadoop
Writable
Defines a de/serialization protocol. Every data
type in Hadoop is a Writable.
WritableComprable
Defines a sort order. All keys must be of this
type (but not values).
Concrete classes for different data types.
IntWritableLongWritable Text
18
Complex Data Types in Hadoop

How do you implement complex data types?
The easiest way
Encoded it as Text, e.g., (a, b) ab
Use regular expressions to parse and extract data
The hard way
Define a custom implementation of
WritableComprable
Must implement readFields, write, compareTo
Computationally efficient, but slow for rapid
prototyping

19
Yahoo! PNUTS and Hadoop

20
Search Results of the Future
yelp.com
Gawker
babycenter
New York Times
epicurious
LinkedIn
answers.com
webmd
21
Whats in the Horizontal Cloud?
Simple Web Service APIs
Horizontal Cloud Services
Edge Content Services e.g., YCS, YCPI
Provisioning Virtualization e.g., EC2
Batch Storage Processing e.g., Hadoop Pig
Operational Storage e.g., S3, MObStor, Sherpa
Other Services Messaging, Workflow, virtual
DBs Webserving
ID Account Management
Shared Infrastructure
Metering, Billing, Accounting
Monitoring QoS
Common Approaches to QA, Production
Engineering, Performance Engineering, Datacenter
Management, and Optimization
22
Yahoo! Cloud Stack
EDGE
Horizontal Cloud Services

YCS
YCPI
Brooklyn
WEB
Horizontal Cloud Services
VM/OS
yApache
PHP
App Engine
APP
Provisioning (Self-serve)
Monitoring/Metering/Security
Horizontal Cloud Services
VM/OS

Serving Grid
Data Highway
STORAGE
Horizontal Cloud Services

Sherpa
MOBStor
BATCH
Horizontal Cloud Services

Hadoop
23
Yahoo! CCDI Thrust Areas

Fast Provisioning and Machine Virtualization On
demand, deliver a set of hosts imaged with
desired software and configured against standard
services
Multiple hosts may be multiplexed onto the same
physical machine.
Batch Storage and Processing Scalable data
storage optimized for batch processing, together
with computational capabilities
Operational Storage Persistent storage that
supports low-latency updates and flexible
retrieval
Edge Content Services Support for dealing with
network topology, communication protocols,
caching, and BCP

Rest of todays talk
24
Web Data Management

CRUD
Point lookups and short scans
Index organized table and random I/Os
per latency

Scan oriented workloads
Focus on sequential disk I/O
per cpu cycle

Structured record storage (PNUTS/Sherpa)
Large data analysis (Hadoop)

Object retrieval and streaming
Scalable file storage
per GB

Blob storage (SAN/NAS)
25
The World Has Changed

Web serving applications need
Scalability!
Preferably elastic
Flexible schemas
Geographic distribution
High availability
Reliable storage
Web serving applications can do without
Complicated queries
Strong transactions

26
PNUTS / SHERPA To Help You Scale Your Mountains
of Data
27
Yahoo! Serving Storage Problem

Small records 100KB or less
Structured records lots of fields, evolving
Extreme data scale - Tens of TB
Extreme request scale - Tens of thousands of
requests/sec
Low latency globally - 20 datacenters worldwide
High Availability - outages cost millions
Variable usage patterns - as applications and
users change

27
28
The PNUTS/Sherpa Solution

The next generation global-scale record store
Record-orientation Routing, data storage
optimized for low-latency record access
Scale out Add machines to scale throughput
(while keeping latency low)
Asynchrony Pub-sub replication to far-flung
datacenters to mask propagation delay
Consistency model Reduce complexity of
asynchrony for the application programmer
Cloud deployment model Hosted, managed service
to reduce app time-to-market and enable on demand
scale and elasticity

28
29
What is PNUTS/Sherpa?
CREATE TABLE Parts ( ID VARCHAR, StockNumber
INT, Status VARCHAR )
Structured, flexible schema
Geographic replication
Parallel database
Hosted, managed infrastructure
29
30
What Will It Become?
Indexes and views
CREATE TABLE Parts ( ID VARCHAR, StockNumber
INT, Status VARCHAR )
Geographic replication
Parallel database
Structured, flexible schema
Hosted, managed infrastructure
31
What Will It Become?
Indexes and views
32
Design Goals

Scalability
Thousands of machines
Easy to add capacity
Restrict query language to avoid costly queries
Geographic replication
Asynchronous replication around the globe
Low-latency local access
High availability and fault tolerance
Automatically recover from failures
Serve reads and writes despite failures

Consistency
Per-record guarantees
Timeline model
Option to relax if needed
Multiple access paths
Hash table, ordered table
Primary, secondary access
Hosted service
Applications plug and play
Share operational cost

32
33
Technology Elements
Applications
Tabular API
PNUTS API

PNUTS
Query planning and execution
Index maintenance

Distributed infrastructure for tabular data
Data partitioning
Update consistency
Replication

YCA Authorization

YDOT FS
Ordered tables

YDHT FS
Hash tables

Tribble
Pub/sub messaging

Zookeeper
Consistency service

33
34
Data Manipulation

Per-record operations
Get
Set
Delete
Multi-record operations
Multiget
Scan
Getrange
Web service (RESTful) API

34
35
TabletsHash Table
Name
Description
Price
0x0000
Grape
12
Grapes are good to eat
Limes are green
9
Lime
1
Apple
Apple is wisdom
900
Strawberry
Strawberry shortcake
0x2AF3
2
Orange
Arrgh! Dont get scurvy!
3
Avocado
But at what price?
Lemon
How much did you pay for this lemon?
1
14
Is this a vegetable?
Tomato
0x911F
2
The perfect fruit
Banana
8
Kiwi
New Zealand
0xFFFF
35
36
TabletsOrdered Table
Name
Description
Price
A
1
Apple
Apple is wisdom
3
Avocado
But at what price?
2
Banana
The perfect fruit
12
Grape
Grapes are good to eat
H
Kiwi
8
New Zealand
Lemon
How much did you pay for this lemon?
1
Limes are green
Lime
9
2
Orange
Arrgh! Dont get scurvy!
Q
900
Strawberry
Strawberry shortcake
Is this a vegetable?
14
Tomato
Z
36
37
Flexible Schema
Posted date Listing id Item Price
6/1/07 424252 Couch 570
6/1/07 763245 Bike 86
6/3/07 211242 Car 1123
6/5/07 421133 Lamp 15
Condition
Good

Fair

Color

Red

38
Detailed Architecture
Local region
Remote regions
Clients
REST API
Routers
Tribble
Tablet Controller
Storage units
38
39
Tablet Splitting and Balancing
Each storage unit has many tablets (horizontal
partitions of the table)
Storage unit may become a hotspot
Tablets may grow over time
Overfull tablets split
Shed load by moving tablets to other servers
39
40
QUERY PROCESSING
40
41
Accessing Data
Get key k
SU
SU
SU
41
42
Bulk Read
SU
SU
SU
42
43
Range Queries in YDOT

Clustered, ordered retrieval of records

Apple Avocado Banana Blueberry
Canteloupe Grape Kiwi Lemon
Lime Mango Orange
Strawberry Tomato Watermelon
Apple Avocado Banana Blueberry
Canteloupe Grape Kiwi Lemon
Lime Mango Orange
Strawberry Tomato Watermelon
44
Updates
Write key k
Sequence for key k
Routers
Message brokers
Write key k
Sequence for key k
SUCCESS
Write key k
44
45
ASYNCHRONOUS REPLICATION AND CONSISTENCY
45
46
Asynchronous Replication
46
47
Consistency Model

Goal Make it easier for applications to reason
about updates and cope with asynchrony
What happens to a record with primary key
Alice?

Record inserted
Delete
Update
Update
Update
Update
Update
Update
Update
v. 1
v. 2
v. 3
v. 4
v. 5
v. 7
v. 6
v. 8
Time
Time
Generation 1
As the record is updated, copies may get out of
sync.
47
48
Example Social Alice
East
Record Timeline
West
User Status
Alice ___
___
User Status
Alice Busy
Busy
User Status
Alice Busy
User Status
Alice Free
Free
User Status
Alice ???
User Status
Alice ???
Free
49
Consistency Model
Read
Current version
Stale version
Stale version
v. 1
v. 2
v. 3
v. 4
v. 5
v. 7
v. 6
v. 8
Time
Generation 1
In general, reads are served using a local copy
49
50
Consistency Model
Read up-to-date
Current version
Stale version
Stale version
v. 1
v. 2
v. 3
v. 4
v. 5
v. 7
v. 6
v. 8
Time
Generation 1
But application can request and get current
version
50
51
Consistency Model
Read v.6
Current version
Stale version
Stale version
v. 1
v. 2
v. 3
v. 4
v. 5
v. 7
v. 6
v. 8
Time
Generation 1
Or variations such as read forwardwhile copies
may lag the master record, every copy goes
through the same sequence of changes
51
52
Consistency Model
Write
Current version
Stale version
Stale version
v. 1
v. 2
v. 3
v. 4
v. 5
v. 7
v. 6
v. 8
Time
Generation 1
Achieved via per-record primary copy
protocol (To maximize availability, record
masterships automaticlly transferred if site
fails) Can be selectively weakened to eventual
consistency (local writes that are reconciled
using version vectors)
52
53
Consistency Model
Write if v.7
ERROR
Current version
Stale version
Stale version
v. 1
v. 2
v. 3
v. 4
v. 5
v. 7
v. 6
v. 8
Time
Generation 1
Test-and-set writes facilitate per-record
transactions
53
54
Consistency Techniques

Per-record mastering
Each record is assigned a master region
May differ between records
Updates to the record forwarded to the master
region
Ensures consistent ordering of updates
Tablet-level mastering
Each tablet is assigned a master region
Inserts and deletes of records forwarded to the
master region
Master region decides tablet splits
These details are hidden from the application
Except for the latency impact!

55
Mastering
A 42342 E
B 42521 W
C 66354 W
D 12352 E
E 75656 C
F 15677 E
A 42342 E
B 42521 W
Tablet master
C 66354 W
D 12352 E
E 75656 C
F 15677 E
A 42342 E
B 42521 W
C 66354 W
D 12352 E
E 75656 C
F 15677 E
55
56
Bulk Insert/Update/Replace

Client feeds records to bulk manager
Bulk loader transfers records to SUs in batches
Bypass routers and message brokers
Efficient import into storage unit

Client
Bulk manager
Source Data
57
Bulk Load in YDOT

YDOT bulk inserts can cause performance hotspots
Solution preallocate tablets

58
Index Maintenance

How to have lots of interesting indexes and
views, without killing performance?
Solution Asynchrony!
Indexes/views updated asynchronously when base
table updated

59
SHERPAIN CONTEXT
59
60
Types of Record Stores

Query expressiveness

S3
PNUTS
Oracle
Simple
Feature rich
Object retrieval
Retrieval from single table of objects/records
SQL
61
Types of Record Stores

Consistency model

S3
PNUTS
Oracle
Best effort
Strong guarantees
Eventual consistency
Timeline consistency
ACID
Program centric consistency
Object-centric consistency
62
Types of Record Stores

Data model

PNUTS
CouchDB
Oracle
Flexibility, Schema evolution
Optimized for Fixed schemas
Object-centric consistency
Consistency spans objects
63
Types of Record Stores

Elasticity (ability to add resources on demand)

PNUTS S3
Oracle
Inelastic
Elastic
Limited (via data distribution)
VLSD (Very Large Scale Distribution /Replication)
64
Data Stores Comparison

Versus PNUTS
More expressive queries
Users must control partitioning
Limited elasticity
Highly optimized for complex workloads
Limited flexibility to evolving applications
Inherit limitations of underlying data management
system
Object storage versus record management

User-partitioned SQL stores
Microsoft Azure SDS
Amazon SimpleDB
Multi-tenant application databases
Salesforce.com
Oracle on Demand
Mutable object stores
Amazon S3

65
Application Design Space
Get a few things
Sherpa
MObStor
YMDB
MySQL
Oracle
Filer
BigTable
Scan everything
Hadoop
Everest
Files
Records
65
66
Alternatives Matrix
Consistency model
Structured access
Global low latency
SQL/ACID
Availability
Operability
Updates
Elastic
Sherpa
Y! UDB
MySQL
Oracle
HDFS
BigTable
Dynamo
Cassandra
66
67
Further Reading
Efficient Bulk Insertion into a Distributed
Ordered Table (SIGMOD 2008) Adam Silberstein,
Brian Cooper, Utkarsh Srivastava, Erik Vee,
Ramana Yerneni, Raghu Ramakrishnan PNUTS
Yahoo!'s Hosted Data Serving Platform (VLDB
2008) Brian Cooper, Raghu Ramakrishnan, Utkarsh
Srivastava, Adam Silberstein, Phil Bohannon,
Hans-Arno Jacobsen, Nick Puz, Daniel Weaver,
Ramana Yerneni Asynchronous View Maintenance for
VLSD Databases, Parag Agrawal, Adam Silberstein,
Brian F. Cooper, Utkarsh Srivastava and Raghu
Ramakrishnan SIGMOD 2009 (to appear) Cloud
Storage Design in a PNUTShell Brian F. Cooper,
Raghu Ramakrishnan, and Utkarsh
Srivastava Beautiful Data, OReilly Media, 2009
(to appear)
68
QUESTIONS?
68

Write a Comment

User Comments (0)