Title: Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson,Jean
1MegastoreScalable Highly Available Storage for
Interactive Systems
- Jason Baker, Chris Bond, James C. Corbett, JJ
Furman, Andrey Khorlin, James Larson,Jean - Michel Leon, Yawei Li, Alexander Lloyd, Vadim
Yushprakh
Presented By Hamid Seyedmoradi Ayoub
Hamidi Ehsan Mohamad Nezamian Advanced Database
Systems SRBIAU, Kurdistan Campus 10May2012
2Megastore
- Motivation
- Introduction
- NoSQL RDBMS
- Megastore
- Paxos
3Megastore
- wow! more than 3 billion write
- and 20 billion read daily
- key Contribution
- Data Model and Storage System
- Paxos Replication
- Report on Experience
4AVAILABILITY SCALE
- Replication
- For Availability, we implemented a
synchronous, fault tolerant log replicator
optimized for long distance links - Partitioning and Locality
- for scale, we partitioned data into a vast
space of small databases
5AVAILABILITY SCALE
- Replication
- Strategies
- Asynchronous Master/Slave
- Synchronous Master/Slave
- Optimistic Replication
- We decided to use Paxos
6Technology Options
7Technology Options
8AVAILABILITY SCALE
- Partitioning and Locality
- Replication
Datacenters
ACID semantics within an entity group
Entity Groups Partition the datastore
Looser consistency across entity groups
Each entity group is synchronously replicated
across datacenters
Entity group data and replication metadata stored
in scalable NoSQL datastores
9AVAILABILITY SCALE
- Partitioning and Locality
- Operations
Entities (Units of data)
Most transactions are within a single entity
group
Entity Group 1
Local Index
Global Indexes span entity groups but have
weaker consistency
Cross Entity group transactions supported via Two
Phase Commit
receive
queue
Send
Local Index
Asynch communication between entity groups
supported by Queues
Entity Group 2
10AVAILABILITY SCALE
- Partitioning and Locality
- Entity Groups
- Selecting Entity Group Boundaries
- Example
- Email
- Blogs
- Physical Layout
11Megastore
- API Design Philosophy
- Data Model
- Pre-Joining with Keys
- SCATTER
- Indexes
- Storing Clause
- Repeated Indexes.
- Inline Indexes
- Mapping to Bigtable
-
12Megastore
13Megastore
- Transactions and Concurrency Control
- Read
- current
- snapshot
- inconsistent
- Transaction Lifecycle
- 1-Read 3-Commit
5-Clean up - 2-Application logic 4-Apply
14Megastore
15REPLICATION
- Brief Summary of Paxos
- Megastores Approach
- Fast Reads
- Fast Writes
- Replica Types
- Witness Replica
- Architecture
16Architecture
17Data Structures and Algorithms
18Data Structures and Algorithms
- Reads
- Query Local
- Find Position
- Local read
- Majority read
- Catchup
- Validate
- Query Data
19Data Structures and Algorithms
20Data Structures and Algorithms
- Writes
- Accept Leader
- Prepare
- Accept
- Invalidate
- Apply
21Feedback
22END