Paxos Commit - PowerPoint PPT Presentation

About This Presentation
Title:

Paxos Commit

Description:

Paxos Commit Jim Gray Leslie Lamport Microsoft Research Preview of a paper in preparation Presented Microsoft Research Techfest 3 March 2004, Redmond, WA – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 29
Provided by: researchM6
Category:
Tags: commit | paxos

less

Transcript and Presenter's Notes

Title: Paxos Commit


1
Paxos Commit
  • Jim Gray
  • Leslie Lamport
  • Microsoft Research
  • Preview of a paper in preparation
  • Presented Microsoft Research Techfest
  • 3 March 2004,
  • Redmond, WA
  • Article MSR-TR-2003-96
  • Consensus on Transaction Commit
  • http//research.microsoft.com/research/pubs/view.a
    spx?tr_id701

2
Commit is Common
  • Do you?I do.I now pronounce you
  • Ready on the set?Ready!Action!
  • OfferSignatureDeal / lawsuit
  • Marriage ceremony
  • Theater
  • Contract law

3
The Common Picture
Ready
Action!
director
actors
Ready?
Action!
actors
Ready?
Ready
Action!
actors
Ready?
Ready
Ready?
Ready
Action!
4
All or Nothing If any actor says no the deal is
off.
No deal!
Ready?
actors
director
Ready
No deal!
Ready?
actors
No!
No deal!
Ready?
actors
Ready
Ready?
Ready
No deal!
5
The Database Version
director
RM
director
actors
actors
RM
actors
RM
Commit
Ready?
Ready
Commit
Commit
TM Transaction Manager RM Resource Manager
6
Two Phase Commit
  • N Resource Managers (RMs)
  • Want all RMs to commit or all abort.
  • Coordinated by Transaction Manager (TM)TM sends
    Prepare, Commit-Abort
  • RM responds Prepared, Aborted
  • 3N1 messages
  • N1 stable writes
  • Delay
  • 4 message
  • 2 stable write
  • Blocking if TM fails, Commit-Abort stalls

7
The Problem With 2PC
  • Atomicity all or nothing
  • Consistency does right thing
  • Isolation no concurrency anomalies
  • Durability / Reliability state survives
    failures
  • Availability always up

Blocks if TM fails
8
Problem Statement
  • ACID Transactions make error handling easy.
  • One fault can make 2-Phase Commit block.
  • Goal ACID and Available.Non-blocking despite F
    faults.

9
Fault-Tolerant Two Phase Commit
Prepared
client
TM
RM
RequestCommit
Prepare
Prepared
Prepare
TM
RM
RequestCommit
Prepare
Prepared
If the 2PC Transaction Manager (TM) Fails,
transaction blocks.

Solution Add a spare transaction manager
(non blocking commit, 3 phase commit)
10
Fault-Tolerant Two Phase Commit
client
TM
RM
abort
Prepared
Prepare
commit
commit
TM
RM
TM
Prepared
commit
Prepare
RequestCommit
Prepare
Prepared
Inconsistent! Now What?
Prepare
Prepared
commit
commit
abort
If the 2PC Transaction Manager (TM) Fails,
transaction blocks.
Solution Add a spare transaction manager
(non blocking commit, 3 phase commit)
But What if.?
The complexity is a mess.
11
Fault Tolerant 2PC
  • Several workarounds proposed in database
    community
  • Often called "3-phase" or "non-blocking" commit.
  • None with complete algorithm and correctness
    proof.

12
Reaching Agreement in the Presence of Faults
Shostak, Pease, Lamport
JACM, 1980
  • 25 years of theory
  • Now called the Consensus problem
  • N processes want to agree on a value, even if F
    of them have failed.

13
Consensus
Propose X
consensus box
client
W Chosen
Propose W
client
W Chosen
client
W Chosen
  • collects proposed values
  • Picks one proposed value
  • remembers it forever

14
Consensus for CommitThe Obvious Approach
consensus box
RM
client
TM
Propose Prepared
Prepared Chosen
Request Commit
Prepared
Prepare
Commit
Commit
Prepare
Commit
TM
RM
Prepared Chosen
Prepared
RequestCommit
Prepare
Prepared
Propose Prepared
Prepared Chosen
Commit
Commit
  • Get consensus on TMs decision.
  • TM just learns consensus value.
  • TM is stateless

15
Consensus for CommitThe Paxos Commit Approach
RM
client
TM
Request Commit
consensus box
Propose RM1 Prepared
Prepare
RM1 Prepared Chosen
Commit
Commit
Prepare
consensus box
Commit
RM
TM
Propose RM2 Prepared
RM2 Prepared Chosen
RequestCommit
Prepare
Propose RM1 Prepared
Propose RM2 Prepared
RM1 Prepared Chosen
RM2 Prepared Chosen
Commit
Commit
  • Get consensus on each RMs choice.
  • TM just combines consensus values.
  • TM is stateless

16
The Obvious Approach
Paxos Commit
One fewer message delay
Prepare
Prepare
Prepared
Propose RM1 Prepared
Propose RM2 Prepared
Propose Prepared
RM1 Prepared Chosen
Prepared Chosen
RM2 Prepared Chosen
Commit
Commit
17
Consensus in Action
RM
Consensus box
Propose RM Prepared
acceptor
Propose RM Prepared
Vote RM Prepared
TM
Propose RM Prepared
RM Prepared Chosen
Vote RM Prepared
acceptor
Vote RM Prepared
TM
acceptor
  • The normal (failure-free) case
  • Two message delays
  • Can optimize

18
Consensus in Action
RM
Consensus box
acceptor
TM
acceptor
TM
TM
acceptor
TM can always learn what was chosen, or get
Aborted chosen if nothing chosen yet
if majority of acceptors working .

19
The Complete Algorithm
  • Subtle.
  • More weird cases than most people imagine.
  • Proved correct.

20
Paxos Commit
  • N RMs
  • 2F1 acceptors (2F1 TMs)
  • If F1 acceptors see all RMs prepared, then
    transaction committed.
  • 2F(N1) 3N 1 messages5 message delays 2
    stable write delays.

21
Two-Phase Commit
Paxos Commit
tolerates F faults
  • 3N1 messages
  • N1 stable writes
  • 4 message delays
  • 2 stable-write delays
  • 3N 2F(N1) 1 messages
  • N2F1 stable writes
  • 5 message delays
  • 2 stable-write delays

Same algorithm when F0 and TM Acceptor
22
Summary
  • Commit is common
  • Two Phase commit is good butIt is the
    un-availability protocol
  • Paxos commit is non-blocking if there are at
    most F faults.
  • When F0 (no fault-tolerance), Paxos Commit
    2PC

23
(No Transcript)
24
Paxos Consensus
  • Group has a leader known to all
  • leader election is a subroutine
  • Process proposes a value v to leader.
  • Leader sends proposal (phase 2) (ballot, value)
    to all acceptors
  • Acceptors respond withmax(ballot, value) they
    have seen
  • If leader gets no higher ballot, and gets at
    least F1 responses then leader can announce
    (ballot, value)
  • Full protocol 3-phase
  • Phase 1
  • Leader starts new ballot
  • Phase 2
  • Leader proposes value
  • Phase 3
  • If value accepted by F1 then value is accepted.
  • If not, leader tries to get majority value
    accepted.

6F4 messages, 2F1 stable writes 4 message
delays and 2 stable write delays
25
Using ConsensusHave a consensus for each RM
Prepared
client
TM
RM
RequestCommit
consensus box
Prepare
Commit
consensus box
Prepared
Commit
Prepare
Commit
TM
RM
RequestCommit
Prepare
Prepared
Commit
Commit
26
Propose X
consensus box
RM
X Chosen
Propose W
TM
X Chosen
X Chosen
TM
27
Paxos Commit (success case)
Acceptors
Commit Leader
28
Consensus
  • The distributed systems theory community has
    thought about this a lot.
  • They call it ConsensusN processes want to agree
    on a value
  • Want to tolerate F faults
  • Tolerate F processes stopping
  • Tolerate F Messages delayed or lost
  • If there are fewer than F faults in a windowThen
    consensus achieved.
  • Byzantine faults need 3F acceptors
  • Benign faults need 2F1 acceptorsstalls but
    safe if more than F faults
Write a Comment
User Comments (0)
About PowerShow.com