Sinfonia: A New Paradigm for

About This Presentation

Title:

Sinfonia: A New Paradigm for

Description:

Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christos Karamanolis ... Development of these DS based on Message-Passing involves complicated ... – PowerPoint PPT presentation

Number of Views:116

Avg rating:3.0/5.0

Slides: 28

Provided by: philh4

Category:

more less

Transcript and Presenter's Notes

Title: Sinfonia: A New Paradigm for

1
Sinfonia A New Paradigm for Building Scalable
Distributed Systems
Marcos K. Aguilera, Arif Merchant, Mehul Shah,
Alistair Veitch, Christos Karamanolis
Presented by Phil Huynh
2
The problem

Distributed systems (DS) requires
fault-tolerance, scalability, consistency,
reasonable performance
Development of these DS based on Message-Passing
involves complicated protocols for handling
distributed states ( protocols for replication,
file management, cache consistency,
membership...)

HARD TO ACHIEVE
3
Sinfonia comes to help!

Sinfonia is claimed to be a service that supports
application data sharing in a fault-tolerant,
scalable and consistent manner
Provides the ACID abstraction of
minitransaction instead of low level message
passing
Transform the problem of protocol design into
data structure design ltltlt much easier

4
Overview

Design
Minitransactions
Protocol
Features
Sample Applications
Evaluation
Conclusion

5
Design

Assumptions data center
fairly well-connected machines
small network latencies
trustworthy participants
(not valid assumptions for WAN, P2P)
Goals
Provide a framework for building distributed
infrastructure applications

6
Design

Principles
Make components reliable before scaling them
Reduce coupling to obtain scalability no
assumption about structure on the data, just
memory addressing and raw binary

7
Design

Components app nodes, memory nodes
Separate address space
ltmem-node-id, addressgt

8
Minitransaction

Conceptually a combination of multiple operations
for handling distributed data
ACID properties
Each minitransaction compare items, read
items, write items
Swap, compare-and-swap, atomic-read-of-many-data,
acquire-a-lease, acquire-multiple-lease-atomically
, change-data-if-lease-held

9
Minitransaction
10
Protocol
11
Protocol
12
Features

Fault tolerance
Consistent backup
Replication
Modes of operation

13
Fault tolerance

Recovery from coordinator crashes
Use a 3rd party recovery coordinator
Recovery node ask participants to abort
Participant if did not vote, then vote abort
otherwise, resend the vote to recovery
coordinator
Recovery node act as if it was the
minitransactions coordinator

Recovery from participant crashes
Participant
On restart, replay redo-log
If a minitransaction is not in decided list, ask
relevant participants for status (committed /
aborted)
Still appear offline until this process finished

Recovery from whole system crashes
Participants
Send final status of recent minitransactions to
all other nodes
Start the above procedure

14
Consistent backups

Lock all addresses of all nodes
(without blocking to avoid dead lock)
Update the disk image up to the last committed
minitransaction
Disk image is copied or snapshotted
Release locks
Backup made from the copy or snapshot

15
Replication

Replicate redo-log, decide-log, forced-abort log
The primary copy sends updates on these logs to
the replica and checks for acks

16
Modes of operation
17
Review

Fault tolerance smart strategy
Consistent backup lock all address of all nodes,
copy the disk images until the last committed
minitransaction at each mem-node, then release
locks
Replication nothing special
Load balancing not supported
Caching not supported, app should take care
Modes of operation

18
Sample Applications

SinfoniaFS distributed file system
SinfoniaGCS group communication system

19
SinfoniaFS

Scalable, fault tolerant FS
Cluster nodes application nodes of Sinfonia
Each Sinfonia memory node holds data metadata
Sinfonia LOG or LOG-REPL mode
Data block of 16KB
Inode, chaining list, and file content of the
same file are stored on the same memory node
(locality)
Each cluster (app) node writes to a preferred
memory node (load balancing)
Memory nodes need not even know each others
existence (scalability)

20
Implementation

As easy as implement a local file system

21
Implementation
22
SinfoniaGCS

Broadcast m member adds m to its queue, finds
the end of the global list, updates the global
tail pointing to m
Receive new msgs member follows next pointer in
the global list

23
SinfoniaGCS