Title: The User is the Computer: From Decentralized Systems to Social Computing
1The User is the Computer From Decentralized
Systems to Social Computing
Peter Druschel
2Course overview
- Todays computer systems augment a wide range of
human activity, including cooperation among
individuals, organizations, businesses - This course deals with some of the technology
underlying this trend, as well as the challenges
and opportunities that come with it
3Course overview
- Decentralized systems (2 hours)
- Overlays, object lookup, routing
- Shared state and coordination
- Applications
- Challenges
- Accountability for distributed systems (1.5
hours) - Why and what is accountability?
- How can we implement it?
- How well does it work?
- Social computing and applications (1.5 hours)
- Exploiting social networks for distributed
computing - Example enhancing Web search
- Example thwarting unwanted communication
4Credits
- Colleagues
- Krishna Gummadi, MPI-SWS
- Rodrigo Rodrigues, MPI-SWS
- Anne-Marie Kermarrec, INRIA
- Ant Rowstron, MSRC
- Miguel Castro, MSRC
- Ion Stoica, UC Berkeley
- John Kubiatowicz, UC Berkeley
- Frank Dabek, Google
- Y. Charlie Hu, Purdue
- Group members
- Andreas Haeberlen
- Jeff Hoye
- Petr Kuznetsov
- Alan Mislove
- Animesh Nandi
- Ansley Post
- Atul Singh
- Jim Stewart
- Funding
- Max Planck Society
- National Science Foundation
- Intel Research
- Microsoft Research
- Texas ATP
5Decentralized (p2p) systems
- Distributed computer system with
- Symmetric components
- Decentralized control and state
- Self-organization
- Promise
- Organic growth
- Low barrier to deployment
- Resilience to faults, attack
- Resource abundance, diversity
6Partly vs. fully decentralized systems
- Partly decentralized systems have a dedicated
controller node - Organic growth, abundant/diverse resources
- Limited scalability, resilience
- Fully decentralized systems
- Some fully decentralized systems have powerful
supernodes - Increased efficiency, but reduced resilience
7Decentralized systems deployment
- Self-organization enables deployment in dynamic
networks - Ad hoc wireless networks
- Mobile wireless devices
- Delay-tolerant networks
- Devices with intermittent connectivity
- Overlay networks (most common)
- Internet-connected devices
8Outline
- Decentralized systems state-of-the-art
- Overlays, object lookup, routing
- Example Pastry
- Shared state and coordination DHTs and
Scribe/DOLR - Challenges
- Putting it all together ePOST
- Accountability for distributed systems
- Social computing and applications
9Overlay networks
Overlay network
Internet
- Overlay links rely on unicast service in the
Internet - Topology can be structured or unstructured
10Why overlays?
- Overcome limitations of Internet architecture
- group communication, content-oriented networking
- enable innovation
- Low barrier to deployment
- resource sharing enables organic growth
- self-organization simplifies operation
- Robustness to faults, attacks, unexpected
workloads - decentralization
- resource diversity, wealth
11Decentralized (p2p) systems What do they enable?
- Cooperative computing
- Content sharing/distribution (Kazaa, BitTorrent)
- Streaming media (SOPcast, PPLive, Joost, iPlayer)
- Telephony (Skype), popular scientific computing
- Low barrier to deployment, market entry
Innovation - Digital preservation
- Diversity, abundance of resources provides
durability - Autonomous distributed systems
- Self-managing networks of little or mobile
devices - Decentralization is necessary for autonomy
12Popular decentralized systems
- File sharing, bulk content distribution
- BitTorrent, eDonkey dominate Internet traffic
- Streaming media distribution
- PPLive, CoolStreaming, Joost, iPlayer,
LiveStation - Skype
- Volunteer computing
- BOINC apps perform 1 PFLOPS on average
13Decentralized (p2p) systems State-of-the-art
- Decentralized state management
- Object location
- Replication
- Availability, Durability
- Load balancing
- Efficient, consistent lookup routing in Internet
overlays - Efficient cooperative content distribution
- Dependable storage from untrusted components
- Security secure routing, content integrity,
incentives
14Key problem Object location
- Objects partitioned among participating nodes
- Mapping from objects to nodes is dynamic
- Unicast routing doesnt help
- dont know who to talk to
- dont know where to store objects
- want to address (data) objects, not nodes !
15Solution 1 Unstructured overlay
- No assumptions about overlay graph structure
- New node is assumed to know one participant
- Performs random walk to find more nodes to attach
to - Object placement
- Inserting node or random walk target
- May leave references along random path
- Object lookup
- Scoped flooding or random walk
- Examples Gnutella, Kazaa, eDonkey
16Unstructured object location
- I inserts an object
- Leave reference on R
- S floods a request
- Finds reference at R
- Tradeoff between scalability and recall
- Popular object easy to find
-
17Solution 2 structured overlay networks
- Overlay graph conforms to a specific graph
structure - Key-based routing primitive (KBR)
- KBR(M, X) route message M to the live node
that is currently responsible for the object
associated with numerical id X - Basis for content-oriented networking
- Examples Chord, CAN, Pastry, Tapestry, Bamboo,
Kademlia, SkipNet, Kelips, Accordeon, etc.
18Structured vs. unstructured overlays
- Structured
- Pre-determined routes
- Efficient identity lookup, tree formation
- More susceptible to churn
- Unstructured
- Simple overlay formation
- Tradeoff between recall and efficiency
- Robust to churn
- Can be combined
- Stable nodes form structure
- Others attach randomly
19Outline
- Decentralized systems state-of-the-art
- Overlays, object lookup, routing
- Example Pastry
- Shared state and coordination DHTs and
Scribe/DOLR - Challenges
- Putting it all together ePOST
- Accountability for distributed systems
- Social computing and applications
20Pastry Identifier space
- Consistent hashing Karger et al. 97
- 160 bit circular id space
- nodeIds (uniform random)
- keys (uniform random)
- Each key is mapped to the live node with
closest nodeId
2160-1
O
key
nodeIds
21Pastry lookup
2160-1
O
Msg with key X is routed to live node with nodeId
closest to X Problem complete routing table
not scalable
X
KBR(M, X)
22Pastry prefix-based routing
d471f1
d467c4
d462ba
d46a1c
d4213f
- Properties
- log16 N steps
- O(log N) state
KBR(M, d46a1c)
d13da3
65a1fc
23Pastry routing table (node 65a1fcx)
Row 0
Row 1
Row 2
Row 3
log16 N rows
24Pastry prefix-based routing
- Similar to Plaxton Trees Plaxton et al. 97
- But added
- Neigbor sets for consistency, robustness,
security - Consistent routing
- Self-organization (dynamic joins, fault
tolerance) - Proximity neighbor selection for efficiency
- Secure routing to defend against malicious nodes
25Neighbor sets
A
B
- Stabilization protocol ensures eventual
consistency - aids routing consistency
- enables secure routing
- localizes fault detection within neighbor sets
- enables application-specific local coordination
(e.g., object replica
management)
26Challenge Inconsistent routing
New node N has informed X, but not yet Y of its
arrival
Y
- Routing consistency
- At any time, at most one overlay node accepts
messages with a given key - Necessary for consistency of mutable data
- Complicated by Internet routing anomalies
key
N
X
27Ensuring routing consistency
- To accept a message with key k, a node Y
requires a lease from its neighbors, for an
interval XltkltZ - Lease can be issued if grantor has a valid lease
and previous lease has expired - Assumption
- Any live node can be reached via one of its
neighbor set members - Ensures
- properly formed ring (eventually)
- at most one node at a time accepts
- messages with key k
- gt routing consistency
L1 Y-Z
L2 X-Y
Z
X
Y
28Challenge Self-organization
- Initializing and maintaining node state
(overlay construction and maintenance) - Node addition
- Node departure (failure)
29Pastry Node join
d471f1
d467c4
d462ba
d46a1c
New node d46a1c
d4213f
KBR(Join,d46a1c)
d13da3
65a1fc
30Pastry Node departure (failure)
- Neighbor set members exchange keep-alive
messages (failure detection, neighbor set
stabilization) - Neighbor set repair (eager) request set from
farthest live node in set - Routing table repair (lazy) get table from peers
in the same row, then higher rows
31Challenge Overlay route efficiency
20x
OR-DSL
CMU
MIT
MA-Cable
Cisco
81x
Cornell
CA-T1
CCI
89x
NYU
Aros
Utah
80x
- Nodes close in id space, but far away in Internet
- Goal choose routing table entries that yield few
hops and low latency
32Proximity neighbor selection (PNS)
- Assumptions
- scalar proximity metric (e.g., RTT)
- a node can probe distance to any other node
- Proximity invariant
- Each routing table entry refers to a node
close to the local node (in the physical
network), among all nodes with the appropriate
nodeId prefix.
33PNS Routes in delay space
34PNS Properties
- Low-delay routes Average delay stretch, relative
to IP, is a small constant (1.3 - 2.2) and can be
derived from the physical networks delay
distribution - Route convergence Routes of messages sent by
nearby nodes with the same key converge at a node
near the source nodes - Details in Castro et al. MSR-TR-2002-82
35Outline
- Decentralized systems state-of-the-art
- Overlays, object lookup, routing
- Example Pastry
- Shared state and coordination DHTs and
Scribe/DOLR - Challenges
- Putting it all together ePOST
- Accountability for distributed systems
- Social computing and applications
36Sharing state Distributed hash tables (DHT)
- Hashtable API put(obj,key), obj lt- get(key)
- Layered on top of a structured overlay
- Scalability, Robustness
- Persistent storage
- High availability
- Examples Chord/CFS, Pastry/PAST, Bamboo, Kelips,
Kademlia
37Distributed hash table (DHT)
nodes
k1,v1
k2,v2
k3,v3
Overlay network
Operations insert(k,v) vlookup(k)
k4,v4
k5,v5
k6,v6
- Structured overlay maps keys to nodes
- Decentralized and self-organizing
- Scalable, robust
38DHT Insertion and replication
r4
Storage Invariant Tuple replicas are stored on
r nodes with nodeIds closest to key
key
Insert(key,value,r)
39DHT Lookup
C
r replicas
Object located in log16 N steps
(expected) usually locates replica nearest
client C
Key
Lookup(key)
40DHT Dynamic caching
- Nodes cache tuples in the unused portion of their
allocated disk space - Tuples cached on nodes along the route of lookup
and insert messages - Goals
- maximize query xput for popular tuples
- balance query load
- improve client latency
41DHT Dynamic caching
Key
Delay space
Lookup(key)
42Coordination Decentralized group management
- E.g., SCRIBE Rowstron et al., JSAC 02
- Spanning trees embedded in structured overlay
- Multicast, anycast primitives
- Scalable large numbers of groups, members, wide
range of members/group, dynamic membership
43Cooperative group communication
nodes
n0
gn1,n2
Operations create(g) join(g) leave(g) multicast(g
,m) anycast(g,m)
n1
g
n2
gn3,n4
n3
g
- groupId g mapped to n0
- decentralized membership
- robust, scalable
n4
g
44Scribe
groupId
Delay space
Join(groupId)
45Structured overlay APIs
create(g) join(g) leave(g) multicast(g,m) anycast(
g,m
insert(k,v) vlookup(k)
DHT
SCRIBE / DOLR
route(M, X)
KBR
Dabek et al., IPTPS 05
46Outline
- Decentralized systems state-of-the-art
- Overlays, object lookup, routing
- Example Pastry
- Shared state and coordination DHTs and
Scribe/DOLR - Challenges malicious participants
- Putting it all together ePOST
- Accountability for distributed systems
- Social computing and applications
47Malicious participants threats
A
- Prevent messages from reaching root
- drop or corrupt
- bias routing tables
- Cause objects to be placed on faulty nodes
- choose nodeId values
- use many identities (Sybil attack)
- impersonate root
B
key
C
F
I
J
L
48Malicious participants threats
A
- Prevent messages from reaching root
- drop or corrupt
- bias routing tables
- Cause objects to be placed on faulty nodes
- choose nodeId values
- use many identities (Sybil attack)
- impersonate root
B
C
F
I
J
L
49Malicious participants threats
- Prevent messages from reaching root
- drop or corrupt
- bias routing tables
- Cause objects to be placed on faulty nodes
- choose nodeId values
- use many identities (Sybil attack)
- impersonate root
A
B
key
C
F
I
J
L
50Malicious participants threats
- Prevent messages from reaching root
- drop or corrupt
- bias routing tables
- Cause objects to be placed on faulty nodes
- choose nodeId values
- use many identities (Sybil attack)
- impersonate root
A
B
key
C
D
E
F
G
H
I
J
L
K
51Malicious participants threats
A
- Prevent messages from reaching root
- drop or corrupt
- bias routing tables
- Cause objects to be placed on faulty nodes
- choose nodeId values
- use many identities (Sybil attack)
- impersonate root
B
C
F is my neighbor
key
F
I
J
L
K
52Securing routing
A
B
key
- Secure node identifier assignment
- thwarts Sybil and id choosing attacks
- Secure membership protocol
- Prevents routing table bias attacks
- Secure routing primitive
- Prevents root impersonation
- Can tolerate up to 25 malicious nodes
C
F
I
J
53Securing routing
A
- Secure routing primitive
- Prevents root impersonation
B
C
key
F is my neighbor
F
Castro et al., OSDI 02
I
M
J
L
K
54Other threats
- Freeloading incentives mechanisms
- Data corruption crypto
- Denial-of-service
- Several defenses needed
55Outline
- Decentralized systems state-of-the-art
- Overlays, object lookup, routing
- Example Pastry
- Shared state and coordination DHTs and
Scribe/DOLR - Challenges malicious participants
- Putting it all together ePOST
- Accountability for distributed systems
- Social computing and applications
56Putting it all together ePOST
- Decentralized, cooperative email service
- Based on users desktops/notebooks
- Messages transmitted and stored securely
- Standard mail clients (IMAP/POP)
- Interoperability via SMTP
- Nodes may fail arbitrarily
- Users only trust their local node
Mislove et al., EuroSys 06
57Why Email?
- Demanding user expectations
- Privacy
- Integrity
- Durability
- Availability
- Goal Demonstrate that a decentralized,
cooperative email service can be built that users
can entrust with their production email
58ePOST Single-copy store
Email Data
- Emails split into MIME components, stored in the
DHT - Using its content-hash as the key
- Self-certifying (integrity)
- Identical items stored once
- Convergent encryption
- Items replicated thrice for availability
- Additional erasure-coded replicas for durability
(Glacier Haeberlen et al., NSDI05)
Attachment
Header
Body
Attachment
59ePOST Single-writer log
- Per-user metadata (folders, inbox, etc.) stored
as an update log - All updates performed by owner
- Stored in the DHT
- Entries form a hash chain
- Log head is signed with owners key
- Periodic snapshots stored in log
Email Data Log Head Log Entry
Insert msg x
Attachment
Header
Insert msg y
Body
Mark msg y read
Attachment
60ePOST Message Delivery
- Message notifications are signed and contain
encrypted headers and keys to the messages
components - Each user has a Scribe group
- Node joins users group if it has a message for
the user - User announces to the group when online
- Pending notifications delivered
61ePOST Security
- Users have certificates (public key, node id)
- Secure communication (SSL)
- All content stored in the DHT is protected
- Authenticity
- Integrity
- Privacy
- Incentives to prevent freeloading (Scrivener
Nandi, Middleware05) - Secure KBR
62Deployment and Experience
- Rice / MPI rings reserved for internal members
- PlanetLab ring open membership ring, backed by
Planetlab - Usage
- 26 internal users (16 used ePOST as primary
email) over more than two years - 40 DHT nodes (Rice / MPI ring), 350 nodes
(PlanetLab ring) - Several times, ePOST was available when Rice or
MPI-SWS email had failed - No system-wide outages after initial testing
phase - Shut down due to overhead of tracking spam
filtering
63Decentralized systems challenges
- Maintaining mutable distributed state remains
hard - Fortunately, lots of useful applications dont
require it - Incentives are basis for cooperation
- Strategy-prove protocols (e.g. tit-for-tat)
- Accountability
- Need to control membership
- Certified identities (background check or fee)
- proof-of-work, social networks?
64Decentralized systems challenges
- Need to protect data
- Durability requires non-decreasing membership
- Scalable storage, high availability, churn
resilience pick two BlakeRodrigues, HotOS-IX - Manageability
- Self-organization reduces administrative effort
- Hardware management is decentralized
- BUT Evidence that lack of centralized control
may make it difficult to manage system-wide
disruptions
65Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Why accountability?
- What is accountability?
- How can we implement it?
- How well does it work?
- Accountable virtual machines
- Social computing and applications
66Byzantine faults occur in practice
- Not all faults cause a node to stop
- The faulty node continues to operate, but its
behavior deviates from that of a correct node - Examples
- Hardware malfunction
- Misconfiguration
- Software error
- External security attack
- Intentional software modification
67Example LAX airport outage
Admin
- Aug 2007 17,000 passengers stranded at LAX
- Cause intermittent fault of a network card
68Example Botnets in the Internet
Domain A
Domain B
Administrative domain
- Compromised computer targets different domain
- Admin A must localize fault, then convince admin
B that her machine is faulty
69Example Insider attack
Administrative domain
- Mar 2002 UBS PaineWebber admin disrupts trade
for days to weeks - Difficult to detect, defuse logical bombs
70Why is detecting faults difficult?
Responsibleadmin
Incorrectmessage
- How to detect faults?
- How to identify the faulty node?
- How to convince others that a node is (not)
faulty?
71Learning from the 'offline' world
- Relies on accountability
- Example Banks
- Record can be used to (manually) detect, identify
and convince - Is accountability useful in distributed systems?
- Is it practical?
72What does accountability mean?
- Accountability tamper-evident record
automated, reliable fault detection
73Is accountability alone useful?
- No, if faults are severe and irrecoverable
- need byzantine fault tolerance (see Lorenzos
course) - Yes, for
- systems that provide best-effort service
- systems that assume crash failures
- systems that mask severe/irrecoverable faults
- Accountability
- reliably detects and localizes faults
- provides incentives to avoid faults
- builds trust, reputation
74Which Systems can benefit?
- Internet services (BGP, DNS, NTP, NNTP, SMTP)
- Web services
- Content distribution networks (CDN)
- Grid computing
- Peer-to-peer systems
- Multi-player games
- Cloud computing
75Butler Lampson on accountability
- "Dont forget that in the real world, security
depends more on police than on locks, so
detecting attacks, recovering from them, and
punishing the bad guys are more important than
prevention."
-- Butler Lampson, "Computer Security in the Real
World", ACSAC 2000
76Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Why accountability?
- What is accountability?
- How can we implement it?
- How well does it work?
- Accountable virtual machines
- Whats next? Social computing and applications
77Ideal accountability
- Fault Node deviates from expected behavior
- Our goal is to automatically
- detect faults
- identify the faulty nodes
- convince others that a node is (or is not) faulty
- Can we build a system that provides the following
guarantee?
- Whenever a node is faulty in any way, the system
generates a proof of misbehavior against that node
78Can we detect all faults?
100101011000101101011100100100
- Problem Faults that affect only a node's
internal state - Would require online trusted probes at each node
- Focus on observable faults
- Faults that affect a correct node
- Can detect observable faults without requiring
trusted components
A
C
79Can we always get a proof?
I sent X!
- Problem He-said-she-said
- Three possible causes
- A never sent X
- B refuses to acknowledge X
- X was lost by the network
- Cannot get proof of misbehavior!
- Generalize to verifiable evidence
- a proof of misbehavior, or
- a challenge that a faulty node cannot answer
- What if the challenged node does not respond?
- Does not prove a fault, but node is suspected
until it responds
A
X
?
B
I neverreceived X!
?!
C
80Practical accountability
- We propose the following requirement for an
accountable distributed system - This is useful
- Any (!) fault that affects a correct node is
eventually detected and linked to a faulty node - It can be implemented in practice
- Whenever a fault is observed by a correct node,
the system eventually generates verifiable
evidence against a faulty node
81Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Why accountability?
- What is accountability?
- How can we implement it?
- How well does it work?
- Accountable virtual machines
- Social computing and applications
82An implementation PeerReview
- Adds accountability to a given system
- Implemented as a library
- Provides tamper-evident record
- Detects faults via state-machine replay
- Assumptions
- Nodes can be modeled as deterministic state
machines - Nodes have reference implementations of the
state machines - Correct nodes can eventually communicate
- Nodes can sign messages
83PeerReview from 10,000 feet
A is faulty
- All nodes keep logs of their inputs outputs
- Including all messages
- Each node has a set of witnesses, which audit the
node periodically - If the witnesses detect misbehavior, they
- generate evidence
- make the evidence avai-lable to other nodes
- Other nodes check evi-dence, report fault
A's witnesses
C
D
E
M
M
A
M
B
A's log
B's log
84PeerReview detects tampering
Message
- What if a node modifies its log entries?
- Log entries form a hash chain
- Inspired by secure histories Maniatis02
- Hash is included with every message
authenticator ? Node commits to its
current state ? Changes are evident
Hash(log)
B
A
ACK
Hash(log)
85PeerReview detects omission
- What if a node omits log entries?
- While inspecting As log, As witnesses send msg
authenticators signed by B to Bs witnesses - Thus, witnesses learn about all messages their
node has ever sent or acknowleged - Omission of a message from the log is a fault
A's witnesses
B's witnesses
MB
MB
MB
MB
MB
MB
A
B
A's log
86PeerReview detects inconsistencies
- What if a node
- keeps multiple logs?
- forks its log?
- Witnesses check whether all msg authenticators
form a single hash chain - Two authenticators not connected by a log segment
indicate a fault
87PeerReview detects faults
- How to recognize faults?
- Assumption
- Nodes can be modeled as deterministic state
machines - To audit a node, witness
- Fetches signed log
- Replays inputs to a trusted copy of the state
machine - Checks outputs against the log
Module A
State machine
Module B
Network
Log
Module A
Module B
Input
if ?
?
Output
88PeerReview guarantees
- Observable faults will be detected
- Good nodes cannot be accused
- Formal definitions and proof in the TR
- If node commits a fault has a correct
witness, then witness obtains - a proof of misbehavior (PoM), or
- a challenge that the faulty node cannot answer
- If node is correct
- there can never be a PoM, and
- it can answer any challenge
89PeerReview is widely applicable
- App 1 NFS server in the Linux kernel
- Many small, latency-sensitive requests
- Tampering with files
- Lost updates
- App 2 Overlay multicast
- Transfers large volume of data
- Freeloading
- Tampering with content
- App 3 P2P email
- Complex, large, decentralized
- Denial of service
- Attacks on DHT routing
- More information in Haeberlen et al., SOSP07
- Metadata corruption
- Incorrect access control
90Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Why accountability?
- What is accountability?
- How can we implement it?
- How well does it work?
- Accountable virtual machines
- Social computing and applications
91How much does PeerReview cost?
- Log storage
- 10 100 GByte per month, depending on
application - Message signatures
- Message latency (e.g. 1.5ms RTT with RSA-1024)
- CPU overhead (embarrassingly parallel)
- Log/authenticator transfer, replay overhead
- Depends on witnesses
- Can be deferred to exploit bursty/diurnal load
patterns
92P2p email, dedicated witnesses
100
80
Checking logs
60
Avg traffic (Kbps/node)
Baseline traffic
40
Signatures and ACKs
20
0
Baseline
2
1
3
5
4
W dedicatedwitnesses
Number of witnesses
- Dominant cost depends on number of witnesses W
- O(W2) component
93P2p email, mutual auditing
Small randomsample of peers chosen as witnesses
Node
- Small probability of error is inevitable
- Example Replication
- Can use this to optimize PeerReview
- Accept that an instance of a fault is found only
with high probability - Asymptotic complexity O(N2) ? O(log N)
94PeerReview is scalable
Email system PeerReview(P1.0)
Email system PeerReview (P0.999999)
DSL/cableupstream
O((log N)2)
Avg traffic (Kbps/node)
Email systemw/o accountability
O(log N)
System size (nodes)
- Assumption up to 10 of nodes can be faulty
- Probabilistic guarantees provide scalability
- Example email system scales to over 10,000
nodeswith P0.999999
95PeerReview summary
- Accountability is a new approach to
handlingfaults in distributed systems - detects faults
- identifies the faulty nodes
- produces evidence
- PeerReview A system that enforces accountability
- Offers provable guarantees and is widely
applicable - Details in Haeberlen et al., SOSP 07
96Challenges
- Tension between accountability and privacy
- PeerReview (PR) requires disclosure to witnesses
- Zero-knowledge proofs?
- Fault detection
- PR uses state-machine replay for fault detection
- Cant detect deterministic software bugs
- Different implementations of underspecified
protocols may diverge - Protocol specification or abstract model?
97Challenges (contd)
- Message signatures
- PR assumes a public-key infrastructure
- Web-of-trust (physical network, social network) ?
- Partial deployment
- Accountability zones, gateways ?
- PR requires source code modifications
- To enable deterministic replay
- Accountable virtual machines?
98NetReview
- Accountability applied to inter-domain routing
- Fault detection based on a spec of the routing
policy - Web-of-trust-based certificates
- Auditing limited to peering partners
- Partial deployment accountability zones
- Details in Haeberlen et. al., NSDI09
99Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Why accountability?
- What is accountability?
- How can we implement it?
- How well does it work?
- Accountable virtual machines
- Whats next? Social computing and applications
100Accountable virtual machines (AVM)
- Make unmodified binary VMs accountable
- VMM provides deterministic logging/replay
VM
AVM
Unmodified binary
Packets
Authenticator
Accountable VMM
Log
101What are AVMs good for?
- Accountability for proprietary/legacy software
- Accountable cloud computing
- Customer can verify correct execution
- Making an entire host computer accountable
- Check for compromised software
- Forensics
102Trusted network probes
- Making the Internet accountable, one host at a
time
Authenticator
Packet
Secure log
Chain of authenticators validates log
Internet
Accountable Workstation
Cable/DSL modem or ISPs DSLAM
103Related Work
- Accountability Lampson 00, YumerefendiChase
05, Yemerefendi et al. 07, Argyraki et al. 07,
Michalakis et al. 07 - Practical byzantine fault tolerance
CastroLiskov 00, Ramasamy 07 - General fault detection Kihlstrom et al. 07,
Doudou et al. 99, MalkhiReiter 97 - Intrusion detection, reputation systems Denning
87, Ko et al. 94, Kamvar et al. 03 - Trusted computing Garfinkel et al. 02
- Fault-specific defenses CoxNoble 03,
WaldmanMazieres 03 - Tamper-evident logs SchneierKelsey 98,
ManiatisBaker 02
104Conclusion
- Byzantine faults in distributed systems are real
- Accountability is a new approach to handling
faults - detects observable faults
- identifies the faulty node
- produces verifiable evidence
- Presented a practical definition of
accountability - Practical implementations exist
- Many challenges remain
105Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Social computing and applications
- Exploiting social networks for distributed
computing - Example enhancing Web search
- Example thwarting unwanted communication
106From service-centric to user-centric computing
- Collaborative, social computing and communication
- In peer-to-peer, users share technical resources
- In social computing, users share knowledge,
opinions, referrals, ratings
107User-centric, social computing
- Mass collaboration, enabled by technology
- Human intelligence aggregated through technology
- User contribution is the most important resource
- (Underutilized resource of enormous scale?)
- BUT Outcome depends on user behavior
- depends on cooperation, good will
- vulnerable to spoilers
108Social networks two concepts
- Users contribute
- Content
- Opinions, recommendations, ratings (ex- or
implicit) - Users form social networks
- Graph connecting users (ex- or implicit)
- Links imply shared interest or trust
109What are social networks?
- Graphs connecting people
- Edges connect friends
- Imply shared interest or trust
- Online friends may have never met in real life
- E.g., email, Skype, IM
- Online social networking sites
- Network hosted by a Web site
- Often used to share opinions, advice, ratings,
multimedia content
Social Network
Online Social Network
110Huge opportunity
- to leverage collective user input, e.g.
- to deal with unwanted communication
- to thwart security attacks
- to enable better organization, filtering, search,
ranking, and distribution of content - may provide an answer to the ever-increasing
flood of information
111Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Whats next? Social computing and applications
- Exploiting social networks for distributed
computing - Example enhancing Web search
- Example thwarting unwanted communication
112Whats it got to do with Systems?
- Social networks enhance distributed systems
- Sybil attacks
- Unwanted communication
- Personalization
- Social computing may need distribution
- Privacy
- Avoid dependence on a single provider
113Leveraging social networks to enhance systems
- Trust can help thwart security problems
- Sybil attacks SybilGuard SIGCOMM06
- Clones unlikely to have diverse links
- Trust can help block unwanted communication
- Friends unlikely to send SPAM RE NSDI06
- Using social networks to thwart SPAM (Ostra)
- Shared interest can improve search
- Web search PeerSpective HotNets06
- Related users likely to visit relevant content
114Leveraging social networks More ideas
- Sharing solutions and problem fixes
- Configurations that work
- Fixes that others have found
- Copy what works for others
- Combine technology and social networks to truly
stand on the shoulders of giants - Answer to the increasing complexity of the
information age?
115Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Whats next? Social computing and applications
- Exploiting social networks for distributed
computing - Example enhancing Web search
- Example thwarting unwanted communication
116Example social network based Web search
- PeerSpective experiment
- Idea users can query their friends previously
viewed pages - Results from friends appear alongside Google
results
Google
PeerSpective
117PeerSpective implementation
- Prototype is a lightweight HTTP proxy
- Runs on users desktop and indexes all browsed
content - When Google search is performed,
- query other PeerSpective proxies in parallel with
Google - present PeerSpective results alongside Google
results
PeerSpective
PeerSpective
PeerSpective
118PeerSpective results summary
- Explored potential of integrating Web and social
network search - Evidence that PeerSpective added value
- Additional coverage for viewed sites
- Improved ranking of results
- Aided in finding content serendipitously
- However, just an experiment
- Many challenges remain
- Opportunities as well
- Details in Mislove et al., HotNets 06
119Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Whats next? Social computing and applications
- Exploiting social networks for distributed
computing - Example enhancing Web search
- Example thwarting unwanted communication
120Unwanted communication
- Well-known problem
- Email spam
- Increasingly affects other systems
- Search-engine spam
- Mislabeled videos plaguing YouTube
- Unwanted invitations in Skype
- Existing solutions insufficient
- Content filtering for videos?
121Known defenses
- Content filtering
- Works very well for email, but
- False positives reduce communication reliability
- Doesnt work for multimedia
- Holding senders accountable
- Requires strong user identities
- Imposing a per-communication cost
- Refunded if communication is wanted
- Requires micro-payments/quota market
122Ostra Using social relationships
- Assumptions
- Cost for acquiring and maintaining social links
- Cannot create links arbitrarily fast
- Cannot maintain arbitrary number of links
- Receivers are willing to classify content
- Explicit (Junk button)
- Implicit (Deletion, response)
123Ostra Pair-wise credit exchange
-202
- Credit balance/bound associated with each link
- Credit balances decay at constant rate (10/day)
- Sum of all credit 0 (invariant)
124Ostra Pair-wise credit exchange
Receiver
-202
- Message unwanted -gt sender pays receiver one
credit - Sending spam exhausts senders link balance
125Ostra End-to-end credit exchange
-2-12
-202
-212
-202
Rate of spam a user can send is proportional
to number of links (s)he has
126Sybil attacks are not effective
Sybils
Total unwanted communication by Sybils is bounded
by the number of links with other users
127Ostra
- Thwarts unwanted communication existing systems
- Examples Email, Skype, IM, YouTube
- Uses existing relationships among users
- Online social networks
- Graph of email/IM/Skype users
- Does not require strong user identities
- Does not rely on automatic content classification
- Respects recipients idea of wanted/unwanted
communication - Details in Mislove et al., NSDI 08
128SN and applications research agenda
- Measurement/Analysis
- Theory of complex networks
- Empirical study of social networks
- Understanding SN evolution
- Understanding SN information flow
- Design
- Personalized search, filtering, content
distribution - Using social networks to thwart unwanted behavior
- Online social networks and privacy
129Outline
- Decentralized systems state-of-the-art
- Accountability for distributed systems
- Social computing and applications
- Exploiting social networks for distributed
computing - Example enhancing Web search
- Example thwarting unwanted communication
130Max Planck Institute for Software
Systems(MPI-SWS)
- Part of Max Planck Society
- Academic research institute, pub. funded
- Focus on basic research
- Kick-off in Aug 2005
- 17 faculty positions (tenure-track)
- 100 doctoral/post-doc positions
- Administrative and technical support staff
- Top international research institution
131 MPI-SWS Faculty
Dependable systems
Program analysis and verification
Rodrigo Rodrigues
Networked systems
Large scale Internet systems
Andrey Rybalchenko
Krishna Gummadi
Paul Francis
Functional Programming
Security and Cryptography
Peter Druschel
Derek Dreyer
Michael Backes (Fellow)
132Graduate program (MS/PhD)
- Advised by MPI-SWS faculty
- Stimulating, competitive environment
- International, diverse student body (80)
- English language
- Financial aid
- Internships available
http//www.mpi-sws.org
133Thanks for your attention!