Last class review - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Last class review

Description:

Download failures. Scalability. Fragmented development. Encouragement of content sharing ... KaZaA/Morpheus. Hybrid indexing/searching model ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 54
Provided by: Daniel84
Category:

less

Transcript and Presenter's Notes

Title: Last class review


1
Last class review
  • Parallel processing
  • MPI

2
Today
  • Grid Computing
  • Peer to peer computing (P2P)

3
Software Trends
Grid Computing
Multi-tier Server-side
P2P Computing Ubiquous/pervasive
Component programming
Client-server Classes
Application complexity
Object-oriented programming
monolithic
Structured programming
Time (years)
1970 1980 1990 2000
4
Grid Computing
  • High performance distributed applications in
    large-scale internetworks
  • Coordinated resource sharing and problem solving
    in dynamic, multi-istitutional virtual
    organizations

5
Network Exponentials
  • Network vs. computer performance
  • Computer speed doubles every 18 months (Moores
    law)
  • Network speed doubles every 9 months
  • Difference order of magnitude per 5 years
  • 1986 to 2000
  • Computers x 500
  • Networks x 340,000
  • 2001 to 2010
  • Computers x 60
  • Networks x 4000

6
The 13.6 TF TeraGridComputing at 40 Gb/s
TeraGrid/DTF NCSA, SDSC, Caltech, Argonne
www.teragrid.org
7
International Virtual Data Grid Laboratory iVDGL
U.S. PIs Avery, Foster, Gardner, Newman, Szalay
www.ivdgl.org
8
The Grid Problem
  • Flexible, secure, coordinated resource sharing
    among dynamic collections of individuals,
    institutions, and resource
  • Enable communities (virtual organizations) to
    share geographically distributed resources as
    they pursue common goals -- assuming the absence
    of
  • central location,
  • central control,
  • omniscience,
  • existing trust relationships.

9
One View of Requirements
  • Identity authentication
  • Authorization policy
  • Resource discovery
  • Resource characterization
  • Resource allocation
  • (Co-)reservation, workflow
  • Distributed algorithms
  • Remote data access
  • High-speed data transfer
  • Performance guarantees
  • Monitoring
  • Adaptation
  • Intrusion detection
  • Resource management
  • Accounting payment
  • Fault management
  • System evolution
  • Etc.
  • Etc.

10
Elements of the Problem
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, negotiation, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation, collaboration,
  • Dynamic, multi-institutional virtual orgs
  • Community overlays on classic org structures
  • Large or small, static or dynamic

11
Resource Sharing Requirements
  • Members should be trustful and trustworthy.
  • Sharing is conditional.
  • Should be secure.
  • Sharing should be able to change dynamically over
    time.
  • Need for discovery and registering of resources.
  • Can be peer to peer or client/server.
  • Same resource may be used in different ways.
  • All these point to well defined architecture and
    protocols.

12
The Globus ProjectMaking Grid computing a
reality
  • Close collaboration with real Grid projects in
    science and industry
  • Development and promotion of standard Grid
    protocols to enable interoperability and shared
    infrastructure
  • Development and promotion of standard Grid
    software APIs and SDKs to enable portability and
    code sharing
  • The Globus Toolkit Open source, reference
    software base for building grid infrastructure
    and applications
  • Global Grid Forum Development of standard
    protocols and APIs for Grid computing

13
Globus Toolkit
  • A software toolkit addressing key technical
    problems in the development of Grid enabled
    tools, services, and applications
  • Offer a modular bag of technologies
  • Enable incremental development of grid-enabled
    tools and applications
  • Implement standard Grid protocols and APIs
  • Make available under liberal open source license

14
Layered Grid Architecture
15
Key Protocols
  • The Globus Toolkit centers around four key
    protocols
  • Connectivity layer
  • Security Grid Security Infrastructure (GSI)
  • Resource layer
  • Resource Management
  • Information Services
  • Data Transfer
  • Also key collective layer protocols
  • Info Services, Replica Management, etc.

16
Peer to Peer Computing
17
What is Peer-to-Peer?
  • A model of communication where every node in the
    network acts alike.
  • As opposed to the Client-Server model, where one
    node provides services and other nodes use the
    services.

18
Advantages of P2P Computing
  • No central point of failure
  • E.g., the Internet and the Web do not have a
    central point of failure.
  • Most internet and web services use the
    client-server model (e.g. HTTP), so a specific
    service does have a central point of failure.
  • Scalability
  • Since every peer is alike, it is possible to add
    more peers to the system and scale to larger
    networks.

19
Disadvantages of P2P Computing
  • Decentralized coordination
  • How to keep global state consistent?
  • Need for distributed coherency protocols.
  • All nodes are not created equal.
  • Computing power, bandwidth have an impact on
    overall performance.
  • Programmability
  • As a corollary of decentralized coordination.

20
P2P Computing Applications
  • File sharing
  • Process sharing
  • Collaborative environments
  • Instant messaging
  • New forms of content delivery, distribution

21
P2P File Sharing Applications
  • Improves data availability
  • Replication to compensate for failures.
  • E.g., Napster, Gnutella, Freenet, KaZaA
    (FastTrack)

22
P2P Process Sharing Applications
  • For large-scale computations
  • Data analysis, data mining, scientific computing
  • E.g., SETI_at_Home, Folding_at_Home, distributed.net,
    World-Wide Computer

23
P2P Collaborative Applications
  • For remote real-time human collaboration.
  • Instant messaging, virtual meetings, shared
    whiteboards, teleconferencing, tele-presence.
  • E.g., talk, IRC, ICQ, AOL Messenger, Yahoo!
    Messenger, Jabber, MS Netmeeting, NCSA Habanero,
    Games

24
P2P Technical Challenges
  • Peer identification
  • Routing protocols
  • Network topologies
  • Peer discovery
  • Communication/coordination protocols
  • Quality of service
  • Security
  • Fine-grained resource management

25
P2P Topologies
  • Centralized
  • Ring
  • Hierarchical
  • Decentralized
  • Hybrid

26
Centralized
  • Client/server
  • Web servers
  • Databases
  • Napster search
  • Instant Messaging
  • Popular Power

27
Ring
  • Fail-over clusters
  • Simple load balancing
  • Assumption
  • Single owner

28
Hierarchical
  • DNS
  • NTP
  • Usenet (sort of)

29
Decentralized
  • Gnutella
  • Freenet
  • Hive
  • Internet routing

30
Centralized Centralized
  • N-tier apps
  • Database heavy systems
  • Web services gateways
  • Grand Central

31
Centralized Ring
  • Serious web applications
  • High availability servers

32
Centralized Decentralized
  • Clip2 Gnutella Reflector
  • FastTrack / KaZaA
  • Morpheus
  • Email

33
What about other topologies?
  • Centralized Hierarchical?
  • Back end tree of information
  • Caching architectures
  • Decentralized Ring?
  • P2P network of fail-over clusters
  • Decentralized Hierarchical?
  • Decentralized Centralized?

34
Strengths and Weaknesses
  • Plenty of topologies to choose from
  • What is each kind good for?
  • Need a set of properties to measure

35
Things to Measure
  • Manageability
  • How hard is it to keep working?
  • Information coherence
  • How authoritative is info? (Auditing,
    non-repudiation)
  • Extensibility
  • How easy is it to grow?
  • Fault tolerance
  • How well can it handle failures?
  • Security
  • How hard is it to subvert?
  • Resistance to legal or political intervention
  • How hard is it to shut down? (Can be good or bad)
  • Scalability
  • How big can it grow?

36
Centralized
  • Manageable
  • Coherent
  • Extensible
  • Fault Tolerant
  • Secure
  • Lawsuit-proof
  • Scalable
  • System is all in one place
  • All information is in one place
  • No one can add on to system
  • Single point of failure
  • Simply secure one host
  • Easy to shut down
  • One machine. But in practice?

37
Ring
  • Manageable
  • Coherent
  • Extensible
  • Fault Tolerant
  • Secure
  • Lawsuit-proof
  • Scalable
  • Simple rules for relationships
  • Easy logic for state
  • Only ring owner can add
  • Fail-over to next host
  • As long as ring has one owner
  • Shut down owner
  • Just add more hosts

38
Hierarchical
  • Manageable
  • Coherent
  • Extensible
  • Fault Tolerant
  • Secure
  • Lawsuit-proof
  • Scalable
  • Chain of authority
  • Cache consistency
  • Add more leaves, rebalance
  • Root is vulnerable
  • Too easy to spoof links
  • Just shut down the root
  • Hugely scalable DNS

39
Decentralized
  • Manageable
  • Coherent
  • Extensible
  • Fault Tolerant
  • Secure
  • Lawsuit-proof
  • Scalable
  • Very difficult, many owners
  • Difficult, unreliable peers
  • Anyone can join in!
  • Redundancy
  • Difficult, open research
  • No one to sue!
  • Theory yes Practice no

40
Centralized Ring
  • Manageable
  • Coherent
  • Extensible
  • Fault Tolerant
  • Secure
  • Lawsuit-proof
  • Scalable
  • Just manage the ring
  • As coherent as ring
  • No more than ring
  • Ring is a huge win
  • As secure as ring
  • Still single place to shut down
  • Ring is a huge win

Common architecture for web applications
41
Centralized Decentralized
  • Manageable
  • Coherent
  • Extensible
  • Fault Tolerant
  • Secure
  • Lawsuit-proof
  • Scalable
  • Same as decentralized
  • Better than decentralized
  • Anyone can still join!
  • Plenty of redundancy
  • Same as decentralized
  • Still no one to sue
  • Looking very hopeful

Best architecture for P2P networks?
42
Centralized vs. Decentralized
  • Centralized is pretty good!
  • Manageable
  • Coherent
  • Security
  • Decentralized is exciting
  • Extensible
  • Massive fault tolerance
  • Lawsuit-proof
  • Scalability is the big question

43
Conclusions
  • Centralized is easy to deal with
  • Major architecture for distributed systems
  • Combines well with rings
  • Decentralized is good, needs research
  • Coherence, Manageability, Security
  • Scalability
  • Hierarchical is overlooked
  • Combining architectures is powerful
  • P2P does not have to be descentralized when
    centralized is good

44
Napster
  • P2P concept existed since early 90s
  • Napster ronovated interest in P2P systems
  • Napster features
  • Central indexing and searching service
  • File downloading in a peer-to-peer point-to-point
    manner.

45
Gnutella
  • Peer-to-peer indexing and searching service.
  • Peer-to-peer point-to-point file downloading
    using HTTP.
  • A gnutella node needs a server (or a set of
    servers) to start-up. gnutellahosts.com
    provides a service with reliable initial
    connection points. This fact introduces a single
    point of failure

46
The Gnutella protocol (v0.4)
  • PING Notify a peer of your existence
  • PONG Reply to a PING request
  • QUERY Find a file in the network
  • RESPONSE Give the location of a file
  • PUSHREQUEST Request a server behind a firewall
    to push a file out to a client.

47
Gnutella Decentralized Model
48
Gnutella Research Directions
  • Download failures
  • Scalability
  • Fragmented development
  • Encouragement of content sharing
  • Reducing browsing downtime
  • Reducing unnecessary network traffic
  • Creating and maintaining a healthy network
    structure rebalancing, different TTL strategies,
    priorities
  • Addressing security concerns.

49
JXTA
  • Connecting devices and applications by providing
    common P2P services to heterogeneous devices,
    operating systems, programming languages, and
    applications
  • Open source
  • www.jxta.org

50
JXTA
  • JXTA defines a set of Protocols
  • JXTA defines XML message formats and protocols,
    for communication between peers
  • Protocols are used to discover peers, advertise
    and discover resources, communicate and route
    messages, and provide monitoring
  • Asynchronous based on query/response model. Can
    be implemented in any language and sent across
    different networks

51
JXTA Architecture
JXTA Applications
JXTA Services Search Indexing Discovery Membershi
p
JXTA Core Peer groups Peer Pipes Peer
Monitoring Peer Advertisements Peer Ids Security
52
Freenet
  • Peer-to-peer indexing and searching service.
  • Peer-to-peer file downloading.
  • Files served use the same route as searches (not
    point-to-point)
  • Provides for anonymity.

53
KaZaA/Morpheus
  • Hybrid indexing/searching model
  • Not centralized like Napster, not decentralized
    like Gnutella.
  • Peer-to-peer file downloading using HTTP.
  • SmartStream for incomplete file downloads.
  • FastStream for partial file downloads.
  • SuperNodes elected dynamically if sufficient
    bandwidth and processing power hybrid topology
    model.
  • A central server keeps user registrations, logs
    usage, and helps bootstrapping peer discovery.
Write a Comment
User Comments (0)
About PowerShow.com