Advanced Operating Systems - PowerPoint PPT Presentation

1 / 145
About This Presentation
Title:

Advanced Operating Systems

Description:

Advanced Operating Systems Lecture 9: Distributed Systems Architecture University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani – PowerPoint PPT presentation

Number of Views:880
Avg rating:3.0/5.0
Slides: 146
Provided by: Larry415
Category:

less

Transcript and Presenter's Notes

Title: Advanced Operating Systems


1
Advanced Operating Systems
Lecture 9 Distributed Systems Architecture
  • University of Tehran
  • Dept. of EE and Computer Engineering
  • By
  • Dr. Nasser Yazdani

2
Covered topic
  • Distributed Systems Architectures
  • References
  • Chapter 2 of the text book
  • Anatomy of Grid

3
Outline
  • Distributed Systems Architecture
  • Client-server
  • Grid computing
  • Peer to peer Computing
  • Cloud Computing

4
Architectural Models
  • Concerned with
  • The placement of the components across a network
    of computers
  • The interrelationships between the components
  • Common Architectures
  • Client server, Web
  • Grid
  • Peer to peer
  • Cloud

5
Clients and Servers
  • General interaction between a client and a server.

1.25
6
Processing Level
  • The general organization of an Internet search
    engine into three different layers

1-28
7
Multitiered Architectures (1)
  • Alternative client-server organizations (a) (e).

1-29
8
Multitiered Architectures (2)
  • An example of a server acting as a client.

1-30
9
Client-Server
  • Creating for example a hotmail? What are the
    options?
  • One server?
  • Several servers?

10
Multiple Servers
11
HTTP Basics (Review)
  • HTTP layered over bidirectional byte stream
  • Almost always TCP
  • Interaction
  • Client sends request to server, followed by
    response from server to client
  • Requests/responses are encoded in text
  • Stateless
  • Server maintains no information about past client
    requests

12
How to Mark End of Message? (Review)
  • Size of message ? Content-Length
  • Must know size of transfer in advance
  • Delimiter ? MIME-style Content-Type
  • Server must escape delimiter in content
  • Close connection
  • Only server can do this

13
HTTP Request (review)
  • Request line
  • Method
  • GET return URI
  • HEAD return headers only of GET response
  • POST send data to the server (forms, etc.)
  • URL (relative)
  • E.g., /index.html
  • HTTP version

14
HTTP Request (cont.) (review)
  • Request headers
  • Authorization authentication info
  • Acceptable document types/encodings
  • From user email
  • If-Modified-Since
  • Referrer what caused this page to be requested
  • User-Agent client software
  • Blank-line
  • Body

15
HTTP Request (review)
16
HTTP Request Example (review)
  • GET / HTTP/1.1
  • Accept /
  • Accept-Language en-us
  • Accept-Encoding gzip, deflate
  • User-Agent Mozilla/4.0 (compatible MSIE 5.5
    Windows NT 5.0)
  • Host www.intel-iris.net
  • Connection Keep-Alive

17
HTTP Response (review)
  • Status-line
  • HTTP version
  • 3 digit response code
  • 1XX informational
  • 2XX success
  • 200 OK
  • 3XX redirection
  • 301 Moved Permanently
  • 303 Moved Temporarily
  • 304 Not Modified
  • 4XX client error
  • 404 Not Found
  • 5XX server error
  • 505 HTTP Version Not Supported
  • Reason phrase

18
HTTP Response (cont.) (review)
  • Headers
  • Location for redirection
  • Server server software
  • WWW-Authenticate request for authentication
  • Allow list of methods supported (get, head,
    etc)
  • Content-Encoding E.g x-gzip
  • Content-Length
  • Content-Type
  • Expires
  • Last-Modified
  • Blank-line
  • Body

19
HTTP Response Example (review)
  • HTTP/1.1 200 OK
  • Date Tue, 27 Mar 2001 034938 GMT
  • Server Apache/1.3.14 (Unix) (Red-Hat/Linux)
    mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2
    PHP/4.0.1pl2 mod_perl/1.24
  • Last-Modified Mon, 29 Jan 2001 175418 GMT
  • ETag "7a11f-10ed-3a75ae4a"
  • Accept-Ranges bytes
  • Content-Length 4333
  • Keep-Alive timeout15, max100
  • Connection Keep-Alive
  • Content-Type text/html
  • ..

20
Typical Workload (Web Pages)
  • Multiple (typically small) objects per page
  • File sizes
  • Heavy-tailed
  • Pareto distribution for tail
  • Lognormal for body of distribution
  • -- For reference/interest only --
  • Embedded references
  • Number of embedded objects
  • pareto p(x) akax-(a1)

21
HTTP 0.9/1.0 (mostly review)
  • One request/response per TCP connection
  • Simple to implement
  • Disadvantages
  • Multiple connection setups ? three-way handshake
    each time
  • Several extra round trips added to transfer
  • Multiple slow starts

22
Single Transfer Example
  • Client

Server
SYN
0 RTT
SYN
Client opens TCP connection
1 RTT
ACK
DAT
Client sends HTTP request for HTML
Server reads from disk
ACK
DAT
FIN
2 RTT
ACK
Client parses HTML Client opens TCP connection
FIN
ACK
SYN
SYN
3 RTT
ACK
DAT
Client sends HTTP request for image
Server reads from disk
ACK
4 RTT
DAT
Image begins to arrive
23
More Problems
  • Short transfers are hard on TCP
  • Stuck in slow start
  • Loss recovery is poor when windows are small
  • Lots of extra connections
  • Increases server state/processing
  • Server also forced to keep TIME_WAIT connection
    state
  • -- Things to think about --
  • Why must server keep these?
  • Tends to be an order of magnitude greater than
    of active connections, why?

24
Persistent Connection Solution (review)
  • Multiplex multiple transfers onto one TCP
    connection
  • How to identify requests/responses
  • Delimiter ? Server must examine response for
    delimiter string
  • Content-length and delimiter ? Must know size of
    transfer in advance
  • Block-based transmission ? send in multiple
    length delimited blocks
  • Store-and-forward ? wait for entire response and
    then use content-length
  • Solution ? use existing methods and close
    connection otherwise

25
Persistent Connection Example (review)
  • Client

Server
0 RTT
DAT
Server reads from disk
Client sends HTTP request for HTML
ACK
DAT
1 RTT
ACK
Client parses HTML Client sends HTTP request for
image
DAT
Server reads from disk
ACK
DAT
2 RTT
Image begins to arrive
26
Persistent HTTP (review)
  • Nonpersistent HTTP issues
  • Requires 2 RTTs per object
  • OS must work and allocate host resources for each
    TCP connection
  • But browsers often open parallel TCP connections
    to fetch referenced objects
  • Persistent HTTP
  • Server leaves connection open after sending
    response
  • Subsequent HTTP messages between same
    client/server are sent over connection
  • Persistent without pipelining
  • Client issues new request only when previous
    response has been received
  • One RTT for each referenced object
  • Persistent with pipelining
  • Default in HTTP/1.1
  • Client sends requests as soon as it encounters a
    referenced object
  • As little as one RTT for all the referenced
    objects

27
HTTP Caching
  • Clients often cache documents
  • Challenge update of documents
  • If-Modified-Since requests to check
  • HTTP 0.9/1.0 used just date
  • HTTP 1.1 has an opaque entity tag (could be a
    file signature, etc.) as well
  • When/how often should the original be checked for
    changes?
  • Check every time?
  • Check each session? Day? Etc?
  • Use Expires header
  • If no Expires, often use Last-Modified as estimate

28
Example Cache Check Request
  • GET / HTTP/1.1
  • Accept /
  • Accept-Language en-us
  • Accept-Encoding gzip, deflate
  • If-Modified-Since Mon, 29 Jan 2001 175418 GMT
  • If-None-Match "7a11f-10ed-3a75ae4a"
  • User-Agent Mozilla/4.0 (compatible MSIE 5.5
    Windows NT 5.0)
  • Host www.intel-iris.net
  • Connection Keep-Alive

29
Ways to cache
  • Client-directed caching
  • Web Proxies
  • Server-directed caching
  • Content Delivery Networks (CDNs)

30
Web Proxy Caches
  • User configures browser Web accesses via cache
  • Browser sends all HTTP requests to cache
  • Object in cache cache returns object
  • Else cache requests object from origin server,
    then returns object to client

origin server
Proxy server
HTTP request
HTTP request
client
HTTP response
HTTP response
HTTP request
HTTP response
client
origin server
31
Caching Example (1)
  • Assumptions
  • Average object size 100,000 bits
  • Avg. request rate from institutions browser to
    origin servers 15/sec
  • Delay from institutional router to any origin
    server and back to router 2 sec
  • Consequences
  • Utilization on LAN 15
  • Utilization on access link 100
  • Total delay Internet delay access delay
    LAN delay
  • 2 sec minutes milliseconds

origin servers
public Internet
1.5 Mbps access link
institutional network
10 Mbps LAN
32
Caching Example (2)
  • Possible solution
  • Increase bandwidth of access link to, say, 10
    Mbps
  • Often a costly upgrade
  • Consequences
  • Utilization on LAN 15
  • Utilization on access link 15
  • Total delay Internet delay access delay
    LAN delay
  • 2 sec msecs msecs

origin servers
public Internet
10 Mbps access link
institutional network
10 Mbps LAN
33
Caching Example (3)
  • Install cache
  • Suppose hit rate is .4
  • Consequence
  • 40 requests will be satisfied almost immediately
    (say 10 msec)
  • 60 requests satisfied by origin server
  • Utilization of access link reduced to 60,
    resulting in negligible delays
  • Weighted average of delays
  • .62 sec .410msecs lt 1.3 secs

origin servers
public Internet
1.5 Mbps access link
institutional network
10 Mbps LAN
institutional cache
34
Problems
  • Over 50 of all HTTP objects are uncacheable
    why?
  • Not easily solvable
  • Dynamic data ? stock prices, scores, web cams
  • CGI scripts ? results based on passed parameters
  • Obvious fixes
  • SSL ? encrypted data is not cacheable
  • Most web clients dont handle mixed pages well
    ?many generic objects transferred with SSL
  • Cookies ? results may be based on passed data
  • Hit metering ? owner wants to measure of hits
    for revenue, etc.
  • What will be the end result?

35
Content Distribution Networks (CDNs)
  • The content providers are the CDN customers.
  • Content replication
  • CDN company installs hundreds of CDN servers
    throughout Internet
  • Close to users
  • CDN replicates its customers content in CDN
    servers. When provider updates content, CDN
    updates servers

origin server in North America
CDN distribution node
CDN server in S. America
CDN server in Asia
CDN server in Europe
36
Content Distribution Networks Server Selection
  • Replicate content on many servers
  • Challenges
  • How to replicate content
  • Where to replicate content
  • How to find replicated content
  • How to choose among know replicas
  • How to direct clients towards replica

37
Server Selection
  • Which server?
  • Lowest load ? to balance load on servers
  • Best performance ? to improve client performance
  • Based on Geography? RTT? Throughput? Load?
  • Any alive node ? to provide fault tolerance
  • How to direct clients to a particular server?
  • As part of routing ? anycast, cluster load
    balancing
  • As part of application ? HTTP redirect
  • As part of naming ? DNS

38
Application Based
  • HTTP supports simple way to indicate that Web
    page has moved (30X responses)
  • Server receives Get request from client
  • Decides which server is best suited for
    particular client and object
  • Returns HTTP redirect to that server
  • Can make informed application specific decision
  • May introduce additional overhead ? multiple
    connection setup, name lookups, etc.
  • OK solution in general, but
  • HTTP Redirect has some flaws especially with
    current browsers
  • Incurs many delays, which operators may really
    care about

39
Naming Based
  • Client does DNS name lookup for service
  • Name server chooses appropriate server address
  • A-record returned is best one for the client
  • What information can name server base decision
    on?
  • Server load/location ? must be collected
  • Information in the name lookup request
  • Name service client ? typically the local name
    server for client

40
How Akamai Works
  • Clients fetch html document from primary server
  • E.g. fetch index.html from cnn.com
  • URLs for replicated content are replaced in html
  • E.g. ltimg srchttp//cnn.com/af/x.gifgt replaced
    with ltimg srchttp//a73.g.akamaitech.net/7/23/cn
    n.com/af/x.gifgt
  • Client is forced to resolve aXYZ.g.akamaitech.net
    hostname

41
How Akamai Works
  • How is content replicated?
  • Akamai only replicates static content ()
  • Modified name contains original file name
  • Akamai server is asked for content
  • First checks local cache
  • If not in cache, requests file from primary
    server and caches file
  • (At least, the version were talking about
    today. Akamai actually lets sites write code
    that can run on Akamais servers, but thats a
    pretty different beast)

42
How Akamai Works
  • Root server gives NS record for akamai.net
  • Akamai.net name server returns NS record for
    g.akamaitech.net
  • Name server chosen to be in region of clients
    name server
  • TTL is large
  • G.akamaitech.net nameserver chooses server in
    region
  • Should try to chose server that has file in cache
    - How to choose?
  • Uses aXYZ name and hash
  • TTL is small ? why?

43
Simple Hashing
  • Given document XYZ, we need to choose a server to
    use
  • Suppose we use modulo
  • Number servers from 1n
  • Place document XYZ on server (XYZ mod n)
  • What happens when a servers fails? n ? n-1
  • Same if different people have different measures
    of n
  • Why might this be bad?

44
Consistent Hash
  • view subset of all hash buckets that are
    visible
  • Desired features
  • Balanced in any one view, load is equal across
    buckets
  • Smoothness little impact on hash bucket
    contents when buckets are added/removed
  • Spread small set of hash buckets that may hold
    an object regardless of views
  • Load across all views of objects assigned to
    hash bucket is small

45
Consistent Hash Example
  • Construction
  • Assign each of C hash buckets to random points on
    mod 2n circle, where, hash key size n.
  • Map object to random position on circle
  • Hash of object closest clockwise bucket

0
14
Bucket
4
12
8
  • Smoothness ? addition of bucket does not cause
    movement between existing buckets
  • Spread Load ? small set of buckets that lie
    near object
  • Balance ? no bucket is responsible for large
    number of objects

46
How Akamai Works
cnn.com (content provider)
DNS root server
Akamai server
Get foo.jpg
12
11
Get index.html
5
1
2
3
Akamai high-level DNS server
6
4
Akamai low-level DNS server
7
Nearby matchingAkamai server
8
9
  • End-user

10
Get /cnn.com/foo.jpg
47
Akamai Subsequent Requests
cnn.com (content provider)
DNS root server
Akamai server
Get index.html
1
2
Akamai high-level DNS server
Akamai low-level DNS server
7
8
Nearby matchingAkamai server
9
  • End-user

10
Get /cnn.com/foo.jpg
48
Impact on DNS Usage
  • DNS is used for server selection more and more
  • What are reasonable DNS TTLs for this type of use
  • Typically want to adapt to load changes
  • Low TTL for A-records ? what about NS records?
  • How does this affect caching?
  • What do the first and subsequent lookup do?

49
HTTP (Summary)
  • Simple text-based file exchange protocol
  • Support for status/error responses,
    authentication, client-side state maintenance,
    cache maintenance
  • Workloads
  • Typical documents structure, popularity
  • Server workload
  • Interactions with TCP
  • Connection setup, reliability, state maintenance
  • Persistent connections
  • How to improve performance
  • Persistent connections
  • Caching
  • Replication

50
Grid
  1. What is Grid?
  2. Grid Projects Applications
  3. Grid Technologies
  4. Globus
  5. CompGrid

51
(No Transcript)
52
(No Transcript)
53
Definition
  • A type of parallel and distributed system that
    enables the sharing, selection, aggregation of
    geographically distributed resources
  • Computers PCs, workstations, clusters,
    supercomputers, laptops, notebooks, mobile
    devices, PDA, etc
  • Software e.g., ASPs renting expensive special
    purpose applications on demand
  • Catalogued data and databases e.g. transparent
    access to human genome database
  • Special devices/instruments e.g., radio
    telescope SETI_at_Home searching for life in
    galaxy.
  • People/collaborators.
  • depending on their availability, capability,
    cost, and user QoS requirements
  • for solving large-scale problems/applications.
  • thus enabling the creation of virtual
    organization (VOs)

54
Resources assets, capabilities, and knowledge
  • Capabilities (e.g. application codes, analysis
    tools)
  • Compute Grids (PC cycles, commodity clusters,
    HPC)
  • Data Grids
  • Experimental Instruments
  • Knowledge Services
  • Virtual Organisations
  • Utility Services

55
Why go Grid?
  • Hot subject
  • Try it, experience it to learn the potential
  • Will enable true ubiquitous computing in future
  • Today, proven in some areas intraGrids
  • But still long way to World Wide Grid
  • State of art techniques, tools are difficult
  • Short term goals? Use another technology
  • Does your system have Grid characteristics?
  • Distributed users, large scale and heterogeneous
    resources, across domains

56
Grids main idea
  • To treat CPU cycles and software like
    commodities.
  • Enable the coordinated use of geographically
    distributed resources in the absence of central
    control and existing trust relationships.
  • Computing power is produced much like utilities
    such as power and water are produced for
    consumers.
  • Users will have access to power on demand
  • When the Network is as fast as the computers
    internal links, the machine disintegrates across
    the Net into a set of special purpose appliances
    Gilder Technology Report June 2000

57
Computational Grids and Electric Power Grids
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
What do users want ?
  • Grid Consumers
  • Execute jobs for solving varying problem size and
    complexity
  • Benefit by selecting and aggregating resources
    wisely
  • Tradeoff timeframe and cost
  • Grid Providers
  • Contribute (idle) resource for executing
    consumer jobs
  • Benefit by maximizing resource utilisation
  • Tradeoff local requirements market opportunity

64
(No Transcript)
65
(No Transcript)
66
Grid Applications
  • Distributed HPC (Supercomputing)
  • Computational science.
  • High-Capacity/Throughput Computing
  • Large scale simulation/chip design parameter
    studies.
  • Content Sharing (free or paid)
  • Sharing digital contents among peers (e.g.,
    Napster)
  • Remote software access/renting services
  • Application service provides (ASPs) Web
    services.
  • Data-intensive computing
  • Drug Design, Particle Physics, Stock
    Prediction...
  • On-demand, real-time computing
  • Medical instrumentation Mission Critical.
  • Collaborative Computing
  • Collaborative design, Data exploration,
    education.
  • Service Oriented Computing (SOC)
  • Towards economic-based Utility Computing New
    paradigm, new applications, new industries, and
    new business.

67
Grid Projects
  • Australia
  • Nimrod-G
  • Gridbus
  • GridSim
  • Virtual Lab
  • DISCWorld
  • GrangeNet
  • ..new coming up
  • Europe
  • UNICORE
  • Cactus
  • UK eScience
  • EU Data Grid
  • EuroGrid
  • MetaMPI
  • XtremeWeb
  • and many more.
  • India
  • I-Grid
  • USA
  • Globus
  • Legion
  • OGSA
  • Sun Grid Engine
  • AppLeS
  • NASA IPG
  • Condor-G
  • Jxta
  • NetSolve
  • AccessGrid
  • and many more...
  • Cycle Stealing .com Initiatives
  • Distributed.net
  • SETI_at_Home, .
  • Entropia, UD, Parabon,.
  • Public Forums
  • Global Grid Forum
  • Australian Grid Forum

68
Grid Requirements
  • Identity authentication
  • Authorization policy
  • Resource discovery
  • Resource characterization
  • Resource allocation
  • (Co-)reservation, workflow
  • Distributed algorithms
  • Remote data access
  • High-speed data transfer
  • Performance guarantees
  • Monitoring Adaptation
  • Intrusion detection
  • Resource management
  • Accounting payment
  • Fault management
  • System evolution
  • Etc.

69
Resource ManagementProblem
  • Enabling secure, controlled remote access to
    computational resources and management of remote
    computation
  • Authentication and authorization
  • Resource discovery characterization
  • Reservation and allocation
  • Computation monitoring and control

70
Grid-based Computation Challenges
  • Locate suitable computers
  • Authenticate with appropriate sites
  • Allocate resources on those computers
  • Initiate computation on those computers
  • Configure those computations
  • Select appropriate communication methods
  • Compute with suitable algorithms
  • Access data files, return output
  • Respond appropriately to resource changes

71
Leading Grid Middleware Developments
  • Globus Toolkit (mainly developed at ANL and USC)
  • Service-oriented toolkit from the Globus
    project,to be used in Grid applications, not
    targeted at end-user
  • Services for resource selection and allocation,
  • authentication, file system access and file
    transfer,
  • Largest user-base in projects worldwide
  • Open-source software, commercial support by IBM
    and Platform Computing

72
The Globus Alliance
  • Globus Project , since 1996
  • Ian Foster (Argonne National Lab),
  • Carl Kesselman (University of Southern
    Californias Information Science Institute)
  • Develop protocols, middleware and tools for Grid
    computing
  • Globus Alliance, since Sept 2003
  • International scope
  • University of Edinburghs EPCC
  • Swedish Center for Parallel Computers (PDC)
  • Advisory council of Academic Affiliates from
    Asia-Pacific, Europe, US

73
Globus Toolkit
  • GT2 (2.4 released in 2002) reference
    implementation of Grid fabric protocols
  • GRAM for job submissions
  • MDS for resource discovery
  • GridFTP for data transfer
  • GSI security
  • GT3 (3.0 released July 2003) redesign
  • OGSI based
  • Grid services, built on SOAP and XML
  • GT3.2 released March 31, 2004

74
Globus Toolkit Services
  • Job submission and management (GRAM)
  • Uniform Job Submission
  • Security (GSI)
  • PKI-based Security (Authentication) Service
  • Information services (MDS)
  • LDAP-based Information Service
  • Remote file management (GASS) and transfer
    (GridFTP)
  • Remote Storage Access Service
  • Remote Data Catalogue and Management Tools
  • Support by Globus 2.0 released in 2002
  • Resource selection and allocation (GIIS, GRIS)

75
Resource Specification Language
  • Common notation for exchange of information
    between components
  • Syntax similar to MDS/LDAP filters
  • RSL provides two types of information
  • Resource requirements Machine type, number of
    nodes, memory, etc.
  • Job configuration Directory, executable, args,
    environment
  • API provided for manipulating RSL

76
Some Useful Definitions
  • Network Protocol
  • A formal description of message formats and a set
    of rules for exchange of messages
  • Rules define sequences of message exchange, and
    potentially resulting behavior
  • Protocol may define state-change in endpoint
  • Network Enabled Services
  • Defines a set of capabilities
  • Protocol defines interaction with service
  • All services require protocols, although not all
    protocols are to services

77
More definitions
  • Resource
  • Entity that is to be shared
  • Provides some capabilities, that can be accessed
    via interface (API) or protocol
  • Application Programmer Interface (API)
  • Software Development Kit (SDK)
  • Package that enables application development,
    consisting of one or more APIs, and programming
    tools

78
Protocols Make the Grid
  • Protocols and APIs
  • Protocols enable interoperability
  • APIs enable portability
  • Sharing is about interoperability, so
  • Grid architecture should be about protocols

79
Grid Services Architecture Previous Perspective
a rich variety of applications ...
Applns
Appln Toolkits
Remote data toolkit
Remote sensors toolkit
Async. collab. toolkit
Remote viz toolkit
Remote comp. toolkit
...
Protocols, authentication, policy, resource
management, instrumentation, discovery, etc.,
etc.
Grid Services
Grid Fabric
Grid-enabled archives, networks, computers,
display devices, etc. associated local services

80
Characteristics of Grid Services Architecture
  • Identifies separation of concerns
  • Isolates Grids from languages and specific
    programming environments
  • Makes provisions for generic and application
    specific functionality
  • Protocols not explicit in architecture
  • fails to make clear distinction between language,
    service and networking issues

81
Layered Grid Protocol Architecture
Application
User
Grid
Resource
Connectivity
Fabric
82
Important Points
  • Being Grid-enabled requires speaking appropriate
    protocols
  • Protocol only requirement, not reachability
  • Protocols can be used to bridge local resources
    or local Grids
  • Intergrid as analog to Internet
  • Built on Internet protocols
  • Independent of language and implementation
  • Focus on interaction over network
  • Services exist at each level

83
Protocols, services and interfaces
Applications
Languages/Frameworks
Connectivity APIs
Connectivity Protocols
Local Access APIs and protocols
Fabric Layer
84
How does Globus fit in?
  • Defines connectivity and resource protocols
  • Enables definition of grid and user protocols
  • Globus provides some of these, others defined by
    other groups
  • Defines range of APIs and SDKs that leverage
    Resource, Grid and User protocols

85
Fabric
  • Local access to logical resource
  • May be real component, e.g. CPU, software module,
    filesystem
  • May be logical component, e.g. Condor pool
  • Protocol or API mediated
  • Fabric elements include
  • SSP, ASP, peer-to-peer, Entropia-like, and
    enterprise level solutions

86
Connectivity Protocols
  • Two classes of connectivity protocols underlie
    all other components
  • Internet communication
  • Application, transport and internet layer
    protocols
  • I.e., transport, routing, DNS, etc.
  • Security
  • Authentication and delegation
  • Discussed below

87
Security
  • Protocols
  • TLS with delegation
  • Services
  • K5ssl, Globus Authorization Service
  • APIs
  • GSS-API, GAA, SASL, gss_assist
  • SDKs
  • GlobusIO

88
Resource Protocols
  • Resource management,
  • Storage system access
  • Network quality of service
  • Data movement
  • Resource information

89
Resource Management
  • Protocols
  • GRAMGARA (on HTTP)
  • Resource services
  • Gatekeeper, JobManager, SlotManager
  • APIs and SDKs
  • GRAM API, JavaCog Client, DUROC

90
Data Transport
  • Protocols
  • Grid FTP, LDAP for replica catalog
  • Services
  • FTP, LDAP replica catalog
  • APIs and SDKs
  • GridFTP client library, copy URL API, replica
    catalog access, replica selection

91
Resource Information
  • Protocol
  • LDAP V3, Registration/Discovery protocol
  • Service
  • GRIS
  • APIs SDKs
  • C API JNDI, PerlLDAP, .

92
Grid Protocols
  • Grid Information Index Services
  • LDAP and Service registration protocol,
  • GIIS service
  • LDAP APIs and specialized information API
  • Co-allocation and brokering
  • GRAM (HTTPRSL)
  • DUROC service
  • DUROC client API, end-to-end reservation API

93
Grid Protocols (cont)
  • Online authentication, authorization services
  • HTTP
  • MyProxy, Group policy servers
  • Myproxy API, GAA API,
  • Many others (e.g.)
  • Resource discovery (Matchmaker)
  • Fault recovery

94
User Protocols
  • In general, there are many of these, they tend to
    be on off, and not well defined
  • Examples
  • Portal toolkits (e.g. Hotpage)
  • Netsolve
  • Cactus framework

95
Why Study Peer to peer systems?
  • To understand how they work
  • To build your own peer to peer system
  • To understand the techniques and principles
    within them
  • To modify, adapt, reuse these techniques and
    principles in other related areas
  • Cloud computing
  • Sensor networks
  • To grow the body of knowledge about distributed
    systems

96
Searching Fetching
  • Human I want to watch that great 80s cult
    classic Better Off Dead
  • Search better off dead -gt better_off_dead.mov
    or -gt 0x539fba83ajdeadbeef
  • Locate sources of better_off_dead.mov
  • Download the file from them

96
97
Searching
N2
N1
N3
Internet
Keytitle ValueMP3 data
?
Client
Publisher
Lookup(title)
N6
N4
N5
98
Search Approaches
  • Centralized
  • Flooding
  • A hybrid Flooding between Supernodes
  • Structured

98
99
Different types of searches
  • Needles vs. Haystacks
  • Searching for top 40, or an obscure punk track
    from 1981 that nobodys heard of?
  • Search expressiveness
  • Whole word? Regular expressions? File names?
    Attributes? Whole-text search?
  • (e.g., p2p gnutella or p2p google?)

100
Framework
  • Common Primitives
  • Join how to I begin participating?
  • Publish how do I advertise my file?
  • Search how to I find a file?
  • Fetch how to I retrieve a file?

101
Centralized
  • Centralized Database
  • Join on startup, client contacts central server
  • Publish reports list of files to central server
  • Search query the server gt return node(s) that
    store the requested file

102
Napster Example Publish
I have X, Y, and Z!
123.2.21.23
103
Napster Search
123.2.0.18
Where is file A?
104
Napster Discussion
  • Pros
  • Simple
  • Search scope is O(1) for even complex searches
    (one index, etc.)
  • Controllable (pro or con?)
  • Cons
  • Server maintains O(N) State
  • Server does all processing
  • Single point of failure
  • Technical failures legal (napster shut down
    2001)

105
Query Flooding
  • Join Must join a flooding network
  • Usually, establish peering with a few existing
    nodes
  • Publish no need, just reply
  • Search ask neighbors, who ask their neighbors,
    and so on... when/if found, reply to sender.
  • TTL limits propagation

106
Example Gnutella
Where is file A?
107
Flooding Discussion
  • Pros
  • Fully de-centralized
  • Search cost distributed
  • Processing _at_ each node permits powerful search
    semantics
  • Cons
  • Search scope is O(N)
  • Search time is O(???)
  • Nodes leave often, network unstable
  • TTL-limited search works well for haystacks.
  • For scalability, does NOT search every node. May
    have to re-issue query later

108
Supernode Flooding
  • Join on startup, client contacts a supernode
    ... may at some point become one itself
  • Publish send list of files to supernode
  • Search send query to supernode, supernodes flood
    query amongst themselves.
  • Supernode network just like prior flooding net

109
Supernode Network Design
110
Supernode File Insert
I have X!
123.2.21.23
111
Supernode File Search
Where is file A?
112
Supernode Which nodes?
  • Often, bias towards nodes with good
  • Bandwidth
  • Computational Resources
  • Availability!

113
Stability and Superpeers
  • Why superpeers?
  • Query consolidation
  • Many connected nodes may have only a few files
  • Propagating a query to a sub-node would take more
    b/w than answering it yourself
  • Caching effect
  • Requires network stability
  • Superpeer selection is time-based
  • How long youve been on is a good predictor of
    how long youll be around.

114
Superpeer results
  • Basically, just better than flood to all
  • Gets an order of magnitude or two better scaling
  • But still fundamentally o(search) o(per-node
    storage) O(N)
  • central O(1) search, O(N) storage
  • flood O(N) search, O(1) storage
  • Superpeer can trade between

114
115
Structured SearchDistributed Hash Tables
  • Academic answer to p2p
  • Goals
  • Guatanteed lookup success
  • Provable bounds on search time
  • Provable scalability
  • Makes some things harder
  • Fuzzy queries / full-text search / etc.
  • Read-write, not read-only
  • Hot Topic in networking since introduction in
    2000/2001

116
Searching Wrap-Up
Type O(search) storage Fuzzy?
Central O(1) O(N) Yes
Flood O(N) O(1) Yes
Super lt O(N) gt O(1) Yes
Structured O(log N) O(log N) not really
117
DHT Overview
  • Abstraction a distributed hash-table (DHT)
    data structure
  • put(id, item)
  • item get(id)
  • Implementation nodes in system form a
    distributed data structure
  • Can be Ring, Tree, Hypercube, Skip List,
    Butterfly Network, ...

118
DHT Overview (2)
  • Structured Overlay Routing
  • Join On startup, contact a bootstrap node and
    integrate yourself into the distributed data
    structure get a node id
  • Publish Route publication for file id toward a
    close node id along the data structure
  • Search Route a query for file id toward a close
    node id. Data structure guarantees that query
    will meet the publication.
  • Important difference get(key) is for an exact
    match on key!
  • search(spars) will not find file(briney
    spars)
  • We can exploit this to be more efficient

119
DHT Example - Chord
  • Associate to each node and file a unique id in an
    uni-dimensional space (a Ring)
  • E.g., pick from the range 0...2m
  • Usually the hash of the file or IP address
  • Properties
  • Routing table size is O(log N) , where N is the
    total number of nodes
  • Guarantees that a file is found in O(log N) hops

from MIT in 2001
120
DHT Consistent Hashing
Key 5
K5
Node 105
N105
K20
Circular ID space
N32
N90
K80
A key is stored at its successor node with next
higher ID
121
DHT Chord Basic Lookup
N120
N10
Where is key 80?
N105
N32
N90 has K80
N90
K80
N60
122
DHT Chord Finger Table
1/2
1/4
1/8
1/16
1/32
1/64
1/128
N80
  • Entry i in the finger table of node n is the
    first node that succeeds or equals n 2i
  • In other words, the ith finger points 1/2n-i way
    around the ring

123
Node Join
  • Compute ID
  • Use an existing node to route to that ID in the
    ring.
  • Finds s successor(id)
  • ask s for its predecessor, p
  • Splice self into ring just like a linked list
  • p-gtsuccessor me
  • me-gtsuccessor s
  • me-gtpredecessor p
  • s-gtpredecessor me

123
124
DHT Chord Join
  • Assume an identifier space 0..8
  • Node n1 joins

Succ. Table
0
i id2i succ 0 2 1 1 3 1 2 5
1
1
7
2
6
3
5
4
125
DHT Chord Join
  • Node n2 joins

Succ. Table
0
i id2i succ 0 2 2 1 3 1 2 5
1
1
7
2
6
Succ. Table
i id2i succ 0 3 1 1 4 1 2 6
1
3
5
4
126
DHT Chord Join
Succ. Table
i id2i succ 0 1 1 1 2 2 2 4
0
  • Nodes n0, n6 join

Succ. Table
0
i id2i succ 0 2 2 1 3 6 2 5
6
1
7
Succ. Table
i id2i succ 0 7 0 1 0 0 2 2
2
2
6
Succ. Table
i id2i succ 0 3 6 1 4 6 2 6
6
3
5
4
127
DHT Chord Join
Succ. Table
Items
7
i id2i succ 0 1 1 1 2 2 2 4
0
  • Nodes n1, n2, n0, n6
  • Items f7, f2

0
Succ. Table
Items
1
1
7
i id2i succ 0 2 2 1 3 6 2 5
6
2
6
Succ. Table
i id2i succ 0 7 0 1 0 0 2 2
2
Succ. Table
i id2i succ 0 3 6 1 4 6 2 6
6
3
5
4
128
DHT Chord Routing
Succ. Table
Items
7
i id2i succ 0 1 1 1 2 2 2 4
0
  • Upon receiving a query for item id, a node
  • Checks whether stores the item locally
  • If not, forwards the query to the largest node in
    its successor table that does not exceed id

0
Succ. Table
Items
1
1
7
i id2i succ 0 2 2 1 3 6 2 5
6
query(7)
2
6
Succ. Table
i id2i succ 0 7 0 1 0 0 2 2
2
Succ. Table
i id2i succ 0 3 6 1 4 6 2 6
6
3
5
4
129
DHT Chord Summary
  • Routing table size?
  • Log N fingers
  • Routing time?
  • Each hop expects to 1/2 the distance to the
    desired id gt expect O(log N) hops.

130
DHT Discussion
  • Pros
  • Guaranteed Lookup
  • O(log N) per node state and search scope
  • Cons
  • This line used to say not used. ButNow being
    used in a few apps, including BitTorrent.
  • Supporting non-exact match search is (quite!) hard

131
The limits of searchA Peer-to-peer Google?
  • Complex intersection queries (the who)
  • Billions of hits for each term alone
  • Sophisticated ranking
  • Must compare many results before returning a
    subset to user
  • Very, very hard for a DHT / p2p system
  • Need high inter-node bandwidth
  • (This is exactly what Google does - massive
    clusters)
  • But maybe many file sharing queries are okay...

132
Fetching Data
  • Once we know which node(s) have the data we
    want...
  • Option 1 Fetch from a single peer
  • Problem Have to fetch from peer who has whole
    file.
  • Peers not useful sources until d/l whole file
  • At which point they probably log off. )
  • How can we fix this?

132
133
Chunk Fetching
  • More than one node may have the file.
  • How to tell?
  • Must be able to distinguish identical files
  • Not necessarily same filename
  • Same filename not necessarily same file...
  • Use hash of file
  • Common MD5, SHA-1, etc.
  • How to fetch?
  • Get bytes 0..8000 from A, 8001...16000 from B
  • Alternative Erasure Codes

134
BitTorrent Overview
  • Swarming
  • Join contact centralized tracker server, get a
    list of peers.
  • Publish Run a tracker server.
  • Search Out-of-band. E.g., use Google to find a
    tracker for the file you want.
  • Fetch Download chunks of the file from your
    peers. Upload chunks you have to them.
  • Big differences from Napster
  • Chunk based downloading (sound familiar? )
  • few large files focus
  • Anti-freeloading mechanisms

135
BitTorrent
  • Periodically get list of peers from tracker
  • More often
  • Ask each peer for what chunks it has
  • (Or have them update you)
  • Request chunks from several peers at a time
  • Peers will start downloading from you
  • BT has some machinery to try to bias towards
    helping those who help you

135
136
BitTorrent Publish/Join
Tracker
137
BitTorrent Fetch
138
BitTorrent Summary
  • Pros
  • Works reasonably well in practice
  • Gives peers incentive to share resources avoids
    freeloaders
  • Cons
  • Central tracker server needed to bootstrap swarm
  • (Tracker is a design choice, not a requirement,
    as you know from your projects. Modern
    BitTorrent can also use a DHT to locate peers.
    But approach still needs a search mechanism)

139
Writable, persistent p2p
  • Do you trust your data to 100,000 monkeys?
  • Node availability hurts
  • Ex Store 5 copies of data on different nodes
  • When someone goes away, you must replicate the
    data they held
  • Hard drives are huge, but cable modem upload
    bandwidth is tiny - perhaps 10 Gbytes/day
  • Takes many days to upload contents of 200GB hard
    drive. Very expensive leave/replication
    situation!

140
Whats out there?
Central Flood Super-node flood Route
Whole File Napster Gnutella Freenet
Chunk Based BitTorrent KaZaA (bytes, not chunks) DHTs eDonkey2000
141
P2P Summary
  • Many different styles remember pros and cons of
    each
  • centralized, flooding, swarming, unstructured and
    structured routing
  • Lessons learned
  • Single points of failure are bad
  • Flooding messages to everyone is bad
  • Underlying network topology is important
  • Not all nodes are equal
  • Need incentives to discourage freeloading
  • Privacy and security are important
  • Structure can provide theoretical bounds and
    guarantees

142
Some Questions
  • Why do people get together?
  • to share information
  • to share and exchange resources they have
  • books, class notes, experiences, videos, music
    cds
  • How can computers help people
  • find information
  • find resources
  • exchange and share resources

143
Cloud Computing InfrastructureTake a seat
prepare to fly
  • Anh M. Nguyen
  • CS525, UIUC, Spring 2009

144
What is cloud computing?
  • I dont understand what we would do differently
    in the light of Cloud Computing other than
    change the wordings of some of our ads
  • Larry Ellision, Oracles CEO
  • I have not heard two people say the same thing
    about it cloud. There are multiple definitions
    out there of the cloud
  • Andy Isherwood, HPs Vice President of European
    Software Sales
  • Its stupidity. Its worse than stupidity its a
    marketing hype campaign.
  • Richard Stallman, Free Software Foundation founder

145
Next Lecture
  • Communication among distributed systems.
  • Remote Procedure Call (RPC)
  • References
  • Chapter 4 of the book
Write a Comment
User Comments (0)
About PowerShow.com