LINF2345: Languages and Algorithms for Distributed Applications - PowerPoint PPT Presentation

1 / 121
About This Presentation
Title:

LINF2345: Languages and Algorithms for Distributed Applications

Description:

Programming language, operating system, and network basics ... S. Tanenbaum, Maarten van Steen, Distributed systems: Principles and Paradigms, ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 122
Provided by: seifha
Category:

less

Transcript and Presenter's Notes

Title: LINF2345: Languages and Algorithms for Distributed Applications


1
LINF2345 Languages and Algorithms for
Distributed Applications
  • Seif Haridi
  • Peter Van Roy

2
Overview
  • Course organization and objectives
  • Course overview
  • General concepts
  • Language-based distribution
  • Some advanced concepts
  • Motivation for distributed systems
  • Introduction to distributed systems
  • Distributed algorithms
  • Basic underlying concepts
  • Programming language, operating system, and
    network basics
  • Addressing Internet hosts, process on a host,
    Web document

3
Course organizationand objectives
4
Objectives
  • Understand some of the fundamental aspects of
    distributed systems
  • Course in three parts
  • General Concepts
  • Language-Based Distribution
  • Advanced Concepts

5
LINF2345 organization
  • Evaluation
  • Test 25 (dispensatory) around the seventh week
  • Final exam 50 or 75 (if redo test)
  • Project 25
  • Web page
  • http//www.info.ucl.ac.be/notes_de_cours/LINF2345
  • Most material will be put there
  • People
  • Teacher Peter Van Roy (pvr_at_info.ucl.ac.be)
  • Assistant Yves Jaradin (yjaradin_at_info.ucl.ac.be)

6
Project
  • There will be one project, to be done in groups
    of two, around the 10th-12th week
  • Practical sessions will build toward the project
  • Exercises to review Oz
  • Exercises to introduce Distributed Oz
  • What will the project be?
  • Maybe a peer-to-peer application, using our
    peer-to-peer middleware?

7
Lecture structure
  • Reminder of last lecture
  • Overview
  • Content
  • Summary
  • Reading suggestions

8
Material
  • Lectures are based on mainly the following
  • Andrew S. Tanenbaum, Maarten van Steen,
    Distributed systems Principles and Paradigms,
    Prentice-Hall 2002.
  • Randy Chow and Theodore Johnson, Distributed
    Operating Systems Algorithms, Addison Wesley
    1997, ISBN 0-201-49838-3.
  • Peter Van Roy and Seif Haridi, Concepts,
    Techniques, and Models of Computer Programming,
    chapter 11, MIT Press, 2004
  • Some research papers!
  • The books are available in the INGI library
  • Research papers will be hand-outs

9
Other recommended material
  • Coulouris, Dollimore, Kindberg, Distributed
    Systems Concepts and Design, Addison-Wesley
    (3rd Edition)
  • M.L. Liu, Distributed Computing, Principles and
    Applications, Addison Wesley
  • Nancy Lynch, Distributed Algorithms

10
Questions and using brakes!
  • Please do ask questions during the lectures
  • repeat an explanation
  • give better explanation
  • for an example?
  • Please say when things go too fast!
  • Please say when things go too slow!

11
Background knowledge
  • I assume some basic knowledge about
  • Programming languages knowledge C/Java
  • Operating systems knowledge basic concepts
  • Networking basic concepts
  • Algorithms and data structures
  • I will try to be as elementary as possible
  • Ask me to explain if I assume something you dont
    know

12
Course overview
13
Languages and algorithms for distributed
applications (1)
  • Part 1 general concepts of distributed systems
  • Inter-process communication
  • Processes, threads, client/servers, code
    migration, software agents
  • Naming services
  • Clocks and synchronization

14
Languages and algorithms for distributed
applications (2)
  • Part 2 language-based distribution
  • Network-transparent distribution
  • Review of Oz language
  • Practical introduction to Distributed Oz
  • Open distribution
  • Connecting independent computations
  • Foundations of Distributed Oz
  • Language entities and their protocols
  • Fault tolerance

15
Languages and algorithms for distributed
applications (3)
  • Part 3 some advanced concepts
  • Decentralized (peer-to-peer) systems
  • As contrasted with centralized (client/server)
    systems
  • Peer-to-peer library in Mozart
  • Introduction to the programming project
  • Introduction to distributed algorithms
  • Introduction to distributed transactions
  • Some systems
  • Grid, CORBA, Web Services, Erlang, etc.

16
Language-based distribution
  • How can we make distributed programming really
    simple?
  • First approximation network transparency
  • Second approximation extend with control over
    network communication (network awareness)
  • Third approximation add failure detection
    ability and connection ability (fault tolerance
    and openness)
  • (Security is also important, but is outside the
    scope of this course)

17
Distributed Oz
  • Keep same language semantics while executing over
    a network
  • Can this work? Waldo et al say it cant!
  • It depends on the language design
  • The main problems are state and concurrency
  • Global state is very expensive in a distributed
    system
  • A distributed system is naturally concurrent
  • The language has to do two things
  • Make it easy to program without global state
  • Make concurrency easy and efficient
  • Both of these are hard in Java but easy in Oz
  • You will see how simple it really is!

18
Some advanced concepts
  • Distributed programs have two extremes
  • Centralized (client/server)
  • Decentralized (peer-to-peer)
  • We will explore this spectrum!
  • We will use the peer-to-peer library of Mozart
  • Distributed algorithms
  • Leader election, mutual exclusion, snapshots
  • Can be very subtle in a distributed setting!
  • Some systems and middleware
  • Grid, CORBA, Web Services, Erlang, etc.

19
Examples of middleware
  • Distributed systems
  • Grid Globus
  • CORBA
  • Distributed COM
  • Web Services
  • GLOBE
  • Erlang
  • Distributed coordination-based systems
  • JavaSpaces
  • Security
  • E

20
Distributed algorithms
  • Model of distributed computations
  • Techniques for coordination of processes
  • Techniques for high availability
  • Fault tolerance
  • Reliable group communication
  • Distributed agreement
  • Techniques for scalability
  • Consistency models
  • Replication techniques

21
Motivation fordistributed systems
22
Distributed system and distributed computing
  • Early computing was performed on a single
    processor. Uni-processor computing can be called
    centralized computing.
  • A distributed system is a collection of
    independent computers, interconnected via a
    network, capable of collaborating on a task.
  • Distributed computing is computing performed in a
    distributed system.

23
Distributed systems
24
Examples of distributed systems
  • Network of workstations (NOW) a group of
    networked personal workstations connected to one
    or more server machines.
  • The Internet
  • An intranet a network of computers and
    workstations within an organization, segregated
    from the Internet via a protective device (a
    firewall).

25
Computers in a distributed system
  • Workstations computers used by end-users to
    perform computing (desktops or laptops)
  • Server machines computers which provide
    resources and services
  • Personal Digital Assistants (PDAs) handheld
    computers connected to the system via a wireless
    communication link.

26
Centralized vs. distributed computing
27
Evolution of paradigms
  • Client-server Socket API, remote method
    invocation
  • Distributed objects
  • Object broker CORBA
  • Network service Jini
  • Object space JavaSpaces
  • Mobile agents
  • Message oriented middleware (MOM) Java Message
    Service
  • Collaborative applications

28
Cooperative distributed computing projects
  • Cooperative distributed computing projects
    (also called distributed computing in some
    literature) these are projects that parcel out
    large-scale computing to workstations, often
    making use of surplus CPU cycles. Example
    seti_at_home project to scan data retrieved by a
    radio telescope to search for radio signals from
    another world.

29
Why distributed computing?
  • Economics Distributed systems allow the pooling
    of resources, including CPU cycles, data storage,
    input/output devices, and services
  • Reliability Distributed systems allow
    replication of resources and/or services, thus
    reducing service outage due to failures
  • Universality The Internet has become a universal
    platform for distributed computing, i.e., it is
    available everywhere with substantially identical
    standards

30
The strengths and weaknesses of distributed
computing
  • In any form of computing, there is always a
    tradeoff in advantages and disadvantages
  • Some of the reasons for the popularity of
    distributed computing
  • The affordability of computers and availability
    of network access
  • Resource sharing
  • Scalability
  • Fault tolerance

31
The strengths and weaknesses of distributed
computing
  • The disadvantages of distributed computing
  • Multiple Points of Failures the failure of one
    or more participating computers, or one or more
    network links, can spell trouble.
  • Security Concerns In a distributed system, there
    are more opportunities for unauthorized attack.
  • Programming Difficulty Complex APIs, many issues
    that have to be handled at the same time

32
Introduction to distributed systems
33
What is a distributed system?
34
A distributed system
  • A simplified view

Processor
Communication Medium
Process
Thread
Communication channel
Node processor/process
35
A distributed system
  • Set of computing nodes that cooperate in order to
    achieve a well defined goal
  • Nodes cooperate through communication
  • Communication is by message passing at the
    fundamental level

36
A distributed system
  • A distributed system is one/more applications
    running on a collection of independent computers
    that appears to its users as a single coherent
    system

37
What is a distributed system?
  • Hardware is distributed
  • n processing elements (processor memory), PE
  • Interconnected by some network
  • No shared memory
  • Software is distributed
  • No centralized OS, each PE has its own copy of OS
  • No physically centralized file system
  • Means for inter-process communication is message
    passing at the lowest level

38
Why distributed systems?
  • Information exchange (collaborative work)
  • Resource sharing (e.g. printer, backup storage,
    disk units, etc.)
  • Resource sharing (applications, information,
    media, services)
  • Cost reduction
  • Increase of availability (partial failure)
  • Increase of performance through parallelism, ...

39
Main characteristics
  • No shared memory between nodes
  • Each node has its memory
  • Communication by message passing
  • No global clock
  • Each node has its own clock
  • Impossible for a node to obtain an instantaneous
    global state of the system

40
Examples ofdistributed systems
  • Airline reservation system
  • Bank automated teller machine network
  • CSCW (Computer Supported Cooperative Work)
  • Intranet
  • Internet TCP/IP-based communications
    infrastructure
  • Mobile computing

41
A typical portion of the Internet
server
network link
42
A typical intranet
43
How are distributed systems built?
  • A number of computers connected by a network
  • Distribution middleware services layer that gives
    a uniform view of the nodes, and hides some of
    the network and distribution aspects
  • Applications on top of the middleware service
    layer (using a programming system or combination
    of programming systems)

44
Middleware view
45
Middleware view
  • A distributed system is often organized as a
    layer on the top of local operating systems

46
Goals of a distributed system
  • Transparency
  • Hide the fact the processes are resources are
    physically distributed
  • Scalability
  • Distributed systems should be easy to expand
  • Availability
  • Distributed systems should be continuously
    available
  • Openness
  • Adding new users/components into the system
  • Adding new functionality, incrementally and
    independently by independent developer teams

47
Transparency
  • Ideally a distributed application (system) should
    look like a conventional centralized system, with
    no distinction between local and remote resources
  • This is the user view
  • The developer view is different
  • Network aware, knows the cost of distribution of
    programming entities (e.g. objects)
  • Have means to control the distribution behavior

48
Transparency
  • Access Transparency
  • Hide differences in data representation and how a
    resource is accessed
  • Hides heterogeneity of underlying nodes
  • Location Transparency
  • Hide where a resource/service is located
  • Migration Transparency
  • Hides that resources/services may be moved to
    another location without affecting how they are
    accessed

49
Transparency
  • Relocation Transparency
  • Hides that a resource may be moved to another
    location while in use
  • Failure Transparency
  • Hide the failure and recovery of a resource
  • Concurrency Transparency
  • Hides that a resources may be shared by a number
    of competitive uses/processes

50
Transparency
51
Scalability
  • Size
  • Add more users and resources/components
  • Distance
  • Cope with geographically separate resources and
    users
  • Management
  • Spanning over independent administrative
    organizations
  • Local management

52
Scalability problems (Size)
Examples of scalability limitations
53
Scaling techniques (1)
1.4
Offload the server by sending form processing
procedures to the client
54
Scaling techniques (2)
  • Distributed algorithms
  • No process has complete information of the system
  • Process decisions are based on local information
  • Failure of one process does not ruin the whole
    system
  • No assumptions about exactly synchronized clocks
    (no global clock)

55
Scaling techniques (3)
1.5
An example of dividing the DNS name space into
zones
56
Scalability problems (distance)
  • Long communication delays
  • Programming techniques for Local Area Networks
    LAN do not really work for Wide Area Networks WAN
  • Synchronous communication like a Remote Procedure
    Call (RPC) is not suitable
  • Asynchronous message passing is more appropriate

57
Scalability problems (distance)
  • WAN has unreliable communication media
  • Cannot exploit broadcast communication
  • Only point-to-point communication
  • Locating a service on a WAN is more difficult
    that on LAN
  • On LAN just broadcast a service identifier, and
    wait for response

58
Scalability problems (different administrative
organizations)
  • Different and conflicting policies for
  • Resource usage
  • Management of the system
  • Security policies
  • WHO has access to WHAT resources
  • Can I trust a non local system administrator

59
Scalability problems (different administrative
organizations)
Admin Domain 2
Admin Domain 1
Distributed System DS
  • Protect DS from the domains 1 2
  • Protect domains 1 2 from the DS
  • Example Grid Computing GGF

60
Distributed algorithms
61
Distributed algorithms
  • How to design distributed algorithms
  • Study of some fundamental problems
  • Analysis of distributed algorithms
  • How to achieve fault tolerance in a distributed
    system
  • Fault tolerance ability of a system to provide
    useful service despite the failure of some of its
    components
  • Very important for high availability

62
Why study distributed algorithms?
  • Distributed algorithms are the backbone of
    distributed computing systems
  • They are essential for the implementation of
    distributed systems
  • Distributed operating systems
  • Distributed databases
  • Distributed communication systems
  • Real-time process-control systems
  • Transportation systems, etc.

63
Classes of distributed algorithms
  • Fully decentralized
  • Fault tolerant
  • More difficult in general
  • With a centralized coordinator
  • Conceptually simpler
  • Single point of failure, bottleneck
  • Require efficient mechanisms for selecting a new
    coordinator if the current one fails

64
Distributed algorithms
  • Models of distributed computation
  • Causality
  • Ordering of events, logical clocks (timestamps)
  • Causal communication
  • Distributed snapshots
  • Detecting stable properties, diffusing
    computation
  • Modeling a distributed computation
  • Expressing correctness properties of a
    distributed algorithm
  • Failures in a distributed system

65
Distributed algorithms outline
  • Synchronization
  • Distributed mutual exclusion needed to regulate
    accesses to a common resource that can be used
    only by one process at a time
  • Election
  • Used for instance, to designate a new coordinator
    when the current coordinator fails

66
Distributed algorithms outline
  • Distributed agreement
  • How to get a set of nodes to agree on a value
  • Distributed agreement is used for instance,
  • To determine which nodes are alive in the system
  • To confine malicious behavior of some components
  • In distributed databases to determine when to
    commit a transaction
  • (Fault tolerance again!)

67
Distributed algorithms outline
  • Replicated data management
  • A key for high availability is to replicate
    components (data/files, servers, etc.)
  • We shall be concerned with
  • Techniques for maintaining replicated data in a
    distributed system (database techniques)
  • Atomic broadcast/multicast
  • Membership

68
Distributed algorithms outline
  • Check-pointing and recovery
  • Error recovery is essential for fault-tolerance
  • When a processor fails and then is repaired, it
    will need to recover its state of the computation
  • To enable recovery, check-pointing (recording of
    the state into a stable storage) is needed
  • We will be concerned with techniques used for
    this, in the context of distributed systems

69
References
  • Text book
  • Distributed Operating Systems Algorithms
  • Randy Chow and Theodore Johnson, Addison Wesley,
    1997
  • Others
  • Distributed Algorithms
  • Nancy A. Lynch, 1996
  • Research papers

70
Basic underlying concepts
71
Basics in three areas
  • As a prerequisite for the rest of the course, we
    introduce some basics from other areas as they
    related to distributed systems
  • Some of the notations and concepts from these
    areas will be employed from time to time in this
    course
  • Programming languages
  • Operating systems
  • Networks

72
Programming language basics
73
Sequential languages versus concurrent languages
  • Most popular languages were designed first as
    sequential, and concurrency was added on later
  • Examples Java, C, etc.
  • This makes concurrency complicated!
  • Some recent languages make concurrency much
    easier
  • Example Erlang, which is based on message
    passing between active objects. This is
    important not only for concurrency, but also for
    fault tolerance.
  • Example Oz, which allows programming with
    dataflow concurrency
  • Example E, which uses message passing to make
    security easier

74
Centralized versus distributed languages
  • Most popular languages were designed first as
    centralized (running on one machine), and
    distribution was added on later
  • Examples Java, C, etc.
  • This makes distribution complicated!
  • This is why most distributed programs are
    client/servers
  • Some languages make distribution much easier
  • So far, these are research languages (no
    widespread industrial use) except perhaps Erlang
  • This is an old and respectable field Emerald, SR
    (1980s), Erlang (early 1990s), Oz (end 1990s),
    etc.
  • A recent development is to support peer-to-peer
    applications
  • Pushed from the applications side, with file
    sharing, collaboration applications such as
    Napster, Gnutella, Kazaa, Skype

75
Concurrency and state
  • The basic lesson is that concurrency and state
    are difficult when used together in a program
  • Keep them separate!
  • Unfortunately, a distributed system is by nature
    concurrent
  • This points toward message passing, in which
    active entities send each other asynchronous
    messages, as the right approach since it does not
    suppose shared state
  • State is important for modularity
  • Use it for that purpose and not for collaboration
  • With care, concurrency and state can keep out of
    each others hair
  • Erlang and Oz (again) are the canonical examples

76
Procedural versus object-oriented programming
  • In the olden days, when distribution was still
    embryonic (early 1980s), languages were divided
    into two classes procedural languages and
    object-oriented languages.
  • Procedural languages, with the C language being
    the primary example, use procedures (functions)
    to break down the complexity of the tasks that an
    application entails.  
  • Object-oriented languages, exemplified by Java,
    use objects to encapsulate the details. Each
    object carrying state data as well as behaviors.
    State data are represented as instance data.
    Behaviors are represented as methods.
  • Both of these languages lead to synchronous calls
    (RPC and RMI) when extended for distribution
  • This is highly error prone and non-transparent
  • Too simplistic thinking!

77
Operating system basics
78
Operating system basics
  • A process consists of an executing program, its
    current values, state information, and the
    resources used by the operating system to manage
    its execution.
  • A program is an artifact constructed by a
    software developer a process is a dynamic entity
    which exists only when a program is run.

79
Process state transition diagram
80
Example Java processes
  • There are three types of Java program
    applications, applets, and servlets, all are
    written as a class
  • A Java application program is run as an
    independent(standalone) process
  • An applet is run using a browser or the applet
    viewer
  • A servlet is run in the context of a web server
  • A Java program is compiled into byte code, a
    universal object code. When run, the byte code
    is interpreted by the Java Virtual Machine (JVM).
    A just-in-time compiler (JIT) improves execution
    speed.

81
Three types of Java programs
  • Applications
  • A program whose byte code can be run on any
    system which has a Java Virtual Machine. An
    application may be standalone (monolithic) or
    distributed (if it interacts with another
    process).
  • Applets
  • A program whose byte code is downloaded from a
    remote machine and is run in the browsers Java
    Virtual Machine.
  • Servlets
  • A program whose byte code resides on a remote
    machine and is run at the request of an HTTP
    client (a browser).

82
Three types of Java programs
83
Concurrent processing
  • On modern day operating systems, multiple
    processes appear to be executing concurrently on
    a machine by timesharing resources.

84
Concurrent processing within a process
  • It is often useful for a process to have parallel
    threads of execution,
  • each of which timeshare the system resources in
    much the same
  • way as concurrent processes.

85
Thread-safe programming
  • Consider two threads that independently access
    and update the same data object, such as a
    counter
  • It is possible for one of the updates to be
    overwritten by the other due to the sequencing of
    the machine instructions in the two threads (race
    condition)
  • This leads to unpredictable behavior that is not
    explained by the counters specification
    (observable nondeterminism)
  • To protect against this, a synchronized method
    can be used to provide mutual exclusion. The
    counter executes inside a critical region, that
    can be entered by only one thread at a time.

86
Race condition (observable nondeterministic
behavior)
87
Network basics
88
Network standards and protocols
  • On public networks such as the Internet, it is
    necessary for a common set of rules to be
    specified for the exchange of data
  • Such rule sets, called protocols, specify matters
    such as the formatting and semantics of data,
    flow control, and error correction
  • Software can share data over the network using
    network software which supports a standard set of
    protocols

89
Protocols
  • A protocol is a set of rules that must be
    observed by the participants
  • Message formats, meanings of message components
  • Allowable sequences of message interchange
  • Protocols must be formally defined and precisely
    implemented. For each protocol, there must be
    rules that specify the following
  • How is the exchanged data encoded?
  • How are events (sending, receiving) synchronized
    so that the participants can send and receive in
    a coordinated order?
  • Implementation independence

90
Layered network architecture
  • Network hardware transfers electronic
    signals,which represent a bit stream, between two
    devices
  • Modern day network applications require an
    application programming interface (API) which
    masks the underlying complexities of data
    transmission
  • A layered network architecture allows the
    functionalities needed to mask the complexities
    to be provided incrementally, layer by layer
  • Actual implementation of the functionalities may
    not actually follow this structure, i.e., it may
    not be clearly divided by layer

91
The OSI seven-layer network architecture
92
Conceptual layering
  • The division of the layers is conceptual the
    implementation of the functionalities need not be
    clearly divided as such in the hardware and
    software that implements the architecture.
  • The conceptual division serves at least two
    useful purposes
  • Systematic specification of protocols
  • it allows protocols to be specified
    systematically
  • 2. Conceptual data flow it allows programs to be
    written in terms of logical data flow.

93
The TCP/IP protocol suite
  • The Transmission Control Protocol/Internet
    Protocol suite is a set of network protocols
    which supports a four-layer network architecture
  • It is the protocol suite currently used on the
    Internet

94
The TCP/IP protocol suite (2)
  • The Internet layer implements the Internet
    Protocol, which provides the functionalities for
    allowing data to be transmitted between any two
    hosts on the Internet (best-effort packet
    routing)
  • The Transport layer delivers the transmitted data
    to a specific process running on an Internet host
    (reliable byte stream)
  • The Application layer supports the programming
    interface used for building a program
    (application API)

95
Network resources
  • Network resources are resources available to the
    participants of a distributed computing community
  • Network resources include hardware such as
    computers and equipment, and software such as
    processes, email mailboxes, files, web documents
  • An important class of network resources is
    network services such as the World Wide Web and
    file transfer (FTP), which are provided by
    specific processes running on computers

96
Identification of network resources (naming
problem)
  • One of the key challenges in distributed
    computing is the unique identification of
    resources available on the network, such as
    e-mail mailboxes, and web documents
  • Addressing an Internet Host
  • Addressing a process running on a host
  • Email Addresses
  • Addressing web contents URL

97
Addressing Internet hosts
98
The Internet topology
99
The Internet topology
  • The internet consists of a hierarchy of networks,
    interconnected via a network backbone
  • Each network has a unique network address
  • Computers, or hosts, are connected to a network.
    Each host has a unique ID within its network.
  • Each process running on a host is associated with
    zero or more ports. A port is a logical entity
    for data transmission.

100
The Internet addressing scheme
  • In IP version 4, each address is 32 bit long.
  • The address space accommodates 232 (4.3 billion)
    addresses in total.
  • Addresses are divided into 5 classes (A through E)

byte 0
byte 1
byte 2
byte 3
0
class A address
1
0
class B address
network address
1
1
0
class C address
host portion
1
multicast
group
1
1
0
1
1
Multicast addresses
1
1
1
1
1
0
1
reserved
reserved
reserved address
101
The Internet addressing scheme (2)
Subdividing the host portion of an Internet
address
byte 0
byte 1
byte 2
byte 3
1
0
class B address
network address
host portion
A class A/C
address space can
also be similarly subdivided.
Which portion of the host address
is used for the
subnet
identification
subnet
address
local host address
is determined by a
subnet
mask.
102
Example
  • Suppose the dotted-decimal notation for a
    particular Internet address
  • is129.65.24.50. The 32-bit binary expansion of
    the notation is as
  • follows
  • Since the leading bit sequence is 10, the address
    is a Class B
  • address. Within the class, the network portion
    is identified by the
  • remaining bits in the first two bytes, that is,
    00000101000001, and the
  • host portion is the values in the last two bytes,
    or 0001100000110010.
  • For convenience, the binary prefix for class
    identification is often included as part of the
    network portion of the address, so that we would
    say that this particular address is at network
    129.65 and then at host address 24.50 on that
    network.

103
Another example
  • Given the address 224.0.0.1, one can expand it as
    follows
  •  
  • The binary prefix of 1110 signifies that this is
    class D, or multicast, address. Data packets
    sent to this address should therefore be
    delivered to the multicast group
    0000000000000000000000000001.

104
The Internet addressing scheme (3)
  • For human readability, Internet addresses are
    written in a dotted decimal notation
  • nnn.nnn.nnn.nnn, where each nnn group is
    a decimal value in the range of 0 through 255
  • Internet host table (found in /etc/hosts file)
  • 127.0.0.1 localhost
  • 129.65.242.5 falcon.csc.calpoly.edu
    falcon loghost
  • 129.65.241.9 falcon-srv.csc.calpoly.edu
    falcon-srv
  • 129.65.242.4 hornet.csc.calpoly.edu
    hornet
  • 129.65.241.8 hornet-srv.csc.calpoly.edu
    hornet-srv
  • 129.65.54.9 onion.csc.calpoly.edu
    onion
  • 129.65.241.3 hercules.csc.calpoly.edu
    hercules

105
IP version 6 addressing scheme
  • Each address is 128-bit long.
  • There are three types of addresses
  • Unicast An identifier for a single interface.
  • Anycast An identifier for a set of interfaces
    (typically belonging to different nodes).
  • Multicast An identifier for a set of interfaces
    (typically belonging to different nodes). A
    packet sent to a multicast address is delivered
    to all interfaces identified by that address.
  • See Request for Comments 2373 http//www.faqs.org
    /rfcs/ (link is in books reference)

106
The Domain Name System (DNS)
  • Each Internet address is mapped to a symbolic
    name, using the DNS, in the format of
  • ltcomputer-namegt.ltsubdomain
    hierarchygt.ltorganizationgt.ltsector namegt.ltcountry
    codegt
  • e.g., www.csc.calpoly.edu.us

107
The Domain Name System
  • For network applications, a domain name must be
    mapped to its corresponding Internet address.
  • Processes known as domain name system servers
    provide the mapping service, based on a
    distributed database of the mapping scheme.
  • The mapping service is offered by thousands of
    DNS servers on the Internet, each responsible for
    a portion of the name space, called a zone.

108
Domain name hierarchy
109
Name lookup and resolution
  • If a domain name is used to address a host, its
    corresponding IP address must be obtained for the
    lower-layer network software.
  • The mapping, or name resolution, must be
    maintained in some registry.
  • For runtime name resolution, a network service is
    needed a protocol must be defined for the naming
    scheme and for the service. Example The DNS
    service supports the DNS the Java RMI registry
    supports RMI object lookup JNDI is a network
    service lookup protocol.

110
Addressing a process running on a host
111
Logical ports
host A
host B
...
process
...
port
Each host has 65536 ports.
The Internet
112
Well known ports
  • Each Internet host has 216 (65,535) logical
    ports. Each port is identified by a number
    between 1 and 65535, and can be allocated to a
    particular process
  • Port numbers between 1 and 1023 are reserved for
    processes which provide well-known services such
    as finger, FTP, HTTP, and email.

113
Well known ports
114
Choosing a port to run your program
  • For programming when a port is needed, choose a
    random number above the well known ports
    1,024-65,535
  • For providing a network service for the
    community, then arrange to have a port assigned
    to and reserved for your service

115
Addressing a Web document
116
The Uniform Resource Identifier (URI)
  • Resources to be shared on a network need to be
    uniquely identifiable.
  • On the Internet, a URI is a character string
    which allows a resource to be located.
  • There are two types of URIs
  • URL (Uniform Resource Locator) points to a
    specific resource at a specific location
  • URN (Uniform Resource Name) points to a specific
    resource at a nonspecific location.

117
URL
  • A URL has the format of
  • protocol//host addressport/directory
    path/file namesection

118
More on URL
  • The path in a URL is relative to the document
    root of the server.
  • A URL may appear in a document in a relative
    form
  • lt a hrefanother.htmlgt
  • and the actual URL referred to will be
    another.html preceded by the protocol, hostname,
    directory path of the document .

119
Summary
120
Summary (1)
  • We discussed the following topics
  • What is meant by distributed computing
  • Distributed systems
  • Distributed algorithms
  • Basic concepts in operating systems processes
    and threads
  • Basic concepts in programming languages

121
Summary (2)
  • Basic concepts in networks
  • Layered network architectures the OSI model and
    the Internet model
  • TCP/IP protocol suite
  • Connection-oriented communication vs.
    connectionless communication
  • Internet addressing
  • Naming schemes for network resources
  • Domain Name System (DNS)
  • Protocol port numbers
  • Web documents and Uniform Resource Identifiers
    (URI)
Write a Comment
User Comments (0)
About PowerShow.com