Socket Options - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Socket Options

Description:

... enabled, causes IP to bypass the routing mechanisms of the ... process is blocked) ... It is recommended that sites use contiguous subnet masks and that ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 57
Provided by: david520
Category:

less

Transcript and Presenter's Notes

Title: Socket Options


1
  • Socket Options
  • Four functions define the API for get and
    setting socket options. We will cover those
    aspects which address generic, IPv4, TCP UDP and
    Raw Socket options.
  • getsockopt and setsockopt functions.
  • int getsockopt (int sockfd, int level, int
    optname, void optval, socklen_t optlen)
  • sockfd must refer to an OPEN socket descriptor
  • level specifies the interpretation option
    (generic, IPv4, IPv6).
  • optval is a ptr to a var from which the new value
    of the option is stored by getsockopt
    (value-result argument).
  • getsockopt is effectively a query in an OO
    language a get is typically an operation to
    retrieve a reference to a class and its methods.

2
  • Socket Options
  • int setsockopt (int sockfd, int level, int
    optname, void optval, socklen_t optlen)
  • optval is a pointer to the datatype for each
    option (typically an int or a struct).
  • Options are basically a mechanism whereby certain
    communication features either in the kernel or in
    IPv4, or TCP may be enabled / disabled / modified
    / queried.
  • Two primary option datatypes
  • Binary options used to enable or disable features
    (flag settings)
  • Return options that retrieve and return specific
    values for either examination or modification.

3
  • Socket Options
  • Assignment for ALL (assignment 2).
  • Write a c program to define all of the options
    that are available on your particular system
    (page 180 sockopt/checkopts.c)
  • email your results to the T/A Mr. Venkatarhagavan
    by 9/27.
  • Not all implementations support all socket
    options. Therefore we must (should) employ the
    ifdef else endif compilation directive of
    lines 34 to 39 of Figure 7.2.
  • This compiler directive must (should) be used on
    EVERY socket option.

4
  • Socket Options
  • Some socket options must be set prior to the
    completion of an accept (connected socket not
    returned until completion) this means they must
    be set on the LISTENING socket.
  • Some socket options are inherited from the
    listening socket (SO_DEBUG, SO_DONTROUNT,
    SO_KEEPALIVE, SO_LINGER, SO_OOBINLINE, SO_RCVBUF,
    SO_SNDBUF).
  • Generic socket options they are handled not by
    protocol code but rather by the kernel (but some
    may apply to only certain types of protocols).

5
  • Socket Options
  • SO_BROADCAST socket option
  • enables/disables ability to send broadcast
    messages (only UDP sockets and only on 802.3,
    802.4, 802.5) (which is most of the world).
  • SO_DEBUG option (TCP only).
  • When enabled the kernel keeps track of
    information about all segments sent or received
    by that socket.
  • SO_DONTROUTE option
  • This option, when enabled, causes IP to bypass
    the routing mechanisms of the underlying protocol
    (ethernet, etc).
  • Often used by routing daemons (routed and gated)
    to bypass the routing table and force a packet to
    be sent out a particular interface.
  • Avoid enabling this option. Will, in most
    cases, cause a connection failure.

6
  • Socket Options
  • SO_ERROR option
  • This option can be queried but cannot be set. A
    very important option.
  • When a socket error occurs the kernel sets
    so_error to one of the standard Exxx values.
    (pending error for that socket).
  • The process affected can be notified if the proc
    is blocked on select (select returns with
    appropriate read / write condition set).
  • Or if signal driven I/O is being employed SIGIO
    signal is generated.
  • In either case the value of so_error can be
    obtained by querying the SO_ERROR socket option.
  • After this query the value of so_error is reset
    by the kernel.
  • On a read or write error a -1 is returned and
    so_error is set to the value of errno, which is
    then reset to zero.

7
  • Socket Options
  • SO_KEEPALIVE socket option.
  • Generally cannot employ at the user level.
  • When set for a TCP socket, if no data exchanged
    for 2 hours, a keepalive probe is sent to the
    peer.
  • Either peer responds with ACK
  • Peer responds with RST, and errno is ECONNRESET
    (socket is closed)
  • If no response to the keepalive probe, BSDs will
    send 8 more probes. If no response to the
    keepalive probes then errno is set to ETIMEDOUT
    and the socket is closed.
  • If an ICMP error is received in response to the
    probes then error is returned and errno is
    EHOSTUNREACH.

8
  • Socket Options
  • SO_KEEPALIVE
  • Most common question regarding SO_KEEPALIVE is
    whether the two hour inactive period can be
    modified.
  • Generally NOT. This is a kernel issue and not a
    socket issue. If changed then it may end up
    being changed for ALL sockets.
  • SO_KEEPALIVE is designed to detect a peer host
    crash, not a process crash (process crash will
    result in a FIN being sent).
  • SO_KEEPALIVE is generally used by servers to
    prevent de facto half-open connections.
  • Rlogin and Telnet servers use SO_KEEPALIVE to
    terminate the connection on hang-up or power
    down.
  • FTP handles the timeout within the application.

9
  • Socket Options
  • TCP is sending - Peer process crashes
  • Then peer sends FIN, detected using select.
  • TCP is sending - Peer host crashes
  • Our TCP will timeout and errno ETIMEDOUT.
  • TCP is sending - peer host becomes unreachable
  • Our TCP will timeout and the pending error is
    set to EHOSTUNREACH
  • TCP is receiving - Peer process crashes
  • peer will send a FIN (will read as EOF).
  • TCP is receiving - peer host crashes
  • will stop receiving data and will go to
    inactive period (SO_KEEPALIVE is active then
    probes after 2 hours).
  • TCP is receiving - peer host is unreachable
  • will stop receiving data.

10
  • Socket Options
  • Connection is idle - SO_KEEPALIVE is set. Peer
    process crashes.
  • Peer TCP sends FIN, detect using select.
  • Connection is idle - SO_KEEPALIVE is set. Peer
    host crashes.
  • Keepalive probes are sent after 2 hours. If
    no answer then errno is ETIMEDOUT.
  • Connection is idle - SO_KEEPALIVE is set. Peer
    host is unreachable.
  • Keepalive probes are sent after 2 hours. If
    no answer then errno (so_error?) is set to
    EHOSTUNREACH.
  • Connection is idle - SO_KEEPALIVE is not set.
  • If peer proc crashes then a FIN is sent by
    peer.
  • If peer host crashes then we are lost.
  • If peer becomes unreachable then we are lost.

11
  • Socket Options
  • SO_LINGER socket option.
  • Default on a close() is to return immediately but
    the kernel will attempt to send any data in the
    socket send buffer.
  • SO_LINGER allows modification of the default
    behavior for a close().
  • SO_LINGER requires the passing of the linger
    struct to the kernel.
  • struct linger
  • int l_onoff
  • int l_linger // Posix specs in
    seconds.
  • If l_onoff non-zero and l_linger 0, TCP aborts
    connection on close and discards send buffer and
    sends a RST (no 4 way termination).
  • Avoids TIME-WAIT state but this can cause
    problems if no segment sequence numbers being
    used.

12
  • Socket Options
  • SO_LINGER option (continued).
  • If l_onoff nonzero and l_linger nonzero then on
    close the kernel will first attempt to send any
    data in the send buffer until the linger time
    expires. (meanwhile process is blocked).
  • If linger time expires before all data is sent
    then EWOULDBLOCK is returned and data in send
    buffer is lost.
  • The correct way for a client to terminate a
    connection is to set the SO_LINGER option for
    some finite time and to use shutdown in
    conjunction with this option.
  • This way the client can KNOW that the server has
    read any transferred data.
  • The problem is that close() returns immediately
    (or can linger until the ACK of its FIN).
  • But shutdown(), followed by a read() waits until
    the peers FIN is received.

13
  • Stevens suggests an application layer ACK to
    make this work there has to be an agreed upon
    data send size or an agreed upon end-of-record
    marker.
  • My commentary is that this is NOT a sound
    engineering approach as it violates the OSI stack
    model it mixes the responsibilities and methods
    of the application and network layers.
  • The approach of application layer hacks (or
    acks) reflects the old-school approach to Unix
    coding. Do whatever you have to do without
    regard to later consequences.
  • The result of this approach is hard to
    understand, difficult to modify software.
    Though it does create many jobs.

14
  • Some comments on buffer sizes should be at least
    three times the MSS (most systems 8192 bytes for
    a send/receive buffer versus a typical MSS of 512
    (some 1460)).
  • Buffer size can be a problem on systems with
    large MTUs (some ATMs have MTUs as large as
    4096 bytes).
  • Recommended that the user not employ the
    SO_RCVBUF or the SO_SNDBUF socket options.
    Unless serious network programming is being done.
  • Most important concept is the relationship
    between a full-duplex pipe and the socket buffer
    sizes on the machines using the pipe.

15
  • SO_RCVBUF and SO_SNDBUF socket options
  • Receive buffers are used to hold received data
    until it is read by the application.
  • The available room in the receive buffer is the
    window that TCP advertises to the other end.
  • Therefore it cannot overflow
  • This is the core of TCP flow control.
  • If a peer ignores the window and sends data
    beyond it both TCP and UDP discard the data.
  • SO_RCVBUF and SO_SNDBUF allow the buffer sizes to
    be modified.
  • Modern TCP systems have buffers between 8k and
    64k bytes.
  • Modern UDP systems have 9k byte send buffers and
    about 40k byte receive buffers.

16
  • Socket Options
  • SO_RCVBUF must be called prior to a connect()
    call.
  • For a server the receiver socket options must be
    called prior to a listen() call.
  • TCP socket buffers should be an even multiple of
    the MSS.
  • BANDWIDTH-DELAY PRODUCT
  • Multiply the bandwidth in bits/sec times the RTT
    and then convert the bits to bytes.
  • RTT can be determined with a ping.
  • Example T1 with a bandwidth of 1.536 Mb/s and a
    RTT of 60 mS gives a bandwidth-delay of 11.520k
    Bytes.
  • If the socket buffers are less than this then the
    pipe will not stay full and performance will be
    less than optimal.
  • Large buffers are required with fast bandwidth or
    long RTT.
  • When the bandwidth-delay gt TCPs Max Window (64k
    B) size then a long fat pipe options are used.

17
  • Socket Options
  • SO_RCVLOWAT and SO_SNDLOWAT.
  • Allow the changing of the low-water marks.
  • Receive low water mark is the amount of data that
    must be in the socket receive bffer for select()
    to return readable.
  • Defaults to 1 for TCP and UDP sockets.
  • The send low water mark is the amount of data
    that must be in the socket send buffer for
    select() to return writable.
  • Normally defaults to 2048 for TCP
  • With UDP the low-water mark is used but since
    the number of bytes of available space in the UDP
    send buffer never changes, as long as the UDP
    socket send buffer size is gt than the low water
    mark the UDP socket is writable.
  • UDP does NOT keep a copy of the datagrams sent by
    the application.

18
  • SO_RCVTIMEO and SO_SNDTIMEO
  • Allow the setting of timeouts on socket receives
    and sends.
  • Argument is a pointer to a timeval struct
    identical to the one used with select().
  • Disable a timeout by setting the struct value to
    zero.
  • Both timeouts are disabled by default.
  • Posix 1.g does not require support for these
    options.
  • SO_REUSEADDR option.
  • allows a listening server to start and bind() its
    well known port even if previously established
    connections exist that use this port as their
    local port.

19
  • s

20
Class 8 - CSE 7348/5348
  • Goals for Class 8
  • Understand hierarchical IP addressing
  • Subnetting
  • Gateways.
  • Anagram? Abendego? Or abendego, or abnormal
    end to ego. Carolyn Meinel? Attrition.org.

21
Class 8 - CSE 7348/5348
  • Five classes of addresses in the original RFC-950
  • Network Working GroupRequest for Comments
    950 J. Mogul (Stanford)J. Postel (ISI)August
    1985
  • Page 1
  • Internet Standard Subnetting Procedure
  • Status Of This Memo
  • This RFC specifies a protocol for the
    ARPA-Internet community. If subnetting is
    implemented it is strongly recommended that these
    procedures be followed. Distribution of this memo
    is unlimited.

22
Class 8 - CSE 7348/5348
0 network ID (7 bits) host
ID (24 bits)
Class A
10 network ID (14 bits)
host ID (16 bits)
Class B
110 network ID (21 bits)
host ID (8 bits)
Class C
1110 multicast group 28 bits
Class D
11110 reserved for future use (27 bits)
Class E
23
Class 8 - CSE 7348/5348
  • Most IP addresses are now classless.
  • What is assigned is a 32 bit network address and
    a corresponding 32 bit mask.
  • Bits of 1 in the mask cover the NETWORK address.
  • Bits of 0 in the mask cover the HOST address.
  • Bits of 1 in the mask are ALWAYS CONTIGUOUS FROM
    THE LEFT.
  • Bits of 0 in the mask are ALWAYS CONTIGUOUS FROM
    THE RIGHT.
  • This allows the mask to be specified as a PREFIX
    length, i.e. a class A address can be specified
    as having a PREFIX length of 8.

24
Class 8 - CSE 7348/5348
  • The advantage of a classless address space is
    that we are no longer restricted by using fixed
    PREFIX lengths.
  • IPV4 addresses are sometimes written a dotted
    decimal number followed by a slash, followed by
    the prefix length.
  • Using classless addresses requires classless
    routing are what is called CIDR (RFC 1519).
  • The purpose of CIDR was to reduce the size of the
    Internet routing tables to reduce the rate of
    IPV4 address depletion.
  • IPv4 addresses are normally SUBNETTED.

25
Class 8 - CSE 7348/5348
  • SUBNETTING.
  • Extends addressing Hierarchy
  • network ID (assigned to site)
  • subnet ID (chosen by site)
  • host ID (chosen by site)
  • The boundary between the subnet ID and the
    network ID is fixed by the PREFIX length of the
    assigned NETWORK id.
  • The prefix length is normally assigned by the
    ISP.
  • The boundary between the subnet ID and the host
    ID is chosen by the site.
  • All hosts on a given subnet share a common subnet
    MASK.
  • The subnet MASK specifies the boundary between
    the subnet ID and the host ID.

26
Class 8 - CSE 7348/5348
  • Subnetting (continued)
  • Example
  • 206.62.226.0/24 - an entire class C network.
  • The user then divides the remaining 8 bits into a
    3 bit subnet ID and a 5 bit host ID.
  • The subnet mask is therefore 255.255.255.224
    (0xFE for the last octet).
  • So how does this allow the more efficient use of
    the available address space?

27
Class 8 - CSE 7348/5348
  • Gateways
  • From RFC 950
  • 2.2 Changes to Host Software to Support Subnets
  • In most implementations of IP, there is code in
    the module that handles outgoing datagrams to
    decide if a datagram can be sent directly to the
    destination on the local network or if it must be
    sent to a gateway.
  • Generally the code is something like this
  • IF ip_net_number(dg.ip_dest)
    ip_net_number(my_ip_addr)
  • THEN
  • send_dg_locally(dg, dg.ip_dest)
  • ELSE
  • send_dg_locally(dg,

  • gateway_to(ip_net_number(dg.ip_dest)))

28
Class 8 - CSE 7348/5348
  • Gateways
  • To support subnets, it is necessary to store one
    more 32-bit quantity, called my_ip_mask. This is
    a bit-mask with bits set in the fields
    corresponding to the IP network number, and
    additional bits set corresponding to the subnet
    number field. (or the subnet MASK).
  • The code then becomes
  • IF bitwise_and(dg.ip_dest, my_ip_mask)

  • bitwise_and(my_ip_addr, my_ip_mask)
  • THEN
  • send_dg_locally(dg, dg.ip_dest)
  • ELSE
  • send_dg_locally(dg,
  • gateway_to(bitwise_and(dg.
    ip_dest, my_ip_mask)))

29
Class 8 - CSE 7348/5348
  • RFC 1812 the mother of all RFCs. Defines the
    addressing, connections and routing of the IP
    layer.
  • F. Baker, Editor Cisco Systems, June 1995
  • 2.2.5 Addressing Architecture
  • An IP datagram carries 32-bit source and
    destination addresses, each of which is
    partitioned into two parts - a constituent
    network prefix and a host number on that network.
    Symbolically
  • IP-address ltNetwork-prefixgt,
    ltHost-numbergt
  • To finally deliver the datagram, the last router
    in its path must map the Host-number (or rest)
    part of an IP address to the host's Link Layer
    address.
  • ASSIGNMENT READ SECTION 2 OF RFC 1812 IN ITS
    ENTIRETY THIS IS TESTABLE MATERIAL.

30
Class 8 - CSE 7348/5348
  • RFC 1812 (continued)
  • This simple notion (the notion of class based
    addressing) has been extended by the concept of
    subnets. These were introduced to allow arbitrary
    complexity of interconnected LAN structures
    within an organization, while insulating the
    Internet system against explosive growth in
    assigned network prefixes and routing complexity.
    Subnets provide a multi-level hierarchical
    routing structure for the Internet system. The
    subnet extension, described in INTERNET2, is a
    required part of the Internet architecture. The
    basic idea is to partition the ltHost-numbergt
    field into two parts a subnet number, and a true
    host number on that subnet
  • IP-address
  • ltNetwork-numbergt, ltSubnet-numbergt,
    ltHost-numbergt

31
Class 8 - CSE 7348/5348
  • RFC 1812
  • The interconnected physical networks within an
    organization use the same network prefix but
    different subnet numbers. The distinction between
    the subnets of such a subnetted network is not
    normally visible outside of that network. Thus,
    routing in the rest of the Internet uses only the
    ltNetwork-prefixgt part of the IP destination
    address. Routers outside the network treat
    ltNetwork-prefixgt and
  • ltHost-numbergt together as an uninterpreted
    rest part of the 32-bit IP
  • address. Within the subnetted network, the
    routers use the extended
  • network prefix
  • ltNetwork-numbergt, ltSubnet-numbergt

32
Class 8 - CSE 7348/5348
  • RFC 1812
  • The bit positions containing this extended
    network number have historically been indicated
    by a 32-bit mask called the subnet mask. The
    ltSubnet-numbergt bits SHOULD be contiguous and
    fall between the
  • ltNetwork-numbergt and the ltHost-numbergt fields.
    More up to date protocols do not refer to a
    subnet mask, but to a prefix length the "prefix"
    portion of an address is that which would be
    selected by a
  • subnet mask whose most significant bits are
    all ones and the rest are zeroes. The length of
    the prefix equals the number of ones in the
    subnet mask. This document assumes that all
    subnet masks are
  • expressible as prefix lengths.

33
Class 8 - CSE 7348/5348
  • RFC 1812
  • 2.2.5.2 Classless Inter Domain Routing (CIDR)
  • The explosive growth of the Internet has forced a
    review of address assignment policies. The
    traditional uses of general purpose (Class A, B,
    and C) networks have been modified to achieve
    better use of IP's 32-bit address space.
    Classless Inter Domain Routing (CIDR)
    INTERNET15 is a method currently being
    deployed in the Internet backbones to achieve
    this added efficiency. CIDR depends on deploying
    and routing to arbitrarily sized networks. In
    this model, hosts and routers make no assumptions
    about the use of addressing in the internet. The
    Class D (IP Multicast) and Class E (Experimental)
    address spaces are preserved, although this is
    primarily an assignment policy.
  • By definition, CIDR comprises three elements
  • topologically significant address assignment,
  • routing protocols that are capable of aggregating
    network layer reachability information, and
  • consistent forwarding algorithm ("longest
    match").

34
Class 8 - CSE 7348/5348
  • RFC-1812
  • 2.2.6 IP Multicasting
  • IP multicasting is an extension of Link Layer
    multicast to IP internets. Using IP multicasts, a
    single datagram can be addressed to multiple
    hosts without sending it to all. In the extended
    case, these hosts may reside in different address
    domains. This collection of hosts is called a
    multicast group. Each multicast group is
    represented as a Class D IP address. An IP
    datagram sent to the group is to be delivered to
    each group member with the same best- effort
    delivery as that provided for unicast IP traffic.
    The sender of the datagram does not itself need
    to be a member of the destination group.
  • The semantics of IP multicast group membership
    are defined in INTERNET4. That document
    describes how hosts and routers join and leave
    multicast groups. It also defines a protocol, the
    Internet Group Management Protocol (IGMP), that
    monitors IP multicast group membership.

35
Class 8 - CSE 7348/5348
  • RFC 1812
  • Forwarding of IP multicast datagrams is
    accomplished either through static routing
    information or via a multicast routing protocol.
    Devices that forward IP multicast datagrams are
    called multicast routers. They may or may not
    also forward IP unicasts. Multicast datagrams are
    forwarded on the basis of both their source and
    destination addresses. Forwarding of IP multicast
    packets is described in more detail in Section
    5.2.1. Appendix D discusses multicast routing
    protocols.
  • 2.2.8.1 Embedded Routers
  • A router may be a stand-alone computer system,
    dedicated to its IP router functions.
    Alternatively, it is possible to embed router
    functions within a host operating system that
    supports connections to two or more networks. The
    best-known example of an operating system with
    embedded router code is the Berkeley BSD system.
    The embedded router feature seems to make
    building a network easy, but it has a number of
    hidden pitfalls

36
Class 8 - CSE 7348/5348
  • Classless Inter-Domain Routing (CIDR)
  • Classless Inter-Domain Routing (CIDR)
  • Conceptually collapses a block of contiguous
    Class C addresses into a single routing table
    entry.
  • A routing table entry consists of (Network
    Address, Count)
  • (Network Address, Count) 1 table entry
  • Network Address - smallest network address in the
    block.
  • Count - total number of network addresses in the
    block.
  • Entry (192.5.48.0, 3) 192.5.48.0 192.5.49.0
    192.5.50.0
  • CIDR Does not restrict network numbers to Class C
    addresses
  • CIDR Does not use an integer count for the block
    size.
  • CIDR Requires each block of addresses to be a
    power of two
  • CIDR Uses bit masking to identify the block size.

37
Class 8 - CSE 7348/5348
  • Coordination of Address Allocation
  • To equally distribute the remaining available
    Class C addresses, they have been broken up into
    address groups. Each continent receives a block
    of addresses to be administered by a
    continent-level authority. Sub-authorities are
    delegated to disperse addresses within each
    country. All addresses were divided so that each
    address group differs by two numbers in the first
    octet.

38
Class 8 - CSE 7348/5348
  • Address allocation

39
Class 8 - CSE 7348/5348
  • CIDR Address Range Values
  • Two values are required to specify the range
  • Lowest address
  • 32-bit mask (which operates like a subnet mask)
  • Lowest - First address in the range
  • Highest - Last valid address in range, usually
    broadcast.
  • Mask - Similar to subnet masks
  • Example (2048 contiguous addresses starting at
    234.170.168.0)
  • lowest
  • (234.170.168.0)
  • Highest
  • (234.170.175.255)
  • CIDR Mask
  • (255.255.248.0)

40
Class 8 - CSE 7348/5348
  • CIDR Router Functionality
  • Routers at a site with classless addresses must
    be changed to correctly route datagrams.
  • Address Class Interpretation within the router
    must be disabled.
  • To determine the correct destination, each entry
    in the routing table (pair of address and mask)
    and the routing software use a "longest-match"
    paradigm to select a route.
  • A given block of addresses can be subdivided and
    separate routes can be setup for each
    subdivision.
  • All nodes on a given network will be assigned
    addresses from the same fixed range.
  • Hosts and routers that use supernetting need
    unconventional routing software that understands
    ranges of addresses.

41
Class 8 - CSE 7348/5348
  • Subnet interpretation can be chosen independently
    for each physical network. The standard specifies
    that a site using subnet addressing must choose a
    32-bit subnet mask for each network.
  • It is recommended that sites use contiguous
    subnet masks and that all physical networks
    sharing the same IP address, use the same subnet
    mask.

42
Class 8 - CSE 7348/5348
  • Subnet Addressing and Routing
  • This is the most accepted IP extension method to
    date. It has been standardized by the IAB.
  • Subnet addressing involves dividing the HOSTID
    part of an IP address into two sub-parts that
    identify
  • A physical network (usually within an autonomous
    system)
  • A host on that network.

43
Class 8 - CSE 7348/5348
  • More on Subnets
  • Subnet interpretation can be chosen independently
    for each physical network. The standard specifies
    that a site using subnet addressing must choose a
    32-bit subnet mask for each network. It is
    recommended that sites use contiguous subnet
    masks and that all physical networks sharing the
    same IP address, use the same subnet mask.

44
Class 8 - CSE 7348/5348
  • Subnet routing
  • Subnet Routing Uses a modified IP routing
    algorithm that includes subnet masks as well as
    NETID and Next-hop addresses.
  • A subnet routing table entry is made ofsubnet
    mask, network address, next-hop address

45
Class 8 - CSE 7348/5348
  • Subnet10 -gt looking ahead
  • Anyone building a huge corporate intranet knows
    that life would be a lot simpler if the InterNIC
    would simply fork over one of those big Class A
    IP network addresses, the kind that supports 16
    million hosts or thousands of subnetworks.
  • Barring that near-miracle, however, there's
    always Plan B Create a huge corporate intranet
    by subdividing a special Class A address
    space--network 10.0.0.0--into smaller
    subnetworks, thereby protecting corporate
    information assets while providing the
    flexibility of Internet access and address
    management and security.
  • These special network IP addresses are the ones
    designated in the Internet's RFCs (Request for
    Comment documents) for use as private
    networks--those TCP/IP networks not connected to
    the Internet. (Of course, you can use these
    addresses and still have full Internet
    connectivity.)

46
Class 8 - CSE 7348/5348
  • Not only can you use as much address space as you
    need (actually, smaller networks may find special
    Class B or C addresses adequate), the Subnet 10
    strategy lets you build protected intranets,
    isolated behind firewalls and proxy servers, and
    manage the networks' IP address space any way you
    like.
  • The Subnet 10 strategy also allows you to create
    a large-scale template for a comprehensive set of
    Internet and intranet services. It frees you from
    the restraints of configuring and managing a
    number of small Class C networks.
  • And even though it's a large-scale design capable
    of supporting a campuswide or regionwide network,
    it can be scaled down to individual offices,
    small networks or a tightly controlled intranet
    structure.

47
Class 8 - CSE 7348/5348
  • The key to building a Subnet 10 is that unlike
    networks with unique, Internet service provider-
    or InterNIC-assigned IP addresses, you must keep
    every host, router or workstation that uses these
    special addresses hidden from the Internet behind
    a firewall and a proxy host. They'll still be
    able to reach the Internet through the proxy, but
    you'll be free to tailor the internal network
    addresses any way you like.
  • There also must be a second, external network
    that is directly accessible from the Internet.
    Hosts that are publicly available--World Wide Web
    sites, an anonymous FTP host and the DNS (Domain
    Naming System) host, for example--reside on the
    external network.

48
Class 8 - CSE 7348/5348
  • The hosts on the internal, protected network have
    a unique, special identity. Their signature is
    those special Class A, B or C network addresses
    that mark them as members of the private network.
    Their special network addresses help to protect
    the intranet from intruders. An intruder will
    find it difficult to target a private network
    host by forging a private network address,
    because any IP datagrams bearing the special
    addresses will be discarded by external routers.
  • Behind the firewall and the proxy, intranet users
    can communicate freely with each other. When
    users venture out onto the Internet, they connect
    to the proxy, which establishes the connection
    for them.

49
Class 8 - CSE 7348/5348
  • The design has three elements the inner network,
    the proxy hosts (sometimes called "bastion"
    hosts) and the outer network.
  • All user workstations, servers, E-mail hosts and
    a DNS server that knows about the inner network
    machines go on the inner network or its subnets.
    E-mail hosts or post offices are usually hidden
    from Internet E-mail systems by an SMTP gateway
    and a mail host on the outside network, which is
    visible to the Internet.
  • The proxy hosts sit on the border between the
    inner and outer networks. They're the line of
    defense that protects the hosts on the inner
    network. Proxies on the border connect to both
    the inner and the outer networks, so they know
    about both worlds, as well as what each should
    know about the other.

50
Class 8 - CSE 7348/5348
  • The outside network is the public address
    space--a separate network that is not part of
    Subnet 10. It has a real, registered and fully
    routable InterNIC or ISP-assigned network
    address. The outer network only has a few hosts,
    so a Class C address will do.
  • The outer network contains the hosts that
    Internet wanderers can see, contact and (since we
    live in dangerous times) attempt to hack. These
    include the organization's Web site, its
    anonymous FTP host, its E-mail gateway and the
    external DNS. The outer network portions of the
    proxy servers also reside on this network.
  • The DNS only knows the identities of the hosts on
    the outside network, along with the outside
    network identities of the proxies. The DNS on the
    inner network points to the external network DNS
    to resolve Internet host names.

51
Class 8 - CSE 7348/5348
  • Network Address Translation (NAT) is a vitally
    important Internet technology for a variety of
    reasons. It can provide load balancing for
    parallel processing, it can provide several types
    of strong access security, and it can provide
    fault-tolerance and high-availability. Finally,
    it can simplify some basic network administration
    functions. Below, we sketch the possible uses,
    and then follow up with Linux-specific
    applications.
  • RFC 1631
  • RFC 1631 (alt) describes the "traditional" NAT
    (Network Address Translation) that can be used
    for this kind of a task. Basically, the idea
    behind NAT is to re-write the IP headers and
    substitute one numeric address for another. This
    document discusses some basic implementation
    issues, such as computing header checksums, and
    mentions problems with packet encryption, and
    ICMP. It does not discuss load-balancing or
    masquerading issues.

52
Class 8 - CSE 7348/5348
  • Masquerading
  • One variation of NAT, called masquerading, is
    already available in stock Linux kernels. The
    theory, tools and installation procedure are
    discussed in the IP Masquerade mini-HOWTO.
    Masquerading is designed to provide security. It
    is intended for use as a type of a firewall,
    hiding many hosts behind one IP address, and
    relabeling all packets from behind the firewall
    so that they appear to be coming from on
    location, the firewall itself.
  • IP Masq is very powerful and flexible in this
    respect, and the filter accounting rules can
    configured to handle complex network topologies.
    However, it does not currently support the
    inverse operation of distributing incoming
    packets to multiple servers.

53
Class 8 - CSE 7348/5348
  • Since NAT gateways operate on IP packet-level,
    most of them have built-in internetwork routing
    capability. The internetwork they are serving can
    be divided into several separate sub networks
    (either using different backbones or sharing the
    same backbone) which further simplifies network
    administration and allows more computers to be
    connected to the network.
  • NAT and Proxies
  • A proxy is any device that acts on behalf of
    another. The term is most often used to denote
    Web proxying. A Web proxy acts as a "half-way"
    Web server network clients make requests to the
    proxy, which then makes requests on their behalf
    to the appropriate Web server. Proxy technology
    is often seen as an alternative way to provide
    shared access to a single Internet connection.
    The main benefits of Web proxying are

54
Class 8 - CSE 7348/5348
  • Local caching a proxy can store
    frequently-accessed pages on its local hard disk
    when these pages are requested, it can serve them
    from its local files instead of having to
    download the data from a remote Web server.
    Proxies that perform caching are often called
    caching proxy servers.
  • Network bandwidth conservation if more than one
    client requests the same page, the proxy can make
    one request only to a remote server and
    distribute the received data to all waiting
    clients.
  • Both these benefits only become apparent in
    situations where multiple clients are very likely
    to access the same sites and so share the same
    data.
  • Unlike NAT, Web proxying is not a transparent
    operation it must be explicitly supported by its
    clients. Due to early adoption of Web proxying,
    most browsers, including Internet Explorer and
    Netscape Communicator, have built-in support for
    proxies, but this must normally be configured on
    each client machine, and may be changed by the
    naive or malicious user.

55
Class 8 - CSE 7348/5348
  • The basic purpose of NAT is to multiplex traffic
    from the internal network and present it to the
    Internet as if it was coming from a single
    computer having only one IP address.
  • A modern NAT gateway must change the Source
    address on every outgoing packet to be its single
    public address. It therefore also renumbers the
    Source Ports to be unique, so that it can keep
    track of each client connection. The NAT gateway
    uses a port mapping table to remember how it
    renumbered the ports for each client's outgoing
    packets. The port mapping table relates the
    client's real local IP address and source port
    plus its translated source port number to a
    destination address and port. The NAT gateway can
    therefore reverse the process for returning
    packets and route them back to the correct
    clients.

56
Class 8 - CSE 7348/5348
  • When any remote server responds to an NAT client,
    incoming packets arriving at the NAT gateway will
    all have the same Destination address, but the
    destination Port number will be the unique Source
    Port number that was assigned by the NAT. The NAT
    gateway looks in its port mapping table to
    determine which "real" client address and port
    number a packet is destined for, and replaces
    these numbers before passing the packet on to the
    local client.
  • This process is completely dynamic. When a packet
    is received from an internal client, NAT looks
    for the matching source address and port in the
    port mapping table. If the entry is not found, a
    new one is created, and a new mapping port
    allocated to the client
  • Incoming packet received on non-NAT port
  • Look for source address, port in the mapping
    table
  • If found, replace source port with previously
    allocated mapping port
  • If not found, allocate a new mapping port
  • Replace source address with NAT address, source
    port with mapping port
Write a Comment
User Comments (0)
About PowerShow.com