Title: Hyper Text Transfer Protocol
1Hyper Text Transfer Protocol
2HTTP
- HTTP defines how Web pages are requested and
served on the Internet - Early servers and browsers used an ad-hoc
approach - A standardized protocol, called HTTP/1.0, was
derived from this - The earlier approach is now called HTTP/0.9
- Later, HTTP/1.0 was extended to HTTP/1.1
- The protocol versions are upwardly compatible
- servers and browsers which can handle HTTP/1.1
can also handle HTTP/1.0 and HTTP/0.9
3History HTTP/0.9
- HTTP/0.9 was very simple
- A browser would send a request like this to a
server - GET /hobbies.html
- In response, the server would send the contents
of the requested file. - Only GET requests were supported
- Only a file path and name could appear in a GET
request - The response had to be a HTML document.
4Got here on 14/jan/2003
5History (contd.)
- Different browsers/servers soon extended this
basic scheme in various ways - To achieve some standardization, the HTTP/1.0
protocol was specified, in 1996, in a document
called RFC1945 - (for historical reasons, an Internet standard
spec is called a Request for Comment or RFC) - This was soon extended to HTTP/1.1, in RFC2068,
released in January 1997 - An update to RFC2068 was produced in June 1999,
as RFC2616 - Various other protocols, based on HTTP, have been
produced from time-to-time - we will see a cookie protocol, based on HTTP,
which was specified in February 1997, in RFC2109
6How HTTP Works
- HTTP sits on TCP, which, in turn, sits on IP
- Usually, HTTP servers are configured to listen to
TCP/IP Port 80 - although sometimes a different port is used,
- particularly if two HTTP servers are running on
one machine - You can see how HTTP works by pretending to be a
browser yourself - Using telnet to connect to a server, you can
issue a request and see the response
7Example
- If you were to point a browser at the URL
- http//student.cs.ucc.ie
- you would get a HTML home-page which provides
links to various pages for students, etc. - The server on student.cs.ucc.ie uses the standard
HTTP port, Port 80, so you can get the same page
by - telnetting to Port 80 on student.cs.ucc.ie
- and typing a GET request
8Connecting to the HTTP server on student.cs.ucc.ie
- On any machine, say interzone, specify the
address and port in a telnet command - interzone.ucc.iegt telnet student.cs.ucc.ie 80
- You will get the following response
- Trying 143.239.211.125...
- Connected to student.cs.ucc.ie.
- Escape character is ''.
- The HTTP server is now listening
9Requesting the home page
- Issue the following HTTP/1.0 request, noting
that you must type two carriage returns - GET / HTTP/1.0 RETURN
- RETURN
- The response consists of
- a status line,
- a sequence of headers and
- the requested home page
- Then you are told that the telnet connection was
closed by the server, - as you will see on the next slide
10Cs 607 got here on 14 dec 2004
11The reply to your request
- The servers response
- HTTP/1.1 200 OK
- ...
- Content-Type text/html
-
- ltHTMLgt
- ...
- lt/HTMLgt
- Then your local telnet program tells you that the
connection was closed by the server - Connection closed by foreign host.
- interzone.ucc.iegt
12Getting a different page
- Consider the page whose URL is
- http//student.cs.ucc.ie/cs1064/jabowen/
- Telnet to the server
- interzone.ucc.iegt telnet student.cs.ucc.ie 80
- When the server is listening, ask for the page
like this - GET /cs1064/jabowen/ HTTP/1.0 RETURN
- RETURN
13What was going on above
- Once connected to a HTTP server, we can
- send a HTTP request line,
- optionally followed by request headers.
- In the cases above,
- GET / HTTP/1.0
- and
- GET /cs1064/jabowen/ HTTP/1.0
- were request lines
- Each request line was terminated by pressing
RETURN - In each case, the second RETURN marked the end
of an empty list of request headers
14GET requests
- In GET / HTTP/1.0
- the / is the resource the client wants to get
- the HTTP/1.0 tells the server that the client is
using the HTTP/1.0 protocol - In GET /cs1064/jabowen/ HTTP/1.0
- the /cs1064/jabowen/ is the resource the client
wants to get - the HTTP/1.0 tells the server that the client is
using the HTTP/1.0 protocol - In each case, the server responds by sending a
status line, a number of response headers and the
content of the requested resource.
15Consider the response
- HTTP/1.1 200 OK
- ...
- Content-Type text/html
-
- ltHTMLgt
- ...
- lt/HTMLgt
- The first line, HTTP/1.1 200 OK , is a status
line - The next few lines, ending in the line
Content-Type text/html, are header lines - The lines bounded by ltHTMLgt and lt/HTMLgt form
the content of the requested resource.
16HEAD requests
- HEAD requests were new in HTTP/1.0
- A HEAD request is similar to a GET, the only
difference being the use of the word HEAD instead
of the word GET, for example - HEAD /cs1064/jabowen/ HTTP/1.0 RETURN
- RETURN
- The server sends the same status line and the
same response headers as if it had received a GET
request, - but does not send the actual content of the
resource mentioned in the request. - Thus, human clients can use HEAD requests to
- access easily information about a resource on a
server - without being overwhelmed by the mass of detail
that would be received if the resource content
were sent in the response
17Example HEAD request
- Suppose, for example, we wanted to see
information about - http//student.cs.ucc.ie/cs1064/jabowen/
- such as its size, when it was last edited, etc.
- We can send the request
- HEAD /cs1064/jabowen/ HTTP/1.0
18 Response to example HEAD request
- HTTP/1.1 200 OK
- Date Wed, 13 Dec 2000 122135 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Thu, 07 Dec 2000 131618 GMT
- ETag "2160-29c6-3a2f8da2"
- Accept-Ranges bytes
- Content-Length 10694
- Connection close
- Content-Type text/html
19 Analysis of response
- The first line in the response
- HTTP/1.1 200 OK
- is the status line in which
- HTTP/1.1 indicates that the server can use
HTTP/1.1 (although it can accept requests in
earlier HTTP forms) - 200 is a code which indicates the status the
request was given by the server - OK is an English language phrase giving the
meaning of the status code - The other lines in the response give information
either about the server or the resource
20 Analysis (contd.)
- Date Wed, 13 Dec 2000 122135 GMT
- gives date/time of the response
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- gives details on server
- Last-Modified Thu, 07 Dec 2000 131618 GMT
- says when resource was last modified
- ETag "2160-29c6-3a2f8da2"
- provides a supposedly-unique string to identify
this entity - Accept-Ranges byte
- says that this server could serve up pieces of
this resource, pieces specifiable to the nearest
byte - Content-Length 10694
- gives the size of the resource
- Connection close
- says that the server does not regard this as a
persistent connection - Content-Type text/html
- gives the type of data in the resource
21Another example
- Suppose, we wanted to learn about the resource
with URL - http//student.cs.ucc.ie/cs1064/jabowen/vh40.gif
- We can send the request
- HEAD /cs1064/jabowen/vh.gif HTTP/1.0
- Response is
- HTTP/1.1 200 OK
- Date Wed, 13 Dec 2000 122304 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Fri, 24 Nov 2000 114600 GMT
- ETag "3133-361-3a1e54f8"
- Accept-Ranges bytes
- Content-Length 865
- Connection close
- Content-Type image/gif
22Cs 607 got here on 21 Jan 2003
23HTTP/1.1
- A (fairly) detailed description
24- We have just seen some example HTTP/1.0
interactions - The same kinds of concepts we saw in these
interactions will arise as we examine HTTP/1.1 in
more detail - The versions of HTTP have a great deal in common,
so, in what follows, much of what is said will be
true of all three versions - Therefore,, any mention of just HTTP will mean
that the statement applies to HTTP/0.9, HTTP/1.0
and HTTP/1.1
25Overall Operation of HTTP
- The HTTP protocol is a request/response protocol.
- request
- An HTTP message sent by a client to a server
- response
- An HTTP message sent by a server to a client
which has made a request. - client
- A program that establishes connections for the
purpose of sending requests. - server
- A program that accepts connections in order to
service requests by sending back responses. - As we shall see, a program may act as both a
client and a server.
26 Message from a client
- A client sends, over a connection, to a server
- a request line in the form of
- a request method,
- a URI (Uniform Resource Identifier), and
- a protocol version,
- possibly followed by a message containing
- request modifiers,
- information about the client,
- and (possibly) body content.
27Response from a server
- The server responds with
- a status line, in the form of
- the message's protocol version,
- a success or error code and
- an English phrase explaining the code
- possibly followed by a message containing
- server information,
- information about the entity in the body content
(if any) - and (possibly) body content.
28HTTP Communication
- Most communication
- is started by a user agent and
- consists of a request to be applied to a resource
on some origin server. - user agent
- A client (browser, spider, etc.) which initiates
a request. - resource
- A data object or service that can be identified
by a URI. - origin server
- The server on which a resource resides or is to
be created.
29Cs 607n got here on 11 jan 2005
30Simple communication
- Involves single connection between user agent
(UA) and origin server (O) - This connection is denoted, in diagrams on this
and future slides, by ------- -
- request chain gt
- UA -----------------------------------O
- ltresponse chain
31More complicated case
- Intermediaries present in request/response chain.
- request chain gt
- UA ----------- A ----------- B ----------- C
----------- O - ltresponse chain
- Above, 3 intermediaries (A, B, and C) lie
between user agent and origin server. - Intermediaries act as both clients and servers
- Request or response message that travels the
whole chain passes through 4 separate
connections - UA-A connection
- A-B connection
- B-C connection
- C-O connection
32Simple versus complicated
- Distinction is important because some HTTP
options may apply - only to the connection with the nearest
neighbour, - only to the end-points of the chain,
- or to all connections along the chain.
333 forms of intermediary
- proxy, an agent which
- receives a request for a resource whose URI is in
its absolute form and, - if necessary, rewrites all or part of the message
and forwards the reformatted request toward the
server identified by the URI. - gateway, an agent which
- acts as a translation interface to a server for
another protocol, such as WAP, etc. - tunnel, an agent which
- acts as a relay point between two connections
without changing messages - tunnels are used, for example, in security
firewalls
34Caching
35Caching
- User agents, proxies and gateways (but not
tunnels) may use a local cache to handle
requests, instead of forwarding them on to an
origin server - A request/response chain is shortened if one of
the parties along the chain has a cached response
applicable to the request.
36Example Network topology
- The example caching scenarios in the next few
slides will use this network - UA3____________D
-
- UA2_____
-
-
- UA1_____A______B________C_________O
-
37Caching Example 1
- request chain gt
- UA1 ----------- A ----------- B -------- C
--------- O - ltresponse chain
- In the example above
- the user has made a request for a resource on
origin server O - neither UA1 nor any of the proxies A, B or C has
an appropriate cached response - so the request has been forwarded all the way to
O - Four connections are involved in servicing the
request
38Caching Example 2
- request
- chain
- UA1.... A ... B .. C O
- response
- chain
- In the example above
- the user has repeated the same request for a
resource on O - UA1 has a cached response to the earlier request
and gives this to the user without sending the
request anywhere - No connection is involved in servicing the
request
39Caching Example 3
- request chain gt
- UA2 -----------------
- UA1 ........ A .. B .. C ... O
- ltresponse chain
- In the example above
- the user at UA2 has requested the same resource
on origin server O that was earlier requested by
the user at UA1 - UA2 has forwarded the request to proxy A
- proxy A has an appropriate cached response, from
when it serviced the earlier request from UA1 - Only one connection is involved in servicing the
request
40Caching Example 4
- request chain gt
- UA3 ---------- D --------
-
- UA1 ..... A .. B .. C ... O
- ltresponse chain
- In the example above
- the user at UA3 has requested the same resource
on origin server O that was earlier requested by
the user at UA1 - UA3 has forwarded the request to proxy D, which
has forwarded it to proxy B - proxy B has an appropriate cached response, from
when it serviced the earlier request from UA1 - Two connections are involved in servicing the
request
41To cache or not?
- Not all responses are usefully cacheable
- As we will see later, some requests may contain
modifiers which place special requirements on
cache behavior. - The same is true of responses
42Cs 607 ngot here on 28 january 2003
43Caching/Proxy architectures
- A wide variety of cache and proxy
architectures/configurations exist, including - national hierarchies of proxy caches to save
inter-national and/or inter-continental
bandwidth, - systems that broadcast or multicast cache
entries, - organizations that distribute subsets of cached
data via CD-ROM, - and so on.
44Connections
45Temporary Connections
- In most implementations of HTTP/1.0, a server
closed a connection after it had serviced the
request received on that connection - We saw this earlier, when the server on
student.cs.ucc.ie closed the telnet connection
that we had established, after it had sent its
response to the HTTP/1.0 GET request we had sent - The use of inline images, sound files, etc., in
web pages often requires a client to make
multiple requests of the same server when loading
one document - Thus the temporary connections provided by
HTTP/1.0 meant that loading even one web page
required many separate TCP connections (one to
to fetch each inline image, each sound file etc.)
- This imposed a significant unnecessary load on
HTTP servers and caused congestion on the
Internet.
46Advantages of Persistent Connections
- Persistent HTTP connections offer a number of
advantages - By opening and closing fewer TCP connections, CPU
time is saved - HTTP requests and responses can be pipelined on a
connection, allowing a client to make multiple
requests without waiting for each response - Network congestion is reduced by reducing the
number of packets caused by TCP opens, - Latency on subsequent requests is reduced since
there is no time spent in TCP's
connection-opening handshake.
47Persistent Connections in HTTP/1.1
- Unlike HTTP/1.0 and earlier, persistent
connections are the default behavior of any
HTTP/1.1 connection. - This means that, in HTTP/1.1, when a connection
has been opened to service a request, it is kept
open for further possible requests from the same
client - This is true even if the initial request
triggered an error response from the server - But, when no further request has been received
after some time-out period, the server may close
the connection - However, a client can indicate, when making a
request, that it wants the connection closed
after the request is serviced
48Connection Persistency Negotiation
- HTTP/1.1 provides a mechanism by which a client
and a server can signal the close of a TCP
connection. - the Connection header field.
- If a HTTP/1.1 client wants a connection closed
after it receives a response to its request, it
should include, in the request, a Connection
header containing the token "close" . - Similarly, if a HTTP/1.1 server intends to close
a connection closed after it sends a response to
a request, it should include, in the response, a
Connection header containing the token "close"
. - If either the client or the server sends the
close token in a Connection header, that request
becomes the last one for the connection.
49Example 1 Introduction
- A human, using a telnet client, sends a HTTP/1.0
request to a HTTP/1.1 server - The server assumes that the client, because it is
using HTTP/1.0, cannot handle persistent
connections and, in its response, signals its
intention to close the connection - After printing the response, the telnet client
says that the connection was closed by the
foreign host
50Example 1
- interzone.ucc.iegt telnet student.cs.ucc.ie 80
- Trying 143.239.211.125...
- Connected to student.cs.ucc.ie.
- Escape character is ''.
- HEAD /cs1064/jabowen/ HTTP/1.0
-
- HTTP/1.1 200 OK
- Date Sat, 06 Jan 2001 175644 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Wed, 20 Dec 2000 113446 GMT
- ETag "2160-2dee-3a409956"
- Accept-Ranges bytes
- Content-Length 11758
- Connection close
- Content-Type text/html
-
- Connection closed by foreign host.
51Example 2 Introduction
- A human, using a telnet client, sends a HTTP/1.1
request to a HTTP/1.1 server - The server assumes that the client, because it is
using HTTP/1.1, wants a persistent connection - thus, there is no Connection header in the
response - The telnet client prints the response for the
human to see - After a significant delay (the time-out period),
the server realizes the client has no further
request and closes the connection - The telnet client then tells the human that the
connection was closed by the foreign host
52Example 2
- interzone.ucc.iegt telnet student.cs.ucc.ie 80
- Trying 143.239.211.125...
- Connected to student.cs.ucc.ie.
- Escape character is ''.
- HEAD /cs1064/jabowen/ HTTP/1.1
- Host student.cs.ucc.ie
-
- HTTP/1.1 200 OK
- Date Sat, 06 Jan 2001 175708 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Wed, 20 Dec 2000 113446 GMT
- ETag "2160-2dee-3a409956"
- Accept-Ranges bytes
- Content-Length 11758
- Content-Type text/html
- A time-out period elapses before server closes
connection - Connection closed by foreign host.
53Example 3 Introduction
- A human, using a telnet client, sends a HTTP/1.1
request to a HTTP/1.1 server - The client knows that, because it is using
HTTP/1.1, the server will think it wants a
persistent connection - Since the client does not want a persistent
connection it sends a Connection header with a
close token in the request - Seeing this, the server indicates its intention
to close the connection immediately, by including
a Connection header with a close token in its
response - The telnet client prints the response for the
human to see and, immediately thereafter, tells
the human that the connection was closed by the
foreign host
54Example 3
- interzone.ucc.iegt telnet student.cs.ucc.ie 80
- Trying 143.239.211.125...
- Connected to student.cs.ucc.ie.
- Escape character is ''.
- HEAD /cs1064/jabowen/ HTTP/1.1
- Host student.cs.ucc.ie
- Connection close
-
- HTTP/1.1 200 OK
- Date Sat, 06 Jan 2001 175758 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Wed, 20 Dec 2000 113446 GMT
- ETag "2160-2dee-3a409956"
- Accept-Ranges bytes
- Content-Length 11758
- Connection close
- Content-Type text/html
-
- Connection closed by foreign host. (No time-out
delay before this from telnet client)
55Pipelining Requests
- A client that supports persistent connections may
"pipeline" its requests (i.e., send multiple
requests without waiting for each response). - A server must send its responses to those
requests in the same order that the requests were
received.
56Example 4 Introduction
- A human, using a telnet client, sends two
HTTP/1.1 requests to a HTTP/1.1 server, sending
the second request before it even receives a
response to the first request - Since he has only two requests, the client sends
a Connection header with a close token in the
second request - The server responds to both requests and, because
of the close token in the 2nd request, indicates
its intention to close the connection
immediately, by including a Connection header
with a close token in its response to the 2nd
request. - The telnet client prints the responses for the
human to see and, immediately thereafter, tells
the human that the connection was closed by the
foreign host
57Example 4 the pipelined requests
- interzone.ucc.iegt telnet student.cs.ucc.ie 80
- Trying 143.239.211.125...
- Connected to student.cs.ucc.ie.
- Escape character is ''.
- HEAD http//student.cs.ucc.ie/cs1064/jabowen/
HTTP/1.1 - Host student.cs.ucc.ie
-
-
- HEAD http//student.cs.ucc.ie/cs4400/jabowen/
HTTP/1.1 - Host student.cs.ucc.ie
- Connection close
58 Example 4 the sequence of responses
- HTTP/1.1 200 OK
- Date Wed, 31 Jan 2001 200141 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Thu, 25 Jan 2001 132632 GMT
- ETag "2160-2e25-3a702988"
- Accept-Ranges bytes
- Content-Length 11813
- Content-Type text/html
-
- HTTP/1.1 200 OK
- Date Wed, 31 Jan 2001 200141 GMT
- Server Apache/1.3.14 (Unix) PHP/4.0.3pl1
- Last-Modified Wed, 20 Dec 2000 124239 GMT
- ETag "13d3a-2b60-3a40a93f"
- Accept-Ranges bytes
- Content-Length 11104
- Connection close
- Content-Type text/html
59Pipelining Requests (contd.)
- Clients which assume persistent connections and
pipeline immediately after connection
establishment should be prepared to retry their
connection if the first pipelined attempt fails. - If a client does such a retry, it must NOT
pipeline before it knows the connection is
persistent. - Clients must also be prepared to resend their
requests if the server closes the connection
before sending all of the corresponding responses.
60Pipelining Requests (contd.)
- Care must be taken when pipelining
- because some requests (called non-idempotent
requests) may change the state of the server (for
example, by changing a database used by the
server) - Clients should NOT pipeline such requests
- Otherwise, a premature termination of the
transport connection could lead to indeterminate
results. - A client wishing to send a non-idempotent request
should wait to send that request until it has
received the response status for the previous
request.
61Cs 607 got here on 18 jan 2005
62Uniform Resource Identifiers
63Uniform Resource Identifiers
- URIs have been known by many names
- WWW addresses,
- Universal Document Identifiers,
- Universal Resource Identifiers,
- Uniform Resource Locators (URL)
- Uniform Resource Names (URN).
- For HTTP, URIs are simply formatted strings
which identify (by name, location, or any other
characteristic) a resource.
64CS 607 got here on 4/Feb/2003
65General URI Syntax
- URIs in HTTP can be represented in absolute form
or relative to some known base URI. - The two forms are differentiated by the fact that
absolute URIs always begin with a scheme name
followed by a colon.
66http-scheme URIs
- A URI which is based on the http scheme must be
of the syntactic form - "http" "//" host "" port abs_path "?"
query - where items enclosed in are optional
- If the port is not given, Port 80 is assumed.
67Meaning of a http-scheme URI
- The meaning of a http-scheme URL is that the
identified resource is on the server at that port
of the host, and the Request-URI for the resource
is abs_path. - Thus, for example, pointing a browser at
- http//student.cs.ucc.ie/cs1064/jabowen/
- is the same as opening a TCP/IP connection to
Port 80 on student.cs.ucc.ie and sending either - the HTTP/1.0 request
- GET /cs1064/jabowen/ HTTP/1.0
- or the HTTP/1.1 request
- GET /cs1064/jabowen/ HTTP/1.1
- Host student.cs.ucc.ie
- (As we shall see later, all HTTP/1.1 requests
must include a Host header field)
68Meaning of a http-scheme URI (contd.)
- The lectures on HTML forms given earlier in this
course used a method called POST to send
user-supplied data to a server - The POST method was not defined in HTTP/0.9,
which only provided one method, the GET method - The convention used in HTTP/0.9 to send data to a
server was to encode the data in the Request-URI,
in the form of a query at the end - The convention is still supported in HTTP/1.1.
Consider the following form - ltform action"http//student.cs.ucc.ie/myProg.cgi"
method"get"gt - Home town ltinput type"text" name"hometown gt
- ltbutton type"submit"gt Send data lt/buttongt
- lt/formgt
- If the user entered cork in the input box and
submitted the form, the browsers request would
include this request line - GET http//student.cs.ucc.ie/myProg.cgi?hometown
cork HTTP/1.1
69Meaning of a http-scheme URI (contd.)
- The query in a URL can include several
equations. Consider the following form - ltform action"http//student.cs.ucc.ie/myProg.cgi"
method"get"gt - Surname ltinput type"text" namesurname gt
- Home town ltinput type"text" name"hometown gt
- ltbutton type"submit"gt Send data lt/buttongt
- lt/formgt
- If the user entered sullivan and cork in the
input boxes and submitted the form, the browsers
request would include this request line - GET http//student.cs.ucc.ie/myProg.cgi?surnames
ullivanhometowncork HTTP/1.1 - Equations in a query are separated by the
character
70Meaning of a http-scheme URI (contd.)
- Some characters in user-supplied data have to be
specially handled when a browser is writing the
query in a Request-URI - The following characters, called the reserved
characters, have a special usage in URIs - / _at_ ?
- They have to be URL encoded, to send them in
URI query - Consider the following form
- ltform action"http//abc.com/prog.cgi"
method"get"gt - Name of company ltinput type"text"
namecompany gt - Home town ltinput type"text" nameplace gt
- ltbutton type"submit"gt Send data lt/buttongt
lt/formgt - If the user entered BlackDecker and Cork in
the input boxes, the browsers request would
include this request line - GET http//abc.com/prog.cgi?companyBlack26Decke
rplaceCork HTTP/1.1 - where the URL encoded form of is 26, the 26
being the headecimal ASCII code for
71Meaning of a http-scheme URI (contd.)
- URL escape codes for the reserved characters
- colon 3A
slash 2F - at (_at_) 40
question-mark 3F - equals 3D
ampersand 26 - semi-colon() 3B
-
72CS4400 got to here at 1600 ON 7/12/2001
73Meaning of a http-scheme URI (contd.)
- The following characters, called the unsafe
characters, should also be URL-encoded in URIs,
using the hex codes specified - space 20
quotation mark 22 - less than 3C
greater than 3E - hash () 23
percent 25 - left brace 7B
right brace 7D - pipe () 7C
Backslash 5C - Caret () 5E
Tilde 7E - Left Sq Bracket 5B
Right Sq Bracket 5D - Grave accent () 60
- These characters are unsafe for different reasons
74Length of URI
- Since a browser which is sending user-supplied
data to a server includes these data in the query
part of a URL, URLs can get quite long - The HTTP protocol does not place any a priori
limit on the length of a URI - Servers must be able to handle the URI of any
resource they serve up - Servers should be able to handle URIs of
unbounded length if they serve up GET-based forms
that could generate such URIs. - A server should return 414 (Request-URI Too
Long) status if a URI is longer than the server
can handle.
75Host names in http-scheme URIs
- A fully-qualified host name of a host means
- either the fully-qualified domain name (i.e., a
completely specified domain name ending in a
top-level domain such as .com or .ie), - or the numeric Internet Protocol (IP) address of
the host. - The fully qualified domain name is preferred
use of numeric IP addresses in URIs is strongly
discouraged. and should be avoided whenever
possible.
76Proxy handling of host names
- If a proxy receives a fully qualified domain
name, the proxy must NOT change the host name. - But, if a proxy receives a host name which is not
a fully qualified domain name, it may add its
domain to the host name it received.
77Example
- Suppose we use the host name cosmos in a URL
sent to the proxy student.cs.ucc.ie - Then the proxy can extend this to
cosmos.cs.ucc.ie - EG
- http//cosmos/jabowen/prog1.php
- becomes
- http//cosmos.cs.ucc.ie/jabowen/prog1.php
78- http//www.independent.co.uk
- http//www.independent.co.uk/printer.php?storyID1
4356