Title: HTTP - Hypertext Transfer Protocol
1HTTP - Hypertext Transfer Protocol
- Arthur Yigal Eliaspur
- Date 28.1.2001
2HTTP Overview
- Webs application-layer protocol
- in use by the WWW since 1990
- client/server paradigm
- in the web
- clients browsers (IExplorer,Netscape..)
- server web servers (Apache,IIS..)
- Request/Response Protocol
- Web servers usually using TCP port 80
request
S
C
response
3HTTP Overview (cont.)
- Stateless protocol - HTTP server maintains no
information about the client.
4HTTP Versions
- HTTP 0.9
- Simple GET protocol for the Web
- limits on data transfer (1024 characters)
- HTTP 1.0
- Headers give information about the data
transferred. - Greater data type/quantity transfer in both
directions - HTTP 1.1
- Supports hierarchical proxy servers
- caching
- persistent connections
5HTTP 0.9 GET example
- telnet www.cs.huji.ac.il 80
- GET /dbsi/index.html ltCRLFgt
- output
- ltHTMLgtltHEADgt.......lt/HEADgtltBODYgt.............
..lt/BODYgtlt/HTMLgt - Connection closed by foreign host
6HTTP 1.0
- developed between 1992 and 1996.
- Exchange more than simple text
- Headers allowed in both requests and responses
- Extends GET request to allow headers
- Adds HEAD request to get information
- Adds POST request, sends information with the
request
7Request message format
8Response message format
9HTTP Request/Response example
10Response-codes
11Headers types
- General
- Date, Pragma ..
- Request
- Authorization, From, If-Modifed-Since, Referer,
User-Agent .. - Response
- Location, Server, WWW-Authenticate ...
- Entity
- Allow, Content-Encoding, Content-Length,
Content-Type, Expires, Last-Modified,
extension-header...
12POST HEAD messages
- POST
- sends information with the request in the Entity
Body. - Useful when the user fills out a form.
- HEAD
- return only the request result without the data
itself (I.e. only the Status line and the Header
lines) - use for debugging HTTP servers and for page
update checking.
13Upgrading Header
- allows the client to specify what additional
communication protocols it supports - The server may choose to switch protocols, but
this is not mandatory. - Example
- Upgrade HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11
14Caching
- Why?
- Reduces response time
- Request is satisfied from cache closest to
browser - Takes less time to get the page and display it
- Reduces traffic
- Each page only accessed from the server once
- Reduces bandwidth used by browser
- Saves money if client is paying by traffic
- Keeps bandwidth requirements down
15Caching (cont.)
- Risks?
- Might not be semantically transparent'
- the response is different from what would have
been returned by the origin server.
16Caching in HTTP/1.0
- simple caching mechanism
- Origin server may mark a response, using the
Expires header - cache validity checking using a conditional
request which include If-Modified-Since
Last-Modified headers. - server responds
- 304 (Not Modified)
- 200 (OK) the New entry.
17Caching in HTTP/1.0 (cont.)
- The Pragma no-cache request Header indicate that
a request should not be satisfied from a cache. - PROBLEM - origin servers/clients cant give full
and explicit instructions to caches (will be
explained later)
18Caching in HTTP/1.1
- retains the basic HTTP/1.0 design
- new features
- more careful specifications of the existing
features. - Entry start as fresh.
- Become stale - when reaches its expiration time.
- must revalidate it with the origin server.
19Caching in HTTP/1.1 (cont.)
- cache validator string entity tag.
- two responses resource with the same entity tag
must be identical. - Can include fine-grained timestamp, internal
database pointer . . . - If-None-Match header with one or more entity
tags. - Much stronger then If-Modified-Since.
20Caching in HTTP/1.1 (cont.)
- Cache-Control header
- server/client implicit directives to caches
- directives examples
- max-age - relative expiration time.
- HTTP/1.0 Expires header can lead to clock skew
failure. - no-transform - prevent proxies response
transformations. - like reduce image complexity over a slow link
(WAP) - private no-store - prevent the storage of some
or all of a response.
21Caching in HTTP/1.1 (cont.)
- Vary header - include list of headers that
identical the request beside the URL field. - For example Accept-Language, Accept-Charset
...
22Cooperative Cashing
23Cooperative Cashing (cont.)
- Higher level cache ( e.g. national cash)
- larger user population
- higher hit rates.
- Multiple Web cashes which cooperate gt Improve
overall performance. - Cooperative cashes usualy built from clusters
- divide the traffic overhead
- improve storage capacity
24Cooperative Cashing (cont.)
- which of the cashes we sould ask for a particular
doc? - Hash routing (of URLs) - an object want be
present in more then one cash. - HTTP/1.1 introduces the concept of hop-by-hop
headers - message headers that apply only to a given
connection, and not to the entire path. - This enable much more power with proxies (cashes)
usage.
25Cooperative Cashing (cont.)
- HTTP 1.1 hop-by-hop headers
- Connection
- options that are desired for that particular
connection (e.g connectionclose.) - Public
- lists the set of methods supported by the server
- Proxy-Authenticate
- enable authentication methods between two hops.
- Transfer-Encoding -
- compression method between two hops.
- Upgrade
- additional communication protocols supported.
26Persistent Non Persistent Connections.
- Persistent Connections
- Opens new TCP connection for each request.
- For example for a web page with 10 image - 11
new TCP connections is needed. - Used in HTTP/1.0
- nonpersistent connections
- one TCP connection can serve more then one
request/response pair. - Less connection establishing overhead, smaller
slow-start delay. - Used as default in HTTP/1.1
27Persistent Non Persistent Connections.(cont.)
- nonpersistent connections, two types
- without pipelining
- the client issues a new request only when the
previous response has been arrived. - with pipelining
- client send the request as soon as it encounters
a reference. - Multiple request/response on the same TCP packet.
- Or on back-to-back packets.
28Compression
- most image formats (GIF, JPEG, MPEG) are
precompressed. - many other data types used in the Web are not.
- compression could save almost 40 of the bytes
sent via HTTP - need for negotiating the use of codings.
29Compression (cont.)
- Client send Accept-Encoding header
- indicate what content-codings it can handle, and
which ones it prefers. - Server Send
- Content-Encoding header - for end-to-end coding
indication. - Transfer-Encoding header - for hop-to-hop
coding indication. (supported only in HTTP/1.1)
30W3C Performance Measurements
- "Microscape" Benchmark, 43 inline images
Scenarios - HTTP/1.0 using 4 simultaneous connections
- HTTP/1.1 using 1 persistent connection
- HTTP/1.1 pipeline using 1 persistent connection
- HTTP/1.1 pipeline compression using 1
connection
31W3C Performance Measurements (cont.)
32Authentication
- Many sites require users to provide a username
and password in order to access the documents
housed on the server. - Provide mechanism for keeping track of users
(more then security mechanism). - How does its work?
- Client send
- ordinary request message
- server responds with
- 401 Authorization Required status code
- WWW-Authenticate header which specified how to
perform authentication
33Authentication (cont.)
- Client resend
- the requested message but this time including
Authorization header (e.g. user-name password.)
- The client continue to add this header for each
following request to that server.
34Cookies
- Another site mechanism for keeping tracks of
users. - Example
- Client contact a web site for the first time.
- Server response with
- Set-cookie 1678453 header
- client store the cookie value and the server name
in a special cookie file. - For each further request for that server the
client will add the - Cookie 1678453 header
35Cookies (cont.)
- Usage
- server requires authentication but doesnt want
to hassle a user with a user-name and password. - Remembering users preferences for advertising.
- Enable creating a virtual shopping cart.
- Problems
- users who accesses the same site from different
machines.
36References
- http//www.ietf.org/rfc/rfc2068.txt
- http//www.ietf.org/rfc/rfc1945.txt
- http//www.w3.org/Protocols/
- http//www8.org/w8-papers/5c-protocols/key/key.htm
l - Computer Networks by Joames Fokurose Keith
W.Ross.