Title: HTTP messages Entities and Encoding
1HTTP messagesEntities and Encoding
2Outline
- The format and behavior of HTTP message entities
as HTTP containers - How HTTP describes the size of entity bodies, and
what HTTP requires in the way of sizing - The entity headers used to describe the format,
alphabet, and language of content, so clients can
process it properly
3- Reversible content encoding transforms data
format to take up less space or be more secure - Transfer encoding modifies how HTTP ships data to
enhance the communication of some kinds of data - Chunked encoding chops data into multiple pieces
to deliver content of unknown length safely
4- The assortment of tags, labels, times, and
checksums help clients get the latest version of
requested content - Ranges are useful for continuing aborted
downloads where they left off - Delta encoding extensions allow client to request
just those parts of a web page that actually have
changed since a previously viewed revision
5- Checksums of entity bodies are used to detect
changes in entity content as it passes through
proxies
6Message is made up of header and body
HTTP/1.0 200 OK Server Netscape_Enterprise/3.6 Da
te Sun, 17 Sep 2000 000105 GMT Content_type
text/plain Content-length 18 Hi!Im a message!
Entity headers
Entity
Entity body
7HTTP 1.1 defines 10 entity headers
- Content-Type
- Content-Length
- Content-Language
- Content-Encoding
- Content-Location
- Content-Range
- Content-MD5
- Last-Modified
- Expires
- Allow
- ETag
- Cache-Control
8Entity Bodies
9Why content-length is important?
- Detecting Truncation
- Incorrect Content-Length problems?
- When connection is persistent, where one entity
body ends and the next message begins. - Chunked encoding is an alternate, sending the
data in a series of chunks, each with a specified
chunk size. - When content-encoding is applied
- Content-length refers to the encoded body, not
the length of the original, unencoded body.
10Entity Digest
- Content-MD5
- Is used to check message integrity
- Also can be used as a key into a hash table to
quickly locate documents and reduce duplicate
storage of content.
11Media type and Charset
- Content-type refers to original entity body type
before encoding. - Support optional parameters to further specify
the content type. - Character Encodings for Text Media
- Content-Type text/html charsetiso-8859-4
12Common media types
13Multipart Media Types
- MIME multipart email messages contain multiple
messages stuck together and sent as a single,
complex message. - Each component is self-contained, with its own
headers describing its contents the different
components are concatenated together and
delimited by a string. - HTTP also supports multipart bodies however,
only used in two cases fill-in form submission
and range responses carrying pieces of a document.
14Multipart Form Submissions
- ltform actionhttp//xxx/cgi
enctype"multipart/form-data
methodPOSTgt ltPgt Your Name? ltINPUT
typetext namesubmit-namegtltbrgt Your File
to send? ltINPUT typefile namefilesgt
ltbrgt ltINPUT typesubmit valuesendgt
ltINPUT typeresetgtltformgt
15If the user enters John and selects the text
file hello.txt
- Content-Type multipart/form-data
boundaryAaBo3x - --AaBo3x
- Content-Disposition form-data
namesubmit-name - John
- --AaBo3x
- Content-Disposition form-data namefiles
filenamehello.txt - Content-Type text/plain
- contents of hello.txt
- --AaBo3x
16If selects the text file hello.txt and the
second image file image.gif
- Content-Type multipart/form-data
boundaryAaBo3x - --AaBo3x
- Content-Disposition form-data
namesubmit-name - John
- --AaBo3x
- Content-Disposition form-data namefiles
- Content-type multipart/mixed boundaryBbC04y
- --BbC04y
- Content-Disposition file filenamehello.txt
- Content-type text/plain
- contents of hello.txt
- --BbC04y
- Content-Disposition file filenameimage.gif
- Content-Type image/gif
- Content-Transfer-Encoding binary
- contents of image.gif
- --BbC04y
- --AaBo3x
17Multipart Range Response
- HTTP/1.0 206 Partial Content
- Server Microsoft-IIS/5.0
- Content-Location http//xxx/hello.txt
- Content-Type martipart/x-byteranges
boundary--abcdefghikz-- - ----abcdefghikz
- Content-Type text/plain
- Content-Range bytes 0-174/1441
- . Part I content ---
- --abcdefghikz--
- Content-Type text/plain
- Content-Range bytes 1344-1441/1441
- . Part II content ---
- --abcdefghikz--
18Content-Encoding
- HTTP applications sometimes want to encode
content before sending it, to help lesson the
time it takes to transmit the data. - Content-Type is the type of the original format,
before encoding - Content-Length is the length of the encoded length
19Content Encoding
Content-encoded content Content-Type
text/html Content-Length 5746 content-encoding
gzip
Original content Content-Type text/html Content-L
ength 17571
Original content Content-Type text/html Content-L
ength 17571
01110001 00110010
Gzip content decoder
Gzip content encoder
20Content-encoding tokens
21Accept-Encoding Headers
Request message
GET /logo.gif HTTP/1.1 Accept-encoding gzip
client
server
HTTP/1.1 200 OK Content-type image/gif Content-en
coding gzip
gzip
gunzip
Response message
00101101
00101101
The server compresses the image with gzip to
transport a smaller file over the thin Network
connection between itself and the client.This
saves network bandwidth And reduces the amount of
time that the client waits for the
transfer.Though,the Client will have to spend
time decompressing the image once the image is
served.
22Client can indicate preferred encodings by
attaching Q values
- Accept-Encoding compress, gzip
- Accept-Encoding
- Accept-Encoding
- Accept-Encoding compressq0.5, gzipq1.0
- Accept-Encoding gzipq1.0, identityq0.5
q0
23Transfer Encoding
- Content-Encodings are to deal with the entity
content to be encoded for less-space or security
reason, tightly associated with the content
format. - In comparison, transfer encodings are applied for
architectural reasons and are independent of the
content format.
24Content encoding vs. transfer encoding
Content-encoded response
Normal header block
Normal entity (just encoded)
A content-encoded message just encodes the
entity Section of the message. With
Transfer-encoded Messages the encoding is a
function of the entire Message, changing the
structure of the message itself
Transfer-encoded response
Basic header
Encoded blocks
25Transfer-Encoding Headers
- TE
- Used in the request header to tell the server
what extension transfer encoding are okay to use. - Transfer-Encoding
- Used in the response header to tell the receiver
(client) what encoding has been perform
26Example
- GET /1.html HTTP/1.1
- Host www.csie.ncnu.edu.tw
- User-Agent Mozilla/4.61
- TE trailers, chunked
- HTTP/1.1 200 ok
- Transfer-Encoding chunked
- Server Apache 3.0
27Chunked Encoding
28Chunked Encoding (continued)
- Chunking and Persistent connection
- Trailers in chunked messages
- Combining Content and Transfer Encoding
29Combining Content and Transfer Encodings
Content-type text/heml
Content encoding
9BF2578EA4 2670CD
9BF2578EA4 2670CD
Content-Type text/html content-encoding gzip
Transfer encoding (chunking)
426
426
Content-Type text/html content-encoding
gzip Transfer-encoding chunked
8EA
8EA
257
257
98B
98B
30Time-Varying Instance
- Web objects usually are not static.
- The same URL can, over time, point to different
versions of an object. - For example, the website of any media company
like CNN, and BBC.
31Time-Varying Instances
32Validators and Freshness
- In the previous CNN example, the client got the
initial resource V1 and can cache this copy, but
for how long? - Once the document has expired at the client, it
must request a fresh copy from the server. - Using a conditional request to tell the server
which version it currently has, using a
validator, and ask for a copy to be sent only if
its current copy is no long valid.
33Cache-Control header directives
34Cache-Control header directives
35Conditional request types
36Range Request
- HTTP allows clients to actually request just part
or a range of a document. - Applications
- Request RoI (Region of Interest)
- Media Indexing and Access
- Streaming applications
37Range Requests
Request message
GET /bigfile.html HTTP/1.1
client
Response message
HTTP/1.1 200 OK Content-Type text/html Content-Le
ngth 65537 Accept-Ranges bytes
110100 111001 101001 110010
www.csie.ncnu.edu.tw
Range request message
GET /bigfile.html HTTP/1.1 Range bytes20224-
Range response message
HTTP/1.1 200 OK Content-Type text/html Range
bytes20224- Accept-Ranges bytes
The clients original request was Interrupted,but
a second request For the part of the message that
Was not received allows the Client to resume
form the point Of the interruption
www.csie.ncnu.edu.tw
38Delta Encoding
- An extension to the HTTP protocol that optimizes
transfer by communicating changes instead of
entire objects. - RFC 3229 describe delta encoding.
39Delta Encoding
40Delta Encoding
41Delta-encoding headers
- Etag
- If-None-Match
- A-IM
- IM
- Delta-Base
42IANA registered types of instance manipulations
43For More Information
- http//www.ietf.org/rfc/rfc2616.txt
- Hypertext Transfer Protocol -- HTTP/1.1
- http//www.ietf.org/rfc/rfc3229.txt
- Delta encoding in HTTP
- http//www.ietf.org/rfc/rfc1521.txt
- MIME (Multipurpose Internet Mail Extensions) Part
OneMechanisms for Specifying and Describing the
Format of Internet Message Bodies - http//www.ietf.org/rfc/rfc2045.txt
- Multipurpose Internet Mail Extensions(MIME) Part
OneFormat of Internet Message Bodies - http//www.ietf.org/rfc/rfc1864.txt
- The Content-MD5 Header Field
- http//www.ietf.org/rfc/rfc3230.txt
- Instance Digests in HTTP