Title: Content Negotiation and Transcoding
1Content Negotiation and Transcoding
2Outline
- A single URL may need to correspond to different
resources multiple language support for
different request language users. - HTTP provides content-negotiation methods that
allow clients and servers to make such
determinations, accessing a single URL
corresponding to different resources (e.g.,
French or English version) called variants.
3- Servers also can make other types of decisions
about what content is best to send to a client
for a particular URL. - Servers even can automatically generate
customized pages for instance, converting an
HTML page into a WML page for your handheld
device. - This kind of dynamic content transformations are
called transcodings.
4Content-Negotiation Techniques
- There are three distinct methods for deciding
which page at a server is the right one for a
client - Present the choice to the client
- Decide automatically at the server
- Ask an intermediary to select.
5Summary of content-negotiation techniques
6Client-Driven Negotiation
- Client makes a request.
- Server sends list of choices to client.
- Client chooses.
- Disadvantage two requests are needed
- One to get the list and a second to get the
selected copy, leading slow (increased latency) ,
tedious decision process made manually at the
client side in the browser.
7For servers, two ways to present choices (are
manually decided)
- By sending back an HTML with links to the
different versions of the page and descriptions, - By sending back an HTTP/1.1 response with the 300
Multiple response code. - The client browser may receive this response and
display a page with the links, as in the first
method, or it may pop up a dialog asking for
selection. - Another problem requires multiple URLs
- One for the main page, one for each specific page
8Server-Driven Negotiation
- Client-driven approach has several drawbacks, as
discussed previously however, the most one is
the increased communication between client and
server to decide on the best page. - Why do we let the server decide which page to
send back? - Client must send enough information about its
preferences
9Two mechanisms to evaluate the proper response
- Examining the set of content-negotiation headers.
- The server looks at the clients Accept header
and tries to match them with corresponding
response headers. - Varying on other (non-content-negotiation)
headers. - For example, the server could send responses
based on the clients User-Agent header.
10Content-Negotiation Headers
11Accept and matching document headers
12 - We have discussed the entity header in Chapter
15, which are like a shipping label for
describing the attributes of the message body. - Content-negotiation header, on the other hand,
are used by clients and servers to exchange
preference information and to choose between
different versions, so that the best, or the most
closely one (q values) matching the preferences
is served.
13Content-Negotiation Header Quality Values
- For example, clients send an Accept-Language
header as below - Accept-Language enq0.5, frq0.0, nl q1.0,
trq0.0 - Where q value ranges from 0.0 to 1.0 (the highest
preference) - In this case, the client prefers to receive a
Dutch (nl) version, but an English (en) version
will do. Under no circumstance does the client
want a French (fr) or Turkish (tr) version. - Order is not important.
- Occasionally, the server may not have any
documents that mach any of the clients
preference. - In this case, the server may change or transcode
the document to match the clients preference
(discussed later).
14Varying on Other Headers
- Servers also can attempt to match up responses
with other client request headers, such as
User-Agent. - Server may know that old versions of a browser or
browser types do not support JavaScript, for
example, and may therefore send back a version
without Javascript. - In this case, there is no q-value to look for
approximate best match. The server either looks
for an exact match or simply serves whatever it
has.
15- Because caches must attempt to server correct
best versions of cached document, the HTTP
defines a Vary header that the server sends in
responses -
- The Vary headers tells caches (and clients, and
any downstream proxies) which headers the server
is using to determine the best version of the
response to send. (discussed later)
16Content Negotiation on Apache
- A web site content provider Joe, for example to
provide different version of Joes index page.
Joe must put all his index page files in the
appropriate directory of the Apache server. There
are two ways to enable this. - In the web site directory, create an type-map
file for each URI in the web site that has
variant. - Enable the MultiViews directive, which causes
Apache to create type-map files for the directory
automatically.
17Using type-map file
- AddHandler type-map .var
- Here is a smaple type-map file
- URI joes-hardware.html
- URI joes-hardware.en.html
- Content-type text/html
- Content-language en
- URI joes-hardware.en.html
- Content-type text/htmlcharsetiso-885902
- Content-language fr, de
18Using Multi-Views
- Use Options directive to enable multi-view for
the directory (ltDirectorygt, ltLocationgt, or
ltFilesgt). - The server looks for all files with
joe-hardware in the name and creates a type-map
file for them. - Based on the names , the server guesses the
appropriate content-negotiation header to which
the files correspond. - Another two ways to implement content negotiation
at the server is by - Server-side extension, such as Microsofts Active
Server Page (ASP) - any CGI-program,i.e., doing this by yourself
19Transparent Negotiation
- Seeks to move the load of server-driven
negotiation away from the server, while
minimizing message exchange with the client by
having an intermediary proxy negotiate on behalf
of client. - The proxy is assumed to have knowledge of the
clients expectations and be capable of
performing the negotiations on its behalf.
20Caching and Alternates
21Caches use content-negotiation headers to send
back correct responses to client
GET / HTTP/1.1 Host www.joes-hardware.com User-ag
ent spiffy multimedia browser Accept-language
frq1.0
Hi! Welcome to Joes Hardware Store.
Hola! Bienvenido a Joes Hardware Store.
Bonjour
French-speaking user
Bonjour! Bienvenue a Joes Hardware Store.
Web server
Cache
22Caches use content-negotiation headers to send
back correct responses to client
GET / HTTP/1.1 Host www.joes-hardware.com User-ag
ent spiffy multimedia browser Accept-language
esq1.0
Hola! Bienvenido a Joes Hardware Store.
Bonjour
Bienvenido
Spanish-speaking user
Web server
Cache
23The Vary Header
- The huge number of different User-Agent and
Cookie values could generate many variants - Vary User-Agent, Cookie
24Caches match request headers
GET / HTTP/1.1 Host www.joes-hardware.com User-ag
ent spiffy multimedia browser Accept-language
frq1.0
I need to send her French document. Since she has
such a cool browser, Ill send her a media-rich
version of the page.
HTTP/1.1 200 OK Content-language fr Vary
User-agent Bonjour media-rich content
Bonjour
French-speaking user 1
Web server
Cache
25Caches match request headers
GET / HTTP/1.1 Host www.joes-hardware.com User-ag
ent simpy wireless device Accept-language
frq1.0
He wants a French copy of the document and I have
it in my cache, but Id better not send it to
him. The server said my cached copy was for a
spiffy browser. This guy has a wimpy wireless
one. I had better ask the server for a French
version for the wireless browser.
Bonjour
HTTP/1.1 200 OK Content-language fr Vary
User-agent Bonjour simple text content
Web server
Bonjour
French-speaking user 2
Cache
26Transcoding
- We have discussed the mechanism by which clients
and servers can choose between a set of documents
for a URL and send the one that best matches the
clients needs. - What happens, however, when a server does not
have a document that matches the clients needs
at all? - Respond to client with an error, but
- Yet another solution transcoding, transforming
the unsatisfactory one into something that the
client can use.
27Three categories of Transcoding
- Format Conversion
- Compatible problem
- Bandwidth issues
- Information Synthesis
- Information summary
- Advertisement removal
- Content Injection (increasing the amount of
content) - Automatic ad generator
- User-tracking system
- Collect statistics about how the page is viewed
and how the clients surf the Web.
28Hypothetical transcoding
29Transcoding Versus Static Pregeneration
- An alternative to transcoding is to build
different copies of web pages at the web server. - For example, one with HTML, one with WSML, one
with high-resolution, one with low-resolution. - Is this practical?
- Storage cost, management problem
30Content transformation or transcoding at a proxy
cache
GET / HTTP/1.1 Host www.joes-hardware.com User-ag
ent wimpy wireless deviceAccept-language
frq1.0
I have a French copy of the document that the
wants, but my copy is very media-rich and he has
a wimpy wireless browser. I will strip out all of
the multimedia content and send it to him.
Bonjour
Bonjour
Web server
Transmogrifier
French-speaking user
Cache
HTTP/1.1 200 OK Content-language fr Vary
User-agent Bonjour simple text content
Since I have transformed this document for a
wireless device, I will store the transformed
copy as an alternate in case someone else wants
it as well.
31For More Information
- RFC 2616, Hypertext Transfer Protocol--HTTP 1/1
- RFC 2295, Transparent Content Negotiation in HTTP
- RFC 2296, HTTP Remote Variant Selection
Algorithm-RSVA 1.0 - RFC 2936, HTTP MIME Type Handler Detection
- http//www.imc.org/ietf-medfree/index.html
- a link to the Content Negotiation (CONNEG)
working group