Title: Caching XML Web Services to Support Disconnected Operation
1Caching XML Web Services to Support Disconnected
Operation
- Venugopalan Ramasubramanian
- Cornell University
- Doug Terry
- Microsoft Research, Silicon Valley
2Web Services
- method of providing and accessing services on the
Internet - consumer services
- hotmail, orbitz, mapquest, ebay,
- B to B services
- supply chain management
- request-response paradigm
- RPCs on the internet
3XML Web Services
- W3C (world wide web consortium) standards
- Microsoft, IBM, HP,
- Microsoft .Net web services (HailStorm)
- mycontacts, myprofile, myfavoritewebsites
- TerraServer, CoolRooster
- SOAP (simple object access protocol)
- standard representation of web service
requests/responses (SOAP-RPC) - WSDL (web services description language)
- description of web services
4Availability of Web Services
- GOAL
- make web services available despite frequent
disconnections and limited bandwidth! - web service clients reside on all kinds of
devices - desktop, laptop, PDA, smart phone
- network outages (especially wireless)
- bandwidth restriction
5Governing Principles
- cannot modify web services
- cannot modify access protocols
- can perhaps modify client
- must also comply with existing clients
- can interpose storage and computation
client-side caching is a solution to improve
availability!
6XML Standards SOAP
- SOAP-RPC standard
- encoding definitions for data types
- success, failure definitions
- SOAP-Envelope
- outer-most element
- SOAP-Body
- obligatory
- request operation name, parameters
- response status return value, failure
- SOAP-Header
- optional, multiple header blocks.
- supplementary information kerberos ticket
- HTTP binding
- HTTP request and response messages
7example soap request
ltsEnvelope xmlnsshttp//schemas.xmlsoap.org/so
ap/envelope/ xmlnsmhttp//schemas.micro
soft.com/hs/2001/10/myContacts
xmlnschttp//schemas.microsoft.com/hs/200
1/10/core xmlnsmp"http//schemas.microsoft.
com/hs/2001/10/myProfile" gt ltsHeadergt ltlicense
s xmlns"http//schemas.xmlsoap.org/soap/security/
2000-12"gt ltcidentitygt ltckerberosgt3240lt/ckerb
erosgt lt/cidentitygt lt/licensesgt ltpath
xmlns"http//schemas.xmlsoap.org/rp/"gt ltaction
gthttp//schemas.microsoft.com/hs/2001/10/corerequ
estlt/actiongt lttogthttp//terry.microsoft.comlt/to
gt ltfwdgtltvia /gtlt/fwdgtltrevgtltvia
/gtlt/revgt ltidgtb55528a4-5d63-49f1-87a2-5fab8d76f6
58lt/idgt lt/pathgt ltcrequest service"myContacts
" document"content" method"insert"
genResponse"always" gt ltkey puid"3240"
instance"1" cluster"1" /gt lt/crequestgt lt/sHe
adergt ltsBodygt ltcinsertRequest
select"/mmyContacts/mcontactmpname/mpgivenNa
me Terry'/mpemailAddress" gt ltmpemailgtterr
y_at_microsoft.comlt/mpemailgt lt/cinsertRequestgt lt
/sBodygt lt/sEnvelopegt
8XML Standards WSDL
- concrete definition of the web service
- data structures
- interface offered by the web service
- operation names and parameters
- message formats (components of a message)
- protocol binding (SOAP)
- automatic generation of client-side stubs
- Visual Studio .Net
9Experiments with Web Cache
- experiment with existing clients and services
(Microsoft .Net web services) - check feasibility by building a cache to store
HTTP requests/responses
10Issues in Caching
- web services are active
- default HTTP cache directive is No Cache!
- web services are diverse
- unlike files and databases, web services have
custom interfaces - fundamental questions
- which requests are cacheable?
- which operations have permanent side effects?
- how to understand requests/responses?
- services use different formats for
requests/responses
11example soap request
ltsEnvelope xmlnsshttp//schemas.xmlsoap.org/so
ap/envelope/ xmlnsmhttp//schemas.micro
soft.com/hs/2001/10/myContacts
xmlnschttp//schemas.microsoft.com/hs/200
1/10/core xmlnsmp"http//schemas.microsoft.
com/hs/2001/10/myProfile" gt ltsHeadergt ltlicense
s xmlns"http//schemas.xmlsoap.org/soap/security/
2000-12"gt ltcidentitygt ltckerberosgt3240lt/ckerb
erosgt lt/cidentitygt lt/licensesgt ltpath
xmlns"http//schemas.xmlsoap.org/rp/"gt ltaction
gthttp//schemas.microsoft.com/hs/2001/10/corerequ
estlt/actiongt lttogthttp//terry.microsoft.comlt/to
gt ltfwdgtltvia /gtlt/fwdgtltrevgtltvia
/gtlt/revgt ltidgtb55528a4-5d63-49f1-87a2-5fab8d76f6
58lt/idgt lt/pathgt ltcrequest service"myContacts
" document"content" method"insert"
genResponse"always" gt ltkey puid"3240"
instance"1" cluster"1" /gt lt/crequestgt lt/sHe
adergt ltsBodygt ltcinsertRequest
select"/mmyContacts/mcontactmpname/mpgivenNa
me Terry'/mpemailAddress" gt ltmpemailgtterr
y_at_microsoft.comlt/mpemailgt lt/cinsertRequestgt lt
/sBodygt lt/sEnvelopegt
12Issues in Caching contd.
request 1 query request
ltqueryRequest select myContacts/contactnamet
erry /gt
request 2 delete request
ltdeleteRequest select myContacts/contactname
terry/phone_at_catcell /gt
- consistency
- later requests might invalidate responses cached
earlier. - read/write, write/write conflicts
- how to specify consistency requirements for
generic web services?
13More Issues
- user experience
- user unaware of web service cache
- operations reportedly successful could fail!
- hoarding
- keeping the cache hot
- user controlled hoard requests
- security
- enforce access control
14Our Approach
- annotate WSDL description of web services to
define cache properties - published by service providers or third party
- no changes to server side code required ?
- transparent cache for web services
- acts as a web proxy on the client machine
- no modifications of the client program necessary
? - custom cache managers for each web service
- generated automatically from the annotated WSDL
description ?
15Architecture
Proxy Server
Cache
WBQ
CCM1 Custom Cache Manager 1
WBQ Write Back Queue
16WSDL Annotations for each Operation
- cacheable the operation can be cached
- lifetime the duration for which replies are
cached - play-back the operation has side effects and
must be played back when connection is restored - default-response a default response will be sent
when connection is not available
17WSDL Annotations for each Service
- identify the operation (operationName)
- xpath (xml query language) expression to extract
the name of the operation - extract the request message (identifier)
- portions of the request message should be ignored
while caching (date) - xpath expression to extract relevant parts of the
message for identification
18snippet from annotated myContacts.wsdl
ltbinding name"myContactsBinding"
type"tnsmyContactsPort" operationName
"substring-before(localname(/senvEnvelope/senvB
ody/1), 'Request')" Identifier
"/senvEnvelope/senvHeader/s0licenses
/senvEnvelope/senvHeader/s1request
/senvEnvelope/senvBody"gt ltsbinding
transport"http//schemas.xmls.org/s/http"
style"document" /gt ltoperation name"insert"
cacheable"false" playback"true"
defaultResponse"true" cacheHeader"true"gt
ltsoperation sAction"http//schemas.microsoft.com
/hs/2001/10/crequest" /gt
19Annotations for Consistency
- when does request 2 invalidate the response of an
earlier request 1 in the cache? - an insert could invalidate an earlier query
response - consider requests to be functions with signatures
- req1 op1 (param1,1, param1,2, , param1,n)
- req2 op2 (param2,1, param2,2, , param2,m)
- invalidate condition is an expression of req1 and
req2 - f(op1, op2, param1,1, , param2,1, )
20Annotations for Consistency XSL Transformations
- extensible style sheet language (XSL)
- transforms XML documents in to html/text/xml
- Turing-complete language
- cache transform transforms a cached response
- input request1, reply1, request2, reply2
- output transformed reply1 (null if invalidated)
- powerful than just specifying invalidations
- can actually transform the old response
21Cache Transform Example
ltqueryRequest select myContacts/contactnamet
erry /gt
request 2 delete request
ltdeleteRequest select myContacts/contactname
terry/phone_at_catcell /gt
smart cache transform would delete the cell
phone number from the cached query response
22 ltxsltemplate match"/"gt ltxslvariable
name"service1" select"req1/sHeader/crequest/_at_
service"/gt ltxslvariable name"service2"
select"req2/sHeader/crequest/_at_service"/gt
ltxslvariable name"opName1" select"substring-bef
ore(local-name(req1/sBody/1), 'Request')"/gt
ltxslvariable name"opName2"
select"substring-before(local-name(req2/sBody/
1), 'Request')"/gt ltxslchoosegt
ltxslwhen test"service1 service2"gt
ltxslchoosegt ltxslwhen
test"opName2 'query' and (opName1 'insert'
or opName1 'delete' or opName1
'replace')"gt ltxslvariable
name"cleanQuery1"gt ltxslcall-template
name"StripSegment"gt ltxslwith-param
name"xpQuery" select"substring-after(req1/sBod
y/c/_at_select, '/')"/gt lt/xslcall-templategt
lt/xslvariablegt
ltxslvariable name"cleanQuery2"gt ltxslcall
-template name"StripSegment"gt
ltxslwith-param name"xpQuery" select"substring
after(req2/sBody/cqueryRequest/cxpQuery/_at_selec
t, '/')"/gt lt/xslcall-templategt
lt/xslvariablegt
ltxslcall-template name"CheckIntersection"gt ltxsl
with-param name"xpQuery1" select"cleanQuery1"/
gt ltxslwith-param name"xpQuery2"
select"cleanQuery2"/gt
lt/xslcall-templategt lt/xslwhengt
ltxslotherwisegt
ltxslvalue-of select"rep2"/gt
lt/xslotherwisegt lt/xslchoosegt
lt/xslwhengt ltxslotherwisegt
ltxslvalue-of select"rep2"/gt
lt/xslotherwisegt lt/xslchoosegt lt/xsltemplategt
23Picking Level of Consistency
- user-freedom in choosing consistency guarantees
- multiple consistency transforms
- strong consistency
- less availability ?
- better user experience ?
- weak consistency
- user experience could deteriorate ?
- operations reportedly successful could fail!
- optional cache header
- better availability ?
24More Transforms
- response transform
- response from the cache may have to be changed
before returning to the client. - adding time-stamp, unique identifiers etc.
- default response transform
- generates a default response for a request.
- default responses are returned when disconnected
but request is queued for play-back
25Optional Cache Header
- cache provides information to the client using
cache header - response from cache or server
- age of cached response
- request will be played back in the future
- no changes to the definition of WSDL
- would not affect existing clients in any way.
- cache aware clients can provide additional
information to the user
26example default response and cache header
ltsEnvelope xmlnsshttp//schemas.xmlsoap.org/so
ap/envelope/ xmlnshs"http//schemas.microsoft.c
om/hs/2001/10/core"gt ltsHeadergt ltpath
xmlns"http//schemas.xmlsoap.org/rp/"gt ltaction
gthttp//schemas.microsoft.com/hs/2001/10/coreresp
onselt/actiongt lt/revgt ltfromgthttp//terry.micr
osoft.comlt/fromgt ltrelatesTo gt
d978b559-aceb-4e9e-9747-b8a306234bc8
ltrelatesTogt lt/pathgt lt response xmlns
"http//schemas.microsoft.com/hs/2001/10/core"
/gt ltcacheHeader defaultResponse"true"
toPlayback"true" xmlns"http//localhost/wsdlanno
tation" /gt lt/sHeadergt ltsBodygt lthsinsertResp
onse status"success" selectedNodeCount"1"
newChangeNumber"0" /gt lt/sBodygt lt/sEnvelopegt
27Conclusion
- built a prototype web services cache
- experimented with Hailstorm web services and
clients - annotated Hailstorm WSDL files
- the prototype demonstrates custom cache managers
in action for Hailstorm - couldnt give a demo ?
28Work for the Future
- WSDL annotations for more web services
- hard to find interesting web services with WSDL
descriptions yet! - hoarding to enhance availability
- specify user controlled hoard queries
- hoard transform to obtain response from cached
hoard requests - incorporate security constraints
- tune cache performance