Title: Tools for Automated Verification of Web Services
1Tools for Automated Verification of Web Services
Modeling Interactions of Web Software
Analyzing Conversations of Web Services
- Tevfik Bultan
- Department of Computer Science
- University of California, Santa Barbara
- bultan_at_cs.ucsb.edu
- http//www.cs.ucsb.edu/bultan
- Joint work with
- Xiang Fu, Georgia Southwestern State University
- Jianwen Su, University of California, Santa
Barbara
2Going to Lunch at UCSB
- Before Xiang graduated from UCSB, Xiang, Jianwen
and I were using the following protocol for going
to lunch - Sometime around noon one of us would call another
one by phone and tell him where and when we would
meet for lunch. - The receiver of this first call would call the
remaining peer and pass the information. - Lets call this protocol the First Caller Decides
(FCD) protocol.
3Implementation of the FCD Protocol
Tevfik
Xiang
Jianwen
!tj1
?jt2
!xj1
?jx2
!jt1
?tj2
!tx1
?xt2
!xt1
?tx2
!jx1
?xj2
?xt1
?jt1
?tx1
?jx1
?xj1
?tj1
!tx2
!tj2
!xt2
!xj2
!jx2
!jt2
! send ? receive
t x 1
Message Labels
from Tevfik
to Xiang
1st message
4FCD Protocol does not Work with Voicemail
- When the university installed a voicemail system
FCD protocol started causing problems - We were showing up at different restaurants at
different times! - Example scenario tx1, jx1, xj2
- The messages jx1 and xj2 are not consumed
- Note that this scenario is not possible without
voicemail!
5A Different Lunch Protocol
- Jianwen suggested that we change our lunch
protocol as follows - As the most senior researcher among us Jianwen
would make the first call to either Xiang or
Tevfik and tell when and where we would meet for
lunch. - Then, the receiver of this call would pass the
information to the other peer. - Lets call this protocol the Jianwen Decides (JD)
protocol
6Implementation of the JD Protocol
Tevfik
Xiang
Jianwen
?xt
?tx
?jt
?jx
!jt
!jx
!tx
!xt
- JD protocol works fine with voicemail!
7Conversation Protocols
- The FCD and JD protocols specify a set of
conversations - The implementations I showed are supposed to
generate the set of conversations specified by
these protocols - We can specify the set of conversations without
showing how the peers implement them, we call
such a specification a conversation protocol
8FCD and JD Conversation Protocols
FCD Protocol
JD Protocol
jt
jx
tj1
tx1
xt1
xj1
jt1
jx1
xj2
jx2
tj2
jt2
tx2
xt2
tx
xt
Conversation set (tx1, xj2), (tj1, jx2), (xt1,
tj2), (xj1, jt2), (jt1, tx2), (jx1, xt2)
Conversation set (jt, tx), (jx, xt)
9Observations Questions
- The implementation of the FCD protocol behaves
differently with synchronous and asynchronous
communication whereas the implementation of the
JD protocol behaves the same. - Can we find a way to identify such
implementations? - The implementation of the FCD protocol does not
obey the FCD protocol if asynchronous
communication is used whereas the implementation
of the JD protocol obeys the JD protocol even if
asynchronous communication used. - Given a conversation protocol can we figure out
if there is an implementation which generates the
same conversation set?
10Synchronizability and Realizability Analyses
- We formalized these observations and questions
using synchronizability and realizability
analyses - The implementation of the JD protocol is
synchronizable but the implementation of the FCD
protocol is not synchronizable - The JD protocol is realizable but the FCD
protocol is not realizable
11Outline
- Web Service Composition Model
- Capturing Global Behaviors
- Conversations
- Top-Down vs. Bottom-Up Specification and
Verification - Realizability vs. Synchronizability
- XML messaging
- MSL, XPath
- Translation to Promela
- Web Service Analysis Tool
- Conclusions and Future Work
12Characteristics of Web Services
- Loosely coupled, interaction through standardized
interfaces - Standardized data transmission via XML
- Asynchronous messaging
- Platform independent (.NET, J2EE)
WS-CDL
Interaction
BPEL4WS
Behavior
Interface
WSDL
Implementation Platforms
Microsoft .Net, Sun J2EE
SOAP
Message
XML Schema
Type
XML
Data
Web Service Standards
13Challenges in Verification of Web Services
- Distributed nature, no central control
- How do we model the global behavior?
- How do we specify the global properties?
- Asynchronous messaging introduces undecidability
in analysis - How do we check the global behavior?
- How do we enforce the global behavior?
- XML data manipulation
- How do we specify the XML messages?
- How do we verify properties related to data?
14A Model for Composite Web Services
- A composite web service consists of
- a finite set of peers
- Lunch example T, X, J
- and a finite set of message classes
- Lunch example (JD protocol) jt, tx, jx, xt
tx
Peer T
Peer X
xt
jx
jt
Peer J
15Communication Model
- We assume that the messages among the peers are
exchanged using reliable and asynchronous
messaging - FIFO and unbounded message queues
Peer T
Peer X
tx
tx
- This model is similar to industry efforts such as
- JMS (Java Message Service)
- MSMQ (Microsoft Message Queuing Service)
16Conversations
- A virtual watcher records the messages as they
are sent
Peer T
Peer X
Watcher
tx
jt
Peer J
- A conversation is a sequence of messages the
watcher sees during an execution - Bultan, Fu, Hull, Su WWW03
17Effects of Asynchronous Communication
- Question Given a composite web service, is the
set of conversations a regular set? - Even when messages do not have any content and
the peers are finite state machines the
conversation set may not be regular - Reason asynchronous communication with unbounded
queues - Bounded queues or synchronous communication
- ? Conversation set always regular
18Properties of Conversations
- The notion of conversation enables us to reason
about temporal properties of the composite web
services - LTL framework extends naturally to conversations
- LTL temporal operators
- X (neXt), U (Until), G (Globally), F (Future)
- Atomic properties
- Predicates on message classes (or contents)
- Example G ( payment ? F receipt )
- Model checking problem Given an LTL property,
does the conversation set satisfy the property?
19Bottom-Up vs. Top-Down
- Bottom-up approach
- Specify the behavior of each peer
- The global communication behavior (conversation
set) is implicitly defined based on the composed
behavior of the peers - Global communication behavior is hard to
understand and analyze - Top-down approach
- Specify the global communication behavior
(conversation set) explicitly as a protocol - Ensure that the conversations generated by the
peers obey the protocol
20tx
Conversation Schema
Peer T
Peer X
xt
Peer J
jt
jx
jt
jx
Conversation Protocol
LTL property
?
GF(tx ? xt))
tx
xt
Peer T
Peer X
Peer J
Input Queue
?xt
?tx
?jt
?jx
!jt
!jx
!tx
!xt
...
?
Virtual Watcher
LTL property
GF(tx ? xt))
21Conversation Protocols
- Conversation Protocol
- An automaton that accepts the desired
conversation set - A conversation protocol is a contract agreed by
all peers - Each peer must act according to the protocol
- For reactive protocols with infinite message
sequences we use - Büchi automata which accept infinite strings
- For specifying message contents, we use
- Guarded automata
- Guards are constraints on the message contents
22Synthesize Peer Implementations
- Conversation protocol specifies the global
communication behavior - How do we implement the peers?
- How do we obtain the contracts that peers have to
obey from the global contract specified by the
conversation protocol? - Project the global protocol to each peer
- By dropping unrelated messages for each peer
23Interesting Question
- If this equality holds the conversation protocol
is realizable - Are there conditions which ensure the
equivalence?
?
Conversations generated by the projected services
Conversations specified by the conversation
protocol
?
24Realizability Problem
- Not all conversation protocols are realizable!
A?B m1
C?D m2
Conversation protocol
Conversation m2 m1 will be generated by all
peer implementations which follow the protocol
25Another Non-Realizable Protocol
m1
A
B
m2
A
m2
m2
m3
C
m1
m3
B
m1
B
A, C
C
A?B m1
B?A m2
m3
Watcher
B?A m2
m2 m1 m3
Generated conversation
A?B m1
A?C m3
26Realizability Conditions
- Three sufficient conditions for realizability (no
message content) Fu, Bultan, Su, CIAA03,
TCS04 - Lossless join
- Conversation set should be equivalent to the join
of its projections to each peer - Synchronous compatible
- When the projections are composed synchronously,
there should not be a state where a peer is ready
to send a message while the corresponding
receiver is not ready to receive - Autonomous
- At any state, each peer should be able to do only
one of the following send, receive or terminate - (a peer can still choose among multiple
messages)
27Realizability Conditions
- Following protocols fail one of the three
conditions but satisfy the other two
A?B m1
B?A m2
A?B m1
A?B m1
B?A m2
A?B m1
C?D m2
C?A m2
A?C m3
Not lossless join
Not autonomous
Not synchronous compatible
28Bottom-Up Approach
- We know that analyzing conversations of composite
web services is difficult due to asynchronous
communication - Model checking for conversation properties is
undecidable even for finite state peers - The question is
- Can we identify the composite web services where
asynchronous communication does not create a
problem?
29Three Examples, Example 1
!a1
!a2
r1, r2
!e
e
?r1
?r2
?a1
?a2
?e
a1, a2
!r2
!r1
requester
server
- Conversation set is regular (r1a1 r2a2) e
- During all executions the message queues are
bounded
30Example 2
!a1
!a2
r1, r2
!e
?a1
?a2
e
?r1
?r2
?e
!r2
!r1
a1, a2
requester
server
- Conversation set is not regular
- Queues are not bounded
31Example 3
r1, r2
!e
!r1
!r2
?r
!a
e
?r1
?r2
?a
!r
a1, a2
?e
requester
server
- Conversation set is regular (r1 r2 ra) e
- Queues are not bounded
32State Spaces of the Three Examples
of states in thousands
queue length
- Verification of Examples 2 and 3 are difficult
even if we bound the queue length - How can we distinguish Examples 1 and 3 (with
regular conversation sets) from 2? - Synchronizability Analysis
33Synchronizability Analysis
- A composite web service is synchronizable, if its
conversation set does not change - when asynchronous communication is replaced with
synchronous communication - If a composite web service is synchronizable we
can check the properties about its conversations
using synchronous communication semantics - For finite state peers this is a finite state
model checking problem
34Synchronizability Analysis
- A composite web service is synchronizable, if it
satisfies the synchronous compatible and
autonomous conditions - Fu, Bultan, Su WWW04, TSE
- Connection between realizability and
synchronizability - A conversation protocol is realizable if its
projections to peers are synchronizable and the
protocol itself satisfies the lossless join
condition
35Are These Conditions Too Restrictive?
Problem Set Problem Set Size Size Size Pass?
Source Name msg states trans.
ISSTA04 SAS 9 12 15 yes
IBM Conv. Support Project CvSetup 4 4 4 yes
IBM Conv. Support Project MetaConv 4 4 6 no
IBM Conv. Support Project Chat 2 4 5 yes
IBM Conv. Support Project Buy 5 5 6 yes
IBM Conv. Support Project Haggle 8 5 8 no
IBM Conv. Support Project AMAB 8 10 15 yes
BPEL spec shipping 2 3 3 yes
BPEL spec Loan 6 6 6 yes
BPEL spec Auction 9 9 10 yes
Collaxa. com StarLoan 6 7 7 yes
Collaxa. com Cauction 5 7 6 yes
36Web Service Analysis Tool (WSAT)
Verification Languages
WebServices
Front End
Analysis
Back End
Intermediate Representation
GFSA to Promela (synchronous communication)
success
BPEL to GFSA
SynchronizabilityAnalysis
Guarded automata
BPEL
fail
(bottom-up)
GFSA to Promela (bounded queue)
Promela
skip
GFSA parser
Conversation Protocol
Guarded automaton
GFSA to Promela(single process, no
communication)
success
Realizability Analysis
fail
(top-down)
http//www.cs.ucsb.edu/su/WSAT/
Fu, Bultan, Su CAV04
37Guarded Automata Model
- Uses XML messages
- Uses MSL for declaring message types
- MSL (Model Schema Language) is a compact formal
model language which captures core features of
XML Schema - Uses XPath expressions for guards
- XPath is a language for writing expressions
(queries) that navigate through XML trees and
return a set of answer nodes
38The Guarded Automata Model
//type declaration request id int //
message declaration r2 request // local
variable declaration last request
!e
?a1
?a2
!r2
!r1
Guard a2/id last/id gt r2/id last/id
1, last/id last/id 1
39XML (eXtensible Markup Language)
- XML is a markup language like HTML
- Similar to HTML, XML tags are written as
- lttaggt followed by lt/taggt
- HTML vs. XML
- In HTML, tags are used to describe the appearance
of the data - ltbgt lt/bgt ltigt lt/igt ltbrgt ltpgt ...
- In XML, tags are used to describe the content of
the data rather than the appearance - ltdategt lt/dategt ltaddressgt lt/addressgt
40An XML Document and Its Tree
ltRegistergt ltinvestorIDgt VIP01 lt/investorIDgt ltreque
stListgt ltstockIDgt 0001 lt/stockIDgt ltstockIDgt 0002 lt
/stockIDgt lt/requestListgt ltpaymentgt ltaccountNumgt 04
25 lt/accountNumgt lt/paymentgt lt/Registergt
- XML documents can be modeled as trees
- where each internal node corresponds to a
- tag and leaf nodes correspond to basic types
41XML Schema
- XML provides a standard way to exchange data over
the Internet. - However, the parties which exchange XML documents
still have to agree on the type of the data - What are the tags that will appear in the
document, in what order, etc. - XML Schema is a language for defining XML data
types - MSL (Model Schema Language) is a compact formal
model language which captures core features of
XML Schema
42MSL (Model Schema Language)
- Basic MSL syntax
- g ? ? b t g g m , n
- g , g g g g g
- g is an XML type (i.e., an MSL type expression)
- ? is the empty sequence
- b is a basic type such as string, boolean, int,
etc. - t is a tag
- m and n are positive integers
- , are MSL type constructors
43MSL Semantics
- t g denotes a type with root node labeled t
with children of type g - g m , n denotes a sequence of size at least m
and at most n where each member is of type g - g1 , g2 denotes an ordered sequence where the
first member is of type g1 and the second member
is of type g2 - g1 g2 denotes an unordered sequence where one
member is of type g1 and the other member is of
type g2 - g1 g2 denotes a choice between type g1 and type
g2, i.e., either type g1 or type g2, but not both
44An MSL Type Declaration and an Instance
ltRegistergt ltinvestorIDgt VIP01 lt/investorIDgt ltreque
stListgt ltstockIDgt 0001 lt/stockIDgt ltstockIDgt 0002 lt
/stockIDgt lt/requestListgt ltpaymentgt ltaccountNumgt 04
25 lt/accountNumgt lt/paymentgt lt/Registergt
Register investorIDstring , requestList
stockIDint1,3 , payment
creditCardNumint accountNumint
45Translating Guarded Automata to Promela
- We used the SPIN model checker to verify the
properties of conversations - SPIN is a finite state model checker
- we restricted XML message contents to finite
domains - We translate guarded automata models to Promela
(input language of the SPIN model checker) - First, translate MSL type declarations to Promela
type declarations - Then, translate XPath expressions to Promela code
46Mapping MSL types to Promela
- Basic types
- integer and boolean types are mapped to Promela
basic types int and bool - We only allow constant string values and strings
are mapped to enumerated type (mtype) in Promela - Other type constructors are handled using
- structured types (declared using typedef) in
Promela - or arrays
47Mapping MSL type constructors to Promela
- t g is translated to a typedef declaration
- g m , n is translated to an array declaration
- g1 , g2 is translated to a sequence of type
declarations - g1 g2 is translated to a sequence of type
declarations and an enumerated variable which is
used to record which type is chosen - g1 g2 is not handled! We do not handle
unordered type sequence (it can cause state-space
explosion)
48Example
typedef t1_investorID mtype
stringvalue typedef t2_stockIDint
intvalue typedef t3_requestList t2_stockID
stockID 3 int stockID_occ typedef
t4_accountNumint intvalue typedef
t5_creditCardint intvalue mtype m_accountNum,
m_creditCard typedef t6_payment t4_accountNum
accountNum t5_creditCard creditCard mtype
choice typedef Register t1_investorID
investorID t3_requestList requestList
t6_payment payment
Register investorIDstring , requestList
stockIDint1,3 , payment
creditCardNumint accountNumint
49XPath
- In order to write specifications or programs that
manipulate XML documents we need - an expression language to access values and nodes
in XML documents - XPath is a language for writing expressions
(queries) that navigate through XML trees and
return a set of answer nodes - An XPath query defines a function which
- takes and XML tree and a context node (in the
same tree) as input and - returns a set of nodes (in the same tree) as
output
50XPath Syntax
- Basic XPath syntax
- q ? . .. b t
- /q //q q / q q // q
- q q q exp
- q is an XPath query
- exp denotes a predicate on basic types, i.e., on
the leaf nodes of the XML tree - b denotes a basic type such as string, boolean,
int, etc. - t denotes a tag
51XPath Semantics
- Given an XML tree and a node n as a context node
- . returns n
- .. returns the parent of n
- Given an XML tree and a set of nodes
- returns all the nodes
- b returns the nodes that are of basic type b
- t returns the nodes which are labeled with tag
t
52XPath Semantics Contd.
- Starting at the context node
- /q returns the nodes that match q
- //q returns the nodes that match q starting at
any descendant - q1 / q2 returns each node which matches q2
starting at a child of a node which matches q1 - q1 // q2 returns each node which matches q2
starting at a descendant of a node which matches
q1 - q1 q2 applies q2 to the children of the
nodes which match q1 - q exp returns the nodes that match q and for
children of which the expression exp evaluates to
true
53Examples
//payment/ returns the node labeled
accountNum /Register/requestList/stockID/int
returns the nodes labeled 0001 and
0002 //stockIDint gt 1/int returns the node
labeled 0002
54XPath to Promela
- Generate code that evaluates the XPath expression
- Fu, Bultan, Su ISSTA04
- Traverse the XPath expression from left to right
- Code generated in each step is inserted into the
BLANK spaces left in the code from the previous
step - A tree representation of the MSL type is used to
keep track of the context of the generated code - Uses two data structures
- Type tree shows the structure of the
corresponding MSL type - Abstract statements which are mapped to Promela
code
55Statement
Promela Code
if v -gt BLANK else -gt skip fi
IF(v)
FOR(v,l,h)
v l 1 do v lt h -gt BLANK v
else -gt break od
BLANK
EMPTY
INC(v)
v
SET(v,a)
v a
56Type Tree
Register investorIDstring requestList
stockIDint1,3 payment
creditCardNumint accountNumint
1
Register
7
2
4
payment
investorID
requestList
8
10
3
5
string
creditCard
stockID (idx i1)
accountNum
9
11
int
int
6
int
57Generated Statements
register // stockID / int()gt5 / position()
last()/ int()
EMPTY
5
5
FOR (i1,1,3)
IF (i2i3)
1
5
EMPTY
5
5
5
5
5
6
Sequence
cond ? v_register.requestlist.stockIDi1 gt 5
Insert
58request//stockIDregister//stockIDint()gt5posi
tion()last()
/ result of the XPath expression / bool
bResult false / results of the predicates 1,
2, and 1 resp. / bool bRes1, bRes2, bRes3 /
index, position(), last(), index, position() /
int i1, i2, i3, i4, i5 i21 / pre-calculate
the value of last(), store in i3 / i40 i51
i30 do i4 lt v_register.requestList.stockID_
occ -gt / compute first predicate /
bRes3 false if v_register.requestList.
stockIDi4.intvaluegt5 -gt bRes3 true
else -gt skip fi if bRes3 -gt i5
i3 else -gt skip fi i4
else -gt break od
59request//stockIDregister//stockIDint()gt5posi
tion()last()
i10 do i1 lt v_register.requestList.stockID
_occ -gt bRes1 false if
v_register.requestList.stockIDi1.intvaluegt5 -gt
bRes1 true else -gt skip fi if
bRes1 -gt bRes2 false if
(i2 i3) -gt bRes2 true else -gt
skip fi if bRes2 -gt
if (v_request.stockID.intvalue
v_register.requestList.stockIDi
1.intvalue) -gt bResult true
else -gt skip fi else -gt
skip fi i2 else -gt skip
fi i1 else -gt break od
60Model Checking Using Promela
- Found subtle errors in an example
- SAS Stock Analysis Service Fu, Bultan, Su
ISSTA04 - 3 peers Investor, Broker, ResearchDept.
- Investor ? Broker a registerList of stockIDs
- Broker ? ResearchDept.
- relay request (1 stockID per request)
- find the stockID in the latest request, send its
subsequent stockID in registerList - Repeating stockID will cause error.
- Only discoverable by analysis of XPath expressions
61Related Work
- Conversation specification
- IBM Conversation support project
http//www.research.ibm.com/convsupport/ - Conversation support for business process
integration Hanson, Nandi, Kumaran EDOCC02 - Orchestrating computations on the world-wide web
Choi, Garg, Rai, Misram, Vin EuroPar02 - Realizability problem
- Realizability of Message Sequence Charts (MSC)
Alur, Etassami, Yannakakis ICSE00, ICALP01
62Related Work
- Verification of web services
- Simulation, verification, composition of web
services using a Petri net model Narayanan,
McIlraith WWW02 - BPEL verification using a process algebra model
and Concurrency Workbench Koshkina, van Breugel
TAV-WEB03 - Using MSC to model BPEL web services which are
translated to labeled transition systems and
verified using model checking Foster, Uchitel,
Magee, Kramer ASE03 - Model checking Web Service Flow Language
specifications using SPIN Nakajima ICWE04
63Current and Future Work
- Extending the source and target languages
- Symbolic analysis
- Fu, Bultan, Su ICWS04, JWSR
- Abstraction
- Design for verification for web services
- Betin-Can, Bultan WWW05, ICWS05
64Current and Future Work
Web Service Specification Languages
Verification Languages
Front End
Analysis
Back End
Intermediate Representation
BPEL
Translation with synchronous communication
success
Translator for bottom-up specifications
Promela
SynchronizabilityAnalysis
DAML-S
SMV
Guarded automata
fail
Translation with bounded queue
WS-CDL
Automated Abstraction
skip
ActionLanguage
Conversation Protocols
Translator for top-down specifications
. . .
Realizability Analysis
success
Translation withsingle process, no communication
Guarded automaton
. . .
fail
65THE END