Title: Web%20Services%20and%20Data%20Integration
1Web Services and Data Integration
- Zachary G. Ives
- University of Pennsylvania
- CIS 455 / 555 Internet and Web Systems
- May 13, 2018
Some slides by Berthier Ribeiro-Neto
2Reminders Announcements
- Assignment 3 now officially released
- Midterm next Wednesday, 3/31
3How Do We Declare Functions?
- WSDL is the interface definition language for web
services - Defines notions of protocol bindings, ports, and
services - Generally describes data types using XML Schema
- In CORBA, this was called an IDL
- In Java, the interface uses the same language as
the Java code
4A WSDL Service
Service
Port
Port
Port
PortType
PortType
PortType
Operation
Operation
Operation
Operation
Operation
Operation
Binding
Binding
Binding
5Web Service Terminology
- Service the entire Web Service
- Port maps a set of port types to a transport
binding (a protocol, frequently SOAP, COM, CORBA,
) - Port Type abstract grouping of operations, i.e.
a class - Operation the type of operation
request/response, one-way - Input message and output message maybe also
fault message - Types the XML Schema type definitions
6Example WSDL
- ltservice namePOServicegt
- ltport bindingmyPOBindinggt
- ltsoapaddress locationhttp//yyy9000/POSvc/gt
- lt/portgt
- lt/servicegt
- ltbinding xmlnsmy namePOBindinggt
- ltsoapbinding stylerpc transporthttp//www.w3
.org/2001/... /gt - ltoperation namePOrdergt
- ltsoapoperation soapActionPOService/POBinding
stylerpc /gt - ltinput namePOrdergt
- ltsoapbody useliteral namespacePOService
/gt - lt/inputgt
- ltoutput namePOrderResultgt
- ltsoapbody useliteral namespacePOService
/gt - lt/outputgt
- lt/operationgt
- lt/bindinggt
7JAX-RPC Java and Web Services
- To write JAX-RPC web service endpoint, you
need two parts - An endpoint interface this is basically like
the IDL statement - An implementation class your actual code
- public interface BookQuote extends
java.rmi.Remote - public float getBookPrice(String isbn) throws
java.rmi.RemoteException -
- public class BookQuote_Impl_1 implements
BookQuote - public float getBookPrice(String isbn) return
3.22
8Different Options for Calling
- The conventional approach is to generate a stub,
as in the RPC model described earlier - You can also dynamically generate the call to the
remote interface, e.g., by looking up an
interesting function to call - Finally, the DII (Dynamic Instance Invocation)
method allows you to assemble the SOAP call on
your own
9Creating a Java Web Service
- A compiler called wscompile is used to generate
your WSDL file and stubs - You need to start with a configuration file that
says something about the service youre building
and the interfaces that youre converting into
Web Services
10Example Configuration File
- lt?xml version"1.0" encoding"UTF-8"?gt
- ltconfiguration xmlns"http//java.sun.com/xml/ns/j
ax- rpc/ri/config"gt - ltservice name"StockQuote" targetNamespace"http/
/example.com/stockquote.wsdl" typeNamespace"http
//example.com/stockquote/types"
packageName"stockqt"gt - ltinterface name"stockqt.StockQuoteProvider"
servantName"stockqt.StockQuoteServiceImpl"/gt - lt/servicegt
- lt/configurationgt
11Starting a WAR
- The Web Service version of a Java JAR file is a
Web Archive, WAR - Theres a tool called wsdeploy that generates WAR
files - Generally this will automatically be called from
a build tool such as Ant - Finally, you may need to add the WAR file to the
appropriate location in Apache Tomcat (or
WebSphere, etc.) and enable it - See http//java.sun.com/developer/technicalArticle
s/WebServices/WSPack2/jaxrpc.html for a detailed
example
12Finding a Web Service
- UDDI Universal Description, Discovery, and
Integration registry - Think of it as DNS for web services
- Its a replicated database, hosted by IBM, HP,
SAP, MS - UDDI takes SOAP requests to add and query web
service interface data
13Whats in UDDI
- White pages
- Information about business names, contact info,
Web site name, etc. - Yellow pages
- Types of businesses, locations, products
- Includes predefined taxonomies for location,
industry, etc. - Green pages what we probably care the most
about - How to interact with business services business
process definitions etc - Pointer to WSDL file(s)
- Unique ID for each service
14Data Types in UDDI
- businessEntity top-level structure describing
info about the business - businessService name and description of a
service - bindingTemplate how to access the service
- tModel (t type/technical) unique identifier
for each service-template specification - publisherAssertion describes relationship
between businessEntities (e.g., department,
division)
15Relationships between UDDI Structures
publisherAssertion
n
tModel
2
businessEntity
n
1
m
n
businessService
bindingTemplate
1
n
16Example UDDI businessEntity
- ltbusinessEntity businessKey0123
xmlnsurnuddi-orgapi_v2gt - ltdiscoveryURLsgt
- ltdiscoveryURL useTypebusinessEntitygt
- http//uddi.ibm.com/registery/uddiget?businessKey
0123... - lt/discoveryURLgt
- ltnamegtMy Bookslt/namegt
- ltdescriptiongtTechnical Book Wholesalerlt/descriptio
ngt -
- ltbusinessServicesgt
-
- lt/businessServicesgt
- ltidentifierBaggt
- lt! keyedReferences to tModels ?
- lt/identifierBaggt
- ltcategoryBaggt lt/categoryBaggt
- lt/businessEntitygt
17UDDI in Perspective
- Original idea was that it would just organize
itself in a way that people could find anything
they wanted - Today UDDI is basically a very simple catalog of
services, which can be queried with standard APIs - Its not clear that it really does what people
really want they want to find services like Y
or that do Z
18The Problem
- Theres no universal, unambiguous way of
describing what I mean - Relational database idea of normalization
doesnt convert concepts into some normal form
it just helps us cluster our concepts in
meaningful ways - Knowledge representation tries to encode
definitions clearly but even then, much is up
to interpretation - The best we can do describe how things relate
19This Brings Us to XQuery,Whose Main Role Is to
Relate XML
- Suppose we define an XML schema for our target
data and our source data - XQuery allows us to define mappings from input
XPath matches to output trees - Can directly translate between XML schemas or
structures - Describes a relationship between two items
- Transform 2 into 6 by add 4 operation
- Convert from S1 to S2 by applying the query
described by view V - Often, we dont need to transfer all data
instead, we want to use the data at one source to
help answer a query over another source
20Lets Look at Some SimpleMappings
- Beginning with examples of using XQuery to
convert from one schema to another, e.g., to
import data - First lets review what our XQuery mappings
need to accomplish
21Challenges of Mapping Schemas
- In a perfect world, it would be easy to match up
items from one schema with another - Each element would have a simple correspondence
to an element in the other schema - Every value would clearly map to a value in the
other schema - Real world as with human languages, things
dont map clearly! - Different decompositions into elements
- Different structures
- Tag name vs. value
- Values may not exactly correspond
- It may be unclear whether a value is the same
- Its a tough job, but often things can be mapped
22Example Schemas
- Bobs Movie Database
- ltmoviegt lttitlegtlt/titlegt ltyeargtlt/yeargt
ltdirectorgtlt/directorgt lteditorgtlt/editorgt
ltstargtlt/stargtlt/moviegt
- Marys Art List
- ltworkOfArtgt ltidgtlt/idgt lttypegtlt/typegt
ltartistgtlt/artistgt ltsubjectgtlt/subjectgt
lttitlegtlt/titlegtlt/workOfArtgt
Want to map data from one schema to the other
23Mapping Bobs Movies ? Marys Art
- Start with the schema of the output as a
template - ltworkOfArtgt ltidgtilt/idgt lttypegtylt/typegt
ltartistgtalt/artistgt ltsubjectgtslt/subjectgt
lttitlegttlt/titlegtlt/workOfArtgt - Then figure out where to find the values in the
source, and create XPaths
24The Final Schema Mapping
- Marys Art ? Bobs Movies
- for m in doc(movie.xml)//movie, a in
m/director/text(), i in m/title/text(),
t in m/title/text()return ltworkOfArtgt
ltidgtilt/idgt lttypegtmovielt/typegt
ltartistgtalt/artistgt lttitlegttlt/titlegt lt/work
OfArtgt
Note the absence of subjectWe had no reasonable
source,so we are leaving it out.
25Mapping Values
- Sometimes two schemas use different
representations for the same thing - ID ? SSN
- English ? Hungarian
- We typically use an intermediate table defining
correspondences a concordance table - It can be generated automatically, and then
corrected by hand (since there will often be
exceptions)
26An Example Value Mapping Problem
- Penn student enrollment DB
- ltstudentgtltpennidgt12346lt/pennidgt
ltnamegtMary McDonaldlt/namegt lttakinggtltsemgtF03lt/s
emgt ltclassgtcse330lt/classgtlt/takinggt
lt/studentgt ltstudentgtltpennidgt12345lt/pennidgt
ltnamegtJon Dohlt/namegt lt/studentgt - Penn dental plan
- ltpatientgtltssngt323-468-1212lt/ssngt
lttreatmentgtDental sealantlt/treatmentgt
lt/patientgt - Want to output student names treatments
27Translating Values with a Concordance Table
- return ltstudentgt ltnamegt n lt/namegt lttreatmen
tgt tr lt/treatmentgt lt/studentgt
28Translating Values with a Concordance Table
pid PennID n name
student.xml ltstudentgtltpennidgt12346lt/pennidgt
ltnamegtMary McDonaldlt/namegt
lttakinggtltsemgtF03lt/semgt
ltclassgtcse330lt/classgtlt/takinggt lt/studentgt
- for p in doc (student.xml) /db/student,
- pid in p/pennid/text(), n in
p/name/text(),m in doc (concord.xml)
/db/mapping, f in m/from/text(),
t in m/to/text(),d in doc(dental.xml)/db/pat
ient, s in d/ssn/text(), tr in
d/treatment/text() - where ____________________
- return ltstudentgt ltnamegt n lt/namegt lttreatmen
tgt tr lt/treatmentgt lt/studentgt
29Translating Values with a Concordance Table
pid PennID n name s ssn tr treatment
student.xml ltstudentgtltpennidgt12346lt/pennidgt
ltnamegtMary McDonaldlt/namegt
lttakinggtltsemgtF03lt/semgt
ltclassgtcse330lt/classgtlt/takinggt
lt/studentgt dental.xml ltpatientgtltssngt323-468-12
12lt/ssngt lttreatmentgtDental
sealantlt/treatmentgt lt/patientgt
- for p in doc (student.xml) /db/student,
- pid in p/pennid/text(), n in
p/name/text(),d in doc(dental.xml)/db/patient
, s in d/ssn/text(), tr in
d/treatment/text(),m in doc (concord.xml)
/db/mapping, f in m/from/text(),
t in m/to/text()where ____________________ - return ltstudentgt ltnamegt n lt/namegt lttreatmen
tgt tr lt/treatmentgt lt/studentgt
30Translating Values with a Concordance Table
pid PennID n name s ssn tr treatment f
PennID t ssn
student.xml ltstudentgtltpennidgt12346lt/pennidgt
ltnamegtMary McDonaldlt/namegt
lttakinggtltsemgtF03lt/semgt
ltclassgtcse330lt/classgtlt/takinggt
lt/studentgt dental.xml ltpatientgtltssngt323-468-12
12lt/ssngt lttreatmentgtDental
sealantlt/treatmentgt lt/patientgt concord.xml ltm
appinggt ltfromgt12346lt/fromgt lttogt323-468-1212lt/t
ogt lt/mappinggt
- for p in doc (student.xml) /db/student,
- pid in p/pennid/text(), n in
p/name/text(),d in doc(dental.xml)/db/patient
, s in d/ssn/text(), tr in
d/treatment/text(),m in doc (concord.xml)
/db/mapping, f in m/from/text(),
t in m/to/text()where ____________________ - return ltstudentgt ltnamegt n lt/namegt lttreatmen
tgt tr lt/treatmentgt lt/studentgt
31Translating Values with a Concordance Table
pid PennID n name s ssn tr treatment f
PennID t ssn
student.xml ltstudentgtltpennidgt12346lt/pennidgt
ltnamegtMary McDonaldlt/namegt
lttakinggtltsemgtF03lt/semgt
ltclassgtcse330lt/classgtlt/takinggt
lt/studentgt dental.xml ltpatientgtltssngt323-468-12
12lt/ssngt lttreatmentgtDental
sealantlt/treatmentgt lt/patientgt concord.xml ltm
appinggt ltfromgt12346lt/fromgt lttogt323-468-1212lt/t
ogt lt/mappinggt
- for p in doc (student.xml) /db/student,
- pid in p/pennid/text(), n in
p/name/text(),d in doc(dental.xml)/db/patient
, s in d/ssn/text(), tr in
d/treatment/text(),m in doc (concord.xml)
/db/mapping, f in m/from/text(),
t in m/to/text()where ____________________ - return ltstudentgt ltnamegt n lt/namegt lttreatmen
tgt tr lt/treatmentgt lt/studentgt
32Summary Mapping, Integrating, and Sharing Data
- Mappings based on XQuery rather than XSLT
- Can do point-to-point mappings to exchange data
- UDDI versus this approach?
- What about search and its relationship to
integration? In particular, search over Amazon,
Google Maps, Google, Yahoo,