Title: State of the Art in Web Services
1State of the Art in Web Services Commercial
Databases
- Outline
- Web Services Technologies
- JDBC
- XML, SVG, PNG
- XMLP, SOAP, WSDL, UDDI
- Commercial Databases
- IBM DB2
- Oracle 9i
- Microsoft SQL 2000
- Information Integration
- Access to web data sources
2Data Management for GIS
- Database Implementation (e.g. IBM GFIS)
GIS Information
How dose it look? (graphical data)
Where is it? (coordinates)
What is it? (attribute data)
Spatial Database
3Why use database for spatial data?
- Better data management for spatial data. Users
gain access to full function spatial information
systems based on industry standards with an open
interface to their data (e.g. SQL). - Spatial data is now stored in enterprise-wide
database, thereby facilitating spatially enabling
many more application. - Reduced complexity of systems management by
eliminating the hybrid or file based
architectures of traditional GIS-based data
management schemes.
4JDBC (Java Database Connectivity)
- Why use JDBC?
- Establish a connection with a database or access
any tabular data source. - Send SQL statements.
- Process the results.
5JDBC Architecture
- The JDBC API contains two major sets of
interfaces the first is the JDBC API for
application writers, and the second is the
lower-level JDBC driver API for driver writers.
JDBC technology drivers fit into one of four
categories. Applications and applets can access
databases via the JDBC API using pure Java JDBC
technology-based drivers, ODBC drivers, and
existing database client libraries as shown in
the following figures
6XML (eXtensible Markup Language)
- DTD (Document Type Definition)
- A DTD defines a class of XML documents using a
language that is essentially a context-free
grammar with several restrictions. - RDF (Resource Description Framework)
- It provides a general method to describe metadata
for XML documents - It describes resources, which are objects
identified using Uniform Resource Identifier
(URIs) - XSL (eXtensible Stylesheet Language)
- It is a language for transforming and formatting
XML
7XML (cont.)
- XPath
- XPath defines the syntax and semantics of
path expressions such as the following, which
matches the last report child (in document order)
of the weather descendants of the node with
unique identifier favorites id("favorites")/des
cendantweather/childreportposition()last() - XPointer
- The XML Pointer Language (XPointer)
extends XPath to support applications by adding
two new axes to specify basis steps in XPath.
8Querying XML Data
- Requirements for a query language for XML
- Precise Semantics An XML query language should
have a formal semantics. - Rewritability, Optimizability XML data will
often be generated automatically from other
formats relational, object-oriented,
special-purpose formats. - XML Output An XML query should yield XML output.
- Compositional Semantics Expressions in the XML
query language should have referential
transparency. - No Schema Required An XML query language should
be usable on XML data when there is no schema
(DTD) known in advance.
9An example XML-QL
- Selection and Extraction Selection in XML-QL is
done with patterns and conditions. The example
below selects all books published by
Addison-Wesley after 1991 - WHERE ltbibgt ltbook yearygt
ltpublishergtltnamegtAddison- Wesleylt/namegtlt/publisher
gt -
lttitlegt t lt/titlegt -
ltauthorgt a lt/authorgt - lt/bookgt lt/bibgt IN
"www.a.b.c/bib.xml", y gt 1991 - CONSTRUCT a
- Reduction and Restructuring The following query
retrieves the same data as above but groups the
results differently - WHERE ltbibgt ltbook yearygt
ltpublishergt ltnamegtAddison-Wesley lt/gt lt/gt -
lttitlegt t lt/gt -
ltauthorgt a lt/gt - lt/gt lt/gt IN
"www.a.b.c/bib.xml", y gt 1991 - CONSTRUCT ltresultgt ltauthorgt a lt/gt
-
lttitlegt t lt/gt - lt/gt
10Web Services
The World Wide Web is more and more used for
application to application communication. The
programmatic interfaces made available are
referred to as Web services.
11Web Services (cont.)
- A Web Service a service accessible via XML
messages over Internet protocols. - SOAP Simple Object Access ProtocolA lightweight
protocol to exchange information in a distributed
environment based on XML. - WSDL Web Services Description LanguageA
language to describe the functions provided by a
web service. - UDDI Universal Description, Discovery and
IntegrationA meta service to locate web
services - www.uddi.org.
12SVG (Scalable Vector Graphics)
- Whats SVG?
- SVG is a language for describing two-dimensional
graphics in XML. - Types of graphic objects
- Vector graphic shapes, images, and text.
- Its relationship with our project
- We can use SVG to describe the graphs in civil
engineering XML documents.
13PNG (Portable Network Graphics)
- Whats PNG?
- An extensible file format for the loss less,
portable, well-compressed storage of raster
images. - Its relationship with our project
- We can use it to store the images of boreholes
and other related geotechnical data in our
documents.
14IBM DB2 Universal Database
- Platform
- Windows NT, 2000, IBM AIX
- Embedding ESRI
- IBM's collaborative relationship with ESRI, which
includes embedding ESRI technology within DB2
Spatial Extender, - is one of its GIS solutions.
- IBM DB2 Spatial Extender
- DB2 Spatial Extender manages the updating,
structuring and insertion of geospatial data and
enables development of the spatial database
schema.
15San Francisco maps city services with IBM DB2 and
DB2 Spatial Extender
16Oracle 9i DatabaseSpatial
- Oracle Spatial
- Address-based data
- Customer Lists
- Locations of Assets
- Map data
- Roads
- Parcels
- Remotely Sensed data
- Satellite Imagery
- GPS
- Aerial Photography
17Oracle 9i Database (cont.)
Oracle9i applies spatial indexes to any data in
relational databases. Oracle Locator includes
R-tree indexing, in addition to quadtree indexing
capability. R-tree indexes can be used in place
of quadtree indexes, or in conjunction with them.
In addition, R-tree indexing can be used for any
3D and 4D indexing of data critical to solving
problems in oil exploration, architecture,
engineering, and many other scientific
application.
Queries can be spatially constrained, as defined
by an "area of interest" chosen by the user.
Eliminating data outside the area of interest
from consideration during queries ensures optimum
performance levels.
18Microsoft SQL Server 2000
- Geographic data MapPoint works against any ODBC
compliant data source, including MS Excel, MS
Access, and MS SQL Server 2000. - Ease of use A simple user interface (UI) and
straightforward documentation help ensure that
everyone in your organization can use MapPoint.
Users can be productive quickly, without the
downtime or cost typically associated with
training them in new business intelligence tools. - Integration with Office As part of the Office
- product family, MapPoint is
well-integrated - with the other Office applications. This
gives - MapPoint a strong advantage over
competing - products with regard to users personal
- productivity.
19Information Integration
Research by Craig Knoblock Cyrus Shahabi Steve
Benton José Luis Ambite USC/Information Sciences
Institute
20Information IntegrationSingle Interface to
Multiple Sources
21MotivationTheaterLoc Entertainment Agent
Hollywood.com Trailers
Tiger Map Server
Etak Geocoder
Agent
Zagat
CuisineNet
Yahoo Movies
22TheaterLoc
23Information Integration
- The problem of providing
- uniform (sources transparent to user)
- access to (query, and eventually updates too)
- multiple (even 2 is a problem!)
- autonomous (not affect the behavior of sources)
- heterogeneous (different data models, schemas)
- structured (at least semistructured)
- data sources (not only databases)
24Principal Dimensions of Information Integration
- Virtual vs. materialized architecture
- Access query only or query update?
- Mediated schema
- Mediated schema requires schema integration and
then query reformulation. Two main approaches - Global as View
- Local as View
- Language for descriptions and queries
conjunctive queries (CQs), union of CQs, Datalog
(recursion), first-order logic (?,?,?),
description logics - Types of Sources
- Structured (DBs) vs. semi-structured (Web)
- Source capabilities positive and negative
25Materialized Architecture Data Warehouse
26Virtual ArchitectureMediator
27Mediator Architecture
- User queries in global (mediator) schema
- Mediator translates and decomposes user query
into multiple source queries
28Wrapper Building Tools
- Wrappers provide uniform query language for data
access - Wrapping Web-pages by Non-experts
- Demonstration-oriented user interface enables
users to show system what to extract by example - System automatically induces extraction patterns
- Simplifies wrapper maintenance
29Example of Extraction Rule
Muslea et al 1999
RULE sequence of landmarks (e.g., Cuisine )
Page
ltbgtNamelt/bgtChinois on Mainltbgt Cuisine ltpgt
lt/bgt Pacific New Wave ltbrgt
Start SkipTo(Cuisine ) SkipTo(lt/bgt)
End SkipTo(ltbrgt)
30 Example of Rule Induction
Muslea et al 1999
Training Examples
ltpgtCuisineltpgtltbgtThailt/bgtltpgtReviewltpgt ltbgt
Good ltpgtReviewltpgtltbrgtltpgt ltbgt Excellent
SkipTo(
ltbgt ) SkipTo(ltpgt ltbgt) ... SkipTo( )
SkipTo(ltbgt) ... SkipTo(ltpgt)SkipTo(ltbgt)
SkipTo(Review ) SkipTo( ltbgt )
...
31Matching Objects Across Sources
- Problem how to decide that objects in two
sources refer to the same object in the real
world - Example Zagat Fodors
CPK California Pizza Kitchen
Ralphs Ralphs Grill - Information Retrieval techniques for similarity
joins Cohen,SIGMOD-98 TejadaKnoblock