Title: Grid Wrapper for IBM Websphere Information Integrator
1Grid Wrapper for IBM Websphere Information
Integrator
- Patrick Dantressangle 8/2005
- IBM Hursley Emerging Technology Services
- dantress_at_uk.ibm.com
2Agenda
- The Data GRID virtualization of data
- OGSA-DAI and DAIS
- Websphere II GRID
- OGSA-DAI Wrapper features an example of GRID
technology with Websphere II. - Conclusion/questions
3The Grid
Storage
Operating System
I/O
Data
Processing
Applications
Grid Computing
Distributed Computing Over a Network of
Heterogeneous Resources Using Open Standards
4Desired World for Customers
- Customers need access to information from
anyplace, anytime - Based upon Policy-managed quality of service
(QoS) - Which will drive an automatic data placement
- Federation of data stores to utilize existing
resources - Access to all relevant resources structured,
semi-structured, unstructured
Request QoS
Response Delivered QoS
Virtualized Information
Placement Manager
Data Policy
5Virtualization of data on the Grid
WS
J2EE
SQL API
WS
OGSA-DAI
OQL API
OGSA-DAI
Data Virtualization
IBM Websphere information integrator
OGSA DQP? (open source distributed query
processor)
Schema integration Federation
GRID registry
DataSource access Over SOA
Oracle
DB2
MS
MySQL
XML
Global policy
MicrosoftWindows VMWare/MSCS
Linux(Intel) VMWare
File system access (SAN FS)
IBM AIXHACMP
SunSolaris
IBMBladeCenter VMWare/Windows/Linux OPM/FCS
HP/UX
SAN
Global Mngt
Disk Access
TAPE
EMC
XXX
IBM
6OGSA-DAI
- DAIS The GGF Database Access and Integration
Working Group - http//cs.man.ac.uk/grid-db
- OGSA-DAI The UK e-Science funded reference
implementation of DAIS - http//ogsadai.org.uk/
- IBM Emerging Technology Services
- http//tes.hursley.ibm.com/
7OGSA-DAI - Objective
Define open standards and open source based
uniform service interfaces for accessing
heterogeneous data sources within the Open Grid
Services Architecture (OGSA) Since WS-RF,
This is now moving to a full web services
architecture.
8Grid and Web Services Convergence(an aside)
Grid
Web
The definition of WSRF means that Grid and Web
communities can move forward on a common base
9Some IBM Grid Data Activities
GGF - OGSA
OGSA Data Architecture
GGF OGSI Working Group
OASIS WSRF
GGF - DAIS Working Group
OGSA Data Services
UK eScience - OGSA DAI
2005
2002
2003
2004
Today
10OGSA-DAI Project Partners
11GDS Internals
response document
perform document
The Engine
element
element
element
Query Activity
Delivery Activity
Transform Activity
data
data
credentials
data
query
connection
Role Mapper
role
Data Resource Implementation
connection
12OGSA-DAI Supported Data Resources
13Some OGSA-DAI Customers more here
http//www.ogsadai.org.uk/projects/
Bridges (http//www.brc.dcs.gla.ac.uk/projects/bri
dges/)
N2Grid (http//www.cs.univie.ac.at/institute/index
.html?project-8080)
BioSimGrid (http//www.biosimgrid.org/)
AstroGrid (http//www.astrogrid.org/)
BioGrid (http//www.biogrid.jp/)
OGSA-DAI (http//www.ogsadai.org.uk)
GEON (http//www.geongrid.org/)
eDiaMoND (http//www.ediamond.ox.ac.uk/)
OGSA-WebDB (http//www.gtrc.aist.go.jp/dbgrid/)
GeneGrid (http//www.qub.ac.uk/escience/projects.p
hpgenegrid)
FirstDig (http//www.epcc.ed.ac.uk/firstdig/)
myGrid (http//www.mygrid.org.uk/)
IU RGRBench (http//www.cs.indiana.edu/plale/proj
ects/RGR/OGSA-DAI.html)
ODD-Genes (http//www.epcc.ed.ac.uk/oddgenes/)
14Websphere Information Integrator
15 Information Virtualization
Applications
Enterprise Service Bus
Applications
Web Services Requests
SQL, SQL/XML, XPath, XQuery (soon)
Event Consumption
Can mix and match
JDBC, ODBC, XQJ, Web Services, JMS
WebSphere Information Integrator
Can mix and match
Heterogeneous Data Sources
Event Publishing
Web Services
Applications
Enterprise Service Bus
16Heterogeneous Data Sources
DB2 UDB
DB2 UDB on z/OS
Integrated SQL or XML View
Sybase
WebSphere II
VSAM
SQL, SQL/XML
O D B C
Informix
IMS
Federation Engine
SQL Server
Wrappers and functions
Software AG Adabas
Oracle
CA-Datacom
Teradata
CA-IDMS
ODBC
IBM Extended Search
Packaged Apps
Content Management systems
Excel
Text
WebSphere MQ
WWW, email,
XML
Web services
17WebSphere II Web Services Overview
WebSphere II
WebSphere
HTTP/SOAP
DB2 Web Service Provider (WORF)
HTTP/SOAP
HTTP/GET
SQL
SQL Applications
Web Browser
Soap Client
Soap Client
WebSphere II provides Web Services data
WebSphere II applications consume Web Services
data
18Accessing Web Services from DB2 Through User
Defined Functions
- Can integrate SQL statements and Web Service
invocations - Support for generating SQL scalar and table UDFs
(User Defined Functions) based on WSDL (web
service description) - Command line version
- Tool support integrated into Web Sphere Studio
SELECT city, GetTemperature(city) FROM location
19DB2 Tables
Today XML Support in DB2
1. Publish
XML
- 1. Publish
- SQL/XML Functions
- XML Extender Composition
- DB2 Web Services
2. Transform Manipulate
- 2. Transform Manipulate
- Validate (schema and DTD)
- XSL
- Extract (fragments or values through XPath)
- Update
XML
3. Store
- 3. Store
- Intact as CLOB or as XML Extender Type
- Shredded into relational through XML Extender
Decomposition - DB2 Web Services
DB2 Tables
For references and more information see
http//www.ibm.com/developerworks/db2/library/tec
harticle/0212malaika/0212malaika.html
20Integration of XML Relational Capabilities
soon, beta in 2005
- Native XML data type (server client side)
- (not Varchar, not CLOB, not object-relational !)
- XML Capabilities in all DB2 components
- Applications combine XML relational data
SERVER
CLIENT
SQL/XML
DB2 Storage
Relational Interface
DB2 Client / Customer Client Application
DB2 Engine
XQuery
XML Interface
21Websphere Information Integration Wrapping it
up!
- Encapsulate data source information inside a
wrapper - Models data as tables
- Expose unique function
- Wrapper participates in query planning and
execution - Easy to add for unusual data sources
Websphere Information Integrator
Query Plan
Query Planning
Query Execution
Execution Plan
Plan request
Wrapper
Wrapper Plan
Generic/ODBC, Oracle, SQL Server, Sybase,
Informix, Teradata, SQL Anywhere, Flat file,
Excel, XML, Documentum, BLAST,
22A Typical Environment, Today
Web Clients
Business Partners
Enterprise Portals
Internal Applications
Data Feeds
XML
HTTP Server
JMS
XML or Web Service
Websphere AS
Web Services
EJBs
Servlets and JSPs
Worflows
Windows
Unix
zOS
Web Server
Reporting Analysis
Customers
Inventory System
Content Manager
Purchase Orders
Web Service
Digital media
DB2/MVS
IMS
Oracle
DB2
Excel
DB2
- The goal to achieve an Integrated, Service-Based
Architecture
23An application produces a Report of the Sales
- grouping sales, per region, and showing current
percentage of quota attained. - The information comes from 3 systems the EJB
needs to federate this data.
Application
Monthly Sales Report
EJBs
Servlets and JSPs
Purchase Orders
Customers
Sales Quotas
Oracle
DB2/MVS
Excel
- The Application needs connectivity to each
system load and merge the data and aggregate it
for the report.
24...the application requires complex logic to
implement federation
Thousands of lines of Java code!
Only 100 regions with sales
MBs of memory required!
SELECT custid, qty, price FROM ORDERS ORDER BY
custid into ords
Monthly Report Service
SELECT custid, region FROM CUSTOMES ORDER BY
custid into custs
100 rows
// Merge the data for each customer in custs
for each order in ords if order.custid
customer.custid region.sales
(order.qty order.price) ii1
. . . etc ...
Federate
Websphere AS
JDBC
Java file i/o
JDBC
2 000 records
10 000 rows
2 000 000 rows
Excel SALES Quotas
DB2 CUSTOMERS
Oracle ORDERS
- The EJB needs a connection to each back-end!
- JDBC helps accessing Oracle/DB2 data, but what
about Excel? - Millions of rows loaded into the application,
stressing the JVM! - Joining large sets of data complex and slow!
25Our Environment, with Information Integration
Web Clients
Business Partners
Enterprise Portals
Internal Applications
Data Feeds
HTTP Server
XML
XML or Web Service
application
Web Services
EJBs
Servlets and JSPs
Information Integrator
Unified Data Model
meta-data
Federation
Unix
zOS
Web Server
Windows
Reporting Analysis
Customers
Inventory System
Content Manager
Purchase Orders
Web Service
Digital media
DB2/MVS
Oracle
IMS
DB2
Excel
DB2
- All Information Sources appear as if from the
same database!
26Our EJB using Information Integration
- Use SQL to query the virtual database, into which
the Excel Spreadsheet appears as just another
table!
Application
Monthly Sales Report
EJBs
Servlets and JSPs
JDBC
nicknames
Information Integrator
Unified Data Model
CUSTOMERS
Federation
ORDERS
Unix
SALESQUOTA
Native APIs
Purchase Orders
Customers
Sales Quotas
Oracle
DB2/MVS
Excel
- Each data source appears as a local table via a
nickname - Information Integration efficiently merges and
aggregate data from all sources, transparently!
27Federation hides the Location and Source of the
data behind a nickname
Application SELECT c.region, SUM(o.price
o.qty) FROM CUSTOMERS c, ORDSTATS o WHERE
c.custid o.custid GROUP BY c.region
DBA CREATE NICKNAME ORDERS FOR
ORACLE8.bourbon.ORDERS CREATE VIEW ORDSTATS
... or, if table does not exist CREATE TABLE
ORDERS(id primary key... ... OPTIONS(REMOTE_SERVER
'ORACLE8' REMOTE_SCHEMA
'bourbon')
Unified Data Model
ORDSTATS view
CUSTOMERS nickname
ORDERS nickname
QUOTAS nickname
ORDERS table
CUSTOMERS table
QUOTAS spreadsheet
Oracle
Excel
DB2/MVS
- A nickname, for most cases, is just like any
other local table!
28With Information Integration the federation chore
moves out of the application
Monthly Report Service
SELECT region, SUM(o.price o.qty) FROM SALES
WHERE month(sdate) CURRENT MONTH GROUP BY
c.region
100 rows
A few 100 bytes of memory required!
A single simple database query
100 rows
Application
JDBC
Single View
SALES view
CUSTOMERS nickname
ORDERS nickname
SALESQUOTA nickname
Excel SALES Quotas
DB2 CUSTOMERS
Oracle ORDERS
- The application still uses JDBC, but only worries
about 1 data model - Nicknames allow automatic type mappings and
transformations
29Variations of the same Report are queried
repeatedly WII can cache it!
Monthly Report Service
SELECT region, SUM(o.price o.qty) FROM SALES
WHERE month(sdate) CURRENT MONTH GROUP BY
c.region
A single simple database query
Application
JDBC
No need to re-fetch the data!
SALES view
CUSTOMERS nickname
ORDERS nickname
Cache
MQT
SALESQUOTA nickname
Excel SALES Quotas
DB2 CUSTOMERS
Oracle ORDERS
- DB2 will cache query results for Materialized
Query Table (MQT) - DB2 optimizer also picks MQT for other similar
queries!
30WebSphere II Replication
Introduction to Replication ftp//ftp.software.
ibm.com/ps/products/db2/info/vr82/pdf/en_US/db2gpe
80.pdf
Federation Engine
Heterogeneous
WebSphere MQ
- Highlights
- Release independence (install new replication
over old UDB) - Add Oracle log-based capture for Q Replication
- Enable Classic sources to participate in Q
Replication (VSAM, IMS, IDMS) - Enable Heterogeneous targets
- Replication of schema changes (starting with
Alter Table/Column) - Expand peer-to-peer conflict resolution schemes
(merge, user-defined)
31WebSphere II Event Publishing
WebSphere Business Integration
Workflow
- Asynchronous communication between data and
application layers to build powerful solutions - Key capability of DB2 II for both distributed and
classic platforms - New addition of IDMS sources
XML message
JDBC Client
ODBC Client
WS II Event Publisher
WS Classic EP For z/OS
IDMS
32Summary Federated Technology
- Infrastructure for extracting information from
data - Wrappers allow access to key data sources
- Query engine provides cross-source queries
- An extension of industrial-strength DB2
technology - Robust, high function, high performance
- Benefit from R D in relational DBMS
- Benefit from experience in content management
- Eases application development
- Transparency, heterogeneity, high function
- APIs for modern environments (XML, J2EE, Web
Services) - Application Autonomy
- Does not disrupt existing applications
33The Grid wrapper for Websphere Information
Integrator V8
34What is the Grid Wrapper ?
- New technology developed here in Hursley
- Bridging Websphere II v8 with the standard based
Grid world to hide data heterogeneity on a
datagrid as well as introducing new access
patterns. - Released as an IBM Alphaworks technology in
August 2005 - http//www.alphaworks.ibm.com/
35OGSA-DAI - today
External World
External resources
External resources represented as grid services
OGSA/OGSI Grid
Grid enabled application
Grid services
Note The application has to integrate/federate
all the data from the different GRID services.
!!!! Schema integration and federation of Data
grid services were features required by OGSA-DAI
users in the 2004 user group meeting in
Edinburgh.!!!!!
36OGSA-DAI Grid Wrapper
External World
External resources
External resources represented as grid services
OGSI/WS RF Grid
OGSA-DAI Websphere II
Standard DB2 enabled application
Grid/WS-RF services
37Data and GRID/Web Services access patternsThe
Diamond graph.
Registry
Registration, Discovery
Dynamic, Discovery
3
Websphere II
Other Services
Client
2
Orchestration, data format, federation
Service
1
Virtualisation
Access, Performance
Data
Arrow indicates interaction
38Leveraging Both Worlds
SQL
OGSA-DAI
Application
WAS Application
SQL
SQL
SQL
Schema integration
Websphere II
OGSA-DAI Wrapper for instance
Reuse DBMS specific wrapper where possible
WS/Grid Service Discovery/Creation
Boundary (Firewall or adminstrative, Technology)
OGSADAI
Grid
DBMS
39Grid/OGSADAI and Websphere II
- Wrapper registry which is pretty static less
likely to change - Server type of Resource knows how to map
information from this type of resource, not tied
to a specific resource but is tied to the
virtualisation of a particular resource - Nickname tied to particular table schema or set
of data which can be returned from a service,
ties it to a domain and data set but not specific
set of data in a specific location.
40Some Customer Scenarios, leveraging OGSA-DAI
Websphere II
- SELECT Get data from first available or
Preferred GRID Data Resource (GRID-DR) - ? OR or PREFERRED QoS
- UNION ALL
- Select and consolidate all result sets from all
GRID-DR - ? AND QoS
- UNION ALL WITH PARTIAL RESULT SETS
- Select and consolidate all result sets from
GRID-DR Data Resource, and it may not matter if
one or many GRID-DR is down, but get a warning
back that not all data were fetched. (see
EDiaMoND type scenario) - ? AND and PARTIAL AND QoS
- Third Party delivery
- results sets will be delivered to another place
(FTP site, WEB Server) and result set holds URL
of where to fetch the data from.
41Quality of Service AND
?
9.20.6.100
select from table
9.20.6.1009.20.6.1019.20.6.102
Result
select from table
9.20.6.101
select from table
9.20.6.102
42Quality of Service PARTIAL AND
?
9.20.6.100
select from table
9.20.6.1009.20.6.1019.20.6.102
Result
select from table
9.20.6.101
select from table
9.20.6.102
43An example for Partial And EDiaMoND Scenario
(Part I) As it is today.
???
Training Services
OGSA DAI
Websphere II Data Federation (static union all
view)
44An example for Partial And EDiaMoND Scenario
(Part II) with the WebSphere II wrapper.
???
Training Application
Core API
Training API
Training Services
Websphere II Data Federation (using OGSA-DAI
wrapper fully dynamic no union all view!! )
OGSA DAI
Registry
Database
Local Files
45Quality of Service PREFERRED
?
9.20.6.100
9.20.6.1009.20.6.1019.20.6.102
Result
9.20.6.101
select from table
9.20.6.102
46Quality of Service OR
?
9.20.6.100
9.20.6.1009.20.6.1019.20.6.102
Result
select from table
9.20.6.101
select from table
9.20.6.102
47Quality of Service Third Party Delivery
?
9.20.6.100
9.20.6.1009.20.6.1019.20.6.102
Result
URLs
select from table
URLs
9.20.6.101
URLs
select from table
Client
9.20.6.102
FTP/GRIDFTP/HTTP Server holding result sets
48Wrapper Security Components
- Makes use of
- Grid Security Infrastructure (GSI)
http//www-unix.globus.org/toolkit/docs/3.2/secur
ity.html - Grid Certificate Service (GCS)http//gcs.globus.o
rg8080/gcs/index.html - Java Commodity Grid Kit (Java CoG
Kit)http//www.cogkit.org
49Create Wrapper statement
- CREATE WRAPPER ogsadai" LIBRARY 'db2qgjava.dll
- OPTIONS( DB2_FENCED 'Y', FENCED_WRAPPER_CLASS
'com.ibm.db2.wrapper.onDemand.OnDemandFencedWrappe
r', - FILENAME 'C\gridwrapper\bin\registry.xml',
- REGISTRY_TYPE 'XMLFILE,
- UNFENCED_WRAPPER_CLASS 'com.ibm.db2.wrapper.on
Demand.OnDemandUnfencedWrapper - )
-
- CREATE WRAPPER ogsadai LIBRARY db2qgjava.dll
- States the wrapper name and the wrapper library
to be used (part of Websphere II). In this case a
Java non-relational wrapper. - UNFENCED_WRAPPER_CLASS and FENCED_WRAPPER_CLASS
- Directs DB2 to the Java classes implementing the
necessary logic. - REGISTRY_TYPE and FILENAME
- Points to the XML registry in this case.
- Can also point to a relational registry
50Create Server
- CREATE SERVER ogsadai_server" WRAPPER
"THEGRIDWRAPPERBASIC
- Simple statement isnt it?
- CREATE SERVER ogsadai_server WRAPPER ogsadai
- States name of server and which wrapper it
belongs to. - Options about service quality is possible here
but we prefer to locate them in the registry.
51Create Nickname statement
- CREATE NICKNAME WRAP.basetbl (pkey INT NOT
NULL, chr CHAR(1),
chr20 CHAR(20),
vchr20 VARCHAR(20), intgr
INT, flt DOUBLE,
mny DECIMAL(19,4), sdt
TIMESTAMP, bt SMALLINT) - FOR SERVER ogsadai_server
- OPTIONS (TABLE_NAME WRAP.basetbl')
- CREATE NICKNAME WRAP.basetbl
- Nickname identifier , here in schema WRAP
- pkey INT NOT NULL, chr CHAR(1),chr20 CHAR(20)
- Defines the nickname columns, this can be done
dynamically. - FOR SERVER ogsadai_server OPTIONS (TABLE_NAME
'basetbl') - Indicates which server will be used as well as
the logical table name, the one to be matched
with information stored in the registry.
52DEMO NOW!
53A basic grid wrapper XML registry
- lt?xml version"1.0" encoding"UTF-8"?gt
- ltregistry version"0.0.0" expireafter"0"gt
- ltclass id"DB2" vendor"IBM" product"DB2"
version"-"/gt - ltclass id"MySQL" vendor"MySQL"
product"MySQL" version"-"/gt - lttransform target"DB2"
fname/gridwrapper/TransformPatterns/db2.transfor
m kind"patterns"/gt - lttransform target"MySQL" fname/gridwrapper/
TransformPatterns/mysql.transform"
kind"patterns"/gt -
- ltconnectivity id"DB2 Test DB JDBC"
type"JDBC" implementation"com.ibm.db2.jcc.DB2D
river - url"jdbcdb2//myhost50000/MYDB
userid"db2localuser password"db2localpass
/gt - ltconnectivity id"MySQL Test DB OGSADAI
type" OGSADAI " - url"http//localhost8080/ogsa/services/
ogsadai/DBSGDSF /gt - lttable logicalnameWRAP.demotest"
assemblage"all-ns"gt - lttable assemblage"direct" using"MySQL
Test DB OGSADAI " physicalnameDB1.demotest
class"MySQL /gt - lttable assemblage"direct using"DB2
Test DB JDBC physicalnameMyDB.demotest
class"DB2 /gt - lt/tablegt
54Possibilities of this wrapper
- Using GRID registry as policy enforcement
- Move DB from one QoS to another by moving them in
and out of registries.(domain specific
registries) - Change data space topology by just interacting
with registries. No impact on the
applications/SQL logic. No need to recreate
nicknames if DB move from one system onto
another. - DBA liked this idea very much. (see demo for GUI)
- Switch between local access (JDBC..) to SOA
depending of application requirement/infrastructur
e needs. see diamond graph in a previous slide. - Leverage distribution of data by fetching from
multiple GRID data sources in parallel. - Many more scenario where loose coupling is
involved
55Conclusion
- Websphere II can really be GRID enabled and be
the unique simple front end for all your data on
the GRID - It is available on IBM AlphaWorks technology for
free - http//alphaworks.ibm.com/tech/gridwrapper
- A tutorial is provided for easy set-up and
training - Read the article Bridging the integration gap,
Part 1 Federating grid data on IBM
DeveloperWorks web site - Its about why we built it and the vision behind
it. - http//www-128.ibm.com/developerworks/grid/library
/gr-feddata/ - We will add more features as customers ask for
them.
56Questions/ Comments
- Patrick Dantressangle
- dantress_at_uk.ibm.com