Title: Distributed Heterogeneous Data Warehouse For Grid Analysis
1- Distributed Heterogeneous Data Warehouse For
Grid Analysis - Harvey B. Newman, Julian Bunn, Saima Iqbal
- CALTECH ( California Institute of Technology ).
-
2OUTLINE
- Introduction
- What is Relational data warehouse
- Distributed Heterogeneous Relational Data
Warehouse Databases (DHRD) and Grid - How DHRD could be integrated with the Grid
- Why web services?
- Building blocks of Web Services
- Vital parts of Web Services
- How DHRD could be integrated with the Grid as a
Web Service - Grid services
- Grid services architecture (GGF) draft 16th Feb.
2003 - Grid services client infrastructure (GGF) draft
5th Jun. 2003 - Proposed web services architecture based on Grid
services to use DHRD in Grid environment - Technologies employed
- UDDI complaint registry service
3INTRODUCTION
- Can databases integrated with the Grid ?
- Most of the existing and proposed Grid
applications are file based. - Very little work has been done on how Distributed
Heterogeneous Databases can be made available on
the Grid. - Web Services can help in accessing Distributed
Heterogeneous Databases as a single Virtual
Database across the Grid.
4Distributed Data Warehouse
- The distributed database system allows
applications to access data from local remote
databases. - It helps to move some of data and some of the
users to separate servers and databases. - Allow to keep data by a particular workgroup at
Tier 2 and Tier 3, on a server nearby. - Reduce the need for massive central computing and
network delays.
5Distributed Heterogeneous Relational Data
Warehouse (DHRD) Databases and Grid
- Is it possible to access DHRD databases across
Grid by adopting the existing Grid services that
handle files? - While relational databases offers much richer set
of operations like queries and transactions. - There is much differences exists among different
DBMS as that of different file systems. - Even within one paradigm different databases
products ( Oracle, MS-SQL, DB2) vary in their
functionality and interfaces.
6 How DHRD Could Be Integrated with The Grid
- The diversity of DHRD makes it difficult to
design a single solution to integrate DHRD
databases with Grid. - The Open Grid Services Architecture (OGSA) for
distributed system provide the concept of Grid
Services (like Web Services) to access resources
across distributed and heterogeneous environment. - These Grid Services/Web Services can help in
providing the distributed databases across the
Grid as a Virtual Database System.
7Why Web Services?
- Web Services are centered on the Service
definition and messages - Web Services build on set of well established
technologies and protocols - - XML used for service description and data
interchange - http used as a transport protocol
- - widely deployed with trusted security
features - Web Services standards are structured and
extensible - - Interface evolution without breaking what is
already working - Provide solution for the access of heterogeneous,
web-wide resources.
8Building blocks Of Web Services
- Web Services are modular software components
wrapped inside a specific set of Internet
communication protocols and that can be run over
the Internet. - At the heart, web services architecture is the
need for program-to-program communications. - Key roles in the web services architecture are
- - a service provider
- - a service registry
- - a service requestor
9Building blocks Of Web Services (contd)
- - Together they perform three operation on web
services - Publish, Find and Bind
-
SERVICE PROVIDER
1 Publish
3 Bind
Make the service description publicly available
Allows the service to be used by the requestor
SERVICE REQUESTOR
SERVICE REGISTRY
2 Find
Discover the service
10Vital Parts of Web Services
- SOAP (Simple Object Access Protocol) through
which the service provider, service registry and
service requestor communicate. - WSDL( Web Services Description Language) is the
language used to create service description. - UDDI (Universal Description Discovery and
Integration) is the directory technology used by
service registries that contain the description
of web services and allows the directory to be
searched for a particular web service.
11How DHRD Could Be Integrated with The Grid As A
Web Service
- The Distributed Heterogeneous Relational
Databases can register themselves as a web
service in a UDDI registry. - These web services can be accessible by a client
through web application by using WSDL. - In this architecture Client is very important
because this Client can dynamically discover
services, configure the remote calls on the basis
of the inputs it gets from http call.
12Grid Services
- The OGSA integrates key Grid technologies
(including Globus toolkit) with Web Services
mechanisms to create a distributed system
framework around the OGSI (Open Grid Services
Infrastructure). - A Grid Service is a Web Service that conforms to
a set of conventions (interfaces behavior) that
define how a client interacts with a services
available across Grid.
13Grid Services Architecture (contd) (Grid
Database Service specification (GGF))
GridDataService
GridServicePort
FindServiceData
ltServiceDatagt
GridDataServicePort
Perform
Requester
ltResponsegt
GridDataTransport Port
Put/get
ltResponsegt
GridDataService
Requester Using Grid Data Service Ports
14Grid Services Architecture (Grid Database
Service specification (GGF))
GridServiceRegistry
FindServiceData
GSH(GridServiceHandler)
CreateService
Requester
GridDataServiceFactory
ltServiceInformationgt
create
GridDataService
Database Servers
Creating a Grid Data Service
15Grid Services Client Infrastructure (Grid
Database Service specification (GGF))
Proxy
Binding Selection
Client Application
Protocol 1 (binding) Specific stub
Invocation of Web Service
A Client-Server Interface
Protocol 2 (binding) Specific stub
A Client-Side runtime architecture
16Proposed Web Services Architecture Based on Grid
ServicesTo Use DHRD In Grid Environment
ORACLE9i SERVER DATA (META DATA)
MonaLisa
Data Replication through
SSL
(Service Registry)
UUDI Registry Server
ORACLE9i SERVER DATA (META DATA)
Â
JAVA XML API to connect with Database Server
Web Server
SOAP
HTTP Server
Server with Master Database
SOAP Processor
DISTRIBUTED DATABASE
WSDL file
(Service Provider)
UDDI SOAP Request and Response
SOAP
Bind with the provided service
MS-SQL DATA (META DATA)
Server with Materialized View Database
Client Web Application to connect with database
(Service Requestor)
17Technologies Employed
- Java Web Services Developer Pack 1.0 (JWSDP)
- Apache Tomcat 4.1.2 for Java Web Services
Developer Pack 1.0 - -Apache web server
- -Tomcat servlet engine
- Java API for XML Registries (JAXR) 1.0_02
- Java API for XML-based RPC (JAX-RPC) 1.0_01
- Web Application Deployment Tool for JWSDP
- XRPCC tool to generate WSDL
- JWSDP Registry Server 1.0_02
- -Xindice database, the repository for registry
data - -implements Version 2.00 of the Universal
Description, Discovery and Integration (UUDI)
18UDDI Complaint Service Registry
- A standardized, transparent mechanism for
describing the service - A simple mechanism for invoking the service
- An accessible central registry services
- Make use of XML and SOAP
- Provide service discovery platform on WWW
- Suitable for Black Box web environment
- Allow to store as much as detail about a service
and its implementation as desired - The UDDI version 2.0 API defines approx. 40
messages to perform inquiry and publishing
functions against any UDDI complaint service
registry - The schema defines 25 requests and 15 responses
19Working of Web Services Prototype
6
SOAP Message
Program Implementation
Database Server
Ties
7
JAX-RPC Runtime
SOAP Message
8
5
JAX-RPC Runtime
Registry Server
JAXR
Web server
Stubs
JAX-RPC
3
Program Interface
2
9
4
Find-service
1
10
SOAP Message
http
Web Service Requester
20Working of Web Services Prototype
21Working of Web Services Prototype
22Conclusion
- It seems possible that we can make the
Distributed Heterogeneous Relational Data
Warehouse Databases available across the Grid in
form of Web Services/Grid Services.
23Future Work
- Integration of MonALISA (Grid monitoring tool),
for the location of required web service with
optimal network resources - Exploit UDDI with its full functionality
- Provide an API to integrate this Grid Services
based Web Services prototype into the Globus
toolkit
24