XML Metadata Services

About This Presentation

Title:

XML Metadata Services

Description:

GNU/Linux (kernel release 2.6.16) 2GB. Dual Core AMD Opteron(tm) Processor 270 ... Sun-Fire-880, sun4u sparc SUNW. Indianapolis, IN, USA. complexity.ucs.indiana.edu ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 36

Provided by: gridsUcs

Learn more at: http://grids.ucs.indiana.edu

Category:

more less

Transcript and Presenter's Notes

Title: XML Metadata Services

1
XML Metadata Services

SKG06 http//www.culturegrid.net/SKG2006/
Guilin China November 3 2006
Mehmet S. Aktas, Sangyoon Oh, Geoffrey C. Fox
and Marlon Pierce
Presented by Geoffrey Fox Computer Science,
Informatics, Physics
Pervasive Technology Laboratories
Indiana University Bloomington IN 47401
gcf_at_indiana.edu
http//www.infomall.org

2
Different Metadata Systems

There are many WS- specifications addressing
meta-data defined broadly
WS-MetadataExchange
WS-RF
UDDI
WS-ManagementCatalog
And many different implementations from
(extended) UDDI through MCAT of the Storage
Research Broker
And of course representations including RDF and
OWL
Further there is system metadata (such as UDDI
for core services) and metadata catalogs for each
application domain such as WFS (Web Feature
Service) for GIS (Geographical Information
Systems)
They have different scope and different QoS
trade-offs
e.g. Distributed Hash Tables (Chord) to achieve
scalability in large scale networks

WS-Context
ASAP
WBEM
WS-GAF

3
Different Trade-offs

It has never been clear how a poor lonely service
is meant to know where to look up meta-data and
if it is meant to be thought up as a database
(UDDI, WS-Context) or as the contents of a
message (WS-RF, WS-MetadataExchange)
We identified two very distinct QoS tradeoffs
1) Large scale relatively static metadata as in
(UDDI) catalog of all the worlds services
2) Small scale highly dynamic metadata as in
dynamic workflows for sensor integration and
collaboration
Fault-tolerance and ability to support dynamic
changes with few millisecond delay
But only a modest number of involved services (up
to 1000s in a session)
Need Session NOT Service/Resource meta-data so
dont use WS-RF

4
Hybrid WS-Context ServiceArchitecture and
Prototype
5
WS-Context compliant XML Metadata Services

We designed and built a WS-Context compliant XML
Metadata services supporting distributed or
central paradigms. This service,
supports extensive metadata requirements of rich
interacting systems, such as
correlating activities of widely distributed
services, EX workflow style GIS Service
Oriented Architectures, AND
optimizing Grid/Web Service messaging
performance, EX mobile computing environment,
AND
managing dynamic events especially in multimedia
collaboration, EX collaboration Grid/Web service
applications, AND
providing information to enable session failure
recovery capabilities.

6
Context as Service Metadata

We define all metadata (static, semi-static,
dynamic) relevant to a service as Context.
Context can be associated to a single service, a
session (service activity) or both.
Context can be independent of any interaction
slowly varying, quasi-static context
Ex type or endpoint of a service, less likely to
change
Context can be generated as result of service
interactions
dynamic, highly updated context
information associated to an activity or session
Ex session-id, URI of the coordinator of a
workflow session

7
Hybrid XML Metadata Services gt WS-Context
extended UDDI

We combine functionalities of these two services
WS-Context AND extendedUDDI in one hybrid service
to manage Context (service metadata).
WS-Context controlling a workflow
(Extended) UDDI supporting semantic service
discovery
This approach enables uniform query capabilities
on service metadata catalog.
http//www.opengrids.org/wscontext/index.html

8
Distributed Hybrid WS-Context XML Metadata
Services
Note that all Replica Servers are identical in
their capabilities. This figure illustrates the
system from the perspective of one Replica
Server.
HTTP(S)
HTTP
Subscriber
Publisher
Replica Server-1
Replica Server-2
Replica Server-N
9
Key Features

Publish-Subscribe exploited to support replicated
storage e.g.
Initial storage of context
Update to make copies consistent
Access context
Use of Javaspaces cache running in memory on each
WS-Context node
Naturally supports Get Context by name requests
Backed up every 30 milliseconds to a MySQL
database
If query can be satisfied by Javaspaces cache,
the query can be satisfied in lt 1ms plus the few
milliseconds of Web service overhead

10
TupleSpaces-Based Caching Strategies

TupleSpaces is a communication paradigm
asynchronous communication
pioneered by David Gelernter
first described in Linda project in 1982 at Yale
communication units are tuples
data-structure consisting of one or more typed
fields
Hybrid WS-Context Service employs/extends
TupleSpaces
all memory accesses. overhead is negligible (less
than 1msec. for inqueries)
data sharing - mutual exclusive access to tuples
associative lookup - content based search,
appropriate for key-based caching
temporal, spatial uncoupling of communicating
parties
e.g. a tuple ("context_id", Context). This
indicates a tuple with two fields a) a string,
"context_id" and b) a Java object, "Context".
back-up with frequent time intervals for
fault-tolerance

11
Managing Context UDDI WS-Context
purpose standard way of publishing, discovering generic Web Service information standard way of maintaining distributed session state information
metadata characteristics interaction-independent, rarely-changing, small-size interaction-dependent, highly dynamic, small-size
types of typical queries high degree of complexity in inquiry arguments to improve the selectivity and increase the precision in the search results simplicity in inquiry arguments, mostly key-based retrieval queries, selectivity of queries is one.
scalability Whole Grid, UDDI is a domain-independent service for generic service metadata Sub-Grids, modest number interacting Web Services participating an activity
desired features better expressiveness power of service metadata (e.g., RDF-enabled UDDI Registries), up-to-date service entries (e.g., leasing capable UDDI Registries), domain-specific capabilities (e.g., geospatial query capabilities), persistent storage notification (members of an activity should be notified of the distributed state information), synchronous callback (loose-coupling of services), high performance, light-weight storage
12
A general performance evaluation on the most
recent implementation of the Hybrid WS-Context
Service
13
Prototype Evaluation - I

Performance Experiment We investigate the
practical usefulness of the system by exploring
following research questions.
What is the baseline performance of the hybrid
WS-Context Service implementation for given
standard operations?
What is the effect of the network latency on the
baseline performance of the system?
How does the performance compare with previous
metadata management solutions?

14
PERFORMANCE TEST
15
TESTBED Cluster node configuration TESTBED Cluster node configuration
Processor Intel Xeon CPU (2.40GHz)
RAM 2GB total
Network Bandwidth 900 Mbits/sec.1 (among the cluster nodes)
OS GNU/Linux (kernel release 2.4.22)
Java Version Java 2 platform, Standard Edition (1.4.2-beta-b19)
SOAP Engine Axis 2 (in Tomcat 5.5.8)
Metadata Services Avg. latency for inquiries
hybrid WS-Context 8.41 ms
extended UDDI 17.5 ms
JUDDI 40 ms
UDDI-MT 20.37 ms
JWSD 18.99 ms

Test 2-Test 1 is Javaspaces overhead
The experimental study indicates that the
proposed system can provide comparable
performance for standard operations with the
existing metadata management services.
16
Prototype Evaluation - II

Scalability Experiment We investigate the
scalability of the system by finding answers to
the following research questions.
What is the performance degradation of the
system for standard operations under increasing
message sizes?
What is the performance degradation of the
system for standard operations under increasing
message rates?
What is the scalability gain (both in numbers
and in performance) of moving from a centralized
system to a distributed system under the same
workload?

17
SCALABILITY TEST-1
1 user/100 transactions
single threaded
WSDL
WS-Context Client
TEST-1 - Hybrid-WSContext inquiry/publication
with increasing message sizes
TEST-2 - Hybrid-WSContext inquiry/publication
with increasing message rates ( of messages per
second)
18
TESTBED Cluster node configuration for hybrid WS-Context tests TESTBED Cluster node configuration for hybrid WS-Context tests
Processor Intel Xeon CPU (2.40GHz)
RAM 2GB total
Network Bandwidth 900 Mbits/sec.1 (among the cluster nodes)
OS GNU/Linux (kernel release 2.4.22)
Java Version Java 2 platform, Standard Edition (1.4.2-beta-b19)
SOAP Engine Axis 2 (in Tomcat 5.5.8)
Metadata Services Avg. latency for inquiries for 64KByte data retrieval
hybrid WS-Context 14.55 ms
OGSA-DAI WSRF 2.1 232 ms

gt OGSA-DAI Results are from
http//www.ogsadai.org.uk/documentation/scenarios/
-performance
Both OGSA-DAI and WS-Context testing cases were
conducted on a tightly coupled network.

The results indicate that the cost of inquiry and
publication operations remains the same, as the
contexts payload size increases from 100Bytes up
to 10KBytes. We also see that the hybrid
WS-Context presents better performance than
OGSA-DAI approach but latter technology more
powerful
19
TESTBED Cluster node configuration TESTBED Cluster node configuration
Processor Intel Xeon CPU (2.40GHz)
RAM 2GB total
Network Bandwidth 900 Mbits/sec.1 (among the cluster nodes)
OS GNU/Linux (kernel release 2.4.22)
Java Version Java 2 platform, Standard Edition (1.4.2-beta-b19)
SOAP Engine Axis 2 (in Tomcat 5.5.8)

The results indicate that the proposed system can
scale up to 940 simultaneous querying clients or
222 simultaneous publishing clients where each
client sending one query per second, for small
size context payloads with 30 milliseconds fault
tolerance. Multi-core hosts will improve
performance dramatically
20
4 Cores is 3000 messages per second about one
message per millisecond per core for Opteron one
message per 2 ms for Sun Niagara core
21
5 Client distributed to cluster nodes 1 to 5,
with each running 1 to 15 threads firing messages
to randomly selected servers.
DISTRIBUTION TEST
HTTP(S)

We investigate scalability when moving from a
centralized server to a distributed one under
heavy workloads.
Numbered rectangle shapes correspond to an
N-node FTHPIS system with various
Publish-Subscribe topologies (this does NOT
affect performance)
5 different FTHPIS system tested when N range
from 1 to 5 under the same workload.
At each testing case, same volume of data is
evenly distributed among the nodes.

22
TESTBED Cluster node configuration TESTBED Cluster node configuration
Processor Intel Xeon CPU (2.40GHz)
RAM 2GB total
Network Bandwidth 900 Mbits/sec.1 (among the cluster nodes)
OS GNU/Linux (kernel release 2.4.22)
Java Version Java 2 platform, Standard Edition (1.4.2-beta-b19)
SOAP Engine Axis 2 (in Tomcat 5.5.8)
Hybrid WS-Context inquiry operation Hybrid WS-Context inquiry operation Hybrid WS-Context inquiry operation Hybrid WS-Context inquiry operation
of nodes message rate mean error (ms) Stdev (ms)
1 940 47.05 0.24 33.52
2 1005 40.76 0.43 38.22
3 1082 38.58 0.45 34.93
4 1148 36.28 0.42 32.24
5 1221 34.13 0.4 30.76
Non-optimal caching algorithm as does database
access BEFORE Publish-Subscribe. Reversingthis
choice should lead to throughput Linear in
nodes Pub-Sub overhead 2ms

The results indicate that the scalability of
metadata store can be increased when moving from
a centralized service to a distributed system.
23
Prototype Evaluation - III

Fault Tolerance Experiment We investigate the
empirical cost of having fault-tolerance by
finding answers to the following research
questions.
What is the cost of the fault-tolerance in terms
of execution time of standard operations on a
tight cluster?
How does the cost of fault-tolerance change when
the replica servers separated with significant
network distances?

24
FAULT-TOLERANCE TEST
25
FAULT-TOLERANCE EXPERIMENT TEST BED
Summary of machine configurations Summary of machine configurations Summary of machine configurations Summary of machine configurations Summary of machine configurations Summary of machine configurations
Location Processor RAM OS Java Version
gf6.ucs.indiana.edu Bloomington, IN, USA Intel Xeon CPU (2.40GHz) 2GB GNU/Linux (kernel release 2.4.22) Java 2, STE, (1.4.2-beta-b19)
complexity.ucs.indiana.edu Indianapolis, IN, USA Sun-Fire-880, sun4u sparc SUNW 16GB SunOS 5.9 Java HotSpot(TM) 64-Bit Server VM(1.4.2-01)
lonestar.tacc.utexas.edu Austing, TX, USA Intel(R) Xeon(TM) CPU 3.20GHz 4GB GNU/Linux (kernel release 2.6.9) Java 2, STE, (1.4.2-beta-b19)
tg-login.sdsc.teragrid.org San Diego, CA, USA GenuineIntel IA-64, Itanium 2, 4 processors 8GB GNU/Linux Java 2, STE, (1.4.2-beta-b19)
vlab2.scs.fsu.edu Tallahase, FL, USA Dual Core AMD Opteron(tm) Processor 270 2GB GNU/Linux (kernel release 2.6.16) Java 2, STE, (1.4.2-beta-b19)
26
FAULT-TOLERANCE TEST RESULTS
The results point out the inevitable trade-off
between the fault-tolerance (degree of
replication or high availability of data) and
performance. The lower the level of
fault-tolerance, the higher the performance would
be for publication operations. These results
also indicated that, high degree of replication
could be succeeded (by utilizing an asynchronous
communication model such as publish-subscribe
paradigm) without increasing the cost of
fault-tolerance.
27
An Application Case Scenarioand an
application-specific performance evaluation of
the Hybrid WS-Context Service
28
Application Context Store usage in
communication of mobile Web Services

Handheld Flexible Representation (HHFR) is an
open source software for fast communication in
mobile Web Services. HHFR supports
streaming messages, separation of message
contents and usage of context store.
http//www.opengrids.org/hhfr/index.html
We use WS-Context service as context-store for
redundant message parts of the SOAP messages.
redundant data is static XML fragments encoded in
every SOAP message
Redundant metadata is stored as context
associated to service conversion in place
The empirical results show that we gain 83 in
message size and on avg. 41 on transit time by
using WS-Context service.

29
Optimizing Grid/Web Service Messaging Performance
The performance and efficiency of Web Services
can be greatly increased in conversational and
streaming message exchanges by removing the
redundant parts of the SOAP message.
30
Performance with and without Context-store

Experiments ran over HHFR
Optimized message exchanged over HHFR after
saving redundant/unchanging parts to the
Context-store
Save on average
83 of message size, 41 of transit time

Summary of the Round Trip Time (TRTT)
Message Size Without Context-store Without Context-store With Context-store With Context-store
Message Size Ave.error Stddev Ave.error Stddev
Medium 513byte (sec) 2.760.034 0.187 1.750.040 0.217
Large 2.61KB (sec) 5.200.158 0.867 2.810.098 0.538
31
System Parameters

Taccess time to access to a Context-store (i.e.
save a context or retrieve a context to/from the
Context-store) from a mobile client
TRTT Round Trip Time to exchange message through
a HHFR channel
N number of simultaneous streams supported by
stream summed over ALL mobile clients
Twsctx time to process setContext operation
Taxis time consumed for Axis process
Ttrans transmission time through network
Tstream stream length

32
Context-store System Parameters
33
Summary of Taxis and Twsctx measurements

Taccess Twsctx Taxis Ttrans
Data binding overhead
at Web Service Container
is the dominant factor to
message processing

34
Performance Model and Measurements

Chhfr nthhfr Oa Ob
Csoap ntsoap
Breakeven point
nbe thhfr Oa Ob nbe tsoap
Oa(WS) is roughly 20 milliseconds

Oa overhead for accessing the
Context-store Service Ob overhead for
negotiation
Averageerror (sec) Stddev (sec)
Context-store Access (Oa) 4.1270.042 0.516
Negotiation (Ob) 5.1330.036 0.825
35
String Concatenation

Measure the total time to process stream
Independent variables
Number of messages per stream
Size of the message

Write a Comment

User Comments (0)

About PowerShow.com

XML Metadata Services - PowerPoint PPT Presentation

XML Metadata Services

GNU/Linux (kernel release 2.6.16) 2GB. Dual Core AMD Opteron(tm) Processor 270 ... Sun-Fire-880, sun4u sparc SUNW. Indianapolis, IN, USA. complexity.ucs.indiana.edu ... – PowerPoint PPT presentation