Apresenta - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Apresenta

Description:

The dissemination of GIS systems, associated with the improvement of channels' ... Some hot points in these relationships: ... OS: Fedora Linux ... – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Slides: 30
Provided by: dpiI
Category:
Tags: apresenta | fedora

less

Transcript and Presenter's Notes

Title: Apresenta


1
FEDERAL UNIVERSITY OF RIO DE JANEIRO
Spatial Query Broker in a Grid Environment
Author Wladimir S. Meyer Advisors Jano M.
Souza Milton R. Ramirez
2
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

3
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

4
Motivation
  • The dissemination of GIS systems, associated
    with the improvement of channels bandwidth, is
    increasing quickly and the interactions between
    data producers and consumers are becoming more
    frequent, complex and dynamic.
  • Some hot points in these relationships
  • Huge amount of data spread by many different
    geographic places
  • Complexity of spatial data
  • Demand for sophisticated services delivered by
    web
  • The high price that shared resources may have in
    some federations (CPU time, storage space, ...)
  • Integration problems (many levels of
    heterogeneity)

Distributed spatial operations and methods to
improve their efficiency take an important role
in this context . There are a lot of works
involving spatial operations in a centralized
context, but few in a distributed context. The
Grid computig paradigm aggregate many
characteristics that can improve the execution of
distributed spatial operations.
5
Goal
This work aim at improving the efficiency of
distributed spatial join by means of an
architecture that permits the allocation of
non-specialized computers in execution of the
operation, reducing the overall response
time. Spatial join was focused because it is a
very common operation in GIS systems and has a
high processing cost. The architecture also
offers condictions to make experiments with new
algorithms (filter/refine, scheduler, ...)
6
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

7
The Problem
How to proceed with a spatial join in a pool of
data providers that share a huge amount of
spatial data, in order to have the response time
bellow a limit stated by some quality
criteria? The data fragmentation may be spatial
and/or thematic (ie a hybrid schema) and there
are local spatial indexes on each dataset This
scenario could be depicted by a pool of regional
governmental agencies responsible by
cartographic data generation, offering
query-services that run over their data by mean
of the internet.
8
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

9
Related Work
  • Many important works in spatial query processing
    are related with the filter / refine strategy
    5. Some of them are mentioned bellow
  • Multi-Step processing of spatial joins Brinkhoff
    et al 6
  • Raster signatures in spatial joins (4CRS)
    Zimbrao et al 30
  • Multi-Steps with remote indexes (MR2) Ramirez
    and Souza 26
  • On the other hand, the execution of the query
    plan in a distributed context may emphasize the
    parallelism as a manner to reduce the overall
    response time.
  • MR2 Ramirez 26
  • Grid Greedy Node, Porto et al 25
  • OGSA-DQP, Smith et al 27
  • The need of a scheduler module in some of these
    strategies should guarantee an adequate load
    balance among the selected local SDBMS

10
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

11
The Proposal
In this work, the grids ability in offering
resources on-demand is used to reduce the overall
response time during distributed spatial query
join operations in databases. The parallelism in
previous works involves only those nodes that are
storing spatial data mentioned in the query. Our
proposal is involve also generic computational
resources in the most expensive step of the
filter / refine strategy the exact geometry
processing.
Multi-step filter / refine strategy 6
12
The Proposal
The follow picture gives an overview of the
context
13
The Proposal
A specialized meta-scheduler, named Spatial Query
Broker (SQB), is being proposed to deal with all
spatial query processing, in a similar way as
conventional Resource Brokers in grid
environments.
Item SQB OGSA-DQP GridWay WMS
Unit of work Query Query Job Job
App domain Databases Databases Generic jobs Generic jobs
Dynamic scheduling Yes No Yes No
Spatial queries? Yes No - -
Use generic nodes? Yes No - -
14
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

15
SQB Architecture
The SQB is composed by the following modules
16
SQB Architecture
Steps managed by the optimizer
17
SQB Architecture
  • The Execution Monitor builds two queues to store
    the inconclusive pairs in order to deliver them
    to the CEs.
  • One of them are shared among faster CEs, while
    the other among slower ones.
  • The total number of vertices is adopted as
    indicator to the complexity of the processing.
  • A throughput indicator is previously picked up
    from the CEs and registered in the Information
    server (MDS)

It isnt necessary to sort the pairs
18
SQB Architecture
Simplified sequence diagram
19
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

20
Preliminary Tests
Despite a prototype is under construction, a few
tests were done with synthetic spatial datasets
consisting of polygons in order to give us some
relative parameters to guide our work while
dealing with spatial joins among polygons
(overlap predicate). Spatial join operations
were performed over servers that have both
datasets R-Tree indexed. The original datasets
were partitioned in four and nine regular parts
and the response time (RT) on each situation was
taken
RT TMSG messages TTX bytes TCPU
TI/O
  • Objets that cross boundaries were replicated on
    involved datasets (they werent split).
  • The tests were executed in three situations
  • The whole query at once in a single SDBMS
  • The query over the same region broken in four
    parts and executed by four identical machines
  • The query over the same region broken in nine
    parts and executed by nine identical machines

21
Preliminary Tests
Theme 1
Theme 2
22
Preliminary Tests
This operation is CPU bound and the communication
cost has a low impact in the final response time.
RT TMSG messages TTX bytes TCPU
TI/O
T remove replicas
Communications cost based on a 256kbps
bandwidth
23
Preliminary Tests
1
4
9
servers
The processing cost and the communication cost
tend to reach a same magnitude when the number of
servers increase.
The superlinear speedup means, in this case, that
computational resources available in a single
machine were insufficient to reach good response
time
24
Test conditions
  • The preliminary tests were executed under the
    following conditions
  • Spatial Database Secondo
  • Grid Middleware Globus GT4
  • Datasets Two datasets composed by 10060
    triangles indexed
  • Hardware Sempron 2800, 1GB RAM, 80GB HD
  • OS Fedora Linux
  • The overall architecture is under construction
    and is based on web services (WSRF)

25
Outline
  • Motivation and Goal
  • The Problem
  • Related works
  • The Proposal
  • SQB Architecture
  • Preliminary Tests
  • Remarks

26
Remarks
  • This work presents an architecture based on grid
    infrastructure tailored to cover some needs of a
    distributed geographic information system.
  • The focus was on offering a strategy to execute
    spatial queries over spatial databases managed by
    several organizations that are gathered in a
    federation
  • The filter/refine approach was adopted and tried
    to use some pre-existent spatial index in
    datasets.
  • A global ID structure must be proposed in order
    to
  • Easily reduce the multi-processing of objects
    crossing boundaries after filtering step
    (avoiding to move them unnecessarily to CEs)
  • Isolate the processing in SQB from local IDs,
    improving the scalability
  • As next steps
  • Specify new cost models to help the optimizer and
    the scheduler taken into account the dynamic of
    the environment
  • Research the scheduling process in order to
    improve the reliability of the architecture
  • Compare the response time of a join, executed
    over a benchmark dataset, with that one executed
    in similar distributed environments

27
References
1. Adzigogov, L., Soldatos, J., and Polymenakos,
L. (2005). "EMPEROR An OGSA Grid Meta-Scheduler
based on Dynamic Resource." Journal of Grid
Computing, 3, 19-37. 2. Afgan, E. (2004). "Role
of the Resource Broker in the Grid." ACM,
Huntsville, Alabama, USA. 3. Andretto, P. e. a.
(2004). "Practical approaches to Grid workload
and resource management in the EGEE
project.". 4. Azevedo, L. G., Monteiro, R. S.,
Zimbrão, G., and Souza, J. M. (2004).
"Approximate Spatial Query Processing Using
Raster Signature.". 5. Brinkhoff, T., Kriegel, H.
and Seeger B.(1993). Efficient Processing of
Spatial Joins Using R-Trees, In Proceedings of
the 1993 ACM SIGMOD, Washington,DC. 6. Brinkhoff,
T., Kriegel, H., and Schneider, R. (1994).
"Multi-Step Processing of Spatial Joins."
Washington,DC - USA, 237-246. 7. Buyya, R., and
Venegupal, S. (2004). "The Gridbus Toolkit for
Service Oriented Grid and Utility Computing An
overview and Status Report.". 8. Câmara, G., and
Queiroz, G. (2002). "GeoBR Intercâmbio Sintático
e Semântico de Dados Espaciais.". 9. Di, L.,
Chen, A., Yang, W., and Zhao, P. (2003). "The
Integration of Grid Technology with OGC Web
Services (OWS) in NWGISS for NASA EOS
Data.". 10. EGEE .(2006) "GLite - Installation
and Configuration Guide v 3.0 (rev 2)" , European
Union. 11. Egenhofer, M. J., and Herring, J. R.
(1994) "Categorizing Binary Topological Relations
Between Regions, Lines and Point in Geographical
Databases" , NCGIA. 12. "Globus Toolkit
4."(2005). www.gridbus.org/escience/051205GlobusTu
torialeScience.ppt, July/2006. 13. Foster, I.,
and Kesselman, C. (1999). "Computational grids."
The Grid Blueprint for a New Computing
Infrastructure, Morgan-Kaufman. 14. Foster, I.,
Kesselman, C., and Tuecke, S. (2001). "The
Anatomy of the Grid Enabling Scalable Virtual
Organizations." Lecture Notes in Computer
Science, 2150. 15. Gistafson, J. L. (1990).
"Fixed Time, Tiered Memory, and Superlinear
Speedup.".
28
References
16. GridWay Team .(2006) "GridWay 5
Documentation User Guide" Madrid, Spain,
Universidad Complutense de Madrid. 17. Güting,
R. H., Behr, T., Almeida, V., Ding, Z., Hoffmann,
F., and Spiekermann, M. (2004) "Secondo An
Extensible DBMS Architecture and Prototype"
Hagen, Germany, Fernuniversität Hagen.
18. Hanssen, G. (2005). "The Filter/Refine
Strategy A Study on the Land-Use Resource
Dataset in Norway.". 19. Ilya, Z., Memon, A.,
Petropoulos, M., and Baru, C. (2003). "Online
Querying of Heterogeneous Distributed Spatial
Data on a Grid." Brno, Cz, 813-823. 20. Kang,
M.-S., and Choy, Y.-C. (2002). "Deploying
parallel spatial join algorithm for network
environment." IEEE, 177-181. 21. Meyer, W. S.,
and Souza, J. M. (2006). "Overlapped Regions with
Distributed Spatial Databases in a Grid
Environment." Rio de Janeiro, Brazil. 22. Meyer,
W. S., Souza, J. M., and Ramirez, M. R. (2005).
"Secondo-gridAn Infrastructure to Study Spatial
Databases in Computational Grids." Campos do
Jordão, SP, Brazil. 23. Mondal, A., Goda, K., and
Kitsuregawa, M. (2003). "Effective Load-Balancing
via Migration and Replication in Spatial Grids."
Lecture Notes in Computer Science, 2736,
202-211. 24. Özsu, M. T., and Valduriez, P.
(2001). "Principles of Distributed Database
Systems." Prentice-Hall. 25. Porto, F., Silva, V.
F. V., Dutra, M. L., and Shulze, B. (2005). "An
adaptive distributed query processing grid
service." Trondheim, Norway. 26. Ramirez, M. R.
(2001) "Spatial Distributed Query Processing" Rio
de Janeiro, RJ, COPPE/UFRJ. 27. Smith, J.,
Gounaris, A., Watson, P., Paton, N. W.,
Fernandes, A. A. A., and Sakellariou, R. (2002)
"Distributed Query Processing on the Grid"
28. "OGSA-DQP 3.1 User's Documentation."(2006).
http//www.ogsadai.org.uk/documentation/ogsa-dqp_3
.1/, July/2006. 29. Venegupal, S., Buyya, R., and
Winton, L. (2004). "A Grid Service Broker for
Scheduling Distributed Data-Oriented Applications
on Global Grids.". 30. Zimbrão, G., and Souza, J.
M. (1998). "A Raster Approximation for the
Processing of Spatial Joins." New York - USA,
558-569.
29
Thank You !
30
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com