Title: Grid Platform for Geospatial Applications
1Grid Platform for Geospatial Applications
Fine Granule Scheduler
- Presented by Bin Zhou
- Bin Zhou, Jibo Xie, Chaowei Yang
- Joint Center for Intelligent Spatial Computing
- George Mason University
2Agenda
- Grid Computing Introduction
- CISC SURA Grid
- Geospatial Applications Require Grid
- CISC Fine Granule Scheduler
- Architecture,Strategy
- Progress Status
3Grid Computing Introduction
- Definition
- Grid computing is an emerging computing
infrastructure that treats all resources as a
collection of manageable entities with common
interfaces to such functionality as lifetime
management, discoverable properties and
accessibility via open protocols - wikipedia
- Popular Grid Middleware
- Condor
- Globus
- Condor-G
- Unicore
4(No Transcript)
5GMU grid environment
GMU
CISC GMU Grid can access the computing resources
contributed by SURAgrid member universities
6(No Transcript)
7GMU grid environment
GMU
CISC Grid can setup 1-10Gbps connection to any of
the LamdaRail supported Universities, Agencies,
and Centers, such as GSFC SDSC
8CISC Computing Pool
9Geospatial Requirements
- Large Data Set
- Map Data, Sensor Data, in Tera-bytes
- Reliability,Interoperability
- collaboration
- Intensive Computation
- More Complex Algorithms
- Adaptive Algorithms
- Intelligent Processing
10Grid Computing Could Satisfy these requirements
- Reliable File Transfer
- Resource Management and Allocation
- Authorization Control
- Job Control
- Web Service Oriented
11Detecting Watersheds from multi-scale DEM
- Watershed boundaries are not known before
processing massive data - extract coarse watershed boundaries from
multi-scale DEM - Using the boundaries to decompose the massive
data with some redundancy
Extraction
resample
Xie 2006
12Use 24 units to test the speed up (each unit
is 3.08M)
(Xie 2006)
13CISC Test Applications
Real Time Routing Test Result
30
30
30
Job Amount
30
20
30
10
CPUs
1
322s
293s
374s
Executing Time
1686s
5.2
5.75
4.5
Speed Up
1
0.26
0.19
0.45
Efficiency
1
The efficiency decreases with the CPU numbers
because the overhead increase, but the major
problem is Condor cant handle small jobs
efficient. Demonstrates the need for fine granule
scheduler
14Specific Applications Fine-Grained Near Real
Time Jobs
- Fine-Grained
- Very Short Executing Time
- Huge Amount
- Job Similarity
- Near Real Time
- Sensitive to scheduling latency
- example Real-Time Routing, Short-Time stock
prediction, - Condor cannot be used for tasks that require less
than 3.5 min to complete - ---Gregg Cooke, IT Technical Council ,"Evaluating
Condor for Enterprise Use A UBS Case Study"
15CISC Scheduler
- Purpose
- improve near real time job response time
- improve mass Fine Granularity job throughput
- Scheduling Strategy
- Short Communicating Message
- Simple Match-Making Function
- Dynamic Index
- Multi-Dispatch
16System Architecture
Central Manager
Worker
User Interface
Abstract Interface /APIs
Services
Container
Algorithm module
Collector
Submitter
Lib
Dispatcher
Resource Manager
File Transfer
Process
Other
Message passing
Memory
TCP/UDP Socket
System Function
17Components
18Job Work Flow
19Prototype Overhead Test
- Test Case
- Insertion Sort 200,000 integers
- Dataset 5.56M
- Execute File 1.8M
- Test Platform
- OS ubuntu 6.10 Network 100Mbps
- CPU Celeron M 1.6G Memory 1G
20Thanks