Taiwan UniGrid - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

Taiwan UniGrid

Description:

... has a local scheduler (Condor) installed to schedule the jobs ... Condor. A scheduler for large collections of distributively owned computing resources ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 66
Provided by: mrl9
Category:
Tags: unigrid | condor | taiwan

less

Transcript and Presenter's Notes

Title: Taiwan UniGrid


1
Taiwan UniGrid
  • Yeh-Ching Chung
  • Department of Computer Science
  • National Tsing Hua University
  • Hsin-Chu, 300, Taiwan

2
Outline
  • Introduction
  • Portal
  • Broker and Scheduler
  • Resource Information Service
  • Storage Service
  • Applications
  • Conclusion

3
Introduction (1)
  • The purpose of grid computing is to integrate
    various resources within a large network
    environment.
  • The purpose of the UniGrid project is to build a
    platform for academic research using grid-related
    technologies in Taiwan.

4
Introduction (2)
  • 9 institutes join to develop the system
  • ????
  • ???????
  • ??????
  • ???????
  • ???????
  • ???????
  • ???????
  • ????????????
  • ?????????

5
Introduction (3)
  • All institutes that participate in the UniGrid
    project contribute some resources.
  • These resources can be used in collaboration for
    large scale applications.

6
Introduction (4)
  • System Architecture

7
Outline
  • Introduction
  • Portal
  • Broker and Scheduler
  • Resource Information Service
  • Storage Service
  • Applications
  • Conclusion

8
Portal
  • The UniGrid portal provides an interface for
    UniGrid users to use the resources available in
    the UniGrid system.
  • Functionalities of the portal
  • System status monitoring
  • Single sign-on
  • User workflow management
  • Project information

9
System Status Monitoring (1)
  • UniGrid users can examine the status of system
    resources through the portal.
  • The portal gathers the current system information
    from the information service and present these
    information to the users.

10
System Status Monitoring (2)
  • Screenshot of the system status monitoring web
    page

11
Single Sign-On (1)
  • Single sign-on is a mechanism whereby a single
    authentication can permit a user to access all
    resources where he has access permission, without
    the need to enter multiple passwords.
  • All user account information are kept in a
    database at the portal site.
  • When a user requests a service, his verification
    data is passed to that service.
  • The request will be granted only if the identity
    is verified by the verification web service

12
Single Sign-On (2)
  • User identity verification through single sign-on
    service

13
User Workflow Management (1)
  • A UniGrid user can design and save his own
    workflows at the UniGrid portal.
  • A user can select any workflow he designed and
    execute the workflow through the UniGrid portal.
  • A user can also monitor the status of his
    workflow through the UniGrid portal.

14
User Workflow Management (2)
  • Structure of a workflow

Workflow
parallel execution
sequential execution
15
User Workflow Management (3)
  • The workflows of each user is stored in the
    portal storage in XML format.
  • ltflow name"testflow" numstages"3"gt
  • ltstage name"stage1" numjobs"1"gt
  • ltjob id"0"gt
  • ltsortkeygt1lt/sortkeygt ltruntypegtmpilt/runtypegt
  • ltworkdirgt/home/test/lt/workdirgt
  • ltfilenamegtmm_mpilt/filenamegt
  • ltrunrpgttruelt/runrpgt ltdatafile/gt ltargugt256lt/argugt
  • ltotherurl/gt ltcpunogt4lt/cpunogt
  • lt/jobgt
  • lt/stagegt
  • lt/flowgt

16
User Workflow Management (4)
  • Screenshot of the workflow editing web page

17
User Workflow Management (5)
  • When an user submits a workflow, the portal will
    pass the selected workflow information to the
    broker.
  • Upon receiving an execution request, the resource
    broker will find the required resource for that
    workflow and schedule its execution.

18
User Workflow Management (6)
19
User Workflow Management (7)
  • Users can examine the execution status of his
    workflow through the portals workflow monitoring
    system.
  • All workflow execution information are stored in
    a database at the machine with resource broker
    installed on it.
  • The portal queries the database and obtain the
    current status of a particular workflow.
  • The status information is processed and presented
    in the form of web pages.

20
User Workflow Management (8)
  • Screenshot of the workflow monitoring web page

21
User Workflow Management (9)
  • Screenshot of the UniGrid workflow management web
    page

22
Outline
  • Introduction
  • Portal
  • Broker and Scheduler
  • Resource Information Service
  • Storage Service
  • Applications
  • Conclusion

23
Broker Scheduler (1)
  • The broker provides a uniform interface to access
    available resources in the UniGrid system.
  • The broker uses the resource information service
    to obtain the current status of the resources in
    the system.
  • After these information are gathered, the broker
    will allocate the resources that meets the
    requirements of the current job.
  • The jobs are then passed to the corresponding
    local schedulers to be executed locally.

24
Broker Scheduler (2)
  • Broker workflow

25
Broker Scheduler (3)
  • Each participating organization has a local
    scheduler (Condor) installed to schedule the jobs
    assigned to that organization.
  • Condor
  • A scheduler for large collections of
    distributively owned computing resources
  • Developed by the researchers at University of
    Wisconsin
  • Specialized for compute-intensive jobs
  • Uses the ClassAd mechanism to match job
    requirements to machine status and schedule the
    jobs according to the matching results

26
Related Research (1)
  • Tools have been developed to simulate different
    load sharing and scheduling policies on computing
    grid and analyze their performance
  • Queuing methods
  • Independent clusters
  • Multiple queues
  • Forwarding to no-need-to-wait site
  • Forwarding to shortest-queue site
  • Forwarding to least-load site,
  • load

27
Related Research (2)
  • Queuing methods (contd.)
  • Single queue
  • Multi-pool centralized queue
  • Single-pool centralized queue
  • One big cluster
  • Two-level scheduling
  • Empty queue only
  • Shortest queue first
  • Least load first
  • Two-level local queues
  • Forwarding to shortest-queue site

28
Related Research (3)
  • Scheduling policies
  • Non-FCFS
  • Multi-pool centralized queue
  • Single-pool centralized queue
  • FCFS
  • Two-level scheduling
  • The performance of Non-FCFS is three times better
    than FCFS

29
Related Research (4)
  • Implementation Approaches
  • Multi-Pool Centralized Queue
  • Global queue scheduling in the broker, no local
    queuing system
  • Global queue scheduling in the broker, making
    sure available processors through local queuing
    system
  • Single-Pool Centralized Queue
  • Global queue scheduling in the broker, no local
    queuing system

30
Related Research (5)
  • Two-Level Scheduling (Empty-Queue-Only
    Multi-Pool Grid)
  • Global queue in the broker, local queues in the
    local queuing systems

31
Related Research (6)
  • Simulation results

32
Related Research (7)
  • Simulation results (contd.)

33
Related Research (8)
  • Discussion
  • Non-FCFS methods can effectively improve the
    overall system utilization and performance.
  • The smallest first non-FCFS policy outperforms
    all other policies in terms of waiting time and
    waiting ratio.
  • As the worst case is concerned, the backfilling
    policy is superior because it does not allow jobs
    to be delayed by the backfilling activities

34
Outline
  • Introduction
  • Portal
  • Broker Scheduler
  • Resource Information Service
  • Storage Service
  • Applications
  • Conclusion

35
Resource Information Services
  • The resource information service provides
    information about current resource status, these
    information can be used by other services of the
    system
  • Functionalities of the resource information
    service
  • Information system
  • Performance visualization of MPI parallel
    programs execution

36
Information System (1)
  • Provides an interface for other services to query
    various information about computing nodes
  • The statistics about the individual nodes are
    obtained using MDS (Monitoring Discovery
    Service) provided by the Globus Toolkit
  • The current network status between machines are
    gathered using NWS (Network Weather Service)
  • Automatic update of node information
  • When a new computing nodes is added/removed

37
Information System (2)
  • The Network Weather Service (NWS)
  • A distributed system that periodically monitors
    and dynamically forecasts the performance various
    network and computational resources can deliver
    over a given time interval
  • Developed by the researchers at UCSB
  • It uses numerical models to generate forecasts of
    what the conditions will be for a given time
    frame
  • Because this functionality is analogous to
    weather forecasting, the system is called Network
    Weather Service

38
Information System (3)
39
Information System (4)
  • Screenshot of the node status webpage

40
Performance Visualization of MPI Programs (1)
  • Input any application (depending on the
    availability of compiler in grid platform)
  • Output performance visualization of the
    execution of this application

41
Performance Visualization of MPI Programs (2)
  • Execution of a Parallel Application using 4
    computing nodes

42
Related Research (1)
  • Communication localization data partitioning
    techniques in cluster-based grid system
  • Localized communication enhances performance of
    parallel applications on grid
  • Adaptive data partitioning for identical cluster
    non-identical cluster grid topology
  • In-core out-of-core applications

43
Related Research (2)
  • Communication localization techniques for
    identical cluster

Localized communication patterns
Original communication patterns
44
Related Research (3)
  • Communication localization techniques for
    non-identical cluster

Original communication table
45
Related Research (4)
  • Communication localization techniques for
    non-identical cluster (contd.)

Localized communication table
46
Outline
  • Introduction
  • Portal
  • Broker and Scheduler
  • Resource Information Service
  • Storage Service
  • Applications
  • Conclusion

47
Storage Service
  • The goal of storage service is to provide a
    collaborative space where UniGrid users can share
    their data and resources with others.
  • Components of the storage service
  • Virtual storage system
  • Data management system

48
Virtual Storage System (1)
  • Virtual storage system architecture

49
Virtual Storage System (2)
  • The virtual storage system is implemented with
    Java as a web service
  • UniGrid services access the virtual storage
    system when they need to fetch/modify users data
    files
  • A client program is available for users to manage
    his own storage space
  • The files are stored in a master file server and
    replicas of the files are distributed to other
    machines

50
Virtual Storage System (3)
51
Virtual Storage System (4)
  • Screenshot of the storage service client program

52
Data Management (1)
  • The Data Management is the Web-based Replica
    Access and Management System
  • It consists of the Registration, Search and
    Manager system
  • The registration system is used in managing the
    user for accessing the UniGrid System
  • The search system combines with the RLS and Web
    technique
  • The manager system offers a friendly interface
    for manager, it will be easy to maintain the
    contents of database
  • Structure of the Data Management System

53
Data Management (2)
  • Replica Location Service

54
Data Management (3)
  • The Registration System
  • In Security
  • We design a web registration system
  • User need to be registered in portal and logged
    in by CA (Proxy-init)
  • In account manage
  • Administrator
  • User
  • The detailed structure of Web Service System

55
Data Management (4)
  • The Search System
  • Replica Index and Replica Location
  • In LRC Sever, we can execute the basic command.
  • We can update information of LRI server use the
    batch command
  • Services
  • We offer the service of the Job submit, files
    list, files upload and data replication in single
    server
  • The detailed structure of Web Service System

56
Data Management (5)
  • The Manager System
  • We plan to design a friendly interface for
    manager, it will be easy to maintain the contents
    of Metadata database, update the RLS database and
    manage users account
  • The detailed structure of Web Service System

57
Outline
  • Introduction
  • Portal
  • Broker and Scheduler
  • Resource Information Service
  • Storage Service
  • Applications
  • Conclusion

58
Applications 1
Simulations of atmospheric circulations with the
NTU/Purdue nonhydrostatic numerical model. Model
characteristics Nonhydrostatic Explicit
forward-backward integration for both
high-frequency waves and gravity waves Implicit
diffusion scheme with a TKE prognostic
equation Time split schemes for high-frequency
waves, gravity waves, diffusion, and surface
processes. Physical processes Cloud
microphysics Surface similarity equation 3-layer
soil model Coriolis force
59
Performance with the UniGrid
12 hr
50 sec
30 min
5 hr
17 min
12 hr
50 sec
31 min
5 hr
35 min
60
Commands for submitting jobs
/opt/mpich/pgi/bin/mpirun nolocal machinefile
host np 8 nonh3d.exe gt test
host
NTU uninode11 2 uninode12 2 uninode14
2 uninode15 2
makefile
OBJ nonh3d.o tograds.o copy.o update.o sound.o
adv.o cloud1.o dampini.o\ initial.o
restart.o nbr2d.o startend.o tkeeq.o updtrp.o\
sprogi4.o sprogi2.o diffxy.o diffz.o pbl.o EXE
../nonh3d.exe OPT -O3 -Mextend -Msave
-Bstatic -byteswapio OPT -O3 -static
-ffixed-line-length-80 OPT -O3 -static OPT
-O3 OPT -static (EXE) (OBJ)
/opt/mpich/pgi/bin/mpif77 (OPT) -o (EXE)
(OBJ) .f.o /opt/mpich/pgi/bin/mpif77
(OPT) -c lt clean rm -f .o
../nonh3d.exe
host1
uninode11 uninode11 uninode10 uninode10 uninode12
uninode12 uninode14 uninode14 uninode15 uninode15
uninode5 uninode5 uninode7 uninode7 uninode9 unino
de9
61
Three-dimension simulation of a thermal bubble
in an isentropic environment
Initial spherical bubble develops into a
mushroom-like shape. Two isentropic surfaces are
shown. The isentropic surface corresponding to a
higher potential temperature is in pink.
62
Two-dimensional simulation of a sea breeze
10 5km 0
z
SBF
0
15km 30
x
The figure shows the total water mixing ratio
(vapor plus liquid) over land after 2.5 hr. The
label under the x-axis is the distance from the
coastline. Water vapor is pumped up from the
ground surface in the convective boundary layer
(with the red/orange color representing high
water vapor content in the air). The location of
the sea breeze front (SBF) is shown.
63
Applications 2
  • FASTA
  • Compares a protein sequence to another protein
    sequence or to a protein database, or a DNA
    sequence to another DNA sequence or a DNA library

64
Applications 3
  • ClustalW
  • A general purpose multiple sequence alignment
    program for DNA or proteins.

65
Conclusions and Future Work
  • A prototype of UniGrid system has been developed
  • Enhance the data grid part of UniGrid
  • Promote the UniGrid system to universities in
    Taiwan
Write a Comment
User Comments (0)
About PowerShow.com