Title: QoS-based Scheduling of e-Research Application Workflows on Global Grids
1QoS-based Scheduling of e-Research Application
Workflows on Global Grids
Grid Computing and Distributed Systems (GRIDS)
LaboratoryDept. of Computer Science and Software
EngineeringThe University of Melbourne,
Australiawww.gridbus.org
Gridbus Sponsors
2GRIDS Lab _at_ Melbourne
Education
R D
- Youngest and one of the rapidly growing research
labs in our School/University - Founded in 2002
- Houses
- Research Fellows/PostDocs
- Research Programmers
- PhD candidates
- Honours/Masters students
- Funding
- National and International organizations
- Australian Research Council DEST
- Many industries (Sun, StorageTek, Microsoft, IBM,
Microsoft) - University-wide collaboration
- Faculties of Science, Engineering, and Medicine
- Many national and international collaborations.
- Academics
- Industries
- Software
- Widely in academic and industrial users.
- Publication
Community Services e.g., IEEE TC for Scalable
Computing
3Agenda
- Introduction
- Utility Networks and Grid Computing
- Application Drivers and Various Types of Grid
Services - Global Grids and Challenges
- Security, resource management, pricing models,
- Service-Oriented Grid Architecture and Gridbus
- Market-based Management and Gridbus Software
Stack - Grid Workflows and QoS Scheduling
- Architecture, Design and Implementation
- Performance Evaluation Simulation based
workflows - SLA-based Resource Allocation
- Utility based allocation, pricing, performance
results - Summary and Conclusion
4Power Grid Inspiration Seamlessly delivering
electricity as a utility to users
5(5) Computing Grid Delivering IT services as the
5th utility after water, gas, electricity, and
telephone
eScience eBusiness eGovernment eHealth Multilingua
l eEducation
6Grid-like Vision
- In 1969, Leonard Kleinrock, one of the chief
scientists of the original ARPA project which
seeded the Internet, wrote - "As of now, computer networks are still in their
infancy, but as they grow up and become
sophisticated, we will probably see the spread of
"computer utilities", which, like present
electric and telephone utilities, will service
individual homes and offices across the country - Despite major advances in hardware and software
systems over the past 35 years, we are yet to
realize this vision. How far are we still from
delivering computing as a utility?
7Computing and Communication Technologies
Evolution 1960-2010!
HTC
P2P
PDAs
Minicomputers
PCs
Workstations
Mainframes
Grids
COMPUTING
PC Clusters
Computing as Utility
Crays
MPPs
WS Clusters
XEROX PARC worm
e-Science
e-Business
IETF
W3C
TCP/IP
Ethernet
Communication
Mosaic
HTML
Web Services
Email
Sputnik
SocialNet
Internet Era
WWW Era
XML
ARPANET
1960
1970
1975
1980
1985
1990
1995
2000
2010
Control
Centralised
Decentralised
8What is Grid? (It means different things to
different people)
- IBM
- On Demand Computing
- Microsoft
- .NET
- Oracle
- 10g
- Sun
- N1 Sun Grid Engine
- HP
- Adaptive Enterprise
- Amazon
- Electric Cloud Services
- United Devices and related companies
- Harvesting Unused Desktop resources
9What is Grid?Buyya et. al.
- A type of parallel and distributed system that
enables the sharing, exchange, selection,
aggregation of geographically distributed
autonomous resources - Computers PCs, workstations, clusters,
supercomputers, laptops, notebooks, mobile
devices, PDA, etc - Software e.g., ASPs renting expensive special
purpose applications on demand - Catalogued data and databases e.g. transparent
access to human genome database - Special devices/instruments e.g., radio
telescope SETI_at_Home searching for life in
galaxy. - People/collaborators.
- depending on their availability, capability,
cost, and user QoS requirements.
Widearea
10How does Grids look like?A Bird Eye View of a
Global Grid
Grid Information Service
Grid Resource Broker
Application
R2
R3
R4
R5
RN
Grid Resource Broker
R6
R1
Resource Broker
Grid Information Service
11Classes of Grid Services / Types of Grids
- Computational Services CPU cycles
- Pooling computing power SETI_at_Home, TeraGrid,
AusGrid, ChinaGrid, IndiaGrid, UK Grid, - Data Services
- Collaborative data sharing generated by
instruments, sensors, persons LHC Grid, Napster - Application Services
- Access to remote software/libraries and license
managementNetSolve - Interaction Services
- eLearning, Virtual Tables, Group Communication
(Access Grid), Gaming - Knowledge Services
- The way knowledge is acquired, processed and
manageddata mining. - Utility Computing Services
- Towards a market-based Grid computing Leasing
and delivering Grid services as ICT utilities.
Utility Grid
Users
Knowledge Grid
Interaction Grid
ASP Grid
Data Grid
infrastructure
Computational Grid
12How Are Grids Used?
Utility computing
High-performance computing
Collaborative design
Financial modeling
High-energy physics
E-Business
Drug discovery
Life sciences
Data center automation
E-Science
Natural language processing Data Mining
Collaborative data-sharing
13e-Science Environment Supporting Collaborative
Science
E-Scientist
Peers sharing ideas and collaborative
interpretation of data/results
Cyberinfrastructure
Distributed data
Remote
Visualization
2100
2100
2100
2100
Distributed computation
2100
21
00
2100
2100
Distributed instruments
Data Compute Service
14Agenda
- Introduction
- Utility Networks and Grid Computing
- Application Drivers and Various Types of Grid
Services - Global Grids and Challenges
- Security, resource management, pricing models,
- Service-Oriented Grid Architecture and Gridbus
- Market-based Management and Gridbus Software
Stack - Grid Workflows and QoS Scheduling
- Architecture, Design and Implementation
- Performance Evaluation Simulation based
workflows - SLA-based Resource Allocation
- Utility based allocation, pricing, performance
results - Summary and Conclusion
15Grid Challenges
16Some Grid Initiatives Worldwide
- Australia
- Nimrod-G
- Gridbus
- DISCWorld
- GrangeNet.
- APACGrid
- ARC eResearch
- Brazil
- OurGrid, EasyGrid
- LNCC-Grid many others
- China
- ChinaGrid Education
- CNGrid - application
- Europe
- UK eScience
- EU Grids..
- and many more...
- India
- Garuda
- USA
- Globus
- GridSec
- AccessGrid
- TeraGrid
- Cyberinfrasture
- and many more...
- Industry Initiatives
- IBM On Demand Computing
- HP Adaptive Computing
- Sun N1
- Microsoft - .NET
- Oracle 10g
- Infosys Enterprise Grid
- Satyam Business Grid
- StorageTek Grid..
- and many more
- Public Forums
- Global Grid Forum
27 million
1.3 billion 3 yrs
2? billion
120million 5 yrs
450million 5 yrs
486million 5 yrs
1.3 billion (Rs)
1 billion 5 yrs
http//www.gridcomputing.com
17Open-Source Grid Middleware Projects
18Driving ThemeCommunity Grids vs. Utility Grids
Type Feature Community Grids Utility Grids
User QoS Best effort Contract/SLA
Service Pricing Not considered / free access Usage, QoS level, Market supply and demand
Example Middleware Globus, Condor, OMII, Unicore Nimrod-G, Gridbus, many inspired efforts
19The Gridbus Project _at_ MelbourneEnable Leasing
of ICT Services on Demand
WWG
Gridbus
Pushes Grid computing into mainstream computing
20(No Transcript)
21Agenda
- Introduction
- Utility Networks and Grid Computing
- Application Drivers and Various Types of Grid
Services - Global Grids and Challenges
- Security, resource management, pricing models,
- Service-Oriented Grid Architecture and Gridbus
- Market-based Management and Gridbus Software
Stack - Grid Workflows and QoS Scheduling
- Architecture, Design and Implementation
- Performance Evaluation Simulation based
workflows - SLA-based Resource Allocation
- Utility based allocation, pricing, performance
results - Summary and Conclusion
22What do Grid players want?
- Grid Consumers
- Execute jobs for solving varying problem size and
complexity - Benefit by utilizing distributed resources wisely
- Tradeoff timeframe and cost
- Strategy minimise expenses
- Grid Providers
- Contribute resources for executing consumer jobs
- Benefit by maximizing resource utilisation
- Tradeoff local requirements market opportunity
- Strategy maximise return on investment
23What do Grid players require?
- They need tools and technologies that help them
in value expression, value translation, and value
enforcement. - Grid Service Consumers (GSCs)
- How do I express QoS requirements ?
- How do I trade between timeframe cost ?
- How do I map jobs to resources to meet my QoS
needs? - How do I manage Grid dynamics and get my work
done? -
- Grid Service Providers (GSPs)
- How do I decide service pricing models ?
- How do I specify them ?
- How do I translate them into resource allocations
? - How do I enforce them ?
- How do I advertise attract consumers ?
- How do I do accounting and handle payments?
24Principle 1 Service Oriented Architecture (SOA)
- A SOA is a contractual architecture for offering
and consuming software as services. - There are four entities that make up an SOA
- service provider,
- service registry, and
- service consumer (also known as service
requestor). - The functions or tasks that the service provider
offers, along with other functional and technical
information required for consumption, are defined
in - the service definition or contract.
registry
contract
provider
consumer
25Principle 2 Market-Oriented (Grid) Computing-
(a) Sustained Resourced Sharing and (b)
Effective Management of Shared Resources
Grid Economy
26Market-based Systems Self-managed and
Self-regulated systems.
- Complexity present in Grid systems is similar to
one present in human economies.
27Service-Oriented Grid Architecture
Data Catalogue
Grid Bank
Information Service
Grid Market Services
Sign-on
HealthMonitor
Info ?
Grid Node N
Grid Explorer
Secure
ProgrammingEnvironments
Job Control Agent
Grid Node1
Applications
Schedule Advisor
QoS
Pricing Algorithms
Trade Server
Trading
Trade Manager
Accounting
Resource Reservation
Misc. services
Deployment Agent
JobExec
Resource Allocation
Storage
Grid Resource Broker
R1
R2
Rm
Core Middleware Services
Grid Service Consumer
Grid Service Providers
28Gridbus and Complementary Technologies
realizing Utility Grid
Grid Applications
Portals
Science
Commerce
Engineering
Collaboratories
X-Parameter Sweep Lang.
Workflow
ExcellGrid
Gridscape
MPI
User-LevelMiddleware
Grid Brokers
Gridbus Data Broker
Workflow Engine
Nimrod-G
Grid Exchange Federation
Grid MarketDirectory
Globus
Unicore
Grid Storage Economy
GridBank
Core Grid Middleware
Alchemi
NorduGrid
XGrid
Grid Economy
.NET
JVM
Condor
SGE
Tomcat
PBS
Libra
Grid Fabric Software
Mac
AIX
Solaris
Windows
Linux
IRIX
OSF1
Grid Fabric Hardware
Worldwide Grid
29Agenda
- Introduction
- Utility Networks and Grid Computing
- Application Drivers and Various Types of Grid
Services - Global Grids and Challenges
- Security, resource management, pricing models,
- Service-Oriented Grid Architecture and Gridbus
- Market-based Management and Gridbus Software
Stack - Grid Workflows and QoS Scheduling
- Architecture, Design and Implementation
- Performance Evaluation Simulation based
workflows - SLA-based Resource Allocation
- Utility based allocation, pricing, performance
results - Summary and Conclusion
30Workflow-based Applications
- Workflow applications
- Scientific and engineering domains (e.g.,
biology, astronomy, chemistry) - Task execution is based on their control and data
dependencies.
(Protein annotation workflow London e-Science
Centre)
31Workflow for VR-Based Respiratory Treatment
Planning System
32Driving ThemeCommunity Grids vs. Utility Grids
Type Feature Community Grids Utility Grids
User QoS Best effort Contract/SLA
Service Pricing Not considered / free access Usage, QoS level, Market supply and demand
Example Workflow Systems Triana, MyGrid, Askalon, DAGMan, Pegasus, GrADS Kepler Gridbus Grid Workflow Engine
33 Workflow Scheduling
- Scheduling on Community Grids
- Minimize the execution time based on best effort
(ignores factors such as monetary cost of
resource access and various users QoS
satisfaction levels.) - Scheduling on Utility Grids
- Focuses on mapping workflow tasks on services to
satisfy users QoS constraints (e.g. deadline,
the quality of produced data). - Supports negotiation and establishment of SLA as
a contract between users and providers - Optimize performance under most important QoS
constraints imposed by users. - Minimize execution cost while meeting a specified
deadline. - Minimize execution time while meeting a specified
budget. - Support SLA-based allocation of resources so that
multiple competing demands from users can be
managed with the aim of enhancing providers
profit.
34Cost-based Workflow Scheduling
- Objective Function
- Minimize the execution cost and yet meet the
time constraints imposed by users.
task
35Workflow Management Systems
- Support composition, deployment, and execution
management of workflow applications - Workflow language
- Graphical environment for workflow composition
and monitor - Grid middleware integration
- Data management
- Fault-tolerance
- QoS-based SLA negotiation
- Scheduling
- ...
36(No Transcript)
37Architecture
GSP Grid Service Provider
Workflow Management System
SLA Service Level Agreement
Workflow Specification
Grid MarketDirectory
QoSRequest
Service Discovery
Grid Service
Performance Estimator
Grid Service
Workflow Planning
GSP
Grid Service
ReservationRequest(SLA)
Advance Reservation
contract violation
Grid Service
ServiceRequest(SLA)
Workflow Execution
QoS Monitor
Executor
Feedback
marketplace
Workflow Scheduling
38Methodology
- Discover available services and estimate
execution time for every task. - Group workflow tasks into task partitions.
- Distribute users overall deadline into every
task partition. - Query available time slots, generate optimized
schedule for each task partition and make advance
reservations. - Start workflow execution and reschedule when the
initial schedule is violated at run-time.
39Predicting Execution Time
- Reservation-enabled Utility Services
- Resource services
- Provide proportions of hardware resources (e.g.
computing processors, network bandwidth, storage,
memory) as a service for remote client access. - Simulation, analytical modeling, empirical and
historical data. - Application services
- Allow remote clients to use their specialized
applications. - Provide estimated service times based on the
metadata of users service requests.
40Workflow Task Partitioning
Simple task
Branch
Synchronization task
T3
T2
T2
T3
T4
T4
T5
T6
T6
T5
T1
T1
T14
T14
T11
T11
T8
T7
T8
T7
T13
T13
T10
T10
T12
T12
T9
T9
Before partitioning.
After partitioning.
41Deadline Assignment/Distribution
(43)
- P1. Any assigned sub-deadline must be greater
than or equal to the minimum processing time of
the corresponding task partition. - P2. The overall deadline is divided over task
partitions in proportion to their minimum
processing time. - P3. The cumulative sub-deadline of any
independent path between two synchronization
tasks must be same. - P4. The cumulative sub-deadline of any path from
entry task to exit task is equal to the overall
deadline.
(152)
350
(217)
(284)
(350)
(53)
(120)
(187)
(187)
(187)
350
(253)
(269)
(269)
(350)
42Planning
- Generates an optimized schedule for advanced
reservation and run-time execution. - Solve the problem based on divide-and-conquer.
- Generate a optimized schedule for each partition
based on its assigned sub-deadline. - A local optimized schedule minimizes execution
cost while meeting its assigned sub-deadline. - A optimized schedule constructed by local
schedules. - Task partition optimization
- Synchronization Task Scheduling
- Branch Task Scheduling
43Task Partition Scheduling
- Synchronization task scheduling
- Only one task.
- Solution select the cheapest service that can
process the task and transfer data within the
assigned sub-deadline. - Branch task scheduling
- One simple task in a branch.
- Multiple tasks in a branch.
- Model a branch as a Markov Decision Process (MDP)
44Experiments
- Different Workflow Structures
(fMRIs neuroscience workflow)
(Protein annotation workflow London e-Science
Centre)
Pipeline
Parallel
Hybrid structure
45(Simulation) Experiments
- MI (million instructions) represents length of
tasks - MIPS (Million Instructions per Second) represents
the processing capability of services. - Service type represents different types of
services. - 15 types of services, each supported by 10
different service providers with different
processing capability.
Table I. Service speed and corresponding price
for executing a task.
Table II. Transmission bandwidth and
corresponding price.
Service ID Processing Time (sec) Cost (G)
1 1200 300
2 600 600
3 400 900
4 300 1200
Bandwidth (Mbps) Cost/sec (G/sec)
100 1
200 2
512 5.12
1024 10.24
46Experiments
- Compared heuristics
- Greedy cost
- sorts services by their prices.
- assigns as many tasks as possible to cheapest
services without exceeding the deadline. - Deadline-level
- divides workflow tasks into levels based on their
depth in the workflow graph. - assigns sub-deadlines to each task level equally.
47Results
48Results
49Agenda
- Introduction
- Utility Networks and Grid Computing
- Application Drivers and Various Types of Grid
Services - Global Grids and Challenges
- Security, resource management, pricing models,
- Service-Oriented Grid Architecture and Gridbus
- Market-based Management and Gridbus Software
Stack - Grid Workflows and QoS Scheduling
- Architecture, Design and Implementation
- Performance Evaluation Simulation based
workflows - SLA-based Resource Allocation
- Utility based allocation, pricing, performance
results - Summary and Conclusion
50Utility-driven Cluster RMS Architecture for GSPs
51Economy-based Admission Control Resource
Allocation
- Uses the pricing function to compute cost for
satisfying the QoS of a job as a means for
admission control - Regulate submission of workload into the cluster
to prevent overloading - Provide incentives
- Deadline --
- Execution Time --
- Cluster Workload --
- Cost acts as a mean of feedback for user to
respond to
52Impact of Penalty Function on Utility
53Normalised Comparison of FCFS, Libra Libra
54Impact of Increasing Dynamic Pricing Factor on
GSP Profitability
55Agenda
- Introduction
- Utility Networks and Grid Computing
- Application Drivers and Various Types of Grid
Services - Global Grids and Challenges
- Security, resource management, pricing models,
- Service-Oriented Grid Architecture and Gridbus
- Market-based Management and Gridbus Software
Stack - Grid Workflows and QoS Scheduling
- Architecture, Design and Implementation
- Performance Evaluation Simulation based
workflows - SLA-based Resource Allocation
- Utility based allocation, pricing, performance
results - Summary and Conclusion
56Summary and Conclusion
- Grids exploit synergies that result from
cooperation of autonomous entities - Resource sharing, dynamic provisioning, and
aggregation at global level ?Great Science and
Great Business! - Grids have emerged as enabler for
Cyberinfrastructure that powers e-Science and
e-Business applications. - SOA Market-based Grid Management Utility
Grids - Grids allow users to dynamically lease Grid
services at runtime based on their quality, cost,
availability, and users QoS requirements. - Delivering ICT services as computing utilities.
- QoS Scheduling of Workflows and SLA-based
resource allocation enables ability of Grids to
serve as IT backbone for delivering utility
computing services.
57Thanks for your attention!
We Welcome Cooperation in Research and
Development! http/www.gridbus.org
eScience2007.org
58Backup
59Markov Decision Process (MDP)
- Effective for solving sequential decision
problems. - A MDP model contains
- A set of possible system states
- A set of possible actions
- A real valued reward (penalty) function
- A transition of each actions effects in each
state
60MDP Model
- States
- A state consists of current execution task, ready
time and current location. - Actions
- An action in the MDP is to allocate a time slot
on a service to a task. - t input data transmission time plus the
processing time of the service. - c transmission cost plus the service cost.
61MDP Model
- Immediate penalty obtained from taking action a
in state s and transitioning to state s. - Expected penalty
- The sum of immediate penalties from current state
to a terminal state. - The optimal action for state s is
, sub-deadline
, otherwise
Expected penalty
62Implementation
- Value iteration
- is a standard dynamic programming algorithm
- compute a new value function for each state based
on the current value of its next state. - value iteration proceeds in an iterative fashion
and can converge to the optimal solution quickly. - record a number of candidate solutions while
finding the optimal time slot.
63Rescheduling
- Re-adjust sub-deadline and re-compute optimal
schedules for unexecuted task partitions. - Reschedule minimum number of tasks.