Grid meets Economics: A Market Paradigm for Resource Management and Scheduling for WorldWide Grid Co - PowerPoint PPT Presentation

1 / 99
About This Presentation
Title:

Grid meets Economics: A Market Paradigm for Resource Management and Scheduling for WorldWide Grid Co

Description:

I am unable to give you access our Australian machines, but I want to ... Content Sharing (free or paid) Sharing digital contents among peers (e.g., Napster) ... – PowerPoint PPT presentation

Number of Views:146
Avg rating:3.0/5.0
Slides: 100
Provided by: rajk66
Category:

less

Transcript and Presenter's Notes

Title: Grid meets Economics: A Market Paradigm for Resource Management and Scheduling for WorldWide Grid Co


1
Grid meets EconomicsA Market Paradigm for
Resource Management and Scheduling for
World-Wide Grid Computing
  • Rajkumar Buyya

Melbourne, Australiawww.buyya.com/ecogrid
2
(No Transcript)
3
Need Honest Answers!
  • I want to have access to your Grid resources
    want to knowhow many of you are willing to give
    me access ? (following cases)
  • I am unable to give you access our Australian
    machines, but I want to have access to yours!
  • Want to solve academic problems
  • Want to solve business problems
  • I am willing to gift you Kangaroos! (bartering)
  • I am willing to give you access to my machines,
    if you want. (sharing, but no measure no QoS)
  • I am willing to pay you dollars on usage basis.
    (economic incentive, market-based, and QoS)

4
Overview
  • A quick glance at todays Grid computing
  • Resource Management challenges for next
    generation Grid computing
  • A Glance at Approaches to Grid computing.
  • Grid Architecture for Computational Economy
  • Economy Grid Globus GRACE
  • Nimrod-G -- Grid Resource Broker
  • Scheduling Experiments
  • Case Study Drug Design Application on Grid
  • Conclusions

5
Scalable HPC Breaking Administrative Barriers
new challenges
?
PERFORMANCE
Administrative Barriers
  • Individual
  • Group
  • Department
  • Campus
  • State
  • National
  • Globe
  • Inter Planet
  • Universe

Desktop
SMPs or SuperComputers
Local Cluster
Global Cluster/Grid
Inter Planetary Grid!
Enterprise Cluster/Grid
6
Why Grids? Large Scale Explorations need
themKiller Applications.
  • Solving grand challenge applications using
    modeling, simulation and analysis

Aerospace
Internet Ecommerce
Life Sciences
CAD/CAM
Digital Biology
Military Applications
Military Applications
Military Applications
7
(No Transcript)
8
What is Grid ?
  • An infrastructure that logically couples
    distributed resources
  • Computers PCs, workstations, clusters,
    supercomputers, laptops, notebooks,
  • mobile devices, PDA, etc
  • Software e.g., ASPs renting expensive special
    purpose applications on demand
  • Catalogued data and databases e.g. transparent
    access to human genome database
  • Special devices e.g., radio telescope
    SETI_at_Home searching for life in galaxy.
  • People/collaborators.
  • and presents them as an integrated global
    resource.
  • It enables the creation of virtual enterprises
    (VEs) for resource sharing.

Widearea
9
Grid Applications-Drivers
  • Distributed HPC (Supercomputing)
  • Computational science.
  • High-throughput computing
  • Large scale simulation/chip design parameter
    studies.
  • Content Sharing (free or paid)
  • Sharing digital contents among peers (e.g.,
    Napster)
  • Remote software access/renting services
  • Application service provides (ASPs).
  • Data-intensive computing
  • Data mining, particle physics (CERN), Drug
    Design.
  • On-demand, realtime computing
  • Medical instrumentation network-enabled
    solvers.
  • Collaborative
  • Collaborative design, data exploration, education.

10
Building and Using Grids require
  • Services that make our systems Grid Ready!
  • Security mechanisms that permit resources to be
    accessed only by authorized users.
  • (New) programming tools that make our
    applications Grid Ready!.
  • Tools that can translate the requirements of an
    application/user into the requirements of
    computers, networks, and storage.
  • Tools that perform resource discovery, trading,
    selection/allocation, scheduling and distribution
    of jobs and collects results.

Globus
?
11
Players in Grid Computing
12
What users want ?Users in Grid Economy Strategy
  • Grid Consumers
  • Execute jobs for solving varying problem size and
    complexity
  • Benefit by selecting and aggregating resources
    wisely
  • Tradeoff timeframe and cost
  • Strategy minimise expenses
  • Grid Providers
  • Contribute idle resource for executing consumer
    jobs
  • Benefit by maximizing resource utilisation
  • Tradeoff local requirements market opportunity
  • Strategy maximise return on investment

13
Challenges for Next Generation Grid Technology
Development
14
Sources of Complexity in Resource Management for
World Wide Grid Computing
  • Size (large number of nodes, providers,
    consumers)
  • Heterogeneity of resources (PCs, Workstations,
    clusters, and supercomputers, instruments,
    databases, software)
  • Heterogeneity of fabric management systems
    (single system image OS, queuing systems, etc.)
  • Heterogeneity of fabric management polices
  • Heterogeneity of application requirements (CPU,
    I/O, memory, and/or network intensive)
  • Heterogeneity in resource demand patterns (peak,
    off-peak, ...)
  • Applications need different QoS at different
    times (time critical results). The utility of
    experimental results varies from time to time.
  • Geographical distribution of users located
    different time zones
  • Differing goals (producers and consumers have
    different objectives and strategies)
  • Unsecure and Unreliable environment

15
Traditional approaches to resource management
scheduling are NOT useful for Grid ?
  • They use centralised policy that need
  • complete state-information and
  • common fabric management policy or decentralised
    consensus-based policy.
  • Due to too many heterogenous parameters in the
    Grid it is impossible to define/get
  • system-wide performance matrix and
  • common fabric management policy that is
    acceptable to all.
  • Economics paradigm proved to effective
    institution in managing decentralization and
    heterogeneity that is present in human economies!
  • Fall of USSR Emergence of US as world
    superpower! (monopoly?)
  • So, we propose/advocate the use of computational
    economics principles in management of resources
    and scheduling computations on world wide Grid.
  • Think locally and act globally approach to grid
    computing!

16
Benefits of Computational Economies
  • It provides a nice paradigm for managing self
    interested and self-regulating entities (resource
    owners and consumers)
  • Helps in regulating supply-and-demand of
    resources.
  • Services can be priced in such a way that
    equilibrium is maintained.
  • User-centric / Utility driven
  • Scalable
  • No need of central coordinator (during
    negotiation)
  • Resources(sellers) and also Users(buyers) can
    make their own decisions and try to maximize
    utility and profit.
  • Adaptable,
  • It helps in offering different QoS (quality of
    services) to different applications depending the
    value users place on them.
  • It improves the utilisation of resources
  • It offers incentive for resource owners for being
    part of the grid!
  • It offers incentive for resource consumers for
    being good citizens
  • There is large body of proven Economic principles
    and techniques available, we can easily leverage
    it.

17
New challenges of Computational Economy
  • Resource Owners
  • How do I decide prices ? (economic models?)
  • How do I specify them ?
  • How do I enforce them ?
  • How do I advertise attract consumers ?
  • How do I do accounting and handle payments?
  • ..
  • Resource Consumers
  • How do I decide expenses ?
  • How do I express QoS requirements ?
  • How I trade between timeframe cost ?
  • .
  • Any tools, traders brokers available to
    automate the process ?

18
mix-and-match
Object-oriented
Internet/partial-P2P
Grid Computing Approaches
Network enabled Solvers
Market/Computational Economy
Nimrod-G
19
Many Grid Projects Initiatives
  • Australia
  • Economy Grid
  • Nimrod-G
  • Virtual Lab
  • Active Sheets
  • DISCWorld
  • ..new coming up
  • Europe
  • UNICORE
  • MOL
  • Lecce GRB
  • Poland MC Broker
  • EU Data Grid
  • EuroGrid
  • MetaMPI
  • Dutch DAS
  • XW, JaWS
  • and many more...
  • Japan
  • USA
  • Globus
  • Legion
  • Javelin
  • AppLeS
  • NASA IPG
  • Condor
  • Harness
  • NetSolve
  • AccessGrid
  • GrADS
  • and many more...
  • Cycle Stealing .com Initiatives
  • Distributed.net
  • SETI_at_Home, .
  • Entropia, UD, Parabon,.
  • Public Forums
  • Global Grid Forum
  • P2P Working Group

http//www.gridcomputing.com
20
Many Testbeds ? who pays ?, who regulates
demand and supply ?
GUSTO (decommissioned)
World Wide Grid
Legion Testbed
NASA IPG
21
Testbeds so far -- observations
  • Who contributed resources why ?
  • Volunteers for fun, challenge, fame, charismatic
    apps, public good like distributed.net
    SETI_at_Home projects.
  • Collaborators sharing resources while developing
    new technologies of common interest Globus,
    Legion, Ninf, Ninf, MC Broker, Lecce GRB,...
    Unless you know lab. leaders, it is impossible to
    get access!
  • How long ?
  • Short term excitement is lost, too much of
    admin. Overhead (Globus inst), no incentive,
    policy change,
  • What we need ? Grid Marketplace!
  • Regulates supply-and-demand, offers incentive for
    being players, simple, scalable solution,
    quasi-deterministic proven model in real-world.

22
Building an Economy Grid(Next Generation Grid
Computing!)
To enable the creation of Grid Marketplace
(competitive) ASP Service Oriented Computing . .
. And let users focus on their own work (science,
engineering, or commerce)!
23
GRACE A ReferenceGrid Architecture for
Computational Economy
Grid Bank
Information Server(s)
Grid Market Services
Sign-on
Health Monitor
Info ?
Grid Node N

Grid Explorer

Application
Secure
Job Control Agent
Grid Node1
Schedule Advisor
QoS
Pricing Algorithms
Trade Server
Trading
Trade Manager
Accounting
Resource Reservation
Misc. services

Deployment Agent
JobExec
Resource Allocation
Storage
Grid User
Grid Resource Broker

R1
R2
Rm
Grid Middleware Services
Grid Service Providers
See PDPTA 2000 paper!
24
Economic Models for Trading
  • Commodity Market Model
  • Posted Prices Models
  • Bargaining Model
  • Tendering (Contract Net) Model
  • Auction Model
  • English, first-price sealed-bid, second-price
    sealed-bid (Vickrey), and Dutch
    (consumerlow,high,rate producerhigh, low,
    rate)
  • Proportional Resource Sharing Model
  • Shareholder Model
  • Partnership Model

See SPIE ITCom 2001 paper! with Heinz
Stockinger, CERN!
25
Grid Components
Applications and Portals
Grid Apps.

Prob. Solving Env.
Collaboration
Engineering
Web enabled Apps
Scientific
Grid Tools
Development Environments and Tools

Web tools
Languages
Libraries
Debuggers
Resource Brokers
Monitoring
Grid Middleware
Distributed Resources Coupling Services

QoS
Security
Information
Process
Resource Trading
Market Info
Local Resource Managers
TCP/IP UDP

Operating Systems
Queuing Systems
Libraries App Kernels
Grid Fabric
Networked Resources across Organisations

Clusters
Data Sources
Scientific Instruments
Storage Systems
Computers
26
Economy Grid Globus GRACE
Applications
Grid Apps.

Science
Engineering
Commerce
Portals
ActiveSheet
High-level Services and Tools
Grid Status

Grid Tools
DUROC
globusrun
MPI-G
Nimrod/G
CC
Core Services
Heartbeat Monitor
Nexus
GRACE-TS
GRAM
Grid Middleware
Globus Security Interface
GASS
DUROC
MDS
GARA
GBank
GMD
Grid Fabric
Local Services
GRD
QBank
JVM
Condor
TCP
UDP
eCash
LSF
PBS
Solaris
Irix
Linux
See IPDPS HWC 2001 paper!
27
GRACE components
  • A resource broker (e.g., Nimrod/G)
  • Various resource trading protocols for different
    economic models
  • A mediator for negotiating between users and grid
    service providers (Grid Market Directory)
  • A deal template for specifying resource
    requirements and services offers
  • Grid Trading Server
  • Pricing policy specification
  • Accounting (e.g., QBank) and payment management
    (GridBank, not yet implemented)

28
Grid Open Trading Protocols
Trade Manager
Trade Server
Get Connected
Pricing Rules
Reply to Bid (DT)
Negotiate Deal(DT)
.
API
Confirm Deal(DT, Y/N)
DT - Deal Template - resource requirements (BM)
- resource profile (BS) - price (any one can
set) - status - change the above values
- negotiation can continue -
accept/decline - validity period
Cancel Deal(DT)
Change Deal(DT)
Get Disconnected
29
Pricing, Accounting, Allocations and Job
Scheduling Flow _at_ each site/Grid Level
Pricing Policy
GRID Bank (digital transactions)
0
0
2
DB_at_Each Site
QBank
Trade Server
1
3
5
8
0. Make Deposits, Transfers, Refunds,
Queries/Reports 1. Clients negotiates for
access cost. 2. Negotiation is performed
per owner defined policies. 3. If client is
happy, TS informs QB about access deal. 4.
Job is Submitted 5. Check with QB for go
ahead 6. Job Starts 7. Job Completes 8.
Inform QB about resource resource
utilization.
Resource Manager
4
IBM-LL/PBS/.
6
7
Compute Resources clusters/SGI/SP/...
30
Service Items to be Charged
  • CPU - User and System time
  • Memory
  • maximum resident set size - page size
  • amount of memory used
  • page faults with/without physical I/O
  • Storage size, r/w/block IO operations
  • Network msgs sent/received
  • Signals received, context switches
  • Software and Libraries accessed
  • Data Sources (e.g. Protein Data Bank)

31
How to decide Price ?
  • Fixed price model (like todays Internet)
  • Dynamic/Demand and Supply (like tomorrows
    Internet)
  • Usage Period
  • Loyalty of Customers (like Airlines favoring
    frequent flyers!)
  • Historical data
  • Advance Agreement (high discount for
    corporations)
  • Usage Timing (peak, off-peak, lunch time)
  • Calendar based (holiday/vacation period)
  • Bulk Purchase (register 100 .com domains at
    once!)
  • Voting -- trade unions decide pricing structure
  • Resource capability as benchmarked in the market!
  • Academic RD/public-good application users can be
    offered at cheaper rate compared to commercial
    use.
  • Customer Type Quality or price sensitive
    buyers.
  • Can be Prescribed by Regulating (Govt.)
    authorities

32
Payments- Options Automation
  • Buy credits in advance / GSPs bill the user
    later--pay as you go
  • Pay by Electronic Currency via Grid Bank
  • NetCash (anonymity), NetCheque, and Paypal
  • NetCheque - http//www.isi.edu/gost/info/netcash/
  • Users register with NC accounting servers, can
    write electronic cheques and send (e.g email).
    When deposited, balance is transferred from
    sender to receiver account.
  • NetCash - http//www.isi.edu/gost/info/netcheque/
  • It supports anonymity and it uses the NetCheque
    system to clear payments between currency
    servers.
  • Paypal.com accountemail is linked to credit
    card.
  • Enter the recipients email address and the
    amount you wish to request.
  • The recipient gets an email notification and pays
    you at www.PayPal.com

33
Nimrod-GThe Grid Resource Broker
  • Soft Deadline and Budget-based Economy Grid
    Resource Broker for Parameter Processing on P2P
    Grids

34
Parametric Computing(What Users think of Nimrod
Power)
Parameters
Magic Engine
Multiple Runs Same Program Multiple Data
Killer Application for the Grid!
See IPDPS 2000 paper!
Courtesy Anand Natrajan, University of Virginia
35
P-study Applications -- Characteristics
  • Code (Single Program sequential or threaded)
  • High Resource Requirements
  • Long-running Instances
  • Numerous Instances (Multiple Data)
  • High Computation-to-Communication Ratio
  • Embarrassingly/Pleasantly Parallel

36
Sample P-Sweep Applications
Bioinformatics Drug Design / Protein
Modelling
Combinatorial Optimization Meta-heuristic
parameter estimation
Ecological Modelling Control Strategies for
Cattle Tick
Sensitivityexperiments on smog formation
Data Mining
Electronic CAD Field Programmable Gate Arrays
High Energy Physics Searching for Rare Events
Computer Graphics Ray Tracing
Finance Investment Risk Analysis
VLSI Design SPICE Simulations
Civil Engineering Building Design
Network Simulation
Automobile Crash Simulation
Aerospace Wing Design
astrophysics
37
Thesis
  • Perform parameter sweep (bag of tasks) (utilising
    distributed resources) within T hours or early
    and cost not exceeding M.
  • Three Options/Solutions
  • Using pure Globus commands
  • Build your own Distributed App Scheduler
  • Use Nimrod-G (Resource Broker)

38
Executing Remotely
Choose Resource
Transfer Input Files
Set Environment
Start Process
Pass Arguments
Monitor Progress
Summary View Job View Event View
Read/Write Intermediate Files
Transfer Output Files
Resource Discovery, Trading, Scheduling,
Predictions, Rescheduling, ...
39
Using Pure Globus commands
Do all yourself! (manually)
Total Cost???
40
Build Distributed Application Scheduler
Build App case by case basis Complicated
Construction
E.g., AppLeS/MPI based
Total Cost???
41
Use Nimrod-G
Aggregate Job Submission Aggregate View
Submit Play!
42
Nimrod Associated Family of Tools
Remote Execution Server (on demand Nimrod Agent)
P-sweep App. Composition Nimrod/ Enfusion Resour
ce Management and Scheduling Nimrod-G
Broker Design Optimisations Nimrod-O App.
Composition and Online Visualization Active
Sheets Grid Simulation in Java GridSim Drug
Design on Grid Virtual Lab
File Transfer Server
Upcoming? HEPGrid (U. Melbourne),
GAVE(Rutherford Appleton Lab)
Grid (Un)Aware Virtual Engineering
43
Nimrod/G A Grid Resource Broker
  • A resource broker for managing and steering task
    farming (parametric sweep) applications on
    computational Grids based on deadline and
    computational economy.
  • Key Features
  • A single window to manage control experiment
  • Resource Discovery
  • Resource Trading
  • Scheduling Predications
  • Transportation of data results
  • Steering data management
  • It allows to study the behaviour of some of the
    output variables against a range of different
    input scenarios.

44
A Glance at Nimrod-G Broker
Nimrod/G Client
Nimrod/G Client
Nimrod/G Client
Nimrod/G Engine
Schedule Advisor
Trading Manager
Grid Store
Grid Dispatcher
Grid Explorer
Grid Middleware
TM TS
Globus, Legion, Condor, etc.
GE GIS
Grid Information Server(s)
RM TS
RM TS
RM TS
G
C
L
G
Legion enabled node.
Globus enabled node.
L
G
C
L
RM Local Resource Manager, TS Trade Server
Condor enabled node.
See HPCAsia 2000 paper!
45
Nimrod/G Grid Broker Architecture
Legacy Applications
Customised Apps (Active Sheet)
Monitoring and Steering Portals
Nimrod Clients
P-Tools (GUI/Scripting) (parameter_modeling)
XML?
Farming Engine
Meta-Scheduler
XML
Algorithm1
Programmable Entities Management
Schedule Advisor
. . .
Resources
Jobs
Tasks
Channels
AlgorithmN
Nimrod Broker
Agents
AgentScheduler
JobServer
IP hourglass ?
Trading Manager
Grid Explorer
Database (Postgres)
Dispatcher Actuators
. . .
Legion-A
P2P-A
Globus-A
. . .
Condor
GMD
Globus
Legion
P2P
GTS
Middleware
G-Bank
. . .
Computers
Storage
Networks
Instruments
Local Schedulers
Fabric
. . .
PC/WS/Clusters
Radio Telescope
Condor/LL/Mosix/
Database
46
A Nimrod/G Monitor
Deadline
Legion hosts
Globus Hosts
Bezek is in both Globus and Legion Domains
47
User Requirements Deadline/Budget
48
Active Sheet Spreadsheet Processing on Grid
Nimrod Proxy
Nimrod/G
See HPC 2001 paper!
49
(No Transcript)
50
Nimrod/G Interactions
Resource Discovery
Grid Info servers
Scheduler
Grid Trade Server
Resource allocation (local)
Farming Engine
Queuing System
User process
Nimrod Agent
Dispatcher
Process server
Do this in 30min. for 10?
Gatekeeper node
Computational node
Root node
51
Adaptive Scheduling Algorithms
See HPDC AMS 2001 paper!
Discover More Resources
Discover Resources
Establish Rates
Compose Schedule
Evaluate Reschedule
Meet requirements ? Remaining Jobs, Deadline,
Budget ?
Distribute Jobs
52
Cost Model
  • Without cost ANY shared system becomes
    un-managable
  • Charge users more for remote facilities than
    their own
  • Choose cheaper resources before more expensive
    ones
  • Cost units (G) may be
  • Dollars
  • Shares in global facility
  • Stored in bank

53
Cost Matrix _at_ Grid site X
  • Non-uniform costing
  • Encourages use of local resources first
  • Real accounting system can control machine usage

Resource Cost Function (cpu, memory, disk,
network, software, QoS, current demand, etc.)
Simple price based on peaktime, offpeak,
discount when less demand, ..
54
Deadline and Budget-based Cost Minimization
Scheduling
  • Sort resources by increasing cost.
  • For each resource in order, assign as many jobs
    as possible to the resource, without exceeding
    the deadline.
  • Repeat all steps until all jobs are processed.

55
Deadline-based Cost-minimization Scheduling
  • M - Resources, N - Jobs, D - deadline
  • Note Cost of any Ri is less than any of Ri1 .
    Or Rm
  • RL Resource List need to be maintained in
    increasing order of cost
  • Ct - Time when accessed (Time now)
  • Ti - Job runtime (average) on Resource i (Ri)
    updated periodically
  • Ti is acts as a load profiling parameter.
  • Ai - number of jobs assigned to Ri , where
  • Ai Min (No.Unassigned Jobs, No. Jobs Ri can
    complete by remaining deadline)
  • No.UnAssignedJobsi Diff( N, (A1Ai-1))
  • JobsRi consume RemainingTime (D- Ct) DIV Ti
  • ALG Invoke Job Assignment() periodically until
    all jobs done.
  • Job Assignment()/Reassignment()
  • Establish ( RL, Ct , Ti , Ai ) dynamically
    Resource Discovery.
  • For all resources (I 1 to M) Assign Ai Jobs
    to Ri , if required

56
Deadline and Budget-based Time Minimization
Scheduling
  • For each resource, calculate the next completion
    time for an assigned job, taking into account
    previously assigned jobs.
  • Sort resources by next completion time.
  • Assign one job to the first resource for which
    the cost per job is less than the remaining
    budget per job.
  • Repeat all steps until all jobs are processed.
    (This is performed periodically or at each
    scheduling-event.)

57
Deadline and Budget-based TimeCost Min.
Scheduling
  • Split resources by whether cost per job is less
    than budget per job.
  • For the cheaper resources, assign jobs in inverse
    proportion to the job completion time (e.g. a
    resource with completion time 5 gets twice as
    many jobs as a resource with completion time
    10).
  • For the dearer resources, repeat all steps (with
    a recalculated budget per job) until all jobs are
    assigned.
  • Schedule/Reschedule Repeat all steps until all
    jobs are processed.

58
Evaluation of Scheduling Heuristics
  • A Hypothetical Application on
  • World Wide Grid

59
World Wide Grid (WWG)
Australia
North America
ANL SGI/Sun/SP2 USC-ISI SGI UVa Linux
Cluster UD Linux cluster UTK Linux cluster
Monash Uni.
Nimrod/G
Linux cluster
GlobusLegion GRACE_TS
Globus/Legion GRACE_TS
Solaris WS
Internet
Asia/Japan
Europe
Tokyo I-Tech. ETL, Tuskuba
ZIB/FUB T3E/Mosix Cardiff Sun E6500 Paderborn
HPCLine Lecce Compaq SC CNR Cluster Calabria
Cluster CERN Cluster Pozman SGI/SP2
Linux cluster
Globus GRACE_TS
Chile Cluster
Globus GRACE_TS
Globus GRACE_TS
South America
60
Experiment-1 Setup
  • Workload
  • 165 jobs, each need 5 minute of cpu time
  • Deadline 1 hrs. and budget 800,000 units
  • Strategy minimise cost and meet deadline
  • Execution Cost with cost optimisation
  • AU Peaktime471205 (G)
  • AU Offpeak time 427155 (G)

61
Resources Selected Price/CPU-sec.
62
Execution _at_ AU Peak Time
63
Execution _at_ AU Offpeak Time
64
AU peak Resources/Cost in Use
After the calibration phase, note the difference
in pattern of two graphs. This is when scheduler
stopped using expensive resources.
65
AU offpeak Resources/Cost in Use
66
Experiment-2 Setup
  • Workload
  • 165 jobs, each need 5 minute of CPU time
  • Deadline 2 hrs. and budget 396000 units
  • Strategy minimise time / cost
  • Execution Cost with cost optimisation
  • Optimise Cost 115200 (G) (finished in 2hrs.)
  • Optimise Time 237000 (G) (finished in 1 hr.)
  • In this experiment Time-optimised scheduling run
    costs double that of Cost-optimised.
  • Users can now trade-off between Time Vs. Cost.

67
Resources Selected Price/CPU-sec.
68
Scheduling for Time Optimization
69
Scheduling for Cost Optimization
70
Application Case Study
  • The Virtual Laboratory Project "Molecular
    Modelling for Drug Design" on Peer-to-Peer Grid

71
Drug Design Data Intensive Computing on Grid
  • A Virtual Laboratory for Molecular Modelling for
    Drug Design on Peer-to-Peer Grid.
  • It provides tools for examining millions of
    chemical compounds (molecules) in the Protein
    Data Bank (PDB) to identify those having
    potential use in drug design.
  • In collaboration with
  • Kim Branson, Structural Biology, Walter and Eliza
    Hall Institute (WEHI)

http//www.csse.monash.edu.au/rajkumar/dd_at_home/
72
DesignDrug_at_Home ArchitectureA Virtual Lab for
Molecular Modeling for Drug Design on P2P Grid
Grid Info. Service
Grid Market Directory
Data Replica Catalogue
Give me list PDBs sources Of type aldrich_300?
service cost?
service providers?
GTS
Resource Broker
Screen 2K molecules in 30min. for 10
mol.5 please?
(RB maps suitable Grid nodes and Protein DataBank)
get mol.10 from pdb1 screen it.
PDB2
mol.10 please?
(GTS - Grid Trade Server)
PDB1
73
Software Tools
  • Molecular Modelling Tools (DOCK)
  • Parameter Modelling Tools (Nimrod/enFusion)
  • Grid Resource Broker (Nimrod-G)
  • Data Grid Broker
  • Protein Data Bank (PDB) Management and
    Intelligent Access Tools
  • PDB databse Lookup/Index Table Generation.
  • PDB and associated index-table Replication.
  • PDB Replica Catalogue (that helps in Resource
    Discovery).
  • PDB Servers (that serve PDB clients requests).
  • PDB Brokering (Replica Selection).
  • PDB Clients for fetching Molecule Record (Data
    Movement).
  • Grid Middleware (Globus and GrACE)
  • Grid Fabric Management (Fork/LSF/Condor/Codine/)

74
DOCK code(Enhanced by WEHI, U of Melbourne)
  • A program to evaluate the chemical and geometric
    complementarities between a small molecule and a
    macromolecular binding site.
  • It explores ways in which two molecules, such as
    a drug and an enzyme or protein receptor, might
    fit together.
  • Compounds which dock to each other well, like
    pieces of a three-dimensional jigsaw puzzle, have
    the potential to bind.
  • So, why is it important to able to identify small
    molecules which may bind to a target
    macromolecule?
  • A compound which binds to a biological
    macromolecule may inhibit its function, and thus
    act as a drug.
  • Thus disabling the ability of (HIV) virus
    attaching itself to molecule/protein!
  • With system specific code changed, we have been
    able to compile it for Sun-Solaris, PC Linux, SGI
    IRIX, Compaq Alpha/OSF1

Original Code University of California, San
Francisco http//www.cmpharm.ucsf.edu/kuntz/
75
Dock input file
  • score_ligand yes
  • minimize_ligand yes
  • multiple_ligands no
  • random_seed 7
  • anchor_search no
  • torsion_drive yes
  • clash_overlap 0.5
  • conformation_cutoff_factor 3
  • torsion_minimize yes
  • match_receptor_sites no
  • random_search yes
  • . . . . . .
  • . . . . . .
  • maximum_cycles 1
  • ligand_atom_file S_1.mol2
  • receptor_site_file ece.sph
  • score_grid_prefix ece
  • vdw_definition_file parameter/vdw.defn
  • chemical_definition_file parameter/chem.defn

76
Parameterized Dock input file
score_ligand score_ligand minim
ize_ligand minimize_ligand multipl
e_ligands multiple_ligands random_s
eed random_seed anchor_search
anchor_search torsion_drive
torsion_drive clash_overlap
clash_overlap conformation_cutoff_factor
conformation_cutoff_factor torsion_minimize
torsion_minimize match_receptor_sit
es match_receptor_sites random_search
random_search . . . . . .
. . . . . . maximum_cycles
maximum_cycles ligand_atom_file
ligand_number.mol2 receptor_site_file
HOME/dock_inputs/receptor_site_file score_g
rid_prefix HOME/dock_inputs/score_
grid_prefix vdw_definition_file
vdw.defn chemical_definition_file
chem.defn chemical_score_file
chem_score.tbl flex_definition_file
flex.defn flex_drive_file
flex_drive.tbl ligand_contact_file
dock_cnt.mol2 ligand_chemical_file
dock_chm.mol2 ligand_energy_file
dock_nrg.mol2
77
Dock PlanFile (contd.)
parameter database_name label "database_name"
text select oneof "aldrich" "maybridge"
"maybridge_300" "asinex_egc" "asinex_epc"
"asinex_pre" "available_chemicals_directory"
"inter_bioscreen_s" "inter_bioscreen_n"
"inter_bioscreen_n_300" "inter_bioscreen_n_500"
"biomolecular_research_institute"
"molecular_science" "molecular_diversity_preservat
ion" "national_cancer_institute" "IGF_HITS"
"aldrich_300" "molecular_science_500" "APP" "ECE"
default "aldrich_300" parameter score_ligand
text default "yes" parameter minimize_ligand
text default "yes" parameter multiple_ligands
text default "no" parameter random_seed integer
default 7 parameter anchor_search text default
"no" parameter torsion_drive text default
"yes" parameter clash_overlap float default
0.5 parameter conformation_cutoff_factor integer
default 5 parameter torsion_minimize text
default "yes" parameter match_receptor_sites
text default "no" parameter random_search text
default "yes" . . . . . . . . . . .
. parameter maximum_cycles integer default
1 parameter receptor_site_file text default
"ece.sph" parameter score_grid_prefix text
default "ece" parameter ligand_number integer
range from 1 to 200 step 1
Molecules to be screened
78
Dock PlanFile
task nodestart copy ./parameter/vdw.defn
node. copy ./parameter/chem.defn node.
copy ./parameter/chem_score.tbl node.
copy ./parameter/flex.defn node. copy
./parameter/flex_drive.tbl node. copy
./dock_inputs/get_molecule node. copy
./dock_inputs/dock_base node. endtask task main
nodesubstitute dock_base dock_run
nodesubstitute get_molecule
get_molecule_fetch nodeexecute sh
./get_molecule_fetch nodeexecute
HOME/bin/dock.OS -i dock_run -o dock_out
copy nodedock_out ./results/dock_out.jobname
copy nodedock_cnt.mol2
./results/dock_cnt.mol2.jobname copy
nodedock_chm.mol2 ./results/dock_chm.mol2.jobnam
e copy nodedock_nrg.mol2
./results/dock_nrg.mol2.jobname endtask
79
Nimrod/TurboLinux enFuzion GUI tools for
Parameter Modeling
80
Docking Experiment Preparation
  • Setup PDB DataGrid
  • Index PDB databases
  • Pre-stage (all) Protein Data Bank (PDB) on
    replica sites
  • Start PDB Server
  • Create Docking GridScore (receptor surface
    details) for a given receptor on home node.
  • Pre-Staging Large Files required for Docking
  • Pre-stage Dock executables and PDB access client
    on Grid nodes, if required (e.g., dock.Linux,
    dock.SunOS, dock.IRIX64, and dock.OSF1 on Linux,
    Sun, SGI, and Compaq machines respectively). Use
    globus-rcp.
  • Pre-stage/Cache all data files (3-13MB each)
    representing receptor details on Grid nodes.
  • This can can be done demand by Nimrod/G for each
    job, but few input files are too large and they
    are required for all jobs). So,
    pre-staging/caching at http-cache or broker level
    is necessary to avoid the overhead of copying the
    same input files again and again!

81
Protein Data Bank
  • Databases consist of small molecules from
    commercially available organic synthesis
    libraries, and natural product databases.
  • There is also the ability to screen virtual
    combinatorial databases, in their entirety.
  • This methodology allows only the required
    compounds to be subjected to physical screening
    and/or synthesis reducing both time and expense.

82
Target Testcase
  • The target for the test case electrocardiogram
    (ECE) endothelin converting enzyme. This is
    involved in heart stroke and other transient
    ischemia.
  • Ischemia A decrease in the blood supply to a
    bodily organ, tissue, or part caused by
    constriction or obstruction of the blood vessels.

83
DataGrid Brokering
Screen 2K molecules in 30min. for 10
Nimrod/G Computational Grid Broker
Algorithm1
Data Replica Catalogue
PDB Broker
. . .
AlgorithmN
3
PDB replicas please?
advise PDB source?
5
1
4
2
Grid Info. Service
process send results
selection advise use GSP4!
Screen mol.5 please?
Is GSP4 healthy?
7
6
mol.5 please?
PDB2
PDB Service
PDB Service
GSP1
GSP2
GSPm
GSP4
GSP3(Grid Service Provider)
GSPn
84
Nimrod/G in ActionScreening on World-Wide Grid
85
Any Scientific Discovery ? Did your collaborator
invent new drug for xxxx?
Not Yet?
?
86
Conclude with a comparison with the Electrical
Grid..
  • Where we are ????

Courtesy Domenico Laforenza
87
Alessandro Volta in Paris in 1801 inside French
National Institute shows the battery while in the
presence of Napoleon I
  • Fresco by N. Cianfanelli (1841)
  • (Zoological Section "La Specula" of National
    History Museum of Florence University)

88
.and in the future, I imagine a worldwide Power
(Electrical) Grid ...
Oh, mon Dieu !
What ?!?! This is a mad man
89
2001 - 1801 200 Years
90
Grid Computing A New Wave ?
Can we Predict its Future ?
I think there is a world market for about five
computers. Thomas J. Watson Sr., IBM Founder,
1943
91
What Enron, World Leader in Power and Natural Gas
Distribution Business, Think of Economy Grid!...
  • -------- Original Message --------
  • Subject Your papers on Economics Grid
    allocation
  • Date Wed, 14 Mar 2001 121020 -0800
  • From Lance_Norskog_at_enron.net
  • To rajkumar_at_csse.monash.edu.au,
    davida_at_csse.monash.edu.au,jon_at_csse.monash.edu.au
  • Hello-
  • I am researching mass computation
    infrastructures.
  • The company I work for, Enron, is a worldwide
    commodity company. Our
  • business model is to find mid-sized commodity
    markets that don't work like
  • large-scale commodity markets (like wheat, gold,
    orange juice, etc.) and to
  • restructure them. The division I work in is
    working to do this for
  • long-distance fiber optic bandwidth. Other
    divisions are pursuing metals,
  • paper pulp, etc. (We even have "weather
    derivatives", which is a betting
  • parlor for industries that depend far too much on
    hot or cold weather.
  • Somehow, this is legal!) So, my perspective on
    mass computation is from the
  • point of view of how to make it a large-scale
    market.

92
What Enron, World Leader in Power and Natural Gas
Distribution Business, Think of Economy Grid!...
  • Our customers are large corporations. They have
    a need not to solve some
  • particular problem now and then, but every day.
    For example, take a
  • department store chain that routes one million
    different items from
  • warehouses into department stores every day, and
    wants to do it in a
  • near-optimal way. They need to run this
    computation, with different input
  • vectors, every business day!
  • If that company is to commit to running its
    business with the Grid, it needs a
  • few guarantees around its Grid use
  • 1) complete reliability and availability
  • 2) a known price for every use, set far in
    advance
  • 3) an active spot market in case its regular
    supplier fails
  • 4) enforceable contracts guaranteeing quality of
    service
  • This last is a killer the quantity of processing
    power you get can't be
  • squishy. It has to be a measureable unit "Java
    MIPS" is computer-driven.
  • Number 2, a known price for every use, is also
    missing from your analysis.
  • A large, active, "liquid" commodity market can
    supply not merely spot

93
What Enron, World Leader in Power and Natural Gas
Distribution Business, Think of Economy Grid!...
  • Large commodity markets have the vast majority of
    their product change
  • hands under such long-term agreements, rather
    than on the spot market.
  • The strategy is to buy part of your needs in
    long-term contracts, part in
  • medium-term contracts, and most of the rest on
    short-term, just to avoid
  • getting stuck with 5 years from now with vast
    amounts of a commodity you
  • don't want. Every day you might have a missing
    3-5 that you need to buy
  • on the spot market. (I'm in California. We're
    having so much trouble
  • because our utilities were banned from buying a
    heterogeneous basket of
  • contracts and were required to buy all
    electricity on the spot market.)
  • I think my point is that while economics is a
    fine metaphor for making operational decisions in
    scheduling resources, those numbers will not be
    visible to end customers. Instead, the customers
    will buy blocks of resource for future delivery
    with pricing based on standard macro-economic
  • factors like interest rates, falling machine
    prices, rising electricity
  • prices, etc. The producers will use your
    economic-based techniques to
  • direct their day-to-day operations and to make
    the interproducer spot market function.
  • Lance Norskog
  • Sr. Software Engineer
  • Enron Broadband Systems
  • --------------------------------------------------
    --------------------------------------------------
    --

94
Conclusions
  • Grid Computing is emerging as a next generation
    computing platform.
  • The use of economics paradigm for management of
    resources in Grid Computing is essential to push
    Grid into mainstream computing!
  • Adaptive, scalable, and easy-to-use systems and
    tools are essential to make end-users life
    easier.
  • It is projected that the impact of World-Wide
    Grid on 21st century economy will be similar to
    the impact made by electric power grid on the
    20th century economy.
  • To achieve this goal, in my humble opinion, Use
    Nimrod Family of Tools (not to mention Nimrod-G
    Broker) along with Globus, of course!
  • Enjoy excitements of World-Wide Grid Computing!

95
Download Software Information
  • Nimrod Parameteric Computing
  • http//www.csse.monash.edu.au/davida/nimrod/
  • Economy Grid Nimrod/G
  • http//www.buyya.com/ecogrid/
  • Virtual Laboratory/DesignDrug_at_Home
  • http//www.buyya.com/dd_at_home/
  • Grid Simulation (Java based)
  • http//www.buyya.com/gridsim/
  • World Wide Grid testbed
  • http//www.buyya.com/ecogrid/wwg/
  • Looking for new volunteers to grow ?
  • Please contact me to barter your our machines!
  • Want to build on our work/collaborate
  • Talk to me now or email rajkumar_at_csse.monash.edu.
    au

96
Acknowledgements
  • Special thanks to the following colleagues for
    sharing ideas/works
  • David Abramson, Monash University
  • Jack Dongarra, University of Tennessee
  • Wolfgang Gentzsch, Sun Microsystems
  • Jon Giddy, DSTC _at_ Monash University
  • Domenico Laforenza, CNR/CNUCE, Italy
  • Globus Team!
  • Unable to mention all names explicitly here,
    however, their efforts are recognized by
    featuring their work in this presentation.
  • Colleagues from Asia/Japan, Europe Italy,
    Germany, Swiss, Poland, UK, US, Chille for
    providing access to their machines--gtWWG testbed.

97
(No Transcript)
98
Further Information
  • Books
  • High Performance Cluster Computing, V1, V2,
    R.Buyya (Ed), Prentice Hall, 1999.
  • The GRID, I. Foster and C. Kesselman (Eds),
    Morgan-Kaufmann, 1999.
  • IEEE Task Force on Cluster Computing
  • http//www.ieeetfcc.org
  • Global Grid Forum
  • www.gridforum.org
  • IEEE/ACM CCGridxy www.ccgrid.org
  • CCGrid 2002, Berlin ccgrid2002.zib.de
  • Grid workshop - www.gridcomputing.org

99
Further Information
  • Cluster Computing Info Centre
  • http//www.buyya.com/cluster/
  • Grid Computing Info Centre
  • http//www.gridcomputing.com
  • IEEE DS Online - Grid Computing area
  • http//computer.org/dsonline/gc
  • Compute Power Market Project
  • http//www.ComputePower.com
Write a Comment
User Comments (0)
About PowerShow.com