Title: Grid meets Economics: A Market Paradigm for Resource Management and Scheduling for WorldWide Grid Co
1Grid meets EconomicsA Market Paradigm for
Resource Management and Scheduling for
World-Wide Grid Computing
Melbourne, Australiawww.buyya.com/ecogrid
2(No Transcript)
3Need Honest Answers!
- I want to have access to your Grid resources
want to knowhow many of you are willing to give
me access ? (following cases) - I am unable to give you access our Australian
machines, but I want to have access to yours! - Want to solve academic problems
- Want to solve business problems
- I am willing to gift you Kangaroos! (bartering)
- I am willing to give you access to my machines,
if you want. (sharing, but no measure no QoS) - I am willing to pay you dollars on usage basis.
(economic incentive, market-based, and QoS)
4Overview
- A quick glance at todays Grid computing
- Resource Management challenges for next
generation Grid computing - A Glance at Approaches to Grid computing.
- Grid Architecture for Computational Economy
- Economy Grid Globus GRACE
- Nimrod-G -- Grid Resource Broker
- Scheduling Experiments
- Case Study Drug Design Application on Grid
- Conclusions
5Scalable HPC Breaking Administrative Barriers
new challenges
?
PERFORMANCE
Administrative Barriers
- Individual
- Group
- Department
- Campus
- State
- National
- Globe
- Inter Planet
- Universe
Desktop
SMPs or SuperComputers
Local Cluster
Global Cluster/Grid
Inter Planetary Grid!
Enterprise Cluster/Grid
6Why Grids? Large Scale Explorations need
themKiller Applications.
- Solving grand challenge applications using
modeling, simulation and analysis
Aerospace
Internet Ecommerce
Life Sciences
CAD/CAM
Digital Biology
Military Applications
Military Applications
Military Applications
7(No Transcript)
8What is Grid ?
- An infrastructure that logically couples
distributed resources - Computers PCs, workstations, clusters,
supercomputers, laptops, notebooks, - mobile devices, PDA, etc
- Software e.g., ASPs renting expensive special
purpose applications on demand - Catalogued data and databases e.g. transparent
access to human genome database - Special devices e.g., radio telescope
SETI_at_Home searching for life in galaxy. - People/collaborators.
- and presents them as an integrated global
resource. - It enables the creation of virtual enterprises
(VEs) for resource sharing.
Widearea
9Grid Applications-Drivers
- Distributed HPC (Supercomputing)
- Computational science.
- High-throughput computing
- Large scale simulation/chip design parameter
studies. - Content Sharing (free or paid)
- Sharing digital contents among peers (e.g.,
Napster) - Remote software access/renting services
- Application service provides (ASPs).
- Data-intensive computing
- Data mining, particle physics (CERN), Drug
Design. - On-demand, realtime computing
- Medical instrumentation network-enabled
solvers. - Collaborative
- Collaborative design, data exploration, education.
10Building and Using Grids require
- Services that make our systems Grid Ready!
- Security mechanisms that permit resources to be
accessed only by authorized users. - (New) programming tools that make our
applications Grid Ready!. - Tools that can translate the requirements of an
application/user into the requirements of
computers, networks, and storage. - Tools that perform resource discovery, trading,
selection/allocation, scheduling and distribution
of jobs and collects results.
Globus
?
11Players in Grid Computing
12What users want ?Users in Grid Economy Strategy
- Grid Consumers
- Execute jobs for solving varying problem size and
complexity - Benefit by selecting and aggregating resources
wisely - Tradeoff timeframe and cost
- Strategy minimise expenses
- Grid Providers
- Contribute idle resource for executing consumer
jobs - Benefit by maximizing resource utilisation
- Tradeoff local requirements market opportunity
- Strategy maximise return on investment
13Challenges for Next Generation Grid Technology
Development
14Sources of Complexity in Resource Management for
World Wide Grid Computing
- Size (large number of nodes, providers,
consumers) - Heterogeneity of resources (PCs, Workstations,
clusters, and supercomputers, instruments,
databases, software) - Heterogeneity of fabric management systems
(single system image OS, queuing systems, etc.) - Heterogeneity of fabric management polices
- Heterogeneity of application requirements (CPU,
I/O, memory, and/or network intensive) - Heterogeneity in resource demand patterns (peak,
off-peak, ...) - Applications need different QoS at different
times (time critical results). The utility of
experimental results varies from time to time. - Geographical distribution of users located
different time zones - Differing goals (producers and consumers have
different objectives and strategies) - Unsecure and Unreliable environment
15Traditional approaches to resource management
scheduling are NOT useful for Grid ?
- They use centralised policy that need
- complete state-information and
- common fabric management policy or decentralised
consensus-based policy. - Due to too many heterogenous parameters in the
Grid it is impossible to define/get - system-wide performance matrix and
- common fabric management policy that is
acceptable to all. - Economics paradigm proved to effective
institution in managing decentralization and
heterogeneity that is present in human economies!
- Fall of USSR Emergence of US as world
superpower! (monopoly?) - So, we propose/advocate the use of computational
economics principles in management of resources
and scheduling computations on world wide Grid. - Think locally and act globally approach to grid
computing!
16Benefits of Computational Economies
- It provides a nice paradigm for managing self
interested and self-regulating entities (resource
owners and consumers) - Helps in regulating supply-and-demand of
resources. - Services can be priced in such a way that
equilibrium is maintained. - User-centric / Utility driven
- Scalable
- No need of central coordinator (during
negotiation) - Resources(sellers) and also Users(buyers) can
make their own decisions and try to maximize
utility and profit. - Adaptable,
- It helps in offering different QoS (quality of
services) to different applications depending the
value users place on them. - It improves the utilisation of resources
- It offers incentive for resource owners for being
part of the grid! - It offers incentive for resource consumers for
being good citizens - There is large body of proven Economic principles
and techniques available, we can easily leverage
it.
17New challenges of Computational Economy
- Resource Owners
- How do I decide prices ? (economic models?)
- How do I specify them ?
- How do I enforce them ?
- How do I advertise attract consumers ?
- How do I do accounting and handle payments?
- ..
- Resource Consumers
- How do I decide expenses ?
- How do I express QoS requirements ?
- How I trade between timeframe cost ?
- .
- Any tools, traders brokers available to
automate the process ?
18mix-and-match
Object-oriented
Internet/partial-P2P
Grid Computing Approaches
Network enabled Solvers
Market/Computational Economy
Nimrod-G
19Many Grid Projects Initiatives
- Australia
- Economy Grid
- Nimrod-G
- Virtual Lab
- Active Sheets
- DISCWorld
- ..new coming up
- Europe
- UNICORE
- MOL
- Lecce GRB
- Poland MC Broker
- EU Data Grid
- EuroGrid
- MetaMPI
- Dutch DAS
- XW, JaWS
- and many more...
- Japan
- USA
- Globus
- Legion
- Javelin
- AppLeS
- NASA IPG
- Condor
- Harness
- NetSolve
- AccessGrid
- GrADS
- and many more...
- Cycle Stealing .com Initiatives
- Distributed.net
- SETI_at_Home, .
- Entropia, UD, Parabon,.
- Public Forums
- Global Grid Forum
- P2P Working Group
http//www.gridcomputing.com
20Many Testbeds ? who pays ?, who regulates
demand and supply ?
GUSTO (decommissioned)
World Wide Grid
Legion Testbed
NASA IPG
21Testbeds so far -- observations
- Who contributed resources why ?
- Volunteers for fun, challenge, fame, charismatic
apps, public good like distributed.net
SETI_at_Home projects. - Collaborators sharing resources while developing
new technologies of common interest Globus,
Legion, Ninf, Ninf, MC Broker, Lecce GRB,...
Unless you know lab. leaders, it is impossible to
get access! - How long ?
- Short term excitement is lost, too much of
admin. Overhead (Globus inst), no incentive,
policy change, - What we need ? Grid Marketplace!
- Regulates supply-and-demand, offers incentive for
being players, simple, scalable solution,
quasi-deterministic proven model in real-world.
22Building an Economy Grid(Next Generation Grid
Computing!)
To enable the creation of Grid Marketplace
(competitive) ASP Service Oriented Computing . .
. And let users focus on their own work (science,
engineering, or commerce)!
23GRACE A ReferenceGrid Architecture for
Computational Economy
Grid Bank
Information Server(s)
Grid Market Services
Sign-on
Health Monitor
Info ?
Grid Node N
Grid Explorer
Application
Secure
Job Control Agent
Grid Node1
Schedule Advisor
QoS
Pricing Algorithms
Trade Server
Trading
Trade Manager
Accounting
Resource Reservation
Misc. services
Deployment Agent
JobExec
Resource Allocation
Storage
Grid User
Grid Resource Broker
R1
R2
Rm
Grid Middleware Services
Grid Service Providers
See PDPTA 2000 paper!
24Economic Models for Trading
- Commodity Market Model
- Posted Prices Models
- Bargaining Model
- Tendering (Contract Net) Model
- Auction Model
- English, first-price sealed-bid, second-price
sealed-bid (Vickrey), and Dutch
(consumerlow,high,rate producerhigh, low,
rate) - Proportional Resource Sharing Model
- Shareholder Model
- Partnership Model
See SPIE ITCom 2001 paper! with Heinz
Stockinger, CERN!
25Grid Components
Applications and Portals
Grid Apps.
Prob. Solving Env.
Collaboration
Engineering
Web enabled Apps
Scientific
Grid Tools
Development Environments and Tools
Web tools
Languages
Libraries
Debuggers
Resource Brokers
Monitoring
Grid Middleware
Distributed Resources Coupling Services
QoS
Security
Information
Process
Resource Trading
Market Info
Local Resource Managers
TCP/IP UDP
Operating Systems
Queuing Systems
Libraries App Kernels
Grid Fabric
Networked Resources across Organisations
Clusters
Data Sources
Scientific Instruments
Storage Systems
Computers
26Economy Grid Globus GRACE
Applications
Grid Apps.
Science
Engineering
Commerce
Portals
ActiveSheet
High-level Services and Tools
Grid Status
Grid Tools
DUROC
globusrun
MPI-G
Nimrod/G
CC
Core Services
Heartbeat Monitor
Nexus
GRACE-TS
GRAM
Grid Middleware
Globus Security Interface
GASS
DUROC
MDS
GARA
GBank
GMD
Grid Fabric
Local Services
GRD
QBank
JVM
Condor
TCP
UDP
eCash
LSF
PBS
Solaris
Irix
Linux
See IPDPS HWC 2001 paper!
27GRACE components
- A resource broker (e.g., Nimrod/G)
- Various resource trading protocols for different
economic models - A mediator for negotiating between users and grid
service providers (Grid Market Directory) - A deal template for specifying resource
requirements and services offers - Grid Trading Server
- Pricing policy specification
- Accounting (e.g., QBank) and payment management
(GridBank, not yet implemented)
28Grid Open Trading Protocols
Trade Manager
Trade Server
Get Connected
Pricing Rules
Reply to Bid (DT)
Negotiate Deal(DT)
.
API
Confirm Deal(DT, Y/N)
DT - Deal Template - resource requirements (BM)
- resource profile (BS) - price (any one can
set) - status - change the above values
- negotiation can continue -
accept/decline - validity period
Cancel Deal(DT)
Change Deal(DT)
Get Disconnected
29Pricing, Accounting, Allocations and Job
Scheduling Flow _at_ each site/Grid Level
Pricing Policy
GRID Bank (digital transactions)
0
0
2
DB_at_Each Site
QBank
Trade Server
1
3
5
8
0. Make Deposits, Transfers, Refunds,
Queries/Reports 1. Clients negotiates for
access cost. 2. Negotiation is performed
per owner defined policies. 3. If client is
happy, TS informs QB about access deal. 4.
Job is Submitted 5. Check with QB for go
ahead 6. Job Starts 7. Job Completes 8.
Inform QB about resource resource
utilization.
Resource Manager
4
IBM-LL/PBS/.
6
7
Compute Resources clusters/SGI/SP/...
30Service Items to be Charged
- CPU - User and System time
- Memory
- maximum resident set size - page size
- amount of memory used
- page faults with/without physical I/O
- Storage size, r/w/block IO operations
- Network msgs sent/received
- Signals received, context switches
- Software and Libraries accessed
- Data Sources (e.g. Protein Data Bank)
31How to decide Price ?
- Fixed price model (like todays Internet)
- Dynamic/Demand and Supply (like tomorrows
Internet) - Usage Period
- Loyalty of Customers (like Airlines favoring
frequent flyers!) - Historical data
- Advance Agreement (high discount for
corporations) - Usage Timing (peak, off-peak, lunch time)
- Calendar based (holiday/vacation period)
- Bulk Purchase (register 100 .com domains at
once!) - Voting -- trade unions decide pricing structure
- Resource capability as benchmarked in the market!
- Academic RD/public-good application users can be
offered at cheaper rate compared to commercial
use. - Customer Type Quality or price sensitive
buyers. - Can be Prescribed by Regulating (Govt.)
authorities
32Payments- Options Automation
- Buy credits in advance / GSPs bill the user
later--pay as you go - Pay by Electronic Currency via Grid Bank
- NetCash (anonymity), NetCheque, and Paypal
- NetCheque - http//www.isi.edu/gost/info/netcash/
- Users register with NC accounting servers, can
write electronic cheques and send (e.g email).
When deposited, balance is transferred from
sender to receiver account. - NetCash - http//www.isi.edu/gost/info/netcheque/
- It supports anonymity and it uses the NetCheque
system to clear payments between currency
servers. - Paypal.com accountemail is linked to credit
card. - Enter the recipients email address and the
amount you wish to request. - The recipient gets an email notification and pays
you at www.PayPal.com
33Nimrod-GThe Grid Resource Broker
- Soft Deadline and Budget-based Economy Grid
Resource Broker for Parameter Processing on P2P
Grids
34Parametric Computing(What Users think of Nimrod
Power)
Parameters
Magic Engine
Multiple Runs Same Program Multiple Data
Killer Application for the Grid!
See IPDPS 2000 paper!
Courtesy Anand Natrajan, University of Virginia
35P-study Applications -- Characteristics
- Code (Single Program sequential or threaded)
- High Resource Requirements
- Long-running Instances
- Numerous Instances (Multiple Data)
- High Computation-to-Communication Ratio
- Embarrassingly/Pleasantly Parallel
36Sample P-Sweep Applications
Bioinformatics Drug Design / Protein
Modelling
Combinatorial Optimization Meta-heuristic
parameter estimation
Ecological Modelling Control Strategies for
Cattle Tick
Sensitivityexperiments on smog formation
Data Mining
Electronic CAD Field Programmable Gate Arrays
High Energy Physics Searching for Rare Events
Computer Graphics Ray Tracing
Finance Investment Risk Analysis
VLSI Design SPICE Simulations
Civil Engineering Building Design
Network Simulation
Automobile Crash Simulation
Aerospace Wing Design
astrophysics
37Thesis
- Perform parameter sweep (bag of tasks) (utilising
distributed resources) within T hours or early
and cost not exceeding M. - Three Options/Solutions
- Using pure Globus commands
- Build your own Distributed App Scheduler
- Use Nimrod-G (Resource Broker)
38Executing Remotely
Choose Resource
Transfer Input Files
Set Environment
Start Process
Pass Arguments
Monitor Progress
Summary View Job View Event View
Read/Write Intermediate Files
Transfer Output Files
Resource Discovery, Trading, Scheduling,
Predictions, Rescheduling, ...
39Using Pure Globus commands
Do all yourself! (manually)
Total Cost???
40Build Distributed Application Scheduler
Build App case by case basis Complicated
Construction
E.g., AppLeS/MPI based
Total Cost???
41Use Nimrod-G
Aggregate Job Submission Aggregate View
Submit Play!
42Nimrod Associated Family of Tools
Remote Execution Server (on demand Nimrod Agent)
P-sweep App. Composition Nimrod/ Enfusion Resour
ce Management and Scheduling Nimrod-G
Broker Design Optimisations Nimrod-O App.
Composition and Online Visualization Active
Sheets Grid Simulation in Java GridSim Drug
Design on Grid Virtual Lab
File Transfer Server
Upcoming? HEPGrid (U. Melbourne),
GAVE(Rutherford Appleton Lab)
Grid (Un)Aware Virtual Engineering
43Nimrod/G A Grid Resource Broker
- A resource broker for managing and steering task
farming (parametric sweep) applications on
computational Grids based on deadline and
computational economy. - Key Features
- A single window to manage control experiment
- Resource Discovery
- Resource Trading
- Scheduling Predications
- Transportation of data results
- Steering data management
- It allows to study the behaviour of some of the
output variables against a range of different
input scenarios.
44A Glance at Nimrod-G Broker
Nimrod/G Client
Nimrod/G Client
Nimrod/G Client
Nimrod/G Engine
Schedule Advisor
Trading Manager
Grid Store
Grid Dispatcher
Grid Explorer
Grid Middleware
TM TS
Globus, Legion, Condor, etc.
GE GIS
Grid Information Server(s)
RM TS
RM TS
RM TS
G
C
L
G
Legion enabled node.
Globus enabled node.
L
G
C
L
RM Local Resource Manager, TS Trade Server
Condor enabled node.
See HPCAsia 2000 paper!
45Nimrod/G Grid Broker Architecture
Legacy Applications
Customised Apps (Active Sheet)
Monitoring and Steering Portals
Nimrod Clients
P-Tools (GUI/Scripting) (parameter_modeling)
XML?
Farming Engine
Meta-Scheduler
XML
Algorithm1
Programmable Entities Management
Schedule Advisor
. . .
Resources
Jobs
Tasks
Channels
AlgorithmN
Nimrod Broker
Agents
AgentScheduler
JobServer
IP hourglass ?
Trading Manager
Grid Explorer
Database (Postgres)
Dispatcher Actuators
. . .
Legion-A
P2P-A
Globus-A
. . .
Condor
GMD
Globus
Legion
P2P
GTS
Middleware
G-Bank
. . .
Computers
Storage
Networks
Instruments
Local Schedulers
Fabric
. . .
PC/WS/Clusters
Radio Telescope
Condor/LL/Mosix/
Database
46A Nimrod/G Monitor
Deadline
Legion hosts
Globus Hosts
Bezek is in both Globus and Legion Domains
47User Requirements Deadline/Budget
48Active Sheet Spreadsheet Processing on Grid
Nimrod Proxy
Nimrod/G
See HPC 2001 paper!
49(No Transcript)
50Nimrod/G Interactions
Resource Discovery
Grid Info servers
Scheduler
Grid Trade Server
Resource allocation (local)
Farming Engine
Queuing System
User process
Nimrod Agent
Dispatcher
Process server
Do this in 30min. for 10?
Gatekeeper node
Computational node
Root node
51Adaptive Scheduling Algorithms
See HPDC AMS 2001 paper!
Discover More Resources
Discover Resources
Establish Rates
Compose Schedule
Evaluate Reschedule
Meet requirements ? Remaining Jobs, Deadline,
Budget ?
Distribute Jobs
52Cost Model
- Without cost ANY shared system becomes
un-managable - Charge users more for remote facilities than
their own - Choose cheaper resources before more expensive
ones - Cost units (G) may be
- Dollars
- Shares in global facility
- Stored in bank
53Cost Matrix _at_ Grid site X
- Non-uniform costing
- Encourages use of local resources first
- Real accounting system can control machine usage
Resource Cost Function (cpu, memory, disk,
network, software, QoS, current demand, etc.)
Simple price based on peaktime, offpeak,
discount when less demand, ..
54Deadline and Budget-based Cost Minimization
Scheduling
- Sort resources by increasing cost.
- For each resource in order, assign as many jobs
as possible to the resource, without exceeding
the deadline. - Repeat all steps until all jobs are processed.
55Deadline-based Cost-minimization Scheduling
- M - Resources, N - Jobs, D - deadline
- Note Cost of any Ri is less than any of Ri1 .
Or Rm - RL Resource List need to be maintained in
increasing order of cost - Ct - Time when accessed (Time now)
- Ti - Job runtime (average) on Resource i (Ri)
updated periodically - Ti is acts as a load profiling parameter.
- Ai - number of jobs assigned to Ri , where
- Ai Min (No.Unassigned Jobs, No. Jobs Ri can
complete by remaining deadline) - No.UnAssignedJobsi Diff( N, (A1Ai-1))
- JobsRi consume RemainingTime (D- Ct) DIV Ti
- ALG Invoke Job Assignment() periodically until
all jobs done. - Job Assignment()/Reassignment()
- Establish ( RL, Ct , Ti , Ai ) dynamically
Resource Discovery. - For all resources (I 1 to M) Assign Ai Jobs
to Ri , if required
56Deadline and Budget-based Time Minimization
Scheduling
- For each resource, calculate the next completion
time for an assigned job, taking into account
previously assigned jobs. - Sort resources by next completion time.
- Assign one job to the first resource for which
the cost per job is less than the remaining
budget per job. - Repeat all steps until all jobs are processed.
(This is performed periodically or at each
scheduling-event.)
57Deadline and Budget-based TimeCost Min.
Scheduling
- Split resources by whether cost per job is less
than budget per job. - For the cheaper resources, assign jobs in inverse
proportion to the job completion time (e.g. a
resource with completion time 5 gets twice as
many jobs as a resource with completion time
10). - For the dearer resources, repeat all steps (with
a recalculated budget per job) until all jobs are
assigned. - Schedule/Reschedule Repeat all steps until all
jobs are processed.
58Evaluation of Scheduling Heuristics
- A Hypothetical Application on
- World Wide Grid
59World Wide Grid (WWG)
Australia
North America
ANL SGI/Sun/SP2 USC-ISI SGI UVa Linux
Cluster UD Linux cluster UTK Linux cluster
Monash Uni.
Nimrod/G
Linux cluster
GlobusLegion GRACE_TS
Globus/Legion GRACE_TS
Solaris WS
Internet
Asia/Japan
Europe
Tokyo I-Tech. ETL, Tuskuba
ZIB/FUB T3E/Mosix Cardiff Sun E6500 Paderborn
HPCLine Lecce Compaq SC CNR Cluster Calabria
Cluster CERN Cluster Pozman SGI/SP2
Linux cluster
Globus GRACE_TS
Chile Cluster
Globus GRACE_TS
Globus GRACE_TS
South America
60Experiment-1 Setup
- Workload
- 165 jobs, each need 5 minute of cpu time
- Deadline 1 hrs. and budget 800,000 units
- Strategy minimise cost and meet deadline
- Execution Cost with cost optimisation
- AU Peaktime471205 (G)
- AU Offpeak time 427155 (G)
61Resources Selected Price/CPU-sec.
62Execution _at_ AU Peak Time
63Execution _at_ AU Offpeak Time
64AU peak Resources/Cost in Use
After the calibration phase, note the difference
in pattern of two graphs. This is when scheduler
stopped using expensive resources.
65AU offpeak Resources/Cost in Use
66Experiment-2 Setup
- Workload
- 165 jobs, each need 5 minute of CPU time
- Deadline 2 hrs. and budget 396000 units
- Strategy minimise time / cost
- Execution Cost with cost optimisation
- Optimise Cost 115200 (G) (finished in 2hrs.)
- Optimise Time 237000 (G) (finished in 1 hr.)
- In this experiment Time-optimised scheduling run
costs double that of Cost-optimised. - Users can now trade-off between Time Vs. Cost.
67Resources Selected Price/CPU-sec.
68Scheduling for Time Optimization
69Scheduling for Cost Optimization
70Application Case Study
- The Virtual Laboratory Project "Molecular
Modelling for Drug Design" on Peer-to-Peer Grid
71Drug Design Data Intensive Computing on Grid
- A Virtual Laboratory for Molecular Modelling for
Drug Design on Peer-to-Peer Grid. - It provides tools for examining millions of
chemical compounds (molecules) in the Protein
Data Bank (PDB) to identify those having
potential use in drug design. - In collaboration with
- Kim Branson, Structural Biology, Walter and Eliza
Hall Institute (WEHI)
http//www.csse.monash.edu.au/rajkumar/dd_at_home/
72DesignDrug_at_Home ArchitectureA Virtual Lab for
Molecular Modeling for Drug Design on P2P Grid
Grid Info. Service
Grid Market Directory
Data Replica Catalogue
Give me list PDBs sources Of type aldrich_300?
service cost?
service providers?
GTS
Resource Broker
Screen 2K molecules in 30min. for 10
mol.5 please?
(RB maps suitable Grid nodes and Protein DataBank)
get mol.10 from pdb1 screen it.
PDB2
mol.10 please?
(GTS - Grid Trade Server)
PDB1
73Software Tools
- Molecular Modelling Tools (DOCK)
- Parameter Modelling Tools (Nimrod/enFusion)
- Grid Resource Broker (Nimrod-G)
- Data Grid Broker
- Protein Data Bank (PDB) Management and
Intelligent Access Tools - PDB databse Lookup/Index Table Generation.
- PDB and associated index-table Replication.
- PDB Replica Catalogue (that helps in Resource
Discovery). - PDB Servers (that serve PDB clients requests).
- PDB Brokering (Replica Selection).
- PDB Clients for fetching Molecule Record (Data
Movement). - Grid Middleware (Globus and GrACE)
- Grid Fabric Management (Fork/LSF/Condor/Codine/)
74DOCK code(Enhanced by WEHI, U of Melbourne)
- A program to evaluate the chemical and geometric
complementarities between a small molecule and a
macromolecular binding site. - It explores ways in which two molecules, such as
a drug and an enzyme or protein receptor, might
fit together. - Compounds which dock to each other well, like
pieces of a three-dimensional jigsaw puzzle, have
the potential to bind. - So, why is it important to able to identify small
molecules which may bind to a target
macromolecule? - A compound which binds to a biological
macromolecule may inhibit its function, and thus
act as a drug. - Thus disabling the ability of (HIV) virus
attaching itself to molecule/protein! - With system specific code changed, we have been
able to compile it for Sun-Solaris, PC Linux, SGI
IRIX, Compaq Alpha/OSF1
Original Code University of California, San
Francisco http//www.cmpharm.ucsf.edu/kuntz/
75Dock input file
- score_ligand yes
- minimize_ligand yes
- multiple_ligands no
- random_seed 7
- anchor_search no
- torsion_drive yes
- clash_overlap 0.5
- conformation_cutoff_factor 3
- torsion_minimize yes
- match_receptor_sites no
- random_search yes
- . . . . . .
- . . . . . .
- maximum_cycles 1
- ligand_atom_file S_1.mol2
- receptor_site_file ece.sph
- score_grid_prefix ece
- vdw_definition_file parameter/vdw.defn
- chemical_definition_file parameter/chem.defn
76Parameterized Dock input file
score_ligand score_ligand minim
ize_ligand minimize_ligand multipl
e_ligands multiple_ligands random_s
eed random_seed anchor_search
anchor_search torsion_drive
torsion_drive clash_overlap
clash_overlap conformation_cutoff_factor
conformation_cutoff_factor torsion_minimize
torsion_minimize match_receptor_sit
es match_receptor_sites random_search
random_search . . . . . .
. . . . . . maximum_cycles
maximum_cycles ligand_atom_file
ligand_number.mol2 receptor_site_file
HOME/dock_inputs/receptor_site_file score_g
rid_prefix HOME/dock_inputs/score_
grid_prefix vdw_definition_file
vdw.defn chemical_definition_file
chem.defn chemical_score_file
chem_score.tbl flex_definition_file
flex.defn flex_drive_file
flex_drive.tbl ligand_contact_file
dock_cnt.mol2 ligand_chemical_file
dock_chm.mol2 ligand_energy_file
dock_nrg.mol2
77Dock PlanFile (contd.)
parameter database_name label "database_name"
text select oneof "aldrich" "maybridge"
"maybridge_300" "asinex_egc" "asinex_epc"
"asinex_pre" "available_chemicals_directory"
"inter_bioscreen_s" "inter_bioscreen_n"
"inter_bioscreen_n_300" "inter_bioscreen_n_500"
"biomolecular_research_institute"
"molecular_science" "molecular_diversity_preservat
ion" "national_cancer_institute" "IGF_HITS"
"aldrich_300" "molecular_science_500" "APP" "ECE"
default "aldrich_300" parameter score_ligand
text default "yes" parameter minimize_ligand
text default "yes" parameter multiple_ligands
text default "no" parameter random_seed integer
default 7 parameter anchor_search text default
"no" parameter torsion_drive text default
"yes" parameter clash_overlap float default
0.5 parameter conformation_cutoff_factor integer
default 5 parameter torsion_minimize text
default "yes" parameter match_receptor_sites
text default "no" parameter random_search text
default "yes" . . . . . . . . . . .
. parameter maximum_cycles integer default
1 parameter receptor_site_file text default
"ece.sph" parameter score_grid_prefix text
default "ece" parameter ligand_number integer
range from 1 to 200 step 1
Molecules to be screened
78Dock PlanFile
task nodestart copy ./parameter/vdw.defn
node. copy ./parameter/chem.defn node.
copy ./parameter/chem_score.tbl node.
copy ./parameter/flex.defn node. copy
./parameter/flex_drive.tbl node. copy
./dock_inputs/get_molecule node. copy
./dock_inputs/dock_base node. endtask task main
nodesubstitute dock_base dock_run
nodesubstitute get_molecule
get_molecule_fetch nodeexecute sh
./get_molecule_fetch nodeexecute
HOME/bin/dock.OS -i dock_run -o dock_out
copy nodedock_out ./results/dock_out.jobname
copy nodedock_cnt.mol2
./results/dock_cnt.mol2.jobname copy
nodedock_chm.mol2 ./results/dock_chm.mol2.jobnam
e copy nodedock_nrg.mol2
./results/dock_nrg.mol2.jobname endtask
79Nimrod/TurboLinux enFuzion GUI tools for
Parameter Modeling
80Docking Experiment Preparation
- Setup PDB DataGrid
- Index PDB databases
- Pre-stage (all) Protein Data Bank (PDB) on
replica sites - Start PDB Server
- Create Docking GridScore (receptor surface
details) for a given receptor on home node. - Pre-Staging Large Files required for Docking
- Pre-stage Dock executables and PDB access client
on Grid nodes, if required (e.g., dock.Linux,
dock.SunOS, dock.IRIX64, and dock.OSF1 on Linux,
Sun, SGI, and Compaq machines respectively). Use
globus-rcp. - Pre-stage/Cache all data files (3-13MB each)
representing receptor details on Grid nodes. - This can can be done demand by Nimrod/G for each
job, but few input files are too large and they
are required for all jobs). So,
pre-staging/caching at http-cache or broker level
is necessary to avoid the overhead of copying the
same input files again and again!
81Protein Data Bank
- Databases consist of small molecules from
commercially available organic synthesis
libraries, and natural product databases. - There is also the ability to screen virtual
combinatorial databases, in their entirety. - This methodology allows only the required
compounds to be subjected to physical screening
and/or synthesis reducing both time and expense.
82Target Testcase
- The target for the test case electrocardiogram
(ECE) endothelin converting enzyme. This is
involved in heart stroke and other transient
ischemia. - Ischemia A decrease in the blood supply to a
bodily organ, tissue, or part caused by
constriction or obstruction of the blood vessels.
83DataGrid Brokering
Screen 2K molecules in 30min. for 10
Nimrod/G Computational Grid Broker
Algorithm1
Data Replica Catalogue
PDB Broker
. . .
AlgorithmN
3
PDB replicas please?
advise PDB source?
5
1
4
2
Grid Info. Service
process send results
selection advise use GSP4!
Screen mol.5 please?
Is GSP4 healthy?
7
6
mol.5 please?
PDB2
PDB Service
PDB Service
GSP1
GSP2
GSPm
GSP4
GSP3(Grid Service Provider)
GSPn
84Nimrod/G in ActionScreening on World-Wide Grid
85Any Scientific Discovery ? Did your collaborator
invent new drug for xxxx?
Not Yet?
?
86Conclude with a comparison with the Electrical
Grid..
Courtesy Domenico Laforenza
87Alessandro Volta in Paris in 1801 inside French
National Institute shows the battery while in the
presence of Napoleon I
- Fresco by N. Cianfanelli (1841)
- (Zoological Section "La Specula" of National
History Museum of Florence University)
88.and in the future, I imagine a worldwide Power
(Electrical) Grid ...
Oh, mon Dieu !
What ?!?! This is a mad man
892001 - 1801 200 Years
90Grid Computing A New Wave ?
Can we Predict its Future ?
I think there is a world market for about five
computers. Thomas J. Watson Sr., IBM Founder,
1943
91What Enron, World Leader in Power and Natural Gas
Distribution Business, Think of Economy Grid!...
- -------- Original Message --------
- Subject Your papers on Economics Grid
allocation - Date Wed, 14 Mar 2001 121020 -0800
- From Lance_Norskog_at_enron.net
- To rajkumar_at_csse.monash.edu.au,
davida_at_csse.monash.edu.au,jon_at_csse.monash.edu.au - Hello-
- I am researching mass computation
infrastructures. - The company I work for, Enron, is a worldwide
commodity company. Our - business model is to find mid-sized commodity
markets that don't work like - large-scale commodity markets (like wheat, gold,
orange juice, etc.) and to - restructure them. The division I work in is
working to do this for - long-distance fiber optic bandwidth. Other
divisions are pursuing metals, - paper pulp, etc. (We even have "weather
derivatives", which is a betting - parlor for industries that depend far too much on
hot or cold weather. - Somehow, this is legal!) So, my perspective on
mass computation is from the - point of view of how to make it a large-scale
market.
92What Enron, World Leader in Power and Natural Gas
Distribution Business, Think of Economy Grid!...
- Our customers are large corporations. They have
a need not to solve some - particular problem now and then, but every day.
For example, take a - department store chain that routes one million
different items from - warehouses into department stores every day, and
wants to do it in a - near-optimal way. They need to run this
computation, with different input - vectors, every business day!
- If that company is to commit to running its
business with the Grid, it needs a - few guarantees around its Grid use
- 1) complete reliability and availability
- 2) a known price for every use, set far in
advance - 3) an active spot market in case its regular
supplier fails - 4) enforceable contracts guaranteeing quality of
service - This last is a killer the quantity of processing
power you get can't be - squishy. It has to be a measureable unit "Java
MIPS" is computer-driven. - Number 2, a known price for every use, is also
missing from your analysis. - A large, active, "liquid" commodity market can
supply not merely spot
93What Enron, World Leader in Power and Natural Gas
Distribution Business, Think of Economy Grid!...
- Large commodity markets have the vast majority of
their product change - hands under such long-term agreements, rather
than on the spot market. - The strategy is to buy part of your needs in
long-term contracts, part in - medium-term contracts, and most of the rest on
short-term, just to avoid - getting stuck with 5 years from now with vast
amounts of a commodity you - don't want. Every day you might have a missing
3-5 that you need to buy - on the spot market. (I'm in California. We're
having so much trouble - because our utilities were banned from buying a
heterogeneous basket of - contracts and were required to buy all
electricity on the spot market.) - I think my point is that while economics is a
fine metaphor for making operational decisions in
scheduling resources, those numbers will not be
visible to end customers. Instead, the customers
will buy blocks of resource for future delivery
with pricing based on standard macro-economic - factors like interest rates, falling machine
prices, rising electricity - prices, etc. The producers will use your
economic-based techniques to - direct their day-to-day operations and to make
the interproducer spot market function. - Lance Norskog
- Sr. Software Engineer
- Enron Broadband Systems
- --------------------------------------------------
--------------------------------------------------
--
94Conclusions
- Grid Computing is emerging as a next generation
computing platform. - The use of economics paradigm for management of
resources in Grid Computing is essential to push
Grid into mainstream computing! - Adaptive, scalable, and easy-to-use systems and
tools are essential to make end-users life
easier. - It is projected that the impact of World-Wide
Grid on 21st century economy will be similar to
the impact made by electric power grid on the
20th century economy. - To achieve this goal, in my humble opinion, Use
Nimrod Family of Tools (not to mention Nimrod-G
Broker) along with Globus, of course! - Enjoy excitements of World-Wide Grid Computing!
95Download Software Information
- Nimrod Parameteric Computing
- http//www.csse.monash.edu.au/davida/nimrod/
- Economy Grid Nimrod/G
- http//www.buyya.com/ecogrid/
- Virtual Laboratory/DesignDrug_at_Home
- http//www.buyya.com/dd_at_home/
- Grid Simulation (Java based)
- http//www.buyya.com/gridsim/
- World Wide Grid testbed
- http//www.buyya.com/ecogrid/wwg/
- Looking for new volunteers to grow ?
- Please contact me to barter your our machines!
- Want to build on our work/collaborate
- Talk to me now or email rajkumar_at_csse.monash.edu.
au
96Acknowledgements
- Special thanks to the following colleagues for
sharing ideas/works - David Abramson, Monash University
- Jack Dongarra, University of Tennessee
- Wolfgang Gentzsch, Sun Microsystems
- Jon Giddy, DSTC _at_ Monash University
- Domenico Laforenza, CNR/CNUCE, Italy
- Globus Team!
- Unable to mention all names explicitly here,
however, their efforts are recognized by
featuring their work in this presentation. - Colleagues from Asia/Japan, Europe Italy,
Germany, Swiss, Poland, UK, US, Chille for
providing access to their machines--gtWWG testbed.
97(No Transcript)
98Further Information
- Books
- High Performance Cluster Computing, V1, V2,
R.Buyya (Ed), Prentice Hall, 1999. - The GRID, I. Foster and C. Kesselman (Eds),
Morgan-Kaufmann, 1999. - IEEE Task Force on Cluster Computing
- http//www.ieeetfcc.org
- Global Grid Forum
- www.gridforum.org
- IEEE/ACM CCGridxy www.ccgrid.org
- CCGrid 2002, Berlin ccgrid2002.zib.de
- Grid workshop - www.gridcomputing.org
99Further Information
- Cluster Computing Info Centre
- http//www.buyya.com/cluster/
- Grid Computing Info Centre
- http//www.gridcomputing.com
- IEEE DS Online - Grid Computing area
- http//computer.org/dsonline/gc
- Compute Power Market Project
- http//www.ComputePower.com