Grid Computing: an introduction

About This Presentation

Title:

Grid Computing: an introduction

Description:

Weather Forecast and Climate. Simulation of VLSI systems. Parallel Search in Databases ... Fibre channel, Gigabit Ethernet, Web services, XML: 1995-2000 ... – PowerPoint PPT presentation

Number of Views:176

Avg rating:3.0/5.0

Slides: 199

Provided by: joscc

Category:

more less

Transcript and Presenter's Notes

Title: Grid Computing: an introduction

1
Grid Computing an introduction

José C. Cunha, DI-FCT/UNL

2
Distributed and Parallel Computing

Distributed Computing
Parallel Computing
Grid Computing

3
(No Transcript)
4
Distributed Computing

Physically distributed computations and data
Goals
Adapt to geographical application distribution
Provide appropriate levels of transparency
Geographical distribution (LAN or WAN)
Users / Access / Processing / Archiving Sites
Availability and Reliability
Fault tolerance / Redundancy

5
Transparency

Depends on the layer
Failure
Communication (message,RPC,memory)
Design choices can be revised
Interactions events, uncertainty, causality
Loose / tight interactions / collaboration
Pessimistic / Optimistic Choices (DBs)
Sometimes there is no choice
mobility, disconnected operation

6
Transparency and Virtualisation

Transparency and Awareness
The concept of transparency has been revised as
time passes
Raw hardware, Assembly, High-Level Languages,
etc....Operating Systems,...., Text editors and
processing tools....
The Grid is one of the current revisions....

7
(No Transcript)
8
Parallel Computing

Goal to reduce execution time, compared to
sequential execution.
Computer System Architectures
Supercomputers
Shared / Distributed memory multiprocessors
LANs and Clusters of PCs
Parallel Programming requires
Decompose application in parts
Launch tasks in parallel processes
Plan the cooperation between tasks

9
(No Transcript)
10
In the 2006... There are increased reasons to
exploit Parallelism
11
Classical Application Areas

Science and Engineering
Fluid Dynamics
Particle Systems in Physics
Weather Forecast and Climate
Simulation of VLSI systems
Parallel Search in Databases
Artificial Intelligence
...

12
Great Application Successes
The development of scalable massively parallel
computers was motivated largely by a set of Grand
Challenge applications (courtesy Prof. David
Walker)

Climate modelling to understand the Earth's
climate and predict future changes
Computational fluid dynamics to design aerospace
vehicles and cars
Numerical turbulence to develop realistic fluid
and particle simulations of plasma turbulence to
optimise performance of fusion devices
Rational drug design to discover / design new
drugs with simulations of macro-molecular
structure

But the application profiles have changed

14
Evolution ofApplication Characteristics

Complex models simulations
Large volumes of input / generated data
Difficult interpretation and classification
High degree of User interaction
Offline / online data processing / visualisation
Distinct user interfaces
Computational steering
Multidisciplinary
Heterogeneous models / components
Interactions among multiple users /
collaboration
Require parallel and distributed processing

15
Heterogeneous Components

Sequential, Parallel, Distributed Problem Solvers
(simulators, mathematical packages,etc.)
Tools for data / result processing,
interpretation and visualisation
Online access to scientific data sets and
databases
Interactive (online) computational steering

16
(No Transcript)
17
Ambitious application requirements

Distinct operation modes (offline/online)
Distinct user interfaces
User / Agent driven control
Dynamic modification of operation modes and
interactions
Multiple users concurrently join ongoing
experiments with distinct roles (observers,
controllers)

18
Complex cycle of user activities

Problem specification
Configuration of the environment
Component selection (simulation, control,
visualisation) and configuration
Component activation and mapping
Initial set up of simulation parameters
Start of execution, possibly with monitoring,
visualisation and steering
Analysis of intermediate / final results

Requirements
To meet more complex applications
To ease the cycle of application development,
deployment and control
To integrate heterogeneous components into an
environment
To allow transparent access to parallel and
distributed resources
To support collaborative modeling and simulation

20
Problem-solving perspective

Integrated environments for solving a class of
related problems in an application domain

21
Problem-Solving Environments

A different approach
-- specific methods for each problem domain
are encapsulated in components (libraries,
packages, class OO repositories)
-- development and runtime support tools
are also made available.
Application components and computational tools
are integrated into a single unified environment
(PSE)
Easy-to-use by the end-user

22
PSE Functionalities

Support for problem specification
Resource management
Execution support services

23
PSEs Users and Developers

End-users (scientists,engineers,etc.)
Solve a particular problem in a specific
application domain
Perform experiments
PSE Developers
Develop new algorithms and techniques
Integrate them into components and place them in
component repositories
Develop tools to support problem specification
and application composition
Develop tools to help the user choose the best
solutions and to locate the resources
Use the services and interfaces built by the
System Developers

24
PSEs Problem Specification / Solution

Use either
Visual programming environment to link software
components
High-level language specification
Recommender systems can be used to help user
choose best way to solve problem and locate
required software
They are very important to enable use of a
complex computing environment

25
Software Components

Components specified in terms of their input and
output interfaces
User ignores internal details of components
Components can be interconnected within a visual
development environment

26
Plug and Play Components (courtesy Prof. David
Walker)

Can link the output of one component to the input
of another.
Store components in a repository.

See Triana for example http//triana.co.uk/
27
An Example
28
Impact of PSEs in many areas (1990-1999-2000...)

Fully developed PSEs in the Industry, e.g.
Automotive, Aerospace
Many applications in Science and Engineering
Design optimisation
Application behavior studies (parameter sweeping)
Rapid prototyping
Decison support
Process control
Emerging areas Education, Environment, Health,
Finance
A new profile of end-user, beyond the scientist
and engineer

29
(No Transcript)
30
Computer technology advances
31
Storage evolution (Carl Kesselman)

Storage density doubles every 12 months
Dramatic growth in online data (1 petabyte 1000
terabyte 1,000,000 gigabyte)
2000 0.5 petabyte
2005 10 petabytes
2010 100 petabytes
2015 1000 petabytes?
Transforming entire disciplines in physical and,
increasingly, biological sciences etc.

32
Networks evolution (Carl Kesselman)

Network vs. computer performance
Computer speed doubles every 18 months
Network speed doubles every 9 months
Difference order of magnitude per 5 years
1986 to 2000
Computers x 500
Networks x 340,000
2001 to 2010
Computers x 60
Networks x 4000

Moores Law vs. storage improvements vs. optical
improvements. Graph from Scientific American
(Jan-2001) by Cleo Vilett, source Vined Khoslan,
Kleiner, Caufield and Perkins.
33
Enabling factors

The Internet
Broadband communications (eg optical-based)
Faster processors / HPC using standard / open OS
World Wide Web infrastructure and services

34
Major Phases

Networking TCP/IP
Communications Internet and e-mail
Information the Web
Computing the Grid

35
Computing milestones

Mainframes, time-sharing, Unix, minicomputers
1960-70
PCs, commercial Unix, Crays, Workstations, MPPs
1980s
Clusters, PVM, Linux, PDAs, Open Source, P2P
1990s
Globus Project, 1G Grids 1995-2000
2G Grids, OGSI/OGSA, 3G Grids 2000-05
Mainframes, 1960s focus on efficient and
exploitation of shared resources, virtual monitor
concept, time-sharing
Minicomputers, microcomputers, desktops,
1970-80s large dissemination of computing power
Client-server computing, 1990 distributed
functionality to the endpoints, the clients
Developments on networks and interconnects,
1980-90s rise of commercial Internet
Now, is the Grid time

36
Communication milestones

Packet switching, e-mail, ARPAnet, LANs/Ethernet,
TCP/IP 1960-70-80
Internet era, broadband, WWW, wireless 1990s
Fibre channel, Gigabit Ethernet, Web services,
XML 1995-2000
Internet from a military Project DARPA to
academic NSF projects
1969 ARPAnet had 4 nodes
mid-1970s around 30 university, military, and
gov. sites
1974/78 TCP/IP Transmission Control Protocol /
Internet Protocol
1983 ARPAnet had hundreds of nodes
NSFnet, 1980s scientific communication network
to access NSF supercomputer centers
mid-1980s NSFDARPA joined efforts IETF,
Internet Engineering Task Force shaped modern
Internet

37
Communication milestones

WWW mid-late 1980s, goal to share information
HyperTextMarkupLanguage HTML a standard to
create/organise docs
HyperTextTransferProtocol HTTP, browsers and
servers, to link and access docs online,
transparently
W3C(onsortium) mid-1990s new standards for
information interchange (XML etc)
SONET/DWDM (Synchronous Optical Network/Dense
Wavelenght Division Multiplexing) optical
technology, late 1990s, early 2000s
- Provides broadband connectivity and services
at reasonable prices
- Corporate WANs at 155 Mbps vs USA 56Kbps in the
mid-1980s
- at 2.5 Gbps since mid-1990s
- OC-768 (about 40 Gbps)
- A single fiber in the range 1 Tbps using
high-density DWDM but still out of reach to
individual organisations

38
Communication milestones

Recent past
Theoretical WAN performance doubled every 9-12
months, supported by optical technology
But Commercial user-available bandwidth (BW) has
grown at a much slower rate

39
How about the theoretical maximum communication
speed?

1. In general, the max. available speed is not
affordable by the end-user
Communication speed/cost will continue to
increase
But quality high-speed BW will hardly ever be
free
Providers must react to the continuous pressure
for more multiplexing of wavelenghts in DWDM
products
And profits are in order
the individual end-user will hardly afford the
theoretical performance
Cf actual cost of a bottle of mineral water
2. Plus the effects of the Overheads due to the
communication protocol layers
3. And also It all depends on the application
profile Is it CPU-bound or I/O-bound?
A grid job splits into multiple components which
are spread on the grid
? need to locate one another
? to establish communication connections
? to send data

40
History of sharing

1965 MIT Multics operating system (multi-user
time-sharing system)
A computer facility should operate like a power
company or water company
Late 1960s-early 1970s when computers were first
linked by data communication networks, the ARPA
net supported early experiments on exploiting
unused remote machine cycles
1973 Xerox PARC worm program replicated itself
in about 100 Ethernet-connected computers
Each worm used idle resources to perform a
computation
Could replicate and send clones to other nodes
Since 1990s parallel and distributed computing
Widely available PCs and workstations
High-speed networks such as Gigabit Ethernet
Clusters for HPC

41
History of sharing

Clusters motivated interest in
Aggregating distributed resources to solve
complex problems via parallel computing and also
to support reliability via redundancy
2002 NSF installed the TeraGrid transcontinental
(virtual) supercomputer set up HPC clusters at 4
sites (NCSA /ANL Illinois, and Caltech/SDSCS
California)
aimed at problems with requirements in the TFLOPS
range

42
(No Transcript)
43
Modern applications demanding more ambitious
goals

Enable heavy applications in science and
engineering
Complex simulations with visualisation and
steering
Access and analysis of large remote datasets
Access to remote data sources and special
instruments (satellite data, particle
accelerators)
distributed in wide-area networks, and
accessed through collaborative and
multi-disciplinary PSE, via Web Portals.

44
(No Transcript)
45
The Grid

Treat CPU cycles and software like commodities.
Enable the coordinated use of geographically
distributed resources in the absence of central
control and existing trust relationships.
Computing power is produced much like utilities
such as power and water are produced for
consumers.
Users will have access to power on demand
When the Network is as fast as the computers
internal links, the machine disintegrates across
the Net into a set of special purpose appliances
Gilder Technology Report June 2000

This slide is courtesy of Professor Jack Dongarra
46
US Software Infrastructure, 1998
The Grid is a computational and network
infrastructure providing pervasive, uniform, and
reliable access to distributed resources.

Globus provides core services for grid-enabled
computing http//www.globus.org/

47
Concept of a Grid

Gathers a diversity of resources, distributed at
large-scale
supercomputers and parallel machines, and
clusters
massive storage systems
databases and data sources
special devices
Provides globally unified access to virtual
resources
Transient to support experiments
(computation, data, scientific
instruments)
Persistent
(databases, catalogues, archives)
Collaboration spaces

48
What is a Grid Computing System

A virtualised computing environment
Enabling dynamic runtime selection, sharing,
aggregation of geog distributed autonomous
resources
Based on the availability, capability,
performance and cost
Based on an applications or organisations
requirements
Relies on a highly interconnected networking
infrastructure

49
(No Transcript)
50
Related concepts

Virtualisation
IBM allows several OS to run simultaneously on
one large computer (VirtualMachineMonitor)
Generic approach to
Allow logical access to types of remote,
heterogeneous, and distributed resources
As if they were a single larger homogeneous
resource, locally available
Applies to computation, storage, and network
resources and to any other LOGICAL RESOURCE
Dynamically adjust resource mappings to match
application demands

51
Virtualisation

The logical functions of the server, storage and
network resources are separated from their
physical functions and representations
(processors, memories, I/O devices, switches).
Resources are aggregated into pools
Elements from the pools are allocated,
provisioned, managed, manually or automatically,
to meet application demands

52
Virtualisation examples

Processes
Server
Network
Storage
Data center groups of servers, storage, and
network resources can be reallocated on the fly
Software resources

53
Cluster computing

Aggregate processors locally in parallel-based
configurations, integrate them and provide access
as a single unified resource
Central resource manager and scheduler
Centralised control and knowledge of system and
user states
Typically owned by single organisation

54
Cluster vs Grids

Clusters focus on datacenter, single
organisation
Grids focus on geo distributed multiorganisation
utility-based (outsourced) networking

55
Changing perspectives - Grid Views

The Grid. Use distributed hardware and software
infrastructure ? reliable, pervasive, inexpensive
access to computational resources irrespective of
physical location or access point.
The Consumer Grid. Services and resources
anywhere. Issues of dynamic resource discovery,
trust, and digital reputation.
Application Service Provider. Provide or sell
computational or data services via Web.
Virtual Organisation. Group of people or
institutions with some common purpose that need
to share resources .

56
Grids Towards uniform and standard large-scale
computing environments

Analogy to the Electrical Power Grid
Simple local interface
Transparency
Pervasive access
Secure
Dependable
Efficient
Inexpensive
The Computational, Data, and Interaction Grids
Not really true (yet!?)

57
The Transparent Grid

Transparency The user is not aware (and doesnt
care) what computing resources are used to solve
their problem
Similarly, in an electrical grid we ignore the
source of the power

Heterogeneity
Resource discovery
Scheduling

Distributed computing issues
58
(No Transcript)
59
EGEEEnabling Grids for E-science in
Europewww.eu-egee.orgEU IST project
60
The Grid metaphor
61
(No Transcript)
62
the future Grid!
63
The Pervasive Grid

Pervasive The Grid can be accessed from any
networked device, eg, laptop, mobile phone, PDA,
etc.
In electrical analogy, any appliance can access
power through a standard interface, eg, a wall
socket.

Standard interfaces
Protocols
Legacy software

64
The transparent grid access
65
Grid is an evolving field

Multiple views, perspectives
Concepts, models and architectures still being
defined and tested
Applications still emerging
Wide variety of interests

66
The main questions

Grid benefits, challenges, status and directions
Grid architectures
Portal and UI, User and node security, Brokers,
Schedulers, Data managers, Job and resource
managers
Standardisation efforts
Architecture OGSA/OGSI (Open Grid Service
Arch/Infrast)
Execution Models Workflows, Events, Transactions
System services Security, Monitoring, Billing
and Accounting, Implementation (Globus Toolkit)
Economics
Grid deployment
Local, national, and global grids

67
Applications and benefits

The Grid can be seen as an evolution of
Parallel and Distributed computing
The Web
And Virtualisation concepts
As such, the Grid will probably improve existing
application types, and will enable new types of
applications

68
Applications example

Virtual access to special instruments
electron microscopes, particle accelerators, wind
tunnels,
coupled with remote supercomputers, DBs,
to enable
interactive use,
online scenario comparisons,
and collaborative data analysis

69
Applications example

Virtual access to distributed supercomputing
For complex computations
Migrate CPU-bound operations to more powerful
remote computing resources supported by large
virtual supercomputers, assembled to solve
problems too large to fit on a single computer
system

70
Applications example

Collaborative engineering
Design of complex systems
Based on highly interactive environments
Relying on high-bandwidth access to shared
virtual spaces, supporting
Interactive manipulation of shared datasets
Management of complex simulations

71
Applications example

Parameter studies
Rapid, large-scale parametric studies
A single program is run many times
To explore a multidimensional parameter space

72
Summary Grid applications

Distributed supercomputing for Computational
science and Engineering
High-capacity throughput large-scale
simulation/chip design, and parameter studies
Content sharing digital contents
Data-intensive drug design, particle physics,
stock prediction, etc.
On-demand real-time medical instrumentation,
mission-critical
Collaborative e-science, e-engineering, design,
data exploration, education, e-learning
Remote software access/renting services (ASP and
Web services)
Utility/service-oriented computing

73
Question Is this just an academic exercise? No!

Real applications needs
Solve new or larger problems by aggregating
available resources at large-scale
for bigger, longer experiments, and more accurate
models
Easier access to remote resources
a large diversity of computation, data and
information services
Increased levels of interaction for increased
productivity and capability to analyse and react
enable coordinated resource sharing and
collaboration across virtual organisations

74
Applications and User Profiles

Computational Grids
provide a single point of access to a
high-performance computing service
Scientific Data Grids
Access large datasets with optimized data
transfers and interactions for data processing
Virtual Organisations and Interactions
Access to virtual environments for resource
sharing, user interaction and collaboration
Real-time interactions for decision support
Information and Knowledge services
Access large geographically distributed data
repositories, e.g. for data mining applications

75
Grid benefits

Resource sharing
Transparent access to remote resources
Efficient exploitation of resources, reduce
execution time large-scale data processing,
support load smoothing across the network,
exploit time and work differences
Enable the concept of a virtual data center
Access to remote DB and software
Reduce the local services needed
On-demand aggregation of resources, to meet
dynamic needs (including real-time response)
Fault-tolerance and dependability

76
Ultimate goal

Allows an organisation to
Integrate and share heterogeneous pools of
resources (physical and logical)
Presenting them as one large, cohesive, virtual,
transparent computing system
In order to deliver agreed services at specified
levels of quality (application functionality,
efficiency and performance)

77
Grid mechanisms

To enable online discovery and access to
distributed resources
And online collaboration

78
Grid ideas

Internet a network of communication
Grid a network of cooperation / computation
Grid relies on the ability to negotiate
resource-sharing among partners (providers and
consumers) and using the resulting resource pool
for some specific application goal

79
Grid Views
80
View - Computational Grids

Service-oriented view
Netsolve an example

81
View Grids as Frameworks for Application
Service Providers

Application Service Provider. Provide or sell
computational services via web interface.
Provide remote services such as compute cycles,
specific applications, or storage.
Selective outsourcing certain functions are
performed remotely.
Application hosting remote sites act as
application servers.
Browser-based computing online applications
accessible through web site.
(Courtesy Prof. David Walker)

82
An Example NetSolve as a Scientific ASP

A client-server system for remote solutions of
complex scientific problems
On request performs computational tasks on a set
of servers
Searches for computational resources on a
network, chooses the best one available, and
returns the answers to the user.
Based on agents or resource brokers
Developed by Professor Jack Dongarra and
colleagues at University of Tennessee, Knoxville

83
NetSolve The Big Picture (David Walker)
Client
Schedule Database
AGENT(s)
Matlab Mathematica C, Fortran Java, Excel
S3
S4
S1
S2
C
A
84
Data grids

Aggregate underused/unused storage
Into a larger virtual data store
For improved performance and reliability and for
increased capacity

Storage
a file or a DB can span multiple physical
devices
a unifying distributed file system can solve
this problem
storage hierarchy
- primary (attached to a CPU)
- secondary (in hard disks such as RAID)
- tertiary (in near-real-time accessible media
as tape )
--- distributed
Using mountable network file systems
as Network File System (NFS), Distributed File
System (DFS) or
General Parallel File System (GPFS)
DB management software can federate a group of
individual DBs and files to build a larger DB

Grid file systems can manage automatic file or
data sets replication
for performance and reliability
Applications may require different semantics for
synchronous replication of data files and so
require specific data placement decisions
exploiting locality of access this may
critically affect the resulting performance
? an intelligent grid data scheduler can
consider, not only the computational requirements
of an application but also its data requirements,
based on usage patterns and replication needs
and then can schedule jobs closer to the
data
and/or on processors with direct SAN access to
storage devices
? Need to revise traditional scheduling
strategies and models typically based on
computational requirements only

87
View Scientific Data Grids

EU DataGrid projects
Large-scale environment for accessing and
analysing large amounts of data
High energy physics, Biology, Earth observation
Petabytes of data (1 000 000 Giga)
Thousands of researchers
Scalable storage of datasets replicated,
catalogued, distributed in distinct sites

88
Distributed Computing Grid Experiences in CMS
Data Challenge
A.Fanfani Dept. of Physics and INFN, Bologna

Introduction about LHC and CMS
CMS Production on Grid
CMS Data challenge

89
Large Hadron Collider LHC
bunch-crossing rate 40 MHz
?20 p-p collisions for each bunch-crossing p-p
collisions ? 109 evt/s ( Hz )
90
CMS detector
91
CMS Data Acquisition
Bunch crossing 40 MHz
1event is ? 1MB in size
? GHz ( ? PB/sec)
Online system
Level 1 Trigger - special hardware

multi-level trigger to
filter out not interesting events
reduce data volume

75 KHz (75 GB/sec)
100 Hz (100 MB/sec)
data recording
Offline analysis
92
CMS Computing

Large amounts of events will be available when
the detector will start collecting data
Large scale distributed Computing and Data Access
Must handle PetaBytes per year
Tens of thousands of CPUs
Tens of thousands of jobs
heterogeneity of resources
hardware, software, architecture and Personnel
Physical distribution of the CMS Collaboration

93
CMS Computing Hierarchy
1PC ? PIII 1GHz
? PB/sec
? 100MB/sec
Offline farm
recorded data
Online system

Filter?raw data
Data Reconstruction
Data Recordin
Distribution to Tier-1

CERN Computer center
Tier 0
?10K PCs
. .

Permamnet data storage and management
Data-heavy analysis
re-processing
Simulation
,Regional support

Italy Regional Center
Fermilab Regional Center
France Regional Center
Tier 1
?2K PCs
? 2.4 Gbits/sec
. . .
Tier 2

Well-managed disk storage
Simulation
End-user analysis

Tier2 Center
Tier2 Center
Tier2 Center
?500 PCs
? 0.6 2. Gbits/sec
workstation
Tier 3
InstituteB
InstituteA
? 100-1000 Mbits/sec
94
View - Virtual Organisations

Resource sharing and collaboration between
dynamically changing collections of individuals
and organisations
e.g. Consortium of companies collaborating in a
design of a new product
Sharing design data, Collaborative simulations,
etc
e.g. Scientists collaborating in common
experiments via a distributed virtual laboratory

95
Example Collaborative Immersive Visualisation

Scientific simulations, experiments, and
observations generate vast amounts of data that
often overwhelm data management, analysis, and
visualization capabilities.
Observer appears to be in the same space as the
visualised data and can navigate within the
visualisation space relative to the data.
Important in interpreting and extracting insights
from the data.
Several observers can co-exist in the same
visualisation space - ideal for remote
collaboration.
CAVE a fully immersive environment. Systems with
stereoscopic projections onto 3 walls and the
floor.
ImmersaDesk or stereoscopic workstation projects
stereoscopic images onto a single flat panel
display.

96
CAVE
97
(No Transcript)
98
Virtual organisations (VO)

A set of entities (individuals and institutions)
defining a set of resource sharing and access
rules
Highly controlled sharing
What is shared
Who is allowed access
Conditions to allow such sharing

99
Keys

Resource sharing and problem-solving in dynamic
multi-institutional VOs
Service providers
Application
Storage
Machine-cycles (computation)
Collaboration in industry consortia

100
Commercial, IT, data center applications

First grid generations had limitations namely for
database interoperability
This has motivated approaches for
business-centric solutions, developed by
commercial software and DB suppliers

101
Commercial and financial

Enabling
Data-mining, pattern-detection, scenario-modeling
processes
Applied to banks, credit card processing,
financial institutions
Improve the financial transaction flow, better
understanding of customer profitability, and risk
modeling done in real time (knowledge-based
analysis and simulation are common in financial
firms)

102
Financial applications

Instead of
Manually subdivide algorithms
Run them on separate machines
Manually merge and integrate the results
Exploit grid tools to the same, more or less
automatically, in a virtualised environment

103
Business goals

Improve
Utilisation
Responsiveness
Reduce IT costs

104
Traditionally

Business applications
Dedicated platforms of servers and storage
devices associated to each server
Not able to share resources
Not exploiting abilities to predict, anticipate,
and exploit expected levels of processing loads
Design for excess capacity to handle excess peak
loads
Higher overall costs

105
Virtualisation of resources

Exploit
Synergistic integration
Economies of scale
Load smoothing
Due to the sharing and aggregation of distributed
resources
And the delivery of services in a highly
transparent way to the end-user
Several solutions
Dedicated local Clusters
Grids

106
Cost savings

Cluster computing
Aggregating processors in parallel-based
configurations
Cost reductions in IT costs and costs of
operations, confirmed.
Enterprise grids
Middleware-based to exploit unused CPU cycles ?
avoiding growth/expansion costs
Expected savings.

107
Expectations

2005-06 the Grid will become commercially viable
Early adoption for enteprise applications, at
single-site and multi-site
Exploitation of solutions from Web services and
utility computing
By 2005, significant 50 of companies were
already aware of the IT utility model for
outsourcing (IT services from Service Providers
as a commodity)
A significant of companies have some sort of
utility computing and a significant of IT
services are being delivered from offshore
centers
Uncertainties remain about cost, security, and
integration with existing IT systems

108
Grid for entreprises

Obtain computing services over networks from
remote Service Providers
Aggregate an organisations dispersed set of
independent resources into one unified single
virtual environment

109
Data grids

Connection
connect DBs at different locations in a single
company
Significant savings in finding information ?
staff efficiency gains
Requires large investment in broadband links
to connect remote data centers

110
Cluster Computational Grid

Processing power for HPC
Big saving in processing time ? efficiency and
savings in RD costs
No initial impact on broadband until cluster
computing evolves to an enterprise grid

111

Cluster/Local Grid
few homogeneous processors connected in a data
center on a LAN or SAN(StorageAreaNetwork) ? more
a cluster than a grid
under the same OS and a central administration
Enterprise or IntraGrids
heterogeneous processors and OS, geo distributed
and interconnected by Intranet links (or
high-quality high-throughput, high-security
communications)
owned by different departments of a single
organisation
may be structured as a hierarchy cluster of
clusters

112
Enterprise Grid

Processing power connection within a
single company, links RD centers at different
geo locations
Efficiency due to processing power access to
data
Savings on RD times and time-to-market
Investment in broadband links require very high
speed due to large amount of data transmitted

113
Partner Grid

Processing power Connection for multiple
companies
Savings in design time and RD time, and
time-to-market
More efficient collaboration between partners in
a supply chain relationship
Significant investment in secure,
high-performance, broadband links between the
companies

114

Enterprise Grids
require policies and operations to control
actual use of grid resources, based on
priorities, and kinds of applications
also requiring security control across distinct
departments
Global Grids
crosses organisation borders
more critical security
allows sharing, trading, brokering resources
over global pools

115
Web Services

Provide secure Internet access to new services
for consumers and business
Closely develops with cluster and data grids
Big gain in productivity
savings in cost of offering services and
time-to-market new services
requires a data grid-like structure to provide
rapid updating of information
Large spending on broadband to link data centers
Significant spending on software and integration
services
Example Bank of America over the Internet

116
How to evolve to a Grid?

Transform individual components (computers,
storage, networks) into aggregated and virtual
pool of resources, to be allocated and monitored
automatically
Provide defined business services on the basis of
specified goals and priorities develop and
automate policies and service-level objectives to
manage the needed applications and resources
Build an enterprise grid infrastructure and use
open-source and vendors proprietary tools
Enable these tools to comply with new standards,
and combine components together.

117
How?

Concept of outsourcing
Delegate the provision of a service in an
external reliable and trusted supplier
Install the concept of utility computing
? Expected as a major trend in the 2010s
Virtualisation of resources
Dynamically manage and adjust a logical pool of
resources and their mappings to share the
physical infrastructure

118
Virtualisation without limit

? Application software and licenses
Specific business software may be installed on a
few designated grid processors and be shared
among clients.
eventually limiting the nº of current users
? virtual licenses
Cf vs installing the same licensed software in
thousands of servers

119
Grid requirements include

Online negotiation of access to services who,
what, why, when, how
Establishment of applications and systems able to
deliver multiple qualities of service
Autonomic management of infrastructure elements
Dynamic formation and management of virtual
organisations
Open, extensible, evolvable infrastructure

120
(No Transcript)
121
More Complex Applications and Environments

Large number of components
Complex interactions
Dynamic configuration

122
Software Engineering Challenges

Suitable levels of flexibility in all stages of
the software lifecycle
Application specification and design
Program transformation and refinement
Simulation
Code generation
Configuration and deployment
Coordination and control of the execution

123
Issues - 1

Clear separation and representation of concepts
Computation and interaction
Structure and behaviour
Specification of multiple components
Enabling alternative mappings
Varying degrees of automated processing
Supported by pattern and template repositories
with relevant attributes

124
Issues - 2

Mapping the programming models into the
underlying computing platforms
Interacting with resource descriptions and
discovery services
For flexible configuration and deployment
Coordination of distributed execution
Allowing workflow descriptions
With adaptability and dynamic reconfiguration

125
Component Based Development /Software
Architecture
Repositories (Skeletons/Templates/Patterns)
Abstract Description Language
specify, design, compose
For structure, behaviour, computation, and
interaction
Mappings
verify, analyse, evaluate, predict
Programming Levels (Models)
Resource Description and Discovery
Deploy and Configure
Grid Execution Environments
control, coordinate execute, reconfigure
Methodology
126
Global conceptual layers

Software architectures
Coordination models
Resource management
Execution, monitoring and control
Support infrastructures

127
(No Transcript)
128
1 - Software Architectures

Specification of components, their composition
and interactions
Modeling and reasoning on global structure and
behavior
Specification languages
for structure and behavior
incremental refinement and dynamic composition

129
2 - Coordination models

Represent and manage interaction patterns among
components
Communication and cooperation models
Consistency guarantees
Abstract, logical, dynamic organisation models
Dynamic application structure, interaction
patterns and operation modes

130
Handle dynamic characteristics

Looking at the past
Fault tolerance, Load balancing, Task spawning
At present and in the future
Changes in the configuration and availability of
resources, variations of characteristics and
behaviour
Changes at the application level user control of
a dynamic experiment
Flexibility to build PSEs
Mobility of agents and devices

131
3 - Resource management

Configuration of parallel and distributed virtual
machines
Resource discovery, scheduling, and reservation
Execution and monitoring at local and large
scales
Quality of service

132

Need to be fair and efficient in
locating software resources
negotiating for use of resources
scheduling components on distributed resources to
achieve
Minimum execution time
Maximum throughput
Need to be able to monitor resource usage and
level of availability
Need of Resource Specification Languages
A difficult problem in dynamic environment.

133
New challenges

New problem-solving strategies with adaptive
behaviour
Awareness to Quality of Service factors
Management at intermediate layers
By intermediate agents planners
Contract negotiation
Dynamic revision of plans
Reconfiguration
Specify, compose, develop, understand dynamic
distributed large-scale applications models,
languages, and tools

134
Two Views of Components

A component as an executable that runs on a
certain specified machine.
A component can be viewed as a contract. It says
If you give me these inputs then Ill give you
these outputs.

In the second case the component is not tied to
any particular executable. Problem specification
is separate from service provision.

135
Binding Service Requests to Resources

In a fully transparent system the scheduler would
decide where components execute based on
Availability and performance of resources
Cost and time constraints
This is a hard problem.
Possible solution is to supply hints about
where it can run, eg, in a components XML
specification.

136
High Level View of Network Computing

Services are advertised on the network
A service typically consists of
A component that actually provides the service,
and
An agent that mediates access to the service.
Scheduler must be able to locate services and
then schedule use.

137
Service-oriented architecture

Defines how two entities interact so that one
performs a unit of work to the other
- the unit of work is a service
- service interactions are defined in a
description language
- each interaction is self-contained and
loosely-coupled and independent of other
interactions
- applications are assembled as collections
of services, each with different functions
and are exposed as services on the network, to
be (re)used
. different users can communicate with the
services differently
an intermediate layer between providers and
consumers
- building applications is
to identify required components, find them,
glue them together

138
Service Providers and Brokers

NetSolve is an example of an ASP providing
numerical software, still limited to
client-server style.
Trend to network-based computing paradigm.
Nodes offer sets of computing services with known
advertised interfaces.
Software seen as a pay-as-you-go service
rather than a product that you buy once
?Computational Economies
Open Service-Oriented Architectures
Shifting paradigms to master-slave and more
tight cooperation models

139
Grids Key components

Resource management
Security
Data management
Services management

140
Grid types

Space scale Local, metropolitan, regional,
national, global
Time scale logically aggregate resources for
long or short periods of time
Crossing borders Resources can span a single or
multiple organisations, or a service provider
space

141
Very complex systems

Aim at providing unifying abstractions to the
end-user
Large-scale universe of distributed,
heterogeneous, and dynamic resources
Critical aspects
Distributed
Large-scale
Multiple administrative domains
Security and access control
Heterogeneity
Dynamic

142
Layers of a Grid Architecture

User Interfaces, Applications, PSEs
Programming Models, Development Tools and
Environments
Grid middleware Services and Resource
Management
Heterogeneous Resources and Infrastructure

143
Elements of a Grid Architecture

Applications, User interfaces, Grid portals and
PSEs
Models, tools and environments for application
composition, programming and deployment
Grid operating environment (middleware)
Services and resource management, discovery and
scheduling
Information registration and querying
Authentication, Security
Computation, data management, and communication
Monitoring, Quality of Service
Heterogeneous resources and infrastructure

144
Grid tools(1)

Infrastructure include hardware and software
components (file systems, resource managers,
messaging systems, security applications,
certificate authorities, file transfer
mechanisms)
Middleware software plug-ins that facilitate
using the Grid
open source Globus GT 3 - first implementation
of OGSI, as a set of services and software
libraries
based on a security model plus a mechanism
for hierarchically collecting data about the grid
includes support for
security
information infrastructure
resource management
data management
communication
fault detection
portability

145
Grid tools(2)

Directory services to discovery available
services, to define and monitor the grid topology
generally based on the Lightweight Directory
Access Protocol LDAP
and Domain Name Server (DNS)
Schedulers and load balancers ensure job
completion under priority, deadline or urgency
constraints and distribute tasks and data across
systems to reduce the chance of bottlenecks
Developer tools for file transfer,
communications, environment control, ranging from
utilities to APIs
Security authenticate and authorise, control
who/what can access a grids resources. Includes
message integrity
message confidentiality

146
Grid architecture concepts

Influenced by the Globus Toolkit
a de facto standard for security, info.
discovery, resource data management,
communication, fault detection, and portability
Driven by the Global Grid Forum (GGF)
An industry advisory group for community-driven
development of new standards
Grid architectures heavily dependant on former
Internet protocols and services (for
communication, routing, name resolution..)

147
Grid logical hierarchy

L1Grid fabric resources (computers, storage,
networks, special devices) -gt managed by a local
RM with a local policy, and interconnected in
LAN, MAN, or WAN
L2Security infrastructure (authenticate secure
connectivity access to resources)
L3Core Grid middleware (job management, storage
access, accounting) ? uniform access to the
fabric resources, and hides partitioning,
distribution, and load-balancing
L4 User-level middleware resource aggregators
(scheduling services and resource brokers)
L5 Grid programming environments and tools
(languages, libraries, compilers, and support
tools)
L6 Applications (commercial, scientific,
engineering)

148
GGF layered architecture

Fabric controlling things locally
Connectivity talking to things
communication (Internet protocols) security
Resource sharing single resources
negotiating access, controlling use
Collective coordinating multiple resources
ubiquitous infrastructure services,
application-specific, distributed services
Applications putting things to work

149
Critical Grid Issues

Security When resources are shared across
organisation boundaries security is an important
issue.
Dependability The Grid must be robust and
resilient to failure.
Efficiency Resources should not be wasted, good
load balancing needed.
Cost For broad impact The Grid should be
inexpensive.
Portability Grid applications should be able to
run on a wide range of hardware.

150
Functional perspective of a Grid

a) Grid Portal UI
interface to launch applications
with transparent access to resources and
services
b) Grid Security
b1) USERs view
- provides authentication, authorization, data
confidentiality, data integrity, and
availability, from the users view
- a single sign-on run-anywhere uniform
authentication service
- a user job requires on-the-fly confidential
message-passing services
or may require a long-lived service
- user must be allowed to check availability of
such security services

151

provide security across organisation borders
with support for local control over access
rights and mapping
uniform authentication, authorization and
message-protection
with delegation of credentials for computations
involving multiple geo distributed resources
usually relies on public key technology
b2) SYSTEMs view
- the user needs to be authentication but remote
resources too!
- secure (authenticated and confidential)
communication between internal grid components
- a Certificate Authority establishes the
identity of users and grid resources

152

c) Broker and Directory
users request to launch an application ?
requires to identify suitable resources
based on applicatíons parameters
?
-- informs about available resources and
working status
-- allows to define and monitor grid
topology/resources
? supported by a Directory mechanism (LDAP
and/or DNS)
d) Scheduler
to coordinate the concurrent execution of jobs
components
in a simple case
- selection of suitable processor
- grid request to send the job code and data to
the selected processor

153

in general cases
- a scheduler must dynamically react to grid
load
by getting measurement information obtained by
grid monitoring and resource management
scheduler strategies
- simple round-robin (cf default PVM)
- usually, try to find most appropriate
processor(s)
- hierarchical scheduling
metascheduler submit a job to a cluster
scheduler
cluster scheduler manages a cluster as a
single resource and uses an internal scheduling
strategy

154

schedulers also monitor job progression
- to automatically resubmit to other nodes, in
case of losses
- to check for job completion (eg with
timeouts)
some use a resource static reservation system
-- a calendar-based mechanism (like in old batch
processing)
managing pools of resources
- processors automatically report their
availability to grid management
? allows reassignment of jobs to such
processors
- local nodes may report start of local NONGRID
work
? forces node availability for grid work
may originate umpredictable completion times
-- suggests use of DEDICATED grid resources

155

e) Grid data management
reliable and secure method for moving files and
data
f) Grid job/resource management
Grid Resource Allocation (GRAM)
f1) keeps track of grid available resources,
node capacities and current utilisation levels,
and of current grid users
?passes this information to the Scheduler, for
deciding where to submit jobs
? also uses this to monitor grids
unpredictable incidents outages, congestion
? and for administration overall usage
patterns, statistics, log resource usage for
accounting purposes
f2) services to launch a job on a set of
resources
to check status
to get results when job is complet