Session 2 Overview of eScience and Distributed Systems

About This Presentation

Title:

Session 2 Overview of eScience and Distributed Systems

Description:

Session 2 Overview of eScience and Distributed Systems – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 94

Provided by: jone1

Category:

more less

Transcript and Presenter's Notes

Title: Session 2 Overview of eScience and Distributed Systems

1
Session 2 Overview of e-Science and
Distributed Systems
7 July 2008

Malcolm Atkinson

2
Overview

e-Science Computational thinking
A turning point in the history of science
Modern challenges
Combined approaches
Grids in context
What can a grid do?
What cant it do?
Principles
Scenarios

3
New modes in Research, Thought and Collaboration
4
Vision

We are undergoing a transition in
the power of affordable computing
the wealth of accessible data and
the capacity of digital communication
e-Science provides leadership in
interdisciplinary collaboration
By combining these we will provide unprecedented
ability to address pressing research challenges

5
Definition of e-Science
Computing has become a fundamental tool in all
research disciplines, which often proceed by
assembling and managing large data collections
and exploiting computer models and simulations
(a topic called e-Science) Phil Wadler 2008
e-Science is the invention and application of
computer-enabled methods to achieve new, better,
faster or more efficient research in any
discipline. It draws on advances in mathematical
sciences, informatics, computation and digital
communications. As such it has been an important
tool for researchers for many decades. The data
deluge and the scale and complexity of todays
research challenges have greatly increased its
importance for researchers. As a consequence, in
2001 the UK led the world by initiating a
coordinated e-Science research programme to
stimulate the development of e-Science across all
fields of research.
6
Strengths of e-Science
Communities and e-Infrastructure supporting
research and innovation
7
Computational thinking

Transforming the way we think
Incremental refinement
Solution by composition
Layers of abstractions
Process models
Notations
Recursive thinking
Simulation, Randomisation
Enabled by ubiquitous computers
Analogue of the printing press

Jeanette Wing, Computational Thinking,
Communications of the ACM, March 2006, Vol 49,
No. 3, p33-35
8
WWW acting

The Long Tail
Data is the Next Intel Inside
Users Add Value
Network Effects by Default
Some Rights Reserved
The Perpetual Beta
Cooperate, Don't Control
Software Above the Level of a Single Device

Transforming the way we act
Data is key ingredient
Community action
Global collaboration
Community thinking
Minimal (?) control
Minimal reserved rights
Composition via wikis
Mash ups
Enabled by ubiquitous digital communication
Analogue of the radio

http//www.oreillynet.com/pub/a/oreilly/tim/news/2
005/09/30/what-is-web-20.html
9
Is e-Science making a difference?
10
Tremendous global challenges
11
Scale, Urgency, Complexity,
12
Achieving the CI Vision requires synergy between
3 types of Foundation wide activities
Transformative Application - to enhance discovery
learning
Provisioning -Creation, deployment and operation
of advanced CI
RD to enhance technical and social dimensions of
future CI systems
Cyberinfrastructure Vision for 21st Century
Discovery, NSF Cyberinfrastructure Council, March
2007
13
The Information Explosion
988EB (2010)
1ZB
?????
161EB (2006 by IDC)
??????
???? ????????
GRID/????
ITS
Slide Satoshi Matsuoka
14
The 21st Century
This is the century of information
Prime Minister Gordon Brown, University of
Westminster, 25 October 2007
Thanks for images to Mark Birkin (MoSeS
Genysis projects) and Michael Batty (GeoVue
project)
15
Historical perspective
16
Timeline
Foundations for Collaborative Behaviour
Today
Wellbeing the global-scale killer app., Sir
Robin Saxby Oct. 2006
17
Healthcare _at_ Home
REFERRAL
REFERRAL
GPHome-mobile-clinic via PDA-laptop-PC-Paper
DiabeticianHome-mobile-clinic via
PDA-laptop-PC-Paper
Various Clinical Specialists (Distributed) e.g.
Ophthalmologist, Podiatrist, Vascular Surgeons,
Renal Specialists, Wound clinic, Foot care
clinic, Neurologists, Cardiologists
ILLNESS
REFERRAL
VARIABLESACCESSMATRIX
CASE
PatientHome-mobile-clinic via TV-PDA-laptop-PC-Pa
per
Diabetes Specialist / Other Specialist
Nurses Home-mobile-clinic via TV-PDA-laptop-PC-Pap
er
Dietitian
Biochemist
Community Nurses / Health Visitors
Slide from Alex Hardisty
18
Distributed Systems History
ARPA net
1960
1970
1980
1990
2000
19
Distributed Systems to Grids
1960
1970
1980
1990
2000
20
e-Infrastructure

A shared resource
That enables science, research, engineering,
medicine, industry,
It will improve UK/European/ productivity
Lisbon Accord 2000
e-Science Vision SR2000 John Taylor
Commitment by UK government
Sections 2.23-2.25
Always there multi-purpose
c.f. telephones, transport, power
OSI report

www.nesc.ac.uk/documents/OSI/index.html
21
A Grid Computing Timeline
US Grid Forum forms at SC 98
Grid Forums merge, form GGF
European AP Grid Forums
I-Way SuperComputing 95
OGSA-WG formed
Physiology paper
Anatomy paper
GGF EGAform OGF
OGSA v1.0
Source Hiro Kishimoto GGF17 Keynote May 2006
22
What is a Grid?

A grid is a system consisting of
Distributed but connected resources and
Software and/or hardware that provides and
manages logically seamless access to those
resources to meet desired objectives

Handheld
Supercomputer
Server
Data Center
Cluster
Workstation
Source Hiro Kishimoto GGF17 Keynote May 2006
23
Grid Related Paradigms

Cluster
Tightly coupled
Homogeneous
Cooperative working

Distributed Computing
Loosely coupled
Heterogeneous
Single Administration

Grid Computing
Large scale
Cross-organizational
Geographical distribution
Distributed Management

Source Hiro Kishimoto GGF17 Keynote May 2006
24
Views of Grids
25
Grids integrating providing homogeneity

Grids are (potentially) Generic Industry
Supported
Grids combine many heterogeneous distributed
resources
Data Information
Computation software
Instruments, sensors actuators
Research processes procedures
System operations processes procedures
Grids restrict choices
Harder for provider to make localised decisions
Deployment can be challenging
Grids provide virtual homogeneity through
virtualisation
Should be easier to compose services
More opportunity to amortise costs
A component of e-Infrastructure

Deliberately choosing consistent interfaces,
protocols management controls across a set of
compatible services. Giving up some freedom to
differ.
26
Grids as a Foundation for Solutions

The grid per se doesnt provide
Supported e-Science methods
Supported data information resources
Computations
Convenient access
Collaborative behaviour
Grids help organisations provide
International national secure e-Infrastructure
Standards for interoperation
Standard APIs to promote re-use
But Research Support must be built
What is needed?
Who should build it?

27
Grids as a Foundation for Solutions
Much to be done by developers of applications
services and by resource providers

The grid per se doesnt provide
Supported e-Science methods
Supported data information resources
Computations
Convenient access
Collaborative behaviour
Grids help organisations provide
International national secure e-Infrastructure
Standards for interoperation
Standard APIs to promote re-use
But Research Support must be built
What is needed?
Who should build it?

28
Grids as a Foundation for Solutions
Much to be done by developers of applications
services and by resource providers

Must support many categories of user
Application Service developers
Tool builders
Deployers Operations teams
Gateway developers
App, tool gateway users

The grid per se doesnt provide
Supported e-Science methods
Supported data information resources
Computations
Convenient access
Grids help providers of these
International national secure e-Infrastructure
Standards for interoperation
Standard APIs to promote re-use
But Research Support must be built
What is needed?
Who should do it?

29
Motives for Grids
30
Why use / build Grids?

Research Arguments
Enables new ways of working
New distributed collaborative research
Unprecedented scale and resources
Economic Ecological Arguments
Reduced system management costs
Shared resources ? better utilisation
Pooled resources ? increased capacity
Greener / less power consumption ?
environmentally acceptable computing
Load sharing utility computing
Cheaper disaster recovery

31
Why use / build Grids?

Computer Science Arguments
New attempt at an old hard problem
Frustrating ignorance about existing results
New scale, new dynamics, new scope
Engineering Arguments
Enable autonomous organisations to
Write complementary software components
Set up run use complementary services
Share operational responsibility
General consistent environment forAbstraction,
Automation, Optimisation Tools
Generally available code mobility

32
Why use / build Grids?

Political Management Arguments
Stimulate innovation
Promote intra-organisation collaboration
Promote inter-enterprise collaboration

33
Collaboration is key
34
Biomedical Research Informatics Delivered by Grid
Enabled Services
Portal
http//www.brc.dcs.gla.ac.uk/projects/bridges/
Slide by Richard Sinnott
35
eDiaMoND Screening for Breast Cancer
1 Trust ? Many Trusts Collaborative Working Audit
capability Epidemiology

Other Modalities
MRI
PET
Ultrasound

Better access to Case information And digital
tools
Supplement Mentoring With access to
digital Training cases and sharing Of information
across clinics
Provided by eDiamond project Prof. Sir Mike
Brady et al.
36
climateprediction.net and GENIE

Largest climate model ensemble
gt45,000 users, gt1,000,000 model years

Response of Atlantic circulation to freshwater
forcing
10K
2K
37
Integrative Biology

Tackling two Grand Challenge research questions
What causes heart disease?
How does a cancer form and grow?
Together these diseases cause 61 of all UK
deaths

Will build a powerful, fault-tolerant Grid
infrastructure for biomedical science Enabling
biomedical researchers to use distributed
resources such as high-performance computers,
databases and visualisation tools to develop
complex models of how these killer diseases
develop.
Slide David Gavaghan IB team, Oxford
38
Foundations of Collaboration

Strong commitment by individuals
To work together
To take on communication challenges
Mutual respect mutual trust
Strong leadership
Distributed technology
To support information interchange
To support resource sharing
To support data integration
To support trust building
Sufficient time
Common goals
Complementary knowledge, skills data

Can we predictwhen it will work? Can we
findremedies when itdoesnt?
39
Grid Collaboration Questions

Without collaboration little is achievable
Must collaboration precede successful grid
applications?
Or will persistently and pervasively available
grids stimulate collaborations?
If we deliver support for collaborative teams,
will we also support the individual researcher?
Can we use grids to democratise computation?
Broadening access
Open science

40
CARMEN - Scales of Integration
Understanding the brain may be the greatest
informatics challenge of the 21st century
See talk Paul Watson at Google Scalability
conf., Seattle, June 2008www.youtube.com/watch?v
2m4EvnlgL8Q
Slide from Colin Ingram Paul Watson
41
CARMEN Consortium
Leadership e-Infrastructure
Colin Ingram
Paul Watson
Leslie Smith
Jim Austin
Slide from Colin Ingram Paul Watson
42
CARMEN Consortium
International Partners
Slide from Colin Ingram Paul Watson
43
CARMEN Consortium
Commercial Partners
- applications in the pharmaceutical sector
- interfacing of data acquisition software
- application of infrastructure
- commercialisation of tools
Slide from Colin Ingram Paul Watson
44
Summary
45
Grids in context

Technology is transforming research
Computer power, network speed, data bonanza,
pervasive devices
Social and commercial impact of web-based
computing
Part of a long-term drive for distributed
computing
A new and ambitious form
Search for trade-offs multiple uses
Leads to many varieties
Multiple stake holders
Many good reasons for building using grids
Questions
Will we have many grids?
A consistent general purpose foundation grid?
What are the minimum standards across the grids
Collaboration is a key driver enabler

46
Minimum Grid Functionalities

Supports distributed computation
Data and computation
Over a variety of
hardware components (servers, data stores, )
Software components (services resource managers,
computation and data services)
With regularity that can be exploited
By applications
By other middleware tools
By providers and operations
It will normally have security mechanisms
To develop and sustain trust regimes

Users want uniform and consistent access to
computing and data desk top, cloud, cluster,
institutional and regional grids, national and
international facilities
47
Distributed Systems Introduction, Principles
Foundations
48
Principles of Distributed Computing

Issues you cant avoid
Lack of Complete Knowledge (LoCK)
Latency
Heterogeneity
Autonomy
Unreliability
Change
A Challenging goal
balance technical feasibility
against virtual homogeneity, stability and
reliability
Balance between usability and productivity
Affordable
Wide user base to amortise costs
Manageable and maintainable

This is NOT easy
49
Lack of Complete Knowledge

Technical origins of LoCK
Dynamics of systems involve very large state
spaces
Cant track or explore all the states
Latency prevents up-to-date knowledge being
available
By the time a notification of a state change
arrives the state may have changed again
Failures inhibit information propagation
Unanticipated failure modes
If you ask a remote system
By the time the answer arrives it may be wrong

Never assume you know the state of a remote
system
50
Lack of Complete Knowledge 2

Human origins of LoCK
lack of understanding
Incomplete simplified models
Intractable models
Poor incomplete descriptions
Erroneous descriptions
Socio-Economic effects generate LoCK
Autonomous owners do not reveal all
About services, resources and performance
Intermediaries aggregate simplify
Present services they want to sell or you to use
favourably

51
LoCK Counter Strategies

Improve the quality of the available knowledge
Better static information
Better information collection dissemination
Improve quality of Distributed System Models
Prove invariants that algorithms can exploit
Test axioms with real systems
Build algorithms that behave reasonably well
When they have incomplete knowledge

52
Latency

It is always going to be there
Consequence of signal transmission times
Consequence of messages / packets in queues
Consequence of message processing time
Errors cause retries ? multiplied delays
It gets worse
Geographic scale increases latency
System complexity increases number of queues
Scale complexity increase processing time
Think about
How many operations a system can do while a
message it sent reaches its destination, a reply
is formed and the reply travels back

53
Latency Counter Strategies

Design algorithms that require fewer round trips
This is THE complexity measure!
Batch requests and responses
Shorten distance to get information
Caching, pre-fetching replication
But may be stale data!
Move data to computation
But be smart about which data when
Move computation to data
Succinct computation volumes of data
But safety and privacy issues arise

Communication is very expensive
54
Heterogeneity
Some of the variation is wanted and exploited

Hardware variation
Different computer architectures
Big endians v little endians
Number representation
Address length
Performance
Different Storage systems
Architectures
Technologies
Available operations
Different Instrument systems
Accepting different control inputs
Generating different output data streams

55
Heterogeneity 2
Some of the variation is unnecessary

Operating System variation
Different O/S architectures
Unix families versions
Windows families and versions
Specialised O/S, e.g. for Instruments Mobile
devices
Implementation system variation
Programming languages
Scripting languages
Workflow systems
Data models
Description languages
Grid systems
Many implementations of same functionality

56
Heterogeneity Counter Measures

Invest in virtual Homogeneity
Agree standards (formally or de facto)
Introduce intermediate code
That hides unwanted variation
Presenting it in standard form
But this has high cost
Developing the standard
Developing the intermediate code
Executing the intermediate code
It may hide variations some want
Provide direct access to facilities as well
But this may inhibit optimisation automation

57
Heterogeneity Counter Measures 2

Automatically manage diversity
Manual agreement and construction of virtual
homogeneity will not scale compose
Develop abstract and higher level models
Describe each component
Generate adaptations as needed from descriptions
Not yet achievable for general complete systems
Relevant for specific domains

58
Autonomy and Change

Necessary
To persuade organisations individuals to engage
They need to control their own facilities
They have best knowledge to develop their
services
Their business opportunity
Because coordinated change is unachievable
Systems workloads are busy
Service commitments must be met
Large-scale scheduling of work is very hard
To correct errors
To plug vulnerabilities
To obtain new capabilities

59
Autonomy and Change 2

What changes local decisions
The underlying technology delivering a service
The operations available from a service
The semantics of the operations
Policy changes, e.g. authorisation rules, costs,
What changes corporate decisions
Some agreed standard is changed
E.g. a new version of a protocol is introduced

60
Autonomy and change Counter Measures

Users other providers expect stability
Agree some standards that are rarely changed
As a platform framework
As a means of communicating change
Introduce change-absorbing technology
Mark the protocols and services with version
information
Transform between protocols when changes occur
Anneal the change out of the system
Develop algorithms tolerant to change
Revalidate dependencies where they may change
Handle failures due to change

Change is an asset Embrace and Manage it Ignore
it atyour peril
61
Unreliability

Failures are inevitable
Equipment, software operations errors
Network outages, Power outages,
Their effects must be localised
Cannot afford total system outages
This is not easy
Each error may occur when system is in any state
The system is an unknown composition of
subsystems
Errors often occur while other errors are still
active
Errors often occur during error recovery actions
Errors may be caused by deliberate attack
Attackers may continue their attack

62
Unreliability Counter Measures

Requires much RD
Continuous arms race as scale of Grids grow
Ideal of a continuously available stable service
Not achievable recognise that drops in response
and local failures must be dealt with
Design resilient architectures
Design resilient algorithms
Improve reliability of each component
Distribute the responsibility
For failure detection
For recovery action

Invest heavily in error detection and recovery
63
Service Oriented Architectures
64
Three Components
Registries
Register an available service Send name
description
Service Consumers
Services
65
Three Components
Registries
Request a service Send a description
Service Consumers
Services
66
Three Components
Registries
Set (possibly empty)of matching services
Service Consumers
Services
67
Three Components
Registries
Service Consumers
Request service operation
Services
68
Three Components
Registries
Service Consumers
Services
Return result or Error
69
Composed behaviour

Services are themselves consumers
They may compose and wrap other services
The registry is itself a consumer
A federation of registries may deal with registry
services reliability performance
Observer services may report on quality of
services and help with diagnostics
Agreements between services may be set up
Service-Level Agreements
Permitting sustained interaction

70
Composed behaviour

Services are themselves consumers
They may compose and wrap other services
The registry is itself a consumer
A federation of registries may deal with registry
services reliability performance
Observer services may report on quality of
services and help with diagnostics
Agreements between services may be set up
Service-Level Agreements
Permitting sustained interaction

Requires Organising as an Architecture
71
Scenarios
72
Why Scenarios

Abstraction of what people want to do
Catches the essence of their requirement
Framework for
Discussion
Comparison
Elaboration
Check how technologies cover scenarios
Scenarios should not be about implementation
Scenario can be decomposed into steps
Possibly in many ways
These are less abstract requirements

73
Job submission scenario
1 Create or revise a job description Q In what
language? Q What must it / can it say?
74
Job submission scenario
2 Submit the job description Q How? Q With what
extra parameters?
75
Job submission scenario
3 Ask about progress Q How? Q What can they learn
and when? Q Is the reply in user or system terms?
76
Job submission scenario
4 Retrieve results Q How? Q Where can they be
found? Q Are there helpful diagnostics?
77
Job submission scenario
Q Who provides and runs this system? Q How does
it get paid for? Q What are its policies for
allocating resources to JD submissions? Q How
reliable and efficient is it? Users view?
Managers view?
78
Job submission scenario
Q How much effort does it take to submit the same
job to another system? Q How does the code for
the application get to be executed? Q How are
data read or created during the computation
handled? Q How will this system evolve? Will
users need to learn new tricks?
79
Ensemble run scenario
Computing resources any type any where
80
Ensemble run scenario
Computing resources any type any where
Coordinationsystem
resultsstore
81
Ensemble run scenario
Computing resources any type any where
resultsstore
1 Create plan for the ensemble run, e.g.
parameter space to sweep and sampling method
82
Ensemble run scenario
Computing resources any type any where
resultsstore
2 Initiate the production and submission of jobs
83
Ensemble run scenario
Computing resources any type any where
resultsstore
3 Result accumulation
84
Ensemble run scenario
Computing resources any type any where
resultsstore
4 Researcher monitors and steers progress
85
Ensemble run scenario
Computing resources any type any where
resultsstore
5 Researcher recovers and analyses results -
computes derivatives
86
Ensemble run scenario
Computing resources any type any where
resultsstore
6 Researcher completes analyses discards or
archives results
87
Ensemble run scenario with context
Computing resources any type any where
Everything asbefore, plusinterleavedrequests
forcontext datafrom eachjob as it runs
Runs draw data from context stores boundary
conditions, pre-computed data, observations
88
Ensemble run scenario with metadata
Computing resources any type any where
Everything asbefore, plususe andgeneratemetada
ta aseach job runs
Runs organised using metadata and jobs generate
metadata helps manage 1000s of files
89
Repetition of Scenario

Normally, users repeatedly perform the same
scenario
Analysis of the next sample
Re-analysis by other researchers designers
Calibration and normalisation of the latest
observational run
Re-verification against the latest data
Evaluation of the risk of the next share purchase
(Revising the) design of an(other similar) engine
component
Often with parametric variations
Often with progressive refinements
A better pattern recogniser
A refinement in calibration
Code fixes, updates to reference data,
How well do the solutions on offer support
repetition?

90
Data integration scenario
Researcher wants to obtainspecified data from
multipledistributed data sources andto supply
the result to aprocess and then view itsoutput.
1 Researcher formulates query
2 Researcher submits query
3 Query system transforms and distributes query
4 Data services send back local results
5 Query system combines these to form requested
data
6 Query system sends data to process
7 Process system sends derived data to researcher
91
Summary Conclusions
92
Grids

Many reasons motivating investment in grids
Collaboration for Global Science Business
Resource integration sharing
New approach to large-scale distributed systems
Large coordinated effort necessary
Industry Academia
Economic Creative niches
Can they be assembled to provide all that is
needed?
Many technical and socio-economic challenges
Work for you all
Many new opportunities
Work for you all

93
Summary Take home message

e-Infrastructure is arriving
Built on Grids Web Services
Data and Information grow in importance
Must include user support
Must be based on good socio-economic
understanding
There is a dramatic rate of change
An opportunity for everyone

Can you ride the wave?
94
?
Picture compositionbyLuke Humphrybased on
prior art by Frans Hals
www.omii.ac.uk

Write a Comment

User Comments (0)

About PowerShow.com

Session 2 Overview of eScience and Distributed Systems - PowerPoint PPT Presentation

Session 2 Overview of eScience and Distributed Systems

Session 2 Overview of eScience and Distributed Systems – PowerPoint PPT presentation