What is eScience - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

What is eScience

Description:

Manchester Computing. Supercomputing, Visualization & eScience. W T Hewitt. UCISA Meeting ... heterogeneous computing resources, information systems, and instruments, ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 57
Provided by: Carole153
Category:
Tags: escience | hewitt

less

Transcript and Presenter's Notes

Title: What is eScience


1
What is e-Science What is the Grid?
W T Hewitt Monday, November 23, 2009 UCISA
Meeting Edinburgh
2
Agenda
  • What is Grid eScience?
  • The Global Programme
  • The UK eScience Programme
  • Impacts

3
  • What is e-Science the Grid?

4
Why Grids?
  • Large-scale science and engineering are done
    through
  • the interaction of people,
  • heterogeneous computing resources, information
    systems, and instruments,
  • all of which are geographically and
    organizationally dispersed.
  • The overall motivation for Grids is to
    facilitate the routine interactions of these
    resources in order to support large-scale science
    and engineering.

From Bill Johnston 27 July 01
5
The Grid
  • "is the web on steroids."
  • "is Napster for Scientists" of data grids
  • "is the solution to all your problems."
  • "is evil." a system manager, of Globus
  • "is distributed computing re-badged."
  • "is distributed computing across multiple
    administrative domains"
  • Dave Snelling, senior architect of UNICORE

6
  • provides "Flexible, secure, coordinated
    resource sharing among dynamic collections of
    individuals, institutions, and resource"
  • From The Anatomy of the Grid Enabling Scalable
    Virtual Organizations
  • "enables communities (virtual organizations)
    to share geographically distributed resources as
    they pursue common goals -- assuming the absence
    of central location, central control,
    omniscience, existing trust relationships."

7
CERN Large Hadron Collider (LHC)
Raw Data 1 Petabyte / sec Filtered 100Mbyte /
sec 1 Petabyte / year 1 Million CD ROMs
CMS Detector
8
Why Grids?
  • A biochemist exploits 10,000 computers to screen
    100,000 compounds in an hour
  • A biologist combines a range of diverse and
    distributed resources (databases, tools,
    instruments) to answer complex questions
  • 1,000 physicists worldwide pool resources for
    petaop analyses of petabytes of data
  • Civil engineers collaborate to design, execute,
    analyze shake table experiments

From Steve Tuecke 12 Oct. 01
9
Why Grids? (contd.)
  • Climate scientists visualize, annotate, analyze
    terabyte simulation datasets
  • An emergency response team couples real time
    data, weather model, population data
  • A multidisciplinary analysis in aerospace couples
    code and data in four companies
  • A home user invokes architectural design
    functions at an application service provider

From Steve Tuecke 12 Oct. 01
10
Broader Context
  • Grid Computing has much in common with major
    industrial thrusts
  • Business-to-business, Peer-to-peer, Application
    Service Providers, Storage Service Providers,
    Distributed Computing, Internet Computing
  • Sharing issues not adequately addressed by
    existing technologies
  • Complicated requirements run program X at site
    Y subject to community policy P, providing access
    to data at Z according to policy Q
  • High performance unique demands of advanced
    high-performance systems

11
What is the Grid?
  • Grid computing is distinguished from
    conventional distributed computing by its focus
    on large-scale resource sharing, innovative
    applications, and, in some cases,
    high-performance orientation...we review the
    "Grid problem", which we define as flexible,
    secure, coordinated resource sharing among
    dynamic collections of individuals, institutions,
    and resources - what we refer to as virtual
    organizations."
  • From "The Anatomy of the Grid Enabling Scalable
    Virtual Organizations" by Foster, Kesselman and
    Tuecke

12
New Book
13
What is the Grid?
  • Resource sharing coordinated problem solving in
    dynamic, multi-institutional virtual
    organizations
  • On-demand, ubiquitous access to computing, data,
    and all kinds of services
  • New capabilities constructed dynamically and
    transparently from distributed services
  • No central location, No central control, No
    existing trust relationships, Little
    predetermination
  • Uniformity
  • Pooling Resources

14
e-Science and the Grid
  • e-Science is about global collaboration in key
    areas of science, and the next generation of
    infrastructure that will enable it.
  • e-Science will change the dynamic of the way
    science is undertaken.
  • John Taylor,
  • Director General of Research Councils,
  • Office of Science and Technology

15
Why GRID?
  • VERY VERY IMPORTANT
  • The GRID is one way to realise the e-Science
    vision.
  • WE ARE TRYING TO DO E-SCIENCE!

16
  • Grid Middleware

Diverse global services
Grid services
Local OS
17
Common principles
  • Single sign-on
  • Often implying Public Key Infrastructure (PKI)
  • Standard protocols and services
  • Respect for autonomy of resource owner
  • Layered architectures
  • Higher-level infrastructures hiding heterogeneity
    of lower levels
  • Interoperability is paramount

18
Grid Middleware
  • Middleware
  • Globus
  • UNICORE
  • Legion and Avaki
  • Scheduling
  • Sun Grid Engine
  • Load Sharing Facility (LSF)
  • from Platform Computing
  • OpenPBS and PBS(Pro)
  • from Veridian
  • Maui scheduler
  • Condor
  • could also go under middleware
  • Data
  • Storage Resource Broker (SRB)
  • Replica Management
  • OGSA-DAI
  • Web services (WSDL, SOAP, UDDI)
  • IBM Websphere
  • Microsoft .NET
  • Sun Open Net Environment (Sun ONE)
  • PC Grids
  • Peer-to-Peer computing

19
  • Data-oriented Grids

20
Data-oriented middleware
  • Wide-area distributed file systems (e.g. AFS)
  • Storage Resource Broker (SRB)
  • UCSD and SDSC
  • Provide transparent access to data storage
  • Centralised architecture
  • Motivated by experiences of HPC users, not
    database users
  • Little enthusiasm from UK e-Science programme
  • OGSA-DAI
  • Database Access and Integration
  • Strategic contribution of UK e-Science programme
  • Universities of Edinburgh, Manchester, Newcastle
    IBM, Oracle
  • Alpha release January 2003
  • Globus Replica Management software
  • Next up!

21
Data Grids forHigh Energy Physics
22
Data Intensive Issues Include
  • Harness potentially large numbers of data,
    storage, network resources located in distinct
    administrative domains
  • Respect local and global policies governing what
    can be used for what
  • Schedule resources efficiently, again subject to
    local and global constraints
  • Achieve high performance, with respect to both
    speed and reliability
  • Catalog software and virtual data

23
Desired Data Grid Functionality
  • High-speed, reliable access to remote data
  • Automated discovery of best copy of data
  • Manage replication to improve performance
  • Co-schedule compute, storage, network
  • Transparency wrt delivered performance
  • Enforce access control on data
  • Allow representation of global resource
    allocation policies

24
Grid Standards
  • Grid Standards Bodies
  • IETF Home of the Network Infrastructure
    Standards
  • W3C Home of the Internet
  • GGF Home of the Grid
  • GGF Defines the Open Grid Services Architecture
  • OGSI is the Infrastructure part of OGSA
  • OGSI Public comment draft submitted 14 February
    2003
  • Key OGSA Areas of Standards Development
  • Job management interfaces
  • Resources Discovery
  • Security
  • Grid Economy and Brokering

25
What is OGSA?
Web Services with Attitude!
Also known as "Open Grid Services Architecture"
26
Aside What are Web Services?
  • Loosely Coupled Distributed Computing
  • Think Java RMI or C remote procedure call
  • Text Based Serialization
  • XML Human Readable serialization of objects
  • IBM and Microsoft lead
  • Web Services Description Language (WSDL)
  • W3C Standardization
  • Three Parts
  • Messages (SOAP)
  • Definition (WSDL)
  • Discovery (UDDI)

27
Web Services in Action
28
Enter Grid Services
  • Experiences of Grid computing (and business
    process integration) suggest similar extensions
    to Web Services
  • State
  • Service Data Model
  • Persistence and Naming
  • Two Level Naming (GSH, GSR)
  • Allows dynamic migration and QoS adaptation
  • Lifetime Management
  • Self healing and soft garbage collection.
  • Standard PortTypes
  • Guarantee of minimal level of service
  • Beyond P2P is Federation through Mediation
  • Explicit Semantics
  • Grid Services specify semantics on top of Web
    Service syntax.
  • PortType Inheritance

29
  • If one GRID is good then Many GRIDS must be better

30
US Grid Projects
  • NASA Information Power Grid
  • DOE Science Grid
  • NSF National Virtual Observatory
  • NSF GriPhyN
  • DOE Particle Physics Data Grid
  • NSF DTF TeraGrid
  • DOE ASCI DISCOM Grid
  • DOE Earth Systems Grid
  • DOE FusionGrid
  • NEESGrid
  • NIH BIRN
  • NSF iVDGL

31
National Grid Projects
  • Japan Grid Data Farm, ITBL
  • Netherlands VLAM, DutchGrid
  • Germany UNICORE, Grid proposal
  • France Grid funding approved
  • Italy INFN Grid
  • Eire Grid-Ireland
  • Poland PIONIER Grid
  • Switzerland - Grid proposal
  • Hungary DemoGrid, Grid proposal
  • ApGrid AsiaPacific Grid proposal

32
EU GridProjects
  • DataGrid (CERN, ..)
  • EuroGrid (Unicore)
  • DataTag (TTT)
  • Astrophysical Virtual Observatory
  • GRIP (Globus/Unicore)
  • GRIA (Industrial applications)
  • GridLab (Cactus Toolkit)
  • CrossGrid (Infrastructure Components)
  • EGSO (Solar Physics)
  • COG (Semantic Grid)

33
  • UK e-Science Programme

34
UK e-Science Programme
DG Research Councils
Grid TAG
E-Science Steering Committee
Director
Directors Management Role
Directors Awareness and Co-ordination Role
Generic Challenges EPSRC (15m), DTI (15m)
Academic Application Support Programme Research
Councils (74m), DTI (5m) PPARC (26m) BBSRC
(8m) MRC (8m) NERC (7m) ESRC (3m) EPSRC
(17m) CLRC (5m)
80m Collaborative projects
Industrial Collaboration (40m)
From Tony Hey 27 July 01
35
Key Elements
  • Development of Generic Grid Middleware
  • Network of Grid Core Programme e-Science Centres
  • National Centre http//www.nesc.ac.uk
  • Regional Centres http//www.esnw.ac.uk/
  • Grid IRC Grand Challenge Project
  • Support for e-Science Pilots
  • Short term funding for e-Science demonstrators
  • Grid Network Team
  • Grid Engineering Team
  • Grid Support Centre
  • Task Forces
  • Database lead by Norman Paton
  • Architecture lead by Malcolm Atkinson
  • International Involvement

Adapted from Tony Hey 27 July 01
36
National Regional Centres
  • Centres donate equipment to make a Grid

Edinburgh
Glasgow
Newcastle
Belfast
Manchester
DL
Cambridge
Oxford
Hinxton
RAL
Cardiff
London
Southampton
37
e-Science Demonstrators
  • Dynamic Brain Atlas
  • Biodiversity
  • Chemical Structures
  • Mouse Genes
  • Robotic Astronomy
  • Collaborative Visualisation
  • Climateprediction.com
  • Medical Imaging/VR

38
Grid Middleware RD
  • 16M funding available for industrial
    collaborative projects
  • 11M allocated to Centres projects plus 5M for
    Open Call projects
  • Set up Task Forces
  • Database Task Force
  • Architecture Task Force
  • Security Task Force

39
Grid Network Team
  • Expert group to identify end-to-end network
    bottlenecks and other network issues
  • e.g. problems with multicast for Access Grid
  • Identify e-Science project requirements
  • Funding 0.5M traffic engineering/QoS project
    with PPARC, UKERNA and CISCO
  • investigating MPLS using SuperJANET network
  • Funding DataGrid extension project investigating
    bandwidth scheduling with PPARC
  • Proposal for UKLight lambda connection to
    Chicago and Amsterdam

40
UK e-Science Pilot Projects
  • GRIDPP (PPARC)
  • ASTROGRID (PPARC)
  • Comb-e-Chem (EPSRC)
  • DAME (EPSRC)
  • DiscoveryNet (EPSRC)
  • GEODISE (EPSRC)
  • myGrid (EPSRC)
  • RealityGrid (EPSRC)
  • Climateprediction.com (NERC)
  • Oceanographic Grid (NERC)
  • Molecular Environmental Grid (NERC)
  • NERC DataGrid ( OST-CP)
  • Biomolecular Grid (BBSRC)
  • Proteome Annotation Pipeline (BBSRC)
  • High-Throughput Structural Biology (BBSRC)
  • Global Biodiversity (BBSRC)

41
e-Science Centres of Excellence
  • Birmingham/Warwick Modelling
  • Bristol Media
  • UCL Networking
  • White Rose Grid Leeds, York, Sheffield
  • Lancaster Social Science
  • Leicester Astronomy
  • Reading - Environment

42
UK e-Science Grid
Edinburgh
Glasgow
Newcastle
Belfast
Manchester
DL
Cambridge
Oxford
RL
Hinxton
Cardiff
London
Soton
43
UK e-Science Funding
  • First Phase 2001 2004
  • Application Projects
  • 74M
  • All areas of science and engineering
  • Core Programme
  • 15M 20M (DTI)
  • Collaborative industrial projects
  • Second Phase 2003 2006
  • Application Projects
  • 96M
  • All areas of science and engineering
  • Core Programme
  • 16M
  • Core Grid Middleware
  • DTI follow-on?

44
  • EPSRC Computer Science for e-Science
  • 9M, 18 projects so far
  • ESRC National e-Social Science Centre 3 hubs
  • 6M
  • PPARC
  • MRC
  • BBSRC

45
Core Programme Phase 2
  • UK e-Science Grid/Centres and e-Science Institute
  • Grid Operation Centre and Network Monitoring
  • Core Middleware engineering
  • National Data Curation Centre
  • e-Science Exemplars/New Opportunities
  • Outreach and International involvement

46
Other Activities
  • Security Task Force
  • Joint fund key security projects with EPSRC
    JCSR and coordinated effort with NSF NMI
    Internet2 projects
  • JCSR 2M call in preparation
  • UK Digital Curation Centre
  • 3M, Core e-Science JCSR
  • JCSR
  • 3M per annum

47
SR2004 e-Science Infrastructure
  • Persistent UK e-Science Research Grid
  • Grid Operations Centre
  • UK Open Middleware Infrastructure Institute
  • National e-Science Institute
  • UK Digital Curation Centre
  • AccessGrid Support Service
  • e-Science/Grid collaboratories Legal Service
  • International Standards Activity

48
  • Conclusions

49
Todays Grid
  • A Single System Image
  • Transparent wide-area access to large data banks
  • Transparent wide-area access to applications on
    heterogeneous platforms
  • Transparent wide-area access to processing
    resources
  • Security, certification, single sign-on
    authentication, AAA
  • Grid Security Infrastructure,
  • Data access,Transfer Replication
  • GridFTP, Giggle
  • Computational resource discovery, allocation and
    process creation
  • GRAAM, Unicore, Condor-G

50
Reality Checks!!
  • The Technology is Ready
  • Not true its emerging
  • Building middleware, Advancing Standards,
    Developing, Dependability
  • Building demonstrators.
  • The computational grid is in advance of the data
    intensive middleware
  • Integration and curation are probably the
    obstacles
  • But!! It doesnt have to be all there to be
    useful.
  • We know how we will use grid services
  • No Disruptive technology
  • Lower the barriers of entry.

51
Grid Evolution
  • 1st Generation Grid
  • Computationally intensive, file access/transfer
  • Bag of various heterogeneous protocols toolkits
  • Recognises internet, Ignores Web
  • Academic teams
  • 2nd Generation Grid
  • Data intensive -gt knowledge intensive
  • Services-based architecture
  • Recognises Web and Web services
  • Global Grid Forum
  • Industry participation

We are here!
52
Impacts
  • It's all about interoperability, really.
  • Web Grid Services are creating a new
    marketplace for components
  • If you're concerned with systems integration or
    internet delivery of services, embrace Web
    Services technologies now. You'll be ready for
    Grid Services when they're ready for you.
  • If you're a developer, get Web Services on your
    CV
  • If you're an IT manager, collect Web Service
    expertise through hiring or training
  • Software license models must adapt

53
I don't want to share!Do I need a grid?
54
In conclusion
  • The GRID is not, and will not, be free
  • must pay for resources
  • What have we to show for 250M?

55
Acknowledgements
  • Carole Goble
  • Stephen Pickles
  • Paul Jeffreys
  • University of Manchester
  • Academic collaborators
  • Industrial collaborators
  • Funding Agencies DTI, EPSRC, NERC, ESRC, PPARC

56
SVE _at_ Manchester Computing
World Leading Supercomputing Service, Support and
Research Bringing Science and Supercomputers
Together www.man.ac.uk/sve sve_at_man.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com