What are grid computing and e-Science? - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

What are grid computing and e-Science?

Description:

What are grid computing and eScience – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 40
Provided by: MikeMi5
Category:
Tags: computing | grid | mi6 | science

less

Transcript and Presenter's Notes

Title: What are grid computing and e-Science?


1
What are grid computing and e-Science?
  • Mike Mineter
  • mjm_at_nesc.ac.uk

2
Policy for re-use
  • This presentation can be re-used for academic
    purposes.
  • However if you do so then please let us know. We
    need to gather statistics of re-use no. of
    events, number of people trained. Thank you!!
  • If you use a significant part of the course,
    sendmailtotraining-support_at_nesc.ac.uk?subjectN
    GSI_course
  • If just this module please mailtotraining-suppo
    rt_at_nesc.ac.uk?subjectNGSI_intro
  • THANK YOU!!!!

3
Acknowledgements
  • This talk was prepared by Mike Mineter of NeSC
    and includes slides from previous tutorials and
    talks delivered by
  • Dave Berry, Richard Hopkins (National e-Science
    Centre)
  • the EDG training team
  • Ian Foster, Argonne National Laboratories
  • Jeffrey Grethe, SDSC
  • EGEE colleagues

4
Goals of this module
  • To introduce the concepts of e-Science and Grid
    computing assuming no previous knowledge

5
Contents
  • The Grid vision
  • What is a grid ?
  • Drivers of grid computing
  • Some examples
  • Current status of grids
  • Are grids for you?!

6
The Grid Metaphor
7
The grid vision
  • The grid vision is of Virtual computing (
    information services to locate computation,
    storage resources)
  • Compare The web virtual documents ( search
    engine to locate them)
  • MOTIVATION collaboration through sharing
    resources (and expertise) to expand horizons of
  • Research
  • Commerce engineering, the knowledge economy
  • Public service health, environment,

8
Before Grids
9
The Grid Vision
Slide derived from EDG / LCG tutorials
10
Grid projects
  • Many Grid development efforts all over the
    world
  • UK OGSA-DAI, RealityGrid, GeoDise,
    Comb-e-Chem, DiscoveryNet, DAME, AstroGrid,
    GridPP, MyGrid, GOLD, eDiamond, Integrative
    Biology,
  • Netherlands VLAM, PolderGrid
  • Germany UNICORE, Grid proposal
  • France Grid funding approved
  • Italy INFN Grid
  • Eire Grid proposals
  • Switzerland - Network/Grid proposal
  • Hungary DemoGrid, Grid proposal
  • Norway, Sweden - NorduGrid
  • NASA Information Power Grid
  • DOE Science Grid
  • NSF National Virtual Observatory
  • NSF GriPhyN
  • DOE Particle Physics Data Grid
  • NSF TeraGrid
  • DOE ASCI Grid
  • DOE Earth Systems Grid
  • DARPA CoABS Grid
  • NEESGrid
  • DOH BIRN
  • NSF iVDGL
  • DataGrid (CERN, ...)
  • EuroGrid (Unicore)
  • DataTag (CERN,)
  • Astrophysical Virtual Observatory
  • GRIP (Globus/Unicore)
  • GRIA (Industrial applications)
  • GridLab (Cactus Toolkit)
  • CrossGrid (Infrastructure Components)
  • EGSO (Solar Physics)

11
Contents
  • The Grid vision
  • What is a grid ?
  • Drivers of grid computing
  • Some examples
  • Current status of grids
  • Are grids for you?!

12
A grid
  • The initial vision The Grid
  • The present reality Many grids
  • Each grid is an infrastructure enabling one or
    more virtual organisations to share computing
    resources
  • Whats a VO?
  • People in different organisations seeking to
    cooperate and share resources across their
    organisational boundaries
  • Why establish a Grid?
  • Share data
  • Pool computers
  • Collaborate

VO
Institute
Desktop
13
A computer
  • The Operating System enables easy use of
  • Input devices
  • Processor
  • Disks
  • Display

14
An institutes resources on a LAN
  • Middleware runs on each computer
  • To allow sharing of disks and printers (using,
    e.g. Samba)
  • To share processors for computation (e.g. Condor)
  • User just perceives shared resources, with no
    regard to location in the building
  • Authenticated by username / password
  • Authorised to use own files,

15
Typical current grid
  • Grid middleware runs on each shared resource
  • Data storage
  • (Usually) batch jobs on pools of processors
  • Users join VOs
  • Virtual organisation negotiates with sites to
    agree access to resources
  • Distributed services (both people and middleware)
    enable the grid, allow single sign-on

16
What is a Grid?
  • An infrastructure that enables flexible, secure,
    coordinated resource sharing among dynamic
    collections of individuals, institutions and
    resources Ian Foster and Carl Kesselman

17
Key concepts
  • Virtual organisation people and resources
    collaborating - crosses admin, organisational
    boundaries
  • Single sign-on
  • I connect to one machine some sort of digital
    credential is passed on to any other resource I
    use
  • Authentication How do I identify myself to a
    resource without username/password for each
    resource I use?
  • Authorisation what can I do? Determined by
  • My role in a VO (role-based in near future)
  • VO negotiations with resource providers
  • Grid middleware runs on each resource
  • User just perceives shared resources, with no
    awareness of location or owning organisation

18
What is Grid computing not?
  • Grid computing is a trendy phrase!
  • Its therefore also a misused term!
  • Sometimes in Industry Grids clusters
  • Motivations better use of resources scope for
    commercial services
  • Also used to refer to the harvesting of unused
    compute cycles, e.g.
  • SETI_at_home, Climateprediction.net

19
Contents
  • The Grid vision
  • What is a grid ?
  • Drivers of grid computing
  • Some examples
  • Current status of grids
  • Are grids for you?!

20
The first driver e-Science
  • What is e-Science? Collaborative science that is
    made possible by the sharing across the Internet
    of resources (data, instruments, computation,
    peoples expertise...)
  • Often very compute intensive
  • Often very data intensive (both creating new data
    and accessing very large data collections) data
    deluges from new technologies
  • Crosses organisational boundaries

21
Other major drivers for grids
  • e-Research not just e-Science
  • Also curation and digital data libraries
    (DILIGENT, DELOS, GRACE)
  • Commerce not just academia!!
  • Politics the knowledge economy
  • e-Infrastructure A shared resource
  • That enables science, research, engineering,
    medicine, industry,
  • It will improve UK / European / productivity

Collaboration
Grid
Operations, Support and training
Network infrastructure linking resource centres
22
Contents
  • The Grid vision
  • What is a grid ?
  • Drivers of grid computing
  • Some examples
  • Current status of grids
  • Are grids for you?!

23
Astronomy
  • No. sizes of data sets as of mid-2002,
    grouped by wavelength
  • 12 waveband coverage of large areas of the
    sky
  • Total about 200 TB data
  • Doubling every 12 months
  • Largest catalogues near 1B objects

Data and images courtesy Alex Szalay, John
Hopkins University

24
Earth Observation
  • ESA missions
  • 100s of Gbytes of data per day
  • Grid contribution to EO
  • Enhance the ability to access high level products
  • Allow reprocessing of large historical archives
  • Improve Earth science complex applications (data
    fusion, data mining, modelling )

Derived from L. Fusco, June 2001
Federico.Carminati , EU review presentation, 1
March 2002
25
DAME Grid based tools and Infer-structure for
Aero-Engine Diagnosis and Prognosis
  • A Significant factor in the success of the
    Rolls-Royce campaign to power the Boeing 7E7 with
    the Trent 1000 was the emphasis on the new
    aftermarket support service for the engines
    provided via DSS. Boeing personnel were shown
    DAME as an example of the new ways of gathering
    and processing the large amounts of data that
    could be retrieved from an advanced aircraft such
    as the 7E7, and they were very impressed, DSS
    2004

XTO
Companies Rolls-Royce DSS Cybula
Universities York, Leeds, Sheffield, Oxford
Engine Model
Case Based Reasoning
Signal Data Explorer
26
Large Hadron Collider at CERN
  • Data Challenge
  • 10 Petabytes/year of data !!!
  • 20 million CDs each year!
  • Simulation, reconstruction, analysis
  • LHC data handling requires computing power
    equivalent to 100,000 of today's fastest PC
    processors!
  • Operational challenges
  • Reliable and scalable through project lifetime of
    decades

Mont Blanc (4810 m)
Downtown Geneva
27
Contents
  • The Grid vision
  • What is a grid ?
  • Drivers of grid computing
  • Some examples
  • Current status of grids
  • Are grids for you?!

28
If The Grid vision leads us here
then where are we now?
29
Current status
  • Many key concepts identified and known
  • Many grid projects have tested these
  • Major efforts now on establishing
  • Standards (a slow process) (Global Grid Forum,
    http//www.gridforum.org/ )
  • Production Grids for multiple VOs
  • Production Reliable, sustainable, with
    commitments to quality of service
  • New user communities
  • whilst research development continues
  • In Europe, EGEE
  • In UK, NGS
  • In US, Teragrid

30
1997- Present Globus
  • A software toolkit addressing certain technical
    problems in the development of Grid enabled
    tools, services, and applications
  • Offers a modular bag of technologies
  • Made available under liberal open source license
  • Not turnkey solutions, but building blocks and
    tools for application developers and system
    integrators
  • Tools built on Grid Security Infrstructure to
    include
  • Job submission (GRAM) run a job on a remote
    computer
  • Information services So I know which computer to
    use
  • File transfer (GridFTP) so large data files can
    be transferred
  • Replica management so I can have multiple
    versions of a file close to the computers where
    I want to run jobs
  • Production grids are (currently) based on release
    2
  • http//www.globus.org/

31
Computing Resources Feb 2005
  • Country providing resources
  • Country anticipating joining
  • In LCG-2
  • 113 sites, 30 countries
  • gt10,000 cpu
  • 5 PB storage
  • Includes non-EGEE sites
  • 9 countries
  • 18 sites

32
Grid security and trust -1
  • Providers of resources (computers, databases,..)
    need risks to be controlled they are asked to
    trust users they do not know
  • Users need single sign-on logon to a machine
    that can pass the users identity to other
    resources
  • Build middleware on layer providing
  • Authentication know who wants to use resource
  • Authorisation know what the user is allowed to
    do
  • Security reduce vulnerability, e.g. from outside
    the firewall
  • Non-repudiation knowing who did what
  • GSI from the Globus toolkit does this for NGS

33
Grid security and trust -2
  • Currently, achieved by Certification
  • Users identity has to be certified by one of the
    national Certification Authorities (CAs)
  • mutually recognized http//www.gridpma.org/
  • In UK go to http//www.grid-support.ac.uk/ca/ralis
    t.htm
  • Resources (node machines) have to be certified
    by CAs
  • Digital certificate installed on the machine
    accessed by user basis of AA
  • User joins a VO
  • Identity passed to other resources you use, where
    it is mapped to a local account the mapping is
    maintained by the VO
  • Common agreed policies establish rights for a
    Virtual Organization to use resources

34
The key for new VOs
  • Application development environment, portals
  • Insulate applications from changing middleware

35
Contents
  • Definitions of e-Science and a grid
  • Exploring the definitions
  • Why now?!
  • Some examples
  • Current status of grids
  • Are grids for you?!

36
Are Grids for you?!
  • IF a community effort is vital to achieving
    goals, by sharing services of data and
    computation,
  • AND that effort crosses organisation boundaries
  • THEN yes!
  • In the UK, plan to join the NGS!
  • OR if you wish to use computation/storage/data
    services provided on a Grid then YES!

37
Summary
  • Collaboration across multiple organisations
  • Single sign-on to resources in multiple
    organisations
  • Need for people-services as well as middleware
    services to enable this e.g. to run
  • Enabling services (e.g. info service)
  • Certification authority for AA
  • VO management to negotiate with sites
  • Helpdesk,
  • Drives are towards
  • Production services
  • In the UK, the NGS
  • In Europe, EGEE
  • Standards (tomorrow)
  • e-Infrastructure integration of networking
    and middleware to support collaboration

38
Additional slides
39
Exponential Growth
Optical Fibre(bits per second)
Doubling Time(months)
Gilders Law(32X in 4 yrs)
Data Storage(bits per sq. inch)
Storage Law (16X in 4yrs)
Performance per Dollar Spent
Chip capacity( transistors)
Moores Law(5X in 4yrs)
0 1 2
3 4 5
Number of Years
Triumph of Light Scientific American. George
Stix, January 2001
40
How Different 2005 is from 1995
  • Enormous quantities of data Petabytes
  • For an increasing number of communities
  • Constraint is not collection but analysis
  • Ubiquitous Internet
  • gt100 million hosts
  • Security and Trust are crucial issues
  • Ultra-high-speed networks gt10 Gb/s
  • Global optical networks
  • Bottlenecks last kilometre firewalls
  • Huge quantities of computing gt100 Top/s
  • Moores law gives us all supercomputers
  • Organising their effective use is the challenge
  • Moores law everywhere
  • Instruments, detectors, sensors, scanners,
  • Organising their effective use is the challenge
Write a Comment
User Comments (0)
About PowerShow.com