The Grid: UK Practice and Experience - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

The Grid: UK Practice and Experience

Description:

The Grid: UK Practice and Experience. Mark Baker. Distributed Systems Group ... Screen saver/cycle stealers: SETI_at_HOME, fold_at_home, etc... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 36
Provided by: acetR
Category:

less

Transcript and Presenter's Notes

Title: The Grid: UK Practice and Experience


1
The Grid UK Practice and Experience
  • Mark Baker
  • Distributed Systems Group
  • University of Portsmouth
  •  
  • Email Mark.Baker_at_computer.org
  • Web http//dsg.port.ac.uk/mab

2
Outline
  • Characterisation of the Grid,
  • The UK e-Science programme,
  • Commercial case studies
  • GridCast BBC/Belfast e-Science Centre,
  • OGSA-DAI,
  • Other e-Science projects.
  • Summary and conclusions

3
Characterisation of the Grid
  • In 2001, Foster, Kesselman and Tuecke refined
    their original definition of a grid to
  • "co-ordinated resource sharing and problem
    solving in dynamic, multi-institutional virtual
    organizations
  • This definition is the one most commonly used to
    day to abstractly define a grid.

4
Characterisation of the Grid
  • Foster later produced a checklist that could be
    used to help understand exactly what can be
    identified as a grid system, three parts
  • Co-ordinated resource sharing with no centralised
    point of control and that the users resided
    within different administrative domains
  • If not true it is probably the case that this is
    not a grid system!
  • Standard, open, general-purpose protocols and
    interfaces
  • If not, it is unlikely that system components
    will be able to communicate or inter-operate, and
    it is likely that we are dealing with an
    application-specific system, and not the Grid.

5
Characterisation of the Grid
  • Delivering non-trivial qualities of service -
    here we are considering how the components that
    make up a grid can be used in a co-ordinated way
    to deliver combined services, which are
    appreciably greater than sum of the individual
    components
  • These services may be associated with throughput,
    response time, meantime between failure,
    security, or many other facets.

6
Characterisation of the Grid
  • From a commercial view point, IBM define a grid
    as
  • a standards-based application/resource sharing
    architecture that makes it possible for
    heterogeneous systems and applications to share
    compute and storage resources transparently

7
What is not a Grid!
  • A cluster, a network attached storage device, a
    desktop PC, a scientific instrument, a network
    these are not grids
  • Each might be an important component of a grid,
    but by itself, it does not constitute a grid.
  • Screen saver/cycle stealers
  • SETI_at_HOME, fold_at_home, etc,
  • Other application specific to distributed
    computing.
  • Most of the current Grid providers
  • Proprietary technology with closed model of
    operation.
  • Globus
  • It is a toolkit to build a system that might work
    as or within a grid.
  • Sun Grid Engine, Platform LSF and related.
  • Almost anything referred to as a grid by
    marketeers!

8
e-Science
  • e-Science is about global collaboration in key
    areas of science, and the next generation of
    infrastructure that will enable it.
  • e-Science will change the dynamics of the way
    science is undertaken.
  • John Taylor
  • Director General of Research
    Councils
  • Office of Science and
    Technology

9
The Drivers for e-Science
  • More data
  • Instrument resolution and laboratory automation,
  • Storage capacity and data sources.
  • More computation
  • Computations available, simulations
    doubling every year
  • Faster networks
  • Bandwidth,
  • Need to schedule.
  • More inter-play and collaboration
  • Between scientists, engineers, computer
    scientists etc.,
  • Between computation and data.

10
The Drivers for e-Science
  • Collaboration,
  • Data Deluge,
  • Digital Technology
  • Ubiquity,
  • Cost reduction,
  • Performance increase.
  • In summary
  • Shared data, information and computation by
    geographically dispersed communities.

11
The UK e-Science Programme
  • Second Phase 2003 2006
  • Application Projects
  • 96M,
  • All areas of science and engineering.
  • Core Programme
  • 16M Research Infrastructure,
  • DTI Technology Fund.
  • First Phase 2001 2004
  • Application Projects
  • 74M,
  • All areas of science and engineering.
  • Core Programme
  • 15M Research infrastructure,
  • 40M Collaborative industrial projects.

12
The UK e-Science Programme
  • An exciting portfolio of Research Council
    e-Science projects
  • Beginning to see e-Science infrastructure deliver
    some early wins in several areas,
  • Astronomy, Chemistry, Bioinformatics,
    Engineering, Environment, Healthcare .
  • The UK unique in strong industrial component
  • Over 60 UK companies contributing over 30M,
  • Engineering, Pharmaceutical, Petrochemical, IT
    companies, Commerce, Media,

13
And the future
  • Grid Operations Centre, National Grid Service and
    AAA services,
  • Open Middleware Infrastructure Institute,
  • National e-Science Institute,
  • Digital Curation Centre,
  • International Standards Activity,
  • Needs continued support from Research Councils
    with identifiable e-Science funding lines post
    2006.

14
E-Science Case Studies
  • The GridCast ProjectGrid based Broadcast
    Infrastructures
  • http//www.qub.ac.uk/escience

15
The Grid Scenario The BBC Nations -BBC NI,
Scotland and Wales
The focus of the project is distribution of
stored media files and their management in
multiple sites.
  • BBC Nations provide customised services in each
    nation.
  • Television programmes are distributed to BBC
    Nations from BBC Network (London) using
    dedicated leased ATM circuits.

16
Grid Infrastructure
  • Technical
  • High-bandwidth network connections inter-connect
    broadcast locations,
  • Network bandwidth means geography is less of an
    issue.
  • Organisational
  • Less centralised.

17
Overview
  • The aim was develop a baseline media grid to
    support a broadcaster
  • Manage distributed collections of stored media,
  • Prototype security and access mechanisms,
  • Integrate processing and technical resources,
  • Integrate with media standards and hardware.
  • To analyse Quality of Service issues
  • Analyse remote content distribution
    infrastructures,
  • Analyse remote service provision,
  • To analyse reactivity, reliability and resilience
    issues in a grid-based broadcast infrastructure

18
Characteristics
  • Stored media files are Gbytes and increasing
  • 1 hour 200 Gbytes distributes 1 petabyte /year
  • Management and distribution is significant
    technically,
  • Metadata which includes location, timings,
    artists, storage formats is an integral part of
    broadcast structure,
  • Content is a valuable commodity access,
    modification, copying must be controlled,
  • High levels of quality required.

19
A Virtualised Infrastructure
Sound Improvement
20
Model Grid Service Operation
  • A schedule is registered with the schedule
    (network) management service,
  • The schedule is automatically distributed to
    (nation) the schedule management component
  • Local controller receives notification of
    schedule availability.
  • The Nation Controller registers (nation) the
    schedule with local schedule management,
  • Transport services develop a transport plan for
    content movement,
  • Scheduled transport service moves content as
    defined in transport plan.

21
Grid Service Operation
  • Index (registry) services track grid sites and
    available services,
  • Discovery services locate available copies of
    broadcast content
  • Services for nearest, or least busy or
  • Discovery services identify best transport
    service to use
  • Cross mounted file systems, 3rd party or ftp-type
    transport.
  • Transport services move work flows associated
    with content
  • The necessary operation(s) when content is
    delivered.

22
Grid Service Operation
  • Transport planner incorporates a model of
    network load
  • High cost at peak times and low cost at off-peak,
  • Other models in development.
  • Content archives are managed as replica archives
  • Content locations are tracked.content can be
    withdrawn.
  • Content archives permit automatic replication
  • For resilience and/or QoS.
  • Public and private services facilitate operation
    with public and private networks
  • Co-ordinating security policies with internal BBC
    policies.

23
Broadcast grid issues
  • Business change
  • A revised organisational model (services and
    resources),
  • Each broadcast location gains control.no network
    schedule.
  • Resilience
  • Resource sharing and no single programme
    repository,
  • A BBC Nation can be anywhere!
  • Reliability
  • Use resources available in other BBC sites or
    from 3rd party suppliers.
  • Cost
  • Better use of resources and less need for backup
    resources,
  • Less dependence on particular vendors or
    suppliers.
  • Customisation
  • Schedule, local resources, local capabilities.
  • Interoperability
  • Business model facilitates sharing with other
    broadcasters.

24
GridCast A Summary
  • Television programme distribution
  • Using a grid architecture to distribute
    programmes between broadcast sites
  • Concentrating initially on recorded material.
  • Television programme production
  • Using a grid architecture to monitor and
    facilitate programme production.
  • Television production technical assets
  • Using a grid architecture to facilitate access
    and use of broadcasting resources in television
    programme production.

25
OGSA Data Access and Integration
  • Middleware for distributed data access over the
    Grid.
  • UK e-Science Edinburgh, Manchester and
    Newcastle.
  • Industry partners IBM, Oracle and Microsoft.
  • OGSA-DAI DBMS XML Dist. SQL

Dist. Query
OGSA-DAI
TCP/IP
OGSA/WSRF
TCP/IP
26
OGSA-DAI Project
  • OGSA-DAI is one of the Grid Middleware Centre
    Projects
  • Collaboration between
  • EPCC,
  • IBM ( Oracle in phase 1),
  • National e-Science Centre,
  • Manchester University,
  • Newcastle University.
  • Project funding
  • OGSA-DAI, 2002-03
  • 3.3 million from the UK Core e-Science funding
    programme,
  • DAIT (DAI Two), 2003-06
  • 1.3 million from the UK e-Science Core Programme
    II.
  • "OGSA-DAI" is a trade mark.

Funded by UKs Department of Trade Industry
Engineering Physical Sciences Research Council
as part of the e-Science Core Programme
27
Example Projects Using OGSA-DAI
Bridges (http//www.brc.dcs.gla.ac.uk/projects/bri
dges/)
N2Grid (http//www.cs.univie.ac.at/institute/index
.html?project-8080)
BioSimGrid (http//www.biosimgrid.org/)
AstroGrid (http//www.astrogrid.org/)
BioGrid (http//www.biogrid.jp/)
GEON (http//www.geongrid.org/)
OGSA-DAI (http//www.ogsadai.org.uk)
eDiaMoND (http//www.ediamond.ox.ac.uk/)
OGSA-WebDB (http//www.gtrc.aist.go.jp/dbgrid/)
FirstDig (http//www.epcc.ed.ac.uk/firstdig/)
GeneGrid (http//www.qub.ac.uk/escience/projects.p
hpgenegrid)
INWA (http//www.epcc.ed.ac.uk/)
myGrid (http//www.mygrid.org.uk/)
ODD-Genes (http//www.epcc.ed.ac.uk/oddgenes/)
IU RGRBench (http//www.cs.indiana.edu/plale/proj
ects/RGR/OGSA-DAI.html)
28
OGSA-DAI User Project classification
  • AstroGrid
  • ODD-Genes
  • Bridges

Physical Sciences
  • BioSimGrid
  • GEON
  • BioGrid
  • eDiamond
  • myGrid

Biological Sciences
  • GeneGrid

OGSA-DAI
  • N2Grid
  • MCS
  • OGSA Web-DB
  • GridMiner
  • IU RGBench
  • FirstDig
  • INWA

Computer Sciences
Commercial Applications
29
The FirstDIG Project
  • The FirstDIG (First Data Investigation on the
    Grid) project deployed OGSA-DAI within the First
    South Yorkshire bus operational environment
  • First plc are the UKs largest public transport
    operator,
  • Within their UK bus operations they have a huge
    range of data sources - vehicle mileage, fuel
    consumption, maintenance records, revenue,
    reliability, etc.
  • A generic Grid Data Service Browser has been
    built and used to interrogate and combine data
    from OGSA-DAI enabled data sources to answer
    business questions posed by First South
    Yorkshire.

30
Other e-Science Projects
  • Comb-e-Chem, a combinatorial chemistry
    application - http//www.combechem.org/
  • The system allows students and researchers to
    virtually mix chemicals together and then try to
    identify the compounds they produce and the
    particular benefits these compounds may haveL
  • Chemistry, CS, Maths, and IT Innovation.
  • DAME - Distributed Aircraft Maintenance
    Environment http//www.cs.york.ac.uk/dame/
  • Aims to produce sensors that measure temperature,
    vibration, and pressure of airplane engines as
    they fly from one location to another.
  • Instead of waiting until a plane lands, sensor
    data will be sampled in flight and compared with
    existing patterns.
  • If problems are detected mechanics can replace
    the damaged or faulty engine parts as soon as the
    plane lands and before anything drastic occurs
  • Universities, Rolls Royce, Data Systems
    Solutions, and Cybula.

31
Other e-Science Projects
  • The Geodise project is a grid-enabled
    optimisation and design search program for
    engineers -http//www.geodise.org/
  • The project will allow aerospace companies, speed
    up the design process of their vehicles by
    capturing knowledge from previous designs and
    putting it together for simulations
  • Universities, BAE Systems and Rolls-Royce and
    Fluent.
  • Discovery Net - http//www.discovery-on-the.net/
  • This project is producing high-throughput sensing
    applications such as environmental sensors and
    bioinformatic monitors.
  • The aim is for doctors to someday be able to
    monitor the blood pressure, temperature, and drug
    intake of all their patients.
  • A sensor on the patient's body will communicate
    the data through a mobile wireless communication
    device to the doctor's office.
  • Universities, InforSense, deltaDOT, and
    HydroVenturi

32
Summary
  • The e-Science programme has pump primed the take
    up of the Grid in the UK.
  • The programme is perceived as being a great
    success - given the UK a lead in e-Science.
  • It has not been without its problems not least
    of these was the move to WSRF and the take up of
    the various WS specifications.
  • Output from programme has led to a number of
    other projects that will address the current gaps
    in grid technologies.
  • New funding related to infrastructure (JISC)
    support by implementing and deploying the
    technologies (VREs).
  • All the projects are collaborations between
    academia and industry.

33
Some Further Work!
  • Robust, reliable and inter-operable middleware
    that can scale to support a global
    infrastructure
  • UK OMII meant to be hardening existing
    software.
  • Funding for the implementation and deployment,
    rather than just research
  • UK JISC for academia,
  • UK DTI for commerce/industry.
  • Security and trust mechanisms
  • Take-up of Semantic Web technologies to speed the
    automation of component interaction.
  • Open source software and agreed standards
  • GGF, Oasis, EGA, IETF, W3C etc.
  • Educational aspects
  • Undergraduate, graduate other courses.

34
Summary Successful Grid Areas
  • Distributed database integration intelligent
    queries and data-mining across heterogeneous data
    sources.
  • Parameter sweeps run sequential tasks many
    times with different input data
  • Coupled simulations the output of one
    simulation is the input of another
  • Distributed resources sensors and equipment,
    processing, data silos, and visualisation at
    different remote sites.
  • Application Service Provision services on
    demand!

35
Acknowledgements and links
  • Prof Ron Perrott and the Belfast e-Science
    Centre
  • http//www.qub.ac.uk/escience
  • Prof Malcom Atkinson, NeSC
  • The OGSA-DAI Project Site
  • http//www.ogsadai.org.uk
Write a Comment
User Comments (0)
About PowerShow.com