Title: The Grid: UK Practice and Experience
1The Grid UK Practice and Experience
- Mark Baker
- Distributed Systems Group
- University of Portsmouth
-
- Email Mark.Baker_at_computer.org
- Web http//dsg.port.ac.uk/mab
2Outline
- Characterisation of the Grid,
- The UK e-Science programme,
- Commercial case studies
- GridCast BBC/Belfast e-Science Centre,
- OGSA-DAI,
- Other e-Science projects.
- Summary and conclusions
3Characterisation of the Grid
- In 2001, Foster, Kesselman and Tuecke refined
their original definition of a grid to - "co-ordinated resource sharing and problem
solving in dynamic, multi-institutional virtual
organizations - This definition is the one most commonly used to
day to abstractly define a grid.
4Characterisation of the Grid
- Foster later produced a checklist that could be
used to help understand exactly what can be
identified as a grid system, three parts - Co-ordinated resource sharing with no centralised
point of control and that the users resided
within different administrative domains - If not true it is probably the case that this is
not a grid system! - Standard, open, general-purpose protocols and
interfaces - If not, it is unlikely that system components
will be able to communicate or inter-operate, and
it is likely that we are dealing with an
application-specific system, and not the Grid.
5Characterisation of the Grid
- Delivering non-trivial qualities of service -
here we are considering how the components that
make up a grid can be used in a co-ordinated way
to deliver combined services, which are
appreciably greater than sum of the individual
components - These services may be associated with throughput,
response time, meantime between failure,
security, or many other facets.
6Characterisation of the Grid
- From a commercial view point, IBM define a grid
as -
- a standards-based application/resource sharing
architecture that makes it possible for
heterogeneous systems and applications to share
compute and storage resources transparently
7What is not a Grid!
- A cluster, a network attached storage device, a
desktop PC, a scientific instrument, a network
these are not grids - Each might be an important component of a grid,
but by itself, it does not constitute a grid. - Screen saver/cycle stealers
- SETI_at_HOME, fold_at_home, etc,
- Other application specific to distributed
computing. - Most of the current Grid providers
- Proprietary technology with closed model of
operation. - Globus
- It is a toolkit to build a system that might work
as or within a grid. - Sun Grid Engine, Platform LSF and related.
- Almost anything referred to as a grid by
marketeers!
8e-Science
- e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it. - e-Science will change the dynamics of the way
science is undertaken. - John Taylor
- Director General of Research
Councils - Office of Science and
Technology
9The Drivers for e-Science
- More data
- Instrument resolution and laboratory automation,
- Storage capacity and data sources.
- More computation
- Computations available, simulations
doubling every year - Faster networks
- Bandwidth,
- Need to schedule.
- More inter-play and collaboration
- Between scientists, engineers, computer
scientists etc., - Between computation and data.
10The Drivers for e-Science
- Collaboration,
- Data Deluge,
- Digital Technology
- Ubiquity,
- Cost reduction,
- Performance increase.
- In summary
- Shared data, information and computation by
geographically dispersed communities.
11The UK e-Science Programme
- Second Phase 2003 2006
- Application Projects
- 96M,
- All areas of science and engineering.
- Core Programme
- 16M Research Infrastructure,
- DTI Technology Fund.
- First Phase 2001 2004
- Application Projects
- 74M,
- All areas of science and engineering.
- Core Programme
- 15M Research infrastructure,
- 40M Collaborative industrial projects.
12The UK e-Science Programme
- An exciting portfolio of Research Council
e-Science projects - Beginning to see e-Science infrastructure deliver
some early wins in several areas, - Astronomy, Chemistry, Bioinformatics,
Engineering, Environment, Healthcare . - The UK unique in strong industrial component
- Over 60 UK companies contributing over 30M,
- Engineering, Pharmaceutical, Petrochemical, IT
companies, Commerce, Media,
13And the future
- Grid Operations Centre, National Grid Service and
AAA services, - Open Middleware Infrastructure Institute,
- National e-Science Institute,
- Digital Curation Centre,
- International Standards Activity,
- Needs continued support from Research Councils
with identifiable e-Science funding lines post
2006.
14E-Science Case Studies
- The GridCast ProjectGrid based Broadcast
Infrastructures - http//www.qub.ac.uk/escience
15The Grid Scenario The BBC Nations -BBC NI,
Scotland and Wales
The focus of the project is distribution of
stored media files and their management in
multiple sites.
- BBC Nations provide customised services in each
nation. - Television programmes are distributed to BBC
Nations from BBC Network (London) using
dedicated leased ATM circuits.
16Grid Infrastructure
- Technical
- High-bandwidth network connections inter-connect
broadcast locations, - Network bandwidth means geography is less of an
issue. - Organisational
- Less centralised.
17Overview
- The aim was develop a baseline media grid to
support a broadcaster - Manage distributed collections of stored media,
- Prototype security and access mechanisms,
- Integrate processing and technical resources,
- Integrate with media standards and hardware.
- To analyse Quality of Service issues
- Analyse remote content distribution
infrastructures, - Analyse remote service provision,
- To analyse reactivity, reliability and resilience
issues in a grid-based broadcast infrastructure
18Characteristics
- Stored media files are Gbytes and increasing
- 1 hour 200 Gbytes distributes 1 petabyte /year
- Management and distribution is significant
technically, - Metadata which includes location, timings,
artists, storage formats is an integral part of
broadcast structure, - Content is a valuable commodity access,
modification, copying must be controlled, - High levels of quality required.
19A Virtualised Infrastructure
Sound Improvement
20Model Grid Service Operation
- A schedule is registered with the schedule
(network) management service, - The schedule is automatically distributed to
(nation) the schedule management component - Local controller receives notification of
schedule availability. - The Nation Controller registers (nation) the
schedule with local schedule management, - Transport services develop a transport plan for
content movement, - Scheduled transport service moves content as
defined in transport plan.
21Grid Service Operation
- Index (registry) services track grid sites and
available services, - Discovery services locate available copies of
broadcast content - Services for nearest, or least busy or
- Discovery services identify best transport
service to use - Cross mounted file systems, 3rd party or ftp-type
transport. - Transport services move work flows associated
with content - The necessary operation(s) when content is
delivered.
22Grid Service Operation
- Transport planner incorporates a model of
network load - High cost at peak times and low cost at off-peak,
- Other models in development.
- Content archives are managed as replica archives
- Content locations are tracked.content can be
withdrawn. - Content archives permit automatic replication
- For resilience and/or QoS.
- Public and private services facilitate operation
with public and private networks - Co-ordinating security policies with internal BBC
policies.
23Broadcast grid issues
- Business change
- A revised organisational model (services and
resources), - Each broadcast location gains control.no network
schedule. - Resilience
- Resource sharing and no single programme
repository, - A BBC Nation can be anywhere!
- Reliability
- Use resources available in other BBC sites or
from 3rd party suppliers. - Cost
- Better use of resources and less need for backup
resources, - Less dependence on particular vendors or
suppliers. - Customisation
- Schedule, local resources, local capabilities.
- Interoperability
- Business model facilitates sharing with other
broadcasters.
24GridCast A Summary
- Television programme distribution
- Using a grid architecture to distribute
programmes between broadcast sites - Concentrating initially on recorded material.
- Television programme production
- Using a grid architecture to monitor and
facilitate programme production. - Television production technical assets
- Using a grid architecture to facilitate access
and use of broadcasting resources in television
programme production.
25OGSA Data Access and Integration
- Middleware for distributed data access over the
Grid. - UK e-Science Edinburgh, Manchester and
Newcastle. - Industry partners IBM, Oracle and Microsoft.
- OGSA-DAI DBMS XML Dist. SQL
Dist. Query
OGSA-DAI
TCP/IP
OGSA/WSRF
TCP/IP
26OGSA-DAI Project
- OGSA-DAI is one of the Grid Middleware Centre
Projects - Collaboration between
- EPCC,
- IBM ( Oracle in phase 1),
- National e-Science Centre,
- Manchester University,
- Newcastle University.
- Project funding
- OGSA-DAI, 2002-03
- 3.3 million from the UK Core e-Science funding
programme, - DAIT (DAI Two), 2003-06
- 1.3 million from the UK e-Science Core Programme
II. - "OGSA-DAI" is a trade mark.
Funded by UKs Department of Trade Industry
Engineering Physical Sciences Research Council
as part of the e-Science Core Programme
27Example Projects Using OGSA-DAI
Bridges (http//www.brc.dcs.gla.ac.uk/projects/bri
dges/)
N2Grid (http//www.cs.univie.ac.at/institute/index
.html?project-8080)
BioSimGrid (http//www.biosimgrid.org/)
AstroGrid (http//www.astrogrid.org/)
BioGrid (http//www.biogrid.jp/)
GEON (http//www.geongrid.org/)
OGSA-DAI (http//www.ogsadai.org.uk)
eDiaMoND (http//www.ediamond.ox.ac.uk/)
OGSA-WebDB (http//www.gtrc.aist.go.jp/dbgrid/)
FirstDig (http//www.epcc.ed.ac.uk/firstdig/)
GeneGrid (http//www.qub.ac.uk/escience/projects.p
hpgenegrid)
INWA (http//www.epcc.ed.ac.uk/)
myGrid (http//www.mygrid.org.uk/)
ODD-Genes (http//www.epcc.ed.ac.uk/oddgenes/)
IU RGRBench (http//www.cs.indiana.edu/plale/proj
ects/RGR/OGSA-DAI.html)
28OGSA-DAI User Project classification
Physical Sciences
Biological Sciences
OGSA-DAI
Computer Sciences
Commercial Applications
29The FirstDIG Project
- The FirstDIG (First Data Investigation on the
Grid) project deployed OGSA-DAI within the First
South Yorkshire bus operational environment - First plc are the UKs largest public transport
operator, - Within their UK bus operations they have a huge
range of data sources - vehicle mileage, fuel
consumption, maintenance records, revenue,
reliability, etc. - A generic Grid Data Service Browser has been
built and used to interrogate and combine data
from OGSA-DAI enabled data sources to answer
business questions posed by First South
Yorkshire.
30Other e-Science Projects
- Comb-e-Chem, a combinatorial chemistry
application - http//www.combechem.org/ - The system allows students and researchers to
virtually mix chemicals together and then try to
identify the compounds they produce and the
particular benefits these compounds may haveL - Chemistry, CS, Maths, and IT Innovation.
- DAME - Distributed Aircraft Maintenance
Environment http//www.cs.york.ac.uk/dame/ - Aims to produce sensors that measure temperature,
vibration, and pressure of airplane engines as
they fly from one location to another. - Instead of waiting until a plane lands, sensor
data will be sampled in flight and compared with
existing patterns. - If problems are detected mechanics can replace
the damaged or faulty engine parts as soon as the
plane lands and before anything drastic occurs - Universities, Rolls Royce, Data Systems
Solutions, and Cybula.
31Other e-Science Projects
- The Geodise project is a grid-enabled
optimisation and design search program for
engineers -http//www.geodise.org/ - The project will allow aerospace companies, speed
up the design process of their vehicles by
capturing knowledge from previous designs and
putting it together for simulations - Universities, BAE Systems and Rolls-Royce and
Fluent. - Discovery Net - http//www.discovery-on-the.net/
- This project is producing high-throughput sensing
applications such as environmental sensors and
bioinformatic monitors. - The aim is for doctors to someday be able to
monitor the blood pressure, temperature, and drug
intake of all their patients. - A sensor on the patient's body will communicate
the data through a mobile wireless communication
device to the doctor's office. - Universities, InforSense, deltaDOT, and
HydroVenturi
32Summary
- The e-Science programme has pump primed the take
up of the Grid in the UK. - The programme is perceived as being a great
success - given the UK a lead in e-Science. - It has not been without its problems not least
of these was the move to WSRF and the take up of
the various WS specifications. - Output from programme has led to a number of
other projects that will address the current gaps
in grid technologies. - New funding related to infrastructure (JISC)
support by implementing and deploying the
technologies (VREs). - All the projects are collaborations between
academia and industry.
33Some Further Work!
- Robust, reliable and inter-operable middleware
that can scale to support a global
infrastructure - UK OMII meant to be hardening existing
software. - Funding for the implementation and deployment,
rather than just research - UK JISC for academia,
- UK DTI for commerce/industry.
- Security and trust mechanisms
- Take-up of Semantic Web technologies to speed the
automation of component interaction. - Open source software and agreed standards
- GGF, Oasis, EGA, IETF, W3C etc.
- Educational aspects
- Undergraduate, graduate other courses.
34Summary Successful Grid Areas
- Distributed database integration intelligent
queries and data-mining across heterogeneous data
sources. - Parameter sweeps run sequential tasks many
times with different input data - Coupled simulations the output of one
simulation is the input of another - Distributed resources sensors and equipment,
processing, data silos, and visualisation at
different remote sites. - Application Service Provision services on
demand!
35Acknowledgements and links
- Prof Ron Perrott and the Belfast e-Science
Centre - http//www.qub.ac.uk/escience
- Prof Malcom Atkinson, NeSC
- The OGSA-DAI Project Site
- http//www.ogsadai.org.uk