Grid Computing at the University of Arkansas - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Grid Computing at the University of Arkansas

Description:

Title: Project Goal Author: Compaq Last modified by: Amy Apon Created Date: 11/24/2002 2:38:08 PM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 39
Provided by: Com3354
Category:

less

Transcript and Presenter's Notes

Title: Grid Computing at the University of Arkansas


1
Grid Computing at the University of Arkansas
  • Amy Apon, Ph.D.
  • Faculty Seminar Series
  • March 29, 2004

2
What is Grid Computing
  • Grid computing is way of organizing computing
    resources
  • So that they can be flexibly and dynamically
    allocated and accessed
  • Central processors, storage, network bandwidth,
    databases, applications, sensors and so on
  • The objective of grid computing is to share
    information and processing capacity so that it
    can be more efficiently exploited
  • Offer QOS guarantees (security, workflow and
    resource management, fail-over, problem
    determination, )

3
Elements of Grid Computing
  • Resource sharing
  • Computers, storage, sensors, networks,
  • Sharing always conditional issues of trust,
    policy, negotiation, payment,
  • Coordinated problem solving
  • Beyond client-server distributed data analysis,
    computation, collaboration,
  • Dynamic, multi-institutional virtual
    organizations
  • Community overlays on classic org structures
  • Large or small, static or dynamic

4
Online Access to Scientific Instruments
Advanced Photon Source
wide-area dissemination
desktop VR clients with shared controls
real-time collection
archival storage
tomographic reconstruction
DOE X-ray grand challenge ANL, USC/ISI, NIST,
U.Chicago
5
Data Grids forHigh Energy Physics
Image courtesy Harvey Newman, Caltech
6
Network for EarthquakeEngineering Simulation
  • NEESgrid national infrastructure to couple
    earthquake engineers with experimental
    facilities, databases, computers, each other
  • On-demand access to experiments, data streams,
    computing, archives, collaboration

NEESgrid Argonne, Michigan, NCSA, UIUC, USC
7
Counterexample Oracle Real Application Clusters
Architecture is a cool architecture but not
Grid
Users
Network
Centralized Management Console
No Single Point Of Failure
Low Latency Interconnect VIA or Proprietary
High Speed Switch or Interconnect
ClusteredDatabase Servers
Hub or Switch Fabric
Storage Area Network
Drive and Exploit Industry Advances in Clustering
Mirrored Disk Subsystem
8
Broader Context
  • Grid Computing has much in common with major
    industrial thrusts
  • Business-to-business, Peer-to-peer, Application
    Service Providers, Storage Service Providers,
    Distributed Computing, Internet Computing, Web
    Services,
  • Sharing issues not adequately addressed by
    existing technologies
  • Complicated requirements run program X at site
    Y subject to community policy P, providing access
    to data at Z according to policy Q
  • High performance unique demands of advanced
    high-performance systems

9
Brief Grid Computing History
  • First generation Grids (mid 1980s to 1990s)
  • Local "Metacomputers"
  • Basic services such as distributed file systems,
    site-wide single sign on
  • Custom distributed applications with custom
    communications protocols everything custom
  • Gigabit test beds extended Grid across distance

10
Brief Grid Computing History
  • Second generation Grids (late 1990s to now)
  • Condor, I-WAY (the origin of Globus) and Legion
    (origin of Avaki)
  • Underlying software services and communications
    protocols are basis for developing distributed
    applications and services
  • Basic building blocks, but deployment involves
    significant customization and filling in lots of
    gaps.
  • No standards, no interoperability
  • Standards dont seem to matter (enough) to the
    developers Europe is crazy over this stuff,
    Acxiom,

11
Brief Grid Computing History
  • Third Generation Grids (recent past and present)
  • Global Grid Forum begins working on standards,
    1999
  • Open Grid Services Architecture (OGSA) published
    in June, 2002
  • Open Grid Services Infrastructure (OGSI), Version
    1.0 published in July, 2003
  • Globus Toolkit 3 (GT3) first reference
    implementation of OGSI, available June, 2003,
    based on Java
  • Several other implementations have been developed
    in Python, Perl, Microsoft .NET,

12
(No Transcript)
13
Grid Services
  • Common interface specification supports the
    interoperability of discrete, independently
    developed services
  • Concept similar to Remote Procedure Call (RPC),
    Remote Method Invocation (RMI), only applied over
    HTTP
  • Based on extensions of Web Services

14
Web Services
15
Web Services Architecture
The Web Services Architecture is specified and
standardized by the World Wide Web Consortium,
the same organization responsible for XML, HTML,
CSS, etc.
16
Web Services missing features
  • At the time the OGSI V1.0 spec was published
    there was a gap between the need to define
    stateful Web Services and what was provided by
    the latest version of Web Services in WSDL 1.1
    Web Services were stateless and non-transient
  • The result was the definition in OGSI of Service
    Data a common mechanism to expose a service
    instances state data for query, update, and
    change notification
  • Also, Grid Services uses a Factory to manage
    instances to allow transient and private
    instances

17
Grid Services Factory
18
Grid Services
  • The declared state of a service is accessed only
    though service operations that are defined as a
    part of the service interface
  • (For those who know JavaBeans, Service Data is
    similar to JavaBean properties)
  • I will show an example using GT3. Since GT3 uses
    Java, the whole example is in Java.

19
Grid Services Example Using GT3
  • Step 1 Define the Service interface using Java
  • public interface Math
  • public void add(int a)
  • public void subtract(int a)
  • public int getValue()
  • In this example there is a value and it can be
    modified via add or subtract, and can be accessed
    via getValue.
  • GT3 provides tools for converting the Java to WSDL

20
Step 2 Implement the Service
  • public class MathImpl extends GridServiceImpl
    implements MathPortType
  • private int value 0
  • public MathImpl()
  • super(Math Factory Service)
  • public void add(int a) throws RemoteException
  • value value a
  • public void subtract(int a) throws
    RemoteException
  • value value - a
  • public int getValue() throws RemoteException
  • return value

21
Step 3 Write the Deployment Descriptor using
Web Service Deployment Descriptor (WSDD) format
  • lt?xml version"1.0"?gt
  • ltdeployment name"defaultServerConfig"
    xmlns"http//xml.apache.org/axis/wsdd/"
    xmlnsjava"http//xml.apache.org/axis/wsdd/provid
    ers/java"gt
  • ltservice name"tutorial/core/factory/MathFactorySe
    rvice" provider"Handler" style"wrapped"gt
  • ltparameter name"name" value"MathService
    Factory"/gt
  • ltparameter name"instance-name"
    value"MathService Instance"/gt
  • ltparameter name"instance-schemaPath"
    value"schema/gt3tutorial.core.factory/Math/MathSe
    rvice.wsdl"/gt
  • ltparameter name"instance-baseClassName"
    value"gt3tutorial.core.factory.impl.MathImpl"/gt
  • lt!-- Start common parameters --gt
  • ltparameter name"allowedMethods" value""/gt
  • ltparameter name"persistent" value"true"/gt
  • ltparameter name"className" value"org.gridforum.
    ogsi.Factory"/gt
  • ltparameter name"baseClassName"
    value"org.globus.ogsa.impl.ogsi.PersistentGridSer
    viceImpl"/gt
  • ltparameter name"schemaPath" value"schema/ogsi/o
    gsi_factory_service.wsdl"/gt
  • ltparameter name"handlerClass"
    value"org.globus.ogsa.handlers.RPCURIProvider"/gt
  • ltparameter name"factoryCallback"
    value"org.globus.ogsa.impl.ogsi.DynamicFactoryCal
    lbackImpl"/gt
  • ltparameter name"operationProviders"
    value"org.globus.ogsa.impl.ogsi.FactoryProvider"/
    gt lt/servicegt

22
Step 4 Compile and deploy the Service using ant
  • aapon_at_kite tutorial ./tutorial_build.sh
    gt3tutorial/core/factory/impl/Math.java
  • You can see gar and jar files that ant creates
    from the source files.
  • aapon_at_kite newgrp globus
  • aapon_at_kite cd GLOBUS_LOCATION
  • aapon_at_kite ant deploy -Dgar.name/home/aapon/tut
    orial/build/lib/gt3tutorial.core.factory.Math.gar

23
Step 5 Write and compile the client
  • public class MathClient
  • public static void main(String args)
  • try // Get command-line arguments
  • URL GSH new java.net.URL(args0)
  • int a Integer.parseInt(args1)
  • // Get a reference to the MathService instance
  • MathServiceGridLocator myServiceLocator
  • new MathServiceGridLocator()
  • MathPortType myprog myServiceLocator.getMathSe
    rvice(GSH)
  • // Call remote method 'add'
  • myprog.add(a)
  • System.out.println("Added " a)
  • // Get current value through remote method
    'getValue'
  • int value myprog.getValue()
  • System.out.println("Current value " value)
  • catch(Exception e)

24
Step 6 Start the Service and execute the client
  • Start the Service
  • aapon_at_kite globus-start-container -p 8081
  • Create the service instance. This client does not
    create a new instance when it runs thus, the
    instance needs to be created the first time.
  • aapon_at_kite ogsi-create-service
    http//localhost8081/ogsa/services/tutorial/core/
    factory/MathFactoryService myprog
  • This ogsi-create-service has two arguments the
    service handle GSH and the name of the instance
    we want to create.
  • Execute the client
  • aapon_at_kite tutorial java gt3tutorial.core.factor
    y.client.MathClient http//localhost8081/ogsa/ser
    vices/tutorial/core/factory/MathFactoryService/myp
    rog 4
  • You will see the following result Added 4
    Current value 4

25
Problems with GT3 and OGSI
  • I didnt tell you the whole story there are a
    lot of environmental variables, a lot of setup is
    required!
  • You have to be very proficient at Java to use
    GT3.
  • Not only that, it is quite slow.
  • Oops, OGSI is not completely interoperable with
    Web Services!

26
Recent Changes to Grid Standards
  • Introduction of Web Services Resource Framework
    (WSRF), January, 2004
  • Web services vendors recognized the importance of
    OGSI concept but would not adopt OGSI as it was
    defined (summer 2003)
  • Globus Alliance teamed up with Web services
    architects and came up with WSRF
  • Add the ability to create, address, inspect,
    discover, and manage stateful resources

27
WSRF changes the terms slightly
  • WS-Resource (instead of Grid services)
  • The concepts are the same
  • Grid service has an identity, service data, and a
    lifetime management mechanism
  • WS-Resource has a name, resource properties, and
    a lifetime management mechanism
  • So, the tutorial is still relevant!

28
Distributed computing is complex
  • There are many advantages to working within a
    standard framework
  • Single sign-on
  • Remote deployment of executables
  • Computation management, data movement
  • Benefits of working with an international
    community of developers and users
  • A framework enables the definition of
    higher-level services

29
Higher Level Globus Toolkit Services
  • Data Services include
  • Replica Management
  • Base Services include
  • Managed Job Service
  • Index Service
  • Reliable FTP
  • Many documents define GT3 Security Services

30
Grid at the NSF
  • The NSF Middleware Initiative funded large
    portions of the Globus Toolkit development
  • Next NSF Middleware Solicitation, May 14, 2004
  • Requires familiarity with emerging standards for
    Grid computing

31
UofA Grid Computing Possibilities
  • CLEANER Proposal, submitted January, 2004
  • Cyberinfrastructure for Environmental Water
    Resource Management Building on the Resources of
    the Great Plains Network Planning Grant
    Proposal
  • Ralph Davis, UA Geosciences, is PI, large
    collaboration
  • The vision is a grid computing system that allows
    various users access at multiple levels, and
    facilitates large-scale database management,
    rapid processing of image data, and seamless
    integration of complex, dispersed data sets and
    model applications.
  • Real-time measurement, accurate modeling,
    effective decision support systems

32
UofA Grid Computing Possibilities
  • Recent Acxiom proposal Self-Regulation of the
    Acxiom Grid Environment
  • Computational chemistry exploit 10,000 computers
    to screen 100,000 compounds in an hour
  • DNA computational scientists visualize, annotate,
    analyze terabyte simulation datasets
  • Environmental scientists share volcanic activity
    sensing data that has been collected from a
    widely dispersed sensor grid

33
UofA Grid for Sharing Digital Map Data
  • GeoStor digital map data delivery system
  • http//www.cast.uark.edu/cast/geostor/
  • Contains all publicly available geographic data
    for the state of Arkansas
  • Oracle database is used for access to metadata
    and some maps

34
UofA Grid for Sharing Digital Map Data
  • GeoSurf
  • A Java based product
  • User queries and downloads data from GeoStor
  • User specifies geographic clip boundaries,
    projection, data format
  • Current system asks user to submit email address
    to system for processing an online link is
    emailed to the user
  • Could be a Grid service

35
UofA Grid Education Efforts
  • D. Thompson, A. Apon, Y. Yara, J. Mache, and R.
    Deaton, Training a Grid Workforce Oklahoma
    Supercomputing Symposium, Sept, 2004.
  • A. Apon, J. Mache, Y. Yara, K. Landrus,
    Classroom Exercises for Grid Services, Linux
    Cluster Institute Conference on High Performance
    Computing, May, 2004 (to appear).
  • CSCE 490 Cluster and Grid Computing, Fall, 2004

36
More Grid Education Efforts
  • International Workshop on Grid Education,
    http//csce.uark.edu/aapon/grid.edu2004, A. Apon
    and J. Mache, Co-Chairs, Chicago, April, 2004.
  • I would like to see the development of ARCHIE
    ARkansas Center for High End Computing
  • Modeled after the Oklahoma Supercomputing Center
    for Education and Research (OSCER)
  • Focal point for research collaboration and
    education in all aspects of high-end computing,
    including high-performance parallel, cluster, and
    grid computing

37
Training a Grid Workforce
Dale R. Thompson, Amy Apon, Yuriko Yara, Jens
Mache, and Russell Deaton University of
Arkansas, Lewis Clark College
Motivation The growing capability of Grids as
viable compute resource brokers is largely
responsible for their acceptance beyond the
traditional high performance computing (HPC)
research community. As applications become more
grid-enabled, the business community is expected
to increase significantly its investments in Grid
Computing over the next decade. These increasing
demands for Grid Computing, coupled with
continued advancements in middleware and
networking technologies, have raised concerns
about the availability of a qualified workforce
to build and use the Grid.
Diagram of Experimental System
  • What is the Grid?
  • The Grid allows sharing of computational and
    data resources across diverse platforms and
  • different organizations.
  • The Grid uses the Internet for worldwide
    communications.
  • The Grid provides worldwide
  • computation.

Four levels of Grid
  • Approach
  • An experiment to connect two institutions on the
    grid
  • Ingredients
  • One undergraduate from Lewis and Clark College
  • One graduate student from the University of
    Arkansas
  • Funding to support the students from Lewis
    Clark College and the University of Arkansas
  • Four eager faculty not on summer support
  • Several PCs and some Fast Ethernet switches
  • One month of time
  • Free software packages
  • Encouragement
  • Goal connect the two institutions using Grid
    protocols, and run a Grid service and an MPI
    program across the Grid in one month.
  • Result a much better understanding of the
    knowledge and training required to build and use
    the Grid!

Future Grid workers must have knowledge in
several specialized areas
  • Knowledge of Linux Administration
  • the ability to configure the network
  • the location of various configuration files
  • software installation and setting up permissions
  • the use of tools such as ifconfig,
    iptables/ipchains for firewall configuration,
    rpm, make, and ant.
  • Knowledge of Order for Installing Software
  • ANT
  • jakarta-oro
  • jdk
  • bison
  • Junit
  • Globus 3
  • SimpleCA
  • CA and add trust
  • Request Host certificate
  • Sign user certificates
  • Ability and Knowledge for Solving Common
    Difficulties
  • Establishing trust at the host level and the user
    level
  • Synchronizing clocks - clocks should be within 5
    minutes or else handshake failed is obtained
  • Configuring firewall to permit Globus port
    numbers
  • Software Packages Used
  • The Globus 3.0 Toolkit (version 3.0.1)
  • Java 2 Platform Standard Edition (j2sdk-1.4.2)
  • OSCAR (version 2.2.1)
  • MPICH-G2 (version 1.2.5-1)
  • RedHat Linux 7.3, with default configurations
  • At the time of this work, these tools were the
    latest releases that would interoperate with each
    other.
  • Activities that Build Knowledge
  • Install Linux on a PC
  • Install the Globus Toolkit
  • Build and use a Grid Service
  • Build a cluster using PCs and OSCAR
  • Run a simple MPI program across the Grid

User-level Initialization Instructions Initialize
environment cd GLOBUS_LOCATION source
setenv.sh GLOBUS_LOCATION/etc/globus-user-env
.sh Request user certificate
grid-cert-request Sign user certificate (you
must be administrator or CA) grid-ca-sign -in
usercert_request.pem -out usercert.pem Start a
valid proxy grid-proxy-init Delete a proxy
grid-proxy-destroy
A website is being developed for Grid training
and access to Grid training activities and
materials at http//gotgrid.uark.edu/
38
Questions and discussion
  • Opportunities for collaboration
  • New interfaces for accessing large shared data
    repositories
  • New protocols for managing widely distributed
    data, or aspects of data sharing
  • Various application possibilities
Write a Comment
User Comments (0)
About PowerShow.com