Title: Grid Technology and The Latin American Grid
1Grid Technology (and The Latin American Grid)
- Jaime Seguel
- Department of Electrical and Computer Engineering
2Outline
- Cyber-infrastructure and Grids
- Overview of Grid Technologies
- Current and future Grid Technology research
- The Latin American Grid
3 Distributed Science and Technology
- Most scientific advancement and breakthroughs
occur in isolation result from the knowledge,
imagination and inspiration of a single or a
small group of scientists. Knowledge and
expertise are distributed geographically and
conceptually in specialties. - Data sources are distributed geographically and
in specialties - Computer systems are distributed geographically
and systems-wise. - The potential benefits of some degree of
integration have been voiced in several forums
4Cyber-infrastructure
- Term coined by NSF to describe ideal research
environments in which Information Technology
capabilities are made available to researchers in
an interoperable computer network. - Requires orchestrating intelligent searching,
semantic integration, visualization of
information, scientific databases and high
performance computing. - The expected result is an integrative environment
for enabling transformative advances in research
and education.
5Grid Computing a cyber-infrastructure incarnation
- Grid Computing consists of a very specific set
of technologies for uniting computing partners
and systems in a virtual organization. - Should not be confused with High-Performance,
Distributed or peer-to-peer Computing. - Grid computing is intended for inter-organizationa
l computations. - Grid Technology is still work in progress.
Current Grid systems today fall short of Grids
goals set by the Global Grid Forum (GGF).
6Grid Technologies
- Grid technology is the software used to create
Grids, not the network technology. - A specific set of Grid technologies are the
outgrowth of the Global Grid Forum (Globus, GSI,
etc) - Grids are frequently purpose-built and normally
built component by component in tight
collaboration between scientists and IT
researchers and implementers. - Turn-key grid systems are not yet available.
7A Snapshot on the Evolution of Information
Technology
- Internet - Communications backbone based on
communications protocols (TCP/IP, FTP, HTTP,
etc.) - Web Human-Systems interaction for document
transfer (HTTP) and rendering (HTML, CGI, URI) - Grid Proposes Human-Systems and Systems-Systems
interaction for collaborative science, commerce
and education.
8 World Wide Grid the ultimate cyber-infrastructur
e goal
- WWW is a fine facility for presenting HTML pages
to people and having them respond but its core
facilities are not adequate for robust
computer-to-computer interactions. - In order to address this deficiency, computer
scientists began combining concepts from the Web,
like structured documents and open standards,
along with traditional distributed computing
concepts (like RPC and CORBA). - What follows is a series of steps that complement
the World Wide Web by providing structured data
services that are consumed by software rather
than by humans and that would some day link
software systems around the globe in a uniform
manner The World Wide Grid.
9Current Status (1) XML Protocols
- The foundation for the standards is XML which
describes data and behavior that can easily move
between languages, platforms and hardware. - The XML family set the stage for a primitive
mechanism to perform distributed systems calls. - Initial rules for leveraging XML and current
Internet and Web standards were created (SOAP
bindings for HTTP, SMTP).
10Current Status (2) Web Services
- XML was well within reach since it had the same
basic roots as HTML. However, creating a
distributed computing infrastructure required a
special set of protocols. - A new set of protocols extended XML adding some
missing computer-to-computer features. Microsoft
referred them as GXA or Global XML
Architecture, while IBM called it all Web
Services. - This new infrastructure for system-to-system
communications serves as the foundation for the
Grid model.
11Current Status (3) The Service Network
- Like in the Internet, a series of participants
would have to play a role to make the network
perform. Routers, gateways, directories, proxies
they are all needed and not the kind that
speak TCP/IP they need to understand the new
web service protocols. This network (sometimes
called The Service Network) works above the
Internet, yet relies on it at the most basic
level.
12Evolution Summary
13Just a few examples
- NeuroGrid - analysis of brain activity data
gathered from the MEG instruments (Japan and
Australia). - Particle Physics Data Grid - Integrates
experiment-specific applications, Grid
technologies, and facility computation and
storage resources to form effective end-to-end
capabilities (ANL, BNL, CalTech, FLNL, Wisonsin,
UCSD, and others) - AstroGrid UK scientific grid
14Current Grid software research
- A meta-data catalogue to find things - GGF
provides one tailored to the application, takes a
predicate and returns a logical list of
needed resources and a Replica Catalogue takes
that list and returns physical filenames - Planner - No GGF standard (no automation). Most
implementations use a Directed Acyclic Graph. - Scheduler - In GGF, a Scheduler moves files to
compute hosts and works with an Executor to run
jobs. Condor is a popular scheduler, there is
also GRAM (Globus) - Executor - provide a means to run a job at the
operating system level. GGF has no standard but
several are available. - Data mover - move data between systems GridFTP -
Argonne National Labs, DataMover - Laurence
Berkeley Labs, Sabul - U. Illinois and like ftp
and scp. Low level data movers are often used by
more sophisticated ones. - Security - GGF only offers single-sign-on
Authentication. GGF offers the GSI-GridMap, a
Kerberos-like, certificate based authentication
strategy. GGF does not addresses the subject of
permissions - only authentications!
15(Near) Future Grid Systems Research
- Coordination and Status A consolidated view of
the system - Management - Grid-wide management tools
- Monitoring - Live tracking of processes/objects
- Error recovery and workspace cleanup Policies
and standards - Process restartability
- Process management - Grid-wide awareness of
processes - Planning (workflow) - Grid-wide awareness and
automation - Saving of results - Policy, standards and
automated transports - Tracking - Recording of processing, object
lineage and/or user activity - Performance metrics
- Data (disk) cache
- Permissions and Object Security
16Latin American Grid LA Grid
17LA Grid Mission
Education
Research
Collaboration
Talent Development
18Why a Latin American Grid?
- The Hispanic minority group is the largest and
fastest growing ethnic segment in the U.S.
(currently 14 growing to 25 by 2050). - Hispanic participation in Computer Science
Engineering is disproportionately low at the
Bachelors 3.9 Master 1.3 and PhD 1.0
19Technology Model (I)
- HW Configuration
- 14 Node Blade Center
- IBM Subsystems
- Dual 32 port SAN switch
- Dual 10 GigE back base switches
- Back end UNIX servers (p/x Series)
- Leverage existing clusters at FIU and UPRM
20Technology Model (II)
- SW Platform
- ? Grid Aware
- - Middleware
- - Autonomic
- - Fault Tolerance
- - QOS
- Tools
- - Grid Tool kit - DB2
- - Load Leveler - Eclipse
- - Websphere - Linux
- - Rational Suites
21Organization (I)
- Advisory Board of High Level Executives
- Provide Business, Technology, Diversity
- Direction
- - Chair Pete Martinez
- Nick Bowen
- Irving Wladawsky-Berger
- Ted Childs
-
22Organization (II)
- Governance Board
- Provide Operational Guidance
-
- - Chair Pete Martinez
- Yi Deng, Steve Luis (FIU)
- Jaime Seguel (UPRM)
- Monterrey Tech
- Rosa Badía (Barcelona SC)
- Jean-Pierre Prost (IBM Watson)
23Current IBM Sponsored Projects Applications Layer
- Bioinformatics on Grid (IBM project leaders
Howard Ho and Wen-Syan Li) - - 2 FTE (Full Time Employee) for IBM researcher
and - - 80K, covering 4 summer student internships at
Almaden Research -
- Grid Enablement of Hurricane Mitigation (IBM
project leader Len Berman) - - 40K for a full time student from FIU.
- -This work will start in 2006, and it is expected
to evolve in a PhD thesis
24Current IBM Sponsored Projects Systems Layer
- Autonomic Resource Management for Heterogeneous
Grid Environments (IBM project leader Vijay
Naik) - -1 FTE for IBM Research.
- - FIU and UPRM students (4) and faculty (4)
- Meta-scheduling and Job Flow (IBM Project leader
Liana Fong) - 1.75 FTE for IBM Research.
- FIU and UPRM students (6) and faculty (6).
25The Ultimate AimTalent Development
Highest Potential
Top Talent
26