Title: Development
1Development Implementation of an
Inter-institutional Multi-purpose Grid
SURAgrid, 11/22/05UNC-Charlotte Grid
Computing-ITSC 4010-001
- Mary Fran Yafchak, SURA
- Jim Jokl, University of Virginia
- Art Vandenberg, Georgia State University
2Presentation agenda
- About SURAgrid - Mary Fran Yafchak
- SURAgrid build/portal - MF Yafchak
- SURAgrid authN/authZ - Jim Jokl
- SURAgrid applications - Art Vandenberg
- QA - All
- This is a living, breathing project. Exchange of
ideas encouraged!
3About SURAgrid
- A beyond regional initiative in support of SURA
regional strategy - Mini-About SURA
- SURA region 16 states DC Delaware to Texas
- SURA membership 62 SE research universities
- SURA mission Foster excellence in scientific
research, strengthen capabilities, provide
training opportunities - Evolved from the NMI Testbed Grid project, part
of the NMI Integration Testbed Program - http//www1.sura.org/3000/NMI-Testbed.html
4SURAgrid Goals
- SURAgrid Organizations collaborating to bring
grids to the level of seamless, shared
infrastructure - Goals
- To develop grid infrastructure that is scalable
and that leverages local identity and
authorization while managing access to shared
resources - To promote use of this infrastructure for the
broad research and education community - To provide a forum for participants to share
experience with grid technology, and participate
in collaborative project development
5SURAgrid Participants
Resources on grid SURA member Project planning
- University of Alabama at Birmingham
- University of Alabama in Huntsville
- University of Arkansas
- University of Florida
- George Mason University
- Georgia State University
- Great Plains Network
- University of Kentucky
- University of Louisiana at Lafayette
- Louisiana State University
- University of Michigan
- Mississippi Center for SuperComputing Research
- University of North Carolina, Charlotte
- North Carolina State University
- Old Dominion University
- University of South Carolina
- University of Southern California
- Southeastern Universities Research Association
(SURA) - Texas AM University
- Texas Advanced Computing Center (TACC)
- Texas Tech University
- Tulane University
- Vanderbilt University
- University of Virginia
6Focus Areas
- Authentication Authorization
- Themes maintain local autonomy, leverage
enterprise infrastructure - Grid-Building
- Themes heterogeneity, flexibility,
interoperability, scalability - Application Development
- Themes immediate benefit to applications,
applications drive development - Project Planning
- Themes cooperative, representative, sustainable
7In the Coming Months
- Continue evolving key areas
- Grow and solidify grid infrastructure
- Continue expanding and exploring authN/authZ
- Identify grid-enable new applications
- Formal work on organizational definition
- Charter, membership, policies, governance
- Develop funding collaboration opportunities
- Some areas of interest scalable mechanisms for
shared, dynamic access interoperability in grid
products grid-enabling applications grids for
education broadening participation support and
management of large-scale grid operations
8(Ashok Adiga, Texas Advanced Computing
Ctr.)Building SURAgrid SURAgrid portal
9SURAgrid Software Requirements
- SURAgrid supports dedicated non-dedicated
compute nodes - Non-dedicated nodes are typically shared across
multiple grids, - Could have constraints on the software that can
be installed - Must allow resource owner to set usage policies
- Dedicated nodes run only SURAgrid jobs
- Common software stack being defined for dedicated
nodes - Will consider using packaged Grid solutions
- Virtual Data Toolkit (VDT)
- NSF Middleware Initiative (NMI Grids)
10Configuring Non-dedicated nodes
- Non-dedicated nodes support basic grid services
- Document simple process to add resources to the
grid - Job data management
- Install Globus (pre-web services GRAM gridftp)
- Authentication
- Cross sign CA certificates with Bridge CA
- Work with individual resource owners to get
authorized - Resource monitoring
- Install GPIR perl provider scripts on resource
and add resource description to User Portal
11SURAgrid Resource Status
- Number of Compute Clusters 14
- Total number of CPUs 611
- Peak GigaFlops 1,367
- Memory (GigaBytes) 621
- Storage (GigaBytes) 5,645
12Motivation for User Portals
- Make joining the SURAgrid easier for users
- Single place for users to find user information
and get user support - Certain information can be displayed better in a
web page than in a command shell - Allow novice users to start using grid resources
securely through a Web interface - Increase productivity of SURAgrid researchers
do more science!
13What is a Grid User Portal?
- In general - a gateway to a set of distributed
services accessible from a Web browser - Provides
- Aggregation of different services as a set of Web
pages - Single URL
- Single Sign-On
- Personalization
- Customization
14Characteristics of a User Portal
- A User Portal can include the following services
- Documentation Services
- Notification Services
- User Support Services
- Allocations
- Accounts
- Training
- Consulting
15User Portal Characteristics (contd)
- Collaborative Services
- Calendar
- Chat
- Resource sharing
- Information Services
- Resource
- Grid-wide
- Interactive Services
- Manage Jobs Data
- Doesnt replace the command shell but provides a
simpler, alternative interface
16Service Aggregation
User Support Consulting
Notification User News
Collaborative Calendar Chat
Documentation User Guides
Information Resource Grid
Interactive Job Submission File Transfer
HTTP/SSL/SOAP
GSI
User Portal
HTTP/SSL
Client Browser
17Portal Built Using GridPort 4
- Developed at TACC San Diego State
- Interface to grid technologies
- GRAM, GridFTP, MyProxy, WSRF, science
applications - Includes
- Portal framework-independent portlets
- Expose backend services as customizable web
interfaces - Small changes allow portlets to run in any
JSR-168 compliant portal framework (e.g.,
uPortal, WebSphere, Jetspeed installs into
Gridsphere by default) - Portal services
- Run in the same web container as portlets
- Provide portlet cohesion and portal framework
level support
18- Single sign-on to access all grid resources
- Documentation tab has details on
- Adding resources to the grid
- Setting up user ids and uploading proxy
certificates
19Information Services
- Resource-level view
- State information about individual resources
- Queue, Status, Load, OS Version, Uptime,
Software, etc.. - Grid-level view
- Grid-wide network performance
- Aggregated capability
- GPIR information Web Service
- Collects and provides information above
20Resource Monitoring
http//gridportal.sura.org
21Interactive Services
- Security
- Hidden from the user as much as possible
- File Management
- Upload
- Download
- Transfer between resources
- Job Submission to a single resource
- Job Submission to a grid meta-scheduler (future)
- Composite Job Sequencing (future)
22Proxy Management
- Upload proxy certificates to MyProxy server
- Portal provides support for selecting a proxy
certificate to be used in a user session
23File Management
- List directories, Move files between grid
resources, Upload/download files from local
machine
24Job Management
- Submit Jobs for execution on remote grid
resources - Check status of, cancel and delete submitted jobs
25Future Directions
- User Portal currently offers basic user,
informational and interactive services. - Build on other services such as user support
- Need to expand services as grid grows
- Resource broker to automatically select resource
for job execution - Workflow support for automation and better
utilization of grid resources - Reliable file transfer services
- Build customized application portlets
26Jim Jokl, University of VirginiaSURAgrid
authN/authZ
27SURAgrid Authentication
- Goal
- Develop a scalable inter-campus solution
- Preferred mechanisms
- Leverage campus middleware activities
- Researchers should not need to operate their own
authentication systems - Use local campus credentials inter-institutionally
- Rely on existing higher education
inter-institutional authentication efforts
28Inter-campus Globus Authentication
- Globus uses PKI credentials for authentication
- Leverage native campus PKI credentials on
SURAgrid - Users do all of their work using local campus PKI
credentials - How do we create the inter-campus trust fabric?
- Standard inter-campus PKI trust mechanisms
include - Operating a single Grid CA or trusting other
campus CAs - Cross-certification and Bridge PKIs
- How well does Globus operate in a bridged PKI?
- OpenSSL PKI in Globus is not bridge-aware
- Known to work from NMI Testbed project
- Decision intercampus trust based on a PKI Bridge
- Leverage EDUCAUSE Higher Education Bridge CA
(HEBCA) when ready
29Background Cross-certification
I UABS UAB
I UVAS UVA
- Top section
- Traditional hierarchical validation example
- Bottom section
- Validation using cross certification example
- UVA signed a certificate request from the UAB CA
- UAB signed a certificate request from the UVA CA
- This pair of cross certificates enables each
school to trust certs from the other using only
their own root as a trust anchor - An n2 problem
I UVAS User-1
I UABS User-2
I UABS UAB
I UVAS UVA
I UABS UVA
I UVAS UAB
Cross Certs
I UVAS User-1
I UABS User-2
30Background Bridged PKI
- Used to enable trust between multiple
hierarchical CAs - Generally more infrastructure than just the
cross-certificate pairs - Typically involves strong policy practices
- Solves the n2 problem
- For SURAgrid we preload cross-certs
31SURAgrid Authentication Schematic
Campus F Grid
Fs PKI
SURAgrid Bridge CA
Campus E Grid
Es PKI
Cross-cert pairs
Ds PKI
Campus D Grid
As PKI
Bs PKI
Cs PKI
Campus B Grid
Campus C Grid
Campus A Grid
32SURAgrid Authentication Status
- SURAgrid Bridge CA
- Off-line system
- Used Linux and OpenSSL to build bridge
- Cross-certifications with the bridge complete or
in progress for 8 SURAgrid sites - Several more planned in near future
- SURAgrid Bridge Web Site
- Interesting PKI issues discussed in paper
33Higher Education Bridge Certification Authority
(HEBCA)
- A project of EDUCAUSE
- Implement a bridge for higher education based on
the Federal PKI bridge model - Support both campus PKIs and sector hierarchical
PKIs - Cross-certify with the Federal bridge (and others
as appropriate) - Should form an excellent permanent trust fabric
for a bridge-based Grid
34Model SURAgrid Authentication
Campus F Grid
Fs PKI
HEBCA
Campus E Grid
Es PKI
Cross-cert pairs
Ds PKI
Campus D Grid
As PKI
Bs PKI
Cs PKI
Campus B Grid
Campus C Grid
Campus A Grid
35Bridge to Bridge Context
- A federal view on how the inter-bridge
environment is likely to develop - FBCA Federal Bridge
- SAFE Pharmaceutical
- HEBCA Higher Ed
- Commercial - aerospace and defense
- Grid extensible across PKI bridges?
36SURAgrid AuthN/AuthZ Status
- Bridge CA and cross-certification process
- Forms the basic AuthN infrastructure
- Builds a trust fabric that enables each site to
trust the certificates issued by the other sites - The grid-mapfile
- Controls the basic (binary) AuthZ process
- Sites add certificate Subject DNs from remote
sites to their grid-mapfile based on email from
SURAgrid sites
37SURAgrid AuthZ Development
- Grid-mapfile automation
- Sites that use a recent version of Globus will
use a LDAP callout that replaces the grid-mapfile - For other sites there will be some software that
provides and updates a grid-mapfile for their
gatekeeper
38SURAgrid AuthZ Development
- LDAP AuthZ Directory
- Web interface for site administrators to add and
remove their SURAgrid users - Directory holds and coordinates
- Certificate Subject DN
- Unix login name (prefixed by school initials)
- Allocated Unix UID (high numbers)
- Some Unix GIDs? (high numbers)
- Perhaps SSH public key, perhaps gsissh only
- Other (tbd)
- Reliability
- Replication to sites that want local copies
39SURAgrid AuthZ Development
- Sites contributing non-dedicated resources to
SURAgrid greatly complicate the equation - We will provide a code template for editing
grid-mapfiles to manage SURAgrid users - Publish our LDAP schema
- Sites may query LDAP to implement their own
SURAgrid AuthZ/AuthN interface
40Likely SURAgrid AuthZ Directions and Research
- User directory or directory access
- Group management
- Person attributes
- VO names
- Store per-person, per-group allocations
- Integrate with accounting
- Local and remote stop-lists
- Resource directory
- Hold resource usage policies
- Time of day, classifications, etc
- Mapping users to resources within resource policy
constraints - Well learn a lot more about what is actually
required as we work with the early user groups
41Art Vandenberg, Georgia State UniversityApplicat
ions on SURAgrid
42SURAgrid Applications
- Need applications to inform and drive development
- Want to be of immediate service to real
applications - Believe in grids as infrastructure
- but not if you build it they will come
- Identifying Fostering Applications
43Proposed Application Process
- Continuing survey of applications
- Catalog of Grid Applications similar agency and
partner databases survey of SURA membership - Identify target applications
- Region significance, multi-institutional,
intersection other e-Science - Illustrating grid benefits
- Test it
- Globus, authN-Z/BridgeCA, compilers, portal and
more - Implementation options
- 1) Immediate deployment
- 2) Demonstration deployment opportunities
- 3) Combined with proposal development
44Catalog of Grid Applications
- http//art11.gsu.edu8080/grid_cat/index5.jsp
- Researchers of grid, grid potential applications
- Initial intent just to see who's doing what
- Potentially larger resource (collaboration,
regional perspective, overall trends) - 20 sites, 475 researchers
- Current focus
- Automated maintenance
- Improved search, browse
45Identify an Applications Base
- Build from application activities already
underway in SURAgrid - Integrate with regional strategy (SURA HPC-Grid
Initiatives Planning Group) - Apply additional resources
- Seeking additional collaboration, external
funding - Achieve critical mass
- Seek FUNDING
46SURAgrid Applications
- SCOOP/ADCIRC (UNC, RENCI, MCNC, SCOOP partners,
SURAgrid partners) - Multiple Genome Alignment (GSU, UAB, UVA)
- ENDYNE (TTU)
- Task Farming (LSU)
- Data Mining on the Grid (UAH)
- BLAST (UAB)
- and more
47SCOOP/ADCIRC- UNC, RENCI, MCNC, SCOOP Partners,
SURAgrid Participants
- SURA program to create infrastructure for
distributed Integrated Ocean Observing System
(IOOS) in the southeast - Shared means for acquisition of observational
data - Enables modeling, analysis and delivery of
real-time data - SCOOP will serve as a model for national effort
- http//www1.sura.org/3000/3300_Coastal.html
- SCOOP/ADCIRC forecast storm surge
- resource selection (query MDS)
- build package (application data)
- send package to resource (gridftp)
- run adcirc in mpi mode (globus rsl qsub)
- retrieve results from resource (gridftp)
48SCOOP/ADCIRC
49SCOOP/ADCIRC
- Results SURAgrid U. Kentucky (CCS-UKY, 48 CPU/230
Gflops/48G RAM, 500G Disk) - -rwx------ 1 howard howard 1458444 Sep 14
1339 adcirc.x - -rwx------ 1 howard howard 12 Sep 14
1339 adcpost.inp - -rwx------ 1 howard howard 843813 Sep 14
1339 adcpost.x - -rw------- 1 howard howard 29 Sep 14
1339 adcprep.inp - -rwx------ 1 howard howard 1150926 Sep 14
1339 adcprep.x - -rwx------ 1 howard howard 915 Sep 14
1339 execute_parallel_bundle.sh - -rwx------ 1 howard howard 3042520 Sep 14
1339 fort.14 - -rw------- 1 howard howard 64545 Sep 14
1339 fort.15 - -rw------- 1 howard howard 19804050 Sep 14
1339 fort.22 - -rw-rw-r-- 1 howard howard 1444457 Sep 14
1617 fort.61 - -rw-rw-r-- 1 howard howard 202457 Sep 14
1617 fort.62 Results stored in
fort.61 - 64 - -rw-rw-r-- 1 howard howard 105626297 Sep
14 1618 fort.63 - -rw-rw-r-- 1 howard howard 169753697 Sep
14 1619 fort.64 - -rw------- 1 howard howard 1257568 Sep 14
1339 fort.68 - -rw-rw-r-- 1 howard howard 1326004 Sep 14
1340 fort.80 - -rw------- 1 howard howard 3940266 Sep 14
1340 metis_graph.txt - -rwx------ 1 howard howard 1802370 Sep 14
1339 padcirc.x - -rw-rw-r-- 1 howard howard 403 Sep 14
1339 pbs_sub-howard
50SCOOP/ADCIRC - Challenges
- resource selection (query MDS)
- Expect MDS to be hosted on resource being
queried. CCS-UKY actually pointed to NCSA for
their MDS needed to implement MDS on CCS-UKY as
well (essentially CCS-UKY part of multiple MDS) - build package (application data)
- Must address incompatibility between GT3 and GT2
style proxies must use -old option to GT3s
grid-proxy-init to get GT2 style proxy which
ADCIRC currently expects - send package to resource (gridftp)
- Staff availability
- run adcirc in mpi mode (globus rsl qsub)
- retrieve results from resource (gridftp)
51Multiple Genome Alignment-GSU, UAB, U. Virginia,
U. Southern CA, TACC
- Demoed March 2005 SURA IT Comm (used BridgeCA)
- SMP cluster UAB grid SURAgrid
- Iteratively advance understanding (algorithm, UAB
grid, Bridge CA, multiple clusters, SURAgrid
portal) - USC baseline testing Mar-Jun 2005
- TACC Bandera MPI running, submit to Portal in
process
52ENDYNE- Texas Tech
- Run on SURAgrid, September 2005
- Electron Nuclear Dynamics simulations
- Trajectory calculation in quantum phase space
- Using grid enables real-time solutions
53Task Farming- Louisiana State U.
- Demo Nov SC2004 Mar 2005 - SURA IT Comm
(BridgeCA) - Pluggable components to use different
technologies - Application independent no need to recompile
- Grid enabled, supports task scheduling
- HTTP interface monitor progress, steer
individual TFM
54Data Mining - U. Alabama at Huntsville
- Linked Environments for
- Atmospheric Discovery (LEAD)
- NSF Program
- Grid-based cyber-infrastructure
- Real time, on-demand and dynamically-adaptive
- Mesoscale weather research
- Vastly disparate high volume, high bandwidth data
- Tremendous computational demand for models and
data assimilation
55BLAST- U. Alabama at Birmingham
- Nearing SURAgrid deployment
- Database search application for protein and
nucleotide sequences - Globus job staging, submission, retrieval
- ncbiBLAST for computation
- Pubcookie initial login, myproxy grid login
- Simplified web interface
- Sequence database pre-staged on nodes
56Funded applications
- EnLIGHTened Computing - MCNC, RENCI, LSU, Cisco,
ATT, SURA Naval Research Lab - Funded project to develop advanced toolkits, Grid
middleware and underlying optical control plane
technologies. Provide awareness of Grid
environment, applications have dynamic, adaptive
and optimized use of networks connecting high end
resources. - UCoMS Reservoir Simulation via Task Farming - CCT
at LSU, Petroleum Department at LSU, CASCS at
ULL, and CS Department at SUBR - Ubiquitous Computing and Monitoring System. DOE
funded project addresses key research issues for
technical solutions in the areas of wireless
networked systems, grid computing, and
application software. - The workflow of a typical reservoir simulation
includes geostatics modeling, reservoir
simulating, and result analysis.
57Potential apps grants
- NSF CRI proposal pending (SURAgrid team)
- Improved 2D gel statistical analysis
- Dr. Alan Shih, ME, Dr. Sreelatha Meleth, Dept.
Med., U. Alabama at Birmingham - Asynchronous iterative algorithms
- Dr. Jim Browne, CS, U. Texas at Austin
- Protein structure prediction
- Dr. Yi Pan, CS, Dr. Robert Harrison, CS/Biol,
Georgia State U. - Configurable grid testbed
- Dr. Ashok Adiga, Dr. Warren Smith, Texas Advanced
Computing Center - Application discovery and knowledge management
- Dr. Vijay Vaishnavi, CIS/CS, Art Vandenberg,
Georgia State U.
58Potential applications
- Turbomachinery Flow Field Simulation (Dr. Alan
Shih, UAB) - Computational fluid dynamics (Tulane, TACC, UAB)
59More Potential applications
- Bioportal Phylip (PHYLogeny Inference Package)
RENCI - Bioportal application that includes a variety of
tools for determining the phylogenetic
relationship between sets of related nucleic acid
and protein sequence. - GeoScience Grid - George Mason University
- Grid platform for supporting research,
development, and operational needs of spatial
computing infrastructure focusing on GeoScience
interoperability
60Applications drive infrastructure
- Contributed nodes
- Defining software stack (evolving)
- Bridge CA
- Portal
- Policy, meta-scheduling
- CHALLENGE
- Pragmatic Managed
- Experimental Production
61Challenges
- Essentially persistent application use
- Meeting broad objectives of SURAgrid in context
of size diversity of SURA - Busy people, multiple priorities, tight resources
- Application implementation template
- Collaboration is key
62SURAgrid Summary
- Fulfilling SURA mission to foster excellence in
scientific research, strengthen capabilities,
provide training opportunities - Evolving beyond regional initiative
- Growing infrastructure moving to production
- Identifying applications participants
- Collaborative research activities
- AuthN/Z, Portals, Applications, Metascheduling
- Grid middleware services
- Funding opportunities
63Additional questions or comments?
- For more information
- http//www1.sura.org/SURAgrid.html