Title: DATA-CENTRIC COMPUTING, SCIENCE GATEWAYS, AND THE TERAGRID
1DATA-CENTRIC COMPUTING, SCIENCE GATEWAYS, AND THE
TERAGRID
Kurt A. Seiffert seiffert_at_indiana.edu http//rtinf
o.indiana.edu/
April 2008
2Outline Presentation
- What is the TeraGrid
- Indiana Universitys data-centric computing focus
- HPSS
- Lustre
- Data collections
- Science Gateways
- Bringing it all together
3What is the TeraGrid?
- An instrument (cyberinfrastructure) that delivers
high-end IT resources - storage, computation,
visualization, and data/service hosting - almost
all of which are UNIX-based under the covers
some hidden by Web interfaces - A data storage and management facility over 20
Petabytes of storage (disk and tape), over 100
scientific data collections - A computational facility - over 750 TFLOPS in
parallel computing systems and growing - (Sometimes) an intuitive way to do very complex
tasks, via Science Gateways, or get data via data
services - A service help desk and consulting, Advanced
Support for TeraGrid Applications (ASTA),
education and training events and resources - The largest individual cyberinfrastructure
facility funded by the NSF, which supports the
national science and engineering research
community - Allocated via peer review (and without double
jeopardy)
4TeraGrid 11 Resource Partners, 1 Instrument
5HPSS Configuration
IUB Subsystem
IUPUI Subsystem
HPSS Core Servers
Research Network
Research Network
FC SAN
FC SAN
6Whats A Data Capacitor Really?
- 12 pairs Dell PowerEdge 2950
- 2 x 3.0 GHz Dual Core Xeon
- Myrinet 10G Ethernet
- Dual port Qlogic 2432 HBA (4 x FC)
- 2.6 Kernel (RHEL 4)
- 6 DDN S2A9550 Controllers
- Over 2.4 GB/sec measured throughput each
- 535 Terabytes of spinning SATA disk
7Bandwidth Challenge
- Annual Event at SC Conference in November
- This years venue - Reno, Nevada
- This Years Theme - Serving as a Model
- Can others do what youre doing?
- Criteria for Judging
- Did you fill a single 10 Gigabit connection?
- How are you supporting science?
- Did you use your production network?
8The ChallengeFive Applications Simultaneously
- Acquisition and Visualization
- Live Instrument Data
- Chemistry
- Rare Archival Material
- Humanities
- Acquisition, Analysis, and Visualization
- Trace Data
- Computer Science
- Simulation Data
- Life Science
- High Energy Physics
9Bandwidth Challenge Configuration
10Digitization of SarvamoolaGranthas
- SarvamoolaGranthas teachings of
ShriMadhvacharya (1238-1317) a great Indian
Philosopher, proponent of Dvaita Philosophy - SarvamoolaGranthas is a collection of works with
commentaries on various important scriptures such
Vedas, Upanishads, Itihasas, Puranas, Tantras and
Prakaranas - All of the original manuscripts of the
Sarvamoolagranthas were incised on palm leaves - Mathas or Monasteries
- Keepers of Palm Leaf Manuscripts
Shri Madhvacharya
11Digitization of Sarvamoola Granthas
Post processed images of the palm leaves
Sample images of the palm leaf of Sarvamoola
granthas illustrating the performance of the
image processing algorithms. (a) Stitched 8 bit
grayscale image without normalization and
contrast enhancement, (b) Final image after
contrast enhancement
12MutDB (www.mutdb.org)
13Science Gateways
- A Science Gateway is a domain-specific computing
environment, typically accessed via the Web, that
provides a scientific community with end-to-end
support for a particular scientific workflow - Science Gateways are distinguished from Web
portals (http//en.wikipedia.org/wiki/Web_portal)
in that portals present information from diverse
sources in a unified way. - Hides complexity (pay no attention to the grid
behind the curtain)
14LEAD (http//portal.leadproject.org)
15LEAD (portal.leadproject.org)
- Simple enough an undergraduate can use it!
- National Center for Supercomputing Applications
(NCSA) and IU teamed up to support WxChallenge
weather forecast competition. 64 teams, 1000
students, 16,000 CPU hours on Big Red
16Purdues NanoHUB (www.nanohub.org)
17But you dont care - TeraGrid Architecture
RP 1
RP 2
TeraGrid Infrastructure (Accounting, Network,
Authorization,)
Network, Accounting,
RP 3
Compute Service
18Acknowledgements
- IUs involvement as a TeraGrid Resource Partner
is supported in part by the National Science
Foundation under Grants No. ACI-0338618l,
OCI-0451237, OCI-0535258, and OCI-0504075. - The IU Data Capacitor is supported in part by the
National Science Foundation under Grant No.
CNS-0521433. - The Grid Infrastructure Group management of the
TeraGrid, and Dane Skow's leadership thereof, is
funded by NSF grant 0503697. - Purdues involvement as a TeraGrid Resource
Partner is supported in part by the National
Science Foundation under Grant No. OCI-050399. - This research was supported in part by the
Pervasive Technology Labs and the Indiana METACyt
Initiative. Both Indiana University initiatives
are supported by the Lilly Endowment, Inc. - This work was supported in part by Shared
University Research grants from IBM, Inc. to
Indiana University. - The LEAD portal is developed under the leadership
of IU Professors Dr. Dennis Gannon and Dr. Beth
Plale, and supported by NSF grant 331480. Marcus
Christie and SurreshMarru of the Extreme!
Computing Lab contributed the LEAD graphics - The ChemBioGrid Portal is developed under the
leadership of IU Professor Dr. Geoffrey C. Fox
and Dr. Marlon Pierce and funded via the
Pervasive Technology Labs (supported by the Lilly
Endowment, Inc.) and the National Institutes of
Health grant P20 HG003894-01. - Many of the ideas presented in this talk were
developed under a Fulbright Senior Scholars
award to Stewart, funded by the US Department of
State and the TechnischeUniversitaet Dresden. - Any opinions, findings and conclusions or
recommendations expressed in this material are
those of the author(s) and do not necessarily
reflect the views of the National Science
Foundation (NSF), National Institutes of Health
(NIH), Lilly Endowment, Inc., or any other
funding agency. - This work is made possible by the dedicated
efforts of the expert staff of the Research
Technologies Division of University Information
Technology Services, the faculty and staff of the
Pervasive Technology Labs, and the staff of UITS
generally. Steve Simms, Erik Cornet, Mike Lowe,
Scott Tiege, Michael Grobe, and Malinda Lingwall
helped with this presentation. - Thanks to the faculty and staff with whom we
collaborate locally at IU and globally (within
the US via the TeraGrid, and internationally via
collaboration with TechnischeUniversitaet Dresden)