Title of Presentation Here - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Title of Presentation Here

Description:

Shuttle _at_ NewEgg.com. Sun HPC10000. Cray Y-MP C916. System. 2005. 1998. 1991 ... 8 Opteron, 20 Gflops workstation/cluster for O($10,000) ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 16
Provided by: spea64
Category:

less

Transcript and Presenter's Notes

Title: Title of Presentation Here


1
(No Transcript)
2
Future of Scientific Computing
  • Marvin Theimer
  • Software Architect
  • Windows Server High Performance Computing Group
  • Microsoft Corporation

3
Supercomputing Goes Personal
4
Molecular Biologists Workstation
  • High-end workstation with internal cluster nodes
  • 8 Opteron, 20 Gflops workstation/cluster for
    O(10,000)
  • Turn-key system purchased from a standard OEM
  • Pre-installed set of bioinformatics applications
  • Run interactive workstation applications that
    offload computationally intensive tasks to
    attached cluster nodes
  • Run workflows consisting of visualization and
    analysis programs that process the outputs of
    simulations running on attached cluster nodes

5
The Future Supercomputing on a Chip
  • IBM Cell processor
  • 256 Gflops today
  • 4 node personal cluster 1 Tflops
  • 32 node personal cluster Top100
  • Intel many-core chips
  • 100s of cores on a chip in 2015 (Justin
    Rattner, Intel)
  • 4 cores/Tflop 25 Tflops/chip

6
The Continuing Trend Towards Decentralized,
Dedicated Resources
Grids of personal departmental clusters
Personal workstations departmental servers
Minicomputers
Mainframes
7
The Evolving Nature of HPC
8
Exploding Data Sizes
  • Experimental data TBs ? PBs
  • Modeling data
  • Today
  • 10s to 100s of GB per simulation is the common
    case
  • Applications mostly run in isolation
  • Tomorrow
  • 10s to 100s of TBs, all of it to be archived
  • Whole-system modeling and multi-application
    workflows

9
How Do You Move A Terabyte?
Material courtesy of Jim Gray
10
Anticipated HPC Grid Topology
  • Islands of high connectivity
  • Simulations done on personal workgroup
    clusters
  • Data stored in data warehouses
  • Data analysis best done inside the data
    warehouse
  • Wide-area data sharing/replication via FedEx?

Personal cluster
Workgroup cluster
Data warehouse
11
Data Analysis and Mining
  • Traditional approach
  • Keep data in flat files
  • Write C or Perl programs to compute specific
    analysis queries
  • Problems with this approach
  • Imposes significant development times
  • Scientists must reinvent DB indexing and query
    technologies
  • Have to copy the data from the file system to the
    compute cluster for every query
  • Results from the astronomy community
  • Relational databases can yield speed-ups of one
    to two orders of magnitude
  • SQL application/domain-specific stored
    procedures greatly simplify creation of analysis
    queries

12
Is That the End of the Story?
Personal cluster
Relational Data warehouse
Workgroup cluster
13
Too Much Complexity
2004 NAS supercomputing report O(35) new
computational scientists graduated per year
  • Parallel application development
  • Chip-level, node-level, cluster-level, LAN
    grid-level, WAN grid-level parallelism
  • OpenMP, MPI, HPF, Global Arrays,
  • Component architectures
  • Performance configuration tuning
  • Debugging/profiling/tracing/analysis
  • Digital experimentation
  • Experiment management
  • Provenance (data workflows)
  • Version management (data workflows)

Domain science
  • Distributed systems issues
  • Security
  • System management
  • Directory services
  • Storage management

Personal cluster
Relational Data warehouse
Workgroup cluster
14
Separating the Domain Scientist from the Computer
Scientist
Computer scientist
Parallel/distributed file systems, relational
data warehouses, dynamic systems management,
Web Services HPC grids
Concrete concurrency
Concrete workflow
Abstract concurrency
Computational scientist
Parallel domain application development
Abstract workflow
Domain scientist
(Interactive) scientific workflow, integrated
with collaboration-enhanced office automation
tools
Example
Write scientific paper (Word)
Record experiment data (Excel)
Individual experiment run (Workflow orchestrator)
Analyze data (SQL-Server)
Share paper with co-authors (Sharepoint)
Collaborate with co-authors (NetMeeting)
15
Scientific Information WorkerPast and Future
  • Past
  • Buy lab equipment
  • Keep lab notebook
  • Run experiments by hand
  • Assemble analyze data (using stat pkg)
  • Collaborate by phone/email write up results with
    Latex
  • Metaphor
  • Physical experimentation
  • Do it yourself
  • Lots of disparate systems/pieces
  • Future
  • Buy hardware software
  • Automatic provenance
  • Workflow with 3rd party domain packages
  • Excel Access/Sql-Server
  • Office tool suite with collaboration support
  • Metaphor
  • Digital experimentation
  • Turn-key desktop supercomputer
  • Single integrated system
Write a Comment
User Comments (0)
About PowerShow.com