Title: Dr. Frederica Darema
1 Research and Technology Advances in Systems
Software for Emerging Computer Systems EDGE
2006 Workshop
Dr. Frederica Darema Senior Science and
Technology Advisor NSF
2Outline
- The BIG PICTURE
- Applications Directions
- Computing Platforms Directions
- Research and Technology Directions
- Examples of some advances
- Future Challenges and Opportunities
3Science, Engineering, and Commercial
Applications Environments how are they
shaping in the future
- What does it entail forMulticore Processors
- and
- for Computing in the Larger-Scale
4Applications Directions
Past
- Computation Intensive
- Batch
- Hours/days
- Mostly monolithic
- Mostly one programming language
Present / Future
- Computation Intensive
- Data Intensive
- Real Time
- Few Minutes/hours
- Visualization
- Interactive Steering
- Integrated SimulationsExperiments Dynamic Data
Driven Applications Systems
- Multi-Modular
- Multi-Language
- Multi-Developers
- Multi-Source Data
5Platforms Directions
- Vector Processors
- SIMD MPPs
- Distributed Memory MPs
- Shared Memory MPs
Past
- Distributed Platforms,
Heterogeneous Computers and Networks - Heterogeneity
- architecture
- (computer network)
- node power
(supernodes, MCP)
Present/Future
- Latencies
- variable (internode, intranode)
- Bandwidths
- different for different links
- different based on traffic
GiBs
Grids
Petaflops Platform (Grid-in-a-Box)
Distributed Platform
.
MPP
NOW
SP
6 EXAMPLE OF EMERGING DIRECTIONS Dynamic Data
Driven Application Systems (DDDAS) New Direction
for applications/simulations and measurement
methodology Multi-agency DDDAS program NSF,
NIH, NOAA with cooperation with the EU/IST
e-Sciences Programs (www.cise.nsf.gov/dddas)
7What is DDDAS
(Symbiotic MeasurementSimulation Systems)
Simulations (Math.Modeling Phenomenology Observati
on Modeling Design)
Theory (First Principles)
Simulations (Math.Modeling Phenomenology)
Theory (First Principles)
Experiment Measurements Field-Data (on-line/archiv
al) User
Measurements Experiments Field-Data User
Dynamic Feedback Control Loop
Challenges Application Simulations
Development Algorithms Measurement Instruments
Interfaces Computing Systems Support
8Beyond Grid Computing Extended Grid the
Application Platform is the computationalmeasure
ment system
Applications
Archival/ Stored Data
Computational Platforms
Instruments
Sensors
Measurements
Computational Grids
9LEAD Users INTERACTING with Weather
Interaction Level II Tools and People Driving
Observing Systems Dynamic Adaptation
NWS National Static Observations Grids
Virtual/Digital Resources and Services
ADaM
ADAS
Mesoscale Weather
Tools
Remote Physical (Grid) Resources
Local Physical Resources
Local Observations
10(No Transcript)
11Examples of Computational Reqs - examples from
DDDAS applications - often results needed in
Real-Time or near-RT
- Water Pollution/Contaminant Transport/Detection
- Todays problem 500nodes- 4.4Pflops1.2GBmem.02G
B/s -gt - Large/Projected problem 10,000nodes-212Pflops
10.2GBmem .9GB/sec) - Chemical Pollution/Contaminant Transport/Detection
- Todays problem 2000nodes(Lemieux) 4TBmem
5hrs -gt - Large/Projected problem 10Knodes 20TBmem 1hr
- results in Real-Time (or near RT)
50-100Knodes - Protein Folding
- Todays problem 1024nodes(IBM-BlueGeneL)
6/7days for 1 protein - (w 150aminoacids)
- ElectricPowerGrid
- Todays problem 100Gflops 50MBmem
- Aircraft modeling
- Todays problem Full FEMCFD 384,000cpu-hrs
320GBmem - ROM 72secs 78KB
- Fire Propagation
- Todays problem FireModel 100procs(BG,
Teragrid clusters) 30GBmem 1hr-5hrs - Coupled Weather/Fire 100-1000nodes
200-400GBmem
12- So
- Processing at multiple levels
- Computation and data processing both at the
application and the instruments/sensors side - Multicores in high-end platforms, workstations,
visualization servers, data servers, etc, - Multicores at application side
- Multicores at the data collection side
-
- . MULTICORES EVERYWHERE!!!
13Some Challenges
- Programmability
- models of concurrency, multiple heterogeneous
models - optimized performance
- application
- system
- scalability across multiple levels
- application algorithms
- systems software
- fault tolerance, recovery, reliability, security
- power management
- verification, validation
- ..
- These challenges have been articulated for years
on the past and present platforms - Multicores add to the complexity of all the above
14The need for a holistic approach
- Large-Scale Systems does not entail only flops
(Giga-, Tera-, Peta-, Zetta-,) - Large-scale parallel systems are the POWERFUL
nodes/platforms - in balance with other resources
in the system - Analogy the stars and the galaxy within the
cosmos - Methods andTools needed at all levels, and they
need to work together synergistically
15Large-Scale Systems (e.g. Enabling DDDAS)
Applications Modeling Measurements
Dynamic Data-Driven Application
Systems -- Symbiotic MeasurementSimulation Syste
ms
Systems Software (NGS 1998-2004) (CSR/AESSMA
2004-todate)
Dynamic Compilers Application Composition
Performance Engineering
16System Modeling and Analysis (SMA)(a component
of the Computer Systems Research Program)(CSR
Program)
- Develop methods and tools for modeling,
measuring, analyzing, evaluating, and predicting
the performance and dependability of complex
computing and communications systems taking a
system level view - Topics of Interest
- Hardware and Software modeling
- methods tools and measurements, providing
multimodal, hierarchical or multilevel modeling
and analysis capabilities of such systems - methods that describe components of the system,
but also the system as a total, and enable
assessment of the effects of individual hardware
and software layers and components of these
systems - ability to describe the system in multiple levels
of detail (characteristics and time-scales) - combine different methods of describing
components and layers
17System Modeling and Analysis (SMA)
- Topics of Interest (contd)
- Novel modeling and measurement approaches
- Develop capabilities to describe, analyze and
predict the behavior of the components as well as
the systems Analysis and prediction due to
changes in the application, system software,
hardware multilevel approaches and multi-modal
approaches - Performance Frameworks
- combine tools in plug-and-play fashion
- multiple views of the system
18Multiple views of the system The applications
view
Distributed Applications
. . .
Collaboration
Visualization
Environments
Authenication
/
Scalable I/O
Data Management
Authorization
IO / File
Archiving/Retrieval
Dependability
Models
Services
Services
. . .
Other Services
OS
Scheduler
Distributed Systems Management
Models
Architecture /
Distributed, Heterogeneous, Dynamic, Adaptive
Network
Computing Platforms and Networks
Models
Memory
Device
CPU
Memory
. . .
Models
Technology
Technology
Technology
19Advanced Execution Systems (AES) (a component of
the Computer Systems Research Program)(CSR
Program)
- Seeks to create systems software to facilitate
the development and runtime support of complex
applications executing on large, heterogeneous
high-end computing and grid platforms - AES emphasizes runtime compiling systems and
application composition systems interface with
the underlying operating systems services and
incorporating systems modeling and analysis
methods and tools. - Topics of Interest
- Novel Compiler Technology that go beyond the
standard static notion of a compiler - for example by embedding a portion of the
compiler in the runtime and endowing the system
with resource awareness and adaptive mapping
capabilities - new compiler techniques for determining
functional and data dependencies across multiple
levels of memory hierarchy and across platforms - mechanisms for matching an applications resource
needs to underlying resources when both are
changing as the application executes
20Advanced Execution Systems (AES)
- Topics of Interest
- Programming models and tools
- expressing application partitioning across
distributed, heterogeneous computing platforms
application-level checkpointing and recovery - Application composition system (ACS) technology
- constructing applications to fit the available
resources and to adapt to changes in the
underlying execution environment - methods for automatically selecting application
components - creating knowledge bases for application
components interfacing with the underlying
computing platform models to determine suitable
application components - and developing appropriate application component
libraries and interfaces so the run-time portion
of the RCS can link to such libraries.
21The AES component develops technology for
integrated feedback control Runtime Compiling
System (RCS) and Dynamic Application Composition
Application Model
Dynamic Analysis Situation
Distributed Programming Model
Application Program
Compiler Front-End
Application Intermediate Representation
Compiler Back-End
Launch Application (s)
Performance Measuremetns Models
Dynamically Link Execute
Application Components Frameworks
Distributed Computing Resources
Distributed Platform
Adaptable computing Systems Infrastructure
22Examples of areas funded
- Programming Models, Languages, Environments
- legacy models (MPI), to high-level, domain,
hierarchical multithreading, software component
libraries, dynamic workflow, streaming
environments (languages/compilers), - Compiler methods and tools
- program analysis methods program transformation
methods - program Phase detection dynamic detection
- combine static, dynamic, and feedback methods
Continuous optimization methods - scheduling, scalability across hierarchies
- checkpoint recovery (system level, application
level) - Real-Time systems and integration (with server,
high-end, etc environments) - Systems management including power-management
- optimization constraints ( performancepower
optimization) - Validation, Verification, Testing
- System Modeling and Analysis
- Modeling of applications, algorithms, platforms
(at all levels) - performance, dependability (performability),
reliability - multi-modal modeling, power modeling (at all
levels application, computational platforms,
processor/multicores), Performance specification
(languages, compilers) performance frameworks - Fast real-time or near-real-time simulation
methods
Have seen an increase in all these areas with
respect to MULTICORES
23Summary Thoughts
- MultiCoreProcessors provide an opportunity for
enhanced capabilities in computation,
communication and data management - Multicores present the promise of populating all
levels of computational platforms and
environments - They should be viewed in the presence of other
resources heterogeneity, dynamicity, adaptivity - Multicores cannot exist in isolation they will
be nodes in other systems, high-end platforms,
servers, real-time systems, instruments, and
grids (InformationPowerGrid, TeraGrid) - Complexity of applications and platforms presents
a significant opportunity for innovative research
and technology in systems software (methods
tools) - Multicores will resurrect and build upon
ideas/methods started in 80s shared memory
parallel processing and the recent advances for
distributed systems - Need to advance the technologies that will
automate the mapping of such complex and dynamic
applications on complex platforms with multiple
and heterogeneous levels of processors, memory,
and networks - An important item do we nurture a critical mass
of people that will work on these challenges? - (where are the compiler people to
address/contribute to these challenges?!!!) - I personally hope that the opportunities of
MultiCoreProcessors will attract the attention
and the people needed