Title: Priority Research Direction (I/O Models, Abstractions and Software)
1Priority Research Direction (I/O Models,
Abstractions and Software)
Key challenges
Summary of research direction
- What will you do to address the challenges?
- Develop newer I/O models and higher level
abstractions (datasets based techniques that
exploit specialized applications) - Purpose-driven and customizable I/O (e.g.,
checkpointing, analytics, external communication
(workflow) - Incorporate I/O into programming models and
languages - Utilize I/O delegation for offloading I/O within
user space, caching, data reorganization - Integrate online analytics and data management
Programming and Abstraction how is I/O viewed
from 1M processes? The file I/O abstraction is
not good enough nor scalable. Make I/O
independent of number of processes with
predictable performance
Potential impact on software component
Potential impact on usability, capability, and
breadth of community
- What capabilities will result?
- - Higher-level abstraction (e.g., datasets,
specialized data management) - Purpose-driven I/O (e.g., checkpointing,
analytics, external communication in a workflow) - Customizable I/O
- I/O Delegation and Active Storage with I/O and
processing as a service
- How will this impact the range of applications
that may benefit from exascale systems? - More control and significantly reduced complexity
in I/O (3-5 years) - Portability of application WRT I/O (3-5 years)
- Predictable performance (5 years)
- Maximize use of data while available (3-5)
- Real-time Knowledge Discovery and Insights (10
years)
24.x I/O Models, Abstractions and Software
- Technology drivers
- File systems with traditional semantics are not
scalable - I/O architectures as an independent and separate
component does not scale - Alternative RD strategies
- Extend current file systems
- Develop newer layers on top of current file
systems - Develop newer I/O models and higher level
abstractions (datasets based techniques that
exploit applications domains) - Purpose-driven and customizable I/O (e.g.,
checkpointing, analytics, external communication
(workflow) - Develop techniques to concurrently exploit the
data and perform analytics when it is created
that is, embed online analytics - Incorporate I/O into programming models and
languages - Use databases
- I/O Delegation and Active Storage with I/O and
processing as a service - Recommended research agenda
- Develop newer I/O models and higher level
abstractions (datasets based techniques that
exploit specialized applications) - Purpose-driven and customizable I/O (e.g.,
checkpointing, analytics, external communication
(workflow) - Incorporate I/O into programming models and
languages - Active Storage with I/O and processing as a
service - Utilize I/O delegation for offloading I/O within
user space, caching, data reorganization etc. - Develop techniques to concurrently exploit the
data and perform analytics when it is created
that is, Integration of data analytics, online
analysis and data management
3Priority Research Direction (Newer Storage
Devices (SCM/SSD) and I/O Hierarchies)
Key challenges
Summary of research direction
- Brief overview of the barriers and gaps
- Performance, energy footprint and scalability of
current storage devices is limiting - Incorporation of newer storage devices such as
SCM, SSD - Optimizations for managing newer hierarchies
- What will you do to address the challenges?
- Develop balanced architectures with newer devices
embedded within the system - Develop new I/O models, software, runtime systems
and libraries to exploit these hierarchies - Develop new file systems or special-purpose data
management layers - Intelligent and proactive caching mechanisms
Potential impact on software component
Potential impact on usability, capability, and
breadth of community
- What capabilities will result?
- Orders of magnitude faster I/O and performance
- - Significant potential for power optimizations
in the I/O subsystem - What new methods and components will be
developed? - - Software layers for managing newer devices and
memory hierarchy
- How will this impact the range of applications
that may benefit from exascale systems? - Much faster I/O and highly optimized sustained
performance (3 years) - Significant reduction in the cost of
checkpointing (3 years) - Real-time knowledge discovery and insights (6
years) - Much simpler data management (5 years)
- This timeline is relative to the time thee
devices are incorporated into the architectures
44.x Newer Devices and Hierarchies
- Technology drivers
- Disks based storage systems not scalable
- Newer Storage devices such as SCM and SSD provide
a potential to significantly improve performance
and reduce power consumption by orders of
magnitude - Alternative RD strategies
- Build balanced architectures with newer devices
embedded within the system - Develop new I/O models, software, runtime systems
and libraries to exploit these hierarchies - Develop new file systems or special-purpose data
management layers - O/S manages the new memory hierarchy (for I/O
purposes) - Intelligent and proactive caching mechanisms
- Recommended research agenda
- Develop balanced architectures with newer devices
embedded within the system - Develop new I/O models, software, runtime systems
and libraries to exploit these hierarchies - Develop new file systems or special-purpose data
management layers - Intelligent and proactive caching mechanisms
- Crosscutting considerations
- Power optimizations
- Potential to significantly enhance resiliency
- Architectures
- Operating System
54.x ltExternal Communicationgt
- Technology drivers
- Data movement from/to systems is sequential
(single node based) even with multiple streams - Protocol conversion
- Alternative RD strategies
- Develop parallel data movement software and tools
- Special purpose network protocols for parallelism
- Scalable Scheduling
- Integration of external networks with local file
systems - Recommended research agenda
- Develop parallel data movement software and tools
- Special purpose network protocols for parallelism
- Scalable Scheduling
- Integration of external networks with local file
systems - Crosscutting considerations
- Scheduler
6I/O, Storage and Data Management
lt I/O Models and Abstractions gt
Integrated with newer Programming Models and
Languages SDM for Peta/Exa-bytes
Real-time Knowledge Discovery and Insights
Accelerated Scientific insights from Petabytes of
Data
Purpose driven I/O and Active Storage, Integration
of Analytics and I/O
Power optimized, Customizable I/O
I/O Runtime systems for SCM/SSD devices, Newer
I/O abstractions
I/O delegation
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019