Title: The Next Generation ClusterCyberinfrastructure Computing Management Standard SMASH
1The Next Generation Cluster/Cyberinfrastructure
Computing Management Standard -- SMASH
Yung-Chin Fang, Tong Liu Dell Inc.
Chokchai Leangsuksun Louisiana Tech University
Stephen L. Scott Oak Ridge National Laboratory
2TOC
- Motivation
- HPCC/HA Scale Out Fact
- The Problem
- The Facts
- The Idea
- The Action
- The Flexible Architecture
- Result
3Motivation
- Driven by demand
- The successfulness of HPCC/HA cluster computing
and grid computing research were driven by
Federal budget - FY06 Federal planned 2.2 B for supercomputing
research - Cyberinfrastructure computing is a focus, HPCC/HA
scaling out is a fact - When a node in Grid is down, user have to inform
the administrator to reboot the box. This makes
fully automation to be difficult. - Hardware independent management framework is the
needed foundation for Cyberinfrastructure
Framework to achieve successfulness
4HPCC/HA Scale Out Fact Manageability Need
5The Problem
Dell Management Framework
Weak Interoperability Manageability
IBM Management Framework
Cray Management Framework
Intel Management Framework
SGI Management Framework
HP Management Framework
Ad Hoc Management Framework
De Facto Management Framework
6The Facts
- A unified management architecture/framework is
needed for productivity improvement - All venders to drop their established framework
to work on a new management architecture/framework
is not likely to happen - Research institute and academia to develop a
management framework to cover all hardware
architecture and management framework is also not
likely to happen - This hardware level manageability and
interoperability can make Cyberinfrastructure to
achieve higher degree of successfulness
7The Idea
- Create an industry standard on top of well
established architecture/framework so venders
will buy in - Flexibility for research institute and academia
to facilitate the architecture, enhance/modify
implementation and import/export data to fit
their need - Compatibility for current and future management
architecture/framework - No extra overhead (memory footprint/CPU cycle) on
all compute nodes
8The Action
- Distributed Management Task Force (DMTF) forms a
working group to investigate this idea. - The workgroup is called Server Management Working
Group SMWG - DMTF/SMWG initiates the Systems Management
Architecture for Server Hardware (SMASH) - SMASH development member includes Dell, IBM,
Intel, HP, etc. - SMASH is a suite of specifications that deliver
architectural semantics, industry standard
protocols and profiles to unify management
environment
9TheFlexible Architecture
10TheFlexible Architecture
There are more than one way to implement the
architecture to achieve the same desired result
11Result
- Enhance deployment and operational phase
operation efficiency - Provide hardware independent RASS manageability
- Reduce the total cost of ownership
- Set the foundation for heterogeneous
Cyberinfrastructure (consists of HACC/HA
clusters) management - Enables info-structure-load sensitive resource
manager - Enables scientist to investigate even larger
scale problems and bring higher precision result
faster than ever.
12Backup
- SMASH CLP -- System Management Command Line
Protocol specification - Example
- Promptgt show display associations o
formatclpxml /system1/powersup3 - SMASH can input/output text/XML/CIM for binary
portability and data exchangeability
13- SMASH Profiles. Server Management profiles
provide the object model definitions for
manageability content and architecture models for
mapping computer hardware to fully connected
association graphs - SMASH Managed Element Addressing Specification.
This specification defines the formulation of
command target addresses that resemble
hierarchical file system naming conventions. - SMASH CLP-to-CIM Mapping Specification. This is
defined to facilitate existing CIM based
management console - SMASH CLP Discovery
- 1) How a client discovers which managed elements
the MAP manages, - 2) Discovering the capabilities of the MAP
itself, - 3) Discovering the service access points of the
MAP. The MAP is a network-accessible service for
managing a managed system
14FIN
SMASH THIS