Performance Technology for Component Software - TAU - PowerPoint PPT Presentation

About This Presentation
Title:

Performance Technology for Component Software - TAU

Description:

CCA. Common Component Architecture. Performance Technology for Component ... Craig Rasmussen (LANL) Jaideep Ray (SNL, CA) Matt ... Emerging component technology ... – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Slides: 33
Provided by: loisc6
Category:

less

Transcript and Presenter's Notes

Title: Performance Technology for Component Software - TAU


1
Performance Technology for Component Software -
TAU
  • Allen D. Malony (U. Oregon)
  • Sameer Shende (U. Oregon)
  • Craig Rasmussen (LANL)
  • Jaideep Ray (SNL, CA)
  • Matt Sottile (LANL)

2
Overview
  • Complexity and performance technology
  • TAU performance system
  • Developing performance interfaces for CCA
  • Performance modeling and prediction issues
  • Conclusions

3
Focus on Component Technology
  • Emerging component technology for HPC and Grid
  • Component software object embedding
    functionality
  • Component architecture (CA) how components
    connect
  • Component framework implements a CA
  • Common Component Architecture (CCA)
  • Standard foundation for scientific component
    architecture
  • Component descriptions
  • Scientific Interface Description Language (SIDL)
  • CCA ports for component interactions
  • CCA framework services (CCAFEINE)

4
Problem Statement
  • How do we create robust and ubiquitous
    performance technology for the analysis and
    tuning of component software in the presence of
    (evolving) complexity challenges?
  • How do we apply performance technology
    effectively for the variety and diversity of
    performance problems that arise in the context of
    CCA components?

?
5
  • Tuning and Analysis Utilities
  • Performance system framework for scalable
    parallel and distributed high-performance
    computing
  • Targets a general complex system computation
    model
  • nodes / contexts / threads
  • Multi-level system / software / parallelism
  • Measurement and analysis abstraction
  • Integrated toolkit for performance
    instrumentation, measurement, analysis, and
    visualization
  • Portable, configurable performance
    profiling/tracing facility
  • Open software approach
  • University of Oregon, LANL, FZJ Germany
  • http//www.cs.uoregon.edu/research/paracomp/tau

6
TAU Performance System Architecture
Paraver
EPILOG
7
TAU Instrumentation
  • Flexible instrumentation mechanisms at multiple
    levels
  • Source code
  • Manual (TAU API, CCA Measurement Port API)
  • automatic using Program Database Toolkit (PDT),
    OPARI (for OpenMP programs), Babel SIDL compiler
    (proposed)
  • Object code
  • pre-instrumented libraries (e.g., MPI using PMPI)
  • statically linked
  • dynamically linked (e.g., Virtual machine
    instrumentation)
  • fast breakpoints (compiler generated)
  • Executable code
  • dynamic instrumentation (pre-execution) using
    DynInstAPI

8
Program Database Toolkit
9
Program Database Toolkit (PDT)
  • Program code analysis framework for developing
    source-based tools for C99, C and F90
    U.Oregon, LANL, FZJ Germany
  • High-level interface to source code information
  • Widely portable
  • IBM, SGI, Compaq, HP, Sun, Linux
    clusters,Windows, Apple, Hitachi, Cray T3E...
  • Integrated toolkit for source code parsing,
    database creation, and database query
  • commercial grade front end parsers (EDG for
    C99/C, Mutek for F90)
  • Intel/KAI C headers for std. C library
    distributed with PDT
  • portable IL analyzer, database format, and access
    API
  • open software approach for tool development
  • Target and integrate multiple source languages
  • Used in CCA for automated generation of SIDL
    CHASM
  • Use in TAU to build automated performance
    instrumentation tools (tau_instrumentor)
  • Can be used to generate code for performance
    ports in CCA

10
Extended Component Design
Extended Component Design
genericcomponent
  • PKC Performance Knowledge Component
  • POC Performance Observability Component

11
Performance Observation
  • Ability to observe execution performance is
    important
  • Empirically-derived performance knowledge
  • Does not require measurement integration in
    component
  • Monitor during execution to make dynamic
    decisions
  • Measurement integration is key
  • Performance observation integration
  • Component integration core and variant
  • Runtime measurement and data collection
  • On-line and off-line performance analysis

12
Performance Observation Component (POC)
  • Performance observation in aperformance-engineere
    dcomponent model
  • Functional extension of originalcomponent design
    ( )
  • Include new componentmethods and ports ( ) for
    othercomponents to access measured performance
    data
  • Allow original component to access performance
    data
  • Encapsulate as tightly-couple and co-resident
    performance observation object
  • POC provides port allow use optmized interfaces
    ( )to access internal'' performance
    observations

13
Performance Observation Component
Performance Component
Measurement Port
  • One performance component per context
  • Performance component provides a Measurement Port
  • Measurement Port allows a user to create and
    access
  • Timer (start/stop, set name/type/group)
  • Event (trigger)
  • Control (enable/disable groups)
  • Query (get functions, metrics, counters, dump to
    disk)

14
Measurement Port in CCAFEINE
Performance Component API
  • namespace performance
  • namespace ccaports class Measurement
    public virtual classicgovccaPort
    public virtual Measurement ()
    / Create a Timer / virtual
    performanceTimer createTimer(void) 0
    virtual performanceTimer createTimer(string
    name) 0 virtual performanceTimer
    createTimer(string name, string type) 0
    virtual performanceTimer createTimer(string
    name, string type,
  • string group) 0 / Create a Query
    interface / virtual performanceQuery
    createQuery(void) 0 / Create a User
    Defined Event interface / virtual
    performanceEvent createEvent(void) 0
    virtual performanceEvent createEvent(string
    name) 0 / Create a Control
    interface for selectively enabling and disabling
    the instrumentation based on groups
    / virtual performanceControl
    createControl(void) 0

15
CCA Timer Interface
  • namespace performance
  • class Timer public virtual
    Timer() / Start the Timer. Implement
    these methods in a derived class to
    provide required functionality. / virtual
    void start(void) 0
  • / Stop the Timer./ virtual void
    stop(void) 0 virtual void
    setName(string name) 0
  • virtual string getName(void) 0
    virtual void setType(string name) 0
    virtual string getType(void) 0
  • /Set the group name associated with the
    Timer (e.g., All MPI calls can be
    grouped into an "MPI" group)/
  • virtual void setGroupName(string name)
    0 virtual string getGroupName(void) 0
  • virtual void setGroupId(unsigned long group
    ) 0 virtual unsigned long
    getGroupId(void) 0

16
Control Class Interface
CCA Instrumentation Control Interface
  • namespace performance
  • class Control public Control ()
    / Control instrumentation. Enable
    group Id./ virtual void enableGroupId(unsig
    ned long id) 0 / Control
    instrumentation. Disable group Id. /
    virtual void disableGroupId(unsigned long id)
    0 / Control instrumentation. Enable
    group name. / virtual void
    enableGroupName(string name) 0 /
    Control instrumentation. Disable group name./
    virtual void disableGroupName(string name)
    0 / Control instrumentation. Enable
    all groups./ virtual void
    enableAllGroups(void) 0 / Control
    instrumentation. Disable all groups./
    virtual void disableAllGroups(void) 0

17
Query Class Interface
CCA Performance Query Interface
  • namespace performance
  • class Query public virtual
    Query() / Get the list of Timer names
    / virtual void getTimerNames(const char
    functionList, int numFuncs)
  • 0 / Get the list of Counter names
    / virtual void getCounterNames(const char
    counterList, int numCounters)
    0 / getTimerData. Returns lists of
    metrics./ virtual void getTimerData(const
    char inTimerList, int numTimers,
    double counterExclusive, double
    counterInclusive, int numCalls, int
    numChildCalls, const char counterNames,
    int numCounters) 0 virtual void
    dumpProfileData(void) 0 virtual void
    dumpProfileDataIncremental(void) 0 //
    timestamped dump virtual void
    dumpTimerNames(void) 0 virtual void
    dumpTimerData(const char inTimerList, int
    numTimers)
  • 0 virtual void dumpTimerDataIncrementa
    l(const char inTimerList, int
    numTimers) 0

18
Event Class Interface
CCA User Defined Event Interface
  • namespace performance
  • class Event public /
    Destructor / virtual Event()
    / Register the name
    of the event / virtual void
    trigger(double data) 0
  • / e.g., size of a message, error in an
    iteration, memory allocated /

19
Measurement Port Implementation
  • TAU component implements the MeasurementPort
  • Implements Timer, Control, Query and Control
    classes
  • Registers the port with the CCAFEINE framework
  • Components target the generic MeasurementPort
    interface
  • Runtime selection of TAU component during
    execution
  • Instrumentation code independent of underlying
    tool
  • Instrumentation code independent of measurement
    choice
  • TauMeasurement_CCA port implementation uses a
    specific TAU measurement library

20
Using MeasurementPort
Using the Timer Interface An Example
  • include "ports/Measurement_CCA.h"
  • double MonteCarloIntegratorintegrate (double
    lowBound, double upBound,
    int count)
    classicgovccaPort port double sum
    0.0 // Get Measurement port port
    frameworkServices-gtgetPort ("MeasurementPort")
    if (port) measurement_m
    dynamic_cast lt performanceccaportsMeasurement
    gt(port) if (measurement_m
    0) cerr ltlt "Connected to something
    other than a Measurement port" return
    -1 static performanceTimer t
    measurement_m-gtcreateTimer(
    string("IntegrateTimer")) t-gtstart()
    for (int i 0 i lt count i)
    double x random_m-gtgetRandomNumber ()
    sum sum function_m-gtevaluate (x)
    t-gtstop()

21
TAU Component in CCAFEINE
  • repository get TauMeasurement
  • repository get Driver
  • repository get MidpointIntegrator
  • repository get MonteCarloIntegrator
  • repository get RandomGenerator
  • repository get LinearFunction
  • repository get NonlinearFunction
  • repository get PiFunction
  • create LinearFunction lin_func
  • create NonlinearFunction nonlin_func
  • create PiFunction pi_func
  • create MonteCarloIntegrator mc_integrator
  • create RandomGenerator rand
  • create TauMeasurement tau
  • connect mc_integrator RandomGeneratorPort rand
    RandomGeneratorPort
  • connect mc_integrator FunctionPort nonlin_func
    FunctionPort
  • connect mc_integrator MeasurementPort tau
    MeasurementPort

22
SIDL interface for Timers
  • //
  • // File performance.sidl
  • //
  • version performance 1.0
  • package performance
  • class Timer
  • void start()
  • void stop()
  • void setName(in string name)
  • string getName()
  • void setType(in string name)
  • string getType()
  • void setGroupName(in string name)
  • string getGroupName()
  • void setGroupId(in long group)
  • long getGroupId()

23
Using SIDL Interface for Timers
  • // SIDL
  • include "performance_Timer.hh"
  • int main(int argc, char argv)
  • performanceTimer t performanceTimer_crea
    te()
  • ...
  • t.setName("Integrate timer")
  • t.start()
  • // Computation
  • for (int i 0 i lt count i)
  • double x random_m-gtgetRandomNumber ()
  • sum sum function_m-gtevaluate (x)
  • ...
  • t.stop()
  • return 0

24
Performance Knowledge Component
  • Describe and store known components
    performance
  • Benchmark characterizations in performance
    database
  • Empirical or analytical performance models
  • Saved information about component performance
  • Use for performance-guided selection and
    deployment
  • Use for runtime adaptation
  • Representation must be in common forms with
    standard means for accessing the performance
    information

25
Performance Knowledge Repository
  • Component performance repository
  • Implement in componentarchitecture framework
  • Similar to CCA componentrepository Alexandria
  • Access by componentinfrastructure
  • View performance knowledge as component (PKC)
  • PKC ports give access to performance knowledge
  • to other components back to original
    component
  • Store performance model for performance
    prediction
  • Component composition performance knowledge

26
Component Performance Model
  • User specified
  • Inferred automatically by performance tool
  • Prior performance data
  • Expression
  • Parametric model
  • Estimate performance of a single component by
  • Querying runtime performance data
  • Passing this to performance model for evaluation
  • Integration of performance observation and
    knowledge components key to runtime selection of
    components

27
Applications Uintah (U. Utah)
Scalability analysis
28
Applications VTF (ASCI ASAP Caltech)
  • C, C, F90, Python
  • PDT, MPI

29
Applications SAMRAI (LLNL)
  • C
  • PDT, MPI
  • SAMRAI timers (groups)

30
TAU Status
  • Instrumentation supported
  • Source, preprocessor, compiler, MPI, runtime,
    virtual machine
  • Languages supported
  • C, C, F90, Java, Python
  • HPF, ZPL, HPC, pC...
  • Packages supported
  • PAPI UTK, PCL FZJ (hardware performance
    counter access),
  • Opari, PDT UO,LANL,FZJ, DyninstAPI U.Maryland
    (instrumentation),
  • EXPERT, EPILOGFZJ,VampirPallas, Paraver
    CEPBA (visualization)
  • Platforms supported
  • IBM SP, SGI Origin, Sun, HP Superdome, HP/Compaq
    Tru64 ES,
  • Linux clusters (IA-32, IA-64, PowerPC, Alpha),
    Apple, Windows,
  • Hitachi SR8000, NEC SX, Cray T3E ...
  • Compilers suites supported
  • GNU, Intel KAI (KCC, KAP/Pro), Intel, SGI, IBM,
    Compaq,HP, Fujitsu, Hitachi, Sun, Apple,
    Microsoft, NEC, Cray, PGI, Absoft,
  • Thread libraries supported
  • Pthreads, SGI sproc, OpenMP, Windows, Java, SMARTS

31
Concluding Remarks
  • Complex component systems pose challenging
    performance analysis problems that require robust
    methodologies and tools
  • New performance problems will arise
  • Instrumentation and measurement
  • Data analysis and presentation
  • Diagnosis and tuning
  • Performance engineered components
  • Performance knowledge, observation, query and
    control
  • Integration of performance technology

32
Support Acknowledgement
  • TAU and PDT support
  • Department of Energy (DOE)
  • DOE 2000 ACTS contract
  • DOE MICS contract
  • DOE ASCI Level 3 (LANL, LLNL)
  • U. of Utah DOE ASCI Level 1 subcontract
  • DARPA
  • NSF National Young Investigator (NYI) award
Write a Comment
User Comments (0)
About PowerShow.com