An Overview of TSFCore - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

An Overview of TSFCore

Description:

Sandia is a multiprogram laboratory operated by Sandia Corporation, a ... Example from TRICE (Dennis, Heinkenschloss, Vicente) Example from IPOPT (Waechter) ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 36
Provided by: trilino
Category:

less

Transcript and Presenter's Notes

Title: An Overview of TSFCore


1
An Overview of TSFCore
  • Roscoe A. Bartlett
  • 9211, Optimization and Uncertainty Estimation

Sandia is a multiprogram laboratory operated by
Sandia Corporation, a Lockheed Martin
Company,for the United States Department of
Energy under contract DE-AC04-94AL85000.
2
TSFCore SAND Reports
Get most recent copy at Trilinos/doc/TSFCore
3
Nonlinear Equations Foundation for all our Work!
  • Applications
  • Discretized PDEs (e.g. finite element, finite
    volume, finite difference etc.)
  • Network problems (e.g. Xyce)

4
Nonlinear Equations Sensitivities
  • Related Algorithms
  • Gradient-based optimization
  • SAND
  • NAND
  • Nonlinear equations (NLS)
  • Multidisciplinary analysis
  • Linear (matrix) analysis
  • Block iterative solvers
  • Eigenvalue problems
  • Uncertainty quantification
  • SFE
  • Stability analysis / continuation
  • Transients (ODEs, DAEs)

B. van Bloemen Waanders, R. A. Bartlett, K. R.
Long and P. T. Boggs. Large Scale Non-Linear
Programming PDE Applications and Dynamical
Systems, Sandia National Laboratories,
SAND2002-3198, 2002
5
Applications, Algorithms, Linear-Algebra Software
  • Key points
  • Complex algorithms
  • Complex software
  • Complex interfaces
  • Complex computers
  • Duplication of effort?

APP Application (e.g. MPSalsa, Xyce, SIERRA,
NEVADA etc.) LAL Linear-Algebra Library (e.g.
Petra/Ifpack, PETSc, Aztec etc.) ANA Abstract
Numerical Algorithm (e.g. optimization, nonlinear
solvers, stability analysis, SFE, transient
solvers etc.)
6
TSFCore
  • Key points
  • Maximizing development impact
  • Software can be run on more sophisticated
    computers
  • Fosters improved algorithm development

7
Requirements for TSFCore
  • TSFCore should
  • Be portable to ASCI platforms
  • Provide for stable and accurate numerical
    computations
  • Represent a minimal but complete interface that
    will result in implementations that are
  • Near optimal in computational speed
  • Near optimal in storage
  • Be independent of computing environment (SPMD,
    MS, CS etc.)
  • Be easy to develop adapters for existing
    libraries (e.g. Epetra, PETSc etc.)

8
Example ANA Linear Conjugate Gradient Solver
9
TSFCore Basic Linear Algebra Interfaces
ltltcreategtgt
Warning! Unified Modeling Langage (UML) Notation!
10
TSFCore Basic Linear Algebra Interfaces
ltltcreategtgt
11
TSFCore Basic Linear Algebra Interfaces
ltltcreategtgt
12
TSFCore Basic Linear Algebra Interfaces
ltltcreategtgt
13
TSFCore Basic Linear Algebra Interfaces
  • The Key to success!
  • Reduction/Transformation Operators
  • Supports all needed vector operations
  • Data/parallel independence
  • Optimal performance

R. A. Bartlett, B. G. van Bloemen Waanders and M.
A. Heroux. Vector Reduction/Transformation
Operators, Accepted to ACM TOMS, 2003
14
Background for TSFCore
  • 1996 Hilbert Class Library (HCL), Symes and
    Gockenbach
  • Abstract vector spaces, vectors, linear operators
  • 2000 Epetra, Heroux
  • Concrete multi-vectors
  • 2001 Trilinos Solver Framework (TSF) 0.1,
    Long
  • 2001 AbstractLinAlgPack (ALAP) (MOOCHO LA
    interfaces), Bartlett
  • Reduction/transformation operators (RTOp)
  • Abstract multi-vectors

15
TSFCore Basic Linear Algebra Interfaces
16
TSFCore Details
  • All interfaces are templated on Scalar type
    (support real and complex)
  • Smart reference counted pointer class
    TeuchosRefCountPtrltgt used for all dynamic
    memory management
  • Many operations have default implementations
    based on very few pure virtual methods
  • RTOp operators (and wrapper functions) are
    provided for many common level-1 vector and
    multi-vector operations
  • Default implementation provided for MultiVector
    (MultiVectorCols)
  • Default implementations provided for serial
    computation VectorSpace (SerialVectorSpace),
    VectorSpaceFactory (SerialVectorSpaceFactory),
    Vector (SerialVector)

17
Vector-Vector Operations Provided with TSFCore
namespace TSFCore templateltclass Scalargt
Scalar sum( const VectorltScalargt v )
// result sum(v(i)) templateltclass
Scalargt Scalar norm_1( const VectorltScalargt v )
// result v1
templateltclass Scalargt Scalar norm_2( const
VectorltScalargt v ) // result
v2 templateltclass Scalargt Scalar
norm_inf( const VectorltScalargt v_rhs )
// result vinf templateltclass Scalargt
Scalar dot( const VectorltScalargt x
,const VectorltScalargt y
) // result x'y
templateltclass Scalargt Scalar get_ele( const
VectorltScalargt v, Index i ) // result
v(i) templateltclass Scalargt void set_ele(
Index i, Scalar alpha
,VectorltScalargt v )
// v(i) alpha templateltclass Scalargt
void assign( VectorltScalargt y, const Scalar
alpha ) // y alpha templateltclass
Scalargt void assign( VectorltScalargt y
,const
VectorltScalargt x ) // y x
templateltclass Scalargt void Vp_S( VectorltScalargt
y, const Scalar alpha ) // y alpha
templateltclass Scalargt void Vt_S( VectorltScalargt
y, const Scalar alpha ) // y alpha
templateltclass Scalargt void Vp_StV(
VectorltScalargt y, const Scalar alpha
,const
VectorltScalargt x ) // y
alphax y templateltclass Scalargt void
ele_wise_prod( const Scalar alpha ,const
VectorltScalargt x, const VectorltScalargt v,
VectorltScalargt y ) // y(i)alphax(i)v(i)
templateltclass Scalargt void ele_wise_divide(
const Scalar alpha ,const
VectorltScalargt x, const VectorltScalargt v,
VectorltScalargt y ) // y(i)alphax(i)/v(i)
templateltclass Scalargt void seed_randomize(
unsigned int ) // Seed for
randomize() templateltclass Scalargt void
randomize( Scalar l, Scalar u, VectorltScalargt v
) // v(i) random(l,u) // end namespace
TSFCore
18
TSFCore Vectors and Vector Spaces
Mathematical notation
C code templateltclass Scalargt Scalar foo(
const VectorSpaceltScalargt S )
TeuchosRefCountPtrltVectorltScalargt gt x
S.createMember(), // create
x y S.createMember()
// create y assign( x, 1.0 )
// x 1 randomize( -1.0,
1.0, y ) // y rand(-1,1)
Vp_StV( y, -2.0, x )
// y -2.0 x Scalar gamma dot(x,y)
// gamma xy return
gamma
19
TSFCore Applying a Linear Operator
C Prototype namespace TSFCore enum
ETransp NOTRANS, TRANS, CONJTRANS
templateltclass Scalargt class LinearOp
public virtual OpBaseltScalargt public
virtual void apply( ETransp M_trans,
const VectorltScalargt x, VectorltScalargt y
,Scalar alpha 1.0, Scalar beta 0.0
) const 0
Example templateltclass Scalargt void myOp( const
VectorltScalargt x, const LinearOpltScalargt M
,VectorltScalargt y ) M.apply(
NOTRANS, x, y )
20
Example ANA Linear Conjugate Gradient Solver
21
Multi-vector Conjugate-Gradient Solver Single
Iteration
templateltclass Scalargt void CGSolverltScalargtdoI
teration( const LinearOpltScalargt M, ETransp
opM_notrans ,ETransp opM_trans,
MultiVectorltScalargt X, Scalar a ,const
LinearOpltScalargt M_tilde_inv ,ETransp
opM_tilde_inv_notrans, ETransp opM_tilde_inv_trans
) const const Index m
currNumSystems_ int j if( M_tilde_inv
) M_tilde_inv-gtapply( opM_tilde_inv_notra
ns, R_, Z_ ) else assign( Z_,
R_ ) dot( Z_, R_, rho_0 ) if(
currIteration_ 1 ) assign( P_,
Z_ ) else
for(j0jltmj) beta_j rho_j/rho_old_j
update( Z_, beta_0, 1.0, P_ )
M.apply( opM_notrans, P_, Q_ )
dot( P_, Q_, gamma_0 )
for(j0jltmj) alpha_j rho_j/gamma_j
update( alpha_0, 1.0, P_, X )
update( alpha_0, -1.0, Q_, R_ )
22
The TSFCore Trilinos package
  • packages/TSFCore
  • src
  • interfaces
  • Core VectorSpace, Vector, LinearOp etc
  • Solvers Iterative linear solver interfaces
    (unofficial!)
  • Nonlin Nonlinear problem interfaces
    (unofficial!)
  • utilities
  • Core Testing etc
  • Solvers Some iterative solvers (CG, BiCG,
    GMRES)
  • Nonlin Testing etc
  • adapters
  • mpi-base Node classes for MPI-based vector
    spaces
  • Epetra EpetraVectorSpace, EpetraVector etc
  • examples

23
TSFCoreNonlin Interfaces to Nonlinear Problems
  • Supported Areas
  • NAND optimization
  • SAND optimization
  • Nonlinear equations
  • Multidisciplinary analysis
  • Stability analysis / continuation
  • SFE

24
TSFCoreNonlin Interfaces to Nonlinear Problems
State constraints and response functions
  • Supported Areas
  • SAND
  • Nonlinear equations
  • Multidisciplinary analysis
  • Stability analysis / continuation
  • SFE

25
Summary
SAND Reports R. A. Bartlett, M. A. Heroux and K.
R. Long. TSFCore A Package of Light-Weight
Object-Oriented Abstractions for the Development
of Abstract Numerical Algorithms and Interfacing
to Linear Algebra Libraries and Applications,
Sandia National Laboratories, SAND2003-1378,
2003 R. A. Bartlett, TSFCoreNonlin An
Extension of TSFCore for the Development of
Nonlinear Abstract Numerical Algorithms and
Interfacing to Nonlinear Applications, Sandia
National Laboratories, SAND2003-1377,
2003 Location Trilinos/doc/TSFCore
26
The End
Thank You!
27
Extra Slides
28
Examples of Non-Standard Vector Operations
Currently in MOOCHO gt 40 vector operations!
29
Goals for a Vector Interface
Compute efficiency
gt Near optimal performance Optimizat
ion developers add new operations gt
Independence of linear algebra .

library developers Compute
environment independence gt
Flexible optimization software Minimal number of
methods gt Easy to
write adapters
30
Approaches to Developing Vector Interfaces
(1) Linear algebra library allows direct access
to vector elements (2) Optimizer-specific
interfaces (3) General-purpose primitive vector
operations
31
Vector Reduction/Transformation Operators Defined
  • Reduction/Transformation Operators (RTOp) Defined
  • z 1i z qi opt( i , v 1i v pi , z 1i z qi
    ) element-wise transformation
  • b opr( i , v 1i v pi , z 1i z
    qi ) element-wise reduction
  • b2 oprr( b1 , b2 )
    reduction of intermediate
    reduction objects
  • v 1 v p Î R n p non-mutable input vectors
  • z 1 z q Î R n q mutable input/output
    vectors
  • b reduction target
    object (many be non-scalar (e.g. yk ,k), or
    NULL)
  • Key to Optimal Performance
  • opt() and opr() applied to entire sets of
    subvectors (i ab) independently
  • z 1ab z qab , b op( a, b , v 1ab v pab
    , z 1ab z qab , b )
  • Communication between sets of subvectors only
    for b ¹ NULL, oprr( b1 , b2 ) b2

32
Object-Oriented Design for User Defined RTOp
Operators
  • Advantages
  • Functionality
  • Linear-algebra implementations can be changed
    with no impact on optimizer
  • Optimizer developers can unilaterally add new
    vector operations
  • Performance
  • Near optimal performance (large subvectors)
  • Multiple simultaneous global reductions gt no
    sequential bottlenecks
  • No unnecessary temporary vectors or multiple
    vector read/writes
  • Disadvantages
  • New concepts, initially harder to understand
    interfaces?

33
RTOp vs. Primitives Communication
128 processors on CPlant
  • Compare
  • RTOp (all-at-once reduction (i.e. ISIS QMR
    solver))
  • a, g, x, r, e (xT x)1/2, (vT v)1/2, (wT
    w)1/2, wT v, vT t
  • Primitives (5 separate reductions)
  • a (xT x)1/2, g (vT v)1/2, x (wT w)1/2, r
    wT v, e vT

34
RTOp vs. Primitives Multiple Ops and Temporaries
1 processor (gcc 3.1 under Linux)
  • Compare
  • RTOp (all-at-once reduction)
  • max a x a d ³ b min max( (b - xi)/di,
    0 ), for i 1 n a
  • Primitives (5 temporaries, 6 vector operations)
  • -xi ui, xi b vi, vi / di wi, 0
    yi, maxwi,yi zi, minzi,i1n a

35
Parallel Scalability of MOOCHO
Where is the parallel bottleneck? Is it OO C
or MPI?
Answer gt
MPI
Question Does OO C allow for good scalability
for massively parallel computing (i.e. 100 to
10000 processors)?
Serial overhead of MOOCHO (n2, Np1)
0.41 milliseconds per rSQP iteration
Overhead of MPI communication (Np4)
0.42 milliseconds per global reduction
  • Red Hat Linux cluster (4 nodes)
  • 2.0 GHz Intel P4 processors
  • MPICH 1.2.2.1
Write a Comment
User Comments (0)
About PowerShow.com