Title: Parallel Processing for Structural Inelasticity
1Parallel Processing for Structural Inelasticity!!
Frank McKenna
Department of Civil and Environmental
Engineering University of California, Berkeley
- Sponsored by the National Science
Foundationthrough the Pacific Earthquake
Engineering Research Center
2Parallel Processing for Nonlinear FE
Parallel programming involves the decomposition
of a problem into smaller tasks and the
assignment of these tasks to the processors in
such a way as to solve the problem AFAP
So what is the problem we wish to parallelize?
And whats the difference between linear
nonlinear analysis?
3- Nonlinear FE Analysis involves some iteration at
each load step
And where is all the time being spent?
- Most of time spent in solver element state
determination
- How much time in each, DEPENDS ON THE MODEL
ANALYSIS
4Running Finite Element Program
So how do we decompose the task?
5Method 1 Parallel Solver
- Direct
- Blocked Standard
- Multifrontal
- SuperNodal
- Iterative
P1
P2
P3
P4
6Method 2 Domain Decomposition
P0
Integrator
Domain
SolutionAlgorithm
LinearSOE
P1
P3
P2
P4
- No interprocess communication for P1 to Pn-1
7Method 3 Combine Methods 1 2
P0
Integrator
SolutionAlgorithm
Domain
LinearSOE
Analysis
P3
P1
P1
P2
P3
P4
8Main Abstractions in OpenSees
Domain
ModelBuilder
Analysis
Moves the model from state at time t to state at
time t dt
Constructs the objects in the model and adds them
to the domain.
Recorder
Monitors user defined parameters in the model
during the analysis
9New Classes to Domain for Parallel FE
Domain
TimeSeries
NonlinearBeamColumn BeamWithHinges Quad (std,
bbar,) Brick (std, bbar) Shell
10New Classes to Analysis for Parallel FE
Analysis
StaticAnalysis TransientAnalysis
SystemOfEqn
Solver
EquiSolnAlgo Linear NewtonRaphson ModifiedNewton B
royden BFGS KrylovNewton
Lapack(Gen, Band, ..) ProfileSPD SuperLU Umfpack S
parseSym
StaticIntegrator LoadControl DispControl ArcLength
MinUnbalDispNorm TransientIntegrator Newmark HHT
BandGeneral BandSPD ProfileSPD SparseGeneral Spars
eSymmetric
ParallelNumberer
11Differences in Domain Decomposition Analysis
SubstructuringAnalysis
StaticDDAnalysis
doesAnalysis() analyze() formTangent() newStep(dT)
doesAnalysis() analyze() formTangent() newStep(dT)
return theIntegrator-gtnewStep(dT)
return this-gtanalyze()
return ERROR_FLAG
return ERROR_FLAG
return false
return true
12New Classes to Framework for Parallel FE
- Channel objects for communicating between
processes
- ObjectBroker for creating blank objects upon
which recvSelf() called
- Shadow (Proxy) objects to hide parallelism from
existing objects
- Actor objects to sit on a remote process
process task requested
- Machine objects to start/manage processes
(returns Channel to Shadow objects)
13Parallel FE code in OpenSees Method 1
int main(int argc, char argv) ObjectBroker
theBroker MPI_MachineBroker theMachine(theBrok
er, argc, argv) int rank theMachine.getPID()
int np theMachine.getNP() if (rank !
0) // on slave processes we spin waiting to
create run actors theMachine.runActors()
else // on process we create domain
analysis objects LinearSolver theSolver
new DistributedSuperLU(theSOE, theMachine,
np) theTransientAnalysis-gtanalyze(2000,
0.01)
14Parallel FE code in OpenSees Method 2
int main(int argc, char argv) ObjectBroker
theBroker MPI_MachineBroker theMachine(theBrok
er, argc, argv) int rank theMachine.getPID()
int np theMachine.getNP() if (rank !
0) // on slave processes we spin waiting to
create run actors theMachine.runActors()
else // on process we create domain
analysis LinearSolver theSolver new
SuperLU() // create some subdomains
partition the domain for (int i1 iltnp
i) Subdomain theSub new
ShadowSubdomain(i, theMachine)
theSubSolver new StaticCondensationProfileSPD(
) theSubAnalysis new
SubstructuringAnalysis(, theSubSolver, )
theGraphPartioner new Metis()
theDomainPartioner new DomainPartitioner(theGra
phPartitioner) theDomain-gtsetPartioner(theD
omainPartitioner) theDomain-gtpartition(np-1
) theTransientAnalysis-gtanalyze(2000,
0.01)
15Parallel FE code in OpenSees Method 3
int main(int argc, char argv) ObjectBroker
theBroker MPI_MachineBroker theMachine(theBrok
er, argc, argv) int rank theMachine.getPID()
int np theMachine.getNP() if (rank !
0) // on slave processes we spin waiting to
create run actors theMachine.runActors()
else // on process we create domain
analysis LinearSolver theSolver new
DistributedSuperLU(theSOE) // create some
subdomains partition the domain for (int
i1 iltnp i) Subdomain theSub
new ShadowSubdomain(i, theMachine)
theSubSolver 0 theSubAnalysis
new TransientDDAnalysis(, theSolver, )
theGraphPartioner new Metis()
theDomainPartioner new DomainPartitioner(theGrap
hPartitioner) theDomain-gtsetPartioner(theDo
mainPartitioner) theDomain-gtpartition(np-1)
theTransientAnalysis-gtanalyze(2000,
0.01)
16Example Cantilever on Green
- GREEN an SGI Origin 3800 with 512 MIPS R1200
processors and 512GB Total memory
Element Type stdBrick Material Type
feapJ2 elements 50 X 10 X 10
Parallel Method 3 Metis, StaticDomainDecompositio
n DistributedSuperLU
SpeedUp
17Example Cantilever on Green
- GREEN an SGI Origin 3800 with 512 MIPS R1200
processors and 512GB Total memory
Element Type stdBrick Material Type
feapJ2 elements 50 X 10 X 10
Parallel Method 3 Metis, StaticDomainDecompositio
n DistributedSuperLU
SpeedUp