Reconnect

About This Presentation

Transcript and Presenter's Notes

Title: Reconnect

1
Reconnect 04Introduction to PICO

Cynthia Phillips, Sandia National Laboratories
Joint work with
Jonathan Eckstein, Rutgers
William E. Hart, Sandia National Laboratories

2
Parallel Computing Systems

A set of processors (from 2 up to tens of
thousands) working together on a problem
communicating by messages (even if hidden from
user)
Architectures
Grid
Network of workstations (LAN)
Beowulf cluster
Tightly-coupled system

3
Parallelism in Branch and Bound

Two sources of parallelism in BB
Within subproblems
Across subproblems
Warning
Can solve problems otherwise unsolvable but
A constant-factor increase in processors (even
10,000) cannot overcome exponential growth.
We still have to be clever

4
Parallelism Issues for Branch and Bound

In the best of cases, all the processors are busy
all the time doing useful, independent work
Overhead (coordination, exchange of data)
Load balancing
What to do when the tree is small?
Tree shape depends on order of node evaluation
Can lead to slowdown anomalies
Try to emulate a good serial ordering
Wed do a lot better with a single processor
1000x faster

5
Parallel Experimental Algorithmics/Engineering
Issues

Inherent nondeterminism
Parallel random number generators
e.g. for randomized algorithms
Debugging

6
Solution Options for Integer Programming

Commercial codes (ILOGs cplex)
Good and getting better
Expensive
Serial (or modest SMP)
Free serial codes (ABACUS, MINTO, BCP)
Modest-level parallel codes (Symphony)
Grid parallelism (FATCOP)
In development ALPS/BiCePs/BLIS
Massive parallelism PICO (Parallel Integer and
Combinatorial Optimizer)
Note Parallel BB for simple bounding PUBB,
BoB/BOB, PPBB-lib, Mallba, Zram

7
Parallel Integer and Combinatorial Optimizer
(PICO)

Distributed memory (MPI), C
Massively parallel (scalable)
General parallel Branch Bound environment
Portable, flexible
Serial, small LAN, Cplant, ASCI Red, Red Storm
Allows exploitation of problem-specific
knowledge/structure
Open Source release
Always support a free LP solver

8
PICO Features for Efficient Parallel BB

Efficient processor use during ramp-up
Integration of heuristics to generate good
solutions early
Efficient work storage/distribution
Load balancing
Non-preemptive proportional-share thread
scheduler
Flexible hub/worker interaction
Subproblem states with flexible search strategy
Correct termination
Early output

9
What To Do With 9000 Processors and One
Subproblem?

Option 1 Presplitting
Make log P branching choices and expand all ways
(P problems)
P processors
BAD!
Expands many problems that would be fathomed in a
serial solution.

10
PICO MIP Ramp-up

Serialize tree growth
All processors work in parallel on a single node
Parallelize
LP bounding
Preprocessing
Cutting plane generation
Incumbent Heuristics
Pseudocost (gradient) initialization
Work division by processor ID/rank
Crossover to parallel with perfect load balance
When there are enough subproblems to keep the
processors busy
When single subproblems cannot effectively use
parallelism

11
Parallel Incumbent Search

Genetic algorithms
Decomposition-based methods (general)
Pivot, cut, and dive general heuristic
Custom Methods

12
Interior-Point Method for Solving the Root Problem

Mehrotras predictor-corrector (primal-dual)
method
Iterative method where the computational core of
each iteration is the solution of a linear system
with constraint matrix
A is the original LP constraint matrix.
D is a diagonal matrix that changes each
iteration.
Direct Cholesky Solvers OK for moderate
parallelism
Iterative methods
Preconditioning is a big issue
Support theory can help if the matrix has network
structure

13
Resolving LP on Subproblems
Cutting plane (valid inequality or branch
constraint)
Original LP Feasible region
LP optimal solution
Integer optimal

Dual simplex is much faster than starting over
Need parallel dual simplex!

14
Hubs and Workers

Each hub controls some number of workers (can
work itself)
Setting parameters, can go from fully centralized
to fully distributed
Subproblem pools at both the hub and workers
Heap (best-first), stack (depth-first), queue
(breadth-first), custom
Hubs only keep tokens

15
Subproblem Movement

Hub Worker
When worker has low load or low-quality local
pool
Worker Hub
Draw back when hub out of work and cluster
unbalanced
Send new subproblem tokens to hub
(probabilistically) depending on load
Probabilistically scatter tokens to a random hub.
If load in cluster is high relative to others,
scatter probability increases.
Setting parameters, go from pure master-slave to
local
Tradeoffs Communication, Processor utilization,
approximation of serial search order

16
Subproblem Movement/Data Storage
T
Hub
Worker
T
T
T
T
SP
SP
T
SP
SP
SP
SP
SP Server
SP
SP
SP Receiver
SP
SP
SP
SP Server
17
Load Balancing

Hub pullback
Random scattering
Rendezvous
Hubs determine load (function of quantity and
quality)
Use binary tree of hubs
Determine donors and receivers, match them,
exchange

18
Non-Preemptive Scheduler is Sufficient for PICO

Processes are cooperating
Control is returned voluntarily so data
structures left in clean state
No memory access conflicts, no locks
PICO has its own thread scheduler
High priority, short threads are round robin and
done first
Hub communications, incumbent broadcasts, sending
subproblems
If these are delayed by long tasks could lead to
Idle processors
Processors working on low-quality work
Compute threads are proportional share (stride)
scheduling
Adjust during computation (e.g. between lower and
upper-bounding)

19
(No Transcript)
20
Subproblem States
Boundable
Being Bounded
Bounded
Being Separated
Separated
Dead
Handlers lazy, eager, hybrid, build your own
21
Early Output

Problem If you have to abort a long run, want to
know variable settings for the incumbent
May be good enough to stop
Otherwise seed new search with the incumbent
value
PICO will save a new incumbent if
It is a strict improvement over the last saved
value (or is the first)
A sufficient time has passed since the last write
Requires a new message-triggered thread in
parallel
Hub, incumbent holder, I/O processor

22
Serial Class Structure - Inheritance

Branching classes - control search
Branchsub classes - subproblems (tree nodes)
Problem data classes - derived only

PICO Core
AMPL Interface (optional)
Nonlinear BranchPrune
Knapsack
PICO MIP CORE
PICO BC CORE
MIP Application
23
Required Methods for Derived Node Class

bGlobal( ) - subproblem pointer to branching
(search control)
setRootComputation( ) - create the root of the
search tree
boundComputation( ) - compute subproblem lower
bound (for min)
splitComputation( ) - determine how to partition
the subproblem
makeChild( ) - create a child subproblem from a
split parent
candidateSolution( ) - determine whether a
proposed solution is viable candidate for
optimality

24
Optional Customizations

Incumbent heuristic
Incumbent representation/update
Solution output
Solution validation
Preprocessing
Override default parameters
In MIP
Custom cutting planes
Adjust branching priorities
Plan to add more complex branching strategies

25
All PICOs Parallelism Comes (Almost) For Free
PICO serial Core
PICO parallel Core
Serial application
Parallel application

User must
Define serial application (debug in serial)
Describe how to pack/unpack data (using a generic
packing tool)
C inheritance gives parallel management
User may add threads to
Share global data
Exploit problem-specific parallelism
MIP pseudocosts

26
Utilib

Predates STL
Abstract data types arrays, heaps, hash tables,
balanced trees
Random number generators
Hash tables
work well for doubles very close in value
Arrays offer
Protected access (bounds checking)
Sharing
PackBuffer methods facilitate parallelization

27
Pieces of PICO

PICO requires
utilib (for data structures, math, etc)
COIN (an IBM-sponsored optimization interface
standard)
Base interface to LP solvers
We add more PICO-specific functionality
Cut generation library
An LP solver
Currently support cplex, soplex, CLP

28
Using A Math Programming Language

How easily can one bring up applications?
In our world, applications are a moving target
need agility

29
A Mathematical Programming Language (AMPL)

AMPL builds the matrix.
Nice cross between programming language and LaTeX
(math view)

30
AMPL-PICO Interface
Data Files
IP
Exact
Solver PICO
AMPL
LP
Model Files
Compute Approximate Solution
Cutting Planes

Write cutting-plane and approximate-solution code
using AMPL variables
Mapping transparent

31
AMPL-PICO Interface

Standard AMPL interfaces
Customized PICO Interface

32
Availability

PICO will be free under GNU lesser public license
MIP Requires serial LP solver
Cplex is expensive, but many companies/universitie
s have it
CLP is free (through COIN)
Part of ACRO (A Common Repository for Optimizers)
http//software.sandia.gov/Acro/
Need password for CVS checkout otherwise
tarballs

33
Open Problems (Wish List)

Tools wed like to see
Parallel matrix generation from a
math-programming interface
Parallel (sparse) dual simplex solver for linear
programming
Open algorithms questions
Ramp up management multiple subproblems in
parallel

34
Development Team

Core Team
Jonathan Eckstein (RUTCOR) PICO core
Bill Hart (Sandia) scheduler, utilib, AMPL
interface, design, etc
Cindy Phillips (Sandia) MIP layer, MIP
applications
Other Developers
Harvey Greenberg (UCD) preprocessor design
Vitus Leung (Sandia) preprocessor
Tod Morrison (UCD, student) soplex interface,
porting
Mikhail Nediak (RUTCOR student, now McMaster)
MIP heuristic
Konrad Borys (RUTCOR student) core
templatization, heuristic integration
Mike Eldred (Sandia) DAKOTA optimization
framework
Ojas Parekh (ex-CMU student, Sandia) soplex
interface
Mario Alleva (Sandia) porting

Write a Comment

User Comments (0)

About PowerShow.com

Reconnect PowerPoint PPT Presentation