Machine Learning and Grid Optimization - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Machine Learning and Grid Optimization

Description:

Writing an application-tuning fitness function is similar to wrapping or service ... 9/17/2004. DSD Department Meeting. Thanks! Have a good weekend! ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 20
Provided by: keit108
Category:

less

Transcript and Presenter's Notes

Title: Machine Learning and Grid Optimization


1
Machine Learning and Grid Optimization
  • Noah Edelson
  • Distributed Systems Department

2
Motivation
  • Computation is important to science.
  • Search is one of the most important problems in
    computation.
  • Uninformed blind search is the most general
    search technique.
  • Uninformed searches often perform better than
    uninformed programmers.

3
The Black Art of Fussing with Parameters
  • gt Rather than cranking the buffers all the way
    up, it may be better to
  • gt use multiple streams to fill the pipe. It may
    not possible to saturate
  • gt a a fat pipe with just one connection for the
    above reasons, but also
  • gt because of traffic shapers and various other
    limits. You can always
  • gt try of course -) For parallel connections use
    some reasonable
  • gt estimate, e.g. divide the maximum "pipe width"
    (bandwith rtt) by what
  • gt you seem to be able to get out from a single
    stream, give or take a few
  • gt more streams. It would probably help to be
    multi-threaded if using
  • gt multiple streams (to avoid txqueuelen
    exhaustion). Alternatively,
  • gt simply copying several files in parallel
    achieves the same.

4
Blind Search
  • Queue lt- MakeQueue(Initial-State)
  • while( Not Empty(Queue) )
  • Node lt- Remove-Front(Queue)
  • if( Goal-Test(Node) )
  • return Solution
  • Queue lt- Queuing-FN( Expand(Node), Queue)
  • return Failure

5
Search Taxonomy
6
Evolutionary Computation
  • .

Selection
Recombination (crossover)
Mutation
Replacement
Borrowed from Alba Troya - Improving
flexibility and efficiency by adding parallelism
to genetic algorithms, Statistics and Computing 12
7
Genetic Algorithms
t 1
t
reproduction
selection
Borrowed from Alba Troya - Improving
flexibility and efficiency by adding parallelism
to genetic algorithms, Statistics and Computing 12
8
Combinatorial Explosion
  • A search, by definition, picks 1 of N solutions
    from the solution space.
  • Given a bitstring with cardinality N, there are
    2N possible states to search.
  • A well known AI professor at Cal claims that GA
    ignore the combinatoral explosion of the search
    space. He is wrong.

9
The Black Art of Fitness Functions
  • A good fitness function converges quickly.
  • Minimize genes (degrees of freedom)
  • Avoid Deceptive Trap Functions
  • Encourage smooth fitness functions
  • Decouple fitness dependence between genes
  • -- What happens when your fitness function halts?

10
A Solution to the Entscheidungsproblem!!!
  • When all else fails- give up!
  • A solution whose evaluation halts is decidedly
    unfit.
  • Boils down to throwing out the outliers

11
Parameter Tuning by way of Search (2 steps!)
  • An optimizer simply performs a search for
    optimal performance.
  • Step 1) Encode your input parameters
  • Step 2) Invent a fitness metric-
  • I used ExecutionTime-1

12
Example Fitness Function
  • def testFunc(data, args'')
  • if not data return 3, 1.0, 0.0
  • p,b,t int(1data010.0) , \
  • 81024int(int(data1255.0)
    )1024, \
  • 81024int(int(data2255.0)
    1024)
  • startTime time.time()
  • for samples in range(num_samples)
  • lazy_forker(globus-url-copy ' -p '
    str(p)' -bs ' str(b) ' -tcp-bs\
  • str(t) ' ' src ' ' dst, threshHold)
  • endTime time.time() - startTime
  • return 1.0/endTime

13
Step 3?
  • Step 3) Grin in satisfaction, as your GA will
    explore multiple dimensions of your
    solution-space at once.

14
Results
  • Globus-url-copy pyGlobus-url-copy
  • (2x avg) (4x avg)
  • Performance Boost
  • 64 35 19
  • 128 22 15
  • 192 08 00
  • 256 03 06
  • mb

15
Results
  • Globus-url-copy and pyGlobus-url-copy are similar
    in terms of performance given input parameters
    and running environment.
  • Parameter tuning affords both utilities a
    noticeable increase in performance given little
    knowledge of the problem domain.

16
Why an optimizer service?
  • .

Optimizer TuneService(FF) TestEnv
Save/Load TuningsEnv
FitnessFunction(x,y,z)
FitnessFunction(x,y,z)
GridServiceA(x,y,z)
GridServiceB(w,q,r)
17
Too Much Work?
  • Writing an application-tuning fitness function is
    similar to wrapping or service-ifying the
    application.
  • If you wrap a well-used app whose performance is
    sensitive to the parameters you support, you
    might as well provide a mechanism for tuning
    those parameters for higher performance.

18
Future Work
  • Service-ify the GA
  • - What standard to use?
  • Production runs on WAN?
  • - PlanetLab, EMULab?
  • What besides pyGlobus-url-copy and
    globus-url-copy?
  • - Grid Schedulers?
  • - RLS Strategies?
  • - Parameter tuning for important applications
    services?

19
Thanks!
  • Have a good weekend!
Write a Comment
User Comments (0)
About PowerShow.com