Some Applications' II: Constrained Optimization' - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Some Applications' II: Constrained Optimization'

Description:

We will deal, primarily, with numerical problems: we will be given a real-valued ... (and, possibly, crossing the boundary), where the excursion of the new ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 68
Provided by: giampier
Category:

less

Transcript and Presenter's Notes

Title: Some Applications' II: Constrained Optimization'


1
Lecture 7
  • Some Applications. II Constrained Optimization.
  • Although the TSP included some constraints (e.g.,
    that the descendants created remain valid tours -
    therefore constraining the set of possible
    solutions), that is not what is usually meant by
    constrained.
  • We will deal, primarily, with numerical problems
    we will be given a real-valued function of
    several variables (thus defined on some subset of
    a multi-dimensional euclidean space) we will
    also be given a set of functions and inequalities
    that will define the region of euclidean space
    where all the solutions must reside. The search
    space S will be divided into nonoverlapping
    subsets a feasible search space F, and an
    infeasible search space U. The problem will
    consist of maximizing or minimizing the given
    function in the region defined by the constraints.

2
Lecture 7
  • The discussion will follow Michalewicz Fogel,
    Ch. 9. Again, much of the material appears in
    the earlier book by Michalewicz, but is now
    updated and expanded.
  • Some of the discussion uses the genocop and
    genocop3 programs. genocop3 can be found
    on-line.
  • We now look at some geometry and start a long
    series of approaches to the general problem -
    often presented as special examples.

3
Lecture 7
  • Search space with feasible and infeasible
    subsets. A major problem is the determination of
    the boundary between the feasible and infeasible
    regions in terms that are easy to use. The
    feasible region need not satisfy any convenient
    geometric properties (e.g., be convex, be a
    polyhedron, be connected, etc.).

4
Lecture 7
  • Here is an example of a typically small
    population, only some of whose members belong to
    the set of feasible individuals - those
    individuals that fall in the feasible regions.
    Unfeasible ones can still play a useful role, by
    providing ways to bridge the different disjoint
    components of the feasible region.

5
Lecture 7
  • In general, we will want to have two evaluation
    functions, evalf(s), evalu(s). The first
    determining the fitness of the feasible
    individuals, and thus the function whose maximum
    or minimum we are searching for. The second is a
    penalty function applied to infeasible
    individuals. The penalty function could be a
    death sentence (no infeasible individual can
    exist in the population) or a true penalty that
    alters the fitness this individual may possess if
    evaluated by evalf(s). The decision on what
    strategy is appropriate must follow from a
    detailed understanding of the problem, the data
    structures used, etc. There is no a priori
    methodology to guide us.
  • We now take a look at a series of questions that
    need to be addressed in this context.

6
Lecture 7
  • How should we design evalf(s) - to compare two
    feasible individuals?
  • How should we design evalu(s) - to compare two
    infeasible individuals?
  • How are the functions evalf(s) and evalu(s)
    related?
  • Should infeasible individuals be rejected from
    the population?
  • Should we repair infeasible solutions? How? Do
    we simply move one to the nearest feasible
    point?
  • If we use repair procedures, does the repaired
    individual replace the original or should we just
    use it for evaluation purposes?
  • Should we penalize infeasible individuals over
    feasible ones?
  • Should we start with only feasible individuals
    and then always enforce feasibility?

7
Lecture 7
  • Should we change the topology of the search
    space by using decoders that always translate
    infeasible solutions into feasible ones?
  • Should we extract a set of constraints that
    define the feasible search space and process
    individuals and constraints separately?
  • Should we concentrate on searching a boundary
    between feasible and infeasible parts of the
    search space?
  • How do we go about finding a feasible solution?
  • Other questions will arise as we proceed and
    acquire some experience of the kinds of problems
    that arise in practice.

8
Lecture 7
  • Designing evalf(s). In some problems (knapsack,
    TSP, set-covering), the evaluation function is
    obvious (maybe the obvious is not the best,
    but) in others, it isnt. An example of how
    complex the problem may be is given by the
    picture below both paths connect the desired
    points, and they appear to have close to the same
    length (path 1 is shorter, path 2 is smoother).
    They are both solutions to a connect-the-dots
    problem, but determining which is better is not
    obvious.

9
Lecture 7
  • Another example that illustrates the difficulty
    of evaluating feasible individuals is given by
    the SAT problem. Assume the SAT formula F(x)
    (x1 ?x2 x3) ? (?x1 x3) ? (x2 x3).
  • Two feasible individuals are p (0, 0, 0) and q
    (1, 0, 0) (true 1, false 0). For both F(p)
    F(q) 0, so that the formula does not provide
    a convenient evalf. What else could we do? We
    would like to find out if the formula is
    satisfiable, so a reasonable evalf(s) should give
    higher fitness for p then for q let it denote
    the ratio of the number of conjuncts that
    evaluate to true to the number of conjuncts in
    the formula.
  • evalf(p) 0.666 evalf(q) 0.333. In
    this case we try to maximize evalf(s). We could
    also use other expressions.

10
Lecture 7
  • evalf(x) x1 - 1x2 1x3 - 1 x1
    1x3 1 x2 - 1x3 - 1 or evalf(x)
    (x1 - 1)2(x2 1)2(x3 - 1)2 (x1 1)2(x3 1)2
    (x2 - 1)2(x3 - 1)2
  • In both of these cases, the solution to the SAT
    problem corresponds to a global minimum of the
    evaluation function. F(x) true is equivalent
    to evalf(x) 0. There may be a number of
    situations where a reasonable evaluation function
    is too coarse to provide adequate guidance for
    fitness proportional reproduction. In that case,
    one might be able to construct a total order
    function on the search space so that we can
    always decide whether one proposed individual is
    better than another. In this case, we can make
    use of tournament selection, and the
    corresponding reproduction scheme.
  • We might be able to construct a partial order
    function (the feasible solutions form a lattice),
    or find a way to compare feasible with
    infeasible, or two infeasible individuals.

11
Lecture 7
  • Designing evalu(s). This is, usually, hard
    regardless of anything else, evalf(s) and
    evalu(s) must be related. There are both
    intrinsic difficulties - see the picture below
    (which of the two infeasible paths is better?) -
    and difficulties in finding a useful relationship
    between feasible and infeasible individuals. A
    possible solution involves defining evalu(s)
    evalf(s) Q(s), where Q(s) represents either a
    penalty associated with an infeasible individual
    or a cost for repairing ( changing to a feasible
    one) it.

12
Lecture 7
  • evalf(s) and evalu(s). Assume we have two
    evaluation function - for feasible and infeasible
    individuals, respectively. We can generate a
    single evaluation function, applicable to the
    whole population, by choosing two constants, say
    q1 and q2, and defining
  • With both the addition of a penalty function and
    the current method, we introduce the possibility
    that an infeasible individual will have higher
    fitness than a feasible one we might well
    converge to an infeasible individual. One might
    try to solve the problem introducing dynamic
    penalty functions (change from generation to
    generation) by setting it up so that feasible
    individuals always evaluate better than
    infeasible ones, etc.

13
Lecture 7
  • This example adds to the complication path1 is
    infeasible, path2 is feasible, and yet claiming
    that path2 is better than path1 leaves us with an
    unpleasant aftertaste.
  • Trying to solve the problem by introducing a
    penalty function that adds the cost to make an
    infeasible individual into a feasible one might
    help There are too many ways in which
    exceptional configurations can arise.

14
Lecture 7
  • Rejecting infeasible individuals. No need to
    construct evalu(s) no need to deal with possible
    convergence to an infeasible individual.
    Downside if the feasible space is small, the
    initial population may all be made up of
    infeasible individuals. Forcing only feasible
    individuals may constrain us to too small an
    initial portion of the search space. There may be
    large infeasible regions that have to be
    crossed - gradual crossings may be possible,
    while randomly generated jumps may be very
    unlikely to take us to a feasible portion of the
    search space.
  • The method is likely to be acceptable if the
    feasible region is convex (the line connecting
    two feasible individuals is composed of feasible
    individuals), and large enough.

15
Lecture 7
  • Repairing infeasible individuals. We must
    construct a function R that takes an infeasible
    individual y and produces a feasible individual
    x R(y) x. In this case, we can define evalu(y)
    evalf(x). The process of repair can be
    considered analogous to a learning that is
    transferred to the genetics of the system - or
    an occurrence of Lamarckian evolution. Although
    Lamarckian evolution (the inheritance of acquired
    characteristics) does not appear to occur in
    strictly biological systems (sometimes political
    ideology interferes - the career of Lysenko comes
    to mind), it does occur in cultural ones, which
    makes its use legitimate in our context.
  • Unfortunately, the choice of R is not easy (why
    would you expect it to be?). There also may be
    too many ways of constructing such a function,
    leading us into a morass of possibilities. Use
    it, if you can.

16
Lecture 7
  • There are two distinct possibilities for the use
    of repair functions the first one uses the
    repair function to assign a fitness to an
    infeasible individual the second one actually
    replaces the infeasible individual with the
    repaired one. It is only the latter action that
    can be claimed an instance of Lamarckian
    evolution.
  • Another variant involves replacing only a
    fraction of the infeasible individuals with their
    repaired versions - this guarantees that we
    maintain a more diverse population covering a
    larger region and delaying convergence. GENOCOP
    III is claimed to repair about 15 of infeasible
    individuals others have tried varying the
    percentages. It appears that keeping the repair
    percentage small (5 - 15) has good effects on
    the search.

17
Lecture 7
  • Penalizing Infeasible Individuals. Usual
    approach evalu(s) evalf(s) Q(s). How do we
    define Q(s)? There are many criteria that could
    be used, singly or in combination (think of
    more)
  • Ratio between sizes of feasible and infeasible
    regions of search space.
  • Topological properties of feasible search space.
  • Type of evaluation function.
  • Number of variables.
  • Number of constraints.
  • Number of active constraints at optimum (ones
    that are equalities there).
  • Self-adaptive penalties.

18
Lecture 7
  • Maintaining a Feasible Population with Special
    Representations and Variation Operators. This
    should be self-explanatory. The major problem is
    that one has to ensure that the successive
    populations can cover all of the feasible space,
    and provide the possibility for convergence.
    Some problems are conducive to this approach,
    some are not.

19
Lecture 7
  • Using Decoders. The data structure that
    represents an individual does not encode for a
    solution directly, but provides the instruction
    for how to build a feasible solution.
  • Lets consider the 0-1 knapsack problem. We are
    given a set of weights wi 0, 1 i n, and a
    set of values vi 0. The binary condition
    constrains us to take all of wi or none, and we
    usually sort the items in decreasing order of
    vi/wi. The goal is to maximize the value obtained
    by collecting a set of items of specified maximum
    weight. We can represent an individual in our
    population as a binary string, say
    (1100110001001110101001010111010101000110), to
    mean take the first item from the list that fits
    in the knapsack continue with the second, fifth,
    sixth, etc. until the knapsack is full ( you
    have reached the weight bound) or no more items
    are available.

20
Lecture 7
  • The sequence of all 1s corresponds to the greedy
    solution any sequence of bits translates to a
    feasible solution every feasible solution could
    have many strings representing it (the tail end,
    never used, can be arbitrary). We can apply
    mutation and crossover any offspring is
    feasible. Whether they lead us much of anywhere
    is another story.
  • The decoder refers to the algorithm that takes
    an individual in the population (an encoded
    solution) into a feasible solution.

21
Lecture 7
  • Multiple choices for decoders are possible, each
    choice imposing a relationship T between encoded
    solutions and feasible ones. We can observe that
    several conditions should be satisfied
  • For each solution s Î F there is an encoded
    solution T(s) d.
  • Each encoded solution d corresponds to a
    feasible solution s (Æ ? T-1(d) ? s).
  • All solutions in F should be represented by the
    same number of encodings d (this may be hard to
    do and, in some circumstances, undesirable).
  • T and T-1 are computationally fast.
  • T-1 satisfies the condition that nearby
    encodings result in nearby solutions
    (continuity of some sort).

22
Lecture 7
  • Separation of Individuals and Constraints. If f
    is an evaluation function and f1,, fm are
    constraint violation measures, we can use the (m
    1)-dimensional vector (f, f1, , fm) and
    attempt to optimize its components - either
    simultaneously or in some preassigned order.
  • Boundary Exploration. If the proximity of an
    individual to the boundary of the
    feasible/infeasible regions can be computed with
    sufficient ease, one can introduce a strategy of
    approaching (and, possibly, crossing the
    boundary), where the excursion of the new
    individual is carefully controlled (wide
    amplitude changes around the boundary, or small
    amplitude ones, in some sequence).

23
Lecture 7
  • Finding Feasible Solutions. There may be
    problems - especially those with complex
    constraints - where finding any feasible
    solutions would be valuable.
  • An example of such a problem is the N-queens
    problem, usually solved by backtracking, rather
    than genetic algorithm methods.
  • A typical Prolog solution can be found on the
    next slide (see Sterling Shapiro, The Art of
    Prolog, MIT Press, 1986). This CPU (667MH G4)
    can generate a 16-queen solution using OpenProlog
    in a few seconds, and it can continue generating
    all solutions, as desired.

24
Lecture 7
  • queens(N, Qs) -
  • range(1, N, Ns), queens(Ns, , Qs).
  • queens(UnplacedQs, SafeQs, Qs) -
  • select(Q, UnplacedQs, UnplacedQs1),
    not(attack(Q, SafeQs)),
  • queens(UnplacedQs1, QSafeQs, Qs).
  • queens(, Qs, Qs).
  • range(M, N, MNs) -
  • M lt N, M1 is (M 1), range(M1, N, Ns).
  • range(N, N, N).
  • select(X, XXs, Xs).
  • select(X, YYs, YZs) - select(X, Ys, Zs).
  • attack(X, Xs) - attack(X, 1, Xs).
  • attack(X, N, YYs) - X is (Y N).
  • attack(X, N, YYs) - X is (Y - N).
  • attack(X, N, YYs) - N1 is (N 1), attack(X,
    N1, Ys).

25
Lecture 7
  • Numerical Optimization. There seem to be at least
    five categories of methods developed to manage
    numerical optimization.
  • Preserving the feasibility of solutions.
  • Penalty functions.
  • Distinguish between feasible and infeasible
    solutions.
  • Decoders.
  • Hybrid methods.
  • We will take a brief look at examples of all.

26
Lecture 7
  • Preserving the Feasibility of Solutions.
  • a) Use of Specialized Operators. Assume that we
    have only linear constraints - over a linear
    space. One of the consequences is that, if the
    feasible region is not empty, it is convex. A
    property of convex sets (actually, a definition
    of convexity) is that, given two points x and y
    in a convex set, the interval ax (1 - a)y, 0
    a 1, is contained in the set. This provides
    us with a ready-made algorithm for crossover
    instead of breaking and splicing, just compute
    a (random) point in the interval between the two
    parents (this assumes real rather than integer
    representation for the feasible individuals). An
    example is given by the desire to minimize the
    function subject to constraints,

27
Lecture 7
  • given by the inequalities and bounds
  • The problem has 13 variables and 9 linear
    constraints. G1 is quadratic, with global
    minimum at x (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3,
    3, 3, 1), and G1(x) 15. Notice that x is
    interior to the half-spaces -8x1 x10 0, -8x2
    x11 0, -8x3 x12 0.

28
Lecture 7
  • GENOCOP required fewer than 1000 generations to
    find the global optimum. Question how does this
    compare to a standard linear programming
    approach?
  • The method can be generalized (unlike linear
    programming) to convex spaces defined via
    nonlinear inequalities. But you have to know
    that the nonlinear inequalities define a convex
    space - which may be the hardest part There is
    no way to generalize it to non-convex spaces.

29
Lecture 7
  • Searching the Boundary I. Can we introduce
    operators (modified crossover and mutation) that
    let us search the boundary of the
    feasible-infeasible regions efficiently? After
    all, thats what linear programming lets us do
  • Consider the problem of maximizing the function
    subject to
  • The function G2 is nonlinear, with an unknown
    maximum near the origin. The problem has one
    nonlinear and one linear constraint. The linear
    constraint is always satisfied near the origin

30
Lecture 7
  • The boundary between feasible and infeasible
    regions is given by the equation Pi1n xi 0.75.
    Where is the maximum? The decision was made to
    search only near or on the boundary - that
    required developing initialization, crossover and
    mutation procedures that kept the descendants of
    boundary points on the boundary.
  • Initialization randomly choose a positive value
    for xi, and use its reciprocal value for xi1.
    The last variable is either 0.75 (if n is odd),
    or is multiplied by 0.75 (if n is even).
  • Crossover (xi)(yi) (xia)(yi1-a), with a
    randomly chosen 0 a 1.

31
Lecture 7
  • Mutation Pick two variables randomly, multiply
    one by a random factor q gt 0 and the other by 1/q
    - just dont exceed the bounds on the variables).
  • Reproduction Use standard proportional
    selection along with an elitist rule ( preserve
    best from one generation to the next).
  • Results for n 20, the system produced a best
    value of 0.8 in fewer than 4000 generations
    (population size 30, pc 1.0, pm 0.06) best
    value found 0.803553 worst value 0.802964.
    For n 50, 30,000 generations gave a best value
    of 0.8331937. High rates of crossover and low
    rates of mutation worked best. Apparently, the
    results were better than those obtained by any
    other method tried.

32
Lecture 7
  • Searching the Boundary II a sphere. Maximize
  • It is easy to see that G3(x) has a global maximum
    of 1 at
  • We now have to describe the algorithm.

33
Lecture 7
  • Initialization randomly generate n variables
    yi, calculate the sum s Si1n yi2, and
    initialize an individual (xi) by xi yi/s for i
    1,, n.
  • Crossover from two parents (xi) and (yi)
    generate an offspring
  • Mutation transform (xi) by selecting two
    indices, i ? j and a random number p Î 0, 1 and
    set xi pxi, xj qxj, where
  • The evolutionary algorithm (with proportional
    selection and elitism) for n 20, and
    population size 30, the system reached 0.99 in
    fewer than 6000 generations. pc 1.0, pm 0.06.

34
Lecture 7
  • Penalty Functions. We use the constraints to
    construct penalty functions that degrade the
    fitness of infeasible solutions replace
    constraints by penalties. We define
    where penalty(x) is a function (sum?) of
    the amount of violation of each constraint
  • We look at a few examples below, under the
    assumption that we are attempting to minimize a
    function subject to constraints. The paper
    HamidaPetrowski2000 has a fairly extensive -
    but compact - overview of the subject up to 2000
    - and some other results.

35
Lecture 7
  • Static Penalties. For each constraint, construct
    a family of intervals to determine the penalty
    coefficients.
  • For each constraint, create l levels of
    violation.
  • For each level of violation and for each
    constraint, define a penalty coefficient Rij, 1
    i l, 1 j m. Higher levels of violation
    require larger (for minimization) values of the
    coefficient.
  • Start with a random population - individuals can
    be feasible or infeasible.
  • Evolve the population (mutation and crossover -
    maybe with elitism) using the fitness function
    eval(x) f(x) Sj1m Rij fj2(x).

36
Lecture 7
  • One of the serious problems with this method is
    that it requires managing m(2l 1) parameters
    m parameters corresponding to the number of
    constraints, with each giving rise to the number
    of intervals associated with each constraint l
    1 parameters for each constraint defining the
    levels of violation (interval boundaries need to
    be assigned) l parameters for each constraint
    defining the Rij. For m 5 and l 4, the number
    of parameters is 45. Some experiments were run
    with the function, constraints and bounds

37
Lecture 7
  • The results were not overly encouraging. The
    optimum is known to occur at x (78.0, 33.0,
    29.995, 45.0, 36.776), with value G4(x)
    -30665.5. The best solution obtained (over 10
    trials) via the GA method just presented was x
    (80.49, 35.07, 32.05, 40.33, 33.34) with G4(x)
    -30005.7.
  • This points out the need to find good penalty
    coefficients finding them is not trivial, and
    may not be really feasible much of the time. One
    might do better (suggestion by Michalewicz) by
    changing the coefficients dynamically and on-line
    as the evolution progresses.

38
Lecture 7
  • Dynamic Penalties. Rather than have a large,
    static number of parameters devoted to penalty
    evaluation, we can have a simpler penalty
    function which takes into account the number of
    generations since the beginning eval(x) f(x)
    (Ct)a Sj1m fjb(x), where C, a and b are
    constant. A choice for the parameters was C
    0.5, a b 2. As t grows larger, the
    coefficient of the penalty part grows larger.
  • Example

39
Lecture 7
  • The best solution known is x (679.9453,
    1026.067, 0.1188764, -0.3962336), with G5(x)
    5126.4981. The best solution attained byt the GA
    just described was evaluated at 5126.6652. No
    solution was fully feasible, due to the
    equality constraints, but the sum of the violated
    constraints was small (10-4). The factor (Ct)a
    appears to grow too quickly to be effective.
  • The method was claimed to have provided good
    results for quadratic evaluation functions.

40
Lecture 7
  • Annealing Penalties. This uses a variant of a
    method known as simulated annealing.
  • Divide all constraints into four subsets linear
    equations, linear inequalities, nonlinear
    equations, nonlinear inequalities.
  • Select a random point as a starting point - the
    population consists of copies of this individual,
    which satisfies all the linear constraints.
  • Set the initial temperature t t0.
  • eval(x, t) f(x) (1/(2t))Sj1m fj2(x).
  • If t lt tf, stop otherwise
  • - Decrease t
  • - Use the best available solution as next
    population start
  • - Repeat the previous step of the algorithm.

41
Lecture 7
  • The algorithm maintains the feasibility of all
    linear constraints using a set of closed
    operators that convert a feasible solution (in
    terms of the linear constraints) into another
    feasible solution. Active constraints are
    considered once per generation, and the selective
    pressure (decrease in temperature) on infeasible
    solutions increases once per generation.
  • The method starts from a single point, and
    requires setting a start temperature t0 and a
    freeze temperature tf . In general, one might
    start with t0 1, ti1 0.1ti, tf 0.000001.
    Example

42
Lecture 7
  • The known global solutions is x (14.095,
    0.84296), with G6(x) -6961.81381. Both
    constraints are active at optimum. The
    (infeasible) starting point was x0 (20.1, 5.84).

43
Lecture 7
  • The table reports the progress of GENOCOP II on
    this test case.

44
Lecture 7
  • Adaptive Penalties I. We replace a fixed
    decreasing temperature schedule by incorporating
    feedback from the search into adjustment of the
    penalties. We start with a formula eval(x)
    f(x) l(t)Sj1m fj2(x), where l(t) is updated
    at every generation t according
    to where bi
    denotes the best individual in generation i, bi,
    b2 gt 1, b1 ? b2, to avoid cycling. Strictly
    speaking, you may want b2m ? b1-n for all m, n
    0, but this may be overkill

45
Lecture 7
  • This can be restated as
  • if all the best individuals in the last k
    generations were feasible, the method decreases
    the penalty component in generation t 1
  • if all the best individuals in the last k
    generations were infeasible, the method increases
    the penalty component in generation t 1
  • if there are some feasible and some infeasible
    individuals as best in the last k generations,
    leave well enough alone - the penalty component
    is probably just right (you hope).

46
Lecture 7
  • Adaptive Penalties II given m constraints,
    define a near feasible threshold qj for each
    constraint, 1 j m. The thresholds indicate
    reasonable distances from the feasible region
    F. We define the evaluation function as
  • eval(x, t) f(x) Ffeas(t) - Fall(t)Sj1m
    (fj(x)/qj(t))k, where Fall(t) denotes the
    unpenalized value of the best solution found so
    far (up to generation t), Ffeas(t) denotes the
    value of the best feasible solution found so far,
    and k is a constant. Note further that the
    thresholds qj(t) are dynamic ( generation
    dependent). An example could be functions qj(t)
    qj(0)/(1 bjt), increasing the penalty
    component over time.

47
Lecture 7
  • Death Penalty. Reject infeasible solutions.
    Example

48
Lecture 7
  • The problem has three linear and 5 nonlinear
    constraints G7 is quadratic and has a global
    minimum at x (2.171996, 2.363683,
    8.773926, 5.095984, 0.9906548, 1.430574,
    1.321644, 9.828726, 8.280092, 8.375927), G7(x)
    24.3062091. GENOCOP did reasonably well, but the
    requirement that only feasible solutions be
    considered makes comparisons to other methods
    difficult the standard deviation of the solution
    values was also high.

49
Lecture 7
  • Segregated Evolution. Create two penalty
    functions, one deliberately too small and one
    deliberately too large. There are two
    evaluations functions fi(x) f(x) pi(x). Rank
    the solutions according to both, and create the
    next generation by choosing the best individual
    from each list, and then removing it from the
    list.
  • This should give a population that will converge
    to the optimum from both sides of the boundary.

50
Lecture 7
  • Search for Feasible Solutions I Behavioral
    Memory Method.
  • Start with a random population (feasibles and
    infeasibles).
  • j 1 - a constraint counter, 1 j m.
  • Evolve the population with eval(x) fj(x) (the
    j-th constraint-derived function) until a given
    percentage of the population (the flip threshold
    f) is feasible for this constraint.
  • Set j j 1.
  • The current population is the start population
    evolving with eval(x) fj(x). During this
    phase, any individual that does not satisfy the
    first j - 1 constraints is eliminated from the
    population. Continue until flip threshold is
    reached.
  • If j lt m, repeat 4. and 5. Otherwise optimize
    with eval(x) f(x), rejecting infeasible
    individuals.

51
Lecture 7
  • One of the immediate problems is posed by the
    ordering of the constraints one can expect that
    different orderings will result in different
    results and different times to get there (the
    example of ordering of database queries should
    tip us off). Another problem that arises is that
    the sequence of constraint satisfactions may so
    limit the diversity of the population that the
    last optimizations can be defeated. Some
    mechanism (sharing) needs to be introduced to
    guarantee diversity.
  • For the original paper see SchoenauerXanthakis199
    3.

52
Lecture 7
  • Interlude Sharing. In population biology one has
    the competitive exclusion principle two
    distinct species cannot share the same ecological
    niche ( cannot make a living exactly the same
    way). This observation-based biological principle
    has also been mathematically confirmed via models
    of competing predators by R. McGehee and others
    in the 1970s (the mathematical results indicate
    that competing predators occupying the same niche
    must be functionally identical). In our terms,
    we are going to diminish the fitness of any
    individuals that have a large number of other
    individuals nearby, thus encouraging spread. In
    particular, for each individual solution i we
    define

53
Lecture 7
  • where n is the cardinality of the population,
    F(i) is the fitness as given by the evaluation
    function, d(i, j) is a distance between solutions
    i and j (maybe asymmetrical), and sh(d(i, j)) is
    a sharing function, usually given
    as where sshare defines the size
    of the neighborhood around the i-th solution and
    a is a scaling parameter.
  • Drawbacks you must choose sshare and a you must
    have a good distance function between
    individuals (and this is often representation
    dependent). sshare needs to be chosen to
    differentiate between nearby peaks without
    having any information about them. Soln try
    several.
  • All the computations and comparisons required may
    also make the method impractical.

54
Lecture 7
  • The problem. Maximize, subject to constraints
  • We first examine some of the drawbacks of the
    use of penalty functions. We first observe that
    G8(x) has many local optima - mostly located
    along the edges of the x1-x2 plane, e.g.
    G8(0.00015, 0.0225) gt 1540. In the feasible
    region, G8 has two maxima of roughly equal
    fitness, 0.1. If we were to use a penalty method
    we would exploit the function G8(x) - a(c1(x)
    c2(x)) and explore various values for a. The
    pictures on the next slide point out some of the
    difficulties with the penalty method.

55
Lecture 7
  • Penalty Method.

56
Lecture
  • Behavioral Memory Method. This will first
    provide us with a population lying primarily in
    the feasible region for the first constraint
    (step 1), followed by a restriction of the
    population to one mostly in the intersection of
    the regions. Ar this point, we have a reasonable
    population to begin the search for a maximum.
    Notice that we use now a death penalty method
    along with the sharing methodology.

57
Lecture 7
  • Search for Feasible Solutions II Superiority of
    Feasible Points. This is another variant of the
    penalty approach eval(x) f(x) rSj1m
    fj(x) q(t, x), where r is a constant, t is the
    generation number, and q(t, x) is a
    generation-dependent function that alters the
    evaluation of infeasible solutions. The main
    goal is to construct a function eval(x) such
    that, for any feasible individual x and
    infeasible y, eval(x) lt eval(y). This can be
    achieved in many ways one of them could be
  • Infeasible individuals are penalized so they
    cannot be any better than the worst feasible one
    (maxx Î Ff(x)). Note that such individuals are
    worst only relative to the current population -
    thats how t comes in.

58
Lecture 7
  • Example. Minimize the function, subject to
    constraints and bounds
  • The function has a global minimum at
  • x (2.330499, 1.951372, -0.4775414,
    4.365726,-0.6244870, 1.038131, 1.594227),
    where it takes the value 680.6300573. The
    method just described produced a solution with
    value 680.934. Other variants are possible.

59
Lecture 7
  • Search for Feasible Solutions III Repairing
    Infeasible Individuals or GENOCOP III. This
    program possesses the ability to repair
    infeasible solutions, as well as some concept of
    co-evolution. Some of its strategies are as
    follows
  • It eliminates linear equations, thus reducing
    the number of variables.
  • All linear inequalities are modified
    accordingly.
  • It replaces non-linear equations by two-sided
    non-linear inequalities (h(x) 0 is replaced by
    -g h(x) g).
  • All points in the initial population must
    satisfy the linear constraints.
  • Denote by Fl the set of solutions that satisfy
    the linear inequalities.

60
Lecture 7
  • The feasible set also satisfies the nonlinear
    inequalities and is denoted by F. GENOCOP III
    coevolves two distinct populations. Pr is a
    population of reference points, points that are
    fully feasible - they satisfy all constraints. Ps
    is the population of search points which must
    satisfy only the linear inequalities. The
    reference points are evaluated directly by the
    evaluation function, eval( r) f( r) the search
    points are repaired via the following process
    if the search point s is feasible (s Î F ), then
    nothing is done and it is evaluated via f(s) if
    not, choose a reference point r from Pr, and
    create a sequence of random points along the
    segment z as (1 - a)r. Eventually, you will
    generate a point close enough to r to satisfy all
    constraints. Then eval(s) eval(z) f(z).
    Furthermore, if f(z) is better than f( r), then z
    replaces r, and, with some probability, can
    replace s.

61
Lecture 7
  • The structure of GENOCOP III is given by the
    program on the right. Note that Pr is modified
    only every several generations.
  • alter Pr(t - 1)?

62
Lecture 7
  • Evaluation of the search population.

63
Lecture 7
  • Example. Minimize, subject to constraints and
    bounds
  • G10 has its global minimum at x (579.3167,
    1359.943, 5110.071, 182.0174, 295.5985, 217.9799,
    286.4162, 395.5979), with G10(x) 7049.330923.
    GENOCOP III obtained 7286.650, better than
    various other attempts. Standard deviation of
    solutions appears low.

64
Lecture 7
  • Search for Feasible Solutions IV Decoder Based
    Methods. The idea is, in principle, quite
    simple if you start with a star-like domain as
    your feasible region (star-like domain there
    exists at least one point in the domain that can
    see every point on the boundary - a line
    connecting this point to any boundary point is
    made up of points in the domain), you can
    transform the star-like domain into a cube
    (smoothness will be compromised at corner or edge
    points, but not at interior points) and use the
    linear geometry of the cube to create curves of
    approach to the boundary in the feasible region..

65
Lecture 7
  • An obvious advantage of this method is that we
    can always deal with feasible solutions ONLY no
    repairs, no constraints, no penalties.
  • A disadvantage is that a complex region delimited
    by several surfaces may be very difficult to
    analyze so that we can determine it is star-like
    even if we can do that, finding a universal
    visibility point may not be easy (the universal
    visibility neighborhood may make up a very small
    percentage of the region) constructing a usable
    mapping (it must be fast, among other criteria it
    needs to satisfy) from the original feasible
    region to a cube (or sphere) may not be easy.
  • Nevertheless, it can be useful.

66
Lecture 7
  • The method may also be usable when we want to
    scale different portions of the feasible region
    in different ways. The figure below shows what
    can happen the small region to the left of the
    vertical axis can be mapped into half of the
    cube. If the desired optimum is known to exist
    there, this will give us a better chance of
    finding it.

67
Lecture 7
  • Hybrid Methods. Combine genetic
    algorithms/genetic programming with anything else
    (e.g., gradient methods, greedy methods, etc)
  • There is no guarantee that any particular search
    methodology will work over a large set of
    application domains - it is quite possible that a
    multi-strategy method will work better than a
    single-strategy one.
Write a Comment
User Comments (0)
About PowerShow.com