ICT619%20Intelligent%20Systems%20Topic%205:%20Genetic%20Algorithms - PowerPoint PPT Presentation

About This Presentation
Title:

ICT619%20Intelligent%20Systems%20Topic%205:%20Genetic%20Algorithms

Description:

... of rabbits tends to become faster to meet their environment ... The SOGA system was developed in two months using one programmer overseen by its designers ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 44
Provided by: drsham
Category:

less

Transcript and Presenter's Notes

Title: ICT619%20Intelligent%20Systems%20Topic%205:%20Genetic%20Algorithms


1
ICT619 Intelligent SystemsTopic 5 Genetic
Algorithms
2
Genetic Algorithms
  • Introduction
  • How GAs work
  • The TSP as an example
  • Business Applications of GA
  • Advantages of GA systems
  • Some issues related to GA based systems
  • Case Study

3
What is a genetic algorithm?
  • GA part of the broader soft computing (aka
    "computational intelligence") paradigm known as
    evolutionary computation
  • First introduced by Holland (1975)
  • Inspired by possibility of problem solving
    through a process of evolution

4
What is a GA? (contd)
  • GA mimics biological evolution to generate
    better solutions from existing solutions through
  • survival of the fittest
  • crossbreeding and
  • mutation

5
What is a GA? (contd)
  • A GA is capable of finding solutions for many
    problems for which no usable algorithmic
    solutions exist
  • GA methodology particularly suited for
    optimization
  • Optimization searches a solution space consisting
    of a large number of possible solutions
  • GA reduces the search space through evolution of
    solutions, conceived as individuals in a
    population

6
Intelligence and Evolution
  • One way of understanding intelligence is as the
    capability of a creature to adapt itself to an
    ever-changing environment
  • We normally think of adaptation as changes in the
    characteristics (including behaviours) of a
    single animal in response to experiences over its
    history
  • But adaptation is also change in the
    characteristics of a species, over the
    generations, in response to environmental change
  • An individual creature is in competition with
    other individuals of the same species for
    resources, mates etc.
  • There is also rivalry from other species which
    may be a direct (predator)or indirect (food,
    water, land, etc.) threat
  • In nature, evolution operates on populations of
    organisms, ensuring by natural selection that
    characteristics that serve the members well tend
    to be passed on to the next generation, while
    those that dont die out

7
Evolution as Optimisation
  • Evolution can be seen as a process leading to the
    optimisation of a populations ability to survive
    and thus reproduce in a specific environment.
  • Evolutionary fitness - the measure of the ability
    to respond adequately to the environment, is the
    quantity that is actually optimised in natural
    life
  • Consider a normal population of rabbits. Some
    rabbits are naturally faster than others. Any
    characteristic has a range of variation that is
    due to i) sexual reproduction and ii) mutation
  • We may say that the faster rabbits possess
    superior fitness, since they have a greater
    chance of avoiding foxes, surviving and then
    breeding
  • If two parents have superior fitness, there is a
    good chance that a combination of their genes
    will produce an offspring with even higher
    fitness. We say that there is crossover between
    the parents genes
  • Over successive generations, the entire
    population of rabbits tends to become faster to
    meet their environment challenges in the face of
    foxes

8
How GAs work
  • A population of candidate solutions -
    mathematical representations - is repeatedly
    altered until an optimal solution is found
  • The GA evolutionary cycle
  • Starts with a randomly generated initial
    population of solutions (1st generation)
  • Selects a population of better solutions (next
    generation) by using a measure of 'goodness' (a
    fitness evaluation function)
  • Alters new generation population through
    crossbreeding and mutation
  • Processes of selection (step 2) and alteration
    (step 3) lead to a population with a higher
    proportion of better solutions

9
How GAs work (contd)
  • The GA evolutionary cycle continues until an
    acceptable solution is found in the current
    population,
  • or
  • some control parameter such as the maximum number
    of generations is exceeded

10
How solutions are represented
  • A series of genes, known as a chromosome,
    represents one possible solution
  • Each gene in the chromosome represents one
    component of the solution pattern
  • Each gene can have one of a number of possible
    values known as alleles
  • The process of converting a solution from its
    original form into a chromosome is known as
    coding

11
How solutions are represented (contd)
  • The most common form of representing a solution
    as a chromosome is a string of binary digits (aka
    a binary vector) eg 1010110101001
  • Each bit in this string is a gene with two
    alleles 0 and 1
  • Other forms of representation are also used, eg,
    integer vectors
  • Solution bit strings are decoded to enable them
    to be evaluated using a fitness measure

12
GA Selection
  • Selection in GA based on a process analogous to
    that of biological evolution
  • Only the fittest survive and contribute to the
    gene pool of the next generation
  • Fitness proportional selection
  • Each chromosomes likelihood of being selected is
    proportional to its fitness value.
  • Solutions failing selection are bad, and are
    discarded

13
Alteration Crossover Mutation
  • Alteration refines good solutions from current
    generation to produce next generation of
    solutions
  • Carried out by performing crossover and mutation
  • Crossover by splicing two chromosomes at a
    crossover point and swapping the spliced parts
  • A better chromosome may be created by combining
    genes with good characteristics from one
    chromosome with some good genes in the other
    chromosome
  • Crossover carried out with a probability
    typically 0.7
  • Chromosomes not crossed over are cloned

14
Crossover and Mutation
  • Mutation
  • A random adjustment in the genetic composition
  • Can be useful for introducing new characteristics
    in a population
  • May be counterproductive
  • Probability kept low typically 0.001 to 0.01

15
An albino is a common mutation
16
The typical Genetic Algorithm
  1. Represent the solution as a chromosome of fixed
    length, choose size of population N, crossover
    probability pc and mutation probability pm.
  2. Define a fitness function f for measuring fitness
    of chromosomes.
  3. Create an initial solution population randomly of
    size N x1, x2, , xN
  4. Use the fitness function f to evaluate the
    fitness value of each solution in the current
    generation f(x1), f(x2), , f(xN)

17
The typical Genetic Algorithm (contd)
  1. Select good solutions based on fitness value.
    Discard rest of the solutions.
  2. If acceptable solution(s) found in the current
    generation or maximum number of generations is
    exceeded then stop.
  3. Alter the solution population using crossover and
    mutation to create a new generation of solutions
    with population size N.
  4. Go to step 4.

18
The typical Genetic Algorithm (contd)
19
The Travelling Salesperson Problem
  • Given a set of n cities (A, B, C, ...) find a
    closed tour of all cities with the shortest total
    distance d
  • Tour 'cost' may be something other than distance
    d
  • This is an optimization problem with following
    constraints
  • 1. Each city to be visited once and only once
  • 2. Total distance travelled must be shortest
    possible
  • The time required to find a solution by
    exhaustive search increases exponentially - the
    problem is NP-hard
  • Possible number of tours for n cities n!/2n
  • 1 million centuries for 50 cities at the rate of
    1 billion tours per sec!

20
The Travelling Salesperson Problem
In 1987, Martin Groetschel and Olaf Holland found
an optimal tour of 666 interesting places in the
world. Source http//www.tsp.gatech.edu//index.ht
ml
21
The TSP example (contd)
  • Representation and coding of TSP solutions
  • The representation might be an ordered list of
    numbers each representing a city nominally (known
    as order-based GA)
  • 1) London 3) Dunedin 5) Beijing 7) Tokyo
  • 2) Venice 4) Singapore 6) Phoenix 8) Victoria
  • CityList1 (3 5 7 2 1 6 4 8)
  • CityList2 (2 5 7 6 8 1 3 4)
  • Alternatively, the representation of the solution
    may be encoded in binary on a matrix...

22
The TSP example (contd)
  • Representation and coding of TSP solutions
  • A solution to the TSP problem is an ordered list
    of the n cities
  • Each city is assigned 1 out of n possible
    positions
  • Representation of the solution may be visualised
    as a table
  • Each row represents a city
  • Each column associated with a tour position for
    cities

23
The TSP example (contd)
  • The tour represented above is CAEBDC
  • One possible bit string code for this solution
  • 01000 00010 1000 00001 00100
  • (rows written end to end)
  • Binary bit strings can produce "faulty"
    chromosomes needing repair
  • An integer vector scheme produced a 100 city tour
    9.4 above optimal cost

24
An Optimal 100-City Tour
25
Business Applications of GA
  • Increasing number of industrial and business
    applications of GA since late 1980s
  • In business, applications include (Kingdon 1997)
  • Portfolio optimisation
  • Bankruptcy prediction
  • Financial forecasting
  • Fraud detection
  • Scheduling
  • Design of complex
  • machines eg. jet engines

26
Business Applications of GA (contd)
  • First Quadrant - investment firm in California
  • Started using GA technique in 1993
  • Uses GA to manage US5 billion worth of
    investments in 17 different countries
  • Their evolved model earns, on average, 255 for
    every 100 invested over six years, as opposed to
    205 for other types of modeling systems (Begley
    Beals, 1995)

27
Advantages of GA systems
  • Useful when no algorithms or heuristics are
    available for solving a problem
  • No formulation of the solution is required - only
    "recognition" of a good solution
  • A GA system can be built as long as a solution
    representation and an evaluation scheme can be
    worked out
  • So minimal domain expert access is required

28
Advantages of GA systems
  • GA can act as an alternative to -
  • Expert Systems if
  • number of rules is too large or
  • the nature of the knowledge-base too dynamic
  • Traditional optimization techniques if
  • constraints and objective functions are
    non-linear and/or discontinuous

29
Advantages of GA systems (cont'd)
  • GA does not guarantee optimal solutions, but
    produce near optimal solutions which are likely
    to be very good
  • Solution time with GA is highly predictable
    Determined by
  • Size of the population
  • Time taken to decode and evaluate a solution and
  • Number of generations of population
  • GA uses simple operations to solve problems that
    are computationally prohibitive otherwise
  • Example the TSP problem

30
Advantages of GA systems (cont'd)
  • Because of simplicity, GA software are
  • Reasonably sized and self-contained
  • Easier to embed them as a module in another
    system
  • GA can also aid in developing intelligent
    business systems that use other methodologies,
    eg,
  • Building the rule base of an expert system
  • Finding optimal neural networks

31
Some issues related to GA based systems
  • Level of explainability
  • Capability to explain why a particular solution
    was arrived at is practically nil
  • The system does not know what a fitness value
    really means
  • Scalability
  • Moderately scalable
  • Accommodates increased number of variables by
    increasing the length of the chromosome
  • But
  • A longer chromosome means a larger population
    space (more potential combinations of genes)
  • More time required for decoding and fitness
    evaluation

32
Some issues related to GA based systems (contd)
  • Data requirements
  • In general, GA do not require extensive access to
    data but some applications may need it to
    evaluate solutions
  • This makes the quality and quantity of data is
    important
  • Local maxima
  • Local maxima are regions that hold good solutions
    relative to regions around them, but which do not
    necessarily contain the best overall solutions
  • The region(s) that contain the best solutions are
    called global maxima
  • GAs are less prone to being trapped in local
    maxima because of the use of mutation and
    crossover

33
Some issues related to GA based systems (contd)
  • Premature convergence
  • A GA is said to have converged prematurely if it
    explores a local maximum extensively
  • It may be then dominated by very similar
    solutions within the region
  • Most significant factor leading to such
    convergence is a mutation rate which is too slow
  • Mutation interference is an effect opposite to
    that of premature convergence

34
Some issues related to GA based systems (contd)
  • Mutation interference
  • Finding a mutation rate which allows the GA to
    converge but which also allows adequate
    exploration of the solution space is essential
    for satisfactory performance
  • Mutation interference occurs when mutation rates
    in a GA are too high, and as a result
  • Solutions are frequently or drastically mutated
  • The algorithm never manages to explore any region
    of the space thoroughly
  • Any good solutions found tend to be destroyed
    rapidly

35
Case Study - Help Desk Task Scheduling (Dhar
Stein 1997, pp.219-227)
  • GA based system developed at Moodys for
    scheduling service tasks to its customer service
    representatives
  • Major constraints
  • The system
  • Must minimise computer downtime and customer
    dissatisfaction
  • Must integrate with existing database system
    which kept track help desk requests.

36
Case study constraints (contd)
  • Must be flexible to
  • Accommodate new types of task definitions and
    changes in employee, training etc.
  • Allow administrator to modify solutions
  • Must generate and reevaluate schedules quickly
    (under 15 minutes) and consistently
  • Must not take administrator or CSRs away from
    their jobs for any extended period of time
  • Must be developed quickly

37
Case study constraints (contd)
  • Should be scalable in case of future growth in
    number of requests for help and the number of
    CSRs
  • Must not be too complicated for its users the
    administrator and CSRs
  • The main difficulties in meeting the constraints
  • the large number of tasks
  • the large number of CSRs
  • the varying capabilities of CSRs, and
  • the wide variety of tasks

38
Case Study - Variables and issues needing to be
considered
  • The priority of a task, which is determined by
    the severity of the problem
  • The length of time required to perform the task
    and how it would affect the servicing of other
    users
  • The ability of various CSRs to perform different
    levels of tasks (expertise must match the
    complexity)
  • Low priority tasks must not be kept waiting
    indefinitely
  • The measure of goodness of a schedule to be based
    on amount of downtime each schedule cost the
    organisation.

39
Possible solution methodologies considered
  • Traditional linear programming (a numerical
    optimisation technique)
  • A rule based expert system
  • A GA based system
  • ES ruled out because
  • Expertise to solve this problem not expressible
    as a set of rules
  • Help desk administrator not available for
    knowledge extraction
  • Linear programming ruled out because
  • It fails if no optimal solution can be found
  • It does not produce any sub-optimal solutions,
    which is the case with GA.

40
Case Study - the solution
  • SOGA (Schedule Optimising for GA)
  • A hybrid system consisting of GA and fuzzy system
    components
  • The GA component deals with the scheduling task.
  • Each task in the queue is represented by a gene
  • The entire task list forms the chromosome
  • Each chromosome is decoded by a scheduling module
    that assigns tasks to available CSRs who can
    perform them

41
Case Study - the solution (contd)
  • Fitness of each chromosome is determined by
    calculating the amount of downtime that would
    result based on the schedule represented by the
    chromosome.
  • Schedules generated by the GA component are
    modified by the FS component
  • SOGA runs in the background behind the help
    request tracking system
  • Updates schedules based upon a predefined time
    interval (eg, every 10 or 15 minutes)
  • CSRs access their current job queue through their
    interface to accept jobs.

42
Case Study - Results
  • The system is timely generating schedules in
    about 5 minutes.
  • The solutions are found to be good by the help
    desk administrator
  • The system is flexible enough to allow for task
    definitions
  • The system scales up well to larger domains
    (higher number of tasks)
  • The SOGA system was developed in two months using
    one programmer overseen by its designers

43
REFERENCES
  • Begley, S. and Beals, G. "Software au naturel."
    Newsweek, May 8, 1995, p.70
  • Dhar, V., Stein, R., Seven Methods for
    Transforming Corporate Data into Business
    Intelligence., Prentice Hall 1997, pp. 126-148,
    203-210.
  • Goldberg, D. E., Genetic and Evolutionary
    Algorithms Come of Age, Communications of the
    ACM, Vol.37, No.3, March 1994, pp.113-119.
  • Holland, J. H., Adaptation in Natural and
    Artificial Systems, Univ. of Michigan Press,
    1975.
  • Kingdon, J., Intelligent Systems and Financial
    Forecasting, Springer Verlag, London 1997.
  • Medsker,L., Hybrid Intelligent Systems, Kluwer
    Academic Press, Boston 1995.
  • Michalewicz, Z., Genetic Algorithms Data
    Structures Evolution Programs, Springer-Verlag,
    Berlin 1996.
  • Negnevitsky, M. Artificial Intelligence A Guide
    to Intelligent Systems, Addison-Wesley 2005.
Write a Comment
User Comments (0)
About PowerShow.com