Title: Diapositiva 1
1Clase 2 Conceptos Básicos de Búsqueda, Ascenso
de Colina
Gabriela Ochoa http//www.ldc.usb.ve/gabro/
2Contenido
- Repaso Optimización y Búsqueda
- Fitness Landscape
- Conceptos Básicos
- Casos de Estudio SAT, TSP y NLP
- Vecindad y Busqueda Local
- Ascenso de Colina (Hillclimbing)
3Optimization
- Objetivo encontrar el óptimo (o los óptimos)
globales de cualquier problema - No essential difference between maximizing and
minimizing - Theres no one single optimization approach that
is good for all types of optimization problems
(see NFL theorems) - better to classify based on special features of
problems
4Problemas de Optimización
- Optimización Numérica
- Variables de decisión son números reales
- Función objetivo tiene expresión algebraica
(varias variables) - Optimización Combinatoria
- Variables de decisión son discretas
- Soluciones suelen presentarse en la forma de
permutaciones - Función objetivo expresión mas compleja
sumatorias, productorias, etc. - Problemas de grafos, agente viajero,
particionamiento
5Fitness Landscape (2 traits)
6Basic Concepts
- Representation Encodes alternative candidate
solutions for manipulation - Objective describes the purpose to be fulfilled
- Evaluation Function returns a specific value
that indicates the quality of any particular
solution given the representation
7Search Problem
- Definition Given a search space S and its
feasible part F in S, find x ? F such that - eval(x) eval(y), for all y ? F (minimization)
- The point x that satisfies the above condition is
called global solution - The terms search problem and optimization
problem are considered synonymous. The search
for the best solution is the optimization problem
8Boolean satisfiability problem (SAT)
- An instance of the problem is defined by a
Boolean expression written using only AND, OR,
NOT, variables, and parentheses. - The question is given the expression, is there
some assignment of TRUE and FALSE values to the
variables that will make the entire expression
true? - SAT is of central importance in various areas of
computer science, including theoretical computer
science, algorithmics, artificial intelligence,
hardware design and verification.
9Computational Complexity of SAT
- SAT is NP-complete. In fact, it was the first
known NP-complete problem, as proved by Stephen
Cook in 1971 - The problem remains NP-complete even if all
expressions are written in conjunctive normal
form with 3 variables per clause (3-CNF) - (x11 OR x12 OR x13) AND
- (x21 OR x22 OR x23) AND
- (x31 OR x32 OR x33) AND ...
- where each x is a variable, with or without a NOT
in front of it, and each variable can appear
multiple times in the expression.
10Problem Formulation (SAT)
- Let us consider a problem of size 30 (i.e. 30
variables) - Representation
- 1 True, 0 False, Binary String of length 30
- Search Space
- 2 choices for each variable, taken over 30
variables, generates 230 possibilities - Objective
- To find the vector of bits such that the compound
Boolean statement is satisfied (made true) - Evaluation Function?
- Not enough information to take the objective
11Travelling salesman problem (TSP)
- Given a number of cities and the costs of
travelling from one to the other, what is the
cheapest roundtrip route that visits each city
and then returns to the starting city? - An equivalent formulation in terms of graph
theory is Find the Hamiltonian cycle with the
least weight in a weighted graph.
12Problem Formulation (TSP)
- Let us consider a problem of size 30 (i.e. 30
cities) - Representation Permutation of natural numbers
1, ,30 where each number corresponds to a city
to be visited in sequence - Search Space Permutations of all cities.
Symmetric TSP, circuit the same regardless the
starting city (n-1)!/2 - Objective Minimize the total distance traversed,
visiting each city once, and returning to the
starting city. Min Sum(dist(x,y)) - Evaluation Function Map each tour to its
corresponding total distance
13Neighbourhoods and local Optima
- Region of the search space that is near to some
particular point in that space
N(x)
. x
S
A search space S, a potential solution x, and its
neighbourhood N(x)
14Defining Neighbourhoods 1
- Define a distance function dist on the search
space S - Dist S x S ? R
- N(x) y ? S dist(x,y) e
- Examples
- Euclidean distance, for search spaces defined
over continuous variables - Hamming distance, for search spaces definced over
binary strings (e.g. SAT)
15Defining Neighbourhoods, TSP
- Use a mapping m, that defines a neighbourhood
for any point x ? S - 2-swap mapping generates a new set of potential
solutions from a given solution x - Solutions are generated by swapping two cities
from a given tour - Every solution has n(n-1)/2 neighbours
- Example 2 4 5 3 1 ? 2 3 5 4 1,
16Defining Neighbourhoods, SAT
- 1-flip mapping generates a new set of potential
solutions from a given solution x - Solutions are generated by flipping a single bit
in the given bit string - Every solution has n neighbours
- Example 1 1 0 0 1 ? 0 1 0 0 1 ( 1st bit)
17Defining Neighbourhoods, Real Numbers
- Gaussian Distribution for each variable defines
a neighbourhood - Mean the current point, Std. dev. 1/6 of the
range of the variable - x (x1, , xn), where li xi ui
- x xi N(0,si), where si (ui - li)/6
- N(0,si) is an independent random Gaussian number
with mean zero and std. dev. si
18Local Optimum
- A potential solution x ? S is a local optimum
with respect to the neighbourhood N, if and only
if - eval(x) eval(y), for all y ? N(x)
19Métodos de Ascenso de Colina - 1
- Usan una técnica de mejoramiento iterativo
- Comienzan a partir de un punto (punto actual) en
el espacio de búsqueda - En cada iteración, un nuevo punto es seleccionado
de la vecindad del punto actual - Si el nuevo punto es mejor, se transforma en em
punto actual, sino otro punto vecino es
seleccionado y evaluado - El método termina cuando no hay mejorías, o
cuando se alcanza un numero predefinido de
iteraciones
20Hillclimbing Methods - 2
- May converge to local optima
- usually have to start search from various
starting points - Initial starting points may be chosen,
- randomly
- according to some regular pattern
- based on other information (e.g. results of a
prior search)
21Hillclimbing Methods - 3
- Variations of hillclimbing algorithms differ in
the way a new string is selected for comparisons
with the current string - One version of simple (iterated) hillclimbing
method is the steepest ascent hillclimbing
22Hillclimbing Methods - 4
- Example problem
- The search space is a set of binary strings v of
length 30 - The objective function f (to be maximized)
- f(v)11one(v)-150
- where one(v) returns the number of ones in v.
- e.g. v1(110111101111011101101111010101)
- f(v1) 1122 - 150 92
23Hillclimbing Methods - 5
- procedure iterated hillclimber
- begin
- t ? 0
- repeat
- local ? FALSE
- select a curent string vc at random
- evaluate vc
- repeat
- form 30 new strings in the neigborhood of
vc by - flipping single bits of vc
- select vn from the set of new strings with
the - largest value of the objective function f
- if f(vc) lt f(vn) then vc ? vn
- else local ? TRUE
- until local
- t ? t1
- until tMAX
- end
24Hillclimbing Methods - 6
- success/failure of each iteration depends on
starting point - success defined as returning a local or a global
optimum - in problems with many local optima a global
optimum may not be found
25Hillclimbing Methods - 7
- Weaknesses
- Usually terminate at solutions that are local
optima - No information as to how much the discovered
local optimum deviates from the global (or even
other local optima) - Obtained optimum depends on starting point
- Usually no upper bound on computation time
26Hillclimbing Methods - 8
- Advantages
- Very easy to apply (only a representation, the
evaluation function and a measure that defines
the neigborhood around a point is needed)
27Search Techniques Revisited - 1
- Effective search techniques provide a mechanism
to balance exploration and exploitation - exploiting the best solutions found so far
- exploring the search space
28Search Techniques Revisited - 2
- Hillclimbing methods exploit the best available
solution for possible improvement but neglect
exploring a large portion of the search space - Random search (points in the search space are
sampled with equal probability) explores the
search space thoroughly but misses exploiting
promising regions.